<img height="1" width="1" style="display:none;" alt="" src="https://dc.ads.linkedin.com/collect/?pid=306561&amp;fmt=gif">

Brendan Haire, Atlassian, Presentation at Chief Data & Analytics Officer Forum, Melbourne

Written by Corinium

Brendan Haire, Atlassian, Presentation at Chief Data & Analytics Officer Forum, Melbourne

Written by Corinium on Nov 11, 2015 11:13:00 AM

Big Data

 

Brendan Haire, Atlassian, Presentation at CDAO Forum, Melbourne

  1. 1. CDAO Forum Presentations  Building a data lake in the sky DATA LAKE ON AWS AGILE LAKE DELIVERY
  2. 2. Who am I? Data experience Through my career I have built and managed: • reporting platform for an Australian University • data warehouse and BI solution for a telco in Europe • data warehouse and real-time data integration platform for a bank .. and finally I led the Analytics and Data Integration team at Atlassian for the past year delivering on our data strategy. About myself • Atlassian for over 4 years • IT for 20 years • Roles from developer, dev mgr, architect to project mgmt • Software Engineering background • Developer at heart Brendan Haire
  3. 3. Starting pointData Context • Software company • Fast growing • Data Driven • IPO • 200TB Data • ~1000 users per week
 (~800 reporting, ~200 ad hoc) • 30k queries per day • Team of 4 • Legacy EDW • Multiple data silos • Emerging problem Atlassian
  4. 4. Scale/CostData EverywhereSlow Analysis Duplication Effort The Problem
  5. 5. Data lake on AWS “A lake in the clouds”
  6. 6. Principles
 A data pipeline and analytic platform that: Vision •handles large and small data sets •supports real-time and batch functions Enabling Analytics •is easy to add raw data for immediate use •allows value to be progressively added through stages •support self-service analysis and integration functions Scale Friction
  7. 7. Conceptual Source Systems Data Applications Business Intelligence 1 Data Lake 2 Data Stream
  8. 8. Solution
  9. 9. The UglyThe BadThe Good Good, Bad, Ugly • New analytics capability • Less ETL and moving data • Performance • AWS – flexibility • Scaling – compute vs storage • Cost – control + predictability • High learning curve • New tooling • Data Governance • ‘Cutting edge’ hurts
  10. 10. Agile lake delivery “From pond to lake”
  11. 11. by Henrik Kniberg
  12. 12. Minimal Viable Product (MVP)
  13. 13. Weekly Active Usage (WAU)
  14. 14. FeedbackTest Enabling Innovation • Problem statement • Vision • Research • Talk to people • ShipIT / Hackathons • Spikes • Minimum Viable Product • User Feedback • Usage Hypothesis
  15. 15. IncrementalSelf ServiceRaw Data Usage Feedback Self service is key in reducing friction and enabling scale Providing analysts access to raw data is a game changer Incremental delivery and feedback drive innovation When building a platform usage is a great proxy for value Takeaways

Related posts