Data Derp
Wide Breadth
Give enough context across the whole domain in order to enable deeper dive into relevant and advanced topics later
Real World Slice
- Storage
- Data Ingestion
- Data Transformation
- Levels of Curation (Medallion Architecture)
- Visualisation
- Streaming
- Business Value
Real World Scenario
- Wrangle EV Charging Data
- Perform analytics to interpret and communicate findings to stakeholders
- Treat data problems mindfully
Prerequisites
- Data Curiosity
- Python
- SQL
- Comfort with multi-system architectures
Concepts
- Batch vs. Streaming
- Distributed Systems: CAP Theorem
- Storage & Compute Considerations
- Apache Spark
- Intro to Modern Data Science
- Intro to Streaming Technologies
- Challenges with Big Data
- Delta Lake
- Platform- and consumer-thinking
- Data Lakehouse
- Data Mesh
Practice
- Databricks exercises
- Data Wrangling in Apache Spark
- Interactive Analysis via Notebooks
- Data Visualization & Storytelling
- Practice with Spark Streaming
- Intro to Delta Lake