Day 15: Explaining ML's Neglected Concepts
𝗢𝗟𝗔𝗣 (𝗢𝗻𝗹𝗶𝗻𝗲 𝗔𝗻𝗮𝗹𝘆𝘁𝗶𝗰𝗮𝗹 𝗣𝗿𝗼𝗰𝗲𝘀𝘀𝗶𝗻𝗴): The reason your "fast" database still can't answer a simple analytics question
You trained the model. The pipeline runs. Then someone asks "show me weekly accuracy by region by data source" - and your stack chokes.
The query isn't complex. The database is just the wrong shape.
What actually happens:
•OLTP databases optimize for row-level writes - analytics queries are the opposite workload entirely.
•OLAP systems pre-organize data into columnar formats so aggregations scan only the fields they need.
•A "cube" is a mental model: slice by time, dice by category, drill down or roll up on demand.
•The query that took 40 seconds on Postgres runs in 400ms on a columnar store - same data, different physics.
Key approaches in practice:
•Star schemas denormalize intentionally, trading storage for join-free query speed.
•Materialized views precompute expensive aggregations so dashboards don't recompute on every load.
•Partitioning by time column is often the single highest-leverage OLAP optimization.
What happens in real stacks:
•BigQuery, Snowflake, Redshift, and DuckDB are all columnar OLAP engines under the hood.
•ML teams hit OLAP limits first when building feature stores or evaluation dashboards at scale.
•Hybrid Transactional/Analytical Processing (HTAP) is closing the gap - but most teams don't need it yet.
Your model metrics are only as queryable as your data architecture allows.