Data products, not pipelines
A pipeline is a one-way commitment from a source you do not control to a target that does not ask permission. A data product is a published contract with an owner, a consumer registry, an SLA and a quality definition.
The shift from pipeline thinking to product thinking is the single highest-return architectural decision the average enterprise data team can make in 2026. It is also the one that takes the longest culturally.
The lakehouse default
Databricks lakehouse remains our default reference architecture, with Snowflake or Microsoft Fabric serving the BI workload. The choice is driven by client landing zones, BI vendor preference and licensing - never by Moweb's convenience.
Domain-aligned data products live on the lakehouse. The semantic layer (dbt, Cube, semantic models in Fabric) makes them consumable. The feature store unifies analytics and ML.
Lineage as audit infrastructure
Every production AI system we ship has end-to-end lineage from source data through transformations to retrieval index and model output. When a regulator asks 'where did this number come from', we answer in five minutes, not five weeks.
OpenLineage, Marquez, Atlan, Microsoft Purview - the tooling is mature. The discipline of using it consistently is not.