Architecture & Philosophy

SqlCAD is built on a specific data warehouse architecture that favors denormalization and strong contracts.

The Layers

SqlCAD enforces a 3-layer architecture via specific Block Types:

1. Source (Bronze) - DEFINE TABLE

  • Purpose: Represents raw data as it lands in the warehouse.
  • Behavior:
    • Strict schema definition (columns metadata).
    • Can be marked external: true to skip schema enforcement during migration.
    • No transformations allowed.

2. Silver Layer - DEFINE DIM / DEFINE FACT

  • Purpose: Represents clean, conformed business objects (Dimensions) and events (Facts).
  • Dimensions (DIM):
    • Must have a Primary Key.
    • Stateful: Automatically tracked over time (SCD Type 2).
    • Normalized: Represents the state of an object (e.g. Customer, Product).
  • Facts (FACT):
    • Must have an Event Time.
    • Immutable: Represents a discrete event (e.g. Order Placed).
    • Point-in-Time: Joins to Dimensions respects temporal validity.
  • Legacy Note: DEFINE ENTITY is supported but DIM/FACT are preferred for explicit semantics.

3. Mart (Gold) - DEFINE REPORT / AGGREGATION / FEATURE_SET

  • Purpose: Consumable tables for BI, Reporting, or ML.
  • Types:
    • REPORT: Final, user-facing tables. Must have explicit schemas (no SELECT *). Auto-injects reference_time.
    • AGGREGATION: Reusable metric tables with GROUP BY.
    • FEATURE_SET: Collections of ML features joined to an entity with Point-in-Time correctness automatically handled.
  • Legacy: DEFINE MART matches the old style but is less opinionated.

The Graph

SqlCAD builds a dependency graph of your entire warehouse.

  • Nodes: Tables, Entities, Marts.
  • Edges: Explicit JOIN blocks.

"One Big Table" (OBT) Generation

Unlike traditional SQL where you manually write LEFT JOINs in every query, SqlCAD asks you to define the relationship once in a DEFINE JOIN block.

When you create a Mart and select columns from multiple entities, SqlCAD searches the graph for the shortest path between them and injects the necessary SQL.

Benefits:

  1. DRY (Don't Repeat Yourself): Define the complex join logic (e.g., temporal joins, multi-key joins) once.
  2. Consistency: Every analyst uses exactly the same logic to join Orders to Customers.
  3. Refactoring: Change the underlying join logic in one place, and recompile all Marts.