Loading Strategies

SqlCAD supports several strategies for moving data from the compiled "compiler" views into physical tables in your data warehouse.

Overview

By default, everything in SqlCAD is generated as a VIEW. This ensures that your logic is always "live" and respects the graph-based dependencies. However, for large datasets, you often need to persist data into tables using different loading patterns.

LayerTypeDefaultMaterialization Options
SourceTABLEExternalView (over files)
SilverENTITYViewview, table
GoldMARTViewview, table

Materialization (View vs Table)

You can control how each object is realized in the database using the materialization metadata (default: view).

materialization: view

  • Result: CREATE OR REPLACE VIEW ...
  • Location: setup.sql
  • Execution: Logic is executed at query time. Always fresh.

materialization: table

  • Result: CREATE TABLE IF NOT EXISTS ...
  • Location (DDL): setup.sql
  • Location (Load): recurring/{layer}/{name}.sql
  • Execution: Logic is executed when the recurring script is run. Good for performance optimization.

The build output is split to support this:

  • setup.sql: Run this ONCE (or on schema change) to define the structure of all Views and Tables.
  • recurring/: Schedule these scripts to run periodically (e.g., via Airflow or cron) to populate the Tables.

SCD Type 2 (Entities)

For Entities in the Silver layer, you can maintain a full history of changes.

Configuration

Adding a temporal block to an ENTITY enables the SCD2 load script generator.

DEFINE ENTITY Users primary_key: user_id temporal: {} AS ( ... )

How it works

SqlCAD generates a load script in recurring/silver/ that:

  1. Creates a temporary table with the new data.
  2. Compares the hash of non-PK columns between the new data and the current active records in the target table.
  3. Updates valid_to for changed records.
  4. Inserts new versions of changed records and completely new records.

Incremental Marts (Gold)

For Marts in the Gold layer, you can use an incremental append strategy to avoid re-calculating the entire history.

Configuration

Set STRATEGY: INCREMENTAL and provide an INCREMENT_BY column.

DEFINE MART DailyRevenue STRATEGY: INCREMENTAL INCREMENT_BY: order_date AS ( ... )

How it works

SqlCAD generates a load script in recurring/gold/ that handles the "high-water mark" logic:

  1. Schema Check: It uses CREATE TABLE IF NOT EXISTS to ensure the target table exists with the correct schema (inferred from the Mart's logic).
  2. High-Water Mark: It finds the MAX(INCREMENT_BY) currently in the target table.
  3. Filtered Insert: It executes the Mart's full flattened SQL but filters the results to only include rows where the INCREMENT_BY value is strictly greater than the current maximum.
INSERT INTO DailyRevenue SELECT * FROM ( -- Full Mart Logic here ) AS source WHERE order_date > (SELECT COALESCE(MAX(order_date), '1900-01-01') FROM DailyRevenue);

Limitations

  • Append Only: The current implementation does not support "Upsert" (Merge). It is designed for immutable event streams or time-series data.
  • Duplicate Handling: It relies on the INCREMENT_BY column being strictly increasing. If data arrives out of order, it may be missed.