Migration and Legacy Integration

SqlCAD is designed to be opinionated about data warehouse structure, but it also provides flexible entry points for onboarding legacy data warehouses (e.g., from dbt, stored procedures, or raw SQL scripts).

This document outlines strategies for migrating to SqlCAD incrementally.

1. Incremental Adoption with External Tables

Onboarding hundreds of raw tables can be daunting if you have to define every column. You can mark source tables as external to skip mandatory column definitions.

Strategy

  1. Define your raw sources as TABLE with external: true.
  2. Reference them in ENTITY or MART blocks using SELECT *.
  3. SqlCAD will allow these to pass validation without a strict schema contract.

Example

DEFINE TABLE LegacyCustomers { external: true, source_name: "raw.customers" } AS ( SELECT * FROM raw.customers ) -- You can start modeling immediately DEFINE ENTITY Customers { primary_key: CustomerID } AS ( SELECT * FROM LegacyCustomers )

2. Referencing Existing SQL Files

If you have complex stored procedures or long SQL files, you don't need to move them into .sqlcad files immediately. Use the source_file metadata property.

Example

DEFINE MART FinancialReport { source_file: "legacy/reports/monthly_accruals.sql" } AS ()

Note: The AS () body is ignored when source_file is present. SqlCAD will read the content of the referenced file during compilation.


3. Migrating from dbt

SqlCAD provides a built-in utility to scaffold a project from a dbt manifest.json.

Command

sqlcad migrate dbt --manifest path/to/dbt/manifest.json --output ./my_new_project

What happens:

  • Sources: Converted to DEFINE TABLE blocks with external: true.
  • Models: Converted to DEFINE ENTITY or DEFINE MART based on dbt tags (e.g., tags silver or entity map to ENTITY).
  • SQL: The raw SQL code from dbt is preserved inside the AS (...) block.

4. The "Strangle-and-Replace" Workflow

We recommend an incremental migration:

  1. Map the DAG: Use the migrate dbt tool or external: true tables to replicate your existing dependency graph in SqlCAD.
  2. Verify Lineage: Run sqlcad check to ensure SqlCAD understands the relationships.
  3. Harden Contracts: One by one, remove external: true and SELECT * from your most important entities. Define explicit columns and primary_key to enable SqlCAD's advanced features like:
    • SCD Type 2 Generation: Automatically generating loading scripts.
    • Auto-Flattening: Injecting PIT (Point-in-Time) joins.
    • Type Safety: Full schema validation across the graph.

5. Dependency Resolution for Legacy SQL

SqlCAD's parser is intelligent enough to distinguish between internal entities and external references. If your legacy SQL refers to schema.table_name, SqlCAD treats it as an external reference and won't confuse it with an internal ENTITY named table_name.

This allows you to keep your SQL mostly "as-is" during the initial migration phase.