Migration and Legacy Integration
SqlCAD is designed to be opinionated about data warehouse structure, but it also provides flexible entry points for onboarding legacy data warehouses (e.g., from dbt, stored procedures, or raw SQL scripts).
This document outlines strategies for migrating to SqlCAD incrementally.
1. Incremental Adoption with External Tables
Onboarding hundreds of raw tables can be daunting if you have to define every column. You can mark source tables as external to skip mandatory column definitions.
Strategy
- Define your raw sources as
TABLEwithexternal: true. - Reference them in
ENTITYorMARTblocks usingSELECT *. - SqlCAD will allow these to pass validation without a strict schema contract.
Example
DEFINE TABLE LegacyCustomers { external: true, source_name: "raw.customers" } AS ( SELECT * FROM raw.customers ) -- You can start modeling immediately DEFINE ENTITY Customers { primary_key: CustomerID } AS ( SELECT * FROM LegacyCustomers )
2. Referencing Existing SQL Files
If you have complex stored procedures or long SQL files, you don't need to move them into .sqlcad files immediately. Use the source_file metadata property.
Example
DEFINE MART FinancialReport { source_file: "legacy/reports/monthly_accruals.sql" } AS ()
Note: The AS () body is ignored when source_file is present. SqlCAD will read the content of the referenced file during compilation.
3. Migrating from dbt
SqlCAD provides a built-in utility to scaffold a project from a dbt manifest.json.
Command
sqlcad migrate dbt --manifest path/to/dbt/manifest.json --output ./my_new_project
What happens:
- Sources: Converted to
DEFINE TABLEblocks withexternal: true. - Models: Converted to
DEFINE ENTITYorDEFINE MARTbased on dbt tags (e.g., tagssilverorentitymap toENTITY). - SQL: The raw SQL code from dbt is preserved inside the
AS (...)block.
4. The "Strangle-and-Replace" Workflow
We recommend an incremental migration:
- Map the DAG: Use the
migrate dbttool orexternal: truetables to replicate your existing dependency graph in SqlCAD. - Verify Lineage: Run
sqlcad checkto ensure SqlCAD understands the relationships. - Harden Contracts: One by one, remove
external: trueandSELECT *from your most important entities. Define explicitcolumnsandprimary_keyto enable SqlCAD's advanced features like:- SCD Type 2 Generation: Automatically generating loading scripts.
- Auto-Flattening: Injecting PIT (Point-in-Time) joins.
- Type Safety: Full schema validation across the graph.
5. Dependency Resolution for Legacy SQL
SqlCAD's parser is intelligent enough to distinguish between internal entities and external references. If your legacy SQL refers to schema.table_name, SqlCAD treats it as an external reference and won't confuse it with an internal ENTITY named table_name.
This allows you to keep your SQL mostly "as-is" during the initial migration phase.