How dbt Macros Actually Work Under the Hood
You write {{ ref('stg_users') }} a hundred times a day without thinking about it. But ref() is a macro. It is Jinja. And once you understand what Jinja actually does during compilation, a lot of dbt behavior that seemed magical starts making sense.
Jinja Is String Replacement
That is it. The entire Jinja engine in dbt is a string preprocessor. It takes your SQL file, processes the Jinja template expressions, and produces a plain SQL string. The database never sees Jinja. It only sees the compiled SQL.
People sometimes say this dismissively, like Jinja is unsophisticated. It is actually the design choice that makes dbt portable across every SQL database. There is no runtime overhead, no custom query engine. Your warehouse runs exactly the SQL that dbt compiles. Nothing more.
The compilation step happens before any SQL is executed. dbt reads all your model files, resolves all the Jinja expressions, writes the compiled SQL to the target/ directory, and then executes those compiled files against your database.
ref() Is a Macro
When you write {{ ref('stg_users') }}, dbt does two things. First, it registers a dependency in the DAG. Second, it replaces the expression with the fully qualified table name.
SELECT
users.user_id,
users.name,
orders.total
FROM {{ ref('stg_users') }} AS users
JOIN {{ ref('stg_orders') }} AS orders
ON users.user_id = orders.user_idIn development, {{ ref('stg_users') }} might compile to "analytics"."dev_luke"."stg_users". In production, the same expression compiles to "analytics"."production"."stg_users". Same code, different environments, correct table names everywhere.
The dependency registration is the part most people do not think about. Every ref() call tells dbt that model A depends on model B. dbt uses this to build the execution order, determine what needs to rebuild when something changes, and generate the DAG visualization. Without ref(), dbt is just a SQL runner.
How Compilation Actually Works
When you run dbt build, the compilation pipeline looks like this:
target/compiled/You can inspect the compiled output at any time:
1-- target/compiled/my_project/models/marts/fct_orders.sql2SELECT3 o.order_id,4 o.customer_id,5 o.amount,6 c.customer_name7FROM "analytics"."production"."stg_orders" AS o8JOIN "analytics"."production"."dim_customers" AS c9 ON o.customer_id = c.customer_idLooking at compiled SQL is the fastest way to debug macro issues. If the compiled output is wrong, the problem is in your Jinja. If the compiled output looks correct but the query fails, the problem is in your SQL or your data.
Writing Useful Macros
Custom macros are worth writing when you have SQL patterns that repeat across multiple models with slight variations. The classic example is a unit conversion:
1{% macro cents_to_dollars(column_name) %}2 ({{ column_name }} / 100)::numeric(10,2)3{% endmacro %}Used in a model:
1SELECT2 order_id,3 {{ cents_to_dollars('amount_cents') }} AS amount_dollars,4 {{ cents_to_dollars('tax_cents') }} AS tax_dollars,5 {{ cents_to_dollars('shipping_cents') }} AS shipping_dollars6FROM {{ ref('stg_orders') }}This compiles to straightforward SQL:
1SELECT2 order_id,3 (amount_cents / 100)::numeric(10,2) AS amount_dollars,4 (tax_cents / 100)::numeric(10,2) AS tax_dollars,5 (shipping_cents / 100)::numeric(10,2) AS shipping_dollars6FROM "analytics"."production"."stg_orders"The test for a good macro is simple: can you predict the compiled SQL without running it? If yes, keep the macro. If you have to run dbt compile every time to check whether it did what you expected, the macro is too clever.
When Macros Make Things Worse
Macros have a cost: they add a layer of indirection. When someone reads your model, they have to mentally resolve the macro to understand the SQL. This is fine for simple, well-named macros like cents_to_dollars. It is terrible for complex macros that generate entire queries.
The worst macro pattern is the "dynamic model generator" that takes a list of columns and generates an entire SELECT statement. These macros are clever and they make debugging miserable. When the generated query returns wrong results, you have to reason about the macro logic and the SQL logic simultaneously.
Rules of thumb:
The Macro Resolution Order
dbt resolves macros in a specific order that matters when you override built-in behavior:
macros/ directory)packages.yml)This means you can override any built-in macro by defining one with the same name in your project. The most common use case is overriding generate_schema_name to customize how dbt names schemas in different environments.
1{% macro generate_schema_name(custom_schema_name, node) %}2 {% if target.name == 'production' %}3 {{ custom_schema_name | trim if custom_schema_name else target.schema }}4 {% else %}5 {{ target.schema }}_{{ custom_schema_name | trim if custom_schema_name else target.schema }}6 {% endif %}7{% endmacro %}Be careful with macro overrides. They are powerful and they are invisible. Someone reading a model has no indication that generate_schema_name has been overridden. Document these overrides prominently.
What Squish Does With This
This is where Squish connects to the macro story. Squish exports discovered relationships as dbt YAML, including relationship test definitions that reference models and columns by name. Because dbt resolves those names through ref() at compile time, the same export works in dev and prod without changes.
The practical workflow: run Squish against your database, export the dbt YAML, merge it into your schema.yml, and run dbt test. Any relationship that Squish discovered but your project did not define will now be tested on every build.