How dbt Macros Actually Work Under the Hood

You write {{ ref('stg_users') }} a hundred times a day without thinking about it. But ref() is a macro. It is Jinja. And once you understand what Jinja actually does during compilation, a lot of dbt behavior that seemed magical starts making sense.

Jinja Is String Replacement

That is it. The entire Jinja engine in dbt is a string preprocessor. It takes your SQL file, processes the Jinja template expressions, and produces a plain SQL string. The database never sees Jinja. It only sees the compiled SQL.

Macro

compile

Output

People sometimes say this dismissively, like Jinja is unsophisticated. It is actually the design choice that makes dbt portable across every SQL database. There is no runtime overhead, no custom query engine. Your warehouse runs exactly the SQL that dbt compiles. Nothing more.

The compilation step happens before any SQL is executed. dbt reads all your model files, resolves all the Jinja expressions, writes the compiled SQL to the target/ directory, and then executes those compiled files against your database.

ref() Is a Macro

When you write {{ ref('stg_users') }}, dbt does two things. First, it registers a dependency in the DAG. Second, it replaces the expression with the fully qualified table name.

SELECT
  users.user_id,
  users.name,
  orders.total
FROM {{ ref('stg_users') }} AS users
JOIN {{ ref('stg_orders') }} AS orders
  ON users.user_id = orders.user_id

In development, {{ ref('stg_users') }} might compile to "analytics"."dev_luke"."stg_users". In production, the same expression compiles to "analytics"."production"."stg_users". Same code, different environments, correct table names everywhere.

The dependency registration is the part most people do not think about. Every ref() call tells dbt that model A depends on model B. dbt uses this to build the execution order, determine what needs to rebuild when something changes, and generate the DAG visualization. Without ref(), dbt is just a SQL runner.

How Compilation Actually Works

When you run dbt build, the compilation pipeline looks like this:

dbt reads every SQL file in your models directory

Jinja expressions are resolved: ref() becomes table names, config() sets materialization, macros expand into SQL

The compiled SQL is written to target/compiled/

dbt executes the compiled SQL in dependency order

You can inspect the compiled output at any time:

sql

1-- target/compiled/my_project/models/marts/fct_orders.sql
2SELECT
3  o.order_id,
4  o.customer_id,
5  o.amount,
6  c.customer_name
7FROM "analytics"."production"."stg_orders" AS o
8JOIN "analytics"."production"."dim_customers" AS c
9  ON o.customer_id = c.customer_id

Looking at compiled SQL is the fastest way to debug macro issues. If the compiled output is wrong, the problem is in your Jinja. If the compiled output looks correct but the query fails, the problem is in your SQL or your data.

Writing Useful Macros

Custom macros are worth writing when you have SQL patterns that repeat across multiple models with slight variations. The classic example is a unit conversion:

sql+jinja

1{% macro cents_to_dollars(column_name) %}
2  ({{ column_name }} / 100)::numeric(10,2)
3{% endmacro %}

Used in a model:

sql

1SELECT
2  order_id,
3  {{ cents_to_dollars('amount_cents') }} AS amount_dollars,
4  {{ cents_to_dollars('tax_cents') }} AS tax_dollars,
5  {{ cents_to_dollars('shipping_cents') }} AS shipping_dollars
6FROM {{ ref('stg_orders') }}

This compiles to straightforward SQL:

sql

1SELECT
2  order_id,
3  (amount_cents / 100)::numeric(10,2) AS amount_dollars,
4  (tax_cents / 100)::numeric(10,2) AS tax_dollars,
5  (shipping_cents / 100)::numeric(10,2) AS shipping_dollars
6FROM "analytics"."production"."stg_orders"

The test for a good macro is simple: can you predict the compiled SQL without running it? If yes, keep the macro. If you have to run dbt compile every time to check whether it did what you expected, the macro is too clever.

When Macros Make Things Worse

Macros have a cost: they add a layer of indirection. When someone reads your model, they have to mentally resolve the macro to understand the SQL. This is fine for simple, well-named macros like cents_to_dollars. It is terrible for complex macros that generate entire queries.

The worst macro pattern is the "dynamic model generator" that takes a list of columns and generates an entire SELECT statement. These macros are clever and they make debugging miserable. When the generated query returns wrong results, you have to reason about the macro logic and the SQL logic simultaneously.

Rules of thumb:

If a macro generates more than ten lines of SQL, consider whether a model would be clearer

If a macro has more than three parameters, it is doing too much

If you need nested macros (a macro that calls another macro), step back and reconsider

If a new team member cannot understand what the macro does within thirty seconds, simplify it

The Macro Resolution Order

dbt resolves macros in a specific order that matters when you override built-in behavior:

Project-level macros (your macros/ directory)

Package-level macros (from packages.yml)

dbt core macros (built-in)

This means you can override any built-in macro by defining one with the same name in your project. The most common use case is overriding generate_schema_name to customize how dbt names schemas in different environments.

sql+jinja

1{% macro generate_schema_name(custom_schema_name, node) %}
2  {% if target.name == 'production' %}
3    {{ custom_schema_name | trim if custom_schema_name else target.schema }}
4  {% else %}
5    {{ target.schema }}_{{ custom_schema_name | trim if custom_schema_name else target.schema }}
6  {% endif %}
7{% endmacro %}

Be careful with macro overrides. They are powerful and they are invisible. Someone reading a model has no indication that generate_schema_name has been overridden. Document these overrides prominently.

What Squish Does With This

This is where Squish connects to the macro story. Squish exports discovered relationships as dbt YAML, including relationship test definitions that reference models and columns by name. Because dbt resolves those names through ref() at compile time, the same export works in dev and prod without changes.

The practical workflow: run Squish against your database, export the dbt YAML, merge it into your schema.yml, and run dbt test. Any relationship that Squish discovered but your project did not define will now be tested on every build.