The dbt Relationships Test is Not Enough

If you use dbt, you probably have some relationships tests in your YAML. They look like this:

models:

- name: orders

columns:

- name: user_id

tests:

- relationships:

to: ref('users')

field: id

This test verifies that every user_id in the orders table exists in the users table. It catches orphaned references, which is genuinely useful. The problem is what it does not catch.

What the Test Actually Does

The dbt relationships test runs a query that looks roughly like:

SELECT COUNT(*) FROM orders WHERE user_id NOT IN (SELECT id FROM users)

If the count is zero, the test passes. If any user_id values in orders do not have a matching id in users, the test fails. Simple, reliable, useful.

What It Misses

Relationships You Did Not Define

The test only runs on relationships you explicitly configure. If your orders table has a promo_code column that references a promotions table, but nobody added that relationship test, it does not get checked. The test cannot find relationships it does not know about.

This is the fundamental limitation. You are testing the relationships you already documented, which are the ones you were least likely to get wrong. The dangerous relationships are the undocumented ones.

Relationships Across Sources

dbt relationships tests work within your dbt project. If your staging models reference source tables that live outside your dbt project, or if relationships span multiple dbt projects, testing gets complicated. You can test against sources, but in practice teams rarely achieve complete coverage.

Cardinality Issues

The relationships test confirms referential integrity (every child value exists in the parent). It does not check cardinality. If your data model assumes a one-to-one relationship between users and profiles, but the profiles table actually has duplicates, the relationships test passes while your JOINs silently produce wrong results.

Stale Definitions

As schemas evolve, new columns get added that establish new relationships. The dbt YAML does not update itself. Unless someone manually adds a new relationships test every time a column is added, coverage degrades over time.

Closing the Gap

Automated Discovery

Run a relationship discovery tool against your database periodically. Compare discovered relationships against your dbt YAML. Any relationship that exists in the database but is not tested in dbt is a gap.

Squish does this automatically. Connect your database, run discovery, and export the results. Diff them against your existing dbt YAML to find untested relationships.

Cardinality Tests

Supplement relationships tests with cardinality assertions. dbt-utils and dbt-expectations both provide tests for uniqueness and accepted values that can catch cardinality violations.

For critical one-to-one relationships, add an explicit uniqueness test on the child column:

models:

- name: user_profiles

columns:

- name: user_id

tests:

- unique

- relationships:

to: ref('users')

field: id

The relationships test confirms every user_id exists in users. The unique test confirms there is only one profile per user. Together they enforce a true one-to-one relationship.

Coverage Tracking

Track what percentage of your columns with _id or _key suffixes have corresponding relationships tests. This is a rough but useful proxy for relationship test coverage. If you have 80 columns ending in _id and only 30 relationships tests, you have gaps.

Source Freshness as a Proxy

If a source table changes schema (new columns added, columns renamed), that is a signal that your relationship definitions may need updating. Pair source freshness checks with periodic schema diffing to catch drift.

A Practical Workflow

Run Squish against your database to discover all relationships

Export discovered relationships as dbt YAML

Diff against your existing schema.yml

Add missing relationships tests

Run dbt test to validate

Schedule periodic re-discovery to catch drift

The dbt relationships test is useful but not complete. Pair it with automated discovery for more comprehensive relationship validation.