Data Catalog

A catalog people actually use

Enterprise data catalogs exist but nobody opens them. They're separate from the workflow, require manual curation, and go stale within weeks. Squish builds a catalog that stays accurate because it's connected to your actual data.

dim_customerstable
analytics
piicore
fact_orderstable
analytics
financecore
stg_paymentsview
staging
pcistaging

The Alation problem

Why existing catalogs fail

Catalogs like Alation and Atlan require manual curation. Nobody wakes up thinking "time to update the catalog." So they don't. And the catalog becomes another ignored tool.

What happens today
With Squish Catalog
Slack messages to find tables
AI-powered hybrid search across all metadata with vector embeddings
Documentation in outdated Confluence pages
Descriptions live next to the actual schema, always in sync
No idea if your data is trustworthy
Trust scores measure data quality across 7 dimensions automatically
PII scattered across unknown columns
Automatic classification and tagging for governance

What makes us different

Discovery-first cataloging

Not another tool to ignore

Part of Your Workflow

Other catalogs require a separate login and separate mental context. Squish integrates where you already work. Documentation feels like editing dbt yaml, not switching tools.

Discovery-first

Auto-Populated from Discovery

Relationships and structural metadata come from automated discovery. You add business context. The catalog stays accurate because the source of truth is your actual data.

AI writes, you review

AI-Powered Documentation

AI generates table and column descriptions using OpenAI or Anthropic. Review and refine what it writes instead of starting from scratch on every table.

Everything you need to organize your data

AI-Powered Search

Hybrid vector + full-text search with OpenAI embeddings. Find what you need even when you don't know the exact name.

Trust Scores

7-factor quality scoring across completeness, accuracy, freshness, consistency, validity, uniqueness, and documentation.

Quality Testing

5 test types (not-null, unique, range, pattern, custom SQL) with test suites and a quality dashboard.

Business Glossary

Hierarchical business terms with relationships and approval workflows. Define once, reference everywhere.

Domains

Organize assets into hierarchical domains. Assign tables and columns to business areas for clear ownership.

AI Description Generation

Auto-generate column and table descriptions using OpenAI or Anthropic. Review and refine instead of writing from scratch.

Classification & PII Detection

Classify columns as PII, PCI, or internal. Automatic detection suggests classifications, you approve them.

Comments & Collaboration

Threaded comments with voting on any table or column. Discussion lives where the data is, not in a separate tool.

Task Management

Track documentation and data quality tasks. Assign work, set deadlines, and monitor progress.

Notifications

Follow assets and get alerts via Slack, email, or webhook when changes happen.

Webhooks

HMAC-signed event notifications with retry logic. Integrate catalog events into your existing workflows.

Custom Properties

Extend metadata with organization-specific fields. Add any attribute you need without waiting for a feature request.

Your data deserves documentation that stays accurate

Start with automated discovery. Add business context where it matters. Get a catalog that reflects reality, not last quarter's best intentions.