where_clause_suffix
-enabled warehouses, we follow the amundsen-databuilder pattern, for those familiar with it. For simplicity, we provide a list of metadata tables (e.g. information_schema
) and their aliases, as well as a list of a few notable column names that are commonly used for filtering. If you write a statement that includes a %
, you'll need to escape it (replace it with %%
), as we use the python library SQLAlchemy to execute your query, and python reserves %
for string formatting.wh pull
, manually or through cron), we will never delete any table stubs, to avoid removing any personal documentation, so if you want to remove filtered tables scraped before adding a filter, they'll have to be deleted manually.project_id
is already an accepted configuration key), we supply a configuration key database
which enables users to pull data only from specific databases/clusters/catalogs under a connection.database
for simplicity, though it often is described differently for different warehouse types. E.g. for postgres and snowflake it's typically called "database", while the ANSI sql standard labels this the "catalog", and amundsen calls this layer the "cluster", following the typical pattern in Hive & Presto-based setups with production-development replication.where_clause
like with the other warehouses. You can instead add the included_tables_regex
with an associated regex expression. If the regex expression is matched, the table will be indexed.project_id.dataset.table_name
, so dataset- level restrictions can also occur here.TBLS
(alias: t
)DBS
(alias: d
)PARTITION_KEYS
(alias: p)TABLE_PARAM
(alias: tp
)d.NAME
(table schema)t.TABLE_NAME
INFORMATION_SCHEMA.COLUMNS
(alias: c
)PG_CATALOG.PG_STATIO_ALL_TABLES
(alias: st
)PG_CATALOG.PG_DESCRIPTION
(alias: pgtd
)c.table_schema
c.table_name
information_schema.columns
(alias: a
)information_schema.views
(alias: b
)a.table_catalog
a.table_schema
a.table_name
cluster
schema
name
(table name)TABLES
(alias: t
)COLUMNS
(alias: c
)c.TABLE_NAME
c.TABLE_SCHEMA