BigQuery and ClickHouse: equivalent concepts

The tables below map each BigQuery concept to its ClickHouse equivalent. For the end-to-end migration walkthrough, see Migrating from BigQuery to ClickHouse Cloud.

Resource hierarchy

BigQuery	ClickHouse	Notes
Organization	Organization	Root node of the hierarchy in both.
Project	Service (region-scoped); warehouse for grouping services with shared storage and independent compute	A service is the data-container analogue; use a warehouse to group services that share storage but scale compute independently. Billing, quotas, and tier are set at the ClickHouse organization level, not per service.
Dataset	Database	Same role — logical container for tables and grant scope.
IAM permissions	Console roles plus SQL grants	Two-layer access: roles in `console.clickhouse.cloud` plus SQL grants in the database. Console users can also be granted DB roles for SQL Console use.

Compute and capacity

BigQuery	ClickHouse	Notes
Slot	Replica (whole node); queries can parallelize across replicas	A replica is a static compute node in a ClickHouse service. See the callout below.
Slot reservation	Vertical and horizontal autoscaling bounds; warehouses for workload isolation	Where BigQuery reservations guarantee slot capacity, ClickHouse sizes via min/max autoscaling bounds. Setting min = max effectively fixes the size.
Quotas	Workload classes plus per-query limits	Workload classes cover memory, CPU, concurrency, and I/O scheduling at runtime. Spend caps have no direct primitive; there's no setting that suspends a service on hitting a cost threshold.

Slot vs replica

Unlike a BigQuery slot, a ClickHouse replica is a static compute node that serves many concurrent queries; it isn't allocated to a query from a pool. Replicas are also much coarser-grained: a slot is closer to a unit of CPU than to a whole node.

Billing and pricing model

ClickHouse Cloud bills compute as replica-hours (sized by the autoscaling bounds) rather than per-TB scanned or flat-rate slot commitments, storage as compressed bytes, and backups as a separate line item. There's no equivalent to BigQuery's logical-vs-physical storage choice. Both platforms charge for public internet egress and cross-region data transfer, and both offer committed-spend discounts. Managed ingestion (ClickPipes) is metered separately; other features (materialized view refresh, secondary index maintenance, dictionary loads) run on the service's compute. See ClickHouse Cloud pricing for current rates, tiers, and commitment options.

Storage and tables

In ClickHouse, a table's behavior is set at creation time: the engine (MergeTree family) determines merge and storage semantics, and ORDER BY / PARTITION BY / TTL clauses configure physical layout and retention. Many BigQuery per-feature settings map to a clause in the ClickHouse CREATE TABLE statement. Physical schema design also differs between platforms; see the migration guide for design tradeoffs.

BigQuery	ClickHouse	Notes
Table	MergeTree-family table	Engine choice determines storage and merge behavior; pick by access pattern (`MergeTree` for append-mostly facts, `ReplacingMergeTree` for upserts, `AggregatingMergeTree` for pre-aggregations).
Column schema modes (`NULLABLE`, `REQUIRED`, `REPEATED`)	`Nullable(T)` for optional; omit for required; `Array(T)` for repeated; `Array(Tuple(...))` or `Nested` for repeated records	In ClickHouse, columns are non-nullable unless wrapped with `Nullable(T)`. Nullability has a small storage and query cost, so use it only when the column needs nulls.
Schema evolution (add / drop / modify columns)	`ALTER TABLE ... ADD / DROP / MODIFY COLUMN`	Same DDL surface as BigQuery. Many column changes are metadata-only.
Partitioning	`PARTITION BY` clause on the table	Where BigQuery limits partitioning to time-unit, integer-range, or ingestion-time columns, ClickHouse takes an arbitrary expression. Use it for retention (drop a partition) and pruning.
Clustering	`ORDER BY` columns in the table definition	Where BigQuery clustering is best-effort and reorganized in the background, ClickHouse's `ORDER BY` is enforced at insert time and drives the sparse primary index.
External tables / BigLake	`s3` / `gcs` / `azureBlobStorage` table functions for direct file access; Iceberg engine for open catalogs	Object storage and open-table formats are read directly through these functions and engines. ClickHouse does not provide a unified-governance layer over external storage.
Object tables (SQL access to unstructured files)	`s3` / `gcs` table functions over binary formats	Unstructured objects (images, PDFs, audio) are queried directly through these table functions, not via a dedicated table type.
In-SQL ML inference on objects (`ML.GENERATE_TEXT`, `ML.GENERATE_EMBEDDING`)	—	No in-SQL LLM-on-objects equivalent; call hosted models from the application layer.
Apache Iceberg	Iceberg engine	Reads Iceberg tables stored in S3, Azure, HDFS, or local storage. Writes are not supported. See the engine page for the current list of supported features.
Default table / partition / dataset expiration	`TTL` clause on the table, column, or partition	`TTL` can be set at table creation or via `ALTER TABLE ... MODIFY TTL`, and applies at the table, column, or partition level rather than the dataset default.
Table snapshot	Service-level backup	Granularity differs significantly; see the callout below.
Time travel	Point-in-time backup restore into a new service	No inline historical query; restore is the only way to reach prior state.
Authorized views	View with `SQL SECURITY DEFINER` (runs with the view-owner's privileges)	See CREATE VIEW for the syntax and the `INVOKER` / `DEFINER` / `NONE` modes.
Row-level security	Row policy — a `WHERE`-style expression evaluated per user	Same role; the policy expression is attached per user (or role) rather than as a BigQuery RLS row access policy resource.
Wildcard tables (`_TABLE_SUFFIX`)	`Merge` table engine (persistent grouping) or `merge()` function (inline)	Same idea, different syntax. `Merge` is a persistent table-of-tables; `merge()` is inline without creating one.
Table clone	`CREATE TABLE ... AS SELECT` copy, or backup restore into a new service	ClickHouse has no copy-on-write primitive; every copy reads the source data fully.

Backups

ClickHouse Cloud backups are per-service. Restoring a backup creates a new service; a single table cannot be restored back into the original service.

Updates and deletes

ClickHouse is append-optimized. There's no SQL MERGE statement (unrelated to the Merge and MergeTree engines), and ALTER TABLE … UPDATE / DELETE run as background mutations rather than transactional row writes. BigQuery DML patterns (MERGE, UPDATE, DELETE, dbt incremental updates) typically port to engine choice in ClickHouse: ReplacingMergeTree keeps the latest row by sort key, CollapsingMergeTree marks deletes inline, and AggregatingMergeTree maintains aggregated state. Engine choice is set at table creation and is non-trivial to change later.

Query model and performance

Query acceleration in ClickHouse comes from three layers: primary-key ordering (a sparse index over the on-disk sort order), secondary indexes on non-key columns, and materialized views.

BigQuery	ClickHouse	Notes
Primary key (advisory)	Primary key — drives the on-disk sort order and the sparse primary index	Where BigQuery's PK is advisory only, ClickHouse's PK is load-bearing — it determines physical layout and is used to prune granules, avoid re-sorts, and short-circuit `LIMIT`. Neither system enforces uniqueness.
Foreign key (advisory)	Wide tables or dictionaries for lookups	ClickHouse doesn't accept foreign-key declarations even as advisory hints.
Search index	Full-text index	Token index over string columns. Operator surface differs: BigQuery's `SEARCH()` maps to the `hasToken` / `hasAllTokens` / `hasAnyTokens` / `hasPhrase` family in ClickHouse; `LIKE` and `match` can also use the index when the pattern is tokenizable (see the index doc for restrictions).
Vector index	`Array(Float32)` plus a vector ANN index	ClickHouse has no dedicated vector type. Embeddings are stored as `Array(Float32)` and accelerated with an ANN index for approximate nearest-neighbor lookups.
Materialized view	Incremental MV — updates on each insert into a base table	Cost is paid at insert time. BigQuery's auto-refreshed MVs map to this incremental model, not to refreshable MVs.
Scheduled query	Refreshable MV for query-driven scheduled work; external orchestrator (dbt, Airflow) for procedural pipelines	Refreshable MVs run on a cron-style schedule and maintain a result table.
Streaming inserts (`tabledata.insertAll`, Storage Write API)	Asynchronous inserts for server-side batching of small row-by-row writes; `Buffer` engine for in-memory accumulation; ClickPipes for managed streaming pipelines	The raw API surface is `INSERT` over HTTP or the native protocol, but for the row-by-row semantics of BigQuery streaming inserts, async inserts are the closer fit, batching on the server. For deduplication, ClickHouse exposes block-level dedup on Replicated tables (`insert_deduplicate`, default-on, recent-blocks window) and user-supplied per-insert dedup via `insert_deduplication_token`; for row-level dedup, use `ReplacingMergeTree`. None of these reproduces the Storage Write API's full exactly-once-by-stream-offset model.
Continuous queries	Streaming table engine (Kafka, NATS, RabbitMQ) feeding a materialized view that writes to a destination table; ClickPipes for managed streaming from Pub/Sub or Kinesis; an incremental MV on a regular table for inserts already landing in ClickHouse	"Always-on transform" maps to engine + MV rather than a single declarative resource. Pick by source: streaming queue → table engine or ClickPipes; ongoing inserts into an existing table → MV.
Dry run	`EXPLAIN ESTIMATE` — reports rows, parts, and marks the query would read; other `EXPLAIN` variants (`PLAN`, `PIPELINE`, `SYNTAX`) cover deeper plan inspection	Covers the plan-inspection role of BigQuery's dry run, not the cost-estimation role. ClickHouse billing isn't per-query, so there's no "bytes that will be billed" answer to return.
Federated queries (Spanner, Cloud SQL, AlloyDB, Bigtable)	External OLTP attached via database engine (PostgreSQL, MySQL, MongoDB, SQLite)	Distinct from external tables in object storage; these attach a live source so its tables are queryable directly. Bigtable has no ClickHouse database-engine equivalent.
Cached results	Query cache	ClickHouse's query cache lives in each replica's memory and is per-user by default; identical queries to different replicas don't share results. Not transactionally consistent.
Sessions / multi-statement queries	Per-statement execution; multi-step state managed in the client or an orchestrator	ClickHouse has no per-session variables or shared state.

Security and governance

Authorized views and row-level security are listed under Storage and tables.

BigQuery	ClickHouse	Notes
Policy tags / column-level access control	Column-level grants on specific columns of a table	Grants apply at the column level. BigQuery's centralized taxonomy/policy-tag governance has no direct equivalent.
Data masking	Views, row policies, or function-based transforms; see data masking patterns	No column-mask primitive; patterns are SQL-level.
Customer-managed encryption keys (CMEK)	CMEK on the service	Available on AWS (KMS) and GCP (Cloud KMS), with rotation and revocation.
AEAD / SQL-level encryption functions	Encryption functions (`encrypt` / `decrypt`)	Covers AES-128/256-CBC/GCM and AEAD modes.
Differential privacy	—	Not a managed feature. Applying noise in a UDF doesn't reproduce BigQuery's epsilon/delta-controlled privacy guarantees; for true DP, use an external library.
VPC Service Controls	Private connectivity — Private Service Connect (GCP), PrivateLink (AWS, Azure), and IP allowlists for ingress restriction	Restricts ingress; doesn't replicate VPC SC's data-exfiltration boundary.

BigQuery	ClickHouse	Notes
Analytics Hub / data exchanges / listings	Read access to a shared database, or a dedicated service with consumer-specific row policies	ClickHouse has no in-product data marketplace; sharing uses standard access primitives.
Data clean rooms	Row policies and authorized views	No managed clean-room product.

Operations and ecosystem

ClickHouse surfaces operational state through system.* tables (queries, sessions, replication, parts, metrics) and the cloud console; managed ingestion is handled by ClickPipes; ML, BI, and notebook workflows are typically handled in external systems that read from ClickHouse.

BigQuery	ClickHouse	Notes
BigQuery ML	External training and serving (notebooks, Spark, Vertex AI, feature stores) reading from ClickHouse; see AI/ML in Cloud for managed-side features	ClickHouse has no in-database ML. Use it as the analytical store and run training elsewhere.
BI Engine	Direct querying — no separate acceleration tier to provision	Sub-second BI latency comes from the storage engine itself; there's no in-memory cache layer to size or pay for separately.
OMNI / cross-cloud federated query	—	ClickHouse doesn't query in place across clouds. The closest pattern is one service per supported region with data staged into the target service before being queried.
Data sources / file formats	File-format and connector library	Managed connectors (ClickPipes) for sources like Kafka, Pub/Sub, MySQL, Postgres, and object storage; SQL table functions for ad-hoc reads of files in object storage.
Query jobs (ID, history, cancel)	`system.query_log` and `system.processes` for inspection; `KILL QUERY` to cancel	Same information, exposed through system tables instead of a job API.
`INFORMATION_SCHEMA`	Native `system.*` tables for ClickHouse-specific detail, or the ANSI `information_schema` views for tool compatibility	Both surfaces available.
Data Transfer Service	ClickPipes — scheduled and streaming ingestion from SaaS, storage, and OLTP sources	ClickPipes is ClickHouse Cloud's managed connector platform; coverage spans streaming systems, OLTP sources, and object storage.
Audit logs (admin activity, data access, system events)	Cloud audit log for org and service admin events; `system.query_log` for data-access activity within the service	BigQuery's three audit streams collapse to two on ClickHouse Cloud: admin activity to the cloud audit log, query/data access to `system.query_log`. System-event telemetry is exposed through other `system.*` tables rather than a dedicated audit stream.
Change data capture ingestion	ClickPipes for Postgres, MySQL, or Kafka	Managed CDC from OLTP sources.
BigQuery Studio notebooks / BigQuery DataFrames	Jupyter with `clickhouse-connect` or another client library	No in-product notebook environment or pandas-compatible in-DB API; notebook-side libraries cover the same workflow.
Data Canvas	—	No drag-and-drop NL canvas. SQL Console covers ad-hoc query authoring; visual data prep happens in an external orchestrator.
Gemini in BigQuery (SQL generation, code completion)	ClickHouse Agents in the cloud console; Ask-AI in the docs	Agents (currently in beta) are conversational: natural-language queries against your data with tool calls and chat workflows. In-SQL LLM functions analogous to `BQ.ML.GENERATE_TEXT` are handled outside SQL today; check the Agents page for the current capability surface.
Knowledge Catalog / data lineage / data quality	`system.*` tables for metadata; external tools (dbt, DataHub) for lineage and quality	ClickHouse exposes metadata via system tables rather than a managed catalog product.
Cross-region replication / managed disaster recovery	Multi-AZ HA within a region (automatic); cross-region resiliency via external backups exported to a customer-owned bucket; see Data resiliency	ClickHouse Cloud has no automatic cross-region replication, failover, or active-active sync today. Cross-region DR is backup-based.

Next steps

For the end-to-end migration walkthrough, see Migrating from BigQuery to ClickHouse Cloud.
For the conceptual side-by-side, see Comparing ClickHouse Cloud and BigQuery.
For loading data, see Loading data from BigQuery.

Resource hierarchy​

Compute and capacity​

Billing and pricing model​

Storage and tables​

Query model and performance​

Security and governance​

Data sharing​

Operations and ecosystem​

Next steps​