User Guide Getting Started Help Center Documentation Community Training
New LookML
Old LookML
New LookML
Looker
  
English
Français
Deutsch
日本語
cluster_keys

Go to View Parameter List

Usage

view: view_name {
  derived_table: {
    cluster_keys: ["customer_city", "customer_state"]
    …
  }
}

Hierarchy

cluster_keys

Default Value

None

Accepts

One or more clustered column names

Special Rules

cluster_keys is supported only on Google BigQuery

Definition

Google BigQuery has the ability to cluster partitioned tables. Clustering sorts the data in a partition based on the values in the clustered columns and organizes the clustered columns in optimally sized storage blocks. Clustering can improve the performance and reduce the cost of queries that filter on or aggregate by the clustered columns.

To add a clustered column to a persistent derived table (PDT), use the cluster_keys parameter and supply the names of the columns you want clustered in the database table.

Examples

Create a customer_order_facts native derived table on a Google BigQuery database, partitioned on the date column and clustered on the city, age_tier, and gender columns to optimize queries that are filtered or aggregated on those columns:

view: customer_order_facts { derived_table: { explore_source: order { column: customer_id { field: order.customer_id } column: date { field: order.order_time } column: city { field: users.city} column: age_tier { field: users.age_tier } column: gender { field: users.gender } derived_column: num_orders { sql: COUNT(order.customer_id) ;; } } partition_keys: [ "date" ] cluster_keys: [ "city", "age_tier", "gender" ] datagroup_trigger: daily_datagroup } }

Common Challenges

cluster_keys Only Works with Derived Tables That Are Persisted and Partitioned

Derived tables can be calculated at query time, or they can be made persistent using datagroup_trigger, persist_for, or sql_trigger_value. The cluster_keys parameter works only with persistent derived tables.

In addition, Google BigQuery supports clustering on partitioned tables only. The cluster_keys parameter only works with PDTs that are also partitioned using the partition_keys parameter.

Google BigQuery Tables Can Partition Only on Date Fields

Google BigQuery tables can only be partitioned on a date or timestamp column. If you want to add clustered columns to a PDT that does not include date or time-based data, one way to do that is to add a date column using a SQL statement such as SELECT CURRENT_DATETIME() as now, and then use partition_keys to partition on the new column. You can then use clustering on other columns in your PDT.

Top