home User Guide Getting Started Help Center Documentation Community Training Certification
Looker keyboard_arrow_down
language keyboard_arrow_down
Using Derived Tables

Derived tables are important tools in Looker that enable you to expand the sophistication of your analyses. In some cases, they can also play a valuable role in enhancing query performance.

At a high level, Looker’s derived table functionality provides a way to create new tables that don’t already exist in your database.

You define a derived table in one of these ways:

You can then base a LookML view on the derived table, just like any other table.

Simple Example

Let’s consider an example to clarify the concept. Suppose your database already contains a table called order, and you’d like to summarize some of that order data by customer. We can create a new derived table named customer_order_facts to do this:

Here’s the LookML to create the customer_order_facts derived table as an NDT and as a SQL-based derived table:

Native Derived Table
view: customer_order_facts { derived_table: { explore_source: orders { column: customer_id { field: orders.customer_id } column: first_order { field: orders.first_order } column: lifetime_amount { field: orders.lifetime_amount } } } dimension: customer_id { type: number primary_key: yes sql: ${TABLE}.customer_id ;; } dimension_group: first_order { type: time timeframes: [date, week, month] sql: ${TABLE}.first_order_date ;; } dimension: lifetime_amount { type: number value_format: "0.00" sql: ${TABLE}.lifetime_amount ;; } }
SQL-Based Derived Table
view: customer_order_facts { derived_table: { sql: SELECT customer_id, MIN(DATE(time)) AS first_order_date, SUM(amount) AS lifetime_amount FROM orders GROUP BY customer_id ;; } dimension: customer_id { type: number primary_key: yes sql: ${TABLE}.customer_id ;; } dimension_group: first_order { type: time timeframes: [date, week, month] sql: ${TABLE}.first_order_date ;; } dimension: lifetime_amount { type: number value_format: "0.00" sql: ${TABLE}.lifetime_amount ;; } }

There are some things to note:

Temporary and Persistent Derived Tables

Derived tables can be temporary (ephemeral), or they can be stored in your database (persistent).

Regular or “Ephemeral” Derived Table

A regular derived table — sometimes called an ephemeral derived table — is temporary and not written to your database. When a user runs an Explore query involving one or more derived tables, Looker constructs a SQL query using a dialect-specific combination of the SQL for the derived table(s) plus the requested fields, joins, and filter values. If the combination has been run before and the results are still valid in cache, Looker uses those cached results.

Because an ephemeral derived table runs a new query on your database every time a user requests data from it, you’ll want to be sure that the derived table is performant and won’t put excessive strain on your database. In cases where the query will take some time to run, a persistent derived table is often a better option.

Supported Database Dialects for Derived Tables

Looker’s ability to provide derived tables depends on whether the database dialect supports them. The following list shows which dialects support derived tables in the most recent Looker release:

Persistent Derived Table

A persistent derived table (PDT) is written into a scratch schema on your database and regenerated on a schedule of your choosing. In most cases, when the user requests data from the table it has already been created, reducing query time and database load.

To use this feature you need to create a scratch schema on your database, and your database needs to support writes. The connection that connects Looker to your database must also have the Persistent Derived Tables checkbox enabled. Most companies set this up when they initially configure Looker (see the Looker Dialects documentation page for instructions for your database dialect), and of course you can also do so after your initial setup.

There are some read-only database configurations that don’t allow persistence to work (most commonly Postgres hot-swap replica databases). In these cases, you can use ephemeral derived tables instead.

Supported Database Dialects for PDTs

Looker’s ability to provide PDTs depends on whether the database dialect supports them. The following list shows which dialects support PDTs in the most recent Looker release:

PDTs are not supported for Snowflake or Google BigQuery connections that use OAuth.

Adding Persistence

If your database allows writes and you have created a scratch schema (see the Looker Dialects documentation page for instructions for your database dialect), you can store derived tables in your database to increase query speed and reduce database load. Many users who can use persistence always use persistence, as typically the additional storage space required is not an issue.

There are some situations where you should avoid persistence. Persistence should not be added to derived tables that will be extended, because each extension of a PDT will create a new copy of the table in your database. Also, persistence cannot be added to derived tables that make use of templated filters or Liquid parameters. There are potentially an infinite number of possible user inputs with those features, so the number of persistent tables in your database could become unmanageable.

Making a Derived Table Persistent

There is no parameter that means “make this derived table persistent.” Rather, persistence is created by the addition of the datagroup_trigger, sql_trigger_value, or persist_for parameter.


If you have defined a datagroup for your caching policies, you can use the datagroup_trigger parameter to cause the persistent derived table (PDT) to rebuild on the same schedule as your caching policies. This means that users will not have to wait for the PDT to be created.

Datagroups are the most flexible method of creating persistence. You can define a datagroup to trigger a PDT rebuild at the time of an ETL rebuild, for example, or you can trigger the datagroup separately from your database ETL cycle.

See the Caching Queries and Rebuilding PDTs with Datagroups documentation page for an overview of caching and rebuilding PDTs in Looker.


You can also specify a schedule on which the PDT will rebuild. This means that users will not have to wait for the PDT to be created. If the user happens to request data from the PDT while it is being rebuilt, data will be returned from the existing table until the new table is ready.

You achieve this kind of persistence by using the sql_trigger_value parameter.

By default, Looker checks to see if it should regenerate a table every 5 minutes. However, you can change this schedule as desired by using the PDT And Datagroup Maintenance Schedule setting in Looker’s admin settings.


Another option is to set the length of time the derived table should be stored before it is dropped. If a user queries the table before that length of time has been reached, they’ll receive data from the existing table. Otherwise, it will be rebuilt the next time a user requests data from it, meaning that particular user will need to wait for the rebuild. You achieve this kind of persistence by using the persist_for parameter.

Indexing PDTs

Because PDTs are actually stored in your database, you should specify indexes using the indexes parameter for SQL dialects that support them. Or, if you’re using Redshift, you can specify regular sort keys (using sortkeys), interleaved sort keys (using indexes), and distribution keys (using distribution).

To add persistence to our original derived table example, we could set it to rebuild when the order_datagroup datagroup triggers, and add indexes on both customer_id and first_order_date, like this:

view: customer_order_facts { derived_table: { explore_source: orders { … } datagroup_trigger: orders_datagroup indexes: ["customer_id", "first_order_date"] } }

If you don’t add an index (or Redshift equivalent), Looker will warn you that you should do so to improve query performance.

Generating PDTs

If you are going to use persistence, be mindful of the underlying query’s duration and trigger time, because they will have an impact on how PDTs are generated. These are some things to keep in mind:

The takeaway is that long-running PDTs can delay the builds of other PDTs. They can also slow down other user queries against a PDT while the database is working hard to generate the large PDT.

Persistence in Development Mode

When you add persistence to a derived table, it will exhibit some specific behaviors in Development Mode that you should know about.

Derived Table Behavior in Development Mode

If you query a PDT in Development Mode without making any changes to its definition, Looker will query the production version of that PDT. However, as soon as you make any change to the PDT definition and query it, a new development version of the PDT is created. This enables you to test changes without disturbing end users.

If a development version of a PDT is created, it will always be persisted for a maximum of 24 hours, regardless of the persistence method you’ve used. This ensures that Development Mode tables are cleaned up on a frequent basis and don’t clutter your database.

After you push changes to production, Looker will immediately begin treating your development table as the production table (as long as you haven’t used the conditional SQL described below). This means that users will not have to wait for a new version of the production table to be built.

Table Sharing and Cleanup

Within any given Looker instance, Looker will share PDTs between users if the PDTs have the same definition and have the same persistence method setting. Additionally, if a PDT’s definition ever ceases to exist, Looker drops the table.

This has several benefits:

Working Faster in Development Mode

Finally, there are cases when the PDT you’re creating may take a long time to generate, which can be frustrating when testing lots of changes in Development Mode. For SQL-based derived tables, Looker supports a conditional WHERE clause for Development Mode that helps manage this:

view: my_view { derived_table: { sql: SELECT columns FROM my_table WHERE -- if prod -- date > '2000-01-01' -- if dev -- date > '2015-01-01' ;; } }

In this example, the query will include all data from 2000 onward when in production, but only the data from 2015 onward when in Development Mode. Using this feature strategically to limit your result set, and increase query speed, can make Development Mode changes much easier to validate.

If you use this feature, please keep in mind that tables will never be shared between Production Mode and Development Mode. This will use more storage on your database, and it means that the table in Production Mode will need to rebuild after you commit your changes to production.

Native derived tables do not support a conditional WHERE clause, but you can create filters in your NDTs, or limit the NDT rows to help test and develop your NDTs.

Users Can Override Persistence

A user with develop permission can override the persistence setting and rebuild the PDTs to get the most up-to-date data. To rebuild the tables, use the Rebuild Derived Tables & Run option from the Explore’s menu:

You must wait for the Explore query to load before this option is available. This option will rebuild all the persistent tables (all the PDTs and aggregate tables) referenced in the query. In addition, this also rebuilds all the persistent tables that depend upon those aggregate tables and PDTs.

For the user that initiates the Rebuild Derived Tables & Run option, the query will wait for the tables to rebuild before loading results. Other users’ queries will still use the existing tables. Once the persistent tables are rebuilt, then all users will use the rebuilt tables.

Although this process is designed to not interrupt other users’ queries while the tables are rebuilding, those users could still be affected by the additional load on your database. If you are in a situation where triggering a rebuild during business hours could put an unacceptable strain on your database, you may need to communicate to your users that they should never rebuild certain PDTs or aggregate tables during those hours.

Referencing Derived Tables in Other Derived Tables

It is possible to reference one derived table in the definition of another, creating something called “cascading” derived tables. To use them, you’ll need to use the syntax ${derived_table_or_view_name.SQL_TABLE_NAME}. For example, you can reference the clean_events derived table by typing ${clean_events.SQL_TABLE_NAME}. You can also use the ${derived_table_or_view_name.SQL_TABLE_NAME} syntax to refer to a LookML view.

When referring to another derived table or view name, type SQL_TABLE_NAME as a literal string.

Although datagroups are the recommended way to trigger PDT rebuilds, the ${derived_table_or_view_name.SQL_TABLE_NAME} syntax is not supported with a datagroup’s sql_trigger parameter. However, if you are using the sql_trigger_value parameter to trigger your PDT rebuilds, you can use the ${derived_table_or_view_name.SQL_TABLE_NAME} syntax with sql_trigger_value.

Although it’s not always required, when referring to a derived table in this manner, it’s often useful to alias the table. For example:

${derived_table_or_view_name.SQL_TABLE_NAME} AS derived_table_or_view_name

This is because, behind the scenes, persistent derived tables are named with lengthy codes in your database. In some cases (especially with ON clauses) it’s easy to forget that you need to use the ${derived_table_or_view_name.SQL_TABLE_NAME} syntax to retrieve this lengthy name. An alias can help to prevent this type of mistake.

In this example we’re creating a derived table from events that cleans out unwanted rows, and then creating a summary of that derived table. The event_summary table regenerates whenever new rows are added to clean_events.

view: clean_events { derived_table: { sql: SELECT * FROM events WHERE type NOT IN ('test', 'staff') ;; sql_trigger_value: SELECT CURDATE() ;; } }   view: events_summary { derived_table: { sql: SELECT type, date, COUNT(*) AS num_events FROM ${clean_events.SQL_TABLE_NAME} AS clean_events GROUP BY type, date ;; sql_trigger_value: SELECT MAX(id) FROM ${clean_events.SQL_TABLE_NAME} ;; } }

This particular example could be done more efficiently in a single derived table, but it’s useful for demonstrating derived table references.

Monitoring and Troubleshooting Persistent Derived Tables

Companies that use many PDTs — especially if they are cascading — often want to understand the status of those PDTs. Looker offers several tools, described on the Admin Settings - Persistent Derived Tables documentation page, to investigate these tables.

When attempting to troubleshoot persistent derived tables: