home User Guide Getting Started Help Center Documentation Community Training Certification
menu
close
settings
Looker keyboard_arrow_down
language keyboard_arrow_down
English
Français
Deutsch
日本語
search
print
Derived tables in Looker

In Looker, a derived table is a query whose results are used as if it were an actual table in the database.

For example, let’s say we have a database table called orders that has many columns. We can create a derived table named customer_order_summary and include a subset of the orders table’s columns:

We can now work with the customer_order_summary derived table just as if it were any other table in our database.

Native derived tables and SQL-based derived tables

To create a derived table in your Looker project, use the derived_table parameter under a view parameter. Inside the derived_table parameter, you can define the query for the derived table in one of two ways:

For example, the following view files show how you could use LookML to create a view from the customer_order_summary derived table. The two versions of the LookML illustrate how you can create equivalent derived tables using LookML or SQL to define the query for the derived table:

Native derived table version
view: customer_order_summary { derived_table: { explore_source: orders { column: customer_id { field: orders.customer_id } column: first_order { field: orders.first_order } column: total_amount { field: orders.total_amount } } } dimension: customer_id { type: number primary_key: yes sql: ${TABLE}.customer_id ;; } dimension_group: first_order { type: time timeframes: [date, week, month] sql: ${TABLE}.first_order ;; } dimension: total_amount { type: number value_format: "0.00" sql: ${TABLE}.total_amount ;; } }
SQL-based derived table version
view: customer_order_summary { derived_table: { sql: SELECT customer_id, MIN(DATE(time)) AS first_order, SUM(amount) AS total_amount FROM orders GROUP BY customer_id ;; } dimension: customer_id { type: number primary_key: yes sql: ${TABLE}.customer_id ;; } dimension_group: first_order { type: time timeframes: [date, week, month] sql: ${TABLE}.first_order ;; } dimension: total_amount { type: number value_format: "0.00" sql: ${TABLE}.total_amount ;; } }

Both versions create a view called customer_order_summary that is based on the orders table, with the columns customer_id, first_order, and total_amount.

Other than the derived_table parameter and its subparameters, this customer_order_summary view works just like any other view file. Whether you define the derived table’s query with LookML or with SQL, you can create LookML measures and dimensions based on the columns of the derived table.

Once you define your derived table, you can use it like any other table in your database.

Native derived tables

Native derived tables are based on queries that you define using LookML terms. To create a native derived table, you use the explore_source parameter inside the derived_table parameter of a view parameter. You create the columns of your native derived table by referring to the LookML dimensions or measures in your model. See the native derived table view file in the example above.

Compared to SQL-based derived tables, native derived tables are much easier to read and understand as you model your data.

See the Creating native derived tables documentation page for details on creating native derived tables.

SQL-based derived tables

To create a SQL-based derived table, you define a query in SQL terms, creating columns in the table using a SQL query. You cannot refer to LookML dimensions and measures in a SQL-based derived table. See the SQL-based derived table view file in the example above.

Most commonly, you define the SQL query using the sql parameter inside the derived_table parameter of a view parameter.

A helpful shortcut for creating SQL-based queries in Looker is to use SQL Runner to create the SQL query and turn it into a derived table definition.

Certain edge cases won’t permit the use of the sql parameter. In such cases, Looker supports the following parameters for defining a SQL query for persistent derived tables (PDTs):

Whether you are using the sql, create_process, or sql_create parameter, in all these cases you are defining the derived table with a SQL query, so these are all considered SQL-based derived tables.

When you define a SQL-based derived table, make sure to give each column a clean alias by using AS. This is because you will need to reference the column names of your result set in your dimensions, such as ${TABLE}.first_order. This is why in our example above we used MIN(DATE(time)) AS first_order instead of simply MIN(DATE(time)).

To be used as an incremental PDT, a SQL-based derived table must be defined using the sql parameter. SQL-based derived tables that are defined with the sql_create parameter or the create_process parameter cannot be incrementally built.

Temporary and persistent derived tables

In addition to the distinction between native derived tables and SQL-based derived tables, there is also a distinction between a temporary derived table (not written to the database) and a persistent derived table (written into a scratch schema on your database).

Native derived tables and SQL-based derived tables can be either temporary or persistent.

Temporary derived tables

The derived tables shown above are examples of temporary derived tables. They are temporary because there is no persistence strategy defined in the derived_table parameter.

Temporary derived tables are not written to the database. When a user runs an Explore query involving one or more derived tables, Looker constructs a SQL query using a dialect-specific combination of the SQL for the derived table(s) plus the requested fields, joins, and filter values. If the combination has been run before and the results are still valid in the cache, Looker uses the cached results. See the Caching queries and rebuilding PDTs with datagroups documentation page for more information on query caching in Looker.

Otherwise, if Looker can’t use cached results, Looker must run a new query on your database every time a user requests data from a temporary derived table. Because of this, you should be sure that your temporary derived tables are performant and won’t put excessive strain on your database. In cases where the query will take some time to run, a persistent derived table is often a better option.

Supported database dialects for temporary derived tables

For Looker to support derived tables in your Looker project, your database dialect must also support them. The following table shows which dialects support derived tables in Looker 21.16:

Persistent derived tables (PDTs)

A persistent derived table (PDT) is a derived table that is written into a scratch schema on your database and regenerated on the schedule that you specify with a persistence strategy.

A PDT can be either a native derived table or a SQL-based derived table.

PDTs are helpful because when the user requests data from the table, often the table will have already been created, reducing query time and database load.

To use PDTs in your Looker project, you need the following:

Supported database dialects for PDTs

For Looker to support PDTs in your Looker project, your database dialect must also support them.

To support any type of persistent derived tables (either LookML-based or SQL-based), the dialect must support writes to the database, among other requirements. There are some read-only database configurations that don’t allow persistence to work (most commonly Postgres hot-swap replica databases). In these cases, you can use temporary derived tables instead.

The following table shows the dialects that support persistent SQL-based derived tables in Looker 21.16:

To support persistent native derived tables (which have LookML-based queries), the dialect must also support a CREATE TABLE DDL function. Here is a list of the dialects that support persistent native (LookML-based) derived tables in Looker 21.16:

PDTs are not supported for Snowflake or Google BigQuery connections that use OAuth.

Incrementally building PDTs

You can create incremental PDTs in your project if your dialect supports them. An incremental PDT is a persistent derived table (PDT) that Looker builds by appending fresh data to the table instead of rebuilding the table in its entirety. See the Incremental PDTs documentation page for more information.

Supported database dialects for incremental PDTs

For Looker to support incremental PDTs in your Looker project, your database dialect must also support them. The following table shows which dialects support incremental PDTs in Looker 21.16:

Creating persistent derived tables

To make a derived table into a persistent derived table, you define a persistence strategy for the table. To optimize performance, you should also add an optimization strategy.

Persistence strategies

The persistence of a derived table can be managed by Looker or, for dialects that support them, by your database using materialized views.

To make a derived table persistent, add one of the following parameters to the derived_table:

Certain users can override the persistence setting of a PDT. A user with develop permission can use the Rebuild Derived Tables & Run option from an Explore’s menu to override the persistence setting and rebuild all the PDTs required for the current query in the Explore. See the Manually rebuilding PDTs for a query section on this page for details.

datagroup_trigger

Datagroups are the most flexible method of creating persistence. If you have defined a datagroup for your caching policies, you can use the datagroup_trigger parameter to initiate the rebuilding of the PDT on the same schedule as your caching policies.

Looker maintains the PDT in the database until its datagroup is triggered. When the datagroup is triggered, Looker rebuilds the PDT to replace the previous version. This means that, in most cases, your users will not have to wait for the PDT to be built. If a user requests data from the PDT while it is being built and the query results aren’t in the cache, Looker will return data from the existing PDT until the new PDT is built. See the Caching queries and rebuilding PDTs with datagroups documentation page for an overview of datagroups.

See the Looker regenerator section for more information on how the regenerator builds PDTs.

sql_trigger_value

Another option is to use the sql_trigger_value parameter, which triggers the regeneration of a PDT based on a SQL statement that you provide. If the result of the SQL statement is different from the previous value, the PDT is regenerated. Otherwise, the existing PDT is maintained in the database. This means that,in most cases, your users will not have to wait for the PDT to be built. If a user requests data from the PDT while it is being built, and the query results aren’t in the cache, Looker will return data from the existing PDT until the new PDT is built.

See the Looker regenerator section for more information on how the regenerator builds PDTs.

persist_for

Yet another option is to use the persist_for parameter to set the length of time the derived table should be stored before it is marked as expired, so that it is no longer used for queries and will be dropped from the database.

A persist_for PDT is built when a user first runs a query on it. Looker then maintains the PDT in the database for the length of time specified in the PDT’s persist_for parameter. If a user queries the PDT within the persist_for time, Looker uses cached results if possible or else runs the query on the PDT.

After the persist_for time, Looker clears the PDT from your database, and the PDT will be rebuilt the next time a user queries it, meaning that query will need to wait for the rebuild.

PDTs that use persist_for aren’t automatically rebuilt by Looker’s regenerator, except in the case of a dependency cascade of PDTs. When a persist_for table is part of a dependency cascade with PDTs that use the datagroup_trigger or sql_trigger_value persistence strategy, the regenerator will monitor and rebuild the persist_for table in order to rebuild other tables in the cascade. See the How Looker builds cascading derived tables section on this page.

materialized_view: yes

The materialized view functionality is an advanced feature. Depending on your dialect, a materialized view can consume large resources, so it is important that you understand your dialect’s implementation of materialized views. See your dialect’s documentation for information on the behavior of your dialect and the frequency with which the dialect refreshes data for materialized views.

Materialized views allow you to leverage your database’s functionality to persist derived tables in your Looker project. If your database dialect supports materialized views and your Looker connection is configured with the Persistent Derived Tables option enabled, you can create a materialized view by specifying materialized_view: yes for a derived table. Materialized views are supported for both native derived tables and SQL-based derived tables.

Similar to a persistent derived table (PDT), a materialized view is a query result that is stored as a table in the scratch schema of your database. The key difference between a PDT and a materialized view is in how tables are refreshed:

For this reason, the materialized view functionality requires advanced knowledge of your dialect and its features. In most cases, your database will refresh the materialized view any time the database detects new data in the tables that are queried by the materialized view. Materialized views are optimal for scenarios that require real-time data.

See the materialized_view parameter documentation page for information on dialect support, requirements, and important considerations.

Optimization strategies

Because PDTs are stored in your database, you should optimize your PDTs using the following strategies, as supported by your dialect:

For example, to add persistence to our derived table example, we could set it to rebuild when the order_datagroup datagroup triggers, and add indexes on both customer_id and first_order, like this:

view: customer_order_summary { derived_table: { explore_source: orders { … } datagroup_trigger: orders_datagroup indexes: ["customer_id", "first_order"] } }

If you don’t add an index (or equivalent for your dialect), Looker will warn you that you should do so to improve query performance.

Cascading derived tables

It is possible to reference one derived table in the definition of another, creating something called cascading derived tables, or cascading PDTs, as the case may be. You can set up a chain of derived tables, where TABLE_D is dependent on TABLE_C, which is dependent on TABLE_B, which is dependent on TABLE_A, and so on:

Before you set up cascading derived tables, it is important to know how cascading tables are automatically rebuilt by Looker’s regenerator and how they can be manually rebuilt by users. There are also other important considerations for implementing persisted tables. All these topics are covered below on this page.

Syntax for referencing a derived table

To reference a derived table in another derived table, use this syntax:

${derived_table_or_view_name.SQL_TABLE_NAME}

In this format, SQL_TABLE_NAME is a literal string. For example, you can reference the clean_events derived table with this syntax:

${clean_events.SQL_TABLE_NAME}

You can use this same syntax to refer to a LookML view. Again, in this case, the SQL_TABLE_NAME is a literal string.

The ${derived_table_or_view_name.SQL_TABLE_NAME} syntax is not supported in the sql_trigger parameter of a datagroup, so you can’t use a derived table to trigger a datagroup. But you can use the ${derived_table_or_view_name.SQL_TABLE_NAME} syntax in the sql_trigger_value parameter to trigger your PDT rebuilds.

In the example below, the clean_events PDT is created from the events table in our database. The clean_events PDT leaves out unwanted rows from the events database table. And then we’ve created another PDT, the event_summary PDT, which is a summary of the clean_events PDT. The event_summary table regenerates whenever new rows are added to clean_events.

The event_summary PDT and the clean_events PDT are cascading PDTs, where event_summary is dependent on clean_events (since event_summary is defined using the clean_events PDT). (This particular example could be done more efficiently in a single PDT, but it’s useful for demonstrating derived table references.)

view: clean_events { derived_table: { sql: SELECT * FROM events WHERE type NOT IN ('test', 'staff') ;; datagroup_trigger: events_datagroup } }   view: events_summary { derived_table: { sql: SELECT type, date, COUNT(*) AS num_events FROM ${clean_events.SQL_TABLE_NAME} AS clean_events GROUP BY type, date ;; datagroup_trigger: events_datagroup } }

Although it’s not always required, when referring to a derived table in this manner, it’s often useful to create an alias for the table using this format:

${derived_table_or_view_name.SQL_TABLE_NAME} AS derived_table_or_view_name

The example above does this:

${clean_events.SQL_TABLE_NAME} AS clean_events

It is helpful to use an alias because, behind the scenes, PDTs are named with lengthy codes in your database. In some cases (especially with ON clauses) it’s easy to forget that you need to use the ${derived_table_or_view_name.SQL_TABLE_NAME} syntax to retrieve this lengthy name. An alias can help to prevent this type of mistake.

How Looker builds cascading derived tables

In the case of cascading temporary derived tables, if a user’s query results aren’t in the cache, Looker will build all the derived tables that are needed for the query. If you have a TABLE_D whose definition contains a reference to TABLE_C, then TABLE_D is dependent on TABLE_C. This means that if you query TABLE_D and the query is not in Looker’s cache, Looker will rebuild TABLE_D. But first, it must rebuild TABLE_C.

Now let’s take a scenario of cascading temporary derived tables where TABLE_D is dependent on TABLE_C, which is dependent on TABLE_B, which is dependent on TABLE_A. If Looker doesn’t have valid results for a query on TABLE_C in the cache, Looker will build all the tables it needs for the query. So Looker will build TABLE_A, then TABLE_B, then TABLE_C:

In this scenario, TABLE_A must finish generating before Looker can start generating TABLE_B, and so on, until TABLE_C is finished and Looker can provide the query results. (Since TABLE_D isn’t needed to answer this query, Looker won’t rebuild TABLE_D at this time.)

The same basic logic applies for PDTs: Looker will build any table required to answer a query, all the way up the chain of dependencies. But with PDTs, it is often the case that the tables already exist and don’t need to be rebuilt. With standard user queries on cascading PDTs, Looker rebuilds the PDTs in the cascade only if there’s no valid version of the PDTs in the database. If you want to force a rebuild for all PDTs in a cascade, you can manually rebuild the tables for a query through an Explore.

An important logical point to understand is that in the case of a PDT cascade, a dependent PDT is essentially querying the PDT it depends on. This is significant especially for PDTs that use the persist_for strategy. Typically, persist_for PDTs are built when a user queries them, remain in the database until their persist_for interval is up, and then are not rebuilt until they are next queried by a user. However, if a persist_for PDT is part of a cascade with PDTs using datagroup_trigger or sql_trigger_value, the persist_for PDT is essentially being queried whenever its dependent PDTs are rebuilt. So, in this case, the persist_for PDT will be rebuilt on the schedule of its dependent PDTs. This means that persist_for PDTs can be affected by the persistence strategy of their dependents.

Manually rebuilding persistent tables for a query

Users can select the Rebuild Derived Tables & Run option from an Explore’s menu to override the persistence settings and rebuild all the PDTs and aggregate tables required for the current query in the Explore:

This option is visible only to users with develop permission, and only after the Explore query has loaded.

The Rebuild Derived Tables & Run option rebuilds all the persistent tables (all the PDTs and aggregate tables) that are required to answer the query, regardless of their persistence strategy. This includes any aggregate tables and PDTs in the current query, and it also includes any aggregate tables and PDTs that are referenced by the aggregate tables and PDTs in the current query.

In the case of incremental PDTs, the Rebuild Derived Tables & Run option triggers the build of a new increment. With incremental PDTs, an increment includes the time period specified in the increment_key parameter, and also the number of previous time periods specified in the increment_offset parameter, if any. See the Incremental PDTs documentation page for some example scenarios that show how incremental PDTs build, depending on their configuration.

In the case of cascading PDTs, this means rebuilding all the derived tables in the cascade, starting at the top. This is the same behavior as when you query a table in a cascade of temporary derived tables:

For the user who initiates the Rebuild Derived Tables & Run operation, the query will wait for the tables to rebuild before loading results. Other users’ queries will still use the existing tables. Once the persistent tables are rebuilt, then all users will use the rebuilt tables.

Although this process is designed not to interrupt other users’ queries while the tables are rebuilding, those users could still be affected by the additional load on your database. If you are in a situation where triggering a rebuild during business hours could put an unacceptable strain on your database, you may need to communicate to your users that they should never rebuild certain PDTs or aggregate tables during those hours.

Persisted tables in Development Mode

Looker has some special behaviors for managing persisted tables in Development Mode.

These behaviors apply to any type of persisted table, which includes aggregate tables as well as PDTs.

If you query a persisted table in Development Mode without making any changes to its definition, Looker will query the production version of that table. If you do make a change to the table definition that affects the data in the table or the way that the table is queried, a new development version of the table will be created the next time you query the table in Development Mode. Having such a development table enables you to test changes without disturbing end users.

What prompts Looker to create a development table

When possible, Looker uses the existing production table to answer queries, whether or not you are in Development Mode. But there are certain cases where Looker cannot use the production table for queries in Development Mode:

Looker will build a development table if you are in Development Mode and you query a SQL-based derived table that is defined using a conditional WHERE clause with if prod and if dev statements. (This is not the case for native derived tables with the dev_filters parameter. For native derived tables with dev_filters, Looker has the logic to use the production table to answer queries in Development Mode, unless you change the definition of the table and then query the table in Development Mode.)

For persisted tables that don’t have a parameter to narrow the dataset in Development Mode, Looker uses the production version of the table to answer queries in Development Mode, unless you change the definition of the table and then query the table in Development Mode. This goes for any changes to the table that affect the data in the table or the way that the table is queried.

Here are some examples of the types of changes that will prompt Looker to create a development version of a persistent table (Looker will create the table only if you subsequently query the table after making these changes):

For changes that do not modify the table’s data or affect the way that Looker queries the table, Looker will not create a development table. The publish_as_db_view parameter is a good example: In Development Mode, if you change only the publish_as_db_view setting for a derived table, Looker doesn’t need to rebuild the derived table so will not create a development table.

How long Looker persists development tables

Regardless of the table’s actual persistence strategy, Looker treats development persisted tables as if they had a persistence strategy of persist_for: "24 hours". Looker does this to ensure that development tables aren’t persisted for more than a day, since a Looker developer may query many iterations of a table during development, and each time a new development table is built. To prevent the development tables from cluttering the database, Looker applies the persist_for: "24 hours" strategy to be sure the tables are cleaned from the database on a frequent basis.

Otherwise, Looker builds PDTs and aggregate tables in Development Mode the same way it builds persisted tables in Production Mode.

If a development table is persisted on your database when you deploy changes to a PDT or an aggregate table, Looker can often use the development table as the production table so that your users don’t have to wait for the table to build when they query the table.

Note that when you deploy your changes, the table may still need to be rebuilt to be queried in production, depending on the situation:

Otherwise, if you deploy your changes when there is no valid development table that can be used as the production table, Looker will rebuild the table the next time the table is queried in Production Mode (for persisted tables that use the persist_for strategy), or the next time the regenerator runs (for persisted tables that use datagroup_trigger or sql_trigger_value).

Checking for unbuilt PDTs in Development Mode

If a development table is persisted on your database when you deploy changes to a PDT or an aggregate table, Looker can often use the development table as the production table so that your users don’t have to wait for the table to build when they query the table. See the How long Looker persists development tables and What prompts Looker to create a development table sections on this page for more details.

Therefore, it is optimal that all of your PDTs are built when you deploy to production so that the tables can be used immediately as the production versions.

You can check your project for unbuilt PDTs in the Project Health panel. Click the Project Health icon, then click the Validate PDT Status button:

If there are unbuilt PDTs, the Project Health panel will list them:

If you have see_pdts permission, you can click the Go to PDT Management button. Looker will open the Development tab of the Persistent Derived Tables page and filter the results to your specific LookML project. From there, you can see which development PDTs are built and unbuilt and access other troubleshooting information. See the Admin settings - Persistent Derived Tables documentation page for more information.

Once you identify an unbuilt PDT in your project, you can build a development version it by opening an Explore that queries the table, then using the Rebuild Derived Tables & Run option from the Explore menu. See the Manually rebuilding persistent tables for a query section on this page.

If a persisted table is a SQL-based derived table that is defined using a conditional WHERE clause with if prod and if dev statements, you will not be able to use the development version of the table for production, since the development version of the table has an abbreviated dataset. See the How long Looker persists development tables section of this page for more information. For native derived tables with the dev_filters parameter, Looker does use the production table to answer queries in Development Mode, unless you change the definition of the table and then query the table in Development Mode (see the What prompts Looker to create a development table section on this documentation page).

Table sharing and cleanup

Within any given Looker instance, Looker will share persisted tables between users if the tables have the same definition and the same persistence method setting. Additionally, if a table’s definition ever ceases to exist, Looker marks the table as expired.

This has several benefits:

Working faster in Development Mode

There are situations when the PDT you’re creating takes a long time to generate, which can be time-consuming if you are testing lots of changes in Development Mode. For these cases, you can prompt Looker to create smaller versions of a derived table when you’re in Development Mode.

For native derived tables, you can use the dev_filters subparameter of explore_source to specify filters that are only applied to development versions of the derived table:

view: e_faa_pdt { derived_table: { … datagroup_trigger: e_faa_shared_datagroup explore_source: flights { dev_filters: [flights.event_date: "90 days"] filters: [flights.event_date: "2 years", flights.airport_name: "Yucca Valley Airport"] column: id {} column: airport_name {} column: event_date {} } } … }

This example includes a dev_filters parameter that filters the data to the last 90 days and a filters parameter that filters the data to the last 2 years and to the Yucca Valley Airport.

The dev_filters parameter acts in conjunction with the filters parameter so that all filters are applied to the development version of the table. If both dev_filters and filters specify filters for the same column, dev_filters takes precedence for the development version of the table. In this example, the development version of the table will filter the data to the last 90 days for the Yucca Valley Airport.

For SQL-based derived tables, Looker supports a conditional WHERE clause with different options for production (if prod) and development (if dev) versions of the table:

view: my_view { derived_table: { sql: SELECT columns FROM my_table WHERE -- if prod -- date > '2000-01-01' -- if dev -- date > '2020-01-01' ;; } }

In this example, the query will include all data from 2000 onward when in Production Mode but only the data from 2020 onward when in Development Mode. Using this feature strategically to limit your result set, and increase query speed, can make Development Mode changes much easier to validate.

If a persisted table has the dev_filters parameter or the conditional WHERE clause, the development table cannot be used as the production version, since the development version has an abbreviated dataset. If this is the case, after you’ve finished developing the table and before you deploy your changes, you can comment out the dev_filters parameter or the conditional WHERE clause and then query the table in Development Mode. Looker will then build a full version of the table that can be used for production when you deploy your changes. See the How long Looker persists development tables section on this page for more details about using development tables in production.

The Looker regenerator

The Looker regenerator checks the status and initiates rebuilds for persisted tables (PDTs and aggregate tables) that use a SQL trigger as a persistence strategy:

The Looker regenerator also initiates rebuilds for persisted tables that use the persist_for strategy, but only when the persist_for table is part of a dependency cascade of other persisted tables that use SQL triggers for persistence. In this case, the Looker regenerator will initiate rebuilds for a persist_for table, since the table is needed to rebuild the other tables in the cascade. Otherwise, the regenerator doesn’t monitor persisted tables that use the persist_for strategy.

The regenerator maintains a SQL-triggered persisted table in the database and then rebuilds the table as prescribed by the table’s SQL trigger. If a user requests data from the persisted table while it is being built and the query results aren’t in the cache, Looker checks to see if the existing table is still valid. (The previous table may not be valid if it is not compatible with the new version of the table, which can happen for reasons such as the new table has a different definition, the new table uses a different database connection, or the new table was created with a different version of Looker.) If the existing table is still valid, Looker will return data from the existing table until the new table is built. Otherwise, if the existing table is not valid, Looker will provide query results once the new table is rebuilt.

By default, Looker’s regenerator checks SQL trigger queries every five minutes to see if the persisted table is triggered and should be rebuilt. However, other factors can affect the time it takes to rebuild your tables, as described in the Important considerations for implementing persisted tables section on this page.

Important considerations for implementing persisted tables

Considering the usefulness of persisted tables (PDTs and aggregate tables), it is easy to accumulate many of them on your Looker instance. It is possible to create a scenario in which the Looker regenerator needs to build many tables at the same time. Especially with cascading tables, or long-running tables, you can create a scenario where tables have a long delay before rebuilding, or where users experience a delay in getting query results from a table while the database is working hard to generate the table.

By default, Looker’s regenerator checks SQL trigger queries every five minutes to see if it should rebuild SQL-triggered persisted tables (PDTs and aggregate tables that use the datagroup_trigger or sql_trigger_value persistence strategy). However, other factors can affect the time required to rebuild your tables:

In addition to the considerations above, there are also some situations in which you should avoid adding persistence to a derived table:

Depending on the number and complexity of persisted tables on your Looker connection, the queue might contain many persisted tables needing to be checked and rebuilt at each cycle, so it is important to keep these factors in mind when implementing derived tables on your Looker instance.

Monitoring and troubleshooting PDTs

If you use PDTs, and especially cascading PDTs, it is helpful to see the status of your PDTs. You can use Looker’s Persistent Derived Tables admin page to see the status of your PDTs. See the Admin settings - Persistent derived tables documentation page for information.

When attempting to troubleshoot persistent derived tables:

Query comments for PDTs

Database administrators can easily differentiate normal queries from those that generate PDTs. Looker adds a comment to the CREATE TABLE ... AS SELECT ... statement that includes the PDT’s LookML model and view, plus a unique identifier (slug) for the Looker instance. If the PDT is being generated on behalf of a user in Development Mode, the comment will indicate the user’s ID. This comment can be seen in a database console, as shown in the following example of a PDT being generated from Development Mode:

The PDT generation comment can also be seen in an Explore’s SQL tab if Looker had to generate a PDT for the Explore’s query. The comment appears at the top of the SQL statement, as shown in the following example:

Finally, the PDT generation comment appears in the details of the Queries admin page, as shown in the following example:

Rebuilding PDTs after a failure

When a PDT has a failure, here is what happens when that PDT is queried:

With cascading PDTs, the same logic applies, except that with cascading PDTs:

Revisiting our previous example of cascading tables, where TABLE_D is dependent on TABLE_C, which is dependent on TABLE_B, which is dependent on TABLE_A:

If TABLE_B has a failure, all the standard (non-cascade) behavior applies for TABLE_B: If TABLE_B is queried, Looker first tries to use the cache to return results, then tries to use a previous version of the table if possible, then tries to rebuild the table, then finally will return an error if TABLE_B can’t rebuild. Looker will try again to rebuild TABLE_B when the table is next queried or when the table’s persistence strategy next triggers a rebuild.

The same also applies for the dependents of TABLE_B. So if TABLE_B can’t be built, and there is a query on TABLE_C:

Once you resolve the problem with TABLE_B, then TABLE_B and each of the dependent tables will attempt to rebuild according to their persistence strategies, or the next time they are queried (which includes the next time a dependent PDT attempts to rebuild). Or, if a development version of the PDTs in the cascade was built in Development Mode, the development versions may be used as the new production PDTs. (See the Persisted tables in Development Mode section on this page for how this works.) Or you can use an Explore to run a query on TABLE_D and then manually rebuild the PDTs for the query, which will force a rebuild of all the PDTs going up the dependency cascade.

Top