home User Guide Getting Started Help Center Documentation Community Training Certification
menu
close
settings
Looker keyboard_arrow_down
language keyboard_arrow_down
English
Français
Deutsch
日本語
search
print
Incremental PDTs

In Looker, persistent derived tables (PDTs) are written to the scratch schema of your database. Looker persists and rebuilds a PDT based on its persistence strategy. When a PDT is triggered to rebuild, by default Looker rebuilds the entire table.

If your dialect supports incremental PDTs, you can choose to append fresh data to the PDT instead of rebuilding a PDT in its entirety:

The first time you run a query on an incremental PDT, Looker builds the entire PDT to get the initial data. If the table is large, the initial build may take a significant amount of time, as would building any large table. Once the initial table is built, subsequent builds will be incremental and will take less time, if the incremental PDT is set up strategically.

In addition, make sure that the incremental PDT’s source table is optimized for time-based queries. Specifically, the time-based column used for the increment key must have an optimization strategy, such as partitioning, sortkeys, indexes, or whatever optimization strategy is supported for your dialect. Source table optimization is strongly recommended because each time the incremental table is updated, Looker queries the source table to determine the latest values of the time-based column used for the increment key. If the source table is not optimized for these queries, Looker’s query for the latest values may be slow and expensive.

If your dialect supports incremental PDTs, you can make the following types of PDTs into incremental PDTs:

Defining an incremental PDT

You can use the following parameters to make a PDT into an incremental PDT:

See the increment_key parameter documentation page for examples showing how to create incremental PDTs from persistent native derived tables, persistent SQL-based derived tables, and aggregate tables.

Here is a simple example of a view file that defines an incremental LookML-based PDT:

view: flights_lookml_incremental_pdt { derived_table: { indexes: ["id"] increment_key: "departure_date" increment_offset: 3 datagroup_trigger: flights_default_datagroup distribution_style: all explore_source: flights { column: id {} column: carrier {} column: departure_date {} } } dimension: id { type: number } dimension: carrier { type: string } dimension: departure_date { type: date } }

This table will build in its entirety the first time a query is run on it. After that, the PDT will be rebuilt in increments of one day (increment_key: departure_date), going back three days (increment_offset: 3).

The increment key is based on the departure_date dimension, which is actually the date timeframe from the departure dimension group. (See the dimension_group parameter documentation page for an overview of how dimension groups work.) The dimension group and timeframe are both defined in the flights view, which is the explore_source for this PDT. Here is how the departure dimension group is defined in the flights view file:

… dimension_group: departure { type: time timeframes: [ raw, date, week, month, year ] sql: ${TABLE}.dep_time ;; } …

Interaction of increment parameters and persistence strategy

A PDT’s increment_key and increment_offset settings are independent of the PDT’s persistence strategy:

The following example scenarios illustrate how incremental PDTs are updated, by showing the interaction of increment_key, increment_offset, and persistence strategy.

Example 1

This example uses a PDT with these properties:

Here is how this table will be updated:

In SQL terms, here is the command that the PDT builder will run on June 1st to determine the rows from the existing PDT that should be rebuilt:

# Example SQL for BiqQuery:
SELECT FORMAT_TIMESTAMP('%F %T',TIMESTAMP_ADD(MAX(pdt_name),INTERVAL -3 DAY))

# Example SQL for other dialects:
SELECT CAST(DATE_ADD(MAX(pdt_name),INTERVAL -3 DAY) AS CHAR)

And here is the SQL command that the PDT builder will run on June 1st to build the latest increment:

# Example SQL for BiqQuery:

MERGE INTO [pdt_name] USING (SELECT [columns]
   WHERE created_at >= TIMESTAMP(‘4/28/21 12:00:00 AM’))
   AS tmp_name ON FALSE
WHEN NOT MATCHED BY SOURCE AND created_date >= TIMESTAMP(‘4/28/21 12:00:00 AM’)
   THEN DELETE
WHEN NOT MATCHED THEN INSERT [columns]

# Example SQL for other dialects:

START TRANSACTION;
DELETE FROM [pdt_name]
   WHERE created_date >= TIMESTAMP(‘4/28/21 12:00:00 AM’);
INSERT INTO [pdt_name]
   SELECT [columns]
   FROM [source_table]
   WHERE created_at >= TIMESTAMP(‘4/28/21 12:00:00 AM’);
COMMIT;

Example 2

This example uses a PDT with these properties:

Here is how this table will be updated on June 1st:

Here is how this table will be updated on June 2nd:

Example 3

This example uses a PDT with these properties:

This scenario illustrates a poor setup for an incremental PDT, since it’s a daily triggering PDT with a three-month offset. This means that at least three months of data will be rebuilt every day, which would be a very inefficient use of an incremental PDT. However, it is an interesting scenario to examine as a way of understanding of how incremental PDTs work.

Here is how this table will be updated on June 1st:

Here is how this table will be updated on June 2nd:

Testing an incremental PDT in Development Mode

Before deploying a new incremental PDT to your production environment, you can test the PDT to be sure it builds and increments. To test an incremental PDT in Development Mode:

  1. Create an Explore for the PDT:

    • In an associated model file, use the include parameter to include the PDT’s view file in the model file.
    • In the same model file, use the explore parameter to create an Explore for the incremental PDT’s view.

    include: "/views/e_faa_pdt.view" explore: e_faa_pdt {}

  2. Open the Explore for the PDT.

    TIP: Once you’ve included the view in your model file and created an Explore, you can navigate directly to the Explore from the view file or model file in your LookML project by using the file actions menu at the top of the LookML file:

  1. In the Explore, select some dimensions or measures and click Run. Looker will then build the entire PDT. If this is the first query you have run on the incremental PDT, the PDT builder will build the entire PDT to get the initial data. If the table is large, the initial build may take a significant amount of time, as would building any large table.

  2. You can verify that the initial PDT was built in the following ways:

    • If you have the see_logs permission, you can verify that the table was built by looking in the PDT Event Log. If you don’t see the PDT create events in the PDT Event Log, check the status information at the top of the PDT Event Log Explore. If it says “from cache,” you can select Clear Cache & Refresh to get more recent information.
    • Otherwise, you can look at the comments in the SQL tab of the Explore’s Data bar. The SQL tab shows the query and the actions that will be taken when you run the query in the Explore. For example, if the comments in the SQL tab say -- generate derived table e_incremental_pdt, that is the action that will be taken when you click Run.
  3. Once you create the initial build of the PDT, prompt an incremental build of the PDT by using the Rebuild Derived Tables & Run option from the Explore.

  4. You can use the same methods as before to verify that the PDT builds incrementally:

    • If you have the see_logs permission, you can use the PDT Event Log to see create increment complete events for the incremental PDT. If you don’t see this event in the PDT Event Log and the query status says “from cache,” select Clear Cache & Refresh to get more recent information.
    • Look at the comments in the SQL tab of the Explore’s Data bar. In this case, the comments will indicate that the PDT was incremented. For example: -- increment persistent derived table e_incremental_pdt to generation 2
  5. Once you’ve verified the PDT is built and incrementing correctly, if you don’t want to keep the dedicated Explore for the PDT, you can remove or comment out the PDT’s explore and include parameters from your model file.

After the PDT is built in Development Mode, the same table will be used for production once you deploy your changes, unless you make further changes to the table’s definition. See the Persisted tables in Development Mode section of the Derived tables in Looker documentation page for more information.

Supported database dialects for incremental PDTs

For Looker to support incremental PDTs in your Looker project, your database dialect must support Data Definition Language (DDL) commands that enable deleting and inserting rows.

The following table shows which dialects support incremental PDTs in Looker 21.12:

Top