home User Guide Getting Started Help Center Documentation Community Training Certification
menu
close
settings
Looker keyboard_arrow_down
language keyboard_arrow_down
English
Français
Deutsch
日本語
search
print
Incremental PDTs

In Looker, persistent derived tables (PDTs) are written to the scratch schema of your database. Looker persists and rebuilds a PDT based on its persistence strategy. When a PDT is triggered to rebuild, by default Looker rebuilds the entire table.

If your admin has enabled the Incremental PDTs Labs feature, instead of rebuilding a PDT in its entirety, you can choose to append fresh data to the PDT:

The first time you run a query on an incremental PDT, the entire PDT will be built to get the initial data. If the table is large, the initial build may take a significant amount of time, as would building any large table. Once the initial table is built, subsequent builds will be incremental and will take less time, if the incremental PDT is set up strategically.

If your dialect supports incremental PDTs, you can make any type of PDT into an incremental PDT. This includes:

Defining an incremental PDT

You can use the following parameters to make a PDT into an incremental PDT:

See the increment_key parameter documentation page for examples showing how to create incremental PDTs from persistent native derived tables, persistent SQL-based derived tables, and aggregate tables.

Here is a simple example of a view file that defines an incremental LookML-based PDT:

view: flights_lookml_incremental_pdt { derived_table: { indexes: ["id"] increment_key: "departure_date" increment_offset: 3 datagroup_trigger: flights_default_datagroup distribution_style: all explore_source: flights { column: id {} column: carrier {} } } dimension: id { type: number } dimension: carrier { type: string } }

This table will build in its entirety the first time a query is run on it. After that, the PDT will be rebuilt in increments of one day (increment_key: departure_date), going back three days (increment_offset: 3). The increment key is based on a dimension_group that is defined in the flights view, which is the explore_source for this PDT. In this example, the dimension used for the increment key isn’t included in the PDT itself. However, because the dimension is defined in the view file on which the PDT is based, you can use that dimension as the increment key.

Interaction of increment parameters and persistence strategy

A PDT’s increment_key and increment_offset settings are independent of the PDT’s persistence strategy:

The following example scenarios illustrate how incremental PDTs are updated, by showing the interaction of increment_key, increment_offset, and persistence strategy.

Example 1

This example uses a PDT with these properties:

Here is how this table will be updated:

In SQL terms, here is the command that the PDT builder will run on June 1st to determine the rows from the existing PDT that should be rebuilt:

# Example SQL for BiqQuery:
SELECT FORMAT_TIMESTAMP('%F %T',TIMESTAMP_ADD(MAX(pdt_name),INTERVAL -3 DAY))

# Example SQL for other dialects:
SELECT CAST(DATE_ADD(MAX(pdt_name),INTERVAL -3 DAY) AS CHAR)

And here is the SQL command that the PDT builder will run on June 1st to build the latest increment:

# Example SQL for BiqQuery:

MERGE INTO [pdt_name] USING (SELECT [columns]
   WHERE created_at >= TIMESTAMP(‘4/28/21 12:00:00 AM’))
   AS tmp_name ON FALSE
WHEN NOT MATCHED BY SOURCE AND created_date >= TIMESTAMP(‘4/28/21 12:00:00 AM’)
   THEN DELETE
WHEN NOT MATCHED THEN INSERT [columns]

# Example SQL for other dialects:

START TRANSACTION;
DELETE FROM [pdt_name]
   WHERE created_date >= TIMESTAMP(‘4/28/21 12:00:00 AM’);
INSERT INTO [pdt_name]
   SELECT [columns]
   FROM [source_table]
   WHERE created_at >= TIMESTAMP(‘4/28/21 12:00:00 AM’);
COMMIT;

Example 2

This example uses a PDT with these properties:

Here is how this table will be updated on June 1st:

Here is how this table will be updated on June 2nd:

Example 3

This example uses a PDT with these properties:

This scenario illustrates a poor setup for an incremental PDT, since it’s a daily-triggering PDT with a three-month offset. This means that at least three months of data will be rebuilt every day, which would be a very inefficient use of an incremental PDT. However, it is an interesting scenario to examine as a way of understanding of how incremental PDTs work.

Here is how this table will be updated on June 1st:

Here is how this table will be updated on June 2nd:

Testing an incremental PDT in Development Mode

Before deploying a new incremental PDT to your production environment, you can test the PDT to be sure it builds and increments. To test an incremental PDT in Development Mode:

  1. Create an Explore for the PDT:

    • In an associated model file, use the include parameter to include the PDT’s view file in the model file.
    • In the same model file, use the explore parameter to create an Explore for the incremental PDT’s view.

    include: "/views/e_faa_pdt.view" explore: e_faa_pdt {}

  2. Open the Explore for the PDT.

    TIP: Once you’ve included the view in your model file and created an Explore, you can navigate directly to the Explore from the view file or model file in your LookML project by using the file actions menu at the top of the LookML file:

  1. In the Explore, select some dimensions or measures and click Run. Looker will then build the entire PDT. If this is the first query you have run on the incremental PDT, the PDT builder will build the entire PDT to get the initial data. If the table is large, the initial build may take a significant amount of time, as would building any large table.

  2. You can verify that the initial PDT was built in the following ways:

    • If you have the see_logs permission, you can verify that the table was built by looking in the PDT Event Log. If you don’t see the PDT create events in the PDT Event Log, check the status information at the top of the PDT Event Log Explore. If it says “from cache,” you can select Clear Cache & Refresh to get more recent information.
    • Otherwise, you can look at the comments in the SQL tab of the Explore’s Data bar. The SQL tab shows the query and the actions that will be taken when you run the query in the Explore. For example, if the comments in the SQL tab say -- generate derived table e_incremental_pdt, that is the action that will be taken when you click Run.
  3. Once you create the initial build of the PDT, prompt an incremental build of the PDT by using the Rebuild Derived Tables & Run option from the Explore.

  4. You can use the same methods as before to verify that the PDT builds incrementally:

    • If you have the see_logs permission, you can use the PDT Event Log to see create increment complete events for the incremental PDT. If you don’t see this event in the PDT Event Log and the query status says “from cache,” select Clear Cache & Refresh to get more recent information.
    • Look at the comments in the SQL tab of the Explore’s Data bar. In this case, the comments will indicate that the PDT was incremented. For example: -- increment persistent derived table e_incremental_pdt to generation 2
  5. Once you’ve verified the PDT is built and incrementing correctly, if you don’t want to keep the dedicated Explore for the PDT, you can remove or comment out the PDT’s explore and include parameters from your model file.

After the PDT is built in Development Mode, the same table will be used for production once you deploy your changes, unless you make further changes to the table’s definition. See the Persisted tables in Development Mode section of the Derived tables in Looker documentation page for more information.

Supported database dialects for incremental PDTs

For Looker to support incremental PDTs in your Looker project, your database dialect must support Data Definition Language (DDL) commands that enable deleting and inserting rows.

The following table shows which dialects support incremental PDTs in Looker 21.6:

Top