home User Guide Getting Started Help Center Documentation Community Training Certification
menu
close
settings
Looker keyboard_arrow_down
language keyboard_arrow_down
English
Français
Deutsch
日本語
search
print
Looker documentation will be moving to cloud.google.com in mid-2022!
All the information you rely on will be migrated and all docs.looker.com URLs will be redirected to the appropriate page.
Forecasting in visualizations

update

Update to Looker 22.8 for the Forecasting Labs feature to be enabled by default.

Forecasting lets analysts quickly add data projections to new or existing Explore queries to help users predict and monitor specific data points. Forecasted Explore results and visualizations can be added to dashboards and saved as Looks. Forecasted results and visualizations can also be created and viewed on embedded Looker content.

You can use the Forecasting Labs feature to forecast data if your Looker admin has enabled it for your Looker instance.

How forecasted results are created and displayed

The Forecasting Labs feature uses the data results in an Explore’s data table to calculate future data points. Forecast calculations include only the displayed results of an Explore query; any results that do not display because of row limits are not included. For more information about the algorithm that is used to calculate forecasts, see the ARIMA algorithm section on this page.

Forecasted results display as a continuation of existing Explore visualizations and are subject to configured visualization settings. Forecasted data points are distinguished from non-forecasted data points in the following ways:

  1. In Cartesian charts, forecasted data points are differentiated from non-forecasted data points by rendering in a lighter shade or by dashed lines.
  2. In Table, Table Legacy, and Explore data tables, forecasted data points are italicized and appended with an asterisk.

Forecasted data is also explicitly identified in the tooltip that appears when you hover your cursor over a forecasted data point:

Only certain types of visualizations support forecasted data, as discussed in the following section.

ARIMA algorithm

Forecasting leverages an AutoRegressive Integrated Moving Average (ARIMA) algorithm to create an equation that best matches the data that is input into a forecast. To find the best match for the data, Looker runs ARIMA with a set of initial variables, creates a list of variations of the initial variables, and runs ARIMA again with those variations. If any of the variations create an equation that better fits the input data, Looker uses those variations as the new initial variables and creates additional variations that are then evaluated. Looker continues to repeat this process until the best variables are identified or until all options or the allocated compute time are exhausted.

This process can be thought of as a genetic algorithm, where individuals throughout hundreds of generations create 1 to 10 offspring each (variations of variables based on the parent), and the best offspring survive to potentially create “better” generations. The way Looker uses many invocations of ARIMA in a genetic algorithm approach is called AutoARIMA.

For additional details about AutoARIMA, see the Tips to using auto_arima section of the pmdarima User Guide. Although this is not the library that Looker uses to run AutoARIMA, pmdarima provides the best explanation of the process and the different variables that are used.

Supported visualization types

The following Cartesian visualization types support rendering forecasted data:

The following text and table chart types support rendering forecasted data:

Other visualization types, including custom visualizations, cannot currently render forecasted data.

Explore query requirements for forecasting

To create a forecast, an Explore must meet these requirements:

Things to consider

The following are additional criteria to consider when you create a new Explore query to forecast or add a forecast to an existing Explore query:

For additional tips and troubleshooting resources, see the Common issues and things to know section on this page.

Typically, a dataset with more rows, in conjunction with a shorter forecast length, will result in a more accurate forecast.

Forecast menu options

You can use the options in the Forecast menu to customize forecasted data. The Forecast menu includes the following options:

Select Fields

The Select Fields drop-down menu displays the measures or custom measures in the Explore query that are available for forecasting. Up to five measures or custom measures may be selected.

Length

The Length option indicates the number of rows, or the length of time, for which to forecast data values. The forecast duration interval is automatically populated based on the timeframe dimension in the Explore query.

Typically, a dataset with more rows, in conjunction with a shorter forecast length, results in a more accurate forecast.

Prediction Interval

The Prediction Interval option lets analysts express some uncertainty in forecasts to aid in accuracy. When enabled, the Prediction Interval option lets you select the bounds of the forecasted data values. For example, a prediction interval of 95% indicates a 95% chance that forecasted measure values will fall between the upper and lower bounds of the forecast.

The larger the selected prediction interval, the wider the upper and lower bounds.

Seasonality

The Seasonality option lets analysts account for known cycles or repetitive data trends in a forecast, and it refers to the number of rows of data in the cycle. For example, if an Explore data table has one row per hour and the data cycles daily, the seasonality is 24.

With default forecast settings, Looker references the date dimension in an Explore and scans several possible seasonality cycles to find the best match for the final forecast. For example, when using hourly data, Looker may try daily, weekly, and four-week seasonality cycles. Looker also takes into account the frequency of the dimension — if a dimension represents a six-hour period, Looker knows there will only be four rows in a day and will adjust the seasonality accordingly.

For common use cases, the Automatic option detects the best seasonality for a given dataset. If you are aware of specific cycles in the dataset, the Custom option lets you specify the number of rows that make up a cycle for individual measures in a forecast.

When forecasting data values for multiple measures, you can select different seasonality options, including none, for each individual measure. The Seasonality drop-down menu has several options:

Forecasting applies the Automatic seasonality option to forecasts by default, even when the Seasonality option is not enabled.

Automatic

With the Automatic seasonality option, Looker selects the best option for your data from multiple common seasonality periods, such as daily, hourly, monthly, and so on.

Custom

When you know the specific number of rows that make up each season or cycle in your dataset, you can specify the number in the Period field. It may be helpful to select Custom if you know that your data cycles in a specific number of rows.

When you are working with data that cycles in months but is expressed in greater granularity (for example, using a date or week granularity in an Explore), generally a 4-week or 30-day period fits monthly cycles.

None

Seasonality is a powerful component of forecasting; however, depending on the input data, it’s not always recommended. If there are no predictable cycles in the data, enabling seasonality can occasionally lead to inaccurate forecasts when the algorithm will attempt to find a pattern and then attempt to fit the false pattern to the forecast. This can result in an obscure prediction.

When you are forecasting data values for multiple measures and want to enable Seasonality only for one or a few, you can select None for all measures for which you don’t want to enable Seasonality.

Creating a forecast

To create a forecast:

  1. Ensure that your Explore meets the forecast requirements. In the preceding image, the Explore is sorted by Users Created Month in descending order, with dimension fill enabled.
  2. Select the Forecast tab to open the Forecast menu.
  3. Select the Select Fields drop-down menu to choose up to five measures or custom measures to forecast.

  1. Enter the length of time in the future you wish to forecast.
  2. Select either the Prediction Interval or the Seasonality switch to enable each function and customize the associated options.
  3. Select the x in the menu tab next to Forecast to save your settings and exit the menu.
  4. Select Run to re-run the Explore query. (You must re-run the Explore after making any changes to the forecast.)

Your Explore results and visualization will now display forecasted values for the length of time specified:

Because forecasted calculations are dependent on the order in which data is sorted, sorting is disabled once a forecasted query has run.

Editing a forecast

To edit a forecast:

  1. Optionally, edit the Explore query as needed to add or remove different measures or timeframe fields. Ensure that your Explore meets the forecast requirements.

  1. Select the Forecast tab to open the Forecast menu.
  2. Select the Select Fields drop-down menu to make changes to the forecasted fields. To remove forecasted fields:
    • Select the boxes next to the forecasted fields in the expanded Select Fields drop-down menu to remove the fields from the forecast.
    • Alternatively, select the x next to the field name in the collapsed Select Fields menu.

  1. Edit the specified length of time in the future to forecast, as desired.
  2. Select either the Prediction Interval or the Seasonality switch to enable each function and customize the associated options.
    • If either Prediction Interval or Seasonality was already enabled, the customizations will be displayed. Edit custom settings as desired, or select the switch to remove the function from the forecast.
  3. Select the x in the menu tab next to Forecast to save your settings and exit the menu.
  4. Select Run to re-run the Explore query. (You must re-run the Explore after making any changes to the forecast.)

Your Explore results and visualization will now display the amended forecast. Because forecasted calculations are dependent on the order in which data is sorted, sorting is disabled once a forecasted query has run.

Removing a forecast

To remove a forecast from an Explore:

  1. Open the Forecast tab.
  2. Select Clear.

The query will automatically re-run to produce the results without a forecast applied.

Common issues and things to know

“How accurate is it?”

The accuracy of a forecast depends on the input data. Looker’s AutoARIMA implementation can make incredibly accurate predictions that successfully combine many nuances from the input data. There are also cases in which the algorithm gets caught up in odd patterns in the input data and overemphasizes them in the prediction. Make sure that enough data is provided and that the data is as accurate as possible to get the most out of forecasting.

A forecast could not be generated

There are legitimate reasons that a forecast cannot be generated. These usually have to do with the amount of input data being too little or the requested length of forecast being too large. There is no specific limit to either factor, and there is no exact ratio of required input data for a certain length of forecast. The more scattered and unpredictable the input data, the more difficult it will be for the AutoARIMA algorithm to find a match. The most effective way to generate a forecast is to increase the amount of clean input data, make sure the seasonality settings are correct, and reduce the forecast length to only what’s needed. When using the Prediction Interval option, it may help to choose a lower interval.

Cleaning input data can involve:

The query result returned without forecasts, and I received an obscure error

This should not occur; if it does, try removing the measure or measures from the forecast config and then re-adding them.

The forecast displays but it is obviously wrong or unhelpful

The best thing to do in this case is to add more input data, clean it up as much as possible, and potentially set a custom seasonality (if you are aware of specific cycles in the data) or disable the Seasonality option altogether by selecting None.

Cleaning input data can involve the following tasks:

Top