Boxplot charts help you visualize the distribution and spread of values in your dataset. They can be especially useful for comparing values across categories.
To create a boxplot, your data should be separated into quartiles, or quarters. Quartiles are created when your data values are organized from smallest to largest and then that list is divided into quarters. See below for how to use Looker to divide your data into quartiles.
The box portion of the chart represents the values between the first and third quartiles, where 50% of your data is contained. The “whisker” portions of the chart, which are the lines that extend vertically from the top and bottom of the box and end at the maximum and minimum values in your data, represent the remaining 50% of values. A horizontal line through the box represents the median value.
The boxplot chart below shows the values for the Lifetime Orders field based on the Traffic Source dimension. Let’s examine the Display traffic source.
You can see that Display has a median value of 3 lifetime orders per user, with minimum and maximum values of 1 and 14 lifetime orders per user, respectively. It also has a third quartile value of 5 lifetime orders, showing that three quarters of users from the Display traffic source have fewer than 5 lifetime purchases.
Compared to other traffic sources like Email, which has a third quartile value of 10 lifetime purchases, users from the Display traffic source tend to make fewer lifetime purchases.
Building a Boxplot Chart
You can choose to use a boxplot chart by clicking the ellipsis (…) in the Visualization bar and choosing Boxplot. Each row in the Data table for your query will become one box in the chart.
Once your data is organized, you can use the visualization options for editing boxplot charts, as described on this page. The visualization option menu can be accessed by clicking the gear in the upper right corner of the visualization tab.
Building a Boxplot with Five Measures
Traditional boxplot visualizations, described in the section above, require at least one dimension, as well as the following five types of measures (which must be in this order, from left to right):
- Minimum: A measure representing the minimum data value. This can be defined in LookML as a measure of
- 25th percentile: A measure representing the 25th percentile, or the first quartile. One quarter of your data values are less than or equal to this value. This can be defined in LookML as a measure of
type: percentilewith the value for
- Median: A measure representing the median or midpoint of the dataset, or the second quartile. Half of your data values are less than or equal to this value. This can be defined in LookML as a measure of
- 75th percentile: A measure representing the 75th percentile, or the third quartile. Three quarters of your data values are less than or equal to this value. This can be defined in LookML as a measure of
type: percentilewith the value for
- Maximum: A measure representing the maximum value. This can be defined in LookML as a measure of
Building a Boxplot with Fewer Than Five Measures
You can also create a boxplot using any two or three of the measures described in this section above, in order from left to right. If your query includes only two or three measures, your boxplot visualization will display only the box portion of the chart, and not the whiskers:
The boxplot chart in this example is based on the Traffic Source dimension and three measures representing the minimum, median, and maximum data values for the Lifetime Orders dimension.
Series Menu Options
You can define the color palette for a chart in the Color Configuration section.
Choose a color collection from the Collection drop-down menu. A collection allows you to create themed visualizations and dashboards that look good together. You can see all the palettes in each of Looker's built-in color collections on the Color Collections documentation page. Your Looker admin may also create a custom color collection for your organization.
Once you select the color collection, the Palette section will update with the first categorical palette from that collection.
Once you've selected a color collection, the first color in the palette for the collection you have chosen is assigned to your visualization.
Specifying a Custom Color
To choose a custom color for your visualization, first select the Custom tab on the palette picker. You can edit your palette in one of the following ways:
- Click on the first color in the palette to edit it.
- Click EDIT ALL at the bottom right of the menu, then add the desired color to the beginning of the comma-separated list of color values for that palette.
To change a selected color, or edit all colors in a palette at once, you can input hex strings, such as
#2ca6cd, or CSS color names, such as
mediumblue, into the color value box at the bottom of the picker.
You can also click the color wheel to the right of the color value box to bring up a color picker, which can be used to select a color. The corresponding hex value for that color appears in the color value box:
If you click EDIT ALL, you'll see that the color value box is populated with the hex codes of the color palette you've chosen or customized. Copying and pasting this list is the best way to copy custom color palettes from one chart to another.
Select Reverse colors to apply the last color in the palette to your visualization.
Style Menu Options
Show Full Field Name
Show Full Field Name determines whether to show the view name along with the field name for each axis title and series name. When Show Full Field Name is turned off, generally only the field name shows; however, measures of type
count display only the view name instead.
X Menu Options
Show Axis Name
Show Axis Name toggles the appearance of the x-axis name label.
Custom Axis Name
Custom Axis Name sets the name for the x-axis. It accepts any string value. This option is only available when Show Axis Name is ON.
Axis Value Labels
Axis Value Labels toggles the appearance of value labels on the x-axis.
Gridlines toggles the appearance of gridlines extending from the x-axis. Gridlines are spaced based on the scaling of the x-axis.
Y Menu Options
Show Axis Names
Show Axis Names toggles the appearance of y-axis name labels.
Custom Axis Names
Custom Axis Names defines the label for the y-axis. This option is only available when Show Axis Names is enabled.
Axis Value Labels
Axis Value Labels toggles the appearance of value labels on the y-axis.
Gridlines toggles the appearance of gridlines extending from the y-axis. Gridlines are spaced based on the scaling of the y-axis.
Minimum Values defines the minimum value for each y-axis. This parameter accepts a comma-separated list of integers. If there is more than one y-axis, minimum values will be assigned to each y-axis in the order of the measures in your query.
Maximum Values defines the maximum value for each y-axis. This parameter accepts a comma-separated list of integers. If there is more than one y-axis, maximum values will be assigned to each y-axis in the order of the measures in your query.
Tick Density sets the density of tick marks on the y-axis. The following options are available:
Default: Sets ticks to the default density.
Custom: Enables you to set ticks with a custom density. Selecting this option will display a slider bar where you can set the custom density.
Y Axis Format
Y Axis Format specifies the number format of the y-axis values, independent of the underlying dimension or measure. The parameter accepts Excel style formatting. If no formatting is specified, the value will be displayed in the format of the underlying dimension or measure.
You can read Excel's complete guide about how to specify these formats in their documentation. However, at this time, date formatting and color formatting are not supported in Looker.
Some of the most common formatting options are shown here:
||Integer zero-padded to 3 places (001).|
||Number up to 2 decimals (1. or 1.2 or 1.23).|
||Number with exactly 2 decimals (1.23).|
||Number zero-padded to 3 places and exactly 2 decimals (01.23).|
||Number with comma between thousands (1,234).|
||Number with comma between thousands and 2 decimals (1,234.00).|
||Number in millions with 3 decimals (1.234 M). Division by 1 million happens automatically.|
||Dollars with 0 decimals ($123).|
||Dollars with 2 decimals ($123.00).|
||Dollars with comma between thousands and 2 decimals ($1,234.00).|
||Percent with 0 decimals (1%). Multiplication by 100 happens automatically.|
||Percent with 2 decimals (1.00%). Multiplication by 100 happens automatically.|
||Percent with 2 decimals (1.00%). Multiplication by 100 does NOT happen automatically.|