Joins enable you to connect different views so that you can explore data from more than one view at the same time and see how different parts of your data relate to each other.
For example, your database might include tables like order_items
, orders
, and users
. We can use joins to explore data from all these tables at the same time. This page will walk through joining in LookML, including specific join parameters and joining patterns.
Joins Start with an Explore
Joins are defined in the model file to establish the relationship between an Explore and a view. Joins connect one or more views to a single Explore, either directly, or through another joined view.
Let’s consider two database tables: order_items
and orders
. Once you have generated views for both of these tables, you can declare one or more of them under the explore
parameter in the model file, like this:
When we run a query from the order_items
Explore, order_items
will appear in the FROM
clause of the generated SQL, like this:
We can also join additional information to our order_items
Explore. For example, suppose we wanted to add data about the order
that the order_item
is a part of. You might do something like this:
The LookML above accomplishes two things. First, you will be able to see fields from both orders
and order_items
in the UI, like so:
Second, the LookML describes how to join orders
and order_items
together. That LookML would translate to the following SQL:
Below, we’ll step through these LookML parameters one by one to explain them in greater detail. You can also read the join
parameter reference page to learn more about how this LookML is translated into SQL.
Chat Team Tip: The validation error we are asked about most is “Unknown or inaccessible field,” which can be caused by a missing join. Check out the Help Center article about this error for more information.
Join Parameters
There are four main parameters that are used to join: joins
, join
, type
, relationship
, and sql_on
. Let’s walk through each of them from the example above.
Step 1: Starting the Explore
First, create the order_items
Explore, like so:
Step 2: join
If you want to join a table, you must first declare it in a view. In this example, orders
is an existing view in our model.
We can then use the join
parameter to declare that we want to join the orders
view to order_items
:
Step 3: type
We can next consider which type
of join to perform. Looker supports LEFT JOIN
, INNER JOIN
, FULL OUTER JOIN
, and CROSS JOIN
. These correspond to the type
parameter values of left_outer
, inner
, full_outer
, and cross
.
The default value of type
is left_outer
, which will generally be your most popular join type.
Step 4: relationship
We can then define a join relationship
between order_items
and orders
. Properly declaring the relationship
of a join is important for Looker to calculate accurate measures. The relationship is defined from the order_items
Explore to the orders
view. The possible options are one_to_one
, many_to_one
, one_to_many
, and many_to_many
.
In our example, there can be many order_items
for a single order
, so the relationship from order_items
to orders
is many_to_one
:
If you do not include a relationship
in your join, Looker will default to many_to_one
.
Step 5: sql_on
The final step is declaring how to join these two tables together. This can be done with either the sql_on
or the foreign_key
parameter. We usually suggest sql_on
since it can do everything foreign_key
can do, but is typically easier to understand.
sql_on
is equivalent to the ON
clause in the generated SQL for a query. With this parameter, we can declare which fields should be matched up to perform the join:
We can also write more complex joins. For example, you may want to join only orders with id
greater than 1000:
Check out substitution operators to learn more about the ${ ... }
syntax in these examples.
Step 6: Testing
We’ve now created a new join in LookML! You can test that this join is functioning as expected by going to the Order Items Explore. You should see fields from both order_items
and orders
.
Check out Model Development to learn more about testing LookML changes.
Joining Through Another View
You may want to join a view to an Explore through another view. In the example above, we joined orders
to order_items
via the order_id
field. We might also want to join the data from a view called users
to the order_items
Explore, even though they don’t share a common field. This can be done by joining through the orders
view.
We can use sql_on
or foreign_key
to join users
to orders
instead of order_items
. We do this by correctly scoping the field from orders
as orders.user_id
.
Here is an example using sql_on
:
Joining a View More Than Once
Let’s say we have a users
view that contains data for both buyers and sellers. We may want to join data from this view into order_items
, but do so separately for buyers and sellers. We can join users
twice, with different names, using the from
parameter.
The from
parameter allows you to specify which view to use in a join, while giving the join a unique name. For example:
In this case, only buyer data will be joined in as buyers
, while only seller data will be joined in as sellers
.
Note: the users
view must now be referred to by its aliased names buyers
and sellers
in the join.
Limiting Fields from a Join
The fields
parameter can be used to specify which fields are brought from a join into an Explore. By default, all fields from a view will be brought in when joined. However, sometimes you might want to bring through only a subset of fields.
For example, when orders
is joined to order_items
, we may want to bring only the shipping
and tax
fields through the join, like so:
You can also reference a set of fields, such as [set_a*]
. Each set is defined within a view using the set
parameter. Suppose we have the following set defined in the orders
view:
We can then choose to bring only these three fields through when we join orders
to order_items
, like so:
Symmetric Aggregates
Looker uses a feature called “symmetric aggregates” to calculate aggregations (like sums and averages) correctly, even when joins result in a fanout. Symmetric aggregates are described in more detail in this Community topic, and the fanout problem they solve is explained in this blog post on Aggregate Functions Gone Bad.
Primary Keys Required
In order to have measures (aggregations) come through joins, you must define primary keys in all of the views involved in the join.
You can do this by adding the primary_key
parameter to the primary key field definition in each view:
To correctly handle joined measures, Looker relies on you specifying a primary key where the values are completely unique, non-NULL values. If your data does not contain a primary key, consider whether the concatenation of several fields would result in a primary key of completely unique, non-NULL values. If your primary key is not unique or contains NULL values and your query includes data that reveal those issues, then Looker returns an error as described in this Help Center article.
Supported SQL Dialects
For Looker to support symmetric aggregates in your Looker project, your database dialect must also support them. The following table shows which dialects support symmetric aggregates in Looker 21.0:
If your dialect does not support symmetric aggregates, you will need to be careful when executing joins in Looker, as some types of joins can result in inaccurate aggregations (like sums and averages). This problem, and the workarounds for it, are described in great detail in this blog post on Aggregate Functions Gone Bad.
Apache Druid Does Not Support Joins
The Apache Druid database dialect does not support SQL joins. Using the join
parameter with Apache Druid will result in an error.
Learn More About Joins
This has been a quick tutorial about joining in Looker. To learn more about join parameters in LookML, check out the Join Reference documentation.