Looker supports connections to Amazon Athena, an interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL. Amazon Athena is serverless, so there is no infrastructure to manage. You are charged only for the queries that are run.
This article describes how to connect Looker to an Amazon Athena instance.
Ensure that you have the following:
- A pair of Amazon AWS access keys.
- An S3 bucket. The Amazon AWS access keys must have read-write access to this bucket.
- Knowledge of where your Amazon Athena instance data is located. The region name can be found in the upper right-hand portion of the Amazon Console.
In the Admin section of Looker, navigate to the Connections page and click New Connection. Looker displays this page:
Fill out the connection details:
- Name: Specify the name of the connection. This is how you will refer to the connection in LookML projects.
- Dialect: Select Amazon Athena.
- Host:Port: Specify the name of the host and port. As described in the Athena documentation on the JDBC URL format, the host should be a valid Amazon endpoint (like
athena.eu-west-1.amazonaws.com), and the port should stay at
443. An up-to-date list of endpoints that support Athena can be found here.
- Database: Specify the default database that you would like modeled. Other databases can be accessed, but Looker treats this database as the default database.
- Username: Specify the AWS access key ID.
- Password: Specify the AWS secret access key.
- Additional Params: Configure the bucket that is used for staging using the
staging-bucketis the name of the staging bucket. See the Athena documentation on JDBC Driver Options for more information and a list of all available JDBC driver options.
- SSL: Choose whether or not you want to use SSL encryption to protect data as it passes between Looker and your database.
- Max Connections: By default, this is set to 5 because that is the maximum number of concurrent queries to Amazon Athena per account. See the Athena service limits documentation for more details about the service limits. See this documentation page for more information.
- Connection Pool Timeout: Specify the connection pool timeout. By default, the timeout is set to 120 seconds. See this Looker documentation page for more information.
- Database Time Zone: Specify the time zone used in the database. Leave this field blank if you do not want time zone conversion. See this documentation page for more information.
Do Not Whitelist Looker IPs
Looker connects to Amazon Athena via a public channel. As a result, whitelisting Looker’s IP addresses, as described on the Enabling Secure Database Access documentation page, is unnecessary. If you do whitelist Looker’s IP addresses for Amazon Athena, all other IP addresses are blacklisted, causing Looker’s public connection to Amazon Athena to fail.
- Amazon Athena documentation
- Amazon Web Services Console for Athena (requires an AWS login)
- Amazon Athena SQL and HiveQL reference
log_path JDBC driver options for debugging connections. To use them, add
&log_level=DEBUG&log_path=/tmp/athena_debug.log to the end of the Additional Params field and test the connection again.
If Looker is hosting the instance, then Looker Support or your analyst will need to retrieve this file to continue debugging.
Looker’s ability to provide some features depends on whether the database dialect can support them.
In the current Looker release, Amazon Athena supports the following Looker features:
After completing the database connection, configure authentication options.