This page discusses how to make an Apache Presto connection, along with potential issues and workarounds.
Configuring a Connection
In the Admin section of Looker, navigate to the Connections page and click New Connection. Looker displays this page:
Fill out the page as follows:
- Name: The name of the connection. This is how the connection will be referred to in LookML.
- Dialect: Select PrestoDB.
- Host:Port: The database hostname and port. The default port is 8080.
- Database: The “catalog” or “connector,” in Presto terms. This is most likely
- Username: The username of the user that will run queries. This information isn’t sent to the Presto server.
- Password: This field is optional. This information isn’t sent to the Presto server.
- Schema: The default schema to use when there is no schema specified.
- Persistent Derived Tables: Check this to enable PDTs.
- Temp Database: The schema to write PDTs. (Version 3.50 added PDT support to Presto. See below for more information about how to configure Presto for PDT support.)
- Additional Params: Leave this blank unless you need to customize the JDBC URL.
Click Test These Settings to verify a connection. Looker will run a
SELECT 1 query to verify a basic connection and perform a query test. It will not validate that the catalog and schema combination exists or that the user has access to it.
Click Update Connection to save these settings.
For more information about connection settings, see the Connecting Looker to Your Database documentation page.
Configuring Apache Presto for PDTs
This section explains the necessary configuration settings for a scratch database.
Currently only Hive connectors are supported for PDTs. You may want to set up a separate Hive catalog properties file for the PDT scratch schema, or modify the existing Hive catalog properties file.
There are a few configuration properties and values that the Hive catalog properties file should contain.
The following is required because Presto caches the Hive metastore results, and Looker needs to be able to see the tables right away:
hive.metastore-cache-ttl = 0s
These two properties are required because Looker needs to be able to drop and rename PDTs:
For reference, in our internal Presto testing servers we use the following
hive.properties file, which is used for all Hive schemas:
For more information about configuring your Hive connector, see the Presto documentation.
If using EMR with a Hive 1.0.0 version, there is a bug in the permissions of
/user/hive/warehouse that prevents
ALTER TABLE...RENAME TO from working.
To fix it, change the ownership of that database’s directory to
hadoop:hive, with something like:
hadoop dfs -chown hive:hadoop -R /user/hive/warehouse/scratch_db.db
See the Presto on EMR documentation for more detail.
- Try connecting your web browser to
host:8080on the Presto server to verify that
SELECT 1was sent to the server successfully.
If you aren’t able to bring up a browser on a machine to test the connection, try using
curl, with a command something like this:curl -H "Content-Type: text/plain" \ -H "X-Presto-Catalog: hive" \ -H "X-Presto-Schema: default" \ -X POST \ -d "SELECT 1" \ http://:8080/v1/statement
It may be necessary to increase to 64k the number of file handles for the user running Looker, especially if there is very low latency between Looker and the Presto server.
- This is generally set in
/etc/security/limits.din the file
- You can check the number of file handles available with
lsof -p <pid_of_looker> | wc -lis a quick way to check how many file handles Looker has open at any given time.
- This is generally set in
Looker’s ability to provide some features depends on whether the database dialect can support them.
In the current Looker release, PrestoDB supports the following Looker features:
After you have connected your database to Looker, you’re ready to configure sign-in options for your users.