User Guide Getting Started Help Center Documentation Community Training
Looker
  
English
Français
Deutsch
日本語
Cloudera Impala

Looker connects to any database through a JDBC connection. By default, on Impala this is the server running the impalad daemon on port 21050.

Configuring Looker to Connect to Cloudera Impala

In the Admin section of Looker, navigate to the Connections page and click New Connection.

The Looker connection configuration depends on the security that is being used with Impala:

Connecting to a Cluster without Kerberos or User Authentication

To configure an Impala connection that isn’t using Kerberos or user authentication:

  1. On the Connection Settings page, leave the Username and Password fields blank. (The * next to the field names implies that these fields are required, but they are not.)
  2. In the Additional Params field, enter ;auth=noSasl.

Verifying the Connection String

To verify the JDBC connection string in the log files, in the Looker Admin panel, click Log in the left menu. Then filter the log on a term such as jdbc or noSasl:

For more information about configuring Cloudera Impala to work with JDBC, see this Cloudera documentation.

Connecting to a Cluster Requiring LDAP Authentication

For a cluster that requires LDAP authentication, including a cluster with Apache Sentry and Kerberos, on the Connection Settings page, enter a Username and Password with access to the schemas Looker will access.

Connecting to a Cluster Secured with Kerberos, but Not Using Apache Sentry

The Looker analyst team may need to assist in configuring this correctly.

Usually, Kerberos authentication with Cloudera environments is handled through Apache Sentry. See this Cloudera documentation for more detail.

If you want to configure Looker to connect directly to Cloudera Impala using Kerberos authentication, follow the steps below.

Setting Up the Kerberos Client Configuration

The first thing to do is ensure the installation of several pieces of software and the presence of several files on the Looker machine.

Kerberos Client

Verify that the Kerberos client is installed on the Looker machine by trying to run kinit. If it’s not, install the Kerberos client’s binaries.

For example on Redhat/CentOS, this would be:

sudo yum install krb5-workstation krb5-libs krb5-auth-dialog

Java 8

Java 8 must be installed on the Looker machine and in the PATH and JAVA_HOME of the Looker user. If necessary, install it locally in the looker directory.

Java Cryptography Extension

  1. Download and install the Java Cryptography Extension (JCE) for Java 8 from this page.

    • Locate the jre/lib/security directory for the Java installation.
    • Remove the following JAR files from this directory: local_policy.jar and US_export_policy.jar.
    • Replace these two files with the JAR files included in the JCE Unlimited Strength Jurisdiction Policy Files download.

    It may be possible to use versions of Java prior to Java 8 with the JCE installed, but this is not recommended.

  2. Update JAVA_HOME and PATH in ~looker/.bash_profile to point to the correct installation of Java and source ~/.bash_profile or log out and in again.

  3. Verify the Java version with java -version.

  4. Verify the JAVA_HOME environment variable with echo $JAVA_HOME.

gss-jaas.conf

Create a gss-jaas.conf file in the looker directory with these contents:

com.sun.security.jgss.initiate {
    com.sun.security.auth.module.Krb5LoginModule required
    useTicketCache=true
    doNotPrompt=true;
};

If necessary for testing, debug=true can be added to this file like this:

com.sun.security.jgss.initiate {
    com.sun.security.auth.module.Krb5LoginModule required
    useTicketCache=true
    doNotPrompt=true
    debug=true;
};

krb5.conf

The server running Looker should also have a valid krb5.conf file. By default, this file is in /etc/krb5.conf. If it is in another location, that must be indicated in the environment (KRB5_CONFIG in the shell environment).

You may need to copy this from another Kerberos client machine.

lookerstart.cfg

Point to the gss-jaas.conf and krb5.conf files by making a file in the looker directory (the same directory that contains the looker startup script) called lookerstart.cfg that contains the following lines:

JAVAARGS="-Djava.security.auth.login.config=/path/to/gss-jaas.conf -Djavax.security.auth.useSubjectCredsOnly=false -Djava.security.krb5.conf=/etc/krb5.conf" LOOKERARGS=""

If the krb5.conf file is not at /etc/krb5.conf then it will also be necessary to add this variable:

-Djava.security.krb5.conf=/path/to/krb5.conf

For debugging, add these variables:

-Dsun.security.jgss.debug=true -Dsun.security.krb5.debug=true

Then restart Looker with ./looker restart.

Authenticating with Kerberos

User Authentication

  1. If krb5.conf is not in /etc/, then use the environment variable KRB5_CONFIG to indicate its location.

  2. Run the command klist to make sure there is a valid ticket in the Kerberos ticket cache.

  3. If there is no ticket, run kinit username@REALM or kinit username to create the ticket.

  4. The account used with Looker will likely be headless, so you can get a keytab file from Kerberos to store the credential for long-term use. Use a command like kinit -k -t looker_user.keytab username@REALM to get the Kerberos ticket.

Automatically Renewing the Ticket

Set up a cron job that runs every so often to keep an active ticket in the Kerberos ticket cache. How often this should run depends on the configuration of the cluster. klist should give an indication of how soon tickets expire.

Setting Up the Looker Connection

Fill out the Connection Settings page as follows:

It is a best practice to have the server name (impala.company.com in this example) be the canonical name for the server and its IP address’s reverse DNS lookup results in that name. However, the server name should be whatever is listed in the Kerberos domain controller:

nslookup servername # get canonical server name and IP address nslookup ipaddress # get the canonical name back

Sometimes the server name is set to be the hostname, and not the fully qualified domain name. In this case it may be necessary to modify the /etc/hosts and /etc/nsswitch.conf files to make sure that reverse lookups resolve as intended.

Test the connection to make sure that it is configured correctly.

Debugging

Resources

Connecting to a Cluster That is Secured with SSL Certificate Authentication

Impala supports SSL network encryption between the JDBC client and Impala server.

The first step is to verify that you can connect to the SSL-secured Impala using a JDBC client other than Looker. The JDBC client beeline, which comes as part of CDH, is the best option.

If you are not using authentication, try something like this:

beeline -u 'jdbc:hive2://impala.company.com:21050/default;ssl=true;sslTrustStore=/etc/trust_store.jks;trustStorePassword=abcxyz`

If you are using user authentication, try something like this:

beeline -u 'jdbc:hive2://impala.company.com:21050/default;ssl=true;sslTrustStore=/etc/trust_store.jks;trustStorePassword=abcxyz -n user -p passwd`

See this link for more detail about how to form the correct JDBC URLs for Impala.

Once a connection via beeline can be established, then use Looker to connect to Impala.

  1. Copy the certificate somewhere to your Looker instance and chmod it to something like 400 or 600. (Or have your analyst do this for you if Looker is hosting.)

  2. Create a new connection for Cloudera Impala.

  3. Enter your username and password into the appropriate fields if using user authentication, or leave them blank.

  4. For Additional Params there are two options:

    • Enter the entire string:
      ;ssl=true;sslTrustStore=/etc/trust_store.jks;trustStorePassword=abcxyz
    • Enter the string ;ssl=true, and in the Looker startup script add:
      JAVAARGS="-Djavax.net.ssl.trustStore=/etc/trust_store.jks -Djavax.net.ssl.trustStorePassword=abcxyz"
      Then chmod looker to something like 700 or 500. Then restart Looker with ./looker restart.

Notes

The user connecting to the Impala scratch schema for PDTs must have read/write permissions.

Reference

For more information, see the Configuring Impala to work with JDBC section in the Cloudera documentation.

Feature Support

Looker’s ability to provide some features depends on whether the database dialect can support them.

In the current Looker release, Cloudera Impala supports the following Looker features:

Next Step

After you have connected your database to Looker, you’re ready to configure sign-in options for your users.

Top