How to connect impala using Beeline (Direct and via Apache Knox)
There are multiple ways you can connect with Impala. In this post, We are going to discuss connecting Impala using Beeline (Direct and via Apache Knox).
Why do we need to connect Impala using beeline
— There are scenarios, Where we need to troubleshoot or test a JDBC connection, every-time we can’t able to write a code to mimic the client connection.
— Also, It comes in handy while replicating the issue in a different environment
We can connect Impala via the hive JDBC driver or using the Impala driver. It is always recommended to use the Impala driver to connect impala
— For that, We need to download IMPALA JDBC DRIVER here (For Quick short cut to download of the driver check the end of this post 🙂)
— Unzip and upload to some temporary location (example: /tmp/ClouderaImpala_JDBC-2.6.23.1028/ImpalaJDBC41.jar)
— Once done, We just have to add the jar to the class path and refer that in the beeline client
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:/tmp/ClouderaImpala_JDBC-2.6.23.1028/ImpalaJDBC41.jar
beeline -d "com.cloudera.impala.jdbc41.Driver" -u 'jdbc:impala://<Impala Hostname>:21050/;AuthMech=1;KrbRealm=<Realm>;KrbHostFQDN=<FQDN>;KrbServiceName=impala;SSL=1;
> show tables; +------------+--+ | name | +------------+--+ | customer | | service | | Hadoop | +------------+--+ 3 rows selected (0.109 seconds)
Now, Let’s see, How we can connect Impala via Apache KNOX using the above JDBC driver
— It’s the same as the above, But, We need to do a few changes as below
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:/tmp/ClouderaImpala_JDBC-2.6.23.1028/ImpalaJDBC41.jar
beeline -d "com.cloudera.impala.jdbc41.Driver" -u 'jdbc:impala://<Knox hostname>:8443/;transportMode=http;httpPath=gateway/kerberos/impala;AuthMech=1;KrbHostFQDN=<FQDN>;KrbServiceName=HTTP;SSL=1;SSLTrustStore=/tmp/gateway-client-trust.jks;SSLTrustStorePwd=changeit'
> show tables; +------------+--+ | name | +------------+--+ | customer | | service | | Hadoop | +------------+--+ 3 rows selected (0.119 seconds)
KNOX uses “transportMode=http”
httpPath=gateway/kerberos/impala — This policy has to be created in the KNOX config
Good that you made this far (Cheers)
Downloading the jar from the Cloudera portal is time-consuming. As it requires to fill a lot of details before downloading. For quick download
— Find the latest driver
— Right-click the “Get it now” button -> copy the link address

— ssh to linux host -> wget <copied link>
TADA !! You got the driver-jar already 🙂
Good Luck with your Leanring, Use the forum or chat for any questions