Thursday, October 9, 2014

Running Qlikview with HDFS as Data Source on Kerberos enabled Cloudera Hadoop CDH 4.7


In this post we will install Kerberos on Cloudera CDH 4.7
I am using the Cloudera Quick Start VM running on CentOS 6


Step 1 : 

For installing Kerberos Please check the following post by my friend.It is quite descriptive and easy to understand, so I won't be writing that again.



Step 2 :

On client (Here Windows server R2 , any Windows OS which supports Qlikview can  be used) .
Install QlikView and install MIT Kerberos for Windows from the following page



Copy the Keytab file  to Kerberos bin directory and  the Krb5.conf file to (Program Data folder is hidden)

C:\ProgramData\MIT\Kerberos5

Also , on Windows we have Krb5.ini file which needs to be modified a bit.

The .ini file should look something like this


[libdefaults]

    default_realm = BSIL.COM

    default_tkt_enctypes = rc4-hmac des3-cbc-sha1 des-cbc-crc des-cbc-md5

    default_tgs_enctypes = rc4-hmac des3-cbc-sha1 des-cbc-crc des-cbc-md5

    permitted_enctypes = rc4-hmac des3-cbc-sha1 des-cbc-crc des-cbc-md5



[realms]

    BSIL.COM = {

        kdc = 192.168.0.253

        admin_server = 192.168.0.253

       
    }

[domain_realms]

    .localhost.localdomain = BSIL.COM

    localhost.localdomain = BSIL.COM



Now go to C:\Program Files\MIT\Kerberos\bin
and initialise the configuration using 

C:\Program Files\MIT\Kerberos\bin>kinit.exe

It will ask for password , enter the password.



Now let us check using klist.exe 


So , we can see that ticket has been generated for user administrator@BSIL.COM


Step 3 : Install Cloudera ODBC Connector for Hive from Cloudera Downloads Page.

Install ODBC Driver.

Next, Let Us Create a DSN for the Connection to Hadoop Server

Go to  Control Panel\System and Security\Administrative Tools

Click on Data Sources
System DSN- Let us Modify the Default DSN with following settings.



Now let us check the connectivity 

It Shows Connection Successful.



Now , Let us open Qlikview and try to connect to Hive.



But Now as We see it gives the following error.

SQL##f - SqlState: S1000, ErrorCode: 38, ErrorMsg: [Cloudera][HiveODBC] (38) The gssapi DLL can not be loaded and it is needed for authenticating using Kerberos. Please ensure MIT Kerberos is installed.

Now to resolve this , we will copy  krb5_32.dll from Kerberos bin Directory to C:\Program Files\QlikView.

Let us restart Qlikview to reload the new .dll file.



Now , as we see we are able to connect from Qlikview as well.

Let us try to run a simple Hive Query.

Select id from default.test;





So,now we are able to call Hive running on a Kerberos Cluster  from Qlikview .