Guides → Configure a Tenant on ADLS Gen2

You can configure an Incorta Tenant on Azure Data Lake Storage (ADLS) Gen2, using it as a file system to save files, including Parquet and snapshot files. Here are the steps:

Create a core-site.xml file

Create a file and name it core-site.xml. You will need to add your ADLS Gen2 credentials to the file. The following is the content of the core-site.xml file:

<configuration>
<!-- other configuration -->
<property>
<name>fs.azure.account.auth.type</name>
<value>OAuth</value>
</property>
<property>
<name>fs.azure.account.oauth.provider.type</name>
<value>org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider</value>
</property>
<property>
<name>fs.azure.account.oauth2.client.endpoint</name>
<value>https://login.microsoftonline.com/$$TENANT_ID$$/oauth2/token</value>
</property>
<property>
<name>fs.azure.account.oauth2.client.id</name>
<value>$$CLIENT_ID$$</value>
</property>
<property>
<name>fs.azure.account.oauth2.client.secret</name>
<value>$$CLIENT_SECRET$$</value>
</property>
<!-- other configuration -->
</configuration>
Note

Replace $$TENANT_ID$$, $$CLIENT_ID$$ and $$CLIENT_SECRET$$ with your ADLS Gen2 credentials. You can get these from the Azure web console.

Encrypt credentials (optional)

Credentials can be entered in plain text or encrypted. To encrypt credentials, with the Hadoop command line interface (CLI), you will generate the credentials as follows:

cd <INCORTA_INSTALLATION_PATH>/IncortaNode/hadoop/bin
./hadoop credential create fs.azure.account.oauth2.client.id -value <client_id> -provider jceks://file/<KEY_STORE_PATH>.jceks
./hadoop credential create fs.azure.account.oauth2.client.secret -value <client_secret> -provider jceks://file/<KEY_STORE_PATH>.jceks
./hadoop credential create fs.azure.account.oauth2.client.endpoint -value <endpoint> -provider jceks://file/<KEY_STORE_PATH>.jceks

Edit the core-site.xml file as follows:

<configuration>
<property>
<name>hadoop.security.credential.provider.path</name>
<value>jceks://file/<KEY_STORE_PATH>.jceks</value>
<description>Path to interrogate for protected credentials.</description>
</property>
<property>
<name>fs.azure.account.auth.type</name>
<value>OAuth</value>
</property>
<property>
<name>fs.azure.account.oauth.provider.type</name>
<value>org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider</value>
</property>
</configuration>

Copy the core-site.xml file to your Incorta host

Copy the core-site.xml files to the following directories on the host running your Incorta nodes:

  • <INCORTA_INSTALLATION_PATH>/cmc/lib/core-site.xml
  • <INCORTA_INSTALLATION_PATH>/cmc/tmt/core-site.xml
  • <INCORTA_INSTALLATION_PATH>/IncortaNode/hadoop/etc/hadoop/core-site.xml
  • <INCORTA_INSTALLATION_PATH>/IncortaNode/runtime/lib/core-site.xml
  • <INCORTA_INSTALLATION_PATH>/IncortaNode/runtime/webapps/incorta/WEB-INF/lib/core-site.xml
Note

You will need to restart Spark, the CMC, and the Analytics and Loader services after you copy the core-site.xml file.

Declare an environment variable on your Incorta host

On the host running your Incorta nodes, declare the following environment variable in~/.bash_profileor ~/.bashrc:

export INCORTA_USE_AZURE_APIS=true

Create an ADLS Gen2 Tenant in the CMC

Following are the steps to create an ADLS Tenant in the CMC:

  • Sign in to the CMC.

  • In the Navigation bar, select Clusters.

  • In the cluster list, select a Cluster name.

  • In the canvas tabs, select Tenants.

  • Select +Create Tenant.

    • Enter a Tenant Name, Username, Password, and Email.

    • Enter the Shared Storage Path. For ADLS Gen2, the path will be:

      abfs://<CONTAINER_NAME>@<STORAGE_ACCOUNT_NAME>.dfs.core.windows.net/<DIRECTORY_PATH>
Note

You will need to have read/write permission to the ADLS Gen2 path.

Whitelist the ADLS Gen2 endpoints

Whitelist the following ADLS Gen2 endpoints to ensure they are accessible:

*.dfs.core.windows.net

*.blob.core.windows.net

Verify the Wildfly JAR version

In certain operating systems, the files wildfly-openssl-1.0.4.Final.jar and wildfly-openssl-1.0.7.Final.jar both exist under the following path: <INCORTA_INSTALLATION_PATH>/IncortaNode/hadoop/share/hadoop/tools/lib/

In this situation, you will need to remove wildfly-openssl-1.0.4.Final.jar so that only wildfly-openssl-1.0.7.Final.jar exists. You can backup wildfly-openssl-1.0.4.Final.jar to a different directory as needed.

Note

You will need to restart Spark, the CMC, and the Analytics and Loader services after you rename wildfly-openssl-1.0.4.Final.jar.