Tools → Data Agent

About a Data Agent

Rather than opening a VPN or SSH tunnel between your external database and your Incorta cluster, you can install and configure a Data Agent service to run on the host of the database or host that is on the same subnet as a the database host. Typically, the database host resides behind a corporate firewall or on another interdepartmental subnet.

The Data Agent service supports the following data sources:

The data agent service enables the extraction of data from one or more databases behind a firewall to an Incorta cluster. This means that you can have a single data agent that connects to multiple data sources in you organization. Your Incorta cluster can reside on-premises or in the cloud.

The connection between Incorta and a data agent service uses TLS/SSL. Authentication requires a valid CA certificate or self-signed certificate. To learn more about TLS/SSL, please review Security → HTTPS for Apache Tomcat with OpenSSL. The data agent encodes data for transfer using the Google’s ProtoBuf library.

Important

A CMC Administrator must enable and configure an Incorta cluster to support the use of Data Agents. Only a Tenant Administrator (Super User) or user that belongs to a group with the SuperRole role for a given tenant can create a data agent that connects to a data agent service. For a given data agent, the Tenant Administrator or similar user must generate an authentication file. A system administrator then copies the generated authentication file to the conf directory of the data agent service installation on the remote host. With the authentication file in place, a system administrator starts the data agent service on the remote host. A Tenant Administrator or similar user then must confirm the connected status of the Data Agent service in the Data Manager. Once connected, a user that belongs to a group with the Schema Manager or SuperRole can can create an external data source using the required database connector and the connected data agent.

Requirements for the Data Agent service host

Here are the requirements to run a Data Agent service on a host:

  • Minimum of 16G RAM and 4 CPU
  • Ability to install Java or OpenJDK
  • The host must not block outgoing connections
  • The Incorta cluster must allow for two additional incoming ports

There are two configurations available for the Data Agent:

You can download the Data Agent from the Cloud Console for an Incorta cluster. For on-premises, you must contact Incorta Support directly for the download file.

Enable the Data Agent for a cloud.incorta.com Incorta cluster

Important: cloud.incorta.com

When you enable the Data Agent from the Cloud Console, cloud.incorta.com automatically enables and configures the server configurations for the cluster in the Cluster Management Console.

  • Sign in to your cloud console at http://cloud.incorta.com as the Cloud Administrator.
  • In the cloud console, select an Incorta cluster.
  • In Cluster Details, enable the Data Agent.
  • In Data Agent, select download.

Enable the Data Agent for an on-premises Incorta cluster

A data agent service connects to the Analytics Service and Loader Service through specific ports. The data agent service host must be able to receive incoming and outgoing communications over the specified ports.

Warning

After you enable a Data Agent feature, you must restart the Incorta cluster. Changes to individual Agent Ports require restarting the related Analytics or Loader Services.

  • Sign in as the CMC Administrator
  • In the Clusters Manager, select the cluster.
  • In the Cluster Manager, select Cluster Configurations.
  • In Server Configurations, in the left panel, select Data Agent.
  • Toggle on the Enable Data Agent property.
  • Specify the Data Agent properties:
    • Analytics Data Agent Port
    • Loader Data Agent Port
    • Analytics Public Hosts and Port
    • Loader Public Hosts Ports
  • Select Save.
Note

Depending on your requirements, the HOST_IP or HOST_DNS can be a PUBLIC_IP, PUBLIC_DNS, PRIVATE_IP, or PRIVATE_DNS.

PropertyDescription
Analytics Data Agent PortThe Analytics Service listens to a data agent service on this local port
Loader Data Agent PortThe Loader Service listens to a data agent service on this local port
Analytics Public Hosts and PortsThe HOST IP or HOST DNS and the port. The data agent service connects to the Analytics Service using this HOST:PORT. The connection is forwarded to the specified to the Analytics Data Agent Port
Loader Public Hosts and PortsThe HOST IP or HOST DNS and the port. The data agent service connects to the Loader Service using this HOST:PORT. The connection is forwarded to the specified to the Loader Data Agent Port .
Contact Incorta Support to download the Data Agent

For an on-premises installation of the Data Agent, you must contact Incorta Support directly for the download binary.

Install Java or the OpenJDK

Before installing the data agent service on a host, you must first install Java. The supported versions of JAVA are:

  • Oracle Java 8
  • OpenJDK 8
  • OpenJDK 11

You can download OpenJDK 11 from https://jdk.java.net/archive/

The host environment must have a JAVA_HOME system environment variable with a value set to the OpenJDK directory. The PATH environment variable must include:

  • JAVA_HOME/bin for Linux
  • %JAVA_HOME%\bin for Windows

Install the data agent service on a Windows host

Here are the steps to install the data agent service for a Windows host:

  • Copy the incorta.dataagent-X.Y.Z.zip download file to the Windows host.
  • Unzip the incorta.dataagent-X.Y.Z.zip file to any local directory on the Windows host.

Install the data agent service on a Linux host

Here are the steps to install the data agent service for a Linux host:

  • Secure copy the download file to the Linux host. Here is an example:
HOST_IP=192.168.128.100
HOST_KEY_FILE=private.pem
HOST_USER=incorta
DATA_AGENT_FILE=incorta.dataagent-1.1.0.zip
cd ~/Downloads
scp -i ~/.ssh/${HOST_KEY_FILE} ${DATA_AGENT_FILE} ${HOST_USER}@${HOST_IP}:/tmp
  • Secure shell into the Linux host and unzip the incorta.dataagent-X.Y.Z.zip file to any local directory.
ssh -i ~/.ssh/${HOST_KEY_FILE} ${HOST_USER}@${HOST_IP}
  • Unzip the ZIP file.
DATA_AGENT_FILE=incorta.dataagent-1.1.0.zip
cd /tmp
unzip ${DATA_AGENT_FILE}

Configure data agent service properties

The default memory size for the data agent is 2G. You can increase this to a higher amount, such as 4G of memory. Here are the steps:

  • Secure shell into the Linux host
HOST_IP=192.168.128.100
HOST_KEY_FILE=private.pem
HOST_USER=incorta
ssh -i ~/.ssh/${HOST_KEY_FILE} ${HOST_USER}@${HOST_IP}
  • Using VIM, or similar, edit the options.properties file.
DATA_AGENT_PATH=/tmp/incorta.datagent/
cd $DATA_AGENT_PATH
vim options.properties
  • Modify the memorySize property (use the i keystroke for Insert mode)
memorySize=4G
  • Save your changes to the file (use esc keystroke to return to Read mode, and the :wq! keystroke to save).

Create a data agent in the Data Manager

You create a data agent in the Data Manager to authenticate and monitor a remote data agent service. Only a Tenant Administrator (Super User) or user that belongs to a group with the SuperRole role for a given tenant can create a data agent instance that connects to a data agent service. A user that belongs to a group with the Schema Manager role can see a list of data agents in the Data Manager.

When you create a data agent in the Data Manager, you can generate and download an encrypted authentication file. The data agent service on the remote, on-premises host requires the generated .auth file. You must then copy the .auth file to the conf directory of the data agent service installation.

Here are the steps to create a data agent in the Data Manager:

  • Sign in to the tenant as the Tenant Administrator (Super User) or user that belongs to a group with the SuperRole role.
  • In the Navigation bar, select Data.
  • In the Action bar, select + New > Add Data Agent.
  • In the Create Data Agent dialog, enter a Data Agent Name and optionally enter a description.
  • In the Generate Authentication File, select Generate Now.
Note

You can regenerate an authentication file from the Data Manager for a given data agent. The file contains all the information needed by the data agent service to connect to the Incorta cluster. The file includes information regarding the various hosts, ports, and TLS/SSL certificates.

Copy the authentication file to the remote host

  • Secure copy or upload the .auth file to the remote Linux or Windows remote host.
  • Move the .auth file to the conf directory of the data agent service installation.

Start the data agent service for a Windows host

It is recommended that you use a service helper utility to monitor the data agent’s state and automatically restart it if it goes down. Here are the steps:

  • Install a service helper utility if you do not already have one.
  • Create a new service for the data agent and provide the full path to the agent.bat file.
  • Start the service.
  • Log out of the host and log back in to test that the service is still running.
  • View the events for the service in the Event Viewer.

Start the data agent service for a Linux host

  • Secure shell in to the remote host.
  • Navigate to the installation directory of the data agent service.
  • Run ./agent.sh start

Confirm data agent service connection in the Data Manager

  • Sign in to the tenant as the Tenant Administrator (Super User) or user that belongs to a group with the SuperRole role.
  • In the Navigation bar, select Data.
  • In the Action bar tab, select Data Agents.
  • Verify the status of the data agent as connected.

Create or edit external data source using the data agent

You can now create a new or edit an existing external data source using the data agent. To learn more about how to create and edit an external data source, please review Tools → Data Manager.

  • In the Create or Edit Data Source dialog, enable the Use Data Agent toggle.
  • For the Data Agent property, in the drop down list, select the data agent.
  • Specify a connection string that is accessible to the host of the data agent service that includes the Private IP or Private DNS such as 127.0.0.1 for a local host or 192.168.128.100 (replace as required) for a database that is on the same subnet as the data agent host.