Guides → Upgrade from 6.x to 2024.1.3

Context

This content applies to On-Premises installations only.

Prior to upgrade, please review the Release Model. The Release Model describes the release statuses: Preview and Generally Available. Refer to the Release Model to identify the release version and status that is most suitable for upgrading your Sandbox, Developer, User Acceptance Testing (UAT), or Production environment.

Upgrade Considerations

Note

Refer to Guides → Upgrade to the 2024.1.3 On-Premises Release for a list of the 2024.1.3 general upgrade considerations.

  • You must have a valid license key before you start the upgrade process. To obtain your license key, please contact Incorta Support.

  • With the introduction of the new-generation loader, Incorta automatically detects inter-object dependencies within a load plan during the Planning phase of a load job. The Loader Service utilizes these detected dependencies and the user-defined load order within the schema to create an execution plan for loading objects. However, it’s important to note that using both the MVs' user-defined load order and automatically detected dependencies may result in an execution plan with cyclic dependencies, leading to load job failures. To avoid such failures, it is recommended to delete the MVs' user-defined load order before upgrading to the 2024.1.3 release.


Upgrade from Incorta 6.x to 2024.1.3

This guides details how to upgrade a standalone Incorta cluster. Upgrading your Incorta cluster to Release 2024.1.3 requires team resources:

  • a System Administrator with root access to the host or hosts running Incorta Nodes, the host running the Cluster Management Console (CMC), and the host or hosts running Apache Spark
  • a CMC Administrator
  • a Database Administrator
  • a Super User that can access each tenant in the Incorta environment
  • an Incorta Developer to resolve identified issues with formula expressions, schema alias, joins between tables, and dependencies between objects such as dashboards and business schemas

It also requires time as these general timelines for various procedures and processes indicate:

StageEstimated Time
Prepare for Upgrade Readiness15 minutes to 3 hours
Achieve Upgrade Readiness2 hours to 3 days
Stop the Incorta cluster5 minutes to 30 minutes
Create backups15 minutes to 3 hours
Upgrade Apache Zookeeper15 minutes to 1 hour
Upgrade the Incorta cluster15 minutes to 3 hours
Start the Incorta CMC5 minutes to 5 hours
Upgrade the Incorta Metadata database5 minutes to 3 hours
Start the Incorta cluster5 minutes to 5 hours
Verify the successful upgrade15 minutes to 1 day
Important

If you have a cloud file system, please make sure to run the update core site script after the upgrade and before starting your cluster, and not to paste the core-site.xml file manually.


Prepare for Upgrade Readiness

To prepare for Upgrade Readiness requires:

  • running Spark 3.3 validation tool
  • a System Administrator with root access to the host or hosts running Incorta Nodes as well as the host running the Cluster Management Console (CMC)
  • a CMC Administrator
  • a Database Administrator

The estimated time to complete the following is from 15 minutes to 3 hours:

  • Backup the core-site.xml file if you have a cloud file system
  • Pause all scheduled jobs
  • Export all tenants
  • Add a Create View database grant

Spark 3.3 Upgrade Readiness

Incorta uses Spark 3.3 in the 2024.1.3 release. Before you start upgrading to this release, make sure to run the Spark compatibility check tool spark-upgrade-issues-detector.sh located in the 2024.1.3 extracted folder, and to fulfill the Spark upgrade prerequisites.

To run the tool, use the following command:

sh <EXTRACTION_DIR>/spark-upgrade-issues-detector.sh

The tool will create two files:

  • potential-upgrade-issues.csv, which contains the potential compatibility issues that you have.
  • spark-issues-references.xlsx, which contains examples and references to the compatibility concerns and how can you fix them.

Make sure to handle the compatibility concerns reported in the .csv file, before your proceed to upgrade.

Spark Upgrade Prerequisites

  • Customers who are using external Spark must upgrade it to 3.3.
  • Customers who are upgrading from previous Incorta releases and are using ADLS storage must create a copy of the IncortaNode/hadoop/etc/hadoop/core-site.xml file under IncortaNode/spark/conf/.

Important notes

  • While Spark 3.3 supports Python versions from 3.7 to 3.10, Incorta supports Python 3.8 and 3.9 only
Important

If you have multiple Python binaries on the same machine, set the following attributes:

  • python.path to Python 3 path in the IncortaAnalytics/IncortaNode/node.properties
  • export PYSPARK_PYTHON = /usr/bin/python3 in IncortaNode/notebooks/services/<SERVICE-ID>/conf/zeppelin-env.sh
  • PySpark requires Pandas 1.0.5 or later.
  • PySpark requires PyArrow 0.12.1 or later.

Contact Incorta Support if you require further assistance.

Pause all scheduled jobs in the CMC

Enable the following settings to pause active scheduled schema loads, dashboards, and data alerts. This is helpful when importing or exporting an existing tenant. Here are the steps to enable the required options as default tenant configuration:

  • In the Navigation bar, select Clusters.
  • In the cluster list, select a Cluster name.
  • In the canvas tabs, select Cluster Configurations.
  • In the panel tabs, select Default Tenant Configurations.
  • In the left pane, select Data Management.
  • Enable the following settings:
  • Pause Load Plans
  • Pause Dashboard Scheduler
  • Pause Data Notification
  • Select Save.

Export of all tenants with the Tenant Management Tool

A System Administrator with root access to the host running the Cluster Management Console (CMC) is able to run the Tenant Management Tool (TMT). Here are the steps:

  • Secure shell in to the CMC host.
  • As the incorta user, navigate to the installation path of the TMT. The default installation path for the TMT is:
<CMC_INSTALLATION_PATH>/IncortaAnalytics/cmc/tmt
  • Export ALL tenants
./exportAlltenants.sh -c <CLUSTER_NAME> -f False /tmp/<TENANT_EXPORT>.zip

Add a Create View database grant

A Database Administrator with root access to the MySQL or Oracle database server that runs the Incorta Metadata database is able to add the Create View database grant. The estimated time to complete the following is from 5 minutes.

MySQL

Here are the steps for MySQL:

  • Sign in to the MySQL Incorta metadata database as the root user.
mysql -h0 -uroot -proot_password incorta_metadata
Note

-h = host, where 0 is a shorthand reference for localhost
-u = user, where root is the user
-p = password, where the password is root_password
incorta_metadata is the database

  • Verify the incorta database user for the incorta_metadata database.
SELECT User, Host FROM mysql.user WHERE user = 'incorta';
  • Verify the current grants for all users.
SHOW GRANTS for 'incorta'@'locahost';
SHOW GRANTS for 'incorta'@'127.0.0.1';
SHOW GRANTS for 'incorta'@'192.168.128.101';
  • If needed, add the CREATE VIEW grant to the all incorta users.
GRANT CREATE VIEW ON `incorta_metadata`.* TO 'incorta'@'localhost';
GRANT CREATE VIEW ON `incorta_metadata`.* TO 'incorta'@'127.0.0.1';
GRANT CREATE VIEW ON `incorta_metadata`.* TO 'incorta'@'192.168.128.101';

Oracle

To add grants for a user in an Oracle database, please refer to Oracle dDatabase SQL Language Reference.


Achieve Upgrade Readiness

Please review Concepts → Upgrade Readiness. Achieving Upgrade Readiness requires:

  • a System Administrator with root access to the host or hosts running Incorta Nodes as well as the host running the Cluster Management Console (CMC)
  • a CMC Administrator
  • a Super User that can access each tenant in the Incorta environment
  • an Incorta Developer to resolve identified issues with formula expressions

The estimated time to complete the following is from 2 hours to 3 days:

  • Resolve alias issues with the Alias Sync Tool
  • Resolve Severity-1 issues that the Inspector Tool identifies

Resolve alias issues with the Alias Sync Tool

Here are the resources required to run the Alias Sync Tool:

  • A System Administrator with root access to the host running an Incorta Node is able to run the Alias Sync Tool.

To resolve issues with Alias tables, you must download the alias_sync.py file, secure copy the file to IncortaNode/bin directory, and the run the script for each tenant in your cluster.

To learn more, please review Tools → Alias Sync Tool.

Resolve Severity-1 issues that the Inspector Tool identifies

Here are the resources required to run the Inspector Tool:

  • For a given tenant, a CMC Administrator enables the Inspector Tool Scheduler and schedules an Inspector Tools job. A CMC Administrator also downloads the Inspector Tool related schema, business schema, and dashboards files for all tenants.
  • A Super User that can access each tenant in the Incorta cluster.
  • An Incorta Developer to resolve the identified issues in the 1- Validation UseCases dashboard.

For a given tenant, the Inspector Tool checks the lineage references of Incorta metadata objects including tables, schemas, business schemas, business schema views, dashboards, and session variables. It also checks for inconsistencies and validation errors in joins, tables, views, formulas, and dashboards.

Important

Prior to upgrading Incorta, you must enable and configure the Inspector Tool for all tenants. In addition, you must resolve all Severity-1 issues.

To learn more, please review Tools → Inspector Tool.


Stop the Incorta cluster

Here are the resources required to stop all the services in the Incorta cluster:

  • a System Administrator with root access to the host or hosts running Incorta Nodes, the host running the Cluster Management Console (CMC), and the host or hosts running Apache Spark

The estimated time to stop the Incorta cluster and all related services is from 5 minutes to 30 minutes. Here are the steps involved in stopping the Incorta cluster:

Stop the Notebook Add-on Service

Note

Your Incorta cluster may not have enabled and configured the Notebook Add-on Service for a given tenant. You enable the Notebook Add-on as an Incorta Labs feature.

In order to stop the Notebook Add-on Service, you need to know the name of the service. You can read the services.index file to find out the name of Notebook Add-on running on an Incorta Node that is running the Analytics Service.

cat <INCORTA_NODE_INSTALLATION_PATH>/IncortaNode/notebooks/services/services.index

Once you know the name of the Notebook Add-on Service, then execute the following:

NOTEBOOK_ADD_ON=<SERVICE_NAME>
<INCORTA_NODE_INSTALLATION_PATH>/IncortaNode/stopNotebook.sh ${NOTEBOOK_ADD_ON}

Stop the Analytics Service

In order to stop the Analytics Service, you need to know the name of the service. You can read the services.index file to find out the name of the services running on an Incorta Node.

cat <INCORTA_NODE_INSTALLATION_PATH>/IncortaNode/services/services.index

Once you know the name of the Analytics Service, you can then execute the following:

ANALYTICS_SERVICE=<SERVICE_NAME>
<INCORTA_NODE_INSTALLATION_PATH>/IncortaNode/stopService.sh ${ANALYTICS_SERVICE}

Stop the Loader Service

In order to stop the Loader Service, you need to know the name of the service. You can read the services.index file to find out the name of the services running on an Incorta Node.

cat <INCORTA_NODE_INSTALLATION_PATH>/IncortaNode/services/services.index

Once you know the name of the Loader Service, you can then execute the following:

LOADER_SERVICE=<SERVICE_NAME>
<INCORTA_NODE_INSTALLATION_PATH>/IncortaNode/stopService.sh ${LOADER_SERVICE}

Stop Apache Spark

You can stop Apache Spark using the stopSpark.sh shell script:

<INCORTA_NODE_INSTALLATION_PATH>/IncortaNode/stopSpark.sh

Stop the CMC

The default directory for the CMC is ~/IncortaAnalytics/cmc. Stop the CMC with the stop-cmc.sh shell script:

<CMC_INSTALLATION_PATH>/cmc/stop-cmc.sh

Stop the Node Agent

For each Incorta Node, run the following:

<INCORTA_NODE_INSTALLATION_PATH>/IncortaNode/nodeAgent/agent.sh stop

Stop Apache Zookeeper

To stop Apache Zookeeper, run the following:

<INCORTA_NODE_INSTALLATION_PATH>/IncortaNode/stop-zookeeper.sh

Create backups

Here are the resources required to create a various backups:

  • A Database Administrator with root access to the MySQL database server that runs the Incorta Metadata database.
  • A System Administrator with root access to the host or hosts running Incorta Nodes, the Cluster Management Console (CMC), and Apache Spark.

The estimated time to complete the following is from 30 minutes to 3 hours:

  • Create a backup of the Incorta Metadata database
  • Create a backup of the IncortaAnalytics directory
  • Create a backup of the Apache Spark configuration files

Create a backup of the Incorta Metadata database

Here are the resources required to create a backup of the Incorta Metadata database:

  • A Database Administrator with root access to the MySQL database server that runs the Incorta Metadata database.

MySQL

To create a backup of the incorta metadata database, use mysqldump command line utility:

mysqldump -u [user] -p [database_name] > [filename].sql
Example

Here is example with the MySql user as root with the password incorta_root:

mysqldump -uroot -pincorta_root incorta_metadata > /tmp/incorta_metadata.sql

Create a backup of the Incorta installation directory

To create a backup of the Incorta installation directory, use the following command:

zip -r IncortaAnalytics_Backup.zip <INCORTA_NODE_INSTALLATION_PATH>

Create a backup of the Apache Spark configuration files

Create a backup of the following spark configuration files present in the $SPARK_HOME/conf directory:

  • spark-defaults.conf
  • spark-env.sh
SPARK_HOME=<INCORTA_NODE_INSTALLATION_PATH>/IncortaNode/spark
cd $SPARK_HOME/conf
zip -r Spark_Conf_Backup.zip spark-defaults.conf spark-env.sh

Upgrade Apache Zookeeper

This release requires Apache Zookeeper v3.6.1 to support SSL. To enable SSL for Zookeeper, please review Security → Enable Zookeeper SSL.

If you are using an external version of Zookeeper that is not bundled with Incorta, you must upgrade your Zookeeper instance manually with the following steps:

  • Replace the existing zookeeper folder with the one from <INCORTA_INSTALLATION_PATH>/IncortaNode, with the exception of the zookeeper/conf/zoo.cfg file.
  • Add the admin.enableServer=false property to zoo.cfg.
  • Delete any files inside the <INCORTA_INSTALLATION_PATH>/IncortaNode/zookeeper_data folder.
  • Restart Zookeeper.

If you have multiple nodes, repeat the above steps for each Zookeeper node.

Note

The Zookeeper upgrade to v3.6.1 is backward compatible with all Incorta versions.


Upgrade the Incorta cluster

Here are the resources required to upgrade the Incorta cluster:

  • a System Administrator with root access to the host or hosts running Incorta Nodes, the host running the Cluster Management Console (CMC), and the host or hosts running Apache Spark
  • start the Incorta CMC. Do not start the other Incorta cluster services.

To begin, run the incorta-installer.jar file from the shell:

java -jar incorta-installer.jar -i console

In the Incorta Installer console, enter these values for a standalone (Typical) upgrade:

Welcome : Enter
License Agreement/Copyright : Enter
License Agreement/Copyright : Y
Installation Type : 2- Upgrade
Installation Set : 1- Typical
Choose Installation Folder : Enter- Default
Installation Status : Enter
Start CMC : 3- Finish without starting CMC

Kill unwanted processes

After upgrading, you will want to kill any processes related to Incorta as you will start Incorta manually. To kill any unwanted processes, run the following commands:

sudo kill -9 $(ps -aux | grep '[n]odeAgent.jar' | awk '{print $2}')
sudo kill -9 $(ps -aux | grep '[d]erby' | awk '{print $2}')
sudo kill -9 $(ps -aux | grep '[e]xportServer' | awk '{print $2}')
sudo kill -9 $(ps -aux | grep '[z]ookeeper' | awk '{print $2}')
sudo kill -9 $(ps -aux | grep '[c]mc' | awk '{print $2}')
sudo kill -9 $(ps -aux | grep '[s]park' | awk '{print $2}')
sudo kill -9 $(ps -aux | grep '[h]adoop' | awk '{print $2}')
sudo kill -9 $(ps -aux | grep '[p]ostgres' | awk '{print $2}')
sudo kill -9 $(ps -aux | grep '[I]ncortaNode' | awk '{print $2}')

Upgrade an external Apache Spark environment

If the Incorta cluster is using an external Apache Spark environment, you must also upgrade the Apache Spark environment by following these steps:

  • Zip the bundled spark directory under IncortaNode:
zip -r Incorta-Bundled-Spark.zip <INCORTA_NODE_INSTALLATION_PATH>/IncortaNode/spark
  • Zip the bundled hadoop directory under IncortaNode:
zip -r Incorta-Bundled-Hadoop.zip <INCORTA_NODE_INSTALLATION_PATH>/IncortaNode/hadoop
  • Copy Incorta-Bundled-Spark.zip and Incorta-Bundled-Hadoop.zip to the external Apache Spark environment.
  • In the external Apache Spark environment, remove the spark directory.
  • Unzip Incorta-Bundled-Spark.zip to recreate the Spark environment.
  • Unzip Incorta-Bundled-Hadoop.zip to recreate the Hadoop environment.

Review the Upgrade logs

Check to see if there are any critical errors with the upgrade in the following log files and directories:

  • Installer log
cat /tmp/DebuggingLog.log
  • Incorta Node upgrade logs
cd <INCORTA_NODE_INSTALLATION_PATH>/IncortaNode/logs/
  • CMC logs
ls -l <CMC>/logs/

Configure SSO

If you already have an SSO configuration and are upgrading from Incorta version 6.0 or above, your SSO configurations will carry over in the CMC during the upgrade automatically.

Run the update core site script

This step is applicable only if you have a cloud file system.

  • Place the backup core-site.xml under cmc/bin and IncortaNode/bin.
  • Run the following locations script files, using the python command (for example, python3 update_core_site_incortaNode.py):
    • <INCORTA_HOME>/cmc/bin/update_core_site_cmc.py
    • <INCORTA_HOME>/IncortaNode/bin/update_core_site_incortaNode.py

The core-site.xml file is automatically copied to the following locations:

  • Under IncortaNode:

    • INCORTA_HOME/IncortaNode/spark/conf/
    • INCORTA_HOME/IncortaNode/runtime/lib/
    • INCORTA_HOME/IncortaNode/runtime/webapps/incorta/WEB-INF/lib/
    • INCORTA_HOME/IncortaNode/sqli/runtime/lib/
  • Under cmc:

    • INCORTA_HOME/cmc/lib/
    • INCORTA_HOME/cmc/tmt/lib/
    • INCORTA_HOME/cmc/inspector/

Start the Incorta cluster

Here are the resources required to start all the services in the Incorta cluster:

  • a System Administrator with root access to the host or hosts running Incorta Nodes, the host running the Cluster Management Console (CMC), and the host or hosts running Apache Spark

The estimated time to start the Incorta cluster and all related services is from 5 minutes to 5 hours. Depending on schema data size and various tenant configurations, it may take the Incorta Analytics Service several hours to load schemas into memory.

Here are the steps to start the Incorta cluster:

Start the CMC

The default directory for the CMC is ~/IncortaAnalytics/cmc. Start the CMC with the start-cmc.sh shell script:

<CMC_INSTALLATION_PATH>/cmc/start-cmc.sh

Start Apache Zookeeper

To start Apache Zookeeper, run the following:

<INCORTA_NODE_INSTALLATION_PATH>/IncortaNode/start-zookeeper.sh

Start Apache Spark

You can start Apache Spark using the startSpark.sh shell script:

<INCORTA_NODE_INSTALLATION_PATH>/IncortaNode/startSpark.sh

Start the Node Agent

For each Incorta Node, run the following to start the node agent:

<INCORTA_NODE_INSTALLATION_PATH>/IncortaNode/nodeAgent/agent.sh start

Start the Loader Service

In order to start the Loader Service, you need to know the name of the service. You can read the services.index file to find out the name of the services running on an Incorta Node.

cat <INCORTA_NODE_INSTALLATION_PATH>/IncortaNode/services/services.index

Once you know the name of the Loader Service, you can then execute the following:

LOADER_SERVICE=<SERVICE_NAME>
<INCORTA_NODE_INSTALLATION_PATH>/IncortaNode/startService.sh ${LOADER_SERVICE}

Start the Analytics Service

In order to start the Analytics Service, you need to know the name of the service. You can read the services.index file to find out the name of the services running on an Incorta Node.

cat <INCORTA_NODE_INSTALLATION_PATH>/IncortaNode/services/services.index

Once you know the name of the Analytics Service, you can then execute the following:

ANALYTICS_SERVICE=<SERVICE_NAME>
<INCORTA_NODE_INSTALLATION_PATH>/IncortaNode/startService.sh ${ANALYTICS_SERVICE}

Start the Notebook Add-on Service

Note

Your Incorta cluster may not have enabled and configured the Notebook Add-on Service for a given tenant. You enable the Notebook Add-on as an Incorta Labs feature.

In order to start the Notebook Add-on Service, you need to know the name of the service. You can read the services.index file to find out the name of Notebook Add-on running on an Incorta Node that is running the Analytics Service.

cat <INCORTA_NODE_INSTALLATION_PATH>/IncortaNode/notebooks/services/services.index

Once you know the name of the Notebook Add-on Service, then execute the following:

NOTEBOOK_ADD_ON=<SERVICE_NAME>
<INCORTA_NODE_INSTALLATION_PATH>/IncortaNode/startNotebook.sh ${NOTEBOOK_ADD_ON}

Upgrade the Incorta metadata database

A CMC Administrator is able to upgrade the Incorta metadata database. Depending on the number of tenants and schemas in your Incorta cluster, the process can take between 5 minutes and 3 hours.

To sign in to the Cluster Management Console (CMC), visit your CMC host at one of the following:

  • http://<Public_IP>:6060/cmc
  • http://<Public_DNS>:6060/cmc
  • http://<Private_IP>:6060/cmc
  • http://<Private_DNS>:6060/cmc

The default port for the CMC is 6060. Sign in to the CMC using your CMC administrator username and password.

To upgrade the Cluster Metadata database, follow the steps:

  • In the Navigation bar, select Clusters.
  • For each cluster name in the Cluster list, in the Actions column, select Upgrade Cluster Metadata.
Note

A dialog indicates to restart the Incorta Services. In the dialog, select OK.



Verify the successfully upgrade

Next, verify the successful upgrade. Here are the resources required:

  • a System Administrator with root access to the host or hosts running Incorta Nodes, the host running the Cluster Management Console (CMC), and the host or hosts running Apache Spark
  • a CMC Administrator
  • a Super User that can access each tenant in the Incorta environment
  • an Incorta Developer to resolve identified issues with formula expressions, schema alias, joins between tables, and dependencies between objects such as dashboards and business schemas

Resume all scheduled jobs

A CMC Administrator is able to resume scheduled jobs. Here are the steps to disable the required options as a default tenant configuration:

  • In the Navigation bar, select Clusters.
  • In the cluster list, select a Cluster name.
  • In the canvas tabs, select Cluster Configurations.
  • In the panel tabs, select Default Tenant Configurations.
  • In the left pane, select Data Management.
  • Disable the following settings:
  • Pause Load Plans
  • Pause Dashboard Scheduler
  • Pause Data Notification
  • Select Save.

Review and Monitor scheduled jobs

As the tenant Super User, sign in to each each tenant and review the scheduled jobs.