References → Notebook for Business Users

Introduction

Notebooks offer a robust capability, providing a user-friendly interface for data exploration and querying against physical and business schemas. In previous releases, the usage of Notebooks was limited to schema managers who could leverage this functionality by creating materialized views.​​

In recent releases (2024.1.0 for Cloud and 2024.7.x for On-Premises), business users can access an enhanced version of the Notebook Editor from within the Content Manager or the Business Schema Designer to query verified business views they have access to.

Important

Before 2024.7.x, you only need the Analyze User role to access the Notebook for Business Users.

Starting 2024.7.x, only users with the Advanced Analyze User or the SuperRole roles can access the Notebook for Business Users. Thus, after upgrading to 2024.7.x, you must assign the Advanced Analyze User role to Analyze users who created notebooks in earlier releases.

Note: Notebook for Business Users is an Incorta Labs feature

An Incorta Labs feature is experimental and functionality may produce unexpected results. For this reason, an Incorta Labs feature is not ready for use in a production environment. Incorta Support will investigate issues with an Incorta Labs feature. In a future release, an Incorta Labs feature may be either promoted to a product feature ready for use in a production environment or deprecated without notice.


Prerequisites for Notebook for Business Users

Prepare a Spark instance

Releases before 2024.7.x require a dedicated SparkX instance with Spark 3.3.0 or later for the Advanced SQLi and sequentially Notebook for Business Users.

Starting 2024.7.x, all Incorta applications, including the Advanced SQLi and Notebook for Business Users, will use one unified Spark instance to handle different requests. The supported version for the unified Spark instance is 3.4.1 or later.

Supported Python Versions

Spark 3.4.1 requires a Python version later than 3.7.
Running PySpark paragraphs in Notebooks gets stuck or throws errors if the Python Path in the CMC > Tenant Configurations > Advanced is set to Python 2.

Enable and configure Advanced SQLi

The configurations of the Advanced SQLi vary according to the installation type: Cloud or On-Premises.

For common configurations of the Advanced SQLi, refer to References → Advanced SQL Interface. However, additional configurations are required for the Notebook for Business Users on On-Premises clusters. For details, see Enable and configure the Notebook for Business Users on On-Premises installations.

Create a Personal Access Token (PAT)

To run the queries via the Notebook for Business Users in releases before 2024.7.x, you must create a personal access token (PAT) and set it in the editor to allow communication between the Notebook for Business Users and Advanced SQLi.
To be able to create a PAT, you must be granted access to the Public API. You can then use your Profile Manager to create PATs.

For more details, refer to References → Public API v2.

Enable Incorta Premium

Starting 2024.7.2, Notebook for Business Users is a Premium feature that requires an Incorta Premium cluster to run. You must enable Incorta Premium before you can access existing notebooks or create new ones. Steps to enable Incorta Premium vary according to Incorta installation: Cloud or On-Premises. for more details, refer to Incorta Premium.


Enable and configure the Notebook for Business Users

On Cloud installations

The Notebook for Business Users feature is available for Incorta Cloud clusters starting 2024.1.0.

  • To enable it for your cluster, contact Incorta Support.
  • You must also turn on the Notebook Integration feature from the CMC > Tenant Configurations > Incorta Labs or the Cloud Admin Portal > Cluster Advanced Configurations > Default Tenant Configurations > Incorta Labs.
  • Turn on the Enable Advanced SQL Interface toggle in the Cloud Admin Portal > Cluster Configurations.

On On-Premises installations

The Notebook for Analyze User feature is available for On-Premises clusters starting 2024.7.x.

To configure it on On-Premises clusters, follow these steps:

  1. Configure the Advanced SQLi.
  2. Enable the integration with the Apache Zeppelin notebook.
  3. Enable the Notebook for Business Users.

Additional Advanced SQLi configurations

While setting up the Advanced SQLi on On-Premises clusters, you grant the Spark Metastore admin the permissions required to manage tables in all or specific databases in the Metastore MySQL server. However, you must grant the admin additional permission to create other users with the grant option.

Notes

When running commands to grant the Spark Metastore admin the required permissions, replace the following:

  • SPARK_Metastore_Admin_USER_NAME: The actual username of the Spark Metastore admin
  • localhost: The hostname or IP address of the MySQL server in case you are granting permissions remotely
Important

The system will throw an error when configuring the Spark Metastore during the cluster creation or afterward in the CMC without granting the Spark Metastore admin permission to create another user. You can continue the cluster creation or update, which will be successful; however, the Notebook for Business Users will not function properly.

Configure clusters with new Advanced SQLi installations

If this is the first time to configure the Spark Metastore, run the following command:

GRANT CREATE, DROP, INDEX, REFERENCES, INSERT, DELETE, UPDATE, SELECT, ALTER, LOCK TABLES, EXECUTE, CREATE USER ON *.* TO 'SPARK_Metastore_Admin_USER_NAME'@'localhost' WITH GRANT OPTION;
FLUSH PRIVILEGES;
Configure clusters with existing Advanced SQLi installations

If you already have Advanced SQLi configured, you can grant the Spark Metastore admin the required permission before or after the upgrade.

Before the upgrade

Run the following command:

GRANT CREATE USER ON *.* TO 'SPARK_Metastore_Admin_USER_NAME'@'localhost' WITH GRANT OPTION;
FLUSH PRIVILEGES;
After the upgrade
  1. Run the following command:

    GRANT CREATE USER ON *.* TO 'SPARK_Metastore_Admin_USER_NAME'@'localhost' WITH GRANT OPTION;
    FLUSH PRIVILEGES;
  2. Run the following command via the Tenant Management Tool (TMT):

    ./tmt.sh --force --cluster-name <CLUSTER_NAME> --create-metastore-read-only-user

    Replace the <CLUSTER_NAME> with your cluster’s name.

  3. Restart all services.

Post upgrade considerations

After upgrading an On-Premises cluster, you must manually update the following tenant configurations in the Incorta Metadata database with the Primary Analytics for Advanced SQLi via the update-property TMT command:

  • inx.sql.incorta.host
  • inx.incorta.auth.service.host
Note

During fresh Advanced SQLi installations, these configurations are set when defining a new Kyuubi service, as the information is provided as input.

Run the following:

  • ./tmt.sh --force --cluster-name "<ClusterName>" --update-property system inx.sql.incorta.host <incorta_analytics_node_hostname>

  • ./tmt.sh --force --cluster-name "<ClusterName>" --update-property system inx.incorta.auth.service.host <incorta_analytics_node_hostname>

Note

For the previous commands, replace the following:

  • <ClusterName>: Your cluster’s name
  • <incorta_analytics_node_hostname>: The hostname or IP address of the machine where the Primary Analytics node is installed.

After running these commands, restart all services.

Enable the integration with the Apache Zeppelin notebook

  1. Sign in to the Cluster Management Console (CMC).
  2. Select a cluster, and then select the Tenants tab. For the tenant you want, select Configure.
  3. In the Tenant Configurations, select Incorta Labs, and then turn on the Notebook Integration option.
  4. Start the Notebook service. Go to Nodes > Notebook service > More Options > Start.

You can also enable it in the Default Tenant Configurations.

For more details about enabling the integration with Notebook, refer to Tools → Notebook Editor.

Enable the Notebook for Business Users

To enable the Notebook for Business Users on On-Premises clusters:

  1. Using a user with root access to the Incorta machine, navigate to <INCORTA_NODE_INSTALLATION_PATH>/IncortaNode/.
  2. Edit the node.properties and add the following line: notebook.for.analyzer.enabled=true. Save the file.
  3. Restart the related Analytics services.

Access the Notebook for Business Users

You can access the Notebook for Business Users from the Content Manager or the Business Schema Designer.

Create Notebooks using the Content Manager

  • In the navigation pane, select Content.
  • In the Content Manager, on the Action bar, select + New > Add Notebook.
  • In the Notebook Editor, in the Data panel, select Manage Dataset, and then select the verified business views you want to query.
  • For releases before 2024.7.x, run the command: set_pat("<PAT>"), replacing <PAT> with the PAT you have created.
  • Start using the Notebook Editor.
  • Select Save.

Create Notebooks using the Business Schema

  • In the navigation pane, select Business Schema, and then open a business schema.
  • In the Business Schema Designer, on the Action bar, select Explore Data > via Notebook.
  • In the Notebook Editor, in the Data panel, select Manage Dataset, and then select the verified business views you want to query.
  • For releases before 2024.7.x, run the command: set_pat("<PAT>"), replacing <PAT> with the PAT you have created.
  • Start using the Notebook Editor.
  • Select Save.

Additional Information

Supported languages

The Notebook for Business Users supports the following languages:

  • Spark SQL
  • PySpark
  • Scala
  • SparkR

Tips and Tricks

  • You can use the More Options (vertical ellipsis icon) beside a view name in the Data panel to create a Notebook statement.
  • You can drag and drop a view or a column name into a Notebook paragraph.
  • If you have the Incorta Copilot enabled, you can type in natural language commands to have the Copilot write the queries for you.

Known limitations

  • You cannot create or import Notebooks in the Content Manager’s folders.
  • Sharing Notebooks is currently not available.
  • Only 10 concurrent connections are allowed per cluster to create Notebooks simultaneously. Knowing that if a user has two open tabs for Notebook, it is considered two different sessions.