References → Notebook for Business Users
Introduction
Notebooks offer a robust capability, providing a user-friendly interface for data exploration and querying against physical and business schemas. In previous releases, the usage of Notebooks was limited to schema managers who could leverage this functionality by creating materialized views.
In recent releases (2024.1.0 for Cloud and 2024.7.x for On-Premises), business users can access an enhanced version of the Notebook Editor from within the Content Manager or the Business Schema Designer to query verified business views they have access to.
Before 2024.7.x, you only need the Analyze User role to access the Notebook for Business Users.
Starting 2024.7.x, only users with the Advanced Analyze User or the SuperRole roles can access the Notebook for Business Users. Thus, after upgrading to 2024.7.x, you must assign the Advanced Analyze User role to Analyze users who created notebooks in earlier releases.
An Incorta Labs feature is experimental and functionality may produce unexpected results. For this reason, an Incorta Labs feature is not ready for use in a production environment. Incorta Support will investigate issues with an Incorta Labs feature. In a future release, an Incorta Labs feature may be either promoted to a product feature ready for use in a production environment or deprecated without notice.
Prerequisites for Notebook for Business Users
- A supported Spark version that varies according to the Incorta release
- The Advanced SQL Interface enabled and configured
- Notebook integration enabled
- A Personal Access Token (PAT) to run the queries (before 2024.7.x)
- Enable Incorta Premium (required starting 2024.7.2 and 2024.1.7)
Prepare a Spark instance
Releases before 2024.7.x require a dedicated SparkX instance with Spark 3.3.0 or later for the Advanced SQLi and sequentially Notebook for Business Users.
Starting 2024.7.x, all Incorta applications, including the Advanced SQLi and Notebook for Business Users, will use one unified Spark instance to handle different requests. The supported version for the unified Spark instance is 3.4.1 or later.
Spark 3.4.1 requires a Python version later than 3.7.
Running PySpark paragraphs in Notebooks gets stuck or throws errors if the Python Path in the CMC > Tenant Configurations > Advanced is set to Python 2.
Enable and configure Advanced SQLi
The configurations of the Advanced SQLi vary according to the installation type: Cloud or On-Premises.
For common configurations of the Advanced SQLi, refer to References → Advanced SQL Interface. However, additional configurations are required for the Notebook for Business Users on On-Premises clusters. For details, see Enable and configure the Notebook for Business Users on On-Premises installations.
Create a Personal Access Token (PAT)
To run the queries via the Notebook for Business Users in releases before 2024.7.x, you must create a personal access token (PAT) and set it in the editor to allow communication between the Notebook for Business Users and Advanced SQLi.
To be able to create a PAT, you must be granted access to the Public API. You can then use your Profile Manager to create PATs.
For more details, refer to References → Public API v2.
Enable Incorta Premium
Starting 2024.7.2 and 2024.1.7, Notebook for Business Users is a Premium feature that requires an Incorta Premium cluster to run. You must enable Incorta Premium before you can access existing notebooks or create new ones. Steps to enable Incorta Premium vary according to Incorta installation: Cloud or On-Premises. for more details, refer to Incorta Premium.
Enable and configure the Notebook for Business Users
On Cloud installations
The Notebook for Business Users feature is available for Incorta Cloud clusters starting 2024.1.0.
- To enable it for your cluster, contact Incorta Support.
- You must also turn on the Notebook Integration feature from the CMC > Tenant Configurations > Incorta Labs or the Cloud Admin Portal > Cluster Advanced Configurations > Default Tenant Configurations > Incorta Labs.
- Turn on the Enable Advanced SQL Interface toggle in the Cloud Admin Portal > Cluster Configurations.
On On-Premises installations
The Notebook for Analyze User feature is available for On-Premises clusters starting 2024.7.x.
To configure it on On-Premises clusters, follow these steps:
- Configure the Advanced SQLi.
- Enable the integration with the Apache Zeppelin notebook.
- Enable the Notebook for Business Users.
Additional Advanced SQLi configurations
While setting up the Advanced SQLi on On-Premises clusters, you grant the Spark Metastore admin the permissions required to manage tables in all or specific databases in the Metastore MySQL server. However, you must grant the admin additional permission to create other users with the grant option.
When running commands to grant the Spark Metastore admin the required permissions, replace the following:
SPARK_Metastore_Admin_USER_NAME
: The actual username of the Spark Metastore adminlocalhost
: The hostname or IP address of the MySQL server in case you are granting permissions remotely
The system will throw an error when configuring the Spark Metastore during the cluster creation or afterward in the CMC without granting the Spark Metastore admin permission to create another user. You can continue the cluster creation or update, which will be successful; however, the Notebook for Business Users will not function properly.
Configure clusters with new Advanced SQLi installations
If this is the first time to configure the Spark Metastore, run the following command:
GRANT CREATE, DROP, INDEX, REFERENCES, INSERT, DELETE, UPDATE, SELECT, ALTER, LOCK TABLES, EXECUTE, CREATE USER ON *.* TO 'SPARK_Metastore_Admin_USER_NAME'@'localhost' WITH GRANT OPTION;FLUSH PRIVILEGES;
Configure clusters with existing Advanced SQLi installations
If you already have Advanced SQLi configured, you can grant the Spark Metastore admin the required permission before or after the upgrade.
Before the upgrade
Run the following command:
GRANT CREATE USER ON *.* TO 'SPARK_Metastore_Admin_USER_NAME'@'localhost' WITH GRANT OPTION;FLUSH PRIVILEGES;
After the upgrade
Run the following command:
GRANT CREATE USER ON *.* TO 'SPARK_Metastore_Admin_USER_NAME'@'localhost' WITH GRANT OPTION;FLUSH PRIVILEGES;Run the following command via the Tenant Management Tool (TMT):
./tmt.sh --force --cluster-name <CLUSTER_NAME> --create-metastore-read-only-userReplace the
<CLUSTER_NAME>
with your cluster’s name.Restart all services.
Post upgrade considerations
After upgrading an On-Premises cluster, you must manually update the following tenant configurations in the Incorta Metadata database with the Primary Analytics for Advanced SQLi via the update-property
TMT command:
inx.sql.incorta.host
inx.incorta.auth.service.host
During fresh Advanced SQLi installations, these configurations are set when defining a new Kyuubi service, as the information is provided as input.
Run the following:
./tmt.sh --force --cluster-name "<ClusterName>" --update-property system inx.sql.incorta.host <incorta_analytics_node_hostname>
./tmt.sh --force --cluster-name "<ClusterName>" --update-property system inx.incorta.auth.service.host <incorta_analytics_node_hostname>
For the previous commands, replace the following:
<ClusterName>
: Your cluster’s name<incorta_analytics_node_hostname>
: The hostname or IP address of the machine where the Primary Analytics node is installed.
After running these commands, restart all services.
Enable the integration with the Apache Zeppelin notebook
- Sign in to the Cluster Management Console (CMC).
- Select a cluster, and then select the Tenants tab. For the tenant you want, select Configure.
- In the Tenant Configurations, select Incorta Labs, and then turn on the Notebook Integration option.
- Start the Notebook service. Go to Nodes > Notebook service > More Options > Start.
You can also enable it in the Default Tenant Configurations.
For more details about enabling the integration with Notebook, refer to Tools → Notebook Editor.
Enable the Notebook for Business Users
To enable the Notebook for Business Users on On-Premises clusters:
- Using a user with root access to the Incorta machine, navigate to
<INCORTA_NODE_INSTALLATION_PATH>/IncortaNode/
. - Edit the
node.properties
and add the following line:notebook.for.analyzer.enabled=true
. Save the file. - Restart the related Analytics services.
Access the Notebook for Business Users
You can access the Notebook for Business Users from the Content Manager or the Business Schema Designer.
Create Notebooks using the Content Manager
- In the navigation pane, select Content.
- In the Content Manager, on the Action bar, select + New > Add Notebook.
- In the Notebook Editor, in the Data panel, select Manage Dataset, and then select the verified business views you want to query.
- For releases before 2024.7.x, run the command:
set_pat("<PAT>")
, replacing<PAT>
with the PAT you have created. - Start using the Notebook Editor.
- Select Save.
Create Notebooks using the Business Schema
- In the navigation pane, select Business Schema, and then open a business schema.
- In the Business Schema Designer, on the Action bar, select Explore Data > via Notebook.
- In the Notebook Editor, in the Data panel, select Manage Dataset, and then select the verified business views you want to query.
- For releases before 2024.7.x, run the command:
set_pat("<PAT>")
, replacing<PAT>
with the PAT you have created. - Start using the Notebook Editor.
- Select Save.
Additional Information
Supported languages
The Notebook for Business Users supports the following languages:
- Spark SQL
- PySpark
- Scala
- SparkR
Tips and Tricks
- You can use the More Options (vertical ellipsis icon) beside a view name in the Data panel to create a Notebook statement.
- You can drag and drop a view or a column name into a Notebook paragraph.
- If you have the Incorta Copilot enabled, you can type in natural language commands to have the Copilot write the queries for you.
Known limitations
- You cannot create or import Notebooks in the Content Manager’s folders.
- Sharing Notebooks is currently not available.
- Only 10 concurrent connections are allowed per cluster to create Notebooks simultaneously. Knowing that if a user has two open tabs for Notebook, it is considered two different sessions.