Release Notes Incorta 2024.7.x

Release Highlights

2024.7.1

In the 2024.7.1 release, Incorta introduces new features and significant enhancements, including:

  • A General Availability (GA) version of Incorta Data Studio
  • Enhancements to Incorta Copilot
  • New advanced options for data management, allowing data purging, syncing deleted records from data sources, and loading a portion of data to memory
  • Support for load plan distribution on multiple loaders
  • A new capability to restart a load plan from a specified group
  • Support for schema evolution - handling added and deleted columns without performing full loads.
  • Extending the support for handling null values to data loading, joins calculations, visualizations, and additional built-in functions
  • Multiple enhancements to dashboards and visualizations, including the support for dynamic analytics in Pivot Tables, an improved user experience to drill down into another dashboard tab, default options for the Dynamic Group-By, new date part options, downloading to PowerPoint, and improvements to the Summary and Slicer components
  • Enhancements to the data encryption, including 256-bit encryption keys and the integration with Azure Key Vault (On-Premises only)
  • Improved custom configuration persistence during upgrades
  • A new security role: the Advanced Analyze User
  • A preview feature for Spark SQL Views
  • A new Apache Kafka connector with a simplified implementation and multiple enhancements
  • Enhancements to the Connectors Marketplace, including the support for installing custom connector versions
  • Enhancements to the Data Agent management, allowing Schema Managers to start and stop data agent services from the Data Manager
  • Enhancing system resilience by introducing new options for the Analytics workload management and SQLApp enhancements
  • Enhancements to the external table and schema deletion endpoints
  • Data source lineage
  • Enhancements to multiple connectors, including SFTP, Oracle OTM/GTM, Oracle EPM, and Oracle Cloud Applications (BICC)

2024.7.2

In the 2024.7.2 maintenance pack, Incorta introduces the following:

  • An offering for Incorta Premium to unlock advanced features
  • Enhancements to Incorta Copilot
  • Query functions with limits
  • New Public API endpoints

2024.7.3

In the 2024.7.3 maintenance pack, Incorta introduces the following:

  • Decoupling Null Handling from Advanced SQLi, providing more flexibility
  • Handling nulls during join calculations based on the Null Handling CMC setting
  • Relaxing the required permissions to query Spark SQL views and to access insights based on these views
  • Conversational support in Incorta Copilot
  • Searching data within a prompt in Incorta Copilot
  • New Oracle ERP Manufacturing and Supply Chain Data App

Upgrade Considerations

Important

Upgrade considerations for a specific release also apply to subsequent 2024.7.x releases unless stated otherwise.

2024.7.1 Upgrade Considerations

Considerations for MySQL 8

Before upgrading clusters that use a MySQL 8 metadata database from a 6.0.x or earlier release to the 2024.7.x release, execute the following against the Incorta metadata database:

ALTER TABLE `NOTIFICATION` MODIFY COLUMN `EXPIRATION_DATE` TIMESTAMP NULL DEFAULT NULL;
UPDATE `NOTIFICATION` SET `EXPIRATION_DATE` = NULL WHERE CAST(`EXPIRATION_DATE` AS CHAR(19)) = '0000-00-00 00:00:00';
COMMIT;
Resolved

This issue has been resolved in 2024.7.2

JDK supported versions

JDK 8u144 is no longer supported, and you must upgrade to JDK 8u171 or later.

Spark 3.4.1

Starting this release, Incorta uses Spark 3.4.1. All Incorta applications, including the Advanced SQL Interface, will use one unified Spark instance (<incorta_home>/IncortaNode/spark/) to handle the different requests. For more details about the difference between the Spark versions, refer to the 3.4.1 Migration Guide.

As a result of using a unified Spark instance, you must consider the following when upgrading On-Premises clusters that have the Advanced SQLi previously enabled and configured:

  • Take into consideration the resources required by the Advanced SQLi when configuring the Spark worker resources in the CMC.
  • Post upgrade, make sure that the CMC is started and running before starting the services related to Advanced SQLi, such as the Advanced SQL service and Kyuubi, to have the configuration values updated to refer to the unified Spark instance instead of SparkX.
Deprecating Incorta custom Delta Lake reader

Starting this release, Incorta has deprecated its custom Delta Lake reader and started to use only Spark Delta Lake reader to read Delta Lake files.

Load jobs and scheduled plans

Before starting the upgrade process, do the following:

  • For each tenant in your cluster, pause the load plan schedulers: In the Cluster Management Console (CMC) > Tenant Configurations > Data Management, turn on the Pause Load Plans toggle.
  • Stop all load jobs, including in-progress and in-queue jobs: In the Schema Manager, select Load Jobs, and then select Stop for each job you want to stop.
Load filters

During the upgrade, existing load filters for a table will be migrated as partial loading filters. After the upgrade, tables with load filters will require a load from staging.

Accounting for nulls in the Loader Service

Starting 2024.7.x, Incorta Loader accounts for null values in the following contexts whether or not you enable Null Handling in the CMC:

  • Deduplication and PK-index calculations:
    The Loader Service will consider null values as undefined values. Thus, null values are no longer considered zeros or empty values. When loading data in full or incremental load jobs, the Loader Service will retrieve or update records with null key values apart from records with zero or empty key values. However, to preserve backward compatibility, Incorta considers a null value equal to another null value when retrieving or updating data. Accordingly, only one record with a null value will be loaded or updated if the Enforce Primary Key Constraint option is enabled.


  • Join calculations and join filters:
    The Loader Service will account for records with null values individually. During join calculations, a null value does not equal another null value, zero, or empty value. Additionally, the Loader Service will account for null values when evaluating the join filter.

Note

Starting 2024.7.3, the Loader Service will handle nulls during join calculations based on the Null Handling CMC setting.

Intelligent Ingest

After upgrading to 2024.7.x, previously configured incremental ingest jobs may fail or cause data inconsistencies at the destination. To prevent these issues, perform a full ingest before running the first incremental ingest post-upgrade.

Note: This requirement does not apply to schemas configured for ingestion for the first time after the upgrade as they will automatically undergo a full ingest during the initial ingest job.

Resolved

This issue has been resolved in 2024.7.2-P1.

Notebook for business users

Users who created business Notebooks in previous releases of Incorta will need the newly released Advanced Analyze User role for continued access in 2024.7.x.

Upgrades from 2024.1.7

A new version of the Excel Add-in has been introduced in 2024.1.7. This new version is not yet supported in the 2024.7.x releases. Therefore, upgrading from 2024.1.7 to 2024.7.1 or 2024.7.2 is not recommended for the Excel Add-in users.


2024.7.2 Upgrade Considerations

After upgrading to 2024.7.2 and with the introduction of Incorta Premium, the following features will be impacted:

  • Notebook for business users: Enable Incorta Premium to be able to access existing business notebooks and create new ones.
  • Spark SQL Views: To create new, explore, edit, and visualize Spark SQL Views, enable Incorta Premium and Spark SQL View in the Server Configurations > Incorta Labs.
  • Data Studio: To avail the Data Studio tab in the Analytics platform, enable Incorta Premium, enable Data Studio for Cloud clusters in the Cloud Admin Portal > Configurations, and enable and configure it in the CMC > Server Configurations > Incorta Data Studio.
  • Copilot: To have access to Incorta Copilot and generative AI capabilities, enable Incorta Premium, enable and configure Copilot for Cloud clusters in the Cloud Admin Portal > Configurations, and enable and configure it in the CMC > Server Configurations > Incorta Copilot.
Semantic Search in Business Schemas

Semantic search within the Business Schemas list won’t be supported in 2024.7.2 even after enabling the Incorta Premium and Copilot.


2024.7.3 Upgrade Considerations

  • Null Handling is now an independent feature, separate from Advanced SQL Interface (Advanced SQLi). You can enable or disable it as needed. After upgrading to 2024.7.3, clusters with Advanced SQLi enabled will automatically have Null Handling enabled as well.
  • During join calculations, the Loader Service will handle null values based on the Null Handling CMC setting.
    • Disabled: The Loader Service treats Null values as zeros for numeric columns, empty strings for text columns, and empty dates for date columns.
    • Enabled: The Loader Service treats Null values as distinct values, not equivalent to zeros, empty strings, or other null values.
  • You must sync the Spark Metastore after upgrading to this release to ensure access to Spark SQL views and insights leveraging them. Go to the CMC > Clusters > <cluster_name> > Tenants > <tenant_name> > More Options (), and then select Sync Spark Metastore.

Data Agent Considerations

The 2024.7.1 release uses the Data Agent version 9.2.0 while 2024.7.2 and 2024.7.3 use 9.2.1. Please upgrade to the required version.

Important

The Data Agent package no longer includes the MySQL driver; you must provide your own driver.

To download and deploy the MySQL jar, do the following before starting the Data Agent service:

  • From the unzipped incorta.dataagent directory, run one of the following scripts depending on the OS of the machine you install the Data Agent on:
    • For Windows, run patch-mysql.bat <unzipped_data_agent_path>.
    • For Linux, run ./patch-mysql.sh <unzipped_data_agent_path>.

These scripts download the MySQL jar file version 5.1.48 from Maven repository and mainly deploy it to these directories: <unzipped_data_agent_path>/incorta.dataagent/lib and <unzipped_data_agent_path>/incorta.dataagent.controller/lib.

To inspect the script’s output, check <unzipped_data_agent_path>/patch-mysql.log.

Notes

This release introduces the Data Agent Controller to allow managing data agents from within the Incorta Analytics UI. Here are the required steps to start it:

  • On the Data Agent remote host:
    • Copy the .auth file to <unzipped_data_agent_path>/incorta.dataagent.controller/conf/ and <unzipped_data_agent_path>/incorta.dataagent/conf/.
    • Navigate to the <unzipped_data_agent_path>/incorta.dataagent.controller/ directory, and then run one of the following scripts depending on the host machine’s OS:
      • Linux: ./bin/controller.sh start
      • Windows: bin/controller.bat start
  • For On-Premises installations, in the CMC > Server Configurations > Data Agent, enter the ports required for the connection between the Analytics Service and the Data Agent Controller.

2024.7.1 New Features

Dashboards, Visualizations, and Analytics

Data Management Layer

Incorta Copilot

Architecture and Application Layer


Feature Details

Dashboards, Visualizations, and Analytics

Dynamic Pivot Table analysis

In this release, pivot tables have added two powerful enhancements:

  • The ability to define a dynamic group by logic (previously only available in aggregated tables)
  • The ability to define dynamic columns

As a result, users can interactively select and de-select rows and columns to display in their pivot table. As a bonus, these two new features also allow users to set a default view.

Dashboard presentation enhancements

In this release, Incorta has improved slideshow experiences to present dashboards within the platform and an easy option to download dashboards to PowerPoint.

Previously, dashboards could only be presented once added to favorites. Now in the top right corner of any open dashboard is a presentation icon. Once opened, the users can pan through tabs, set the time interval of tab changes, and even quickly move to a tab through a dropdown selection.

To export tabs to a PowerPoint presentation, from any open dashboard, select More Options (⋮) on any tab > Download as. Select the format, then choose which tabs should be downloaded.

Note

For tabs to be eligible to export to PowerPoint, the dashboard page dimensions should set at the PowerPoint resolution in the freeform layout options.

New Date Part options for dates

Additional date part options such as "Year Quarter" and "Year Month" are available for date columns. This enhancement provides you with increased flexibility in your date-based analysis.

  • Year-Quarter Format: Date parts will be formatted as YYYY followed by the quarter number (e.g., 202301 for the first quarter of 2023).
  • Year-Month Format: Date parts will be formatted as YYYY followed by the month number (e.g., 202404 for April 1, 2024).

As a result, you can easily perform continuous monthly or quarterly charting without using formulas.

Conditional formatting enhancements

Due to its popularity of conditional formatting, Incorta continues to invest in making virtually every element capable of bearing a color. Additionally, the release introduces a revamped conditional formatting interface for more straightforward rule creation, sorting, and management.

Components now available that receive conditional formatting options include:

  • Dashboard shapes
  • Measures in Combo and Tornado charts

Negative number format options

Many financial and accounting standards recommend or require using parentheses for negative numbers. This format is appealing for its visual distinction and avoids misinterpretation with hyphens and dashes. With this release, Incorta introduces an additional option to toggle the format for negative values from being represented by a negative symbol (e.g., -123) to representation by parenthesis (e.g., (123)). The new option is found under the number format options for any measure.

Alternatively, you can set a default format in the business view by selecting any numeric column and setting the negative format to parentheses.

New Slicer component formatting options

Starting this release, new enhancements have been added to the slicer insight settings to personalize the component further:

  • When a dropdown is selected, the size of the dropdown can be set to Small, Medium, or Large.
  • Slicer labels and values can change font types, size, and color.
  • In the listing mode, the Slicer can be represented horizontally or vertically.

Direct drill down

Incorta introduces a drill-down configuration in this release that will enable Direct Drill Down. When toggled on, the previous behavior of selecting a drill down from a menu will be unavailable. Instead, the user can instantly drill down to a dashboard tab with a single click.

It is important to note that only one dashboard destination is eligible for Direct Drill Down. When multiple Direct Drill Downs are active, the bolded dashboard name (not in grayscale) receives priority for Direct Drill Down.

Default options for Dynamic Group-By

The dynamic group-by is a powerful feature in Incorta's aggregated tables. In the latest update, users can override the default grouping behavior in which the first dimension in the grouping dimension tray is displayed. Instead, users can select which grouping dimension(s) to show as a default for dashboard consumers.

Query Timeout enhancements

The query timeout feature now includes dashboard searches, preventing long-running queries from impacting system resources.

The options used to manage this feature have been moved to the new tab Analytics Workload Management under Tenant Configurations in the Cluster Management Console (CMC). Turn the Query Automatic Interruption toggle on, and then set the time limit in Query timeout (in minutes). When queries exceed this limit, they are automatically terminated, releasing resources and ensuring ongoing system performance.

An additional option is now available when you turn the Query Automatic Interruption toggle on: Query Max Off-heap Memory (%). Use this option to specify the maximum percentage of the off-heap memory that a query or a dashboard search request can use for processing, excluding data columns. When this threshold is exceeded, the query or dashboard search request is automatically terminated, and all its associated resources are released. The valid values are between 0 and 20 inclusively. Setting the value to 0 disables this functionality. A value greater than 20 will be automatically reset to 20.

Notebook for Business Users enhancements

In this release, when creating a notebook, creating and managing a personal access token (PAT) for an analyze notebook is no longer required. This release empowers notebook users to directly used SparkX by granting direct access to PySpark, Scala, and R DataFrame APIs.

Note

Starting 2024.7.x, the Notebook for Business Users is available for On-Premises installations.

For details on the required configurations, refer to Notebook for Business Users.

Unified Y-axis in the Stacked Column and Line Chart

There is now an option to unify the y-axis values in the Stacked Column and Line chart that is consistent with the "Combo Dual Axis" chart behavior. To unify the Y-axis, go to the pill settings added to the Line tray, and select the Axis Direction to be Left. Once completed, go to the general settings, you can enable the Unify Y-axis (Left) toggle. Now you will see one unified Y-Axis on the left side of the insight.

Editable label for Dynamic Fields

The dynamic field now has a customizable label with the option to rename the suffix of the dynamic measure during the insight design time. This allows users to use dynamic measures multiple times within an insight, but still differentiate their intent of use by providing a written value or providing a presentation variable as a suffix.

Dashed Line Format for Line Type Charts

Now, with charts containing a line to display information, you can update the format from solid to dashed lines. For example, if you represent multiple columns with multiple lines on the same chart, you can choose one or more lines to be dashed. To set the format, select the measure pill, and under format, set the type to “Dashed Line”.

Wrap the Label text in the Dual X-axis chart

There is now a new lab format option to wrap the text label for the Dual X-axis chart. This format option applies to the upper X-axis. You can keep the original behavior, where the text exceeding the reserved area of the label is trimmed, or toggle label wrapping in the pill settings.

Improved UX for searching and displaying columns in the Data panel

When you search for columns within the Data panel, whether in the Analyzer, Formula Builder, Notebook, or Business Schema Designer, Incorta searches within both the column's name and label. The name and label also appear when displaying the column details. Additionally, you have the option to control what the Data panel displays: either the column name or the label.

Data Management Layer

Introducing Spark SQL Views (Beta)

A new Spark SQL View feature is now available to allow business users to write fully Spark-SQL-compliant queries directly within Incorta using the Advanced SQL Interface. Additionally, users can now leverage session and presentation variables within Spark SQL views, enhancing the flexibility and power of their data queries. Spark SQL views can also be accessed from external BI tools. Spark SQL View is located under Business Schema DesignerAdd New View.

Notes
  • The Advanced SQL Interface must be enabled for this feature to work.
  • The Spark SQL Views cannot be referenced by a materialized view.

For more details, refer to Concepts → Incorta Spark SQL View.

The New Data Studio

The Incorta Data Studio tool is now generally available. The Data Studio is a powerful tool that enables users with minimal technical expertise to create Materialized Views (MVs) using a simple drag-and-drop interface.

Using the Data Studio, schema managers can perform multiple actions and transformations on your data using sequential steps and recipes. You use your raw data and apply steps and pre-built recipes to transform your data and create MVs.

To enable the Data Studio, please contact Incorta Support.

Null handling enhancements

In this release, both the Analytics and Loader services have expanded their support for null handling.

  • When the Null Handling option is enabled in the Cluster Management Console (CMC):
    • Formulas defined in schemas, business schemas, and insights now account for null values. The following formulas do not support null handling:
      • and()
      • bin()
      • case()
      • caseContains()
      • decode()
      • in()
      • lookup()
      • or()
    • The Loader Service stores nulls as nulls when materializing formula columns.
    • The Loader Service accounts for null values while partially loading data based on a custom condition.
  • Regardless of the Null Handling option:
    • The PK-index and join calculations: The Loader service accounts for null values and no longer considers them zeros or empty values.
    • The Analyzer and Dashboard now handle null representations when creating or displaying tabular insights, charts, Analyzer tables, and Analyzer views.

For more details, refer to References → Null Handling.

Notes
  • Enabling the Null Handling feature is strongly advised.
  • Incremental loading is now the standard to enforce changes required when enabling or disabling the Null Handling option, eliminating the need for full or staging load operations.
Note

Starting 2024.7.3, the Loader Service will handle nulls during join calculations based on the Null Handling CMC setting.

Partial data loading

With this release, Incorta introduces new table-level settings that partially load data into memory by specifying time-window configurations or custom formula conditions. By limiting the amount of data loaded into memory, users can improve analytics performance, concentrate analytics on recent periods.

Since this feature is focused on loading data to memory, the data will persist fully in storage. Therefore, Incorta functions that read from parquet, like MVs and SQLi, will continue to see the full dataset. However, the SQLi Engine port will honor the conditions set for partial loading.

To set partial loading, select a table from a schema. Inside the table, select Advanced Settings Under Loaded Data into Memory, select Partial. Configure a time window or a custom condition. Any data satisfying those conditions will be loaded into memory on the next staging load.

Notes
  • During the upgrade, existing load filters for a table will be migrated as partial loading filters.
  • After the upgrade, tables with load filters will require a load from staging.
  • The incremental load does not apply the partial loading condition on the incremental data. It is only applied during a full load or a load from staging.
  • The Preview Data feature will preview the full data, not the partially loaded data.

Data retention

With this release, Incorta introduces new table-level settings to enable the deletion of data from the disk by specifying time-window configurations or custom conditions. This allows users to limit disk space usage.

To set data retention, select a table from a schema. Inside the table, select Advanced Settings. Toggle on Data Retention, then select to keep data for a time window or for a condition where evaluated true. In the load options for a schema, there's new Purge Data. Once a purge job has been completed, the data is permanently irretrievably removed from the disk.

Support delete via an exclusion set

With this release, Incorta introduces new table-level settings to delete records identified by an exclusion set. This allows users to synchronize delete operations with operational data sources to remove inconsistencies in data. Additionally, this capability can also be leveraged for privacy reasons, like GDPR Article 17 (The right to be forgotten).

To apply an exclusion set for a table, select a table from a schema. Inside the table, select Advanced Settings. Toggle on Synchronize Delete Operations, then select the Incorta schema and table that contains the exclusion set. Once complete, set the data source column to match the exclusion set.

In the load options for a schema, there's new Purge Data. Once a purge job has been completed, the data is permanently irretrievably removed from the disk.

Assign Loaders to load groups

The default Load Plan behavior by Incorta prior to this release was an automatic loader selection based on availability. To better support scalability and availability, Incorta introduces additional Load Group settings to allow users to assign one or multiple loaders to run a specific load group.

To assign one or more loaders for the Load Group in a load plan, select More Options (⋮), and then select Additional Settings. There, all available loaders can be assigned to the load group.

  • Schema distribution settings are deprecated.
  • Schemas loaded manually will check the assigned loaders.
  • If the assigned loaders are not available, the job will be automatically assigned to other available loaders.

Prevent a load group to continue in case of error

In the case of sequential load groups with dependencies, the load plan execution needs to stop if there is a failed extraction or transformation of tables. Now Incorta introduces Load group level settings “Stop on First Error” to give schema managers more control over the flow of a load plan.

  • If the Stop on first error is enabled in the tenant settings, it will override the load group settings.
  • Stop on First Error does not apply to post-load calculations errors.

Restart a load plan from a selected load group.

With this release, the ability to Start/Restart a load plan is introduced in the load job details page. This allows the users to avoid restarting the whole load plan and start the load plan from a load group if any load group execution fails.

To restart a load plan from a selected group:

  1. In the Load Job Details Viewer, in the dropdown selection for the Start menu, select Restart from....
  2. Choose which load group to restart from.

Enhanced dependency warning messages

This release improves the experience of dependency warning messages appearing when you try to update or delete items with dependent entities. The warning includes a link to the entity’s Data Lineage to review all dependencies before proceeding with the intended action. Entities that trigger these warnings are:

  • Physical table deletions
  • Physical table formula deletions
  • Variable updates or deletions
  • Table join updates
  • Business view deletions
  • Business view column and formula updates or deletions
  • Business view column format change
Note

Dependency warning messages are unavailable for updating or deleting entire physical or business schemas.

Schema evolution support

In this release, Incorta leverages Delta Lake column mapping to support adding or removing columns without the need to fully load the data.

  • No migration is required.
  • Column names in Incorta are case-sensitive. Therefore, if you change the case of a column name in the data source, Incorta considers it a new column. This results in deleting the original column and its data and adding the new column with the new case.

For more details, refer to Concepts → Schema Evolution.

New LLM Recipe for Data Studio

A new Large Language Model (LLM) recipe is available in data studio. This recipe will enable users to greatly simplify tasks like classification, summarization, sentiment analysis, and translation without the need for code. The LLM recipe can contain multiple LLMs to ensure users can switch between models for different use cases.

Before this can be activated, the LLM Keys must be added to the CMC. In the CMC, Go to Clusters > Select your cluster > Cluster Configurations > Server Configurations > Incorta Copilot.

To learn more about the recipe and how to enable it, please review this guide.

Fill in DataFlow LLM Models with the requisite JSON.

Note

You need to install a library named aiXplain and openai. You can do so from the cloud portal from Configurations > Python Packages.

Warning

Running data flows with LLM recipes as well as their output materialized views can accrue expense on a per row of data processed. Token usage estimates are available within the LLM configuration to calculate cost of processing the entire dataset. We encourage users to exercise caution with running MVs containing LLMs.

Import and export Data Studio Studio Flows

Data flows can now be migrated between environments. To export a data flow, select the More Options (⋮) > Export. To import, select + New > Import. If you are importing a new data flow with a Save MV recipe, the MV will need to be redeployed to your desired schema.

Supporting joins using cross-table formulas

In this release, The Analytics service now supports one level of joins using cross-table formulas. The loader executes post-load calculations using a plan generated based on the dependencies between calculations, while the Analytics perform calculations in stages: first in-table formulas, then joins, and finally cross-table formulas. After evaluating the cross-table formulas, this solution appends a second stage for join calculations.

Enhancements to the external table and schema deletion endpoints

Starting 2024.7.x, the Delete External Table endpoint supports deleting related Parquet files and the target path with the external table definition. Additionally, you can enforce the deletion of external tables and their files when deleting the parent external schema.

  • Set the purge option to true in the Delete External Table endpoint to delete the target path and Parquet files related to the given external table, along with the external table definition in the Spark Metastore.
  • Set the cascade option to true in the Delete External Schema endpoint to delete the target paths and Parquet files related to the external tables in the given external schema, along with their definitions in the Spark Metastore.

Data Source Lineage

Incorta has enhanced the data lineage reports for supported entities to include the underlying data sources, improving transparency and traceability. Additionally, you can access the lineage of external data sources, which visualizes where an external data source is utilized across different layers, from physical schema objects down to business schema views, variables, dashboards, and insights.

Limitations:

  • Data source lineage for external session variables is not currently supported.
  • Folders and local files share the same icon as local files.

New Apache Kafka connector

The new Apache Kafka connector V2 provides multiple enhancements over the older connector version, including:

  • An enhanced and streamlined user experience
  • Supporting multiple topics per data source
  • Automatic discovery; no need to generate an Avro schema
  • Transformation of JSON messages (Flattening objects and arrays)
  • Consuming messages in a batch mode during the load job
  • Native support for Data Agent

Connector enhancements

In this release, Incorta has introduced multiple enhancements to different connectors, including:

  • OAuth 2.0 authentication for Oracle EPM and Cloud Applications (BICC) connectors
  • Reading PGP-encrypted files from SFTP and Oracle Cloud Applications (BICC) data sources
  • Oracle OTM/GTM connector:
    • An optional list of tables that can be accessed by Incorta
    • Support for asynchronous export via the Data Export API
    • Support for deleted Primary Keys (PKs) extraction
  • Support for SSH authentication for SFTP
  • Support UCM discovery in Oracle Cloud Applications (BICC) (The connector has been renamed to Oracle Cloud Applications starting the 2.2.3.0 connector version.)

Incorta Copilot

Notebook Copilot

The interactive notebook experience for materialized views and analyst notebooks now gets an experiential upgrade. Instead of interacting with Copilot in the notebook cells, you can now open the Copilot from simply clicking the Ask button.

This will launch the Copilot sidebar experience where you can interact to create code, summarize previously written code, or troubleshoot. Any code generated in the copilot can be quickly added to the notebook for a simplified and more productive scripting experience.

Summary component enhancements

The summary component is an excellent option for highlighting key insights from charts within a dashboard. In this release, Incorta extends the Summary component to highlight insights from the following chart types:

  • Bubble
  • Spider
  • Waterfall
  • Tag Cloud
  • Time Line Series
  • Time Series

Architecture and Application Layer

New data encryption options

Context

This feature is available for On-Premises installations only.

Incorta has enhanced data encryption and security by introducing two new features:

256-bit data encryption keys

A new 256-bit data encryption key is now generated per cluster during cluster creation or upgrade. This new enhancement requires importing the encryption key from the source cluster to the destination when moving data between clusters.

Alternatively, you can provide your own 256-bit data encryption key.

Integration with Azure Key Vault

Incorta now supports integrating with the Azure Key Vault service to provide and maintain a master key to encrypt the cluster’s data encryption key. Incorta will automatically decrypt and re-encrypt the cluster’s data encryption key whenever it detects a rotation of the master key.

These options are managed via the CMC > Clusters > <cluster_name> > Details page.

  • Under Data Encryption > Data Encryption Key, select BYOK to upload your 256-bit data encryption key. The uploaded file should be of a .key format.
  • Under Data Encryption > Data Encryption Key, select Import or Export as required.
  • Under Data Encryption > Key Management Service, select Integrate and provide the required details, including.
    • The Vault URI and a key name
    • The client ID and secret
    • The tenant ID

For details, refer to Guides → Data Encryption.

Data Agent enhancements

In this release, Incorta introduces the Data Agent Controller to prevent the need for human intervention when a data agent accidentally stopped or needs to be restarted. You can now view the data agent status and have the ability to start, stop, or restart the data agent as needed in the Data Manager. This enhancement ensures smoother operation and reduces downtime, especially during after-hours or weekend upgrades.

Note

You need to start the Data Agent Controller. Here are the required steps to start it:

  • On the Data Agent remote host:
    • Copy the .auth file to /incorta.dataagent.controller/conf/ and /incorta.dataagent/conf/.
    • Navigate to the <unzipped_data_agent_path>/incorta.dataagent.controller/ directory, and then run one of the following scripts depending on the host machine’s OS:
      • Linux: ./bin/controller.sh start
      • Windows: bin/controller.bat start
  • For On-Premises installations, in the CMC > Server Configurations > Data Agent, enter the ports required for the connection between the Analytics Service and the Data Agent Controller.

Materialized View usability enhancement

In this release, Incorta has added a few small features to improve the usability:

  • Error messaging for ambiguous errors has been re-written to better guide users to corrective actions.
  • Data types can no longer set by the schema manager in the UI. Instead the UI will reflect the output data type through MV script validation. This prevents discrepancies between the data model and the parquet storage, and ultimately prevents compaction errors.
  • Instead of having MVs get stuck indefinitely in a ‘waiting’ status due to requiring more resources than available on a cluster, or when a Spark driver crashed, the MV will fail.

Analytics Workload Management

Incorta is introducing new Server and Tenant configurations to enhance the monitoring, management, and resilience of the Analytics infrastructure, ensuring proper system utilization and stability.

The Analytics Workload Management feature provides:

  • Enhanced Auditing: Provides comprehensive visibility into the system activities and usage patterns, enabling better troubleshooting and performance analysis. For details, refer to References → Engine Audit.
  • Memory warning and starvation thresholds: Logs alerts when off-heap memory usage exceeds the warning or starvation threshold, facilitating early detection of potential issues.
  • Support for automatic recovery (Beta feature): Attempts to recover the Analytics platform when a starvation threshold is exceeded, ensuring continuity and stability by managing memory resources effectively.

You can find the new options under the new tab Analytics Workload Management in the Server Configurations and Tenant Configurations.

SQLApp enhancements

Previously, both the Analytics and the SQLi services could start SQLApp, which resulted in resilience issues and scattered logs. The new enhancements aim to improve resilience and address recent issues related to SQLApp.

Ownership and startup
  • Only the SQLi service can start SQLApp. As a result, the SQLi service must be started and the Spark port must be enabled when creating or running Incorta-over_Incorta tables or SQL views based on non-optimized tables.
  • Starting a SQLi service will start its associated SQLApp and terminate any existing SQLApp.
  • Shutting down a SQLi service will shut down the associated SQLApp.
Logging improvements
  • Logs are now appropriately managed and not cluttered within the Analytics services.
  • Enhanced SQLApp logging for easier troubleshooting.
Connection throttling

A new CMC option (Server configurations > SQL Interface > Max number of concurrent connections from SQLi to SQLApp) is now available to control the maximum number of connections from SQLi and SQLApp, the default is 100.

When the limit is reached, the SQLi service waits for a configurable interval (5 minutes by default) before trying to acquire a connection to SQLApp. If it fails to acquire a connection after this interval, it throws a timeout exception.

Advanced Analyze User role

In this release, we've introduced the Advanced Analyze User role under the Security tab → Roles. This role grants permissions identical to the Analyzer role. Additionally, the Advanced Analyze User or Admin role is now mandatory for utilizing Copilot, business Notebooks, and installing component SDK.

Important

If you have any business Notebooks created before the upgrade, it's important to ensure that users who created these Notebooks are added to a group containing the new Advanced Analyze User role.

In-app notifications

Incorta Now Keeps You Informed with In-App Notifications!

  • Stay informed: Get notified about important events within the platform.
  • Never miss a beat: Notifications grouped by day, week, and month with 30-day history.
  • Actionable updates: Links embedded in messages for easy access.
  • Initial focus: Scheduled jobs, password resets, authentication updates, comment mentions, impersonation, and API key management.

This feature is enabled by default. CMC admins can disable it using the Enable In-App Notifications option in the CMC > Server Configurations > Notifications.

Cloud installations

You will need to disable this option for Cloud installations if you face a Loading chunk failed error. Contact Incorta Support to disable this feature on your Cloud clusters.

Improved custom configuration persistence during upgrades

This release persists custom CSS, and session time-out configurations are preserved during Incorta platform upgrades. Previously, these settings may have required reapplication after an upgrade. This enhancement streamlines the upgrade process and minimizes disruption.

Monitoring file system usage enhancements

Incorta can now collect more metrics when monitoring how Incorta services use the Google Cloud Storage (GCS) file system. The following are some of the metrics that Incorta captures in the new audit file:

  • The shortest and longest time taken by different backend method calls related to file operations, including reading and writing operations on audit and Parquet files
  • The path of the files that record the shortest and longest time
  • The smallest and largest file sizes

Contact Incorta Support to enable this feature.

Note

The Parquet file sizes on the audit file may slightly differ from the actual sizes due to procedures taken to improve performance and reduce disk space usage.

Performance enhancements on GCS

This release is introducing multiple enhancements to reading Parquet segments on the Google Cloud Storage (GCS) file system, improving the system performance in the following areas:

  • Full PK Index calculations
  • Incremental PK Index calculations
  • Full column reading from Parquet via the Loader or Analytics services

The new enhancements include:

  • Getting Parquet segments from multiple increment directories in parallel
  • Checking the existence of offsets in parallel by default (This feature was introduced before but disabled by default.)
  • Using a Cached Thread Pool when creating threads instead of a Fixed Thread Pool

Connectors Marketplace enhancements

  • Enhanced user experience when updating the connector, allowing for selecting a specific version to install instead of upgrading to the latest version by default.
  • Support for installing a custom version of a connector from the Marketplace. You will need to provide the custom version number shared by Incorta Support.

Enhancements and Fixes

In addition to the new features and major enhancements mentioned above, this release introduces more enhancements and fixes.

Enhancements

EnhancementArea
Highlighting its compatibility with PostgreSQL, the Incorta SQL View is now labeled PostgreSQL View in the following locations:
  ●  Cluster Management Console (CMC): Tenant Configurations > Incorta Labs > Enable creation of PostgreSQL Views
  ●  Business Schema: New > Add New View > Create Via PostgreSQL
CMC/ Business Schema
Cloud admin user can access the External Visualization Tools from the following locations
  ●  The Cloud Admin Portal → Advanced ConfigurationsTenant Configurations (Tenant Name)
  ●  Cluster Management Console (CMC): Tenant Configurations
CMC / Cloud Admin portal
WMS connector enhancements:
  ●  Allow chunking
  ●  Handle nested objects
Connectors
The Timeline Component now supports drill down optionsDashboards
Improved performance for dashboards without filters or prompts by eliminating redundant searches of repeated fields, streamlining data retrieval, and enhancing overall efficiencyDashboards
Continued improvement for keyboard and voice usersDashboards
The schema limit in a load plan group has been increased from 50 to 300 schemas.Data Management
Extended the liveness check between the Loader Service and Spark to be bi-directional. If the Spark driver crashes, the MV will fail after a configurable interval.Loader Service
Incorta log files will now include a record each time the Engine Concurrent Service Thread Throttling initiates for a query.Logs
Introduced new structured logs in JSON format, facilitating detailed tracking of Engine task state updates, including search and query operations. These logs are invaluable for debugging and troubleshooting issues effectively.
Only Admin users can now enable this feature under Cluster Management Console (CMC) → Cluster Configurations → Server ConfigurationsDiagnosticsEngine Tasks Lifecycle Structured Logs.
Logs
MV dynamic resource allocation enhancements:
  ●  The default value of the maximum executors is now limited to 1000.
  ●  You can enable or disable the resource dynamic allocation at the MV level and also change the maximum executors limit.
Materialized Views (MV)
Introduced a retry mechanism for scheduler queries sent via email when there's a sync block during lock acquisition or when the "The underlying data is being updated" error occurs.
  ●  Queries will be retried automatically every 30 seconds, with a maximum of 4 attempts.
  ●  This retry mechanism is configurable and enabled by default.
  ●  To disable this feature, set enable.mail.scheduler.retry.read.lock=false in the service.properties file. Please contact support in case this property needs any adjustments.
Please remember to restart the service for changes to take effect.
Scheduler
Increased the number of schemas to be added in a load plan group from 50 to 300 schemasScheduler
Schema Manager can change a schema or table name in the schema wizardSchemas
On Cloud installations, schema managers can now download the Events and Driver logs of failed MV load jobs from the Job Errors dialog.Schemas
Improved the system resilience and stability by resolving issues related to infinite loops and overflow exceptions caused by cyclic dependencies in variables.
However, it's essential to acknowledge that a standardized error message or expected output has not yet been established. Users may encounter varied outputs, including #Error, 0 rows, empty strings, or variable name displays, depending on the context. Ongoing efforts are dedicated to achieving uniform error handling and output consistency
Variables
You can now copy the text added to the shapes as any native textVisualizations

Fixes

Fixed IssueArea
Updating the source physical schemas during data extraction of related business views caused these updates to fail or get stuck.Advanced SQLi

Maintenance Packs

2024.7.2 maintenance pack

In this maintenance pack:

Incorta Premium

With this release, we are happy to package some of our most newest innovative features into the Incorta Premium Package. Features packaged as a part of Incorta Premium include:

  • The Incorta Copilot - The platform integrates generative AI experiences, collectively known as "Copilot," across various interfaces to enhance user interactions. These include assistants for business queries, dashboards, story generation, metadata, notebooks, and data studio, all using natural language to simplify tasks and generate insights.
  • The Incorta Data Studio - The Incorta Data Studio is a new suite enabling Schema Managers to accelerate Materialized View development through Dataflows—low-code workflows made of interconnected Recipes. Users can perform data quality checks, blending, preparation, analysis, and complex operations with Python, SQL, and LLM Recipes, ultimately deploying the workflow into a Physical Schema.
  • Spark SQL Views - Incorta has now extended its Advanced SQL interface to support building Business views in Spark SQL, enabling complex data transformations, dynamic filtering with variables, and seamless visualization within dashboards.
  • Business Notebooks - The Analyzer Notebook enables analysts to develop in PySpark, Spark SQL, Scala, and Spark R, providing access to verified Business Schema views with row-level security, ensuring secure and governed data analysis.

To explore our Premium features in depth, check Incorta Premium.

Enable Views for Copilot

Incorta increases the security and governance of what views can be queried through the Copilot. Previously, any verified view was enabled to be queried by the Copilot by default. Now, a new setting has been introduced to the Business Schema that enables or disables a view to be queried by the Copilot.

To set a view as eligible for use in Copilot, edit a Business Schema, open a Verified View, select More Options (⋮) -> Enable for Copilot

Important

Views are not enabled for Copilot by Default.

Copilot User Role

This release introduces a new role, the Copilot User Role. This role is required for any user who intends to interact with the Copilot capabilities. Since Copilot experiences are built across multiple experiences, Copilot features will be made available based on their level of access to the platform. For example, Schema Managers can leverage Copilot's natural language to SQL capabilities when building materialized views while an Individual Analyzer can leverage natural language to Insight capabilities.

Enhanced querying with limits in variables and In-Query filters

In this release, Incorta introduces two query functions designed for efficient data retrieval when working with internal session and global variables. These functions enable you to specify a limit on the number of records returned, allowing more targeted querying based on defined conditions.

  • queryWithLimit: Returns a specified number of records from a given column, with the option to apply additional filters to further narrow down the results.
  • queryDistinctWithLimit: Returns a limited set of unique (distinct) values from a specified column, with the ability to apply filters to refine the selection of distinct values. It is supported in In-Query filters as well.

New Public API endpoints

Two new endpoints are now available:

  • Create Catalog Folder (/catalog/folder): Creates a new folder after providing the folder name and the ID of its parent folder.
  • Rename and move Dashboards (/dashboards): Renames a list of dashboards, moves them to other folders, or transfers their ownership.

Enhancements

EnhancementArea
Incorta now supports analyzing and querying unstructured data, such as PDFs and MS Word files.Incorta Copilot and Data Studio
A new version of the Kyuubi Hive JDBC Jar is now available for connecting Tableau to Incorta via Advanced SQli. The new Jar introduces an enhanced error messaging, providing clearer feedback when incorrect user credentials are supplied.Advanced SQLi

Fixes

Fixed IssueArea
Comments at the beginning or end of the Spark SQL view code resulted in query failureSpark SQL
Incremental Mode activated in the Save MV Recipe in a Dataflow would fail to deploy a MV to the target schema.Data Studio
Upgrading clusters that used a MySQL 8 metadata database from a 6.0.x or earlier release to 2024.7.x might fail.Metadata Database
After the cleanup job ran and removed the load job tracking data, the Schema Designer displayed 0 rows for non-optimized tables if they did not have any successful load jobs during the retention period, or if all load jobs during this period resulted in 0 rows, although these tables might still have data on disk.Schemas

2024.7.3 maintenance pack

In this maintenance pack:

Decoupling Null Handling from Advanced SQLi

Before this release, enabling the Advanced SQL Interface (SQLi) automatically enforced the Null Handling feature.
With this release, Null Handling is now an independent feature. You can enable or disable it separately. However, for consistent query results when using Advanced SQLi, it is recommended to enable Null Handling.

For more details, refer to Advanced SQLi and Null Handling.

After upgrading to 2024.7.3, clusters that previously had the Advanced SQLi enabled will automatically have both features enabled.

Updates to null handling during join calculations

In 2024.7.3 and later releases, the Loader Service handles null values during join calculations based on the Null Handling setting in the CMC.

  • Disabled: The Loader Service treats Null values as zeros for numeric columns, empty strings for text columns, and empty dates for date columns.
  • Enabled: The Loader Service treats Null values as distinct values, not equivalent to zeros, empty strings, or other null values.

For more details, refer to References → Null Handling.

Enhanced access to Spark SQL Views and their insights

To query Spark SQL views from BI tools or access insights based on these views, you only need permissions to the business schema containing the Spark SQL view or the dashboard containing the insight, respectively. You no longer require permissions to the underlying entities referenced in the Spark SQL view or the insight.

Important

You must sync the Spark Metastore after upgrading to this release to ensure access to Spark SQL views and insights leveraging them. Go to the CMC > Clusters > <cluster_name> > Tenants > <tenant_name> > More Options (), and then select Sync Spark Metastore.

Conversational support in Incorta Copilot

Previously, prompts sent to Incorta Copilot were treated as discrete conversations. Now, with the conversational context feature, Incorta Copilot offers an improved user experience. This means that the context of previous prompts and responses is used to return the best possible answers in updated prompts.

For example, if a dataset is returned for “Show me sales for this past year.”, then the user has secondary prompts asking “Now, show me for each month.”, the system's understanding of the conversation will insinuate that the user is still discussing sales.

To clear the conversation's context, select the 'paper shredder' icon located to the left of the prompt window. This icon serves as a reset button for the conversation, allowing you to start a new conversation context.

Searching data within a prompt in Incorta Copilot

While no data is transmitted to the LLM underlying Incorta Copilot, sometimes results would return blank since the Copilot created probabilistic values when it came to filtering data. For example, when trying return only data for “California”, the LLM is unaware if the data is represented as a full name or by the US standard two character abbreviation (“CA”).

To address this, we’ve added a capability that allows users to explicitly state a data value by entering ‘#’ followed by a value. If the value exists in multiple columns, a selection dialogue will appear. It's important to note that this capability currently only applies to string values.

New Oracle ERP Manufacturing and Supply Chain Data App

In this release, we are happy to announce our new data app: Oracle Cloud ERP Manufacturing and Supply Chain Analytics. This new data app delivers comprehensive visibility across work order costs, inventory management, and production planning through ready-to-use dashboards and schemas.

Get immediate answers to critical questions about material usage variances, inventory levels, and BOM hierarchies while empowering business users to create custom reports independently.

For further requirements, please check the ERP installation documentation.


Patch Releases

2024.7.2-P1

Enhancements

EnhancementArea
More flexible conditions for parallel column reading into memory. Previously, only Parquet files with at least 10 million rows and 4 segments qualified for parallel processing. Now, Incorta reads columns from Parquet files in parallel if they meet any of these criteria, significantly optimizing load times, particularly for clusters using Google Cloud Storage (GCS).Parallel column reading

Fixes

Fixed IssueArea
After upgrading to 2024.7.x. previously configured incremental ingest jobs might fail or result in data inconsistencies at the destination.Intelligent Ingest

Known issues

IssueWorkaround
Incremental Mode activated in the Save MV Recipe in a Dataflow will fail to deploy a MV to the target schema.Resolved in 2024.7.2

For 2024.7.1, toggle off Incremental Mode, deploy the MV, open the MV in the Physical Schema, then apply incremental logic on the MV.
The load plan won’t appear on the Load Jobs list if the latest execution does not include the load group with the schema that the login user has access to, for example,
  ●  When aborting a load plan execution before the related load group starts.
  ●  When restarting the execution from a group that follows the related load group.
An issue causes a Loading chunk failed error when trying to open a dashboard or schema on Cloud installations.Contact Incorta Support to disable the in-app notifications feature in the CMC.
Upgrading clusters that use a MySQL 8 metadata database from a 6.0.x or earlier release to the 2024.7.x release might fail.Resolved in 2024.7.2.

For 2024.7.1, execute the following against the Incorta metadata database before the upgrade:
ALTER TABLE `NOTIFICATION` MODIFY COLUMN `EXPIRATION_DATE` TIMESTAMP NULL DEFAULT NULL;

UPDATE `NOTIFICATION` SET `EXPIRATION_DATE` = NULL WHERE CAST(`EXPIRATION_DATE` AS CHAR(19)) = '0000-00-00 00:00:00';
COMMIT;