Release Notes 6.0.3

Release Highlights

In this release, Incorta is introducing new features and multiple enhancements, including Incorta Data Studio, improvements to the Google Sheets and ServiceNow connectors, enhancing the responsiveness of interrupting load jobs during Post-load calculations, supporting key columns in derived tables, detecting key columns that result in duplicate values, and bundling a new version of Tomcat. This release also comes with fixes to issues that you might have encountered in previous releases.

Upgrade considerations

Caching mechanism enhancements

The caching mechanism in Analyzer views for dashboards has been enhanced by caching all columns in the view to prevent query inconsistencies. To optimize performance and reduce off-heap memory usage, creating views with only the essential columns used in your dashboards is recommended.

What’s new?

Incorta Data Studio

Incorta is adding the new Data Studio tool to this release. Data Studio is a powerful tool that enables users with minimal technical expertise to create Materialized Views (MVs) using a simple graphical interface.

To get the Data Studio, please contact Incorta Support.

Note: The Data Studio is an Incorta Labs feature

An Incorta Labs feature is experimental, and functionality may produce unexpected results. For this reason, an Incorta Lab feature is not ready for use in a production environment. Incorta Support will investigate issues with an Incorta Labs feature. In a future release, an Incorta Lab feature may be either promoted to a product feature ready for use in a production environment or be deprecated without notice.

Using the Data Studio, schema managers can perform multiple actions and transformations on your data using sequential steps and recipes. As recipes are added to the canvas, the corresponding pySpark is generated automatically. When the code is ready to be saved, use the Save MV recipe to push the code into a schema.

Important

The Data Studio is available in this release as a trial version.

There are multiple recipes available, in addition, you can create your own recipe. The following table briefly explains the available recipes:

  • Filter – Filtering data based on a condition.
  • Fuzzy Join – Match similar words together based on another reference that can be a table or file.
  • Change Type – Change the data type of one or more columns within a table.
  • Code – Simply create a recipe by writing your own code or using the Incorta Copilot to generate code on your behalf.
  • Select Columns – Include or exclude columns from a table or another recipe.
  • Join – Join two tables or recipe output, choosing between different types of joins.
  • Data Quality – Check the data compliance against various rules that you define.
  • Union – Combine two tables that have the same number of columns together.
  • Unpivot – Change columns into rows and vice versa.
  • Sort – Sort a table in ascending or descending order.
  • New Column – Add a new column to your table by writing a formula, or using Copilot to write it on your behalf.
  • Sample – Retrieve sample data from the table with a specified fraction.
  • Group – Create an aggregated view of your data with granularity defined by grouping.
  • Rename Column – Rename columns of the table.
  • Split – Split a table using a specified ratio.
  • SQL – Write your own SQL statement, or with the assistance of the Copilot.
  • Save MV – Save the pySpark generated by the series of recipes into a pySpark materialized view. The materialized view should be loaded from the schema
Known Limitations
  • Any change in the MV from a physical schema will not reflect in Data Studio.
  • While using a certain schema, Incorta locks that schema preventing any updates.
  • Users who do not have access to schemas will have view-only access to data flows where these schemas are used.
  • Deleting users will subsequently delete any data flows they have created.
  • Data Studio performance may degrade when using large tables with many columns.
  • Data flows can not be exported or imported.

Google Sheets connector enhancements

This release introduced a new option for datasets created based on Google Sheets data sources: the Page Size option. Now, you can customize the number of records to discover and retrieve simultaneously. This feature is especially beneficial when dealing with large sheets.

Enhanced the responsiveness of Post-load interruption

This release enhances the responsiveness of interrupting load jobs during Post-load calculations. Previously, the Loader Service would wait for running calculations to complete before stopping the load job.

Derived tables’ support for key columns

Now, you can specify key columns for Incorta Analyzer and SQL tables. Adding, removing, or changing key columns does not require running a load job as derived tables are refreshed as part of schema update jobs. The derived table’s unique index is calculated and saved as a snapshot DDM file each time the key columns are updated or the schema or table is loaded.

Make sure that the column or columns that you designate as key maintain row uniqueness because no deduplication is performed for derived tables. If the selected key columns result in duplicate key values:

  • During the schema update job, duplicate values are kept, and the Engine will return the first matching value whenever a single value of the key columns is required. The schema update logs will point out the unique index issue.
  • During the schema or table load job, the unique index calculation will fail, resulting in a finished-with-error load job. No value is returned when the unique index is required. You must select the correct key columns to have the unique index calculated.

Detecting duplicates during unique index calculations

In previous releases, when the Enforce Primary Key Constraint option was disabled for physical tables or MVs, and the selected key columns resulted in duplicate key values, unique index calculations would not fail, the first matching value was returned whenever a single value of the key columns was required.

Starting with this release, in such a case, the unique index calculation will fail, and the load job will finish with errors. You must either select key columns that ensure row uniqueness and perform a full load or enable the Enforce Primary Key Constraint option and load tables from staging to have the unique index correctly calculated.

Enhancements and fixed issues

Enhancements

EnhancementArea
Implemented a retry mechanism to resubmit requests when ServiceNow schemas fail to load due to excessive requests for the same session, resulting in this error: INC_03070000: Failed to load data from [ServiceNow] due to [Cannot auto-detect encoding, not enough chars].Connectors
Incorta now bundles Apache Tomcat 9.0.83 to catch up with the security enhancements and fixes in this version.Security

Fixed issues

IssueArea
Session and presentation variables were not resolved when used in Rich Text insights.Dashboard
Drilling down into a dashboard from an insight with one measure only and no grouping dimensions erroneously filtered the dashboard with the measure value.Dashboards
Formula calculations took longer than usual just after restarting the Loader Service.Loader Service
An issue prevented forcing data sampling during MV discovery, resulting in loading data rows from source tables instead of reading only the metadata and ignoring data rows during the MV discovery.Materialized Views
The Scheduler didn’t respect the end time when scheduling jobs using the Between option.Scheduler