Tools → Data Lineage Viewer

About the Data Lineage Viewer

The Data Lineage Viewer is introduced as a beta feature in the 2022.4.0 release and has multiple enhancements in the following releases. By using this feature, you can track the dependencies of each column (data columns or formula columns) in your physical schema and business schema objects. You can preview a list of entities where a column is referenced, whether directly or via another dependent entity. The 2022.7.0 brings a graphical representation of the data lineage, besides the tabular view, that gives you a quick insight into dependent entities and their number. The aim of this feature is to help you, as a schema developer, better manage schema updates and load jobs.

Note

Starting the 2022.11.1 release, the Data Lineage Viewer is a General Availability (GA) feature that is ready for production use.

Important

This version of the Data Lineage Viewer will not be available starting with the 2023.7.0 release. A new version will be available that displays both the source of data (upstream lineage) and its destination (downstream lineage). Additionally, it shows the data lineage of most entities in the system, such as, tables, views, dashboards, insights, session variables, and global variables.

The Data Lineage Viewer shows a great deal of granularity regarding dependent entities. The list of dependent entities is not limited to the entities where the column is directly referenced. All entities that can be impacted by the update or deletion of this column are listed. Dependent entities can be columns in other physical or business schema objects, formula columns, joins, dashboard insights, or filters.

Note

In this document, the word “column” refers to both: data columns (non-calculated columns) and formula columns.

Enhancements through different releases
  • Incorta extended the data lineage feature in the 2022.7.0 release to track the dependencies of columns (data or formula columns) in the runtime business views (business schema views, Incorta Analyzer Views, and Incorta SQL Views). The data lineage list and diagram show the entities where a business schema column is referenced directly or via other dependent entities as applicable.
  • Starting the 2022.9.0 release, you can track the dependencies of formula columns in physical schema tables and materialized views (MVs).
  • Starting 2023.4.0, the Data Lineage Viewer differentiates between columns referenced in an insight and in the dashboard filters or prompts. Although it shows both dependent insights and dashboards under the Dependent Dashboards, only the dashboard name appears when the column is referenced in the dashboard filters and prompts while the dashboard name, tab name, and insight name (if any) appear when the column is referenced in an insight, whether as a column or filter. In addition, the tabular view shows the exact dependency type: Dashboard or Insight.

Data Lineage Viewer permissions and access rights

A user that belongs to a group with the SuperRole or the Schema Manager role can access the Data Lineage Viewer for a column in a given physical schema object or business view if the user owns or has at least View access rights to the physical schema or business schema.

If the Enable Super User Mode option is enabled in the Cluster Management Console (CMC) → Tenant Configurations → Security, the Super User that is a Tenant Administrator and any user with the SuperRole can access the Data Lineage Viewer for columns in all physical schema objects and business views regardless of the access rights.

Access the Data Lineage Viewer

To access the Data Lineage Viewer of a column in a physical schema object:

  • Open the Tables Editor for the respective object.
  • On the Columns tab, hover over the column you want, and then select the DNA icon.

To access the Data Lineage Viewer of a column in a business view:

  • Open the Business Schema Designer of the respective business schema in the view mode.
  • Expand the business view you want.
  • In the canvas, hover over the column you want, and then select the DNA icon.

Data Lineage Viewer anatomy

The Data Lineage Viewer consists of the following:

Action bar

The action bar shows the following:

  • Dependent entity type list: Shows the available types of dependent entities: Physical Schema, Business Schema, and Dashboard. Select an option to view the respective dependent entities.
  • Diagram view (partition icon): Shows the data lineage diagram. Selected by default.
  • Tabular view (table icon): Shows the data lineage list.
  • Search box: Enter a search term to find dependent entities on the data lineage diagram. Select an item from the list to highlight on the diagram. For now, search is not supported in the tabular view.
  • Export (download icon): Downloads the data lineage list in a CSV file format.

Data lineage diagram

  • The data lineage diagram shows the granularity of dependent entities as nodes.
  • The diagram shows a header that indicates the node type. The header titles may vary according to the selected entity type on the action bar.
  • Each node that represents a dependent entity has an icon that denotes the entity type. For example, physical schemas are denoted by a hierarchy diagram, dashboards by a sliced pie, join columns by a link icon, and formula columns by a flask.
  • The column node shows the total number of the column’s dependent entities, and each node shows the number of child nodes if any.
  • As applicable, hover over a node and select the plus icon to expand the child nodes. When you expand a node, the entity name appears in bold and is displayed under the node header. In addition, the connecting lines become bold.
  • To hide a child node, select the X icon in the respective node header.
  • Nodes for dependent dashboards have an information icon to show the dashboard owner and path.

Data lineage list

The data lineage list shows dependent entities categorized into three types: Physical Schema, Business Schema, and Dashboard. Select the entity type from the drop-down list on the action bar to display the related dependent entities. The details vary according to the dependent entity type.

Data lineage details

The Data Lineage Viewer shows different details in the diagram view and the tabular view depending on the selected entity type. The Data Lineage Viewer lists all references to the column or its dependent entities in physical schema objects, business schema objects, and dashboards.

The following are the details categorized by the dependent entity type.

Note

The available dependent entities may vary according to the parent object type where the column exists. For example, a column in an Incorta Analyzer table cannot have a dependent column from an Incorta SQL table.

Dependent physical schemas

Dependent physical schema entities can be any of the following:

  • Columns in physical schema tables, MVs, Incorta SQL tables, Incorta Analyzer tables, or aliases
  • Formula columns in physical schema tables, MVs, Incorta SQL tables, Incorta Analyzer tables, or aliases
  • Join columns
  • Runtime security filters in physical schema tables, MVs, Incorta SQL tables, or Incorta Analyzer table
  • Load filters in physical schema tables or MVs

The following table shows the details that the Data Lineage Viewer shows, as applicable, for each dependent entity in a physical schema.

PropertyDescriptionComments
Column NameThe name of the column that you select to show its dependent entitiesThis field is displayed in the CSV file in the fully qualified column name format: physcial_schema.object.column.
Dependency TypeThe type of the dependent entity:
  ●  Alias column
  ●  Physical column
  ●  Physical formula column
  ●  Incorta Table (Analyzer Table)
  ●  Incorta Table column
  ●  Incorta Table formula column
  ●  Incorta MV table
  ●  Incorta MV table column
  ●  Incorta MV table formula column
  ●  Incorta SQL Table
  ●  Incorta SQL Table column
  ●  Join column (condition column or filter column)
  ●  Load Filter node
  ●  Security Filter node
On the diagram, the Dependent Column node shows an icon that indicates the dependency type:
  ●  Data columns: the letter A regardless of the column data type and the object type
  ●  Formula columns: a flask icon
  ●  Load and security filters: a filter icon
  ●  Join columns: a chain icon
Dependent SchemaThe name of the physical schema where the selected column or one of its dependent entities is referencedThere is no dedicated field for the dependent schema in the CSV file. However, it is implicitly indicated in the fully qualified name of the dependent object and column.
Dependent TableThe name of the dependent physical schema objectWhen a column is referenced as a data column in an MV, Incorta SQL Table, or Incorta Analyzer Table, the dependent object itself will have a dedicated row other than the dependent column.

On the other hand, when a column is referenced in one of these objects as a filter or in a condition, only the row of the dependent object exists.
Dependent ColumnThe name of the dependent column if any. This field is empty if the dependency type is an object or a join column.In the case of a dependent column, this field is displayed in the CSV file in the fully qualified column name format: physcial_schema.object.column.

Dependent business schemas

The Data Lineage Viewer lists all references to the column or its dependent entities in business schema objects. Dependent entities can be any of the following:

  • Columns in business schema views, Analyzer Views, or SQL Views
  • Formula columns in business schema views or Analyzer Views

The following table shows the details that the Data Lineage Viewer shows, as applicable, for each dependent entity in a business schema.

PropertyDescriptionComments
Column NameThe name of the column that you select to show its dependent entitiesThis field is displayed in the CSV file in the fully qualified column name format: physcial_schema.object.column.
Dependency TypeThe type of the dependent entity:
  ●  Business View column
  ●  Business View formula column
  ●  Incorta View (Analyzer View)
  ●  Incorta View column
  ●  Incorta View formula column
  ●  Incorta SQL View
  ●  Incorta SQL View column
On the diagram, the Dependent Column node shows an icon that indicates the dependency type:
  ●  Data columns: the letter A regardless of the column data type and the object type
  ●  Formula columns: a flask icon
Dependent SchemaThe name of the business schema where the selected column or one of its dependents entities is referencedThere is no dedicated filed for the dependent business schema in the CSV file. However, it is implicitly indicated in the fully qualified name of the dependent view and column
Dependent ViewThe name of the dependent business viewWhen a column is referenced as a column in an Incorta Analyzer View or Incorta SQL View or as a formula column in an Incorta Analyzer View, the dependent view itself will have a dedicated row other than the dependent column.
Dependent ColumnThe name of the dependent columnThis field is displayed in the CSV file in the fully qualified column name format: businessl_schema.view.column.

Dependent dashboards

The Data Lineage Viewer lists all references to the column or its dependent entities in dashboards. The column or its dependent entities can be referenced as a column or in a formula column in any of the following:

  • Insight trays (depending on the visualization type)
    • The Measure, Grouping Dimension, and Coloring Dimension trays in chart visualizations, and any visualization-specific trays, such as the Row or Column trays in Pivot Tables and Target and Source trays in Sankey insights
    • Filter trays, including Individual Filters, Distinct Filters, and Aggregate Filters
    • The Sort By tray
  • Dashboard Filters
    • Prompts
    • Applied Filters
    • Filter Options

Starting 2023.4.0, the Data Lineage Viewer differentiates between columns referenced in an insight and in the dashboard filters or prompts.

The following table shows the details that the Data Lineage Viewer shows, as applicable, for each dependent dashboard or insight.

PropertyDescriptionComments
Column NameThe name of the column that you select to show its dependent entitiesThis field is displayed in the CSV file in the fully qualified column name format: physical_schema.object.column.
Dependency TypeBefore 2023.4.0, The dependency type is always “Dashboard” regardless of how the column is referenced in the dashboard. However, starting 2023.4.0, it shows either Dashboard or Insight depending on the reference type.
Dashboard NameThe name of the dependent dashboard.
Starting 2023.4.0, if the column is referenced in an insight (whether as a column or filter), the dashboard name, tab name, and insight name (if any) appear.
Dashboard PathThe path to the folder where the dependent dashboard exists in the Content Manager (Catalog)A slash symbol only / means that the dashboard does not exist in a folder; it rather exists at the root of the Content Manager.
  ●  On the diagram, point to the information icon on the Dependent Dashboard node to show the dashboard path.
Dashboard OwnerThe owner or creator of the dependent dashboardOn the diagram, point to the information icon on the Dependent Dashboard node to show the dashboard owner.
Important

When you create or update objects in a physical schema, you must save the changes to a published version to have the dependency lists of referenced columns and their source columns updated.

After you successfully create an MV that references an object and save the changes, the dependency list of the referenced columns or their dependent entities will be automatically updated to show the dependent MV.

Note

In the CSV file, the names of the physical schema, object, and column and those of the business schema, view, and column are combined and displayed as the column fully qualified name: physical_schema.object.column and business_schema.view.column respectively. In addition, dependent dashboards are displayed in the following format: full_path/dashboard_name.

Data Lineage Viewer Actions

With the Data Lineage Viewer, you can do the following:

Change between dependent entity types

  • In the Data Lineage Viewer, on the action bar, select the entity type.
    • Physical Schema
    • Business Schema
    • Dashboard

Change between diagram and tabular views

  • In the Data Lineage Viewer, on the action bar, select the required view.
    • Select the partition icon to view the data lineage diagram
    • Select the table icon to view the data lineage list

Search for a dependent entity

For now, the search feature is supported in the diagram view only.

  • In the Data Lineage Viewer, on the action bar, in the search box, enter a search term.
  • In the result list, select an item. The nodes of the selected entity are highlighted and expanded.

Download the column dependency list

  • In the Data Lineage Viewer, on the action bar, select the download icon. A CSV file with the fully qualified name of the respective column is downloaded to the default download directory.

Feature limitation and known issues

The Data Lineage Viewer does not consider referencing the column in the following contexts:

  • Result sets or insights that are created over result sets
  • Session or global variables
  • Dashboard presentation variables
  • The base field of a measure in a dashboard insight
  • The filter column of a measure in a dashboard insight
  • The sorting column of a dimension in a dashboard insight
  • The sorting column of a dashboard prompt
  • SQL Views that are created over SQL Tables
  • MVs that don't depend on physical tables that exist in Incorta.

The following is a list of the feature known issues.

  • When you only copy a dependent dashboard, the new copy of the dependent dashboard will not appear on the dependency list till you make any change to it, such as editing or renaming the dashboard or even moving it to or out of a folder.
  • After updating the data source of a dependent object, the dependency list does not accurately reflect the changes.
  • There might be some UI issues with huge dependency lists.
  • The search feature introduced in the 2022.7.0 release is not supported in the tabular view.
  • If you create your own data in the mv, read data from Parquet, or reference an physical table column as an alias (using the as keyword), the Data Lineage Viewer throws an error.

The following is a list of the feature known issues in releases before 2022.6.0.

  • Dependent SQL Views appear under the Incorta View dependency type rather than their own type.
  • Updating a physical table formula column is not reflected in the dependency list.
  • After importing a physical schema, the dependency list of all objects will be missing the entries of all dependent MVs and MV-related entities even after loading data. In addition, when displaying the dependency list of an MV column, it throws errors.