Tools → Data Lineage Viewer v2

About

This new version of the Data Lineage Viewer is available as a preview feature. This version promotes data governance and analytics catalog capabilities in Incorta platform. Now, you can track and identify both the source (upstream lineage) and destination (downstream lineage) of data. Upstream data lineage refers to entities that the current entity references and gets impacted by while downstream data lineage refers to entities where the current entity is referenced.

Additionally, you can track the data lineage of not only columns and formula columns but also physical schema objects, runtime business views, dashboards, insights, and session and global variables. Starting 2024.7, Incorta has enhanced the data lineage reports for supported entities to include the underlying data sources. You can also access the lineage of external data sources, which visualizes where an external data source is utilized across different layers, from physical schema objects down to business schema views, columns, variables, dashboards, and insights.

The new version also comes with an enhanced user experience (UX) which displays a multi-level, hierarchical data lineage diagram. With the new version, you can determine the relationship between entities and objects in your system and identify the impacted areas when you update or delete an entity, whether a column, table, view, variable ...etc.

Note

In this document, the word “table” refers to all kinds of objects in a physical schema (physical tables, materialized views (MVs), Incorta Analyzer tables, Incorta SQL tables, and aliases) while the word “view” refers to the three types of runtime business views (business schema views, Analyzer views, and PostgreSQL (Incorta SQL) views).

Data Lineage Viewer Access Permissions

The Data Lineage Viewer can display the data lineage of different entity types in the system, including data sources, tables, views, dashboards, insights, session variables, global variables, and columns and formula columns in tables and views. If you can access the respective tool and have at least view access rights to an entity, you can access its data lineage. For example, if you can access the Business Schema Manager and own or have access to a business schema, you can view the data lineage of all views in this business schema and their columns and formula columns as applicable. This applies to all entity types, except for global variables as you need only to have access to the Schema Manager.

The steps to access the Data Lineage Viewer v2 vary according to the type of entity you want to view its lineage.

  • To view the data lineage of a data source (available starting 2024.7):
    • On the navigation bar, select Data.
    • In the Data Manager, in the Context bar, select the External Data Sources tab.
    • In the list view of external data sources, for a given data source, select Lineage.
  • To view the data lineage of a table:
    • On the navigation bar, select Schema, and then select the physical schema.
    • In the Schema Designer > Tables tab, for the table you want, select More Options (vertical ellipsis icon ⋮), and then select Open Lineage.
  • To view the data lineage of a column or formula column in a table:
    • Open the physical schema, and then select the table you want.
    • In the Table Editor > Columns tab, for the column or formula column you want, in the Data Lineage column, select Lineage.
  • To view the data lineage of a view:
  • To view the data lineage of a column or formula column in a view:
    • Open the business schema, and then expand the view you want.
    • For the column or formula column you want, in the Data Lineage column, select Lineage.
  • To view the data lineage of a session variable:
    • On the navigation bar, select Schema > Session Variables.
    • For the session variable you want, in the Data Lineage column, select Lineage.
  • To view the data lineage of a global variable:
    • On the navigation bar, select Schema > Global Variables.
    • For the global variable you want, in the Data Lineage column, select Lineage.
  • To view the data lineage of a dashboard:
    • On the navigation bar, select Content, and then do one of the following:
      • On the Dashboards list, for the dashboard you want, select More Options (vertical ellipsis icon ⋮), and then select Open Lineage.
      • On the Dashboards list, select the dashboard you want. In the Dashboard Manager, select More Options (vertical ellipsis icon ⋮), and then select Open Lineage.
  • To view the data lineage of an insight:
    • On the navigation bar, select Content, and then on the Dashboards list, select the dashboard you want.
    • In the Dashboard Manager, for the insight you want, select More Options (vertical ellipsis icon ⋮), and then select Open Lineage.

Anatomy of the Data Lineage Viewer

The Data Lineage Viewer consists of the following:

  • Title bar
  • Overview window
  • Diagram canvas
  • Diagram legend

Title bar

The title bar shows:

  • The name of the entity that you preview its data lineage.
  • An X icon that you can select to close the Data Lineage Viewer.

Overview window

The Overview window provides a high-level view of the entire data lineage diagram. Use the Overview window to focus on a specific part of the diagram. Drag the light-blue rectangle or change its width and height to set or change the focus.

You can change the size of the Overview window or you can hide it.

  • To increase the size of the Overview window, select maximize (diverging arrows icon).
  • To decrease the size of the Overview window, select minimize (converging arrows icon).
  • To collapse the Overview window, select v.
  • To expand the Overview window, select >.

Diagram canvas

The diagram canvas shows the data lineage of the respective entity in a multi-level, hierarchical form.

  • The Origin node refers to the current entity you view its data lineage diagram.
  • Arrows represent data flow, where the base of the arrow is the data source, and the head of the arrow represents the data destination.
  • Nodes to the left of the Origin node represent the source entities that the current entity references and is impacted by their update or deletion. These are the upstream data lineage.
  • Nodes to the right of the Origin node represent the entities that reference the current entity and are impacted by updating or deleting it. These are the downstream data lineage.

The diagram by default shows only one level for each of the upstream and downstream data lineage, which are the direct lineage of the current entity. You can select the plus or minus icons to expand or collapse additional levels respectively.

For dashboards and insight nodes, you can hover over their nodes and select the square with an arrow icon to open the respective dashboard in another browser tab. You can open dashboards that you have access rights to; otherwise, an error message appears when you try to open the dashboard.

Note

The data lineage is automatically updated if you apply changes to related entities. However, there might be some latency when an underlying process is still running. For example, when a schema model update job is still running, this might cause the lineage not to show recent changes on the related physical schemas.

Diagram legend

A legend key exists at the bottom of the diagram, indicating the physical schema object’s entity type by color.

Here are the entity types in the diagram legend and corresponding colors.

  • Physical schema table - blue
  • MV- green
  • Alias - red
  • Incorta Analyzer table - orange
  • Incorta SQL table - purple

Additionally, the diagram shows icon codes that indicate the type of other entities, including dashboards, insights, runtime business views, variables, and data sources.

Known issues and limitations

  • The following entities that reference a business schema view won’t appear on its lineage diagram:
    • PostgreSQL (Incorta SQL) views
    • SQL tables
    • PostgreSQL MVs
  • If the Analyzer view is referenced in a PostgreSQL (Incorta SQL) view, the PostgreSQL view won’t appear on the lineage diagram of the Analyzer view, and vice versa. As a result, the diagram of the PostgreSQL view may not show any node for the upstream lineage.
  • Renaming physical schema objects and runtime business views is not reflected in their dependent objects, and accordingly in the data lineage, and may cause the diagram to display incorrect dependencies. For example, if you change the name of a business schema view referenced in an Analyzer view, the business schema view won’t appear when you check the lineage of the Analyzer table, and vice versa.
  • Data source lineage for external session variables is not currently supported.
  • Folders and local files share the same icon as local files.