Concepts → Incorta SQL Table

About an Incorta SQL Table

An Incorta SQL Table is a new type of derived table that a schema developer creates in a physical schema using a SELECT statement. It uses a new SQL engine that orchestrates complex query execution using the existing engine as a query processor.

The SELECT statement also supports the use of the WITH clause for a non-recursive Common Table Expression (CTE).

The Loader Service automatically determines the load order for the Incorta SQL table.

Note

Starting with the 2023.1.0 release, the Incorta SQL table is a fully supported General Availability (GA) feature. Before this release, it was a labs feature.

Object data lineage

Starting with 2023.7.0, you can view the data lineage of Incorta SQL tables in the Data Lineage Viewer v2 that displays a diagram with the entities referenced in the SQL table (upstream lineage) and entities where the SQL table is referenced (downstream lineage). Additionally, you can view the data lineage per column in the SQL table.

Release 2021.3.2 updates

With the introduction of the Incorta Cloud 2021.3.2 release, the Loader Service materializes Incorta SQL tables to parquet files rather than Direct Data Mapping (ddm) files. As a result of materializing Incorta SQL tables to parquet files, external tools, such as Tableau and Power BI, now have better support to access Incorta SQL tables through SQL Interface (SQLi). In addition, you can reference columns in Incorta SQL tables in materialized views in other physical schemas.

The Loader Service creates a directory under the source directory for each Incorta SQL table once created. As Incorta SQL tables do not support incremental loads, the Loader Service creates a new parquet version directory for each of these tables during a load job or during a schema update job that impacts the columns referenced in the Incorta SQL table.

The Loader Service creates a new parquet file for an Incorta SQL table in the following cases:

  • When you create or update the Incorta SQL table
  • When you update the source table column(s) that you reference in the Incorta SQL table in the same physical schema
  • When you update the source table column(s) that you reference in the Incorta SQL table in another physical schema and load the Incorta SQL table
  • When you delete the source table of the Incorta SQL table in the same physical schema (resulting in an empty parquet file)
  • When you delete the source table column(s) that you reference in the Incorta SQL table in another physical schema and load the Incorta SQL table (resulting in an empty parquet file)
Note

When you delete (or update) the source table that you reference in an Incorta SQL table in another physical schema, the Loader Service will NOT automatically update the Incorta SQL table. However, when you load the Incorta SQL table that references a deleted source table, whether in the same or another physical schema, the load fails and the Loader Service writes an empty parquet file for the Incorta SQL table.

Important

After upgrading to release 2021.3.2, you must load all physical schemas with Incorta SQL tables at least from staging.

Considerations for an Incorta SQL Table

Here are some important considerations for an Incorta SQL Table:

  • All referenceable objects must be performance optimized physical schema tables or materialized views.
  • There is no support for an incremental load.
  • There is no fallback routing for SQL processing to Apache Spark.
  • It is not possible to specify a load order for the table in a physical schema.
  • It is not possible to specify a load filter.
  • It is not possible to specify a key column (starting with release 2021.3.2).
  • It is not possible to specify an encrypted column (starting with release 2021.3.2).
  • It is not possible to specify a self join relationship (starting with release 2021.3.2).
  • You must not disable performance optimization for Incorta SQL tables.
  • It is a single-source table.

Referenceable objects

In the FROM clause, you can reference the following physical schema objects:

  • Physical_Schema.Alias
  • Physical_Schema.Table
  • Physical_Schema.Materialized_View
  • Business_Schema.View
Important

Starting with the 2021.4.3 release, Incorta SQL table supports referencing Busineses Views only if the Business Views are flat and not referencing other Incorta views or Incorta tables.

It is not possible to reference the following objects:

  • Business_Schema.Incorta_View
  • Physical_Schema.Incorta_Analyzer_Table
  • Physical_Schema.Incorta_SQL_Table

Join Types

Here are the join types for schema tables and SQL queries:

  • CROSS JOIN
  • INNER JOIN
  • FULL OUTER
  • LEFT OUTER
  • RIGHT OUTER

Join Conditions

Here are the supported join conditions:

  • table1.column1 (<), (<=), (>), (>=), (=) table2.column1
  • table1.column1 (<), (<=), (>), (>=), (=), (<>) table1.column1
  • table1.column (<), (<=), (>), (>=), (=), (<>) constant
  • table1.column IN (constant1, … constantN)
  • table1.column BETWEEN constant1 AND constant2
  • table1.column LIKE ‘pattern’ ['escape']
  • table1.column is [NOT] NULL
  • startsWith(table1.column, ‘STRING’)
  • endsWith(table1.column, ‘STRING’)
  • contains(table1.column, ‘STRING’)

Supported predicates between join conditions and filters

The following are supported predicates between join condition and filter

  • AND
  • OR

Data types for Join Constraints

The following conversion table describes how an Incorta SQL Table handles a join constraint where the data types differ for the join columns.

long or integerDOUBLEdatetimestampstring
long or integerlongdoublelonglongdouble
doubledoubledoubledouble*
datelonglonglongstring
timestamplonglonglongstring
stringdouble*double*stringstringstring
Note

For a join between a string that stores a numeric value and a numeric column, the string converts into a double.

Set operators

Here is a list of supported set operators:

  • UNION
  • UNION ALL

Group filters

Here is a list of supported Group filters:

  • HAVING

Row filters

Here is a list of supported Row filters:

  • column (<), (<=), (>), (>=), (=), (<>) constant
  • column [NOT] IN (constant1, … constantN)
  • column BETWEEN constant1 AND constant2
  • column is [NOT] NULL
  • column [NOT] LIKE

In addition to the standard row level filters, you can use specify in a WHERE clause the following:

  • [NOT] EXISTS (SELECT… )
  • [NOT] IN (SELECT… )
  • WHERE column (<), (<=), (>), (>=), (=) ANY(SELECT… correlated subquery)
  • WHERE column (<), (<=), (>), (>=), (=) ALL(SELECT… non-correlated subquery)

Aggregation functions

Here is a list of supported aggregations functions

  • AVG()
  • COUNT()
  • COUNT(DISTINCT )
  • MIN()
  • MAX()
  • STDDEV()
  • STDDEV_SAMP()
  • STDDEV_POP()
  • SUM()
  • VARIANCE()
  • VAR_SAMP()
  • VAR_POP()

Window functions

A window function performs a calculation across a set of table rows that are somehow related to the current row. An Incorta SQL Table supports windows functions such as

  • OVER ()
  • OVER (PARTITION BY column)
  • OVER (ORDER BY column)
  • OVER (PARTITION BY column ORDER BY column)

You use these aggregate functions in a window function:

  • AVG(), MIN(), MAX(), SUM(), LEAD(), and LAG()

You can use the advanced analytics functions in a window function:

  • DENSE_RANK(), RANK(), and INDEX()

Considerations when using a window function

  • The syntax of any of the DENSE_RANK(), RANK(), and INDEX() window functions is as follows:
function() OVER (
PARTITION BY dimension
ORDER BY (AggregateFunction(measure))
ASC/DESC
)
  • You can use these window functions only in aggregated queries. Thus the query should include a GROUP BY clause with all columns in the SELECT clause and the PARTITION BY parameter.
  • The ORDER BY parameter must include an aggregate function.

Example of a Windows functions query

Here is an example of a windows function query using the HR physical schema:

SELECT
HR.Employees.DEPARTMENT_ID,
HR.Employees.Job_Title,
SUM(HR.Employees.SALARY) as "Total_Salary",
(
RANK() OVER (
PARTITION BY HR.Employees.DEPARTMENT_ID
ORDER BY
(SUM(HR.Employees.SALARY)) ASC
)
) AS "Rank_by_Salary"
FROM
HR.Employees
GROUP BY
HR.Employees.DEPARTMENT_ID,
HR.Employees.Job_Title

Common Table Expressions

An Incorta SQL Table supports a non-recursive CTE for a SELECT statement.

The WITH keyword signifies a CTE, followed by a CTE name and the body of the CTE, a SELECT statement. Optionally, a list of column names can be specified in the CTE SELECT statement. A primary SELECT statement references the CTE by name in a FROM clause. Here is a an example:

WITH cte_analysts AS
( SELECT * FROM tbl_Employees
WHERE dept = 'Finance' )
SELECT * FROM cte_analysts
WHERE ...

A non-recursive CTE is more readable than a nested SELECT statement such as:

SELECT v1.* FROM (
SELECT * FROM tbl_Employees
WHERE dept = 'Finance') v1

A WITH query can only reference sibling WITH queries that are earlier in the WITH list. A WITH query evaluates only once per execution of the primary query, even if the primary query refers to WITH query more than once.

Examples of a CTE

Here are is an example of a year-over-year comparison using the SALES physical schema:

year-over-year

WITH sales_product_year AS (
SELECT t1.PROD_ID, YEAR(t1.TIME_ID) AS year,
SUM(t1.AMOUNT_SOLD) AS total_amt
FROM SALES.SALES t1
GROUP BY t1.PROD_ID, YEAR(t1.TIME_ID)
)
SELECT t2.PROD_ID, t2.total_amt as cur_total_amount, t2.year as cur_year, t3.total_amt as prev_total_amt, t3.year as prev_year
FROM sales_product_year t2 INNER JOIN
sales_product_year t3 ON t2.PROD_ID = t3.PROD_ID
AND t2.year = t3.year + 1
ORDER BY t2.PROD_ID, t2.year

Steps to enable the Incorta SQL Table in the CMC

Here are the steps to enable this option as the default tenant configuration in the CMC:

  • Sign in to the CMC as the CMC Administrator.
  • In the Navigation bar, select Clusters.
  • In the cluster list, select a Cluster name.
  • In the canvas tabs, select Cluster Configurations.
  • In the panel tabs, select Default Tenant Configurations.
  • In the left pane, select Incorta Labs.
  • In the right pane, toggle on Enable Incorta SQL Table.
  • Select Save.

Here are the steps to enable this option for a specific tenant configuration in the CMC:

  • Sign in to the CMC as the CMC Administrator.
  • In the Navigation bar, select Clusters.
  • In the cluster list, select a Cluster name.
  • In the canvas tabs, select the Tenants tab.
  • In the Tenant list, for the given tenant, select Configure.
  • In the left pane, select Incorta Labs.
  • In the right pane, toggle on Enable Incorta SQL Table.
  • Select Save.