References → SQL interface (SQLi)
About the SQL interface (SQLi)
The Analytics Service exposes all tenants in an Incorta cluster as individual databases over the SQL interface (SQLi) using the PostgreSQL protocol.
The PostgreSQL protocol supports a JDBC connection using the appropriate PostgreSQL JDBC driver. With the SQLi, an external application such as Microsoft Power BI or Tableau can query a tenant database using Structured Query Language (SQL).
Internally, the SQLi manages how to process an external SQL query. Based off the query shape, the SQLi determines if the in-memory query engine of the Analytics Service processes the query or if Apache Spark processes the query using the SQL App and Spark SQL.
If you are connecting to your data source using a recent version of Looker Studio, the Show Table feature does not work as it uses default schema that is named public. For more information, refer to Looker official documentation.
About SQL query shapes
To learn more about the typical query shapes, data models, scale sizes, and benchmark queries, visit tpc.org.
Here are few that are relevant to data warehousing and analytics related benchmarks that most professional in the industry find relevant:
Name | Link |
---|---|
TPC-DS | http://tpc.org/tpcds/default5.asp |
TPC-H | http://tpc.org/tpch/default5.asp |
Analytics Service native support for Left Outer Joins
By default, the Loader Service builds a Direct Data Map using LEFT OUTER JOINS
for a table relationships in a physical schema. For this reason, the Analytics Service in-memory engine supports SQL query shapes that feature LEFT OUTER JOINS
in the SQL statement.
Spark SQL
Apache Sparks supports numerous query shapes including various JOIN types natively:
INNER JOIN
LEFT JOIN
RIGHT JOIN
FULL JOIN
CROSS JOIN
SEMI JOIN
ANTI JOIN
To learn more, please review Apache Spark SQL Join Syntax
External integrations often require robust JOIN support for various query shapes.
Joins are not recommended when there are no relationships between the tables in the query.
For example, using a CROSS JOIN (Cartesian join) can produce very large result sets, which may lead to performance issues, especially with large tables. When it’s not recommended:
- With large data sets: Since a Cartesian join returns every possible combination of rows from two tables, the result set can grow exponentially if the tables are large, which may degrade performance.
- Unintentional Usage: Often, a CROSS JOIN occurs unintentionally when the query does not include a WHERE or JOIN condition, for example,
SELECT * FROM products, employees
. This can result in an unexpectedly large number of rows being returned.
SQli ports
There are two default ports for the SQLi.
5436
directs external queries to the SQLi interface5442
directs external queries to the Apache Spark
The SQLi directs a query that conform to a LEFT OUTER JOIN
shape to the Analytics Service. For all other query shapes, the SQLi directs the query to Apache Spark using Spark SQL. A listener application receives and process the job. The listener application is com.incorta.sqlApp.App
.