References → Advanced SQL Interface (SQLi)
Overview
The current SQL interface presents challenges due to its incomplete adherence to SQL standards. Some queries encounter difficulties and fall back to Spark, which may result in syntax errors. Additionally, the existing SQL Interface incorporates any defined views within Incorta, even those not SQL-compliant, potentially leading to errors when non-compliant queries revert to Spark.
We are introducing the new Advanced SQL Interface that aligns with Spark SQL compliance in response to these identified shortcomings. This pivotal feature offers an improved and more seamless experience.
In this release, Incorta is introducing a new advanced SQLi that is fully Spark SQL compliant. With this compliance, the new SQLi introduces enhanced performance and compatibility with more external tools. You can also use the advanced SQLi with Incorta using the Kyuubi connector.
Incorta unveiled the advanced SQLi for on-prem installations starting the 2024.1.3 release.
The Advanced SQL Interface is not available on Azure cloud deployments.
Key features of the Advanced SQL Interface
- Compatibility with External Tools: Seamless integration with popular tools like Tableau, Power BI, and other SQL tools like DBVisualizer.
- Spark SQL compliance: Enables the composition of Spark SQL queries against physical tables or verified business views, ensuring adherence to Spark SQL standards.
- Utilization of Spark 3.4.1: Incorporates Spark 3.4.1 for enhanced performance, leveraging the latest advancements in the Spark framework.
What will happen when Advanced SQL is activated
- Before 2024.7.3, the Null Handling feature was automatically enabled internally, irrespective of user-defined values. Starting 2024.7.3, Null Handling is an independent feature, separate from Advanced SQL Interface (Advanced SQLi). You can enable or disable it as needed. For more details, refer to Advanced SQLi and Null Handling.
- The Loader and Analytics Services must be restarted.
Advanced SQL Interface connectivity
The first connection attempt to the Advanced SQLi may take some time to initialize the necessary resources.
The connection with the Advanced SQLi can be through the Binary port or the HTTP port. You can find these ports by hovering over the Advanced SQLi service or editing it in the CMC > Cluster > Nodes. The Binary port uses the TCP protocols while the Port uses HTTP protocols.
To get the Advanced SQLi JDBC connection string, follow these steps:
For Cloud clusters, once you turn on the Enable Advanced SQL Interface in the Cloud Admin Portal, you can copy the Advanced SQL Interface connection string. Example:
jdbc:hive2://myCluster.spark-cloud.incorta.com:443/;transportMode=http;httpPath=gc-us-west1-a-0/cliservice;ssl=trueFor On-Premises clusters, the Advanced SQLi JDBC connection string is in the following format:
jdbc:subprotocol://host:port[/catalog]/[schema];<clientProperties;><[#|?]sessionProperties>Binary format:
jdbc:hive2://<AdvancedSQLi_host>:10009/;?kyuubi.operation.incremental.collect=true;kyuubi.operation.result.format=arrow;SSL Secured Binary example:
jdbc:hive2://10.10.16.15:10009/default;ssl=true;sslTrustStore=/home/MYincorta/INX/certdir/truststore;trustStorePassword=1234HTTP format example:
jdbc:hive2://<AdvancedSQLi_host>:10010/;transportMode=http;httpPath=cliservice?kyuubi.operation.incremental.collect=true;kyuubi.operation.result.format=arrow;SSL Secured HTTP example:
jdbc:hive2://10.10.16.15:10010/default;ssl=true;transportMode=http;httpPath=cliservice;sslTrustStore=/home/Myincorta/INX/certdir/truststore;trustStorePassword=1234
You must create a personal access token (PAT) to use it as your password when connecting to the Advanced SQLi.
To be able to create a PAT, you must be granted access to the Public API. You can then use your Profile Manager to create PATs.
For more details, refer to References → Public API v2.
Additional resources:
- Tableau – Connect Tableau to the Advanced SQL Interface using the new preview driver Incorta has developed specifically for that purpose. For more details, refer to Integrations → Tableau for Advanced SQLi.
- DB Visualizer – Use Kyuubi Hive JDBC driver to connect to Incorta.
- Incorta Over Incorta (IOI) – Use the Kyuubi connector to create an Incorta over Incorta connection.
- Power BI – Use Spark connector or Simba Spark ODBC to connect to Incorta.
Querying non-optimized tables via Advanced SQL Interface
Starting 2024.1.3 Cloud and On-Premises releases, non-optimized tables can be accessible via the Advanced SQL Interface. As a result, external tools, such as Tableau and Power BI, can now discover and query non-optimized tables that do not have any of the following:
- Security filters
- Formula columns
- Encrypted columns
For more details, refer to References → Discovery of non-optimized tables via Advanced SQLi.
Data extraction via Advanced SQLi and Public API
By leveraging Advanced SQLi and Public API, Incorta has streamlined data extraction from verified business views to different destinations in Parquet format. This feature is available as a preview feature starting 2024.1.4.
For more details, refer to Data extraction via Advanced SQLi and Public API.
Monitor Advanced SQLi queries
Incorta monitors the queries run via the Advanced SQL Interface and records the details of these queries on log files under the kyuubi_audit folder in the local data. You can create a schema based on these log files to explore the query details.
For Cloud installations and On-premises installations on local servers, Incorta handles the creation and configurations required so that the monitoring works fine, However, for On-Premises installations with a cloud storage file system, such as AWS, Azure, or Google Cloud Storage, you must manually allow SparkX to connect to cloud storage to have monitoring work correctly.
For details, refer to References → SparkX Access to Cloud Storage.
Known limitations and issues
The Advanced SQLi can communicate with only a single Analytics service. Starting 2024.1.3, for multi-Analytics clusters, you can set one as the primary service that the Advanced SQLi can use and communicate with.
Geo data type is not supported.
The Spark Metastore does not support synchronizing PostgreSQL (Incorta SQL) views, non-optimized tables (before 2024.1.3), and Analyzer views.
Only one schema is synchronized in case multiple exist with the same name, regardless of case sensitivity.
For On-Premises installations, attempting to browse the Advanced SQL (Beta) section under the Connect to Other Visualization Tools > Tableau page results in an error (resolved in 2024.1.4).
The Sync Spark Metastore option is available for all tenants. However, the Advanced SQL Interface toggle must be on for a successful sync. If the Advanced SQL Interface toggle is turned off, sync requests will be submitted but will not be completed successfully.
You can check the status of the sync requests in the Analytics service logs. Sign in to the Cluster Management Console (CMC) and select the cluster. On the Nodes tab, select Logs for the Analytics node.