You are viewing content for 4.4 | 4.3 | Previous Releases


Server Configurations

Set the following options to configure your server.

Clustering Server Configurations

Configuration Property: Zookeeper Connect String

  • Analytics Service Restart Required? Yes
  • Loader Service Restart Required? Yes
  • Description: Provide the Zookeeper connection string.

Configuration Property: Distributed Task Manager Type

  • Analytics Service Restart Required? Yes
  • Loader Service Restart Required? Yes
  • Description:

Configuration Property: Kafka Consumer Service Name

  • Analytics Service Restart Required?
  • Loader Service Restart Required? Yes
  • Description: Provide the Kafka consumer node name (if applicable).

SQL Interface Server Configurations

Configuration Property: Default SQL interface port

  • Analytics Service Restart Required? Yes
  • Loader Service Restart Required?
  • Description: Provide a number for the port used to connect to the Incorta engine from other BI tools, and run queries against the data loaded in memory. In this case, if the query is not supported by the Incorta engine, it will automatically be routed through Spark to be executed. You can choose to bypass the Incorta engine and run queries directly using Spark against data loaded in the staging area using the “Data Store (DS) port” property.

Configuration Property: Data Store (DS) port

  • Analytics Service Restart Required? Yes
  • Loader Service Restart Required?
  • Description: Provide the port number to use for running queries directly using Spark against data loaded in the staging area.

Configuration Property: Enable Connection Pooling

  • Analytics Service Restart Required? Yes
  • Loader Service Restart Required?
  • Description: Enable this option to create a pool of open connections between external BI tools and Incorta Analytics. Enabling this option avails multiple connections to improve the query response time, and save the time of establishing a new connection every time data is needed for the SQL interface.

Configuration Property: Connection pool size

  • Analytics Service Restart Required? Yes
  • Loader Service Restart Required?
  • Description: Provide the number of SQL interface connections to keep available when executing queries from external BI tools. Determining this value depends on the following factors: Multithreading support from external BI tools. - Query complexity. The Incorta host machine specs.- Available resources. Thus, choose this value very carefully, as setting it too high would result in reserving machine resources without being utilized. On the other hand, setting it too low can impact the query execution performance.

Configuration Property: Concurrency

  • Analytics Service Restart Required? Yes
  • Loader Service Restart Required?
  • Description: This property sets the number of metadata gathering processes that Incorta can run in parallel when executing queries against the Incorta engine.

Configuration Property: Default Schemas

  • Analytics Service Restart Required?
  • Loader Service Restart Required?
  • Description: Provide a comma-separated list of schemas to be used in the case of using non-qualified table names (wrong table path), or when the SQL query does not specify a schema.

Configuration Property: Enable Cache

  • Analytics Service Restart Required?
  • Loader Service Restart Required?
  • Description: Enable this option to cache repeated SQL operations and enhance the performance of executing queries, if there is enough available cache size.

Configuration Property: Enable Caching size (In gigabytes)

  • Analytics Service Restart Required?
  • Loader Service Restart Required?
  • Description: Set the maximum caching size per user to cache the data returned by the SQLi queries. When this size is exceeded, the least recently used (LRU) data gets evicted, availing space for newer cache. Setting this parameter depends on the available memory in the Incorta host server, and the size of the common queries result-sets. For example, if the result is larger than this value, it will never be cached, in which case, it would be recommended to increase the cache size.

Configuration Property: Cached query result max size

  • Analytics Service Restart Required?
  • Loader Service Restart Required?
  • Description: Configure this property to set the max size for each query result. That is, the table cell count, which is the rows multiplied by the columns.

Configuration Property: Enable cache auto refresh

  • Analytics Service Restart Required?
  • Loader Service Restart Required?
  • Description: Enable this option to automatically refresh the cache at specified intervals.

Configuration Property: Refresh cache interval (In minutes)

This option displays if you enable the Enable cache auto refresh option .

  • Analytics Service Restart Required?
  • Loader Service Restart Required?
  • Description: Enter the cache refresh frequency (in minutes) to refresh the cache (if “Enable cache auto refresh” is enabled).

Spark Integration Server Configurations

Configuration Property: Spark master URL

  • Analytics Service Restart Required? Yes
  • Loader Service Restart Required? Yes
  • Description: Provide the Spark Master connection string for the Apache Spark instance to execute materialized views (or SQL) queries. This option is required to connect to Apache Spark. You can access this info by navigating to the Spark host server UI (from any browser), using the following format: <SPARK_HOST_SERVER>:<SPARK_PORT_NO> Copy the Spark Master connection string (usually found in the top center of the UI) in the format: spark://<CONNECTION_STRING>:<SPARK_PORT_NO> The default port number for Spark installed with Incorta is 7077.

Configuration Property: Enable SQL App

  • Analytics Service Restart Required? Yes
  • Loader Service Restart Required?
  • Description: The SQL App is an application that runs within Spark to handle all incoming SQLi queries. Enable this option to start the SQL App, and keep it up and running, to execute incoming SQL queries. Changing this value requires a server restart.

Configuration Property: SQL App driver memory

  • Analytics Service Restart Required?
  • Loader Service Restart Required?
  • Description: Allocate memory (in GB) to be used by the SQL interface Spark to construct (not calculate) the final results. Consult with the Spark admin to set this value.

Configuration Property: Spark App Cores

  • Analytics Service Restart Required?
  • Loader Service Restart Required?
  • Description: Set the number of dedicated CPU cores for the SQLi Spark App only. Ensure that there are enough cores in your setup that are reserved for OS, applications, and other services.

Configuration Property: Spark App Memory

  • Analytics Service Restart Required?
  • Loader Service Restart Required?
  • Description: Provide the maximum memory that will be used by SQLi Spark queries, leaving extra memory for MVs if needed. The memory required for both applications combined cannot exceed the Worker Memory.

Configuration Property: SQL App executors

  • Analytics Service Restart Required?
  • Loader Service Restart Required?
  • Description: Provide the maximum number of executors that can be spawned on a single worker. Each of the executors will have some of the cores defined in the “SQL App cores” property, and will use part of the memory defined in the SQL App memory” property. Note that the cores and memory assigned for each executor will be the same for all the executors. Thus, the number of executors is the divisor of the number of SQL App cores and SQL App memory, and must be smaller than or equal to them. However, if it is not the divisor, the total cores will not be utilized. (Refer to the adminUI online help for an example)

Configuration Property: SQL App shuffle partitions

  • Analytics Service Restart Required?
  • Loader Service Restart Required?
  • Description: A single shuffle partition represents a block of data processed for joins and/or aggregations execution. The shuffle partition increases as the processed data size increases. The optimal shuffle partition size is approximately 128MBs. It’s recommended to increase this value as the processed data size increases. However, this means an increased CPU utilization. On the other hand, if the query operates on a trivial amount of data, an increased amount of partitions will lead to a small partition size. This can increase the query execution time due to the overhead of managing needless partitions. Insufficient partitions can cause a query to fail.

Configuration Property: SQL App extra options

  • Analytics Service Restart Required?
  • Loader Service Restart Required?
  • Description: Extra Spark options can be passed to the SQL interface Spark application. These options can be used to override the default configurations.

Configuration Property: Enable SQL App Dynamic Allocation

  • Analytics Service Restart Required? Yes
  • Loader Service Restart Required?
  • Description: This property controls the dynamic allocation of the Data Hub Spark application. If it is enabled, Spark will dynamically allocate executors depending on the workload. This is bounded by the resources assigned in other configurations (e.g. CPUs and memory per executor). When a query gets fired, it starts with one executor and dynamically generates others if needed. This option helps optimize resource utilization as it removes idle executors to save resources. If the workload increases, Spark claims them again.

Configuration Property: Spark App port

  • Analytics Service Restart Required? Yes
  • Loader Service Restart Required?
  • Description: This port used by Incorta to connect to Spark and access the Data Hub.

Configuration Property: Spark App control channel port

  • Analytics Service Restart Required? Yes
  • Loader Service Restart Required?
  • Description: The port used to send a shutdown signal to the Spark SQL app if the Incorta server needs to shut down Spark.

Configuration Property: Spark App fetch size

  • Analytics Service Restart Required? Yes
  • Loader Service Restart Required?
  • Description: This property sets the number of rows that Incorta fetches at a time from the Data Hub while aggregating the query result-set.

Configuration Property: SQL App spark home (Optional)

  • Analytics Service Restart Required?
  • Loader Service Restart Required?
  • Description: Provide the file system path for the Apache Spark instance used to execute queries that are either sent to the Incorta engine or Data Hub. If this options is not set, the SPARK_HOME environment variable will be used instead. If this is not set either, the Spark home value for the Spark instance used for the Incorta materialized views and compaction will be used.

Configuration Property: SQL App Spark master URL (Optional)

  • Analytics Service Restart Required?
  • Loader Service Restart Required?
  • Description: Provide the Spark master server URL. for executing the SQLi queries sent to the Incorta engine or Data Store. If not entered, the shipped Spark master URL will be used instead.

Tuning Server Configurations

Configuration Property: Max In-Memory Data (%)

  • Analytics Service Restart Required? Yes
  • Loader Service Restart Required?
  • Description: This property sets the maximum data allowed in memory as a percentage of the JVM memory. The default (and recommended) value is 75, to leave enough room for the engine to perform calculations. Note that setting this value to higher than 75% can result in server stability issues.

Configuration Property: Max Concurrent Queries

  • Analytics Service Restart Required? Yes
  • Loader Service Restart Required?
  • Description: This property determines the maximum number of queries that can run at the same time. Note that dragging a column in the Incorta application UI (Analyzer mode) executes a query in the background. The default value is 0, meaning that the number of concurrent jobs is set automatically depending on the number of physical cores.

Configuration Property: Max CPU cores (%)

  • Analytics Service Restart Required? Yes
  • Loader Service Restart Required? Yes
  • Description: Configure this property to set the maximum allowed percentage of the CPU cores for the Incorta engine to utilize.

NOTE

After making configuration changes on any page, you must select Save before navigating away from that page to avoid losing unsaved data.


Content