References → Null Handling
Introduction
Before the 2022.12.0 release, Incorta handled null values differently within the system. Null values were not considered true nulls, but instead real values. This resulted in zero values for numeric columns and empty strings for string and date columns. This approach caused discrepancies in various aspects of Incorta, particularly between Spark-based objects (such as Materialized Views or MVs) and Incorta-native objects (physical schema tables, Incorta SQL tables, and Incorta Analyzer tables). The discrepancies also extended to other functionalities like filters, grouping, functions, and subtotals.
Null handling initiative
Starting with 2022.12.0, Incorta initiated a project to align its handling of null values with the behavior of ANSI SQL. In 2023.7.0, Incorta Analytics Service started to account for null values when sorting and filtering data in dashboards, insights, Analyzer tables, and Analyzer views based on physical columns. Starting with the 2024.1.0 release, Incorta has extended its support of handling null values to more functions and operations where the Analytics Service is handling data: in dashboards, insights, Analyzer tables, and business schema objects.
The Null Handling feature is a preview feature for now. As of the 2024.1.0 release, only Incorta Analytics Service accounts for null values in the supported contexts. Other components, such as Incorta Loader Service, session and global variables, and joins, do not account for null values for now.
By distinguishing between null values and real values in the supported contexts, Incorta provides more accurate and meaningful results in your data analysis and reporting, ensuring that null values are handled consistently and appropriately in various operations.
Null value representation in the supported contexts has not been handled yet.
Controlling null handling
As this project is ongoing and not all system components fully handle null values as true nulls, Incorta provides administrators with the ability to control how the system handles null values in the supported contexts. Cluster Management Console (CMC) administrators can manage how Incorta Analytics Service handles null values by enabling or disabling the Null Handling option.
- Location:
- Before 2024.1.0: CMC > Server Configurations > Customizations > Null Handling.
- Starting 2024.1.0: CMC > Server Configurations > Incorta Labs > Null Handling.
- Status: By default, the Null Handling option is disabled.
Enabling Incorta Advanced SQL Interface or Notebook for Analyzer User automatically enforces accounting for null values in the supported contexts regardless of enabling or disabling the Null Handling option.
Contexts accounting for Null values
When you enable Null Handling in Incorta, the system distinguishes between null values and real values in the following contexts:
Release | Context | Behavior when enabling null handling |
---|---|---|
2022.12.0 | Aggregation functions (Including formulas involving aggregation functions) | Null values are ignored. For details and examples, see Null handling in detail > Aggregation functions. |
2022.12.0 | Insight Grouping Dimensions based on physical columns | Null values in a grouping dimension column are considered a distinct group or category. Note: There is no change in behavior when the feature is disabled. |
2023.7.0 | Sorting based on physical columns | ● In ascending order, null values come first. ● In descending order, null values come last. |
2023.7.0 | Insight and dashboard filters based on physical columns | Null values are not considered zeros nor empty strings. For details, see Null handling in detail > Filters. |
2024.1.0 | Sorting based on formulas | ● In ascending order, null values come first. ● In descending order, null values come last. |
2024.1.0 | Insight and dashboard filters based on formulas | Null values are not considered zeros nor empty strings. For details, see Null handling in detail > Filters. |
2024.1.0 | Filters in other contexts, such as conditional formatting and formula runtime security filters | Null values are not considered zeros nor empty strings. For details, see Null handling in detail > Filters. |
2024.1.0 | More functions and operations in dashboards, insights, Analyzer tables, and business schema objects | The behavior varies according to the function or operation. For details and examples, see the different sections under Null handling in detail. |
Null handling in detail
Enabling the Null Handling option may result in a different output for existing calculations. The following sections describe in detail how Incorta Analytics Service handles null values when enabling and disabling this option.
The following contexts do not differentiate between null values and real values for now.
- Physical schema table formula columns
- Materialized view formula columns
- Load filters
- Runtime security filters based on session variables
- Joins and join filters
Filters
Filter type | Behavior when enabling null handling | Release |
---|---|---|
Insight and dashboard filters based on physical columns: ● Individual filters ● Prompts ● Runtime filters ● Drilldown ● Aggregate filters ● Distinct filters Notes: The following filters are not supported: ● Dynamic filters inside dashboard ● Formula filters based on session or global variables Filters using the following operators are not supported: ● In Query ● Is Descendant ● First Version ● Last Version | Null values are not considered zeros or empty strings. Consequently, ● If the filter column contains only nulls, no rows are returned, except for the Null operator, all rows are returned. ● If the filter column contains real values and nulls, using the Null operator returns only rows with null values while in the case of all other supported operators, null values are ignored. ● For Distinct filter, if the column contains null values, the result list will contain only one row to represent these values. | 2023.7.0 |
Insight and dashboard filters based on formulas: ● Individual filters ● Prompts ● Runtime filters ● Drilldown ● Aggregate filters ● Is Descendant (if the self-join columns do not contain null values; otherwise, the filtering results are not consistent.) Notes: ● Only formulas with the supported functions account for null values. ● The following filters are not supported: formula filters based on session or global variables and dynamic filters. ● Filters using the In Query operator are not supported. | The same as filtering based on physical columns | 2024.1.0 |
Filters in other contexts: ● Both regular and formula runtime security filters (However, runtime security filters based on session variables are not supported.) ● Conditional formatting | Null values are not considered zeros nor empty strings. | 2024.1.0 |
Aggregation functions
Function | Behavior when enabling null handling | Example | Result when disabling the option | Result when enabling the option | Release |
---|---|---|---|---|---|
● sum() ● median() ● average() ● min() ● max() | ● If all column values are null, the function returns NULL. ● Otherwise, the function ignores all null values and considers real values only. | Col1 NULL NULL NULL -1000.23 0 1000 2000.25 5000 8000.37 10000.45 | median(Col1) 500 | median(Col1) 2000.25 | 2022.12.0 |
● count() ● distinct() | ● If all column values are null, the function returns 0 . ● Otherwise, the function ignores null values and returns the count of all real values or distinct real values. | ID,Col1 1,1000 1,2000.25 1,5000 1,-1000.23 1,-1000.23 1,0 1,0 1,NULL 1,NULL 1,NULL 1,10000.45 1,8000.37 1,8000.37 | distinct(Col1) 8 | distinct(Col1) 7 | 2022.12.0 |
Analytics functions
Function | Behavior when enabling null handling | Example | Result when disabling the option | Result when enabling the option | Release |
---|---|---|---|---|---|
● ago() ● toDate() ● toDate() with ago() | ● If the date column value in a row is NULL, the function returns NULL. ● Otherwise, there is no change in behavior. | Col1,Col2 10,23/1/2000 20,12/8/2000 25,NULL 40,15/4/2002 | Quarter(Col2),sum(Col1, toDate(Col2, "Quarter")) 1,10 2,40 3,45 (The column with a value of NULL for Col2 returns quarter 3.) | Quarter(Col2),sum(Col1, toDate(Col2, "Quarter")) NULL,25 1,10 2,40 3,20 | 2024.1.0 |
Arithmetic Functions
Function | Behavior when enabling null handling | Result when disabling the option | Result when enabling the option | Release |
---|---|---|---|---|
● abs() ● ceil() ● floor() ● mod() ● round() ● exp() ● sqrt() ● trunc() | ● If the column value in a row is NULL, the function returns NULL. ● Otherwise, there is no change in behavior. | Col1,abs(Col1) NULL,0 -1,1 | Col1,abs(Col1) NULL,NULL -1,1 | 2024.1.0 |
Boolean functions
Function | Behavior when enabling null handling | Result when disabling the option | Result when enabling the option | Release |
---|---|---|---|---|
● contains() ● startsWith() ● endsWith() ● like() | ● If the column value in a row is NULL, the function returns NULL (instead of true or false ). ● Otherwise, there is no change in behavior. | Col1,contains(Col1, 'M') NULL,false Manager,true | Col1,contains(Col1, 'M') NULL,NULL Manager,true | 2024.1.0 |
not | ● If the column value in a row is NULL, the function returns NULL (instead of true or false ). ● Otherwise, there is no change in behavior. | Col1,not(Col1>100) NULL,true 90,true 200,false | Col1,not(Col1>100) NULL,NULL 90,true 200,false | 2024.1.0 |
Conversion functions
Some conversion functions have two versions, one whose parameter is a column and the other whose parameter is the current date, such as day()
and month()
. There is no change in behavior for functions whose parameter is the current date, and the functions mentioned in the following table are the ones that take columns as a parameter.
Function | Behavior when enabling null handling | Result when disabling the option | Result when enabling the option | Release |
---|---|---|---|---|
● epoch() ● int() ● long() ● parseDate() ● parseDouble() ● parseTimestamp() ● string() ● timestamp() ● double() ● decimal() ● toChar() ● quarter() ● quarterDay() ● quarterMonth() ● quarterWeek() ● month() ● monthName() ● monthsBetween() ● monthWeek() ● weekDay() ● weeknum() ● quarterStartDate() ● quarterEndDate() ● monthStartDate() ● monthEndDate() ● weekStartDate() ● weekEndDate() ● day() ● dayOfYear() ● hour() ● minute() ● second() ● year() | ● If the column value in a row is NULL, the function returns NULL. ● Otherwise, there is no change in behavior. | Col1,long(Col1) NULL,0 20.33,20 50.67,51 | Col1,long(Col1) NULL,NULL 20.33,20 50.67,51 | 2024.1.0 |
● fiscalPeriod() ● fiscalPeriodEndDate() ● fiscalPeriodStartDate() ● fiscalQuarter() ● fiscalQuarterEndDate() ● fiscalQuarterStartDate() ● fiscalYear() ● fiscalYearEndDate() ● fiscalYearStartDate() | ● If the column value in a row is NULL, the function returns NULL. ● Otherwise, there is no change in behavior. | (If the fiscal year start is July) Col1,fiscalQuarter(Col1) 20/10/2023,2 19/7/2020,1 NULL,1 (inconsistent value) 15/1/2020,3 | (If the fiscal year start is July) Col1,fiscalQuarter(Col1) 20/10/2023,2 19/7/2020,1 NULL,NULL 15/1/2020,3 | 2024.1.0 |
Date functions
Function | Behavior when enabling null handling | Result when disabling the option | Result when enabling the option | Release |
---|---|---|---|---|
● addMilliseconds() ● addSeconds() ● addMinutes() ● addHours() ● addDays() ● addMonths() ● addQuarters() ● addYears() | ● If the column value in a row is NULL, the function returns NULL. ● Otherwise, there is no change in behavior. | Col1,addYears(Col1,3) 20/10/2023,20/10/2026 19/7/2020,19/7/2023 NULL,2/12/29226905 15/1/2020,15/1/2023 | Col1,addYears(Col1,3) 20/10/2023,20/10/2026 19/7/2020,19/7/2023 NULL,NULL 15/1/2020,15/1/2023 | 2024.1.0 |
● date() ● formatDate() ● daysBetween ● timeBetween() ● yearsBetween() ● removeTime() ● dateTrunc() ● formatDuration() | ● If the column value in a row is NULL, the function returns NULL. ● Otherwise, there is no change in behavior. | Col1,yearsBetween(Col1,$currentDate) 20/10/2023,-0.07 19/7/2020,3.19 NULL,-292276970.89 15/1/2020,3.69 | Col1,yearsBetween(Col1,$currentDate) 20/10/2023,-0.07 19/7/2020,3.19 NULL,NULL 15/1/2020,3.69 | 2024.1.0 |
● yearMonth(date_timestamp exp) ● yearQuarter(date_timestamp) | ● If the column value in a row is NULL or the expression is evaluated to NULL, the function returns NULL. ● Otherwise, there is no change in behavior. | Col1,yearMonth(Col1) 20/10/2023,202310 19/7/2020,202007 NULL,-836871669 (inconsistent value) 15/1/2020,202001 | Col1,yearMonth(Col1) 20/10/2023,202310 19/7/2020,202007 NULL,NULL 15/1/2020,202001 | 2024.1.0 |
● yearMonth ● yearQuarter | No change in behavior as these functions depend on the current date | Today,yearMonth() 20/1/2023,202301 | Today,yearMonth() 20/1/2023,202301 | 2024.1.0 |
String functions
Function | Behavior when enabling null handling | Result when disabling the option | Result when enabling the option | Release |
---|---|---|---|---|
subString() | ● If the column value in a row is NULL, the function returns NULL. ● Otherwise, there is no change in behavior. | Col1(),subString(Col1,2) Lisa,sa NULL,"" Mark,rk | Col1,subString(Col1,2) Lisa,sa NULL,NULL Mark,rk | 2024.1.0 |
replace() | ● If any of the function parameters (string field, old string, or new string) is NULL, the function returns NULL. ● Otherwise, there is no change in behavior. | Col1,Col2,Col3,replace(Col1, Col2, Col3) Lisa,L,l,lisa NULL,Z,z,"" Mark,NULL,m,mMmamrmkm Rose,R,NULL,ose | Col1,Col2,Col3,replace(Col1, Col2, Col3) Lisa,L,l,lisa NULL, Z,z,NULL Mark,NULL,m,NULL Rose,R,NULL,NULL | 2024.1.0 |
● find() ● findLast() | ● If any of the function parameters is NULL, the function returns NULL. ● Otherwise, there is no change in behavior. | Col1,Col2,find(Col2, Col1) Lisa,L,0 NULL,Z,-1 Mark,NULL,0 Rose,S,2 | Col1,Col2,find(Col2, Col1) Lisa,L,0 NULL,Z,NULL Mark,NULL,NULL Rose,S,2 | 2024.1.0 |
splitPart() | ● If any of the function parameters is NULL, the function returns NULL. ● If the int findNum parameter is greater than the string length, the function returns NULL. ● Otherwise, there is no change in behavior. | Col1,splitPart(Col1, "-",3) A-B-C,C NULL,"" D-E,"" | Col1,splitPart(Col1, "-", 3) A-B-C,C NULL,NULL D-E, NULL | 2024.1.0 |
● lower() ● upper() ● lTrim() ● rTrim() ● trim() ● length() ● repeat() ● reverse() | ● If the column value in a row is NULL, the function returns NULL. ● Otherwise, there is no change in behavior. | Col1,lower(Col1) ABC,abc NULL,"" | Col1,lower(Col1) ABC,abc NULL,NULL | 2024.1.0 |
Arithmetic operations
Operation | Behavior when enabling null handling | Result when disabling the option | Result when enabling the option | Release |
---|---|---|---|---|
Arithmetic operations: ● Addition (+) ● Subtraction (-) ● Multiplication (*) ● Division (/) ● Remainder (%) | If any of the operands is NULL, the operation result is NULL. ● If the operand data type is double, long, or integer, the result is NULL with a data type of double. ● If the operand data type is decimal, the result is NULL with a data type of decimal. | Col1,Col1+100 NULL,100 0,100 1000,1100 | Col1,Col1+100 NULL,NULL 0,100 1000,1100 | 2024.1.0 |
Logical and Comparison operators
Operator | Behavior when enabling null handling | Result when option disabled | Result when option enabled | Release |
---|---|---|---|---|
● Equal (=) ● Less than (<) ● Less than or equal to (<=) ● Greater than (>) ● Greater than or equal to (>=) ● Not equal (<> or !=) | ● If any or both operation operands are NULL, the operation result is NULL. ● Otherwise, the operation result is true or false . | Col1,Col2,Col1>Col2 1000,5000,false 1000,0,true NULL,1000,false NULL,NULL,false 5000,NULL,true | Col1,Col2,Col1>Col2 1000,5000,false 1000,0,true NULL,1000,NULL NULL,NULL,NULL 5000,NULL,NULL | 2024.1.0 |
Limitations and known issues
The following table shows the feature limitations and known issues per release.
Issue | Affected Versions | Fix Version |
---|---|---|
A formula column that only references a physical column with null values shows wrong results as the Loader Service does not respect the null values when materializing the formula column. | 2022.12.0 and later | |
The count and distinct functions return null values instead of zeros when using them in a flat table or a session variable. | 2022.12.0 and later | |
Applying an average aggregation on a decimal column with null values only throws an error and causes the insight to fail to render. | 2022.12.0 - 2023.7.2 | 2024.1.0 |
Running total is not displayed correctly when the measure has null values. | 2022.12.0 - 2023.7.2 | 2024.1.0 |
Sorting based on a physical table column from a joined table does not account for null values in the sorting column when the joined columns do not include null values. Consequently, the original column is not sorted as expected. For example, consider that Tbl1 joins Tbl2 based on Tbl1.Col1 = Tbl2.Col1 and both columns contain no null values. When creating an insight with Tbl1.Col2 and Tbl1.Col3 and sorting Tbl1.Col2 by Tbl2.Col2 that contains null values, the Analytics Engine will not differentiate between nulls and the column’s default value (zero or empty string). Consequently, Tbl1.Col2 will not be sorted as expected. | 2023.7.x | 2024.1.0 |
Geo data types do not support null handling. | 2024.1.0 | |
The Public API does not support null handling. | 2022.12.0 and later |