Concepts → Multi-Source Table

A physical schema table may have more than one data source. A multi-source table allows a schema developer to union disparate datasets into one physical schema table. For example, a multi-source table may define one source as a File System dataset and another as a SQL Database dataset.

A multi-source table represents a parallelized workload. The parallelized extraction optimizes the load performance and simulates the parallelism afforded to connectors that support chunking.

When using multi-source tables, consider the following:

  • Mapped columns should have a common data type or can be successfully type-casted. See Type Conflicts for more information.
  • When possible, define a column to indicate the source itself to help debug any issues with that source.
  • A key column enforces row uniqueness for the union of datasets.
  • If enabling an incremental load, you must enable the Incremental property for all the data sources in the multi-source table. In addition, if you specify the Maximum Value of a Column for the Incremental Extract Using property, the column you specify for the corresponding Incremental Column must have the same range for all data sources. If this is not the case, specify the Last Successful Extract Time.

Create a multi-source table

After you create a physical schema table you can then add additional data sources to the table.

Following are the steps to create a multisource table:

  • From the Schema Manager, select the desired physical schema.
  • Select the desired table from the physical schema.
  • Select the large plus icon in the Data Sources section.
  • Complete the steps to add an additional data source to the table.

Manage multi-source output

Once a table has multiple data sources, the Manage Output option will be available next to the table name in the Table Editor. The **Manage Output **dialog will allow you to see the data source for each column and edit the column data type.

Data-source conflicts

If a conflict exists between the two data sources, Incorta will provide a warning in the Table Editor. Conflicts in the data sources must be rectified before the table will load successfully. Conflicts can arise from column function, data type, ordering, or naming.

Column merging by name

If columns from different data sources have the same name, Incorta will merge them into a single column for your multi-source table.

Ordering Conflicts

When creating a multi-source table, columns from source tables will be merged if they have the same names and are in the same column order. You will receive a warning if there is an ordering conflict between the tables and be prompted to correct the ordering conflict.

Function Conflicts

When two columns are matched by Incorta to merge, Incorta will automatically cast the column function if there is a conflict. You can use the Manage Output menu to customize the column function if needed.

The following table shows how Incorta will automatically resolve column function conflicts:

column1 in dataset1 column1 in dataset2 casting
key key key
key dimension key
key measure key
dimension dimension dimension
dimension measure dimension
measure measure measure

Type Conflicts

When there is a conflict in the column type Incorta will automatically resolve most data type conflicts. There are several cases where you will be prompted to resolve the type conflict.

Incorta will automatically perform data type casting when needed or you can perform manual type casing. When you perform manual type casting, you need to ensure data can be cast into the new data type.

The following table shows the automatic type casting that will occur and when the user will be prompted to manually select the column type:

column1 in dataset_1 column1 in dataset_2 casting
int int int
int long long
int double double
int string string
int null int
int timestamp User will be prompted to resolve type conflict
int date User will be prompted to resolve type conflict
int text text
long int long
long long long
long double double
long string string
long null long
long timestamp User will be prompted to resolve type conflict
long date User will be prompted to resolve type conflict
long text text
double int double
double long double
double double double
double string string
double null double
double timestamp User will be prompted to resolve type conflict
double date User will be prompted to resolve type conflict
double text text
string int string
string long string
string double string
string string string
string null string
string timestamp string
string date string
string text text
null int int
null long long
null double double
null string string
null null null
null timestamp timestamp
null date date
null text text
timestamp int User will be prompted to resolve type conflict
timestamp long User will be prompted to resolve type conflict
timestamp double User will be prompted to resolve type conflict
timestamp string string
timestamp null timestamp
timestamp timestamp timestamp
timestamp date timestamp
timestamp text text
date int User will be prompted to resolve type conflict
date long User will be prompted to resolve type conflict
date double User will be prompted to resolve type conflict
date string string
date null date
date timestamp timestamp
date date date
date text text
text int text
text long text
text double text
text string text
text null text
text timestamp text
text date text
text text text