Concepts → Multi-Source Table

A physical schema table may have more than one data source. A multi-source table allows a schema developer to union disparate datasets into one physical schema table. For example, a multi-source table may define one source as a File System dataset and another as a SQL Database dataset.

A multi-source table represents a parallelized workload. The parallelized extraction optimizes the load performance and simulates the parallelism afforded to connectors that support chunking.

When using multi-source tables, consider the following:

  • Mapped columns should have a common data type or can be successfully type-casted. See Type Conflicts for more information.

  • When possible, define a column to indicate the source itself to help debug any issues with that source.

  • A key column enforces row uniqueness for the union of datasets.

  • If enabling an incremental load, you must enable the Incremental property for all the data sources in the multi-source table. In addition, if you specify the Maximum Value of a Column for the Incremental Extract Using property, the column you specify for the corresponding Incremental Column must have the same range for all data sources. If this is not the case, specify the Last Successful Extract Time.

Create a multi-source table

After you create a physical schema table you can then add additional data sources to the table.

Following are the steps to create a multisource table:

  • From the Schema Manager, select the desired physical schema.
  • Select the desired table from the physical schema.
  • Select the large plus icon in the Data Sources section.
  • Complete the steps to add an additional data source to the table.

Manage multi-source output

Once a table has multiple data sources, the Manage Output option will be available next to the table name in the Table Editor. The Manage Output dialog will allow you to see the data source for each column and edit the column data type.

Data-source conflicts

If a conflict exists between the two data sources, Incorta will provide a warning in the Table Editor. Conflicts in the data sources must be rectified before the table will load successfully. Conflicts can arise from column function, data type, ordering, or naming.

Column merging by name

If columns from different data sources have the same name, Incorta will merge them into a single column for your multi-source table.

Ordering Conflicts

When creating a multi-source table, columns from source tables will be merged if they have the same names and are in the same column order. You will receive a warning if there is an ordering conflict between the tables and be prompted to correct the ordering conflict.

Function Conflicts

When two columns are matched by Incorta to merge, Incorta will automatically cast the column function if there is a conflict. You can use the Manage Output menu to customize the column function if needed.

The following table shows how Incorta will automatically resolve column function conflicts:

column1 in data_set_1column1 in data_set_2casting
keykeykey
keydimensionkey
keymeasurekey
dimensiondimensiondimension
dimensionmeasuredimension
measuremeasuremeasure

Type Conflicts

When there is a conflict in the column type Incorta will automatically resolve most data type conflicts. There are several cases where you will be prompted to resolve the type conflict.

Incorta will automatically perform data type casting when needed or you can perform manual type casing. When you perform manual type casting, you need to ensure data can be cast into the new data type.

The following table shows the automatic type casting that will occur and when the user will be prompted to manually select the column type:

column_1 in data_set_1column_1 in data_set_2casting
intintint
intlonglong
intdoubledouble
intstringstring
intnullint
inttimestampUser will be prompted to resolve type conflict
intdateUser will be prompted to resolve type conflict
inttexttext
longintlong
longlonglong
longdoubledouble
longstringstring
longnulllong
longtimestampUser will be prompted to resolve type conflict
longdateUser will be prompted to resolve type conflict
longtexttext
doubleintdouble
doublelongdouble
doubledoubledouble
doublestringstring
doublenulldouble
doubletimestampUser will be prompted to resolve type conflict
doubledateUser will be prompted to resolve type conflict
doubletexttext
stringintstring
stringlongstring
stringdoublestring
stringstringstring
stringnullstring
stringtimestampstring
stringdatestring
stringtexttext
nullintint
nulllonglong
nulldoubledouble
nullstringstring
nullnullnull
nulltimestamptimestamp
nulldatedate
nulltexttext
timestampintUser will be prompted to resolve type conflict
timestamplongUser will be prompted to resolve type conflict
timestampdoubleUser will be prompted to resolve type conflict
timestampstringstring
timestampnulltimestamp
timestamptimestamptimestamp
timestampdatetimestamp
timestamptexttext
dateintUser will be prompted to resolve type conflict
datelongUser will be prompted to resolve type conflict
datedoubleUser will be prompted to resolve type conflict
datestringstring
datenulldate
datetimestamptimestamp
datedatedate
datetexttext
textinttext
textlongtext
textdoubletext
textstringtext
textnulltext
texttimestamptext
textdatetext
texttexttext