Union All Transformation:
Like Merge, Union All transformation is also used to merge datasets. But unlike Merge, this transformation can merge more than two datasets and Sorted data is not a mandatory requirement for this one. First Input that is mapped to this transformation becomes the output of this transformation. To map the columns they must have same datatype. This transformation does not support Error handling, neither does it eliminate duplicates.Using this transformation, we can join ‘Heterogeneous’ sources as well, for e.g. we can join database table, flat file and a excel file in a single Union All transformation. The only mandate for this is, there data types should match.
In the below example, we will merge three excel datasets:
Dataset 1:
Id
|
Name
|
Location
|
1
|
Test User 1
|
Loc1
|
7
|
Test user 7
|
Loc2
|
3
|
Test User 3
|
Loc3
|
Dataset 2:
Id
|
Name
|
2
|
Test User 2
|
6
|
Test User 6
|
8
|
Test User 8
|
4
|
Test user 4
|
10
|
Test User 10
|
Dataset 3:
Id
|
Name
|
Location
|
5
|
Test User 5
|
Loc4
|
9
|
Test user 9
|
Loc5
|
As shown below, use ‘Excel source’ to configure above:

Now drag, ‘Union All’ transform from Data Flow Transformation, first map the output of ‘Dataset 1’ to Union All, followed by ‘Dataset 2’ and ‘Dataset 3’.

As we mapped ‘Dataset 1’ as the first input to Union All transformation, we can see all the columns of ‘Dataset 1’ are been shown as Output Columns of Union All transformation.

In the editor, we can change the ‘Output Column name’ as well if needed. As ‘Dataset 2’ has only 2 columns, hence in the editor for output column Location under ‘Union All Input 2’ we can see ‘<ignore>’.
Below shown is the output:

In the data viewer, we can see under column ‘Location’ 5 rows have NULL, all these columns are from ‘Dataset 2’, as in the editor it was been mapped as ‘<ignore>’, under ‘Union All input 2’. Also unlike MERGE transformation, the output data is not sorted.
No comments:
Post a Comment