Power Query: Merge Columns vs Merge Columns
21 November 2018
Welcome to our Power Query blog. This week, I look at merging columns to show that not all ‘Merge Columns’ options are born equal.
There are currently some problems that can occur when merging columns. To demonstrate, I will use the following subset of my tent data:
I want to create a new column which will contain all of my dimension data.
In the ‘Transform’ data tab, I can choose to merge columns. First, I must select them, which I can do by holding down the CTRL key as I click on them.
I have selected all the columns containing dimensional data, then I choose to merge:
I choose to separate my data by commas, and I call my new column Dimensions.
I have several problems here:
- My original columns have disappeared
- The data is not in the order I would have expected
- There are extra commas.
The first problem could be resolved by copying my columns before I merge them:
In the ‘Add Column’ tab, I can create a duplicate of the selected column. I have to do this one column at a time, but having created the duplicates I can select them instead of the original columns.
Surprisingly, there is an easier way – I can merge columns from the ‘Add Column’ tab instead:
I can merge columns here too, but the process behaves slightly differently.
This part looks the same, I pick the separator and the name of my new column.
However, this time I get to keep my original columns. Confusing, but useful when I know that ‘Merge Columns’ on the ‘Transform’ tab deletes the original columns, whereas ‘Merge Columns’ from the ‘Add Column’ tab keeps the originals.
They are still in the wrong order though, and now I can see my original columns, I can work out the order. The merge is constructed in the order in which I select my columns. I selected colour first and worked backwards, so my merge does too. I can rectify this by selecting my columns in the order I want them to merge.
Having selected Length, I can select my remaining columns in the right order by holding down the SHIFT key and then selecting colour. This is quicker and will ensure the order is right.
Having created my merge (from the ‘Add Column’ tab), I check the order again.
The order is correct, and looking more closely, my last problem (extra commas) has also been resolved! This is because there are more differences to the way the ‘Merge Columns’ option on the ‘Add Column’ tab works compared to the ‘Transform’ tab. To show the difference in the results, I will merge the columns from the ‘Transform’ tab:
I give the column a different name so I can see what is happening.
The ‘Merge Column’ from the ‘Transform’ tab does not cope with the nulls, but the ‘Merge Column’ from the ‘Add Column’ tab does!
The answer to this discrepancy is in the Advanced Editor, where I can see what each ‘Merge Columns’ has done in M.
Firstly, consider the ‘Transform’ tab version:
#"Merged Columns" = Table.CombineColumns(#"Inserted Merged Column",{"Length", "width", "height", "volume", "temperature", "colour"}, Combiner.CombineTextByDelimiter(",", QuoteStyle.None),"Dimensions (from Transform)")
This version of the merge has created a new column using the M function Combiner.CombineTextByDelimiter. Clearly, this function does not cope with nulls.
The ‘Add Column’ tab version has used a different M function:
#"Inserted Merged Column" = Table.AddColumn(#"Changed Type", "Dimensions", each Text.Combine({[Length], [width], [height], [volume],[temperature], [colour]}, ","), type text)
This time, the M function used is Text.Combine, which does cope with the nulls.
This does leave the question of whether the ‘Add Column’ merge column feature will cope equally well with numerical columns…
The answer is yes, the price is included too, and looking at the M code, I can see why:
= Table.AddColumn(#"Changed Type", "Details from Add Column", each Text.Combine({[Length], [width], [height], [volume], [temperature], Text.From([Price], "en-GB"), [colour]}, ","), type text)
The price is converted to text so that it can be combined with the other columns.
In summary, if merging columns, it’s currently best to do so from the ‘Add Column’ tab. It keeps the original columns and copes with nulls. This is correct at the time of writing, but I wouldn’t be surprised to find that the ‘Transform’ tab is soon updated to use the same functionality for the ‘Merge Columns’ option. I still have to remember to add the columns in the order I want to merge them, but allowing that functionality means that I don’t need to reorder my columns if I do need them to be merged in a different order.
Come back next time for more ways to use Power Query!