Power Query: Project Population – Part 3
3 May 2023
Welcome to our Power Query blog. This week, I keep on keeping on the transformation of selected data from a public source.
I have found some information on population growth provided by The World Bank, which I am using as an example of how to transform real-life data.
data:image/s3,"s3://crabby-images/b5984/b5984c6e6cd4a763a0fcc683cd2180f6473ff19c" alt=""
I have downloaded the Excel file, and in Part 1, I extracted the queries I needed:
data:image/s3,"s3://crabby-images/9e612/9e6124c0e0b3e8f7aba70e07f4fe25f0579f92a8" alt=""
I reduced the data by selecting only the columns I wanted to keep:
data:image/s3,"s3://crabby-images/9e1e8/9e1e89edcb174968821a987a4a4f8c453818d2c2" alt=""
Last time, I identified and removed any unnecessary rows:
data:image/s3,"s3://crabby-images/82b16/82b163f7a2b10f8f31fb249728302626818a8f8f" alt=""
I also transformed the System of National Accounts column:
data:image/s3,"s3://crabby-images/c82a7/c82a7e1ba3d64085ebd03e501687ea024bbb6717" alt=""
I renamed these steps to make it easier to follow the transformations:
data:image/s3,"s3://crabby-images/a1289/a12895a1d771dbebcfb13dd632ebdb82a1e308d6" alt=""
This time, I am going to look at the column Latest Population Census:
data:image/s3,"s3://crabby-images/c5056/c505607942cfec6227cbc2646b5681fb50bdfd81" alt=""
This column currently contains mixed data types, which is why Power Query has given it a type of ‘Any’. I would like to transform this column into two [2] columns, one with the year, and one with any additional notes. On the Transform tab, I choose to split the column ‘By Positions’:
data:image/s3,"s3://crabby-images/985d3/985d3c23e369972fc034929b3c6177bd973dae2c" alt=""
This prompts me for where to split the column, remembering that Power Query counts from zero [0]:
data:image/s3,"s3://crabby-images/2f0de/2f0ded1e138e35054516deabe82e774387692b3b" alt=""
Clicking on ‘Advanced options’ also indicates how to enter multiple positions to split the data. For this example, I just need to split at position 4, however, if I only enter 4, then the column after position 4 would be retained. In order to preserve the year and have two [2] columns, I need to enter ‘0,4’ in the Positions box. I take the default to ‘Split into Columns’.
data:image/s3,"s3://crabby-images/3b563/3b56321c28d7b55ebf56e07f46215d4b6b7b2dd7" alt=""
This gives me two [2] new columns.
data:image/s3,"s3://crabby-images/8cddd/8cdddec9e4f99fae1859234ac2d2e872784cd44b" alt=""
Note that I did not have an option to name my split columns. I can add a step to rename them, but instead I will change the M code in the ‘Split Column by Positions’ step:
data:image/s3,"s3://crabby-images/e2b4d/e2b4d8237df7e802ee7e3b7431c27c4bb586fdac" alt=""
I change the code from:
= Table.SplitColumn(Table.TransformColumnTypes(#"Changed SNA
to Number", {{"Latest population census", type text}},
"en-GB"), "Latest population census",
Splitter.SplitTextByPositions({0, 4}), {"Latest population census.1",
"Latest population census.2"})
to:
= Table.SplitColumn(Table.TransformColumnTypes(#"Changed SNA
to Number", {{"Latest population census", type text}},
"en-GB"), "Latest population census",
Splitter.SplitTextByPositions({0, 4}), {"Latest population census",
"Population census notes"})
data:image/s3,"s3://crabby-images/53f84/53f84c44e835f3f0dd724a9c5ec0aaf6b30183a3" alt=""
I rename the step, and delete ‘Changed Type1’. It no longer uses the correct column names, and it left Latest population census with data type ‘Text’.
data:image/s3,"s3://crabby-images/8c2ec/8c2ec0be4160eb3c78db9808680eba8e8ccead50" alt=""
Next time, I will look at why the column Latest population census was not assigned a numerical data type.
Come back next time for more ways to use Power Query!