Power Query: Project Population – Part 2
26 April 2023
Welcome to our Power Query blog. This week, I continue to transform selected data from a public source.
I have found some information on population growth provided by The World Bank, which I am using as an example of how to transform real-life data.
data:image/s3,"s3://crabby-images/0df4a/0df4a8d80b9448797c2565f48d9b4aac1c11e59b" alt=""
I have downloaded the Excel file, and last week I extracted the queries I needed to Power Query:
data:image/s3,"s3://crabby-images/67680/67680a34010dfee35e884330d67fdedfb9f65ebb" alt=""
I reduced the data by selecting only the columns I wanted to keep:
data:image/s3,"s3://crabby-images/b2e9e/b2e9e0e657f4a09502dbc60665271b115778ea3a" alt=""
This time, I will identify and remove any unnecessary rows and continue to transform the data. Looking at the country names, I can see that some are regions and not countries:
data:image/s3,"s3://crabby-images/be3ae/be3aeff0494a1ed4f4d9510382695a62cb569eff" alt=""
For example, row 2 is ‘African Eastern and Southern’ and row 8 is ‘Arab World’. Since I only want the country data, I can remove these by filtering on Currency Unit, and choosing ‘Remove Empty’:
data:image/s3,"s3://crabby-images/c7137/c7137791875e63117edccdaa9fcdd683d99a9c8f" alt=""
This leaves me with a smaller dataset of 13 columns and 217 rows.
data:image/s3,"s3://crabby-images/05165/05165a36a496ccf31230ceeea39df331706e654e" alt=""
Now I can turn my attention to transforming the remaining data. The System of National Accounts column has excess words:
data:image/s3,"s3://crabby-images/ece30/ece30d86ab179fdd3da5a21b10e8f1f8c19545cf" alt=""
I could split the column into numerical and text data, using the ‘Split column’ feature on the Transform tab, but I only need the year, so the remaining column would just need to be deleted.
data:image/s3,"s3://crabby-images/30241/3024162280654eebee5e0b20e5db214f8e8c756c" alt=""
Instead, I can extract the year as a range, since it always appears in the same position:
data:image/s3,"s3://crabby-images/532bc/532bc756187674792d7e328e13d578b5af3e7c82" alt=""
I want to start at position 17 (remembering the first position is zero [0]) and extract four [4] characters:
data:image/s3,"s3://crabby-images/2cf78/2cf78cc65685ae940282f04ddfd382352f01423d" alt=""
This gives me the years where the data is available, and I can change the data type to ‘Whole number’:
data:image/s3,"s3://crabby-images/252d7/252d7d290cccc611894f1c31f3941b8a60c4cdef" alt=""
This is a good point to look at the names of the steps that have been generated:
data:image/s3,"s3://crabby-images/dd758/dd75879dd3e48019f67a4e69bb188057cc0ce070" alt=""
The first six [6] steps are fine, as I am reducing the data. However, ‘Extracted Text Range’ and ‘Changed Type 1’ could be improved:
data:image/s3,"s3://crabby-images/9c926/9c9267680c6fd7aca7819869b3e656678b5fdb1f" alt=""
Changing the names has been achieved simply by right-clicking on the step and clicking ‘Rename’. Note that I could have chosen to remove spaces from the step names, which would remove the leading hash (#) from the step names in the Advanced Editor:
data:image/s3,"s3://crabby-images/9cca0/9cca03b82969c1fd90ec13ef5a084d57e607b8b5" alt=""
This depends how familiar you (and anyone else that might need to understand the query) are with M code.
That’s it for this week, next time I’ll continue transforming the queries.
Come back next time for more ways to use Power Query!