Power Query: Tidy Texting
13 June 2018
Welcome to our Power Query blog. Following on from last week’s blog, this time I look at some useful transforming Text() functions in M.
It’s no secret that cleaning up data involves tidying up text strings so that I have consistent information to work with. There are a few Text() functions in M which are particularly useful. Having looked at true / false functions last week, this time I take a look at some Text() functions that can transform my data and I will give an example for each one.
Text.Insert
Text.Insert(text as nullable text, offset as number, newText as text) as nullable text
Returns a text value with newText inserted into a text value starting at a zero-based offset.
This is a useful function for formatting a column to make it easier to read. In the next screen I have a column of serial numbers. If different sections of the serial number have specific meanings (for example department t or product type), then it makes sense to break down the number to make it easier to read. This also makes it clearer for other users who need to extract parts of the serial number to new columns.
I can now use the M Text.Insert() formula to transform my column.
The formula I have used for my new column is
= Text.Insert([Column1],4,"-")
It is also possible to nest this function so that I insert multiple characters, as shown in the formula, I use next to create column Product Segments.
In this case, the formula is
= Text.Insert((Text.Insert([Product ID],9, "-")),13, "-"))
The product ID has now been fully broken down into segments and is much easier to read and compare to other ID’s.
Text.PadStart
Text.PadStart(text as nullable text, length as number, pad as nullable text) as nullable text
Returns a text value padded at the start with pad to make it at least length characters.
This is a useful M function for presenting generated reference numbers correctly. Note that if no ‘pad’ character is specified, then the default is a space. In the next screen I have a list of generated reference numbers, but they all need to be 10 characters long.
It’s important to make sure that Power Query hasn’t ‘helpfully’ converted the reference to a number – if it has, delete the automatically created ‘Changed Type’ step. In this case I have converted my column to text.
I create a new column using Text.PadStart().
My formula is
= Text.PadStart([Column1],10,"0")
I am expanding all my references to be 10 characters, by adding zeros at the beginning (zero filling).
My references are now presented correctly.
Text.PadEnd
Text.PadEnd(text as nullable text, length as number, pad as nullable text) as nullable text
Returns a text value padded at the end with pad to make it at least length characters.
This is not an M function I use a great deal, but it’s useful to know that text can be padded at either end. I will add zeros to the end of my reference number so that each number is 15 characters in length.
The M formula I use is
= Text.PadEnd([Reference Number], 15, "0")
All the references are 15 characters long, and the zeros appear at the end of the references.
Text.Remove
Text.Remove(text as nullable text, removeChars as any) as nullable text
Removes all occurrences of a character or list of characters from a text value. The removeChars parameter can be a character value or a list of character values.
This M function is very useful for getting rid of unwanted characters, so there is consistency.
On the next screen I have a list of contacts where the phone numbers have not been recorded consistently.
I need to get rid of spaces, brackets and hyphens so that the numbers are all in the same format.
The M formula I have used is
= Text.Remove([Phone Number], {" ", "(", ")", "-"})
This should remove all spaces, brackets and hyphens. I have specified a list by the use of the curly brackets ({ }).
My phone numbers in the Phone No column are now displayed consistently.
Next time I’ll look at some text functions that can be used to extract sections of a text string.
Come back next time for more ways to use Power Query!