Power Query: Emerging from a Muddle
3 March 2021
Welcome to our Power Query blog. This week, I look at a more efficient way of merging the data from last week, again when there is not a clear link between the tables,
I begin by repeating the scenario from last time. I have some sales contact data for my imaginary business. Unfortunately, I have some complaints and I want to allocate them to the correct salesperson. I have the date of the complaint and also the date the salesperson took over as the primary.
I start by extracting my data to Power Query. I have two tables. I click somewhere in my first table, and I choose ‘From Table/Range’ on the ‘Get & Transform’ section of the Data tab.
I take the usual defaults and click OK.
I ‘Close & Load’ my first table and repeat this process for the other table.
I now have two tables, which I have called ‘Key Salesperson’ and Complaints. At this point last week, I used a full outer join to connect my tables. This week, I am going to join them conditionally. I will write a function to get the correct salesperson for each complaint.
I start with my ‘Key Salesperson’ table. Considering this table from the perspective of the Complaints table, I will need to know which salesperson was the primary contact for a given Complaint Date, so I start by generating some of the M code I will need. I will need to find the salesperson who takes over before the date of the complaint.
I filter on the date, and for now, I just choose to filter before one of the dates already in the table. It doesn’t matter what date I use at this point; I just need the syntax. My next consideration is that this could return more than one salesperson, and I want to get just the latest one.
I need to sort the data in reverse date order, to give me the most recent record at the top.
Finally, I drill down into the top value to ensure that my function will return the name of the salesperson.
This gives me a single value:
Now I have the code I need; I am going to turn this into a function.
The code I have in the advanced editor so far is:
let
Source = #"Key Salesperson",
#"Filtered Rows" = Table.SelectRows(Source, each [Date] < #datetime(2020, 1, 15, 0, 0, 0)),
#"Sorted Rows" = Table.Sort(#"Filtered Rows",{{"Date", Order.Descending}}),
#"Primary Sales Person" = #"Sorted Rows"{0}[Primary Sales Person]
in
#"Primary Sales Person"
I will pass a parameter p_complaintdate, which I will use instead of the hardcoded date in the #Filtered Rows step.
The M code is now:
(p_complaintdate as date) =>
let
Source = #"Key Salesperson",
#"Filtered Rows" = Table.SelectRows(Source, each [Date] < p_complaintdate),
#"Sorted Rows" = Table.Sort(#"Filtered Rows",{{"Date", Order.Descending}}),
#"Primary Sales Person" = #"Sorted Rows"{0}[Primary Sales Person]
in
#"Primary Sales Person"
I can then save this as a function fn_salesperson and go back to my Complaints table.
I am going to add a column from the ‘Add Column’ tab to ‘Invoke Custom Function’.
I choose my function and select to use Complaint Date as the parameter.
This gives me the correct salesperson for each complaint.
Come back next time for more ways to use Power Query!