Power Query: Mind the Overlap
17 June 2020
Welcome to our Power Query blog. This week, I look at a method to spot overlapping periods.
My salespeople have recorded time spent with suppliers, but there seem to be some discrepancies!
data:image/s3,"s3://crabby-images/3d594/3d59468404372eebf651a6b7e4776ef8365e9be1" alt=""
Clearly, there is some overlap, and I intend to show this using Power Query. To begin, I extract my data to Power Query using ‘From Table’ in the ‘Get & Transform’ section of the Data tab.
data:image/s3,"s3://crabby-images/c6fea/c6fea4438946338d7852f1898058951d6cdd031b" alt=""
I will name the headings once I get into Power Query.
data:image/s3,"s3://crabby-images/d00c5/d00c53ee3dc75f9137fa80d8f7d42d7aa5383aaf" alt=""
I need to calculate the overlapping time for each salesperson. To do this, I create a new column based on the source so that I may compare rows.
data:image/s3,"s3://crabby-images/580e1/580e18298d9fe861a9134a827091a658f1878cc1" alt=""
I access ‘Custom Column’ from the ‘Add Column’ tab.
data:image/s3,"s3://crabby-images/cf877/cf877ca6e3d3aa7af42edf550b330cd88488a599" alt=""
Next, I expand GetSource and just for once, I want to use the column name as a prefix in my new columns.
data:image/s3,"s3://crabby-images/248f1/248f15b628919a11360cded9dc0bc22d438a82d1" alt=""
I click OK to create my new columns.
data:image/s3,"s3://crabby-images/d20eb/d20eb544be9391fdd6f8e97e1b5ad51bc660b1ca" alt=""
I now have a product of my columns, so instead of five rows I now have 25. Next week, I will look at a way of doing this where I don’t create all 25 rows. However, let’s continue for this week.
The columns GetSource.Column3 and GetSource.Column4 should be type DateTime, so I change the type on the Transform tab.
data:image/s3,"s3://crabby-images/7f409/7f40953f248180b25d1576cc56f10f0c437a4341" alt=""
I need to create a filtering step to pick those rows I need to make my comparison (i.e. those where the supplier is different so I may compare site times). To get the frame for my filter logic, I create a standard filter. I will then amend the M code.
data:image/s3,"s3://crabby-images/903b2/903b23d91cf01ec396c03011558e4eac9787b027" alt=""
I choose to filter on ‘Amazing Awnings’ and click OK.
data:image/s3,"s3://crabby-images/0f178/0f17873abea19b22b608a3bd44d8327512b85076" alt=""
The M code for this step is
= Table.SelectRows(#"Changed Type1", each ([Supplier] = "Amazing Awnings"))
I need to change this to
= Table.SelectRows(#"Changed Type1", each ([Supplier] <> [GetSource.Column1] and
[Salesperson] = [GetSource.Column3]))
This will result in rows where the supplier is different, but the salesperson is the same
data:image/s3,"s3://crabby-images/a376a/a376a8d82ff60f9aae78498edd9f7f933affd6ed" alt=""
I click on the tick icon to enter my changed step.
data:image/s3,"s3://crabby-images/5f6bf/5f6bf2beb8e872da75c70ab6bd36630b2212c40c" alt=""
I can see that John and Derek have some overlapping time on site.
In order to calculate the overlapping time on site, I will create functions to find the MinSiteTime and MaxSiteTime to apply to my ‘on site’ columns. I close and load my current query, and create a new blank query from the ‘New Query’ dropdown:
data:image/s3,"s3://crabby-images/ed8ab/ed8ab2926ec59afb3ca5d843edc6b3f2b34ca8fd" alt=""
In the Advanced Editor of my new blank query, I add some M code:
data:image/s3,"s3://crabby-images/2837e/2837ec5da2f86ade725e88937d4ba80736a751a5" alt=""
The M code I have added is
(date1, date2)=>
if date1 < date2 then date1 else date2
data:image/s3,"s3://crabby-images/c2f6e/c2f6e7247484f1e035310632cd108e54914ec035" alt=""
I save my function as MinSiteTime and create a new blank query.
data:image/s3,"s3://crabby-images/88eab/88eab3c967f31c2b15535480c7b2bb2735e1f8d2" alt=""
This time, the M code is:
(date1, date2)=>
if date1 >= date2 then date1 else date2
data:image/s3,"s3://crabby-images/47e69/47e69e5de8997bd950ba92c9fc9bc690e7d4ed29" alt=""
I save MaxSiteTime.
Going back to my original query, I can now create a new custom column from the ‘Add Column’ tab. I call my new column Overlap.
data:image/s3,"s3://crabby-images/cba0e/cba0eab6b26cad9e3140c91da1d5e1b0211c73d6" alt=""
The M code I have used is:
= Duration.TotalHours(
MinSiteTime([On site to], [GetSource.Column4]) -
MaxSiteTime([On site from], [GetSource.Column3]))
This formula will give me the number of hours overlapping: if the result is negative, then there is no overlap.
data:image/s3,"s3://crabby-images/6d51b/6d51b6818202bf5c80efeea0083ffa70ac916651" alt=""
As expected, both salespeople have overlapping time on supplier sites. I can add another supplier for John with no overlap to show that this will not give a positive value in the overlap column.
data:image/s3,"s3://crabby-images/64d06/64d0627fbab0ebe2b97d0fb6c308f435a4bfab09" alt=""
If I refresh my query, I can see the results:
data:image/s3,"s3://crabby-images/3f22f/3f22fce7684d2521c35e63f3c75efe9a00db0878" alt=""
The rows involving ‘New Supplier’ have a negative value in Overlap,
showing that there is no overlapping time on site. I can filter on Overlap to remove the negative values:
data:image/s3,"s3://crabby-images/a18da/a18da40718b1b6e5be77c3d047af3800c30a09da" alt=""
I choose values greater than zero (0).
data:image/s3,"s3://crabby-images/2740e/2740e557ddb344679b3cf49cdb48eea825a2d85b" alt=""
I can see the filter applied:
data:image/s3,"s3://crabby-images/0597c/0597cdc63951d1ba3787f207a8849184e89e6dc6" alt=""
I only see those rows where there’s an overlap. If I wish, I can show the overlap in more detail by creating columns to show the beginning and end of the overlap. To do this, I add two custom columns and use my MaxSiteTime and MinSiteTime functions. I use the ‘Invoke Custom Function’ button on the ‘Add Column’ tab to do this.
data:image/s3,"s3://crabby-images/436b5/436b5c176254cb43878bad6ed108ce85db21580c" alt=""
I take the On Site From and GetSource.Column3 (i.e. the linked On Site From), and find the later of the two.
data:image/s3,"s3://crabby-images/78e74/78e74154daeaa8c32040532af7429d6059f50af6" alt=""
I create another column for the end of the overlap using a similar method.
data:image/s3,"s3://crabby-images/9f690/9f6903d1057165a18f20e5272210a1d2844910e0" alt=""
I take the On Site To and GetSource.Column4 (i.e. the linked On Site To), and find the earlier of the two.
data:image/s3,"s3://crabby-images/4f819/4f8190bac3a61f438e4fcbcb42506b91299a632a" alt=""
I can now remove the ‘GetSource’ columns to see my results:
data:image/s3,"s3://crabby-images/61729/6172982bb1119437829c97972bbb49f91eb6ff72" alt=""
I can now see the overlapping site time in the records for my salespeople and can chase up to see what the recorded times should have been. Next time, I will look at a method of doing this where I don’t have to create so many rows during my calculations.
Come back next time for more ways to use Power Query!