Please note javascript is required for full website functionality.

Blog

Power Pivot Principles: The A to Z of DAX Functions – GROUPBY

27 February 2024

In our long-established Power Pivot Principles articles, we continue our series on the A to Z of Data Analysis eXpression (DAX) functions.  This week, we look at GROUPBY

 

The GROUPBY function

The GROUPBY
function is one of table manipulation functions and, it creates a summary of the input table grouped by the specified columns. 

It employs the following syntax:

GROUPBY (table [, groupBy_columnName [, groupBy_columnName [, …]]] [, name, expression [, name, expression [, …]]])

This function primarily involves four [4] key parameters:

  • table: this is required, and it refers to any DAX expression that results in a table of data
  • groupBy_columnName: this is optional and repeatable, and it is the name of an existing column in the table (or a related table), used for grouping the data.  It must be a column name and cannot be an expression
  • name: this is optional and repeatable, and it is a label assigned to a new column that is added to the GroupBy columns, and it should be enclosed in double quotes
  • expression: this is optional and repeatable and is parameter is one of the X aggregation functions where the first argument is CURRENTGROUP().  The section CURRENTGROUP below details the full range of supported X aggregation functions for this purpose.

The following should be noted:

  • the GROUPBY function does not perform an implicit CALCULATE when GROUPBY adds new data columns which mean it does not automatically recalculate these columns, unlike SUMMARIZE, which does update them
  • the GROUPBY function allows the use of CURRENTGROUP, a special function for detailed calculations within the newly added columns
  • the GROUPBY function is efficient for performing several aggregations in a single table scan

Here are some more remarks about this function:

  • Let's break down the GROUPBY function in simpler terms:
  1. the GROUPBY begins with a chosen table and all related tables directly linked to it ("to-one" direction)
  2. create groups using all the GroupBy columns.  These columns must exist in the table from step 1
  3. each group that GROUPBY function creates is shown as a single row in the final result.  However, this single row is the related rows from the original table
  4. for every group, GROUPBY calculates new columns (called extension columns).  However, unlike the SUMMARIZE function, GROUPBY does not automatically adjust these new columns based on other filters or calculations in your data.  Also, it does not use the group itself to filter the rest of the data.
  • when we name a column, we need to provide a formula for it.  If we name a column without a formula, it causes an error
  • groupBy_columnName must be either in a table or in a related table
  • GROUPBY is primarily used to perform aggregations over intermediate results from DAX table expressions.  For efficient aggregations over physical tables in the model, consider using the SUMMARIZECOLUMNS or SUMMARIZE functions
  • each name must be enclosed in double quotation marks
  • this function is not supported for use in DirectQuery mode when used in calculated columns or row-level security (RLS) rules.

In particular, it is worth noting its interaction with CURRENTGROUP:

  • CURRENTGROUP is specifically for use within the GROUPBY function to create a new column
  • In-effect, CURRENTGROUP picks out rows from the initial table that correspond to each row in the GROUPBY results.
  • The CURRENTGROUP function takes no arguments and is only supported as the first argument to one of the following aggregation functions: AVERAGEX, COUNTAX, COUNTX, GEOMEANX, MAXX, MINX, PRODUCTX,STDEVX.S, STDEVX.P, SUMX, VARX.S, VARX.P.

Let’s consider an example here where we have these following tables:

We can write the following DAX expression on the edit DAX:

EVALUATE

    GROUPBY(Fact_Sales,[Country])

This will give the unique list of the country:

Let’s say we want to know the sales amount of each country we can employ the following expression:

EVALUATE

    GROUPBY(Fact_Sales,

    [Country],

    "Sales by Country",

    SUMX(

            CURRENTGROUP(),

            Fact_Sales[SalesAmount]

        )

    )

This will give the sales amount of each country that we want


Come back next week for our next post on Power Pivot in the Blog section.  In the meantime, please remember we have training in Power Pivot which you can find out more about here.  If you wish to catch up on past articles in the meantime, you can find all of our Past Power Pivot blogs here.


Newsletter