We are often required to present data where a business needs to analyse the average, median or any type of aggregation over time in order to see how the business is doing and where improvement is required. Data aggregation is simple when a record is stored in a single row in your data table, but often this is not the case. For example a single sales transaction to a customer may be captured in multiple rows. Each row can contain the data on a SKU and temperature category for example. To just simply calculate the average sales per customer transaction will not be possible by using a measure such as ‘AVERAGE’ by itself. We will shed more light on this in the blog post below.

 

Data Aggregation through Measures and Functions

We will use the example as explained in the introduction where a single sales order is captured across multiple rows. What we want to see is the average sales per customer order over a period of time. Let’s see our sample data below:

Customer #Sales OrderProduct CodeProduct TempQTY SoldDate
1587ab1FR202021/04/01
1587ab2FR152021/04/01
1587ab3FR502021/04/01
1587ab4CH242021/04/01
1587ab5CH152021/04/01
3558ab7FR302021/04/01
3558ab2FR52021/04/01
3558ab8FR102021/04/01
3558ab9CH22021/04/01
3558ab5CH82021/04/01
2454ab1FR252021/04/01
2454ab6FR162021/04/01
2454ab8FR562021/04/01
2454ab4CH882021/04/01
2454ab5CH442021/04/01

 

We are first going to calculate the Average sales on 2021/04/01. We will use the AVERAGE function:

Average = AVERAGE(‘Sales Data'[QTY Sold])

The result that we get is the average units sold per row in the data set.

This is not exactly what we require. The average units sold per sales order is 136.

Another function that we will use in the final result is AVERAGEX. The AVERAGEX function reference a table and returns the average value. The DAX for AVERAGEX is shown as below:

AVERAGEX(<table>,<expression>)

The result will be the same as the Average function.

 

The  Simple Solution

In order to achieve the desired result we need to make use of two measures. Firstly, we use the SUM measure. This is a simple measure that sums the values in a certain column. We sum the “QTY Sold” in the dataset :

Total QTY = SUM ( ‘Sales Data'[QTY Sold] )
Next we will make use of the AVERAGEX function. We need to create a table that contains all the unique Sales Orders. The function that creates a single column table with unique values is the VALUES function. The VALUES function requires a Table or Column as input.
VALUES(<TableName Or ColumnName>)

From the dataset we use the Sales order column as the input column for the VALUES function:

VALUES ( ‘Sales Data'[Sales Order])

Now we use the AVERAGEX function with the VALUES and Total QTY measures to calculate the average sales per sales order.  See the formula below:

Avg QTY per Sales Order = AVERAGEX ( VALUES ( ‘Sales Data'[Sales Order]), [Total QTY])
The results of the above measure vs just using the AVERAGE or AVERAGEX function is shown below:
Aggregate Data
We can clearly see the difference here. By using AVERAGE we determined the average of each row in our data. Through using our ‘Avg QTY per Sales Order‘ measure we solved the data aggregation challenge by providing a scalable solution to show the average QTY sold of all our products per sales order.
That is it for today’s blog post, please visit our other blog posts for some awesome content that helps you to overcome real life business requirements through Power BI.
Learn more about using Power BI here.