DAX RANKX function – Part II

In a previous post, I explained how to use RANKX function to calculate revenue ranking for each provider in each department. In this post I’m going to explain how to rank at different levels.

I’m using my previous example which is a list of providers in different hospital clinical departments, I want to rank department and rank providers in each department by revenue (two ranking levels). Here is my sample table I started with creating a table in Excel “Table1” and add it to Power Pivot model.

clip_image001

I created following measures to rank provider by revenue:

TotalRevenue:=SUM(Table1[Revenue])

Rank:=RANKX(All(Table1[Provider]),[TotalRevenue],,,Dense)

The Rank measure will rank the [TotalRevenue] for the current Row against the [TotalRevenue] for all other rows in the Revenue column. The ALL(Table1[Provider]) is the list of values that the rank function use to compare the current row against. Since I used ALL function to ignore Provider filter context, I will get the following:

image

Now let’s add the department to pivot table and see how Rank measure works:

image

As you can see the Rank by provider still working within each department by the rank by department is not, it uses the department total and compare it against list of provider values. I want to rank each department against all departments, and rank providers within each department. In order to do that in Power Pivot, I want to create an IsDepartmentLevel measure that will return TRUE if the current selection in the PivotTable has a valid correspondent department definition, FALSE otherwise. If it is TRUE, Rank department, otherwise Rank providers. Let’s see how to write the IsDepartmentLevel measure:

IsDeptartmentLevel:=(

                                                 COUNTROWS(Table1)

                                                 =

                                                 CALCULATE(COUNTROWS(Table1) ,

                                                                          ALL(Table1),

                                                                           VALUES(Table1[Department])

                                                                          )

                                              )

 

The IsDepartmentLevel measure checks that the number of rows that are active in Table1 equal to the number of rows that would be active by removing from the filter context any column except Department.

As you can see in the following Pivot table, I added the IsDepartmentLevel measure to Pivot table, the IsDepartmentLevel measure is FALSE for every provider level but TRUE for department.

image

Now we are able to identify the department level which we need in order to evaluate the Rank measure and Rank department vs. provider.

Let’s re-write the Rank measure and see how to use the IsDepartmentLevel measure and RANKX function to rank departments and providers by revenue.

Rank:= IF(

                    [IsDeptartmentLevel],

                    RANKX(Filter(All(Table1[Department]), [TotalRevenue]),[TotalRevenue],,,Dense),

                    RANKX(Filter(All(Table1[Provider]), [TotalRevenue]),[TotalRevenue],,,Dense)

                  )

I used IF function to evaluate IsDepartmentLevel measure, if it is TRUE, it will calculate the revenue ranking for each department, otherwise it will calculate the revenue ranking for each provider.

The following Pivot table shows the ranking for each department.

image

When I expand the departments, it shows the ranking for each provider in each department and maintained the ranking for each department.

image

In this post, you have seen how to use RANKX function and calculate at different levels (department and provider) using one measure.

T-SQL Linear Regression Function

Predictive Analytics is the wave of the future and has become vital in today’s data evolution.

Predictive analytics incorporates a variety of techniques from statistics, modeling, machine learning, and data mining that analyze current and historical facts to make predictions about future.

I’ll describe the linear regression approach and how to write a T-SQL function to calculate the regression and produce the Intercept, Slope and R2 which are used in a regression equation to predict a value. In simple linear regression, the topic of this post, the predictions of Y when plotted as a function of X form a straight line.

The following table lists a sample data of car drivers driving experiences (in years) and monthly auto insurance premiums.

 

Driving Experience

(X value)

Insurance premium

(Y values)

7

75

10

68

12

65

18

60

20

57

25

50

Table1

Here is the T-SQL code will take the X and Y values from table 1 and calculate the regression coefficients:

declare @n int,           

              @Intercept DECIMAL(38, 10),

              @Slope DECIMAL(38, 10),

              @R2 DECIMAL(38, 10)

Select @n=count(*) from Table1

Select

@Slope = ((@n * sum(x*y)) (sum(x)*sum(y)))/ ((@n * sum(Power(x,2)))-Power(Sum(x),2)),

@Intercept = avg(y) ((@n * sum(x*y)) (sum(x)*sum(y)))/((@n * sum(Power(x,2)))-Power(Sum(x),2)) * avg(x)

From Table1

 

Select @R2 = (@Intercept * SUM(Y) + @Slope * SUM(x*y)-SUM(Y)*SUM(y)/@n) / (SUM(y*y) SUM(Y)* SUM(Y) / @n)

Select @Slope as Slope, @Intercept as Intercept, @R2 AS R2

From Table1

 

So to predict what will be the insurance premium if the driving experience years = 15

Y = Intercept + 15 * Slope (where y is the insurance premium)

For an in depth article about the linear regression, please refer to the Wikipedia Entry,