# My blog has moved to jimdehner.com

Thank you for all the support over the past 2 years - I recently updated my blog to make it easier to read, improve the rendering of images and tie back to video presentations that are associated with the content - Check it out at  Jim's new blog  -

# My blog has moved to jimdehner.com

Thank you for all the support over the past 2 years - I recently updated my blog to make it easier to read, improve the rendering of images and tie back to video presentations that are associated with the content - Check it out at  Jim's new blog  -

Frequently we see questions on how identify the Top N and group the "Other".

A common problem with many business applications - determining the Top N can be easily done in a number of ways.   (Presented on VizConnect - see the recording at Link to Recording)

1 The Top N filter

Just apply a Top N filter to the dimension

A more typical question adds some complexity where the user wants to see the Top N across multiple dimensions which can be solved using Index() or Rank()

2 The Index() solution

Index is based on the position of the record in the data table so start by sorting the data:

Add the Index() (note can be hidden)

and finally add a filter set to N

It works well but there is a need to pre sort the data and when there are multiple dimensions you will need to determine which column to use in the sort:

3 The Rank solution

An alternative is to use a ranking based on either a measure or an LOD. All forms of Rank are table calculations and look something like this:

RANK_UNIQUE(sum([Sales]),'desc')

and each segment can be treated independently without presorting the data:

4 Top N and Other

Users often want to see the Top N individually and group the remaining as "Other".  It requires a complex solution that combines both LOD's, Table Calculation, the use of Show/Hide filters and a hack

Start with an LOD to find the Top N and "Other" based on subcategory and region (Note I am using Include to support drilling down into the Region Hierarchy) :

{ INCLUDE  [Region],[Category]:sum([Sales])}

and then Rank the LOD

rank(sum([6 lod sales region subcat]),'desc')

to start building a data table for the viz like this

for flexibility I added a parameter to set the N level for the Top N and created this T/F calculation

[6 rank for lod sub reg]<=[N to top N]

and used this to add a Label to the "Other" group

if [6 rank tf ] then attr([Sub-Category]) else "Other" end

the data table now look like this:

use a table calculation to total the "Other"

window_sum(if [6 subcat and other label]="Other" then sum([6 lod sales region subcat]) end)

and restart every Region places the Other Total on all subcategories not in the Top N - after sorting the table now looks like this

Now the hack, we only want to see each of the Top N subcategories and 1 of the subcategories labeled "Other"  in each region  - That is done with a show hide filter:

if [6 rank for lod sub reg]<=[N to top N]+1 then "show" else "hide" end

After applying the filter and hiding the real subcategory labels the data table is complete

as a bar chart the results look like this

5 Drill down in Hierarchy

One advantage of using the Include LOD is that it supports drill into the hierarchy finding the Top N at subsequently lower levels

the LOD expression used here was

{ INCLUDE  [Region],[Category]:sum([Sales])}

which will support drilling down from Region to the State level - at the Region level and filtered for Central the bar chart looks like this

Drilling down in the Region hierarchy to the State level will return the Top N and Other unique to each state

Hope this helps the next time you need to find the Top N and group the Other

The examples used here,and a few others,  can be downloaded at workbook here

Enjoy

Jim

# My blog has moved to jimdehner.com

Thank you for all the support over the past 2 years - I recently updated my blog to make it easier to read, improve the rendering of images and tie back to video presentations that are associated with the content - Check it out at  Jim's new blog  -

Using  Filters and Sets is nothing new  - but how often do you use the "Condition" or "Top N" tabs - and have you ever used them together.

Here we are going to look at how to use Conditions and Top N in combination with Sets and Filters -

The Condition and the Top N tabs are on the setup for Sets and Filter - They look the same on each and provide the capability to further filter the dimension used in the set or filter.  The can interact with each other - in an   AND  fashion and can be used in combinations with filters on other dimensions or sets -  (Presented at VizConnect Link to Recording)

Example 1 - Simple Filter

Using Superstore data, I've created a parameter on Region and a filter on State with this formula  on the condition tab -

Sum([Sales])>10000  and Max(Region) = [Region Parameter]

The Formula window works like and LOD - dimensions and measures must be aggregated and you can not use ATTR(). The condition will filter out all States with total sales less tha10000 and only states in the Region selected in the parameter will be included.

Now add in a Top N type filter ( I used the Bottom 4)

And the Bottom N is applied in addition ( AND filtering )  to the filters applied on the condition tab.

Before seeing more complex examples we need to see how the Order of Operation will affect the filtering.

The Top N filter takes place in Step 4 in the Order of Operations along with the creation of Sets and Fixed LOD's expressions.   That is after Context Filters but before Dimension filters are applied.  It is also before any other type of calculations including Table Calculations are determined in steps 7 through 10 -

Example 2 - Top N Filter with Context Filter

We will start with a Superstore example to find the Top 7 States based on sales

With Region in Context the filter will return in 7 state with the greatest sales from the regions in the view

if Region is removed from Context the top 7 are determined before the Region filter is applied and

Example 3 - Multi Dimension Top N  Sets

Now let's look at combining additional dimensions in the Top N.  We still want the top 7 States but the users also wants the top 7 based on the total sales of selected subcategories -

First we need an LOD to determine the sales at the combination of State and Subcategory

{ FIXED [Sub-Category],[State]:sum([Sales])}

Next create a set with this condition

sum({ FIXED [State],[Sub-Category]: sum([Sales])} )=

sum([3 LOD sales by subcat and state])

and add in the Category and subcategory Filter to Context -  and the set will include the 7 states that have the max total sales based on the filter subcategories

when the subcategories are changed the Top 7 adjust :

Example 4 - Fixed 7 States find the Top N subcategories

Users want to find the top categories within the 7 overall best States.

Now the Order of Operation and the use of Table Calculations come into play.  First, we need to first determine the top 7 states and then within each state find use Rank to order the subcategories -

Ranking functions are table calculations that are determined in Step 10 of the OOO after the sets and lods are determined in Step 4.   Using the LOD we calculated in the previous example

{ FIXED [Sub-Category],[State]:sum([Sales])}

add that to a Ranking function

RANK_UNIQUE(sum([3 LOD sales by subcat and state]),'desc')

and replace the set with the ranking function the number of categories to show

It is easier to see the result in a text table format

The top 7 States overall are first determined then the top 3 subcategories are identified in each state

Example 5 - the Reset Button

Ever have a dashboard where you've applied a several filters and you would like to "Reset them"  with a single click - It can be done with a "Reset Button" - No magic here just the creative use of an action filter

Create a worksheet with a shape and a label -

then on a dashboard add an Action Filter - from the Rest button to sheet where the filters are applied  - the select each filter you want to reset - you will get a Warning Message  but that is not a problem

Select the rest button will return all values to the filters

Example 6 Default Values for filters

Some users want set default values for filters and it can be done with some creativity using a parameter ( Credit for this goes to Joshua Milligan - Tableau Zen Master Hall of Fame -  Link to Joshua's work  )  It is done by first creating a parameter on Region and adding in 2 values - All and Default

Next create a case statement for Default   In this example there are 4 States we want for the when the user selects Default - they are connected with OR

CASE [Region Param with all and default]

When "Default" then

[State] ="New York" or

[State] = "Illinois" or

[State]= "Florida" or

[State]= "Washington"

end

then we just need a TF filter

[Region Param with all and default] = "All" or

[Region]=[Region Param with all and default]

or CASE [Region Param with all and default]

when "Default" then [8 Default state from region]

end

When All is selected in the Region parameter the statement returns a True.  if the user selects and single region just that region will return a True. When Default is selected the Default case statement returns a True for the 4 states and returns this

Example 7  Cascading Parameters

Frequently users want to have the drop down values in one parameters be dependent on the value of another parameter.   Currently that is not capable in "out of the box Tableau" and I have responded to users to use a parameter and a "Relevant Value Filter or Set" on the Condition Tab like this:

Which would create a set base on the parameter and only show the IN values in the set

and from there you can use the set to drill down

If you can use extensions there is another option

First you need to create 2 parameters - one to drive the extension - here on Category

Then a second parameter that will receive the values from the cascading filter and place both on the worksheet

Search Parameter and select Data Drive Parameter

the extension needs to be configured - on the first page Identify the parameter that will accept the values (the second above), the worksheet it is on and the dimension on that sheet that relates to the values ( here subcategory)

on the Option Tab select 'Update List when Parameter Changes' and the parameter that will drive the list (the 1st parameter - here Category Parameter)

Then create the viz and the first parameter will drive the drop down list in the second

I hope this help explain the use of the Condition and the Top N tabs and has given you some ideas on how you can use them in your vizes

The workbook used here can be found at   Link to Workbook

Enjoy

Jim