The ebook version of this tutorial includes a dataset and exercise to employ these techniques.
Right at the start of my book on Excel for journalists I talk about sorting data to find out which values come top or bottom. However, there is a family of functions which will give you a lot more control in finding out not just who is top or bottom, but the rank of any value in any series of values.
This is particularly useful if you want to compare ranks.
For example, say you had a table showing school performance across the last two years.
Each table shows the percentage of pupils achieving the top grades in that year. You can use RANK to find out what rank each percentage would have placed the school in for each year.
But once you have both ranks, you can also calculate if they have moved up (from a lower rank to a higher one), or down, and which schools have moved the most.
Another common situation is where you have a table for different institutions or individuals showing dozens of scores on various measures and different scales. These can be converted to ranks for easier comparison – and also to calculate ‘average’ ranks (or highest or lowest) for each institution or individual to help focus or speed up your process of identifying newsworthy entries.
Similarly, you can use PERCENTRANK to see what percentage of other values a number ranks above. For example, if a team has conceded 45 goals this season the PERCENTRANK function could tell you ‘They have conceded more goals than 95% of teams in the league’.
Here is how both functions work:
Using the RANK function
The RANK function needs at least two ingredients:
- The value you want to rank
- And the range of values you want to rank it within (an array of cells)
There is also a third optional argument:
- Whether you want to rank in descending or ascending order
A typical formula looks like this:
The first ingredient, then, is the cell containing our value:
And the second ingredient is the range of cells we want to rank it within:
C1:C3000. Text values are ignored.
By default it will give us the rank where values are ranked in descending order: that is, from biggest to smallest.
Of course this makes sense if we are dealing with values where more is better, like test scores, goals scored, or crimes solved, or (sometimes) income.
However, if we are dealing with values where less is better, like goals conceded, or crimes committed, or (sometimes) money spent then we need to rank it in ascending order.
To do this, add
1 as a third argument like so:
(If there is more than one of the same value, then RANK will return the earliest rank for the value. In other words, if the value occupies 3rd, 4th and 5th position then the result of the formula will be
RANK is replaced by RANK.AVG and RANK.EQ
The RANK function has actually been replaced by two slightly more specific versions of the function: RANK.AVG and RANK.EQ. It may be that later versions of Excel stop supporting the more simply-named function.
RANK.EQ works exactly the same as explained above.
RANK.AVG, however, returns an average rank rather than the rank of the value when it first appears.
What this means is that if the specified value occupies 3rd, 4th and 5th position when placed in order (descending or ascending) then the result of the formula will be
If the value just occupies 4th and 5th position then the result of the formula will be
Finding values at a particular rank: LARGE and SMALL
Two further functions come in very handy if you want to reverse the process – and get an answer to a question like ‘What value is 10th highest?’ or ‘Which value is second from bottom?’
LARGE and SMALL count from the top and bottom respectively. LARGE will tell you the value which is a specified position from the top (the ‘nth largest’) and SMALL will tell you the value which is a specified position from the bottom.
Each needs two ingredients: the array of cells you want to look at, and the rank you want to grab (from the top or bottom, depending on the function).
So, in a range of ten cells containing the numbers 1 to 10, this formula:
…will return the answer
8 is the third largest value in that sequence (10 being the first largest, and 9 being the second largest).
This formula, however:
…will return the answer
3 is the third smallest value in the same sequence (1 being the smallest, and 2 being the second smallest).
It does not matter if the values are ordered in any way, and text and empty cells will be ignored.
It is also worth pointing out that if the same number appears twice, it will be ranked accordingly. So, if you put this sequence in column A:
1, 2, 1, 5, 3, 2, 5, 9, 100
…the result of
=SMALL(A:A,4) would both be
2, because there are two instances of
2, and when arranged from smallest to largest
2 appears third and then fourth.
- RANK allows you to find out the rank of any specified value within a specified range of numbers.
Ranking can be particularly useful as a way of quickly seeing where your country or local area ranks well or badly across a range of measures – and spotting that story quicker.
The RANK function has been replaced by RANK.AVG and RANK.EQ.
All RANK functions have three arguments: the value you want to rank; the range of values you want to rank it within (an array of cells); and (optionally) whether you want to rank in descending or ascending order.
By default values will be ranked in descending order (from biggest to smallest).
To specify that you want it ranked in ascending order, use
1as the third argument like so:
If the value appears more than once both RANK and RANK.EQ will give you the highest rank at which it appears.
RANK.AVG, however, will give you the average rank if the value appears more than once.
LARGE will give you the value ranked in a specified number position, from a specified array. For example
=LARGE(A:A,1)will give you the first ranked (largest) value in column A.
SMALL will give you the value ranked in a specified number position of smallest values, from a specified array. For example
=SMALL(A:A,1)will give you the smallest ranked value in column A, and
=SMALL(A:A,9)will give you the 9th smallest value in that column.
If you’re calculating ranks and copying the formula down, remember to fix the range reference using the dollar sign to avoid the range changing for each formula.
Watch out for grand totals in your ranges: these will always be ranked top, meaning the ‘worst’ or ‘best’ places will end up ranking ‘second’. Make sure to delete grand total rows before writing your formulae.
Make sure data is comparable – for example it is per capita or proportional, rather than absolute totals where the populations differ. Of course in some contexts totals are comparable because the story compares a single person, team or organisation (e.g. the top goalscorer; team conceding the most goals; organisation making the most money).
Pingback: The most-read posts on Online Journalism Blog — and on Medium — in 2016 | Online Journalism Blog
Thanks for this, very helpful. Could you tell me how to rank time values (e.g. 12:35:26, minutes, seconds, milliseconds)? I’m having trouble with this. Cheers 🙂