In the past, getting a count of distinct items in Excel using formulas only was quite cumbersome. With the help of dynamic arrays, we can get it done quickly. Let’s look at an example using a list of the top 100 highest-grossing movies as of January 2022. Data source: Kaggle.
Listing the distinct items with UNIQUE()
The Distributor
column of this dataset tells us what distributor is associated with each film. We’d like to know how many distinct distributors are represented in this data. The UNIQUE()
function works like a charm here:
=UNIQUE(movies[Distributor])
It’s a good idea before proceeding to review the results in case any of these categories should be merged. For example, what if the data had both Twentieth Century Fox
and 20th Century Fox
? Should these really be counted as two unique categories?
Fortunately, these categories look clean so we can move on without any adjustments. We have a valid list of distinct distributors. But what about the count of those items?
Counting the distinct items
To get their count using a formula, we can use COUNTA()
. We won’t use COUNT()
here because that function does not count text-based cells, which is what we have.
Let’s modify our formula by wrapping what we have in COUNTA()
:
=COUNTA(UNIQUE(movies[Distributor]))
So now we know that of the 1000 highest-grossing films, 12 distributors are represented. I don’t think I’ll be opening a studio anytime soon!
How have you found a count of distinct items in the past? What do you think of dynamic arrays and this approach? Let me know in the comments.
Leave a Reply