In a previous blog post I shared some reasons for storing data in Excel tables and why to get started now. If you’ve made the leap to tables, perhaps you’re perplexed by the format of their formulas. This post on these so-called structured references is for you.
Let’s demonstrate on a small dataset about the five boroughs of New York City:
Structured versus unstructured references
Unstructured references change by location
Think of the typical Excel formula and you’re probably imagining SUM(A1:A4)
, B5/E5
and so forth. These are known as unstructured references because they point to a location in the workbook, which could change. For example, we could take the sum of A1:C1
below, but add any row or columns and that formula changes.
Unstructured references have their place, but it’s worth keeping in mind that this property makes them unstable.
Structured references are based on fixed names
In contrast, structured references refer to specific named ranges in the workbook. These references do not change based on the location of the data.
For example, I’ve converted A1:C1
into a named range called my_range
. Now, no matter where this range is located, references to it remain the same.
Structured references in Excel tables
As a two-dimensional named range, Excel tables similarly take structured references. Typically you’ll see these expressed as aggregations (grand totals, averages, etc) or calculated columns. Let’s look at each.
Aggregating with structured references
Taking grand totals or averages are examples of aggregation. We can refer to a column in aggregation using the following structured reference notation:
table_name[column_name]
Let’s combine this notation with SUM()
to find the total population of New York.
This table is named nyc
. You can confirm that with the Name Manager. We can find the total of the population
column like so:
=SUM(nyc[population])
If you’re used to writing classic formulas like =SUM(B2:B6)
, this notation can take some getting used to. But in the long run, the stability and readability of structured references make them hard to beat.
Creating calculated columns
Now let’s take a look at creating calculated columns. For example, we could divide population
by area
to get a population density column.
To get started, type density
in cell D1
and you’ll see a new column automatically gets inserted into your table. Pretty nifty! Next, press =
in cell D2
to start the typical formula, but instead of typing in your cell references, hit the left arrow key to grab the relevant columns:
Once you hit Enter, this calculation will automatically apply to the entire column.
This makes the structured reference for creating calculated columns:
[@column_name]
Staying structured
There are a number of other structured references for Excel tables, such as how to pull the header names from a table, its totals and more. For now, focus on these basics and don’t abandon them. It can be tempting to revert back to the familiar, unstructured references. Sticking with structured, however, is the short-term pain, long-term gain you need to succeed as an analyst.
What questions to you have about structured referenes in tables? Do you have examples to share? Still debating whether tables are worth the effort? Let me know in the comments.
Leave a Reply