(Before you read this, if you want to try these out in Excel for yourself, download the exercise file below. It includes the sample data and the Python snippets from this post, so you can actually run the examples instead of just nodding along politely.)
Python in Excel has a favorite child, and it is absolutely the pandas DataFrame.
That is not a bad thing. A DataFrame is basically a table with rows, columns, and labels, so it fits the Excel grid really nicely. Read a range in with xl() and you usually get a DataFrame back. Most tutorials stop there, and honestly, you can get pretty far knowing nothing else.
But DataFrames are not the whole story.
Once you start doing real work in Python in Excel, a few other structures start showing up: lists, dictionaries, tuples, NumPy arrays, and pandas Series. Sometimes you create them on purpose. Other times they show up as the output of something you ran, and you need to know what you are looking at.
So here are six common Python data structures, what they mean in plain English, and when you would actually use them in a finance, accounting, or operations setting.
1. List
What it is: an ordered, changeable collection. You write it with square brackets.
A list is just a sequence of things. The items stay in order, you can add or remove items, and they do not all have to be the same type.
If you have ever thought through twelve monthly budget numbers in order, you already understand the basic idea.
Where you would use it: when you have a small group of items you want to keep together or move through one at a time. Budget months, column names, tickers, entity codes, departments, scenario names, that kind of thing.
targets = [250000, 260000, 275000, 280000, 290000, 300000,
310000, 315000, 320000, 330000, 340000, 360000]
sum(targets)
This returns:
3630000
You also use lists all the time without realizing it. For example, when you select several columns from a DataFrame, the inner brackets are a list:
gl = xl("GL[#All]", headers=True)
gl[["Account", "Department", "Amount"]]
This returns a smaller DataFrame with just those three columns:
| Account | Department | Amount |
|---|---|---|
| Office Supplies | Operations | 1240.55 |
| Software | IT | 8750.00 |
| Travel | Sales | 2180.75 |
In the grid: a plain list usually spills down a column, or appears as a Python object you can expand.
2. Dictionary
What it is: a set of key-value pairs. You look something up by its label, not its position. You write it with curly braces.
If a list is a sequence, a dictionary is a lookup table. This is probably a structure that feels natural to Excel users, even if it looks unusual, because it is very XLOOKUP-ish. You have one thing, and you want to map it to another thing.
Where you would use it: currency codes to FX rates, account codes to names, department numbers to business units, messy labels to clean labels, or old column names to new column names.
fx = {"USD": 1.00, "EUR": 1.09, "GBP": 1.27, "JPY": 0.0067}
fx["EUR"]
This returns:
1.09
Renaming columns also uses a dictionary:
gl = gl.rename(columns={"amt": "Amount", "dt": "Date", "acct": "Account"})
gl.head()
This returns the same DataFrame with cleaner column names:
| Account | Date | Amount |
|---|---|---|
| Office Supplies | 2025-05-23 | 1240.55 |
| Software | 2025-05-24 | 8750.00 |
| Travel | 2025-05-25 | 2180.75 |
And if you want to create a one-row summary, a dictionary is a clean way to do it:
summary = {"Revenue": 4_800_000, "COGS": 2_900_000, "OpEx": 1_100_000}
pd.DataFrame([summary])
This returns:
| Revenue | COGS | OpEx |
|---|---|---|
| 4800000 | 2900000 | 1100000 |
In the grid: a dictionary can show as key-value pairs, but a lot of the time you feed it into pandas so it becomes a DataFrame.
3. Tuple
What it is: like a list, but locked. Once you create it, you cannot change it. You usually write it with parentheses.
Tuples are not usually the first thing you reach for as a beginner. They are more often something Python gives back to you. The point of a tuple is that the pieces belong together and should not change.
Where you would use it: fixed pairs or fixed records. The most common example is probably the shape of a DataFrame.
gl.shape
That might return something like:
(4213, 6)
That means 4,213 rows and 6 columns.
You may also see tuples when a function returns more than one result:
def spread(series):
return series.mean(), series.std()
avg, sd = spread(gl["Amount"])
avg, sd
This returns something like:
(2486.72, 1394.58)
Here, the function returns two values together, and then you unpack them into avg and sd.
You can also run into tuples when grouping by more than one column. For example, a group might be identified by something like:
("West", "2025-Q1")
That is one fixed pair: region and quarter.
In the grid: you usually see tuples as compact readouts, like (rows, columns), or you unpack them into separate values.
4. NumPy Array
What it is: a grid or sequence of numbers built for fast math.
A NumPy array can look a lot like a list, but it behaves very differently.
This is one of those things that feels small until it clicks. If you multiply a regular Python list by 1.05, Python does not increase every item by 5%. It gets weird because lists are not designed for that kind of math.
But if you multiply a NumPy array by 1.05, every number gets increased by 5%.
That is the magic: NumPy is built for vectorized calculation.
Where you would use it: Monte Carlo simulations, portfolio math, risk models, matrix calculations, or any calculation where you want to apply math across lots of numbers quickly.
np.random.seed(0)
sims = np.random.normal(0.08, 0.15, 10000)
sims.mean()
This returns approximately:
0.0772
And this:
np.percentile(sims, 5)
returns approximately:
-0.1649
That is a simple 5th percentile read, which you might use as a quick downside-risk estimate.
A simpler example:
base = np.array([100, 200, 300, 400])
base * 1.05
That returns:
array([105., 210., 315., 420.])
In the grid: a NumPy array can spill like a range. You will often move between arrays and DataFrames depending on whether you need fast math or labeled tables.
5. pandas Series
What it is: a single labeled column.
A Series is what you get when you pull one column out of a DataFrame. It is one-dimensional, but it still carries labels with it.
Think of it as one worksheet column with the row labels still attached.
Where you would use it: one metric over time, one column from a table, daily prices, invoice amounts, status counts, totals by region, averages by vendor, and so on.
gl["Amount"]
This returns a Series, something like:
0 1240.55
1 8750.00
2 2180.75
3 3624.68
4 1986.26
Name: Amount, dtype: float64
Group-by calculations often return a Series too:
gl.groupby("Region")["Amount"].sum()
You might get something like:
Region
Central 8874537
East 11648517
West 13997353
Name: Amount, dtype: int64
Counting values in a messy field also gives you a Series:
gl["Status"].value_counts()
This might return:
Status
Paid 1842
Open 1267
Overdue 614
Pending 490
Name: count, dtype: int64
In the grid: a Series usually spills as one column, often with its index beside it.
6. pandas DataFrame
What it is: the two-dimensional table. Rows, columns, and labels.
This is home base for Python in Excel. A DataFrame is what you usually get from xl(), and it is what you usually want to return to the grid. It looks and behaves like tabular data, which is why it feels so natural in Excel.
One useful way to think about it: a DataFrame is a bundle of Series that share the same index. Each column is a Series. Put several Series together, and you have a DataFrame.
Where you would use it: basically any normal business dataset. General ledger, trial balance, transaction register, AR aging, sales export, headcount file, budget table, whatever.
gl = xl("GL[#All]", headers=True)
(gl.groupby("Department")["Amount"]
.sum()
.sort_values(ascending=False))
This might return:
Department
Sales 9662163
Operations 7083117
Marketing 5069311
IT 4973422
Finance 4149455
HR 3582941
Name: Amount, dtype: int64
If you want the result to look more like a normal two-column table in the Excel grid, you could write it this way:
(gl.groupby("Department", as_index=False)["Amount"]
.sum()
.sort_values("Amount", ascending=False))
That returns a DataFrame:
| Department | Amount |
|---|---|
| Sales | 9662163 |
| Operations | 7083117 |
| Marketing | 5069311 |
| IT | 4973422 |
| Finance | 4149455 |
| HR | 3582941 |
In the grid: a DataFrame renders as a proper table. This is the main reason Python in Excel works so well for spreadsheet users.
How they fit together
These six structures are not separate little islands. A NumPy array is the fast numeric core. Put labels on one dimension and you have a Series. Line up several Series into columns and you have a DataFrame.
Lists, dictionaries, and tuples are the everyday plumbing that help you select, map, rename, group, and move small pieces of data around.
| Structure | Shape | Changeable? | Labels or Index? | Where You Meet It |
|---|---|---|---|---|
| List | 1-D sequence | Yes | No | Budget months, column selections, tickers, codes |
| Dictionary | Key-value pairs | Yes | Keys | FX rates, mappings, renaming columns |
| Tuple | Fixed sequence | No | No | (rows, columns), multiple return values |
| NumPy array | 1-D or 2-D numbers | Yes | No | Simulations, portfolio math, vectorized calculations |
| pandas Series | 1-D labeled data | Yes | Yes | One column, totals by group, time series |
| pandas DataFrame | 2-D labeled table | Yes | Yes | Ledgers, trial balances, transaction tables |
In short:
- A list helps you hold a sequence of items.
- A dictionary maps one thing to another.
- A tuple gives you a fixed pair or record.
- A NumPy array does fast numeric math.
- A Series is one labeled column.
- A DataFrame is the full table.
Once you can recognize these shapes, Python in Excel gets a lot less mysterious. Many errors are really just Python saying, “You gave me the wrong kind of thing.”
If you want help bringing Python in Excel into your own workflow or your team’s training plan, take a look at my How I Work page. I run practical Excel and analytics training for teams that want to use these tools on real business data, not toy examples that disappear the second the class ends.
