Just like in biology, where cells are the building blocks of life, in your Excel workbook, cells serve as the foundational elements. In this post, you’ll discover how to interact with individual cells and then extend your knowledge to managing entire rows and columns using openpyxl.
For this demo, I’ll assume you’re familiar with installing and working with Python packages, along with basic operations such as creating variables, indexing, and so on. If you need assistance with these topics before proceeding, consider checking out my book Advancing into Analytics:
To kick things off, start by importing workbook.Workbook
and load_workbook
from openpyxl.
from openpyxl.workbook import Workbook
from openpyxl import load_workbook
Next, use the load_workbook()
function to read in the regions.xlsx
file. I’m going to define the active worksheet as ws
, and now that it’s an Excel object I can index it to access individual cells.
wb = load_workbook('regions.xlsx')
ws = wb.active
For example, I can access cell A1
of this worksheet by indexing ws
at A1
. Print my_cell
and you’ll see the result gives the cell location, but what about its underlying value?
my_cell = ws['A1']
my_cell
<Cell 'Sheet1'.A1>
That can be found by accessing the value
attribute of my_cell
:
my_cell.value
'Region'
Now let’s look at working with rows and columns, which, like cells, are possible to index.
I’m going to start with rows. First, create a variable called cell_range
, then specify which row to grab with the index position 1
, or the first row in the workbook.
cell_range = ws[1]
cell_range
(<Cell 'Sheet1'.A1>,
<Cell 'Sheet1'.B1>,
<Cell 'Sheet1'.C1>,
<Cell 'Sheet1'.D1>)
So now cell_range
contains all the cells with data in row 1. You could extend this out to rows 1 through 3 with ws[1:3]
.
ws[1:3]
((<Cell 'Sheet1'.A1>,
<Cell 'Sheet1'.B1>,
<Cell 'Sheet1'.C1>,
<Cell 'Sheet1'.D1>),
(<Cell 'Sheet1'.A2>,
<Cell 'Sheet1'.B2>,
<Cell 'Sheet1'.C2>,
<Cell 'Sheet1'.D2>),
(<Cell 'Sheet1'.A3>,
<Cell 'Sheet1'.B3>,
<Cell 'Sheet1'.C3>,
<Cell 'Sheet1'.D3>))
I can do the same with my column data, except this time I will index by column letter. For example column C will give you all those cells in the third column.
col_c = ws['C']
col_c
(<Cell 'Sheet1'.C1>,
<Cell 'Sheet1'.C2>,
<Cell 'Sheet1'.C3>,
<Cell 'Sheet1'.C4>,
<Cell 'Sheet1'.C5>,
<Cell 'Sheet1'.C6>,
<Cell 'Sheet1'.C7>)
If you’d prefer to index columns by number, not letter, it’s as simple as bringing in the get_column_letter()
function and indexing the worksheet that way.
from openpyxl.utils import get_column_letter
ws[get_column_letter(3)]
(<Cell 'Sheet1'.C1>,
<Cell 'Sheet1'.C2>,
<Cell 'Sheet1'.C3>,
<Cell 'Sheet1'.C4>,
<Cell 'Sheet1'.C5>,
<Cell 'Sheet1'.C6>,
<Cell 'Sheet1'.C7>)
And of course, you can index multiple columns just like rows by, for example, printing everything in columns A through C like in col_range
:
col_range = ws['A':'C']
col_range
((<Cell 'Sheet1'.A1>,
<Cell 'Sheet1'.A2>,
<Cell 'Sheet1'.A3>,
<Cell 'Sheet1'.A4>,
<Cell 'Sheet1'.A5>,
<Cell 'Sheet1'.A6>,
<Cell 'Sheet1'.A7>),
(<Cell 'Sheet1'.B1>,
<Cell 'Sheet1'.B2>,
<Cell 'Sheet1'.B3>,
<Cell 'Sheet1'.B4>,
<Cell 'Sheet1'.B5>,
<Cell 'Sheet1'.B6>,
<Cell 'Sheet1'.B7>),
(<Cell 'Sheet1'.C1>,
<Cell 'Sheet1'.C2>,
<Cell 'Sheet1'.C3>,
<Cell 'Sheet1'.C4>,
<Cell 'Sheet1'.C5>,
<Cell 'Sheet1'.C6>,
<Cell 'Sheet1'.C7>))
OK, now it’s pretty cool that openpyxl can go through your rows and columns, printing which contain data. But you probably want to know what that data actually is.
There are a number of ways to do this, but I am going to use a loop to run through a specified range of cells and print the row contents of each: that’s going to be the iter_rows()
function, and I’ll say that I want to start at row 1, column 1, then go to row 2, column 3 and print the value of each cell.
for row in ws.iter_rows(min_row=1, min_col=1,
max_row=2, max_col=3):
for cell in row:
print(cell.value)
Region
Units
Sales
South
54
332
And there you have it. You now possess the ability to navigate any part of an Excel workbook using these techniques. Do you have any questions regarding how to manipulate individual cells, rows, and columns in Excel with openpyxl? Feel free to ask in the comments. To delve deeper into these topics and for additional practice, consider exploring my course “Using Python with Excel” on LinkedIn Learning.
You can also check out my book Modern Data Analytics in Excel:
Leave a Reply