One of the most exciting developments in Excel recently has been the integration of Python. This long-awaited feature has highlighted Python’s significance within the software stack for typical Excel users.
However, the version of Python in Excel represents only a limited, and arguably unrepresentative, portion of the language’s full capabilities. In this blog post, we’ll explore what Python can and cannot do within the context of Excel.
What Python in Excel is good for
For a comprehensive understanding of Python’s capabilities in Excel, navigate to the Formulas tab on the home ribbon. From there, select Insert Python and then choose Try Python Samples:
In this section, you’ll encounter a range of tasks centered on data analysis, visualization, and statistics. Examples include plotting a pairplot and constructing a correlation matrix.
These tasks are fundamental to any analyst’s toolkit. Python offers many advantages for tasks that are challenging in Excel but are much simpler in Python, as demonstrated in these examples for data analysis and visualization:
Again, because Python was not fundamentally built for data analysis tasks, a lot of this functionality comes from a suite of packages that comes with Excel’s environment for Python. Here is the initialization script for Python in Excel:
While it’s commendable that Python integration in Excel includes some of the most significant packages for data analysis, visualization, and statistical analysis, it notably overlooks a crucial aspect: the automation of Excel tasks using Python. This omission represents a significant gap in the toolset provided. In the following discussion, we’ll delve deeper into this oversight to better understand what’s missing and the potential impact of including Python-based automation tools in Excel.
So, what does this leave out?
You can now do a lot with Python in Excel, but here are some additonal tasks that can otherwise be done with Python for Excel:
- Advanced formatting options for cells, such as applying specific styles or conditional formatting
- The ability to manipulate worksheets, such as renaming, adding, or deleting sheets
- Fine-grained control over workbook properties, such as password protection or worksheet visibility
- Working with named ranges and tables
- Adding images, shapes, and charts to Excel files
- Handling print settings, page layout, and page breaks
- Working with formulas and formula-related functionality in Excel
Indeed, Python’s current presence in Excel represents only a fraction of its extensive capabilities. To fully harness Python’s power, exploring external packages beyond those available in Excel’s native environment is often necessary. These external packages can unlock a wider range of functionalities, further enhancing Excel’s data processing, automation, and integration capabilities.
Python and Excel automation packages
Specifically, Python’s integration in Excel lacks access to packages that specialize in automating and developing Excel workbooks using Python:
openpyxl
: This library is versatile for both reading and writing Excel files in the .xlsx format. While it may not excel in performance for large datasets likexlsxwriter
, it offers a balanced blend of functionality, including basic cell formatting and data manipulation. Unlikexlsxwriter
,openpyxl
supports reading Excel files, making it a more comprehensive solution for handling Excel data in Python.xlwings
: This package is used for automating Excel tasks, including interacting with Excel workbooks and running VBA macros. It uniquely offers access to Excel’s COM API on Windows, facilitating two-way communication between Excel and Python, a capability not provided byopenpyxl
.xlwings
requires a complex development environment and its full feature set is mainly available on Windows.xlsxwriter
: This library is designed for writing data, formatting, and charts to Excel files in the .xlsx format. It’s optimized for performance, particularly with large datasets. It offers advanced cell formatting options, exceeding those available inopenpyxl
. However,xlsxwriter
is limited to writing data only, whereasopenpyxl
supports both reading and writing Excel files.pyxll
: A premium library,pyxll
allows users to create Excel add-ins using Python. It focuses on enabling developers to build standalone applications for sectors like data science and financial trading, rather than automating Excel workbooks.
Learning about all these packages might seem daunting, but there’s no need to master them all at once. I recommend exploring the basics of these packages in the order I’ve listed, which coincidentally aligns with their ranking on Google Trends. (If the page doesn’t load right away, try refreshing your browser.)
Indeed, these packages will unlock much of Python’s functionality that has contributed to its popularity, which, unfortunately, isn’t available with the current Python integration in Excel.
Getting the best of both worlds
For these reasons, if you’re keen on maximizing Python’s potential for Excel, I suggest starting outside of Python in Excel. Consider it a specific use case. Begin by getting comfortable with running Python independently, perhaps through Jupyter notebooks in an Anaconda environment.
Felix Zumstein’s Python for Excel is an excellent, comprehensive resource that adopts this approach, guiding you on how to effectively combine the strengths of both tools.
To truly grasp how Python and Excel can work together, you need to look beyond the current Python integration in Excel. It’s unfortunate that these automation tools aren’t included yet, but given Python’s capabilities, I wouldn’t be surprised to see more native support in the future.
Do you have any questions about using these tools? Feel free to ask in the comments.
Leave a Reply