The recent incorporation of Python into Excel marks a substantial leap in data analytics, combining Python’s powerful features with the accessible Excel workbook interface.
Employing Python directly within Excel’s formula bar is noteworthy, yet add-ins further enrich this capability. This enhancement is particularly important considering that Python programming typically involves extensive scripts or segments in a notebook, featuring elements like syntax highlighting and line numbering, which Excel’s standard Python cell lacks.
A significant development from Microsoft is the complimentary Python Editor from Excel Labs. Further details about this tool and its usage are discussed in this post:
Although the Python Editor is equipped with powerful features, it assumes a high level of Python proficiency among Excel users, expecting them to code independently. Nevertheless, this requirement is becoming less critical with the emergence of generative AI, which simplifies and aids in coding.
The Anaconda Toolbox enhances Python coding in Excel, its capabilities further bolstered by generative AI. This integration not only simplifies the coding process but also enhances its efficiency.
Getting started with Anaconda Toolbox and availability
The Anaconda Toolbox, currently in beta and offered for free by Anaconda, is a significant development from the company renowned for its popular Python environment geared towards data analytics. Anaconda is also in charge of managing and executing the Python environment within Excel.
To access Anaconda Toolbox, users need to be part of Anaconda’s Early Adopter Program and have Microsoft’s Python in Excel preview installed. For a comprehensive understanding of the setup and the collaborative relationship between Anaconda and Microsoft, refer to the Anaconda Toolbox documentation.
First steps with Anaconda Toolbox
Once you have gained access to the Anaconda Toolbox, you’ll find the Anaconda Toolbox button located at the far right of the Formulas tab in Excel’s home ribbon. To start, click this button. You will then be prompted to sign into Anaconda Cloud through the Toolbox menu, situated on the right side of your worksheet:
Upon successful login, you’ll be presented with the Anaconda Toolbox menu, which is organized into three main categories:
- Anaconda Cloud Data: This feature allows you to save and retrieve data from Anaconda Cloud, a cloud-based platform provided by Anaconda for storing and executing data science projects.
- Visualize with Python: In this section, a user-friendly, point-and-click menu enables you to automatically generate Python code for creating specific types of plots.
- Anaconda Assistant: This innovative tool employs generative AI to assist in crafting the appropriate Python code based on the task you specify.
Let’s walk through each section.
Loading a sample dataset from Anaconda Cloud
A prevalent approach in learning Python involves using conventional sample datasets. Conveniently, many well-known datasets are accessible directly from Anaconda Cloud, which can be extremely beneficial. You also have the option to upload your own datasets to Anaconda Cloud, but for now, let’s concentrate on accessing the existing ones.
To accomplish this, navigate to Import Data from Cloud under the Anaconda Cloud Data section, and then select Import Data > Public Catalog:
Next, proceed to the last dataset, ‘Pittsburgh Bike Share’, and choose ‘rides’. Afterward, you have the option to send this data to ‘Sheet1’. It’s worth noting that you can also write a custom SQL statement at this stage to selectively import data into Excel or to make other modifications as needed:
Great! Now that the data is loaded, we can explore the interesting analyses and operations we can perform with it. To gain more workspace, go ahead and undock or detach this menu from its current position.
Visualizing data
Now, let’s move to the next feature in the Anaconda Toolbox, Visualize with Python. Here, you have the chance to create Python plots using a user-friendly, point-and-click interface. If you’re interested in viewing the total number of rides by month, for example, a line chart would be an appropriate choice. To begin, select the line chart as your desired chart type.
In the Design section, you’ll find various customization options for your chart. While you have the freedom to experiment with these options, it’s highly recommended to add a title to your chart for clarity. For instance, in your case, you could title the chart Total Rides by Month to accurately represent the data being visualized.
When you navigate to the Preview section, you’ll be able to view both the generated Python code responsible for creating the plot and the resulting plot itself. It’s important to remember that this process encompasses all the necessary steps for utilizing Python within Excel. This includes the conversion of the Excel dataset into a Pandas DataFrame using the XL()
function.
While the code displayed in the ‘Preview’ section cannot be edited directly there, you have the option to copy it and paste it into a Python cell within Excel. This allows you to make any desired modifications. Additionally, you can also adjust the settings using the menu options available to fine-tune your plot.
If you’re satisfied with the plot, the final step is to insert it into your workbook. To do this, navigate towards the bottom of the Setup tab in the menu. There, you’ll find the Click to select option under the Output group. By clicking this, you can manually choose where to place the plot in your Excel sheet. For instance, you could choose to insert it at cell I1:
The resulting plot wil be barely visible, placed inside just one cell. To make this bigger, make sure it spills across multiple cells by right-clicking on the cell and selecting Picture in Cell > Place over Cells:
To enhance the visibility of your plot, which initially appears quite small within a single cell, you can expand it to span multiple cells. Right-click on the cell containing the plot and select ‘Picture in Cell > Place over Cells’. This action will allow the plot to cover a larger area, making it more visible and detailed.
If you anticipate needing to edit the source code or the data to modify the graph, consider creating a linked reference to the cell containing the Python code. This approach maintains a connection between the plot and its source, facilitating easier updates.
For more on these alternatives to resizing a Python plot in Excel, check out this post:
As we proceed, go ahead and delete this initial plot. We’ll next explore creating custom visuals and more with the assistance of the Anaconda Assistant chatbot.
Anaconda Assistant
This plotting wizard is undoubtedly an impressive tool, yet it has its limitations. For instance, it doesn’t include every type of plot available in Python in its menu options. More importantly, it lacks the capability to assist with data manipulation, a crucial step in numerous data analysis and visualization tasks.
Consider a scenario where we wish to visualize the number of riders in a unique manner. Say, for instance, I aim to view the average number of daily riders per week in the dataset. Achieving this would necessitate altering the Date
column in ways this menu simply can’t handle. We were fortunate to have a Month
column available to easily do this, but this won’t always be the case.
Here, the Anaconda Assistant becomes invaluable. This tool is a comprehensive generative AI, similar to ChatGPT. We can provide it a specific prompt to get help with, such as “Create a line plot showing average n_rides by week of date:”
.It will generate instructions and output code in the style of ChatGPT. Then, you can insert it into your worksheet.
Go ahead and run the code, and you’re likely to encounter an error. This happens because the DataFrame _df
was never defined, which is a surprising mismatch. There are several methods to rectify this, but perhaps the simplest solution is to add the following piece of code to the Python snippet being inserted:
Now, for whatever reason, _df
is the default DataFrame name in Anaconda Assistant, and you can pretty much count on it using this conventino, but since this DataFrame has been explicitly defined now in your workbook I am going to keep calling out to _df
in the code.
For example, maybe I just want a plain tally of riders on weekends versus during the week, here’s the code that I would use:
Note that Anaconda Assistant performs certain unnecessary actions, such as importing pandas separately even though we know it’s part of the Python environment. Additionally, it creates its own weekend variable, despite one already existing in the DataFrame.
This is typical of generative AI behavior. With time, we hope the algorithm will improve, gaining a better understanding of Python’s application in Excel and more accurately identifying column names. However, for now, it’s essential to validate the output and make any necessary adjustments.
The future of Excel is not evenly distributed
What I appreciate about the Anaconda Toolbox is how it effortlessly merges two of Excel’s most exciting features. It allows you to use Python for creating reproducible scripts to perform tasks that are challenging or impossible in Excel alone. Moreover, it simplifies certain processes that Excel doesn’t readily facilitate, like building plots directly from a menu.
Additionally, it integrates the capabilities of a generative AI chatbot within Excel. While not flawless, it acts as an impressive co-pilot for learning and applying Excel in a secure environment.
This truly feels like the future of Excel, although it’s not yet widely adopted. But you are ahead of the curve by using it!
Feel free to compare your progress with mine and let me know if you have any questions about the Anaconda Toolbox or Python’s application in Excel more broadly in the comments.
Leave a Reply