Download this post as a PDF:
One of the greatest wins for Excel users learning Python is access to a comprehensive suite of data visualization tools. However, compared to Excel, the default plots produced with Python in Excel can seem somewhat static. For example, there’s no straightforward way to add tooltips or interactive elements that allow users to click on parts of the chart to understand the data’s origins.
Nonetheless, it is possible to make Python plots created in Excel feel more interactive and user-friendly. This post expands on a previous article where I explored using moving averages with Python in Excel. We’ll extend this concept further to develop an interactive plot. You can follow along using the exercise file provided below:
The first step I will take is to incorporate several user inputs that will guide the remainder of the model. Specifically, I will allow the user to determine the sampling interval for the data and the number of periods to use for the moving average:
In an Excel model that requires user-defined inputs, it is a good ida to incorporate data validation controls to enhance usability and robustness. Specifically, I will includ a dropdown menu to designate permissible interval periods within the model (“Daily,” “Monthly,” etc.). Accurate spelling of these intervals is essential, as it will dictate the subsequent operations of the model.
Therefore, I will create a lookup table that serves two purposes: it will identify the corresponding pandas date offset values required for Python integration, and establish a foundation of accuracy for the data validation. This approach ensures both precision and ease of use, streamlining the model’s functionality.
In a new worksheet, go ahead and list out the three options we are going to let the user choose between: Daily, Weekly and Monthly. Then in the next column, add the proper pandas date offset frequency string. In pandas, a date offset frequency string specifies the increment at which to apply a frequency conversion or generate a date range. These strings represent base time frequencies, such as “D” for day, “M” for month, and “W” for minute, with additional modifiers (like “S” for start or “E” for end) to indicate specific points within the given period.
Your table should look like this. Name it period_offsets
:
Now that we’ve established our “source of truth,” we’d like to use the period
column as the basis for the data validation dropdown. This involves the slightly roundabout step of creating a named range from the Period table and using it as the foundation for the data validation dropdown. For more information, check out this post from MyOnlineTrainingHub.
After you have created the named range periods
, select cell E2 in your main worksheet. Then, go to the ribbon, click on “Data,” and choose “Data Validation” to apply a dropdown list based on this range to your cell.
Excellent. We only need to define one more input, which will be offstage as it derives from the user’s selection. I plan to use the XLOOKUP()
function to identify the selected date offset frequency string. This will then be passed into Python to dictate the resampling of the time series data.
Great job! We now have all the data and user-defined inputs necessary to create this chart. Let’s proceed by adding a Python cell that will load both the main dataset and the user-defined inputs as named objects:
Next, we will resample the dataset based on whichever interval the user requested.
First, resampling in pandas requires the date column to be set as the index of the DataFrame, so I’ll do that using set_index()
.
Then, I will use the resample()
method to calculate total sales over the specified interval. Remember, this is the value found in the period_offset
variable, so we will pass that into resample()
to obtain the desired resampling. I will then chain the sum()
method to get the total sales for the specified period. If you want to further validate and experiment with this code, you can preview or display the resulting Excel values in the downloaded exercise file.
The next step is to construct the plot, which only requires a few lines of code. I will plot the actual, “unsmoothed” resampled data, then add the smoothed sales data.
The trickier, more dynamic aspect here is the title of the plot. I would like it to dynamically be relabeled based on the period that is being resampled and the number of periods being used to smooth. I will use f-strings from Python to make this happen.
Go ahead and display and resize the Python plot in Excel, and you should see something like the following. The user now controls both the sampling interval and the number of smoothing periods, and the plot updates accordingly.
While the resulting plot still doesn’t have tooltips or the ability to click into the data to understand its series, the ability to work smoothly with time series data and easily customize the plot makes this a very worthwhile skill to have in your toolkit.
What questions do you have about Python in Excel in general, or about plotting and interactive visualizations with Python in Excel particularly? Let me know in the comments.
Leave a Reply