A scatterplot is a chart that displays the relationship between two variables by plotting data points along the X and Y axes, with each point representing a unique pair of values.
By conditionally formatting points based on their category, we can reveal patterns, clusters, or outliers, making it easier to interpret and draw insights from different groups within a single chart.
Excel doesn’t natively support conditional formatting for scatterplot points by category without manual adjustments, but with some creative Excel workarounds—such as creating separate series for each category—or by using Python for enhanced customization, we can produce scatterplots that visually differentiate each category effectively. In this post, we’ll walk through how to achieve this.
To follow along, download the exercise file:
In this dataset, we have information on total bill amounts, corresponding tips, and whether each receipt is from lunch or dinner. Creating a basic scatterplot in Excel to compare total bill versus tip was straightforward—just select the data, go to Insert > Recommended Charts, and choose a scatterplot from the options.
But how can we adjust this plot to color each point by meal type (lunch or dinner), helping us see if time of day impacts these values or the overall relationship? Let’s explore how to do it.
Option 1: Legacy Excel and the NA()
function
To add conditional formatting to a scatterplot in basic Excel, we’ll start by adding two extra columns to our table: Lunch and Dinner. The goal is to use conditional logic to populate these columns with tip values only for the corresponding meal type, mapping lunch tips to the Lunch column and dinner tips to the Dinner column.
Great! Let’s add the conditional formatting. This might look a bit different since we’re using structured table references, but the key concept is straightforward: if the value in the “Time” column is “Dinner,” then the formula will return a value in the “Dinner” column; otherwise, it will use the NA()
function to signal to Excel that this value isn’t available, which prevents it from being plotted.
=IF([@time]=scatter_format[[#Headers],[Dinner]], [@tip], NA())
Now, write a similar function for the “Lunch” column, so it only returns a value where “Time” is “Lunch.” Once you’re done, select the data in the “Total Bill,” “Dinner,” and “Lunch” columns, and go to Insert > Recommended Charts to visualize it.
You should still see a scatterplot recommended, but this time it’s color-coded by time of day. Great work!
It appears that while both total bills and tips tend to be lower at lunch than at dinner, the overall relationship between the numbers follows a similar pattern for both meals.
Option 2: Python in Excel and the seaborn package
Now, let’s see how to accomplish this using Python in Excel. The biggest advantage of this approach is its flexibility. Unlike in Excel, where we need helper columns for conditional formatting, here we can simply add an extra argument to Seaborn’s scatterplot()
function. This is particularly useful if we anticipate adding more categories to our data. For instance, if the restaurant starts serving breakfast, we won’t need to adjust the chart’s formatting manually.
With Seaborn, we can achieve this visualization in a single line of code. Here’s how it looks (the first line is just for data import, no visualization yet):
tips_df = xl("tips[#All]", headers=True)
sns.scatterplot(data=tips_df, hue='time', x='total_bill', y='tip')
We’ll get the following plot returned to us in Excel:
The main advantage here is definitely the ease of extensibility and the fact that we don’t need to create additional helper columns. However, there are a few limitations to keep in mind. For instance, the lack of tooltips and the inability to click on each data point for details can’t be addressed directly in Python. But we can enhance the chart by adding a title, improving the axis labels, and formatting the axes as currency.
This is where Python might start to feel a bit more cumbersome than Excel, as we’ll need to add several lines of code to achieve these adjustments:
You should now have a chart like the following:
To close, here’s a summary table comparing and contrasting these two methods for conditionally formatting a scatterplot in Excel:
Feature | Native Excel | Python in Excel (Seaborn) |
---|---|---|
Conditional Formatting Approach | Requires adding helper columns (e.g., Lunch and Dinner) and using conditional formulas with NA() |
Conditional coloring done directly via hue argument |
Ease of Setup | Moderate – needs manual setup and workaround for each category | High – single line of code for basic scatterplot with color differentiation |
Customizability | Limited – manual adjustments for new categories, series separation | High – easily extendable with minimal code for multiple categories |
Chart Interactivity | Moderate – clickability, some built-in tooltips available | Limited – lacks tooltips and direct interactivity in Excel |
Required Skill Level | Basic to Intermediate Excel knowledge | Intermediate Python knowledge (Seaborn library) |
Maintenance | More time-consuming with new categories | Easier to scale and maintain for additional categories |
Which method do you think best suits your needs, and do you have any questions about implementing either approach? Let me know in the comments.
Jon Peltier
I’d say that customizability of the native Excel chart is high. Once you’ve split the data into two series, splitting it into three or more is straightforward. Also there is a great deal of customizability available through Excel’s user interface.
Customizability of the Seaborn chart may also be high; I don’t have enough experience to know what is available. But rather than a point-and-click interface for applying the desired changes, one has to use additional code. Once you’re moving up the learning curve, it’s probably pretty easy, but for a newcomer I’m sure it’s overwhelming.
George Mount
Thanks for sharing your experience, Jon! I’ve found with Python and Seaborn, it’s often straightforward to cover the basic 80% of needs, but that last stretch of customization can quickly become a real challenge.