Python’s integration into Excel has opened a new world of possibilities for data analysis and visualization, and Seaborn is one of the most exciting tools to leverage within this environment. Seaborn builds on Matplotlib, simplifying complex visualizations while maintaining flexibility and producing professional, aesthetically pleasing charts.
This post introduces Seaborn and its core functionality, with practical examples to help Excel users get started with this powerful library. Download the exercise file below to follow along:
The building blocks of Seaborn
Seaborn stands out for its intuitive design, built to simplify the process of creating plots. Here are its key features:
- Dataset-oriented approach: Seaborn integrates seamlessly with Pandas DataFrames, allowing you to directly reference column names in your visualizations. This makes it easier to create charts without manually restructuring your data, streamlining the workflow for Python in Excel users.
- Plot types: Seaborn offers two categories of plotting functions:
- Axis-level functions: Functions like
sns.scatterplot()
andsns.barplot()
are used for single plots, such as scatterplots and bar charts. These functions offer fine-grained control for customization and are a great starting point for simpler, more direct visualizations. - Figure-level functions: Functions like
sns.relplot()
andsns.catplot()
operate at the figure level, which means they can generate complex visualizations composed of multiple subplots. These functions are particularly useful for exploratory data analysis, as they allow you to facet your data across multiple categories or dimensions automatically. For example, you can usesns.relplot()
to create scatterplots that are grouped by a categorical variable into a grid of subplots, providing deeper insights into your dataset. While powerful, figure-level functions require a slightly more involved setup and are beyond the scope of this post.
- Axis-level functions: Functions like
- Themes and styles: With several built-in themes and styles, Seaborn ensures that your charts are professional and readable without requiring extensive manual formatting.
Seaborn relies heavily on specific functions to create different plot types, each with arguments to control data sources, variables, colors, and other aesthetics. As an Excel user, you’re already well-versed in functions—they’re the bread and butter of the Excel programming language. This familiarity puts you in a great position to dive into Seaborn.
Data import and creating your first Seaborn plot
In the following examples, we’ll work with two datasets from the exercise workbook, loading them into Python in Excel as ad_spend_df
and ratings_df
, respectively:
In the following code, we’ll create a scatter plot to visualize the relationship between advertising spend and monthly sales across different stores.
sns.set_theme(style='white')
sets the plot’s overall style to a clean, white background, giving the visualization a minimalist and professional look. This theme will be universally applied to all subsequent Seaborn plots unless explicitly changed.
The scatterplot()
function generates the scatter plot. It uses the ad_spend_df
dataset, mapping the ‘Advertising Spend’ column to the x-axis and the ‘Monthly Sales’ column to the y-axis. The hue='Store'
argument applies color coding to the points based on the ‘Store’ column, making it easy to distinguish data points by store.
The remaining functions are from matplotlib.pyplot
, which Seaborn relies on for additional customizations. We’ll use these frequently in other examples to enhance visualizations. Here, plt.title
sets the plot’s title, “Advertising Spend vs. Monthly Sales,” providing context for the chart. Similarly, plt.xlabel
and plt.ylabel
define the axis labels as “Advertising Spend ($)” and “Monthly Sales ($),” clarifying the data being displayed and adding essential details to the visualization.
The resulting scatterplot will look something like this. To refine it further, we might consider enhancing the axis formatting, though this requires a bit more effort compared to Excel’s point-and-click interface. However, we were able to effortlessly add color to each point by category—a task that would have been more challenging to achieve in Excel.
Exploring Seaborn’s chart types
Next, let’s explore a range of other basic plots that can be created using a similar approach with Seaborn. Some of these plots are simpler to replicate in Excel than others:
Line plot
Let’s start with a basic line chart using the lineplot()
function. Here, we specify the x-axis and y-axis to map the data appropriately. The marker='o'
argument adds circular markers to each data point along the line, making individual values stand out and emphasizing key points in the trend.
The ci=None
argument removes the default confidence interval shading typically added by Seaborn, keeping the chart clean and focused. Finally, we’ll include chart and axis titles. Since we’ll consistently add these in future examples, we won’t call attention to it again.
The resulting chart should look like this:
KDE plot
Now let’s create a KDE plot, which offers a smoothed representation of a histogram. Unlike histograms that display data frequency within bins, KDE plots estimate the probability density function of a variable using kernels to create a smooth curve. This results in a continuous line that more accurately reflects the overall distribution of the data.
The kdeplot()
function is used here with the ad_spend_df
dataset, mapping the ‘Advertising Spend’ column to the x-axis. The fill=True
argument fills the area under the curve, enhancing the visual appeal and clarity of the distribution. The color='purple'
argument sets the curve’s color to purple, adding a touch of differentiation and aesthetics.
Violin plot
Now, let’s explore creating a violin plot. A violin plot combines the features of a boxplot and a KDE plot. It displays both the summary statistics (like a boxplot) and the data’s full distribution (like a KDE plot). The “violin” shape represents the density of the data at different values, providing deeper insights into the distribution compared to a boxplot alone.
The violinplot()
function is used with the ratings_df
dataset. The x='Store'
argument places the store names on the x-axis, while y='Customer Rating'
maps the customer ratings to the y-axis, visualizing their distribution for each store.
The inner='box'
argument adds a boxplot inside each violin, summarizing the central tendency and variability of the data, such as the median and interquartile range. The palette='pastel'
argument applies a soft pastel color scheme, enhancing the visual appeal of the plot.
Bar plot
Finally, let’s dive into creating a classic bar plot. To start, we calculate the average monthly sales for each store, sorting the results in descending order with .sort_values(ascending=False)
. The .index
method retrieves the sorted store names, which are used to customize the x-axis order, ensuring the data is displayed from highest to lowest.
The sns.set_theme(style="dark")
line applies a dark background theme to the plot. This theme will now act as the default for subsequent plots.
To generate the bar plot, we use the barplot()
function, mapping the ‘Store’ column to the x-axis and ‘Monthly Sales’ to the y-axis. The estimator='mean'
argument ensures the bars represent the mean of ‘Monthly Sales’ for each store.
By setting ci=None
, we remove confidence intervals to simplify the chart and emphasize the averages. The palette='Blues_r'
applies a reversed blue gradient, where darker shades indicate higher values. Finally, order=sorted_stores
arranges the bars in descending order of average sales, as defined earlier.
Conclusion
Seaborn offers an incredible variety of plots, and by now, you’re likely noticing a consistent pattern of using Seaborn functions to map variables and aesthetics, followed by customizing the results with both high-level Seaborn options and Matplotlib tools. This flexibility allows you to tailor your visualizations to your exact needs.
For Excel users, Seaborn is a game-changer, enabling the creation of advanced, visually compelling charts that go beyond the basics. By integrating Python into your workflow, you gain access to powerful analytical tools and precise customization, making it easier to build sophisticated, statistically informed plots with a level of detail and polish that can be difficult to beat in Excel alone.
Have questions about using Seaborn or Python for data visualization in Excel? Drop them in the comments—I’d love to help! Don’t forget to check out the Seaborn documentation for inspiration and examples. Be sure to visit the gallery for some amazing visualization ideas.
Leave a Reply