Python in Excel: How to conduct linear regression with Copilot

Data and tech trends come and go, but linear regression has remained one of the most reliable tools in a data analyst’s toolbox. It helps you identify relationships, test hypotheses, and make predictions with a clear view of how different factors influence outcomes. Whether you’re working in finance, marketing, manufacturing, or real estate, few methods match linear regression for both clarity and ease of use.

For all the benefits, however, Excel’s built-in regression tools were never very user friendly. You had to load the Analysis ToolPak (if you could find it) step through a dated wizard, and then make sense of an output sheet that offered little guidance. Changing your model or presenting the results to others was awkward.

With Copilot, things are much smoother. You can build more advanced models with Python in Excel, understand how they work, and interpret the results directly within your workbook. It’s easier to see what your data is telling you and focus on meaningful conclusions rather than the mechanics.

We’ll explore this using a fuel economy dataset. Download the exercise file below to follow along.

Download the exercise file here

If you haven’t used the Advanced Analysis with Python feature yet, take a look at this post:

How to get started with Advanced Analysis with Python for Copilot in Excel

To get started, we’ll run a very simple linear regression: just one dependent and one independent variable. It’s a good habit to make the scope of your model explicit, even when you’re testing something small. In this case, it makes sense to treat mpg as the variable we’re trying to explain and weight as the factor we think influences it.

Here’s the Copilot prompt I used:

“Run a linear regression in Python where mpg is the dependent variable and weight is the independent variable. Include a summary of the results.”

Copilot automatically fitted the model using the dataset and produced the following regression summary:

The interpretation is straightforward: as a car’s weight increases, its fuel efficiency tends to decline. The negative coefficient for weight means heavier vehicles use more fuel. The very small p-value confirms the relationship is statistically significant.

This is the classic starting point for regression analysis: one variable at a time, clear direction, and easily interpretable results. From here, we can begin layering in more predictors to see how horsepower, displacement, or cylinder count refine the story.

Adding more predictors and checking model diagnostics

Now that we’ve built our first model, it’s natural to wonder what other factors might influence fuel economy. Weight appears significant, but horsepower and acceleration could also play a part.

As we start refining our models, we need a way to tell if each new version is actually improving. Two standard metrics help with this: R-squared, which shows how much of the variation in mpg is explained by the predictors, and RMSE, which measures the average prediction error in miles per gallon.

Here’s the Copilot prompt:

The R-squared value of about 0.71 means roughly 71 percent of the variation in fuel efficiency is explained by these three variables. The RMSE of 4.22 means the model’s predictions are off by about four miles per gallon on average. It’s a noticeable improvement over our single-variable model and a good sign that we’re moving in the right direction.

Visualizing predicted versus actual values

Once you’ve built a model and reviewed the metrics, it’s important to see how well the predictions line up with reality. A quick visual check often reveals patterns or problems that numbers alone can miss.

“Plot the predicted vs actual mpg values from the model to check how well the regression fits. Include a line showing perfect predictions for reference.”

Copilot produced a scatter plot comparing the model’s predicted mpg values with the actual ones. Each point represents a car in the dataset. The red dashed line shows what perfect predictions would look like, where predicted and actual values are exactly equal.

This visualization gives a quick gut check on model performance. The tighter the points hug that line, the stronger the predictive power. And while the model isn’t perfect, it’s doing a solid job of explaining how weight, horsepower, and acceleration interact to influence fuel efficiency.

Interpreting model coefficients

You might be wondering how each variable contributes. That’s where interpretation comes in, and Copilot can help you reason through it, not just calculate.

Here’s the prompt:

“Interpret the coefficients of the model using statsmodels. Which features have the biggest impact on mpg and in what direction? Explain in plain language.”

Copilot returned a summary showing that both weight and horsepower have negative coefficients. This means that as either of these increases, fuel efficiency tends to decrease. Weight has the strongest influence. Each additional unit of weight leads to the largest drop in miles per gallon. Horsepower also lowers mpg, though not quite as sharply.

Acceleration, on the other hand, shows a very small and statistically insignificant coefficient, suggesting it doesn’t meaningfully affect fuel economy in this dataset. In other words, how quickly a car accelerates doesn’t matter much for mpg once weight and horsepower are already accounted for.

Together, these results tell a clear story: heavier and more powerful cars use more fuel, while quick acceleration on its own doesn’t add much explanatory value.

Checking model assumptions

Once you’ve built and interpreted your model, it’s a good idea to run a few quick diagnostics to make sure the basic assumptions of linear regression hold. One of the most important checks is to look at the residuals, or the differences between the predicted and actual values.

Here’s the Copilot prompt:

“Plot the residuals of the model. Are they randomly distributed? Is there evidence of non-linearity or heteroskedasticity?“

Copilot produced a residuals vs. predicted values plot. Ideally, the points should be scattered randomly around the zero line. That pattern suggests the model is capturing the data well and that errors are evenly spread across all prediction levels.

In this case, the residuals look mostly random, but there’s a slight funnel shape as mpg increases. That widening spread hints that the model may fit smaller cars a bit more consistently than larger ones, a mild sign of heteroskedasticity. It’s not severe, but it’s worth noting.

Residual plots are one of several ways to check whether your model is behaving properly. You can also look at whether the relationships between predictors and mpg appear roughly linear, whether residuals seem normally distributed, or whether there’s evidence that one error predicts the next. These checks help confirm that the model’s estimates are trustworthy.

Copilot can guide you through these steps, not just by generating plots or statistics, but by explaining what they mean and why they matter. In that sense, it acts less like a calculator and more like a coach, helping you understand the reasoning behind good modeling practice.

Making predictions

Finally, let’s put the model to work in a real-world example. In business settings, the real value of regression often isn’t just understanding relationships. It’s using those relationships to make predictions. Decision-makers care less about the exact slope of a line and more about what it means for future outcomes: how a change in product weight, horsepower, or price might affect performance or profit. A well-built model lets you turn analysis into foresight.

Here’s the Copilot prompt:

“Given a car with 3000 lbs weight, 130 horsepower, and 15 seconds of acceleration, use the regression model to predict mpg.“

Copilot returned a predicted fuel efficiency of about 21.6 miles per gallon.

That means for a car with those specifications, the model expects it to travel roughly 21 and a half miles on a gallon of fuel. This is where regression analysis becomes more than just theory. You can use it to estimate outcomes for new observations, guide design tradeoffs, or compare how different features affect performance.

Conclusion

Iinear regression remains one of the most practical and interpretable tools in data analysis, and Copilot makes it easier than ever to use inside Excel. Even a simple model can uncover useful insights when built thoughtfully and checked carefully. Metrics like R-squared and RMSE help quantify performance, but visuals and diagnostics often reveal the places where your model fits well and where it struggles.

And in the business world, the real power of regression lies in prediction. The ability to estimate how changes in one factor might influence another turns analysis into something decision-ready.

That said, linear regression isn’t magic. It assumes straight-line relationships and evenly distributed errors, which don’t always hold up with messy real-world data. Outliers, overlapping variables, or curved relationships can throw things off, and that’s where judgment comes in. Copilot can automate the steps, but it still takes a human eye to decide what makes sense.

From here, you might explore adding interaction terms, adjusting variables to handle nonlinearity, or comparing results to more flexible models like decision trees or random forests. You could even use Copilot to test cross-validation or experiment with feature selection to see how stable your model really is.

Python in Excel: How to conduct linear regression with Copilot

Adding more predictors and checking model diagnostics

Visualizing predicted versus actual values

Interpreting model coefficients

Checking model assumptions

Making predictions

Conclusion

Like this:

Related

Want more Excel + AI insights? Join my newsletter.

Thank you for signing up. I look forward to becoming savvier about data with you.

Adding more predictors and checking model diagnostics

Visualizing predicted versus actual values

Interpreting model coefficients

Checking model assumptions

Making predictions

Conclusion

Share this:

Like this:

Related

Want more Excel + AI insights? Join my newsletter.

Thank you for signing up. I look forward to becoming savvier about data with you.