Python in Excel: How to build the capital asset pricing model (CAPM)

CAPM is one of those finance models that can sound more complicated than it really is.

At the core, we are trying to answer a practical question: how closely does a stock move with the overall market? If the market is up, does the stock usually rise too? If the market falls, does the stock tend to fall more, less, or about the same? The Capital Asset Pricing Model, or CAPM, gives us a framework for studying that relationship, with beta as one of the key outputs. Beta measures how sensitive a stock’s returns are to market returns.

In this post, we will use Python in Excel to estimate CAPM for a fictional company called TechCo. The exercise file includes daily closing prices for TechCo and a market ETF. We will convert those prices into daily log returns, run a regression, visualize the relationship between TechCo and the market, and then use bootstrapping to get a better sense of how stable the beta estimate is.

This is a good example of where Python in Excel can make an ordinary spreadsheet analysis more efficient. You could do much of this with standard Excel tools: formulas for returns, a scatterplot, a trendline, and maybe the Data Analysis ToolPak for the regression. But Python gives us a cleaner and more repeatable workflow, especially once we start using statistical libraries and simulation. To follow along, download the exercise file below.

Download the exercise file here

Before we can estimate CAPM, we need a clean price history for both the asset we care about and the market benchmark we are comparing it against. In this example, TechCo is the individual stock, and the market ETF stands in for the broader market. That pairing is the whole idea behind CAPM: we are not just asking whether TechCo went up or down. We are asking how TechCo behaved relative to the market.

Importing & preparing the data

The first step, as always, is to import the necessary packages and datasets:

from scipy.stats import norm
import statsmodels.api as sm

prices = xl("PriceData[#All]", headers=True)
prices["Date"] = pd.to_datetime(prices["Date"])
prices = prices.set_index("Date").sort_index()
prices.head()

The is arranged as daily closing prices by date, which is a natural starting point because prices are what we usually receive from financial data sources. But CAPM is about returns, not raw prices. A $5 move means something very different for a $20 stock than it does for a $500 stock, so we will soon convert these price series into log returns.

That date ordering matters because returns are calculated from one period to the next. If the rows are out of sequence, the return calculation will be wrong before the model even starts. Once the prices are loaded, dated, indexed, and sorted, we have the structure we need to move from raw price history into the return-based CAPM analysis.

Now that the price data is in order, we can convert it into returns. This is the point where the dataset starts to become useful for CAPM.

rtns = np.log(prices[["Market_ETF_Close","TechCo_Close"]]).diff() * 100
rtns = rtns.dropna()
rtns.columns = ["market_rtn","techco_rtn"]
rtns.head()

A raw price series tells us where TechCo and the market ETF traded each day, but beta is based on how those values change over time. By calculating daily log returns for both series, we can compare their movements on the same scale.

Multiplying by 100 expresses the returns as percentages, while dropna() removes the first blank row created by the return calculation. We then rename the columns to market_rtn and techco_rtn, which sets up the regression in the next step.

Estimating beta with a regression

With the returns calculated, we can now estimate the relationship between TechCo and the market.

Here, market_rtn becomes X, the explanatory variable, and techco_rtn becomes y, the value we are trying to explain. In CAPM terms, we are asking how much TechCo’s return tends to change when the market return changes.

X = rtns[["market_rtn"]]
y = rtns["techco_rtn"]

model = sm.OLS(y, X)
results = model.fit()
results.params

Here, market_rtn becomes the explanatory variable, and techco_rtn becomes the value we are trying to explain. In CAPM terms, we are asking how much TechCo’s return tends to change when the market return changes.

The coefficient for market_rtn is the beta estimate. For example, if the beta is about 1.2, that means a 1 percentage point move in the market is associated with about a 1.2 percentage point move in TechCo, on average, in this sample. So if the market ETF is up 1%, the model would estimate TechCo up about 1.2%. If the market ETF is down 1%, the model would estimate TechCo down about 1.2%.

That does not mean TechCo will move exactly that way every day. It means that, across the sample, TechCo appears to be more sensitive to market movement than the market benchmark itself. In plain English, this would be an above-market beta stock.

Visualizing the CAPM relationship

Now we can visualize the CAPM relationship.

Each dot in the scatterplot represents one trading day. The x-axis shows the market ETF’s return, and the y-axis shows TechCo’s return on that same day. The upward pattern suggests that TechCo generally moves in the same direction as the market: when the market has a positive day, TechCo often does too, and when the market falls, TechCo often falls as well.

rtns["techco_hat"] = results.predict(X)

fig, ax = plt.subplots(figsize=(7,5))
ax.scatter(rtns["market_rtn"], rtns["techco_rtn"], alpha=.5)
ax.plot(rtns["market_rtn"], rtns["techco_hat"], linewidth=2)
ax.set_xlabel("Market return (%)")
ax.set_ylabel("TechCo return (%)")

The fitted line gives us the CAPM relationship visually. Its slope is the beta estimate. Because the line rises fairly sharply, TechCo appears to have a positive relationship with the market and may be more sensitive to market movements than a lower-beta stock would be.

The scatter around the line is just as important as the line itself. The points do not sit perfectly on the line, which tells us the market does not explain everything about TechCo’s returns. Some days TechCo does better than the market relationship would predict, and some days it does worse. That leftover variation reflects company-specific movement, noise, and other factors outside this simple one-factor model.

This is a useful point to keep in mind with CAPM. Beta is not a complete personality profile for a stock. It is a summary of market sensitivity. That summary can be useful, but it should not be confused with a full explanation of everything driving returns.

The regression gives us more than just the fitted line from the chart. It also produces a statistical summary of the beta estimate.

summary_df = pd.DataFrame(
    results.summary().tables[1].data[1:],
    columns=results.summary().tables[1].data[0]
)

This code pulls the coefficient table from the statsmodels output and turns it into a regular DataFrame so it is easier to view in Excel. The most important row is market_rtn, which contains the estimated beta. The coefficient tells us how much TechCo’s return is expected to change, on average, for a 1 percentage point change in the market return.

The other columns help us judge how reliable that estimate appears to be. The standard error gives a sense of uncertainty around the beta. The t-statistic, p-value, and confidence interval help indicate whether the relationship is statistically meaningful in this sample.

This is where we move from “the chart looks upward sloping” to a more formal estimate of TechCo’s market sensitivity. If the beta estimate is around 1.2 and the confidence interval is comfortably above zero, that supports the idea that TechCo has a clear positive relationship with the market. We are not just eyeballing a scatterplot. We are quantifying the relationship and checking whether it looks meaningful in the data.

Testing the beta estimate with bootstrapping

The regression gives us one beta estimate, but it is still an estimate based on this particular sample of daily returns. To get a better sense of how much that estimate might vary, we can use bootstrapping.

def bootstrap_beta(df, x, y, n=2000, seed=42):
    rng = np.random.default_rng(seed)
    betas = []
    for i in range(n):
        sample = df.sample(len(df), replace=True,
                           random_state=rng.integers(1_000_000))
        fit = sm.OLS(sample[y], sample[[x]]).fit()
        betas.append(fit.params[x])
    return pd.Series(betas, name="beta")

This function repeatedly resamples the return data with replacement, runs the CAPM regression on each resampled dataset, and stores the beta from each run. In this example, we repeat that process 2,000 times. Instead of relying only on a single beta value, we now have a distribution of possible beta estimates based on the observed data.

That is helpful because a single regression coefficient can look more precise than it really is. Bootstrapping gives us a more practical way to ask: if the sample had been a little different, would the beta estimate still tell roughly the same story?

boot_betas = bootstrap_beta(rtns, "market_rtn", "techco_rtn")
boot_betas.describe()

The describe() output summarizes the bootstrapped beta estimates. The mean gives the average beta across the resamples, while the standard deviation shows how much the beta estimates vary. The minimum, maximum, and quartiles help show the range of plausible values.

The main thing to look for is whether the bootstrapped estimates cluster tightly around the original regression beta or whether they are spread all over the place. If most of the estimates sit near the original beta, that suggests the result is fairly stable. The exact beta may vary a bit depending on the sample, but the broader story remains the same: TechCo appears to have a positive, above-market beta.

Finally, we can visualize the bootstrapped beta estimates.

The histogram shows the 2,000 beta values generated from the resampling process. Instead of treating beta as a single fixed number, this chart lets us see the range of estimates we might get if the return sample were slightly different. Most of the estimates cluster around the original regression beta, which is a good sign that the result is fairly stable.

beta = results.params["market_rtn"]
beta_se = results.bse["market_rtn"]
x = np.linspace(beta - 3 * beta_se, beta + 3 * beta_se, 200)

fig, ax = plt.subplots(figsize=(7,5))
ax.hist(boot_betas, bins=40, density=True, alpha=.6)
ax.plot(x, norm.pdf(x, beta, beta_se), linewidth=2)
ax.set_xlabel("Estimated beta")

Most of the estimates cluster around the original regression beta, which is a good sign. If the histogram were extremely wide or oddly shaped, we might be more cautious about leaning too hard on the original estimate. Here, the estimates appear to center around the same general value, suggesting that TechCo’s beta is not being driven by one strange sample of dates.

The smooth curve overlays a normal distribution using the original beta estimate and its standard error from the regression model. This gives us a useful comparison between the theoretical uncertainty from the regression output and the simulated uncertainty from bootstrapping. In this case, the two line up reasonably well, which gives us more confidence in the general result.

The takeaway is not that beta is a perfect measure of risk. It is not. CAPM is a simple one-factor model, and real stocks are affected by company news, industry conditions, interest rates, earnings, and many other variables. But beta gives us a useful first pass at market sensitivity.

Conclusion and next steps

In this post, we used Python in Excel to estimate CAPM for TechCo. We started with daily closing prices, converted them into log returns, ran a regression to estimate beta, visualized the market relationship, and used bootstrapping to check how stable that beta estimate appeared to be.

The main takeaway is that TechCo appears to have a positive, above-market beta. In practical terms, it tends to move in the same direction as the market, but with somewhat larger moves on average. That makes beta a useful first pass for understanding market sensitivity and portfolio risk.

That said, CAPM is still a simple one-factor model. It does not capture company news, sector effects, interest rates, earnings surprises, or other drivers of returns. The result also depends on the time period, benchmark, and return calculation used.

Good next steps would be to compare multiple stocks, test different benchmarks, add a risk-free rate and alpha term, or look at rolling beta over time.

The best part is that we can do all of this directly in Excel. With Python in Excel, we can keep the familiar spreadsheet workflow while adding real statistical modeling, simulation, and visualization through Python libraries. That is a pretty powerful combination for analysts who already spend much of their day in workbooks.

Python in Excel: How to build the capital asset pricing model (CAPM)

Importing & preparing the data

Estimating beta with a regression

Visualizing the CAPM relationship

Testing the beta estimate with bootstrapping

Conclusion and next steps

Like this:

Related

Want more Excel + AI insights? Join my newsletter.

Thank you for signing up. I look forward to becoming savvier about data with you.

Importing & preparing the data

Estimating beta with a regression

Visualizing the CAPM relationship

Testing the beta estimate with bootstrapping

Conclusion and next steps

Share this:

Like this:

Related

Want more Excel + AI insights? Join my newsletter.

Thank you for signing up. I look forward to becoming savvier about data with you.