Calculating the Intercept and Regression Coefficient using the Matrix Inverse in Python

2 minute read

In linear regression, we map the predictor variable/s X to the dependent variable Y with the linear function Y = f(x) = a + bx

You can think about this as solving a system of linear equations. For example, if we had an independent and dependent variable with n observations, we could write out the series of equations like this…

first equation

We can change the representation of the linear system using matrix notation.

Matrix equation: Ax = b where A is the matrix containing predictors, x is the matrix containing the coefficients, and b is the target/dependent variable.

second equation

We can solve for x (a and b) by performing the following operations

\begin{equation}(A^{T}A)^{-1}(A^{T}A)x = (A^{T}A)^{-1}A^{T}b \ or \ x = (A^{T}A)^{-1}A^{T}b\end{equation}

Let’s import some data and use Python to test some of what we discussed

# Import libraries
import pandas as pd 
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
# Import synthetic dataset 
data = pd.read_csv("C:\\Users\\steve\\OneDrive\\Documents\\website projects\\statistics\\synthetic data\\simple_regression.csv")
#View dataframe
data.head()
Unnamed: 0 x y
0 1 4.373518 89.996377
1 2 4.723557 96.019228
2 3 5.671688 116.386921
3 4 4.860543 100.563548
4 5 4.797259 100.377688
# Visualize data using matplotlib
plt.scatter(data.x, data.y, edgecolor = "black")
plt.xlabel("x - Independent Variable")
plt.ylabel("y - Dependent Variable")

output

Based on the above figure, we can see that there is a clear linear relationship between x and y (What a strange coincidence)

Note. The numpy library has a host of linear algerbra functions (numpy.linalg)

Creating the A and B matracies

#Create column of ones (This is for the intercept)
col1 = np.ones((data.shape[0], 1)) 
#Create a numpy array from the x variable in the above pandas data frame
col2 = np.array(data.x)
#Create A
A = np.column_stack((col1, col2))
#Create B
B = np.array(data.y)

Perform the above mentioned matrix operations

#Left multiply A by A transpose
AtA = np.transpose(A) @ A
# Perform the explicit inverse on AtA
AtAinv = np.linalg.inv(AtA)

Before solving for x, let’s do a quick sanity check to see if the inverse was calculated correctly.

Remember that

\begin{equation}(A^{T}A)^{-1}(A^{T}A) = I \ Thus, (A^{T}A)^{-1}(A^{T}A)x = I x \end{equation}

print(AtAinv @ AtA)
[[1. 0.]
 [0. 1.]]

Solving for x

x = AtAinv @ (np.transpose(A) @ B)
print("a = ", x[0])
print("b1 = ", x[1])
a =  3.2436220896970553
b1 =  19.9273221320193

Let’s check our work by using sklearn

reg = LinearRegression().fit(A, B)
print("a = ", reg.intercept_)
print("b1 =", reg.coef_[1])
a =  3.2436220896965153
b1 = 19.927322132019434