Fit Nonlinear Data with a Linear Model!

Photo by Sid Verma on Unsplash

Fitting nonlinear data with a linear model is a technique called Polynomial Regression. The intuition is that the model will have a higher degree of freedom to fit the data.

First, we generate the data (note that y is a quadratic function of X):

The linear regression model (without Polynomial features):

Adding polynomial features (XX, X**2):

The first five samples from X:

The first five samples from X_poly:

The linear regression model (with Polynomial features):

The models’ coefficients are almost identical to y.

This trick has many applications in machine learning (such as Support Machine Vectors). However, polynomial features can cause over-fitting. The solution is to use grid search to pick the optimal parameter for the polynomial feature function.

Bonus: The grid search implementation is in the link below:

https://github.com/booletic/medium/blob/main/poly.ipynb

Programming for now!