week 2: reflections on regression

In data science by Parker TichkoLeave a Comment

As Henry mentioned, Chapter 3 highlighted one of the most common linear methods that we’ve previously encountered in our statistics courses – linear regression. This section, essentially, consolidated one semester’s coursework on regression into a single chapter. As such, many of the topics were familiar to me (e.g., simple regression, multiple regression, outliers, leverage, etc.), but reading through the material again was helpful. In addition to reviewing the conceptual material, I worked through the R regression exercises and learned a few new functions along the way, such as the I(), names(), and, most importantly, predict() functions. Finally, I analyzed a data set that showed a clear curvilinear (i.e., non-linear) trend using polynomial regression. This approach yielded a polynomial regression model with a polynomial term raised to the 7th agree. I might be overfitting the data, so I hope to return to this analysis once we cover the section on model selection. Below, I’ve included some plots of the fitted values that increasingly capture the non-linear trend of the data set.

Rplot_PolynomialRegression

 

Using polynomial regression to model a curvilinear relationship in the data. The model with a polynomial term raised to the 7th degree seemed to capture the non-linear trend the best. 

Leave a Comment