Using real estate data with over 10000 data points, 28 influencing vectors and multiple modelling techniques, we analyze the housing market of Washington D.C. The techniques used are Linear Regression, Smoothing, Random Forests and Gradient Boosting accompanied by a Final Report to analyze key findings from the data and evaluate the process of selecting the "best" model.
We use R to visually understand the real estate variates and the influence it has on the models. Our focus as a class was primarily on obtaining the best RMSLE score against the test data on Kaggle