Introduction This is post four of a multi-part analysis of college basketball game outcomes. There are probably going to be things I mention in this post that I talked about in prior posts, so here are the links if you need to catch up.
Post 1 Post 2 Post 3 This post specifically talks about a non-hierarchical clustering method, kmeans clustering. The concept behind kmeans clustering is relatively simple, and can generally be broken down into a few steps.
Introduction This is post three of a multi-part analysis of college basketball game outcomes. Part one shows how the data was aquired and loaded, and part two is a brief exploratory data analysis exploring a few teams for the 2005 season. Links to those can be found below.
Post 1 Post 2
This post will focus on some data munging and feature engineering. The data set is currently not set up very well for modeling.
Introduction This post is part two of a multi-part analysis of college basketball outcomes. Part one showed a couple cool features of R markdown code chunks, and how to use the Kaggle API to download data. If you didn’t see that one, you can find it here.
In this post, I am going to dig into the data set a bit, to understand some of the fields and their relationships with one another.