Tidyverse | Naming Things is Hard

NCAA Basketball Analysis Part 4 - A Cluster Analysis

Introduction This is post four of a multi-part analysis of college basketball game outcomes. There are probably going to be things I mention in this post that I talked about in prior posts, so here are the links if you need to catch up. Post 1 Post 2 Post 3 This post specifically talks about a non-hierarchical clustering method, kmeans clustering. The concept behind kmeans clustering is relatively simple, and can generally be broken down into a few steps.

NCAA Basketball Analysis Part 3 - Data Preprocessing and Feature Engineering

Introduction This is post three of a multi-part analysis of college basketball game outcomes. Part one shows how the data was aquired and loaded, and part two is a brief exploratory data analysis exploring a few teams for the 2005 season. Links to those can be found below. Post 1 Post 2 This post will focus on some data munging and feature engineering. The data set is currently not set up very well for modeling.

NCAA Basketball Analysis Part 2 - Understanding the Dataset

Introduction This post is part two of a multi-part analysis of college basketball outcomes. Part one showed a couple cool features of R markdown code chunks, and how to use the Kaggle API to download data. If you didn’t see that one, you can find it here. In this post, I am going to dig into the data set a bit, to understand some of the fields and their relationships with one another.