R | Naming Things is Hard

NCAA Basketball Analysis Part 7 - Regression and Classification with XGBoost

Introduction So up to this point, we pulled down some college basketball data, feature engineered the heck out of it, and used keras to predict wins/losses as well as predicting the point differential. In this post, I will show the implimentation of the xgboost algorithm for both a regression and a classification task, the same tasks that we tackled in the last two posts. Here are links to the other posts in this series if you need reference.

NCAA Basketball Analysis Part 5 - A Simple Neural Network with Keras

Introduction This is post four of a multi-part analysis of college basketball game outcomes. There are probably going to be things I mention in this post that I talked about in prior posts, so here are the links if you need to catch up. Post 1: Data Prep Post 2: Understanding the Dataset Post 3: Data Preprocessing Post 4: A Cluster Analysis One of the end products of this series of posts is going to be a handful of models that predict the outcome of a given game.

NCAA Basketball Analysis Part 4 - A Cluster Analysis

Introduction This is post four of a multi-part analysis of college basketball game outcomes. There are probably going to be things I mention in this post that I talked about in prior posts, so here are the links if you need to catch up. Post 1 Post 2 Post 3 This post specifically talks about a non-hierarchical clustering method, kmeans clustering. The concept behind kmeans clustering is relatively simple, and can generally be broken down into a few steps.

NCAA Basketball Analysis Part 3 - Data Preprocessing and Feature Engineering

Introduction This is post three of a multi-part analysis of college basketball game outcomes. Part one shows how the data was aquired and loaded, and part two is a brief exploratory data analysis exploring a few teams for the 2005 season. Links to those can be found below. Post 1 Post 2 This post will focus on some data munging and feature engineering. The data set is currently not set up very well for modeling.

NCAA Basketball Analysis Part 2 - Understanding the Dataset

Introduction This post is part two of a multi-part analysis of college basketball outcomes. Part one showed a couple cool features of R markdown code chunks, and how to use the Kaggle API to download data. If you didn’t see that one, you can find it here. In this post, I am going to dig into the data set a bit, to understand some of the fields and their relationships with one another.

NCAA Basketball Analysis Part 1 - Downloading and Preparing the data for Analysis

Introduction Well hello there world. This is my first blog post. Yay. I am going to be starting a series of posts revolving around a Kaggle competition a couble of years back that used NCAA basketball regular season and tournament data. The data used in this analysis comes from the kaggle competition ’ March Machine Learning Mania 2016’. This tutorial will walk through a few things to get us ready for an analysis.