Introduction
In baseball forecasting, it is widely understood that more data is better when trying to model future performance. Today we examine that assumption for pitchers, and find that occasionally a smaller data set is actually better. We will also explore at what point the recent data becomes more significant than the historical data.
Methodology
We will use pitching data from 2010-2017, since 2010 is the season that Baseball Info Solutions began using an algorithm to classify quality-of-contact. For annual data, we’ll use pitchers with = 120 IP. For monthly data, we use only data from pitchers with = 25 IP in that month.
Throughout the article we will use R2 as a measure of correlation between data sets. The R2 value describes...
Almost!
You’re just a few clicks away from accessing this feature and hundreds more throughout the year that have a singular goal in mind: Winning your league. Subscribe to BaseballHQ.com here!
Already a subscriber? Sign in here