In 1988, when the statistical revolution that would become MoneyBall was in its infancy, Bill James wrote:
A power pitcher has a dramatically higher expectation for future wins than does a finesse picther of the same age and ability.
A quarter century later, this claim remains somewhat controversial, as can be seen in the boards. I do not propose to end the argument today. Indeed, I'm hoping to start a debate on it. To that end, I'd like to share a nascent data set and some early conclusions. I believe that the initial results will surprise some.
To begin tackling the larger claim, I'd like to address a slightly different question: are power pitchers more likely to repeat a strong year, controlling for age? This approximates James's original claim by making (the dubious) assumption that two pitchers who had a strong year are of equal ability and uses controls for age to address the "same age" part of the claim.
I started by taking every pitcher in 2009 who threw more than 100 innings, which left me with 130 pitchers. I then needed a way to divide pitchers into power pitchers vs. finesse pitchers. As a first pass, I decided to calculate their average velocity. The idea behind this is that true power pitchers, such as Verlander and Felix Hernandez, tend to throw everything hard. The velocities went from a high of 91.8 MPH for Josh Johnson to a low of 65.63 (1 million CubsDen Dollars to the poster who can guess who that was. Cash value of 1 million CubsDen Dollars = $0.00 USD). I then divided pitchers into thirds based on velocity and considered the top third to be power pitchers and the bottom third to be finesse pitchers.
Next, I needed to determine a performance metric. I determined that ERA was appropriate. Why ERA? Because metrics such as FIP are based on James's work and may give a slight edge to power pitchers, due to the emphasis on strikeouts. Like I did with velocity, I divided pitchers into thirds based on ERA.
The results are summarized in the figure above. A trend can be seen in this data. High velocity pitchers tend to have lower ERAs and low velocity pitchers tend to have higher ERAs. Thus, based on this data set, we find evidence that is consistent with Friday's work: high velocity (strikeout) pitchers are more likely to have a lower ERA.
However, the question I want to address here is whether pitchers who have a good year are likely to repeat that year. Therefore, I took the pitchers in the high velocity-low ERA box and the low velocity-low ERA box and tracked them into the future.
The power pitchers were Edwin Jackson, Yovani Gallardo, Zack Greinke, Josh Johnson, CC Sabathia, Justin Verlander, Clayton Kershaw, Jake Peavy, Hiroki Kuroda, Ubaldo Jimenez, Jon Lester, Felix Hernandez, Roy Halladay, Josh Beckett, Matt Cain, and Cub favorite Carlos Zambrano. The finesse pitchers were Javier Vazquez, Jarrod Washburn, Bronson Arroyo, Adam Wainwright, Ted Lilly, Dallas Braden, Mark Buehrle, Wandy Rodriguez, Kenshin Kawakami, Jered Weaver, John Lannan, and Randy Wolf.
The first thing I did was enter their ages and ERAs for the 2010, 2011, 2012, and 2013 seasons. Jarrod Washburn's final season was 2009, so he was dropped from the data set. Since we're interested in whether pitchers could repeat a good season, I calculated the difference in ERA between their strong 2009 season and the four following seasons. Hereafter, this will be referred to as the ERA differential.
Warning: the next section is kind of mathy. You can skip it and not miss anything.
First, I ran a simple test on the two groups. I ran a statistical test called a t-test to see whether, statistically, the era differentials were lower for power pitchers than for finesse pitchers.
I then regressed the difference on a dummy variable for finesse, age, and age squared. The dummy variable is a variable equal to 1 if the pitcher is a finesse pitcher and 0 otherwise. In a regression, this will give the impact of being a finesse pitcher on ERA differential. Age and age squared are included to reflect the idea of a "prime." Players should get better up to a certain age, and then start to slowly get worse. If this is true, the coefficient on age will be positive and the coefficient on age squared will be negative.
If you skipped the mathy part, you can pick up again here.
Simply comparing the two sets of differentials does not indicate any difference in the ERA differntials. In fact, there is 80% certainty that the differentials are the same for power pitchers and finesse pitchers.
When this is extended to regression analysis, I find the same thing. The coefficient on finesse pitchers, while positive, is not statistically different from zero. The coefficients on age have the expect sign, though the magnitude suggests the "prime" occurs in the early to mid-30s.
The results give hope to both sides of the debate. It is possible that the data set is simply not big enough. If more players and years are added, it may turn out that the positive coefficient on finesse pitchers remains and becomes significant.
However, for now, the conclusion I draw is that, while a fireballer is more likely to have a low ERA, a good pitcher is a good pitcher and just as likely to remain so year over year.
Filed under: Analysis