A few days ago, we were wondering in the comments if there was a way to isolate how variable a pitcher’s performance is. In other words, if Pitcher A always throws seven innings of 2 run baseball, and Pitcher B alternates between throwing 9 inning shutouts, and 5 innings with 4 runs given up, they have the same ERA, and same innings pitched, but probably have a very different fan perception. What I found out is that there is no real stat to track this—the closest might be game score. So, I did what any crazy baseball fan with access to game logs from Baseball Reference and Microsoft Excel might do—I made up my own stat.
The formula: First, I found the standard deviation in the number of innings pitched, and the number of earned runs given up per by each Twins pitcher. We could factor in things like walks, hits, and home runs, but I didn’t, to keep this simple. This alone yields an interesting data set.
Next I took the standard deviation in number of innings pitched, and multiplied it by the average number of innings pitched per start for the league, and did the same for runs given up. Then I took those numbers, added them together, and divided the sum by the number of starts each pitcher made.
The whole thing can be expressed as ((pitcher’s stdev IP * league average IP)+(pitcher’s stdev ER * league average Runs against)/pitchers number of starts. Currently for 2019, starting pitchers are averaging 5.3 innings pitched per start, and 4.78 runs per game. This yields a number I will express to three decimal points. We will call this number the Variability Factor (VF) below.
A perfectly consistent pitcher will have a VF of zero. The closer they are to zero, the less variability in their appearances. The lower realistic limit is around .500. The upper limit is technically infinite, but is a little below three within the bounds of actual, realistic MLB numbers. Therefore, we want to realistically look at between .5 and 2.5 as being the “range” that a pitcher will fall in to, and grade accordingly. It would take an incredible amount of data entry, and/or some computer programming, neither of which I am up for, to get a truly useful sample size of pitchers, but we can easily compare amongst a smaller sample.
Here are how the Twins starters stack up.
Twins Starter VF
**I only counted starts, so Martin Perez’ three relief outings are not included.
Keep in mind, this is only measuring consistency of performance, not quality. In other words, a guy who always pitches five innings and gives up five runs is very consistent. So, as you can see above, Jose Berrios has been very consistent. He has also been consistently very good. We can tell that by comparing his FIP to his VF. Low FIP and low VF will imply a pitcher is consistently good. High FIP and low VF implies he is consistently not good. And of course, the inverse also applies. (ERA also works here, but FIP removes team defense as an element.)
Twins starter VF vs FIP
By inserting FIP into this table, we are able to better put the VF into context. Jose Berrios, as I mentioned, is consistently good. Michael Pineda is also very consistent—but consistently mediocre. Jake Odorizzi is a much more inconsistent pitcher, but when he is good, he is very good, and he had a couple outliers skewing his numbers (remember that game in Philadelphia where he didn’t get out of the first inning?)
Perhaps I shouldn’t be surprised, as I’ve been defending him for years, but Kyle Gibson is sitting exactly where he belongs, in the middle of the Twins staff. I think we can finally declare the old good Gibby/bad Gibby myth busted, as Gibson has been a fairly consistent pitcher across his fourteen starts this season.
While I’m convinced this is an utterly worthless stat, especially for starters, I still find it fascinating, and it provides some great context the next time a guy struggles through a start, or utterly shuts down an opponent. Is that one start truly indicative of who the pitcher is, or is more likely to be an aberrant data point? Furthermore, much of the basis of sabermetric thought is that the aggregate is more important than individual data points. In other words, it doesn’t matter exactly how a guy gets to his average numbers, only what those averages are. That’s incredibly unsatisfying as a fan, but is an important caveat to remember for an analysis such as this.
Does a study on consistency change your mind on any of the Twins pitchers?
This poll is closed
I’d need more information