You don’t need me to tell you that the Twins’ bullpen failures have been an “energetic” topic of discussion in recent weeks. Much internet ink has been spilled on this site, and many others, detailing the team’s bullpen woes and suggesting ideas for how to fix them.
Meltdown after meltdown after meltdown. What a mess.— Aaron Gleeman (@AaronGleeman) June 30, 2022
In the #MNTwins last five losses to Cleveland they had win probabilities as high as 92%, 98%, 82%, 95%, and 89%.
If the Twins had won all five, they'd be up 11.0 games in the AL Central.
Instead they lead by only 1.0 game.
I don’t intend to hash through all of that in this article.
Instead, I thought it might be interesting to go through some thinking surrounding a topic that commonly comes up in our game thread discussions: how the MLB-wide strategic shift to regularly feature a parade of relievers is probably increasing the risk that a bullpen will blow up a game.
The (logical) argument I have seen is that the more pitchers a team uses in a game, the more likely it is that one of them won’t have it that day. There’s sound reasoning to that theory because relievers, almost as a rule, are prone to the occasional blowup – be it from bad luck, poor performance, or any number of other things. And, the thinking goes, if the relief pitchers were more talented or consistently able to perform at a high level, they’d probably be starters.
Last fall, then-FanGraphs writer (and now Twins’ front office staffer) Kevin Goldstein wrote about the inherent risk in using many different relievers, focused on the Tampa Bay Rays’ bullpenning strategy.
The beauty of some of the most valuable “advanced” stats that we have in baseball today is that they are simple and straightforward. Fielding Independent Pitching and its focus on walks, strikeouts, and home runs is one example. Along the lines of keeping it simple, Goldstein proposed a definition for a reliever blowup, which he, amusingly, called “crap-the-bed” innings:
- An ERA of 18.00 or more, and/or:
- Allowing more baserunners than outs recorded
With this definition, we don’t need to dive deep into any acronym soup metrics or x-stats to understand the likelihood that the Twins bullpen might wreck a close game. Using the available data at FanGraphs and the definition of a relief blowup above, I calculated the Twins' go-to relievers in close games' respective blowup numbers this season.
The Twins preferred pitching lineup this season has emerged to be something like 5-6 innings from a starter, then a mix of four or five relievers to get through the rest of the game. Usually, in close games the Twins have a decent shot at winning, that’s meant Griffin Jax, followed by one or both of Caleb Thielbar and Joe Smith, and then a combination of Tyler Duffey, Emilio Pagán, and Jhoan Duran to close it out.
With those numbers calculated for each reliever, we can use simple probability to estimate the chances the team will get through a game without one of these bad outings occurring.
For example, if Jax, Thielbar, Pagán, and Duran are used in the same game, multiplying their respective non-blowup percentages together gives us a 50% likelihood that a blowup would not occur (or occur, as it were in this case). If the ‘pen parade in a game were Duffey, Smith, and Duran, the same approach would yield 58%.
We don’t need to do this very many times to see that trotting out several relievers in an outing comes with a significant risk of a blowup — and, as has been hypothesized, the more relievers the team uses the more likely a reliever blowup becomes. These numbers make it plain that even when the Twins have everything lined up to throw their “A” bullpen plan this season, they have something approaching a coin flip’s chance of one of these blowups occurring.
It’s also informative to understand the probabilities if the number of relievers used can be limited. To illustrate, let’s say it’s one of the rare days where the Twins get seven innings from a starter and then can finish the game with only Jax and Duran – inarguably the Twins' two most effective and consistent relievers this season. In that scenario, the chances neither of them has a bad outing are about 85%, a significant improvement over the scenarios above when four relievers were used.
Now, not every bullpen blowup, by this definition, equates to a loss or even a lost lead. Often these pitchers are working with a three-run or greater lead or the bad outing can be salvaged by stranding the extra baserunners without them scoring. But the point remains that each additional new resident on the mound increases the club’s risk of a bad outcome, especially if it means the pitchers coming in are farther down the bullpen hierarchy.
With this in mind, a common response from people irritated by the relief-heavy ways of the game today is that teams and their spreadsheet nerd front offices need to trust their starting pitchers to work deeper into games. After all, it’s not as simple as throwing fewer relievers to mitigate the blowup risk – someone has to throw those innings and those pitchers have blowup risk, too.
Relying on the starters for more would usually mean entrusting them with facing the opposing batting order for the third time (or potentially fourth). The penalty pitchers experience on the third trip through the opposing lineup has been well studied. Today, it is generally accepted across baseball that the third time through the order (TTO), starters, on average, are expected to pitch around .35 runs per nine innings worse than they do overall.
But that’s comparing the starting pitchers to themselves. We know they’ll be worse than they were earlier in the game. We’re interested in a different, but somewhat related question. Is a tiring starting pitcher a better bet to avoid a blowup facing the lineup a third (or fourth) time through the order than a fresh reliever might be?
Thanks again to FanGraphs and its splits tool, we can investigate that question for the Twins' starting pitchers. Using the same per outing definition of a blowup as I did with the relievers above, here’s how the Twins' most common starters this season have fared in their careers when they have been allowed to face batters a third or fourth time:
It’s worth noting that the vast majority of the opportunities for the veterans Sonny Gray, Dylan Bundy, and Chris Archer to work deep into a game like this came earlier in their careers before the management of the game had fully adopted the times through the order penalty (and before their workloads needed to be managed quite as carefully as they do now).
Over the past three seasons, their respective non-blowup percentages have been significantly worse than the career marks in the table above — Gray: 20 for 40, 50%; Archer: 3 for 5, 60%; Bundy: 18 for 31, 58%.
With this data, the managerial calculus is not quite as clean-cut as the argument for stretching the starters further would make it seem.
Let’s play out an example to illustrate. Let’s say Gray completes five innings having faced the opposing order twice. Rocco Baldelli’s choices at that point are to stick with him for the 6th inning and gamble with the TTO penalty or go to a fresh reliever to start what’s likely to be (at least) a four-arm bullpen parade.
Let’s say that if he pulls Gray now, he would use Duffey, Thielbar, Pagán, and Duran to cover the last four innings. If he stays with Gray now, he’ll still use Thielbar, Pagán, and Duran to finish the game. Here’s how the blowup math looks for those two paths:
- Stay with Gray: 40.6% chance of no-blowups
- Go with Duffey: 41.3% chance of no-blowups
It’s basically a toss-up. And, that estimate is using Gray’s full career marks, not his (much worse) more recent numbers. In fairness, that bias is likely offset by using Duffey and Pagán’s 2022 numbers. Both of them have career non-blowup marks of around 85%, notably better than this season. If we give the two relievers the benefit of the doubt and use their larger sample career marks for this math, the chances look something like this:
- Stay with Gray: 51.7% chance of no-blowups
- Go with Duffey: 56.5% chance of no-blowups
That gives a clearer edge in favor of going to the bullpen. I fully acknowledge this is far from the perfect approach. This is more of a rough, back-of-the-envelope calculation than a definitive, scientific analysis. There are all kinds of legitimate critiques someone could levy against this approach. (For instance, it has no consideration for things like managing workloads and reliever appearances on back-to-back days, platoon matchups, player slumps, and many other variables that might come into play.)
But as a rough tool, this approach is useful enough for illustrating the risk balancing that comes with bullpen management decisions. In general, I think we’re on reasonably solid ground to buy what this data suggests: that the Twins' best relievers have around a 90% chance of getting through their outings without a big blowup. The rest of the bullpen members have something like a 75% to 85% chance that decreases as you move down the pecking order. And the starters, given the veterans' recent results and the youngsters' small sample sizes when facing the opposing lineup a third time, probably have something in the bottom half of that range or lower.
While the differences are not huge and you can rightly quibble with the specifics, I think this gives a little (very high level) analytic insight into why teams are quick to hook their starters and turn games over to the reliever assembly line. Yes, using more relievers in a game increases the risk one of them might blow up. But, even with the Twins’ mostly underperforming 2022 bullpen cast, the numbers still come down slightly in favor of turning to a fresh reliever, instead of having a starter face the lineup a third time.