In 2020, everything has been out of whack. Wacky sports seasons, altered day-to-day life, Cleveland refusing to pay their good players (oh wait, that one’s normal). But much like our news sources these days, it’s hard to know how much we should trust the baseball player statistics that were recorded in the 2020 season. 37% of a normal season is hardly a large enough sample size to take meaningful conclusions from (hear that, White Sox fans?).
Further complicating the question is the 2019 offensive breakouts from Max Kepler and Jorge Polanco. The pair experienced a pair of reasonably expected breakout seasons in 2019, followed by steeper-than-expected regression in 2020. However, particularly in these two cases, other factors may have also contributed to their drop-offs. Polanco underwent ankle surgery immediately after the season, and it was revealed he played hurt for most of the season, which may help explain the cratering of his power numbers in 2020. Kepler also spoke out on the mental health challenges of playing in the unique off-field circumstances of 2020, which may have contributed to his struggles.
Case 1: Jorge Polanco
Using Polanco for an example, let’s examine how (un)stable a sample size of 55 games (how many he played in 2020) really is.
Polanco 55-Game Samples
As this table illustrates, there’s a lot of variance in a sample size this small. Polanco almost certainly isn’t going to be a .338 hitter, as he was through 55 games of 2019, and he almost certainly isn’t a .258 hitter, like he was this season. This was the worst (non-overlapping) 55-game sample I examined by OPS, and second-worst by batting average. Coming off a career year, some regression was expected. However, just entering his prime, regressing back to his age-23 season would be very unusual. I think it’s reasonable to pin 2020 on a bad small sample size aided and abetted by a balky ankle.
ⓘ Official sources stated that Jorge Polanco’s 2020 is false and misleading
Case 2: Max Kepler
Max Kepler has never really been a high-average hitter, but what he lacks in average, he makes up for in extra base hits. His 48-game sample in 2020 signaled some stark regression from his 2019 numbers, but not necessarily his career norms.
Kepler 48-Game Samples
As we can see, Kepler is nearly as volatile in 48-game samples as Polanco is in 55 games. However, his 2020 really isn’t out of the ordinary, falling toward the median of his samples, rather than the bottom end (as Polanco’s did). On the other hand, Kepler has an (albeit weak) upward trend in his batting numbers over these samples that is pretty typical of hitters as they progress towards their primes, and this did not continue in 2020. I think we should expect Kepler to wind up closer to his 2019 numbers than his 2020 numbers in 2021, but his 2020 is more telling than Polanco’s.
ⓘ Official sources stated that Max Kepler’s 2020 is not really false and misleading
Case 3: Byron Buxton
LOL Just kidding. There are no conclusions to be made from any sample size in Buxton’s career.
Statistically speaking, Nelson Cruz had the most telling (and simultaneously useless) season of all the Twins in 2020. Cruz had another great season, his 13th (out of 16) with an OPS of over .800. Seasons like his, which affirm what we already have concluded (that Cruz is a great hitter), are more trustworthy than seasons that buck trends or are outliers (Kepler and Polanco, respectively). However, we already knew that Cruz was a great hitter, and we’re waiting to find out what Polanco and Kepler will be in their final forms, so even though his stats were trustworthy, they’re just as useless as anyone else’s were in 2020. There are no conclusions to be drawn from 2020. Had Kepler or Polanco produced closer to their 2019 levels in 2020, we would all be pointing to their seasons and saying “see! They really are that good!”; however, their stats would be just as meaningless had they been good. The extenuating circumstances, paired with the small sample sizes, render 2020’s stats rather untrustworthy.
The Twins also have a pitcher who is a clear example of the small sample size problem. Taylor Rogers was disappointing to Twins fans in 2020, posting the worst ERA of his career (4.09). He only pitched about a third of his normal innings load, with 20.0. Had he given up one earned run over his next 11 innings (pretty reasonable for a guy who finished 2018 with 26 consecutive scoreless innings), his ERA would have come in at 2.90, and we would all feel very differently about his season than we do now.
As far as player statistics go, there just isn’t much to be gleaned from 2020. The extenuating circumstances, paired with the small sample sizes, render 2020’s stats untrustworthy. Luckily for the Twins, they can mostly expect positive regression in 2021, while
the White Sox some other teams will likely experience the opposite.