Checking PECOTA -- Batting Average

I'm kind of tempted to just post the graph without comment, but that might not be very instructive.


I'm using the same methods that I used way back in this post in mid-April where I evaluated the start by the Twins' hitters.

Batting average check-up

Reading the graph is fairly simple.  Plotted along the horizontal axis is the number of hits we expect each hitter to have given their ABs this season and their PECOTA-projected batting average.  Plotted along the vertical axis is the number of actual hits each Twins hitter has this season.

The black solid line represents the best-fit line between actual and predicted hits.  The R^2 value for the fit of 0.89 would indicate a good fit, but as I mentioned in the original post, R^2 is not really a good measure of whether the fit is good in this case.

The red line is the x=y line.  Any points that lie on this line represent a hitter who has exactly as many hits as we would have expected given their PECOTA forecast.  Any points above this line represent a player out-performing his PECOTA forecast, and points below this line are players under-performing their PECOTA forecast.  Being towards the right or towards the left just means the player has a lot of hits or not many hits, and doesn't indicate whether their batting average has been good or bad this year.  Only the distance above or below the line indicates whether or not they've over-performed or under-performed expectations.

The reduced chi square value for the fit to the red line is 1.54.  (This is explained in more detail in the original post.)  A perfect fit would have a reduced chi square value of 1.00.  1.54 is kind of on the high side, but not bad.  It represents about as good of a fit as you would expect given the R^2 value of 0.89, so in this case the R^2 and chi square statistics give us about the same information.  (This is in contrast to the mid-April results.  One big reason for this is that the standard deviation in the number of hits is pretty similar for each of the hitters at this point in the season, whereas it varied more at the beginning of the season.)

What does this tell us about the players?

Basically, it tells us that Rondell sucks, Mauer rocks, and everyone else has been about as expected.  (At least as far as batting average is concerned.)

Now that Rondell is officially RonDL as a Twin, we can blame his poor start on injury.  Going back to mid-April, his extremely poor start, the only one that was suspiciously far from the prediction, should have been a big tip-off that he was injured.

Mauer's PECOTA forecasted batting average was a really good .302.  That was one of the best expected batting averages in baseball.  Mauer is obliterating that projection.  One possible explanation why his projection is so much lower than his performance is that PECOTA is punishing him too much for time he missed two years ago to an injury that hasn't seemed to affect him so far this year.  Another possibility is that catchers as a group tend to wear down over the course of the year, so they should be expected to be ahead of their yearly averages at the All-Star break.

I doubt Mauer will wind up finishing the year out hitting .390-something, although it would be one of the most exciting stories in baseball should he make a run at hitting .400 on the season.  But, Mauer's performance has been so much better than expected that I think it's reasonable to adjust his future performance expectation upwards.  Mauer's 90th percentile forecast has him with a .330 batting average, and if I had to guess something for the second half of the season, I'd go with Mauer hitting .330 after the ASB.

The Rest

If you take Mauer and RonDL out of the pool, then the reduced chi square value for the rest of the team is 1.02.  So other than those two guys, as a group, the rest of the team is hitting so close to their expectations that their actual performance is indistinguishable from expected statistical noise around the predictions.

Rondell's worse than we thought he would be going into the season, Mauer's better (amazingly) than we thought he would be going into the season, and the rest of the team is essentially as expected.  The Twins are currently 10th in the AL in runs scored, and that's right about where we should have expected them to be given their roster choices.