During the thread on Ramon Ortiz, I ended up doing a small experiment using Tom Tango's Marcel projections for 2007. I'm just going to copy what I wrote in that thread for clarity's sake:
"With all that said, I downloaded Tango's 2007 Marcel projections (http://www.insidethebook.com/ee/index.php/site/comments/marcel_player_forecasts_2007/), split out all the totals for players on the 2007 Twins roster according to ESPN.com (http://sports.espn.go.com/mlb/teams/roster?team=min), and ran the numbers through the 2002 Runs Created formula (http://en.wikipedia.org/wiki/Runs_created). I actually got an astonishing total of 871 runs created, which surprised me no end."
Given that I posted that I thought (and still think) that the Twins are unlikely to exceed last year's offensive total of 801 runs, this difference requires some kind of detailed explanation - either my gut-level presumption is wrong, or the projection is wrong. More important than which is wrong, however, is why the wrong projection is wrong. So I went out to try and prove that the projections I made based on Tango's Marcel numbers were off.
I started by trying to come up with explanations as to why the numbers, when aggregated, would be unrealistic.
1. Too many plate appearances
The Marcel projections include everyone who was a Twins regular last year, plus some new additions who weren't regulars last year such as Jeff Cirillo. My first thought was that perhaps the total number of plate appearances projected for the club was excessive; you can project even a team full of so-called 'replacement players' to a thousand runs created if you give them enough plate apparances to do it in, even though there's no way an actual major league team will reach the number of plate apparances required in a single season.
Marcel projects the 2007 Twins offense to 6412 AB+BBs, which is more than just significantly greater than the 2006 total of 6092 AB+BB. 6412 AB+BB is quite high; looking at offenses that scored around 870 or so runs in 2006 (there were two of them; Cleveland at 870 and the White Sox at 868), they tended to collect about 6170 AB+BB (Cleveland actually had 6175 AB+BB, while Chicago had 6159). The best offense in the league - New York's - racked up exactly 6300 AB+BB, and they scored 930 runs. 6412 AB+BB would mean that the Twins are playing more (or many longer) games than other clubs, or that their offense is woefully inefficient by comparison.
It's hard to accurately revise this number, though. You'd think that better offenses send more men to the plate - looking at the numbers for Boston vs Minnesota would support that conclusion, after all. However, the difference in AB+BB isn't really all that great between many teams in the 'middle'; Minnesota's and Chicago's totals, for example, aren't all that different in AB+BB terms (the Twins had 6092 AB+BB while Chicago had 6170, a difference of less than 80 AB+BB which nevertheless translates to a difference of 68 runs). In fact, consider the following four 2006 AL teams:
Regardless, the Twins won't send men to the plate nearly 6500 times in 2007; the 1999 Cleveland Indians had just under 6400 AB+BB, and they scored over a thousand runs, and a casual search doesn't turn up anybody who did better than that.
For this reason, we already know that the 871 run prediction is too high. How high is up to debate, but normalizing the prediction to 6100 AB+BB lowers the projection to 829 runs.
Injuries are by their very nature unpredictable, but Marcel does work some presumptions into its projections: Torii Hunter is projected to only 495 AB and 547 PAs in 2007, which seems reasonable when you consider the number of nagging and minor injuries Hunter has dealt with since his major injury in 2005. A number of other players, however, are slated for about a full-season's worth of PAs, including Luis Castillo, Michael Cuddyer, Justin Morneau, and Joe Mauer. While there's no guarantee that any of these players will actually be injured, odds are that at least one of them will probably miss significant time due to injury (most would point at Castillo as the most likely culprit); if this happens, the run expectation goes down again, though by how much would again be pure speculation.
3. Overprojection of minor players
Marcel admits that its projections aren't perfect, or even likely to be - the Marcel spreadsheets include an index of reliability that in theory ranges from 0 to 1 (least to most), but which tends to run from about .67 (Redmond, Cirillo) to .8 or more (Mauer, Morneau), based largely on playing time both past and expected. There are two players in the Twins Marcel list, however, who have a reliability index of .02 - in other words, Marcel's saying it's just guessing at these two guys:
Alexi Casilla (.285/797 in 203 PA)
Alejandro Machado (.279/791 in 201 PA)
Now, given Casilla's (http://www.thebaseballcube.com/players/C/Alexi-Casilla.shtml) and Machado's (http://www.thebaseballcube.com/players/M/Alejandro-Machado.shtml) minor-league numbers, it's not unreasonable to say they could perform at that level in limited action in 2007. However, keep in mind that Terry Tiffee had good minor league hitting numbers, too (http://www.thebaseballcube.com/players/T/Terry-Tiffee.shtml), and his 239 major-league ABs produced a .226/634 hitting line, so anything is possible. I also tend to discount the odds of Casilla and Machado doing well out of the gate as young Twins, given the organization's general lack of success with young Hispanic hitters - David Ortiz is, of course, the ur-case, but you can go back from Guzman and Rivas back to Petey Munoz and not find a single Hispanic hitter that the Twins got young and developed; their successes, like Luis Castillo, came from veterans developed in other organizations.
If this is an accurate organizational weakness and not just a small-sample-size illusion, and it results in Machado and Casilla being over-projected, that will cost a certain number of runs. Dropping Machado to Marty Cordova's 1998 hitting line (.253/710) and Casilla to Luis Rivas's 2002 hitting line (.256/697) drops the Twins projected 2007 runs to 860 runs, and to 818 after normalizing as for point 1 above.
(It should also be noted that, given the small sample sizes for Machado's and Casilla's expected playing time, reducing their numbers from promising to disappointing didn't take much: just adding three ABs that end in outs and changing two singles and two home runs to outs makes Machado look like Cordova, and simply changing three homers and three singles to outs while changing another single to a double for Casilla makes him look much more like Rivas. Keep this in mind when judging the production of a player with less than 200 big-league ABs.)
4. The Chuck Tanner Effect
Back when Bill James was still writing his Baseball Abstracts, he spent one year's comment on the Pittsburgh Pirates talking about manager Chuck Tanner. Despite Tanner having been the manager of a world champion Pirates team some years earlier, it didn't seem as if James thought Tanner was a very good manager. Noting the disappointing seasons that followed the 1979 World Series season in Pittsburgh, James coined what he termed the Chuck Tanner Theory of Baseball: if everyone has a good year, we'll win.
Though few would think of putting the sunny, positive Tanner in the same boat with the oft-surly Ron Gardenhire, they do both have one thing in common: at least one season in which all their key players had good years. Nothing makes this more obvious than looking at the Marcel projections for the 2007 Twins.
Marcel is said to regress players toward the major league mean, which in 2006 for batting average was .269. If you look at the Twins Marcel projections, however, you see that only two players are projected to hit below the major league mean: Luis Rodriguez at .268, and Jason Kubel at .266, neither of which are very far below the mean. Even if you adjust and use the AL mean for batting average of .275 in 2006, only six of the seventeen Twins offensive players in Marcel are projected to hit even below that mean. In other words, when regressing the 2006 Twins to the mean, most of the offensive players got regressed downward not upward.
As many as six Twins can be said to have had 'career years' in 2006: Mauer, Morneau, Punto, Bartlett, Cuddyer, and Tyner. Granted, in Bartlett's and Tyner's cases, 'career year' is a bit misleading, as neither has had a career before now, and one could argue that Morneau and even Mauer aren't likely to slip too far from their 2006 totals, and might even surpass them at some future point in their careers.
But if you think all six players are going to repeat or even exceed what they did in 2006, I've got some beachfront property in Omaha I'd like to sell you.
For the most part, Marcel's penchant for regressing players to league means results in conservative estimates; for instance, Mauer is projected to hit .328, which would be good but probably not enough to win a second-straight batting title if accurate, while nobody on the team is projected to hit 30 homers (as you'd expect, Morneau comes in highest with a projected 28 homers). Nevertheless, that so many players are projected to hit essentially at or above league average only points out how far above league average these same players were in 2006, and to a contrarian mind, how unlikely it is that all these players will remain above league average in 2007.
After further review, I'm reassured that my seat-of-the-pants projection of below 800 runs for the Twins offense remains reasonable, if not guaranteed accurate.