FanPost

fWAR, rWAR and Free Agent Pitchers

A couple of weeks ago in one of his Saturday Notebooks, Andrew discussed how Fangraphs and Baseball Reference have very different opinions about Mike Pelfrey's value during the past season. To quote his entry in its entirety:

Twins pitcher (or perhaps former Twins pitcher) Mike Pelfrey allows us to do an exercise in the differing calculations of WAR. Twins Daily's Parker Hageman tweeted that FanGraphs' WAR calculation had Pelfrey at 2.1 WAR in 2013, while Baseball Reference listed it at -0.3. Based on what we saw last season, I think most of us would say that Pelfrey looked more like a -0.3 WAR pitcher than a 2.1 WAR pitcher, but we need to understand the inputs used for these two websites' WAR statistic. FanGraphs uses a pitcher's FIP, which is an ERA predictor based entirely on strikeouts, walks, home runs, and now infield flies (the things pitchers can control) while Baseball Reference sticks with runs allowed per 9 innings (RA/9). Simply looking at Pelfrey's ERA and FIP, the differences in WAR appear much clearer. Pelfrey's RA/9 was 5.42, leading to his bWAR (Baseball Reference WAR) of -0.3. However, his FIP was a league average 3.99, creating a much better looking 2.1 fWAR (FanGraphs WAR). Despite the positive look by FanGraphs and the fact that it's one of my favorite baseball websites, I cannot agree that Pelfrey was worth over 2 wins last season, and I think all of you could agree with me. While I do understand that FIP is used because it measures what the pitcher can control, just simply from watching Pelfrey pitch I think we can all agree that his bWAR seems like a better representation of how he pitched last year. Oh, and if you're letting Pelfrey's glacial speed on the mound influence your opinion of him like I am, his pace (time between pitches) of 24.6 was a full 3 seconds longer than his career average. I don't know, I might have to look at the Twins coaching staff rather than Pelfrey himself to explain that one.

Now, I agree with Andrew in his opinion of Pelfrey's value. However, bringing this up got me curious about the other pitchers in the league. In particular, I was curious about the free agent pitchers that are available this offseason. Do any of them have major splits between the two WAR metric? And if so, what does that difference mean?

In order to explain the differences between the two systems, it is important to know the components that go into each separate calculation. Fangraphs and Baseball-Reference both have good explanations on their websites, and I strongly encourage everyone to read these overviews. I'll try my best to sum up the big ideas to the best of my ability, but I'm sure I'll miss some of the details.

fWAR

In Fangraphs, their WAR (fWAR) is based on a pitcher's Fielding Independent Pitching (FIP). FIP is a value that is function of 3 values, HR rate, strikeout rate and walk rate. (I know there has been discussion of adding a fourth value, infield popup rate, to the calculation. I'm not sure if that has been incorporated or not.) Those three values are linearly weighted, and then a constant is added to be on the same scale as ERA. This means that the league-average FIP and the league-average ERA are the same value. This FIP value is used to determine what the a context-neutral run environment is for that pitcher, and from there, the fWAR value for the player. The details of the fWAR calculation are fairly involved, and a full explanation can be found in seven parts here. The key takeaway, however, is that the underlying metric for determining fWAR is FIP.

FIP is a defense-independent metric, in so far as it only measures things that a pitcher has complete control of: walks, strikeouts and home runs. Everything else that happens in a baseball game requires some interaction with the fielders, so FIP ignores it. For all its benefits, FIP is not without limitations. First, it pays no attention to the competition the pitcher faces; results against Boston and Houston count the same. Second, it assumes that all pitchers have essentially the same batted-ball profile. This doesn't always hold. For example, a really bad pitcher with terrible stuff will give up a lot more line-drives and hard contact, which may cause their BABIP to be legitimately higher than normal rather than something that should be ignored as unlucky. Finally, it ignores context. For the most part this is good, but in certain cases (say for pitchers who struggle from the stretch) it masks some of their actual deficiencies.

rWAR

In Baseball-Reference, their WAR (rWAR) is based on the runs actually allowed by the pitcher. The basic philosophy is to count all the runs (not just earned runs) the pitcher gives up during the season to create a Runs Allowed per 9 innings stat (RA9). Then it looks at several components to determine what an "average" pitcher would give up in the same situations. It looks at the competition faced, the defensive metrics of the team(s) he pitched for, and the ballparks pitched in to create an RA9 value for this "average" pitcher. The difference between these two values determines the pitcher's runs above or below average, and from there the pitcher's rWAR.

There are two major limitations with this metric. First, the initial RA9 value is completely context dependent. It is based on the actual runs allowed during the games, and as such it is dependent on the dozen of little things that are more or less beyond the control of the pitcher when giving up runs: errors, bullpen let inherited runners score, ordering of hits. etc. Second, it depends on calculating the other values (ballpark factors, defense, etc) correctly. Defensive metrics in particular are an inexact science, so there is definitely the possibility of introducing significant errors by incorporating these metrics into the "average" pitcher calculation.

Pitchers With Large fWAR-bWAR Splits

Looking at some of the prominent names in this off-season's free agent pitcher class, there are six names (in addition to Mike Pelfrey) that had fairly significant fWAR-rWAR splits and have been mentioned as targets for the Twins this offseason. For many of these players, the difference is fairly large and quite stark: one system values the player at replacement level or worse while the other system values that player as league-average.

Name

Team

IP

ERA

FIP

Diff

fWAR

rWAR

Diff

Mike Pelfrey

Twins

152.2

5.19

3.99

1.20

2.1

-0.3

2.4

Phil Hughes

Yankiees

145.2

5.19

4.50

0.69

1.5

-0.7

2.2

Josh Johnson

Blue Jays

81.1

6.20

4.62

1.58

0.5

-1.6

2.1

Bronson Arroyo

Reds

202

3.79

4.49

-0.70

0.8

2.5

1.7

Dan Haren

Nationals

169.2

4.67

4.09

0.58

1.5

-0.1

1.6

Scott Kazmir

Indians

158

4.04

3.51

0.53

2.5

1.1

1.4

Ricky Nolasco

Marlins/Dodgers

199.1

3.70

3.34

0.36

3

1.8

1.2

Phil Hughes (1.5 fWAR vs -0.7 rWAR)

I am leaning toward fWAR in Hughes's case, though not without some reservations. His rate fielding independent metrics didn't drop off too far from 2012, but the results were much, much worse. There certainly a case to be made that this was due to some bad luck. In particular, he had a sharp spike in BABIP (.286 to .324) and drop in Left On Base % (74% to 69%). Additionally, he has had fairly massive home-road splits, so getting out of Yankee stadium should help is results. However, it should be noted that his batted-ball profile shifted this past year, as he gave up the most line drives and fly balls of his career, and fewest infield pop-ups. Hughes has always been a fly-ball pitcher but would compensate by getting a lot of pop-ups, and I'm concerned this may be a change in his ability rather than a yearly fluke.

Josh Johnson (0.5 vs -1.6)

I don't have any strong feelings one way or the other with Johnson. One metric says he was bad, and the other says he was really, really bad. For Johnson, the biggest outliers for this past season were his HR rate (1.66 HR/9 vs career average of 0.67 HR/9) and BABIP (.356). Going forward, his effectiveness will have more to do with his health than anything else.

Bronson Arroyo (0.8 vs 2.5)

Leaning toward the fWAR. Arroyo benefited from some good luck: he had a very low BABIP (.267) and a career high in percentage of runners left on base (77.9%). Also, the Reds bullpen was very good a preventing inherited runners from scoring. All of these factors have a high probability of regressing back toward league-average, so treat Arroyo's nice-looking ERA and rWAR with caution.

Dan Haren (1.5 vs -0.1)

Leaning toward rWAR. He faced rather weak competition this season as much of the NL East (Marlins, Mets, Phillies) was fairly terrible offensively. Still, he gave up a lot of runs throughout the year. There is some evidence that these runs were due to bad pitching rather than bad luck. He had his worst flyball rate, line drive rate, home run rate of his career; further, those numbers have been trending the wrong way the past few years. Opponents had a .760 OPS against last year as well.

Scott Kazmir (2.5 vs 1.1)

Split the difference. Kazmir had one of the easiest opponent schedules of all AL pitchers - 10 of his 29 starts were against the Royals and the Twins, and 8 of his other starts were against a combination of the Astros (2), Mariners (2), Phillies, Mets, White Sox and Marlins. Overall, only two pitcher who threw more that 100 IP has a lower opponent rating that Kazmir, and they were his teammates Ubaldo Jimenez and Corey Kluber. By comparison, Chris Tillman of the Orioles started 15 games against the Red Sox (6), Blue Jays (5) and Rays (4).

On the other hand, Kazmir had a very high BABIP (.324). So their is a reasonable argument to be made that he was unlucky throughout the year and potentially gave up more runs than he should.

Overall, I think that Kazmir's weak competition boosted his FIP stats, which increased his overall fWAR more than it should be. For anyone interested in adding Kazmir to the Twins's staff, taking out Kazmir's 5 starts against the Twins degrades all of his FIP stats:


2013

2013 w/o Twins

K/9

9.23

8.93

BB/9

2.68

2.83

HR/9

1.08

1.20

Ricky Nolasco (3.0 vs 1.8)

I'm leaning toward the fWAR. It appears that the difference in value is based almost entirely on one start toward the end of the season when he gave up 7 runs in 1.1 innings. This game was an epic disaster for Nolasco, but not entirely his fault as there were 2 errors behind him. Anyway, this is a good example of the limitations of the rWAR system, as this one bad outing cut roughly ⅓ of Nolasco's value for the season.

X
Log In Sign Up

forgot?
Log In Sign Up

Forgot password?

We'll email you a reset link.

If you signed up using a 3rd party account like Facebook or Twitter, please login with it instead.

Forgot password?

Try another email?

Almost done,

Join Twinkie Town

You must be a member of Twinkie Town to participate.

We have our own Community Guidelines at Twinkie Town. You should read them.

Join Twinkie Town

You must be a member of Twinkie Town to participate.

We have our own Community Guidelines at Twinkie Town. You should read them.

Spinner

Authenticating

Great!

Choose an available username to complete sign up.

In order to provide our users with a better overall experience, we ask for more information from Facebook when using it to login so that we can learn more about our audience and provide you with the best possible experience. We do not store specific user data and the sharing of it is not required to login with Facebook.

tracking_pixel_9351_tracker