PDA

View Full Version : The Statistical Case Against Cabrera for M.V.P.



eddhead
November 14th, 2012, 01:31 PM
Knowing that there are at least a few baseball fans on this site, who do not think a lot of sabermetrics, I hesitate posting this. Truth be told, I was leaning toward Cabrera for MVP prior to reading this but Silver builds a solid case for Trout, and he won me over.

This argument seems much more rounded than what we usually get from Sabermetricians in that it include baserunning skills, defense, and clutch hitting.

I am interested in hearing what other members think.


--------------------------------------------------------------------------------------------------------------------------------

November 14, 2012, 9:13 am The Statistical Case Against Cabrera for M.V.P.By NATE SILVER (http://wirednewyork.com/author/nate-silver/)On Thursday, the American League will announce the recipient of its Most Valuable Player award. The winner is likely to be Miguel Cabrera, the Detroit Tigers star who won the league's triple crown by leading in batting average (. 330), home runs (44) and runs batted in (139).

It might seem as if these statistics make Cabrera, the first triple crown winner in either league since 1967, a shoo-in for the M.V.P. But most statistically minded fans would prefer that it go to another player, Mike Trout of the Los Angeles Angels.

The argument on Trout's behalf isn't all that complicated: he provided the greater overall contribution to his team. Trout was a much better defensive player than Cabrera, and a much better base runner. And if Cabrera was the superior hitter, it wasn't by nearly as much as the triple crown statistics might suggest.

It is an argument enabled by the improved ability to measure different elements of the game - defense, base running, and situational hitting - that were once weak points of statistical analysis.

Take Trout's base running, for example. The "Moneyball" paradigm is sometimes associated with de-emphasizing the value of the stolen base. In large part, this is because being caught stealing hurts a team about twice as much as a successful stolen base attempt helps it. Thus, a player who steals 20 bases, but who was caught stealing 10 times, provides little added benefit to his club.

But this wasn't a problem for Trout, who was successful on 49 of 54 stolen base attempts, one of the highest percentages ever for a player who attempted to steal so many times.

Trout, with his speed, aggressiveness and good judgment on the bases, was also able to help the Angels in other ways, such as by scoring more often from second base when one of his teammates got a base hit. With the more detailed data available on everything that happens on the field, it is now possible to quantify these contributions as well.

Over all, Trout contributed about 12 additional runs on the basepaths (http://www.fangraphs.com/statss.aspx?playerid=10155&position=OF) when compared with an average runner. The bulky Cabrera, by contrast, cost the Tigers about three runs on the bases.

Trout is also the much better defensive player. Major League Baseball now records in detail exactly where each batted ball is hit. The best systems for measuring defense rely on this physical evidence, rather than pure statistical inference, in order to see whether a player makes more or fewer plays than his peers at the same position.

One of these systems, Ultimate Zone Rating (http://en.wikipedia.org/wiki/Ultimate_zone_rating), estimates that Trout saved the Angels 11 runs with his defense in the outfield. Cabrera, a clumsy defender at third base who is more naturally suited to play first base, cost the Tigers 10 runs with his.

Between his defense and his base running, therefore, Trout was about 35 runs more valuable to the Angels than Cabrera was to the Tigers. By contrast, the 14 additional home runs that Cabrera hit (44 against Trout's 30) were worth about 22 extra runs for the Tigers, based on measures that convert players' contributions to a common scale.

Didn't Cabrera also hit for a higher batting average? Yes, but barely: he hit .330 against Trout's .326. And Trout had the slight edge in on-base percentage, .399 to .393.

Trout also made his offensive contributions in a more difficult ballpark for hitters. Detroit's Comerica Park once had a reputation as a pitcher's haven, but that has not really been true since the Tigers moved the fences in before the 2005 season.

Angel Stadium of Anaheim, instead, is more pitcher-friendly, measuring 387 feet to the left-field power alley, one of the deepest distances in the majors.

Although there are statistical formulas to adjust for these "park effects," it is now also possible to measure the impact of ballpark dimensions through a visual inspection of the data.

Of the 159 home runs hit at Comerica Park this season, for example, about 20 or 25 were not hit deep enough to leave the field at Angel Stadium, according to ESPN's Home Run Tracker (http://hittrackeronline.com/detail2.php?id=2012_4586&type=ballpark). Another 15 or 20 would have been borderline cases.
Angel Stadium is shallower in straight center field, making up for much of the difference, but since most of Cabrera's home runs came to the power alleys (http://hittrackeronline.com/detail.php?id=2012_1549&type=hitter), playing in Anaheim would likely have hurt his statistics on balance. Trout, who hits to all fields, is less sensitive to his ballpark, and had slightly better overall numbers than Cabrera in road games.

What about Cabrera's superior R.B.I. total? Isn't that evidence that he helped his team when it had the most on the line?
In general, the consensus among statistical analysts is that the best hitters in the clutch are simply the best hitters over all. With the possible exception of a few outlying cases (http://www.cbc.ca/gfx/images/sports/photos/2012/10/12/rodriguez-alex_940-8col.jpg), most players' statistics in clutch situations are similar to their overall batting statistics over the long run.

Even if one believes this, however - that there is little predictive power in clutch hitting statistics - one could nevertheless form a coherent argument that they deserve consideration in the retrospective evaluation of players, such as in determining who had the more valuable season. A grand slam still counts more than a solo home run.

Cabrera, in fact, was a very good clutch hitter in 2012, hitting .356 with nine home runs and 89 R.B.I.'s with runners in scoring position. Trout, by contrast, had 53 R.B.I.'s with runners in scoring position.

But much of the difference simply reflects the fact that Cabrera hits third in the batting order, and had more opportunities to hit with runners on base. His 89 R.B.I.'s with runners in scoring position came in 205 plate appearances, a rate of 0.43 R.B.I.'s per opportunity. Trout's 53 R.B.I.'s came in just 135 opportunities, since he is the Angels' leadoff hitter. That yields a similar rate of production: 0.39 R.B.I.'s per plate appearance with runners in scoring position.

Furthermore, leading off the inning, as Trout frequently did, represents a sort of clutch situation of its own. Advanced statistics have validated the conventional wisdom that getting the leadoff hitter on base greatly increases a team's chance of success: a plate appearance to lead off the inning is more than twice as important (http://www.insidethebook.com/li.shtml) as one with two outs but nobody on base.

Trout was very good when leading off the inning, hitting .339 with a .398 on-base percentage. (He also stole 16 bases with nobody out.) Cabrera hit .301 with a .342 on-base percentage in leadoff situations. That counteracts much of the advantage from Cabrera's superior performance with runners on base.

In fact, there are now systems, like Win Probability Added, that measure all aspects of clutch performance in a comprehensive way. They account not just for the number of runners on base and the number of outs, but also the game score and the inning. A grand slam when a team trails by three runs with two outs in the bottom of the ninth turns a near-certain loss into a win, giving a player maximal credit by this system. A grand slam when a team already leads 7-0 gets little credit, since the game is already in hand.

According to this measure, Trout was actually slightly more valuable than Cabrera (http://www.baseball-reference.com/leaders/wpa_bat_top_ten.shtml) as an offensive player, considering the timing of his contributions. Add in his defense and base running, and it isn't all that close a call.

It may seem hard to argue against a player who won the triple crown. But Cabrera's numbers, while worthy of an M.V.P. award in many seasons, weren't historically great. His batting average, R.B.I. and home run totals would also have qualified for the American League's triple crown in 2008. Before that, however, you would have to go back to 1972 to find a year in which his numbers were good enough to lead the league in all three categories.

There is also the fact that Cabrera's Tigers made the playoffs, while Trout's Angels did not. But the Angels won more games (89) than the Tigers (88), missing the playoffs because they played in a harder division. Trout, moreover, began the year in the minors; the Angels went 81-58 in games in which he participated, equivalent to their winning 94 games over a full season.

Still, the real progress in the statistical analysis of baseball is in the ability to evaluate the contributions that a player makes on the field in a more reliable and comprehensive way.

Perhaps 10 or 20 years ago, when evaluations of base running, defense and clutch hitting were murkier, stat geeks would have argued that Cabrera deserved the M.V.P. on the basis of the hard evidence.

Now that some of the "intangibles" have become measurable, we know that Trout did more of the little things to help his team win.

It's the traditionalists who are using statistics in a way that misses the forest for the trees.

ZippyTheChimp
November 14th, 2012, 05:09 PM
It all depends on what you consider valuable to a team.

Down the stretch, Trout fell off somewhat, the Angels didn't make the payoffs and went home.

If anyone watched the games, the Tigers seemed out of gas late in the season. Cabrera carried the team on his back down the stretch, drove in 28 runs, hit 11 HR. He did it under the media pressure of the first possible triple crown winner in decades.

Trout had a great season, but to me, what separates a great athlete from the numbers is doing it when the pressure is on, when it matters most.

That's an MVP.

eddhead
November 14th, 2012, 09:18 PM
i do have a different definition of value than you do. To me, value is defined by who would command the highest price in a competitive free market based solely on the year he had. This is kind of like the Karl Marx definition of "Value Added" which is akin to contribution in a way.

Either way, you can make a good case for either one of them, but what I like about Silver's analysis is that he includes components not often taken into account when making sabermetrics arguments; defense, clutch hitting, and base running. It is a combination of new and old school thinking and I guess for that reason appeals to me.

In fairness, Trout did do a lot when the pressure was on. He hit well in the clutch for most of the year. I think you have to consider the entire body of work of both players, not just what happened over a small period of time.

And it is unquestionably true that Cabrera was a bit of a liability in the field who it well, but in a hitter friendly park. ON the other hand, Trout hit in a park that favors pitchers and was a much better defensive player and base runner

Still Cabrera had a great year and will probably win the award on the basis of winning the triple crown.

ZippyTheChimp
November 14th, 2012, 10:02 PM
I know we're never going to agree, but I find it strange that instead of using pure stats, you say it should be based on "who would demand the highest price in a competitive free market."

How would we do that, given that out-of-whack contracts seem to be the norm? It's an outside reality thought experiment.

Also strange is suddenly regarding Comerica as hitter friendly, especially for a power hitter who lead the league in HR. Comparative dimensions:

Detroit: 345..........370............420............365.... ............330
Anaheim: 330..........387............400.............370... ............330

Seems like a wash to me.

And I'm not considering Cabrera only "during a small period of time." I wouldn't have considered him for MVP if he had a mediocre year, and just got hot for a month. C'mon, he had a great year.
His performance down the stretch, where he became the feared player in the lineup tips the scales for me. I'm considering the stats of both players, plus that something extra that defines a great player.

Baseball is a very long season, and there's opportunity to build huge numbers in any segment of it, fall off, and still have a good season overall. Adrian Gonzalez in 2011 comes to mind.

In June and July, Trout hit .372 and .392. In August and September .284 and .289.
Cabrera had a so-so start, hitting .298 the first month, but the last four were .311, .344, .357, .333.

I know on paper all the games count the same, but they really don't. Look at the Texas Rangers.

eddhead
November 15th, 2012, 09:04 AM
The reason you are confused is I oversimplified the definition of value. I don't really want to get to caught up in this because I am not really sure how relevant it is to the debate but essentially to me a ballplayer is a commodity on the market place, no different in certain respects than a car. A car has a marketplace value that is marked by its price. The more desirable the car, the greater its value. The greater its value the higher the price.

Similarly ballplayers are commodities in the labor market. To understand my definition of value, you have to picture a fictional market wherein players salaries (= price for their services) are based solely on their current performance. Factors such as age, fan appeal, irrational bidding, injuries, etc.. do not enter into the equation. It is a given that all other factors, except performance, are equal for all players.

What I am trying to do here is isolate performance as the primary measure of value in order to determine the 'Most' Valuable player based on performance alone, using marketplace theory as the means to measure. It is really an economics model; no different than other models that make assumptions that really don't exist in the real world in order to analyze socio-economic systems. Capitalism (in which 5 assumptions are made that don't necessarily exist in the real world) is on such model; Socialism is another.

But I am off the point. In order to identify the Most Valuable Player, I ask my self, "all other things being equal, whose performance last year would yield the most 'value' and thus highest price in an open and free market. Again, please understand, this market does not exist in the real world.

The rub is that each person will define value in a different way. To some it is based on statistical analysis, to others intangibles come into play. That is fine. But at the end of the day the market as a collective defines the value of a commodity.

Now you know why I did not want to go into detail last night. I realize some people find this a bit convoluted, but it works for me.

Even after taking this view, I was of the opinion that Cabrera was the league MVP before reading Silver's blog. But after reading it, I now think that, based solely on last year's performance, with all else being equal, I would have to give the nod to Trout.


As to Anaheim and Commerica, I think the big difference is the foul area, which is larger in Anaheim and provide hitters with more swings because foul balls that are out of play in Commercia are playable in Anaheim.

Cabrera had excellent stats both at home and on the road, but his home stats were definitely better than his road stats. HR's were 28 vs 16, RBI's were 78 vs 64, and OPS was 1.095 vs .913. This is one illustration that Commerica is an easier park to hit in, at least for him, and I would presume others as well

Trout on the other hand hit better on the road; .332 vs .318, but had a slightly higher OPS and 2 additional HR's at home. Best case, Anaheim is agnostic as a hitters park.

Than there is the matter of defense and base running.

None of this takes away from the year either one had. Both had great years, and both are deserving. But if I were voting, I would have to go with Trout. It kind or reminds me in a way of the year Williams hit .406 but lost the MVP voting to Dimaggio (except in that case I think Williams should have won it.).

ZippyTheChimp
November 15th, 2012, 11:09 AM
[FONT=times new roman][SIZE=2]The reason you are confused is I oversimplified the definition of value. I don't really want to get to caught up in this because I am not really sure how relevant it is to the debate but essentially to me a ballplayer is a commodity on the market place, no different in certain respects than a car.When I once said that Sabermetrics would ruin baseball, it was misinterpreted that I was referring to an expanded statistical base. But the above is what I meant - statistics as an end to itself.

Statistics are linear, but the way players perform, and the way we watch them, isn't linear.

Statistics would put equal weight on a free throw missed in the first two minutes; it has consequence in the final score. But in the real world, it's not the same as missing one in the final minute of a tie game. A player can lead the league in field goal percentage and ppg, but if he's shooting 30% in the last 8 minutes of a game, I don't think he's an MVP.

Statistics work for things like BA, HR totals, wins, strikeouts, ERA. But categories like MVP and Cy Young are vague, and I think better for baseball. Would a thread be opened on the batting title?


But I am off the point. In order to identify the Most Valuable Player, I ask my self, "all other things being equal, whose performance last year would yield the most 'value' and thus highest price in an open and free market. Again, please understand, this market does not exist in the real world.But we watch sports in the real world.


But at the end of the day the market as a collective defines the value of a commodity.The market you are basing this on, as you said above, doesn't exist. The real market is influenced by the number of buyers, their resources and needs, the total talent pool, and events like the recent salary dump by the Marlins. Unless we can wait to let a theoretical "pure market of player value" determine an MVP, it has no place in this discussion.


As to Anaheim and Commerica, I think the big difference is the foul area, which is larger in Anaheim and provide hitters with more swings because foul balls that are out of play in Commercia are playable in Anaheim.I think you may be confusing Anaheim with Oakland. I don't think there's much difference, but to your point, Comerica has the larger foul territory.

Comparison:
http://www.andrewclem.com/Baseball/ComericaPark.html
http://www.andrewclem.com/Baseball/AnaheimStadium.html


Cabrera had excellent stats both at home and on the road, but his home stats were definitely better than his road stats. HR's were 28 vs 16, RBI's were 78 vs 64, and OPS was 1.095 vs .913. This is one illustration that Commerica is an easier park to hit in, at least for him, and I would presume others as wellWell, if you want some stats:
In almost all parameters of Clutch Hitting tracked by Baseball Reference (especially those with the game on the line), Cabrera outperforms Trout. Here's just three:

Two outs, RISP
Trout: 286/435/347
Cabrera: 420/491/720

Late and Close
Trout: 277/338/446
Cabrera: 337/422/618

Tie Game
Trout: 330/390/573
Cabrera: 343/403/578

For a power hitter who led the league in HR to hit .420 with RISP and two out is a remarkable stat. It defines his year - the toughest out in baseball.

eddhead
November 15th, 2012, 11:56 AM
As I posted, the creation of a fictiional market to evaluate player value is a model, nothing more. It is just a tool that helps me isolate player performance for a given year and use that as a means of identfying an MVP. It is a given that the market would have the characteristics of the type of 'pure' market that Adam Smith envisoned but that doesn't exist in the real world. For instance, all participants would have complete knowledge of the commodity and the marketplace dynamics, all participants have unencumbered access to the market, etc.. The fact that this market does not exist in the real world did not prevent Adam Smith from using it to develop his theory on capitalism, and it is not going to prevent eddhead from using it to evaluate MVP candidates. ;).

As to the clutch hitting stats, please again look at Silver's stats in this category:



Furthermore, leading off the inning, as Trout frequently did, represents a sort of clutch situation of its own. Advanced statistics have validated the conventional wisdom that getting the leadoff hitter on base greatly increases a team's chance of success: a plate appearance to lead off the inning is more than twice as important (http://www.insidethebook.com/li.shtml) as one with two outs but nobody on base.

Trout was very good when leading off the inning, hitting .339 with a .398 on-base percentage. (He also stole 16 bases with nobody out.) Cabrera hit .301 with a .342 on-base percentage in leadoff situations. That counteracts much of the advantage from Cabrera's superior performance with runners on base.

In other words, in order to hit well with RISP, someone has to actually be in scoring position. Trout did an excellent job of getting himself into scoring position.

I think you can make a good case for either player; and you make an excellent one for Cabrera. Two days ago, I would have agreed with you, but Silver's arguement won me over.

ZippyTheChimp
November 15th, 2012, 01:34 PM
As I posted, the creation of a fictiional market to evaluate player value is a model, nothing more.So how is it tested and validated?


It is just a tool that helps me isolate player performance for a given year and use that as a means of identfying an MVP. It is a given that the market would have the characteristics of the type of 'pure' market that Adam Smith envisoned but that doesn't exist in the real world.Adam Smith had a theoretical model of economics, to explain economics. Not evaluate something like baseball performance.


For instance, all participants would have complete knowledge of the commodity and the marketplace dynamics, all participants have unencumbered access to the market, etc.. The fact that this market does not exist in the real world did not prevent Adam Smith from using it to develop his theory on capitalism, and it is not going to prevent eddhead from using it to evaluate MVP candidates. ;).If you're going to declare Trout the MVP based on an economic model, then I have to see the results of that model in the market place before I can accept it.

The significant statement from Silver is:
That counteracts much of the advantage from Cabrera's superior performance with runners on base.Well, it didn't work out that way, did it.

You had to have watched games and seen the way things unfolded to get a clear picture, rather than a season ending analysis. It's the same as my basketball example. Cabrera took a struggling team on his back, and willed them to the desired result - the postseason. Trout was not quite the same dominating presence on the Angels, and he took them to the golf course. You can't argue woulda. Maybe it's not fair to include team success in the equation, but Cabrera had the opportunity, and got it done. For two players with similar seasons, it's a better tie-breaker than a mythical market value in the following year.

And if you are going to take a statistical view, Trout did not start playing until the end of April. So for 16% of the season, he added zero value to the team. That's significant. But I reject that for the same reasons I've already explained.

Anyway, Adam Smith and baseball aren't good together. It makes both boring.

eddhead
November 15th, 2012, 05:09 PM
Model may have been a strong word. I was trying to illustrate a thought process that isolates a specific attribute - player performance- by imagining it is the only criteria driving value in a free market. Smith's thought process was similar. As for Smith making baseball boring, you could say the same of Bill James and maybe even Nate Silver.

Come to think of it, you may have a point. ;)

I think the 16% argument is really quite significant. To me the bottom line is that it takes a certain number of wins to get into the playoffs. I realize this is counter-intuitive but to me it doesn't really make all that much of a difference when you get those wins. To illustrate, I recall a season in the 80's when I think Yogi was managing, and the Yankees got off to a horrible start. They played fairly well afterwards, but never well enough to make up the ground they lost early on. The important games for them were the first 30 or so because that is what put them in such a hole.

The other thing I don't quite like about your argument is that it at least hints at the notion that only players from playoff teams are qualified to win the MVP. I know of others who share that view, but it doesn't sit right with me.

ZippyTheChimp
November 15th, 2012, 06:34 PM
To me the bottom line is that it takes a certain number of wins to get into the playoffs. I realize this is counter-intuitive but to me it doesn't really make all that much of a difference when you get those wins.That seems more like the argument I would make, but from a strictly statistical point of view, Trout had zero value for 16% of the season. If it's going to be argued that "this" is twice as important as "that," the total absence from the scene for a month is important; you're not adding anything. Again, this line of reasoning is not one of my primary arguments, because it is too rigidly statistical. But looking at it from a non-statistical viewpoint...

[Warning: shameless self-promotion by Zippy follows]

In another thread, you, Hbcat and I were discussing the batting title race. I thought there was a higher probability that Trout would fall off because he was in uncharted territory late in the season, when most players hit a wall. That's exactly what happened. So what would have happened if you added a month of playing time to Trout. It's not something you can easily quantify, but it's there.

That's why I reject the title of the thread; there's more to it than statistics.


The other thing I don't quite like about your argument is that it at least hints at the notion that only players from playoff teams are qualified to win the MVP.I only mentioned it as a counter to the economic model thing, which I don't want to talk about anymore.

In a year where two players are close, I think it matters at least as much as statistical advantages. Not just making the playoffs, but the environment the player has to perform in. Although I'm discounting the Triple Crown as a deciding factor, there was a media circus around Cabrera down the stretch. While it was less than a big deal to me, I'm sure it was something he wanted to do.

eddhead
November 16th, 2012, 08:44 AM
I knew I would wear you out on the modelling thing.

Well, regardless of my leanings, I do think Crabrera will win the award (it is shocking that the BBWAA doesn't consult with me), and if he does it won't be a scandal. He had a great year, and I would add, that up until this season, was not recognized for being as good a hitter as he really is.

eddhead
November 16th, 2012, 09:16 AM
Cabrera, Posey capture Major League MVP trophies
http://gantdaily.com/2012/11/16/cabrera-posey-capture-major-league-mvp-trophies/

And so this thread comes to an end...

hbcat
November 21st, 2012, 08:55 AM
Just getting caught up. I can't comment on Trout much since all he is to me is stats. I don't recall his performances against the Yankees, and I don't follow the Angels. He's certainly made a big splash.

Cabrera had a fantastic year, and once he locked up the Triple Crown, this award was a cinch. Zippy pointed out that Trout was falling off, and would continue to fall off, when we were discussing the batting title late in the season and that turned out to be the case. We all watch the statistics -- fans, writers, and players -- and yet all but the most die-hard statistician has to admit that you have to watch the games to be able to gauge a player's total contribution.

It looked like Jeter had a growing shot at that batting title for a couple weeks but then *he* fell off as well -- or rather, his late-season hot streak cooled ever so slightly. I mention this because just going by stats it is pretty clear that Jeter did not have a strong case for an MVP, but we who watched this season know he had one of his best ever years, and is always there in the post-season -- which of course doesn't count for MVP voting but it does demonstrate how much such a player contributes when it matters most.

So, for most of us, the view is skewed toward players from the one or two teams we follow closely. I am sure LA fans can recall many exciting Trout moments from this season, and could argue in great detail for his case. The rest of us are left with a few anecdotes from a game or two we might have seen against the Tigers or the Angels, but most of it of our info is coming from sports writers and stats.

Cabrera won this one fair and square. Trout seems to be off to an elite and exciting career. Who would be surprised if he doesn't haul in an MVP season or two over the next 5 or 10 years.

Who would be surprised, following Jeter, if he didn't do the same next year? ;)