The David Ortiz Debate

Normally, we would cover all of the modern first basemen at the same time, but occasionally you get a case that deserves more focus. David Ortiz retired following a seemingly brilliant career in 2016. He had one of the best closing seasons in baseball history and that should be enough to get him into the Hall of Fame. After all, he has more than 500 home runs and 1700 RBI.

The index has always been meant to be a guide. No statistical formula is meant to solve every puzzle. If David Ortiz’s candidacy is anything it is a puzzle. His counting numbers are no more a definitive statement on his fitness than anything else. However, they are a good place to start when looking at what the voters will be considering.

Runs: 1419 (89th)

RBI: 1768 (22nd)

HR: 541 (17th)

2B: 632 (10th)

OPS: .931 (35th)

EBH: 1192 (8th)

Of course, this is only a smattering of the career numbers that people would consider, but all of those numbers say pretty much the same thing. Ortiz was one of the best hitters in the modern era. So, it stands to reason that he would be a shoo in for the Hall of Fame. Of course he is. The question though is whether he should be.

Index

Before we take a look at the index we should look some of the peculiarities around it. There is a secret sauce that goes into it. Essentially, what we need to know for players like Ortiz is that bWAR and fWAR have built in a punishment in a way for designated hitters. It can be seen more readily in bWAR with their defensive WAR statistic. Win shares builds from zero, so if a player never plays in the field he is worth zero in terms of fielding shares. bWAR takes the replacement level player overall and builds defensive value from there. Therefore, most first basemen are below replacement level already and that is that much worse for designated hitters.

In most instances we would trust the index to give us an accurate assessment of a player’s value. In this case (and Edgar Martinez) we are not quite sure. Is it fair to penalize designated hitters? Of course, this is the main reason we use three sources of data for the index. We want to have a consensus. Unfortunately, a consensus does not necessarily mean the same thing as accurate or precise.

  bWAR fWAR WS/5 Total
Career 55.3 50.7 65.2 171.2
Peak 38.5 36.2 41.2 115.9
Total 93.8 86.9 106.4 287.1

If we were to follow the strict rules of the index, we would immediately eliminate Ortiz from consideration. The problem is that most of the BBWAA holds to those counting numbers. Counting numbers have to be considered in the context they were accrued. Ortiz played in a hitter’s ballpark, in an offensive era, and on one of the better teams in the time period. So, it is hard to discount the numbers above out of hand, but most people would agree with the notion that something doesn’t feel quite right.

  bWAR Rank RC Rank OW% Rank
2003 3.4 * 98 * .698 9
2004 4.3 * 135 4 .711 5
2005 5.2 8 149 2 .752 4
2006 5.8 6 152 1 .759 4
2007 6.4 6 156 2 .800 1
2008 1.7 * 79 * .627 *
2009 0.7 * 86 * .524 *
2010 2.8 * 102 * .671 9
2011 4.0 * 110 * .713 5
2012 3.2 * 84 * .799 *
2013 4.4 * 111 7 .727 4
2014 2.6 * 93 * .657 *
2015 3.1 * 104 10 .677 9
2016 5.2 * 130 5 .756 2

*did not finish in the top ten

It should be noted that Ortiz played in only 90 games in 2012, so he did not qualify to finish in the top ten in either runs created or offensive winning percentage. Had he played a full season he likely would have top five finishes in both of those categories. We obviously see that not playing in the field hurt him in terms of the value numbers we use for the index. That undoubtedly should be the case, but to what degree is anyone’s best guess. Runs created and offensive winning percentage demonstrate he was one of the most valuable offensive players from the period.

Playoff Success

One of the many arguments used to support a player like Ortiz and his greatness is his reputation as a clutch performer. There are any number of ways to measure that, but we begin with the notion itself. Does clutch performance exist? The working theories have varied over time with the luminaries of the sport vacillating from one extreme to the other. Conventional wisdom says it must exist. When sabermetrics first began we naturally assumed that when the sample size is large enough then performance would naturally level out.

The pendulum has swung the other way as data sources have been able to more clearly define clutch performance. The working theory is that a neutral player would perform just as well in the playoffs as during the regular season. That may or may not be the case. Pitching should be better in the postseason, so actually you would expect hitters to perform a little worse in the postseason as they would in the regular season. Ortiz is known for some pretty big hits in big moments. How did he do overall in the playoffs?

  PA AVG OBP SLG HR Runs RBI SO/BB
ALDS 139 .270 .388 .513 6 16 17 29/23
ALCS 171 .255 .357 .490 8 21 30 38/22
WS 59 .455 .576 .795 3 14 14 5/14
TOT 369 .289 .404 .543 17 51 61 72/59

The usual standard for clutch performers is that they get better when the stakes are higher. Fortunately, the good folks at baseball-reference have broken down performance based on low, medium, and high leveraged situations. We don’t need that here. Ortiz was better in the World Series than he was during the league championship series and during the divisional round. The Red Sox won every World Series they participated in. He obviously had a lot to do with that.

The theories and relevance of clutch performance is one of the big bones of contention between those behind WAR and Bill James and his win shares formula. The when of performance is often overlooked in WAR as they assume everything evens out over time. They focus on the what. Yet, if there is a positive variance in when players perform at their best then it could be argued that someone like Ortiz would be more valuable than his index numbers would otherwise indicate.

  PA AVG OBP SLG HR Runs RBI SO/BB
Low 4162 .271 .368 .532 228 443 448 1.41
Medium 3921 .299 .388 .571 212 493 655 1.31
High 1998 .292 .388 .556 101 481 619 1.20

Anyone that has paid attention to baseball knows this is not a normal distribution. Some players seem to get worse as the situations grow more tense. The best you can usually hope for is for performance to remain level as the situations get more tense. Ortiz was far better in clutch situations than he was in low pressure situations. This could be seen in his rates of contact as well. Either way you slice it, Ortiz has to be considered in a different light due to the fact that he was such a clutch performer.

The Steroid Issue 

To make matters more complicated, David Ortiz was mentioned in the Mitchell Report in 2003 in connection to steroid use. He supposedly tested positive for steroids when MLB was going through its non-penalty phase of steroid testing. Of course, no official statement was made as to such and nothing was ever reported as to what substance he tested positive for if he ever tested positive at all.

Ortiz has always denied using and he never tested positive once testing was made official after 2003. What’s more, his performance didn’t really change following 2003 when he joined the Red Sox. We could assume that he used steroids to go from being cut in Minnesota to being an all-star in Boston, but we would then have to suppose that he continued using throughout his career to maintain that level of performance.

This is where the guessing game goes off the tracks. History shows that players discover their game at different times. There are general rules of thumb that we see with most players, but not every player fits the model. Are we to assume that everyone that suddenly discovers more power discovered it based on a needle? We can in some instances when we know what happened. We can when there is no other logical explanation. Ortiz didn’t necessarily get stronger in Boston. He played in a better ballpark for him and he played in a better lineup. Those factors had just as much to do with his power surge as any potential steroid use.

Final Verdict 

The additional of the steroids question makes Ortiz’s candidacy hard to predict. Usually, steoids would destroy that candidacy as it did for the likes of Roger Clemens and Barry Bonds. On the other hand, the rumors came relatively early and he seemingly rebounded well. When in doubt I usually follow the index, but I’m just not sure how relevant the numbers are in this case. I just don’t think either WAR formula treats DHs fairly.

When James designed his win shares formula he postulated that 300 win shares was the normal dividing line between being in and out of the Hall of Fame. Like the rest of us, that wasn’t meant to be a hard and fast rule, but Ortiz’s 325 win shares would seem to put him in based on those numbers. Add in the playoff numbers and he would seemingly be put over the top. I’m inclined to put him in at this point, but it isn’t as easy as his counting numbers would make it seem.

On the Outside Looking In: First Basemen

Looking at those on the outside looking in is normally an exercise in identifying value where others have not. So, it shouldn’t be a surprise that the first base class is no exception. However, these players each hit on their own category of bias. It’s almost an exercise of opening up our minds to all different kinds of value.

We could certainly expand our list over a number of eras, but we will focus on the immediate list of players that could theoretically be selected by the new version of the Veterans Committee. The committee has been more destructive than helpful historically, but they have eliminated a large part of their largesse and have minimized their errors. Of course, their process is still less desirable than the BBWAA process.

When you picture a first baseman you normally don’t picture a gap hitter that is slick around the bag, but in this section we will look at two such players. Keith Hernandez and John Olerud challenge our very sensibility in what makes a first baseman valuable. We think of guys that hit 30 home runs or more a season and drive in 100 or more runs. If we haven’t found out by now then this article will hopefully prove that value can come in many different forms. A run saved is as good as a run produced.

The third member of our team had the type of numbers that we normally like to see, but his career was brief. Unfortunately, the BBWAA didn’t quite recognize all of his greatness because they were not clued into the importance of OBP and they were distracted by his prickly nature. Dick Allen labored in the 1960s and 1970s when players were not free to move yet. Instead, he was shipped from team to team when he had worn out his welcome. Since the media votes for the honor, relationships with the media matter. Sadly, many were not dispassionate enough to set aside his nature.

Career Value 

  bWAR fWAR WS/5 Total
Dick Allen 58.7 61.3 68.4 188.4
Keith Hernandez 60.4 59.4 62.2 182.0
John Olerud 58.2 57.3 61.0 176.5

We begin with Allen because he appears on top of our list and we have an immediate issue to address. Allen stands here because he played more games at first base than any other position. However, he did play a number of games at third and it could be argued that he belongs there. More of his valuable seasons came there and you could argue that more of his peak value came there.

How much it matters is debatable. In the index we compare players with players from their own position. Our standard floats from position to position and it is likely that the standard at third won’t be as rigorous as it is at first base. The fact that he clears the bar here indicates that he would likely clear the bar there.

Hernandez and Olerud are clearly first basemen, but they don’t have the kind of counting numbers that most of the voters look for. Yet, they played on very good teams that utilized their skill sets to create wins. As Billy Beane famously said in Moneyball, “are we finding players or selling jeans?” Whether they looked the part or not matters not. As we will see when we break down the numbers, they have a strong case.

Peak Value 

  bWAR fWAR WS/5 Total Index
Dick Allen 54.5 54.9 59.0 168.4 356.8
Keith Hernandez 51.3 50.9 51.0 153.2 335.2
John Olerud 46.8 47.1 46.4 140.3 316.8

As you can see, Allen clears the bar with plenty of distance. Any time you clear the 350 win plateau your place in history is secure. If you stack his numbers up against the likes of Willie McCovey, Tony Perez, and Harmon Killebrew you will see that his name is missing in Cooperstown. All three are a testament to the importance of on base percentage. Stealing first base is the most important skill in the game and it is the one skill that the BBWAA often overlooked 40 and 50 years ago.

If we were into following straight mathematical guidelines we would say Hernandez is in and Olerud is out. While that would be easy, the index was never designed to make quick and easy decisions. There are a number of players I have gone back and forth on and he is one of those. Often, it takes time for a player like Hernandez or Olerud to be fully appreciated. Both players have considerable defensive reputations, but we should be systematic in our approach. So, we will begin by looking their offensive numbers.

Offensive Production

  wOBA wRC+ OW% BPO
Dick Allen .400 155 .741 .899
Keith Hernandez .365 131 .668 .786
John Olerud .377 130 .638 .857

We had been using OPS+ and wRC+, but they are largely redundant statistics. Plus, wOBA (weighted on base average) describes their on base skills more. The problem with all of these numbers is that they measure different things. They are also weighted differently. Weighted runs created plus and offensive winning percentage are measured against the league average. wOBA and bases per out are not.

It is pretty clear that Allen is a cut above the other two across the board, but the debate between Hernandez and Olerud may rage on forever. Hernandez played in the 1970s and 1980s where offensive numbers were depressed, so we would expect his numbers to be a little worse. As we can see, when compared to the leagues average he is either equal to or superior to Olerud. When we combine this with the fielding data we get a justification for taking one and not the other. This ignores the importance of some of the numbers and what they mean in practical terms.

Offensive winning percentage is an easy number to wrap our head a round. A team of Dick Allens would win 120 games with average pitching and fielding. He isn’t the most valuable player ever, but he might be the best offensive player not in the Hall of Fame. Of course, fielding can’t be ignored, but that’s a staggering number. A team of Hernandez’s would win 108 games. That doesn’t even consider his defense which would make a team slightly above average overall defensively. Olerud would produce 103 wins and when think back to the actual list of Hall of Famers we see he would be in line with the bottom of the list.

Simply put, I love bases per out. Outs are blood currency of baseball. If you look at the majority of strategy development in the game it is in the development of strategies to either get more outs or prevent outs. We’ve seen the sacrifice bunt go by the wayside. We’ve seen extreme shifting on the defensive end to garner more outs. Teams have slowly caught on to the benefit of having players that get on base more often and accrue more value per out. The only group that moves slower than teams are the fans. We are often stuck in the batting average and home runs paradigm. Walks matter. They matter a great deal and each of these three demonstrate that in spades.

Ignoring the players from the Live Ball Era, we can see that these three players belong simply by looking at these two numbers. In many cases, the players in the Hall of Fame embody both the power numbers and the production that we look for. Still, if someone gives you the production then who cares how they do it?

  OW% BPO
Eddie Murray .624 .787
Willie McCovey .718 .900
Harmon Killebrew .706 .899
Tony Perez .619 .737
Dick Allen .741 .899
Keith Hernandez .668 .786
John Olerud .638 .857

Every time I look at Tony Perez the worse he looks. Numbers take on more meaning when we have a larger frame of reference. In other words, when we can compare players to a standard that is accepted as good enough then we have a better idea whether new candidates are good enough. Clearly, Dick Allen should have been in all along. It is fair to point out that all four players in Cooperstown enjoyed longer peaks and longer careers. Still, when someone is as good if not better it is a glaring omission.

Hernandez and Olerud are not as good as some but are better than both Murray and Perez. Granted, those players did enjoy lengthy careers, but placing these players together gives us an inkling that both might belong. Naturally, we haven’t even mentioned defense and this is where Hernandez supposedly shines.

Hernandez Fielding Contemporaries

  dWAR DWS TZ TZGG WSGG GG
Keith Hernandez 1.3 34.9 117 8 1 11
Eddie Murray -11.6 36.9 61 1 6 3
Steve Garvey -11.6 37.9 0 0 7 4
Don Mattingly -6.2 29.0 33 1 3 9
Pete O’Brien -1.3 29.6 69 1 5 0
Chris Chambliss -7.9 29.1 31 1 1 1
Wally Joyner -5.1 28.9 52 0 2 0
Tony Perez -6.6 24.2 13 1 0 0
Kent Hrbek -7.7 28.8 16 1 1 0

When someone has the reputation of a Keith Hernandez we have to first compare him with his contemporaries to see if that reputation is warranted. Here we see a hodgepodge of numbers that all mean something different, but we see Hernandez on top of the heap for most of them. There are a few notable exceptions that we should look at. First, we have win shares. Win shares are compared to replacement level and no one has negative value in win shares. So, the players that play the most have the most value. Steve Garvey and Eddie Murray enjoyed much longer careers, so their place on top is more a testament to their durability than their greatness.

The win share Gold Gloves are relevant (more relevant than the actual Gold Gloves) but they are largely tied to durability and whether a player played on a good team or not. Interestingly enough, while Pete O’Brien did not play long he got short-changed by the Rawlings company. He was a shade better than Don Mattingly at the same time, but Mattingly won the nine Gold Gloves. Life isn’t fair.

Whether Hernandez is the best defensive first baseman in history as some claim remains to be seen. The numbers above demonstrate that Hernandez was objectively the best of the time period and definitely deserves a spot in Cooperstown based on the combination of his fielding and hitting. Since Allen and Olerud played in different eras we have to check in and see how they stack up.

Defensive Production

  dWAR TZ DWS FG
Dick Allen -16.3 -110 26.8 -109.0
Keith Hernandez 1.3 117 34.9 119.0
John Olerud -1.4 103 39.1 98.8

Dick Allen was a butcher by all accepted standards on fielding. That might be one reason why he is not in the Hall of Fame. The thing is that fielding’s value is relative depending on the position. First base is just not that important as compared to the other positions on the diamond. That leaves us Hernandez and Olerud. Both were very good fielders and it enhances both of their overall values when it comes to Cooperstown.

Fangraphs (the last column) had numerous iterations of fielding numbers before they adopted UZR. They ended up being very similar to total zone runs (which they also kept). The addition of these numbers say the same thing as the other categories, but we include it because we include fWAR and we want to make sure Fangraphs is represented on the table.

My first inclination is to put all three in the Hall of Fame. Allen and Hernandez definitely deserve their spot while Olerud is more debatable. Still, he was a very good all-around player for a number of years. He also has a prominent place on the best regular season team of all-time. I’m not sure how much extra credit that affords him, but if it breaks a tie then so be it.

First Base: The Juicers

Nothing has caused more problems for the Hall of Fame than the specter of steroids and other performance enhancing drugs. It creates a legalistic, moral, and problematic statistical argument all at the same time. So, navigating through these requires that we do some compartmentalizing. Mark McGwire and Rafael Palmeiro have both reentered the news in the past year for ludicrous reasons. Palmeiro made some noise about returning to the game nearly 15 years after he left. McGwire commented that he thought he could hit 70 home runs without the help of steroids.

Both developments were likely motivated by the desire to reboot their Hall of Fame chances. In Palmeiro’s case, the clock would start over and he could potentially enter the ballot fresh at a time when voters were more sympathetic. McGwire’s comments were more transparent and pathetic in a way. All that being said, that likely muddies the water. If we follow that down the rabbit hole we will come away more confused. The case for confirmed users is three-fold. First, you consider their numbers on their own without adjustment as if they were normal. Secondly, you consider the historical account of their use and try to find a reasonable adjustment based on those facts. Finally, you consider the moral implications.

The Raw Numbers

The beauty of the index is that it allows us to distill out the noise and focus on the facts. Granted, the addition of use creates quagmires because many of the statistical arguments are colored by our feelings about the use. This is particularly true of McGwire. His detractors often look at batting average, a low hits total, and few contributions outside of the batter’s box. This is true but is largely irrelevant when looking at the overall value. Overall value according to the index considers all of those factors.

So, an argument against him based on a lack of speed, defensive value, or ability to hit to all fields or in small ball situations is just a mask for the disdain of the cheating. He wouldn’t be the first guy in the Hall of Fame to have significant weaknesses in his game. He wouldn’t be the last one either.

Palmeiro was more well-rounded, so his problem is that the perception of the drug use clouds more of his career as a whole. This has little to do with the facts of use and more to do with the nature of his numbers. Palmeiro never had a signature season, so his signature was his consistency. Even without the drugs, a vote for Palmeiro would be a vote for 3000 hits and 500 home runs and not necessarily any specified greatness.

Career Value 

  bWAR fWAR WS/5 Total
Mark McGwire 62.2 66.3 68.4 196.9
Rafael Palmeiro 71.9 70.0 68.4 210.3

Joe Morgan released a letter on behalf of the living Hall of Famers in opposition of any known user getting into the Hall of Fame. I suppose we shouldn’t be shocked by his black and white views on the subject and we should give him some benefit of the doubt. After all, he was representing not only himself, but those that feel they have a stake in the debate. However, while my views have changed some over the years, they have always been pragmatic in nature and it is why I use this approach.

A player that is a borderline Hall of Famer can easily be dismissed if he has the stain of PEDs, but someone that is clearly superior cannot be dismissed so easily and really shouldn’t be. So, were Palmeiro and McGwire clearly superior? We won’t know for sure until we look at the peak value, but based on the numbers above we have to say they appear to be pretty solid Hall of Famers based on the numbers alone.

Peak Value 

  bWAR fWAR WS/5 Total
Mark McGwire 46.7 48.5 48.8 142.0
Rafael Palmeiro 49.6 50.7 39.4 139.7

Before we consider drugs we must look at the numbers because there are interesting things going on here. The index changed some from the book version. The book called on us to find the ten best seasons in their career where here we are looking at the best consecutive ten seasons. That seems like a battle of semantics, but it had an impact on McGwire because of his frequent injuries. He lost a couple of productive seasons with this model. That might seem unfair, but it probably correctly categorizes his career. Attendance matters and these numbers reflect his problems with staying healthy.

As for Palmeiro, his numbers reflect a tremendous consistency and durability, but also demonstrates a noticeable lack of greatness. The win shares deficit can be explained through methodology. WAR is based on runs while win shares are based on wins. The Rangers and Orioles were decent clubs, but they never had the success of the Athletics or Cardinals during McGwire’s career. Is it fair to penalize Palmeiro for the shortcomings of his teammates? That’s certainly one way to look at it. Another would be to simply demonstrate that he was never good enough to carry a team to anything.

The Historical Account

McGwire and Palmeiro are clearly Hall of Famers according to the basic numbers, but the next question comes in how much the drug use helped them. This is a two-fold discussion. The first asks us to determine when use started. This can be difficult when the players and witnesses aren’t being particularly helpful. It often involves looking at their numbers and offering our best guess.

Palmeiro famously wagged his finger at Congress and said he had never ever used steroids. He tested positive later that same season. He still has never admitted to long-term use, but the numbers would seem to say otherwise. The Cubs traded Palmeiro because they also had Mark Grace. They felt Grace would develop more power. The two were similar players when Palmeiro left and the two brought similar skills the table when you remove power. So, a simple look at Grace’s career value would give us some clue.

  bWAR fWAR WS/5 Total
Mark Grace 46.4 45.5 58.8 150.7
Rafael Palmeiro 71.9 70.0 68.4 210.3

Grace was a heck of a player. He brought good defensive value to the table and was a productive hitter for more than a decade. However, his numbers don’t quite add up and that would have likely been the same fate for Palmeiro. Palmeiro had enough of a defensive reputation to win a Gold Glove at first when he played more than 100 games that year at DH. Still, it is pretty clear he wouldn’t have been Hall of Fame worthy without the power.

McGwire’s case is a lot murkier. His recent comments were obviously aimed at making the case that much more debatable. If we assume that he began his heavy use in the 1990s then we have enough seasons before that to demonstrate that he did have the potential to hit 40 plus home runs without the drugs. In his case, the drugs served two purposes. Yes, they made him stronger, but they also helped him stay healthy. The best evidence of that comes when he broke down so quickly following the use.

Anyone that watched him during that 1998 season had to know something wasn’t kosher. In batting practice he would hit home runs to parts of the ballpark no one could touch in a million years. So, he could claim that he would have hit 70 home runs without the PEDs, but that just doesn’t make a whole lot of sense given what we saw.

The Moral Argument

I have a few major problems with the Joe Morgan (I realize it is more than him) view of PEDs. The most obvious problem is the obvious hypocrisy from at least a few of those players. Players from Morgan’s era popped “greenies” like they were candy. So, hearing from at least some of those folks is like the guy smoking a pack of cigarettes looking down on the drinker. Granted, I’m not going to pretend to know who in that group partook of the greenies, but the point is that it is highly unlikely that none of them did.

Then, you get the idea that one could go into a GNC or other major supplement store and buy many of these substances. They can also buy legal supplements that have many of the same effects (although in different proportion). Add in the surgical advancements of the age (Tommy John surgery) and it is hard to look at the modern game in the same light as past.

All of this is to say that we have to take each player on an individual basis. Why did they begin using? When did they begin using? What exactly did they use? Some players wanted to recover from injury. Yes, that absolutely affects their numbers (thus the need for the second part of the test) but they didn’t necessarily turn in Paul Bunyun. In other cases they may have been attempting to prolong an already Hall of Fame caliber career.

All of this is to say that I think the drug use excludes McGwire and Palmeiro, but it doesn’t do it for moral reasons. We simply cannot say that they would have been worthy without the use. However, other players we will cover along our journey could have a different result. Specifically to these two, their use started early in their career and was designed to boost performance in addition to health.

First Base: The What Abouts

It’s often difficult to give every candidate for the Hall of Fame their just due when considering them at the same time. Most fans are stuck on one or two and are not ready to hear anything about the rest until we address those few. These are the guys I call the “what abouts”. In other words, “what about Gil Hodges?” or “what about Fred McGriff?” Those two tend to dominate the conversation, so we will focus on them this time around.

The focus on these two are based on similar considerations that we can generally place under the guise of counting numbers. Fans are fixated on home runs, runs scored, RBI, or the number of hits a player collected. This isn’t to say that those numbers are completely useless, but there is usually something else going on behind the scenes that skews are perception of those players. Hodges might be the most egregious offender. Naturally, this isn’t his fault. He didn’t ask for this and passed on long before this debate raged on. He just happens to be Ground Zero for the debate between a strict adherence to numbers or the happenstance of history.

As everyone knows, Hodges was a beloved part of one of the most beloved teams in baseball history. The Boys of Summer Dodgers won 92 or more games nine times between 1946 and 1956. Those seasons came in the days when teams played 154 games. In an 162 game schedule they might have eclipsed the 92 win mark one or two more times. However, there are a few facts that elude us this many years later. First, they won 100 or more games only once. Granted, the 154 game schedule had an effect on that fact. Secondly, they won only one World Series title during those years. Naturally, many will point to the three titles they won between 1959 and 1965, but he was hardly a big part of those teams.

The fact people hold onto are the seven consecutive 100 RBI seasons Hodges had during that first stretch. That means he was the most consistent force that team had during that stretch. That would be true if RBI were the most important statistic in baseball, but we are more sophisticated these days. To provide for an illustration I’m going to do something I lovingly call the Player A and B test. I’ve plucked another player that played for a similar dynasty and compare him with Hodges using traditional numbers.

  AVG OBP SLG OPS HR Runs RBI
Player A .273 .359 .487 .846 370 1105 1274
Player B .271 .344 .471 .815 339 1009 1271

Obviously, there are a number of caveats here, but that’s kind of the point. We don’t know when these two players played and we don’t know where they played. We just know that they played for great teams that experienced a ton of success. However, that success is always relative. Player B joined his team for six of the seasons in question and drove in 100 or more runs in five of those six seasons. He drove in 91 runs in that other season, so we could argue that he had a similar statistical impact on those teams?

How did those teams do in those six seasons? They won the World Series four times in six seasons including three times in a row. They won 100 or more games only once like the Dodgers, but that season was a doozey. They won 114 games and set the modern record that would go down only a few seasons later. Hodges’ defenders will claim that he had much more to do with his team’s success in those seasons than the other player. Each player played in a similar number of World Series during their respective spans. The other player has a divisional round and league championship experience, but we will only include World Series play to keep it fair.

  AVG OBP SLG OPS HR Runs RBI
Player A .267 .349 .412 .761 5 15 21
Player B .268 .355 .390 .745 3 11 14

If playoff performance is a tiebreaker then we have to give Player A the slight edge. He has better playoff numbers and better regular season numbers. However, no one can deny how close the numbers are before we consider the effects of time and space. I don’t want to get accused of cherry-picking, so here are the overall playoff numbers for Player A and Player B. Player B comes out looking far worse, but it could also could be argued that he performed better during the brighter lights and was more instrumental in his teams winning four rings.

  AVG OBP SLG OPS HR Runs RBI
Player A .267 .349 .412 .761 5 15 21
Player B .233 .321 .351 .672 9 44 38

No one would argue that Player B was a superior player and that really isn’t the point of the Player A/B test. We remove names and find someone that might be more similar that we might have previously thought. Player A is Hodges and Player B is Tino Martinez of the Yankees. No one in their right mind would trumpet Martinez’s cause for the Hall of Fame, but that kind of begs the question of why so many do so for Hodges. Martinez was almost as good for a team that experienced far more success. Fortunately for Hodges, he benefits when we apply our usual adjustments for time and place. Yet, he is still a great example of how a happenstance in history can vault someone up the ladder when their performance doesn’t necessarily warrant it. As usual, the index will ultimately solve this case.

  bWAR fWAR WS/5 Total
Career 44.9 42.1 52.6 139.6
Peak 42.1 39.4 45.0 126.5
Total 87.0 81.5 97.6 266.1

These results alone should put the issue to bed, but some Hodges defenders will also bring up his place as the manager of the Miracle Mets in 1969. The argument is somewhat compelling on a certain level. You take someone that would never make the Hall of Fame as a hitter or as a manager but combined could be said to have contributed enough to the history of the game. If he doesn’t die prematurely maybe he would have been good enough as a manager. Maybe if he had two or three more prime years he might have made it as a player. Unfortunately, that kind of sentiment can be attached to any number of guys.

Fred McGriff has similar statistical issues, but his shortcomings are wholly different. His career value ends up being fairly similar to guys on the bottom of the Hall of Fame list. He is very comparable with Tony Perez, Hank Greenberg, and George Sisler. He is also very comparable to other guys on the outside looking in that get more support. So, some are overwhelmed by 493 home runs and more than 1600 RBI. He is so close to 500. If he had 500 or more home runs he likely would be in. That kind of thinking is more and more faulty the more we break it down. Does seven home runs really generate that much value? Does an additional 100 RBI over a twenty-year career create that much more value?

I can fully appreciate how that argument might extend to the index, but there is a very subtle difference. Especially when we get to peak value, the numbers do describe something else entirely. McGriff’s fairly strong career value makes him a borderline candidate but consider his case even if we compare him with Hodges above.

  bWAR fWAR WS/5 Total
Career 52.6 56.9 65.1 174.6
Peak 38.6 43.2 46.0 127.8
Total 91.2 100.1 111.1 302.4

Getting to the 300-win barrier is fairly significant when we compare him historically to every Hall of Famer. Unfortunately, we aren’t comparing to every Hall of Famer. We are comparing with first basemen in the Hall of Fame. In that environment he comes up a tad short. Simply put, he was good for a very long time, but he was never really great. Even Hodges had a higher peak value and he wasn’t even good for a solid decade. The BBWAA themselves settled this issue when they played. Following are their collective finishes in the MVP races.

  MVP Top 5 Top 10 Top 30
Gil Hodges 0 0 3 6
Fred McGriff 0 1 5 2

We should be careful not to read too much into awards voting. This doesn’t necessarily mean that they were treated fairly by the writers, but it does indicate where those writers view these players in history. Nine and eight seasons registering on the ballot is impressive on some level, but neither player had a signature season we could hang our hat on. At the end of the day, that is often the difference between glory and coming close.

First Base Hitting

When people think of first basemen they think of hitters first and for good reason. You could argue that they are the best hitters overall in baseball. Certainly, outfielders will have a huge say in that, but most teams have a better hitter at first base than at any other position. As we saw with the catchers, there are different metrics that we use. Some of them compare with the league average, so they can be compare players over time. Others exist on their own, so it is more difficult to compare players across eras.

We will be adding base running runs from Fangraphs this time around. While it may not mean a lot in the grand scheme of things, it does help us differentiate between OPS+ and wRC+. wRC+ includes base running, so it ends up being a lower in most cases. However, there are a few notable exceptions. It may not make a huge difference, but when the overall picture is close, every little thing helps.

  OPS+ wRC+ OW% BPO BsR
Lou Gehrig 179 173 .803 1.205 -27.2
Jimmie Foxx 163 158 .781 1.125 -18.9
Jeff Bagwell 149 149 .722 .975 6.5
Frank Thomas 156 154 .732 1.033 -38.2
Eddie Murray 129 127 .624 .787 3.4
Jim Thome 147 145 .713 1.020 -35.2
Willie McCovey 147 145 .718 .900 -0.4
Harmon Killebrew 143 142 .706 .899 -0.1
Hank Greenberg 158 154 .765 1.086 -2.3
Tony Perez 122 121 .619 .737 -0.7
George Sisler 125 123 .672 .756 19.4

We could break this down in any number of ways, so we will look at the metrics themselves. Four of the five are normalized by comparing the player with the average player. The lone holdout is a statistic called bases per out (BPO). Even though it isn’t normed, it is an incredibly valuable statistic. Sure, the top three guys all played in the Live Ball Era, but you can still compare players from the same period.

Tony Perez didn’t play at exactly the same time as Willie McCovey and Harmon Killebrew, but they were contemporaries. You can see that both were vastly superior to Perez. It’s to the point where you have to question his place in the Hall of Fame. Couple that with his offensive winning percentage and it’s enough to raise the question again. Offensive winning percentage assumes all eight position players produced the same as he did and the team gave up an average number of runs.

The three Live Ball Era first sackers would win well over 120 games if all the hitters produced like they did. A team of Perez’s would win 100 games on the nose. Even a team with that record is usually one of the favorites to win the World Series. Obviously, all eleven guys produce numbers that would make a team full of those guys favorites to win the World Series.

These numbers tend to make a lot more sense when we place the players in groups from the same era. Even though most of the numbers are normed, it is still hard to compare players from the Live Ball Era with the players that played in the 1960s and 1970s. The current era might be comparable to the Live Ball Era, but there is still some separation. We are taking the fielding out of it, so these rankings are not complete, but many would swear by them.

Live Ball Era 

  OPS+ wRC+ OW% BPO BsR
Lou Gehrig 179 173 .803 1.205 -27.2
Jimmie Foxx 163 158 .781 1.125 -18.9
Hank Greenberg 158 154 .765 1.086 -2.3

These numbers demonstrate how good Hank Greenberg really was. Give him his three full seasons he lost to the World War and his career numbers would have been almost as good as Foxx and Gehrig. When you look at the rate statistics you can see that he was just a cut beneath Foxx. Gehrig is in a league of his own. Even when we throw Albert Pujols into the equation, he still winds up better than the rest.

Expansion Era 

  OPS+ wRC+ OW% BPO BsR
Willie McCovey 147 145 .718 .900 -0.4
Harmon Killebrew 143 142 .706 .899 -0.1
Eddie Murray 129 127 .624 .787 3.4
Tony Perez 122 121 .619 .737 -0.7

It’s always interesting to see how similar players are from various eras. You could argue that McCovey and Killebrew are practically the same player based on this profile. Both Murray and Perez enjoyed longer careers, so you could argue that their overall value was suppressed by playing longer. Murray played in over 3000 games (around 400 more than McCovey and 500 more than Killebrew). Perez really only played in about 200 more games than McCovey and 300 more than Killebrew. So, it is harder to make that excuse for him.

Both Murray and Perez drove in more runs than Killebrew and McCovey in that span of time and that is why they are in the Hall of Fame. Of course, in addition to the increased number of games, there is no accounting for the number of opportunities those players had. However, ignoring that we could look at the number of games they played and then take the number of runs and RBI on a per game basis.

  Games Runs RBI TRP TRP/G
Willie McCovey 2588 1229 1555 2784 1.075
Harmon Killebrew 2435 1283 1584 2867 1.177
Eddie Murray 3026 1627 1917 3544 1.171
Tony Perez 2777 1272 1652 2924 1.050

It’s hard to take these numbers at face value. They are raw numbers that don’t account for the hitting environment or the quality of the teams around them. With the exception of Willie Mays, there were no Hall of Famers that surrounded either McCovey or Killebrew. Perez had Johnny Bench, Joe Morgan, and Pete Rose. Murray had Cal Ripken Jr. on his team for most of his time with the Orioles. Even still, this is just another spot where Perez comes out a bit behind his colleagues.

Steroid Era 

  OPS+ wRC+ OW% BPO BsR
Frank Thomas 156 154 .732 1.033 -38.2
Jeff Bagwell 149 149 .722 .975 6.5
Jim Thome 147 145 .713 1.020 -35.2

Again, we notice how similar these players are when we break them down by era. Bagwell’s base-running keeps him in the conversation. He also spent a good portion of his career in the Astrodome where power was suppressed. However, you could put a blanket over all of these guys offensively. As we saw in the last article, the fielding numbers were not necessarily this close.

Daily fantasy baseball is taking the world by storm. While total points (and total points per game) is not as scientific as the sabermetric numbers, it is a lot of fun to play and looking at the historical numbers can be fun. The total points formula changes depending on the source, so we came up with our own.

Total Points= TB + Runs + RBI+ SB + BB + HBP – SO – CS- GIDP

  Games TP TP/G
Lou Gehrig 2164 9661 4.46
Jimmie Foxx 2317 8715 3.76
Jeff Bagwell 2150 7005 3.26
Frank Thomas 2322 7801 3.36
Eddie Murray 3026 8510 2.81
Jim Thome 2543 6982 2.75
Willie McCovey 2588 6626 2.56
Harmon Killebrew 2435 6628 2.72
Hank Greenberg 1394 5436 3.90
Tony Perez 2777 6262 2.30
George Sisler 2055 6726 3.27

These numbers confirm what we already saw from the sabermetric numbers, but they reveal them in another way. Gehrig comes out on top in total points and total points per game. Perez comes out lacking again. Naturally, someone has to be the worst of the group, so it doesn’t mean he doesn’t belong in yet, but it will be interesting to see once we start comparing him with the guys on the outside looking in.

First Base Fielding

When we look at fielding at any position we must differentiate between terms like greatness and value. Even the most discerning fans often interchange those terms as if they mean the same thing. Greatness is an esoteric term that often gets debated at sports bars and on television shows on the MLB Network and ESPN. We can look at a number like the index because all three platforms use like terms. Unfortunately, the fielding numbers compare players with the replacement level player and the average player. Putting those together is problematic.

Even the same platform can treat players differently depending on the metric being used. For instance, total zones appears in multiple platforms, but is primarily used by baseball-reference. It compares players to the average fielder at their position. It makes it easy to understand with zero being average. Their dWAR statistic is intriguing to say the least. It compares players with the replacement level performer, but it doesn’t compare them with the replacement level first baseman. It compares them with an overall replacement level player and since first base is the least valuable defensive position, most first basemen are automatically worse than a replacement level performer at another position.

So, we treat the data like ordinal data. The idea is to see if the various sources agree on who the best fielder is and who the worst fielder is. So, we cannot combine the numbers and get any real idea from them. We just show them all to see if the various sources agree on the value of the fielder.

  Innings TZ DWS DWAR DWS/1000
Lou Gehrig 18831   2 33.0   -8.9 1.75
Jimmie Foxx 16775 21 35.6   -5.5 2.12
Jeff Bagwell 18523 46 32.4   -7.9 1.75
Frank Thomas   8383 -68   6.5 -23.4 0.78
Eddie Murray 21151   -3 36.9 -12.8 1.74
Jim Thome   9539 -24 15.1 -17.2 1.58
Willie McCovey 16718 -65 23.6 -21.8 1.41
Harmon Killebrew   7810   -6 14.0 -18.8 1.79
Hank Greenberg   9999  21 20.5   -4.3 2.05
Tony Perez 14366  16 24.2   -6.9 1.68
George Sisler 17441    6 23.8   -7.6 1.36

We include the innings because the difference between some of these players in terms of productivity is tremendous. The win shares per 1000 innings has a way of evening those numbers out. This is particularly true when looking at the value numbers. Win shares compares with the replacement level first baseman (unlike total zone runs). Before we look at the year by year data we can get a better idea of where these guys are if we look at the rankings in each category. If we can get universal agreement then we have a good idea of what value each player’s fielding adds to their overall value.

  TZ DWS DWAR DWS/1000 AVG
Lou Gehrig 6 3 6 4   4.8
Jimmie Foxx 3 2 2 1   2.0
Jeff Bagwell 1 4 5 5   3.8
Frank Thomas 11 11 11 11 11.0
Eddie Murray 7 1 7 6   5.3
Jim Thome 9 9 8 8   8.5
Willie McCovey 10 8 10 9   9.3
Harmon Killebrew 8 10 9 3   7.5
Hank Greenberg 2 9 1 2   4.5
Tony Perez 4 6 3 7   5.0
George Sisler 5 7 4 10   6.5

Here is where the rubber meets the road. Recently, I got into a long drawn out discussion on social media given the need for complex metrics and data. As the usual argument goes, we don’t need complex statistics to tell us who the best player is. Of course, this is true to a certain extent, but it ignores a few things and is certainly not true all the time. More often than not, it is simply an overreaction to developments either we don’t understand or don’t have much need for. We see it in other walks of life and all of us are guilty of that thinking to one extent or another.

Those that argue against the use of statistics often use statistics to argue their point. They use batting average, home runs, RBI, and runs scored. They use fielding percentage, errors, putouts, and assists in fielding. It’s incredibly ironic for anyone to use statistics to argue against the use of statistics. Now, we certainly can argue as to which statistics are the most descriptive and accurate, but we should at least be honest about what we are talking about.

That being said, we don’t need statistics to tell us Jeff Bagwell was a better defensive first baseman than Frank Thomas. You could look at them with a glove in their hand and decipher that much. However, looking at them doesn’t answer the question of how much better one is than the other. What is the relative or precise value of Bagwell’s glove in comparison with Thomas? How does that interact with their overall value? Then, we get to more complex questions like how fielding value compares at positions like first base versus say a shortstop or catcher. Is a team better off with a Thomas type or should they go after a Keith Hernandez type?

Whether this kind of data interests the common fan is certainly debatable. No one really needs to know what someone’s secondary average is to really enjoy the game. However, teams increasingly need to be on the leading edge on data to put together the best rosters and determine how much each player should be paid and whether they should be invested in long-term. So, those of us that enjoy looking at data enjoy looking at it because we want to know not only who was the best, but how much better were than they someone else.

  TZ WS GG
Lou Gehrig 4
Jimmie Foxx 5
Jeff Bagwell 0 1 1
Frank Thomas 0 0 0
Eddie Murray 1 5 3
Jim Thome 0 0 0
Willie McCovey 0 0 0
Harmon Killebrew 0 1 0
Hank Greenberg 2
Tony Perez 1 0 0
George Sisler 1

When we look at awards breakdowns like these we usually see a discrepancy between the total zone and win share awards and the Gold Gloves. Gold Gloves are voted in by the coaches. Rafael Palmeiro won once after a season where he played over 100 games at DH. This isn’t to say that the coaches are always wrong, but given the fact that they may see an opposing fielder a maximum of 19 times during a season, we can’t really take their suggestions all that seriously. It’s not that they can’t identify a good fielder. They know more about baseball than most of us. The problem is a lack of evidence. So, their voting is largely based on reputations that may or may not be deserved.

The problem here is value at first base is overwhelmingly proportioned on the offensive end. The best fielders are rarely ever the most valuable unless they also happen to be great hitters. We don’t see that very often because the skills needed to be a good fielder (quickness, speed) are not present in power hitters. There are notable exceptions, but if you think of the best fielding first basemen in history they usually did not hit for prodigious power. So, while some of us find fielding fascinating and love to see how players are rated with the glove, in terms of value it is not particularly important in finding out which first baseman was the best in history.

Lazy Analysis

Every once in awhile I stumble on to something on the internet that interrupts my train of thought. I’ve been trying to go through the positions in an organized fashion. I posted the first base index earlier in the week and should be moving to fielding shortly. The great thing about having a website instead of a book is that I can respond to things in real time. So, today I’m interrupting the process to respond to something that I read on the interwebs.

It was innocent enough really. Someone posted a link on Facebook to an article they had written about Jimmy Rollins and his chances of getting into the Hall of Fame. I read the article because I am obviously interested in the Hall of Fame and arguments surrounding it. I won’t mention the author or all of the specifics because I don’t believe in snark as a general rule. I highlight the arguments because they are something you see quite often.

The two arguments that came up are similar, but they are enough differences that  we should point them out separately.

  1. In 2007, Rollins won the MVP award with a season where he had 20 triples, 30 home runs, and more than 40 steals. He is the only person in history to produce such a season.

I could call these what my wife calls them (“random ass statistics”) but that would serve to cheapen the argument and add the kind of snark I usually detest. The closest comparison I could make here is the so-called triple double in basketball. If someone scores 12 points, has 11 rebounds, and 11 assists did he have a better game than a player with 28 points, 9 rebounds, and 8 assists?

In the baseball vernacular we could look at that individual season and the shortstops involved. Keep in mind, I’m leaving out WAR and win shares out of this discussion. It’s disingenuous to use a term to justify it’s utility. In other words, we are talking in general about the need for more rigorous analysis. So, Rollins was the only player to put together that particular statistical profile. Unfortunately, that presupposes that each of those events carries the same value. What we have learned is that stolen bases are not all that valuable in comparison with other events. It is much more valuable to steal first base then it is second or third.

Most of you are familiar with OPS. It stands for on base percentage plus slugging percentage. It is said to explain 90 percent of the variance in run production. It is much more descriptive then simply looking at runs scored, RBI, home runs, steals, and triples. Considering his MVP award and high fluting statistical profile we would expect Rollins to be the best shortstop in baseball that season. However, if we look at OPS+ (OPS normalized with home ballpark effects removed and compared to the league average) then we see that is not the case.

  1. Hanley Ramirez: 145
  2. Edgar Renteria: 124
  3. Carlos Guillen: 122
  4. Derek Jeter: 121
  5. Jimmy Rollins: 119

Does this mean he was the fifth best shortstop in baseball in 2017? Of course it doesn’t. We haven’t even looked at fielding and OPS+ doesn’t explain everything. It does explain a whole lot more than that analyst was attempting to explain. In essence, it just means that putting together a list of the number of times a player meets specific statistical markers is an inexact way of arriving at value.

2. Jimmy Rollins had a similar number of hits, runs, RBI, home runs, stolen bases, and Gold Gloves awards as Alan Trammell and Barry Larkin.

I will simply leave Gold Gloves where they are. That requires a whole separate article that we will likely get to someday. That day won’t be today. Suffice it to say that Gold Gloves have about as much to do with identifying fielding greatness as good penmanship has to do with stock car racing. The other comparisons seem compelling and that is why we have to spend more time on it.

Basic statistics have three separate issues that make using them problematic. The most obvious area of bias is one we identified with the first problem. Scouts love to talk about the five tools and love to swoon over guys they call “five tool players.” This is where a guy can hit, hit for power, run, throw, and field. Even ignoring the fact that those tools ignore plate discipline (which might be more important than all of them) we have to recognize that not all of those tools are equally valuable or important. A home run is more valuable than a stolen base. A run scored is more valuable than a hit. So, stacking these numbers next to each other can cloud our judgment as to how valuable a player really is.

The second issue is the issue of place. Where a player produces these numbers can further be divided into two different considerations. First, we have the ballpark the player played in. Coors Field and Petco Park are two very different environments, so we would expect that environment to impact each player differently. You cannot simply stack those counting numbers next to each other without considering how those environments impacted those numbers. Hitting .300 in the Astrodome for instance is far different than hitting .300 in Fenway Park.

The second issue with place is the team that the player played for. Insert any player onto a team like the 1970s Reds, late 1990s Yankees, or 1950s Dodgers will impact their numbers in a very positive way. Do the same for the 1960s Mets, Senators from any period, or the 1930s Phillies and you would see those numbers depressed in comparison with players from the same period that were on good teams. Geography matters. In the instance of Rollins, we have to acknowledge that the Phillies were good throughout that period. Sure, Rollins has a hand in that, but so did Ryan Howard, Chase Utley, Bob Abreu, and others.

This brings us to the bias of time. Time is simple. When did the player play? An average player in 1930 would produce far different numbers than an average player in 1968. This is even if we normalize it for the quality of the team he played on and the ballpark he played in. One cannot compare Alan Trammel, Barry Larkin, and Jimmy Rollins and not account for those three areas of bias. To simply say that they have similar totals in runs scored, RBI, and stolen bases ignores a great deal.

I know I said I would not include WAR or win shares, but I feel the point has been made. What those metrics do is distill the affects of time, place, and the randomness of how much each unit means in terms of helping his team win. They are all included in the secret sauce. Below are the bWAR for each player.

  • Alan Trammell: 70.4
  • Barry Larkin: 70.2
  • Jimmy Rollins: 46.0

I’m not saying that Rollins is not a Hall of Famer. I haven’t done the full analysis on that yet, but it certainly isn’t looking good right now. I find the above a little more compelling than comparing how many bases each stole. In short, this is why something like the index is so valuable. We miss a lot when we only play around with the basic numbers.