Can pitchers prevent hits on balls in play?
July 21, 2003
In January, 2001, Voros McCracken published an article that shook the baseball analysis community.
In an attempt to better understand how to separate the contributions of pitching and defense, McCracken divided the traditional pitching stats into two groups -- those that are under the direct control of the pitcher (hit batsmen, walks, strikeouts, homers) and those that aren't (hits on balls in play). He called the first group defense-independent pitching stats, or DIPS for short.
I'll get into the details shortly, but before I do, the reason McCracken's work caused such a stir is that he reached a conclusion that seems very counter-intuitive and, if true, extremely important. In his own words, he stated his major finding in these two ways, once at the beginning of the article and once at the end:
"hits allowed are not particularly meaningful in the evaluation of pitchers"
"major-league pitchers don't appear to have the ability to prevent hits on balls in play"
McCracken wasn't able to give a reason why this would be true, but stated rather emphatically that it is true.
Ever since I read that article, I've been wondering how this could possibly be. It seems so obvious that certain pitchers must be able to get more than their share of easy outs. Doesn't Greg Maddux produce more than his share of routine ground balls? Doesn't Mariano Rivera's cutter eat up opposing hitters even when they don't strike out? Doesn't a flame-thrower like Roger Clemens induce a lot of weak swings from hitters who are down in the count? Wouldn't a knuckleball lead to more lazy popups from hitters who are just guessing at where that pitch will dance next?
McCracken's analysis used a stat that I'll call in-play average (or IPAvg), which he defined as (H - HR) / (BF - HR - HBP - BB - K). That's just non-homer hits divided by balls in play, and because all but a handful of homers leave the yard, it's a good reflection of how well pitchers and defenses are able to turn batted balls (that stay in the field of play) into outs.
He found that:
- there are "massive differences in the ability of pitchers" even before considering balls in play. To put it another way, a lot of a pitcher's ERA is explained by his walk rate, strikeout rate, and ability to prevent homers.
- the correlation between a pitcher's IPAvg one year and the next is low, suggesting that pitching ability might not have a major impact on IPAvg, as compared to other factors such as defense and luck
- some of the best pitchers in the game, such as Greg Maddux and Pedro Martinez, have gone from the top to the bottom and back to the top in IPAvg in subsequent seasons, again suggesting that these results are largely out of their control
- the variations in IPAvg decrease when you add park effects and the quality of the defense to the analysis
- projections of next-year pitching stats are more accurate if you use a team's collective IPAvg than if you use each pitcher's personal IPAvg from the year before
My reaction was to think that McCracken was on to something but may have gone too far, so I began to think about how to dig a little deeper.
McCracken appears to have done most of his work using stats from two seasons. I wasn't sure whether those two seasons were representative or not, so I decided to apply his method to all pitcher-seasons since 1913. Why 1913? Because that's the first year my historical database has all of the stats needed to compute IPAvg and the DIPS for every pitcher. And I figured that 90 years would be more than enough to prove the point one way or the other.
After compiling this information and studying it for a while, I discovered a pair of columns by Rob Neyer of ESPN.com. In the first column, Rob described the McCracken article. In the second one, which appeared a couple of days later, Rob included email messages from Craig Wright and Bill James with their take on McCracken's assertion.
Wright described his own work in this area:
"Like McCracken, I've studied hits allowed per ball in play (though with the small difference that I subtract sacrifice hits) ... I agree that this type of hit rate is not as heavily influenced by the pitcher as is commonly believed, but at the same time I am distinctly uncomfortable with McCracken's conclusion."
James wrote that he hadn't studied this issue, but that he shared Wright's reservations and suggested that someone do a large-scale study to find out whether the idea would hold up. It appears that the work I had just finished doing was exactly what Bill was proposing.
In addition, Bill wrote about McCracken's work in the New Bill James Historical Baseball Abstract. Based on a review of an unspecified number of pitching careers and about 400 pitcher-seasons, he concluded that pitchers do have an influence on these outcomes but confirmed McCracken's finding that there's still a lot of random variation in single-season performances.
Finally, in recent months, I've seen more and more references to McCracken's assertion in various baseball articles and posts to baseball research forums. There's enough momentum building behind this idea that a few of our customers have asked how we might change the design of our Diamond Mind Baseball game to reflect this new knowledge about how baseball works.
Before making any changes to our game or our method for projecting player performance, I figured it was worth spending some time looking at this question.
NOTE: In an article published on Baseball Primer last year, McCracken softened his original conclusion a little, saying that there are small differences among pitchers in their ability to prevent hits on balls in play, and those differences are "statistically significant if generally not very relevant." Except for the regulars on Baseball Primer, I don't think many people in the baseball research community are aware of this update to McCracken's thinking.
For every pitcher who appeared in the big leagues since 1913, I computed his HBP rate, walk rate, strikeout rate, homerun rate, and IPAvg for each of his seasons. The first four numbers are computed quite simply -- take the relevant stat and divide by batters faced. The IPAvg figures were computed according to McCracken's formula, which I wrote out a few paragraphs back.
To establish a baseline against which to evaluate those figures, I also computed those stats for each league-season and each team-season since 1913.
This enables us to evaluate every pitcher relative to the norms for his league. Last year, for example, Roger Clemens faced 768 batters and fanned 192 of them. That's a strikeout rate of .250 in a league where the average was only .163. His advantage over the league can be stated in two ways: (a) his rate was .077 higher than the league, and (b) he had 67 more strikeouts than the league-average pitcher would have had if he faced the same number of batters as Clemens. The same method was used to determine how many hit batsmen, walks, and homeruns each pitcher yielded above or below the league average.
For balls in play, I compared the in-play batting average for each pitcher and subtracted from that the corresponding in-play batting average for the league. As was the case with strikeouts, the result can be expressed either as a number of batting average points above/below the league or a number of hits above/below the league.
But hits on balls in play are subject to some outside influences that make comparisons with the league average a little suspect. Some parks (like Coors Field) tend to inflate batting averages. Some defenses are much better than others. If Jamie Moyer allows 15 fewer hits than normal, how can we decide whether to give Moyer the credit or chalk it up to Safeco Field and the talents of Mike Cameron and Ichiro?
To account for the effects of park and defense, I also computed the in-play average for each team-season in the period from 1913 to 2002. If McCracken is correct when he says that pitchers have virtually no influence over these outcomes, every pitcher on a given team should have roughly the same IPAvg. After all, those pitchers share a common park and a common defense.
If we then (a) compute the IPAvg for each team, (b) compare the IPAvg for each pitcher to that of his team, and (c) study those differences, we should find that the differences in IPAvg between a pitcher and his teammates are random. In other words, those differences should be centered around zero, equally likely to be above zero as below zero, and have no predictive value from one year to the next.
If we find that these differences are not random, there must be another factor, apart from defense and park effects, that accounts for them. And it follows that the missing factor must be an attribute of the pitcher. Because if the pitcher had nothing to do with it, there'd be no reason for that external factor to be evident only for this pitcher.
Studying career totals
At this stage of the process, we now know how much a pitcher exceeded or fell short of his league in five categories -- HBP, BB, K, HR and hits on balls in play -- for every season of his career. And we also know how much a pitcher exceeded or fell short of his teammates on in-play hits for every season of his career. The last step is to sum these values to obtain career totals (from 1913 forward) for every pitcher.
McCracken asserted that pitchers have a lot of control over the defense-independent pitching stats, so I would expect to see substantial differences among pitchers in their career HBP, walk, strikeout, and homerun rates, even after normalizing all of these figures against the league averages for each season.
After crunching the numbers for a total 29,973 seasons by 6,004 pitchers, we did indeed find very large differences among pitchers in some of the defense-independent statistics, especially walks and strikeouts. That's not likely to surprise any of you. It didn't surprise me, and it's entirely consistent with McCracken's findings.
More importantly, McCracken asserted that pitchers have almost no control over balls in play. If he's right, we would expect to see essentially random values for the career rates of in-play hits, especially for net in-play hits relative to the team baseline.
But we also found meaningful differences in the number of hits allowed on balls in play. In other words, a large number of pitchers consistently demonstrated the ability to limit the number of those hits. Their influence on these outcomes isn't as great as it is on the defense-independent stats, but it is real, and it is large enough to be important.
Here's a partial list of the top pitchers based on the number of career hits they saved relative to the IPAvg of their teams. The list includes two figures for each pitcher, the first without adjustments for park and defense and the second with those adjustments:
Pitcher IPHits vsLg IPHits vsTm ----------------- ----------- ----------- Charlie Hough -371 -299
Walter Johnson -277* -214*
Tom Seaver -269 -201
Catfish Hunter -296 -185
Warren Spahn -266 -183
Fergie Jenkins -128 -182
Pete Alexander -197* -177*
Phil Niekro -147 -172
Jim Palmer -315 -170
Ned Garver -71 -168 * excludes seasons before 1913
Charlie Hough has prevented more hits on balls in play than any other pitcher in our study, and our sample includes the last ninety years, so we've covered most of baseball history. Compared with the league-average pitcher, Hough has allowed 371 fewer hits on balls in play. Compared with his teammates, that figure drops to 299 hits, suggesting that his parks and defenses deserve some of the credit.
How important is 299 hits? Hough would have given up an extra run every three games or so if he had allowed hits on balls in play at the same rate as his teammates over the course of his career. That's a pretty big deal.
Could this happen by chance? No, it couldn't. Hough allowed batters to put 11,586 balls in play over the course of his career. If these results were random, there'd be a 95% chance that his net hits allowed would fall between +93 and -93 and a 99% chance they would fall between +116 and -116. The probability that a pitcher could reduce hits by 299 totally by chance is exceedingly small. (For the statisticians among you, Hough was more than six standard deviations from the mean.)
And Hough wasn't the only one, not by a long shot. In a sample of 351 pitchers with at least 6000 career balls in play, more than 12% of them posted results that would happen less than 1% of the time by chance. And that understates the case, too, because you get to keep pitching if you're that much better than the league, but you usually don't make it to 6000 balls in play if you're that much worse than the league. If one end of the distribution hadn't been truncated by job losses, approximately 20% of those pitchers would have fallen outside the range that can be explained by chance.
There are two knuckle-ballers on this list, and while you can't see it here, I can tell you that if I had run this list a little further, you'd have seen 6 knuckle-ballers in the top 35. (The other four are Eddie Rommel, Ted Lyons, Hoyt Wilhelm and Tim Wakefield.)
NOTE: The observation that knuckleball pitchers are especially good in this area is not new. Craig Wright noted the same thing in his email to Rob Neyer in January, 2001, and McCracken made this point in an article on Baseball Primer last year.
Some pitchers got a lot of help from their defense and park -- almost half of Jim Palmer's hits saved can be attributed to his defense (mostly) and his park -- while others look even better after the defense/park adjustment.
Of course, when you rank players based on counts, rather than averages, you're going to see a lot of guys with very long careers at the top of the list. So let's rank them again, this time dividing career hits saved by career balls in play, and setting a minimum of 5000 balls in play:
Pitcher IPAvg vs Lg IPAvg vs Tm ----------------- ----------- ----------- Charlie Hough -.032 -.026 Don Wilson -.015 -.023 Andy Messersmith -.033 -.021 Ned Garver -.008 -.020 Tim Wakefield -.020 -.019 Catfish Hunter -.028 -.017 Bud Black -.020 -.017 Oral Hildebrand -.015 -.017 Walter Johnson -.021 -.016 Dave Stieb -.022 -.016
Hough remains the career leader by holding enemy hitters to an in-play batting average that was 26 points lower than that of the pitchers on his teams. That's a very substantial advantage, and one that is entirely inconsistent with McCracken's conclusion.
To recap, this examination of career totals suggests very strongly that a meaningful number of pitchers have demonstrated the ability to reduce the rate of hits on balls in play.
Year-to-year variations, part one
By comparing the results for two seasons, McCracken concluded that "there is little correlation between what a pitcher does one year in the stat and what he will do the next." I'll start by looking at a few of the pitchers mentioned in the McCracken article, then expand the study and get a little more scientific.
McCracken pointed out that Greg Maddux had one of the league's best marks in baseball in 1998, then had one of the worst in 1999, and bounced back with a good in-play average in 2000. The following chart shows his entire career, with bars going up indicating an IPAvg that was worse than average and the bars going down indicating a lower-than-average rate of hits on balls in play:
The wild swings of 1998-2000 look like an anomaly when you examine Maddux's entire career. In fact, it appears that he struggled a bit as a youngster, reeled off a decade of good-to-great performances, then began to lose it as he got into his mid-30s. That sounds like a pretty normal career progression to me.
Pedro Martinez was another pitcher who gave up a lot of in-play hits in 1999 but bounced back in 2000. It should be noted that Pedro had a 2.07 ERA despite all those in-play hits in 1999, so we can only imagine what he would have done if he'd been a little less unlucky. Here's Pedro's career:
There's really only one bad year in this line, but it happened to fall in one of the years McCracken looked at. I think it's fair to say that Pedro has shown an above-average ability to prevent hits on balls in play, but his influence on these results is much less than on strikeouts, where he consistently mowed down an extra 90 or more hitters a year, and an incredible 181 more than average in 1999.
McCracken wrote that "You'll often hear people use names like Randy Johnson, Jamie Moyer and Andy Pettitte [as being very good at preventing hits on balls in play], but by any definition you want to use, these guys are not particularly good in the stat." Here's Moyer's career:
Moyer wasn't very good in this respect, or in most other respects, for the first half of his career. But he figured something out in 1996 and has been consistently better than the league ever since, with the exception of 2000. If I was McCracken and I was looking at the 1999 and 2000 seasons, I would have concluded that Moyer isn't particularly effective in preventing hits, but his last seven years say otherwise.
By the way, it's tempting to assume that Safeco Field and a very good Seattle defense are responsible for these recent successes, but that wouldn't be true. First of all, the 1996-1999 numbers were accumulated in a mix of Fenway Park, the Kingdome, and Safeco, with only the second half of 1999 in Safeco. More importantly, these numbers are relative to the in-play average for his teams, so they already factor out the impact of the park and the defense. The bottom line is that Jamie Moyer has been a master at preventing hits on balls in play since 1996.
How about Andy Pettitte? Here's his career:
McCracken was quite correct in pointing out that Pettitte is not a pitcher who prevents hits on balls in play. On the other hand, he's a very good counter-example regarding the claim that pitchers are not consistent in this regard.
Randy Johnson is the third pitcher mentioned by McCracken in the quote I cited above. Here's how Johnson has fared on balls in play over his career:
That's nine straight seasons at or better than the league average, followed by five seasons that were league-average or worse. The shift occurred at the very moment that he moved from the AL to the NL. I'm not sure whether that's meaningful, or whether it has more to do with the fact that he turned 35 in 1998. Like Pedro, Johnson's main asset is not his ability to prevent hits on balls in play, it's his ability to prevent balls in play in the first place. But Johnson was pretty good on those balls in play for nine years.
McCracken also claimed that "Randy Johnson gives up fewer hits than Scott Karl. That's not because batters hit the ball harder off Karl than Johnson, but because they hit the ball more often off Karl than Johnson." Here's Karl's career:
You might be able to make the case that Karl in his prime wasn't any worse than Randy Johnson in his late 30s, but if you compare the two pitchers at the same age, there's a noticeable edge for Johnson.
While we're on the subject of consistency from year to year, let's take a look at some of the knuckleballers, starting with Charlie Hough:
This chart is a little misleading in one respect. There are two bars for 1980, one for each of the teams he played for that year. Hough's IPAvg was awful in his 32 innings with the Dodgers and quite good in his 61 innings with Texas. Overall, he was a little worse than average for the year. The bottom line is that Hough was remarkably good at preventing hits on balls in play for a very long time.
Here's another knuckleballer, Tim Wakefield:
And a third knuckleballer, Phil Niekro:
Hough and Wakefield were remarkably good throughout their careers, and if you ignore the years after his 43rd birthday (1983 to the end), you could say the same about Niekro, too.
Number two on the all-time list was Walter Johnson, whose career looked like this:
Remember, I cut things off at 1913, so this leaves out his early years. It's quite possible that he would have been the all-time leader if those seasons had been included.
Sandy Koufax got some help from Dodger Stadium, but that wasn't the only reason he was so dominant during the last five years of his career. Even with the park and defense factored out, his IPAvg was consistently good during those years:
Finally, here's Jim Palmer, another Hall-of-Famer who was consistently good on balls in play during his career, except for the very beginning and end of his time in the big leagues:
If I had run Palmer's chart showing his performance relative to the league average (instead of his team), it would have been twice as impressive.
We could go on and do a lot more pitchers, but I think we've seen enough to make the point that it's not too hard to find examples where these in-play averages appear to be anything but random. In other words, this is highly persuasive evidence that these pitchers did indeed have the ability to prevent hits on balls in play.
Year-to-year variations, part two
It goes without saying that one cannot prove or disprove the idea that "there is little correlation between what a pitcher does one year in the stat and what he will do the next" by examining only ten or twelve careers.
To get a better handle on this phenomenon, I compiled a database consisting of all pairs of consecutive seasons in which a pitcher faced at least 400 batters in each season. Using this sample of 7,486 season-pairs, I computed the correlation coefficient for the net HBP rate, BB rate, K rate, HR rate, and in-play hit rate.
I found the highest correlation (.73) for strikeout rates. Walk rates (.66) were also highly correlated. The correlation coefficients dropped to .36 for hit batsmen, .29 for homeruns, and .16 for in-play batting average relative to the league. The lowest correlation (.09) was seen for in-play batting average relative to the team.
It may appear to be contradictory to say that certain pitchers appear to be consistently good while the overall correlation rate is quite low. But that's not necessarily so.
If McCracken is right, the difference between a pitcher's IPAvg and that of his team should vary randomly around zero as he moves through his career, and the correlation would be quite weak.
But if pitchers do have some influence over these outcomes, they could still exhibit a weak correlation by varying around some value other than zero that reflects the ability of the pitcher.
What about the weaker pitchers?
Most of our work to this point has focused on pitchers who had long and mostly successful careers in the big leagues. How do the DIPS and IPAvg stats of these players compare to those of players who weren't good enough to last that long?
The following table shows how eleven groups of pitchers compared with the overall averages. The first row includes all pitchers who faced less than 1,000 batters in their careers. The second row includes all pitchers who faced at least 1,000 batters but less than 2,000 batters during their careers. And so on.
Career BF BF HBP BB K HR vsLg vsTm 1 - 999 401,138 .002 .027 -.017 .002 .017 .015 1000 - 1999 931,981 .001 .013 -.009 .001 .006 .004 2000 - 2999 1,105,712 .001 .007 -.005 .000 .002 .001 3000 - 3999 1,179,916 .000 .006 -.003 .000 .000 .000 4000 - 4999 906,271 .000 .002 -.002 .000 .000 .001 5000 - 5999 920,680 .000 .001 .000 .000 .000 .000 6000 - 6999 647,553 .000 -.004 -.002 .001 -.001 -.001 7000 - 7999 843,937 .000 -.003 .000 .000 -.002 -.001 8000 - 8999 716,200 -.001 -.005 .005 .000 -.002 -.002 9000 - 9999 788,532 .000 -.008 -.001 -.001 -.002 -.001 10000+ 2,589,409 -.001 -.010 .008 -.001 -.004 -.003
Let's walk through the first row so it's clear how to read this table. Those pitchers, as a group:
- faced a total of 401,138 batters in their careers
- hit batters at a rate that was .002 above the league average. In other words, they hit two more batters per 1000 BF than did the average pitcher.
- walked 27 more batters per 1000 BF
- struck out 17 fewer batters per 1000 BF
- gave up 2 more homers per 1000 BF
- gave up 17 more hits per 1000 balls in play when compared with the league-average pitcher
- gave up 15 more hits per 1000 balls in play when compared with the in-play averages of their teammates
As you can see from the table, the pitchers with longer careers were progressively better than their shorter-career counterparts in every respect. They walked fewer batters, struck out more hitters, gave up fewer homeruns, and gave up fewer hits on balls in play. The ability to prevent hits on balls in play appears to be as much of a skill as anything else.
It might be easier to see this in chart form, so here are the walk rate, strikeout rate, homerun rate, and in-play averages for these groups of pitchers:
Another interesting aspect of this breakdown by career length is the total number of batters faced by each group. Only a very small percentage of batters are faced by pitchers with short careers. Of the roughly 11 million plate appearances since 1913 (including the Federal League of 1914-15), only 3.6% featured pitchers who finished their careers with less than 1000 batters faced.
In fact, the midpoint falls in the 6000-6999 group. A little more than half of the plate appearances since 1913 have been initiated by a pitcher who faced at least 6000 hitters in his career. We, along with other baseball analysts, often compare pitchers to the league average. Those league averages reflect the fact that the majority of plate appearances involve pitchers who are good enough to face thousands of big-league hitters.
That's a very high standard. And that may explain why it's difficult for any pitcher to consistently perform at a level higher than the league average. The table shows that the pitchers with the longest careers are only a little better than average. (They peak at a higher level, of course, but if you take their entire careers, there's not a huge difference.)
A better indicator may be the comparison of the short-career pitchers to the league averages. The chart shows that these marginal hurlers are far worse than the average in every way. In particular, they give up a lot more hits on balls in play than do the pitchers who are good enough to be big-league regulars for several years.
What's the right baseline?
At this point, we've seen (a) career totals that demonstrate that pitchers do influence these outcomes over the course of their careers, (b) several examples of pitchers who have been very consistent in IPAvg during their careers, and (c) that pitchers with longer careers are better than pitchers with shorter careers in every respect, including IPAvg.
In other words, pitchers do affect the rate of hits on balls in play. That means we can no longer use the team's IPAvg as a baseline against which to evaluate a pitcher. McCracken asserted that the team's IPAvg depended only on the park and the defense, but we've found that it depends on the park, the defense, and the quality of the pitchers on that team. If we use team IPAvg as the baseline, a good pitcher on a good staff is going to look worse than he really is. A good pitcher on a bad staff is going to look better than he really is. A good pitcher on an average team is still going to look a little worse than he really is because his own good performance is included in the team's IPAvg.
That leads to a good question, one that is not easily resolved. Is it better to compare a pitcher's IPAvg to that of his league or his team? If we use the league IPAvg as our baseline, we leave out the impact of the park and the defense. If we use the team's IPAvg as the baseline, we adjust for the park and the defense, but we introduce the quality of the fellow pitchers as a variable that can skew the results.
Neither approach is completely satisfactory. It's probably best to evaluate each pitcher's IPAvg against that of his team but make some accommodation for the quality of the pitching staff before making any judgments about that pitcher and before making any predictions about future performance.
In addition to ranking pitchers on IPAvg, this exercise provides a different way of looking at pitching careers. By putting each pitcher's career totals for net HBP, BB, K, HR, and IPHits side by side, we get a very clear picture of the reasons why they were successful.
Let's do a few, starting with Roger Clemens:
How's that for a picture of all-around greatness? Sure, he hit a few more batters than the average pitcher, but compared to the league averages, he walked 173 fewer and struck out 1,355 more, allowed 138 fewer homers, and surrendered 101 fewer hits on balls in play. (The IPHits figures include the defense/park adjustments for all of these profiles.)
Pedro Martinez shows a very similar pattern to that of Roger Clemens, but based on less than half of Clemens' batters faced.
Greg Maddux demonstrates awesome control, an above-average K rate, and the ability to keep the ball in the park. He had some influence on IPAvg, but that was only a part of his success.
By the way, some of those 69 hits saved might be attributable to his own defensive skill rather than his pitching skill. It's also quite possible that the -69 figure signficantly understates his contribution. Maddux saved 97 hits relative to the league averages, and now that we've shown that the team IPAvg reflects the ability of the other pitchers on the staff, that figure may represent Maddux's talent more accurately.
This line shows only one dominating characteristic -- the strikeouts. But if you're going to dominate in one area, that's a good one, because they can't get a hit if they can't put the ball in play. Fortunately for Johnson, his control is only a little worse than the norm, and got better in the later stages of his career.
Guys with below-average strikeout rates aren't supposed to be successful, but Moyer's exceptional control and low IPAvg have been the keys, especially in the later stages of his career.
Now here's a guy who didn't strike anyone out and gave up a lot of hits on balls in play, but survived because he had excellent control and kept the ball in the park. In particular, he kept the ball on the ground, meaning that a lot of those extra hits were singles and that a good number of potential rallies were killed by double plays.
John's profile made me think that it would have been a good idea to extend McCracken's work to measure GDP rates, but that notion didn't hit me until it was too late. Some day, I'll go back and add that to the study and see what pops out.
We can't leave this section without looking at the all-time leader in in-play hits saved. As you can see, Hough hit more batters, walked more batters, struck out only a few more batters, and gave up more homers than the average pitcher. His ability to prevent hits on balls in play is the biggest reason he had a long and successful career.
Is there really any doubt that Don Sutton is a Hall-of-Famer when you look at this profile?
Groups of similar pitchers
We could go on forever this way, so let's speed things up by looking at groups of pitchers with similar styles. Maybe we'll see some patterns.
HBP BB K HR IPHits Nolan Ryan +44 +878 +2578 -117 -133 Randy Johnson +44 +107 +1769 -52 -10 Roger Clemens +17 -173 +1355 -138 -101 Dazzy Vance +19 -65 +1122 -20 -19 Steve Carlton -49 -1 +1042 +5 -31 Bob Feller -1 +149 +1022 -42 -53 Sandy Koufax -35 +64 +1015 -12 -94 Pedro Martinez +27 -152 +974 -60 -47
Obviously, the defining characteristic of these pitchers was their ability to retire batters without help from anyone else. As a group, with the exception of Ryan, they had average control. All of them were better than average on hits per ball in play, but that wasn't the main reason for their success.
HBP BB K HR IPHits Rich Gossage +10 +90 +492 -31 -57 Lee Smith -17 +25 +447 -21 +12 Tom Henke -10 -24 +391 -12 -20 Rollie Fingers +4 -109 +358 -16 +12 Armando Benitez -3 +78 +332 +1 -41 Trevor Hoffman -17 -34 +317 -8 -49 John Wetteland -5 -25 +310 -5 -39 Billy Wagner 0 +17 +295 -6 -23 Robb Nen -18 -3 +283 -27 +12 Troy Percival 0 +30 +279 -9 -55 Bruce Sutter -4 -48 +269 +1 -54
This is just a special case of the power pitchers group, but it's interesting to see how many of these guys have posted impressive IPHits numbers even though they pitch many fewer innings than do the power pitchers in the previous table.
HBP BB K HR IPHits Robin Roberts -40 -772 -15 +56 -82 Pete Alexander -50 -570 +247 -1 -177 Jim Kaat +19 -566 -264 -4 +144 Ferguson Jenkins -12 -534 +635 +125 -182 Greg Maddux -4 -507 +150 -147 -69 Ted Lyons -43 -481 -366 -7 -121 Dutch Leondard +7 -477 -100 -53 -64 Don Sutton -25 -476 +512 +42 -138 Lew Burdette -9 -445 -611 -13 +32 Walter Johnson +25 -442 +847 -20 -214
Some of these guys (Roberts, Jenkins, and Sutton) gave up more than their share of homers, but with control this good, plus the ability to reduce hits on balls in play, a lot of those homers were solo shots.
HBP BB K HR IPHits Warren Spahn -63 -437 -36 -44 -183 Bud Black 2 -110 -204 +23 -114 Randy Jones -18 -189 -346 -13 -97 Wilbur Wood 0 -238 -135 -13 -84 John Tudor -3 -146 -50 +5 -82 Kenny Rogers +3 -39 -105 -40 -74 Larry Gura +4 -127 -276 +21 -72 Jim Deshaies -7 +21 -34 +44 -72 Jamie Moyer 0 -238 -153 +15 -65 Don Carman +4 +44 -4 +36 -65
This is a list of left-handed pitchers with below-average strikeout rates. Most had very good control, but six of them were at least as susceptible to the long ball as the average pitcher. A significant part of their success is/was the ability to keep hitters off balance and keep their in-play batting averages down.
Putting the pieces together
We've seen that there's more than one way to succeed as a big-league pitcher. Robin Roberts walked 772 fewer batters than his peers. Roger Clemens struck out 1355 more batters than average. Greg Maddux yielded 147 fewer homeruns. And Charlie Hough prevented somewhere between 299 and 371 hits on balls in play.
So what's the most important element of a pitcher's repertoire?
Well, the value of various baseball events depends on the era. When scoring is up, as it has been in recent years, an extra baserunner comes around to score more often than during a period like the 1960s. In The Hidden Game of Baseball, Pete Palmer provided a table of run values for various periods in the 20th century, and I'll use those values to evaluate these events.
Palmer puts the value of a walk at about a third of a run, so the 772 walks saved by Robin Roberts are worth about 250 runs over the course of a career. That's not bad.
Clemens struck out 1355 more batters, but if he hadn't, some of those batters would have reached base, and some would have been retired in other ways. If his strikeout rate had been at the league average, it's possible that he would have allowed another 125 walks, 35 homers, and 320 more hits on balls in play. Using Palmer's run values and reasonable assumptions about the distribution of those hits among singles, doubles, and triples, those strikeouts are worth about 250-280 runs.
Palmer puts the value of a homer at about 1.4 runs, so Maddux saved about 200 runs by keeping his homerun rate down.
And the 300+ hits saved by Hough are worth about 150-175 runs.
Those are impressive figures, and they'd be even more impressive if we were evaluating them against replacement level pitchers instead of the league average. As we noted before, the league average is a very high standard.
The bottom line is that success in all four areas is important. You can have a good career if you're average in all four areas or if you can offset one weak area with a strength in another. You can have a very good career if you have no major weaknesses and you have a special ability in one of these respects. And you can have a great career if you're better than average in all four areas.
Having completed this study, I can sum up my own beliefs as follows:
1. Pitchers have more influence over in-play hit rates than McCracken suggested. In fact, some pitchers (like Charlie Hough and Jamie Moyer) owe much of their careers to the ability to excel in this respect.
2. Their influence over in-play hit rates is weaker than their influence over walk and strikeout rates. The most successful pitchers in history have saved only a few hits per season on balls in play, when compared with the league or team average. That seems less impressive than it really is, because the league average is such a high standard. Compared to a replacement-level pitcher, the savings are much greater.
3. The low correlation coefficients for in-play batting average suggest that there's a lot more room for random variation in these outcomes than in the defense-independent outcomes. I believe this follows quite naturally from the physics of the game. When a round bat meets a round ball at upwards of 90 miles per hour, and when that ball has laces and some sort of spin, miniscule differences in the nature of that impact can make the difference between a hit and an out. In other words, there's quite a bit of luck involved.
4. Year-to-year variations in IPAvg-versus-team can occur if the quality of a pitcher's teammates varies from year to year, even if that pitcher's performance is fairly consistent.
5. The fact that there's room for random variation doesn't necessarily mean a pitcher doesn't have any influence over the outcomes. It just means that his year-to-year performances can vary randomly around value other than zero, a value that reflects his skills.
6. Unusually good or bad in-play hit rates aren't likely to be repeated the next year. This has significant implications for projections of future performance.
7. Even if a pitcher has less influence on in-play averages than on walks and strikeouts, that doesn't necessarily mean that in-play outcomes are less important. Nearly three quarters of all plate appearances result in a ball being put in play. Because these plays are much more frequent, small differences in these in-play hit rates can have a bigger impact on scoring than larger differences in walk and strikeout rates.
The process of separating pitching stats into defense-independent and defense-dependent groups is illuminating. The notion that pitchers don't have as much control over in-play outcomes as they do over defense-independent outcomes is both obvious (in retrospect) and very important. Voros McCracken deserves a lot of credit for introducing this way of thinking.
The bottom line, though, is that I am convinced that pitchers do influence in-play outcomes to a significant degree. There's a reason why Charlie Hough and Jamie Moyer and Phil Niekro and Tom Glavine and Bud Black have had successful careers despite mediocre strikeout rates. There's a reason why the top strikeout pitchers have also suppressed in-play hits at a good rate. Using power or control or deception or a knuckleball, pitchers can keep hitters off balance and induce more than their share of routine grounders, popups, and lazy fly balls.