DMB News December 2002
Diamond Mind Email Newsletter
December 13, 2002
Written by Tom Tippett
Welcome to the sixth edition of the Diamond Mind email newsletter for the year 2002. Through these newsletters, we will try to keep you up to date on the latest product and technical information about the Diamond Mind Baseball game, related player disks, and our ongoing baseball research efforts. Back issues are available on our website.
Topics for this issue:
We're happy to report that we began shipping the 2002 Season Disk on schedule earlier this week, and all advance orders have been shipped. If you ordered in advance for overnight or email delivery, the season disk should already be in your hands. And if you requested delivery by first class mail, priority mail, or air mail, your package should arrive by the end of next week.
Earlier today, we updated our master copy of the 2002 Season Disk to make a minor change to two ballparks.
If your copy of the season disk was shipped on or after December 13th, you already have the new version and you don't need to do anything. If you received the original version, please read the following so you can decide what action, if any, you wish to take.
The weather this year was unusually cool in Pacific Bell Park in San Francisco and Network Associates Coliseum in Oakland. As a result, we assigned these parks an average temperature of Cool.
This setting has been supported by the game for many years and does not cause any problems. But it can lead to a serious problem if you follow the steps described in the next paragraph.
The Modify Park window has a drop-down list of available temperature settings, but Cool is not one of the options on that list. As a result, if you choose to Modify one of these parks and click on the weather tab, you will see a blank value for average temperature. If you leave it blank and click OK, an invalid temperature setting will be stored, and you'll see ridiculously high scores for any future games played in that park. (If you choose another temperature setting before clicking OK, or if you click Cancel instead, you're fine.)
There's nothing wrong with the Cool setting other than the fact that the Modify Park window doesn't recognize it. While testing the season disk, we ran well over a hundred simulated seasons with excellent results. So if you don't think you'll ever use the Modify Park window in exactly this way, no action needs to be taken.
However, if you're worried that you might accidentally follow that sequence at some point, we recommend that you use the Modify Park window to assign the Comfortable setting to these two parks.
Changing the setting from Cool to Comfortable will not have an impact on your simulation results. Both parks were right on the boundary between these two settings and we could just as easily have coded them as Comfortable in the first place.
The Cool setting has been supported within the Diamond Mind game engine for many years even though it doesn't appear on the Modify Parks window and has almost never been used. (Prior to this year, the only other park given this setting was Candlestick Park in 1999.) We will add the Cool setting to the Modify Parks window for version 9.
Here are a few tips regarding the use of this season disk:
1. We have prepared four notes that you can view through the Notes page of the Organizer window. We recommend that you take some time to read these notes in the relatively near future, as they contain useful information that may answer questions you might have about using the season disk, the statistics and ratings on the disk, and what you can expect when you start playing games with it.
2. The 2002 Season Disk is shipped with the real-life transactions and game-by-game starting lineups feature turned on, real-life opening day rosters (meaning that players who were disabled on opening day in real life are also disabled on this season disk), and the "as-played" 2002 schedule installed. By "as-played", we mean that postponed games are listed on the dates they were actually played.
The use of real-life transactions and lineups requires that the rosters and schedule be exactly as they were in real-life. Feel free to change rosters or switch to the original ("as-scheduled") schedule, but if you do, remember to change the settings in your organization or league so the use of real-life transactions and lineups is turned off.
3. The season disk includes multiple player records for anyone who appeared with more than one team this year. These players have one record for each team and one combined record that reflects their overall performance.
If you wish to release all players into free agency and draft new rosters from scratch, start by using the "Release all players" command and then use "Delete team-specific records". Both commands can be found on the Tools menu.
If you don't run the "Delete team-specific records" command, these multi-team players will be drafted more than once. And this command must be used AFTER releasing the players, because it deletes those team-specific records from the list of free agents, not from team rosters.
4. If you ran a draft league using our 2001 Season Disk, remember that you can use the Migrate command on the File menu to automatically set up the 2002 Season Disk with the structure of your league and your team rosters. See the DMB help system for more information on how to use the Migrate feature.
If you use Migrate, remember that:
a) the "source" database is your 2001 league database and your "target" database is the 2002 Season Disk. (You can install the 2002 Season Disk more than once if you want to migrate your league to one copy and have another with the real-life rosters still intact.)
b) Migrate does not assign home parks to each team, so you'll have to do that yourself.
c) When Migrate is placing a multi-team player on a roster, it's the combined record that is used. His team-specific records for the 2002 season are placed in the free agent pool. Use the "Delete team-specific records" command on the Tools menu to remove them before running a draft.
d) Migrate does not create manager profiles, so you'll need to generate new ones or use the "Roster / manager profile" window to set them up the way you want before playing games.
5. Before starting a season, take a look at the organization and leagues options. The disk ships with the generation of game-by-game stats turned on, but game accounts, boxscores and scoresheets turned off.
If you want faster autoplay results and you don't care about being able to look at batting logs, pitching logs, or reports based on time intervals, turn off the generation of game-by-game stats.
If you run a league and you're planning to use the Transfer features to exchange game results, statistics, and manager profiles with the managers in that league, you'll need to turn on the generation of game accounts.
And you may want to turn on the automatic generation of boxscores and scoresheets.
6. If you plan to set up a pair of leagues whose champions will meet in a "world series" at the end of your postseason, remember to create an organization to link those leagues BEFORE your season begins. DMB won't allow you to create an organization after the season starts, and you'll need that organization in place to take full advantage of the game's support for postseason play.
In November and December, we added these new articles to our web site:
- a list of all of the players who made their big-league debuts this season, along with their batting or pitching stats for 2002
- a recap of the preseason predictions that were made by various pundits and publications, along with accuracy rankings for 2002 and for the past several seasons
- an evaluation of the offensive production each team received from players at each position, providing an interesting look at each team's offensive strengths and weaknesses.
- our annual review of the Gold Glove selections along with a substantially revised edition of our Evaluating Defense article.
- a new way to look at how efficient each team was in the real-life 2002 season. To measure efficiency, we used (a) the familiar Bill James pythagorean method that shows how each team's win/loss record related to the runs it scored and allowed and (b) a new statistic that we're calling Run Efficiency Average (REA). By relating offensive events (hits, extra-base hits, walks) to runs scored, REA measures how efficiently a team turned those events into runs and how efficiently it prevented the other team from scoring. For both the pythagorean method and REA, we look at several decades of history to see what we can learn about the chances for each team in 2003.
Two of these articles were published by ESPN.com. All of them can be found on our web site by clicking on the "Baseball Articles" link that appears in the banner at the top of our web pages.
In response to comments by Tim McCarver during two postseason telecasts, Dave Smith of Retrosheet (www.retrosheet.org) posted some very interesting analysis to SABR's online forum. Dave was gracious enough to give us permission to share the following with our newsletter subscribers . . .
Here is the analysis that I mentioned on SABR-L yesterday concerning the consequences of starting an inning with a walk. I have three tables of data which address the basic topic in different ways.
Recall that the immediate impetus was [another SABR member's] quote of Tim McCarver who said on Sunday night's broadcast to the effect that "there are more multirun innings that begin with a walk".
Last week, during one of the LCS games, McCarver asserted that "the one thing I would tell a young pitcher is 'never walk the leadoff man, he *always* scores; he *always* scores'" (repetition and emphasis in the original).
I examined the second of these two quotes in 1998 at the request of the San Diego Padres, although for the life of me I do not recall what use, if any, they made of what I gave them. I have expanded my data set since that 1998 study and for the present report I checked every game from 1974 through 2002. This 29-year period covered 61365 games and 1,101,019 half innings. There were over 4.5 million plate appearances in these games.
Table 1. For all methods for a leadoff batter to reach base, this table shows the number of times each event occurred, the number of times that batter scored, and the frequency of each. Note that the "E" category includes all times the leadoff batter reached on an error, which includes those cases when he went past first. The frequency for batters with leadoff walks scoring is insignificantly different from the frequency for leadoff singles; both are a tiny bit lower than the value for reaching via a hit by pitch.
CONCLUSION: A leadoff batter who walks does NOT "always score"; the walk has the same effect as the other ways to reach first base.
Reach Score Freq 1B 183468 72841 .397 2B 48364 30961 .640 3B 6573 5753 .875 HR 27205 27205 1.000 BB 82637 33002 .399 HP 6217 2543 .409 INT 81 22 .272 E 12105 5298 .438
Table 2. For all possible outcomes for leadoff batters (the 8 categories from Table 1 plus making an out), this shows the number of times the indicated number of runs were scored. For example, batters led off an inning with a single 183,468 times and in 104,074 of those innings, his team did not score. One run was scored 35,868 times, two runs on 22,726 occasions, and so on, with all innings of six or more runs combined.
Total 0 1 2 3 4 5 >5 1B 183468 104074 35868 22726 11329 5375 2415 1681 2B 48364 17671 17657 6772 3427 1632 683 522 3B 6573 984 3696 1019 467 228 101 78 HR 27205 0 19690 4130 1816 871 386 312 BB 82637 46794 15837 10481 5167 2503 1100 755 HP 6217 3453 1209 776 427 203 93 56 INT 81 56 9 7 6 1 0 2 E 12105 6427 2726 1580 744 355 159 114 OUT 734369 616379 70656 28839 11379 4441 1679 996 Total 1101019 795838 167348 76330 34762 15609 6616 4516
These raw totals are not easy to compare, especially since the various outcomes occur with very different frequencies. Therefore, I created Table 3.
Table 3 takes the data from Table 2 and normalizes it per number of occurrences of each outcome. For example, a leadoff single led to no runs with a frequency of .567 (56.7%), one run was scored after the leadoff single with a frequency of .196, etc.
CONCLUSION: The values for leadoff singles and leadoff walks are virtually indistinguishable. The hit by pitch data are only slightly lower in the "no runs" category.
0 1 2 3 4 5 >5 1B .567 .196 .124 .061 .029 .013 .009 2B .365 .365 .140 .070 .033 .014 .010 3B .150 .562 .155 .071 .034 .015 .011 HR .000 .724 .152 .066 .032 .014 .011 BB .566 .192 .127 .062 .030 .013 .009 HP .555 .194 .125 .068 .032 .014 .009 INT .691 .111 .086 .074 .012 0 .024 E .531 .225 .131 .061 .029 .013 .009 OUT .839 .096 .039 .015 .006 .002 .001
OVERALL CONCLUSION: Both of McCarver's assertions are clearly contradicted by this huge body of evidence. Having the leadoff batter reach base is certainly an advantage for the offense (compare the values for the "OUT" row in Table 3). The data for reaching on interference are far too limited to be useful. When the leadoff man collects an extra base hit or reaches on an error (with the occasional cases of going past first on the error included), it is even better than reaching first, as expected. However, if we just look at those instances when the leadoff batter reaches first, then it does not matter how he got there.
SUMMARY and personal views: Even if we allow Tim some poetic license for his hyperbole; it is his job after all, we do not need to accept his opinion as authoritative. I have great respect for anyone who played in the Major Leagues for 22 years, as McCarver did. However, anecdotal observations and gut feelings are just that and have no inherent credibility, no matter what the source. Since we can now check these opinions with evidence, and McCarver definitely has at his disposal the talents of people who can do such checking, then we should expect him and other announcers to get it right.
Andruw Jones was my pick for NL Defensive Player of the Year in 1998. And among players with at least 1500 outfield innings from 1997 to 2002, Jones ranks second in the majors in putouts per game despite playing about 20% of his games behind a ground ball machine named Greg Maddux. (The top six on this list are Torii Hunter, Andruw Jones, Mike Cameron, Chris Singleton, Darin Erstad, and Tsuyoshi Shinjo.)
But relative to the norms for his position, Jones was making a lot more plays in 1997 and 1998 than in 2001 and 2002, and this year he was only 10th in putouts per nine innings among players with at least 500 innings. Was that ranking depressed by the ground ball nature of Atlanta's staff? Not really. Atlanta's pitchers were 7th in the league in ground ball percentage, only a few points above the league average.
We again rated Jones as one of the best center fielders in the game. But there's a big difference between being one of the best and being head-and-shoulders better than anyone in the game today and perhaps the best who ever played the position. That's how he looked four years ago and how he is widely regarded by the media.
In fact, the arc of Jones' career as a hitter and a fielder is more consistent with someone who's 29 years old than his listed age of 25. It's widely believed that hitters peak around age 27, and my experience in doing fielding analysis for the past 15 years suggests that defensive range tends to peak around 24 or 25. (Error rates may improve later, but the ability to get to the ball tends to peak early.) Jones had his best year at the plate in 2000 and his best defensive year in 1998.
Of course, there are at least three reasons why this pattern could mean absolutely nothing. One, there's no evidence that I know of to suggest that his listed age isn't legit. Two, things will look very different if Andruw has a monster season in the next year or two. And, three, there are plenty of other players who don't fit the "normal" career arc of building to a peak around age 27, staying on that plateau for a while, and declining slowly after that. After all, Barry Bonds has been a better hitter in his late thirties than at any other time in his career.
But if it did come out that Andruw was really 22 or 23, not 19, when we first saw him in 1996, our perception of his career to date and his potential for future growth would be very different. We would be less surprised that he was able to hold his own at the big-league level at such a young age. Many of us would stop thinking that he's due to take another great leap forward and start thinking that we may have already seen him at his peak.
My point isn't that Andruw IS older than we think. As I said, I have absolutely no evidence upon which to conclude that. I'm just saying that his career LOOKS more like that of an older player SO FAR.
My point is that age has become a major factor in the thinking of baseball analysts, perhaps too much so. Hundreds of player ages have been revised in the last year or two, which makes me wonder whether there are more that haven't been discovered, past and present. And even if we could assume that all published birthdates are correct, many players don't follow the "normal" career arc anyway.
Maybe it's time for baseball analysts (including us) to find better ways besides age to assess the past and likely future path of a player's career.