DMB News December 2002

Diamond Mind Email Newsletter

December 13, 2002
Written by Tom Tippett

Welcome to the sixth edition of the Diamond Mind email newsletter for the year 2002. Through these newsletters, we will try to keep you up to date on the latest product and technical information about the Diamond Mind Baseball game, related player disks, and our ongoing baseball research efforts. Back issues are available on our website.

Topics for this issue:

2002 Season Disk now shipping
2002 Season Disk update
Tips for using the 2002 Season Disk
Several new articles available
Leadoff walks
How age affects our view of the game

2002 Season Disk now shipping

We're happy to report that we began shipping the 2002 Season Disk on schedule earlier this week, and all advance orders have been shipped. If you ordered in advance for overnight or email delivery, the season disk should already be in your hands. And if you requested delivery by first class mail, priority mail, or air mail, your package should arrive by the end of next week.

2002 Season Disk update

Earlier today, we updated our master copy of the 2002 Season Disk to make a minor change to two ballparks.

If your copy of the season disk was shipped on or after December 13th, you already have the new version and you don't need to do anything. If you received the original version, please read the following so you can decide what action, if any, you wish to take.

The weather this year was unusually cool in Pacific Bell Park in San Francisco and Network Associates Coliseum in Oakland. As a result, we assigned these parks an average temperature of Cool.

This setting has been supported by the game for many years and does not cause any problems. But it can lead to a serious problem if you follow the steps described in the next paragraph.

The Modify Park window has a drop-down list of available temperature settings, but Cool is not one of the options on that list. As a result, if you choose to Modify one of these parks and click on the weather tab, you will see a blank value for average temperature. If you leave it blank and click OK, an invalid temperature setting will be stored, and you'll see ridiculously high scores for any future games played in that park. (If you choose another temperature setting before clicking OK, or if you click Cancel instead, you're fine.)

There's nothing wrong with the Cool setting other than the fact that the Modify Park window doesn't recognize it. While testing the season disk, we ran well over a hundred simulated seasons with excellent results. So if you don't think you'll ever use the Modify Park window in exactly this way, no action needs to be taken.

However, if you're worried that you might accidentally follow that sequence at some point, we recommend that you use the Modify Park window to assign the Comfortable setting to these two parks.

Changing the setting from Cool to Comfortable will not have an impact on your simulation results. Both parks were right on the boundary between these two settings and we could just as easily have coded them as Comfortable in the first place.

The Cool setting has been supported within the Diamond Mind game engine for many years even though it doesn't appear on the Modify Parks window and has almost never been used. (Prior to this year, the only other park given this setting was Candlestick Park in 1999.) We will add the Cool setting to the Modify Parks window for version 9.

Tips for using the 2002 Season Disk

Here are a few tips regarding the use of this season disk:

1. We have prepared four notes that you can view through the Notes page of the Organizer window. We recommend that you take some time to read these notes in the relatively near future, as they contain useful information that may answer questions you might have about using the season disk, the statistics and ratings on the disk, and what you can expect when you start playing games with it.

2. The 2002 Season Disk is shipped with the real-life transactions and game-by-game starting lineups feature turned on, real-life opening day rosters (meaning that players who were disabled on opening day in real life are also disabled on this season disk), and the "as-played" 2002 schedule installed. By "as-played", we mean that postponed games are listed on the dates they were actually played.

The use of real-life transactions and lineups requires that the rosters and schedule be exactly as they were in real-life. Feel free to change rosters or switch to the original ("as-scheduled") schedule, but if you do, remember to change the settings in your organization or league so the use of real-life transactions and lineups is turned off.

3. The season disk includes multiple player records for anyone who appeared with more than one team this year. These players have one record for each team and one combined record that reflects their overall performance.

If you wish to release all players into free agency and draft new rosters from scratch, start by using the "Release all players" command and then use "Delete team-specific records". Both commands can be found on the Tools menu.

If you don't run the "Delete team-specific records" command, these multi-team players will be drafted more than once. And this command must be used AFTER releasing the players, because it deletes those team-specific records from the list of free agents, not from team rosters.

4. If you ran a draft league using our 2001 Season Disk, remember that you can use the Migrate command on the File menu to automatically set up the 2002 Season Disk with the structure of your league and your team rosters. See the DMB help system for more information on how to use the Migrate feature.

If you use Migrate, remember that:

a) the "source" database is your 2001 league database and your "target" database is the 2002 Season Disk. (You can install the 2002 Season Disk more than once if you want to migrate your league to one copy and have another with the real-life rosters still intact.)

b) Migrate does not assign home parks to each team, so you'll have to do that yourself.

c) When Migrate is placing a multi-team player on a roster, it's the combined record that is used. His team-specific records for the 2002 season are placed in the free agent pool. Use the "Delete team-specific records" command on the Tools menu to remove them before running a draft.

d) Migrate does not create manager profiles, so you'll need to generate new ones or use the "Roster / manager profile" window to set them up the way you want before playing games.

5. Before starting a season, take a look at the organization and leagues options. The disk ships with the generation of game-by-game stats turned on, but game accounts, boxscores and scoresheets turned off.

If you want faster autoplay results and you don't care about being able to look at batting logs, pitching logs, or reports based on time intervals, turn off the generation of game-by-game stats.

If you run a league and you're planning to use the Transfer features to exchange game results, statistics, and manager profiles with the managers in that league, you'll need to turn on the generation of game accounts.

And you may want to turn on the automatic generation of boxscores and scoresheets.

6. If you plan to set up a pair of leagues whose champions will meet in a "world series" at the end of your postseason, remember to create an organization to link those leagues BEFORE your season begins. DMB won't allow you to create an organization after the season starts, and you'll need that organization in place to take full advantage of the game's support for postseason play.

Several new articles available

In November and December, we added these new articles to our web site:

- a list of all of the players who made their big-league debuts this season, along with their batting or pitching stats for 2002

- a recap of the preseason predictions that were made by various pundits and publications, along with accuracy rankings for 2002 and for the past several seasons

- an evaluation of the offensive production each team received from players at each position, providing an interesting look at each team's offensive strengths and weaknesses.

- our annual review of the Gold Glove selections along with a substantially revised edition of our Evaluating Defense article.

- a new way to look at how efficient each team was in the real-life 2002 season. To measure efficiency, we used (a) the familiar Bill James pythagorean method that shows how each team's win/loss record related to the runs it scored and allowed and (b) a new statistic that we're calling Run Efficiency Average (REA). By relating offensive events (hits, extra-base hits, walks) to runs scored, REA measures how efficiently a team turned those events into runs and how efficiently it prevented the other team from scoring. For both the pythagorean method and REA, we look at several decades of history to see what we can learn about the chances for each team in 2003.

Two of these articles were published by ESPN.com. All of them can be found on our web site by clicking on the "Baseball Articles" link that appears in the banner at the top of our web pages.

Leadoff walks

In response to comments by Tim McCarver during two postseason telecasts, Dave Smith of Retrosheet (www.retrosheet.org) posted some very interesting analysis to SABR's online forum. Dave was gracious enough to give us permission to share the following with our newsletter subscribers . . .

Here is the analysis that I mentioned on SABR-L yesterday concerning the consequences of starting an inning with a walk. I have three tables of data which address the basic topic in different ways.

Recall that the immediate impetus was [another SABR member's] quote of Tim McCarver who said on Sunday night's broadcast to the effect that "there are more multirun innings that begin with a walk".

Last week, during one of the LCS games, McCarver asserted that "the one thing I would tell a young pitcher is 'never walk the leadoff man, he *always* scores; he *always* scores'" (repetition and emphasis in the original).

I examined the second of these two quotes in 1998 at the request of the San Diego Padres, although for the life of me I do not recall what use, if any, they made of what I gave them. I have expanded my data set since that 1998 study and for the present report I checked every game from 1974 through 2002. This 29-year period covered 61365 games and 1,101,019 half innings. There were over 4.5 million plate appearances in these games.

Table 1. For all methods for a leadoff batter to reach base, this table shows the number of times each event occurred, the number of times that batter scored, and the frequency of each. Note that the "E" category includes all times the leadoff batter reached on an error, which includes those cases when he went past first. The frequency for batters with leadoff walks scoring is insignificantly different from the frequency for leadoff singles; both are a tiny bit lower than the value for reaching via a hit by pitch.

CONCLUSION: A leadoff batter who walks does NOT "always score"; the walk has the same effect as the other ways to reach first base.

       Reach  Score   Freq 

1B    183468  72841   .397 

2B     48364  30961   .640 

3B      6573   5753   .875 

HR     27205  27205  1.000 

BB     82637  33002   .399 

HP      6217   2543   .409 

INT       81     22   .272 

E      12105   5298   .438 

Table 2. For all possible outcomes for leadoff batters (the 8 categories from Table 1 plus making an out), this shows the number of times the indicated number of runs were scored. For example, batters led off an inning with a single 183,468 times and in 104,074 of those innings, his team did not score. One run was scored 35,868 times, two runs on 22,726 occasions, and so on, with all innings of six or more runs combined.

        Total      0      1      2      3      4      5     >5 

1B     183468 104074  35868  22726  11329   5375   2415   1681 

2B      48364  17671  17657   6772   3427   1632    683    522 

3B       6573    984   3696   1019    467    228    101     78 

HR      27205      0  19690   4130   1816    871    386    312 

BB      82637  46794  15837  10481   5167   2503   1100    755 

HP       6217   3453   1209    776    427    203     93     56 

INT        81     56      9      7      6      1      0      2 

E       12105   6427   2726   1580    744    355    159    114 

OUT    734369 616379  70656  28839  11379   4441   1679    996 

Total 1101019 795838 167348  76330  34762  15609   6616   4516 

These raw totals are not easy to compare, especially since the various outcomes occur with very different frequencies. Therefore, I created Table 3.

Table 3 takes the data from Table 2 and normalizes it per number of occurrences of each outcome. For example, a leadoff single led to no runs with a frequency of .567 (56.7%), one run was scored after the leadoff single with a frequency of .196, etc.

CONCLUSION: The values for leadoff singles and leadoff walks are virtually indistinguishable. The hit by pitch data are only slightly lower in the "no runs" category.

        0     1     2     3     4     5    >5 

1B   .567  .196  .124  .061  .029  .013  .009 

2B   .365  .365  .140  .070  .033  .014  .010 

3B   .150  .562  .155  .071  .034  .015  .011 

HR   .000  .724  .152  .066  .032  .014  .011 

BB   .566  .192  .127  .062  .030  .013  .009 

HP   .555  .194  .125  .068  .032  .014  .009 

INT  .691  .111  .086  .074  .012     0  .024 

E    .531  .225  .131  .061  .029  .013  .009 

OUT  .839  .096  .039  .015  .006  .002  .001 

OVERALL CONCLUSION: Both of McCarver's assertions are clearly contradicted by this huge body of evidence. Having the leadoff batter reach base is certainly an advantage for the offense (compare the values for the "OUT" row in Table 3). The data for reaching on interference are far too limited to be useful. When the leadoff man collects an extra base hit or reaches on an error (with the occasional cases of going past first on the error included), it is even better than reaching first, as expected. However, if we just look at those instances when the leadoff batter reaches first, then it does not matter how he got there.

SUMMARY and personal views: Even if we allow Tim some poetic license for his hyperbole; it is his job after all, we do not need to accept his opinion as authoritative. I have great respect for anyone who played in the Major Leagues for 22 years, as McCarver did. However, anecdotal observations and gut feelings are just that and have no inherent credibility, no matter what the source. Since we can now check these opinions with evidence, and McCarver definitely has at his disposal the talents of people who can do such checking, then we should expect him and other announcers to get it right.

Dave Smith

How age affects our view of the game

Andruw Jones was my pick for NL Defensive Player of the Year in 1998. And among players with at least 1500 outfield innings from 1997 to 2002, Jones ranks second in the majors in putouts per game despite playing about 20% of his games behind a ground ball machine named Greg Maddux. (The top six on this list are Torii Hunter, Andruw Jones, Mike Cameron, Chris Singleton, Darin Erstad, and Tsuyoshi Shinjo.)

But relative to the norms for his position, Jones was making a lot more plays in 1997 and 1998 than in 2001 and 2002, and this year he was only 10th in putouts per nine innings among players with at least 500 innings. Was that ranking depressed by the ground ball nature of Atlanta's staff? Not really. Atlanta's pitchers were 7th in the league in ground ball percentage, only a few points above the league average.

We again rated Jones as one of the best center fielders in the game. But there's a big difference between being one of the best and being head-and-shoulders better than anyone in the game today and perhaps the best who ever played the position. That's how he looked four years ago and how he is widely regarded by the media.

In fact, the arc of Jones' career as a hitter and a fielder is more consistent with someone who's 29 years old than his listed age of 25. It's widely believed that hitters peak around age 27, and my experience in doing fielding analysis for the past 15 years suggests that defensive range tends to peak around 24 or 25. (Error rates may improve later, but the ability to get to the ball tends to peak early.) Jones had his best year at the plate in 2000 and his best defensive year in 1998.

Of course, there are at least three reasons why this pattern could mean absolutely nothing. One, there's no evidence that I know of to suggest that his listed age isn't legit. Two, things will look very different if Andruw has a monster season in the next year or two. And, three, there are plenty of other players who don't fit the "normal" career arc of building to a peak around age 27, staying on that plateau for a while, and declining slowly after that. After all, Barry Bonds has been a better hitter in his late thirties than at any other time in his career.

But if it did come out that Andruw was really 22 or 23, not 19, when we first saw him in 1996, our perception of his career to date and his potential for future growth would be very different. We would be less surprised that he was able to hold his own at the big-league level at such a young age. Many of us would stop thinking that he's due to take another great leap forward and start thinking that we may have already seen him at his peak.

My point isn't that Andruw IS older than we think. As I said, I have absolutely no evidence upon which to conclude that. I'm just saying that his career LOOKS more like that of an older player SO FAR.

My point is that age has become a major factor in the thinking of baseball analysts, perhaps too much so. Hundreds of player ages have been revised in the last year or two, which makes me wonder whether there are more that haven't been discovered, past and present. And even if we could assume that all published birthdates are correct, many players don't follow the "normal" career arc anyway.

Maybe it's time for baseball analysts (including us) to find better ways besides age to assess the past and likely future path of a player's career.