DMB News October 2002

Diamond Mind Email Newsletter

October 16, 2002
Written by Tom Tippett

Welcome to the fifth edition of the Diamond Mind email newsletter for the year 2002. Through these newsletters, we will try to keep you up to date on the latest product and technical information about the Diamond Mind Baseball game, related player disks, and our ongoing baseball research efforts. Back issues are available on our website.

Topics for this issue:

2002 Season Disk / October mailing
New articles
NetMeeting tip
Win Shares question
Professor Carl Morris and Barry Bonds

2002 Season Disk / October mailing

We have started work on the 2002 Season Disk, which will be ready for shipment on or before December 10th, and we are now taking advance orders.

As usual, you'll receive a ton of information with this season disk, including everything you need to start playing games immediately upon installation:

- full rosters with every player who appeared in the big leagues this season

- official batting, pitching and fielding statistics, including left/right splits for all batters and pitchers and modern statistics such as inherited runners, holds, blown saves, pickoffs, and stolen bases versus pitchers and catchers

- games started by position versus left- and right-handed pitchers

- updated park factors

- a full set of real-life transactions and game-by-game lineups for season replays

- two schedules, the original (as-scheduled) schedule and another (as-played) reflecting rainouts and other rescheduled games.

- real-life salaries for all players

- complete manager profiles for all teams

The 2002 Season Disk will be available only in version 8 format.

You can place a credit card order now, either through our web store (follow the link from www.diamond-mind.com) or by calling us at 800-400-4803 during business hours (9-5 Eastern time, Mon-Fri).

As we do each year at this time, we are sending a mailing to all current customers and newsletter subscribers. The mailing includes an order form and postage-paid reply envelope for those of you who prefer to order by mail. To order by mail without waiting for your letter, which should arrive in your mailbox in the next three weeks or so, you can print an order form via the "How to Order" page of our web site.

New articles

At the beginning of this year's playoffs, ESPN.com published our latest article called "May the best team win ... at least some of the time". In that article, we describe a model for assessing the probability that each team would advance through the playoffs given their regular-season record, emphasizing the fact that it's very difficult, even for the teams with the best records, to make it through three rounds against very good opponents. A copy of the article is on our web site.

In the coming weeks, we will post several new articles to our web site. Between now and the end of November, you'll see our annual ranking of pre-season predictions, stats for all of the players who made their debuts in 2002, a reprise of our 2001 article ranking the offensive production each team received from players at each position, and our annual Gold Glove commentary.

NetMeeting tip

As many of you know, it's possible to play Diamond Mind Baseball games head-to-head over the internet using a free product from Microsoft called NetMeeting. In case you haven't already seen it, there's a technical note on the use of NetMeeting in the version 8 support area of our web site.

There are two things that make it easy to share the mouse with your opponent. First, when the host of the NetMeeting session is choosing DMB as the program to be shared, he can check the box labeled "Automatically accept requests for control". This makes sure he doesn't have to say "yes" each time the other player wants the mouse.

Second, when the guest wants control, there's a menu command for that purpose. But there's another, better way to do the same thing. Simply position your mouse cursor (which looks like a circle with a line through it when the host has control) over the host's mouse cursor (which is a grayed-out version of the normal pointer-shaped cursor) and double-click.

All tactics can be chosen using the keyboard so you don't reveal your plans to the other manager by moving the mouse over the buttons on the game window. Because the mouse is needed only for things like substitutions, you can leave the cursor in a central location so it's readily available when you want to take control.

Using these two techniques, you can play a NetMeeting game with a minimum of wasted effort.

Win Shares question

A few months ago, we received this question from a DMB customer:

 

I just finished reading the two new books by Bill James, "The New Historical Abstract" and "Win Shares". In them, he details a new way to evaluate players for both offensive and defensive skills. Do you have plans to incorporate Bill James ratings into your game or do you see flaws in his ratings?

 

We read both books soon after they were published. We're still in the process of evaluating his methods, but we can point out a few things to keep in mind about Bill's approach to rating fielders:

1. He begins by evaluating overall team defense and then tries to break that down and assign credit/blame to positions and then players. We've been doing that for many years.

2. His method is intended to work with players from all eras, so he chose to develop new techniques for coping with the biases inherent in traditional fielding stats. We've been aware of those biases for a long time and have always kept them in mind while evaluating traditional fielding stats.

Bill's system is an attempt to make better estimates of the number of opportunities to make plays and the number of plays made, and it appears that he has come up with at least a few useful ways to do that.

On the other hand, using play-by-play data from the 1990s, we can now count those things directly, and we want to spend some time seeing whether Bill's estimates match up with the actual data for that period.

If they do, he's made a giant contribution to the field, because we can confidently apply his techniques to seasons for which we don't have first-rate play-by-play data. If they don't, we'll have to figure out why and proceed from there.

3. Bill's method is intended to aggregate all aspects of fielding skill into one number, while our goal is to isolate specific skills. We have separate ratings for range, errors and throwing, and we cannot assume that a high number of defensive win shares necessarily indicates a fielder who should get a top range rating. It's possible that his range is average and his value lies in a strong arm and good hands.

4. We're not yet sure about the weights Bill put on different fielding skills when coming up with his fielding win shares. To some extent, that doesn't matter to us because we're more interested in rating the individual components of defense anyway. But as fans of baseball analysis, we're curious to see whether Win Shares really works, so we hope to find time to look at this part of his system, too.

The bottom line is that we will continue to rate fielders for modern seasons based on our analysis of play-by-play data. But we're always on the lookout for new and better ways to evaluate fielders, and if our review suggests that the fielding portion of the Win Shares model provides us with some new tools, we'll use them.

Professor Carl Morris and Barry Bonds

In a recent article for ESPN.com, Alan Schwarz described a model for evaluating offensive production that was developed by Professor Carl Morris of Harvard University. Here's an excerpt:

 

Morris ... has built a way to determine how many runs per game a team of players of various calibers will score.

What Morris has done, he will assert, is devise the first approach that is not an estimate, nor a computer simulation, but a relatively straightforward algebraic formula that comes to an answer that is probabilistically true -- and can be backed up by rigorous mathematical proof.

"This is an exact calculation, not an estimate," says Morris, the former chairman of Harvard's statistics department. "It is correct."

 

Prof. Morris used his formula to compute that a batting order consisting of nine clones of Barry Bonds, each based on Barry's 2002 season, would score 22.4 runs per game, easily the highest total in baseball history.

I found this claim to be very interesting for a couple of reasons. First, I'm always interested in new ideas. Second, in the first half of this year I wrote a program that does something similar.

My program computes the expected number of runs for any sequence of hitters facing any pitcher in any ballpark in any era. My purpose was to create a tool for evaluating baseball strategies such as the bunt, stolen base and intentional walk, and I gave a presentation at the 2002 SABR convention based on the application of this method. (Sometime this winter, we'll publish an article based on that presentation.)

Even though it was developed for a different purpose, there's no reason why our program couldn't be used to evaluate a batting order consisting of nine clones of the same player.

I wouldn't expect our models to give exactly the same results, however. Prof. Morris included all of Barry's intentional walks, while my program excludes them because no manager in his right mind would walk Barry to get to another Barry. And my program took advantage of some things (such as adjusting for era and park effects) that we routinely do in Diamond Mind Baseball.

As Schwarz pointed out in the article, Prof. Morris' model "doesn't take into account park effects or the era in which a hitter plays. Double plays, stolen bases and other outs on the bases are not included, either, because doing so would ... forbid the use of a straightforward formula."

Curious about how the two models might compare, I cloned the 2001 version of Bonds 8 times and created a lineup with Bonds from top to bottom. (I would have used 2002 if our season disk was ready).

Prof. Morris computed that such a lineup would score 17.1 runs per game. Our model puts the number at 14.7 when Bonds is playing at Pac Bell and 16.9 when he's in a neutral park, for an average of 15.8 over the course of a full season.

I gave some thought to using simulation to test the validity of this number, but quickly realized that there are some thorny issues:

- because an all-Bonds lineup would be playing with a big lead most of the time, it would have a big impact on strategy. Neither team would have any reason to bunt or steal bases, and the intentional walk would be out of the question when the Bonds team was up.

- an all-Bonds lineup would be totally left-handed, affecting how the opposition uses its pitching staff.

- an all-Bonds lineup would put enough runners on base to get most opposition starting pitchers to 100 pitches by the third inning. By the end of the first game of any series, this lineup would have used up all of the pitches a modern bullpen normally throws in 3-4 games. So the other team would be seriously short-handed if it wasn't carrying sixteen or seventeen pitchers.

In other words, it's not obvious that we can use simulation as the basis for a scientific study of such an extreme lineup because of these side-effects.

Nevertheless, I autoplayed the 2001 season one time using an all-Bonds lineup, just for fun. I used the Giants pitching staff, the 2001 schedule, changed the league rules to permit the DH so Bonds would bat in all nine spots, and created another Bonds clone to serve as backup catcher so the Bonds who started at catcher wouldn't see his performance suffer because of fatigue.

One simulated season isn't enough to prove anything, of course, but here's how that season turned out:

- Bonds' team cruised to the division title with a 157-5 record. The season began with a 68-game winning streak that was broken by a loss to Mark Mulder and the A's. They also lost one game in Seattle, one game at home against Arizona, one game at the Cubs (with Kerry Wood pitching), and one game at home against Houston (with Wade Miller pitching).

- the lineup averaged 14.8 runs per game, but when you figure that his team hardly ever batted in the bottom of the ninth, that corresponds to a rate of about 15.7 runs per nine innings. Recall that our expected runs program computed that this lineup would average 15.8 runs per nine innings in an even mix of Pac Bell and a neutral park.

- no intentional walks were issued. Bonds was lifted for a pinch hitter a few times, but the pinch hitter was always Bonds-the-backup-catcher, and the move was made only because our computer manager rests superstars in the late innings of a blowout.

- Bonds attempted to steal only about 40% as often as he did in the real 2001 season, mainly because his team was winning big so often.

It's not surprising that my model and these simulations each came up with a lower number than Prof. Morris did. We included park effects, which would make Barry look even better because his home park is tough for lefty hitters. On the other hand, my program and our simulation leave out the intentional walks and account for caught stealing, double plays, and other outs on the bases.

Prof. Morris claims that his model is "exact" and "correct". This may be true for (a) a lineup involving the same player nine times and (b) a simplified version of baseball that excludes outs on the bases and treats intentional walks as regular ones.

I look forward to reading Prof. Morris's paper when it is published on his web site. In the meantime, you should be able to find Alan Schwarz's article at

http://espn.go.com/mlb/columns/schwarz_alan/1436689.html