Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Posting-Version: version B 2.10.2 9/18/84; site lasspvax.UUCP
Path: utzoo!watmath!clyde!burl!ulysses!mhuxr!mhuxt!houxm!vax135!cornell!lasspvax!rokhsar
From: rokhsar@lasspvax.UUCP (Dan Rokhsar)
Newsgroups: net.sport.baseball
Subject: World Series Probabilities
Message-ID: <538@lasspvax.UUCP>
Date: Thu, 19-Sep-85 22:17:13 EDT
Article-I.D.: lasspvax.538
Posted: Thu Sep 19 22:17:13 1985
Date-Received: Sat, 21-Sep-85 04:02:31 EDT
Distribution: net
Organization: LASSP, Cornell University
Lines: 59

In preparing a text on probability for nonscientists, a professor of ours
considered the possibility that the teams were equally likely to win, and 
computed the probabilities that the Series would go 4, 5, 6 or 7 games based
on this assumption.  
A newspaper article from 1981 claimed that the Series has been tied at two
games apiece 30 times in the 78 years of the World Series; this agrees 
well with the 3/8 predicted by the "equal probability" argument.  The article
goes on to say that in 22 out of these 30, the winner of the fifth game won
it all, which is in agreement with the 3/4 prediction.
Using the above assumption, the chance of a Series of a given length can be
calculated; comparing with data from 1926-1975, we find

		1926-50		Calculation	1951-1975
7-games		   7		   7.8		   15
6-games		   5		   7.8		    3
5-games		   7		   6.2		    4
4-games		   6		   3.1		    3

The 15 7 game Series lies more than 3 standard deviations away, and the 3
6 game Series is over 2 standard deviations away.

To explain this anomaly we decided to test the assumption that the home team
advantage was the cause.  Since the Series is played with 2 games at home, 
3 games on the road, and the last 2 (if needed) back at home, a significant
home field advantage would tend to increase the number of games expected
in a Series.  In the last 30 years, the team which started the Series at
home went on to win it 21 times (assuming that the advantage has alternated
from league to league, and that the 2-3-2 format has been unchanged).
Assuming that the probability of team A winning at home is the same as
the probability of team B winning at home (i.e. the teams are evenly
matched except for the home field advantage)  this 21/30 ratio
corresponds to a .87 probability of winning a home game.

Using this .87 probability, the probabilities of 4, 5, 6, and 7 game Series
are:

	     1926-50       Calculation       1951-75
7-games  	7   		13.1		15
6-games		5		6.9		3
5-games		7		4.4		4
4-games		6		0.7		3

This helps with the 1951-75 data but misses badly with the early data.
One explanation is in that time the Yankees won 13 of the World Series
casting in serious doubt the assumption of evenly matched teams.  A
dynasty would clearly lead to shorter Series' since the dominant 
team would win no matter where it played.  In fact the Yankees' won
1 Series in 7 games, 1 in 6 games, 5 in 5 games, and 5 in 4 games.
When all the Series' are removed in which the Yankees played, we are
left with 6 7-game Series, 4 6-gamers,2 5-gamers, and 1 4-gamer, 
which matches the trend predicted by the principle of equally matched
teams.  Since 1955, no team has won more than 3 times in a row, and
that only happened once (Oakland '72,'73,'74).
We don't have statistics for individual games; this should be checked
by those who can and are interested.  Any other information relating
to these issues would be appreciated.

		Dan Rokhsar
		Eric Grannan