Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Posting-Version: version B 2.10.1 6/24/83; site mmintl.UUCP
Path: utzoo!decvax!decwrl!greipa!pesnta!amd!amdcad!lll-crg!seismo!cmcl2!philabs!pwa-b!mmintl!franka
From: franka@mmintl.UUCP (Frank Adams)
Newsgroups: net.sport.baseball
Subject: Re: Re: playoff slugging + onbase avg.
Message-ID: <757@mmintl.UUCP>
Date: Fri, 1-Nov-85 22:55:40 EST
Article-I.D.: mmintl.757
Posted: Fri Nov  1 22:55:40 1985
Date-Received: Mon, 4-Nov-85 21:34:16 EST
References: <483@philabs.UUCP> <941@water.UUCP> <489@philabs.UUCP>
Reply-To: franka@mmintl.UUCP (Frank Adams)
Distribution: na
Organization: Multimate International, E. Hartford, CT
Lines: 169

In article <489@philabs.UUCP> dpb@philabs.UUCP (Paul Benjamin) writes:
>> > This is just a short (thank God!) note on team slugging and
>> > on-base averages. The Cards did beat LA in those stats, as well
>> > as on the field, but the opposite is true for KC vs. Tor, and
>> > so far in the World Series.
>> > 
>> >          BA      SA     OBA     SA+OBA
>> > 
>> > KC     .225     .366   .294     .660
>> > Tor    .269     .372   .319     .681
>> 
>> > 
>> > Doesn't look like team OBA+SA is so important, does it?
>> > 
>> >                                  Paul Benjamin
>> 
>> Two quick observations.
>> 
>> 1)   The results of one seven game series are not going to convince
>>    the average person of anything. You may remember that in the 1960
>>    W.S. the Yankees outscored the Pirates by about 30 or so runs,
>>    yet Pittsburgh won it in seven games. By your reasoning we could
>>    conclude that scoring runs isn't so important.
>
>Actually, I would agree that just the total of runs is not important.
>What counts is when they are scored. The NY-Pitt series you mention is
>the most extreme example of this. But it is definitely true that the
>gross total (or differential) of runs is not a strong indicator of 
>winning games. The same holds in other sports, such as tennis, where it
>is often the case that the winner has won fewer games, but won more sets.
>This is particularly true between strong players.

The relation between statistics like SA+OBA and scoring runs is pretty
much like that between scoring runs and winning games.  I think we are
getting close to the meat of the argument here.

Let's look at the 60 series here.  The question, I believe, is the following:
(to put it baldly) was the outcome the result of luck or skill?  That is,
which of the following descriptions of the series is more accurate:

(1) The Pirates proved themselves the better team by their ability to score
    runs in clutch situations.  Although the Yankees were better at getting
    men to cross the plate, Pittsburgh got them when they needed them.

(2) Although it is hard to tell from such a short series, the Yankees
    dominance in all statistical departments makes it seem quite likely
    that they have the better team.  However, the Pirates were fortunate
    enough to win all the close ones and only lose blowouts, and so won
    the series.

Let me paint the second picture in more extreme terms.  From this point of
view, all that matters in each at bat is the talents of the pitcher, the
batter, and the fielders.  Everything else is randomness.  Sometimes the
batter gets lucky, sometimes the pitcher does.  (I have left base stealing
out for simplicity; this point of view would hold that the chances of the
runner stealing depend on the runner and the relevant fielders, and that
whether the runner attempts a steal or not, and whether he is successful
or not, does not affect the batter.)

A similarly extreme picture of the first option holds that nothing is random.
Everything happens as it must happen, given the players and the situation
they are involved in.

Now, I think it is obvious that neither of these extremes is correct.  But
I think that number 2 is closer to the truth than number 1.

The main argument for randomness is that it suffices to explain the kind of
effects that are being talked about.  Some fraction of the time, one team
will score more runs than the other, yet lose the series.  Some fraction of
the time, a team will get more men on base and slug better, yet score fewer
runs.  These things don't just happen occasionally, either; they are fairly
common JUST ON THE ASSUMPTION OF RANDOMNESS.

Now, in principle, non-randomness could either increase or decrease the
frequency of such events.  But all the kinds of non-randomness I have seen
proposed (some players or teams perform better in certain kinds of
situations) will in fact increase this frequency.  So in principle, a
statistical analysis should be able to determine to what extent such
factors are present.

But a fairly large sample is required for such a study.  All the World Series
played are not nearly enough for a study of runs scored vs winning the series
to be statistically significant.  That *might* be enough for a study of
OBA+SA vs. runs scored, but it might not.  (The effective sample size is
higher in the latter case, being approximately the number of games played,
whereas in the latter, it is the number of Series played.) (It is not at all
clear what number of runs per game to predict from a given OBA+SA; the
prediction of wins from runs is simpler, but also non-trivial.  For hockey,
the last calculation is fairly easy, but runs in baseball are not always scored
one at a time.)  LOOKING JUST AT THREE OR FOUR SERIES IS COMPLETELY
MEANINGLESS.

A complete solution to the expected number of runs scored given certain
probabilities for each event involves solving a system of 24 simultaneous
equations (8 possible states of having runners on base times 3 possible
numbers of outs).  Doing so requires some numbers not generally available,
such as the chance that a runner will advance from first to third on a
single.

*** begin digression ***

I did this a few years ago, using typical major league numbers
for available statistics, and guessing at those that were unavailable.
By taking derivatives, one can get estimates of the values of each possible
result in the context of a typical offense.

The raw results of this computation are not currently available to me.  I do
remember some scaled and rounded results, which are as follows:

Walk:    8
Single: 10
Double: 14
Triple: 17
Homer:  22
DP ball:-1
Out:     0

(A DP ball is a ball which will result in a DP if there is a runner on first
and zero or one out.  Otherwise it is a ground out.  An "Out" is an out which
does not change the positions of base runners.)

By scaled, I mean that if you take the frequency with which a batter does
each of these things (as well as others, e.g., hit a possible sacrifice fly)
times the factor above, multiply by an appropriate constant, and add (actually
subtract) an appropriate constant, you get an estimate of how many runs that
batter will produce per game.

Note that this is approximated fairly well by 2*SA+3*OBA (with scaling),
except that walks are underestimated thereby.  A better approximation is
2*SA+4*OBA-BA.

*** end digression ***

This method could be expanded on a bit to compute standard deviations in
number of expected runs for an offense, as well as the means.  It would be
interesting to see such a study done for the entire history of the World
Series, comparing expected and actual runs scored.

Of course, this calculation is still not fully what the "randomness" theory
predicts, since it assumes each player has the same chance of producing
each result.  A more accurate calculation would have 216 equations (24*9),
for each situation and each hitter.  This still pretends all pitchers are
the same, and ignores pinch-hitting, platooning, and other lineup changes.
It also ignores the different stealing abilities of different runners.

---------------------

There are two established variances from the randomness theory.  One is
that left-handed batters hit better against right-handed pitchers, and
right-handed batters hit better against left-handed pitchers.  Another is
that batters hit better with runners on base.  The former effect is fairly
significant, and seems to be different for different players.  (So that it
would be more accurate to talk about a player's hitting or pitching ability
vs. lefties and vs. righties seperately, rather than together.)  The
latter is comparable in size, perhaps a bit smaller.  I do not believe it
has been established that the effect depends on the individuals, or how
large that effect might be.

It is not yet well established that some players are better in the clutch,
but based on the Elias data, it appears that this is the case.  The size of
this effect appears to be about .020 to .040 points, measured in terms of
batting average, for the most extreme players.  It would take half a dozen
such players to significantly affect a teams winning probabilities.

This ran on much longer than I intended for it to.  Thank you to those of
you who read it all.

Frank Adams                           ihpn4!philabs!pwa-b!mmintl!franka
Multimate International    52 Oakland Ave North    E. Hartford, CT 06108