Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Posting-Version: version B 2.10.1 (Denver Mods 7/26/84) 6/24/83; site drutx.UUCP
Path: utzoo!linus!decvax!tektronix!uw-beaver!cornell!vax135!houxm!mtuxo!drutx!djvh
From: djvh@drutx.UUCP (VanHandelDJ)
Newsgroups: net.sport.baseball
Subject: RE: Carter vs Pena (offense)
Message-ID: <161@drutx.UUCP>
Date: Wed, 14-Aug-85 16:22:50 EDT
Article-I.D.: drutx.161
Posted: Wed Aug 14 16:22:50 1985
Date-Received: Mon, 19-Aug-85 23:47:53 EDT
Organization: AT&T Information Systems Laboratories, Denver
Lines: 171

> 
> Now, why don't we expand this whole discussion so that others
> can take part, too? After all, no one else has made any
> contributions to this, and we might as well conduct this by
> mail, rather than over the net.
>
	No one else?  I for one have posted twice regarding this
discussion.  Did you ignore these just like you ignored David Rubin's
remarks that didn't agree with yours.
 
> You feel that by computing statistics based upon the raw numbers
> which are available in various publications, you can attain an
> understanding of the inner workings of the game, to the point
> that you can describe the strengths and weaknesses of players,
> discuss strategies, etc. I feel that this is not true, that instead,
> statistics lead to a superficial appreciation of correlation, but
> no definite understanding of cause-and-effect. 
>
	Statistics may not be used to PROVE a point in baseball, but they
sure as hell are better than anything else we can use.  Analysis (as there
has been a lot of it) supports the argument that stats can be used to
understand ALL of the items you list above.
 
> This can be compared to the type of knowlegdge which current "Artificial
> Intelligence" systems employ. Such systems, e.g. MYCIN and its
> descendants, use associational knowledge to correlate stimuli and
> desired responses, e.g., "If symptom A and symptom B are present,
> then prescribe remedy X." This can be effective in many cases. But
> there is a great deal of knowledge which cannot be captured this
> way - the structural knowledge, i.e., WHY A and B tend to respond
> to X. Even adding probabilities to this knowledge does not change
> this limitation.
>
 	As long as stats show which type of player will help the team
win, who cares WHY (analysis of stats makes the WHY obvious).

> In the same way, baseball stats can reveal many correlations, and
> can thus lead to a greater appreciation of the game and its
> intricacies. But the structural knowledge, e.g., "Is player A
> better defensively than player B?","Will player A contribute more
> to my team than B?", is subjective, and hence dependent upon the
> knowledge that a person has acquired about the game. 
>
	Who's to say this is subjective.  Any good manager will base
his decision on the results obtained by past performance.
 
> Also, these baseball pros have access to three sources of info that
> you and I do not:
> 
>      1) They keep charts of every pitch, and every hit and play in
>         the field. Thus, they have MUCH better data to rate the
>         players;

		So they have better stats.  They are still stats, which
	shows the value of using the right statistics for the game.

			       (or do you really think you know more
> than they all do?)
>
	It's possible that David Rubin does know more.  The people that
vote for these awards aren't God; do you honestly feel that the best players
get to the all-star game each year?  You seem to accept the "Authority's"
decisions as law.
 
>                Responses to statistical verbiage:
> 
> > Also, we really don't want to consider years
> > before the principles established themselves in their respective teams
> > starting lineups, so we consider Carter in 1975 and 1977-1984 (in 1975,
> > Carter spent most of his playing time in the outfield (as the starting
> > left fielder, generally) while serving as the #2 catcher; in 1976, he
> > was just the back-up catcher) and Pena from 1981-1984.  
> 
> Why consider any years before 1984? The original question was which
> should start the 1985 all-star game, not which was better in 1975.
> I knew Carter was better than Pena before 1984 even without reading
> any stats :-)
>
	I agree with this.  However, we did need some form to measure ability.
 
> > It is widely recognized that the purpose of the offense is run
> > production, and there are two distinct ways in which a hitter may
> > contribute to it.  The first is to score runs, the second to drive
> > them in.  Thus, traditionally, fans have placed great store in the
> > most obvious measures of that production, runs scored and runs batted
> > in.  Unfortunately, those traditional measures are heavily dependent
> > on circumstances beyond the hitter's control: how well his teammates
> > fare in doing THEIR job.  You can't score if no one drives you in, and
> > you can't drive some one in if no one is on base.  If we are to
> > evaluate individual performance, we must look at statistics that are
> > NOT dependent on the action of anyone save the individual in question.
> 
> EXACTLY!!! If you can find a statistic that is truly independent of
> teammate's contributions, I'd love to see it. All the stats you list
> below (Putouts, %thrown out, DP's, BA, OBA, HR, R, RBI, slugging,
> etc.) are dependent on: teammates' seasons, manager's tactics, place
> in the lineup, ballparks, and others. Apparently one of your favorite
> stats for pitchers is Earned Runs Prevented. You love to post a detailed
> list to the net every so often. This stat is no more independent than
> the old ERA. For example, if a pitcher pitches for a team which scores
> fewer runs for him, then he may be lifted earlier, on the average.
> This leads to fewer innings for him, and a lower score on ERP. This
> effect can also be achieved bu playing for a manager who loves to go
> to the bullpen early (e.g. Chuck Tanner.) These stats can be revealing
> at times (Gooden leads by a large margin no matter how you measure
> things) but using them to make finer distinctions is meaningless.
>
	David Rubin repeatedly stated that no statistic is independent
of other players, only that some stats have less correlation to what the
other players do.  After years of studying stats, it is obvious to me that
slugging pct. and on-base pct. are by far the most useful for discussion
of offensive contribution.  These are not perfect, but are the best we have.
 
> > Thus, we look at on On-Base Percentage (a.k.a.
> > Average) to evaluate how well a player performs this function.  
> 
> You really love this stat. Fine. This is a free country. I think
> it is just another meaningless stat. Hits are better than walks
> any day. Since you love to compute, why not analyze how often runs
> are scored with only walks, versus how often runs are scored with
> only hits? Or how often runs are scored with no hits at all, versus
> how often runs are scored without walks? This is baseball, not the 
> on-base derby.
>
	You don't make any sense here.  Are you saying that a player
cannot do both?  Of course a hit is better than a walk.  But, if two
players each get 600 plate appearances, #1 gets 130 hits and 100 walks,
#2 gets 180 hits and 20 walks, who would you say contributed more?
Player #1 has a BA of .260 while Player #2 has a BA of .310.  So 
Player #2 is clearly better, right?  Assuming power figures are equal,
I would say Player #1 was more valuable, as he reached base 30 more
times in the same 600 plate appearances.
	This is reality, you can get on base either way (many times
both ways in the same inning !!!)

> 
> > Both Carter and Pena batted in the middle of their respective orders
> > (generally fifth for Carter, sixth for Pena), and probably have about
> > 45 such opportunities in a season.  Assuming that the opportunities
> > are uniformly distributed among out counts, 30 of these occurred with
> > none or one out (actually, this probably overestimates the number of
> > such opportunities, as outs accumulate as batters bat, thus implying
> > that more runners are on base, on average, with two out than with none
> > out).  Pena has good speed for a catcher, average for all runners, and
> > would probably advance to third about 33% of the time (choosing the
> > median value from Texas regulars); for Carter, my best guess is 20%
> > (he's not as hopeless on the bases as Sundberg).  The difference,
> > then, is probably about 30/3 - 30/5 = 4.  It does not make up for
> > Pena's negative contribution in his stolen base attempts.
> 
> By your argument, speed is a negligent factor in baseball, at least
> offensively. You apparently feel that breaking up double-plays at
> second base, or avoiding them at first, or taking the extra base,
> or causing an errant throw, are small factors. You like HRs and 
> walks. Well, why not take up this argument some time with a professional
> baseball person (which neither of us is) and tell him that speed is
> negligable offensively? From everything I have heard and read, this is
> not so. Again, we are fans. I trust what I hear from pros more than
> your amateur judgement (or my own). 
>
	Whitey Herzog has been the only manager who has consistently won
with speed in his lineup.  Power is far more valuable, and if you want it
proven, look at the top scoring teams each year.  Also look at the top
(Slugging Pct + OB Pct) teams each year.  You will see that the same teams
are at the top of each list.  Baltimore (late 60's and early 70's) won near
100 games annually, but rarely did they have a .300 hitter.  They did it
with power and the ability to draw walks.  They had a few decent pitchers
as well.
 
						Dave Van Handel
						drutx!djvh