Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.2 9/18/84; site fisher.UUCP Path: utzoo!watmath!clyde!burl!ulysses!allegra!princeton!astrovax!fisher!david From: david@fisher.UUCP (David Rubin) Newsgroups: net.sport.baseball Subject: Re: NL catchers, AI, and philosophy Message-ID: <748@fisher.UUCP> Date: Mon, 19-Aug-85 12:29:52 EDT Article-I.D.: fisher.748 Posted: Mon Aug 19 12:29:52 1985 Date-Received: Tue, 20-Aug-85 22:11:25 EDT References: <409@philabs.UUCP> Distribution: na Organization: Princeton University.Mathematics Lines: 187 [Frankly, I'm now eager to move onto other things. Apparently, Paul will continue to insist upon ignoring my arguments and in setting up straw-men to knock down (he continues, for example, to point out the shortcomings of SA and OB without addressing my contention that their indirect shortcomings are far smaller in magnitude than those of statistics directly influenced by teammates, and to attribute falsely to me a belief that these statistics reveal all). Rather than answer him point-by-point (and wind up repeating myself incessantly in again presenting an argument for him to again ignore by again answering the argument he wishes or thinks I made instead of the one I did), I will answer carefully only the "philosophy" arguments (they are relatively fresh) and will summarize why I find it useless to continue the "statistical" argument. ] >............................... So since the disagreement is >one of underlying philosophy, let's argue on that ground.... If the disagreement were truly one of philosophy, we would have immediately headed for that ground. Instead, you began by attempting to support Pena statistically. This is inconsistent with the general indictment of those statistical methods that follows. I cannot help but wonder whether these philosophical questions would have even been appealed to had I accepted your original argument based upon BA and R+RBI-HR. >You feel that by computing statistics based upon the raw numbers >which are available in various publications, you can attain an >understanding of the inner workings of the game, to the point >that you can describe the strengths and weaknesses of players, >discuss strategies, etc. I am not that sanguine. I realize, though, that if I am to achieve ANY understanding of the game, I cannot ignore what I and others observe. Statistics are no more or less than summaries of those observations. Poor summaries can obfuscate and mislead; good summaries can illuminate and inform. > I feel that this is not true, that instead, >statistics lead to a superficial appreciation of correlation, but >no definite understanding of cause-and-effect. Whether the appreciation is superficial or fundamental depends on the quality of the analysis. Cause-and-effect is something statistics can only suggest. We must eventually face up to the decision on whether a given statistical association is causal or not; we usually decide this by examining whether a plausible mechanism exists to link the proposed cause and effect. It can, however, guide us in a choice between two plausible causes, for we may find that one proposed cause is far more closely associated with the effect than another. You seem to think that I advocate using statistical methods without consideration for previous knowledge; nothing could be further from the truth. Our differences arise not because I use the statistics without regard for established knowledge, but because some things that you accept as established I do not. >................ baseball stats can reveal many correlations, and >can thus lead to a greater appreciation of the game and its >intricacies. But the structural knowledge, e.g., "Is player A >better defensively than player B?","Will player A contribute more >to my team than B?", is subjective, and hence dependent upon the >knowledge that a person has acquired about the game. We may have a semantic problem here. You apparently use "subjective" to describe anything that is not directly observable; I use it to describe that which cannot be logically inferred. I do not contest that these questions cannot be answered in any intelligent way by watching a few games or reading the box score everyday; I believe that for most players, we can make such comparisons once we have available enough experience. >.........................................................Thus, the >only people who are qualified to make these judgements are baseball >professionals, not statisticians. This is the real philosophic difference. In any field of endeavor, the relative merit of two proposals should be decided (in my view) by force of argument; the origin of the proposals, though they might affect our original estimation of their likelihood of correctness, ought not affect our final evaluation. I will not leave war to generals, religion to clergy, government to politicians, education to teachers, nor baseball to its "professionals". > Even though Davey Johnson uses a >compendium of stats to help him manage the Mets, HE still makes the >decisions, and can easily decide to ignore a stat. (After all, he >initially decides which stats to put into his machine, and which >to leave out.) There is no other intelligent way to use them; remember, statistics are summaries, and one should feel free to incorporate information that is not contained in them or properly weighted by them (though the latter case strongly urges the selection of a new statistic). When I questioned your arguments for considering other factors, it was not because I objected to such a consideration, but because I viewed the introduction of such factors, with no attempt to substantiate them, as speculative. I did not question, for example, Pena's better speed, but did question the extent to which it contributed to the Pirates in ways that were not already accounted for in the evidence that I presented. >Also, these baseball pros have access to three sources of info that >you and I do not: > 1) They keep charts of every pitch, and every hit and play in > the field. Thus, they have MUCH better data to rate the > players; But, as they do not make this information generally available, we have no assurance that they even use it intelligently. You seem to be arguing that baseball professionals are uniformly capable folk in the handling of data; ho! > 2) They see many more games, and from a better vantage point than > we do (they have access to the tapes of the games, too, so > they miss nothing that we see); Even a baseball professional does not have more than 24 hours in a day to watch tapes; they, too, must eventually synthesize what they see and accept summaries of what they don't see. > 3) They have the accumulated experience from their careers, which > enables them to interpret what they see in ways that can be > very different from the way we see things. a) It is not as different as you imply. b) Where it differs, the "expert" may be wrong. c) The "experts" themselves differ, in which case who is truly expert is an open question.... > It is precisely > this lack of understanding that prompts people like us to > revert to statistics. Actually, it is the limited storage capacity of the brain, combined with the unachievability of omniprescience, that drives us to statistics. Summarize we must. >So, I leave the decisions to the pros, and hope that my hometown >pros make good decisions. This myth of a great gulf between expert knowledge and layman knowledge has been created in order to protect your hometown pros from the consequences of their decisions. With your attitude, all you can do is hope. I prefer to criticize, and by criticizing, provoke change (well, at least provoke argument!). To shorten my article, I will content myself with noting that you have been all too slippery. First you used statistics, then you declared them entirely irrelevant. First you appealed to 1984's performance, then you declared everything before this year irrelevant. First you claim that Pena is a better catcher than Carter, then you reduce that claim to defense, or to 1985 (April-June inclusive), or whatever appears to be the path of least resistance. In just the latest instance, you responded to my overwhelming evidence that the Pirates of 1984 were NOT less productive than the Expos of 1984 by declaring 1984 irrelevant to the issue at hand -- while in the same article, you again bring up the supposedly critical 1984 Gold Glove! I have been clear about my claims: that Carter has been, is, and likely will be substantially more productive offensively; that in the past, his defense has been about as keen as Pena's, although I have no information for this year; and that his offense so clearly outdoes Pena that even if Pena is having a better year defensively, Carter has more merit as a starting all-star this year. You have claimed superiority for Pena, offensively and defensively, this year and in years past, retracting those claims as appears seemly, and occassionly reintroducing them once the evidence I presented against them recedes from memory. Regarding statistics, it appears to me that you consider them targets of opportunity to be exploited for your argument's benefit; you use your position to select your statistics, rather than the other way around. You may naturally presume others treat them the same way, and thus can lament that they cannot yield any information. A more accurate statement is that they cannot yield any information when so abused. You lament the flaws any statistic must possess, and then falsely infer that all statistics are thus equally worthless. My "alternate" statistics were never presented as perfect, but rather as substantial improvements. I mean none of this in a hostile or ill-mannered way. Rather, I am somewhat saddened at finding yet another person who so fundamentally misunderstands what Statistics is all about. It is NOT a collection of techniques used to crank out numbers that we use in some prescribed fashion; it is instead the search for patterns, and the interpretation of patterns (or lack of them) found. David Rubin {allegra|astrovax|princeton}!fisher!david