Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.2 9/18/84; site wateng.UUCP Path: utzoo!watmath!wateng!dclee From: dclee@wateng.UUCP (David C. Lee) Newsgroups: net.math Subject: Re: Pistachio Probabilities Message-ID: <2772@wateng.UUCP> Date: Fri, 23-Aug-85 14:49:43 EDT Article-I.D.: wateng.2772 Posted: Fri Aug 23 14:49:43 1985 Date-Received: Sat, 24-Aug-85 19:14:30 EDT References: <285@ihnet.UUCP> Distribution: net Organization: U of Waterloo, Ontario Lines: 72 Subject: Re: Pistachio Probabilities Newsgroups: net.math Distribution: net References: <285@ihnet.UUCP> Here is the original question about Pistachio Probabilities: >Suppose you begin with a bag containing O openable pistachios, >and U unopenable pistachios. A trial consists of selecting a nut at random, >and eating it (if possible), or returning it to the bag. >How many trials, on the average, are required to consume all the "openable" >pistachios? Express the answer in terms of U and O. >Any ideas on the variance/distribution of trials(U,O)? Rather than using Openable & Unopenable (O & 0 is confusing), I will use Good and Bad. Suppose we associate a stage of the experiment with the consumption of a good pistachio. For example at the beginning of stage (1), we have (G) good ones and (B) bad ones; at stage (2), [G-1,B]; ... ;at stage (i), [G-i+1,B]; ... ; and finally at stage (G), [1,B]. Let us analyze this experiment stage-by-stage. At any stage (i) with [G-i+1,B], we have the following results: p = #good / (#good+#bad) = (G-i+1) / (G-i+1 + B) - probability of picking a good one at each trial q = 1-p P(k) = p q^(k-1) - probability of finding the first good one at the k-th trial - recall geometric distribution with expected value E(k) = 1/p and variance V(k) = q/(p^2) So at stage (1), [G,B] (found and consumed the 1st pistachio) the expected number of trials is E(k) = (G+B)/G = 1 + B/G with the variance of V(k) = (B/(G+B)) / (G/(G+B))^2 = (B/G) + (B/G)^2 At stage (2), [G-1,B] (found and consumed the 2nd pistachio) the expected number of trials is E(k) = 1 + B/(G-1) with the variance of V(k) = (B/(G-1)) + (B/(G-1))^2 ... At stage (G), [1,B] (Ha-Ha, the last one!) the expected number of trials is E(k) = 1 + B with the variance of V(k) = (B) + (B)^2 The expected number of trials to consume all pistachio is: N = k(1) + k(2) + ... + k(G) E(N) = E(k(1)) + ... + E(k(G)) = { 1 + B/G } + ... + { 1 + B } = G + B { 1/G + 1/(G-1) + ... + 1 } (reducible?) Since k(1), k(2), ..., k(G) are independent random variables, (i.e. each stage of the experiment is independent of others) the variance of N is: V(N) = V(k(1)) + V(k(2)) + ... + V(k(G)) = B { 1/G + 1/(G-1) + ... + 1 } + B^2 { (1/G)^2 + (1/(G-1))^2 + ... + 1 } (reducible?) Further comments: No doubt about it! The task of finding a good pistachio gets harder with each being consumed. To get the (i+1)-th pistachio we would need an extra effort of B/{(G-i)(G-i+1)} trials compared to (i)-th pistachio, i.e. an increase of no more than 100/G% at the beginning for large values of G and B, but with the increase of as much as 200% at the end for a large B. Finally by "Central Limit Theorem", N tends to be normally distributed with mean E(N) and variance V(N) as dervied above, for large values of B and G. David C. Lee @ University of Waterloo, Ontario