Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Posting-Version: version B 2.10.PCS 1/10/84; site mtgzz.UUCP
Path: utzoo!linus!decvax!harpo!whuxlm!whuxl!houxm!ihnp4!drutx!mtuxo!mtgzz!dmt
From: dmt@mtgzz.UUCP (d.m.tutelman)
Newsgroups: net.micro
Subject: Re: EPROM memory lifetime query
Message-ID: <1043@mtgzz.UUCP>
Date: Wed, 14-Aug-85 12:29:53 EDT
Article-I.D.: mtgzz.1043
Posted: Wed Aug 14 12:29:53 1985
Date-Received: Tue, 20-Aug-85 07:42:37 EDT
References: <562@wdl1.UUCP> <1686@hao.UUCP>
Organization: AT&T Information Systems Labs, Middletown NJ
Lines: 48
Cc: dmt

> >       How long do current EPROMS and EEPROMS hold their memory? 
> Intel states (in their EPROM Applications Manual AFN-01648A) that you will
> have 5% cell failures after only 220,000 years at 70C in storage.  This is
> from tests taken at high temperatures to accelerate failures.  Assuming that
> these failures are linear over time gives a cell failure rate of 0.0001%
> in 4.4 years at 70 degrees C.
> ...
>         {ucbvax!hplabs | allegra!nbires | harpo!seismo } !hao!hull

Thanks, Howard. I really enjoyed your response.
Just to put a number on it that's meaningful to me, I used your data
to get a failure rate per chip.  Assuming a 128K chip (current
state-of-the-industry), with the failure of any bit meaning the
chip has failed,  I get an annual failure rate of 2%.
Put another way, the MTBF of the chip is about 50 years.
If I used a 64K chip, it would be more like 100 years.

Implication: as we learn to put more bits on a chip, we'll get
to the point that we need to improve the failure rate per cell
to keep the chips from having an unacceptably high failure rate.
We're probably one order of magnitude from there now. I.e. - a
1 Meg chip would have a mean lifetime of 5 years; that's probably
unacceptable.  A few thoughts on that:
   -	First the bad news - if we do it by higher densities alone,
	we're probably hurting the MTBF, not helping it.  In fact,
	I doubt that the same failure rate per cell should be quoted
	over all PROM from 8K to 128K, but I didn't see the spec sheet.
   -	Now some good news - there's probably a residual factor in
	the MTBF (due to the package, etc.) that prevents failure rate
	from being quite proportional to bits.
   -	Some more good news - if we can put a meg on a chip, we can reclaim
	some of that silicon real estate for an error-correcting code.
	If cell failures are really independent, that should uield a
	big improvement.  
By the way, the independence assumption
is (1) important to the MTBF calculations above, and (2) probably wrong.  
(That is, cell failures on a chip are probably correlated.)  If the
assumption of independence is wrong, then the chip MTBF is better,
and failures are multiple-cell failures (making error-correcting codes
less effective).

....for what it's worth....

			Dave Tutelman
			Physical - AT&T Information Systems
				   Holmdel, NJ 07733
			Logical  - ...ihnp4!mtuxo!mtgzz!dmt
			Audible  - (201)-834-2895