Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!utgpu!water!watmath!clyde!cbosgd!ucbvax!sdcsvax!darrell From: darrell@sdcsvax.UUCP Newsgroups: comp.os.research,ca.unix Subject: REQUEST FOR DATA Message-ID: <3430@sdcsvax.UCSD.EDU> Date: Tue, 7-Jul-87 16:03:13 EDT Article-I.D.: sdcsvax.3430 Posted: Tue Jul 7 16:03:13 1987 Date-Received: Fri, 10-Jul-87 04:16:32 EDT Sender: darrell@sdcsvax.UCSD.EDU Lines: 42 Approved: mod-os@sdcsvax.uucp Xref: utgpu comp.os.research:67 junk:5416 Hello. We're doing research on the behaviour of distributed systems. And I'd like to solicit your help! What I need is data on the failure modes of systems. An example follows. All times are specified in HOURS. CPU OS MTTF(S) MTTF(H) MTTR(S) MTTR(H) PMI PMD --- -- ------- ------- ------- ------- --- --- VAX-780 4.3BSD 240+60 8640+0 1+1 4+2 720 4 3B2 V.3 720+120 4320+0 3+1 48+8 - - NET MTTF(I) MTTF(R) MTTF(C) MTTR(I) MTTR(R) MTTR(C) --- ------- ------- ------- ------- ------- ------- Ether 17280+0 2160+24 8640+0 8+168 1+1 4+4 MTTF = mean time to failure MTTR = mean time to repair PMI = preventative maintenance interval PMD = preventative maintenance duration S = software H = hardware I = interface R = routing C = cabling The means are given as a constant term plus an exponentially distributed (random) term. A software failure typically means that the system hangs and has to be rebooted. A hardware failure means it hangs, halts or whatever and the man-from-DEC has to be called. Similar things for the network failures. Please look around your site, talk to your systems folk. Then send to me the information that you discover. I'll summarize and post it to the net. Your help is greatly appreciated! Darrell Long Department of Computer Science and Engineering, C-014 University of California, San Diego La Jolla, California 92093 ARPA: Darrell@Beowulf.UCSD.EDU UUCP: sdcsvax!beowulf!darrell