Path: utzoo!utgpu!jarvis.csri.toronto.edu!rutgers!psuvax1!ukma!tut.cis.ohio-state.edu!cs.utexas.edu!oakhill!mikes From: mikes@oakhill.UUCP (Mike Schultz) Newsgroups: comp.mail.elm Subject: Serious elm problems on Sun OS 3.5.2 Message-ID: <2311@oakhill.UUCP> Date: 16 Aug 89 14:25:51 GMT Reply-To: mikes@oakhill.UUCP () Distribution: usa Organization: Motorola Inc., Austin, Texas Lines: 53 Our sight has been running elm 2.2 ever since it was available. We are now running patch level 10. Shortly after I installed elm 2.2, I began getting complaints from users that elm was crashing. I blamed the problem on system overloading: elm doesn't always check all system calls for the less frequent error returns caused by stuff like process table overflows (which does happen occasionally :-)). But then it happened to me one day and I could repeat the failure, so I got out adb and went bug hunting! And what did I find? I found an illegal instruction in the middle of the object file. No, I didn't disassemble in the middle of an instruction, it was right there in the middle of the disassembly of the legal instructions,.... And then it wasn't! Yes, that's right. While poking around trying to figure out how on a virtual memory machine with page protection, elm managed to get an instruction trashed, the instruction fixed itself and elm ran my test case without failure! Whoa! I took my evidence to the system managers, and reported something was wrong. For my trouble I got blank looks, told that no one else is having this problem, and that usenet software always has bugs in it. (So they're human and overworked, who isn't.) I let the problem drop, confident people would begin to report other mysterious problems and that the problem would evidentually get corrected. But they didn't and the elm problem complaints went away too. Unfortunately, the problems are coming back again, now several months later. The symptoms are still the same. Elm randomly crashes, adb disassemblies show illegal instructions in the code, but almost never in the same place. The most frequent sequence that causes the crash is to start elm, change folders, use * to advance to the end, backup one message, and then perform a reply. I don't know if it is significant, but the folder has 51 messages in it, and when I backup a message, elm has to scroll backwards on page. The routine that most frequently shows the problem is get_return. It has happened using a vt100 termcaps and the Sun OS is 3.5.2. Anybody heard of this problem? Is there a bug in the Sun OS? Is anybody currently running elm 2.2 on a Sun OS 3.5.2? Please respond using email. Hurry up guys! The natives are beginning to want 2.1 back. Mike Schultz mikes@oakhill.uucp ...!uunet!cs.utexas.edu!oakhill!mikes "At Motorola, we make many types of high speed microprocessor chips. So NUMBER CRUNCH all you want, we'll make more!"