Path: utzoo!utgpu!jarvis.csri.toronto.edu!rutgers!psuvax1!ukma!tut.cis.ohio-state.edu!cs.utexas.edu!oakhill!mikes
From: mikes@oakhill.UUCP (Mike Schultz)
Newsgroups: comp.mail.elm
Subject: Serious elm problems on Sun OS 3.5.2
Message-ID: <2311@oakhill.UUCP>
Date: 16 Aug 89 14:25:51 GMT
Reply-To: mikes@oakhill.UUCP ()
Distribution: usa
Organization: Motorola Inc., Austin, Texas
Lines: 53

Our sight has been running elm 2.2 ever since it was available.  We are now
running patch level 10.

Shortly after I installed elm 2.2, I began getting complaints from users that
elm was crashing.  I blamed the problem on system overloading:  elm doesn't
always check all system calls for the less frequent error returns caused by
stuff like process table overflows (which does happen occasionally :-)).

But then it happened to me one day and I could repeat the failure, so I got out
adb and went bug hunting!  And what did I find?  I found an illegal instruction
in the middle of the object file.  No, I didn't disassemble in the middle of
an instruction, it was right there in the middle of the disassembly of the 
legal instructions,.... And then it wasn't!

Yes, that's right.  While poking around trying to figure out how on a virtual
memory machine with page protection, elm managed to get an instruction trashed,
the instruction fixed itself and elm ran my test case without failure!

Whoa!  I took my evidence to the system managers, and reported something was
wrong.  For my trouble I got blank looks, told that no one else is having this
problem, and that usenet software always has bugs in it.  (So they're human
and overworked, who isn't.)

I let the problem drop,  confident people would begin to report other 
mysterious problems and that the problem would evidentually get corrected.

But they didn't and the elm problem complaints went away too.

Unfortunately, the problems are coming back again, now several months later.

The symptoms are still the same.  Elm randomly crashes, adb disassemblies
show illegal instructions in the code, but almost never in the same place.
The most frequent sequence that causes the crash is to start elm, change 
folders, use * to advance to the end, backup one message, and then perform
a reply.  I don't know if it is significant, but the folder has 51 messages
in it, and when I backup a message, elm has to scroll backwards on page.
The routine that most frequently shows the problem is get_return.

It has happened using a vt100 termcaps and the Sun OS is 3.5.2.

Anybody heard of this problem?  Is there a bug in the Sun OS?  Is anybody
currently running elm 2.2 on a Sun OS 3.5.2?

Please respond using email.

Hurry up guys!  The natives are beginning to want 2.1 back.

Mike Schultz
mikes@oakhill.uucp
...!uunet!cs.utexas.edu!oakhill!mikes

"At Motorola, we make many types of high speed microprocessor chips.
	So NUMBER CRUNCH all you want, we'll make more!"