Xref: utzoo comp.arch:11590 comp.sys.ibm.pc.rt:1002
Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!mailrus!cs.utexas.edu!uunet!mcsun!tuvie!edvvie!eliza!johnny
From: johnny@edvvie.at (Johann Schweigl)
Newsgroups: comp.arch,comp.sys.ibm.pc.rt
Subject: integer alignment problems on RT
Keywords: RT 6150 032 ROMP alignment
Message-ID: <162@eliza.edvvie.at>
Date: 26 Sep 89 09:15:45 GMT
Organization: Edv GesmbH, Austria/Europa
Lines: 110

Environment: IBM PC/RT, ROMP RISC CPU, AIX 2.2.1, standard AIX C compiler

After two night's hunting for a bug the big enlightment came over me; 
with it came the remembrance of the old law 'thou shalt write four byte
integers to word boundaries'. 

The story was as follows:
I'm producing an output stream, consisting of an int, containing the
length of the following string, and the string, this repeated for every string to be written. The string has arbitrary length, so the following int (4 bytes) 
can be at any adress, even or odd, word or not. 
Not paying attention to alignment rules, the tail of the string 
would be destroyed by the following int. 
That's it. The CPU writes every int to a word boundary <= the actual adress.
This is the assembly code:
	...

#       	*msgBufCurr.Integer = curColLen;
	l	4,8+L.1L(1)    	# load msgBufCurr.Integer into R4
	l	3,12+L.1L(1)    # load curColLen into R3
	st	3,0(4)    	# store R3 to *msgBufCurr.Integer
	...

Nothing to see from outside the CPU.
The thing that's very suspect to me is, that the CPU simply aligns the adress
internally and writes the int to the new, aligned adress.

I tried the same on my '386 AIX machine, and, whistle and bells, this one
does not take care of anything. If you write an 4 byte int to any address,
odd and wherever you want, the CPU does it.

This leads me to the final questions: 
- is it acceptable that the CPU changes the adress you delivered without any
  warning and does something you wouldn't expect
- how do other CPU's behave (eg. 88000, 68000, SPARC, MIPS)
- would you prefer getting an 'alignment violation trap' or something like this
- does any CPU implement such a trap

Besides this discussion I would like to follow on the net (if there is any 
response) I include the C program source I used to proof my shame. If
you've got any of the above CPU's or another weirdo, and have a bit of time to
spend, please compile it, and email me the output of the program, your CPU type
and the assembler listing of the program. Just because I love to read assembler
listings of CPU's I don't know.

Thank you.
----- start of code ----------------------------------------------------------

#include 
#include 

void memHexDump();

union _ptr {
	int  	*Integer;
	char	*Character;
};

typedef union _ptr	ptr;

main()
{
	int 	iArr[4];
	ptr	foo;
	ptr	bar;

	iArr[0] = 0;
	iArr[1] = 1;
	iArr[2] = 2;
	iArr[3] = 3;

	foo.Integer = iArr;
	bar.Integer = iArr;
	memHexDump(foo.Character,16,"iArr[4] before hacking around");
	foo.Character += 5; /* Har har ack ack barf barf */
	*foo.Integer = -1;  /* 0xffff, a nice pattern    */ 
	memHexDump(bar.Character,16,"iArr[4] after hacking around");
}

void memHexDump(source,n,name)
char *source;
int n;
char *name;
{
	register int 	i;
	static char	hexChars[] = "0123456789abcdef";

	printf("memHexDump: %d bytes dump of %s\n",n,name);
	printf("memHexDump: starting at address %08x\n",source);
	for (i = 0; i < n; i++) {
		putchar(hexChars[i % 15]);
	}
	putchar('\n');
	for (i = 0; i < n; i++) {
		putchar(isprint(*(source + i)) ? *(source + i) : '.');
	}
	putchar('\n');
	for (i = 0; i < n; i++) {
		putchar(hexChars[(*(source + i) & (char)0xf0) >> 4]);
	}
	putchar('\n');
	for (i = 0; i < n; i++) {
		putchar(hexChars[*(source + i) & (char)0x0f]);
	}
	putchar('\n');
}
-- 
       ------------------------------------------------------------------
       EDV Ges.m.b.H Vienna              Johann Schweigl    
       Hofmuehlgasse 3 - 5               USENET: johnny@edvvie.at
       A-1060 Vienna, Austria      Tel: (0043) (222) 59907 257 (8-19 CET)