From: utzoo!decvax!decwrl!turtleva!ken
Newsgroups: net.unix-wizards
Title: Re: Optimize my kernel?
Article-I.D.: turtleva.166
Posted: Wed Feb 23 23:37:42 1983
Received: Thu Feb 24 06:21:39 1983
References: syteka.286

It is alright to optimize the kernel, but it is not recommended to
invoke the C optimizer on the kernel code.

The kernel is much different than other programs because it accesses
the devices sitting out there on the bus which are not necessarily
memory.  The optimizer assumes that every location that it accesses
behaves according to a model that resembles generic memory.  Many of
the control, status, command, and data registers of devices such as
UARTS, disk controllers, graphics boards, and the like DO NOT behave
like memory.  Some registers are not read/write; that is, they may be
writeable, but may return garbage when read.  Likewise, they may be
status-type bits, which can be read but not written.  One of the worst
hardware designs (found quite a bit in practice) is to have the same
memory address refer to two totally different registers when reading as
opposed to writing.  There may be a memory susbystem out there on the
bus which responds to every byte address in a certain range, but which
cannot be accessed in word quantities.  Some types of registers modify
the contents of other registers when they are read; e.g. reading the
data register of a UART usually clears the FULL bit in the status
register.

What difference does this make, you say? Well the optimizer will do
things like not even bother to read a location it has just written,
even if the C code does; it already [thinks] it knows what is contained
in that location because it just wrote it!  Some (probably most)
computers have a special instruction that performs the CLEAR function.
At least one UNIX-supporting minicomputer that I know of accomplishes
this by subtracting the location from itself, generating a
read-modify-write cycle instead of just a write cycle.  This can wreak
havoc with registers that have side effects (such as the UART data
register).  It is better to write the constant 0 than invoke the CLEAR
instruction.

I discovered these problems by writing diagnostic code for imaging &
graphics peripherals.  Bad hardware tested good when I used the
optimizer, and tested bad when I didn't.

It would probably be all right to sic the optimizer on portions of the
kernel that do not deal with devices, but unless you are intimately
familiar with the hardware of a particular device, it is recommended
NOT TO OPTIMIZE ANY DEVICE DRIVER.

			Ken Turkowski
		{ucbvax,decvax}!decwrl!turtlevax!ken