Path: utzoo!utgpu!attcan!uunet!lll-winken!lll-tis!ames!mailrus!purdue!i.cc.purdue.edu!k.cc.purdue.edu!l.cc.purdue.edu!cik
From: cik@l.cc.purdue.edu (Herman Rubin)
Newsgroups: comp.arch
Subject: Re: Software Distribution
Summary: It would be very useful, but it is impossible.  Maybe an
	 approximation could be made
Message-ID: <883@l.cc.purdue.edu>
Date: 20 Aug 88 11:58:18 GMT
References: <891@taux01.UUCP>
Organization: Purdue University Statistics Department
Lines: 96

In article <891@taux01.UUCP>, chaim@taux01.UUCP (Chaim Bendelac) writes:
> Previous articles (under "Standard Un*x HW"/"Open Systems"/"ABI", etc) have
> expressed the wish for portability standards. Many organizations are spending
> tremendous resources to promote such standards. Nothing new there.
> 
> I wondered, if there is no room for another standard layer, specially 
> designed for software DISTRIBUTION.  Imagine an Intermediate Program 
> Representation Standard (IPRS) along the lines of an intermediate language
> within a compiler.  Language independent, architecture independent.
> The distributor compiles and optimizes his program with his favorite 
> language compiler into its IPR, copies the IPR onto a tape and sells. 
> The buyer uses a variation on 'tar' to unload the tape and to post-process 
> the IPR program with the system-supplied architecture-optimized IPRS-to-binary 
> compiler backend.

This would be very useful, _but_ consider the following problems.  I could 
probably list over 1000 hardware-type operations (I am not including such
things as the elementary transcendental functions; only those things for
which I can come up with a nanocode-type bit-handling algorithm, such as
multiplication and division) which I would find useful.  My decision as to
which algorithm to use for a particular problem would be highly dependent on
the timing of these operations.  To give a simple example, one would be well-
advised to avoid too many divisions on a CRAY.  Integer division on a CRAY or
a CYBER 205 is more expensive than floating point, and on the CRAY it is even
necessary to work to ensure the correct quotient.  Packing or unpacking a
floating point number is trivial on some machines but much more difficult
on others.

Thus one cannot optimize a program without knowing the explicit properties of
operations on the target machine.  We used to have a CDC6500 and a 6600 at
Purdue.  These machines had exactly the same instruction set, and unless there
was a fancy speedup attempt using parallel IO and computing in an unsafe
manner, exactly the same results would occur.  However, optimization was 
totally different.

I suggest instead that we have a highly flexible intermediate language, with
relatively easy but flexible syntax, and a versatile macro processor.  This
would be enough by itself in many situations, but I know of none.  An example
of a macro is
		x = y - z

which I would like to treat as the (= -) macro.  Then we could have various
algorithms which an optimizing macro assembler could assemble and estimate
the timing.

Another advantage of something like this, and this is particularly relevant
to this group, is that it can be pointed out the multitudinous situations where
simple hardware instructions not now available can greatly speed up operations.
I personally consider the present "CISC" machines as RISCy.

> 
> No need for cumbersome source-distributions, no more different binary copies
> of the software. Utopia! You introduce a weirdly new, non-compatible 
> architecture? Just supply a Standard Unix (ala X/Open or OSF or whatever), 
> an IPRS-to-binary backend, and you are in business. The Software Crisis 
> is over! 
> 
See above.  I think it will be simpler, but not what was proposed.
> :-)   ?   :-(
> 
> No free lunch, of course. The programmer still has to write "portable"
> software, which is a difficult problem. A truly language- and architecture-
> independent interface is almost as difficult to design as the old "universal 
> assembler" idea. But with enough incentives, perhaps?  Questions:
> 
The "universal assembler" is more practical, if it written more like CAL
with overloaded operators.  The interface would require a macro processor,
but could be done with little more.  However, as I have pointed out, the
portable software mentioned above cannot exist.  The examples above are not
for truly parallel machines.  On a parallel machine, how would one break a
vector into the positive and negative elements, use a separate algorithm to
compute a function for these cases, and put the results back together in the
order of the original arguments?  Something can be done, but I suggest one
would be better served by kludging in additional hardware.  Now if one is 
stuck with this situation, and does not have the additional hardware, an
algorithm somewhat slower may be in order.

We must face the fact that there cannot be efficient portable software.  We
may be able to produce reasonably efficient semi-portable software, and we
should try for that.  I believe that the tools for that can be developed.

We also should try to improve the hardware to be able to use the "crazy"
instructions implementable in nanocode or hardwired.  There are useful
instructions, not present in the HLLs, which may be so slow as to be
impractical if not hardware.

> 	1. How desperate is the need for such a standard? (I know: GNU
> 	   does not need ISPRs nor ABIs...)
> 	2. Assuming LOTS of need, how practical might this be?
> 	3. What are the main obstacles? Economical? Political? Technical?
> 	4. What are the other advantages or disadvantages?


-- 
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907
Phone: (317)494-6054
hrubin@l.cc.purdue.edu (Internet, bitnet, UUCP)