Path: utzoo!attcan!uunet!cs.utexas.edu!tut.cis.ohio-state.edu!mailrus!ncar!tank!mimsy!chris From: chris@mimsy.UUCP (Chris Torek) Newsgroups: comp.lang.c Subject: entry at other than main (was want to know) Message-ID: <19164@mimsy.UUCP> Date: 19 Aug 89 13:16:52 GMT References: <8487@bsu-cs.bsu.edu> <2980@solo9.cs.vu.nl> <182@sunquest.UUCP> <2563@trantor.harris-atd.com> Organization: U of Maryland, Dept. of Computer Science, Coll. Pk., MD 20742 Lines: 122 In many articles many people write this, that, and the other argument for or against `main()' as the program entry point. Personally, I do not see this as much of an issue. There must be *some* way to label something as the program entry point. The obvious way to do this is with a `reserved word'. Many programs use a special syntax: PROGRAM FOO IMPLICIT UNDEFINED (A-Z) ... END or program blivet(input, output); type goo = record ... end; var a, b, c : integer; begin ... end. Others simply `enter from the top' (SNOBOL does this, making subroutines exciting, since the subroutine must be defined before it is used, yet usually cannot be run before the main program itself begins). Still others (like C) reserve a particular function name. In languages with true reserved words, this has the trivial advantage of not `using up' another word. Only a very few languages---particularly interpreted or `symbolic' languages---have historically allowed several program entry points. These get away with it by preserving enough of the symbol table---often this means `all of the symbol table'---to know the names of every function, and the types of arguments, and so on. Many compiled languages discard the symbols at the end of compilation, at least virtually (e.g., global symbols are retained for use with debuggers, unless you use `strip'), and C has historically taken this approach. Once the symbols are gone, there is no good way to bind names to machine code locations, necessitating a simple convention like `start at the first byte' or `start at offset'. Anyway, this gives us some background with which to consider the options available. We have four standard approaches available: a) program begins at procedure or function declared with some special syntax; b) program begins at top; c) program begins at reserved name (`main'); d) program begins at any function (Lisp, APL, etc). Of these, only one allows programmers and users to `do lots more', and that is the last approach. It it certainly very useful during debugging. But it has drawbacks: it uses more resources (you have to carry those symbols around, and provide a way to look them up). A more subtle drawback is that you may not *want* users to start your program anywhere---a canned application is only meant to be started in some particular way(s). Compiler vendors are probably not interested in their users' being able to invoke individual functions and perhaps `steal compiler technology' that way. At any rate, you can, right now, go out and *buy* approach (d) for C: there are at least two C interpreters on the market. If you want it, go pay for it. That leaves us with (a), (b), and (c). Of these, I would personally reject (b) out of hand, having had some experience with it, leaving only (a) and (c). So: what does (a), adding a special syntax, buy us? Well, for one, we can name our programs. Instead of /* calculate prime factors */ int main(int argc, char **argv) { ... } we can write { calculate prime factors } program primefactors(input, output) ... That this is good, I think most will agree. That it is worth the `cost' of a program keyword is a bit more debatable. More intriguing to me is the fact that many compilers actually discard the program name almost immediately---the program name acts like a comment. If it acts like one, maybe it should just *be* one, as in C. Either way, I think this is ultimately unimportant. One either learns `main is the program, look near it to figure out what the program is about' or `the program name is discarded, look away from it when the debugger prints locations' or whatever. But there is another advantage to the special syntax, if we design it properly. We could allow programs to declare each entry point with a `program' or `entry' statement, and thus share subroutines and get the effect of switching on argv[0] on Unix machines, as ex/vi/view/edit/e and compress/uncompress do. To do this we must have the compiler and the linker cooperate: the compiler has to `leave behind' the names of all the program entry points, and the linker must include code to select the appropriate one at runtime. If there is only one entry point, the linker could skip the selection code. The benefits we know; the cost of this is some special syntax, some code in the compiler, and some more code in the linker. Is this an advantage? Certainly, at least for programs like ex/vi/view/edit/e and compress/uncompress; they could leave out the `magic' used to decide how to operate, relying on the `magic' in the runtime library instead. Is it worth it? Again, this is debatable. For every application that has several entry points you can find many that have only one. (In fact, ex/vi/... has only one: it sets flags based on argv[0], does some startup common to all variants, and only then looks at the flags. The same flags can be set or cleared under program control [e.g., `set magic', `set readonly'], so ex/vi/... is not such a great example. Compress/uncompress is a much better example.) Moreover, one of the philosopies underlying both C and Unix is (or at least was) `there is no magic': the language and the programs are (or at least, once were) generally simple and straightforward. At any rate, C uses the `reserved procedure name' approach, with its single merit of simplicity and its drawbacks as discussed above, and arguments in this newsgroup are unlikely to change this. If you really want multiple entry points *and* debuggability in C, go buy a C interpreter. If you want something in between, go write it yourself. Maybe, after demonstrating how wonderful it is, you can get it into C00 (or whatever the next standard may be called). -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris