Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.2 9/18/84; site brl-tgr.ARPA Path: utzoo!watmath!clyde!burl!ulysses!allegra!mit-eddie!godot!harvard!seismo!brl-tgr!tgr!Schauble@MIT-MULTICS.ARPA From: Paul SchaubleNewsgroups: net.lang.c Subject: Re: length of external names Message-ID: <7035@brl-tgr.ARPA> Date: Sun, 6-Jan-85 02:21:17 EST Article-I.D.: brl-tgr.7035 Posted: Sun Jan 6 02:21:17 1985 Date-Received: Mon, 7-Jan-85 03:18:07 EST Sender: news@brl-tgr.ARPA Organization: Ballistic Research Lab Lines: 85 I'm not sure that I should post this to the net, but I can't resist.. Henry Spencer, who seems to be one of the chief exponents of short external names, just posted a convincing explaination of the need to not break existing linkers. I understand why and the issues involved. I even mostly agree. In a previous incarnation I worked on COBOL and PL/1 for a manufacturer that had the same problem: a language that required long names and a linker that only handled short ones. The solution that was used, and worked, was to have the COMPILER use the external "name" to store a hashed value. During the recent net discussion I posted a description of this technique and some analysis of the chance and cost of collisions. This is done entirely in the compiler, and has no effect on the linker. I have not seen any reasonable statement of why this would not be workable. The only objection that I can recall was that having to look up the name translation during debugging was extra work. True, but consider...Would you rather have the extra work on the few occasions that you need to look up a symbol on the load map, or on the many more frequent occasions that you are dealing with C source and have to guess what "dtfmdu" or something means? You know which way I will vote. More recent discussion prompts me to post a small modification of the technique. Several people have pointed out the desirability of a language feature that would have the internal and external names of a global item be different, e.g. extern int date_and_time() entry "SYS$TIME"; extern int memory_size entry "CSYS$MEMSIZ"; I like this, other languages have it, it's useful, and it would have saved me having to write a number of assembler routines whose only purpose was to change names. It also allows me to suggest a modification of the hashing technique. Note that this only applies to systems with deficient linkers. If the declaration contains an entry clause, use that as an external name. Otherwise, if the item name is short enough, use the item name. Otherwise, hash the item name and use the result as the external name. This allows programming using the full names, and using the entry clause for those cases where you really care what the external name is, or in the rare cases when the hash causes a duplication of external names. ---------------------------------------------------------------------- Now, my questions: To the standards commiteee poeple: 1. Suppose that the standard required longer names and suggested the hashing technique as an implementation technique, you would force manufacturers to update either linker or compiler to meet the standard. Is this politically possible? 2. In some other areas, I am told, the standard described a relatively high level language, rather than the mimimum of implementations. This will prevent some present compilers from meeting the standard. Why should it pick the mimimum here? 3. How can I get a copy of the draft standard? 4. Is this an adequate method of getting comments and questions to the committee? If not, what is a useful channel? To the net at large: 1. What are specific objections to the hashing technique? 2. Are there any machines where it won't work, and why? Please copy me on any answers. Service from the list has been erratic lately. Thanks for all the fish... Paul Schauble@MIT-Multics.ARPA