Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!mailrus!uflorida!novavax!twwells!bill From: bill@twwells.com (T. William Wells) Newsgroups: comp.lang.c Subject: Re: pointer poison (was: effect of free()) Message-ID: <1989Oct2.160047.25461@twwells.com> Date: 2 Oct 89 16:00:47 GMT References: <184@bbxsda.UUCP> Organization: None, Ft. Lauderdale, FL Lines: 154 In article <184@bbxsda.UUCP> scott@bbxsda.UUCP (Scott Amspoker) writes: : >I'm not sure what you are talking about. The original posting in this : >thread (mine) said that the C standard permits the compiler writer to : >assume that you don't use a pointer after a free... : >[...] : >If I write the program so that it is not designed to reference : >pointers after they are freed, assuming I get it right, it doesn't : >matter whether the generated code trashes the pointer... : >[...] : >The conclusion is that I should avoid referencing the pointer after : >it is freed, so that this perfectly legit optimization won't break my : >code. : : I agree, the original posting had to do with testing a pointer after : a free() call. However, the discussion quickly expanded into pointer : handling in general. Many readers were concerned that their code : could possibly be testing or moving a pointer that did not contain : a valid address. Even though their programs may have the appropriate : logic in place that would prevent ultimate *dereferencing* of such a : pointer, it was suggested that merely handling such a pointer is : considered a bug: (ex: p1 = p2 causes a trap if p2 contains an : invalid address). While I would agree that handling a pointer : that you won't ultimately be using (because of some later : condition) is questionable style - it's hardly an outright *bug*. I'm going to explain this at length, so that we can stop arguing about it. Your assertion amounts to using a freed pointer's value doesn't break anything so it is OK. And I'm saying that that is not true. A C program can operate in one of two modes: within the C model, and outside it. Programs that operate within the C model may do different things due to implementation differences, yet, until they stray outside the model, will do predictable things. (Compiler bugs permitting, anyway. :-) Programs that operate outside the C model might do *anything*. Obviously, you want to never write a C program that goes outside the C model. Unfortunately, this is not always possible. For example, when writing code that has to reference specific addresses. But such programs are always nonportable, and so should never be written that way unless their purposes are inherently nonportable. (And then, the only parts of the program which go outside the model should be the ones that must.) It is nice when a compiler has a wider model than the C model, making it possible to write such programs, and making it easier to debug your program when it goes outside the usual model. But such a compiler can also be misused, especially by those who take its model as "the" C model. What exactly is "the C model"? This is a list of assertions associated with each part of a program. Each assertion may say something about source code of that part or the state of the "C machine" when executing that code. For example, the C model includes a statement t that the right hand operand of the divide operator must be nonzero when the code is executed. If you do divide by zero, *anything* can happen. It is the case that most machines will either quietly ignore this error condition, stop the program execution and return to the OS, or trap to an error routine, but if your program played taps over your machine's speaker and went into an infinite loop, you shouldn't be too surprised. :-) The question arises as to which exact set of assertions should comprise the C model. Obviously, the "dictatorship" view: *my* compiler defines the C model, is right out. On the other hand, the "liberal" (American sense) view, the view that this set is null or as close as possible, is also right out. Another view, the "democratic" view, says that the C model is the intersection of the models of some set of popular compilers. This too is out. Like any absolute democracy, it tramples on those who are not in the majority, by declaring that their particular problems are of little concern. Yet another view, the "anarchic" view, says that the C model is the intersection of every C model. This view, too, is out. Should we cater the the quirks of, say, one of the brain-damaged "C compilers" for the 8051? Or what about compiler bugs? What about compiler "features" (like 8086 compiler's near and far keywords)? Should we even try to define the C model in terms of what existing compilers do? Never mind that this really is begging the question, the answer is the same as in politics: you *can* define the C model in terms of existing compilers, but this is going to result in compromise and dissatisfaction all 'round. And eventual chaos. There is, as in politics, one way that can work, the "constitutional" method. In this method, there is a piece of paper which defines the langauge, the standard. The standard serves as the touchstone by which we determine the model: any assertion stated or implied by the standard is part of the model; any other assertion is not. Just as with constitutions, standards require interpretation, will contain ambiguity and incompleteness and downright error, and will generate endless debate. Such is the consequence of our being finite; a standard represents the best we can do at the time, but we aren't going to have a *perfect* standard (not, at least, till progamming becomes an engineering discipline instead of an art. No it isn't!) In spite of this, many of the views mentioned above have some merit. Obviously, a standard that is largely inconsistent with existing practice is going to be worthless. And a standard that ignores the needs of the minorities is going to alienate a large part of the community. (We are all, after all, likely to become a part of that minority at some time or another. :-) So, having a standard doesn't really solve the problems. Instead, however, it gives those problems to a small group of people who will do their best to satisfy as many of the conflicting desires of the C community. It is guaranteed that some minorities will be left out in the cold. And some parts of the standard will even offend the majority (6 character monocase externals, faugh!). But once it is done we have a *single* (well, within the parameters of "implementation defined") C model which we can all look to and which a programmer, who you can usually bet is not as conversant with the problems of many different machines as the standard writers, can follow and have, as a consequence, a justified belief that his program will be portable. (A similar reasoning applies to the de facto standards that exists in the absence of a real standard. See "democracy" above and apply that de facto standard instead of a real standard in the following paragraphs.) Now, to get out of the ether and back to the real world, we have a practical question: should a programmer limit his portable programmers to the C model in the standard or should he use a wider or even different model? The answer to the latter should be clear: a programmer writing portable code that is inconsistent with the standard C model is just fooling himself. But that still leaves the question open: should we use a wider model? The answer to that should still be "no". For the "yes" answer implies that you know about all those zillions of systems out there and are willing to gamble that none of them breaks your model. And also that you know about all those zillions of systems *that do not exist yet* and are willing to make that gamble about them as well. So, to summarize my point so far: if you are writing portable programs, you must write to the actual standard or to some kind of "democratic" de facto standard. But best of all is to write your programs so that they don't violate the de facto standard and are easily modified, as the two converge, to meet the actual standard. The freed pointer thing is, as has been argued, acceptable within the de facto standard's model. No one has shown a real system where this fails. Fine. But, once we are following the C standard, using a freed pointer will not be within the C model. Since (unlike, e.g., prototypes) there is no contradiction involved, one can just not use freed pointers, one should never do it at all. --- Bill { uunet | novavax | ankh | sunvice } !twwells!bill bill@twwells.com