Xref: utzoo comp.sources.d:3975 comp.os.vms:16869 Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!rutgers!mailrus!iuvax!cica!tut.cis.ohio-state.edu!ucbvax!bloom-beacon!adam.pika.mit.edu!scs From: scs@adam.pika.mit.edu (Steve Summit) Newsgroups: comp.sources.d,comp.os.vms Subject: Re: Flex (latest 2.1 beta) on VMS Keywords: read() in VMS library Message-ID: <13603@bloom-beacon.MIT.EDU> Date: 18 Aug 89 05:46:31 GMT References: <1629@naucse.UUCP> Reply-To: scs@adam.pika.mit.edu (Steve Summit) Lines: 64 In article <1629@naucse.UUCP> jdc@naucse.UUCP (John Campbell) writes: >The new flex reads a large chunck at a >time. With VMS STREAM-LF files this works just fine--but with "normal" >VFC editor text files (darn these RMS things) the VMS 'C' rtl will only >return at most 1 record full of characters for any large read() byte >request. >During processing on a flex input file, flex complains of a "NULL in >input." This seems to be because yyunput() wants to "shift things up >to make room" and assumes that the end of the valid buffer is around >YY_BUF_SIZE deep. This sounds like a bug in flex. If I understand the complaint correctly, the code gets confused when the buffer is not (?) substantially full. (This sounds odd; code usually fails when buffers fill up, not when they stay relatively empty.) Flex should certainly be fixed to handle "short" reads. The set of conditions under which read() is guaranteed to return its third argument is much smaller than the set of exception cases -- those in which, though succeeding (neither error nor EOF) read returns fewer characters than requested. In fact, the set of "normal" cases has exactly one element: reads from disk files in which as many bytes as are requested exist between the current offset and end-of-file. This set can further be restricted to Unix systems (VMS and MS-DOS read emulations do not necessarily comply), and I wouldn't be surprised if there are distributed filesystems or other wrinkles existing under apparently pristine Unix variants which also cause the assumption to break down. The message is clear: never assume read() will return everything you ask for. This is usually straightforward, and I can't imagine why flex is having trouble with it. (flex is probably doing something wildly inappropriate in its input buffering strategy, doubtless out of efficiency concerns, which actually might be acceptable in a lexer, lexical analysis being a frequent bottleneck, but still no excuse for incorrect or unportable code.) Steve Summit scs@adam.pika.mit.edu >oriented file systems read() will, of course, only do this on the last >buffer.) > >So my problem is not that I can't understand what is causing the "NULL >in input" message, but a request for what the best solution might be. >I could, of course, create a special VMS YY_INPUT macro that fills >the buffer like a unix read() would using getc, or I could try to >patch yyunput() and hope that the read() behavior assumption is isolated >to this spot in flex's code. > >If you have an opinion on which way to go I'd like to hear it. If you >have already solved the problem I'd like to know what you did. If you >can think of a reason why this read() assumption is a bad idea for unix >(streams and producer/consumers that might not always behave like flex >is assuming) I'd like to know about that also. > >If I've been unclear and you want to know what the he-- I'm talking about >just mail me a message indicating where I was unclear. I'm going to try >hard to shelve this project for a day or so... >-- > John Campbell ...!arizona!naucse!jdc > CAMPBELL@NAUVAX.bitnet > unix? Sure send me a dozen, all different colors.