Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!decvax!ima!johnl
From: johnl@ima.UUCP
Newsgroups: mod.compilers
Subject: Re: advice needed re: parsing C decl syntax
Message-ID: <843@basser.oz>
Date: Tue, 9-Dec-86 19:26:41 EST
Article-I.D.: basser.843
Posted: Tue Dec  9 19:26:41 1986
Date-Received: Tue, 16-Dec-86 03:32:03 EST
References: <282@ima.UUCP>
Reply-To: steve@basser.oz (Stephen Michael Russell)
Organization: Dept of Comp Sci, Uni of Sydney, Australia
Lines: 48
Approved: 
Summary: Recursive descent vs table-driven parsers

In article <282@ima.UUCP> Paul Tarvydas  writes:

>Recently, I've tried to parse the C declaration syntax using recursive-descent
>and have found that it's a fairly difficult task.

It is difficult regardless of the parsing technique used. Using LR
(or LALR), rather than LL, parsing helps slightly. C's declaration syntax
is a mess.

>I'm still reluctant to use a straight LALR table parser for reasons of
>(ahem) efficiency. [...] Lists of thingies also fall out into
>simple code:
>
>       thingie ();
>       while (inputChoice (COMMA)) {
>               thingie ();
>       }

I doubt this is more efficient than a table driven parser. The
cost of the call to "inputChoice" is at least 3 instructions (say
10 bytes or more). A parsing table could encode this in 2 bytes (2
bits for the action (shift/reduce/accept), leaving 14 bits for
the symbol).  The execution time of the table driven parser
should be at least equal to, if not faster than the above code.
The parser's code to check that the current symbol equals the
value in the table would have similar speed to the code in the
"inputChoice" routine, but without the overhead of the procedure
call (register saving/restoring, allocating local vars, etc).

My experience is that table driven parsers (whether LL or LR) are
smaller and faster than recursive descent. They also offer the
significant advantage of allowing automatic syntax error
correction/recovery. Doing this in a procedural recursive-descent
parser requires passing recovery sets to each parsing procedure,
as described in Hartmann's book on the Concurrent Pascal
compiler, or Pemberton's paper in "Software - Practice &
Experience" (around 1981). This slows the parser (even when
parsing a syntactically correct program), and increases the
stack space requirements.

Recursive descent parsers have the advantages of being easy to
write by hand, automatic allocation of temporary variables while
analysing recursively defined structures (such as statements and
expressions), and the ability to pass values down to lower level
parsing routines. YACC's method to do the last two is much less
clean, although a pre-processor such as Prep (van Katwijk,
ACM SIGPLAN Notices, Oct 1983) helps.
-- 
Send compilers mail to ima!compilers or, in a pinch to Levine@YALE.EDU
Plausible paths are { ihnp4 | decvax | cbosgd | harvard | yale | bbncca}!ima
Please send responses to the originator of the message -- I cannot forward
mail accidentally sent back to compilers.  Meta-mail to ima!compilers-request