Path: utzoo!mnetor!uunet!mcvax!ukc!its63b!csnjr
From: csnjr@its63b.ed.ac.uk (Nick Rothwell)
Newsgroups: comp.lang.misc
Subject: Concrete Syntax in ML (was: CAML Release 2.5)
Message-ID: <1304@its63b.ed.ac.uk>
Date: 10 May 88 15:01:04 GMT
References: <7756@mcdchg.UUCP>
Reply-To: nick%ed.lfcs@uk.ac.ucl.cs.nss
Organization: LFCS, University of Edinburgh
Lines: 51
Keywords: ML, Yacc, Antiquoting

From article <7756@mcdchg.UUCP>, by mauny@inria.UUCP (Michel Mauny):
> CAML, a functional programmimg language of the ML family developed
> at INRIA, is now available on SUN-3 and VAX11 under Unix BSD 4.2 and 4.3.
> CAML (an acronym for Categorical Abstract Machine Language) is close
> in functionality to Standard ML (although release 2.5 does not have modules
> yet). It differs mostly in its syntax (closer to the syntax of LCF ML), and
> in its ability to manipulate concrete syntaxes through an interface with Yacc.

Actually, we're working on a parser generator interface for Standard ML as
well, complete with antiquotation (concrete syntaxes) and so on. Our system
doesn't rely on yacc syntax, and doesn't generate a parser in source form
(I don't recall how the CAML parser is generated and loaded). Instead, the
grammar is presented as a data structure (in fact, as an environment with
the various types for terminals, nonterminals and the associated actions),
and the parser generator returns a new environment containing a parsing
function.
   Writing grammars in this way is a little tedious, but the whole idea is
that a simple grammar is written to generate a parser which understands
grammars. Then, full grammars can be bootstrapped into this system by
antiquoting. The strength of the system is that grammars can be quite
flexible, since they are data structures which can be transformed into the
form that a LR parser generator wants. I am (at this very moment) putting the
finishing touches to an interface which lets you specify productions
containing regular expressions. Coupled with the antiquoting mechanism (not
yet implemented) you can say things rather like:

	structure Grammar =
	   struct
	      ...
	      val Gram = << StartSymbol -> Expr
	                    Expr        -> "nil"
	                                 | Expr "::" Expr
	                                 | "[" (Expr / ",")* "]" {Sexy, eh?}
	                                 | ...
	                 >>
	   end

	structure Parser = ParserGen(structure G = Grammar)

	fun parse() = Parser.parse(lexer);

There's a *lot* of polishing still to do, but the basic mechanisms work.

		Nick.
-- 
Nick Rothwell,	Laboratory for Foundations of Computer Science, Edinburgh.
		nick%lfcs.ed.ac.uk@nss.cs.ucl.ac.uk
		!mcvax!ukc!lfcs!nick
~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~
...while the builders of the cages sleep with bullets, bars and stone,
they do not see your road to freedom that you build with flesh and bone.