SMLYACC(1) | General Commands Manual | SMLYACC(1) |
smlyacc - the parser generator for SML#
smlyacc [-s] [-p output_prefix] filename
SMLYacc is a parser generator in the style of ML-Yacc. It can accept grammer files of ML-Yacc, but generated programs and their usage are not compatible to those of ML-Yacc. Generated programs can be compiled by the SML# compiler.
By default, for an input file X.grm, smlyacc generates X.grm.sml for the generated parser, X.grm.sig for the signature of tokens, and optionally X.grm.desc for the description of LALR parser construction. To compile the generated program with SML#, you need to write an inteface file X.grm.smi by yourself according to the generated signature X.grm.sig.
The following is a minimal example of an input file ex.grm:
%% %term LPAREN | RPAREN | EOF %nonterm start of word | exp of word %pos int %eop EOF %name Example %% start : exp (exp) exp : (0w0) | LPAREN exp RPAREN exp (exp1 + exp2)
By applying this file to smlyacc,
smlyacc ex.grm
you obtain two files ex.grm.sml and ex.grm.sig. Only ex.grm.sml needs to be compiled. To compile it, you need to create the following ex.grm.smi file by yourself:
_require "basis.smi" _require local "ml-yacc-lib.smi" _require local "./ex.grm.sig" structure ExampleLrVals = struct structure Parser = struct type token (= boxed) type stream (= ref) type result = word type pos = int type arg = unit val makeStream : {lexer : unit -> token} -> stream val consStream : token * stream -> stream val getStream : stream -> token * stream val sameToken : token * token -> bool val parse : {lookahead : int, stream : stream, error : string * pos * pos -> unit, arg : arg} -> result * stream end structure Tokens = struct type pos = Parser.pos type token = Parser.token val EOF: pos * pos -> token val RPAREN: pos * pos -> token val LPAREN: pos * pos -> token end end
The types of token constructors (EOF, RPAREN, and LPAREN) are copied from the generated signature ex.grm.sig file by hand.
The parse function in the generated program is the parser. To invoke it, an imperative lexer function of type unit -> token is needed. In the case of combining with SMLLex, the lexer is generated by SMLLex. Suppose that SMLLex generates a lexer of the following interface:
structure ExampleLex = struct exception LexError val makeLexer : (int -> string) -> unit -> ExampleLrVals.Parser.token end
A typical code joining SMLLex and SMLYacc looks like the following:
fun inputN n = TextIO.inputN (instream, n) val lexer = ExampleLex.makeLexer inputN val stream = ExampleLrVals.Parser.makeStream {lexer = lexer} val (result, stream) = ExampleLrVals.parse {lookahead = 0, stream = stream, error = errorFn, arg = parserArg}
SMLYacc is a derivative of ML-Yacc, which is originally developed by David R. Tarditi and Andrew W. Appel. When ML-Yacc was ported to SML#, the source code was restructured to replace functor applications with SML#'s separate compilation and linking. See the SML# document for major changes from the original ML-Yacc.
smllex(1)
ML-Yacc User's Manual, available at https://www.smlnj.org/doc/ML-Yacc/
SML# Document, available at
https://www.pllab.riec.tohoku.ac.jp/smlsharp/docs/