RE: Generalized parser without generation

Quinn Tyler Jackson <qjackson@shaw.ca>
8 Feb 2004 21:52:42 -0500

          From comp.compilers

Related articles
RE: Generalized parser without generation qjackson@shaw.ca (Quinn Tyler Jackson) (2004-02-04)
RE: Generalized parser without generation qjackson@shaw.ca (Quinn Tyler Jackson) (2004-02-04)
Re: Generalized parser without generation cfc@shell01.TheWorld.com (Chris F Clark) (2004-02-04)
RE: Generalized parser without generation qjackson@shaw.ca (Quinn Tyler Jackson) (2004-02-08)
RE: Generalized parser without generation qjackson@shaw.ca (Quinn Tyler Jackson) (2004-02-08)
| List of all articles for this month |
From: Quinn Tyler Jackson <qjackson@shaw.ca>
Newsgroups: comp.compilers
Date: 8 Feb 2004 21:52:42 -0500
Organization: Compilers Central
References: 04-02-060
Keywords: parse
Posted-Date: 08 Feb 2004 21:52:42 EST

Chris Clark said:


> There are such things available. For example, the commercial tool
> Visual Parse++ has a graphical debugging utility which does roughly
> what you want. I believe that has been leveraged by Quinn Tyler
> Jackson in his Meta-S parser. (More on that in a moment.)


The graphical debugging front end of Meta-S evolved from PAISLEI (Pattern
Amalgamator Intended to Simplify Lpm's Egregious Interface), which in turn
evolved from my having experienced the joy of Visual Parse++'s grammar
debugging power. The concept of graphically debugging a grammar is what was
leveraged there -- but the particulars in Meta-S evolved over time with user
feedback.


> The one caveat to such tools is that unless you have a fairly powerful
> parsing mechanism, your parser will probably need "hacks" to work on
> "real" languages--stock LL and LR parsing can't handle any of the
> mainline programming languages (e.g. C or Pascal or any of their
> derivatives) out-of-the-box and need hacks. These hacks usually
> require escaping from the world of context-free-languages into the
> underlying support code. As a result, most grammar visualization
> systems can handle only toy languages.


Yes: Whenever ad-hockery must be added in order to parse a language
correctly, the mathematically correct grammar model is broken. That is to
say type 2 parser that must enter reduction code to know if something is
legal or not during the parsing phase is essentially a Type 0 parser -- the
formal methods required to know whether or not such a parser always
terminates enter into the domain of the Halting Problem. On a purely
practical level, jumping from grammar-to-host-language requires two phase
development, and one quickly loses the benefits of having a graphical
grammar debugger.


I tried to address this issue by allow Meta-S grammars to "host" Lua
reduction events, thusly:


grammar Foobar host Lua
{
        S ::= a b c;
        a ::= '[a-z]+';
        b ::= '[0-9]+';
        c ::= '[a-z]+';


        @func_impls = :{
                function b_event(N)
                        if(N:MATCHED()) then
                              print(N:LEXEME());
                        then
                end
        }:
};


For a more complete example of this, here is a tutorial that demonstrates
the Meta-S/Lua interface for a simple expression grammar:


http://members.shaw.ca/qjackson/cs/tutorials/expr_grammar.html


The nice thing about the Meta-S/Lua marriage is that the Lua code can be
single-stepped, break-pointed from *WITHIN* the very same graphical grammar
debugger as the grammar. (Variable values can be viewed as tooltips, as
well.) No need to switch tools to develop a correct grammar.


> The one exception I'm certain of is Meta-S. Its grammar formalism is
> Turing complete, so you can coding any hacks you need directly into
> the grammar. Handling everything inside the grammar was one of his
> design goals as was handling on-the-gly grammars.


Yes. For purists, the $-grammar formalism is possibly TP iff a given
$-grammar contains *both* predicates and classes (implemented as PDA-Ts --
push-down automata with tries). It has been shown in the formal sense that
the set of languages that does not contain both predicates and PDA-Ts (that
is, contains one or the other, but not both) is Type 1. This means that a
$-grammar cannot "accidentally" become TP. This was a fun finding, and I
hope to explore this further.


--
Quinn Tyler Jackson


http://members.shaw.ca/qjackson/


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.