Commit de1ffa74 authored by Igor Dejanovic's avatar Igor Dejanovic

Initial import.

parents
syntax: glob
*pyc
*orig
*bak
Arpeggio.egg-info
Arpeggio - Parser interpreter based on PEG grammars
Author: Igor R. Dejanović <igor DOT dejanovic AT gmail DOT com>
Changelog for Arpeggio
Author: Igor R. Dejanovic <igor DOT dejanovic AT gmail DOT com>
Copyright: (c) Igor R. Dejanovic, 2009
Licence: MIT Licence
2009-09-15 - Initial release (v0.1-dev)
Implemented features:
- Basic error reporting.
- Basic support for comments handling (needs refactoring)
- Raw parse tree.
- Support for semantic actions with abbility to transform parse
tree to semantic representation - aka Abstract Semantic Graphs (see examples).
Arpeggio is released under the terms of the MIT License
-------------------------------------------------------
Copyright (c) 2009 Igor R. Dejanović <igor DOT dejanovic AT gmail DOT com>
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
Arpeggio - Pacrat parser interpreter
====================================
Arpeggio is parser interpreter based on PEG grammars implemented as recursive descent
parser with memoization (aka Pacrat parser).
Arpeggio is part of research project whose main goal is building environment for DSL development.
The main domain of application is IDE for DSL development but it can be used for all
sort of general purpose parsing.
Some essential planed/done features are error reporting and error recovery as well
as access to the raw parse tree in order to support syntax highlighting and
other nice features of today's IDEs.
For more information on PEG and pacrat parsers see:
http://pdos.csail.mit.edu/~baford/packrat/
http://pdos.csail.mit.edu/~baford/packrat/thesis/
http://en.wikipedia.org/wiki/Parsing_expression_grammar
INSTALLATION
------------
Arpeggio is written in Python programming language and distributed with setuptools support.
Install it with the following command
python setup.py install
after installation you should be able to import arpeggio python module with
import arpeggio
There is no documentation at the moment. See examples for some ideas of how it can
be used.
OVERVIEW
--------
Here is a basic explanation of how arpeggio works and the definition of some terms
used in the arpeggio project.
Language grammar is specified using PEG's textual notation (similar to EBNF) or
python language constructs (lists, tuples, functions...). This grammar representation,
whether in textual or python form, is referred to as "the parser model".
Parser is constructed out of the parser model.
Parser is a tree of python objects where each object is an instance of class
which represents parsing expressions from PEG (i.e. Sequence, OrderedChoice, ZeroOrMore).
This tree is referred to as "the parser model tree".
This design choice requires some upfront work during initialization phase so arpeggio
may not be well suited for one-shot parsing where parser needs to be initialized
every time parsing is performed and the speed is of the utmost importance.
Arpeggio is designed to be used in integrated development environments where parser
is constructed once (usually during IDE start-up) and used many times.
Once constructed, parser can be used to transform input text to a tree
representation where tree structure must adhere to the parser model.
This tree representation is called "parse tree".
After construction of parse tree it is possible to construct Astract Syntax Tree or,
more generally, Abstract Semantic Graph(ASG) using semantic actions.
ASG is constructed using two-pass bottom-up walking of the parse tree.
ASG, generally has a graph structure, but it can be any specialization of it
(a tree or just a single node - see calc.py for the example of ASG constructed as
a single node/value).
Python module arpeggio.peg is a good demonstration of how semantic action can be used
to build PEG parser itself. See also peg_peg.py example where PEG parser is bootstraped
using description given in PEG language itself.
CONTRIBUTION
------------
If you have ideas, suggestions or code that you wish to contribute to the project
please use google code issue tracker at http://arpeggio.googlecode.com/
Arpeggio is done or influenced by the following free tehnologies.
Python Programming Language (http://www.python.org)
- Arpeggio is implemented 100% in Python programming language.
PyPEG - a PEG Parser-Interpreter in Python (http://www.fdik.org/pyPEG/)
- PyPEG is a parser interpreter based on PEG grammars like Arpeggio but
with different design and implementation approach and different goals in mind.
The idea of Arpeggio parser definition using Python language constructs is
taken from the PyPEG project. Arpeggio also supports parser definition using PEG
textual notation.
pyparsing (http://pyparsing.wikispaces.com/)
- pyparsing is IMO currently the most advanced parser written 100% in python.
Currently there is no much similarity between pyparsing and Arpeggio but
there are some nice features and ideas from pyparsing that I think would be nice to
have implemented in Arpeggio.
Although not directly related to Arpeggio I wish also to thank to the
following free software projects that makes the development of Arpeggio (and some other
projects I am working on) easier and more fun:
- Arch Linux (http://www.archlinux.org/) - Linux distro that I'm using on my dev machine.
- Editra (http://www.editra.org/) - Nice programmer's editor written in Python and wxWidgets.
- Mercurial (www.selenic.com/mercurial/) - Distributed version control system written in Python.
... and many more
Arpeggio parser - TODO
----------------------
Author: Igor R. Dejanovic <igor DOT dejanovic AT gmail DOT com>
Copyright: (c) Igor R. Dejanovic, 2009
Licence: MIT Licence
Some stuff that should be done in the near future:
- Documentation.
- Test suite.
- Error recovery.
This is the essential requirement for the Arpeggio because IDE usage is the main
motivation that has started Arpeggio development.
It would be nice to find all errors, whether syntactic or semantic, in one
parsing session. If error is found the parser should report the error and then
try to recover and continue parsing.
# -*- coding: utf-8 -*-
#######################################################################
# Name: arpeggio.py
# Purpose: PEG parser interpreter
# Author: Igor R. Dejanović <igor DOT dejanovic AT gmail DOT com>
# Copyright: (c) 2009 Igor R. Dejanović <igor DOT dejanovic AT gmail DOT com>
# License: MIT License
#
# This is implementation of pacrat parser interpreter based on PEG grammars.
# Parsers are defined using python language construction or PEG language.
#######################################################################
import re
import bisect
DEBUG = False
DEFAULT_WS='\t\n\r '
class ArpeggioError(Exception):
'''Base class for arpeggio errors.'''
def __init__(self, message):
self.message = message
def __str__(self):
return repr(self.message)
class GrammarError(ArpeggioError):
'''
Error raised during parser building phase used to indicate error in the grammar
definition.
'''
class SemanticError(ArpeggioError):
'''
Error raised during the phase of semantic analisys used to indicate semantic
error.
'''
class NoMatch(Exception):
'''
Exception raised by the Match classes during parsing to indicate that the
match is not successful.
'''
def __init__(self, value, position, parser):
self.value = value
self.position = position # Position in the input stream where error occured
self.parser = parser
self._up = True # By default when NoMatch is thrown we will go up the Parse Model Tree.
def _log(msg):
if DEBUG:
print msg
def flatten(_iterable):
'''Flattening of python iterables.'''
result = []
for e in _iterable:
if hasattr(e, "__iter__") and not type(e) is str:
result.extend(flatten(e))
else:
result.append(e)
return result
# ---------------------------------------------------------
# Parser Model (PEG Abstract Semantic Graph) elements
class ParsingExpression(object):
"""
Represents node of the Parser Model.
Root parser expression node will create non-terminal parser tree node while non-root
node will create list of terminals and non-terminals.
"""
def __init__(self, rule=None, root=False, nodes=None):
'''
@param rule - the name of the parser rule if this is the root of the parser rule.
@param root - Does this parser expression represents the root of the parser rule?
The root parser rule will create non-terminal node of the
parse tree during parsing.
@param nodes - list of child parser expression nodes.
'''
# Memoization. Every node cache the parsing results for the given input positions.
self.result_cache = {} # position -> parse tree
self.nodes = nodes
if nodes is None:
self.nodes = [] # child expressions
self.rule = rule
self.root = root
@property
def desc(self):
return self.name
@property
def name(self):
if self.root:
return "%s(%s)" % (self.__class__.__name__, self.rule)
else:
return self.__class__.__name__
@property
def id(self):
if self.root:
return self.rule
else:
return id(self)
def _parse_intro(self, parser):
_log("Parsing %s" % self.name)
results = []
parser._skip_ws()
self.c_pos = parser.position
def parse(self, parser):
self._parse_intro(parser)
#Memoization.
#If this position is already parsed by this parser expression than use
#the result
if self.result_cache.has_key(self.c_pos):
_log("Result for [%s, %s] founded in result_cache." % (self, self.c_pos))
result, new_pos = self.result_cache[self.c_pos]
parser.position = new_pos
return result
# We are descending down
if parser.nm:
parser.nm._up = False
result = self._parse(parser)
if self.root and result:
result = NonTerminal(self.rule, self.c_pos, result)
# Result caching for use by memoization.
self.result_cache[self.c_pos] = (result, parser.position)
return result
#TODO: _nm_change_rule should be called from every parser expression parse
# method that can potentialy be the root parser rule.
def _nm_change_rule(self, nm, parser):
'''
Change rule for the given NoMatch object to a more generic if
we did not consume any input and we are moving up the parser model tree.
Used to report most generic language element expected at the place of
the NoMatch exception.
'''
if self.root and self.c_pos == nm.position and nm._up:
nm.value = self.rule
class Sequence(ParsingExpression):
'''
Will match sequence of parser expressions in exact order they are defined.
'''
def __init__(self, elements=None, rule=None, root=False, nodes=None):
'''
@param elements - list used as a stageing structure for python based grammar definition.
Used in _from_python for building nodes list of child parser expressions.
'''
super(Sequence, self).__init__(rule, root, nodes)
self.elements = elements
def _parse(self, parser):
results = []
try:
for e in self.nodes:
result = e.parse(parser)
if result:
results.append(result)
except NoMatch, m:
self._nm_change_rule(m, parser)
raise
return results
class OrderedChoice(Sequence):
'''
Will match one of the parser expressions specified. Parser will try to
match expressions in the order they are defined.
'''
def _parse(self, parser):
result = None
match = False
for e in self.nodes:
try:
result = e.parse(parser)
match = True
except NoMatch, m:
parser.position = self.c_pos # Backtracking
self._nm_change_rule(m, parser)
else:
break
if not match:
parser.position = self.c_pos # Backtracking
raise parser.nm
return result
class Repetition(ParsingExpression):
'''
Base class for all repetition-like parser expressions (?,*,+)
'''
def __init__(self, *elements):
super(Repetition, self).__init__(None)
if len(elements)==1:
elements = elements[0]
self.elements = elements
class Optional(Repetition):
'''
Optional will try to match parser expression specified buy will not fail in
case match is not successful.
'''
def _parse(self, parser):
result = None
try:
result = self.nodes[0].parse(parser)
except NoMatch:
parser.position = self.c_pos # Backtracking
pass
return result
class ZeroOrMore(Repetition):
'''
ZeroOrMore will try to match parser expression specified zero or more times.
It will never fail.
'''
def _parse(self, parser):
results = []
while True:
try:
self.c_pos = parser.position
results.append(self.nodes[0].parse(parser))
except NoMatch:
parser.position = self.c_pos # Backtracking
break
return results
class OneOrMore(Repetition):
'''
OneOrMore will try to match parser expression specified one or more times.
'''
def _parse(self, parser):
results = []
first = False
while True:
try:
self.c_pos = parser.position
results.append(self.nodes[0].parse(parser))
first = True
except NoMatch:
parser.position = self.c_pos # Backtracking
if not first:
raise
break
return results
class SyntaxPredicate(ParsingExpression):
'''
Base class for all syntax predicates (and, not).
Predicates are parser expressions that will do the match but will not consume
any input.
'''
def __init__(self, *elements):
if len(elements)==1:
elements = elements[0]
self.elements = elements
super(SyntaxPredicate, self).__init__(None)
class And(SyntaxPredicate):
'''
This predicate will succeed if the specified expression matches current input.
'''
def _parse(self, parser):
for e in self.nodes:
try:
e.parse(parser)
except NoMatch:
parser.position = self.c_pos
raise
parser.position = self.c_pos
class Not(SyntaxPredicate):
'''
This predicate will succeed if the specified expression doesn't match current input.
'''
def _parse(self, parser):
for e in self.nodes:
try:
e.parse(parser)
except NoMatch:
parser.position = self.c_pos
return
parser.position = self.c_pos
parser._nm_raise(self.name, self.c_pos, parser)
class Match(ParsingExpression):
'''
Base class for all classes that will try to match something from the input.
'''
def __init__(self, rule, root=False):
super(Match,self).__init__(rule, root)
@property
def name(self):
return "%s(%s)" % (self.__class__.__name__, self.to_match)
def parse(self, parser):
self._parse_intro(parser)
if parser._in_parse_comment:
return self._parse(parser)
comments = []
try:
match = self._parse(parser)
except NoMatch, nm:
# If not matched try to match comment
#TODO: Comment handling refactoring. Should think of better way to
# handle comments.
if parser.comments_model:
try:
parser._in_parse_comment = True
while True:
comments.append(parser.comments_model.parse(parser))
parser._skip_ws()
except NoMatch:
# If comment match successfull try terminal match again
if comments:
match = self._parse(parser)
match.comments = NonTerminal('comment', self.c_pos, comments)
else:
parser._nm_raise(nm)
finally:
parser._in_parse_comment = False
else:
parser._nm_raise(nm)
return match
class RegExMatch(Match):
'''
This Match class will perform input matching based on Regular Expressions.
'''
def __init__(self, to_match, rule=None):
'''
@param to_match - regular expression string to match.
'''
super(RegExMatch, self).__init__(rule)
self.to_match = to_match
self.regex = re.compile(to_match)
def _parse(self, parser):
m = self.regex.match(parser.input[parser.position:])
if m:
parser.position += len(m.group())
_log("Match %s at %d" % (m.group(), self.c_pos))
return Terminal(self.rule if self.root else '', self.c_pos, m.group())
else:
_log("NoMatch at %d" % self.c_pos)
parser._nm_raise(self.root if self.root else self.name, self.c_pos, parser)
class StrMatch(Match):
'''
This Match class will perform input matching by a string comparison.
'''
def __init__(self, to_match, rule=None, root=False):
'''
@param to_match - string to match.
'''
super(StrMatch, self).__init__(rule, root)
self.to_match = to_match
def _parse(self, parser):
if parser.input[parser.position:].startswith(self.to_match):
parser.position += len(self.to_match)
_log("Match %s at %d" % (self.to_match, self.c_pos))
return Terminal(self.rule if self.root else '', self.c_pos, self.to_match)
else:
_log("NoMatch at %d" % self.c_pos)
parser._nm_raise(self.to_match, self.c_pos, parser)
def __str__(self):
return self.to_match
def __eq__(self, other):
return self.to_match == str(other)
# HACK: Kwd class is a bit hackish. Need to find a better way to
# introduce different classes of string tokens.
class Kwd(StrMatch):
'''
Specialization of StrMatch to specify keywords of the language.
'''
def __init__(self, to_match):
super(Kwd, self).__init__(to_match, rule=None)
self.to_match = to_match
self.root = True
self.rule = 'keyword'
class EndOfFile(Match):
'''
Match class that will succeed in case end of input is reached.
'''
def __init__(self, rule=None):
super(EndOfFile, self).__init__(rule)
@property
def name(self):
return "EOF"
def _parse(self, parser):
if len(parser.input) == parser.position:
return Terminal(self.rule if self.root else '', self.c_pos, 'EOF')
else:
_log("EOF not matched.")
parser._nm_raise(self.name, self.c_pos, parser)
def EOF(): return EndOfFile()
# ---------------------------------------------------------
#---------------------------------------------------
# Parse Tree node classes
class ParseTreeNode(object):
'''
Abstract base class representing node of the Parse Tree.
The node can be terminal(the leaf of the parse tree) or non-terminal.
'''
def __init__(self, type, position, error):
'''
@param type - the name of the rule that created this node or empty string in case
this node is created by a non-root parser model node.
@param position - position in the input stream where match occured.
@param error - is this a false parse tree node created during error recovery?
'''
self.type = type
self.position = position
self.error = error
self.comments = None
@property
def name(self):
return "%s [%s]" % (self.type, self.position)
class Terminal(ParseTreeNode):
'''
Leaf node of the Parse Tree. Represents matched string.
'''
def __init__(self, type, position, value, error=False):
'''
@param value - matched string or missing token name in case of an error node.
'''
super(Terminal, self).__init__(type, position, error)
self.value = value
@property
def desc(self):
return "%s \'%s\' [%s]" % (self.type, self.value, self.position)
def __str__(self):
return self.value
def __eq__(self, other):
return str(self)==str(other)
class NonTerminal(ParseTreeNode):
'''
Non-leaf node of the Parse Tree. Represents language syntax construction.
'''
def __init__(self, type, position, nodes, error=False):
'''
@param nodes - child ParseTreeNode
'''
super(NonTerminal, self).__init__(type, position, error)
self.nodes = flatten([nodes])
@property
def desc(self):
return self.name
# ----------------------------------------------------
# Semantic Actions
#
class SemanticAction(object):
'''
Semantic actions are executed during semantic analysis. They are in charge
of producing Abstract Semantic Graph (ASG) out of the parse tree.
Every non-terminal and terminal can have semantic action defined which will be
triggered during semantic analisys.
Semantic action triggering is separated in two passes. first_pass method is required
and the method called second_pass is optional and will be called if exists after
the first pass. Second pass can be used for forward referencing,
e.g. linking to the declaration registered in the first pass stage.
'''
def first_pass(self, parser, node, nodes):
'''
Called in the first pass of tree walk.
'''
raise NotImplementedError()
# ----------------------------------------------------
# Parsers
class Parser(object):
def __init__(self, skipws=True, ws=DEFAULT_WS):
self.skipws = skipws
self.ws = ws
self.comments_model = None
self.sem_actions = {}
self.parse_tree = None
self._in_parse_comment = False
def parse(self, _input):
self.position = 0 # Input position
self.nm_pos = 0 # Position for last NoMatch exception
self.nm = None # Last NoMatch exception
self.line_ends = []
self.input = _input
self.parse_tree = self._parse()
return self.parse_tree
def getASG(self, sem_actions=None):
'''
Creates Abstract Semantic Graph (ASG) from the parse tree.
@param sem_actions - semantic actions dictionary to use for semantic analysis.
Rule names are the keys and semantic action objects are values.
'''
if not self.parse_tree:
raise Exception("Parse tree is empty. You did call parse(), didn't you?")
if sem_actions is None:
if not self.sem_actions:
raise Exception("Semantic actions not defined.")
else:
sem_actions = self.sem_actions
if type(sem_actions) is not dict:
raise Exception("Semantic actions parameter must be a dictionary.")
for_second_pass = []
def tree_walk(node):
'''
Walking the parse tree and calling first_pass for every registered semantic
actions and creating list of object that needs to be called in the second pass.
'''
nodes = []
if isinstance(node, NonTerminal):
for n in node.nodes:
nodes.append(tree_walk(n))
if sem_actions.has_key(node.type):
retval = sem_actions[node.type].first_pass(self, node, nodes)
if hasattr(sem_actions[node.type], "second_pass"):
for_second_pass.append((node.type,retval))
else:
if isinstance(node, NonTerminal):
retval = NonTerminal(node.type, node.position, nodes)
else:
retval = node
return retval
_log("ASG: First pass")
asg = tree_walk(self.parse_tree)
_log("ASG: Second pass")
# Second pass
for sa_name, asg_node in for_second_pass:
sem_actions[sa_name].second_pass(self, asg_node)
return asg
def pos_to_linecol(self, pos):
'''
Calculate (line, column) tuple for the given position in the stream.
'''
if not self.line_ends:
try:
#TODO: Check this implementation on Windows.
self.line_ends.append(self.input.index("\n"))
while True:
try:
self.line_ends.append(self.input.index("\n", self.line_ends[-1]+1))
except ValueError:
break
except ValueError:
pass
line = bisect.bisect_left(self.line_ends, pos)
col = pos
if line > 0:
col -= self.line_ends[line-1]
if self.input[self.line_ends[line-1]] in '\n\r':
col -= 1
return line+1, col+1
def _skip_ws(self):
'''
Skiping whitespace characters.
'''
if self.skipws:
while self.position<len(self.input) and self.input[self.position] in self.ws:
self.position += 1
def _skip_comments(self):
# We do not want to recurse into parsing comments
if comments_model and not self.in_skip_comments:
self.in_skip_comments = True
comments = self.comments_model.parse(self)
self.in_skip_comments = False
return comments
def _nm_raise(self, *args):
'''
Register new NoMatch object if the input is consumed
from the last NoMatch and raise last NoMatch
@param args - NoMatch instance or value, position, parser
'''
if not self._in_parse_comment:
if len(args)==1 and isinstance(args[0], NoMatch):
if self.nm is None or args[0].position > self.nm.position:
self.nm = args[0]
else:
value, position, parser = args
if self.nm is None or position > self.nm.position:
self.nm = NoMatch(value, position, parser)
raise self.nm
class ParserPython(Parser):
def __init__(self, language_def, comment_def=None, skipws=True, ws=DEFAULT_WS):
super(ParserPython, self).__init__(skipws, ws)
self._init_caches()
# PEG Abstract Syntax Graph
self.parser_model = self._from_python(language_def)
self.comments_model = self._from_python(comment_def) if comment_def else None
# Comments should be optional and there can be more of them
if self.comments_model: # and not isinstance(self.comments_model, ZeroOrMore):
self.comments_model.root = True
self.comments_model.rule = comment_def.__name__
def _init_caches(self):
self.__rule_stack = [] # Used to keep track of the "path" from the initial parser model node
self.__rule_cache = {} # Used for recursive definitions
self.__rule_cache["EndOfFile"] = EndOfFile()
def _parse(self):
return self.parser_model.parse(self)
def _from_python(self, expression):
"""
Create parser model from the definition given in the form of python functions returning
lists, tuples, callables, strings and ParsingExpression objects.
@returns - Parser Model (PEG Abstract Semantic Graph)
"""
root = False
while callable(expression): # Is this expression a parser rule?
if self.__rule_cache.has_key(expression.__name__):
_log("Rule %s founded in cache." % expression.__name__)
return self.__rule_cache.get(expression.__name__)
rule = expression.__name__
root = True
# Semantic action for the rule
if hasattr(expression, "sem"):
self.sem_actions[rule] = expression.sem
_log("push : %s" % rule)
self.__rule_stack.append(rule)
expression = expression()
rule = self.__rule_stack[-1]
retval = None
if isinstance(expression, Match):
retval = expression
retval.rule = rule
retval.root = root
elif isinstance(expression, Repetition) or isinstance(expression, SyntaxPredicate):
retval = expression
retval.rule = rule
retval.root = root
retval.nodes.append(self._from_python(retval.elements))
elif type(expression) in [list, tuple]:
if type(expression) is list:
retval = OrderedChoice(expression, rule, root)
else:
retval = Sequence(expression, rule, root)
# If this expression is rule than we will cache it
# in order to support recursive definitions
if root:
self.__rule_cache[rule] = retval
_log("New rule: %s -> %s" % (rule, retval.__class__.__name__))
retval.nodes = [self._from_python(e) for e in expression]
elif type(expression) is str:
retval = StrMatch(expression, rule, root)
else:
raise GrammarError("Unrecognized grammar element '%s' in rule %s." % (str(expression), rule))
if root:
self.__rule_cache[rule] = retval
if root:
name = self.__rule_stack.pop()
_log("pop: %s" % name)
return retval
def errors(self):
pass
# -*- coding: utf-8 -*-
#######################################################################
# Name: export.py
# Purpose: Export support for arpeggio
# Author: Igor R. Dejanović <igor DOT dejanovic AT gmail DOT com>
# Copyright: (c) 2009 Igor R. Dejanović <igor DOT dejanovic AT gmail DOT com>
# License: MIT License
#######################################################################
import StringIO
from arpeggio import Terminal
class Export(object):
'''
Base class for all Exporters.
'''
def __init__(self):
super(Export, self).__init__()
# Export initialization
self._render_set = set() # Used in rendering to prevent rendering
# of the same node multiple times
self._adapter_map = {} # Used as a registry of adapters to ensure
# ensure that the same adapter is
# returned for the same adaptee object
def export(self, obj):
'''Export of obj to a string.'''
self._outf = StringIO()
self._export(obj)
return self._outf.getvalue()
def exportFile(self, obj, file_name):
'''Export of obj to a file.'''
self._outf = open(file_name, "w")
self._export(obj)
self._outf.close()
def _export(self, obj):
self._outf.write(self._start())
self._render_node(obj)
self._outf.write(self._end())
def _start(self):
'''
Overide this to specify the begining of the graph representation.
'''
return ""
def _end(self):
'''
Overide this to specify the end of the graph representation.
'''
return ""
class ExportAdapter(object):
'''
Base adapter class for the export support.
Adapter should be defined for every graph type.
'''
def __init__(self, node, export):
'''
@param node - node to adapt
@param export - export object used as a context of the export.
'''
self.adaptee = node # adaptee is adapted graph node
self.export = export
# -------------------------------------------------------------------------
# Support for DOT language
class DOTExportAdapter(ExportAdapter):
'''
Base adapter class for the DOT export support.
'''
@property
def id(self):
'''Graph node unique identification.'''
raise NotImplementedError()
@property
def desc(self):
'''Graph node textual description.'''
raise NotImplementedError()
@property
def children(self):
'''Children of the graph node.'''
raise NotImplementedError()
class PMDOTExportAdapter(DOTExportAdapter):
'''
Adapter for ParsingExpression graph types (parser model).
'''
@property
def id(self):
return id(self.adaptee)
@property
def desc(self):
return self.adaptee.desc
@property
def children(self):
if not hasattr(self, "_children"):
self._children = []
adapter_map = self.export._adapter_map # Registry of adapters used in this export
for c,n in enumerate(self.adaptee.nodes):
if isinstance(n, PMDOTExportAdapter): # if child node is already adapted use that adapter
self._children.append((str(c+1), n))
elif adapter_map.has_key(id(n)): # current node is adaptee -> there is registered adapter
self._children.append((str(c+1), adapter_map[id(n)]))
else:
adapter = PMDOTExportAdapter(n, self.export)
self._children.append((str(c+1), adapter))
adapter_map[adapter.id] = adapter
return self._children
class PTDOTExportAdapter(PMDOTExportAdapter):
'''
Adapter for ParseTreeNode graph types.
'''
@property
def children(self):
if isinstance(self.adaptee, Terminal):
return []
else:
if not hasattr(self, "_children"):
self._children = []
for c,n in enumerate(self.adaptee.nodes):
adapter = PTDOTExportAdapter(n, self.export)
self._children.append((str(c+1), adapter))
return self._children
class DOTExport(Export):
'''
Export to DOT language (part of GraphViz, see http://www.graphviz.org/)
'''
def _render_node(self, node):
if not node in self._render_set:
self._render_set.add(node)
self._outf.write('\n%s [label="%s"];' % (node.id, self._dot_label_esc(node.desc)))
#TODO Comment handling
# if hasattr(node, "comments") and root.comments:
# retval += self.node(root.comments)
# retval += '\n%s->%s [label="comment"]' % (id(root), id(root.comments))
for name, n in node.children:
self._outf.write('\n%s->%s [label="%s"]' % (node.id, n.id, name))
self._outf.write('\n')
self._render_node(n)
def _start(self):
return "digraph arpeggio_graph {"
def _end(self):
return "\n}"
def _dot_label_esc(self, to_esc):
to_esc = to_esc.replace("\\", "\\\\")
to_esc = to_esc.replace('\"', '\\"')
to_esc = to_esc.replace('\n', '\\n')
return to_esc
class PMDOTExport(DOTExport):
'''
Convenience DOTExport extension that uses ParserExpressionDOTExportAdapter
'''
def export(self, obj):
return super(PMDOTExport, self).\
export(PMDOTExportAdapter(obj, self))
def exportFile(self, obj, file_name):
return super(PMDOTExport, self).\
exportFile(PMDOTExportAdapter(obj, self), file_name)
class PTDOTExport(DOTExport):
'''
Convenience DOTExport extension that uses PTDOTExportAdapter
'''
def export(self, obj):
return super(PTDOTExport, self).\
export(PTDOTExportAdapter(obj, self))
def exportFile(self, obj, file_name):
return super(PTDOTExport, self).\
exportFile(PTDOTExportAdapter(obj, self), file_name)
\ No newline at end of file
# -*- coding: utf-8 -*-
#######################################################################
# Name: peg.py
# Purpose: Implementing PEG language
# Author: Igor R. Dejanovic <igor DOT dejanovic AT gmail DOT com>
# Copyright: (c) 2009 Igor R. Dejanovic <igor DOT dejanovic AT gmail DOT com>
# License: MIT License
#######################################################################
__all__ = ['ParserPEG']
from arpeggio import *
from arpeggio import _log
from arpeggio import RegExMatch as _
# PEG Grammar
def grammar(): return OneOrMore(rule), EOF
def rule(): return identifier, LEFT_ARROW, ordered_choice, ";"
def ordered_choice(): return sequence, ZeroOrMore(SLASH, sequence)
def sequence(): return OneOrMore(prefix)
def prefix(): return Optional([AND,NOT]), sufix
def sufix(): return expression, Optional([QUESTION, STAR, PLUS])
def expression(): return [regex,(identifier, Not(LEFT_ARROW)),
(OPEN, ordered_choice, CLOSE),
literal]
def regex(): return "r", "'", _(r"(\\\'|[^\'])*"),"'"
def identifier(): return _(r"[a-zA-Z_]([a-zA-Z_]|[0-9])*")
#def literal(): return [_(r"\'(\\\'|[^\'])*\'"),_(r'"[^"]*"')]
def literal(): return _(r'(\'(\\\'|[^\'])*\')|("[^"]*")')
def LEFT_ARROW(): return "<-"
def SLASH(): return "/"
def STAR(): return "*"
def QUESTION(): return "?"
def PLUS(): return "+"
def AND(): return "&"
def NOT(): return "!"
def OPEN(): return "("
def CLOSE(): return ")"
def comment(): return "//", _(".*\n")
# ------------------------------------------------------------------
# PEG Semantic Actions
class PEGSemanticAction(SemanticAction):
def second_pass(self, parser, node):
if isinstance(node, Terminal):
return
for i,n in enumerate(node.nodes):
if isinstance(n, Terminal):
if parser.peg_rules.has_key(n.value):
node.nodes[i] = parser.peg_rules[n.value]
else:
raise SemanticError("Rule \"%s\" does not exists." % n)
class SemGrammar(SemanticAction):
def first_pass(self, parser, node, nodes):
return parser.peg_rules[parser.root_rule_name]
class SemRule(PEGSemanticAction):
def first_pass(self, parser, node, nodes):
rule_name = nodes[0].value
if len(nodes)>4:
retval = Sequence(nodes=nodes[2:-1])
else:
retval = nodes[2]
retval.rule = rule_name
retval.root = True
if not hasattr(parser, "peg_rules"):
parser.peg_rules = {} # Used for linking phase
parser.peg_rules["EndOfFile"] = EndOfFile()
parser.peg_rules[rule_name] = retval
return retval
class SemSequence(PEGSemanticAction):
def first_pass(self, parser, node, nodes):
if len(nodes)>1:
return Sequence(nodes=nodes)
else:
return nodes[0]
class SemOrderedChoice(PEGSemanticAction):
def first_pass(self, parser, node, nodes):
if len(nodes)>1:
retval = OrderedChoice(nodes=nodes[::2])
else:
retval = nodes[0]
return retval
class SemPrefix(PEGSemanticAction):
def first_pass(self, parser, node, nodes):
_log("Prefix: %s " % str(nodes))
if len(nodes)==2:
if nodes[0] == NOT():
retval = Not()
else:
retval = And()
if type(nodes[1]) is list:
retval.nodes = nodes[1]
else:
retval.nodes = [nodes[1]]
else:
retval = nodes[0]
return retval
class SemSufix(PEGSemanticAction):
def first_pass(self, parser, node, nodes):
_log("Sufix : %s" % str(nodes))
if len(nodes) == 2:
_log("Sufix : %s" % str(nodes[1]))
if nodes[1] == STAR():
retval = ZeroOrMore(nodes[0])
elif nodes[1] == QUESTION():
retval = Optional(nodes[0])
else:
retval = OneOrMore(nodes[0])
if type(nodes[0]) is list:
retval.nodes = nodes[0]
else:
retval.nodes = [nodes[0]]
else:
retval = nodes[0]
return retval
class SemExpression(PEGSemanticAction):
def first_pass(self, parser, node, nodes):
_log("Expression : %s" % str(nodes))
if len(nodes)==1:
return nodes[0]
else:
return nodes[1]
class SemIdentifier(SemanticAction):
def first_pass(self, parser, node, nodes):
_log("Identifier %s." % node.value)
return node
class SemRegEx(SemanticAction):
def first_pass(self, parser, node, nodes):
_log("RegEx %s." % nodes[2].value)
return RegExMatch(nodes[2].value)
class SemLiteral(SemanticAction):
def first_pass(self, parser, node, nodes):
_log("Literal: %s" % node.value)
match_str = node.value[1:-1]
match_str = match_str.replace("\\'", "'")
match_str = match_str.replace("\\\\", "\\")
return StrMatch(match_str)
class SemTerminal(SemanticAction):
def first_pass(self, parser, node, nodes):
return StrMatch(node.value)
grammar.sem = SemGrammar()
rule.sem = SemRule()
ordered_choice.sem = SemOrderedChoice()
sequence.sem = SemSequence()
prefix.sem = SemPrefix()
sufix.sem = SemSufix()
expression.sem = SemExpression()
regex.sem = SemRegEx()
identifier.sem = SemIdentifier()
literal.sem = SemLiteral()
for sem in [LEFT_ARROW, SLASH, STAR, QUESTION, PLUS, AND, NOT, OPEN, CLOSE]:
sem.sem = SemTerminal()
class ParserPEG(Parser):
def __init__(self, language_def, root_rule_name, comment_rule_name=None, skipws=True, ws=DEFAULT_WS):
super(ParserPEG, self).__init__(skipws, ws)
self.root_rule_name = root_rule_name
# PEG Abstract Syntax Graph
self.parser_model = self._from_peg(language_def)
# Comments should be optional and there can be more of them
if self.comments_model: # and not isinstance(self.comments_model, ZeroOrMore):
self.comments_model.root = True
self.comments_model.rule = comment_rule_name
def _parse(self):
return self.parser_model.parse(self)
def _from_peg(self, language_def):
parser = ParserPython(grammar, comment)
parser.root_rule_name = self.root_rule_name
parse_tree = parser.parse(language_def)
return parser.getASG()
if __name__ == "__main__":
try:
parser = ParserPython(grammar, None)
f = open("peg_parser_model.dot", "w")
f.write(str(DOTSerializator(parser.parser_model)))
f.close()
except NoMatch, e:
print "Expected %s at position %s." % (e.value, str(e.parser.pos_to_linecol(e.position)))
\ No newline at end of file
#######################################################################
# Name: calc.py
# Purpose: Simple expression evaluator example
# Author: Igor R. Dejanovic <igor DOT dejanovic AT gmail DOT com>
# Copyright: (c) 2009 Igor R. Dejanovic <igor DOT dejanovic AT gmail DOT com>
# License: MIT License
#
# This example demonstrates grammar definition using python constructs as
# well as using semantic actions to evaluate simple expression in infix
# notation.
#######################################################################
from arpeggio import *
from arpeggio.export import PMDOTExport, PTDOTExport
from arpeggio import RegExMatch as _
from arpeggio import _log
def number(): return _(r'\d*\.\d*|\d+')
def factor(): return [number, ("(", expression, ")")]
def term(): return factor, ZeroOrMore(["*","/"], factor)
def expression(): return Optional(["+","-"]), term, ZeroOrMore(["+", "-"], term)
def calc(): return expression, EndOfFile
# Semantic actions
class ToFloat(SemanticAction):
'''Converts node value to float.'''
def first_pass(self, parser, node, nodes):
_log("Converting %s." % node.value)
return float(node.value)
class Factor(SemanticAction):
'''Removes parenthesis if exists and returns what was contained inside.'''
def first_pass(self, parser, node, nodes):
_log("Factor %s" % nodes)
if nodes[0] == "(":
return nodes[1]
else:
return nodes[0]
class Term(SemanticAction):
'''
Divides or multiplies factors.
Factor nodes will be already evaluated.
'''
def first_pass(self, parser, node, nodes):
_log("Term %s" % nodes)
term = nodes[0]
for i in range(2, len(nodes), 2):
if nodes[i-1]=="*":
term *= nodes[i]
else:
term /= nodes[i]
_log("Term = %f" % term)
return term
class Expr(SemanticAction):
'''
Adds or substracts terms.
Term nodes will be already evaluated.
'''
def first_pass(self, parser, node, nodes):
_log("Expression %s" % nodes)
expr = 0
start = 0
# Check for unary + or - operator
if str(nodes[0]) in "+-":
start = 1
for i in range(start, len(nodes), 2):
if i and nodes[i-1]=="-":
expr -= nodes[i]
else:
expr += nodes[i]
_log("Expression = %f" % expr)
return expr
class Calc(SemanticAction):
def first_pass(self, parser, node, nodes):
return nodes[0]
# Connecting rules with semantic actions
number.sem = ToFloat()
factor.sem = Factor()
term.sem = Term()
expression.sem = Expr()
calc.sem = Calc()
if __name__ == "__main__":
try:
import arpeggio
# Setting DEBUG to true will show log messages.
arpeggio.DEBUG = True
# First we will make a parser - an instance of the calc parser model.
# Parser model is given in the form of python constructs therefore we
# are using ParserPython class.
parser = ParserPython(calc)
# Then we export it to a dot file in order to visualise it. This is
# particulary handy for debugging purposes.
# We can make a jpg out of it using dot (part of graphviz) like this
# dot -O -Tjpg calc_parse_tree_model.dot
PMDOTExport().exportFile(parser.parser_model,
"calc_parse_tree_model.dot")
# An expression we want to evaluate
input = "-(4-1)*5+(2+4.67)+5.89/(.2+7)"
# We create a parse tree or abstract syntax tree out of textual input
parse_tree = parser.parse(input)
# Then we export it to a dot file in order to visualise it.
PTDOTExport().exportFile(parse_tree,
"calc_parse_tree.dot")
# getASG will start semantic analysis.
# In this case semantic analysis will evaluate expression and
# returned value will be the result of the input expression.
print "%s = %f" % (input, parser.getASG())
except NoMatch, e:
print "Expected %s at position %s." % (e.value, str(e.parser.pos_to_linecol(e.position)))
#######################################################################
# Name: calc_peg.py
# Purpose: Simple expression evaluator example using PEG language
# Author: Igor R. Dejanovic <igor DOT dejanovic AT gmail DOT com>
# Copyright: (c) 2009 Igor R. Dejanovic <igor DOT dejanovic AT gmail DOT com>
# License: MIT License
#
# This example is functionally equivalent to calc.py. The difference is that
# in this example grammar is specified using PEG language instead of python constructs.
# Semantic actions are used to calculate expression during semantic
# analysis.
# Parser model as well as parse tree exported to dot files should be
# the same as parser model and parse tree generated in calc.py example.
#######################################################################
from arpeggio import *
from arpeggio.peg import ParserPEG
from arpeggio.export import PMDOTExport, PTDOTExport
# Semantic actions
from calc import ToFloat, Factor, Term, Expr, Calc
# Grammar is defined using textual specification based on PEG language.
calc_grammar = """
number <- r'\d*\.\d*|\d+';
factor <- number / "(" expression ")";
term <- factor (( "*" / "/") factor)*;
expression <- ("+" / "-")? term (("+" / "-") term)*;
calc <- expression EndOfFile;
"""
# Rules are mapped to semantic actions
sem_actions = {
"number" : ToFloat(),
"factor" : Factor(),
"term" : Term(),
"expression" : Expr(),
"calc" : Calc()
}
try:
# Turning debugging on
import arpeggio
arpeggio.DEBUG = True
# First we will make a parser - an instance of the calc parser model.
# Parser model is given in the form of PEG notation therefore we
# are using ParserPEG class. Root rule name (parsing expression) is "calc".
parser = ParserPEG(calc_grammar, "calc")
# Then we export it to a dot file.
PMDOTExport().exportFile(parser.parser_model,
"calc_peg_parser_model.dot")
# An expression we want to evaluate
input = "-(4-1)*5+(2+4.67)+5.89/(.2+7)"
# Then parse tree is created out of the input expression.
parse_tree = parser.parse(input)
# We save it to dot file in order to visualise it.
PTDOTExport().exportFile(parse_tree,
"calc_peg_parse_tree.dot")
# getASG will start semantic analysis.
# In this case semantic analysis will evaluate expression and
# returned value will be evaluated result of the input expression.
# Semantic actions are supplied to the getASG function.
print "%s = %f" % (input, parser.getASG(sem_actions))
except NoMatch, e:
print "Expected %s at position %s." % (e.value, str(e.parser.pos_to_linecol(e.position)))
##############################################################################
# Name: json.py
# Purpose: Implementation of a simple JSON parser in arpeggio.
# Author: Igor R. Dejanovic <igor DOT dejanovic AT gmail DOT com>
# Copyright: (c) 2009 Igor R. Dejanovic <igor DOT dejanovic AT gmail DOT com>
# License: MIT License
#
# This example is based on jsonParser.py from pyparsing project
# (see http://pyparsing.wikispaces.com/).
##############################################################################
json_bnf = """
object
{ members }
{}
members
string : value
members , string : value
array
[ elements ]
[]
elements
value
elements , value
value
string
number
object
array
true
false
null
"""
from arpeggio import *
from arpeggio.export import PMDOTExport, PTDOTExport
from arpeggio import RegExMatch as _
def TRUE(): return "true"
def FALSE(): return "false"
def NULL(): return "null"
def jsonString(): return '"', _('[^"]*'),'"'
def jsonNumber(): return _('-?\d+((\.\d*)?((e|E)(\+|-)?\d+)?)?')
def jsonValue(): return [jsonString, jsonNumber, jsonObject, jsonArray, TRUE, FALSE, NULL]
def jsonArray(): return "[", Optional(jsonElements), "]"
def jsonElements(): return jsonValue, ZeroOrMore(",", jsonValue)
def memberDef(): return jsonString, ":", jsonValue
def jsonMembers(): return memberDef, ZeroOrMore(",", memberDef)
def jsonObject(): return "{", Optional(jsonMembers), "}"
def jsonFile(): return jsonObject, EOF
if __name__ == "__main__":
testdata = """
{
"glossary": {
"title": "example glossary",
"GlossDiv": {
"title": "S",
"GlossList":
{
"ID": "SGML",
"SortAs": "SGML",
"GlossTerm": "Standard Generalized Markup Language",
"TrueValue": true,
"FalseValue": false,
"Gravity": -9.8,
"LargestPrimeLessThan100": 97,
"AvogadroNumber": 6.02E23,
"EvenPrimesGreaterThan2": null,
"PrimesLessThan10" : [2,3,5,7],
"Acronym": "SGML",
"Abbrev": "ISO 8879:1986",
"GlossDef": "A meta-markup language, used to create markup languages such as DocBook.",
"GlossSeeAlso": ["GML", "XML", "markup"],
"EmptyDict": {},
"EmptyList" : []
}
}
}
}
"""
try:
import arpeggio
arpeggio.DEBUG = True
# Creating parser from parser model.
parser = ParserPython(jsonFile)
# Exporting parser model to dot file in order to visualise it.
PMDOTExport().exportFile(parser.parser_model,
"json_parser_model.dot")
parse_tree = parser.parse(testdata)
PTDOTExport().exportFile(parser.parse_tree,
"json_parse_tree.dot")
except NoMatch, e:
print "Expected %s at position %s." % (e.value, str(e.parser.pos_to_linecol(e.position)))
# -*- coding: utf-8 -*-
##############################################################################
# Name: peg_peg.py
# Purpose: PEG parser definition using PEG itself.
# Author: Igor R. Dejanovic <igor DOT dejanovic AT gmail DOT com>
# Copyright: (c) 2009 Igor R. Dejanovic <igor DOT dejanovic AT gmail DOT com>
# License: MIT License
#
# PEG can be used to describe PEG.
# This example demonstrates building PEG parser using PEG based grammar of PEG
# grammar definition language.
##############################################################################
from arpeggio import *
from arpeggio.export import PMDOTExport, PTDOTExport
from arpeggio import _log
from arpeggio import RegExMatch as _
from arpeggio.peg import ParserPEG
# Semantic actions
from arpeggio.peg import SemGrammar, SemRule, SemOrderedChoice, SemSequence, SemPrefix, \
SemSufix, SemExpression, SemRegEx, SemIdentifier, SemLiteral, SemTerminal
sem_actions = {
"grammar" : SemGrammar(),
"rule" : SemRule(),
"ordered_choice" : SemOrderedChoice(),
"sequence" : SemSequence(),
"prefix" : SemPrefix(),
"sufix" : SemSufix(),
"expression" : SemExpression(),
"regex" : SemRegEx(),
"identifier" : SemIdentifier(),
"literal" : SemLiteral()
}
for sem in ["LEFT_ARROW", "SLASH", "STAR", "QUESTION", "PLUS", "AND", "NOT", "OPEN", "CLOSE"]:
sem_actions[sem] = SemTerminal()
# PEG defined using PEG itself.
peg_grammar = r"""
grammar <- rule+ EndOfFile;
rule <- identifier LEFT_ARROW ordered_choice ';';
ordered_choice <- sequence (SLASH sequence)*;
sequence <- prefix+;
prefix <- (AND/NOT)? sufix;
sufix <- expression (QUESTION/STAR/PLUS)?;
expression <- regex / (identifier !LEFT_ARROW)
/ ("(" ordered_choice ")") / literal;
identifier <- r'[a-zA-Z_]([a-zA-Z_]|[0-9])*';
regex <- 'r' '\'' r'(\\\'|[^\'])*' '\'';
literal <- r'\'(\\\'|[^\'])*\'|"[^"]*"';
LEFT_ARROW <- '<-';
SLASH <- '/';
AND <- '&';
NOT <- '!';
QUESTION <- '?';
STAR <- '*';
PLUS <- '+';
OPEN <- '(';
CLOSE <- ')';
DOT <- '.';
comment <- '//' r'.*\n';
"""
try:
import arpeggio
arpeggio.DEBUG = True
# ParserPEG will use ParserPython to parse peg_grammar definition and
# create parser_model for parsing PEG based grammars
parser = ParserPEG(peg_grammar, 'grammar')
# Exporting parser model to dot file in order to visualise.
PMDOTExport().exportFile(parser.parser_model,
"peg_peg_parser_model.dot")
# Now we will use created parser to parse the same peg_grammar used for parser
# initialization. We can parse peg_grammar because it is specified using
# PEG itself.
parser.parse(peg_grammar)
PTDOTExport().exportFile(parser.parse_tree,
"peg_peg_parse_tree.dot")
# ASG should be the same as parser.parser_model because semantic
# actions will create PEG parser (tree of ParsingExpressions).
asg = parser.getASG(sem_actions)
# This graph should be the same as peg_peg_parser_model.dot because
# they define the same parser.
PMDOTExport().exportFile(asg,
"peg_peg_asg.dot")
# If we replace parser_mode with ASG constructed parser it will still
# parse PEG grammars
parser.parser_model = asg
parser.parse(peg_grammar)
except NoMatch, e:
print "Expected %s at position %s." % (e.value, str(e.parser.pos_to_linecol(e.position)))
#######################################################################
# Name: simple.py
# Purpose: Simple language based on example from pyPEG
# Author: Igor R. Dejanovic <igor DOT dejanovic AT gmail DOT com>
# Copyright: (c) 2009 Igor R. Dejanovic <igor DOT dejanovic AT gmail DOT com>
# License: MIT License
#
# This example demonstrates grammar definition using python constructs.
# It is taken and adapted from pyPEG project (see http://www.fdik.org/pyPEG/).
#######################################################################
from arpeggio import *
from arpeggio.export import PMDOTExport, PTDOTExport
from arpeggio import RegExMatch as _
def comment(): return [_("//.*"), _("/\*.*\*/")]
def literal(): return _(r'\d*\.\d*|\d+|".*?"')
def symbol(): return _(r"\w+")
def operator(): return _(r"\+|\-|\*|\/|\=\=")
def operation(): return symbol, operator, [literal, functioncall]
def expression(): return [literal, operation, functioncall]
def expressionlist(): return expression, ZeroOrMore(",", expression)
def returnstatement(): return Kwd("return"), expression
def ifstatement(): return Kwd("if"), "(", expression, ")", block, Kwd("else"), block
def statement(): return [ifstatement, returnstatement], ";"
def block(): return "{", OneOrMore(statement), "}"
def parameterlist(): return "(", symbol, ZeroOrMore(",", symbol), ")"
def functioncall(): return symbol, "(", expressionlist, ")"
def function(): return Kwd("function"), symbol, parameterlist, block
def simpleLanguage(): return function
try:
import arpeggio
arpeggio.DEBUG = True
# Parser instantiation. simpleLanguage is root definition and comment is
# grammar rule for comments.
parser = ParserPython(simpleLanguage, comment)
# We save parser model to dot file in order to visualise it.
# We can make a jpg out of it using dot (part of graphviz) like this
# dot -Tjpg -O simple_parser.dot
PMDOTExport().exportFile(parser.parser_model,
"simple_parser_model.dot")
# Parser model for comments is handled as separate model
PMDOTExport().exportFile(parser.comments_model,
"simple_parser_comments.dot")
input = """
function fak(n) {
if (n==0) {
// For 0! result is 0
return 0;
} else { /* And for n>0 result is calculated recursively */
return n * fak(n - 1);
};
}
"""
parse_tree = parser.parse(input)
PTDOTExport().exportFile(parse_tree,
"simple_parse_tree.dot")
except NoMatch, e:
print "Expected %s at position %s." % (e.value, str(e.parser.pos_to_linecol(e.position)))
#!/usr/bin/env python
# -*- coding: utf-8 -*-
#######################################################################
# Name: arpeggio.py
# Purpose: PEG parser interpreter
# Author: Igor R. Dejanović <igor DOT dejanovic AT gmail DOT com>
# Copyright: (c) 2009 Igor R. Dejanović <igor DOT dejanovic AT gmail DOT com>
# License: MIT License
#
# Arpeggio is implementation of pacrat parser interpreter based on PEG grammars.
# Parsers are defined using python language construction or PEG language.
#######################################################################
__author__ = "Igor R. Dejanović <igor DOT dejanovic AT gmail DOT com>"
__version__ = "0.1-dev"
from setuptools import setup
NAME = 'Arpeggio'
VERSION = __version__
DESC = 'Pacrat parser interpreter'
AUTHOR = 'Igor R. Dejanovic'
AUTHOR_EMAIL = 'igor DOT dejanovic AT gmail DOT com'
LICENCE = 'MIT'
URL = 'http://arpeggio.googlecode.com/'
setup(
name = NAME,
version = VERSION,
description = DESC,
author = AUTHOR,
author_email = AUTHOR_EMAIL,
maintainer = AUTHOR,
maintainer_email = AUTHOR_EMAIL,
license = LICENCE,
url = URL,
packages = ["arpeggio"],
keywords = "parser pacrat peg",
classifiers=[
'Development Status :: 3 - Alpha',
'Intended Audience :: Developers',
'Intended Audience :: Information Technology',
'Intended Audience :: Science/Research',
'Topic :: Software Development :: Interpreters',
'Topic :: Software Development :: Compilers',
'Topic :: Software Development :: Libraries :: Python Modules'
'License :: OSI Approved :: MIT License',
'Operating System :: OS Independent',
'Programming Language :: Python',
]
)
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment