from coursepy.lang.parsing import make_parser, make_cfg

grammar_text="""

E -> E O E
E -> LP E RP 
E -> '1'|'2'|'3'|'4'
O -> '+'|'*'
LP -> '('
RP -> ')'
"""

parser = make_parser(make_cfg(grammar_text))

for e in "3 + 4,3 + 4 * 2,( 3 + 4 ) * 2".split(","):
    print(f"Parse(s) for `{e}`:")
    for t in parser(e):
        t.pretty_print()
Parse(s) for `3 + 4`:
     E     
  ___|___   
 E   O   E 
 |   |   |  
 3   +   4 

Parse(s) for `3 + 4 * 2`:
         E         
      ___|_______   
     E       |   | 
  ___|___    |   |  
 E   O   E   O   E 
 |   |   |   |   |  
 3   +   4   *   2 

         E         
  _______|___       
 |   |       E     
 |   |    ___|___   
 E   O   E   O   E 
 |   |   |   |   |  
 3   +   4   *   2 

Parse(s) for `( 3 + 4 ) * 2`:
             E             
          ___|___________   
         E           |   | 
  _______|_______    |   |  
 |       E       |   |   | 
 |    ___|___    |   |   |  
 LP  E   O   E   RP  O   E 
 |   |   |   |   |   |   |  
 (   3   +   4   )   *   2

Note that this would accept (3) as well-formed, which is fine. If you want to avoid that, you need to have a rule where parens are introduced only around E O E.

Also, the paren rule could have directly mentioned ‘(‘ and ‘)’, but the pretty print does not work fine presumably due to a bug in nltk.


Download .ipynb Download .py