Solution: Grammar for arithmetic
from coursepy.lang.parsing import make_parser, make_cfg
grammar_text="""
E -> E O E
E -> LP E RP
E -> '1'|'2'|'3'|'4'
O -> '+'|'*'
LP -> '('
RP -> ')'
"""
parser = make_parser(make_cfg(grammar_text))
for e in "3 + 4,3 + 4 * 2,( 3 + 4 ) * 2".split(","):
print(f"Parse(s) for `{e}`:")
for t in parser(e):
t.pretty_print()
Parse(s) for `3 + 4`:
E
___|___
E O E
| | |
3 + 4
Parse(s) for `3 + 4 * 2`:
E
___|_______
E | |
___|___ | |
E O E O E
| | | | |
3 + 4 * 2
E
_______|___
| | E
| | ___|___
E O E O E
| | | | |
3 + 4 * 2
Parse(s) for `( 3 + 4 ) * 2`:
E
___|___________
E | |
_______|_______ | |
| E | | |
| ___|___ | | |
LP E O E RP O E
| | | | | | |
( 3 + 4 ) * 2
Note that this would accept (3) as well-formed, which is fine. If you want to avoid that, you need to have a rule where parens are introduced only around E O E.
Also, the paren rule could have directly mentioned ‘(‘ and ‘)’, but the pretty print does not work fine presumably due to a bug in nltk.