In the following suit of exercises you are asked to use the Python module nltk to build and use phrase structure grammars. You need to first install the nltk module if you haven’t already. You can do this using pip:

pip install nltk

Or, if you are on conda, you can use:

conda install -c conda-forge nltk

Import the packages you need:

import nltk
from nltk.tokenize import word_tokenize
from nltk import CFG
from nltk.parse import ChartParser

Here is an example grammar in nltk:

grammar = CFG.fromstring("""
S  -> NP VP
NP -> Det N
VP -> V
VP -> V NP

Det -> 'every' | 'a' | 'the' | 'some'
N   -> 'student' | 'professor' | 'dog'
V   -> 'walks' | 'saw' | 'talks'
""")

You build a parser based on the grammar you defined:

parser = ChartParser(grammar)

Here is a helper function that maps a sentence (str) to a list of resulting parse trees:

def parse_sentence(parser, sentence):
    """Map sentence to the set of its parse trees.

       In:
           parser: nltk.parse.chart.ChartParser object
           sentence: str
       Out:
           trees: list of nltk Tree objects
    """
    return\
        list(
            parser.parse(
                word_tokenize(
                    sentence
                )
            )
        )

You can use the parse_sentence function to get the parse trees for a given sentence. Make it a little convenient by:

parse = lambda sentence: parse_sentence(parser, sentence)

Now you can parse sentences like this:

trees = parse("every student walks")

This will give you a list of parse trees for the sentence “every student walks”. You can print the trees to see their structure:

trees[0].pretty_print()

to the console, or use trees[0].draw() to visualize the tree in a separate window.

You are given the following grammar:

S  -> NP VP
NP -> Det N
VP -> V
VP -> V NP

Det -> 'every' | 'a' | 'the' | 'some'
N   -> 'student' | 'professor' | 'dog'
V   -> 'walks' | 'saw' | 'talks'
  1. Extend your grammar so that you can parse sentences like Every student saw a dog with a telescope, with its two readings (i.e., the one in which the dog has a telescope, and the one in which the student has a telescope).
  2. Your grammar generates ungrammatical sentences like *Every student saw (English requires an explicit object in this case). Modify your grammar so that it does not generate such sentences. But, be careful, both Every student walks and Every student walks a dog are grammatical, you should do justice to this fact as well.
  3. Extend your grammar so that your grammar accepts sentences like Every student talks to a professor and rejects *Every student talks a professor.

Feel free to introduce new non-terminal symbols to your grammar.