• 3.4 无限制依赖成分

    3.4 无限制依赖成分

    考虑下面的对比:

    1. >>> nltk.data.show_cfg('grammars/book_grammars/feat1.fcfg')
    2. % start S
    3. # ###################
    4. # Grammar Productions
    5. # ###################
    6. S[-INV] -> NP VP
    7. S[-INV]/?x -> NP VP/?x
    8. S[-INV] -> NP S/NP
    9. S[-INV] -> Adv[+NEG] S[+INV]
    10. S[+INV] -> V[+AUX] NP VP
    11. S[+INV]/?x -> V[+AUX] NP VP/?x
    12. SBar -> Comp S[-INV]
    13. SBar/?x -> Comp S[-INV]/?x
    14. VP -> V[SUBCAT=intrans, -AUX]
    15. VP -> V[SUBCAT=trans, -AUX] NP
    16. VP/?x -> V[SUBCAT=trans, -AUX] NP/?x
    17. VP -> V[SUBCAT=clause, -AUX] SBar
    18. VP/?x -> V[SUBCAT=clause, -AUX] SBar/?x
    19. VP -> V[+AUX] VP
    20. VP/?x -> V[+AUX] VP/?x
    21. # ###################
    22. # Lexical Productions
    23. # ###################
    24. V[SUBCAT=intrans, -AUX] -> 'walk' | 'sing'
    25. V[SUBCAT=trans, -AUX] -> 'see' | 'like'
    26. V[SUBCAT=clause, -AUX] -> 'say' | 'claim'
    27. V[+AUX] -> 'do' | 'can'
    28. NP[-WH] -> 'you' | 'cats'
    29. NP[+WH] -> 'who'
    30. Adv[+NEG] -> 'rarely' | 'never'
    31. NP/NP ->
    32. Comp -> 'that'

    3.1中的语法包含一个“缺口引进”产生式,即S[-INV] -> NP S/NP。为了正确的预填充斜线特征,我们需要为扩展SVPNP的产生式中箭头两侧的斜线添加变量值。例如,VP/?x -> V SBar/?xVP -> V SBar的斜线版本,也就是说,可以为一个成分的父母VP指定斜线值,只要也为孩子SBar指定同样的值。最后,NP/NP ->允许NP上的斜线信息为空字符串。使用3.1中的语法,我们可以分析序列 who do you claim that you like

    1. >>> tokens = 'who do you claim that you like'.split()
    2. >>> from nltk import load_parser
    3. >>> cp = load_parser('grammars/book_grammars/feat1.fcfg')
    4. >>> for tree in cp.parse(tokens):
    5. ... print(tree)
    6. (S[-INV]
    7. (NP[+WH] who)
    8. (S[+INV]/NP[]
    9. (V[+AUX] do)
    10. (NP[-WH] you)
    11. (VP[]/NP[]
    12. (V[-AUX, SUBCAT='clause'] claim)
    13. (SBar[]/NP[]
    14. (Comp[] that)
    15. (S[-INV]/NP[]
    16. (NP[-WH] you)
    17. (VP[]/NP[] (V[-AUX, SUBCAT='trans'] like) (NP[]/NP[] )))))))

    这棵树的一个更易读的版本如(52)所示。

    1. >>> tokens = 'you claim that you like cats'.split()
    2. >>> for tree in cp.parse(tokens):
    3. ... print(tree)
    4. (S[-INV]
    5. (NP[-WH] you)
    6. (VP[]
    7. (V[-AUX, SUBCAT='clause'] claim)
    8. (SBar[]
    9. (Comp[] that)
    10. (S[-INV]
    11. (NP[-WH] you)
    12. (VP[] (V[-AUX, SUBCAT='trans'] like) (NP[-WH] cats))))))

    此外,它还允许没有 wh 结构的倒装句:

    1. >>> tokens = 'rarely do you sing'.split()
    2. >>> for tree in cp.parse(tokens):
    3. ... print(tree)
    4. (S[-INV]
    5. (Adv[+NEG] rarely)
    6. (S[+INV]
    7. (V[+AUX] do)
    8. (NP[-WH] you)
    9. (VP[] (V[-AUX, SUBCAT='intrans'] sing))))