Samenvatting
Previous work on treebank parsing with discontinuous constituents using Linear Context-Free Rewriting systems (LCFRS) has been limited to sentences of up to 30 words, for reasons of computational complexity. There have been some results on binarizing an LCFRS in a manner that minimizes parsing complexity, but the present work shows that parsing long sentences with such an optimally binarized grammar remains infeasible. Instead, we introduce a technique which removes this length restriction, while maintaining a respectable accuracy. The resulting parser has been applied to a discontinuous treebank with favorable results.
Originele taal-2 | Engels |
---|---|
Titel | Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics (EACL), Avignon, France |
Pagina's | 460-470 |
Status | Gepubliceerd - 2012 |