EusEduSeg: syntax-based text segmentation tool for Basque
Contact: mikel.iruskieta at ehu.eus
In the framework of the Rhetorical Structure Theory (RST by Mann and Thompson, 1987), this segmenter was developed as a first step towards an automatic rhetorical analysis for Basque. The segmenter uses the parser MALTIXA (Diaz de Ilarraza et al. 2005) and our purpose is to automatically detect the Elementary Discourse Units (EDUs) or discourse segments (propositions). EDU segmentation is defined in Iruskieta (2014). In future, this segmentation will be the basis for building automatically the corresponding RST tree or other many NLP aplications.
|     ||RST Treebank:|
NOTE: With the aim of preserving the paragraphs, this tool considers every line break as a paragraph.