RST Basque TreeBank




   The RST Basque Treebank was annotated at subsentential level following Tofilosky et al. (2009) and using the extended classification of discourse relations following the Rhetorical Structure Theory (RST) by Mann and Thompson (1988). The annotated corpus contains 60 abstract texts from three different domains: medical, terminologycal and scientific. RSTTool (O'Donnel 2000), an annotation interface for RST was used to annotate this corpus and RhetDataBase was used to annotate the signals of the rhetorical relations.

In this website the user may look up:
  • all the occurrences of any relation in the corpus,
  • the relations of a chosen text,
  • the linear segmentation of a text,
  • the rhetorical relations that are linked to the central unit in the discourse structure,
  • the signals of the rhetorical relations, and
  • any information in the corpus based on part of speech.
  • You are free to use any information from this website, but we would appreciate an acknowledgement. The propper way to cite the RST Basque Treebank is the following:

    Iruskieta, M.; Aranzabe, M.J.; Diaz de Ilarraza, A.; Gonzalez, I.; Lersundi, M.; Lopez de la Calle, O. 2013. The RST Basque TreeBank: an online search interface to check rhetorical relations. Paper presented at the 4th Workshop ''RST and Discourse Studies'', Brasil, October 21-23.

    Team-Group:
    • Oier Lopez de Lacalle
    • Esther Miranda
    • Kike Fernandez
    • Maxux Aranzabe
    • Itziar Gonzalez
    • Mikel Lersundi
    • Arantza Diaz de Ilarraza
    • Mikel Iruskieta