RST Chinese Treebank The RST Spanish-Chinese Treebank is a corpus of specialized texts in Spanish and their parallel texts in Chinese. All the texts are annotated manually with discourse relations under the theoretical framework Rhetorical Structure Theory (RST) (Mann and Thompson, 1988). RSTTool (O’Donnell, 2000) is used to annotate this corpus. The annotation results are saved by rstWeb (Zeldes, 2016).
Totally, 100 texts are included in this corpus. The genres of these texts are: (a) scientific abstract; (b) advertisement; (c) news and (d) announcement. The topics of the corpus are: (a) terminology; (b) culture; (c) language; (d) economy; (e) education; (f) art and (g) international affairs.
In this website, you can find:
  • The texts and a search tool to find any information of the corpus based on part of speech (POS).
  • The occurrences of each discourse relation
  • Discourse structure of a text
  • Linear segmentation of each text
How to use this corpus in a correct way?
In order to use this corpus in an appropriate way, we appreciate you can cite the following references:
  • Cao Shuyuan, Xue Nianwen, da Cunha Iria, Iruskieta Mikel, and Wang Chuan. 2017. Discourse Segmentation for Building a RST Chinese Treebank. In Proceedings of 6th Workshop “Recent Advances in RST and Related Formalisms”, 73-81.
  • Cao Shuyuan, da Cunha Iria, Iruskieta Mikel. 2017. Toward the Elaboration of a Spanish-Chinese Parallel Annotated Corpus. EPiC Series of Language and Linguistics, 2: 315-324.
  • Cao Shuyuan, da Cunha Iria, and Iruskieta Mikel. 2016. A Corpus-based Approach for Spanish-Chinese Language Learning. In Proceedings of the 3rd Workshop on Natural Language Processing Techniques for Educational Applications (NLP-TEA3), 97-106.
Who we are?
Shuyuan Cao (Universitat Pompeu Fabra)
Mikel Iruskieta (University of Basque Country UPV/EHU)
Iria da Cunha (Universidad Nacional de Educación a Distancia)
NianWen Xue (Brandeis University)
Esther Miranda (University of Basque Country UPV/EHU)
Kike Fernandez (University of Basque Country UPV/EHU)