RST Chinese Treebank The RST Chinese Treebank is a corpus of specialized texts in Chinese and their parallel texts in Spanish. All the texts will be annotated manually with discourse relations under the theoretical framework Rhetorical Structure Theory (RST) (Mann and Thompson, 1988). RSTTool (O’Donnell, 2000) was used to annotate this corpus.
100 texts are included in this corpus. The genres of these texts are: (a) scientific abstract; (b) advertisement; (c) news and (d) announcement and the topics of this corpus are: (a) terminology; (b) culture; (c) language; (d) economy; (e) education; (f) art and (g) international affairs.
In this website, you can find:
  • The texts and a search tool to find any information of the corpus based on part of speech.
  • The occurrences of each discourse relation (Forthcoming)
  • Discourse structure of a text (Forthcoming)
  • Linear segmentation each text (Forthcoming)
  • A protocol that contains discourse information for Spanish-Chinese language translation and learning (Forthcoming)
How to use this corpus in a correct way?
In order to use this corpus in an appropriate way, we appreciate you can cite the following reference:
Cao, S. Y.; da Cunha, I.; Iruskieta, M. (2016). «Elaboration of a Spanish-Chinese parallel corpus with translation and language learning purposes». In 34th International Conference of the Spanish Society for Applied Linguistics (AESLA).

Who we are?
参与人员:

Shuyuan Cao (曹书源) (Universitat Pompeu Fabra)
Mikel Iruskieta (University of Basque Country UPV/EHU)
Iria da Cunha (Universidad Nacional de Educación a Distancia)
NianWen Xue (薛念文) (Brandeis University)
Esther Miranda (University of Basque Country UPV/EHU)
Kike Fernandez (University of Basque Country UPV/EHU)



RST 中文树库是一个包含专业文章的中文语料库,同时其西班牙语平行语料库也被囊括其中。采用Rhetorical Structure Theory (RST) (Mann and Thompson, 1988)作为框架理论,所有的文章都采用了人工标注的方式进行语篇关系的标注。语料库的标注工具是 RSTTool (O’Donnell, 2000)。
本语料库包含了100篇文章。文章的类型分别为:(a)科学概要;(b)广告信息;(c)新闻和(d)通告;文章所包含的话题囊括了:(a)术语;(b)文化;(c)语言;(d)经济;(e)教育;(f)艺术以及 (g)国际事务。
在本网站您可以找到下列信息:
  • 语料库文章和基于词汇分类的语料库信息搜索。
  • 语篇关系(即将公布)
  • 文章的语篇结构(即将公布)
  • 文章的线性切分(即将公布)
  • 包含了语篇结构并可以用于西班牙语和中文中间的翻译以及第二外语学习的信息汇编(即将公布)
如何正确的使用语料库?
Cao, S. Y.; da Cunha, I.; Iruskieta, M. (2016). «Elaboration of a Spanish-Chinese parallel corpus with translation and language learning purposes». In 34th International Conference of the Spanish Society for Applied Linguistics (AESLA).
如果您能引用以下的文章对语料库进行正确的使用,我们将不胜感激。