"One Entity per Discourse" and "One Entity per Collocation" Improve Named-Entity Disambiguation


This web page complements the paper published in Coling 2014 (Barrena et al. 2014), and contains the manually annotated data used in the analysis and experiments. The data contains excerpts of TAC-KBP entity linking track documents (http://nlp.cs.rpi.edu/kbp/2014/), and was annotated by ourselves.

The data is composed by a dataset and a gold standard for each of the "One sense" hypotheses mentioned in the paper.


Please, pose any questions/problems you may have to ander -dot- barrena -at- ehu -dot- es

Release data

One Sense per Discourse - Dataset
One Sense per Discourse - Gold Standard

One Sense per Syntactic Collocation - Dataset
One Sense per Syntactic Collocation - Gold Standard

One Sense per Proposition - Dataset
One Sense per Proposition - Gold Standard


Ander Barrena, Eneko Agirre, Bernardo Cabaleiro, Anselmo Peñas, Aitor Soroa. "One Entity per Discourse" and "One Entity per Collocation" Improve Named-Entity Disambiguation. Proceedings of COLING. 2014. (pdf, bibtex)


This work was partially funded by MINECO (CHIST-ERA READERS project – PCIN-2013-002- C02-01) and the European Commission (QTLEAP – FP7-ICT-2013.4.1-610516, OPENER – FP7-ICT-2011-SME-DCL-296451). Ander Barrena is supported by a PhD grant from the University of the Basque Country.

IXA group