• Increase font size
  • Default font size
  • Decrease font size

New to STS 2013, we provide STS common, a shared annotation and inference pipeline for STS. Strong open-source baselines like DKPro can be found in the STS wiki, a collaboratively maintained site, open to the STS community, with a comprehensive list of evaluation tasks and datasets, software and papers related to STS.

The data related for STS 2013 comprises the following:

STS Core task:

  • Initial training data, covering the 2012 train and test data. This data covers 5 datasets: paraphrase sentence pairs (MSRpar), sentence pairs from video descriptions (MSRvid), MT evaluation sentence pairs (MTnews and MTeuroparl) and gloss pairs (OnWN). This data is now included in the trial data for the core STS task (see below).
  • Trial data for the core STS task, including all the data from STS 2012 (additional details). Note that there is no new training data in 2013, but you can use the 2012 data.
  • Test data with gold standard annotations. IMPORTANT NOTICE: Due to license restrictions, the SMT data needs to be downloaded from LDC, see
  • System submissions (you can also download 2012 system submissions).

STS Typed-similarity pilot task:



- *SEM program (incl. papers)
- *SEM registration open
- System runs available
- Gold standard data now available
- Results now available
- Train data for pilot on typed similarity available
- Trial data available
- STS selected as shared task of *SEM 2013
- Please join the mailing list for updates