Package info
The R6j (derived from R6f)
- changed again the setting of s_rel, s_rel1 and s_pos (see Extractor)
- cleaned up ExtractorR.encodeCat method
- changed the feature calculation from c to d methods in class ExtractorR, e.g. c4 -> d4
The R6f0 (derived from R6f)
- Feature extraction for sets of states in ExtractorR - found the bug of version R6h.
- In a final test, we reached on the penn tree (penn-to-malt conversion) LAS/UAS/Tag: 92.31/93.35/97.23 .
- settings: beam: 40; threshold=0.2; pos tag considered 2; hsize 500 000 001; tagger-hsize 90000001 ; -tt 25 ; -tx 2; -ti 10; -tnumber 10
- measured a speed improvement of about 20% compared to version R6f.
The R6f (derived from R6b)
- Corrected or cleaned up cases in completion model
- Changed else value in tagger setting in static feature model of the transition-based parser from else v+=2 -> +0
The R6c (derived from R6b)
- I moved the label creation outside of the main loop so that it is done in a uniform way.
The R6b (derived from R6a)
- In this version, I measured the time consumption of different components and carried out improvements on the feature extractions:
- transition-based features
The R6a version was derived from ysp7b
- I improved the speed of the version. I made the class state more slim
- I fixed two bugs:
(1) One in the tagger. Several tags of the same pos-tags have been provided in the n-best list of the tags. I wonder still of this case that it did not cause more harm to the scores.
(2) The swap operation causes a second shift. Because of this, we get a second time the same state and same pos tag combination with the same score. Now, we avoid to get the same variant again.
In order to solve this two cases, we did not add equal case when the score was as high as the top ranked. I guess we lost a couple of good results.
adapt tagger for usage in shared task to parse web text
- use predicted pos tags in the completion model
- use model 7 and retag
fixed bug in Tagger class moved best tag inside loop of the static tagger
Compare between only complete scoring and incomplete scoring
This version does an incomplete scoring.