Automatic construction of labeled clusters of named entities for IR
Henock Tilahun Teffera
Broschiertes Buch

Automatic construction of labeled clusters of named entities for IR

Thesis for European Master's in Language and Communication Technology

Versandkostenfrei!
Versandfertig in 6-10 Tagen
32,99 €
inkl. MwSt.
PAYBACK Punkte
16 °P sammeln!
In this study we have tried to harvest labeled clusters of semantically similar named entities which can be used as a first step for web document clustering. We first collect ~44,000 named entities from a thesaurus which is constructed by Dekang Lin applying a word similarity measure based on their distributional pattern. Using their similarity metrics and CLUTO clustering software, we create 2000 semantically similar clusters of the named entities. Then we collect ~305,500 label-instance pairs from the 2007 English Wikipedia dump and implement a labeling algorithm presented by Benjamin Van Du...