
Effective Techniques for Indonesian Text Retrieval
Indonesian Text Retrieval
Versandkostenfrei!
Versandfertig in 6-10 Tagen
51,99 €
inkl. MwSt.
PAYBACK Punkte
26 °P sammeln!
In this thesis, we investigate information retrievaltechniques for Indonesian.Stemming is the process of reducing morphologicalvariants of a word to acommon stem form.Although several stemming algorithms have beenproposed for Indonesian,there is no consensus on which gives better performance.We empirically explore these stemming algorithms,propose novel extensions to the best algorithm, develop a new Indonesian stemmer, and show thatthese can improve stemming correctness.We propose a range of techniques to enhance theperformance of Indonesian information retrieval.Our experiments show that man...
In this thesis, we investigate information retrieval
techniques for Indonesian.
Stemming is the process of reducing morphological
variants of a word to a
common stem form.
Although several stemming algorithms have been
proposed for Indonesian,
there is no consensus on which gives better performance.
We empirically explore these stemming algorithms,
propose novel extensions to the best algorithm,
develop a new Indonesian stemmer, and show that
these can improve stemming correctness.
We propose a range of techniques to enhance the
performance of Indonesian information retrieval.
Our experiments show that many of these techniques
can increase retrieval performance.
We also address the problem of automatic creation of
parallel corpora which are essential for
cross-lingual information retrieval and other
natural language processing tasks, including machine
translation.
We describe algorithms that we have developed to
automatically identify parallel documents for
Indonesian and English.
We also investigate the applicability of our
identification algorithms
for other languages that use the Latin alphabet
including German and French.
techniques for Indonesian.
Stemming is the process of reducing morphological
variants of a word to a
common stem form.
Although several stemming algorithms have been
proposed for Indonesian,
there is no consensus on which gives better performance.
We empirically explore these stemming algorithms,
propose novel extensions to the best algorithm,
develop a new Indonesian stemmer, and show that
these can improve stemming correctness.
We propose a range of techniques to enhance the
performance of Indonesian information retrieval.
Our experiments show that many of these techniques
can increase retrieval performance.
We also address the problem of automatic creation of
parallel corpora which are essential for
cross-lingual information retrieval and other
natural language processing tasks, including machine
translation.
We describe algorithms that we have developed to
automatically identify parallel documents for
Indonesian and English.
We also investigate the applicability of our
identification algorithms
for other languages that use the Latin alphabet
including German and French.