EXTRACTING PARALLEL PHRASES FROM ENGLISH-PUNJABI CORPORA
Manpreet Singh Lehal
Broschiertes Buch

EXTRACTING PARALLEL PHRASES FROM ENGLISH-PUNJABI CORPORA

AN INTEGRATED APPROACH

Versandkostenfrei!
Versandfertig in 6-10 Tagen
52,99 €
inkl. MwSt.
PAYBACK Punkte
26 °P sammeln!
This study presents a novel approach to extract parallel data from a comparable English-Punjabi corpus, addressing the scarcity of parallel corpora for this language pair. Unlike previous research, this approach focuses on creating high-precision parallel data using minimal resources. The data is sourced from diverse domains, including Wikipedia articles, TDIL's noisy parallel sentences, and Gyan Nidhi reports. The methodology consists of three phases: extracting and aligning documents, translating Punjabi texts into English using OpenNMT-py, and calculating content similarity through three me...