Content Extraction
Thomas Gottron
Broschiertes Buch

Content Extraction

Identifying the Main Content in HTML Documents

Versandkostenfrei!
Versandfertig in 6-10 Tagen
89,90 €
inkl. MwSt.
PAYBACK Punkte
0 °P sammeln!
Except the article forming the main content most HTML documents on the WWW contain additional contents such as navigation menus, design elements or commercial banners. In the context of several applications it is necessary to draw the distinction between main and additional content automatically. Content extraction and template detection are the two approaches to solve this task. This book gives an extensive overview and detailed description of existing and newly developed algorithms from both areas. The described content extraction algorithms are evaluated under different aspects using object...