This book offers a highly accessible introduction to natural language processing, the field that supports a variety of language technologies, from predictive text and email filtering to automatic summarization and translation. With it, you'll learn how to write Python programs that work with large collections of unstructured text. You'll access richly annotated datasets using a comprehensive range of linguistic data structures, and you'll understand the main algorithms for analyzing the content and structure of written communication. Packed with examples and exercises, Natural Language…mehr
This book offers a highly accessible introduction to natural language processing, the field that supports a variety of language technologies, from predictive text and email filtering to automatic summarization and translation. With it, you'll learn how to write Python programs that work with large collections of unstructured text. You'll access richly annotated datasets using a comprehensive range of linguistic data structures, and you'll understand the main algorithms for analyzing the content and structure of written communication.
Packed with examples and exercises, Natural Language Processing with Python will help you:
Extract information from unstructured text, either to guess the topic or identify "named entities" Analyze linguistic structure in text, including parsing and semantic analysis Access popular linguistic databases, including WordNet and treebanks Integrate techniques drawn from fields as diverse as linguistics and artificial intelligence
This book will help you gain practical skills in natural language processing using the Python programming language and the Natural Language Toolkit (NLTK) open source library. If you're interested in developing web applications, analyzing multilingual news sources, or documenting endangered languages -- or if you're simply curious to have a programmer's perspective on how human language works -- you'll find Natural Language Processing with Python both fascinating and immensely useful.
Steven Bird is Associate Professor in the Department of Computer Science and Software Engineering at the University of Melbourne, and Senior Research Associate in the Linguistic Data Consortium at the University of Pennsylvania. He completed a PhD on computational phonology at the University of Edinburgh in 1990, supervised by Ewan Klein. He later moved to Cameroon to conduct linguistic fieldwork on the Grassfields Bantu languages under the auspices of the Summer Institute of Linguistics. More recently, he spent several years as Associate Director of the Linguistic Data Consortium where he led an R&D team to create models and tools for large databases of annotated text. At Melbourne University, he established a language technology research group and has taught at all levels of the undergraduate computer science curriculum. In 2009, Steven is President of the Association for Computational Linguistics. Ewan Klein is Professor of Language Technology in the School of Informatics at the University of Edinburgh. He completed a PhD on formal semantics at the University of Cambridge in 1978. After some years working at the Universities of Sussex and Newcastle upon Tyne, Ewan took up a teaching position at Edinburgh. He was involved in the establishment of Edinburgh's Language Technology Group in 1993, and has been closely associated with it ever since. From 2000-2002, he took leave from the University to act as Research Manager for the Edinburgh-based Natural Language Research Group of Edify Corporation, Santa Clara, and was responsible for spoken dialogue processing. Ewan is a past President of the European Chapter of the Association for Computational Linguistics and was a founding member and Coordinator of the European Network of Excellence in Human Language Technologies (ELSNET). Edward Loper has recently completed a PhD on machine learning for natural language processing at the the University of Pennsylvania. Edward was a student in Steven's graduate course on computational linguistics in the fall of 2000, and went on to be a TA and share in the development of NLTK. In addition to NLTK, he has helped develop two packages for documenting and testing Python software, epydoc, and doctest.
Inhaltsangabe
Inhaltsverzeichnis Chapter 1 Language Processing and Python Computing with Language: Texts and Words A Closer Look at Python: Texts as Lists of Words Computing with Language: Simple Statistics Back to Python: Making Decisions and Taking Control Automatic Natural Language Understanding Summary Further Reading Exercises Chapter 2 Accessing Text Corpora and Lexical Resources Accessing Text Corpora Conditional Frequency Distributions More Python: Reusing Code Lexical Resources WordNet Summary Further Reading Exercises Chapter 3 Processing Raw Text Accessing Text from the Web and from Disk Strings: Text Processing at the Lowest Level Text Processing with Unicode Regular Expressions for Detecting Word Patterns Useful Applications of Regular Expressions Normalizing Text Regular Expressions for Tokenizing Text Segmentation Formatting: From Lists to Strings Summary Further Reading Exercises Chapter 4 Writing Structured Programs Back to the Basics Sequences Questions of Style Functions: The Foundation of Structured Programming Doing More with Functions Program Development Algorithm Design A Sample of Python Libraries Summary Further Reading Exercises Chapter 5 Categorizing and Tagging Words Using a Tagger Tagged Corpora Mapping Words to Properties Using Python Dictionaries Automatic Tagging N-Gram Tagging Transformation-Based Tagging How to Determine the Category of a Word Summary Further Reading Exercises Chapter 6 Learning to Classify Text Supervised Classification Further Examples of Supervised Classification Evaluation Decision Trees Naive Bayes Classifiers Maximum Entropy Classifiers Modeling Linguistic Patterns Summary Further Reading Exercises Chapter 7 Extracting Information from Text Information Extraction Chunking Developing and Evaluating Chunkers Recursion in Linguistic Structure Named Entity Recognition Relation Extraction Summary Further Reading Exercises Chapter 8 Analyzing Sentence Structure Some Grammatical Dilemmas What s the Use of Syntax? Context-Free Grammar Parsing with Context-Free Grammar Dependencies and Dependency Grammar Grammar Development Summary Further Reading Exercises Chapter 9 Building Feature-Based Grammars Grammatical Features Processing Feature Structures Extending a Feature-Based Grammar Summary Further Reading Exercises Chapter 10 Analyzing the Meaning of Sentences Natural Language Understanding Propositional Logic First-Order Logic The Semantics of English Sentences Discourse Semantics Summary Further Reading Exercises Chapter 11 Managing Linguistic Data Corpus Structure: A Case Study The Life Cycle of a Corpus Acquiring Data Working with XML Working with Toolbox Data Describing Language Resources Using OLAC Metadata Summary Further Reading Exercises Appendix Afterword: The Language Challenge Language Processing Versus Symbol Processing Contemporary Philosophical Divides NLTK Roadmap Envoi... Appendix Bibliography NLTK Index General Index Colophon
Inhaltsverzeichnis Chapter 1 Language Processing and Python Computing with Language: Texts and Words A Closer Look at Python: Texts as Lists of Words Computing with Language: Simple Statistics Back to Python: Making Decisions and Taking Control Automatic Natural Language Understanding Summary Further Reading Exercises Chapter 2 Accessing Text Corpora and Lexical Resources Accessing Text Corpora Conditional Frequency Distributions More Python: Reusing Code Lexical Resources WordNet Summary Further Reading Exercises Chapter 3 Processing Raw Text Accessing Text from the Web and from Disk Strings: Text Processing at the Lowest Level Text Processing with Unicode Regular Expressions for Detecting Word Patterns Useful Applications of Regular Expressions Normalizing Text Regular Expressions for Tokenizing Text Segmentation Formatting: From Lists to Strings Summary Further Reading Exercises Chapter 4 Writing Structured Programs Back to the Basics Sequences Questions of Style Functions: The Foundation of Structured Programming Doing More with Functions Program Development Algorithm Design A Sample of Python Libraries Summary Further Reading Exercises Chapter 5 Categorizing and Tagging Words Using a Tagger Tagged Corpora Mapping Words to Properties Using Python Dictionaries Automatic Tagging N-Gram Tagging Transformation-Based Tagging How to Determine the Category of a Word Summary Further Reading Exercises Chapter 6 Learning to Classify Text Supervised Classification Further Examples of Supervised Classification Evaluation Decision Trees Naive Bayes Classifiers Maximum Entropy Classifiers Modeling Linguistic Patterns Summary Further Reading Exercises Chapter 7 Extracting Information from Text Information Extraction Chunking Developing and Evaluating Chunkers Recursion in Linguistic Structure Named Entity Recognition Relation Extraction Summary Further Reading Exercises Chapter 8 Analyzing Sentence Structure Some Grammatical Dilemmas What s the Use of Syntax? Context-Free Grammar Parsing with Context-Free Grammar Dependencies and Dependency Grammar Grammar Development Summary Further Reading Exercises Chapter 9 Building Feature-Based Grammars Grammatical Features Processing Feature Structures Extending a Feature-Based Grammar Summary Further Reading Exercises Chapter 10 Analyzing the Meaning of Sentences Natural Language Understanding Propositional Logic First-Order Logic The Semantics of English Sentences Discourse Semantics Summary Further Reading Exercises Chapter 11 Managing Linguistic Data Corpus Structure: A Case Study The Life Cycle of a Corpus Acquiring Data Working with XML Working with Toolbox Data Describing Language Resources Using OLAC Metadata Summary Further Reading Exercises Appendix Afterword: The Language Challenge Language Processing Versus Symbol Processing Contemporary Philosophical Divides NLTK Roadmap Envoi... Appendix Bibliography NLTK Index General Index Colophon
Rezensionen
"Natural Language Processing with Python ist ein gelungenes Einsteigerwerk für die computergestützte Textanalyse. Konkrete Code-Beispiele werden von theoretischen Erläuterungen begleitet. Das Buch motiviert zu eigenen Experimenten und sorgt so schnell für Erfolgserlebnisse. Es bietet pragmatisches Grundlagenwissen und wirbt für eine weitergehende Beschäftigung mit dem Thema." -- IT-Rezensionen.de, Januar 2010
Es gelten unsere Allgemeinen Geschäftsbedingungen: www.buecher.de/agb
Impressum
www.buecher.de ist ein Internetauftritt der buecher.de internetstores GmbH
Geschäftsführung: Monica Sawhney | Roland Kölbl | Günter Hilger
Sitz der Gesellschaft: Batheyer Straße 115 - 117, 58099 Hagen
Postanschrift: Bürgermeister-Wegele-Str. 12, 86167 Augsburg
Amtsgericht Hagen HRB 13257
Steuernummer: 321/5800/1497
USt-IdNr: DE450055826