J. Benesty / M. M. Sondhi / Y. Huang (eds.)
Springer Handbook of Speech Processing, w. DVD-ROM
Ed. by Jacob Benesty, Yiteng Huang and M. Sondhi
Ein Angebot für € 79,94 €
J. Benesty / M. M. Sondhi / Y. Huang (eds.)
Springer Handbook of Speech Processing, w. DVD-ROM
Ed. by Jacob Benesty, Yiteng Huang and M. Sondhi
- Broschiertes Buch
- Merkliste
- Auf die Merkliste
- Bewerten Bewerten
- Teilen
- Produkt teilen
- Produkterinnerung
- Produkterinnerung
From common consumer products such as cell phones and MP3 players to more sophisticated projects such as human-machine interfaces and responsive robots, speech technologies are now everywhere. Many think that it is just a matter of time before more applications of the science of speech become inescapable in our daily life. This handbook is meant to play a fundamental role for sustainable progress in speech research and development. Springer Handbook of Speech Processing targets three categories of readers: graduate students, professors and active researchers in academia and research labs, and…mehr
From common consumer products such as cell phones and MP3 players to more sophisticated projects such as human-machine interfaces and responsive robots, speech technologies are now everywhere. Many think that it is just a matter of time before more applications of the science of speech become inescapable in our daily life. This handbook is meant to play a fundamental role for sustainable progress in speech research and development. Springer Handbook of Speech Processing targets three categories of readers: graduate students, professors and active researchers in academia and research labs, and engineers in industry who need to understand or implement some specific algorithms for their speech-related products. The handbook could also be used as a sourcebook for one or more graduate courses on signal processing for speech and different aspects of speech processing and applications. A quickly accessible source of application-oriented, authoritative and comprehensive information about these technologies, it combines the established knowledge derived from research in such fast evolving disciplines as signal processing and communications, acoustics, computer science and linguistics.
Produktdetails
- Produktdetails
- Verlag: Springer, Berlin
- Seitenzahl: 1176
- Erscheinungstermin: November 2007
- Englisch
- Abmessung: 245mm
- Gewicht: 2482g
- ISBN-13: 9783540491255
- ISBN-10: 3540491252
- Artikelnr.: 20924788
- Verlag: Springer, Berlin
- Seitenzahl: 1176
- Erscheinungstermin: November 2007
- Englisch
- Abmessung: 245mm
- Gewicht: 2482g
- ISBN-13: 9783540491255
- ISBN-10: 3540491252
- Artikelnr.: 20924788
J. Benesty, Université de Québec, Montréal, QC, Canada / M. M. Sondhi, Avayalabs Research, Basking Ridge, NJ, USA / Y. Huang, Bell Labs, Murray Hill, NJ, USA
'Foreword by J. L. Flanagan
Chap. 1 Introduction to Speech Processing
Part A: Production, Perception, and Modeling of Speech (M. M. Sondhi)
Part A describes the contemporary views on phonatory and articulatory mechanisms of humans to illustrate the physiological processes of speech production. It also describes the nonlinear cochlear speech processing in auditory masking, the perception of speech and sound by humans, and various methods for speech quality assessment with a focus on standardized methods.
Chap. 2 Physiological Processes of Speech Production
Chap. 3 Nonlinear Cochlear Signal Processing and Masking in Speech Perception
Chap. 4 Perception of Speech and Sound
Chap. 5 Speech Quality Estimation
Part B: Signal Processing for Speech (Y. Huang, J. Benesty)
Part B gives a large number of signal processing concepts and algorithms that are widely used in speech processing and in the applications of speech.
Chap. 6 Wiener and Adaptive Filters
Chap. 7 Linear Prediction
Chap. 8 Kalman Filter
Chap. 9 Homomorphic Systems and Cepstrum Analysis of Speech
Chap. 10 Pitch and Voicing Determination of Speech with an Extension Toward Music Signals
Chap. 11 Formant Estimation and Tracking
Chap. 12 The STFT, Sinusoidal Models, and Speech Modification
Chap. 13 Adaptive Blind Multichannel Identification
Part C: Speech Coding (W. B. Kleijn)
Part C discusses the attributes of speech coders as well as the underlying principles that determine their behavior and architecture. Coders for both traditional and packet networks are discussed, as well as low-bit-rate speech coding, various speech coding standards, and perceptual audio coders.
Chap. 14 Principles of Speech Coding
Chap. 15 Voice over IP: Speech Transmission over Packet Networks
Chap. 16 Low-Bit-Rate Speech Coding
Chap. 17 Analysis-by-Synthesis Speech Coding
Chap. 18 Perceptual Audio Coding of Speech Signals
Part D: Text-to-Speech Synthesis (S. Narayanan)
Part D presents different techniques for speech synthesis, including rule-based, corpus-based, and a combination of both. Linguistic analysis and prosodic processing, which are important parts of a text-to-speech (TTS) system, are reviewed. Other aspects of interest for TTS such as voice transformation and synthesis of expressive speech are also discussed.
Chap. 19 Basic Principles of Speech Synthesis
Chap. 20 Rule-Based Speech Synthesis
Chap. 21 Corpus-Based Speech Synthesis
Chap. 22 Linguistic Processing for Speech Synthesis
Chap. 23 Prosodic Processing
Chap. 24 Voice Transformation
Chap. 25 Expressive/Affective Speech Synthesis
Part E: Speech Recognition (L. Rabiner, B.-H. Juang)
Part E describes the most important speech recognition technologies. The approach based on the powerful hidden Markov models is generously presented and some other promising approaches are outlined. The robustness issues concerning the acoustical environment are studied. Finally, several fundamental applications are also discussed.
Chap. 26 Historical Perspective of the Field of ASR/NLU
Chap. 27 HMMs and Related Speech Technologies
Chap. 28 Speech Recognition with Weighted Finite-State Transducers
Chap. 29 A Machine Learning Framework for Spoken-Dialog Classification
Chap. 30 Towards Superhuman Speech Recognition
Chap. 31 Natural Language Understanding
Chap. 32 Transcription and Distillation of Spontaneous Speech
Chap. 33 Environmental Robustness
Chap. 34 The Business of Speech Technologies
Chap. 35 Spoken Dialog Systems
Part F: Speaker Recognition (S. Parthasarathy)
Part F develops the field of speaker recognition. It covers text-dependent and text-independent speaker recognition and their applications.
Chap. 36 Overview of Speaker Recognition
Chap. 37 Text-Dependent Speaker Recognition
Chap. 38 Text-Independent Speaker Recognition
Part G: Language Recognition (C.-H. Lee)
Part G provides an overview on principles of state-of-the-art language recognition a
Chap. 1 Introduction to Speech Processing
Part A: Production, Perception, and Modeling of Speech (M. M. Sondhi)
Part A describes the contemporary views on phonatory and articulatory mechanisms of humans to illustrate the physiological processes of speech production. It also describes the nonlinear cochlear speech processing in auditory masking, the perception of speech and sound by humans, and various methods for speech quality assessment with a focus on standardized methods.
Chap. 2 Physiological Processes of Speech Production
Chap. 3 Nonlinear Cochlear Signal Processing and Masking in Speech Perception
Chap. 4 Perception of Speech and Sound
Chap. 5 Speech Quality Estimation
Part B: Signal Processing for Speech (Y. Huang, J. Benesty)
Part B gives a large number of signal processing concepts and algorithms that are widely used in speech processing and in the applications of speech.
Chap. 6 Wiener and Adaptive Filters
Chap. 7 Linear Prediction
Chap. 8 Kalman Filter
Chap. 9 Homomorphic Systems and Cepstrum Analysis of Speech
Chap. 10 Pitch and Voicing Determination of Speech with an Extension Toward Music Signals
Chap. 11 Formant Estimation and Tracking
Chap. 12 The STFT, Sinusoidal Models, and Speech Modification
Chap. 13 Adaptive Blind Multichannel Identification
Part C: Speech Coding (W. B. Kleijn)
Part C discusses the attributes of speech coders as well as the underlying principles that determine their behavior and architecture. Coders for both traditional and packet networks are discussed, as well as low-bit-rate speech coding, various speech coding standards, and perceptual audio coders.
Chap. 14 Principles of Speech Coding
Chap. 15 Voice over IP: Speech Transmission over Packet Networks
Chap. 16 Low-Bit-Rate Speech Coding
Chap. 17 Analysis-by-Synthesis Speech Coding
Chap. 18 Perceptual Audio Coding of Speech Signals
Part D: Text-to-Speech Synthesis (S. Narayanan)
Part D presents different techniques for speech synthesis, including rule-based, corpus-based, and a combination of both. Linguistic analysis and prosodic processing, which are important parts of a text-to-speech (TTS) system, are reviewed. Other aspects of interest for TTS such as voice transformation and synthesis of expressive speech are also discussed.
Chap. 19 Basic Principles of Speech Synthesis
Chap. 20 Rule-Based Speech Synthesis
Chap. 21 Corpus-Based Speech Synthesis
Chap. 22 Linguistic Processing for Speech Synthesis
Chap. 23 Prosodic Processing
Chap. 24 Voice Transformation
Chap. 25 Expressive/Affective Speech Synthesis
Part E: Speech Recognition (L. Rabiner, B.-H. Juang)
Part E describes the most important speech recognition technologies. The approach based on the powerful hidden Markov models is generously presented and some other promising approaches are outlined. The robustness issues concerning the acoustical environment are studied. Finally, several fundamental applications are also discussed.
Chap. 26 Historical Perspective of the Field of ASR/NLU
Chap. 27 HMMs and Related Speech Technologies
Chap. 28 Speech Recognition with Weighted Finite-State Transducers
Chap. 29 A Machine Learning Framework for Spoken-Dialog Classification
Chap. 30 Towards Superhuman Speech Recognition
Chap. 31 Natural Language Understanding
Chap. 32 Transcription and Distillation of Spontaneous Speech
Chap. 33 Environmental Robustness
Chap. 34 The Business of Speech Technologies
Chap. 35 Spoken Dialog Systems
Part F: Speaker Recognition (S. Parthasarathy)
Part F develops the field of speaker recognition. It covers text-dependent and text-independent speaker recognition and their applications.
Chap. 36 Overview of Speaker Recognition
Chap. 37 Text-Dependent Speaker Recognition
Chap. 38 Text-Independent Speaker Recognition
Part G: Language Recognition (C.-H. Lee)
Part G provides an overview on principles of state-of-the-art language recognition a
Foreword by J. L. Flanagan
Chap. 1 Introduction to Speech Processing
Part A: Production, Perception, and Modeling of Speech (M. M. Sondhi)
Part A describes the contemporary views on phonatory and articulatory mechanisms of humans to illustrate the physiological processes of speech production. It also describes the nonlinear cochlear speech processing in auditory masking, the perception of speech and sound by humans, and various methods for speech quality assessment with a focus on standardized methods.
Chap. 2 Physiological Processes of Speech Production
Chap. 3 Nonlinear Cochlear Signal Processing and Masking in Speech Perception
Chap. 4 Perception of Speech and Sound
Chap. 5 Speech Quality Estimation
Part B: Signal Processing for Speech (Y. Huang, J. Benesty)
Part B gives a large number of signal processing concepts and algorithms that are widely used in speech processing and in the applications of speech.
Chap. 6 Wiener and Adaptive Filters
Chap. 7 Linear Prediction
Chap. 8 Kalman Filter
Chap. 9 Homomorphic Systems and Cepstrum Analysis of Speech
Chap. 10 Pitch and Voicing Determination of Speech with an Extension Toward Music Signals
Chap. 11 Formant Estimation and Tracking
Chap. 12 The STFT, Sinusoidal Models, and Speech Modification
Chap. 13 Adaptive Blind Multichannel Identification
Part C: Speech Coding (W. B. Kleijn)
Part C discusses the attributes of speech coders as well as the underlying principles that determine their behavior and architecture. Coders for both traditional and packet networks are discussed, as well as low-bit-rate speech coding, various speech coding standards, and perceptual audio coders.
Chap. 14 Principles of Speech Coding
Chap. 15 Voice over IP: Speech Transmission over Packet Networks
Chap. 16 Low-Bit-Rate Speech Coding
Chap. 17 Analysis-by-Synthesis Speech Coding
Chap. 18 Perceptual Audio Coding of Speech Signals
Part D: Text-to-Speech Synthesis (S. Narayanan)
Part D presents different techniques for speech synthesis, including rule-based, corpus-based, and a combination of both. Linguistic analysis and prosodic processing, which are important parts of a text-to-speech (TTS) system, are reviewed. Other aspects of interest for TTS such as voice transformation and synthesis of expressive speech are also discussed.
Chap. 19 Basic Principles of Speech Synthesis
Chap. 20 Rule-Based Speech Synthesis
Chap. 21 Corpus-Based Speech Synthesis
Chap. 22 Linguistic Processing for Speech Synthesis
Chap. 23 Prosodic Processing
Chap. 24 Voice Transformation
Chap. 25 Expressive/Affective Speech Synthesis
Part E: Speech Recognition (L. Rabiner, B.-H. Juang)
Part E describes the most important speech recognition technologies. The approach based on the powerful hidden Markov models is generously presented and some other promising approaches are outlined. The robustness issues concerning the acoustical environment are studied. Finally, several fundamental applications are also discussed.
Chap. 26 Historical Perspective of the Field of ASR/NLU
Chap. 27 HMMs and Related Speech Technologies
Chap. 28 Speech Recognition with Weighted Finite-State Transducers
Chap. 29 A Machine Learning Framework for Spoken-Dialog Classification
Chap. 30 Towards Superhuman Speech Recognition
Chap. 31 Natural Language Understanding
Chap. 32 Transcription and Distillation of Spontaneous Speech
Chap. 33 Environmental Robustness
Chap. 34 The Business of Speech Technologies
Chap. 35 Spoken Dialog Systems
Part F: Speaker Recognition (S. Parthasarathy)
Part F develops the field of speaker recognition. It covers text-dependent and text-independent speaker recognition and their applications.
Chap. 36 Overview of Speaker Recognition
Chap. 37 Text-Dependent Speaker Recognition
Chap. 38 Text-Independent Speaker Recognition
Part G: Language Recognition (C.-H. Lee)
Part G provides an overview on principles of state-of-the-art language recognition approaches. Language characterization, identification, and modeling are addressed. Vector space characterization approaches to converting speech utterances into spoken document vectors for modeling and classification are also presented.
Chap. 39 Principle of Spoken Language Recognition
Chap. 40 Spoken Language Characterization
Chap. 41 Automatic Language Recognition via Spectral and Token Based Approaches
Chap. 42 Vector Based Spoken Language Classification
Part H: Speech Enhancement (J. Chen, S. Gannot, J. Benesty)
Part H develops all classical aspects of speech enhancement: noise reduction, dereverberation, echo cancellation, feedback control, and active noise control.
Chap. 43 Fundamentals of Noise Reduction
Chap. 44 Spectral Enhancement methods
Chap. 45 Echo Cancellation
Chap. 46 Dereverberation
Chap. 47 Adaptive Beamforming and Postfiltering
Chap. 48 Feedback Control in Hearing Aids
Chap. 49 Active Noise Control
Part I: Multichannel Speech Processing (J. Benesty, I. Cohen, Y. Huang)
Part I presents modern aspects of multichannel processing, for acoustic scene analysis, speech acquisition and presentation, when a large number of microphones and loudspeakers are available.
Chap. 50 Microphone Arrays
Chap. 51 Time Delay Estimation and Source Localization
Chap. 52 Convolutive Blind Source Separation Methods
Chap. 53 Sound Field Reproduction
About the Authors
Subject Index
Chap. 1 Introduction to Speech Processing
Part A: Production, Perception, and Modeling of Speech (M. M. Sondhi)
Part A describes the contemporary views on phonatory and articulatory mechanisms of humans to illustrate the physiological processes of speech production. It also describes the nonlinear cochlear speech processing in auditory masking, the perception of speech and sound by humans, and various methods for speech quality assessment with a focus on standardized methods.
Chap. 2 Physiological Processes of Speech Production
Chap. 3 Nonlinear Cochlear Signal Processing and Masking in Speech Perception
Chap. 4 Perception of Speech and Sound
Chap. 5 Speech Quality Estimation
Part B: Signal Processing for Speech (Y. Huang, J. Benesty)
Part B gives a large number of signal processing concepts and algorithms that are widely used in speech processing and in the applications of speech.
Chap. 6 Wiener and Adaptive Filters
Chap. 7 Linear Prediction
Chap. 8 Kalman Filter
Chap. 9 Homomorphic Systems and Cepstrum Analysis of Speech
Chap. 10 Pitch and Voicing Determination of Speech with an Extension Toward Music Signals
Chap. 11 Formant Estimation and Tracking
Chap. 12 The STFT, Sinusoidal Models, and Speech Modification
Chap. 13 Adaptive Blind Multichannel Identification
Part C: Speech Coding (W. B. Kleijn)
Part C discusses the attributes of speech coders as well as the underlying principles that determine their behavior and architecture. Coders for both traditional and packet networks are discussed, as well as low-bit-rate speech coding, various speech coding standards, and perceptual audio coders.
Chap. 14 Principles of Speech Coding
Chap. 15 Voice over IP: Speech Transmission over Packet Networks
Chap. 16 Low-Bit-Rate Speech Coding
Chap. 17 Analysis-by-Synthesis Speech Coding
Chap. 18 Perceptual Audio Coding of Speech Signals
Part D: Text-to-Speech Synthesis (S. Narayanan)
Part D presents different techniques for speech synthesis, including rule-based, corpus-based, and a combination of both. Linguistic analysis and prosodic processing, which are important parts of a text-to-speech (TTS) system, are reviewed. Other aspects of interest for TTS such as voice transformation and synthesis of expressive speech are also discussed.
Chap. 19 Basic Principles of Speech Synthesis
Chap. 20 Rule-Based Speech Synthesis
Chap. 21 Corpus-Based Speech Synthesis
Chap. 22 Linguistic Processing for Speech Synthesis
Chap. 23 Prosodic Processing
Chap. 24 Voice Transformation
Chap. 25 Expressive/Affective Speech Synthesis
Part E: Speech Recognition (L. Rabiner, B.-H. Juang)
Part E describes the most important speech recognition technologies. The approach based on the powerful hidden Markov models is generously presented and some other promising approaches are outlined. The robustness issues concerning the acoustical environment are studied. Finally, several fundamental applications are also discussed.
Chap. 26 Historical Perspective of the Field of ASR/NLU
Chap. 27 HMMs and Related Speech Technologies
Chap. 28 Speech Recognition with Weighted Finite-State Transducers
Chap. 29 A Machine Learning Framework for Spoken-Dialog Classification
Chap. 30 Towards Superhuman Speech Recognition
Chap. 31 Natural Language Understanding
Chap. 32 Transcription and Distillation of Spontaneous Speech
Chap. 33 Environmental Robustness
Chap. 34 The Business of Speech Technologies
Chap. 35 Spoken Dialog Systems
Part F: Speaker Recognition (S. Parthasarathy)
Part F develops the field of speaker recognition. It covers text-dependent and text-independent speaker recognition and their applications.
Chap. 36 Overview of Speaker Recognition
Chap. 37 Text-Dependent Speaker Recognition
Chap. 38 Text-Independent Speaker Recognition
Part G: Language Recognition (C.-H. Lee)
Part G provides an overview on principles of state-of-the-art language recognition approaches. Language characterization, identification, and modeling are addressed. Vector space characterization approaches to converting speech utterances into spoken document vectors for modeling and classification are also presented.
Chap. 39 Principle of Spoken Language Recognition
Chap. 40 Spoken Language Characterization
Chap. 41 Automatic Language Recognition via Spectral and Token Based Approaches
Chap. 42 Vector Based Spoken Language Classification
Part H: Speech Enhancement (J. Chen, S. Gannot, J. Benesty)
Part H develops all classical aspects of speech enhancement: noise reduction, dereverberation, echo cancellation, feedback control, and active noise control.
Chap. 43 Fundamentals of Noise Reduction
Chap. 44 Spectral Enhancement methods
Chap. 45 Echo Cancellation
Chap. 46 Dereverberation
Chap. 47 Adaptive Beamforming and Postfiltering
Chap. 48 Feedback Control in Hearing Aids
Chap. 49 Active Noise Control
Part I: Multichannel Speech Processing (J. Benesty, I. Cohen, Y. Huang)
Part I presents modern aspects of multichannel processing, for acoustic scene analysis, speech acquisition and presentation, when a large number of microphones and loudspeakers are available.
Chap. 50 Microphone Arrays
Chap. 51 Time Delay Estimation and Source Localization
Chap. 52 Convolutive Blind Source Separation Methods
Chap. 53 Sound Field Reproduction
About the Authors
Subject Index
'Foreword by J. L. Flanagan
Chap. 1 Introduction to Speech Processing
Part A: Production, Perception, and Modeling of Speech (M. M. Sondhi)
Part A describes the contemporary views on phonatory and articulatory mechanisms of humans to illustrate the physiological processes of speech production. It also describes the nonlinear cochlear speech processing in auditory masking, the perception of speech and sound by humans, and various methods for speech quality assessment with a focus on standardized methods.
Chap. 2 Physiological Processes of Speech Production
Chap. 3 Nonlinear Cochlear Signal Processing and Masking in Speech Perception
Chap. 4 Perception of Speech and Sound
Chap. 5 Speech Quality Estimation
Part B: Signal Processing for Speech (Y. Huang, J. Benesty)
Part B gives a large number of signal processing concepts and algorithms that are widely used in speech processing and in the applications of speech.
Chap. 6 Wiener and Adaptive Filters
Chap. 7 Linear Prediction
Chap. 8 Kalman Filter
Chap. 9 Homomorphic Systems and Cepstrum Analysis of Speech
Chap. 10 Pitch and Voicing Determination of Speech with an Extension Toward Music Signals
Chap. 11 Formant Estimation and Tracking
Chap. 12 The STFT, Sinusoidal Models, and Speech Modification
Chap. 13 Adaptive Blind Multichannel Identification
Part C: Speech Coding (W. B. Kleijn)
Part C discusses the attributes of speech coders as well as the underlying principles that determine their behavior and architecture. Coders for both traditional and packet networks are discussed, as well as low-bit-rate speech coding, various speech coding standards, and perceptual audio coders.
Chap. 14 Principles of Speech Coding
Chap. 15 Voice over IP: Speech Transmission over Packet Networks
Chap. 16 Low-Bit-Rate Speech Coding
Chap. 17 Analysis-by-Synthesis Speech Coding
Chap. 18 Perceptual Audio Coding of Speech Signals
Part D: Text-to-Speech Synthesis (S. Narayanan)
Part D presents different techniques for speech synthesis, including rule-based, corpus-based, and a combination of both. Linguistic analysis and prosodic processing, which are important parts of a text-to-speech (TTS) system, are reviewed. Other aspects of interest for TTS such as voice transformation and synthesis of expressive speech are also discussed.
Chap. 19 Basic Principles of Speech Synthesis
Chap. 20 Rule-Based Speech Synthesis
Chap. 21 Corpus-Based Speech Synthesis
Chap. 22 Linguistic Processing for Speech Synthesis
Chap. 23 Prosodic Processing
Chap. 24 Voice Transformation
Chap. 25 Expressive/Affective Speech Synthesis
Part E: Speech Recognition (L. Rabiner, B.-H. Juang)
Part E describes the most important speech recognition technologies. The approach based on the powerful hidden Markov models is generously presented and some other promising approaches are outlined. The robustness issues concerning the acoustical environment are studied. Finally, several fundamental applications are also discussed.
Chap. 26 Historical Perspective of the Field of ASR/NLU
Chap. 27 HMMs and Related Speech Technologies
Chap. 28 Speech Recognition with Weighted Finite-State Transducers
Chap. 29 A Machine Learning Framework for Spoken-Dialog Classification
Chap. 30 Towards Superhuman Speech Recognition
Chap. 31 Natural Language Understanding
Chap. 32 Transcription and Distillation of Spontaneous Speech
Chap. 33 Environmental Robustness
Chap. 34 The Business of Speech Technologies
Chap. 35 Spoken Dialog Systems
Part F: Speaker Recognition (S. Parthasarathy)
Part F develops the field of speaker recognition. It covers text-dependent and text-independent speaker recognition and their applications.
Chap. 36 Overview of Speaker Recognition
Chap. 37 Text-Dependent Speaker Recognition
Chap. 38 Text-Independent Speaker Recognition
Part G: Language Recognition (C.-H. Lee)
Part G provides an overview on principles of state-of-the-art language recognition a
Chap. 1 Introduction to Speech Processing
Part A: Production, Perception, and Modeling of Speech (M. M. Sondhi)
Part A describes the contemporary views on phonatory and articulatory mechanisms of humans to illustrate the physiological processes of speech production. It also describes the nonlinear cochlear speech processing in auditory masking, the perception of speech and sound by humans, and various methods for speech quality assessment with a focus on standardized methods.
Chap. 2 Physiological Processes of Speech Production
Chap. 3 Nonlinear Cochlear Signal Processing and Masking in Speech Perception
Chap. 4 Perception of Speech and Sound
Chap. 5 Speech Quality Estimation
Part B: Signal Processing for Speech (Y. Huang, J. Benesty)
Part B gives a large number of signal processing concepts and algorithms that are widely used in speech processing and in the applications of speech.
Chap. 6 Wiener and Adaptive Filters
Chap. 7 Linear Prediction
Chap. 8 Kalman Filter
Chap. 9 Homomorphic Systems and Cepstrum Analysis of Speech
Chap. 10 Pitch and Voicing Determination of Speech with an Extension Toward Music Signals
Chap. 11 Formant Estimation and Tracking
Chap. 12 The STFT, Sinusoidal Models, and Speech Modification
Chap. 13 Adaptive Blind Multichannel Identification
Part C: Speech Coding (W. B. Kleijn)
Part C discusses the attributes of speech coders as well as the underlying principles that determine their behavior and architecture. Coders for both traditional and packet networks are discussed, as well as low-bit-rate speech coding, various speech coding standards, and perceptual audio coders.
Chap. 14 Principles of Speech Coding
Chap. 15 Voice over IP: Speech Transmission over Packet Networks
Chap. 16 Low-Bit-Rate Speech Coding
Chap. 17 Analysis-by-Synthesis Speech Coding
Chap. 18 Perceptual Audio Coding of Speech Signals
Part D: Text-to-Speech Synthesis (S. Narayanan)
Part D presents different techniques for speech synthesis, including rule-based, corpus-based, and a combination of both. Linguistic analysis and prosodic processing, which are important parts of a text-to-speech (TTS) system, are reviewed. Other aspects of interest for TTS such as voice transformation and synthesis of expressive speech are also discussed.
Chap. 19 Basic Principles of Speech Synthesis
Chap. 20 Rule-Based Speech Synthesis
Chap. 21 Corpus-Based Speech Synthesis
Chap. 22 Linguistic Processing for Speech Synthesis
Chap. 23 Prosodic Processing
Chap. 24 Voice Transformation
Chap. 25 Expressive/Affective Speech Synthesis
Part E: Speech Recognition (L. Rabiner, B.-H. Juang)
Part E describes the most important speech recognition technologies. The approach based on the powerful hidden Markov models is generously presented and some other promising approaches are outlined. The robustness issues concerning the acoustical environment are studied. Finally, several fundamental applications are also discussed.
Chap. 26 Historical Perspective of the Field of ASR/NLU
Chap. 27 HMMs and Related Speech Technologies
Chap. 28 Speech Recognition with Weighted Finite-State Transducers
Chap. 29 A Machine Learning Framework for Spoken-Dialog Classification
Chap. 30 Towards Superhuman Speech Recognition
Chap. 31 Natural Language Understanding
Chap. 32 Transcription and Distillation of Spontaneous Speech
Chap. 33 Environmental Robustness
Chap. 34 The Business of Speech Technologies
Chap. 35 Spoken Dialog Systems
Part F: Speaker Recognition (S. Parthasarathy)
Part F develops the field of speaker recognition. It covers text-dependent and text-independent speaker recognition and their applications.
Chap. 36 Overview of Speaker Recognition
Chap. 37 Text-Dependent Speaker Recognition
Chap. 38 Text-Independent Speaker Recognition
Part G: Language Recognition (C.-H. Lee)
Part G provides an overview on principles of state-of-the-art language recognition a
Foreword by J. L. Flanagan
Chap. 1 Introduction to Speech Processing
Part A: Production, Perception, and Modeling of Speech (M. M. Sondhi)
Part A describes the contemporary views on phonatory and articulatory mechanisms of humans to illustrate the physiological processes of speech production. It also describes the nonlinear cochlear speech processing in auditory masking, the perception of speech and sound by humans, and various methods for speech quality assessment with a focus on standardized methods.
Chap. 2 Physiological Processes of Speech Production
Chap. 3 Nonlinear Cochlear Signal Processing and Masking in Speech Perception
Chap. 4 Perception of Speech and Sound
Chap. 5 Speech Quality Estimation
Part B: Signal Processing for Speech (Y. Huang, J. Benesty)
Part B gives a large number of signal processing concepts and algorithms that are widely used in speech processing and in the applications of speech.
Chap. 6 Wiener and Adaptive Filters
Chap. 7 Linear Prediction
Chap. 8 Kalman Filter
Chap. 9 Homomorphic Systems and Cepstrum Analysis of Speech
Chap. 10 Pitch and Voicing Determination of Speech with an Extension Toward Music Signals
Chap. 11 Formant Estimation and Tracking
Chap. 12 The STFT, Sinusoidal Models, and Speech Modification
Chap. 13 Adaptive Blind Multichannel Identification
Part C: Speech Coding (W. B. Kleijn)
Part C discusses the attributes of speech coders as well as the underlying principles that determine their behavior and architecture. Coders for both traditional and packet networks are discussed, as well as low-bit-rate speech coding, various speech coding standards, and perceptual audio coders.
Chap. 14 Principles of Speech Coding
Chap. 15 Voice over IP: Speech Transmission over Packet Networks
Chap. 16 Low-Bit-Rate Speech Coding
Chap. 17 Analysis-by-Synthesis Speech Coding
Chap. 18 Perceptual Audio Coding of Speech Signals
Part D: Text-to-Speech Synthesis (S. Narayanan)
Part D presents different techniques for speech synthesis, including rule-based, corpus-based, and a combination of both. Linguistic analysis and prosodic processing, which are important parts of a text-to-speech (TTS) system, are reviewed. Other aspects of interest for TTS such as voice transformation and synthesis of expressive speech are also discussed.
Chap. 19 Basic Principles of Speech Synthesis
Chap. 20 Rule-Based Speech Synthesis
Chap. 21 Corpus-Based Speech Synthesis
Chap. 22 Linguistic Processing for Speech Synthesis
Chap. 23 Prosodic Processing
Chap. 24 Voice Transformation
Chap. 25 Expressive/Affective Speech Synthesis
Part E: Speech Recognition (L. Rabiner, B.-H. Juang)
Part E describes the most important speech recognition technologies. The approach based on the powerful hidden Markov models is generously presented and some other promising approaches are outlined. The robustness issues concerning the acoustical environment are studied. Finally, several fundamental applications are also discussed.
Chap. 26 Historical Perspective of the Field of ASR/NLU
Chap. 27 HMMs and Related Speech Technologies
Chap. 28 Speech Recognition with Weighted Finite-State Transducers
Chap. 29 A Machine Learning Framework for Spoken-Dialog Classification
Chap. 30 Towards Superhuman Speech Recognition
Chap. 31 Natural Language Understanding
Chap. 32 Transcription and Distillation of Spontaneous Speech
Chap. 33 Environmental Robustness
Chap. 34 The Business of Speech Technologies
Chap. 35 Spoken Dialog Systems
Part F: Speaker Recognition (S. Parthasarathy)
Part F develops the field of speaker recognition. It covers text-dependent and text-independent speaker recognition and their applications.
Chap. 36 Overview of Speaker Recognition
Chap. 37 Text-Dependent Speaker Recognition
Chap. 38 Text-Independent Speaker Recognition
Part G: Language Recognition (C.-H. Lee)
Part G provides an overview on principles of state-of-the-art language recognition approaches. Language characterization, identification, and modeling are addressed. Vector space characterization approaches to converting speech utterances into spoken document vectors for modeling and classification are also presented.
Chap. 39 Principle of Spoken Language Recognition
Chap. 40 Spoken Language Characterization
Chap. 41 Automatic Language Recognition via Spectral and Token Based Approaches
Chap. 42 Vector Based Spoken Language Classification
Part H: Speech Enhancement (J. Chen, S. Gannot, J. Benesty)
Part H develops all classical aspects of speech enhancement: noise reduction, dereverberation, echo cancellation, feedback control, and active noise control.
Chap. 43 Fundamentals of Noise Reduction
Chap. 44 Spectral Enhancement methods
Chap. 45 Echo Cancellation
Chap. 46 Dereverberation
Chap. 47 Adaptive Beamforming and Postfiltering
Chap. 48 Feedback Control in Hearing Aids
Chap. 49 Active Noise Control
Part I: Multichannel Speech Processing (J. Benesty, I. Cohen, Y. Huang)
Part I presents modern aspects of multichannel processing, for acoustic scene analysis, speech acquisition and presentation, when a large number of microphones and loudspeakers are available.
Chap. 50 Microphone Arrays
Chap. 51 Time Delay Estimation and Source Localization
Chap. 52 Convolutive Blind Source Separation Methods
Chap. 53 Sound Field Reproduction
About the Authors
Subject Index
Chap. 1 Introduction to Speech Processing
Part A: Production, Perception, and Modeling of Speech (M. M. Sondhi)
Part A describes the contemporary views on phonatory and articulatory mechanisms of humans to illustrate the physiological processes of speech production. It also describes the nonlinear cochlear speech processing in auditory masking, the perception of speech and sound by humans, and various methods for speech quality assessment with a focus on standardized methods.
Chap. 2 Physiological Processes of Speech Production
Chap. 3 Nonlinear Cochlear Signal Processing and Masking in Speech Perception
Chap. 4 Perception of Speech and Sound
Chap. 5 Speech Quality Estimation
Part B: Signal Processing for Speech (Y. Huang, J. Benesty)
Part B gives a large number of signal processing concepts and algorithms that are widely used in speech processing and in the applications of speech.
Chap. 6 Wiener and Adaptive Filters
Chap. 7 Linear Prediction
Chap. 8 Kalman Filter
Chap. 9 Homomorphic Systems and Cepstrum Analysis of Speech
Chap. 10 Pitch and Voicing Determination of Speech with an Extension Toward Music Signals
Chap. 11 Formant Estimation and Tracking
Chap. 12 The STFT, Sinusoidal Models, and Speech Modification
Chap. 13 Adaptive Blind Multichannel Identification
Part C: Speech Coding (W. B. Kleijn)
Part C discusses the attributes of speech coders as well as the underlying principles that determine their behavior and architecture. Coders for both traditional and packet networks are discussed, as well as low-bit-rate speech coding, various speech coding standards, and perceptual audio coders.
Chap. 14 Principles of Speech Coding
Chap. 15 Voice over IP: Speech Transmission over Packet Networks
Chap. 16 Low-Bit-Rate Speech Coding
Chap. 17 Analysis-by-Synthesis Speech Coding
Chap. 18 Perceptual Audio Coding of Speech Signals
Part D: Text-to-Speech Synthesis (S. Narayanan)
Part D presents different techniques for speech synthesis, including rule-based, corpus-based, and a combination of both. Linguistic analysis and prosodic processing, which are important parts of a text-to-speech (TTS) system, are reviewed. Other aspects of interest for TTS such as voice transformation and synthesis of expressive speech are also discussed.
Chap. 19 Basic Principles of Speech Synthesis
Chap. 20 Rule-Based Speech Synthesis
Chap. 21 Corpus-Based Speech Synthesis
Chap. 22 Linguistic Processing for Speech Synthesis
Chap. 23 Prosodic Processing
Chap. 24 Voice Transformation
Chap. 25 Expressive/Affective Speech Synthesis
Part E: Speech Recognition (L. Rabiner, B.-H. Juang)
Part E describes the most important speech recognition technologies. The approach based on the powerful hidden Markov models is generously presented and some other promising approaches are outlined. The robustness issues concerning the acoustical environment are studied. Finally, several fundamental applications are also discussed.
Chap. 26 Historical Perspective of the Field of ASR/NLU
Chap. 27 HMMs and Related Speech Technologies
Chap. 28 Speech Recognition with Weighted Finite-State Transducers
Chap. 29 A Machine Learning Framework for Spoken-Dialog Classification
Chap. 30 Towards Superhuman Speech Recognition
Chap. 31 Natural Language Understanding
Chap. 32 Transcription and Distillation of Spontaneous Speech
Chap. 33 Environmental Robustness
Chap. 34 The Business of Speech Technologies
Chap. 35 Spoken Dialog Systems
Part F: Speaker Recognition (S. Parthasarathy)
Part F develops the field of speaker recognition. It covers text-dependent and text-independent speaker recognition and their applications.
Chap. 36 Overview of Speaker Recognition
Chap. 37 Text-Dependent Speaker Recognition
Chap. 38 Text-Independent Speaker Recognition
Part G: Language Recognition (C.-H. Lee)
Part G provides an overview on principles of state-of-the-art language recognition approaches. Language characterization, identification, and modeling are addressed. Vector space characterization approaches to converting speech utterances into spoken document vectors for modeling and classification are also presented.
Chap. 39 Principle of Spoken Language Recognition
Chap. 40 Spoken Language Characterization
Chap. 41 Automatic Language Recognition via Spectral and Token Based Approaches
Chap. 42 Vector Based Spoken Language Classification
Part H: Speech Enhancement (J. Chen, S. Gannot, J. Benesty)
Part H develops all classical aspects of speech enhancement: noise reduction, dereverberation, echo cancellation, feedback control, and active noise control.
Chap. 43 Fundamentals of Noise Reduction
Chap. 44 Spectral Enhancement methods
Chap. 45 Echo Cancellation
Chap. 46 Dereverberation
Chap. 47 Adaptive Beamforming and Postfiltering
Chap. 48 Feedback Control in Hearing Aids
Chap. 49 Active Noise Control
Part I: Multichannel Speech Processing (J. Benesty, I. Cohen, Y. Huang)
Part I presents modern aspects of multichannel processing, for acoustic scene analysis, speech acquisition and presentation, when a large number of microphones and loudspeakers are available.
Chap. 50 Microphone Arrays
Chap. 51 Time Delay Estimation and Source Localization
Chap. 52 Convolutive Blind Source Separation Methods
Chap. 53 Sound Field Reproduction
About the Authors
Subject Index
From the reviews:
"This massive volume contains 53 chapters and covers just about all aspects of the field of speech processing. ... The editors are commended for producing a valuable tool in the understanding of speech and speech synthesis/recognition. The book is a valuable addition to the bookshelf of researchers, speech scientists, and engineers." (Richard J. Peppin, International Journal of Acoustics and Vibration, Vol. 13 (1), 2008)
"This book is a comprehensive overview of most of the major topics associated with speech processing written by the most renowned authors in each topic. The book is well structured with a clearly organized topics. It is intended for use by the researcher ... . book is organized in nine sections that cover all current speech applications. ... In conclusion, I would highly recommend that anyone interested in speech processing have a copy of this encyclopaedic work." (Eduardo López Gonsalo, The Phonetician, Vol. I-II (97/98), 2008)
"This massive volume contains 53 chapters and covers just about all aspects of the field of speech processing. ... The editors are commended for producing a valuable tool in the understanding of speech and speech synthesis/recognition. The book is a valuable addition to the bookshelf of researchers, speech scientists, and engineers." (Richard J. Peppin, International Journal of Acoustics and Vibration, Vol. 13 (1), 2008)
"This book is a comprehensive overview of most of the major topics associated with speech processing written by the most renowned authors in each topic. The book is well structured with a clearly organized topics. It is intended for use by the researcher ... . book is organized in nine sections that cover all current speech applications. ... In conclusion, I would highly recommend that anyone interested in speech processing have a copy of this encyclopaedic work." (Eduardo López Gonsalo, The Phonetician, Vol. I-II (97/98), 2008)