Jeff Bilmes, an assistant professor at the
Department of Electrical Engineering at the University of Washington, Seattle,
gave a lecture on
Graphical Models for Speech Recognition.
Professor Bilmes is also an adjunct assistant
professor in linguistics. He co-founded the Signal, Speech, and Language
Interpretation Laboratory at the university. He received a Masters degree from
MIT, and a Ph.D. in Computer Science at the University of California in
Berkeley in 1999. His primary research interests lie in statistical modeling
(particularly graphical model approaches) and signal processing for speech and
pattern recognition, and language and audio processing.
"Graphical Models for Speech Recognition"
A signal processing technique, graphical
models (GMs) are flexible statistical abstractions that offer a promising path
on which to find new approaches to automatic speech recognition (ASR).
This talk provided a brief overview of
GMs, covering the four main components (semantics, structure, implementation,
and parameters) needed to associate a given GM with a probabilistic model.
Three types of graphical models were discussed. The first type are used to represent the
process of classifier combination yielding novel combination rules that have
improved accuracy for voice recognition in a phonetic classification task over
previous combination schemes. A second type of GMs, known as
hidden-articulator Markov models are used to represent hidden information
about articulatory gestures within the human vocal-tract. Finally, buried
Markov models (BMMs), are used where dependencies in a HMM have been augmented
in discriminative and data-derived ways. This is a useful property for a GM
when used for the classification task (e.g., ASR). Relative to other
generative models, such "discriminative-generative" models have the
promise to improve parsimony (i.e., have smaller memory and compute demands),
yet improve recognition accuracy.