Kernel eigenvoice

Speaker adaptation is an important technology to fine-tune either features or speech models for mis-match due to inter-speaker variation. In the last decade, eigenvoice (EV) speaker adaptation has been developed. It makes use of the prior knowledge of training speakers to provide a fast adaptation algorithm (in other words, only a small amount of adaptation data is needed). Inspired by the kernel eigenface idea in face recognition, kernel eigenvoice (KEV) is proposed.[1] KEV is a non-linear generalization to EV. This incorporates Kernel principal component analysis, a non-linear version of Principal Component Analysis, to capture higher order correlations in order to further explore the speaker space and enhance recognition performance.

See also

References

  1. "Kernel Eigenvoice Thesis" (PDF). Archived from the original (PDF) on 2011-06-10. Retrieved 2009-07-17.
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.