Stress and emotion classification using jitter and shimmer features

Li Xi, Tao Jidong, Michael T. Johnson, Joseph Solds, Anne Savage, Kirsten M. Leong, John D. Newman

Research output: Chapter in Book/Report/Conference proceedingConference contribution

61 Scopus citations

Abstract

In this paper, we evaluate the use of appended jitter and shimmer speech features for the classification of human speaking styles and of animal vocalization arousal levels. Jitter and shimmer features are extracted from the fundamental frequency contour and added to baseline spectral features, specifically Mel-frequency Cepstral Coefficients (MFCCs) for human speech and Greenwood Function Cepstral Coefficients (GFCCs) for animal vocalizations. Hidden Markov Models (HMMs) with Gaussian Mixture Models (GMMs) state distributions are used for classification. The appended jitter and shimmer features result in an increase in classification accuracy for several illustrative datasets, including the SUSAS dataset for human speaking styles as well as vocalizations labeled by arousal level for African Elephant and Rhesus Monkey species

Original languageEnglish (US)
Title of host publicationICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume4
DOIs
StatePublished - 2007
Externally publishedYes
Event2007 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '07 - Honolulu, HI, United States
Duration: Apr 15 2007Apr 20 2007

Other

Other2007 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '07
Country/TerritoryUnited States
CityHonolulu, HI
Period4/15/074/20/07

Keywords

  • GFCC
  • HMM
  • Jitter
  • MFCC
  • Shimmer

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Signal Processing
  • Acoustics and Ultrasonics

Fingerprint

Dive into the research topics of 'Stress and emotion classification using jitter and shimmer features'. Together they form a unique fingerprint.

Cite this