High-Level Analysis of Audio Features for Identifying Emotional Valence in Human Singing

Stuart Cunningham, Jonathan Weinel, Richard Picking

Research output: Chapter in Book/Report/Conference proceedingConference proceeding

Abstract

Emotional analysis continues to be a topic that receives much attention in the audio and music community. The potential to link together human affective state and the emotional content or intention of musical audio has a variety of application areas in fields such as improving user experience of digital music libraries and music therapy. Less work has been directed into the emotional analysis of human acapella singing. Recently, the Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) was released, which includes emotionally validated human singing samples. In this work, we apply established audio analysis features to determine if these can be used to detect underlying emotional valence in human singing. Results indicate that the short-term audio features of: energy; spectral centroid (mean); spectral centroid (spread); spectral entropy; spectral flux; spectral rolloff; and fundamental frequency can be useful predictors of emotion, although their efficacy is not consistent across positive and negative emotions
Original languageEnglish
Title of host publicationProceedings of the Audio Mostly 2018 on Sound in Immersion and Emotion
PublisherAssociation for Computing Machinery (ACM)
ISBN (Electronic)978-1-4503-6609-0
DOIs
Publication statusPublished - 2018
EventAudio Mostly 2018: Sound in Immersion and Emotion - Wrexham, United Kingdom
Duration: 12 Sep 201814 Sep 2018
http://audiomostly.com/

Conference

ConferenceAudio Mostly 2018
CountryUnited Kingdom
CityWrexham
Period12/09/1814/09/18
Internet address

Fingerprint

Entropy
Fluxes

Cite this

Cunningham, S., Weinel, J., & Picking, R. (2018). High-Level Analysis of Audio Features for Identifying Emotional Valence in Human Singing. In Proceedings of the Audio Mostly 2018 on Sound in Immersion and Emotion [37] Association for Computing Machinery (ACM). https://doi.org/10.1145/3243274.3243313

High-Level Analysis of Audio Features for Identifying Emotional Valence in Human Singing. / Cunningham, Stuart; Weinel, Jonathan; Picking, Richard.

Proceedings of the Audio Mostly 2018 on Sound in Immersion and Emotion. Association for Computing Machinery (ACM), 2018. 37.

Research output: Chapter in Book/Report/Conference proceedingConference proceeding

Cunningham, S, Weinel, J & Picking, R 2018, High-Level Analysis of Audio Features for Identifying Emotional Valence in Human Singing. in Proceedings of the Audio Mostly 2018 on Sound in Immersion and Emotion., 37, Association for Computing Machinery (ACM), Audio Mostly 2018, Wrexham, United Kingdom, 12/09/18. https://doi.org/10.1145/3243274.3243313
Cunningham S, Weinel J, Picking R. High-Level Analysis of Audio Features for Identifying Emotional Valence in Human Singing. In Proceedings of the Audio Mostly 2018 on Sound in Immersion and Emotion. Association for Computing Machinery (ACM). 2018. 37 https://doi.org/10.1145/3243274.3243313
Cunningham, Stuart ; Weinel, Jonathan ; Picking, Richard. / High-Level Analysis of Audio Features for Identifying Emotional Valence in Human Singing. Proceedings of the Audio Mostly 2018 on Sound in Immersion and Emotion. Association for Computing Machinery (ACM), 2018.
@inproceedings{63b17ea9ea8144dda0183395da3e8cdf,
title = "High-Level Analysis of Audio Features for Identifying Emotional Valence in Human Singing",
abstract = "Emotional analysis continues to be a topic that receives much attention in the audio and music community. The potential to link together human affective state and the emotional content or intention of musical audio has a variety of application areas in fields such as improving user experience of digital music libraries and music therapy. Less work has been directed into the emotional analysis of human acapella singing. Recently, the Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) was released, which includes emotionally validated human singing samples. In this work, we apply established audio analysis features to determine if these can be used to detect underlying emotional valence in human singing. Results indicate that the short-term audio features of: energy; spectral centroid (mean); spectral centroid (spread); spectral entropy; spectral flux; spectral rolloff; and fundamental frequency can be useful predictors of emotion, although their efficacy is not consistent across positive and negative emotions",
author = "Stuart Cunningham and Jonathan Weinel and Richard Picking",
year = "2018",
doi = "10.1145/3243274.3243313",
language = "English",
booktitle = "Proceedings of the Audio Mostly 2018 on Sound in Immersion and Emotion",
publisher = "Association for Computing Machinery (ACM)",
address = "United States",

}

TY - GEN

T1 - High-Level Analysis of Audio Features for Identifying Emotional Valence in Human Singing

AU - Cunningham, Stuart

AU - Weinel, Jonathan

AU - Picking, Richard

PY - 2018

Y1 - 2018

N2 - Emotional analysis continues to be a topic that receives much attention in the audio and music community. The potential to link together human affective state and the emotional content or intention of musical audio has a variety of application areas in fields such as improving user experience of digital music libraries and music therapy. Less work has been directed into the emotional analysis of human acapella singing. Recently, the Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) was released, which includes emotionally validated human singing samples. In this work, we apply established audio analysis features to determine if these can be used to detect underlying emotional valence in human singing. Results indicate that the short-term audio features of: energy; spectral centroid (mean); spectral centroid (spread); spectral entropy; spectral flux; spectral rolloff; and fundamental frequency can be useful predictors of emotion, although their efficacy is not consistent across positive and negative emotions

AB - Emotional analysis continues to be a topic that receives much attention in the audio and music community. The potential to link together human affective state and the emotional content or intention of musical audio has a variety of application areas in fields such as improving user experience of digital music libraries and music therapy. Less work has been directed into the emotional analysis of human acapella singing. Recently, the Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) was released, which includes emotionally validated human singing samples. In this work, we apply established audio analysis features to determine if these can be used to detect underlying emotional valence in human singing. Results indicate that the short-term audio features of: energy; spectral centroid (mean); spectral centroid (spread); spectral entropy; spectral flux; spectral rolloff; and fundamental frequency can be useful predictors of emotion, although their efficacy is not consistent across positive and negative emotions

U2 - 10.1145/3243274.3243313

DO - 10.1145/3243274.3243313

M3 - Conference proceeding

BT - Proceedings of the Audio Mostly 2018 on Sound in Immersion and Emotion

PB - Association for Computing Machinery (ACM)

ER -