Deep learning for real time facial expression recognition in social robots

Ariel Ruiz-Garcia, Nicola Webb, Vasile Palade, Mark Eastwood, Mark Elshaw

Research output: Chapter in Book/Report/Conference proceedingConference proceeding

Abstract

Human robot interaction is a rapidly growing topic of interest in today’s society. The development of real time emotion recognition will further improve the relationship between humans and social robots. However, contemporary real time emotion recognition in unconstrained environments has yet to reach the accuracy levels achieved on controlled static datasets. In this work, we propose a Deep Convolutional Neural Network (CNN), pre-trained as a Stacked Convolutional Autoencoder (SCAE) in a greedy layer-wise unsupervised manner, for emotion recognition from facial expression images taken by a NAO robot. The SCAE model is trained to learn an illumination invariant down-sampled feature vector. The weights of the encoder element are then used to initialize the CNN model, which is fine-tuned for classification. We train the model on a corpus composed of gamma corrected versions of the CK+, JAFFE, FEEDTUM and KDEF datasets. The emotion recognition model produces a state-of-the-art accuracy rate of 99.14% on this corpus. We also show that the proposed training approach significantly improves the CNN’s generalisation ability by over 30% on nonuniform data collected with the NAO robot in unconstrained environments.

Original languageEnglish
Title of host publicationNeural Information Processing - 25th International Conference, ICONIP 2018, Proceedings
PublisherSpringer-Verlag London Ltd
Pages392-402
Number of pages11
ISBN (Electronic)978-3-030-04221-9
ISBN (Print)9783030042202
DOIs
Publication statusE-pub ahead of print - 17 Nov 2018
Event25th International Conference on Neural Information Processing, ICONIP 2018 - Siem Reap, Cambodia
Duration: 13 Dec 201816 Dec 2018

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11305 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference25th International Conference on Neural Information Processing, ICONIP 2018
CountryCambodia
CitySiem Reap
Period13/12/1816/12/18

Fingerprint

Facial Expression Recognition
Emotion Recognition
Robot
Robots
Neural networks
Human-robot Interaction
Human robot interaction
Facial Expression
Encoder
Feature Vector
Neural Network Model
Illumination
Lighting
Model
Neural Networks
Invariant
Learning
Deep learning
Corpus

Keywords

  • Deep convolutional neural networks
  • Emotion recognition
  • Greedy layer-wise training
  • Social robots
  • Stacked convolutional autoencoders

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Ruiz-Garcia, A., Webb, N., Palade, V., Eastwood, M., & Elshaw, M. (2018). Deep learning for real time facial expression recognition in social robots. In Neural Information Processing - 25th International Conference, ICONIP 2018, Proceedings (pp. 392-402). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11305 LNCS). Springer-Verlag London Ltd. https://doi.org/10.1007/978-3-030-04221-9_35

Deep learning for real time facial expression recognition in social robots. / Ruiz-Garcia, Ariel; Webb, Nicola; Palade, Vasile; Eastwood, Mark; Elshaw, Mark.

Neural Information Processing - 25th International Conference, ICONIP 2018, Proceedings. Springer-Verlag London Ltd, 2018. p. 392-402 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11305 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference proceeding

Ruiz-Garcia, A, Webb, N, Palade, V, Eastwood, M & Elshaw, M 2018, Deep learning for real time facial expression recognition in social robots. in Neural Information Processing - 25th International Conference, ICONIP 2018, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 11305 LNCS, Springer-Verlag London Ltd, pp. 392-402, 25th International Conference on Neural Information Processing, ICONIP 2018, Siem Reap, Cambodia, 13/12/18. https://doi.org/10.1007/978-3-030-04221-9_35
Ruiz-Garcia A, Webb N, Palade V, Eastwood M, Elshaw M. Deep learning for real time facial expression recognition in social robots. In Neural Information Processing - 25th International Conference, ICONIP 2018, Proceedings. Springer-Verlag London Ltd. 2018. p. 392-402. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-030-04221-9_35
Ruiz-Garcia, Ariel ; Webb, Nicola ; Palade, Vasile ; Eastwood, Mark ; Elshaw, Mark. / Deep learning for real time facial expression recognition in social robots. Neural Information Processing - 25th International Conference, ICONIP 2018, Proceedings. Springer-Verlag London Ltd, 2018. pp. 392-402 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{60df40e691884d609fc97a8c6dc35ee2,
title = "Deep learning for real time facial expression recognition in social robots",
abstract = "Human robot interaction is a rapidly growing topic of interest in today’s society. The development of real time emotion recognition will further improve the relationship between humans and social robots. However, contemporary real time emotion recognition in unconstrained environments has yet to reach the accuracy levels achieved on controlled static datasets. In this work, we propose a Deep Convolutional Neural Network (CNN), pre-trained as a Stacked Convolutional Autoencoder (SCAE) in a greedy layer-wise unsupervised manner, for emotion recognition from facial expression images taken by a NAO robot. The SCAE model is trained to learn an illumination invariant down-sampled feature vector. The weights of the encoder element are then used to initialize the CNN model, which is fine-tuned for classification. We train the model on a corpus composed of gamma corrected versions of the CK+, JAFFE, FEEDTUM and KDEF datasets. The emotion recognition model produces a state-of-the-art accuracy rate of 99.14{\%} on this corpus. We also show that the proposed training approach significantly improves the CNN’s generalisation ability by over 30{\%} on nonuniform data collected with the NAO robot in unconstrained environments.",
keywords = "Deep convolutional neural networks, Emotion recognition, Greedy layer-wise training, Social robots, Stacked convolutional autoencoders",
author = "Ariel Ruiz-Garcia and Nicola Webb and Vasile Palade and Mark Eastwood and Mark Elshaw",
year = "2018",
month = "11",
day = "17",
doi = "10.1007/978-3-030-04221-9_35",
language = "English",
isbn = "9783030042202",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer-Verlag London Ltd",
pages = "392--402",
booktitle = "Neural Information Processing - 25th International Conference, ICONIP 2018, Proceedings",

}

TY - GEN

T1 - Deep learning for real time facial expression recognition in social robots

AU - Ruiz-Garcia, Ariel

AU - Webb, Nicola

AU - Palade, Vasile

AU - Eastwood, Mark

AU - Elshaw, Mark

PY - 2018/11/17

Y1 - 2018/11/17

N2 - Human robot interaction is a rapidly growing topic of interest in today’s society. The development of real time emotion recognition will further improve the relationship between humans and social robots. However, contemporary real time emotion recognition in unconstrained environments has yet to reach the accuracy levels achieved on controlled static datasets. In this work, we propose a Deep Convolutional Neural Network (CNN), pre-trained as a Stacked Convolutional Autoencoder (SCAE) in a greedy layer-wise unsupervised manner, for emotion recognition from facial expression images taken by a NAO robot. The SCAE model is trained to learn an illumination invariant down-sampled feature vector. The weights of the encoder element are then used to initialize the CNN model, which is fine-tuned for classification. We train the model on a corpus composed of gamma corrected versions of the CK+, JAFFE, FEEDTUM and KDEF datasets. The emotion recognition model produces a state-of-the-art accuracy rate of 99.14% on this corpus. We also show that the proposed training approach significantly improves the CNN’s generalisation ability by over 30% on nonuniform data collected with the NAO robot in unconstrained environments.

AB - Human robot interaction is a rapidly growing topic of interest in today’s society. The development of real time emotion recognition will further improve the relationship between humans and social robots. However, contemporary real time emotion recognition in unconstrained environments has yet to reach the accuracy levels achieved on controlled static datasets. In this work, we propose a Deep Convolutional Neural Network (CNN), pre-trained as a Stacked Convolutional Autoencoder (SCAE) in a greedy layer-wise unsupervised manner, for emotion recognition from facial expression images taken by a NAO robot. The SCAE model is trained to learn an illumination invariant down-sampled feature vector. The weights of the encoder element are then used to initialize the CNN model, which is fine-tuned for classification. We train the model on a corpus composed of gamma corrected versions of the CK+, JAFFE, FEEDTUM and KDEF datasets. The emotion recognition model produces a state-of-the-art accuracy rate of 99.14% on this corpus. We also show that the proposed training approach significantly improves the CNN’s generalisation ability by over 30% on nonuniform data collected with the NAO robot in unconstrained environments.

KW - Deep convolutional neural networks

KW - Emotion recognition

KW - Greedy layer-wise training

KW - Social robots

KW - Stacked convolutional autoencoders

UR - http://www.scopus.com/inward/record.url?scp=85059051112&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-04221-9_35

DO - 10.1007/978-3-030-04221-9_35

M3 - Conference proceeding

SN - 9783030042202

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 392

EP - 402

BT - Neural Information Processing - 25th International Conference, ICONIP 2018, Proceedings

PB - Springer-Verlag London Ltd

ER -