Identifying speech acts in a corpus of historical migrant correspondence

Rachele De Felice, Emma Moreton

Research output: Contribution to journalArticle

Abstract

A full account of the pragmatics of personal correspondence requires speech act annotation, and as manual annotation of large datasets can be extremely difficult, this study proposes to use an automated speech act tagger developed by the first author. It was originally designed for use with business emails; however, the latest iteration of the tagger can be applied to other datasets–such as personal correspondence–providing a useful resource for the corpus linguistics community. In this study, the speech act tagger is tested on a collection of letters written by Irish migrants at the end of the nineteenth century. After discussing issues to do with the digitisation, transcription and annotation of historical migrant correspondence, the article will report on the results of this trial study, demonstrating how the tagger can perform with some success even on corpora with very different characteristics. Although the dataset used for this trial study is small, the findings show the potential for carrying out this type of analysis across larger digital archives allowing for different datasets to be compared, taking into consideration sociobiographic variables such as the author’s sex, class and role within the notional familial hierarchy.

Original languageEnglish
Pages (from-to)154-174
Number of pages21
JournalStudia Neophilologica
Volume91
Issue number2
Early online date20 Jun 2019
DOIs
Publication statusE-pub ahead of print - 20 Jun 2019

Fingerprint

Tag
Migrants
Speech Acts
Annotation
Iteration
Corpus Linguistics
Familial
Electronic Mail
Transcription
Notional
Resources
Digital Archive
Digitization
Letters

Keywords

  • corpus linguistics
  • correspondence corpora
  • Historical migrant letters
  • speech acts

ASJC Scopus subject areas

  • Philosophy

Cite this

Identifying speech acts in a corpus of historical migrant correspondence. / De Felice, Rachele; Moreton, Emma.

In: Studia Neophilologica, Vol. 91, No. 2, 20.06.2019, p. 154-174.

Research output: Contribution to journalArticle

De Felice, Rachele ; Moreton, Emma. / Identifying speech acts in a corpus of historical migrant correspondence. In: Studia Neophilologica. 2019 ; Vol. 91, No. 2. pp. 154-174.
@article{7f6b3167fd5a42758d99966b421a1c2f,
title = "Identifying speech acts in a corpus of historical migrant correspondence",
abstract = "A full account of the pragmatics of personal correspondence requires speech act annotation, and as manual annotation of large datasets can be extremely difficult, this study proposes to use an automated speech act tagger developed by the first author. It was originally designed for use with business emails; however, the latest iteration of the tagger can be applied to other datasets–such as personal correspondence–providing a useful resource for the corpus linguistics community. In this study, the speech act tagger is tested on a collection of letters written by Irish migrants at the end of the nineteenth century. After discussing issues to do with the digitisation, transcription and annotation of historical migrant correspondence, the article will report on the results of this trial study, demonstrating how the tagger can perform with some success even on corpora with very different characteristics. Although the dataset used for this trial study is small, the findings show the potential for carrying out this type of analysis across larger digital archives allowing for different datasets to be compared, taking into consideration sociobiographic variables such as the author’s sex, class and role within the notional familial hierarchy.",
keywords = "corpus linguistics, correspondence corpora, Historical migrant letters, speech acts",
author = "{De Felice}, Rachele and Emma Moreton",
year = "2019",
month = "6",
day = "20",
doi = "10.1080/00393274.2019.1616216",
language = "English",
volume = "91",
pages = "154--174",
journal = "Studia Neophilologica",
issn = "0039-3274",
publisher = "Taylor and Francis",
number = "2",

}

TY - JOUR

T1 - Identifying speech acts in a corpus of historical migrant correspondence

AU - De Felice, Rachele

AU - Moreton, Emma

PY - 2019/6/20

Y1 - 2019/6/20

N2 - A full account of the pragmatics of personal correspondence requires speech act annotation, and as manual annotation of large datasets can be extremely difficult, this study proposes to use an automated speech act tagger developed by the first author. It was originally designed for use with business emails; however, the latest iteration of the tagger can be applied to other datasets–such as personal correspondence–providing a useful resource for the corpus linguistics community. In this study, the speech act tagger is tested on a collection of letters written by Irish migrants at the end of the nineteenth century. After discussing issues to do with the digitisation, transcription and annotation of historical migrant correspondence, the article will report on the results of this trial study, demonstrating how the tagger can perform with some success even on corpora with very different characteristics. Although the dataset used for this trial study is small, the findings show the potential for carrying out this type of analysis across larger digital archives allowing for different datasets to be compared, taking into consideration sociobiographic variables such as the author’s sex, class and role within the notional familial hierarchy.

AB - A full account of the pragmatics of personal correspondence requires speech act annotation, and as manual annotation of large datasets can be extremely difficult, this study proposes to use an automated speech act tagger developed by the first author. It was originally designed for use with business emails; however, the latest iteration of the tagger can be applied to other datasets–such as personal correspondence–providing a useful resource for the corpus linguistics community. In this study, the speech act tagger is tested on a collection of letters written by Irish migrants at the end of the nineteenth century. After discussing issues to do with the digitisation, transcription and annotation of historical migrant correspondence, the article will report on the results of this trial study, demonstrating how the tagger can perform with some success even on corpora with very different characteristics. Although the dataset used for this trial study is small, the findings show the potential for carrying out this type of analysis across larger digital archives allowing for different datasets to be compared, taking into consideration sociobiographic variables such as the author’s sex, class and role within the notional familial hierarchy.

KW - corpus linguistics

KW - correspondence corpora

KW - Historical migrant letters

KW - speech acts

UR - http://www.scopus.com/inward/record.url?scp=85068190809&partnerID=8YFLogxK

UR - http://www.mendeley.com/research/identifying-speech-acts-corpus-historical-migrant-correspondence

U2 - 10.1080/00393274.2019.1616216

DO - 10.1080/00393274.2019.1616216

M3 - Article

VL - 91

SP - 154

EP - 174

JO - Studia Neophilologica

JF - Studia Neophilologica

SN - 0039-3274

IS - 2

ER -