Corpora, Catalogues and Correspondence: The Item-Level Identification and Digitisation of Business Letters for the British Telecom Correspondence Corpus

Ralph Morton, Hilary Nesi

    Research output: Chapter in Book/Report/Conference proceedingChapter

    69 Downloads (Pure)

    Abstract

    This paper explores some of the challenges in working with archive material to produce language corpora. It takes as a case study the British Telecom Correspondence Corpus (BTCC) which contains a selection of the letters held in the BT Archives, housed in Holborn Telephone Exchange. One of the essential differences between a corpus and an archive is that a corpus is intended to be representative of a language variety. Material makes its way into historical archives in a variety of ways, and whilst they may preserve a breadth of material; archives are not generally collected to be representative, nor are they primarily designed to facilitate linguistic investigation. Work on the BTCC began as part of a Jisc-funded project to digitise the BT Archives and create a ‘research resource for the higher education sector’ (Hay, 2014:12). The BT Digital Archives became available to the public in July 2013. Our experiences using this resource inform the second half of the paper, in particular regarding the identification of corpus material and the difficulty in identifying letters at an item level. This leads to a wider discussion of how best to digitise physical archives.
    Original languageEnglish
    Title of host publicationProceedings of the Digital Humanities Congress 2014
    EditorsClare Mills, Michael Pidd, Jessica Williams
    Place of PublicationSheffield
    PublisherHRI Online Publications
    Publication statusPublished - 2016
    EventDigital Humanities Congress 2014 - Sheffield University, Sheffield, United Kingdom
    Duration: 4 Sept 20146 Sept 2014
    https://www.digitalpanopticon.org/?p=618

    Conference

    ConferenceDigital Humanities Congress 2014
    Country/TerritoryUnited Kingdom
    CitySheffield
    Period4/09/146/09/14
    Internet address

    Bibliographical note

    The full text is also available from http://www.hrionline.ac.uk/openbook/chapter/dhc2014-morton
    This is an open access publication with a Creative Commons Attribution-NoDerivatives 4.0 International License.

    Fingerprint

    Dive into the research topics of 'Corpora, Catalogues and Correspondence: The Item-Level Identification and Digitisation of Business Letters for the British Telecom Correspondence Corpus'. Together they form a unique fingerprint.

    Cite this