Corpus from scratch: Collecting and processing a sizeable EAP corpus in a (relatively) resource-poor context

Priya Mathew, Hilary Nesi, Benet Vincent

Research output: Chapter in Book/Report/Conference proceedingConference proceedingpeer-review


Carefully designed home-made corpora are a useful source of highly discipline-specific language data. They enable EAP practitioners not only to find out more about disciplinary practice in their own contexts, but also to create bespoke materials and activities for learners with specific communicative needs. The process of collecting and preparing corpus data is often rather daunting, however, especially if the corpus is not solely for personal use, and if it is to include unpublished texts. This paper explains the process of corpus creation from the perspective of an EAP practitioner working in Oman. The project under discussion was undertaken without special funding, as part of the day-to-day activity of a busy college writing centre. Steps in the process included seeking ethics clearance, liaising with lecturers in the selected discipline (civil engineering), collecting student assignments via an online submission portal, converting, categorising and annotating files, and making them available to students and colleagues via the Sketch Engine corpus query interface. The paper also reports on the practical uses of this project, to support Omani engineering students studying in the medium of English. It therefore discusses how working together with students and faculty staff brings benefits to all.
Original languageEnglish
Title of host publicationProceedings of the 2017 BALEAP Conference
Subtitle of host publicationAddressing the state of the union: Working together = learning together
EditorsMaxine Gillway
Place of PublicationReading
PublisherGarnet Education
Number of pages10
ISBN (Print)978-1-78260-676-5
Publication statusPublished - 2019
EventBALEAP Biannual Conference - University of Bristol, Bristol, United Kingdom
Duration: 7 Apr 20179 Apr 2017


ConferenceBALEAP Biannual Conference
Abbreviated titleBALEAP 2017
Country/TerritoryUnited Kingdom


  • EAP Corpora
  • Student written assessed genres
  • discipline-specific language

ASJC Scopus subject areas

  • Language and Linguistics


Dive into the research topics of 'Corpus from scratch: Collecting and processing a sizeable EAP corpus in a (relatively) resource-poor context'. Together they form a unique fingerprint.

Cite this