This paper presents the case for compiling small, manually-sampled corpora, rich in contextual information, for research, teaching and learning in ESP (English for Specific Purposes) and EAP (English for Academic Purposes). Large datasets such as the British National Corpus, more recent web-derived corpora, and the web itself are major sources of information about the lexis and grammar of general English, and progress has been made in automatic corpus compilation methods; it is possible to mine the internet for documents on specified topics, within specified domains, and with the linguistic features associated with particular genres. Many types of text which are of interest to ESP and EAP practitioners are absent from these general corpora, however, or are under-represented. Perhaps even more importantly, their compilation methods do not capture all the background information ESP and EAP practitioners need in order to understand and explain the linguistic choices speakers and writers make. This paper suggests that the process of designing a good corpus for ESP or EAP is similar to the process of needs analysis, and illustrates this process with examples from various corpus projects, showing how contextual information can be added to corpora in the form of textual annotations or as supplementary material.
Bibliographical noteThe journal homepage is available from http://asp.revues.org .
- English for Specific Purposes
- English for Academic Purposes
- BASE corpus
- BAWE corpus
- Engineering Lecture Corpus.