Article contents
Linguistic Corpora And Lexicography
Published online by Cambridge University Press: 19 November 2008
Extract
Over the past ten to fifteen years, the discipline of lexicography has changed almost beyond recognition. This change is due to the technological revolution which has computerized the lexicographers' working environment to a very high degree and which has permitted a veritable quantum leap in the amount and variety of resources that can be brought to bear on the lexicographical process. The most important of these resources are computerized corpora of real, mostly written, but now increasingly also spoken, running text. When the first entirely corpus-based dictionary—COBUILD1—came out in 1987, it was on the basis of a corpus of around 20 million words of connected text. Now all major British dictionary publishers use corpora of at least one hundred million words of text. Harrap/Chambers, Longman, and Oxford University Press have built the 100 million word British National Corpus (BNC), HarperCollins has the 200 million-plus word Cobuild Bank of English (BoE), and Cambridge University Press has compiled the 100 million word Cambridge Language Survey corpus (CLS).
- Type
- Technology and Language Analysis
- Information
- Copyright
- Copyright © Cambridge University Press 1996
References
UNANNOTATED BIBLIOGRAPHY
- 3
- Cited by