Book contents
- Frontmatter
- Contents
- Preface
- 1 Introduction: goals and methods of the corpus-based approach
- Part I Investigating the use of language features
- Part II Investigating the characteristics of varietie
- Part III Summing up and looking ahead
- Part IV Methodology boxes
- 1 Issues in corpus design
- 2 Issues in diachronic corpus design
- 3 Concordancing packages versus programming for corpus analysis
- 4 Characteristics of tagged corpora
- 5 The process of tagging
- 6 Norming frequency counts
- 7 Statistical measures of lexical associations
- 8 The unit of analysis in corpus-based studies
- 9 Significance tests and the reporting of statistics
- 10 Factor loadings and dimension scores
- Appendix: commercially available corpora and analytical tools
- References
- Index
3 - Concordancing packages versus programming for corpus analysis
Published online by Cambridge University Press: 05 June 2012
- Frontmatter
- Contents
- Preface
- 1 Introduction: goals and methods of the corpus-based approach
- Part I Investigating the use of language features
- Part II Investigating the characteristics of varietie
- Part III Summing up and looking ahead
- Part IV Methodology boxes
- 1 Issues in corpus design
- 2 Issues in diachronic corpus design
- 3 Concordancing packages versus programming for corpus analysis
- 4 Characteristics of tagged corpora
- 5 The process of tagging
- 6 Norming frequency counts
- 7 Statistical measures of lexical associations
- 8 The unit of analysis in corpus-based studies
- 9 Significance tests and the reporting of statistics
- 10 Factor loadings and dimension scores
- Appendix: commercially available corpora and analytical tools
- References
- Index
Summary
In recent years concordancing packages for analyzing corpus data have become increasingly available, and they can be very useful for investigating word frequencies, word associations, and even certain morphological characteristics. With grammatically tagged corpora, concordancing packages can also be used for looking at the grammatical class of words. However, concordancing packages are very constrained with respect to the kinds of analyses they can do, the type of output they give, and, in many cases, even the size of the corpus that can be analyzed. Therefore, many linguistic research questions are either impossible or very difficult to address with currently available concordancing packages. Computers are capable of much more complex and varied analyses than these packages allow, but to take full advantage of a computer's capability, a researcher needs to know how to write programs.
What is a computer program?
Fundamentally, a computer program is a set of instructions to a computer. For our purposes, a program tells the computer how to analyze a corpus. It tells what texts to use as input, what linguistic features to analyze and how to identify them, and what kind of output to produce (counts, KWIC files, etc.). As the sample analyses throughout this book show, many different kinds of analyses and output are possible. In essence, the program controls the computer. Therefore, when you can write programs, you harness the power of the computer to fit your own research goals.
- Type
- Chapter
- Information
- Corpus LinguisticsInvestigating Language Structure and Use, pp. 254 - 256Publisher: Cambridge University PressPrint publication year: 1998