Book contents
- Frontmatter
- Contents
- Preface
- 1 Introduction: goals and methods of the corpus-based approach
- Part I Investigating the use of language features
- Part II Investigating the characteristics of varietie
- Part III Summing up and looking ahead
- Part IV Methodology boxes
- 1 Issues in corpus design
- 2 Issues in diachronic corpus design
- 3 Concordancing packages versus programming for corpus analysis
- 4 Characteristics of tagged corpora
- 5 The process of tagging
- 6 Norming frequency counts
- 7 Statistical measures of lexical associations
- 8 The unit of analysis in corpus-based studies
- 9 Significance tests and the reporting of statistics
- 10 Factor loadings and dimension scores
- Appendix: commercially available corpora and analytical tools
- References
- Index
8 - The unit of analysis in corpus-based studies
Published online by Cambridge University Press: 05 June 2012
- Frontmatter
- Contents
- Preface
- 1 Introduction: goals and methods of the corpus-based approach
- Part I Investigating the use of language features
- Part II Investigating the characteristics of varietie
- Part III Summing up and looking ahead
- Part IV Methodology boxes
- 1 Issues in corpus design
- 2 Issues in diachronic corpus design
- 3 Concordancing packages versus programming for corpus analysis
- 4 Characteristics of tagged corpora
- 5 The process of tagging
- 6 Norming frequency counts
- 7 Statistical measures of lexical associations
- 8 The unit of analysis in corpus-based studies
- 9 Significance tests and the reporting of statistics
- 10 Factor loadings and dimension scores
- Appendix: commercially available corpora and analytical tools
- References
- Index
Summary
One of the very first decisions required when carrying out a corpus-based analysis is to determine what your unit of analysis is. This is a crucial decision because it determines the object of your research; if you do not identify the unit of analysis properly, you may not be able to address the research questions that you have posed.
As you see in this book, corpus-based studies usually have one of two primary research goals: describing a linguistic structure and its variants (as in Part I of the book), or describing some group of texts (as in Part II of the book). Thus, the unit of analysis in corpus-based studies is typically one of two kinds: either occurrences of a linguistic feature or a text. These units of analysis are called the “observations” for the study. Thus, in a study that characterizes a linguistic structure, each observation is an occurrence of the structure in question. In a study that seeks to describe a group of texts, each observation is a text.
Suppose that we plan a study to analyze variants of that-complement clauses and to determine which contextual factors are most strongly associated with each variant. The study should be designed so that each occurrence of a that-complement clause is a separate observation.
- Type
- Chapter
- Information
- Corpus LinguisticsInvestigating Language Structure and Use, pp. 269 - 274Publisher: Cambridge University PressPrint publication year: 1998