Book contents
- Frontmatter
- Contents
- Preface
- 1 Introduction: goals and methods of the corpus-based approach
- Part I Investigating the use of language features
- Part II Investigating the characteristics of varietie
- Part III Summing up and looking ahead
- Part IV Methodology boxes
- 1 Issues in corpus design
- 2 Issues in diachronic corpus design
- 3 Concordancing packages versus programming for corpus analysis
- 4 Characteristics of tagged corpora
- 5 The process of tagging
- 6 Norming frequency counts
- 7 Statistical measures of lexical associations
- 8 The unit of analysis in corpus-based studies
- 9 Significance tests and the reporting of statistics
- 10 Factor loadings and dimension scores
- Appendix: commercially available corpora and analytical tools
- References
- Index
9 - Significance tests and the reporting of statistics
Published online by Cambridge University Press: 05 June 2012
- Frontmatter
- Contents
- Preface
- 1 Introduction: goals and methods of the corpus-based approach
- Part I Investigating the use of language features
- Part II Investigating the characteristics of varietie
- Part III Summing up and looking ahead
- Part IV Methodology boxes
- 1 Issues in corpus design
- 2 Issues in diachronic corpus design
- 3 Concordancing packages versus programming for corpus analysis
- 4 Characteristics of tagged corpora
- 5 The process of tagging
- 6 Norming frequency counts
- 7 Statistical measures of lexical associations
- 8 The unit of analysis in corpus-based studies
- 9 Significance tests and the reporting of statistics
- 10 Factor loadings and dimension scores
- Appendix: commercially available corpora and analytical tools
- References
- Index
Summary
Inferential statistics provide an important tool for assessing whether observed patterns are meaningful. There are many different statistical techniques, depending on the types of variables that are being compared. Statistical techniques can be used to measure the differences between groups (as with a t-test or ANOVA) or the extent of the relationship between variables (as with a correlation). Each technique can be used to produce a test of significance, assessing the likelihood that the observed difference or relationship could be due to chance, and a measure of strength, assessing the importance of the difference or relationship. This methodology box introduces you to some of the most common statistical techniques used in corpus-based studies.
A t-test is used to determine if a significant difference exists between two groups. The statistical procedure compares the distance between mean scores for the two groups relative to the amount of variation that exists within each group. The t-value is a score measuring the likelihood that the observed difference could be due to chance. To evaluate the significance associated with a score for t, it is necessary to also consider the number of observations analyzed in the study. A relatively small difference in mean scores can be significant if it is based on a large number of observations, while a relatively large difference might not be significant if it is based on few observations.
- Type
- Chapter
- Information
- Corpus LinguisticsInvestigating Language Structure and Use, pp. 275 - 277Publisher: Cambridge University PressPrint publication year: 1998