Published online by Cambridge University Press: 14 July 2006
What is the probability of students cheating?
I found myself asking this question quite literally recently, when 10 of my students in an upper-level undergraduate methods course turned in identical results for a take-home exercise. Of course, on some exercises I expect students to produce identical findings, such as when I ask for the mean and variance of a particular variable. In this case, however, I had asked students to produce a new random variable and to summarize its values in a frequency distribution. My initial reaction was to suspect the students of collaborating on the exercise (contrary to my syllabus and the assignment's instructions), though the number of students who produced identical results—10 out of a class of 30—made me skeptical that so many students could conspire so effectively. The very nature of the students made it unlikely that they collaborated together. I teach at a large public-service university, with students that represent a broad variety of backgrounds, nationalities, interests, and ages. The course also is cross-listed among disciplines, so the 10 students included both political science and geography majors. My review of the names of the 10 students persuaded me that it was highly unlikely that they cheated. I have found, furthermore, that many students will not voluntarily work in groups. So how did this diverse group of students produce an identical “random” variable?
My investigation of this question took me well beyond issues of student conduct. To answer the question to my satisfaction, I found I had to understand how campus computer networks operate and ultimately how the statistical software my students use works. My journey took me into the arcane world of “random” numbers in computers, and required me to understand how statistical software generates so-called pseudo-random numbers. When I finally found an answer to my puzzle, I learned that the problem was not with my students, but with the software on which we all rely for research and, increasingly, pedagogy. Many researchers today know there is no such thing as a computer-generated truly random number. Although many political scientists know this has profound consequences for their work—whether for sampling purposes or Monte Carlo experiments—to my knowledge instructors of quantitative methods courses have given little thought to its implications in the classroom. For one, it is easy and tempting to mistake the problems of pseudo-random number generation for student malfeasance. For another, it speaks to the students' (and instructor's) conceptual grasp of the slippery idea of “randomness.” For this reason, I offer my own experience as a cautionary tale.