Data matter. Protest data are no exception. Not all datasets are
equal, nor do all have equal validity and reliability. Thus far,
there has been no serious debate regarding the quality of protest
datasets. Increasing attention to protest studies suggests that now
is the time to talk about the sine qua non for
growing knowledge in this sub-field of comparative politics. In this
paper I emphasize the importance of the underlying data sources. In
particular, I argue that protest datasets should be drawn from many
sources, particularly as many local sources as possible, in order to
provide accurate and meaningful data. To support this contention, I
first discuss traditional protest datasets and their limitations and
provide suggestions about how to overcome these limitations. Second,
I explain an alternative type of dataset coding and its
characteristics. And finally, I use statistical tests to illustrate
the importance of local sources.I give
my sincere and deep thanks to Erik S. Herron and Ronald A.
Francisco for reviewing drafts of this paper and giving me
indispensable comments and continuous encouragements. I would
also like to thank Elizabeth Collins and Omur Yilmaz who edited
drafts of this manuscript. My appreciation also goes to
anonymous reviews of this paper and the editors of PS:
Political Science and Politics. I would like to
thank my late farther, Ki-Dae whose love and support were simply
endless. I'll share any compliment with them, but only I will
take responsibility for any error.