Published online by Cambridge University Press: 06 April 2009
Researchers are increasingly using data from the Nasdaq market to examine pricing behavior, market design, and other microstructure phenomena. The validity of any study that classifies trades as buys or sells depends on the accuracy of the classification method. Using a Nasdaq proprietary data set that identifies trade direction, we examine the validity of several trade classification algorithms. We find that the quote rule, the tick rule, and the Lee and Ready (1991) rule correctly classify 76.4%, 77.66%, and 81.05% of the trades, respectively. However, all classification rules have only a very limited success in classifying trades executed inside the quotes, introducing a bias in the accuracy of classifying large trades, trades during high volume periods, and ECN trades. We also find that extant algorithms do a mediocre job when used for calculating effective spreads. For Nasdaq trades, we propose a new and simple classification algorithm that improves over extant algorithms.
Ellis, Australian Graduate School of Management; Michaely, Cornell University and Tel-Avia University; O'Hara, Cornell University. The authors thank Dean Furbush, Tim McCormick, and Jennifer Drake of the NASD Economic Research Department for extensive help in providing the data setused in this paper. We thank the editor (Paul Malatesta), Jeffrey Harris, Soeren Hvidkjaer, Tim McCormick, and Paul Schultz for useful comments. We are particularly grateful to Hank Bessembinder (associate editor and referee) for helpful suggestions. Please address correspondence to Roni Michaely ([email protected]) or Maureen O'Hara ([email protected]) at the Johnson Graduate School of Management, Cornell University, Ithaca, NY 14853.