Evidence from previous research suggests that frame-of-reference (FOR) training is effective at improving assessor ratings in many organizational settings. Yet no research has presented a thorough examination of systematic sources of variance (assessor-related effects, evaluation settings, and measurement design features) that might influence training effectiveness. Using a factorial ANOVA and variance components analyses on a database of four studies of frame-of-reference assessor training, we found that (a) training is most effective at identifying low levels of performance and (b) the setting of the training makes little difference with respect to training effectiveness. We also show evidence of the importance of rater training as a key determinant of the quality of performance ratings in general. Implications for FOR training theory and practice are discussed.