Political scientists commonly use Grambsch and Therneau’s (1994, Biometrika 81, 515–526) ubiquitous Schoenfeld-based test to diagnose proportional hazard violations in Cox duration models. However, some statistical packages have changed how they implement the test’s calculation. The traditional implementation makes a simplifying assumption about the test’s variance–covariance matrix, while the newer implementation does not. Recent work suggests the test’s performance differs, depending on its implementation. I use Monte Carlo simulations to more thoroughly investigate whether the test’s implementation affects its performance. Surprisingly, I find the newer implementation performs very poorly with correlated covariates, with a false positive rate far above 5%. By contrast, the traditional implementation has no such issues in the same situations. This shocking finding raises new, complex questions for researchers moving forward. It appears to suggest, for now, researchers should favor the traditional implementation in situations where its simplifying assumption is likely met, but researchers must also be mindful that this implementation’s false positive rate can be high in misspecified models.