This paper presents an account of a Welfare Quality® assessment of 92 dairy farms carried out by seven experienced assessors. The aim was to evaluate the potential of the Welfare Quality® assessment protocol with respect to its uptake by UK farm assurance schemes. Data collection, and measure aggregation were performed according to the Welfare Quality® protocol for dairy cows. This study examined the data itself, by the testing of how hypothetical interventions might be reflected in changes in the aggregated scores, and also investigated human-related aspects, through inter-assessor standardisation sessions to evaluate reliability, and an assessor focus group to collect feedback. Overall, three main ‘challenges’ were identified. The first challenge related to the large amount of missing data. Unexpectedly, this was such that it was only possible to calculate an overall classification for 7% of farms. The second challenge concerned the way in which aggregated scores did not always reflect hypothetical interventions. The final challenge was inter-assessor reliability, where not all assessors were found to achieve acceptable levels of agreement on a number of outcome measures by the third training session. Suggestions for managing these challenges included, follow-up to assessor training, the use of multiple imputation methods to fill in missing data, and, where applicable, not aggregating the scores. The conclusion of the study was that the protocol provided useful information from which to make an informed selection of measures, but that the challenges, combined with the lengthy assessment time, were too great for its use as a certification tool.