Scalar inference (SI), e.g., utterances containing some being enriched to mean some but not all, is a central topic in semantics and pragmatics. Of recent interest in the experimental literature is scalar diversity: different lexical scales differ in their likelihood of leading to SI. Studies of scalar diversity have almost exclusively relied on the so-called inference task. In this article, we highlight two shortcomings of the inference task: it biases participants by providing them with the stronger alternative, and it obscures pragmatic inferences other than SI. We offer as an alternative a degree estimate task to investigate utterances containing scalar terms. We validate the degree estimate task, i.a., by successfully replicating a previous finding about scalar diversity: that the distinctness of scalar terms (some versus all) is a significant predictor of it. We then use degree estimates to reassess previous inference task-based findings. Our results show that biasing discourse contexts lead to lower degree estimates (i.e., more strengthened meanings) than a manipulation with only, which contrasts with prior literature’s findings. The article concludes that the inference and degree estimate tasks both have advantages: the former offers a straightforward definition of SI calculation, while the latter avoids explicitly mentioning a negated stronger alternative.