Iconicity in language is receiving increased attention from many fields, but our understanding of iconicity is only as good as the measures we use to quantify it. We collected iconicity measures for 304 Japanese words from English-speaking participants, using rating and guessing tasks. The words included ideophones (structurally marked depictive words) along with regular lexical items from similar semantic domains (e.g., fuwafuwa ‘fluffy’, jawarakai ‘soft’). The two measures correlated, speaking to their validity. However, ideophones received consistently higher iconicity ratings than other items, even when guessed at the same accuracies, suggesting the rating task is more sensitive to cues like structural markedness that frame words as iconic. These cues did not always guide participants to the meanings of ideophones in the guessing task, but they did make them more confident in their guesses, even when they were wrong. Consistently poor guessing results reflect the role different experiences play in shaping construals of iconicity. Using multiple measures in tandem allows us to explore the interplay between iconicity and these external factors. To facilitate this, we introduce a reproducible workflow for creating rating and guessing tasks from standardised wordlists, while also making improvements to the robustness, sensitivity and discriminability of previous approaches.