Book contents
- Frontmatter
- Contents
- List of Panels
- Preface
- PART I A LONG-PONDERED OUTFIT
- PART II THE EVALUATION DISCORDANCE
- 3 The Evaluation of Human Behaviour
- 4 The Evaluation of Non-human Natural Behaviour
- 5 The Evaluation of Artificial Intelligence
- 6 The Boundaries against a Unified Evaluation
- PART III THE ALGORITHMIC CONFLUENCE
- PART IV THE SOCIETY OF MINDS
- PART V THE KINGDOM OF ENDS
- References
- Index
- Plate section
6 - The Boundaries against a Unified Evaluation
from PART II - THE EVALUATION DISCORDANCE
Published online by Cambridge University Press: 19 January 2017
- Frontmatter
- Contents
- List of Panels
- Preface
- PART I A LONG-PONDERED OUTFIT
- PART II THE EVALUATION DISCORDANCE
- 3 The Evaluation of Human Behaviour
- 4 The Evaluation of Non-human Natural Behaviour
- 5 The Evaluation of Artificial Intelligence
- 6 The Boundaries against a Unified Evaluation
- PART III THE ALGORITHMIC CONFLUENCE
- PART IV THE SOCIETY OF MINDS
- PART V THE KINGDOM OF ENDS
- References
- Index
- Plate section
Summary
[Edsger Dijkstra] asked me what I was working on. Perhaps just to provoke a memorable exchange I said, “AI”. To that he immediately responded, “Why don't you work on I?”
He was right, of course, that if “I” is more general than “AI”, one should work on the more general problem, especially if it is the one that is the natural phenomenon, which in this case it is.
– Leslie Valiant, Probably Approximately Correct: Nature's Algorithms for Learning and Prospering in a Complex World (2013)IN THE PREVIOUS three chapters, we have seen three very different approaches to the evaluation of behaviour. Psychometrics use well-defined test batteries, usually composed of abstract culture-fair problems or questionnaires, different from everyday tasks. Comparative psychology also presents tasks to animals, not necessarily so abstract, but careful attention is put on interfaces and motivation, with rewards being key. Artificial intelligence evaluation is significantly different, using benchmarks and competitions. What happens if we use definitions, tools and tests from one discipline to evaluate subjects in other disciplines? How often has this been done and advocated for? Why has it not worked so far?
THE FRAGMENTED EVALUATION OF BEHAVIOUR
There was a time when a certain fragmentation existed between psychology, evolutionary biology and artificial intelligence. However, with the increasing relevance of evolutionary psychology and cognitive science, the boundaries between these disciplines have well been trespassed and new areas have appeared in between, such as artificial life, evolutionary computing, evolutionary robotics and, human-machine interfaces, developmental robotics and swarm computing. Unfortunately, we cannot say the same for the evaluation of behavioural features. The preceding three chapters present different terminologies, principles, tools and, ultimately, tests.
Table 6.1 shows a simplified picture of some of the distinctions between psychometrics, comparative psychology and AI evaluation.
Each discipline is extremely diverse. It is very different if we evaluate a small child or an adult, a chimpanzee or a bacterium, an ‘intelligent’ vacuum cleaner or a reinforcement learning system playing games. Hence, the simplification would be turned into distortion for other differences, which are therefore not included in the table, such as what is evaluated (individual, group or species), whether the measurement is quantitative or qualitative, how difficulty is inferred and the relevance of physical traits in the measurement (e.g., sensorimotor abilities).
- Type
- Chapter
- Information
- The Measure of All MindsEvaluating Natural and Artificial Intelligence, pp. 152 - 172Publisher: Cambridge University PressPrint publication year: 2017