Despite their impressive capabilities, leading AI models understand relatively little about the world around them. This is because the text and image data they are trained on contains only indirect knowledge of physical reality. This limits their ability to both understand and act in dynamic real-world environments.
Future Horizons:
10-yearhorizon
AI agents acquire more powerful world models
25-yearhorizon
Embodied AI achieves human-like capabilities
Internal world models, which allow an agent to predict environmental states given actions, are central to intelligence, supporting zero-shot generalisation and planning.9 Embodied intelligence, which is intelligence that is rooted in the connection between perception, action and reality, can be likened to evolutionary and child-development processes: human children’s learning, viewed as “scientists in the crib”, provides a model for active exploration and self-directed experiment in intelligent systems. However, current text-based models lack the necessary grounding and are orders of magnitude less data-efficient than humans. Real-world, multimodal data is critical for building future models, and robotics offers a promising path: here, the integration of language, vision, memory and manipulation capabilities supports hierarchical planning and lifelong learning.10
In the long term, AI robots may develop complex self-awareness and seamlessly acquire new skills from rich sensory experience, moving closer to human-like capabilities. Safety boundaries must be set, though. For example, reinforcement- learning strategies that incentivise pure survival should be avoided to ensure that AI acts as a tool aligned with human interests. Some researchers feel that highly abstract reasoning and planning must always be reducible to linguistic or symbolic representations in order to maintain transparency, but there is debate about this. There is also debate about whether LLM-based approaches can generalise to richly structured, non-textual reality such as images and 3D worlds. Overall, embodied learning will be important for credible AI advancement.
World-modelling and embodied AI - Anticipation Scores
The Anticipation Potential of a research field is determined by the capacity for impactful action in the present, considering possible future transformative breakthroughs in a field over a 25-year outlook. A field with a high Anticipation Potential, therefore, combines the potential range of future transformative possibilities engendered by a research area with a wide field of opportunities for action in the present. We asked researchers in the field to anticipate:
- The uncertainty related to future science breakthroughs in the field
- The transformative effect anticipated breakthroughs may have on research and society
- The scope for action in the present in relation to anticipated breakthroughs.
This chart represents a summary of their responses to each of these elements, which when combined, provide the Anticipation Potential for the topic. See methodology for more information.
