
AI Systems & Architectures
Attention and Internal Representations: How AI “Sees” the World
Excerpt:
Attention and internal representations are key to how AI processes information. This is the invisible “gaze” that allows language and vision models to make sense of data.
Attention and Internal Representations: How AI “Sees” the World
AI has no eyes or consciousness. Yet modern models have developed mechanisms that allow them to focus on relevant data and build a kind of internal “mental map.”
This is what enables them to understand language, recognize images, or keep track of context in a conversation.
1. What Is Attention in AI?
The attention mechanism became famous with the paper “Attention Is All You Need.”
Instead of treating all information equally, AI learns to detect which parts are most relevant at each step.
Simple metaphor:
It’s like reading a book and underlining only the important sentences. The model doesn’t “read” everything with equal focus—it prioritizes.
2. Internal Representations: AI’s “Mental Map”
Every word, image, or sound processed by AI becomes a vector, a mathematical representation summarizing its meaning.
These vectors are organized in an embedding space, where similar concepts cluster close together.
Metaphor:
Think of mental flashcards: “cat” stays near “animal” and far from “airplane.”
3. How They Work Together
Attention decides what to look at, and internal representations connect that information with prior knowledge.
- In language: ask “Who wrote One Hundred Years of Solitude?”, and attention focuses on “wrote” and the book title, while representations link them to “Gabriel García Márquez.”
- In vision: models like CLIP compare visual patterns with text descriptions, connecting pixels to concepts.
4. Does AI Truly Understand What It Sees or Reads?
Here comes the philosophical debate:
- Some argue these representations are just symbol manipulation with no real understanding.
- Others see them as a functional proto-understanding: not consciousness, but a useful way of modeling the world.
Conclusion
AI doesn’t see or think like we do, but its ability to focus on the right information and organize it into internal maps is what makes it so powerful.
Reflective question:
If AI keeps building increasingly complex representations, could we say it will one day have its own “vision” of the world?
Descubre más desde JRN Calo AI Digital Art & Sci-Fi
Suscríbete y recibe las últimas entradas en tu correo electrónico.