Applied Agent Foundations

Welcome to the Applied Agent Foundations research guide!

This track focuses on implementing and rigorously testing theoretical agent models from the Agent Foundations literature. Unlike the theoretical track, you'll take proven mathematical frameworks and build computational implementations that can be tested, benchmarked, and evaluated empirically.

You'll identify promising theoretical foundations that haven't been implemented, understand their mathematical requirements, build working computational versions, and test them against real scenarios. This bridges the crucial gap between elegant theory and practical systems that could eventually inform AI alignment approaches.

This guide clarifies what we mean by "agents" in this context, explains why implementation matters, and provides concrete steps for turning theory into working code. Start by understanding the theoretical landscape, then focus on implementation challenges and empirical validation.

What Do We Mean by "Agent"?

Important Clarification: When we say "agent" in this track, we don't mean independently acting LLMs or AI systems that take actions in environments. That's a different (though important) area of study.

Our Definition: An agent is a mathematical construct that can perform tasks like learning or decision-making in a mathematically rigorous and provable way. Key characteristics:

Mathematical Rigor: Behavior follows from formal definitions and proofs
Interpretability: Unlike black-box neural networks, we can understand exactly why the agent makes specific decisions
Theoretical Grounding: Built on solid mathematical foundations rather than empirical optimization

Examples: Mathematical frameworks like AIXI, Updateless Decision Theory, or logical inductors—systems where we can prove specific properties about their behavior rather than just observing what they do.

This represents a fundamentally different approach from the gradient-based optimization that powers current AI systems. While less immediately practical, it offers the promise of truly understanding and controlling AI behavior through mathematical guarantees.

Track Description

The gap between theoretical agent foundations and working implementations is enormous. Brilliant theoretical work exists, but most remains unimplemented and untested. This track focuses on:

Taking proven mathematical frameworks and building computational versions
Identifying where theory makes unrealistic assumptions and developing practical approximations
Creating benchmarks and test suites for evaluating different theoretical approaches
Building tools and libraries that make agent foundations research more accessible