AI Agent Core Architecture: LLM + Memory + Tools + Planning
This is Part 1 of the AI Agent Tech Stack series. This article breaks down the core architecture from first principles. The rest of the series: Design Patterns · Framework Comparison · Protocol Layer · Production Practices.
What Is an AI Agent?
An AI Agent is far more than "LLM + Prompt." It is an autonomous system that can perceive its environment, plan actions, use tools, and execute iteratively. Unlike traditional single-turn Q&A, an agent has:
- Autonomy — decides the next action on its own, without waiting for explicit user commands
- Tool Use — invokes external APIs, executes code, queries databases
- Memory — retains conversation context, interaction history, and learned knowledge
- Planning — decomposes complex tasks into executable sub-steps
- Reflection — evaluates its own output quality and self-corrects
Core Architecture
1. LLM (Brain)
The LLM is the agent's reasoning engine — it understands user intent, formulates plans, selects tools, and generates responses. Leading choices:
| Model | Provider | Strengths | Best For |
|---|---|---|---|
| GPT-4o / o3 | OpenAI | Strongest all-round, stable tool calling | General agents, complex reasoning |
| Claude 4 Opus/Sonnet | Anthropic | Long context, strong coding | Code agents, document analysis |
| Gemini 2.5 Pro | Multimodal, long context | Multimodal agents | |
| DeepSeek-V3 | DeepSeek | Excellent cost-performance ratio | Cost-sensitive workloads |
| Llama 3.3 / 4 | Meta | Open-source, self-hostable | Data-privacy-critical scenarios |
| Qwen 3 | Alibaba | Strong multilingual support | Multilingual agents |
2. Memory
Agent memory operates at three levels:
- Short-term — current conversation context, limited by the context window. Implementation: message list + summary compression.
- Long-term — persistent knowledge store, typically via RAG (Retrieval-Augmented Generation). Vector DB options: Pinecone, ChromaDB, pgvector, Milvus.
- Episodic — past task execution experiences and feedback. Notable project: Mem0.
3. Tools
Tools are the agent's interface to the outside world, implemented via two primary mechanisms:
- Function Calling — natively supported by OpenAI/Claude; the LLM outputs structured tool-call requests
- MCP (Model Context Protocol) — an open protocol by Anthropic that abstracts tools as independent services. See Protocol Layer for details.
Common tool types: code execution, web search, database queries, API calls, file I/O, browser automation.
4. Planning
An agent's planning capability determines its ceiling for handling complex tasks. Core approaches:
- ReAct — alternates between Reasoning and Acting; think first, then act at each step
- Chain-of-Thought (CoT) — step-by-step reasoning to improve accuracy
- Task Decomposition — break large tasks into executable sub-tasks
- Plan-and-Execute — formulate a complete plan first, then execute step by step
The Agent Loop
All these components come together in the Perception → Reasoning → Action → Observation loop:
- The agent perceives the user's request and any environmental context
- The LLM reasons about what to do next (planning + chain-of-thought)
- It acts by calling a tool or generating a response
- It observes the result, reflects, and decides whether to continue or finish
This loop repeats until the task is complete or a maximum iteration limit is reached.
What's Next
This article covered the foundational architecture. The rest of the series dives deeper:
- Part 2: Design Patterns — ReAct, Multi-Agent, Graph Workflows, Handoffs
- Part 3: Framework Comparison — LangGraph vs OpenAI Agents vs PydanticAI vs CrewAI
- Part 4: Protocol Layer — MCP, A2A, AG-UI, Function Calling
- Part 5: Production Practices — Observability, Evaluation, Guardrails, Deployment