Tutorial 4: Agent Memory and State
- Contributor
- Jun 10
- 4 min read
A stateless agent can't learn from previous interactions. Memory makes the agent useful across turns and sessions. This tutorial walks through the three layers.
What You'll Build
An agent with three memory types: working (current task), conversation (current session), and persistent (across sessions).
Step 1: Working Memory (5 min)
This is just the agent's loop state — the messages array. Implicit in the API.
messages = [] # Current task's working memory
while not done:
response = client.messages.create(messages=messages, ...)
messages.append(response)
# ...
Cleared when the task completes.
Step 2: Conversation Memory (15 min)
History of the current session. Maintained between agent calls.
class Session:
def __init__(self, session_id):
self.session_id = session_id
self.messages = self.load_messages() # From DB
def chat(self, user_message):
self.messages.append({"role": "user", "content": user_message})
response = call_agent(self.messages)
self.messages.append({"role": "assistant", "content": response})
self.save_messages()
return response
Within a session, the agent has context.
Step 3: Persistent (User) Memory (30 min)
Facts about the user that persist across sessions:
# Schema
CREATE TABLE user_memory (
user_id TEXT,
fact TEXT,
confidence FLOAT,
last_referenced TIMESTAMPTZ,
PRIMARY KEY (user_id, fact)
);
Insert facts as discovered:
def extract_and_store_facts(user_id, user_message):
facts = extract_facts(user_message) # LLM call
for fact in facts:
db.execute("""
INSERT INTO user_memory (user_id, fact, confidence, last_referenced)
VALUES (%s, %s, %s, NOW())
ON CONFLICT (user_id, fact) DO UPDATE
SET last_referenced = NOW()
""", [user_id, fact, 0.8])
Inject into agent's system prompt:
def get_user_context(user_id):
facts = db.query("SELECT fact FROM user_memory WHERE user_id = %s", [user_id])
return "\n".join(f["fact"] for f in facts)
Step 4: Wire Memory Into the Agent (20 min)
def run_agent(session_id, user_message):
session = load_session(session_id)
user_id = session.user_id
# Load user context
user_context = get_user_context(user_id)
system_prompt = f"""
{BASE_SYSTEM_PROMPT}
Known about this user:
{user_context}
"""
# Run with conversation history
response = agent_loop(
system=system_prompt,
messages=session.messages + [{"role": "user", "content": user_message}],
)
# Extract new facts; store
extract_and_store_facts(user_id, user_message)
extract_and_store_facts(user_id, response)
# Update conversation memory
session.append_messages([
{"role": "user", "content": user_message},
{"role": "assistant", "content": response},
])
return response
Three memory layers all in play.
Step 5: Memory Decay (15 min)
Old facts may become stale:
def cleanup_stale_memory():
db.execute("""
DELETE FROM user_memory
WHERE last_referenced < NOW() - INTERVAL '6 months'
""")
Or weight by recency:
def get_user_context(user_id, limit=20):
return db.query("""
SELECT fact FROM user_memory
WHERE user_id = %s
ORDER BY last_referenced DESC
LIMIT %s
""", [user_id, limit])
Step 6: Episodic Memory (advanced, varies)
Memory of past interactions:
CREATE TABLE episode_summaries (
user_id TEXT,
session_id TEXT,
summary TEXT,
embedding VECTOR(1536),
created_at TIMESTAMPTZ
);
When a session ends, summarize:
def end_session(session):
summary = summarize_session(session.messages)
embedding = embed(summary)
db.execute("""
INSERT INTO episode_summaries
(user_id, session_id, summary, embedding, created_at)
VALUES (%s, %s, %s, %s, NOW())
""", [session.user_id, session.session_id, summary, embedding])
Retrieve relevant past episodes for current context:
def relevant_episodes(user_id, current_question, top_k=3):
embedding = embed(current_question)
return db.query("""
SELECT summary FROM episode_summaries
WHERE user_id = %s
ORDER BY embedding <=> %s
LIMIT %s
""", [user_id, embedding, top_k])
The agent remembers relevant past interactions.
Step 7: Tool-Triggered Memory (10 min)
Tools can also write to memory:
def remember(user_id: str, fact: str) -> str:
"""Explicitly remember something about the user."""
db.execute("""
INSERT INTO user_memory (user_id, fact, confidence, last_referenced)
VALUES (%s, %s, 1.0, NOW())
""", [user_id, fact])
return f"Remembered: {fact}"
# Add to tools
TOOLS.append({
"name": "remember",
"description": "Save something important about the user for future conversations.",
# ...
})
The agent can explicitly say "I should remember this."
Step 8: Test Memory Across Sessions (varies)
# Session 1
agent.chat("session1", "My name is Pat and I work in marketing.")
# Session 2 (different session ID)
response = agent.chat("session2", "What's my role again?")
assert "marketing" in response.lower()
Memory should persist across sessions.
Step 9: Privacy Considerations (15 min)
Memory has privacy implications:
User should be able to see what's remembered
User should be able to delete
Sensitive data shouldn't be auto-extracted (PII, financial)
Forgetting should be honored
def list_user_memory(user_id):
return db.query("SELECT fact FROM user_memory WHERE user_id = %s", [user_id])
def forget_user_memory(user_id, fact=None):
if fact:
db.execute("DELETE FROM user_memory WHERE user_id=%s AND fact=%s", [user_id, fact])
else:
db.execute("DELETE FROM user_memory WHERE user_id=%s", [user_id])
Build the UI for these operations.
Step 10: Monitor Memory Quality (ongoing)
Are stored facts useful?
Does memory inclusion improve responses?
Are facts accurate?
Periodically review memory contents. Stale or wrong facts degrade the agent.
What You Just Did
You added three-layer memory to your agent. Working memory handles the current task; conversation memory handles the session; user memory persists across sessions.
Common Failure Modes
Unbounded memory. Token cost grows; eventually exceeds context.
Auto-extracted PII. Sensitive data stored without consent.
No forget mechanism. Privacy violation; can't update wrong facts.
Stale memory injected. Old facts confuse the agent.
Memory without testing. Don't know if it's actually helping.
Next Tutorial
Real agents fail. Handle it: Tutorial 5: Handle Agent Failures.
Related reading
Keep learning. This article is part of the AI in Quality & Delivery path in the ShiftQuality Learning Center. Use AI in delivery — and evaluate it honestly — without the hype.


