Tutorial 3: Add Multiple Tools to an Agent

Contributor
Jun 9
4 min read

A single-tool agent is limited. Multiple tools enable more capability — but also more confusion. This tutorial walks through doing it well.

What You'll Build

An agent with 4-6 well-chosen tools that knows when to use each.

Step 1: Design the Tool Set (15 min)

For a customer support agent:

TOOLS = [
    search_knowledge_base,
    look_up_order,
    look_up_customer,
    create_support_ticket,
    refund_order,
    send_email,
]

6 tools. Each does one specific thing.

Don't add tools speculatively. Each tool the agent has access to is one more thing it can pick wrong.

Step 2: Write Distinct Descriptions (30 min)

TOOLS = [
    {
        "name": "search_knowledge_base",
        "description": (
            "Search documentation and articles for general product info. "
            "Use for 'how to' questions and policy lookups. "
            "Don't use for customer-specific data."
        ),
        # ...
    },
    {
        "name": "look_up_order",
        "description": (
            "Get details about a specific order using its ID. "
            "Use when the customer mentions an order number. "
            "Don't use for general product questions."
        ),
        # ...
    },
    {
        "name": "look_up_customer",
        "description": (
            "Get details about a customer using their email. "
            "Use when you need account info, billing history, or subscription "
            "details. Don't use for order-specific lookups."
        ),
        # ...
    },
    # ...
]

Each description includes "use for X" and often "don't use for Y." This is critical for tool selection.

Step 3: Add Disambiguation Examples (15 min)

In the system prompt:

SYSTEM_PROMPT = """
You are a support agent. Use the tools to help customers.

Tool selection guide:
- Question about how something works → search_knowledge_base
- Question about a specific order → look_up_order
- Question about account/billing → look_up_customer
- Need to escalate → create_support_ticket
- Customer needs refund → refund_order (after confirming with customer)

Examples:
- "What's your refund policy?" → search_knowledge_base("refund policy")
- "Where's my order ORD-12345?" → look_up_order("ORD-12345")
- "What's on my account?" → look_up_customer(<their email>)
"""

Examples teach the pattern. Reduces ambiguity.

Step 4: Implement the Tools (per tool)

def look_up_order(order_id: str) -> dict:
    order = db.query("SELECT * FROM orders WHERE id = %s", [order_id])
    if not order:
        return {"error": "Order not found"}
    return {
        "id": order["id"],
        "status": order["status"],
        "items": order["items"],
        "total": order["total"],
        "created": order["created_at"].isoformat(),
    }

Return structured data. The agent uses it to compose responses.

Step 5: Handle Permissions (15 min)

Not every agent should have every tool:

def get_tools_for_role(role: str):
    base = [search_knowledge_base, look_up_order, look_up_customer]
    
    if role == "senior_support":
        base += [create_support_ticket, refund_order]
    
    if role == "admin":
        base += [delete_account, override_pricing]
    
    return base

The agent's authority comes from its tool set.

Step 6: Test Tool Selection (varies)

Test cases:

TEST_CASES = [
    ("How do I change my password?", "search_knowledge_base"),
    ("Where is my order ORD-789?", "look_up_order"),
    ("Can I get a refund for order ORD-123?", ["look_up_order", "refund_order"]),
    ("I want to delete my account.", "create_support_ticket"),  # Escalate
]

for question, expected_tool in TEST_CASES:
    trace = agent(question, return_trace=True)
    used_tools = [step["tool"] for step in trace["steps"]]
    
    if isinstance(expected_tool, list):
        assert all(t in used_tools for t in expected_tool)
    else:
        assert expected_tool in used_tools

Verify the agent picks the right tools.

Step 7: Handle Tool Confusion (15 min)

If the agent consistently picks wrong:

Tighten descriptions ("don't use this for X")
Add explicit examples in the system prompt
Reduce overlap between tools (combine tools that do similar things)
Reduce tool count if possible

The model has limits on tool discrimination. 6-10 tools is usually manageable; 20+ tools degrades selection.

Step 8: Cost-Sensitive Tools (10 min)

Some tools are expensive (slow, dollars):

def refund_order(order_id, amount):
    # Real action; costs money
    confirm_refund_with_user_first()  # ?? Need to confirm
    return process_refund(order_id, amount)

For high-impact tools, require confirmation:

def refund_order(order_id, amount, confirmed=False):
    if not confirmed:
        return "Please confirm with the customer before refunding. Then call again with confirmed=true."
    return process_refund(order_id, amount)

The agent has to explicitly confirm. Prevents accidental actions.

Step 9: Tool Composition (varies)

Sometimes one tool's output is another's input:

# Agent flow
# Step 1: look_up_customer("user@example.com") → {customer_id: 42, ...}
# Step 2: look_up_orders(customer_id=42) → list of orders
# Step 3: search_knowledge_base("refund policy") → policy text
# Step 4: refund_order(order_id="X", amount=Y, confirmed=True) → success

The agent composes the workflow. With good descriptions, this happens naturally.

Step 10: Monitor Tool Usage (ongoing)

SELECT tool_name, COUNT(*) as calls, AVG(success::int)
FROM tool_invocations
WHERE timestamp > NOW() - INTERVAL '1 day'
GROUP BY tool_name;

Which tools are overused? Underused? Failing?

Tool never called: maybe not useful; or selection issue
Tool always called: maybe should be merged into another flow
Tool failing often: implementation issue

What You Just Did

You built a multi-tool agent with clear selection. The agent picks the right tool for the right job, composes them when needed, and respects permissions.

Common Failure Modes

Overlapping tools. Agent picks the wrong one because two could fit.

Vague descriptions. Agent doesn't know when to use each.

Too many tools. Discrimination breaks down.

No permissions. Agent has tools it shouldn't.

Tool failures unhandled. Agent confused; gives up.

Next Tutorial

Add memory for context-aware decisions: Tutorial 4: Agent Memory and State.

ShiftQuality