Tools¶

Overview¶

Tools are the Achilles heel of AI systems. Poorly designed tools- vague descriptions, unclear usage conditions, unpredictable failures-create massive cognitive friction and lead to unreliable behavior, especially in agentic workflows.

This section provides a complete engineering framework for tool design, not just "better descriptions."

You'll learn the:

Three-Part Tool Definition Standard (Trigger Logic, Negative Constraints, Return Contract),
Risk Classification (Read-Only, State-Change, Computational), and systematic approaches to testing, composition, and error handling.

Mastering tools transforms them from a source of confusion into a source of reliability.

What You'll Learn¶

Tool Literacy: Designing Tools
The complete framework for engineering tools models can actually use reliably:

Three-Part Tool Definition Standard
Trigger Logic (when to use this tool)
Negative Constraints (what NOT to do)
Return Contract (success states + failure modes with recovery actions)
Tool Classification System
Class A: Read-Only (low risk, use freely)
Class B: State-Change (high risk, requires confirmation)
Class C: Computational (when reliability requires it)
Decision Trees - When to use reasoning vs. tools
Standard Library Pattern - Why fewer tools win
Testing Framework - Validating tool comprehension
Common Antipatterns - Mistakes to avoid

Tool Templates
Ready-to-use templates implementing the three-part standard for common tool patterns.

Programmatic Tool Calling
How to move from model-driven tool invocation to code-controlled orchestration for production-grade reliability:

Why Programmatic Calling Wins - Three reliability advantages over model-driven invocation
The Context Window Problem - Why model deliberation tokens contaminate the evidential stream
Four Orchestration Patterns
Sequential calling (dependent steps)
Conditional calling (logic-gated routing)
Parallel calling (independent data gathering)
Error handling and fallback chains
Architectural Classification Enforcement - Moving Class A/B/C from persuasive to structural
Multi-Agent Considerations - Preventing cascading evidential contamination
Common Antipatterns - Mistakes that cause drift and production failures

Why Tools Are Critical¶

Tools multiply cognitive friction when poorly designed:

Vague descriptions → Models guess wrong
No trigger logic → Models use tools at wrong times
Missing failure modes → Models hallucinate error recovery
Tool overload → Decision paralysis
Unclear risk classification → State-change accidents

The cost: Wasted context, unreliable behavior, agent confusion, production failures.

Well-engineered tools eliminate guessing by explicitly defining:

When to use them (and when not to)
What NOT to do (common mistakes, safety constraints)
What to expect (success + every failure mode + recovery actions)

The Three-Part Tool Definition Standard¶

Every tool should define:

1. Trigger Logic (When to Use This)¶

Explicit scenarios for when the model should (and shouldn't) use this tool.

Example:

use_when:

User asks about a specific person's information
Need to verify user exists before action

dont_use_when:

Asking about multiple users (use search_users)
General questions (explain from knowledge)

### 2. Negative Constraints (What NOT to Do)
Common mistakes and safety rules.

**Example:**
```text
do_not:

- Execute queries on null input (validate first)
- Assume column names not in schema

safety:

- Never return password fields
- Limit results to 100 rows maximum

3. Return Contract (What to Expect)¶

Success schema + every failure mode with recovery actions.

Example:

failure_modes:
  rate_limit:
    error: "Rate limit exceeded"
    recovery_action: "Wait 60 seconds, retry automatically"
    user_action: "Email will be sent in 60 seconds"

This eliminates guessing. The model knows exactly what to do for each failure- no loops, no hallucination.

Tool Risk Classification¶

Every tool must be classified by risk:

Class A: Read-Only (Low Risk)
No side effects, safe to retry, safe to use speculatively
Examples: get_user, search_documents, calculate_statistics

Class B: State-Change (High Risk)
Irreversible, has side effects, requires confirmation
Examples: delete_file, send_email, update_database
Rule: ALWAYS confirm with user before execution

Class C: Computational (When Reliability Requires It)
Tasks where accumulated steps, scale, or precision requirements make in-context reasoning unreliable
Examples: 360-month mortgage calculations, large dataset analysis
NOT: "tasks a model finds hard" but "tasks where error accumulates faster than reasoning can correct"

Beyond Basic Descriptions¶

This isn't about writing prettier tool descriptions. It's about:

Engineering for reliability - Systematic error handling
Risk management - Explicit classification and guardrails
Cognitive load reduction - Decision trees, standard libraries
Testing discipline - Validating tool comprehension
Production readiness - Real failure modes, real recovery actions

Getting Started¶

New to Tool Design?¶

Start here:
Tool Literacy: Designing Tools

Work through:

Three-Part Tool Definition Standard
Tool Classification System
Decision Trees (reasoning vs. tools)
Standard Library Pattern
Testing Framework
Common Antipatterns

Need Templates?¶

Jump to Tool Templates for ready-to-use patterns implementing the three-part standard.

Building Production Systems?¶

Pay special attention to:

Return Contracts (failure modes + recovery actions)
Risk Classification (especially Class B confirmation patterns)
Testing Framework (validate tool comprehension before deployment)
Programmatic Tool Calling (architectural enforcement for multi-agent and long-running systems)

Key Principle¶

Tools are the Achilles heel of AI systems- but only when poorly engineered.

Well-designed tools with explicit trigger logic, negative constraints, and return contracts transform uncertainty into reliability. In production systems and multi-agent architectures, programmatic orchestration elevates that reliability from persuasive to architectural.

Ready to begin? Start with Tool Literacy: Designing Tools →