How to Improve an Agent: Part 2 - Simple Task Management

In the previous post, we enhanced our basic agent with parallel tool calling and concurrent execution. Now we’ll add a crucial component for multi-agent coordination: a task management system.

Why Task Management Matters

When agents work on complex problems, they need to:

Break down large tasks into manageable steps
Track progress across multiple operations
Maintain context about what’s been done and what remains
Coordinate between multiple sub-agents (coming in Part 3!)

Without proper task management, agents can lose track of their goals, repeat work unnecessarily, or miss important steps. A todo system provides the shared memory and structure needed for systematic task completion.

What We’re Building

We’ll implement a simple but effective todo system that allows our agent to:

Create todos for planning complex tasks
Track completion status
Maintain visibility into ongoing work
Prepare for multi-agent coordination

This isn’t just about organization – it’s about giving the agent a working memory that persists across tool calls and provides a foundation for more sophisticated multi-agent behaviors.

Prerequisites

The enhanced agent from Part 1 with parallel tool calling
Go installed on your system
Basic understanding of concurrent programming in Go
An Anthropic API key

Implementing the Todo System

Let’s start by defining our todo structure and storage:

import (
    "sync"
    "time"
)

// Todo represents a task with tracking information
type Todo struct {
    ID          string
    Text        string
    Completed   bool
    CreatedAt   time.Time
    CompletedAt *time.Time
}

// Global storage (in production, use a database)
var todos []Todo
var todoMutex sync.Mutex  // Thread safety for concurrent access
var todoCounter int

// Helper to generate unique IDs
func generateTodoID() string {
    todoMutex.Lock()
    defer todoMutex.Unlock()
    todoCounter++
    return fmt.Sprintf("todo_%d", todoCounter)
}

Design decisions:

Simple in-memory storage for now (easily replaceable with a database)
Thread-safe with mutex (important for concurrent tool execution)
Unique IDs for reliable tracking
Timestamps for potential analytics and debugging

Creating the Todo Read Tool

First, let’s implement the tool that lists all todos:

// TodoRead tool - lists all todos
var TodoReadDefinition = ToolDefinition{
    Name:        "todoread",
    Description: "List all todos with their current status",
    InputSchema: GenerateSchema[struct{}](),
    Function:    TodoRead,
}

func TodoRead(input json.RawMessage) (string, error) {
    todoMutex.Lock()
    defer todoMutex.Unlock()

    if len(todos) == 0 {
        return "No todos found", nil
    }

    result := "Todos:\n"
    for _, todo := range todos {
        status := "[ ]"
        if todo.Completed {
            status = "[x]"
        }
        result += fmt.Sprintf("%s %s (ID: %s)\n", status, todo.Text, todo.ID)
    }
    return result, nil
}

Why this format: The checkbox format [x] and [ ] is immediately recognizable to both humans and LLMs, making the output intuitive and easy to parse.

Creating the Todo Write Tool

Now for the tool that creates and completes todos:

// TodoWrite tool - creates or completes todos
var TodoWriteDefinition = ToolDefinition{
    Name:        "todowrite",
    Description: "Create new todos or mark existing ones as completed",
    InputSchema: GenerateSchema[TodoWriteInput](),
    Function:    TodoWrite,
}

type TodoWriteInput struct {
    Action string `json:"action"`     // "add" or "complete"
    Text   string `json:"text,omitempty"`
    ID     string `json:"id,omitempty"`
}

func TodoWrite(input json.RawMessage) (string, error) {
    var todoInput TodoWriteInput
    if err := json.Unmarshal(input, &todoInput); err != nil {
        return "", err
    }

    todoMutex.Lock()
    defer todoMutex.Unlock()

    switch todoInput.Action {
    case "add":
        if todoInput.Text == "" {
            return "", fmt.Errorf("todo text cannot be empty")
        }

        newTodo := Todo{
            ID:        generateTodoID(),
            Text:      todoInput.Text,
            Completed: false,
            CreatedAt: time.Now(),
        }
        todos = append(todos, newTodo)
        return fmt.Sprintf("Added todo: %s (ID: %s)", newTodo.Text, newTodo.ID), nil

    case "complete":
        for i := range todos {
            if todos[i].ID == todoInput.ID {
                if todos[i].Completed {
                    return fmt.Sprintf("Todo already completed: %s", todos[i].Text), nil
                }
                now := time.Now()
                todos[i].Completed = true
                todos[i].CompletedAt = &now
                return fmt.Sprintf("Completed todo: %s", todos[i].Text), nil
            }
        }
        return "", fmt.Errorf("todo with ID %s not found", todoInput.ID)

    default:
        return "", fmt.Errorf("invalid action: %s", todoInput.Action)
    }
}

Design choices:

Single tool with action parameter (reduces tool count, clearer intent)
Idempotent completion (safe to call multiple times)
Clear error messages for debugging

Integrating Todos into the Agent

Add the tools to your agent in main():

func main() {
    // ... existing setup ...
    
    tools := []ToolDefinition{
        ReadFileDefinition, 
        ListFilesDefinition, 
        EditFileDefinition,
        TodoReadDefinition,    // New!
        TodoWriteDefinition,   // New!
    }
    
    agent := NewAgent(&client, getUserMessage, tools, systemPrompt)
    // ... rest of main ...
}

Enhancing the System Prompt

The key to effective todo usage is proper prompting. Add this to your system prompt:

systemPrompt += `

# Task Management
You have access to TodoWrite and TodoRead tools for task management. Use these tools to:
- Plan complex tasks by breaking them into steps
- Track your progress
- Ensure nothing is forgotten

IMPORTANT: Mark todos as completed immediately after finishing each task.
This helps maintain clarity about what's done and what remains.

Example workflow:
1. User asks for a complex task
2. Use TodoWrite to create a list of steps
3. Work through each step
4. Mark each todo complete as you finish
5. Check TodoRead to ensure all tasks are done`

Why immediate completion matters: Agents can lose track during long conversations. Immediate completion ensures accurate progress tracking.

Real-World Example

Let’s see how this transforms agent behavior:

Before (without todos):

User: Refactor all TypeScript files to use new import syntax
Claude: I'll help you refactor the TypeScript files. Let me start by finding all TypeScript files...
[Works on files randomly, might miss some, no clear progress tracking]

After (with todos):

User: Refactor all TypeScript files to use new import syntax
Claude: I'll help you refactor the TypeScript files. Let me start by planning this task.

tool: todowrite({"action":"add","text":"Scan codebase for all TypeScript files"})
tool: todowrite({"action":"add","text":"Update imports in components directory"})
tool: todowrite({"action":"add","text":"Update imports in utils directory"})
tool: todowrite({"action":"add","text":"Update imports in types directory"})
tool: todowrite({"action":"add","text":"Run type checker to verify changes"})

Added todos for tracking. Now let me scan for TypeScript files.

tool: list_files({"path":"src"})
tool: todowrite({"action":"complete","id":"todo_1"})

Found 15 TypeScript files across three directories. Starting with components...

[Continues with clear progress tracking]

Advanced Patterns

1. Hierarchical Todos

While our implementation is flat, agents naturally create hierarchical structures through naming:

- Refactor authentication system
  - Analyze current auth flow
  - Update login component
  - Update logout component
  - Test authentication flow

2. Progress Reporting

Agents can provide status updates using todos:

// Agent can check progress
tool: todoread({})
// Returns: 
// [x] Scan codebase (ID: todo_1)
// [x] Update components (ID: todo_2)
// [ ] Update utils (ID: todo_3)
// [ ] Run tests (ID: todo_4)

"I've completed 2 of 4 tasks (50%). Currently working on updating utility files..."

3. Error Recovery

Todos help agents recover from errors:

// If an error occurs, the agent can:
1. Check todoread to see what's been done
2. Identify the failed step
3. Retry or work around the issue
4. Continue with remaining tasks

Performance Considerations

Our todo system is designed for efficiency:

O(1) writes: Adding todos is constant time
O(n) reads: Linear scan is fine for typical todo counts (<100)
Thread-safe: Mutex ensures safe concurrent access
Memory-efficient: Simple structure, minimal overhead

Conclusion

With just a few hundred lines of code, we’ve given our agent a working memory that dramatically improves its ability to handle complex tasks. The todo system provides:

Structure: Complex tasks become manageable steps
Visibility: Clear progress tracking for both agent and user
Reliability: No forgotten steps or repeated work
Foundation: Ready for multi-agent coordination

But todos alone don’t make a multi-agent system. In the next post, we’ll implement the key innovation: agents that can spawn other agents, creating true multi-agent behavior with the run_task tool.