How to Improve an Agent: Part 2 - Simple Task Management
In the previous post, we enhanced our basic agent with parallel tool calling and concurrent execution. Now we’ll add a crucial component for multi-agent coordination: a task management system.
Why Task Management Matters
When agents work on complex problems, they need to:
- Break down large tasks into manageable steps
- Track progress across multiple operations
- Maintain context about what’s been done and what remains
- Coordinate between multiple sub-agents (coming in Part 3!)
Without proper task management, agents can lose track of their goals, repeat work unnecessarily, or miss important steps. A todo system provides the shared memory and structure needed for systematic task completion.
What We’re Building
We’ll implement a simple but effective todo system that allows our agent to:
- Create todos for planning complex tasks
- Track completion status
- Maintain visibility into ongoing work
- Prepare for multi-agent coordination
This isn’t just about organization – it’s about giving the agent a working memory that persists across tool calls and provides a foundation for more sophisticated multi-agent behaviors.
Prerequisites
- The enhanced agent from Part 1 with parallel tool calling
- Go installed on your system
- Basic understanding of concurrent programming in Go
- An Anthropic API key
Implementing the Todo System
Let’s start by defining our todo structure and storage:
import (
"sync"
"time"
)
// Todo represents a task with tracking information
type Todo struct {
ID string
Text string
Completed bool
CreatedAt time.Time
CompletedAt *time.Time
}
// Global storage (in production, use a database)
var todos []Todo
var todoMutex sync.Mutex // Thread safety for concurrent access
var todoCounter int
// Helper to generate unique IDs
func generateTodoID() string {
todoMutex.Lock()
defer todoMutex.Unlock()
todoCounter++
return fmt.Sprintf("todo_%d", todoCounter)
}
Design decisions:
- Simple in-memory storage for now (easily replaceable with a database)
- Thread-safe with mutex (important for concurrent tool execution)
- Unique IDs for reliable tracking
- Timestamps for potential analytics and debugging
Creating the Todo Read Tool
First, let’s implement the tool that lists all todos:
// TodoRead tool - lists all todos
var TodoReadDefinition = ToolDefinition{
Name: "todoread",
Description: "List all todos with their current status",
InputSchema: GenerateSchema[struct{}](),
Function: TodoRead,
}
func TodoRead(input json.RawMessage) (string, error) {
todoMutex.Lock()
defer todoMutex.Unlock()
if len(todos) == 0 {
return "No todos found", nil
}
result := "Todos:\n"
for _, todo := range todos {
status := "[ ]"
if todo.Completed {
status = "[x]"
}
result += fmt.Sprintf("%s %s (ID: %s)\n", status, todo.Text, todo.ID)
}
return result, nil
}
Why this format: The checkbox format [x] and [ ] is immediately recognizable to both humans and LLMs, making the output intuitive and easy to parse.
Creating the Todo Write Tool
Now for the tool that creates and completes todos:
// TodoWrite tool - creates or completes todos
var TodoWriteDefinition = ToolDefinition{
Name: "todowrite",
Description: "Create new todos or mark existing ones as completed",
InputSchema: GenerateSchema[TodoWriteInput](),
Function: TodoWrite,
}
type TodoWriteInput struct {
Action string `json:"action"` // "add" or "complete"
Text string `json:"text,omitempty"`
ID string `json:"id,omitempty"`
}
func TodoWrite(input json.RawMessage) (string, error) {
var todoInput TodoWriteInput
if err := json.Unmarshal(input, &todoInput); err != nil {
return "", err
}
todoMutex.Lock()
defer todoMutex.Unlock()
switch todoInput.Action {
case "add":
if todoInput.Text == "" {
return "", fmt.Errorf("todo text cannot be empty")
}
newTodo := Todo{
ID: generateTodoID(),
Text: todoInput.Text,
Completed: false,
CreatedAt: time.Now(),
}
todos = append(todos, newTodo)
return fmt.Sprintf("Added todo: %s (ID: %s)", newTodo.Text, newTodo.ID), nil
case "complete":
for i := range todos {
if todos[i].ID == todoInput.ID {
if todos[i].Completed {
return fmt.Sprintf("Todo already completed: %s", todos[i].Text), nil
}
now := time.Now()
todos[i].Completed = true
todos[i].CompletedAt = &now
return fmt.Sprintf("Completed todo: %s", todos[i].Text), nil
}
}
return "", fmt.Errorf("todo with ID %s not found", todoInput.ID)
default:
return "", fmt.Errorf("invalid action: %s", todoInput.Action)
}
}
Design choices:
- Single tool with action parameter (reduces tool count, clearer intent)
- Idempotent completion (safe to call multiple times)
- Clear error messages for debugging
Integrating Todos into the Agent
Add the tools to your agent in main():
func main() {
// ... existing setup ...
tools := []ToolDefinition{
ReadFileDefinition,
ListFilesDefinition,
EditFileDefinition,
TodoReadDefinition, // New!
TodoWriteDefinition, // New!
}
agent := NewAgent(&client, getUserMessage, tools, systemPrompt)
// ... rest of main ...
}
Enhancing the System Prompt
The key to effective todo usage is proper prompting. Add this to your system prompt:
systemPrompt += `
# Task Management
You have access to TodoWrite and TodoRead tools for task management. Use these tools to:
- Plan complex tasks by breaking them into steps
- Track your progress
- Ensure nothing is forgotten
IMPORTANT: Mark todos as completed immediately after finishing each task.
This helps maintain clarity about what's done and what remains.
Example workflow:
1. User asks for a complex task
2. Use TodoWrite to create a list of steps
3. Work through each step
4. Mark each todo complete as you finish
5. Check TodoRead to ensure all tasks are done`
Why immediate completion matters: Agents can lose track during long conversations. Immediate completion ensures accurate progress tracking.
Real-World Example
Let’s see how this transforms agent behavior:
Before (without todos):
User: Refactor all TypeScript files to use new import syntax
Claude: I'll help you refactor the TypeScript files. Let me start by finding all TypeScript files...
[Works on files randomly, might miss some, no clear progress tracking]
After (with todos):
User: Refactor all TypeScript files to use new import syntax
Claude: I'll help you refactor the TypeScript files. Let me start by planning this task.
tool: todowrite({"action":"add","text":"Scan codebase for all TypeScript files"})
tool: todowrite({"action":"add","text":"Update imports in components directory"})
tool: todowrite({"action":"add","text":"Update imports in utils directory"})
tool: todowrite({"action":"add","text":"Update imports in types directory"})
tool: todowrite({"action":"add","text":"Run type checker to verify changes"})
Added todos for tracking. Now let me scan for TypeScript files.
tool: list_files({"path":"src"})
tool: todowrite({"action":"complete","id":"todo_1"})
Found 15 TypeScript files across three directories. Starting with components...
[Continues with clear progress tracking]
Advanced Patterns
1. Hierarchical Todos
While our implementation is flat, agents naturally create hierarchical structures through naming:
- Refactor authentication system
- Analyze current auth flow
- Update login component
- Update logout component
- Test authentication flow
2. Progress Reporting
Agents can provide status updates using todos:
// Agent can check progress
tool: todoread({})
// Returns:
// [x] Scan codebase (ID: todo_1)
// [x] Update components (ID: todo_2)
// [ ] Update utils (ID: todo_3)
// [ ] Run tests (ID: todo_4)
"I've completed 2 of 4 tasks (50%). Currently working on updating utility files..."
3. Error Recovery
Todos help agents recover from errors:
// If an error occurs, the agent can:
1. Check todoread to see what's been done
2. Identify the failed step
3. Retry or work around the issue
4. Continue with remaining tasks
Performance Considerations
Our todo system is designed for efficiency:
- O(1) writes: Adding todos is constant time
- O(n) reads: Linear scan is fine for typical todo counts (<100)
- Thread-safe: Mutex ensures safe concurrent access
- Memory-efficient: Simple structure, minimal overhead
Conclusion
With just a few hundred lines of code, we’ve given our agent a working memory that dramatically improves its ability to handle complex tasks. The todo system provides:
- Structure: Complex tasks become manageable steps
- Visibility: Clear progress tracking for both agent and user
- Reliability: No forgotten steps or repeated work
- Foundation: Ready for multi-agent coordination
But todos alone don’t make a multi-agent system. In the next post, we’ll implement the key innovation: agents that can spawn other agents, creating true multi-agent behavior with the run_task tool.