Skip to content

Speed Optimization

Estimated time: ~30 minutes

Prerequisite: Module 14.1 (Task Optimization)

Outcome: After this module, you will know techniques to reduce Claude Code response time, understand speed/quality tradeoffs, and optimize for different scenarios.


You’re waiting. Claude is “thinking” for 2 minutes on what should be a simple task. Or worse, it’s been 5 minutes and you’re not sure if it’s stuck or working. Time adds up — 10 slow tasks per day × 3 extra minutes each = 30 minutes wasted daily.

Speed optimization gives you back that time. Clear prompts, clean context, right model choice — these compound into significant productivity gains.


FactorSlowFast
PromptVague, ambiguousClear, specific
Context100K tokens10K tokens
TaskComplex, multi-stepFocused, single
ModelOpus (smartest)Haiku (fastest)
OutputLong explanationJust code
Response Time = f(Context Size, Task Complexity, Output Length, Model)
Optimize each:
- Context: /clear frequently, exclude irrelevant files
- Complexity: Break into simpler tasks (Module 14.1)
- Output: "Code only, no explanation"
- Model: Use fastest model that works
Heavy context: Light context:
─────────────────────────────────────────────────
50 files loaded 3 relevant files
Full conversation history Fresh session
All project documentation Just what's needed
↓ ↓
Result: 60 sec response Result: 10 sec response
Task Complexity → Model Choice
────────────────────────────────────────────────
Simple (format, small edit) → Haiku (fastest)
Medium (implement feature) → Sonnet (balanced)
Complex (architecture, debug) → Opus (smartest)
  • “Code only, no explanation” — saves output generation time
  • “One file at a time” — faster than multiple files
  • “Diff format” — faster than full file rewrite

Beyond general speed, different models excel at different task types:

Task CategoryHaikuSonnetOpusRecommendation
Formatting/LintingExcellentOverkillOverkillHaiku — speed matters most, quality sufficient
Simple CRUDGoodExcellentOverkillSonnet — needs pattern understanding
Complex FeaturePoorGoodExcellentOpus — requires architectural reasoning
Bug FixingSimple bugsMost bugsComplex bugsMatch model to bug complexity
Code ReviewStyle issuesLogic issuesArchitecture issuesMatch model to review depth needed
Test GenerationBasic testsThorough testsEdge case testsSonnet for most, Opus for critical paths
DocumentationGoodExcellentOverkillSonnet — needs context understanding
RefactoringRename/moveRestructureArchitecturalMatch model to refactoring scope
Speed Quality Cost
Haiku: ██████████ 10/10 ████░░░░░░ 4/10 █░░░░░░░░░ 1/10
Sonnet: ██████░░░░ 6/10 ████████░░ 8/10 ████░░░░░░ 4/10
Opus: ███░░░░░░░ 3/10 ██████████ 10/10 ████████░░ 8/10

Rule of thumb: Default to Sonnet for daily work. Drop to Haiku for batch operations and simple tasks. Escalate to Opus only when Sonnet’s output quality is insufficient for the specific task. One Vietnamese startup reduced Claude costs from $1,200 to $380/month by following this approach.


Scenario: Implement utility functions (string helpers).

[Session with 50K context from previous work]
You: Create utility functions for our app. We need string helpers,
date formatters, and validators. Explain each function and
add comprehensive documentation.
Claude: [Thinking... 90 seconds]
[Long response with explanations, 200+ lines]
Total time: ~3 minutes
Terminal window
$ claude
# Fresh session, clean context
You: Create src/utils/strings.ts with these functions:
- capitalize(str): Capitalize first letter
- slugify(str): Convert to URL slug
- truncate(str, len): Truncate with ellipsis
Code only, no explanation.
Claude: [Thinking... 15 seconds]
export function capitalize(str: string): string {
return str.charAt(0).toUpperCase() + str.slice(1);
}
export function slugify(str: string): string {
return str.toLowerCase().replace(/\s+/g, '-').replace(/[^\w-]/g, '');
}
export function truncate(str: string, len: number): string {
return str.length > len ? str.slice(0, len) + '...' : str;
}

Total time: ~20 seconds

MetricSlowFast
Time3 minutes20 seconds
Speedup-9x faster

What changed:

  • Fresh context (no 50K baggage)
  • Specific scope (one file, three functions)
  • “Code only” (no explanation overhead)
  • Clear requirements (exact function signatures)
Terminal window
# Run three independent tasks simultaneously
claude -p "Create src/utils/strings.ts: capitalize, slugify" &
claude -p "Create src/utils/dates.ts: formatDate, parseDate" &
claude -p "Create src/utils/validators.ts: isEmail, isURL" &
wait
# Total: ~25 seconds (instead of 75 seconds sequential)

Goal: Experience the impact of context size.

Instructions:

  1. Note your current context size
  2. Use /clear and reload only essential files
  3. Run the same task
  4. Compare response times
💡 Hint

The /cost command shows token usage. Compare before and after /clear.

✅ Solution

Typical results:

  • Heavy context (50K tokens): 45-90 second responses
  • Light context (5K tokens): 10-20 second responses
  • Speedup: 3-5x faster with clean context

Goal: Measure the impact of output length.

Instructions:

  1. Ask Claude to implement something with full explanation
  2. Time it
  3. Ask the same thing with “code only, no explanation”
  4. Compare times
💡 Hint

Output generation takes time. Less output = faster response.

✅ Solution

Typical results:

  • With explanation: 30-60 seconds, 100+ lines output
  • Code only: 10-20 seconds, 20 lines output
  • Speedup: 2-3x faster

Goal: Understand model speed/quality tradeoffs.

Instructions:

  1. Pick a medium-complexity task
  2. Try with different models if available
  3. Compare: time, quality, appropriateness
💡 Hint

Haiku is fastest but may miss nuance. Opus is smartest but slower. Sonnet balances both.

✅ Solution

For simple formatting: Haiku (fast, sufficient quality) For feature implementation: Sonnet (balanced) For complex debugging: Opus (worth the wait)

Match model to task complexity.


# Fresh context
/clear
# Minimal output
"Code only, no explanation"
"Just the function, no tests"
"Diff format only"
# Focused scope
"Only modify [file]"
"Just the [component]"
Task TypeModelWhy
Simple editsHaikuFastest
FeaturesSonnetBalanced
Complex debugOpusSmartest
Terminal window
claude -p "task 1" &
claude -p "task 2" &
wait
  • /clear between unrelated tasks
  • Load only files you’re working on
  • Exclude node_modules, build artifacts

❌ Mistake✅ Correct Approach
Never clearing context/clear for fresh starts
Always using OpusMatch model to task complexity
Asking for explanations you won’t read”Code only” for speed
Loading entire codebaseLoad only relevant files
Sequential tasks that could be parallelUse multiple sessions
Optimizing prematurelyGet it working first, then speed up
Sacrificing quality for speedSpeed should maintain quality

Scenario: Vietnamese agency had developers complaining Claude was “too slow” — 2-3 minute response times making it unusable for quick tasks.

Audit Findings:

  • Average context: 80K tokens (accumulated over days)
  • Asking for explanations on every task
  • Using Opus for simple formatting
  • Never using /clear

Speed Optimization Protocol:

ChangeBeforeAfter
Daily fresh sessionNeverEvery morning
Context clearingNeverBetween projects
Output styleWith explanationCode only (default)
Model matchingAlways OpusTask-appropriate

Results:

  • Average response: 2.5 min → 30 sec (5x faster)
  • Developer satisfaction: “Claude feels snappy now”
  • No quality reduction

Quote: “We were making Claude carry an 80K token backpack everywhere. No wonder it was slow. Traveling light made all the difference.”


Next: Module 14.3: Quality Optimization