Reading & Understanding Codebases
Module 3.1: Reading & Understanding Codebases
Section titled “Module 3.1: Reading & Understanding Codebases”Estimated time: ~30 minutes
Prerequisite: Module 1.3 (Context Window Basics)
Outcome: After this module, you will be able to use Claude Code to rapidly understand any unfamiliar codebase — its architecture, key files, patterns, and dependencies.
1. WHY — Why This Matters
Section titled “1. WHY — Why This Matters”You just joined a new team. There’s 200K+ lines of code, sparse documentation, and the previous tech lead left two weeks ago. Your manager expects you to ship a critical bug fix by Friday. Reading every file is impossible — you’d spend three weeks just understanding the basics. Traditional approaches like grepping through files or following imports manually are slow and error-prone. You need a codebase navigator that can answer “What does this do?” and “Where should I look?” in seconds, not days. Claude Code can be that navigator, but only if you know how to ask the right questions in the right order.
2. CONCEPT — Core Ideas
Section titled “2. CONCEPT — Core Ideas”Understanding an unfamiliar codebase is about strategic reading, not exhaustive reading. You can’t dump 200K lines into Claude’s context window — you need a layered approach.
The 3-Layer Reading Strategy
Section titled “The 3-Layer Reading Strategy”Think of a codebase like a city. You don’t memorize every street before navigating — you learn the map first, then the main routes, then specific streets as needed.
Layer 1: Structure (The Map) Start with the high-level layout. What directories exist? What’s the entry point? What config files define the project? This gives you orientation — you know where things are before understanding what they do.
Layer 2: Patterns (The Routes) Identify the architecture pattern (MVC, Clean Architecture, microservices?). Map dependencies and data flow. Understand how requests move through the system. This reveals the “highways” of your codebase — the critical paths data takes.
Layer 3: Details (The Streets) Only now do you dive into specific function logic, edge cases, and implementation details. This is targeted reading based on what you learned in Layer 1 and 2.
Context Budget Strategy
Section titled “Context Budget Strategy”Claude’s context window is large but finite. Don’t waste it on irrelevant files. Use breadth-first exploration (scan many files shallowly) at Layer 1, then switch to depth-first (deep dive on specific modules) at Layer 3.
graph TD subgraph "Layer 1: Structure (The Map)" A[Directory Layout] --> B[Entry Points] B --> C[Config Files] end
subgraph "Layer 2: Patterns (The Routes)" D[Architecture Pattern] --> E[Dependency Graph] E --> F[Data Flow] end
subgraph "Layer 3: Details (The Streets)" G[Function Logic] --> H[Edge Cases] H --> I[Implementation] end
C --> D F --> GKey principle: Always establish context before drilling down. Never ask “how does login work?” before knowing where the auth module even lives.
3. DEMO — Step by Step
Section titled “3. DEMO — Step by Step”Let’s explore a realistic Node.js/Express REST API project. Assume you’ve just cloned a repository and know nothing about it.
Step 1: Start a Claude Code session in the project
$ cd my-unfamiliar-project$ claudeClaude starts an interactive session. No output yet — just the prompt.
Step 2: Get project overview (Layer 1 start)
Inside the Claude session, type:
What does this project do? Explain the directory structure.Expected output:
This is a Node.js/Express REST API for managing e-commerce orders.
Directory structure:├── src/│ ├── controllers/ # HTTP request handlers (business logic entry points)│ ├── models/ # Mongoose schemas for MongoDB│ ├── routes/ # Express route definitions│ ├── middleware/ # Auth, validation, error handling│ ├── services/ # Business logic layer│ ├── utils/ # Helper functions│ └── index.js # Application entry point├── tests/ # Jest test suites├── config/ # Environment-specific configs├── package.json # Dependencies and scripts└── README.md
Key technologies: Express 4.x, MongoDB, JWT authentication, Jest for testing.Why this matters: You now have the map. You know there’s a services layer (business logic is separated from controllers), MongoDB is used, and tests exist.
Step 3: Identify entry points (Layer 1 continued)
Where is the main entry point? Trace the startup flow.Expected output:
Entry point: src/index.js
Startup flow:1. src/index.js loads environment variables from config/2. Connects to MongoDB via mongoose.connect()3. Initializes Express app4. Registers middleware: body-parser, CORS, auth middleware5. Registers routes from src/routes/ (mapped to controllers)6. Starts HTTP server on port 3000
Critical file: src/routes/index.js — this is where all API endpoints are defined.Why this matters: You know the bootstrap sequence. If the app fails to start, you check config → database connection → routes in that order.
Step 4: Map dependencies (Layer 1 → Layer 2 transition)
What are the key dependencies and what role does each play?Expected output:
Production dependencies:- express (4.18.x): Web framework- mongoose (7.x): MongoDB ODM- jsonwebtoken (9.x): JWT token generation/validation- bcrypt (5.x): Password hashing- dotenv (16.x): Environment variable management- express-validator (7.x): Request validation
Dev dependencies:- jest (29.x): Testing framework- supertest (6.x): HTTP assertion library for API tests- eslint (8.x): Code linting
Key insight: This is a standard REST API stack. Authentication is JWT-based.Why this matters: You know the tech stack. If you see a JWT error, you look at the jsonwebtoken configuration.
Step 5: Understand architecture (Layer 2)
What architecture pattern does this project follow? Show me the data flow for a typical API request.Expected output:
Architecture: Layered architecture with separation of concerns
Layers:1. Routes (src/routes/) — Define endpoints, map to controllers2. Controllers (src/controllers/) — Handle HTTP concerns (req/res)3. Services (src/services/) — Business logic4. Models (src/models/) — Data access
Data flow for POST /api/orders:1. Request hits src/routes/orders.js → routes to OrderController.create2. Middleware chain runs: auth verification → request validation3. OrderController.create calls OrderService.createOrder(data)4. OrderService.createOrder performs business logic (stock check, price calc)5. OrderService calls Order.create() (Mongoose model) to persist6. Response bubbles back up: Model → Service → Controller → Client
This pattern keeps business logic testable and decoupled from HTTP.Why this matters: Now you know where to make changes. Need to add a discount feature? That’s business logic — add it in services/, not controllers/.
Step 6: Dive into a specific module (Layer 3)
Explain the authentication module. How does the login flow work end to end?Expected output:
Authentication module: src/middleware/auth.js + src/controllers/AuthController.js
Login flow (POST /api/auth/login):1. Client sends { email, password }2. AuthController.login receives request3. Calls UserService.findByEmail(email)4. Compares password using bcrypt.compare(password, user.hashedPassword)5. If valid: generates JWT via jsonwebtoken.sign({ userId: user.id }, SECRET, { expiresIn: '7d' })6. Returns { token, user } to client
Protected route flow:1. Client sends request with header: Authorization: Bearer <token>2. auth.js middleware intercepts3. Extracts token, verifies with jsonwebtoken.verify(token, SECRET)4. If valid: attaches decoded userId to req.user, calls next()5. If invalid: returns 401 Unauthorized
Security note: Tokens expire after 7 days. Refresh token mechanism not implemented.Why this matters: You now understand exactly how auth works. If you need to add role-based access control, you know to modify the middleware and add role checks.
Step 7: Find patterns and conventions (Layer 2 refinement)
What coding conventions does this project follow? What patterns are used consistently?Expected output:
Conventions:- Naming: Controllers end with "Controller", Services with "Service"- Error handling: All errors thrown in services are caught by global error middleware- Async/await: Used throughout (no raw promises)- Validation: express-validator schemas defined in routes/- File structure: One file per controller/service/model
Patterns:- Repository pattern: Models act as repositories- Dependency injection: Services are imported and called (not instantiated)- Middleware chain: Auth → Validation → Controller- Test structure: Each controller has a corresponding .test.js file
Anti-patterns detected:- Business logic leaking into some controllers (e.g., OrderController has price calculations)- No logging framework (uses console.log)Why this matters: You now code like the team. New endpoint? Follow the same patterns. You also spotted tech debt to address later.
4. PRACTICE — Try It Yourself
Section titled “4. PRACTICE — Try It Yourself”Exercise 1: Codebase Speed Run
Section titled “Exercise 1: Codebase Speed Run”Goal: Map an unfamiliar codebase in under 10 minutes.
Instructions:
- Clone a public repository:
Terminal window $ git clone https://github.com/fastify/fastify.git$ cd fastify$ claude - Using Claude Code, answer these 5 questions:
- What does this project do?
- What’s the architecture pattern?
- Where would you add a new plugin?
- What testing framework is used?
- What’s the most complex module?
- Time yourself. Stop at 10 minutes.
Expected result: You should have high-confidence answers to all 5 questions within 10 minutes.
💡 Hint
Start with Layer 1. Ask: “What does this project do? Explain directory structure.” Then ask: “What’s the plugin system architecture?” Don’t dive into code yet — get the map first.
✅ Solution
Recommended prompt sequence:
-
“What does this project do? Explain the directory structure.”
- Expected: Fastify is a web framework, focused on speed. Directory has lib/ (core), test/, docs/, etc.
-
“What architecture pattern is used? How does the plugin system work?”
- Expected: Plugin-based architecture. Plugins register via fastify.register(). Encapsulation model prevents plugin pollution.
-
“Where would I add a new plugin? Show me an example.”
- Expected: Create a file in lib/plugins/ or external package. Use fastify.decorate() to add functionality.
-
“What testing framework is used? What’s the test structure?”
- Expected: tap (not Jest). Tests in test/ directory. Each core feature has a corresponding test file.
-
“What’s the most complex module? Why?”
- Expected: Likely lib/reply.js or lib/request.js — handle HTTP lifecycle, serialization, hooks.
Time check: This should take 6-8 minutes if you follow Layer 1 → Layer 2 → Layer 3.
Exercise 2: New Team Member Onboarding Document
Section titled “Exercise 2: New Team Member Onboarding Document”Goal: Generate a reusable onboarding document for a codebase.
Instructions:
- Navigate to any project you’re actively working on (or use a sample project)
- Start Claude Code:
Terminal window $ cd your-project$ claude - Ask: “Generate a new team member onboarding guide for this codebase. Include: architecture overview, key files to read first, development setup, common workflows, and gotchas.”
- Review the generated document. Refine by asking: “Add a section on where to find X” (e.g., where API routes are defined, where database migrations live, etc.)
- Save the output to
ONBOARDING.mdin your project
Expected result: A 2-3 page document that reduces new developer onboarding time from days to hours.
💡 Hint
Be specific about what “onboarding” means for your team. If your team works on features end-to-end, ask Claude to include “how to add a new feature from API to UI.” If your team is backend-focused, ask for “how to add a new API endpoint with tests.”
✅ Solution
Example prompt:
Generate a new team member onboarding guide for this codebase. Structure:
1. What does this project do? (2 paragraphs)2. Architecture overview (diagram if possible)3. Key files to read first (top 10)4. Development setup (commands to run)5. Common workflows: - How to add a new feature - How to run tests - How to debug issues6. Gotchas and conventions7. Who to ask for help (if info available in docs)
Format as Markdown.Expected output: A structured document you can commit to your repository. Review it for accuracy — Claude might hallucinate details if your project lacks documentation. Correct any errors before sharing with the team.
5. CHEAT SHEET
Section titled “5. CHEAT SHEET”| Prompt | What It Does | When to Use |
|---|---|---|
What does this project do? Explain the directory structure. | High-level overview + file layout | First thing when exploring any codebase |
Where is the main entry point? Trace the startup flow. | Identifies bootstrap sequence | Understanding initialization and app lifecycle |
What are the key dependencies and their roles? | Maps external libraries to purposes | Assessing tech stack and potential risks |
What architecture pattern is used here? | Identifies MVC, Clean, layered, etc. | Understanding design decisions and where to add features |
Trace the data flow for [specific feature] | Shows request → response lifecycle | Understanding how a feature works end-to-end |
Explain the [module name] module end to end | Deep dive into one area | Before modifying code in that module |
What conventions does this project follow? | Reveals patterns, naming, structure | Before writing new code to match team style |
What are the potential issues or code smells? | Identifies tech debt, anti-patterns | Assessing code quality or planning refactors |
Generate an onboarding doc for this codebase | Creates structured documentation | Team knowledge sharing and new hire onboarding |
Where should I add [specific feature]? | Suggests location based on architecture | Planning where new code should live |
6. PITFALLS — Common Mistakes
Section titled “6. PITFALLS — Common Mistakes”| ❌ Mistake | ✅ Correct Approach |
|---|---|
| Asking “explain this code” without context | Establish project context first: “What does this project do?” THEN ask about specific code |
| Dumping entire repo into context at once | Start with structure (Layer 1), then patterns (Layer 2), then details (Layer 3) |
| Trusting Claude’s answers without verification | Cross-check responses by asking Claude to show file paths and snippets you can verify |
| Reading files linearly (file by file) | Follow the data flow — trace how a request moves through the system |
| Ignoring test files | Tests are living documentation — ask “Show me tests for [feature]” to understand expected behavior |
| Asking vague questions like “how does auth work?” | Be specific: “Trace the login flow from POST /login to JWT generation” |
| Skipping Layer 2 (patterns) and jumping to code | Always identify the architecture before diving into implementation details |
Not using /compact when context fills up | Run /compact periodically to compress conversation history and free up context |
| Asking about everything in one prompt | Ask one layer at a time. Let Claude build understanding progressively |
7. REAL CASE — Production Story
Section titled “7. REAL CASE — Production Story”Scenario: Susan, a senior Android developer in Ho Chi Minh City, joins a fintech startup building a KMP (Kotlin Multiplatform) mobile banking app. The codebase is 3 years old with 150,000 lines of code across shared Kotlin, Android-specific, and iOS-specific modules. The previous tech lead left suddenly — no handover, no architecture docs. The CEO expects Susan to ship a critical payment bug fix by Friday (4 days away). Traditional onboarding would take 2 weeks.
Problem: How do you understand a 150K-line codebase in 4 days when you’ve never seen Kotlin Multiplatform Mobile before?
Solution: Susan used the 3-layer reading strategy with Claude Code.
Day 1 — Layer 1 (Structure):
- Asked: “What does this project do? Explain the module structure.”
- Learned: 47 modules organized by feature (payments, accounts, loans, etc.). Shared business logic in
shared/, platform code inandroidApp/andiosApp/. - Asked: “Where is the payment feature? Show me the directory structure for payments.”
- Identified:
shared/src/commonMain/kotlin/payments/contains core logic.
Day 1 — Layer 2 (Patterns):
- Asked: “What architecture pattern is used? How does data flow in this app?”
- Learned: Clean Architecture with MVVM. Repository → UseCase → ViewModel → UI.
- Asked: “Trace the payment flow from user tapping ‘Pay’ to transaction completion.”
- Claude mapped the entire flow through 12 files.
Day 2 — Layer 3 (Details):
- Asked: “Explain the PaymentRepository class. What does processPayment() do?”
- Deep-dived into the payment processing logic.
- Asked: “What are potential race conditions in the payment queue?”
- Claude identified a threading issue:
PaymentQueue.enqueue()wasn’t synchronized.
Day 3 — Bug Fix:
- Used Claude to generate the fix: “Add thread-safe queueing to PaymentQueue using a mutex.”
- Reviewed generated code, tested locally.
Day 4 — Documentation:
- Asked: “Generate an architecture document for the payment module.”
- Created a 15-page onboarding doc with diagrams, data flow charts, and key file references.
Result:
- Full codebase mapped in 2 hours instead of 2 weeks
- 47 modules documented with their responsibilities and dependencies
- Bug traced and fixed in 3 days — shipped Thursday, a day ahead of schedule
- Onboarding doc saved — now used by all new hires, reducing onboarding from 2 weeks to 3 days
- Team velocity increased — other developers started using the same technique
Key insight: The 3-layer strategy works because it mirrors how experienced developers actually learn codebases — structure first, patterns second, details last. Claude Code accelerates each layer from hours to minutes.