AI Coding - Beyond the Prompt | Agentic Insights Blog

I recently ran into a great YouTube video about Claude Code and how to best use it. There are some great hints as to how to best utilize the context windows. Much of this I’ve already incorporated into my workflow, a lot of it I was already doing, but many great nuggets of information about some of the more advanced features of Claude Code.

Claude Code is a revolutionary tool for developers, but many users hit a frustrating wall: after a promising start, the AI's performance begins to degrade, its answers become inconsistent, and it seems to "forget" earlier instructions. This isn't a flaw in the model; it's a challenge of context management.

The video masterclass with experts Ray Fernando and Eric Buess provides a clear roadmap for moving beyond simple prompting and becoming a true "context engineer." The core principle is treating the AI's context window not as an infinite canvas, but as a finite, high-value resource that must be managed strategically.

The following is a recent video from the IBM Technology YouTube channel that I’m a big fan of, it is the best disambiguation of Prompt Engineering vs Context Engineering.

The Context Window: Your AI's Short-Term Memory

Think of the context window as the AI's working memory. Everything that happens within a single session fills up this space:

Your Prompts (User Input): Every instruction and piece of code you provide.

The AI's "Thinking" (Agent Thinking): The internal processing and planning the AI does before responding. This is a hidden but significant cost.

Tool Calls: Any time Claude searches files, reads documentation, or runs a command.

The Response (Code Output): The final code or text generated by the AI.

As demonstrated by Ray Fernando's context visualizer, this window can fill up surprisingly fast. Once it's cluttered with multiple requests, long code blocks, and extensive thinking processes, the AI begins to suffer from "context rot." It loses track of the most important information, leading to less effective and often incorrect results.

The Thinking Budget: A Key Lever in Context Engineering

A crucial aspect of managing the context window is controlling how much the AI "thinks." Claude Code offers four distinct "thinking modes" that allow you to allocate a specific token budget for its internal planning process. Understanding these modes is essential for efficient context management.

Thinking Mode, Keyword to Use in Prompt, Approx. Token Cost, Best Use Case

**Think, **"think", ~4,000 Tokens — Simple, direct commands and small, iterative changes.

**Think Hard, **"think hard", ~8,000 Tokens — Moderately complex tasks that require some planning.

**Think Harder, **"think harder", ~16,000 Tokens — Complex tasks requiring a detailed plan or analysis of multiple components.

**Ultrathink, **"ultrathink", ~32,000 Tokens — Initial high-level planning, architectural design, and complex research tasks.

Note: These keywords should be included directly in your prompt, for example, "ultrathink and create a comprehensive testing plan for this feature."

Practical Strategies for Effective Context Management

Here are the key strategies discussed in the video for keeping your context window clean, relevant, and effective.

1. The "Architect then Execute" Strategy

Instead of using the most powerful mode for every single step, adopt a two-phase approach:

Phase 1: Architect (High-Cost): Start a new session by using "ultrathink" or "think harder" to have Claude create a high-level plan. Ask it to analyze the problem, outline the necessary steps, identify relevant files, and propose a solution structure.

Phase 2: Execute (Low-Cost): Once you have this plan, switch to "think" or "think hard" for subsequent prompts. You are no longer asking the AI to solve the whole problem, but merely to execute the next step in the plan it already created. This dramatically reduces the thinking tokens consumed in each turn, preserving your context window.

2. Isolate and Conquer with Sub-Agents

When a conversation starts to branch into a complex, but separate, sub-problem, don't pollute your main context. Instead, use a sub-agent.

A sub-agent is a fresh, isolated Claude Code session with its own clean context window. You can delegate a specific, complex task to a sub-agent (e.g., "Research the best practices for this authentication flow" or "Write unit tests for this specific function"). The sub-agent can use a high-cost thinking mode to solve its dedicated problem without affecting your main workflow. When it's done, it provides a concise answer or code block that you can then feed back into your main agent, adding only the high-signal result to the primary context.

3. Master the Clean Slate

Sometimes, the best way to manage context is to start over. Don't be afraid to use commands like /clear or /reset to wipe the slate clean. If you've just completed a major feature and are about to start on a completely different one, it's far more effective to begin a new, focused session than to carry over the baggage of the previous conversation.

4. Automate Context with Hooks and Indexers

For advanced users, Eric Buess detailed how he uses hooks to automate context management. His "Project Indexer" is a prime example. This is a hook that runs automatically, creating a minified, high-signal summary of the entire project's structure, dependencies, and key components. This PROJECT_INDEX.json file gives Claude a bird's-eye view of the codebase without needing to read every file individually, which is a massive token saver. This pre-processing of context is a hallmark of sophisticated AI-driven development.

By applying these context engineering principles, you can maintain a high signal-to-noise ratio in your interactions with Claude Code, leading to more reliable, accurate, and productive coding sessions.