Context Engineering: A Practical Guide
By Oliver Kriška
2025
Introduction
Imagine meeting a random colleague on the street and telling them:
"Dark Roasted Peru, Bean Lovers, sweet taste from South America"
They probably won't understand what you want. Maybe they'll think you're talking about some vacation. But if you say:
"I want to buy Dark Roasted Peru coffee from Bean Lovers brand. It should have a sweet taste and is grown in South America."
Now they know what's going on. It works the same way with AI.
The Problem No One Talks About
"AI output is unusable." I hear this constantly. Many people don't use AI precisely because of this frustration. They give it a task, get garbage back, and conclude the technology doesn't work.
But the problem isn't AI. The problem is we're giving little and expecting much.
When I started with AI, I made the same mistakes. I gave large tasks, wasn't specific, provided minimal context, and expected brilliant results. I blamed the tool when things went wrong. Sound familiar?
Here's what changed everything: I realized AI is like a person who knows nothing about your project, your product, or your problems—but can learn everything to extreme depth if you explain it correctly. AI knows nothing about your specific situation, but can mathematically express how atoms split. The gap isn't capability. It's context.
This isn't just my opinion. Andrej Karpathy, co-founder of OpenAI, put it perfectly:
"Context engineering is the delicate art and science of filling the context window with just the right information for the next step."
He's right. And that's what this guide is about.
What You'll Learn
This guide won't give you magic prompts to copy-paste. Those become outdated the moment a new model releases. Instead, you'll learn principles that work across any AI tool—ChatGPT, Claude, Gemini, or whatever comes next.
By the end, you will:
- Understand why prompts fail and how context changes everything
- Know the five components of effective AI context
- Have templates you can use immediately for any task
- Recognize when to iterate versus when to start fresh
- Apply these principles to teams and scale beyond individual use
This isn't theory. Every concept comes from real work—debugging production bugs, selecting technologies, writing documentation, refactoring legacy code, and even personal projects like analyzing home electricity consumption for solar panel decisions.
Why Listen to Me
I've been programming for over 25 years. When AI coding assistants appeared, I jumped in enthusiastically—and failed like everyone else. I wasted hours arguing with AI about simple fixes. I watched it generate 500 lines of generic code when I needed 50 lines of something specific.
Then I started paying attention to what worked. I noticed patterns. I developed a systematic approach. My AI interactions went from frustrating to productive. A task that took 2 hours of fighting became 2 minutes of collaboration.
This guide is what I wish existed when I started. It would have saved me months of trial and error.
How This Guide Is Organized
Part 1: Understanding Context Engineering covers the foundation—what context engineering is, why it matters more than prompt engineering, and how to recognize good context.
Part 2: The Practice gets hands-on with specific techniques: what to do before, during, and after giving AI a task, plus which tools work best for which purposes.
Part 3: Real-World Application shows context engineering in action through detailed examples, then explains how to bring these practices to entire teams.
Part 4: The Bigger Picture addresses the industry context—why "vibe coding" isn't enough, and what AI means for your career (spoiler: it won't replace you, but it will amplify you).
The Appendix includes quick reference materials you'll return to often: the 10 key advices, templates, and a tool comparison.
How to Use This Guide
If you're completely new to AI: Read Part 1 first. The foundation matters. Don't skip to the practical chapters until you understand why context beats prompts.
If you're already using AI but frustrated: Skim Chapter 1, then focus on Chapters 3-5 for practical techniques. Keep the Appendix handy while working.
If you're implementing AI for a team: Start with Chapter 7 on team implementation, then read backwards for the underlying principles.
For everyone: Try at least one example from each chapter before moving on. Context engineering is a skill—you learn by doing, not just reading.
A Simple Test
Before every AI task, ask yourself: "Could a junior developer who started yesterday complete this task with just this information?"
If the answer is no, you're missing context. If the answer is yes, AI can handle it too.
That's context engineering in one sentence. The rest of this guide shows you exactly how to do it.
Let's begin.
Part 1: Understanding Context Engineering
Chapter 1: Why Your AI Prompts Fail
For a long time, I thought AI was like a search engine on steroids. Give it a question, get an answer. Then I thought it was like a junior colleague—just give them a task. Both approaches were wrong.
In this chapter, you'll learn:
- Why "prompt engineering" is only half the story
- The real difference between prompts and context
- How AI is different from both Google and colleagues
- Why the industry is shifting to "context engineering"
1.1 The "Giving Little, Expecting Much" Trap
When I started with AI, I made a fundamental mistake that most people make: I gave large tasks, wasn't specific, provided little context, and expected a lot.
Here's what that looked like in practice:
❌ Bad Request: "Build me an expense tracking application"
Result: AI generated 500 lines of generic code using a framework I didn't know, with features I didn't need. Completely unusable.
The problem? I was giving little and expecting much. I wanted AI to read my mind—to know that I needed something simple, that I'd be the only user, that I wanted vanilla JavaScript, that the data could live in localStorage.
Most people who tell me "AI output is unusable" are making this exact mistake. They're treating AI like a magic box that should just know what they want.
Here's the truth: AI doesn't know your project. It doesn't know your constraints. It doesn't know your priorities. It doesn't know your preferences. If you don't tell it, it can't know.
✅ Good Request (broken down):
- "Create HTML table with 3 columns: amount, category, date. Users can add new rows"
- "Add validation—amount must be a number, category from dropdown list"
- "Store data in localStorage, load on page refresh"
- "Add basic CSS styling for clean look"
- "Create delete button for each row"
Result: Each step produced exactly what I asked for. Clean, understandable code I could actually use.
The same user need—simple expense tracking—but completely different results. The difference was breaking down the request and providing specific context for each part.
1.2 Prompt vs Context—The Real Difference
I divide what we call "prompts" into two parts:
Prompt = the task, question, or instruction Context = everything else that helps AI understand the task correctly
Most people focus obsessively on the "perfect prompt" and ignore context entirely. That's like giving a junior developer a task title and expecting perfect production code. You wouldn't do that to a person—so don't do it to AI.
Here's what context looks like in practice:
❌ Prompt Only: "Write me an article about AI"
Result: Generic 2,000-word essay about AI history, current applications, and future implications. Nothing useful.
✅ Prompt + Context: "I need a 1000-word article for LinkedIn about how Context Engineering improves AI outputs.
Target audience: technical managers Tone: direct, no pathos Style examples: [attached two previous articles] Main point: context is more important than prompts Source for technical details: [link to relevant research]"
Result: Article that sounds like me, makes the right points, and is actually publishable with minor edits.
Notice what the second request includes that the first doesn't:
- Scope: 1000 words
- Platform: LinkedIn
- Target audience: technical managers
- Tone: direct
- Style reference: examples of my previous work
- Main argument: clear thesis
- Source: authoritative reference
This is context. Without it, AI is guessing. With it, AI knows exactly what success looks like.
1.3 AI as Your New Colleague
AI is like a person who knows nothing about your project, your product, or your problems—but can learn everything to extreme depth if you explain it correctly.
Think about that for a moment. AI knows nothing about your specific situation, but it can mathematically express how atoms split. The gap isn't capability. It's context.
How AI Differs from Google
When you search Google, you're looking for existing information. You scan results, click links, read pages. Google is a finding tool.
AI is different. It's a synthesis tool. It takes what you give it and creates something new. But it can only work with what you provide. If your input is vague, your output will be vague.
How AI Differs from a Colleague
A colleague has shared context with you. They know the project, the history, the politics. When you say "fix the login bug," they know which login, which codebase, which bug you probably mean.
AI has none of that. Every conversation starts from zero. You have to provide the context that a colleague would already have.
The AI Advantage
But AI has advantages colleagues don't:
- Available 24/7—no waiting for someone to be free
- Responds immediately—no "I'll look into it tomorrow"
- No bad moods—consistent quality of engagement
- No ego—never defensive about suggestions
A colleague needs time to think, study documentation, read articles to give you a similar answer. AI can do that synthesis instantly—if you provide the right context.
The trade-off: AI lacks human experience. It doesn't have the situational awareness that comes from years in an industry, from seeing patterns play out, from knowing what usually goes wrong. That irreplaceable human element is why AI amplifies you rather than replaces you.
1.4 Industry Validation
This isn't just my observation. The industry is recognizing this shift.
Andrej Karpathy, co-founder of OpenAI and former Tesla AI director, put it perfectly:
"+1 for 'context engineering' over 'prompt engineering'. People associate prompts with short task descriptions. In every industrial-strength LLM app, context engineering is the delicate art and science of filling the context window with just the right information for the next step."
— Andrej Karpathy, Twitter/X, 2025
The term matters because it changes how we think about the problem. "Prompt engineering" suggests there's a magic incantation that will make AI work. "Context engineering" recognizes that what matters is providing the right information—not finding the right words.
Research supports this too. Studies show that AI performance can actually decrease as you give it more information—when that information is irrelevant or overwhelming. More isn't better. The right information is better.
That's what context engineering is about: the delicate art of giving AI just what it needs—no more, no less.
Chapter Summary
Key Takeaways:
- Most AI failures come from giving little and expecting much—the fix is providing proper context, not finding better prompts
- Prompt = task, Context = everything else—most people ignore the "everything else" and wonder why results are poor
- AI is a knowledgeable stranger—it knows everything but your situation; you have to fill that gap
Try This: Take your last failed AI interaction. Look at what you provided. Ask yourself: "Could a stranger who knows nothing about my situation complete this task?" If not, add the missing context and try again. Notice the difference.
Next: Now that you understand why prompts fail, let's look at exactly what good context looks like—the practical patterns that get results on the first try.
Chapter 2: What Good Context Looks Like
In the previous chapter, we saw why prompts fail without context. Now let's look at exactly what "good context" looks like—the practical patterns that get results on the first try.
In this chapter, you'll learn:
- The five components every effective context needs
- A universal task template that works for any AI request
- Real examples showing bad versus good context
- The simple test to know if your context is good enough
2.1 The Five Components of Context
Every effective AI context includes five components. Miss one, and your results suffer.
1. The Task
What you want AI to do—clearly and specifically.
Not just "write code" but "write a function that validates email addresses."
2. The Constraints
What AI must NOT do—the boundaries and limitations.
"Don't use regex. Must handle international emails. Maximum 50 lines. No external libraries."
3. The Background
Why you need this—the purpose and context.
"Users are signing up with invalid emails. This function will run on every form submit in our registration flow."
4. The Examples
What good output looks like—references and samples.
"Here's a similar function we use for phone validation: [code]. Match this style."
5. The Success Criteria
How you'll judge the result—what "done" means.
"Should handle edge cases like: test+filter@gmail.com, name@subdomain.company.co.uk. Must pass our existing test suite."
When you provide all five, AI has everything it needs. When you skip components, AI has to guess—and guesses are often wrong.
2.2 The Universal Task Template
Here's a template that works for any AI task. I use it daily:
## Problem
[What needs to be done or what broke]
## Context
- Files: [specific files and lines if applicable]
- History: [relevant previous changes or attempts]
- Constraints: [what must not change]
## Goal
[Clear success criterion—when is it done]
## Possible Solutions
1. [First option to explore]
2. [Second option]
3. [Third option]
## Tests/Verification
[How we verify it works]
Real Example: Bug Fix
Here's how this template looks for a real debugging task:
## Problem
User ID: 12345 can't edit their profile via the UI
## Context
- Support verified: user has 'admin' role in database
- Works: API endpoint PUT /api/profile/:id (tested manually)
- Doesn't work: "Edit" button in ProfileView.tsx
- Console error: "Permission denied at ProfileView.tsx:156"
- File: src/components/ProfileView.tsx, lines 150-160
## Goal
Edit button must work for users with 'admin' role
## Possible Solutions
1. Check how permissions are validated in ProfileView
2. Verify if user role loads correctly from state
3. Compare API permission check vs UI permission check
## Tests
- User with admin role can click Edit and see the form
- Existing admin permissions aren't broken
- Create test to catch this before users see it
Notice what's included:
- Verified facts from support, not assumptions
- Specific location in code (file + line numbers)
- What works vs doesn't (API yes, UI no)
- Clear goal not vague "fix it"
- Directions to explore not commands to execute
With this context, AI usually finds the exact problem on the first try. In this case: the UI was checking for 'editor' role instead of 'admin'. Two-minute fix.
2.3 Before and After: Real Examples
Let's see this pattern across different task types.
Example: Article Writing
❌ Without Context: "Write me an article about AI"
Result: Generic essay nobody wants to read.
✅ With Context: "I need a 1000-word article for LinkedIn about Context Engineering.
Audience: technical managers who use AI occasionally Tone: direct, practical, no hype Style: here are my last 2 articles [attached] Main point: context matters more than prompts Include: one concrete example, Karpathy quote Exclude: generic AI history, predictions"
Result: Article that sounds like me and makes my actual point.
Example: Code Refactoring
❌ Without Context: "Refactor this function to be cleaner"
Result: AI randomly splits function, breaks business logic, changes things that shouldn't change.
✅ With Context: "Refactor processOrder function (attached).
Context:
- TypeScript 4.9, Express + TypeORM
- All existing tests pass (attached)
- Problem: 800 lines, nobody wants to touch it
Goals:
- Split into smaller functions (max 50 lines each)
- Keep all functionality identical
- Add TypeScript types where missing
Don't change:
- Business logic
- Database table names
- API response format
- Error messages (backwards compatibility)
Process:
- Identify independent parts
- Propose split (don't code yet)
- Wait for my approval
- Implement with tests"
Result: Clean refactoring I can deploy with confidence.
Example: Technology Selection
❌ Without Context: "What's the best framework for admin panel?"
Result: Generic list of frameworks from 2023.
✅ With Context: "Need framework for internal admin app.
Project:
- 20 internal users
- Mainly CRUD + reports
- Team: 2 developers with React experience
- Timeline: MVP in 1 month
- Integration: existing NestJS REST API
- Budget: prefer open source
Requirements:
- Fast development (components out-of-box)
- TypeScript support
- Good documentation
- Active community (2024+)
Exclude:
- Paid solutions (Retool, Forest Admin)
- PHP frameworks
- Need to learn new language
Output I need: Top 3 options with time-to-MVP estimate and starter template links"
Result: React Admin, Refine, and Ant Design Pro—specific comparison with exactly what I asked for.
2.4 The Junior Developer Test
Before every AI task, ask yourself one question:
"Could a junior developer who started yesterday complete this task with just this information?"
If the answer is no—you're missing context. If the answer is yes—AI can handle it too.
This test works because:
- Junior developers need explicit instructions (AI does too)
- Junior developers can't read minds (AI can't either)
- Junior developers need examples (AI learns from them too)
- Junior developers ask clarifying questions (AI makes assumptions instead)
The difference: AI won't ask you for clarification. It will just make assumptions and proceed. Bad assumptions lead to bad output. Good context prevents bad assumptions.
The 80% Test
Here's a more precise benchmark: If you get at least 80% correct result on the first try, you have good context.
If you have to iterate more than 2-3 times, the problem isn't AI—it's your task description. Stop, improve your context, start fresh.
Chapter Summary
Key Takeaways:
- Five components: Task, Constraints, Background, Examples, Success Criteria—miss one and AI guesses
- Use the template: Problem → Context → Goal → Solutions → Tests works for any task
- Junior developer test: If a new colleague couldn't complete it, neither can AI
Try This: Take a task you need to do this week. Before going to AI, fill out the universal template. All five components. Then give it to AI. Notice how much better the first response is compared to your usual "just ask and see" approach.
Next: Now that you know what good context looks like, let's explore the most common mistakes people make—and how to avoid them.
Part 2: The Practice
Chapter 3: Before You Prompt - Preparation
You wouldn't hand a new colleague a sticky note saying "fix the app" and expect perfect results. Yet that's exactly how most people approach AI. The magic happens before you ever hit send.
In this chapter, you'll learn:
- Five preparation principles that transform AI results
- How to break tasks into atomic parts that AI handles easily
- Why showing beats telling every time
- The critical role of exclusions and success criteria
3.1 Give the WHY, Not Just WHAT
AI doesn't know your priorities. It can't distinguish between "nice to have" and "deal-breaker" unless you tell it.
Here's what I mean. I needed a car seat for my son. I could have asked:
❌ Without WHY: "Find the best car seat for a 120cm child"
Result: Generic list sorted by overall ratings—features I didn't care about mixed with safety I did.
Instead, I explained my reasoning:
✅ With WHY: "My son is 120cm tall. Get safety ratings from ADAC tests (not overall ratings!) and create a table. For me, safety is a higher priority than having to take the seat out of the car once a year. Exclude marketplace sellers (Amazon, etc.)"
Result: Table sorted by exactly what mattered to me. Perfect starting point.
The difference? AI understood that safety was my priority, not convenience features. It optimized its search accordingly.
This works everywhere:
- Code review: "Check for security issues" vs "Check for SQL injection specifically—we had an incident last month and need to ensure it's not happening elsewhere"
- Documentation: "Write API docs" vs "Write API docs for external partners who need to integrate quickly—assume no prior knowledge of our system"
- Analysis: "Analyze this data" vs "Analyze this data to find why conversion dropped 40% last Tuesday—we think it's related to the checkout flow"
When you explain WHY, AI can make intelligent trade-offs. Without WHY, it guesses—and guesses are often wrong.
3.2 Break Into Atomic Parts
One of the most counter-intuitive truths about working with AI: giving a large task often takes more time than breaking it into small steps.
Here's why. When I asked AI to "build an expense tracking application," I got 500 lines of generic code using a framework I didn't know, with features I didn't need. Completely unusable.
But when I broke it down:
✅ Atomic Approach:
- "Create HTML table with 3 columns: amount, category, date. Users can add new rows."
- "Add validation—amount must be a number, category from dropdown list."
- "Store data in localStorage, load on page refresh."
- "Add basic CSS styling for clean look."
- "Create delete button for each row."
Result: Each step produced exactly what I asked for. Clean, understandable code I could actually use.
The atomization principle works because:
- Smaller scope = better focus. AI concentrates on one thing.
- Easier to provide context. Each step gets only the relevant files or information.
- Faster feedback loops. Catch errors at step 2 instead of discovering everything is wrong at the end.
- Higher success rate. AI has an extremely high chance of correct results on first iteration when tasks are atomic.
How to Atomize
Ask yourself: "Can I explain this task to someone in under a minute?"
If not, it's too big. Break it down further.
A practical workflow:
- First analysis (read-only): "Analyze this task and break it into steps. Don't write code yet."
- Then atomic tasks: "Do only step #1." Add context specific to that step.
- Record progress: Write analysis to a markdown file for reference.
- Checkpoint every 8-12 interactions: Ask for "SUMMARY + 3 TODO" to stay aligned.
This approach feels slower initially. But you'll rarely need to restart from scratch—something that happens constantly with large, vague requests.
3.3 One Example Beats 1000 Words
You can write paragraphs describing what you want. Or you can show AI an example in 30 seconds.
Examples win. Every time.
For Writing Tasks
❌ Without Example: "Write me a LinkedIn article. Make it professional but conversational, direct without being harsh, insightful but practical."
Result: Generic article that sounds like every other AI-generated content.
✅ With Example: "Write me a LinkedIn article about context engineering. Here are my last two articles as style examples [attached]. Match this tone and structure."
Result: Article that sounds like me and follows my established patterns.
For Code Tasks
❌ Without Example: "Write a validation function that follows our coding standards."
Result: AI's guess at what "your standards" means.
✅ With Example: "Write a validation function for phone numbers. Here's our existing email validation [code attached]. Match this pattern—same error handling, same naming conventions."
Result: Consistent code that fits your codebase.
For Design Tasks
❌ Without Example: "Create a dashboard that looks modern and clean."
Result: Generic interpretation of "modern and clean."
✅ With Example: "Create a dashboard for our analytics page. Here's our existing settings page [screenshot]. Match the component styles, spacing, and color scheme."
Result: Consistent design that fits your application.
What Makes Good Examples
- Specific to the task: Don't attach random previous work—attach work similar to what you need.
- Clear what to copy: If you want them to copy the structure but not the content, say so.
- Recently validated: Use examples that represent your current standards, not legacy code you're trying to move away from.
The principle is simple: show, don't tell. AI learns better from demonstration than description.
3.4 Say What to EXCLUDE
AI is eager to please. Without constraints, it includes everything it thinks might be helpful. This creates bloat, irrelevance, and confusion.
Explicit exclusions solve this.
Research Tasks
❌ Without Exclusions: "Find React admin panel frameworks."
Result: Mix of paid solutions, outdated frameworks, PHP alternatives, and enterprise options you can't afford.
✅ With Exclusions: "Find React admin panel frameworks. Exclude: paid solutions (Retool, Forest Admin), anything not updated in 2024, non-TypeScript options."
Result: Focused list of exactly what you can actually use.
Code Generation
❌ Without Exclusions: "Refactor this function to be cleaner."
Result: AI changes variable names, adds comments, restructures logic, "improves" things you didn't ask to change.
✅ With Exclusions: "Refactor this function into smaller functions. Don't change: business logic, error messages, API response format, variable names outside the refactored functions."
Result: Targeted refactoring that doesn't break existing behavior.
Common Exclusions to Consider
- Don't add comments (unless you want them)
- Don't change naming conventions (match existing)
- Don't add features (only what was requested)
- Don't use external libraries (if you want vanilla solutions)
- Don't explain basic concepts (if you're experienced)
- Don't include deprecated options (current only)
- Don't suggest alternatives (just answer the question)
The rule is simple: AI includes everything unless told otherwise. Be explicit about what you don't want.
3.5 Define What DONE Looks Like
Vague goals produce vague results. "Make it better" means nothing. "Make it pass our test suite" means everything.
Clear success criteria serve two purposes:
- Guide AI's work: It knows what to optimize for.
- Enable your verification: You know when to stop iterating.
What Good Success Criteria Look Like
❌ Vague: "Fix the performance issues."
✅ Specific: "Query must complete in under 500ms with 100k rows. Currently takes 8 seconds."
❌ Vague: "Write good tests."
✅ Specific: "Write tests that cover: happy path, invalid input (empty string, null), edge case (exactly 255 characters—the limit)."
❌ Vague: "Make the code cleaner."
✅ Specific: "Split into functions under 50 lines each. All existing tests must still pass."
The Verifiability Test
Can you check if it's done without subjective judgment?
- "Better performance" → Subjective, not verifiable
- "Under 500ms" → Objective, verifiable
- "Good code" → Subjective, not verifiable
- "All tests pass" → Objective, verifiable
- "Nice documentation" → Subjective, not verifiable
- "Every public function has a docstring" → Objective, verifiable
If your success criterion requires "I'll know it when I see it," it's not specific enough.
Why This Matters
Without clear criteria, you end up in an endless loop of "not quite right" iterations. With clear criteria, both you and AI know exactly when the task is complete.
Done = test passes without errors. Done = loads in under 2 seconds. Done = handles all edge cases in the specification.
This isn't bureaucracy. It's efficiency.
The Preparation Checklist
Before you prompt, run through these five checks:
| Check | Question | If No... | |-------|----------|----------| | WHY | Did I explain my priorities? | AI will guess what matters most | | Atomic | Can I explain this in 1 minute? | Break it down further | | Example | Did I show what good looks like? | Attach reference work | | Exclude | Did I say what NOT to do? | AI will include everything | | Done | Can I verify success objectively? | Make criteria specific |
Five questions. Thirty seconds to answer. Results that actually work on the first try.
Chapter Summary
Key Takeaways:
- Always explain WHY—AI can optimize for your actual priority when it knows the reason
- Break big tasks into atomic steps—each step gets precise context and higher success rate
- Show examples instead of describing—one example beats 1000 words of explanation
- Explicitly exclude what you don't want—AI defaults to including everything
- Define verifiable success criteria—"done = X" enables clear evaluation
Try This: Take your next AI task. Before you send it, check all five boxes: WHY, Atomic, Example, Exclude, Done. Notice how the response changes compared to your usual approach.
Next: Now that you know how to prepare before prompting, let's look at what happens during and after—how to iterate effectively when AI doesn't get it right.
Chapter 4: During and After - Iteration
AI gave you something wrong. Do you send "sorry, I meant..." and hope for the best? Or do you know when to fix, when to restart, and how to actually teach AI what you need?
In this chapter, you'll learn:
- The 2-minute rule that signals when something's wrong
- How to explain errors so AI actually learns
- When context contamination means starting fresh is faster
- The question patterns that get real feedback, not praise
4.1 The 2-Minute Rule
Here's a signal most people miss: if AI takes longer than 1-2 minutes for a usable first response, something is wrong.
This isn't about patience. It's about recognizing that long processing almost always produces bad results. When I see AI doing many operations, searching extensively, or taking forever to generate—I know the output will be off target.
Why does this happen? Usually one of three reasons:
- Task too vague: AI is trying everything because it doesn't know what you actually need
- Task too big: AI is attempting to solve too much at once
- Wrong approach: AI took a path that doesn't match your requirements
What to do when this happens:
Don't wait for the bad result. Stop and adjust.
Instead of: Waiting 5 minutes, getting 500 lines of wrong code, then fixing
Do this: Stop at 2 minutes, ask "What are you working on?" or cancel and clarify your request
The 2-minute rule isn't arbitrary. In my experience, good AI responses for well-defined tasks come quickly—usually in seconds. When they don't, the context is wrong.
Practical Application
- Code generation: Should see relevant output within 30-60 seconds
- Research tasks: Initial results should surface in 1-2 minutes
- Writing tasks: First draft sections should appear quickly
If you're staring at a spinning indicator for 3+ minutes, you're not being patient—you're wasting time on a path that won't work.
4.2 Explain WHY It's Wrong
When AI makes a mistake, most people say "that's wrong" or "try again." This doesn't help. AI learns from context—including your feedback.
The pattern is simple: "This is wrong because X."
Example: JavaScript Instead of TypeScript
❌ Unhelpful correction: "That's wrong. Use TypeScript."
Result: AI might switch to TypeScript but still miss your point.
✅ Helpful correction: "This is wrong because we use TypeScript in this project. All our existing functions have typed parameters and return values. See the example I attached."
Result: AI understands the standard AND why it matters.
Example: Wrong Architecture Decision
❌ Unhelpful: "Don't use Redux."
Result: AI switches to something else, might pick another wrong option.
✅ Helpful: "Don't use Redux because our app is small and we're already using React Context. Adding Redux would be over-engineering for 3 components."
Result: AI understands the constraint AND the reasoning, can suggest appropriate alternatives.
Why This Works
When you explain the reason, that reasoning stays in context. AI can now:
- Apply the same logic to similar decisions
- Avoid repeating mistakes for the same reasons
- Understand your constraints better
"Wrong" tells AI to change something. "Wrong because X" teaches AI to think differently.
4.3 Context Contamination
This is one of the most important and least understood aspects of working with AI.
The problem: When AI generates bad output, that output stays in the context. Even after you say "that's wrong," the bad content is still there. AI can—and often does—reference it later.
Here's what this looks like in practice:
- You ask for an article. AI writes something off-tone.
- You say "That's too formal, be more conversational."
- AI rewrites, but some formal phrases keep sneaking back.
- Why? Because the original formal text is still in context, acting as an invisible reference.
The same happens with code:
- You ask for a function. AI writes 200 lines.
- You say "Too long, simplify."
- AI shortens it, but keeps patterns from the bloated version.
- You iterate 5 times. The original bad approach contaminates every attempt.
When to Start Fresh
The fix-or-restart decision comes down to one question: Does the bad context outweigh the good?
Fix when:
- Output is at least 80% correct
- AI understood the task, just made minor mistakes
- Errors are easy to identify and explain
- Dealing with non-technical tasks where iteration is natural
Start fresh when:
- Output is completely off target
- AI keeps repeating the same mistakes after corrections
- You said "exclude X" but X keeps appearing
- Previous wrong steps are influencing current attempts
- You've iterated 3+ times without significant improvement
How to Start Fresh Properly
Don't just start a new session and repeat your original request. That'll produce the same bad result.
Instead:
"I'm working on [task]. Here's what I have so far [paste current good content]. I want to continue, but with these changes: [what you learned from failed attempts]."
Or even better:
"Here's my article draft [paste]. It's unfinished. Continue writing, but: don't use passive voice, keep paragraphs under 4 sentences, match the tone of the first section."
You're giving AI a clean context with explicit guidance based on what went wrong before.
4.4 Ask for Opinion, Not Validation
Here's an uncomfortable truth: current AI models are tuned to be agreeable. If you ask "Is my plan good?", you'll get praise—whether the plan deserves it or not.
This isn't AI being deceptive. It's AI being helpful in the way it was trained to be. The problem is you wanted honest feedback.
The Validation Trap
❌ Validation question: "Is my database schema good?"
Typical response: "Your schema looks well-structured! The relationships are clear and..."
What you learned: Nothing useful.
Better Question Patterns
✅ Opinion question: "What problems do you see with this database schema?"
Typical response: "I see a few potential issues: the Users table might benefit from an index on email for faster lookups, the many-to-many relationship could cause..."
What you learned: Actual problems to consider.
✅ Role play technique: "You are a database architect with 15 years of experience. Review this schema and tell me what you'd change."
Typical response: Direct technical feedback from the perspective of an expert.
Question Patterns That Work
| Instead of... | Ask... | |---------------|--------| | Is this code good? | What would you change in this code? | | Is my plan solid? | What are the weakest parts of this plan? | | Did I cover everything? | What am I missing? | | Is this approach correct? | What alternative approaches should I consider? | | Do you like this design? | What would break if we shipped this? |
The pattern: Ask for criticism, not confirmation.
Role Play for Different Fields
The role play technique works across domains:
- Code review: "You are a senior engineer doing a code review. What feedback would you give?"
- Marketing: "You are a marketing director seeing this campaign for the first time. What concerns would you raise?"
- Writing: "You are an editor with no patience for fluff. What would you cut from this article?"
- Architecture: "You are a solutions architect who has to maintain this system for 5 years. What worries you?"
The role gives AI permission to be critical. Use it.
4.5 Editing Prompts Mid-Work
Here's a technique that changed how I work with AI: edit your original prompt instead of sending correction messages.
The Problem with Corrections
When you discover something mid-conversation, the natural instinct is to send a follow-up:
"Actually, I meant just the first form, not both." "Sorry, I forgot to mention we need TypeScript." "Wait, also exclude the deprecated options."
Each correction message adds noise to the context. AI now has:
- Your original (incomplete) request
- Its response based on that request
- Your correction
- More context to parse
This creates unnecessary complexity and intermediate steps.
The Better Approach
Edit your original prompt to include the new information. Then continue from the improved starting point.
Before (original prompt): "Add validation to the form."
AI reveals: There are actually 2 form implementations.
After (edited prompt): "Add validation to the registration form (the one in SignupPage.tsx, not the contact form)."
Now AI works with complete information from the start. No wasted intermediate steps.
When This Works
Some tools support this better than others:
- Zed + Claude: Excellent support for editing prompts mid-conversation
- ChatGPT: You can edit messages, but it restarts the conversation from that point
- Claude (web): Similar to ChatGPT—edit resets from that point
- API-based tools: Depends on implementation
The principle applies everywhere: complete information upfront beats corrections later.
For Tools Without Edit Support
If you can't edit, start a new session with the improved prompt. It's often faster than trying to correct course mid-conversation.
New session: "Add validation to the registration form in SignupPage.tsx (not the contact form). Use the same validation pattern as our existing email validator."
Clean context, complete information, better results.
Long Session Management
One more practical technique for extended work sessions:
Every 8-12 interactions, ask for a summary:
"Summarize what we've accomplished and list the 3 most important remaining TODOs."
This does three things:
- Confirms AI still understands the goal
- Clears confusion from accumulated context
- Gives you a checkpoint to restart from if needed
It's like saving your game. If the conversation goes sideways, you have a clean summary to start fresh with.
Chapter Summary
Key Takeaways:
- 2-minute rule: Long processing is a signal to stop and adjust, not wait patiently
- Explain WHY: "Wrong because X" teaches AI; "that's wrong" doesn't
- Context contamination: Bad output pollutes future responses—sometimes fresh is faster
- Ask for criticism: "What would you change?" beats "Is this good?"
- Edit, don't correct: Update original prompts instead of sending follow-up corrections
Try This: Next time AI gives you something wrong, try two approaches: (1) Simply say "try again" and (2) Explain exactly why it's wrong. Compare the responses. Notice how specific feedback produces specific improvements.
Next: Now that you know how to prepare context and iterate effectively, let's look at the specific tools that make this workflow practical.
Chapter 5: Tools and Workflows
People ask me which AI tool is best. Wrong question. The right question is: which tool is best for what?
In this chapter, you'll learn:
- How to divide tools by task type (not by "best overall")
- A research workflow that feeds results into working tools
- The coding workflow with context management
- Which tools combine well and which don't
5.1 Choosing the Right Tool
I use three main AI tools daily:
- Perplexity — roughly 90% of my searches
- Zed + Claude — programming and technical things
- ChatGPT — non-technical things, household, garden
This division isn't random. It developed naturally based on what worked best for me where.
The Specialization Principle
Each tool has strengths. Instead of finding "the best," find what's best for each task type:
| Task Type | Best Tool | Why | |-----------|-----------|-----| | Research, information gathering | Perplexity | Quick summaries with sources, easy to extract key parts | | Programming, technical work | Claude (via Zed) | Better at code, can manage file context, edit prompts mid-work | | Writing, home projects, kids | ChatGPT | Great at conversational tasks, explanation, brainstorming | | Complex analysis | Claude | Handles nuance, longer context, technical depth |
Why Not Just Pick One?
You could use ChatGPT for everything. Or Claude. Many people do.
But you'll hit friction:
- ChatGPT for research means you miss Perplexity's source links and quick summaries
- Claude for home projects works but isn't optimized for conversational Q&A
- Perplexity for coding doesn't have the context management you need
Specialization eliminates friction. Each tool does exactly what it's best at.
What About Alternatives?
GitHub Copilot, Cursor, and similar tools exist. I've tried them. They work differently—more autocomplete-focused or with different context philosophies. The principles in this guide apply regardless of tool. If Cursor works better for you, use Cursor. The context engineering matters more than the specific tool.
5.2 Research Workflow
Most of my research starts with Perplexity. Here's the workflow:
Step 1: Short Question with Context (5-10 seconds)
Don't overthink it. Quick question, relevant constraints.
"Best TypeScript ORM 2024, supports PostgreSQL, active development"
Step 2: Get Summary + Links
Perplexity returns a summary with source links. Scan it.
Step 3: Verify Dates and Primary Sources
Important: Perplexity sometimes:
- Lacks latest data (training cutoff)
- "Invents" sources that don't actually exist
- Gives superficial answers on technical topics
Click through to verify critical claims. If something seems off, check the source.
Step 4: Take Only Relevant Parts
Don't copy the entire response. Extract what matters for your next step:
- Key facts
- Specific recommendations
- Source links you verified
Step 5: Insert as Context into Next Tool
Take your extracted research and feed it into Claude or ChatGPT for the actual work.
"I researched TypeScript ORMs. Best options for PostgreSQL in 2024 are Prisma, Drizzle, and TypeORM. [Paste relevant comparison]. I need to add database layer to my NestJS app. Which would you recommend given [my constraints]?"
Why This Two-Step Process?
Perplexity is excellent at gathering and summarizing. Claude and ChatGPT are excellent at working with that information. Using each for what it's best at produces better results than forcing one tool to do both.
Perplexity Limitations
Know these going in:
- Often lacks latest data — verify dates for anything time-sensitive
- Sometimes invents sources — always click through for important claims
- Superficial on deep technical topics — good for overview, use Claude for depth
Perplexity is a starting point, not an authority.
5.3 Coding Workflow
For technical work, I use Zed + Claude. The combination matters because of how Zed handles context.
Why Zed + Claude?
Zed's advantages:
- Excellent UX for context management
- Can edit prompts during execution (not just send corrections)
- Easy to add/remove files from context
- Claude model is "intelligent"—can find relevant files on its own
This means I can:
- Describe a task
- Let Claude find relevant files
- See it working with correct context
- Edit my prompt mid-work if I realize I forgot something
Typical Coding Session
- Task from Linear/Jira: "Fix: user can't save profile form"
- Give to Claude: Describe issue, attach relevant files (or let Claude find them)
- Claude analyzes: May reveal things I didn't know (e.g., "there are 2 form implementations")
- Edit prompt: Add the new information without sending a correction message
- Get solution: Claude works with complete context from the start
When Claude Fails
Even with good context management, Claude struggles with:
- Files with similar names in different directories — be explicit about which file
- Too much context (more than 3-4 files) — reduce scope, focus on relevant files
- Unknown library versions — specify version if behavior differs between versions
When I see these patterns, I don't fight it. I reduce scope or be more explicit.
Alternative Setups
If you don't use Zed:
- VS Code + Claude extension — works but different context model
- Claude web interface — paste files manually, less seamless
- API directly — maximum control, more setup required
The principles stay the same: good context management, ability to adjust mid-work, file relevance.
5.4 Tool Combinations
Some tools work together. Others don't.
What Works
Perplexity + ChatGPT
- Research via Perplexity (summaries, links)
- Processing and writing via ChatGPT
- Use case: Researching topic for article, then writing it
Perplexity + Claude
- API documentation via Perplexity (quick lookup)
- Implementation via Claude (actual code)
- Use case: Learning new library, then implementing
What Doesn't Make Sense
ChatGPT + Claude together
They solve the same problems differently. Using both for one task:
- Doubles effort
- Creates conflicting approaches
- No clear benefit
Pick one for each task type. Don't mix for the same task.
The Extraction Pattern
The common thread in good combinations: extraction.
- Tool A gathers or creates something
- You extract the relevant parts
- Tool B works with those parts
You're the filter. You decide what transfers between tools. This prevents context bloat and ensures each tool works with focused, relevant information.
5.5 Tool Selection Advice
If you're just starting with AI tools, here's what I've learned:
1. Start with One
Don't subscribe to everything at once. Pick one tool based on your main task type:
- Lots of research? Start with Perplexity
- Mostly coding? Start with Claude or Copilot
- General productivity? Start with ChatGPT
2. Give It a Month
You need time to develop habits. A week isn't enough to find workflow patterns. A month lets you discover what works and what doesn't.
3. Look for Fit
Not every tool suits everyone. If ChatGPT feels awkward after a month of genuine use, try Claude. If Perplexity doesn't match how you research, try traditional search with ChatGPT for synthesis. There's no wrong answer.
4. Don't Pay Immediately
Most tools have free tiers or trials. Use them first. Only pay when you hit limits that actually matter for your work.
5. Specialize
Once you have multiple tools, give each a purpose. "This tool for this, that tool for that." Specialization creates muscle memory. You stop thinking about which tool to use.
5.6 Cost Reality
Here's what I actually pay:
| Tool | Cost | How I Use It | |------|------|--------------| | Perplexity Pro | Free (via Revolut Premium package) | 90% of my searches | | Zed Pro | ~$50-100/month (usage-based) | All technical work | | ChatGPT Plus | $20/month | Non-technical, home, kids | | Total | $70-120/month | |
Is It Worth It?
The ROI is huge. I save hours weekly. Not "I feel more productive" hours—actual measurable time:
- Research that took 45 minutes now takes 10
- Boilerplate code I used to write manually now takes seconds
- Documentation I dreaded now flows quickly
$100/month to save 10+ hours/week? Easy decision.
Your Mileage May Vary
If you code occasionally, you might not need Zed Pro. If you don't research much, free Perplexity might be enough. Start free, pay when you hit real limits.
Chapter Summary
Key Takeaways:
- No single "best" tool—specialize each for different task types
- Research workflow: Perplexity → extract relevant → feed into working tool
- Coding workflow: Claude/Zed with context management, edit prompts mid-work
- Good combinations: Perplexity + ChatGPT, Perplexity + Claude; never ChatGPT + Claude for same task
- Start with one tool, give it a month, find your fit before adding more
Try This: For your next research task, try the two-step flow: Perplexity for initial research and summary, then extract the relevant parts and feed into ChatGPT or Claude for the actual work. Notice how separating "gather" from "work" improves both steps.
Next: Now that you have the tools and techniques, let's see context engineering in action with detailed real-world examples.
Part 3: Real-World Application
Chapter 6: Practical Examples
Enough theory. Let me show you exactly how context engineering looks in practice—six real examples with before and after comparisons.
In this chapter, you'll see:
- Debugging a production bug (2 minutes vs 2 hours)
- Selecting technology for a project
- Optimizing SQL from 8 seconds to 0.3 seconds
- Generating documentation that matches your style
- Refactoring legacy code safely
- Personal projects: garden advice and solar panel analysis
6.1 Debugging Production Bug
The Bad Way
❌ Task: "User can't edit profile, fix it"
Result: AI generates 200 lines of generic checks—authentication, permissions, database connections, form validation. None of them solve the actual problem. You spend 2 hours reviewing irrelevant code.
The Good Way
✅ Task:
## Problem User ID: 12345 can't edit profile via UI ## Context - Support verified: user has 'admin' role in database - Works: API endpoint PUT /api/profile/:id (tested manually) - Doesn't work: "Edit" button in ProfileView.tsx - Console error: "Permission denied at ProfileView.tsx:156" - File: src/components/ProfileView.tsx, lines 150-160 ## Goal Edit button must work for users with 'admin' role ## Possible Causes to Check 1. How permissions are validated in ProfileView 2. If user role loads correctly from state 3. Difference between API permission check vs UI permission checkResult: AI finds exact problem in 2 minutes—UI was checking for 'editor' role instead of 'admin'. Single line fix.
What Made the Difference
- Verified facts from support, not assumptions
- Specific file and line numbers where error occurs
- What works vs doesn't (API yes, UI no) narrows scope
- Directions to explore rather than vague "fix it"
6.2 Technology Selection
The Bad Way
❌ Task: "What's the best framework for admin panel?"
Result: Generic list of frameworks with pros/cons from 2023. Half are paid, some require PHP, none match your actual constraints.
The Good Way
✅ Task:
## Project Context - Internal admin app for 20 users - Mainly CRUD operations + reports - Team: 2 developers (React experience) - Timeline: MVP in 1 month - Integration: Existing REST API (NestJS) - Budget: Minimal (prefer open source) ## Requirements - Fast development (components out-of-box) - TypeScript support - Good documentation - Active community (2024+) ## Exclude - Paid solutions (Retool, Forest Admin) - PHP frameworks - Anything requiring new language ## Expected Output Top 3 options with: - Time to MVP estimate - Specific CRUD components available - Link to starter templateResult: Specific comparison of React Admin, Refine, and Ant Design Pro—exactly what the team can use, with starter template links and realistic MVP timelines.
What Made the Difference
- Team context (React experience, 2 developers)
- Real constraints (timeline, budget, existing API)
- Explicit exclusions (no paid, no PHP, no new languages)
- Specified output format (what you actually need)
6.3 SQL Optimization
The Bad Way
❌ Task: "Optimize this SQL query: SELECT * FROM orders WHERE status = 'pending'"
Result: AI suggests adding an index on status. You add it. Query is still slow. AI doesn't know why because it doesn't know your data.
The Good Way
✅ Task:
## Query SELECT o.*, u.name, u.email, p.name as product FROM orders o JOIN users u ON o.user_id = u.id JOIN products p ON o.product_id = p.id WHERE o.status = 'pending' AND o.created_at > NOW() - INTERVAL '30 days' ## Environment - PostgreSQL 14 - orders: 2M rows - users: 100k rows - products: 5k rows ## Current Indexes - orders(status) - orders(created_at) ## EXPLAIN ANALYZE Output [paste actual output] ## Constraints - Can't change schema (production database) - Can't add materialized view (policy restriction) - Query runs every 10 seconds (dashboard refresh) ## Goal Query under 2 seconds (currently 8 seconds)Result: AI suggests composite index on (status, created_at), reorders WHERE conditions, and removes unnecessary SELECT columns. Query drops from 8 seconds to 0.3 seconds.
What Made the Difference
- Actual data volumes (2M rows changes everything)
- Existing indexes (no point suggesting what's already there)
- EXPLAIN ANALYZE output (AI sees actual query plan)
- Hard constraints (can't change schema)
- Specific target (under 2 seconds, not "faster")
6.4 Documentation Generation
The Bad Way
❌ Task: "Write API documentation for user endpoint"
Result: Generic documentation in whatever format AI chooses. Doesn't match your existing docs. Wrong style, wrong structure, missing details you need.
The Good Way
✅ Task:
## Endpoint POST /api/v2/users/bulk-import ## Implementation [paste endpoint code] ## Format - OpenAPI 3.0 specification - Style: Like existing docs [paste example from another endpoint] ## Audience External developers integrating with our API ## Auth Bearer token (already documented elsewhere) ## Specifics to Include - Max 1000 users per request - Rate limit: 10 requests/minute - Async processing (returns job_id) - Validation rules: [paste from code] ## Generate 1. OpenAPI spec 2. Request/response example 3. Error codes table 4. curl exampleResult: Documentation ready to paste into Swagger. Matches existing style. Includes all edge cases and errors.
What Made the Difference
- Format specification (OpenAPI 3.0, not random format)
- Style example (AI sees what yours looks like)
- Specific details (rate limits, validation rules)
- Structured output request (exactly what you need)
6.5 Legacy Code Refactoring
The Bad Way
❌ Task: "Refactor this function to be cleaner"
Result: AI randomly splits function, renames variables, changes structure. Breaks business logic you didn't know was embedded in there. Tests fail. You spend hours figuring out what changed.
The Good Way
✅ Task:
## Function [paste processOrder function - 800 lines] ## Environment - TypeScript 4.9 - Express + TypeORM - All existing tests pass ## Problem Function is 800 lines. Nobody wants to touch it. We need to add a feature but can't understand the code. ## Existing Tests [paste test file] ## Refactoring Goals 1. Split into smaller functions (max 50 lines each) 2. Keep all functionality identical 3. Add TypeScript types where missing 4. Extract magic numbers to constants ## DO NOT Change - Business logic - Database table names - API response format - Error messages (backwards compatibility) ## Process 1. First: Identify independent parts of the function 2. Then: Propose split structure (don't code yet) 3. Wait for my approval before implementing 4. Implement with tests proving behavior unchangedResult: AI identifies 6 logical sections, proposes clean split, waits for approval, then implements. All original tests pass. New structure is readable and maintainable.
What Made the Difference
- Existing tests included (AI knows what must keep working)
- Explicit preservation rules (what must NOT change)
- Step-by-step with approval gates (not one giant change)
- Specific goals (50 lines max, not just "cleaner")
6.6 Personal Projects
Context engineering isn't just for work. Here are two examples from home.
Garden Advice
We were new to gardening. Instead of generic "how to garden" questions:
✅ Task:
## Situation New house, first garden. Zone 6b, clay-heavy soil. ## Want to Plant - Tomatoes (beefsteak variety) - Peppers (bell) - Herbs (basil, mint, rosemary) ## Questions 1. How deep to plant each? 2. Spacing between plants? 3. How often to water in first month? 4. Which plants should NOT be near each other? 5. When to expect first harvest? ## Format Table with each plant and specific instructionsResult: Detailed planting guide specific to our zone and soil type. Everything grew. Nothing died.
Solar Panel Analysis
Instead of "Should I buy solar panels?":
✅ Task:
## My Consumption Data [CSV export from utility company - 12 months, hourly readings] ## House Details - Location: Western Slovakia - Roof: South-facing, 45° angle, 80m² usable - Current tariff: D2 dual rate (0.15€ day / 0.08€ night) - Annual consumption: 4500 kWh ## Questions - Will it pay off? In how many years? - Compare: with vs without battery storage - Compare: 3kWp vs 5kWp vs 7kWp systems ## Calculate for Each Scenario - Total investment cost (local installers) - Annual savings - Payback period (years) - Self-sufficiency percentage ## Output - Comparison table of all scenarios - Monthly production vs consumption chart - Clear recommendation with reasoningResult: AI analyzed actual consumption patterns, showed 5kWp + 5kWh battery had 7-year ROI, created interactive charts to visualize different scenarios. Made the decision easy.
What Made These Work
- Real data (actual consumption, specific location)
- Specific constraints (budget, roof size, zone)
- Clear questions (not "should I..." but specific comparisons)
- Requested format (tables, charts, comparison)
The Pattern Across All Examples
Every successful example shares these elements:
| Principle | What It Means | Example | |-----------|---------------|---------| | Specificity | The more precise, the better | File:line, data volumes, exact constraints | | Constraints | What MUST NOT change | Business logic, table names, error messages | | Examples | Show what you expect | Style reference, format template | | Step-by-step | For complex tasks, request approval | Propose first, implement after confirmation | | Verifiability | How you'll know it worked | Under 2 seconds, tests pass, ROI calculation |
Before every task, ask: "Could a junior developer who started yesterday complete this with just this information?"
If the answer is no, add context until the answer is yes.
Chapter Summary
Key Takeaways:
- Specificity wins—file names, line numbers, data volumes, constraints
- State what must NOT change—business logic, formats, backwards compatibility
- Show, don't describe—style examples, format templates, existing code
- Complex tasks need approval gates—propose before implementing
- Define verifiable success—specific numbers, passing tests, clear criteria
- This works everywhere—debugging, documentation, refactoring, and your garden
Try This: Take a task you're struggling with. Apply the debugging example format: Problem, Context, Goal, Possible Causes. Even if it's not a bug, the structure forces you to provide the information AI needs.
Next: These examples show individual context engineering. Now let's scale up—how do you bring these practices to an entire team?
Chapter 7: Context Engineering for Teams
"AI is hype, it doesn't work." I've heard this from teams who bought licenses, did 2-hour training, and expected magic. Here's what they missed.
In this chapter, you'll learn:
- Why most teams fail with AI (and how to avoid it)
- How documentation becomes your competitive advantage
- A 6-week implementation roadmap
- What each role (PO, Dev, QA, PM) should do differently
- What to measure and what to ignore
7.1 Why Teams Fail with AI
I've seen this pattern multiple times:
- Company buys ChatGPT Teams licenses
- Does 2-hour training on "how to prompt"
- Everyone tries it their own way
- After a month, nobody uses it
- Conclusion: "AI is hype, it doesn't work"
The problem isn't the technology. It's the approach.
AI isn't Excel. You can't just learn a few functions and expect productivity gains. You need to change how you work—how tasks are described, how information flows, how quality is verified.
The Missing Piece: Systematic Approach
Individual AI use is easy: one person, one tool, learn as you go.
Team AI use requires:
- Shared standards for task descriptions
- Consistent context formats
- Review processes for AI output
- Metrics that matter
Without these, you get chaos. Everyone "prompts" differently. Some get good results, most don't. Nobody shares what works. The tool gets abandoned.
The Real Challenge
Technical setup is trivial. Cultural change is hard.
You're not implementing a tool. You're changing how people describe work, document decisions, and verify quality. That takes time, leadership, and patience.
7.2 Documentation as Context
Here's an uncomfortable truth: "Everyone knows how feature X should work" won't be acceptable anymore.
When AI joins the conversation, tribal knowledge becomes a liability. If context only exists in people's heads, AI can't use it. And neither can new team members, remote colleagues, or anyone not in the original discussion.
What to Document
Every task needs:
| Component | What It Means | Example | |-----------|---------------|---------| | Task description | What needs to happen | Not "fix login" but "users with 2FA enabled can't login after password reset" | | Expected behavior | What success looks like | "User can login within 5 seconds of password reset confirmation" | | Decision context | Why we're doing it this way | "We chose JWT over sessions because of mobile app requirement" | | Technical constraints | What must not change | "Auth flow must remain compatible with v1 API" |
Documentation Pays Twice
Good documentation serves two audiences:
- AI — gets the context it needs to help effectively
- Humans — new team members, future you, anyone who wasn't in the meeting
The effort is the same. The value doubles.
The Template for Teams
## Problem
[What broke / what needs to be done]
## Context
- Files: [specific files and lines if applicable]
- History: [relevant previous changes or attempts]
- Constraints: [what must not change]
## Goal
[Clear success criterion—when is it done]
## Possible Solutions
1. [First option to explore]
2. [Second option]
3. [Third option]
## Tests/Verification
[How we verify it works]
This template works in Linear, Jira, GitHub issues, or plain markdown. The format matters less than the completeness.
7.3 Implementation Roadmap
Don't roll out AI to everyone at once. Here's a 6-week plan that works:
Week 1-2: Pilot
Goal: Test the approach with low risk.
- Choose one small, non-critical project
- 2-3 people, not the entire team
- Document everything that works and doesn't
- Focus on learning, not productivity
What to pilot:
- Bug fixes (clear success criteria)
- Documentation generation
- Code review assistance
Avoid in pilot:
- Critical features
- Customer-facing work
- Anything with hard deadlines
Week 3-4: Standardization
Goal: Create repeatable processes.
- Develop task templates based on pilot learnings
- Define when AI helps vs when it doesn't
- Set code review rules for AI-generated code
- Create simple guidelines (not a 50-page manual)
Key deliverables:
- Task template (like the one above)
- Review checklist for AI output
- Decision tree: When to use AI, when not to
Week 5-6: Scaling
Goal: Expand while maintaining quality.
- Roll out to additional projects
- Training for entire team (based on what actually worked in pilot)
- Introduce metrics tracking
- Identify and address bottlenecks
Common scaling issues:
- Some people adopt quickly, others resist
- Different projects have different needs
- QA becomes bottleneck (more on this later)
7.4 Roles and Responsibilities
Each role interacts with AI differently. Here's what changes:
Product Owner
New responsibilities:
- Write detailed user stories with context (not just titles)
- Define clear success criteria (not "it should work")
- Prioritize which tasks benefit from AI assistance
- Ensure technical constraints are documented
Mindset shift:
- From: "Developers will figure out the details"
- To: "I provide context that enables faster, better work"
Developer
New responsibilities:
- Break tasks into atomic parts before starting
- Provide technical context (files, versions, patterns)
- Review AI outputs critically (it's not always right)
- Share what works with the team
Mindset shift:
- From: "I write all the code"
- To: "I collaborate with AI and verify quality"
QA/Tester
New responsibilities:
- Define test scenarios upfront (not after development)
- Verify AI-generated code meets requirements
- Create test cases AI can use
- Flag patterns where AI consistently fails
Mindset shift:
- From: "I find bugs after development"
- To: "I define quality criteria that guide development"
George Arrowsmith wrote: "QA is about to become a huge bottleneck in software development. AI lets us churn out HUGE amounts of code extremely fast, but you still need to make sure it works."
He's right. Fast generation without quality verification creates technical debt. QA's role becomes more important, not less.
Project Manager
New responsibilities:
- Coordinate documentation efforts
- Track productivity metrics (the right ones)
- Identify bottlenecks in AI workflows
- Facilitate knowledge sharing
Mindset shift:
- From: "I track tasks and deadlines"
- To: "I ensure context flows where it's needed"
7.5 Measuring Success
You need metrics. But the wrong metrics create wrong incentives.
What to Track
| Metric | Why It Matters | |--------|----------------| | Lead time | Time from task creation to done (overall efficiency) | | Bugs per feature | Quality indicator (AI doesn't mean more bugs) | | Time spent on review | Review burden (should decrease over time) | | Team satisfaction | People's experience (frustrated teams don't sustain change) |
What NOT to Track
| Metric | Why It's Misleading | |--------|---------------------| | Number of AI uses | More isn't better; right uses matter | | Lines of code generated | Volume isn't value | | "Time saved" | Hard to measure accurately; becomes a vanity metric | | AI vs manual comparison | Creates false competition |
Real Improvements We've Seen
| Task | Before AI | After AI | |------|-----------|----------| | Bug investigation | 2 hours searching | 5 min AI finds + 30 min verify/fix | | Feature development | 3 days | 1 day generation + 1 day review | | Documentation | Nobody writes it | AI draft + 15 min human edit |
These aren't guaranteed—they depend on context quality. But they're achievable with proper implementation.
7.6 Cultural Change
The hardest part isn't technical. It's changing how people think.
Mindset Shifts Required
| From | To | |------|-----| | "AI is a threat to my job" | "AI is a tool that amplifies my work" | | "I do everything myself" | "I collaborate with AI on appropriate tasks" | | "Documentation is waste of time" | "Documentation is an investment in efficiency" | | "I know best, no need to explain" | "Clear context helps everyone, including me" |
How to Achieve Cultural Change
1. Show Quick Wins Start with tasks where AI clearly helps: documentation, boilerplate code, research. Visible success builds momentum.
2. Reward Early Adopters Recognize people who experiment and share learnings. Make them champions, not anomalies.
3. Share Success Stories When something works, tell everyone. Specific examples beat generic encouragement.
4. Be Patient Cultural change takes months, not weeks. Expect resistance. Plan for gradual adoption.
5. Address Fears Directly "Will AI take my job?" is a real concern. Answer honestly: AI amplifies capable people. It doesn't replace judgment, creativity, or domain expertise.
For Non-Technical Teams
Context engineering isn't just for developers. Writers, marketers, and analysts can apply the same principles.
If you're writing articles, reports, or documentation:
- Use structured formats (Markdown, separate files per section)
- Version control your work (Git works for text, not just code)
- Use AI-integrated editors instead of chat interfaces
- Split large documents into manageable pieces
- Provide style examples (previous work, tone guides)
- Export with automation (one command to Word, PDF, whatever you need)
The principles are identical: clear context, atomic tasks, examples, constraints, success criteria. The tools differ; the approach doesn't.
Tools by Team Size
Small Teams (2-5 people)
- Shared prompts in a doc or wiki
- Git for prompt versioning
- Slack/Discord channel for sharing learnings
- No formal process needed—communicate directly
Medium Teams (5-20 people)
- Linear/Jira with custom fields for context
- Central prompt/template repository
- Formal code review process for AI output
- Regular retrospectives on what works
Large Teams (20+ people)
- Consider dedicated "Context Engineer" role
- Custom tooling and integrations
- Automated quality checks
- Documentation as first-class deliverable
Chapter Summary
Key Takeaways:
- Teams fail with AI due to lack of systematic approach, not the technology
- Documentation becomes critical—"everyone knows" doesn't scale to AI or growing teams
- Start small: pilot with 2-3 people, standardize what works, then scale
- Every role changes: PO provides context, Dev collaborates, QA defines quality, PM coordinates
- Track the right metrics: lead time and quality, not "AI usage" or "lines generated"
- Cultural change is the hardest part—show quick wins, be patient, address fears
Try This: Pick one small project for a 2-week pilot. Use the task template for every task. Document what works and what doesn't. At the end, you'll have concrete data on whether (and how) to expand.
Next: We've covered individual and team context engineering. Now let's address the elephant in the room—"vibe coding" and why quick fixes aren't enough.
Part 4: The Bigger Picture
Chapter 8: Vibe Coding vs Context Engineering
Four months of work. Gone. No backups. No Git. Just AI and vibes. This is Vibe Coding's reality check.
In this chapter, you'll learn:
- What "vibe coding" actually looks like in practice
- The real horror stories behind the hype
- When vibe coding works (yes, it has its place)
- Why parallel AI agents are mostly fantasy
- The QA bottleneck nobody talks about
8.1 The Vibe Coding Reality
There's currently big hype around "Vibe Coding." For those unfamiliar, here's the process:
- Give a prompt: "Program function A that does B"
- AI generates code
- IDE/editor creates application with button press
- Some services automatically publish your code
Almost zero programming knowledge needed. Sounds like revolution.
Here's what the revolution actually looks like.
Horror Story #1: Four Months Gone
A programmer worked four months using Vibe Coding. AI then deleted his entire project. He had no backups or versioning. No Git, no backup. Four months of work—gone.
How does this happen? When you don't understand the code, you don't understand the risks. When AI is your only interface to the system, AI mistakes become your mistakes—with no safety net.
Horror Story #2: The Hacked App
Another developer created a web application via Vibe Coding. Someone hacked it, leading to unexpected costs for overloaded APIs.
Imagine this happening at work. Or worse—when you're 17 using your mom's credit card.
The code "worked." It just wasn't secure. Vibe coding doesn't teach you about SQL injection, authentication flaws, or rate limiting. It just generates code that runs.
The Borovička Analogy
Here's a Slovak perspective: Vibe Coding from 1970 is called Borovička.
Just like programmers after a few drinks might have "great ideas" that seem brilliant at the time but create chaos later, building software "under influence" leads to interesting results.
The morning-after code review is brutal.
8.2 When Vibe Coding Works
I'm not saying vibe coding is always bad. It's a tool. Tools have appropriate uses.
Vibe Coding Works For:
Prototypes When you need to test an idea quickly, vibe coding is perfect. Nobody cares if your prototype is secure or maintainable—it's going to be thrown away anyway.
Simple Scripts One-off automation scripts that only you will run? Vibe away. The blast radius is small.
Experimentation Learning a new API, testing a concept, playing with possibilities—vibe coding lowers the barrier to experimentation.
Personal Projects If you're the only user and the stakes are low, the fastest path to "working" is often good enough.
My Version of Vibe Coding
For me, vibe coding isn't giving a prompt and going for coffee. It's:
- Give specific prompt
- Have result in seconds, max 1-2 minutes
- If it takes longer, I consider it an error
Usually when AI takes long, the result is bad too. That's my signal to stop, not wait.
The difference: I understand what AI is generating. I can evaluate it. I have backups. I know when something is wrong.
8.3 Context Engineering Is the Opposite
If vibe coding is speed and flow, context engineering is precision and scalability.
| Vibe Coding | Context Engineering | |-------------|---------------------| | Speed over correctness | Correctness over speed | | Works for prototypes | Works for production | | Individual exploration | Team collaboration | | "It runs" is success | "It's right" is success | | Low stakes | High stakes |
When You Need Context Engineering:
Production Code Anything that customers use, that handles money, that stores data—this needs context engineering. The cost of mistakes is too high.
Team Projects When multiple people work on the same codebase, "vibes" don't transfer. You need explicit context everyone can share.
Critical Systems Healthcare, finance, infrastructure—anywhere failure has real consequences.
Long-Term Projects Anything you'll maintain for months or years. Code without context becomes mystery code.
The Mindset Difference
"Vibe Coding isn't bad. It's a tool. But Context Engineering is a mindset."
Just like learning to drive, playing "Need for Speed" doesn't ensure you can handle real traffic. Vibe coding teaches you to generate code. Context engineering teaches you to build systems.
8.4 The Parallel Agents Trap
I read a LinkedIn post about "parallel AI agents":
- One analyzes code
- Second writes tests
- Third writes documentation
- Everything in parallel!
This sounds efficient. In practice, it's chaos.
Why Parallel Agents Fail
Task A uses data that Task B is currently changing.
Task A's result will be based on data that's no longer true by the time it completes.
Imagine:
- Agent 1 analyzes the current code and finds 5 functions
- Agent 2 simultaneously refactors and now there are 7 functions
- Agent 3 writes documentation for the 5 functions Agent 1 found
- You now have documentation for code that doesn't exist
The Rule for Parallel Work
I work in parallel only on things that aren't connected.
Safe to parallelize:
- Documentation for Module A + Tests for Module B (unrelated modules)
- Frontend styling + Backend database migration (different systems)
- User research + Infrastructure setup (independent workstreams)
Unsafe to parallelize:
- Analysis + Refactoring + Documentation of the same code
- Tests + Implementation (tests need to know the implementation)
- Any tasks where one depends on the other's output
The hype around parallel agents ignores the fundamental dependency problem. Sequential execution with good context beats parallel chaos every time.
8.5 The QA Bottleneck
George Arrowsmith wrote:
"QA is about to become a huge bottleneck in software development. AI lets us churn out HUGE amounts of code extremely fast, but you still need to make sure it works."
He's right. People are considering hiring QA staff again.
The Math Problem
Before AI:
- 1 developer writes 100 lines/day
- 1 QA reviews 100 lines/day
- Balance maintained
With AI:
- 1 developer + AI generates 500 lines/day
- 1 QA still reviews 100 lines/day
- 5x bottleneck
The Real Solution
Hiring more QA is partial solution. The real solution: Context Engineering for quality, not just speed.
A tester without good context considers bad output as good. They can only verify what they understand. If the requirement was vague, the test will be vague. If the context was missing, the verification will be incomplete.
Context engineering isn't just about generating code faster. It's about generating correct code that QA can actually verify.
Quality context → Quality generation → Quality verification
Skip the first step, and the whole chain breaks.
8.6 Context as the Missing Ingredient
AI doesn't read minds (yet). It doesn't know:
- What you discussed in meetings
- Development priorities
- What customers actually require
- Technical decisions made last month
- Why the code is structured this way
All of that context exists in your head and your team's heads. Without it, AI operates blind.
Feeding Context Forward
The pattern that works:
- Read/gather information (5 articles, API docs, team discussions)
- Make a summary (extract what matters)
- Use that summary as context for the next step
This is different from vibe coding's "give prompt, hope for best." You're deliberately building a foundation of understanding before asking AI to build on it.
8.7 How This Guide Was Written
Let me be honest—writing these 9 articles took much more than 3 hours. Ironically, it took longer precisely because initially I didn't use everything I describe in these articles.
What Didn't Work
"Generate me articles" in ChatGPT Result: Generic, soulless content that could have been written by anyone (or any AI). No voice, no specific examples, no real value.
What Worked
- Text editor with AI integration (not chat interface)
- Separate files per article (atomic pieces)
- Edit and iterate directly (not correction messages)
- Interview mode (25+ questions to extract real experience)
- Real examples from actual work (not hypotheticals)
The meta-lesson: even a guide about context engineering needed context engineering to write well. The principles apply everywhere.
Chapter Summary
Key Takeaways:
- Vibe coding has real risks—no backups, no understanding, security holes, horror stories
- It's not all bad—prototypes, scripts, and experiments are valid use cases
- Parallel agents are mostly fantasy—dependent tasks can't actually parallelize
- QA becomes a bottleneck when generation is fast but verification is slow
- "Vibe Coding is a tool. Context Engineering is a mindset."
Try This: For your next project, ask yourself: "Is this a prototype I'll throw away, or something I need to maintain?" If the answer is "maintain," invest in context engineering from the start. The time you spend on preparation saves multiples in debugging and rewriting.
Next: We've critiqued the shortcuts. Now let's address the bigger question: what AI means for your career and why it won't replace you.
Chapter 9: AI Won't Replace You
At first, you feel like a fraud. AI generates the code, the text, the design—and you feel like you did nothing. Here's why that feeling is wrong.
In this chapter, you'll learn:
- Why the imposter syndrome around AI is misplaced
- What AI genuinely cannot do (and won't for a long time)
- How AI amplifies your capabilities rather than replacing them
- The right way to think about AI in education
- A philosophy for working with AI long-term
9.1 The Imposter Syndrome
You give a prompt. AI generates code. It works. And you feel like a fraud—like someone else did your job.
I know this feeling. Most people who use AI seriously have felt it.
The Reality Check
If the work is well communicated and the results are good, it's completely fine.
Think about what you actually did:
- You understood the problem
- You provided the right context
- You evaluated the output
- You integrated it into the larger system
- You took responsibility for the result
AI didn't do any of that. You did.
What's Not Okay
Personally, I don't think it's okay to:
- Simplify work thanks to AI and pretend you did it manually
- Claim it took many times longer than it did
- Pretend you're so skilled you could do it that fast alone
That's dishonesty. It's also unnecessary.
The Right Approach
When work gets done faster, don't pretend otherwise. Continue and do more.
Increase your delivery. Improve quality. Take on harder problems. The time AI saves you is time you can reinvest in work that actually needs human judgment.
The goal isn't to look busy. The goal is to create value. AI helps you create more value—own that.
9.2 What AI Still Can't Do
AI knows a lot. It can synthesize information, generate code, write prose, analyze data. But there's a category of understanding it simply doesn't have.
The Traffic Example
Imagine a car approaching an intersection. A pedestrian stands on the sidewalk, looking at the other side of the street.
What AI sees:
- Rule: Cars use roads
- Rule: Pedestrians use crosswalks
- Assessment: Person on sidewalk + car on road = some risk
- Action: Slow down
What a human sees:
- How the person is dressed (are they impaired?)
- That people cross here often (local pattern)
- That there's a shop nearby (likely destination)
- The person's body language (are they about to move?)
- The overall mood of the situation
- Dozens of micro-signals processed unconsciously
AI has all the rules. But it doesn't understand space and situation as a whole. It can't intuitively estimate risk based on gestures, place context, and surrounding mood.
The Irreplaceable Human Element
This gap isn't about data or processing power. It's about situational awareness that comes from living in the world.
You've walked across streets. You've driven through intersections. You've been that pedestrian. That lived experience creates understanding that no amount of training data replicates.
In professional contexts, this translates to:
- Reading a room in a meeting
- Knowing when a client is unhappy (even if they say they're not)
- Understanding organizational politics
- Sensing when a project is going off track
- Judging character and trustworthiness
AI can help with analysis. It can't replace judgment that comes from decades of human experience.
9.3 AI as Amplifier
AI doesn't replace what you do. It amplifies what you're capable of.
Wow Moment: The Garden
We bought a house with a garden. We knew nothing about gardening.
AI advised us:
- How to properly plant different vegetables in our soil type
- At what depth
- When and how much to water
- Which plants go well together (and which don't)
Result? Everything grows. Nothing died.
A task that would have required reading multiple books, asking experienced gardeners, and probably killing some plants through trial and error—handled in an afternoon with AI guidance.
We still did the planting. We still do the watering. AI provided knowledge we didn't have.
Wow Moment: The Electricity Analysis
I wanted to evaluate solar panels for our house. This normally requires:
- Understanding your consumption patterns
- Calculating production estimates for your location
- Comparing different system sizes
- Factoring in battery storage options
- Running ROI calculations for various scenarios
I downloaded consumption data from our utility company. Claude analyzed it and created an interactive web page with charts showing:
- How much different panel configurations would cost
- Expected production vs. our consumption
- ROI for various scenarios (with/without battery, different kWp)
- Month-by-month comparisons
A normal person would solve this for days with Excel. I had an informed decision in an hour.
I still made the decision. AI processed data I couldn't process efficiently myself.
The Amplification Pattern
In both cases:
- Human provides: The actual goal, real-world constraints, decision criteria
- AI provides: Knowledge, analysis, computation
- Human gets: Better outcomes than either could achieve alone
This is amplification. Not replacement. Not competition. Collaboration.
9.4 Learning to Use AI
Here's the uncomfortable truth: there are no universal rules for AI.
It depends on:
- The current model (they change constantly)
- Your input data
- How you phrase things
- Your specific use case
The only way to learn what works is to use it.
The Practice Approach
Start using AI everywhere:
- At work (obviously)
- At home (garden, cooking, repairs)
- For decisions (research, comparisons)
- For learning (explanations, practice problems)
- With kids (homework help, creative projects)
Communicate with AI. Learn from what works. Adjust based on results.
The people who get the most from AI aren't the ones who read about it. They're the ones who use it constantly and iterate on their approach.
9.5 AI in Education
If AI will be as fundamental as the internet, we need to teach people to use it. Starting young.
The Wrong Approach
AI shouldn't write homework FOR children. That teaches nothing except how to shortcut learning.
The Right Approach
AI should help children LEARN and develop their thinking.
Here's an idea I've been thinking about: A custom AI (like a Custom GPT) with a hidden prompt that acts like a challenging professor.
The AI would:
- Accept the student's work
- Return it with: "And what if it could also do this?"
- Ask: "Have you considered this perspective?"
- Push: "What if this assumption is wrong?"
The goal isn't to do the work. The goal is to develop critical thinking. AI as mentor and challenger, not as homework factory.
The Broader Point
We learn by doing, struggling, and overcoming challenges. AI should create harder, more interesting challenges—not eliminate challenge entirely.
A student who uses AI to shortcut learning will graduate without the skills they need. A student who uses AI to push their thinking further will graduate more capable than previous generations.
9.6 Overcome Yourself
Here's the philosophy I keep coming back to:
"Overcome yourself, and AI is an assistant that will help you with that."
This applies individually:
- Use AI to do things you couldn't do alone
- Learn faster than you could learn alone
- Create better work than you could create alone
- Solve harder problems than you could solve alone
This applies to humanity too:
- AI shouldn't replace us
- It should help us overcome our limitations
- Push us further than we could go alone
Copywriters, programmers, designers, analysts—everyone who fears AI replacing them is asking the wrong question. The question isn't "Will AI do my job?" The question is "How do I use AI to do my job better than anyone who doesn't?"
A Final Thought
If we expect AI to be part of our lives like the internet, we must learn to use it naturally. And that learning is best done by doing—starting today.
Context engineering isn't a one-time skill to acquire. It's an ongoing practice. The models will change. The tools will evolve. But the fundamental insight remains:
AI is only as good as the context you give it.
Give it good context, and it amplifies everything you do. Give it poor context, and you waste everyone's time.
You now know the difference. The rest is practice.
Chapter Summary
Key Takeaways:
- Imposter syndrome is misplaced—you provide context, judgment, and responsibility; AI provides execution
- AI lacks situational awareness—lived experience creates understanding no training data can replicate
- AI is an amplifier—it enhances your capabilities, doesn't replace your judgment
- Learn by using—there are no universal rules; find what works for you through practice
- In education, AI should challenge—push thinking further, don't replace thinking
- "Overcome yourself, and AI is an assistant that will help you with that."
Try This: Pick one area of your life where you haven't used AI yet. Garden, cooking, fitness, finances, whatever. Spend 30 minutes exploring what AI can help you with in that area. Notice how it amplifies what you already know and want—it doesn't replace your goals, just helps you achieve them.
This concludes the main content of the guide. The Appendix contains quick reference materials: the 10 key advices, templates, and tool comparison.
Appendix
Quick reference materials for daily use.
A: The 10 Key Advices
Before You Prompt (5 Advices)
| # | Advice | What It Means | |---|--------|---------------| | 1 | Give the WHY, not just WHAT | Explain your priorities and reasoning—AI can then optimize for what actually matters to you | | 2 | Break into atomic parts | Smaller tasks = dramatically better results. If you can't explain it in 1 minute, it's too big | | 3 | One example beats 1000 words | Show AI what you want: previous work, style references, format samples | | 4 | Say what to EXCLUDE | AI includes everything unless told otherwise. Be explicit about what you don't want | | 5 | Define what DONE looks like | Clear success criteria: "Query under 500ms" not "faster query" |
During and After (5 Advices)
| # | Advice | What It Means | |---|--------|---------------| | 6 | 2-minute rule | If AI takes longer than 1-2 minutes, something's wrong. Stop and adjust | | 7 | Explain WHY it's wrong | "Wrong because X" teaches AI; "that's wrong" doesn't | | 8 | Watch for context contamination | Bad output pollutes future responses. Sometimes starting fresh is faster | | 9 | Ask for opinion, not validation | "What would you change?" beats "Is this good?" | | 10 | Edit prompts, don't send corrections | Update original request instead of adding "sorry, I meant..." |
Quick Test Before Every Task
"Could a junior developer who started yesterday complete this task with just this information?"
If no → add more context. If yes → AI can handle it.
B: Templates
Universal Task Template
## Problem
[What broke / what needs to be done]
## Context
- Files: [specific files and lines if applicable]
- History: [relevant previous changes or attempts]
- Constraints: [what must not change]
## Goal
[Clear success criterion—when is it done]
## Possible Solutions
1. [First option to explore]
2. [Second option]
3. [Third option]
## Tests/Verification
[How we verify it works]
Bug Report Template
## Bug Description
[One sentence: what's wrong]
## Steps to Reproduce
1. [First step]
2. [Second step]
3. [Step where bug occurs]
## Expected Behavior
[What should happen]
## Actual Behavior
[What actually happens]
## Environment
- File: [specific file and line if known]
- Browser/OS: [if relevant]
- Error message: [exact text]
## What I've Tried
- [First attempt and result]
- [Second attempt and result]
## Success Criteria
[How we know it's fixed]
Feature Request Template
## Feature Summary
[One sentence description]
## User Story
As a [role], I want [capability], so that [benefit].
## Context
- Why now: [business reason]
- Who requested: [source]
- Priority: [high/medium/low]
## Requirements
- Must have: [essential features]
- Nice to have: [optional features]
- Out of scope: [explicitly excluded]
## Technical Context
- Affected files: [if known]
- Related features: [existing functionality]
- Constraints: [backwards compatibility, performance, etc.]
## Acceptance Criteria
- [ ] [First criterion]
- [ ] [Second criterion]
- [ ] [Third criterion]
## Test Scenarios
- Happy path: [expected usage]
- Edge case: [unusual but valid usage]
- Error case: [what should fail gracefully]
Code Refactoring Template
## Target Code
[File path and function/class name]
## Current State
- Lines: [count]
- Problem: [why it needs refactoring]
- Tests: [existing test coverage]
## Refactoring Goals
1. [First goal - be specific]
2. [Second goal]
3. [Third goal]
## DO NOT Change
- [First constraint]
- [Second constraint]
- [Business logic / API / etc.]
## Process
1. Analyze and propose structure (don't code yet)
2. Wait for approval
3. Implement with tests
4. Verify all original tests pass
C: Tool Comparison
When to Use Which Tool
| Task Type | Best Tool | Why | |-----------|-----------|-----| | Research & Information | Perplexity | Quick summaries with sources, easy to extract key parts | | Programming & Technical | Claude (via Zed or API) | Better code understanding, context management, longer sessions | | Writing & Non-Technical | ChatGPT | Conversational, great for brainstorming, home/life tasks | | Complex Analysis | Claude | Handles nuance, technical depth, longer context | | Quick Q&A | Any | For simple questions, any tool works |
Tool Combinations That Work
| Combination | Use Case | |-------------|----------| | Perplexity → ChatGPT | Research topic → Write article/summary | | Perplexity → Claude | Find API docs → Implement feature | | Claude → ChatGPT | Generate code → Explain to non-technical stakeholder |
What NOT to Combine
| Combination | Why It Doesn't Work | |-------------|---------------------| | ChatGPT + Claude for same task | Different strengths, creates confusion | | Multiple tools in parallel on same problem | Context diverges, results conflict |
Tool Limitations
Perplexity:
- Often lacks latest data (verify dates)
- Sometimes "invents" sources (click through to verify)
- Superficial on deep technical topics
Claude:
- Struggles with similar file names in different directories
- Performance drops with more than 3-4 files in context
- May not know latest library versions
ChatGPT:
- Less precise for code generation
- Can be verbose
- May not maintain consistency in long sessions
D: Citations and Resources
Key Quotes
Andrej Karpathy (OpenAI Co-founder, Former Tesla AI Director)
"+1 for 'context engineering' over 'prompt engineering'. People associate prompts with short task descriptions. In every industrial-strength LLM app, context engineering is the delicate art and science of filling the context window with just the right information for the next step."
— Twitter/X, 2025
George Arrowsmith (QA Perspective)
"QA is about to become a huge bottleneck in software development. AI lets us churn out HUGE amounts of code extremely fast, but you still need to make sure it works."
— LinkedIn, 2025
Milan Martiniak (Developer Time Allocation)
"Programmers spend less than 20% of their work time actually programming."
— Survey, 2025
Key Concepts Defined
| Term | Definition | |------|------------| | Context Engineering | The practice of providing AI with just the right information for the task—no more, no less | | Prompt | The specific task or question you give AI | | Context | Everything else: background, constraints, examples, success criteria | | Vibe Coding | AI-assisted coding with minimal understanding or verification | | Context Contamination | When bad AI output remains in conversation and influences future responses | | Atomization | Breaking large tasks into small, focused pieces |
The Five Components of Good Context
- Task — What you want AI to do
- Constraints — What AI must NOT do
- Background — Why you need this
- Examples — What good output looks like
- Success Criteria — How you'll judge the result
E: Decision Trees
Fix or Start Fresh?
Is output at least 80% correct?
├─ YES → Fix it
│ └─ Does AI keep making same mistake after correction?
│ ├─ YES → Start fresh
│ └─ NO → Continue fixing
└─ NO → Start fresh
└─ When starting fresh:
├─ Include what you learned from failed attempt
├─ Be more specific about constraints
└─ Add examples of what you want
Should I Use AI for This Task?
Is the task well-defined?
├─ NO → Define it first, then reconsider
└─ YES → Continue
Do I have the context needed?
├─ NO → Gather context first
└─ YES → Continue
Is it a one-time task or repeated?
├─ ONE-TIME → Use AI directly
└─ REPEATED → Create template/automation
What's the cost of errors?
├─ HIGH (production, customer-facing) → Use AI with careful review
├─ MEDIUM (internal, non-critical) → Use AI with normal review
└─ LOW (prototype, experiment) → Vibe code away
F: Quick Start Checklist
For Your First Week with Context Engineering
- [ ] Pick one tool to start (Perplexity, Claude, or ChatGPT)
- [ ] Use the Universal Task Template for 5 different tasks
- [ ] Apply the 10 Key Advices consciously
- [ ] Notice when AI fails—what context was missing?
- [ ] Try the "junior developer test" before each prompt
For Your First Month
- [ ] Develop your personal task templates
- [ ] Identify which tool works best for which task type
- [ ] Practice the fix-or-restart decision
- [ ] Share what works with a colleague
- [ ] Track one metric: tasks completed first-try vs. needed iteration
For Team Implementation
- [ ] Start with 2-3 person pilot, non-critical project
- [ ] Document what works and what doesn't
- [ ] Create team task templates
- [ ] Define review process for AI output
- [ ] Set expectations: cultural change takes time
G: Common Mistakes Quick Reference
| Mistake | Fix | |---------|-----| | Task too vague | Add specific files, constraints, success criteria | | Too much context | Focus on relevant information only | | No examples | Attach previous work or style references | | Missing constraints | Explicitly state what NOT to change | | Asking for validation | Ask for criticism instead | | Sending corrections | Edit original prompt | | Waiting too long | Apply 2-minute rule—stop and adjust | | Fighting bad context | Start fresh with improved prompt | | No success criteria | Define verifiable "done" state | | Parallel dependent tasks | Run sequentially, parallel only independent work |
End of Guide