AI is Not Magic – Why Prompt Engineering Isn't Enough

Briefing

Core idea: Prompting is only half the job — the other half is supplying precise, relevant context. Treat AI like a new, very capable colleague who needs a concise onboarding.
Three quick actions: 1) Start with a one‑sentence goal; 2) attach the minimal artifact (failing test, 10 lines of code, or one authoritative link); 3) state explicit constraints and the exact desired output shape.
Decision rule: If the model's output covers ≳80% of the goal and is correct, iterate inline; otherwise distill and restart in a fresh session.
Practical tip: Use a lightweight summary (one paragraph + 3 TODOs) every 8–12 interactions to avoid context‑bleeding and to seed clean follow‑up sessions.
Do now (1 minute): Open your last AI session, extract a one‑sentence summary and top 3 TODOs, and paste that into a fresh session as the starting point.

Imagine asking a brilliant new colleague for help on day one: they're fast, curious, and available 24/7, but they know nothing about your codebase, your product, or the tiny tribal rules your team uses. They can learn—fast—but only if you teach them the right things.

This mental model changed how I use AI. It isn't a genie. It isn't Google. It's an apprentice with near‑instant access to a huge library—but no context. If you give it bad input, you'll get bad output. If you teach it the right context, it becomes powerful.

Prompt vs. context — a practical distinction

Prompt = the explicit instruction or question you give the model.
Context = everything that surrounds that prompt and helps the model interpret it: data, constraints, examples, tooling, and goals.

People obsess about "the perfect prompt" and ignore context. That's like giving a junior engineer a task title and expecting flawless production code. You wouldn't do that — so don't do it with AI.

Two short examples that make the point

The car seat (real example)

Bad prompt: "Recommend the best car seat."
Good prompt + context: "My older child is 120 cm; safety is the only priority (not comfort or price). Use ADAC 2025 safety metrics, ignore 'overall' scores, and prioritize frontal and side crash protection. Exclude marketplace‑only sellers (e.g. Amazon)."

Iterating the context (telling the model to ignore overall scores and focus on ADAC safety metrics) changed the recommendations completely. That was context refinement, not magic.

The debugging task (real example)

Bad prompt: "User can't edit record."
Good prompt + context: "User is admin. Issue appears in records_table.ex (lines 100–200). Edit works on the detail view but not in the table. I need a fix and a test that prevents regression."

With that level of context, the model suggested concrete checks and a test case—fast.

Practical heuristics you can use right now

These are short, actionable rules I use daily.

The 1–2 minute test If a model's useful response isn't arriving within ~1–2 minutes of focused iteration, something is wrong with your prompt or context. Either the task is too big, the context is noisy, or the model is trying to do everything at once. Split the task or narrow the context.
Atomize the problem Break large tasks into small, independent steps (atomic tasks). Each atomic task should be something you could hand to a junior engineer with a file name, line range, and one clear goal. Smaller tasks let you give precise context and get high‑quality first‑pass results.
Choose the right tool for the job
- Google: when freshness and primary sources matter (e.g., the latest ADAC report, new research papers).
- Perplexity / web‑enabled search assistants: great for guided research and finding links quickly, but always verify dates and original sources.
- Code‑aware models in the IDE (e.g., Claude in Zed): best when the model can access your repo and relevant files directly.
- ChatGPT / general models: excellent for ideation, drafting, and non‑code work (my personal preference for writing and synthesis).
Prefer explicit over implicit Tell the model what you assume: the expected inputs, exact files, library versions, and constraints. If something is new (a framework or project convention), add that explicitly.
Signal when it's okay to stop If the model produces ≳80% of what you need, iterate and correct in the same session. If it's wrong in most respects, start a fresh session with cleaner context — as I did with the ADAC example when I had to correct which metric to use.

Atomization: why smaller is faster and better

There's an irony: giving big, monolithic tasks often costs you more time. When you split a complex problem into small, well‑scoped steps, two things happen:

You can attach the exact context to each step (the right file, right lines, exact test case).
You drastically increase the chance of a correct first‑pass response, which keeps the overall process fast.

Think "vibe coding": rapid, focused interactions. Give the model one small objective and a tight context, get a quick result, then move to the next atom.

Tooling patterns and decision hints

If you need the latest facts: search the web first (Google or Perplexity) and confirm sources.
If you need code changes that reference your repo: use a code‑aware assistant in the IDE and explicitly name files and line ranges.
If you need synthesis, editing, or creative output: use a general LLM but feed it structured context — examples, style notes, and success criteria.
When models keep repeating a wrong behavior (e.g., repeatedly recommending marketplace sellers despite an exclusion), start a new session with a concise corrected constraint.

Common failure modes and quick fixes

Failure: model returns a generic list or "overall" scores you asked it to ignore.
- Fix: explicitly state the metric and show an example line: "Use the ADAC frontal and side protection scores only. Ignore the overall rating."
Failure: model confuses similarly named files.
- Fix: provide absolute paths or paste the file header (the unique top lines) into the context.
Failure: model goes off into multiple tangents or handles many cases at once.
- Fix: ask it to perform one action at a time and return results in a strict format (bullet list, JSON, or numbered steps).

References and further reading (hand‑picked)

"Search is not magic" — a practical piece on structured queries and context: https://blog.ideax.sk/search-is-not-magic-with-postgresql-613069cb2f21
Context engineering primers and discussion:
- https://www.philschmid.de/context-engineering
- https://blog.langchain.com/the-rise-of-context-engineering/
- https://x.com/karpathy/status/1937902205765607626

Takeaway

Prompt engineering matters, but it's only half the story. The real win comes from combining prompts with precise, relevant context: correctly atomized tasks, the right tool at the right time, and a tolerance for iterative correction.

Think of AI as a colleague who knows nothing but can learn everything—if you teach them. If you adopt a few simple heuristics (the 1–2 minute test, atomization, explicit assumptions, and careful tool choice), you'll turn AI from a frustrating black box into a reliable collaborator.

In the next article I’ll show concrete patterns for building "good context" — practical structures and workflows I use for debugging, product research, and drafting content. See you there.

Oliver's Articles

AI is Not Magic – Why Prompt Engineering Isn't Enough

AI is Not Magic – Why Prompt Engineering Isn't Enough

Prompt vs. context — a practical distinction

Two short examples that make the point

Practical heuristics you can use right now

Atomization: why smaller is faster and better

Tooling patterns and decision hints

Common failure modes and quick fixes

References and further reading (hand‑picked)

Takeaway

Article Details

Navigate

More from context engineering

What Good Context Looks Like — Practical Patterns

Iterating Prompts and Workflows — When to Fix and When to Restart

Context Is Everything — But Not All Context Is Useful