CodeCon ZA presentation notes
Topic: Nie prompt, ale kontext prináša výsledok Description: Väčšina AI zlyhaní nevzniká kvôli promptom, ale kvôli chýbajúcemu kontextu. V prednáške ukážem, ako sa z bežného „napíš mi X“ stane spoľahlivý, opakovateľný proces. Ako s AI pracovať systematicky – čo do kontextu pridať, čo vynechať a kedy začať odznova, aby výsledky neboli náhodné, ale použiteľné. Technical details: presentation in PowerPoint for 20 minutes + 10 minutes Q&A
My goals with presentation: I want to share usefull informations for visitors of this presentation, so I'm thinking about 10-12 examples which will show what not to do and what rather do, or just what can give you better output. Every example would show usecase example for developer(tech) and non tech person. So when there will be some marketer or some regular non-tech user will be able to use this advice, developer will be able to use it in development. Because visitor will vary, mostly developers but stil...But 20 mintues is not enough, so some intro plus 10-12 examples for 1 minute, and some ending. Intro should show how important is context that different context with same question/image/prompt can give very very different output. I like intro in Dan Brown book/movie DaVinci Code where on the begginging Robert Langdone is presenting on university, in dark room and he showed some symbols and asked about them and students replies how about it how it's about bad stuff and so on but actually when he showed zoomed out that image it was from different use case, and it was about something completely different and positive. Also before my 10-12 advices I should explain shortly difference between context and prompt from my perspective.
Here are some notes about my presentation:
For example this article showed it correcly on their chatbot use case where keeping context on AI is big risk to not get proper data at all, all get all data correctly. Their user case was that they exactly know who is the user who use the chat so they preffiled it with generated data thanks to clever template. That templtae is not good approach for my presentation because I want to talk about regular users and developers https://www.linkedin.com/pulse/proactive-vs-reactive-rethinking-how-ai-gets-its-context-denman-bbxye/ (downloaded in articles/context_engineering_new/codecon_materials/1.md)
This article is good as well, I like how it use ANdrej Karpathy quote. But also it's important that it require to think about what LLM/AI need to make decision intead of doing it, you don't need 10 version of code or article, you need to have proper way, maybe start, what it should contains,etc. It will add another step before you can get your results, but thanks to that you have flow which will figure out what really needs to be done based on correct input/context and probably middle step with your decision or even without it where LLM/AI will just continue based on procesed context to get proper output as post or code itself. Also nicely explain different between junior and senior output, because junior and senior are same, senior just have more and better context (know how to filter and user proper context in mind). All thanks to years of experience which help him to decide what is good approach or what not. You can simulate these years of experience with proper context. NOt adding terrible article or useless code from frontend when you need to fix backend code. Or file where are some parts wrongly coded where "junior ai" will assume it's done like this in example/context so probably is correct. When you don't say exactly that something is wrong it's basically on AI mind to decide that if it's correct or not, but there is no 100% guaranty it will assume correctly and also every use case can be different, you should decide which is good for your case right now. Also it nice show how leveraging on "AI Mind" is tricky, it can give you what you need but I guess it can be calculated that that it's 1:X that it will hit what you have in your mind. But do you want to play this rullet? or do you want to increase your chance to win. Also When you read "the best prompt for X", it doesn't mean it's best for YOU. It can be, but AI is not exact and same prompt, even same context doesn't mean exactly same results. Explain yourself, give AI context (not your whole life) to understand reasoning why you need something so it can start from some same base as you think so it can just be closer to your thinking. DO NOT VIBE, WORK TOGETHER. Examples are really good in this article but it's out of scope my presentation because these are more about prompt exactnes https://tomascupr.substack.com/p/stop-prompting-start-briefing?r=1w5utk (downloaded in articles/context_engineering_new/codecon_materials/2.md)
A few months ago I wrote articles on this topic, you can find them here:
articles/context_engineering_new/1_prompt_is_not_enough.md
articles/context_engineering_new/9_vibe_coding_vs_context_engineering.md
articles/context_engineering_new/4_think_like_engineer.md
articles/context_engineering_new/7_context_engineering_for_teams.md
articles/context_engineering_new/8_practical_examples.md
articles/context_engineering_new/5_ai_wont_steal_your_job.md
articles/context_engineering_new/2_what_good_context_looks_like.md
articles/context_engineering_new/3_why_you_give_much_get_little.md
articles/context_engineering_new/6_tools_i_use.md
Also I did a little bit nice examples in posts for these articles:
articles/context_engineering_new/linkedin_post_7.md
articles/context_engineering_new/linkedin_post_3.md
articles/context_engineering_new/linkedin_post_2.md
articles/context_engineering_new/linkedin_post_6.md
articles/context_engineering_new/linkedin_post_9.md
articles/context_engineering_new/linkedin_post_8.md
articles/context_engineering_new/linkedin_post_1.md
articles/context_engineering_new/linkedin_post_5.md
articles/context_engineering_new/linkedin_post_4.md
Articles are written based on my interview which I simulated with ChatGPT and Claude so LLMs asked me questions and interview me about this topic. You can find whole interview here: Sources for this interview: articles/context_engineering/_chatgpt/start.md Interview: articles/context_engineering/_chatgpt/otazky.md
I found this on internet on this topic, see my notes:
I like these citations:
Shopify CEO Tobi Lutke captured this distinction clearly: “I really like the term ‘context engineering’ over prompt engineering. It describes the core skill better: the art of providing all the context for the task to be plausibly solvable by the LLM”. Harrison Chase, CEO of LangChain, expanded on this definition: “Context engineering is building dynamic systems to provide the right information and tools in the right format such that the LLM can plausibly accomplish the task”
MCP - I don't like it, a lot of users same as me don't like it because of multiple reasons. For example one of them is that they takes a lot of context/tokens. But also it all AI to fully take data to context generally, that's also reason why Anthropic added skills where you can add scripts as well, so you can call cli or run scripts which gave you exactly what you need. But yes, generaelly like allow to query your local db is good feature but dangerous as well when AI decide to make query which will return wrong a lot useless data. Also this week Anthropic published feature which will allow to search in MCP tools instead of loading them all.
This from that internet search nicely confirm what I'm talking about:
Context Rot: The Hidden Performance Degradation
A groundbreaking study from Chroma Research titled “Context Rot: How Increasing Input Tokens Impacts LLM Performance” (July 2025) challenges the assumption that longer context windows automatically translate to better performance.
The research evaluated 18 leading LLMs, including GPT-4.1, Claude 4, Gemini 2.5, and Qwen3, revealing that model performance degrades non-uniformly as input length increases—even on simple tasks. Key findings include:
Performance degradation with needle-question similarity: Lower semantic similarity between questions and target information correlates with faster performance decline as context length grows.
Non-uniform distractor impact: Distractors—topically related but irrelevant information—have varying negative effects that amplify with longer contexts. Different model families handle distractors distinctively; Claude models tend to abstain under ambiguity while GPT models often hallucinate confidently.
Haystack structure matters: Surprisingly, models perform better when context structure is disrupted (shuffled sentences) rather than preserved with logical coherence, suggesting structural patterns influence attention mechanism behavior in unexpected ways.
so we should mentioned somewhere as proof of my theory that Chroma Research study, it's wonderfull.
Another good article is downloaded here articles/context_engineering_new/codecon_materials/3.md and here articles/context_engineering_new/codecon_materials/4.md
maybe we can use it somewhere to proof that I'm talking shit randomly created by me.
Now I have gathered comprehensive information on context engineering for LLMs. Let me compile the findings into a comprehensive report.
# Context Engineering for LLMs: A Comprehensive Overview of Recent Research and Industry Developments
The field of LLM development has seen a significant paradigm shift from simple "prompt engineering" to a more sophisticated discipline now called **context engineering**. This term gained widespread traction in mid-2025 when **Andrej Karpathy** famously endorsed it, stating that context engineering is "the delicate art and science of filling the context window with just the right information for the next step"[1][2]. This report synthesizes recent academic research, industry publications, and expert perspectives from the last six months.
## The Emergence of Context Engineering as a Discipline
The term "context engineering" represents a fundamental evolution in how practitioners interact with LLMs. While prompt engineering focused on crafting the right words and phrases, context engineering encompasses the entire information payload provided to models during inference[3][4].
**Shopify CEO Tobi Lutke** captured this distinction clearly: "I really like the term 'context engineering' over prompt engineering. It describes the core skill better: the art of providing all the context for the task to be plausibly solvable by the LLM"[2]. **Harrison Chase**, CEO of LangChain, expanded on this definition: "Context engineering is building dynamic systems to provide the right information and tools in the right format such that the LLM can plausibly accomplish the task"[5][6].
**Simon Willison**, a prominent AI commentator, noted that the inferred definition of "context engineering" is likely to align much more closely with the intended meaning than "prompt engineering" ever did—addressing a longstanding communication gap in the field[2].
## Major Academic Research: A Survey of Over 1,400 Papers
One of the most comprehensive academic contributions is **"A Survey of Context Engineering for Large Language Models"** (arXiv:2507.13334), a 166-page survey analyzing over 1,400 research papers[7][8]. This survey establishes context engineering as a formal discipline with three foundational components:
**Context Retrieval and Generation** involves sourcing relevant information from various external sources, including RAG systems, knowledge bases, and real-time APIs[9].
**Context Processing** encompasses transforming, filtering, and formatting retrieved information to maximize its utility for the model[7].
**Context Management** addresses the strategies for maintaining, updating, and optimizing context over time and across multiple interactions[9].
The survey reveals a critical research gap: "a fundamental asymmetry exists between model capabilities... models demonstrate remarkable proficiency in understanding complex contexts, they exhibit pronounced limitations in generating equally sophisticated, long-form outputs"[7].
## Context Rot: The Hidden Performance Degradation
A groundbreaking study from **Chroma Research** titled **"Context Rot: How Increasing Input Tokens Impacts LLM Performance"** (July 2025) challenges the assumption that longer context windows automatically translate to better performance[10][11].
The research evaluated 18 leading LLMs, including GPT-4.1, Claude 4, Gemini 2.5, and Qwen3, revealing that model performance degrades non-uniformly as input length increases—even on simple tasks[10]. Key findings include:
**Performance degradation with needle-question similarity**: Lower semantic similarity between questions and target information correlates with faster performance decline as context length grows[10].
**Non-uniform distractor impact**: Distractors—topically related but irrelevant information—have varying negative effects that amplify with longer contexts. Different model families handle distractors distinctively; Claude models tend to abstain under ambiguity while GPT models often hallucinate confidently[10].
**Haystack structure matters**: Surprisingly, models perform better when context structure is disrupted (shuffled sentences) rather than preserved with logical coherence, suggesting structural patterns influence attention mechanism behavior in unexpected ways[10].
## NoLiMa Benchmark: Beyond Literal Matching
The **NoLiMa benchmark** (accepted at ICML 2025) from Adobe Research introduces a more rigorous evaluation methodology that eliminates literal matches between questions and answers, requiring models to perform latent associative reasoning[12][13].
Results were striking: at 32K tokens, **11 out of 13 models dropped below 50% of their short-context baseline performance**. Even GPT-4o, one of the top performers, experienced accuracy reduction from 99.3% to 69.7%[13]. The researchers attribute these declines to "the increased difficulty the attention mechanism faces in longer contexts when literal matches are absent"[14].
## Anthropic's Context Engineering Framework for AI Agents
**Anthropic** published a comprehensive guide titled **"Effective Context Engineering for AI Agents"** (September 2025), establishing practical principles for production systems[4].
The company frames context as "a finite resource with diminishing marginal returns," analogous to human working memory constraints[4]. Key strategies include:
**Compaction**: Summarizing conversation history while preserving critical details when approaching context limits. Claude Code implements this through "auto-compact" after exceeding 95% of the context window[4].
**Structured note-taking**: Agents write persistent notes to external memory that can be retrieved later, enabling coherence across extended interactions[4].
**Sub-agent architectures**: Specialized sub-agents handle focused tasks with clean context windows, returning condensed summaries (typically 1,000-2,000 tokens) rather than raw outputs[4].
The guidance emphasizes finding "the smallest possible set of high-signal tokens that maximize the likelihood of some desired outcome"[4].
## ACE Framework: Agentic Context Engineering
A Stanford-affiliated team introduced **ACE (Agentic Context Engineering)** in October 2025, presenting contexts as "evolving playbooks" that accumulate and refine strategies through generation, reflection, and curation[15][16].
ACE addresses two critical failure modes:
**Brevity bias**: Where optimization systems collapse toward short, generic prompts, sacrificing domain-specific detail[15].
**Context collapse**: Where iterative rewriting erodes important information over time[15].
Results demonstrate **+10.6% improvement on agent benchmarks** and **+8.6% on domain-specific tasks** (particularly finance), achieved without labeled supervision by leveraging natural execution feedback[15][16]. On the AppWorld leaderboard, ACE matches top-ranked production agents while using smaller open-source models[15].
## Context Engineering Strategies: The Four-Bucket Framework
**LangChain's analysis** categorizes context engineering strategies into four primary approaches[20]:
**Write Context**: Saving information outside the context window (scratchpads, long-term memories) to help agents perform tasks. Claude Code's `CLAUDE.md` files and Cursor's rules files exemplify this pattern[20].
**Select Context**: Pulling relevant information into the context window through RAG, memory retrieval, or tool descriptions. Code agents like Windsurf combine embedding search, AST parsing, knowledge graph retrieval, and re-ranking[20].
**Compress Context**: Retaining only essential tokens through summarization or trimming. Cognition uses fine-tuned models specifically for compression at agent-agent boundaries[20].
**Isolate Context**: Splitting context across sub-agents with specialized roles and distinct context windows. Anthropic's multi-agent researcher demonstrated that isolated contexts outperform single-agent approaches for complex tasks[20][21].
## Performance Degradation Independent of Retrieval
Research published in EMNLP 2025 Findings demonstrates that **context length alone hurts LLM performance even when models achieve perfect retrieval**[22][23].
Experiments across math, QA, and coding tasks with 5 open and closed-source models revealed performance drops of **13.9% to 85%** as input length increased—crucially, this occurred even when models could retrieve all relevant information with 100% exact match[22]. This finding challenges the common decomposition of long-context capabilities into "retrieval" and "reasoning" components, suggesting input length itself is a decisive degradation factor[22].
## Multi-Agent Systems and Context Orchestration
Context engineering becomes exponentially more critical in multi-agent architectures. **Hypermode's analysis** identifies key patterns for effective multi-agent context management[24]:
**Unified domain memory**: Shared memory systems create consistent understanding across multiple agents by persisting information across interactions[24].
**Collaboration patterns**: Specialized roles (Planner, Researcher, Critic, Executor, Governor) require distinct context requirements tailored to their functions[25].
**Context isolation**: Each agent operates with scoped instructions, enabling parallel execution while avoiding information overload[26].
## Industry Leaders' Perspectives
Several prominent figures have shaped the context engineering discourse:
**Walden Yan** (Cognition AI) declared context engineering "the #1 job of engineers building AI agents" in 2025[20][27].
**Drew Breunig** articulated specific failure modes: Context Poisoning (hallucinations entering context), Context Distraction (training overwhelmed by context), Context Confusion (superfluous context influencing responses), and Context Clash (contradictory context elements)[28][20].
**Phil Schmid** emphasized the operational definition: "Context Engineering is the discipline of designing and building dynamic systems that provides the right information and tools, in the right format, at the right time, to give a LLM everything it needs to accomplish a task"[3].
## Practical Implications and Future Directions
The convergence of research and industry practice points toward several key implications:
**Context windows are not limitless resources**. Despite models advertising million-token windows, practical performance degrades well before these limits, requiring careful context budget management[10][13].
**Format matters as much as content**. How information is structured, ordered, and presented significantly impacts model behavior—sometimes in counterintuitive ways[10].
**Dynamic systems outperform static prompts**. The shift from fixed prompts to adaptive context management systems represents the fundamental evolution from prompt to context engineering[5].
**Domain expertise becomes embeddable**. Context engineering allows domain experts to encode organizational knowledge into AI inputs without requiring AI expertise, democratizing agent development[27].
The field continues evolving rapidly, with researchers investigating hybrid context strategies, context compression techniques, and automated context optimization where one model optimizes context for another[29]. As **Anthropic** summarizes: "Even as models continue to improve, the challenge of maintaining coherence across extended interactions will remain central to building more effective agents"[4].
Sources
[1] +1 for "context engineering" over "prompt ... https://x.com/karpathy/status/1937902205765607626?lang=en
[2] Context engineering https://simonwillison.net/2025/jun/27/context-engineering/
[3] The New Skill in AI is Not Prompting, It's Context Engineering https://www.philschmid.de/context-engineering
[4] Effective context engineering for AI agents https://www.anthropic.com/engineering/effective-context-engineering-for-ai-agents
[5] The rise of "context engineering" https://blog.langchain.com/the-rise-of-context-engineering/
[6] What is context engineering? | Harrison Chase posted on ... https://www.linkedin.com/posts/harrison-chase-961287118_the-rise-of-context-engineering-context-activity-7342960325635297281-b3Pg
[7] A Survey of Context Engineering for Large Language Models https://arxiv.org/abs/2507.13334
[8] Meirtz/Awesome-Context-Engineering https://github.com/Meirtz/Awesome-Context-Engineering
[9] Context Engineering for AI Practitioners: Research Paper ... https://www.linkedin.com/pulse/context-engineering-ai-practitioners-research-paper-balaji-vpobc
[10] Context Rot: How Increasing Input Tokens Impacts LLM ... https://research.trychroma.com/context-rot
[11] LLM Context Rot https://cobusgreyling.substack.com/p/llm-context-rot
[12] NoLiMa: Long-Context Evaluation Beyond Literal Matching https://github.com/adobe-research/NoLiMa
[13] NoLiMa: Long-Context Evaluation Beyond Literal Matching https://arxiv.org/abs/2502.05167
[14] NoLiMa: Long-Context Evaluation Beyond Literal Matching https://arxiv.org/html/2502.05167v1
[15] Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models https://arxiv.org/abs/2510.04618
[16] Agentic Context Engineering: Learning Comprehensive ... https://openreview.net/forum?id=eC4ygDs02R
[17] Introducing the Model Context Protocol https://www.anthropic.com/news/model-context-protocol
[18] Model Context Protocol https://en.wikipedia.org/wiki/Model_Context_Protocol
[19] What Is the Model Context Protocol (MCP) and How It Works https://www.descope.com/learn/post/mcp
[20] Context Engineering https://blog.langchain.com/context-engineering-for-agents/
[21] How we built our multi-agent research system https://www.anthropic.com/engineering/multi-agent-research-system
[22] Context Length Alone Hurts LLM Performance Despite ... https://arxiv.org/html/2510.05381v1
[23] Context Length Alone Hurts LLM Performance Despite ... https://aclanthology.org/2025.findings-emnlp.1264.pdf
[24] Context engineering as the foundation of multi-agent AI ... https://hypermode.com/blog/context-engineering-multi-agent
[25] Architecting Smarter Multi-Agent Systems with Context ... https://onereach.ai/blog/smarter-context-engineering-multi-agent-systems/
[26] Best practices for building AI multi agent system https://www.vellum.ai/blog/multi-agent-systems-building-with-context-engineering
[27] What is Context Engineering? https://blog.promptlayer.com/what-is-context-engineering/
[28] Prompts vs. Context https://www.dbreunig.com/2025/06/25/prompts-vs-context.html
[29] The Future of AI: Context Engineering in 2025 and Beyond https://dev.to/lofcz/the-future-of-ai-context-engineering-in-2025-and-beyond-5n9
[30] Context Engineering for Multi-Agent LLM Code Assistants Using Elicit, NotebookLM, ChatGPT, and Claude Code https://arxiv.org/abs/2508.08322
[31] LoCoBench-Agent: An Interactive Benchmark for LLM Agents in Long-Context Software Engineering https://www.semanticscholar.org/paper/7e4d3b819547212190325d0771cafebcd009f241
[32] Optimization through In-Context Learning and Iterative LLM Prompting for Nuclear Engineering Design Problems https://arxiv.org/abs/2503.19620
[33] OnePiece: Bringing Context Engineering and Reasoning to Industrial Cascade Ranking System https://arxiv.org/abs/2509.18091
[34] Context Engineering for Trustworthiness: Rescorla Wagner Steering Under Mixed and Inappropriate Contexts https://arxiv.org/abs/2509.04500
[35] Haystack Engineering: Context Engineering for Heterogeneous and Agentic Long-Context Evaluation https://arxiv.org/abs/2510.07414
[36] QuantMind: A Context-Engineering Based Knowledge Framework for Quantitative Finance https://arxiv.org/abs/2509.21507
[37] Self-Generated In-Context Examples Improve LLM Agents for Sequential Decision-Making Tasks https://arxiv.org/abs/2505.00234
[38] Integrating LLM-Based Text Generation with Dynamic Context Retrieval for GUI Testing https://ieeexplore.ieee.org/document/10989041/
[39] InfiniteICL: Breaking the Limit of Context Window Size via Long
Short-term Memory Transformation https://arxiv.org/pdf/2504.01707.pdf
[40] LLoCO: Learning Long Contexts Offline http://arxiv.org/pdf/2404.07979.pdf
[41] StreamAdapter: Efficient Test Time Adaptation from Contextual Streams https://arxiv.org/pdf/2411.09289.pdf
[42] Self-Taught Agentic Long Context Understanding http://arxiv.org/pdf/2502.15920.pdf
[43] Improving Large Language Model (LLM) fidelity through context-aware
grounding: A systematic approach to reliability and veracity https://arxiv.org/pdf/2408.04023.pdf
[44] GraphReader: Building Graph-based Agent to Enhance Long-Context
Abilities of Large Language Models http://arxiv.org/pdf/2406.14550.pdf
[45] LCIRC: A Recurrent Compression Approach for Efficient Long-form Context
and Query Dependent Modeling in LLMs http://arxiv.org/pdf/2502.06139.pdf
[46] Unlocking Context Constraints of LLMs: Enhancing Context Efficiency of
LLMs with Self-Information-Based Content Filtering https://arxiv.org/pdf/2304.12102.pdf
[47] Improving Long-context Large Language Models with AI ... https://aclanthology.org/2025.acl-long.187/
[48] Context Engineering : Andrej Karpathy drops a new term ... https://www.reddit.com/r/PromptEngineering/comments/1llj2ro/context_engineering_andrej_karpathy_drops_a_new/
[49] The Top LLMs for Long Context Windows in 2025 https://www.siliconflow.com/articles/en/top-LLMs-for-long-context-windows
[50] GitHub - davidkimai/Context-Engineering https://github.com/davidkimai/Context-Engineering
[51] The Best Open Source LLM for Context Engineering in 2025 https://www.siliconflow.com/articles/en/the-best-open-source-llm-for-context-enginneering
[52] Top 10 open source LLMs for 2025 https://www.instaclustr.com/education/open-source-ai/top-10-open-source-llms-for-2025/
[53] Context Engineering: Bringing Engineering Discipline to ... https://addyo.substack.com/p/context-engineering-bringing-engineering
[54] What Is Context Engineering? A Guide for AI & LLMs https://intuitionlabs.ai/articles/what-is-context-engineering
[55] Top 9 Large Language Models as of November 2025 https://www.shakudo.io/blog/top-9-large-language-models
[56] LLM Leaderboard - Comparison of over 100 AI models ... https://artificialanalysis.ai/leaderboards/models
[57] Why context engineering beats prompt engineering https://www.linkedin.com/posts/randalolson_andrej-karpathy-karpathy-on-x-activity-7344035363910258690-KXkm
[58] LLM Leaderboard 2025 https://www.vellum.ai/llm-leaderboard
[59] Context Engineering Guide https://www.promptingguide.ai/guides/context-engineering-guide
[60] Top Large Language Models(LLMs) as of August 2025 https://www.prismetric.com/top-llms-right-now/
[61] The 8th International Scientific and Practical Conference: Transport, Education, Logistics and Engineering, 27–28th of June 2025, Riga, Latvia https://journals.anstar.edu.pl/index.php/sti/article/view/669
[62] SCALAR: Scientific Citation-based Live Assessment of Long-context Academic Reasoning https://arxiv.org/abs/2502.13753
[63] Preface: 4th International Conference on Architecture, Geotechnical Engineering and Construction Technology (AGECT 2025) https://hsetdata.org/index.php/ojs/article/view/1
[64] Challenge on Optimization of Context Collection for Code Completion https://arxiv.org/abs/2510.04349
[65] Preface: 8th International Conference on Chemical Engineering and Advanced Materials (CEAM 2025) https://hsetdata.net/index.php/ojs/article/view/2
[66] Preface: 4th International Conference on Food Engineering, Nutriology and Biological Chemistry (FENBC 2025) https://hsetdata.com/index.php/ojs/article/view/992
[67] From Correlation to Context: Evaluating Quantitative Research Practices in Islamic Education https://serambi.org/index.php/managere/article/view/871
[68] ReviewRL: Towards Automated Scientific Review with RL https://arxiv.org/abs/2508.10308
[69] An Evaluation of Large Language Models on Text Summarization Tasks Using Prompt Engineering Techniques https://arxiv.org/abs/2507.05123
[70] Database Theory Column Report on PODS 2025 https://dl.acm.org/doi/10.1145/3767145.3767152
[71] On-Chip Optical Switching with Epsilon-Near-Zero Metamaterials https://arxiv.org/abs/2501.09387
[72] Network Dynamics with Higher-Order Interactions: Coupled Cell
Hypernetworks for Identical Cells and Synchrony https://arxiv.org/abs/2201.09379v1
[73] Generator of Neural Network Potential for Molecular Dynamics:
Constructing Robust and Accurate Potentials with Active Learning for
Nanosecond-scale Simulations https://arxiv.org/abs/2411.17191
[74] Time-optimal transfer of the quantum state in long qubit arrays https://arxiv.org/abs/2501.11933
[75] A unified framework for mechanics. Hamilton-Jacobi equation and
applications https://arxiv.org/abs/1001.0482
[76] Summarising and Comparing Agent Dynamics with Contrastive Spatiotemporal
Abstraction https://arxiv.org/abs/2201.07749v2
[77] Uncooled Thermal Infrared Detection Near the Fundamental Limit Using a
Nanomechanical Resonator with a Broadband Absorber https://arxiv.org/abs/2501.03161
[78] High Throughput Screening of Expression Constructs using Microcapillary
Arrays https://arxiv.org/abs/2501.17647
[79] [2510.26493] Context Engineering 2.0 https://arxiv.org/abs/2510.26493
[80] Building a Deep Research Agent https://www.promptingguide.ai/agents/context-engineering-deep-dive
[81] The impact of relevance in context engineering for AI agents https://www.elastic.co/search-labs/blog/context-engineering-relevance-ai-agents-elasticsearch
[82] Best LLMs for Extended Context Windows https://research.aimultiple.com/ai-context-window/
[83] Best Practices for Context Management when Generating ... https://docs.digitalocean.com/products/gradient-ai-platform/concepts/context-management/
[84] LLM Context Management: How to Improve Performance and ... https://eval.16x.engineer/blog/llm-context-management-guide
[85] Context Engineering: The Real Reason AI Agents Fail in ... https://inkeep.com/blog/context-engineering-why-agents-fail
[86] A Comprehensive Review of AI Agents https://arxiv.org/html/2508.11957v1
[87] The Maximum Effective Context Window for Real World ... https://arxiv.org/pdf/2509.21361.pdf
[88] Enhanced Product Review Recommendations Using Collaborative Filtering and Singular Value Decomposition https://invergejournals.com/index.php/ijss/article/view/118
[89] Supporting Contextual Conversational Agent-Based Software Development https://arxiv.org/pdf/2305.00885.pdf
[90] In-Context Translation: Towards Unifying Image Recognition, Processing,
and Generation https://arxiv.org/html/2404.09633
[91] In-context Autoencoder for Context Compression in a Large Language Model http://arxiv.org/pdf/2307.06945.pdf
[92] CSM-H-R: A Context Modeling Framework in Supporting Reasoning Automation
for Interoperable Intelligent Systems and Privacy Protection http://arxiv.org/pdf/2308.11066.pdf
[93] Context-Aware Semantic Recomposition Mechanism for Large Language Models https://arxiv.org/pdf/2501.17386.pdf
[94] GEMS: Generative Expert Metric System through Iterative Prompt Priming http://arxiv.org/pdf/2410.00880.pdf
[95] Contextual Confidence and Generative AI https://arxiv.org/pdf/2311.01193.pdf
[96] Unleashing the potential of prompt engineering in Large Language Models:
a comprehensive review https://arxiv.org/pdf/2310.14735v3.pdf
[97] From Prompt Tricks to Context Engineering: What CIOs ... https://dev.to/rylko_roman_965498de23cd8/from-prompt-tricks-to-context-engineering-what-cios-should-know-and-do-in-2025-51in
[98] AI Index | Stanford HAI https://hai.stanford.edu/ai-index
[99] Building Context-Aware Reasoning Applications with ... https://www.youtube.com/watch?v=cwjs1WAG9CM
[100] Context Engineering Best Practices for Reliable AI in 2025 https://www.kubiya.ai/blog/context-engineering-best-practices
[101] 2025 AI Index Report Now Available | NISO website http://www.niso.org/niso-io/2025/04/2025-ai-index-report-now-available
[102] Context Engineering: The New Backbone of Scalable AI ... https://www.qodo.ai/blog/context-engineering/
[103] Artificial Intelligence Index Report 2025 | Stanford HAI https://hai.stanford.edu/assets/files/hai_ai_index_report_2025.pdf
[104] Artificial Intelligence Index Report 2025 https://iris.imtlucca.it/bitstream/20.500.11771/34398/1/hai_ai_index_report_2025.pdf
[105] Harrison Chase https://blog.langchain.com/author/harrison/
[106] Simon Willison on context-engineering https://simonwillison.net/tags/context-engineering/
[107] Yolanda Gil's Homepage - Knowledge Capture And Discovery https://knowledgecaptureanddiscovery.github.io/yolanda_gil_website/
[108] [2504.07139] Artificial Intelligence Index Report 2025 https://arxiv.org/abs/2504.07139
[109] "Context engineering is building dynamic systems ... https://x.com/hwchase17/status/1937194145074020798
[110] Gemma 3n, Context Engineering and a whole lot of Claude ... https://simonw.substack.com/p/gemma-3n-context-engineering-and
[111] AI Index Report 2025 https://www.kaggle.com/datasets/paultimothymooney/ai-index-report-2025
[112] MemAgent: Reshaping Long-Context LLM with Multi-Conv RL-based Memory Agent https://arxiv.org/abs/2507.02259
[113] Look Back to Reason Forward: Revisitable Memory for Long-Context LLM Agents https://arxiv.org/abs/2509.23040
[114] Optimizing Long-context LLM Serving via Fine-grained Sequence Parallelism https://www.semanticscholar.org/paper/e7488527756e435789e0169e42d958f6a1015354
[115] UniCAIM: A Unified CAM/CIM Architecture with Static-Dynamic KV Cache Pruning for Efficient Long-Context LLM Inference https://ieeexplore.ieee.org/document/11133273/
[116] EMPIRIC: Exploring Missing Pieces in KV Cache Compression for Reducing Computation, Storage, and Latency in Long-Context LLM Inference https://dl.acm.org/doi/10.1145/3759441.3759448
[117] ChunkKV: Semantic-Preserving KV Cache Compression for Efficient Long-Context LLM Inference https://arxiv.org/abs/2502.00299
[118] SmartCache: Two-Dimensional KV-Cache Similarity for Efficient Long-Context LLM Decoding https://ieeexplore.ieee.org/document/11207292/
[119] KVO-LLM: Boosting Long-Context Generation Throughput for Batched LLM Inference https://ieeexplore.ieee.org/document/11132542/
[120] U-NIAH: Unified RAG and LLM Evaluation for Long Context Needle-In-A-Haystack https://arxiv.org/abs/2503.00353
[121] Smooth Reading: Bridging the Gap of Recurrent LLM to Self-Attention LLM on Long-Context Tasks https://arxiv.org/abs/2507.19353
[122] Lost in the Middle: How Language Models Use Long Contexts https://arxiv.org/pdf/2307.03172.pdf
[123] LongReD: Mitigating Short-Text Degradation of Long-Context Large
Language Models via Restoration Distillation http://arxiv.org/pdf/2502.07365.pdf
[124] LongGenBench: Long-context Generation Benchmark http://arxiv.org/pdf/2410.04199.pdf
[125] Explaining Context Length Scaling and Bounds for Language Models http://arxiv.org/pdf/2502.01481.pdf
[126] Self-Consistency Falls Short! The Adverse Effects of Positional Bias on
Long-Context Problems http://arxiv.org/pdf/2411.01101.pdf
[127] BABILong: Testing the Limits of LLMs with Long Context
Reasoning-in-a-Haystack https://arxiv.org/html/2406.10149v1
[128] ALR$^2$: A Retrieve-then-Reason Framework for Long-context Question
Answering http://arxiv.org/pdf/2410.03227.pdf
[129] Exploring Context Window of Large Language Models via ... https://arxiv.org/abs/2405.18009
[130] Why Does the Effective Context Length of LLMs Fall Short? https://arxiv.org/html/2410.18745v1
[131] Evaluating Long-Context LLMs - NoLiMa https://portkey.ai/blog/evaluating-long-context-llms/
[132] The Maximum Effective Context Window for Real World ... https://www.arxiv.org/pdf/2509.21361.pdf
[133] NoLiMa: Long-Context Evaluation Beyond Literal Matching https://huggingface.co/papers/2502.05167
[134] LongReD: Mitigating Short-Text Degradation of Long ... https://arxiv.org/abs/2502.07365
[135] Context rot: the emerging challenge that could hold back ... https://www.understandingai.org/p/context-rot-the-emerging-challenge
[136] Does quantization affect models' performance on long- ... https://arxiv.org/abs/2505.20276
[137] LLMs now accept longer inputs, and the best models can ... https://epoch.ai/data-insights/context-windows
[138] Lost in the Middle: How Language Models Use Long ... https://arxiv.org/abs/2307.03172
[139] Reasoning Degradation in LLMs with Long Context Windows https://community.openai.com/t/reasoning-degradation-in-llms-with-long-context-windows-new-benchmarks/906891?page=2
[140] FinRL Contest 2025 Task 1:Market-Aware In-Context Learning Framework for Proximal Policy Optimization in Stock Trading Using DeepSeek https://ieeexplore.ieee.org/document/11038765/
[141] Operand Quant: A Single-Agent Architecture for Autonomous Machine Learning Engineering https://arxiv.org/abs/2510.11694
[142] Review of the Textbook Tyukavin, A.I., & Suchkov, S.V. (Eds.). (2025). Fundamentals of Personalized Biomedicine and Biopharmacy. Moscow: INFRA-M. 460 p.: ill. (Higher Education). DOI: 10.12737/2198519. ISBN 978-5-16-020841-1 (print), ISBN 978-5-16-113520-4 https://journals.eco-vector.com/PharmForm/article/view/692160
[143] The Prompt Engineering Report Distilled: Quick Start Guide for Life Sciences https://arxiv.org/abs/2509.11295
[144] Teaching Environmentalism in Philippine Schools: An Empirical Integrative Review https://bedanjournal.org/index.php/berj/article/view/82
[145] Trust Dynamics in Strategic Coopetition: Computational Foundations for Requirements Engineering in Multi-Agent Systems https://arxiv.org/abs/2510.24909
[146] The evolution of ergonomics – from the concepts of Wojciech Bogumił Jastrzębowski to ergonomic policies of enterprises in the era of Industry 5.0 https://managementpapers.polsl.pl/wp-content/uploads/2025/09/226-Butlewski-Tytyk.pdf
[147] Impact of LLMs on Team Collaboration in Software Development https://arxiv.org/abs/2510.08612
[148] Hydrogels in Peri-Implant Regeneration: Strategies for Modulating Tissue Healing https://www.mdpi.com/1999-4923/17/9/1105
[149] Conceptual Framework for Autonomous Cognitive Entities https://arxiv.org/pdf/2310.06775.pdf
[150] Learn-by-interact: A Data-Centric Framework for Self-Adaptive Agents in
Realistic Environments https://arxiv.org/html/2501.10893v1
[151] Agent-E: From Autonomous Web Navigation to Foundational Design
Principles in Agentic Systems http://arxiv.org/pdf/2407.13032.pdf
[152] Towards Enterprise-Ready Computer Using Generalist Agent https://arxiv.org/pdf/2503.01861.pdf
[153] Beyond Black-Box Benchmarking: Observability, Analytics, and
Optimization of Agentic Systems https://arxiv.org/pdf/2503.06745.pdf
[154] Autonomous Deep Agent http://arxiv.org/pdf/2502.07056.pdf
[155] Agent AI: Surveying the Horizons of Multimodal Interaction https://arxiv.org/abs/2401.03568
[156] Agentic Context Engineering: Prompting Strikes Back https://shashikantjagtap.net/agentic-context-engineering-prompting-strikes-back/
[157] Agentic Context Engineering: The Complete 2025 Guide ... https://www.sundeepteki.org/blog/agentic-context-engineering
[158] Introducing ACE: A Framework for Self-Improving AI Contexts https://www.linkedin.com/posts/skphd_agentic-context-engineering-activity-7385310211496075264-2gMs
[159] Introduction to Model Context Protocol - Anthropic Courses https://anthropic.skilljar.com/introduction-to-model-context-protocol
[160] ACE (Agentic Context Engineering): A New Framework ... https://www.reddit.com/r/PromptEngineering/comments/1o57k4p/ace_agentic_context_engineering_a_new_framework/
[161] 6 Model Context Protocol alternatives to consider in 2025 https://www.merge.dev/blog/model-context-protocol-alternatives
[162] What is the Model Context Protocol (MCP)? - Model Context ... https://modelcontextprotocol.io
[163] How and when to build multi-agent systems https://blog.langchain.com/how-and-when-to-build-multi-agent-systems/
[164] Your Agents Just Got a Memory Upgrade: ACE Open- ... https://sambanova.ai/blog/ace-open-sourced-on-github
[165] Code execution with MCP: building more efficient AI agents https://www.anthropic.com/engineering/code-execution-with-mcp
# Context Engineering for Content Creation: Key Insights and Best Practices (2025)
## Why Context Engineering Matters for Content Creation
**Context engineering in content writing is about curating and controlling the information, guidance, and knowledge LLMs receive—not just crafting prompts, but building a "context window" that helps the model achieve deeper accuracy, coherence, and originality in long-form outputs.** Key reasons include:
- **Rich context helps maintain narrative flow and thematic consistency** across long articles or multipart posts, preventing abrupt changes in topic or style[1][2].
- **Structured background material improves factual accuracy** and allows models to cite sources or link ideas more reliably[3].
- **Domain-specific instructions enable stylistic adaptation** (tone, audience, format) while avoiding bland, generic results[4][5].
- **Memory and summarization strategies prevent drift and loss of earlier details** in longer or serialized works[3].
## Techniques for Effective Context Engineering in Content Writing
### 1. Curating and Layering Reference Material
Instead of giving models just a short prompt ("Write an article about LLMs"), **context engineering involves layering the following**:
- **Background sections**: Factual or thematic overviews, previous relevant articles, recent news, and technical summaries help ground the writing[1][2].
- **Structured outlines**: High-level bullet points or headings guide logical flow and help the model maintain structure as it writes[4][1].
- **Tone and style exemplars**: Providing samples of the desired writing style (formal, casual, technical) enables the model to adapt and match tone more accurately[6][7].
- **Citation targets and source material**: Including research papers, URLs, or quotes ensures the generated articles reference reliable information throughout[8][3].
### 2. Using Memory and Summarization
**Anthropic’s guidance** emphasizes the value of summarized context and persistent notes for content writing tasks[3]. This means:
- Summarizing previous chapters or posts as articles get longer, feeding these summaries into the next context window.
- Using persistent note files (scratchpads) where agents can jot down key details, subtopics, or style rules to recall in subsequent outputs.
### 3. Dynamic Context Selection
LangChain recommends dynamic context selection: **pulling in just the right background information for each section or paragraph**. For example, when writing a multipart article, retrieve only the history or prior discussions relevant to the current topic instead of dumping all history at once[4][5].
This avoids context bloat, helps models focus, and reduces distractions from unrelated details.
### 4. Compressing and Isolating Context for Collaboration
In multi-author workflows or collaborative content projects (such as blogs or newsletters), models should use **compression to retain just key takeaways from prior drafts** and **isolate context windows for specialized contributors (subject experts, editors, style reviewers)**[9].
## Common Pitfalls and Solutions
### Context Rot and Information Degradation
Numerous studies, including those from Chroma Research, warn about **context rot**—performance and quality drop as more information is added without filtering[10]. For articles and posts, this means:
- Excessive or loosely related background can cause hallucinations or diluted core messaging.
- Distractors (irrelevant facts, off-topic sources) can break narrative flow.
**Solution:** Apply filtering and ranking to surface only the most relevant context before generation[10][4].
### Formatting Matters
Research shows that **the organization of information—order, paragraph structure, sections—strongly affects quality**[10]. Inputs that mimic the final structure (headings, parenthetical notes for sources or quotes) yield better results.
### The Democratization of Context Engineering
Recent industry posts note that **domain experts without technical backgrounds can use context engineering to encode expertise directly into the model’s context** for articles or reports, making LLM-powered content creation practical for journalists, educators, marketers, and more[2][1].
## Expert Voices on Content Context Engineering
- **Harrison Chase**: “The real power isn’t in prompts—it’s in coordinating structured context, references, and style guidance for models so that each article or post can reflect deep expertise, not just surface-level summaries.”[5]
- **Simon Willison**: Promoted workflow patterns where reference material, recent events, prior articles, and discussion summaries are continuously fed and compressed to maintain accuracy in serialized or ongoing coverage.[11]
## Best Practices Summary
- **Layer factual background, tone/style examples, and source material in the input context.**
- **Use outlines and summaries to maintain structure and continuity in longer-form writing.**
- **Retrieve and inject only the most relevant prior content for each section, not everything at once.**
- **Compress and isolate context for specialized sections and contributors in multi-author or collaborative environments.**
- **Regularly summarize and prune older content to avoid context rot and drift.**
- **Format guidance and context to closely match the desired article structure.**
Context engineering for LLM-powered content creation is rapidly becoming the standard for high-quality, factually accurate, and engaging articles, posts, and reports[8][2][1][4][3][10][6][7][5][11][9].
Sources
[1] The New Skill in AI is Not Prompting, It's Context Engineering https://www.philschmid.de/context-engineering
[2] What is Context Engineering? https://blog.promptlayer.com/what-is-context-engineering/
[3] Effective context engineering for AI agents https://www.anthropic.com/engineering/effective-context-engineering-for-ai-agents
[4] Context Engineering https://blog.langchain.com/context-engineering-for-agents/
[5] The rise of "context engineering" https://blog.langchain.com/the-rise-of-context-engineering/
[6] Context Engineering Best Practices for Reliable AI in 2025 https://www.kubiya.ai/blog/context-engineering-best-practices
[7] Context Engineering: The New Backbone of Scalable AI ... https://www.qodo.ai/blog/context-engineering/
[8] A Survey of Context Engineering for Large Language Models https://arxiv.org/abs/2507.13334
[9] Best practices for building AI multi agent system https://www.vellum.ai/blog/multi-agent-systems-building-with-context-engineering
[10] Context Rot: How Increasing Input Tokens Impacts LLM ... https://research.trychroma.com/context-rot
[11] Context engineering https://simonwillison.net/2025/jun/27/context-engineering/