# 46 — Fresh literature & market delta (clean-room re-sweep) (https://jackin.tailrocks.com/research/token-optimization/46-fresh-literature-and-market-delta/)


# 46 — Fresh literature & market delta (clean-room re-sweep) [#46--fresh-literature--market-delta-clean-room-re-sweep]

Volume II area file for blind spot 6: a clean-room pass for work
Volume I's 03 market scan does not cite, plus everything token-relevant that shipped or published
around/after Volume I's freeze. Each entry is tagged **timing** (published after the
freeze) or **oversight** (existed before, uncited), with its evidence tier and a verdict on whether
it changes a Volume I number..

**TL;DR**

* **The single biggest market-delta for this operator: Fable 5's free subscription window ends
  .** Anthropic made Fable 5 free on Pro/Max/Team/Enterprise "through June 22"
  and "will remove Fable 5 from those plans" on June 23 (credits required after). Volume I measured on
  Fable 5 *inside* that window. After 06-23 the cheapest same-tokenizer-family model for a subscriber
  flips to **Opus 4.8*&#x2A; ($5/$25 vs Fable's $10/$50, identical tokenizer) — and this environment
  already runs Opus 4.8 on the main loop (measured: 465/560 calls). So Volume I's Fable-5-priced
  dollar figures should be read as **\~half** for an Opus-4.8 subscriber (carried to 49).
* **The KV-cache compression family Volume I missed (SnapKV, H2O, PyramidKV, KVQuant) is real,
  NeurIPS-grade, and entirely self-host-only on a hosted API.** Eviction (H2O 5×@20%, SnapKV 8.2×
  memory\@16K, PyramidKV parity\@12% retention) and quantization (KVQuant 3-bit, \<0.1 ppl, 10M context)
  reduce *server-side GPU memory*, not API-billed tokens — they extend Volume I's self-host tier (19),
  not the hosted-subscription operator's options. Missed by **oversight**; verdict matches Volume I's
  K1/K2 hosted-API block.
* **CAG (Cache-Augmented Generation) is the one genuinely new *buildable-on-hosted-Claude* idea here.**
  The literal mechanism (export/persist KV tensors) is self-host-only, but its *pattern* — preload a
  stable corpus once and reuse it instead of retrieving — is approximable today via Anthropic prompt
  caching: a repo/spec that fits 200K context becomes a long-lived cached prefix read at 0.1×,
  composing *with* caching rather than fighting it (unlike LLMLingua). REAL-NOW.
* **A new risk axis: prompt-compression proxies are an attack surface.** CompressionAttack (arXiv
  2510.22963) flips downstream agent behavior with up to 80% success by perturbing inputs that survive
  compression — adding a *security* reason to Volume I's cost-based "don't put a compressor in the hot
  path" verdict. Two new lossless-compression papers (LoPace at-rest; LTSC meta-tokens) do **not** help
  a hosted-API token bill (one compresses disk bytes, the other needs a fine-tuned model).
* **No post-06-12 change to Anthropic's API cost primitives** (caching multipliers, context-editing
  beta, memory tool, effort ladder all unchanged, re-). The deltas are Claude Code
  *app* features (`/cd` preserves cache; subagents now nest 5 deep; `/usage` gained a cache-miss line)
  and the 06-15 billing split (file 41). Routing research broadened past RouteLLM (LLMRouterBench,
  OrcaRouter) but none produces a Claude-cache-aware number that overrides Volume I's verdict.

Dollar/quota context inherited from Volume I and file 41; tiers per finding.

***

## A. Provider changelog drift (re-verified live) [#a-provider-changelog-drift-re-verified-live]

| Change                                                                                                      | Date      | Tier | Effect on a Volume I number                                  |
| ----------------------------------------------------------------------------------------------------------- | --------- | ---- | ------------------------------------------------------------ |
| **Fable 5 free window ends; removed from plans 06-23**                                                      |           | T1   | Big — model-tiering flips to Opus 4.8 for subscribers (49)   |
| **SDK/`claude -p`/GitHub-Actions usage off the subscription cap** → separate API-rate credit                |           | T1   | Bifurcates the cost model by surface (file 41 Q7)            |
| **Claude Code: subagents nest up to 5 levels deep** (2.1.172)                                               |           | T1   | Compounds the \~7× team blowup geometrically (47 governance) |
| **Claude Code `/cd` preserves the prompt cache** across a dir change (2.1.169)                              |           | T1   | New negative-cost cache lever (technique FL2)                |
| **`/usage` adds cache-miss + subagent/long-context attribution** (2.1.174)                                  |           | T1   | Partially closes Volume I white-space #2 (cache tooling)     |
| OpenAI **GPT-5.5** in API ($5/$30, $0.50 cached = 0.1×; effort none→xhigh; 1.05M ctx; >272K billed 2×/1.5×) |           | T1   | Portability (45); no Claude-side change                      |
| OpenAI **`prompt_cache_retention` defaults to 24h** (non-ZDR)                                               |           | T1   | Strengthens OpenAI cross-provider cache case (45)            |
| OpenAI **container sessions billed per-minute** (5-min min)                                                 |           | T1   | Minor; rewards finishing fast (43)                           |
| Gemini implicit caching default, 90% discount                                                               | live      | T1   | Confirms Volume I baseline; no drift                         |
| **Anthropic API cost primitives** (cache 0.1×/1.25×/2×, context-editing, memory tool, effort)               | unchanged | T1   | Volume I's API math stands as of 06-13                       |

The verdict on the changelog: &#x2A;*Anthropic's billable API primitives did not move; the operator-facing
deltas are app features and the two billing changes (Fable promo, 06-15 split).** Both billing changes
hit the *subscriber* directly and are folded into files 41 and the verdict (49).

## B. The KV-cache compression family (eviction + quantization) — self-host-only [#b-the-kv-cache-compression-family-eviction--quantization--self-host-only]

Two branches, all NeurIPS-grade, all 0-hit in Volume I (oversight). A unified user-reachable
implementation exists (KVCache-Factory, github.com/Zefan-Cai/KVCache-Factory) — but only for
self-hosted open models.

| Method                 | Branch                        | Headline                                               | Venue         | Hosted Claude?     |
| ---------------------- | ----------------------------- | ------------------------------------------------------ | ------------- | ------------------ |
| H2O (2306.14048)       | eviction (heavy-hitter)       | KV memory −5× @20% budget; throughput up to 29×        | NeurIPS 2023  | **No — self-host** |
| SnapKV (2404.14469)    | eviction (observation-window) | 8.2× memory, 3.6× gen speed @16K; "comparable" quality | NeurIPS 2024  | **No — self-host** |
| PyramidKV (2406.02069) | eviction (per-layer budget)   | full-cache parity at 12% retention                     | preprint (T2) | **No — self-host** |
| KVQuant (2401.18079)   | quantization (3-bit)          | \<0.1 ppl loss; 10M context on 8×A100                  | NeurIPS 2024  | **No — self-host** |

**Verdict:** these reduce GPU memory and extend context *inside the inference engine*; the API
boundary still bills the same prompt tokens. They are a self-host serving lever — the same scope as
Volume I's vLLM/SGLang/HiCache section (19) and its K2 (LMCache/CacheGen) — and do **not** give a
hosted-Claude subscriber a token or quota saving. Honest negative result that bounds the gap.

One sharpening for thinking-heavy Claude work: &#x2A;*"Hold Onto That Thought" (arXiv 2512.12008)** finds KV compression *on reasoning* can be perverse — eviction at low budgets
*lengthens* reasoning traces (more output tokens), the same "compress-more-costs-more" direction
Volume I found for LLMLingua. So even on self-host, KV-evicting a thinking model can raise output
cost. (Exact accuracy-drop percentages need the full PDF — flagged.)

## C. CAG vs RAG — and the buildable hosted-Claude pattern [#c-cag-vs-rag--and-the-buildable-hosted-claude-pattern]

**CAG ("Don't Do RAG…", arXiv 2412.15605, WWW '25 short paper)*&#x2A; preloads the whole knowledge base
into context, precomputes and persists its KV cache, and answers queries with no retrieval step. It
meets or beats RAG on BERTScore when the corpus fits the context window (small margins, single 8B
model). The literal mechanism (persist KV tensors) is self-host-only. &#x2A;(The 2.33 s vs 94.35 s latency
and exact BERTScore figures appear only in secondary sources — not cited as primary; flagged.)*

The portable insight is the technique below (FL1): on hosted Claude the **CAG pattern** is
approximable with prompt caching — preload the stable corpus as a cached prefix, reuse at 0.1×.

## D. Prompt compression past LLMLingua / 500xCompressor [#d-prompt-compression-past-llmlingua--500xcompressor]

| Work                                         | What                                                                                      | Verdict for a hosted-Claude token bill                                                                                                                                                    |
| -------------------------------------------- | ----------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **LoPace** (2602.13266, Feb 2026)            | lossless compression *at rest* (Zstd/BPE+packing), 72.2% disk saving, 100% reconstruction | **Irrelevant to tokens** — compresses bytes on disk, not tokens sent; only helps a memory-store's footprint (Vol I 14). Category error to avoid.                                          |
| **LTSC meta-tokens** (2506.00307, May 2025)  | *lossless* token compression, 18–27% sequence reduction                                   | **Self-host-only** — needs a model fine-tuned to decode meta-tokens; a frontier hosted model can't read them. Bounds the "lossless input compression on hosted APIs" hope as still empty. |
| **CompactPrompt** (secondary, 2026)          | pruning + abbreviation + data-quantization, "up to 60%"                                   | **T4, do not propagate** — single vendor guide, lossy NL-oriented; the LLMLingua code-degradation + cache-breaking caveats almost certainly hold.                                         |
| **CompressionAttack** (2510.22963, Oct 2025) | security: perturbations surviving compression flip agent behavior (≤80% ASR)              | **New risk axis** — adds a security reason to Volume I's cost-based "no compressor in the hot path."                                                                                      |

Net: the prompt-compression frontier since LLMLingua-2/500x produced **no new lossy compressor that is
both user-reachable on hosted Claude and safe for code** — the two lossless advances are either
at-rest (no token effect) or self-host (fine-tuned decoder), and the one fresh empirical result is an
*attack* on compressors. Volume I's verdict (LLMLingua-class proxies are a QUALITY-TRADE/cost trap for
coding) is reinforced, now on security grounds too.

## E. Routers past RouteLLM / RouterArena [#e-routers-past-routellm--routerarena]

* **LLMRouterBench** (2601.07206, Jan 2026): 400K+ instances, 33 models, 10 baselines; top routers
  +4% accuracy or 31.7% cost reduction at baseline performance. The larger benchmark Volume I (16)
  did not have; a better *instrument*, but its cost numbers are benchmark-wide, not Claude-pair- or
  cache-aware, so it does **not** override Volume I's verdict that per-request gateway routing breaks
  Claude's model-scoped cache.
* **OrcaRouter** (2605.30736, May 2026): production LinUCB contextual-bandit router, RouterArena #2 at
  submission. Resolves part of Volume I 16's open "no current router eval" question — still a
  stateless-API metric, same cache caveat.
* **RouterEval** (EMNLP 2025): pointer (secondary-sourced); confirms the routing-eval field broadened.

**Verdict:** routing evaluation matured, but the Claude-Code-specific blocker Volume I identified
(per-request routing forfeits the model-scoped prompt cache, and tokenizer-blind routers mis-price
cross-tier moves) is unchanged. No new router produces a cache-aware Claude number.

## F. "Context engineering" tooling — a rename, not new math [#f-context-engineering-tooling--a-rename-not-new-math]

A "context engineering" platform category solidified by mid-2026 (Atlan Context Studio, Packmind,
Tessl, Ruler; native features in Claude Code/Cursor/Copilot), framed as a governed 5-phase context
layer with vendor accuracy claims (94–99% "with governed context" vs 10–31% "without"). The accuracy
figures are vendor-interested and unmeasured (T4). Honest read: this **renames techniques Volume I
already covers** (retrieval, memory, context architecture) under an enterprise-platform banner — it
changes vocabulary, not the dollar or quota math. Logged so the dossier is not blind to the term.

***

## Techniques [#techniques]

### FL1. CAG pattern on hosted Claude — preload the stable corpus as a cached prefix, reuse at 0.1× [#fl1-cag-pattern-on-hosted-claude--preload-the-stable-corpus-as-a-cached-prefix-reuse-at-01]

The buildable residue of CAG: when a knowledge base fits the 200K subscription context, cache it once
instead of retrieving per turn.

* **Coverage-delta:** New. CAG is 0-hit in Volume I (confirmed 40-overview); Volume I 14 debates
  RAG-vs-agentic-search but never the preload-and-cache pattern.
* **Layer:** input / cache / retrieval.
* **Mechanism:** place a stable corpus (the repo's key files, a spec, an API reference) in the prompt
  prefix behind a `cache_control` breakpoint; the first turn writes it (1.25–2×), every later turn
  reads it at 0.1× with no retrieval step, no retrieval errors, and no per-query embedding cost. This
  is the hosted-API analogue of CAG's "preload + reuse KV" — it composes *with* prompt caching, unlike
  LLMLingua which fights it.
* **Expected savings:** converts per-turn RAG retrieval+injection into a one-time cached-prefix write

- 0.1× reads. Wins when the corpus fits context and is queried many times within a TTL; on a
  subscription the cached reads are also \~0.1× against the cap (file 41). Bounded by the 200K
  subscription context (file 41) — large corpora still need retrieval.

* **Evidence tier:** T2 (CAG paper, WWW '25) for the pattern's quality-vs-RAG; T1 for the prompt-cache
  mechanics it rides.
* **Quality risk:** **NEUTRAL-to-NEGATIVE-COST** when the corpus fits (CAG meets/beats RAG on
  BERTScore; removes retrieval errors). RISKY if the corpus exceeds context (silent truncation) or
  goes stale in the cached prefix. Falsify by comparing answer quality preload-vs-retrieve on your
  corpus.
* **Availability:** CLAUDE-CODE-TODAY (a stable prefix / read-only files) / SDK (`cache_control`).
* **Effort to adopt:** hours (curate the corpus into a stable cached block).
* **Composability:** stacks with prompt caching (13) and the codebook/pointer ideas (Volume I 14/20);
  anti-synergy with anything that mutates the prefix (busts the corpus cache).
* **Validation protocol:** for a repeatedly-queried corpus, A/B preload-and-cache vs per-turn
  retrieval; require equal answer quality and confirm later turns show `cache_read` on the corpus.

### FL2. Use `/cd` to change directory without busting the cache [#fl2-use-cd-to-change-directory-without-busting-the-cache]

A shipped post-06-12 cache-preservation lever Volume I's caching file predates.

* **Coverage-delta:** New (Claude Code 2.1.169). Volume I 13 enumerates cache-safe vs
  busting operations but predates `/cd`.
* **Layer:** cache / session habits.
* **Mechanism:** `/cd` moves a session to a new working directory "without breaking the prompt cache
  mid-session"; previously a directory change churned the dynamic prefix (the six-field block, file

44. and forced a full rewrite at 1.25–2×.

* **Expected savings:** one avoided full-history rewrite per directory change — the per-bust economics
  of Volume I 13 tech 1 / file 41 Q2 (≈150K × (1.25–2.0 − 0.1) tokens).
* **Evidence tier:** T1 (Claude Code changelog 2.1.169).
* **Quality risk:** **NEGATIVE-COST** (same context, cache preserved).
* **Availability:** CLAUDE-CODE-TODAY (v2.1.169+).
* **Effort to adopt:** zero (use `/cd` instead of restarting in a new dir).
* **Composability:** extends Volume I 13's cache-safe operation list; pairs with prefix-stability (41 Q2).
* **Validation protocol:** change directory via `/cd` vs a restart; confirm the `/cd` path shows
  `cache_read` continuity in JSONL.

### FL3. Keep prompt-compressors out of the hot path — now for security too [#fl3-keep-prompt-compressors-out-of-the-hot-path--now-for-security-too]

Volume I disfavored LLMLingua-class proxies on cost; CompressionAttack adds an integrity reason.

* **Coverage-delta:** New risk axis. Volume I 19 covers the *cost* trap of compression proxies;
  the compression-as-attack-surface result is 0-hit.
* **Layer:** infra / security (meta).
* **Mechanism:** CompressionAttack (2510.22963) crafts inputs that survive a compression module but
  flip downstream agent behavior (≤80% ASR, 98% preference-flip, high stealth) — targeting both hard
  (LLMLingua/Selective-Context) and soft (ICAE/AutoCompressors) compressors. A compressor in the
  request path is therefore both a cost risk (Volume I) and an exploitable integrity boundary.
* **Expected savings:** none — loss-avoidance. Reinforces routing compression to ingestion-only seams
  (Volume I 20 K15) or avoiding it entirely for code.
* **Evidence tier:** T2 (peer-reviewed-style preprint, Oct 2025).
* **Quality risk:** **QUALITY-TRADE / security** — the proxy can be weaponized.
* **Availability:** n/a (a reason not to deploy a class of tool).
* **Effort to adopt:** minutes (policy: no compressor in the hot path; see file 47).
* **Composability:** strengthens Volume I's LLMLingua kill; relevant to governance (47) and infra (19).
* **Validation protocol:** if a compressor is unavoidable, red-team it with CompressionAttack-style
  perturbations before trusting its output.

### FL4. Treat the KV-compression family as a self-host lever only — and beware it on reasoning [#fl4-treat-the-kv-compression-family-as-a-self-host-lever-only--and-beware-it-on-reasoning]

A clear verdict so the operator stops chasing SnapKV/H2O/PyramidKV/KVQuant on a hosted API.

* **Coverage-delta:** New (0-hit in Volume I). Extends Volume I 19 (self-host) and 20 K2; the
  reasoning-trace-lengthening danger is from "Hold Onto That Thought" (2512.12008).
* **Layer:** infra (self-host serving).
* **Mechanism:** eviction (H2O/SnapKV/PyramidKV) and quantization (KVQuant) shrink the GPU KV cache to
  extend context and cut serving memory/latency — but only inside an inference engine you control.
  On reasoning models, low-budget eviction can lengthen traces (more output), the same perverse
  direction Volume I found for input compression.
* **Expected savings:*&#x2A; &#x2A;*$0 on hosted Claude.** On self-host: 5–8× KV memory / extended context per
  the papers — a capacity lever, not a hosted-API token saver.
* **Evidence tier:** T2 (NeurIPS papers) for ratios; T1 for the hosted-API block.
* **Quality risk:** **QUALITY-TRADE / RISKY** on reasoning workloads (trace lengthening); NEUTRAL for
  memory-bound non-reasoning self-host.
* **Availability:** GATEWAY-OR-SELF-HOST (KVCache-Factory unifies SnapKV/H2O/PyramidKV/StreamingLLM).
* **Effort to adopt:** extreme (self-host stack) — irrelevant for the hosted-subscription operator.
* **Composability:** self-host stack only (with Volume I 19/K2); orthogonal to everything hosted.
* **Validation protocol:** if self-hosting, benchmark the chosen method on *your* reasoning workload
  for trace-length inflation before adopting at low KV budgets.

### FL5. Re-decide model tiering after — Opus 4.8 becomes the cheapest Fable-family option [#fl5-re-decide-model-tiering-after--opus-48-becomes-the-cheapest-fable-family-option]

When the Fable 5 promo ends, the dominant-profile model choice flips for subscribers.

* **Coverage-delta:** Drift, not in Volume I (the promo window post-dates the freeze framing). Volume
  I's tiering (16) and profile assume Fable 5 as the main model.
* **Layer:** routing / model selection.
* **Mechanism:** from 06-23, Fable 5 on Pro/Max/Team/Enterprise requires usage credits (dollars);
  Opus 4.8 — the *same tokenizer family* (identical token counts) at half the sticker ($5/$25 vs
  $10/$50) — remains. So the cheapest model that preserves Volume I's tokenizer math is Opus 4.8, and
  this environment already runs it (465/560 calls measured).
* **Expected savings:** for a subscriber who was on Fable 5, moving the main loop to Opus 4.8 halves
  the per-token dollar cost at identical token counts (and avoids spending credits on Fable post-promo).
  Volume I's Fable-priced dollar figures are \~2× an Opus-4.8 subscriber's (carried to 49).
* **Evidence tier:** T1 (Fable promo announcement; pricing).
* **Quality risk:** &#x2A;*QUALITY-TRADE (mild)** — Fable 5 is the more capable model; the hardest tasks may
  regress on Opus 4.8. Falsify on your hardest task class (Volume I's open quality-parity question).
* **Availability:** CLAUDE-CODE-TODAY (`/model`).
* **Effort to adopt:** minutes.
* **Composability:** the tokenizer family is unchanged (Opus 4.8 ≡ Fable 5 tokenizer), so all of
  Volume I's tokenizer-arbitrage and file-42 image findings transfer directly.
* **Validation protocol:** run the hardest task class on Fable 5 vs Opus 4.8 before 06-23; if parity
  holds, default to Opus 4.8.

***

## Surprising findings [#surprising-findings]

* The whole fresh KV-compression literature (four NeurIPS-grade methods + a unified library) lands on
  the same verdict as Volume I's gist/LMCache frontier: &#x2A;*real, impressive, and $0 for a hosted-Claude
  user.** The hosted/self-host wall is the field's defining constraint, not any single technique.
* The most actionable new idea is the *oldest* one reframed: CAG is just "cache the corpus," which
  hosted prompt caching already does — the paper's contribution to a hosted operator is the framing,
  not the mechanism.
* The compression frontier's freshest empirical result is an **attack** on compressors, not a better
  compressor — the category Volume I disfavored got more dangerous, not more attractive.
* Anthropic's *billable primitives* didn't move in the months around the freeze; the action was all in
  *billing policy* (Fable promo, 06-15 SDK split) — which is exactly the quota axis (41) Volume I
  under-weighted.

## Verification ledger [#verification-ledger]

| #  | Claim                                                                                                                                                                                         | Source (access)                                                                 |
| -- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------- |
| 1  | Fable 5 free through 06-22, removed 06-23 (credits after); Fable $10/$50, Opus 4.8 $5/$25                                                                                                     | anthropic.com/news/claude-fable-5-mythos-5                                      |
| 2  | 06-15 SDK/headless/CI off subscription cap → separate API-rate credit                                                                                                                         | support.claude.com/.../use-the-claude-agent-sdk-with-your-claude-plan           |
| 3  | Claude Code: nested subagents 5-deep (2.1.172); `/cd` preserves cache (2.1.169); `/usage` cache-miss + attribution (2.1.174)                                                                  | code.claude.com/docs/en/changelog                                               |
| 4  | SnapKV 2404.14469 (NeurIPS'24, 8.2×@16K); H2O 2306.14048 (NeurIPS'23, 5×@20%); PyramidKV 2406.02069 (parity\@12%); KVQuant 2401.18079 (NeurIPS'24, 3-bit \<0.1 ppl, 10M ctx); KVCache-Factory | arxiv.org + proceedings.neurips.cc + github.com/Zefan-Cai/KVCache-Factory       |
| 5  | KV compression on reasoning lengthens traces at low budgets                                                                                                                                   | arxiv.org/abs/2512.12008 (excerpt; full numbers need PDF)                       |
| 6  | CAG 2412.15605 (WWW'25): preload+persist KV, meets/beats RAG when corpus fits context; literal mechanism self-host                                                                            | arxiv.org/abs/2412.15605 (latency/BERTScore figures secondary-sourced, flagged) |
| 7  | LoPace 2602.13266 (lossless at-rest, 72.2%); LTSC 2506.00307 (lossless tokens, needs fine-tuned model); CompactPrompt (T4 secondary)                                                          | arxiv.org/html/{2602.13266,2506.00307}; morphllm.com/prompt-compression         |
| 8  | CompressionAttack 2510.22963: ≤80% ASR on prompt-compression modules                                                                                                                          | arxiv.org/html/2510.22963v2                                                     |
| 9  | LLMRouterBench 2601.07206 (+4% acc / 31.7% cost); OrcaRouter 2605.30736 (RouterArena #2); RouterEval (EMNLP'25, pointer)                                                                      | arxiv.org/html/2601.07206; arxiv.org/abs/2605.30736                             |
| 10 | OpenAI GPT-5.5 ($5/$30, $0.50 cached, 1.05M ctx); cache retention default 24h (05-29); container per-min billing (06-02)                                                                      | developers.openai.com/api/docs/changelog                                        |
| 11 | Gemini implicit cache default, 90% discount ; no Anthropic API primitive change post-06-12                                                                                                    | ai.google.dev/gemini-api/docs/caching; platform.claude.com prompt-caching       |
| 12 | "context engineering" tooling category (Atlan/Packmind/Tessl/Ruler); vendor accuracy claims T4                                                                                                | atlan.com/know/context-engineering/...                                          |