IMPP

Attach Modes

How agents consume IMPP artifacts at runtime. Covers the four attach modes and their tradeoffs.

Attach Modes

When an agent installs an IMPP artifact, it needs a strategy for incorporating that knowledge at runtime. IMPP defines four attach modes that control how artifact content is delivered to the receiving model.

Overview

ModeHow It WorksBest For
prepend_policyArtifact content is prepended to the system promptSmall artifacts (under 4K tokens), persistent context
few_shot_examplesArtifact sections are injected as few-shot examplesTask-specific calibration, pattern demonstration
retrieval_onlyArtifact content is indexed and retrieved on demandLarge artifacts, selective knowledge access
tool_policyArtifact content is exposed as tool/function descriptionsAgentic workflows, structured decision-making

prepend_policy

The simplest mode. The full artifact content is prepended to the agent's system prompt on every invocation.

import impp

artifact = impp.load("defi-risk-assessment@v2.1")
artifact.attach(mode="prepend_policy")

Tradeoffs:

  • Lowest latency (no retrieval step)
  • Uses context window on every call, even when irrelevant
  • Best for small, always-relevant artifacts (risk policies, compliance rules)

few_shot_examples

Artifact sections are formatted as input/output pairs and injected into the prompt as few-shot demonstrations.

artifact.attach(mode="few_shot_examples", max_examples=5)

Tradeoffs:

  • Strong calibration effect — the model mimics demonstrated patterns
  • Requires artifacts with clearly separable sections
  • Token cost scales with example count

retrieval_only

The artifact content is indexed locally. At runtime, the agent's query is matched against artifact sections, and only relevant sections are injected into context.

artifact.attach(mode="retrieval_only", top_k=3)

Tradeoffs:

  • Scales to large artifacts without exhausting context
  • Adds retrieval latency (~50-200ms depending on index size)
  • Relevance depends on query quality

tool_policy

Artifact knowledge is exposed as tool descriptions or function schemas. The model decides when to invoke the knowledge based on task context.

artifact.attach(mode="tool_policy")

Tradeoffs:

  • Most flexible — model decides when knowledge is relevant
  • Requires a model that supports tool/function calling
  • Higher autonomy, lower predictability

Choosing a Mode

The right mode depends on artifact size, how often the knowledge is relevant, and the agent's architecture:

  • Always relevant + smallprepend_policy
  • Calibration / demonstrationfew_shot_examples
  • Large reference corpusretrieval_only
  • Agentic decision-makingtool_policy

Modes can be changed at runtime without re-publishing. The artifact content is the same — only the delivery strategy differs.

Attach Modes