How agents consume IMPP artifacts at runtime. Covers the four attach modes and their tradeoffs.

Attach Modes

When an agent installs an IMPP artifact, it needs a strategy for incorporating that knowledge at runtime. IMPP defines four attach modes that control how artifact content is delivered to the receiving model.

Overview

Mode	How It Works	Best For
`prepend_policy`	Artifact content is prepended to the system prompt	Small artifacts (under 4K tokens), persistent context
`few_shot_examples`	Artifact sections are injected as few-shot examples	Task-specific calibration, pattern demonstration
`retrieval_only`	Artifact content is indexed and retrieved on demand	Large artifacts, selective knowledge access
`tool_policy`	Artifact content is exposed as tool/function descriptions	Agentic workflows, structured decision-making

`prepend_policy`

The simplest mode. The full artifact content is prepended to the agent's system prompt on every invocation.

import impp

artifact = impp.load("defi-risk-assessment@v2.1")
artifact.attach(mode="prepend_policy")

Tradeoffs:

Lowest latency (no retrieval step)
Uses context window on every call, even when irrelevant
Best for small, always-relevant artifacts (risk policies, compliance rules)

`few_shot_examples`

Artifact sections are formatted as input/output pairs and injected into the prompt as few-shot demonstrations.

artifact.attach(mode="few_shot_examples", max_examples=5)

Tradeoffs:

Strong calibration effect — the model mimics demonstrated patterns
Requires artifacts with clearly separable sections
Token cost scales with example count

`retrieval_only`

The artifact content is indexed locally. At runtime, the agent's query is matched against artifact sections, and only relevant sections are injected into context.

artifact.attach(mode="retrieval_only", top_k=3)

Tradeoffs:

Scales to large artifacts without exhausting context
Adds retrieval latency (~50-200ms depending on index size)
Relevance depends on query quality

`tool_policy`

Artifact knowledge is exposed as tool descriptions or function schemas. The model decides when to invoke the knowledge based on task context.

artifact.attach(mode="tool_policy")

Tradeoffs:

Most flexible — model decides when knowledge is relevant
Requires a model that supports tool/function calling
Higher autonomy, lower predictability

Choosing a Mode

The right mode depends on artifact size, how often the knowledge is relevant, and the agent's architecture:

Always relevant + small → prepend_policy
Calibration / demonstration → few_shot_examples
Large reference corpus → retrieval_only
Agentic decision-making → tool_policy

Modes can be changed at runtime without re-publishing. The artifact content is the same — only the delivery strategy differs.

Attach Modes

Attach Modes

Overview

prepend_policy

few_shot_examples

retrieval_only

tool_policy

Choosing a Mode

On this page

`prepend_policy`

`few_shot_examples`

`retrieval_only`

`tool_policy`