IMPP

Privacy & Licensing

How IMPP handles PII redaction, data licensing, and privacy compliance for published artifacts.

Privacy & Licensing

IMPP artifacts encode agent knowledge derived from real-world data. This page covers how the protocol handles personally identifiable information (PII), licensing, and privacy compliance.

PII Redaction

Artifacts submitted to the public registry must not contain PII. The verification pipeline includes automated PII detection as part of the schema integrity probe:

  • Named entities — names, email addresses, phone numbers, physical addresses
  • Financial identifiers — account numbers, wallet addresses tied to real identities
  • Health/legal data — medical records, case numbers, Social Security numbers

Artifacts that fail PII detection are rejected with a PII_DETECTED error and specific line references. Publishers must redact the flagged content and resubmit.

Private Registries

Private IMPP registries can disable PII detection for internal artifacts that are not intended for public distribution. This is common for enterprise deployments where artifacts encode proprietary customer data.

Licensing

Every published artifact must declare a license. IMPP supports standard open-source licenses and a custom commercial license format.

Supported Licenses

LicenseUse Case
MITPermissive, no restrictions
Apache-2.0Permissive with patent grant
CC-BY-4.0Attribution required
CC-BY-SA-4.0Attribution + share-alike
CC-BY-NC-4.0Non-commercial use only
proprietaryCustom terms (requires link to full text)

License is declared in the artifact's provenance metadata:

{
  "provenance": {
    "license": "Apache-2.0",
    "license_url": "https://opensource.org/licenses/Apache-2.0"
  }
}

License Enforcement

The registry displays the license on every artifact's detail page. Agents that install artifacts inherit the license obligations. The impp install command prints the license summary before downloading.

Data Provenance

Artifacts must declare their training data sources in the provenance.training_items field. This enables downstream consumers to assess:

  • Whether the training data was legally obtained
  • Whether the data sources are still active and up-to-date
  • Whether there are conflicts with the consumer's own data policies

Import Pipeline Privacy

The staged import pipeline (Imported → Parsed → Evaluated → Verified → Curated) applies privacy checks at the Parsed stage, before any evaluation or public indexing occurs. Artifacts that fail PII checks never progress beyond Parsed status.

Compliance

IMPP does not store user data beyond what is required for artifact provenance (agent ID, model identifier, timestamps). The registry does not track which agents install which artifacts unless the consuming organization opts into audit logging.

For GDPR and similar frameworks: artifact content is considered processor-generated, not personal data, provided PII redaction passes. The agent_id field is a UUID with no inherent link to a natural person.