SECURITY MODEL

Security Model

How AegisAI holds control, what it makes verifiable, and where the boundary between the model and governance begins.

THREAT CLASSES

Prompt Injection

Adversarial input designed to override policy, escalate privilege, or extract data through the model channel.

Control: LLM output enters an adapter that produces a typed, schema-validated ProposalSpec. The Kernel treats this as an untrusted proposal — not a command. Policies apply before any action can proceed.

Audit Record Tampering

Attempts to delete, modify, or forge governance records to erase evidence of unauthorized actions.

Control: Every DecisionTrace carries a SHA-256 custody_hash (Class B chain integrity). Class A: Ed25519-signed attestation provides external verifiability when configured. Neither can be retroactively altered without breaking the chain.

Policy Bypass

Exploiting an edge case, race condition, or misconfiguration to allow a HIGH-severity action without an explicit ALLOW.

Control: HIGH-severity violations produce BLOCK by design — this is a hard invariant, not a default. Verified by the benchmark gate on every CI run: security_failures == 0 required.

Loss of Control

System enters an unpredictable or unrecoverable state during adversarial conditions or unexpected inputs.

Control: Panic mode: global hard stop across all transitions. No reasoning loop, no degraded mode. The Kernel freezes until a signed recovery probe is provided.

CRYPTOGRAPHIC TRUST MODEL

AegisAI operates a two-class trust model. Understanding the distinction matters for any external audit or governance proof requirement.

CLASS AExternal Verifiability

→Ed25519 signature over the finalized DecisionTrace
→Artifact: DecisionTraceAttestation
→Verifiable by any party holding the public key — no Kernel access required
→Produced only when AEGIS_SIGNING_KEY is configured

CLASS BInternal Chain Integrity

→SHA-256 custody_hash on every DecisionTrace
→HMAC-SHA256 chain linking records — tamper-detection only
→Always present — the Class B chain is never optional
→Internal only: HMAC key never crosses the external trust boundary

Key rule: A DecisionTrace without a corresponding DecisionTraceAttestation is an internal record only. Its Class B chain is intact, but it carries no external governance proof. Absence of attestation ≠ absence of governance.

FAIL-SAFE CONTROL PATH

Panic Mode

Global hard stop. All kernel transitions blocked. No reasoning loop, no degraded mode. The system freezes until a signed recovery probe is presented and validated.

trigger: detect_loss_of_control()
effect: block all transitions
recovery: signed_probe required

Break-Glass Override

One-time, targeted administrative override for emergency scenarios. Requires an Ed25519 signature and a specific correlation_id. Replay protection enforced — each override token is valid exactly once.

type: BypassApproval
requires: Ed25519 signature
scope: single correlation_id

HARD INVARIANTS

HIGH-severity policy violations always produce BLOCK — no degraded mode, no soft failure

LLM output is parsed into a typed schema before reaching the Kernel. It is never executed directly.

A DecisionTrace is always produced. It is not conditional on outcome type or configuration.

Identical input + identical policy bundle → identical outcome (deterministic)

Class B operations (HMAC chain) never cross the external trust plane

Ed25519 private key is never logged, serialized, or exposed through any API surface

NON-GOALS

AegisAI is a policy control plane, not a general-purpose AI system or security perimeter.

✕Not a content filter or prompt guardrail — it adjudicates typed proposals, not raw text

✕Not autonomous without policy — every action disposition requires an explicit rule

✕Not a network security control — does not inspect packets, headers, or transport-layer traffic

✕Not a universal safety certification — benchmark coverage is defined, not exhaustive

DEPLOYMENT MODES

Sidecar

Policy enforcement layer alongside an existing system. Intercepts proposals before action dispatch.

API Gateway

Centralised adjudication across multiple upstream models via a single governance endpoint.

SDK (Embedded)

Inline enforcement within application code. Kernel instantiated per session with a policy bundle.

See Architecture →Verification Posture →Benchmark Evidence →