Covenant — Policy-Enforced AI Access Control

AI systems that enforce access control inside the prompt aren't enforcing access control — they're asking the model to be a security layer. Covenant puts OPA Rego as a hard gate in the middleware chain: every request is evaluated against a versioned policy before Claude runs, pgvector retrieval is scoped by tenant_id at the query level, and the audit log captures every decision. The AI cannot be prompted around a policy it never participates in enforcing.

Every request

Policy gate

Cross-tenant data exposure

Roles proven end-to-end

Cloud spend

Core Technologies

PythonFastAPIOPAPostgreSQL / pgvectorClaude Sonnet 4.6JWTDocker

Architecture Components

FastAPI middleware: validates JWT signature, extracts role/user_id/tenant_id claims
OPA REST server (Docker sidecar): evaluates covenant.authz.allow against the input bundle — hard ALLOW or DENY
PostgreSQL with pgvector extension: cosine similarity search filtered by tenant_id — cross-tenant results blocked at query level
Claude Sonnet 4.6: generates response from permitted documents only, system prompt cached via cache_control: ephemeral
Audit log: every OPA decision written with user_id, role, tenant_id, query_hash, docs_returned, opa_decision
Docker Compose: OPA server, Postgres+pgvector, and FastAPI start together — $0 to run locally

Problem

AI systems that enforce access control inside the prompt are not enforcing access control — they are asking the model to be a security layer. Any sufficiently creative prompt can talk the model past a soft guard. The question is how to put policy enforcement somewhere Claude cannot touch.

Prompt-based access control is not access control — it's a suggestion the model can be prompted around.
Multi-tenant AI systems need retrieval isolation, not just response filtering — the model should never see cross-tenant data.
Access rules that live in application code or database rows are not auditable, testable, or reviewable the same way code is.

Solution

OPA as middleware, evaluated before the model runs. Rego policies in version control. pgvector retrieval scoped by tenant_id at the query level. Claude only runs on requests that cleared OPA — the AI cannot be a party to the access decision.

OPA REST server as a FastAPI middleware dependency: POST input bundle, read result.allow — hard gate in the request path.
Rego v1 policies with default allow := false: every identity starts with no access, permissions are explicit grants.
pgvector tenant_id filter applied before embedding search — cross-tenant documents never enter the context window.
Claude Sonnet 4.6 with prompt caching: system prompt cached via cache_control: ephemeral, near-zero latency on repeated requests.

Security Design

Policy as the single source of truth: access rules live in Rego files, not application code — version-controlled, independently testable with opa test, reviewable without understanding the application.
Default deny: every Rego policy starts with default allow := false — permissions are explicit grants, not opt-out exceptions.
Retrieval isolation, not response filtering: pgvector search is scoped by tenant_id at the query level — cross-tenant documents never enter the context window; Claude is never in a position to accidentally surface them.
JWT validation in middleware: token signature verified, claims extracted, and OPA input bundle constructed before any route handler runs — the policy gate cannot be bypassed by route ordering.
Audit log as a first-class output: every OPA decision written with timestamp, user_id, role, tenant_id, endpoint, query hash, and the specific Rego rule that fired.

Observability & Operations

Every OPA decision is written to a structured audit log — the audit log is the observability layer. Query it to answer: who accessed what, when, with what role, and what OPA decided. No separate metrics stack needed for a policy enforcement service.

Outcome

Three-role RBAC proven end-to-end with curl demos. Admin sees everything. User is blocked on sensitive data and cross-tenant queries. Auditor reads the audit log and nothing else. Policy changes are Rego diffs, not code changes.

Admin 200, user 403 (sensitive), auditor 403 (query) — all three verified with curl against a running stack.
Full audit log: every OPA decision written with who asked, what role, what was allowed or denied.
Zero cost to run locally — Docker Compose, OPA sidecar, local Postgres with pgvector.
Rego policy is the single source of truth for access rules — reviewable, testable, and version-controlled.

Real-World Use Cases

Covenant's pattern applies to any product adding AI to an existing multi-tenant system where access control cannot be delegated to the model.

Healthcare AI Assistants

A Claude-powered clinical decision tool where a nurse cannot access another department's patient records — not because the prompt says so, but because OPA Rego structurally prevents it and pgvector never returns cross-department documents. HIPAA compliance enforced at the architecture level.

Legal Technology Platforms

Multi-firm document analysis where Firm A's case files are isolated from Firm B's at the vector search level. The model never sees opposing counsel's documents — not filtered from the response, but excluded from the context window entirely.

Financial AI Tools

Role-based access where analysts see aggregate data, managers see individual client records, and auditors see only the audit log. All three boundaries enforced by Rego policy before Claude runs — provably correct, independently testable, version-controlled.

Any Multi-Tenant SaaS Adding AI

If you're integrating an LLM into an existing product with multiple customers, Covenant's architecture is the correct access control model. Prompt-based tenant isolation is not isolation — it's a suggestion the model can be prompted around. OPA Rego evaluated in middleware is a hard gate.

Key Learnings & Decisions

Access Control Design

OPA as middleware is the right pattern for AI access control — policy evaluated before the model runs, not inside the prompt.
default allow := false is the contract: every identity starts with no access, permissions are explicit grants in Rego.
Retrieval scoping at the query level (pgvector tenant_id filter) is stronger than response filtering — the model never sees data it shouldn't.

OPA & Rego

OPA has three deployment modes — REST server, Go library, CLI eval. Pick one before writing integration code. Mixing mental models wastes hours.
Rego v1 requires import rego.v1 or --v1-compatible flag. Pin your OPA image version. The error messages for version mismatches are not obvious.
Mount policies as a Docker volume so edits apply without rebuilding the container — fast iteration during policy development.

Database & Docker

CREATE EXTENSION vector requires superuser. Split init.sql into numbered files: 00_superuser.sql (postgres user) and 01_app.sql (app user). PostgreSQL processes initdb scripts alphabetically.
Use the official pgvector/pgvector:pg16 image — prebuilt binaries, no kernel header issues on WSL2 or CI runners.
Separate demo token TTL from production token TTL in config from day one. Auth debugging during a live demo is the worst kind of debugging.

Key Tasks Completed

Portfolio Entry
Standalone HTML case study written and deployed.

Monitoring & Analysis

OPA Decision Log

Every policy evaluation is written to the audit log with user_id, role, tenant_id, path, query_hash, documents_returned, and the OPA decision (allow/deny). Auditors can query this log. They cannot query documents.

Claude API Usage

Prompt caching via cache_control: ephemeral on the system prompt. Cache hit rate visible in the Anthropic API response headers. Repeated requests in the same session return significantly faster at lower cost.

Covenant query handler — OPA gate before Claude runs

@app.post("/query")
async def query(
    body: QueryRequest,
    claims: dict = Depends(get_jwt_claims),
):
    opa_input = {
        "claims": claims,
        "path": "/query",
        "method": "POST",
        "body": body.model_dump(),
    }
    resp = httpx.post(OPA_URL, json={"input": opa_input})
    if not resp.json().get("result", {}).get("allow", False):
        raise HTTPException(status_code=403, detail="access denied")

    docs = await vector_search(body.query, tenant_id=claims["tenant_id"])
    return await claude_generate(body.query, docs)

Loading code...

View Source Code

Part of a larger arc

The AI Security & Resilience Stack

Three independent projects that together cover the full surface of an AI-augmented infrastructure stack. Warden secures the Kubernetes runtime — Falco and OPA detecting threats as they happen, Claude triaging before an engineer is paged. Covenant controls access at the application layer — OPA as the hard gate between JWT identity and Claude, policy in code not prompts. Watershed closes the loop at the edge — async telemetry buffered through connectivity loss, with Claude flagging anomalies before the data reaches the cloud. Each project stands alone; together they tell one story.

~$2.00

Warden (AKS)

$0.00

Covenant (local Docker)

~$0.05

Watershed (AWS IoT Core)

~$2.05

Combined

Related project

Warden — Self-Healing Kubernetes Security Agent

AI-driven threat triage and auto-remediation on AKS — two-layer security proven end-to-end for ~$2

PythonFastAPIKubernetes / AKSFalco

Related project

Watershed — Edge-Resilient IoT Telemetry Pipeline

Async Python agent with offline buffering and AI anomaly detection — built for edge environments where connectivity is unreliable

PythonMQTTMosquittoSQLite

Warden →Watershed →