Solutions

Real architectures for real operational problems.

Proven, costed, and ready to adapt.

These aren't demo projects. Each solution below was designed to solve a specific class of problem, proven end-to-end on real cloud infrastructure, and documented with enough detail to adapt to a production environment. The common thread: security enforced by architecture rather than assumed by convention, resilience proven by simulation rather than hoped for in production.

Kubernetes Runtime Security

Kubernetes Runtime Security & Auto-Remediation

Detect, triage, and patch Kubernetes threats automatically — without a human in the loop for low-severity incidents.

The Problem

Security teams running Kubernetes clusters are buried in alert noise. Runtime threats require expert triage, manual response, and accurate severity judgment at a speed that doesn't scale with headcount.

The Solution

A two-layer detection architecture — OPA Gatekeeper blocks non-compliant workloads at admission, Falco eBPF probes catch adversarial behaviour at runtime. Claude Sonnet triages every alert with structured severity classification. Low-severity threats are auto-patched within seconds. High-severity threats surface a Claude-drafted runbook for human review. Every decision is logged with full reasoning chain.

Proven Outcomes

Shell spawn detected → auto-patched in under 3 seconds on live AKS
Privileged container blocked at admission before scheduling
Complete audit trail of every triage decision
Total infrastructure cost: ~$2 per validation session

Best fit for: Healthcare, financial services, multi-tenant SaaS, government contractors, any organisation with Kubernetes in a regulated environment.

View full case study

AI Access Control

Policy-Enforced AI Access Control for Multi-Tenant Systems

OPA Rego as a hard gate between user identity and your AI — the model doesn't decide who sees what.

The Problem

Adding an LLM to a multi-tenant product creates an access control problem that prompt engineering cannot solve. A model that enforces tenant isolation in its system prompt can be prompted around it. The isolation needs to happen before the model runs.

The Solution

OPA Rego evaluated in the FastAPI middleware chain — every request is policy-checked before Claude is invoked. pgvector semantic search is scoped by tenant_id at the query level, not filtered from the response. The model only ever sees documents the requesting identity is already permitted to see. Every access decision is logged with the specific Rego rule that fired.

Proven Outcomes

Admin, user, and auditor roles proven end-to-end with curl
Cross-tenant retrieval blocked at the pgvector query level
Policy changes are Rego diffs — version-controlled, testable, auditable
Zero cloud cost to run — Docker Compose, local Postgres, OPA sidecar

Best fit for: Healthcare AI tools, legal tech platforms, financial services AI, any SaaS product integrating an LLM into an existing multi-tenant architecture.

View full case study

Edge IoT Telemetry

Edge-Resilient IoT Telemetry with AI Anomaly Detection

Sensor telemetry that works when the network doesn't — with real-time AI anomaly detection at the edge.

The Problem

Standard IoT pipelines assume connectivity. In rural, industrial, and remote environments, connectivity loss is a scheduled event, not an incident. Pipelines that drop data during outages or delay anomaly detection until the next cloud sync are not safe for agricultural, industrial, or community infrastructure.

The Solution

An async Python agent that decouples local processing from cloud sync. Every reading is buffered in SQLite immediately. Claude analyses the rolling sensor window on every reading — not on sync. Cloud replay happens in timestamp order when connectivity returns. Device identity is provisioned by Terraform with X.509 certificates and scoped IoT policies — no shared credentials on edge hardware.

Proven Outcomes

10 readings buffered during simulated outage, synced in order on reconnect
28°C → 60°C thermal escalation detected with specific diagnosis and remediation recommendation
Real-time anomaly detection regardless of cloud connectivity state
Total cloud spend: ~$0.05

Best fit for: Agricultural operations, remote industrial sites, Indigenous and rural community infrastructure, precision agriculture, any deployment where connectivity is intermittent and data loss has operational consequences.

View full case study

Have a problem that fits one of these patterns?

These architectures are designed to be adapted, not just admired. If you're building in a regulated environment, adding AI to a multi-tenant system, or deploying to edge infrastructure — reach out.

Get in Touch