Solutions
Real architectures for real operational problems.
Proven, costed, and ready to adapt.
These aren't demo projects. Each solution below was designed to solve a specific class of problem, proven end-to-end on real cloud infrastructure, and documented with enough detail to adapt to a production environment. The common thread: security enforced by architecture rather than assumed by convention, resilience proven by simulation rather than hoped for in production.
Kubernetes Runtime Security
Kubernetes Runtime Security & Auto-Remediation
Detect, triage, and patch Kubernetes threats automatically — without a human in the loop for low-severity incidents.
The Problem
Security teams running Kubernetes clusters are buried in alert noise. Runtime threats require expert triage, manual response, and accurate severity judgment at a speed that doesn't scale with headcount.
The Solution
A two-layer detection architecture — OPA Gatekeeper blocks non-compliant workloads at admission, Falco eBPF probes catch adversarial behaviour at runtime. Claude Sonnet triages every alert with structured severity classification. Low-severity threats are auto-patched within seconds. High-severity threats surface a Claude-drafted runbook for human review. Every decision is logged with full reasoning chain.
Proven Outcomes
- Shell spawn detected → auto-patched in under 3 seconds on live AKS
- Privileged container blocked at admission before scheduling
- Complete audit trail of every triage decision
- Total infrastructure cost: ~$2 per validation session
Best fit for: Healthcare, financial services, multi-tenant SaaS, government contractors, any organisation with Kubernetes in a regulated environment.
View full case studyAI Access Control
Policy-Enforced AI Access Control for Multi-Tenant Systems
OPA Rego as a hard gate between user identity and your AI — the model doesn't decide who sees what.
The Problem
Adding an LLM to a multi-tenant product creates an access control problem that prompt engineering cannot solve. A model that enforces tenant isolation in its system prompt can be prompted around it. The isolation needs to happen before the model runs.
The Solution
OPA Rego evaluated in the FastAPI middleware chain — every request is policy-checked before Claude is invoked. pgvector semantic search is scoped by tenant_id at the query level, not filtered from the response. The model only ever sees documents the requesting identity is already permitted to see. Every access decision is logged with the specific Rego rule that fired.
Proven Outcomes
- Admin, user, and auditor roles proven end-to-end with curl
- Cross-tenant retrieval blocked at the pgvector query level
- Policy changes are Rego diffs — version-controlled, testable, auditable
- Zero cloud cost to run — Docker Compose, local Postgres, OPA sidecar
Best fit for: Healthcare AI tools, legal tech platforms, financial services AI, any SaaS product integrating an LLM into an existing multi-tenant architecture.
View full case studyEdge IoT Telemetry
Edge-Resilient IoT Telemetry with AI Anomaly Detection
Sensor telemetry that works when the network doesn't — with real-time AI anomaly detection at the edge.
The Problem
Standard IoT pipelines assume connectivity. In rural, industrial, and remote environments, connectivity loss is a scheduled event, not an incident. Pipelines that drop data during outages or delay anomaly detection until the next cloud sync are not safe for agricultural, industrial, or community infrastructure.
The Solution
An async Python agent that decouples local processing from cloud sync. Every reading is buffered in SQLite immediately. Claude analyses the rolling sensor window on every reading — not on sync. Cloud replay happens in timestamp order when connectivity returns. Device identity is provisioned by Terraform with X.509 certificates and scoped IoT policies — no shared credentials on edge hardware.
Proven Outcomes
- 10 readings buffered during simulated outage, synced in order on reconnect
- 28°C → 60°C thermal escalation detected with specific diagnosis and remediation recommendation
- Real-time anomaly detection regardless of cloud connectivity state
- Total cloud spend: ~$0.05
Best fit for: Agricultural operations, remote industrial sites, Indigenous and rural community infrastructure, precision agriculture, any deployment where connectivity is intermittent and data loss has operational consequences.
View full case studyHave a problem that fits one of these patterns?
These architectures are designed to be adapted, not just admired. If you're building in a regulated environment, adding AI to a multi-tenant system, or deploying to edge infrastructure — reach out.
Get in Touch