Active / flagship
kube-insight
The missing history layer for Kubernetes AIOps.
Logs have search systems. Metrics have time-series stores. Traces preserve application flows. Kubernetes
infrastructure state is still too often reduced to whatever the apiserver shows right now. kube-insight turns that
gap into an AIOps foundation: it records Kubernetes resource history at low operational cost, extracts facts and
topology, and exposes human- and agent-friendly query surfaces. Agents can work from retained evidence first,
then use live kubectl only for final confirmation instead of rebuilding context from scratch.
Why it matters
Current state is useful. It is not the whole story.
kubectl is still the live-state baseline, but many incidents are already gone by the time someone investigates: Events expire, rollouts are reverted, RBAC edits are fixed, EndpointSlices move, and Pods are replaced. kube-insight keeps the missing Kubernetes evidence and shapes it into fast, scoped investigation paths.
Keep the state that disappeared
Events expire, Pods restart, EndpointSlices change, and deleted objects vanish from the apiserver. kube-insight keeps observed versions and timestamps so the old state can still be inspected.
Turn raw history into queryable clues
Extracted facts, changes, and topology edges let operators and agents rank candidate Services, Pods, Events, owners, RBAC, webhooks, and policies before opening full JSON proof.
Reduce the agent blast radius
Configurable filters and extractors redact sensitive data before storage. Future service mode will inherit Kubernetes RBAC so agents see only what they are allowed to inspect.
Performance
Measured as investigation workflows, not isolated database tricks.
The validation compares retained evidence against broad live kubectl paths, then separates SQLite,
ClickHouse, and chDB tradeoffs. The product claim is focused: pre-extracted evidence makes AIOps workflows
faster and more repeatable before the final live-state check.
Validation profile
Evidence queries stay small because the joins are already shaped.
Five retained-evidence workflows over SQLite evidence.
Comparable broad live calls reconstructing the same context.
ClickHouse SQL/API path used 3 operations; raw kubectl used 4 calls.
Agent workflow benchmark
Retained evidence vs broad live kubectl
Same-dataset storage harness
Choose by operating model, not a single latency number.
SQLite
- ingest
- 17.42 s
- service
- 80.6 ms
- storage
- 4.61 MB DB
ClickHouse
- ingest
- 7.91 s
- service
- 182.0 ms
- storage
- 597 KiB active, ~4.9x
chDB
- ingest
- 1.52 s
- service
- 506.9 ms
- storage
- 1.23 MB dir, ~5.7x
Use cases
Actual investigation shapes from the project docs.
The website should show more than a capability list. These cases demonstrate how retained facts, edges, observations, and versions become practical incident evidence.
Expired events
PolicyViolation events after the workload looks healthy
Symptom
A deployment was rejected or repeatedly reconciled with policy warnings. By the time someone investigates, the workload may look healthy and Events may have rotated out.
Why live kubectl is weak later
- Events are short-lived and often rotated.
- Warning Events must be joined back to Deployments, ReplicaSets, and Pods.
- The policy controller may no longer list every affected object.
Evidence kube-insight uses
- k8s_event.reason, type, and message facts
- event edges to involved resources
- Deployment, ReplicaSet, and Pod retained versions
Query shape
where fact_key in ('k8s_event.reason', 'k8s_event.type') and (fact_value = 'Warning' or severity >= 60) What you get
PolicyViolation warning Events tied back to workload objects, even when the current cluster no longer shows the full incident window.
Service topology
Service / EndpointSlice proof after resources changed
Symptom
A Service briefly routed to no endpoints or unready Pods. Later the Service is healthy, old Pods may be replaced, and the useful topology has moved on.
Why live kubectl is weak later
- Current EndpointSlices only show current endpoints.
- Deleted rollout objects and old Pods cannot be reconstructed from live state alone.
- Pod readiness transitions and Events may no longer line up in one live query.
Evidence kube-insight uses
- endpointslice_for_service edges
- endpointslice_targets_pod edges
- Endpoint readiness, Pod readiness, and restart facts
- Service investigation bundle with proof versions
Query shape
endpointslice_for_service -> endpointslice_targets_pod -> Pod readiness facts -> retained versions What you get
The investigation can show which historical EndpointSlices pointed at which Pods, then use kubectl only as the final live-state comparison.
Architecture
Facts and edges are the candidate path. Versions are the proof.
Kubernetes data is captured once, filtered before storage, extracted into investigation tables, then served through narrow read surfaces: CLI, HTTP API, read-only SQL, MCP tools, and agent prompts.
Architecture flow
Same shape as the project architecture: capture, filter, store, query.
SQLite default / chDB local / ClickHouse central
Evidence model
Small tables, useful answers.
Storage modes
Start local. Keep history central when the team needs it.
SQLite
A pure-Go default artifact with one local evidence database for first captures, laptops, CI fixtures, and local agent workflows.
chDB
A chDB-enabled artifact when you want ClickHouse-compatible local tables without operating a ClickHouse server.
ClickHouse
A continuous evidence service for append-heavy history, compression, API/MCP reads, and future cold-tiering work.
Next steps
Start with the repository quickstart and storage notes.
Installation, MCP usage, SQL recipes, security, retention, and storage-mode tradeoffs are kept in project documentation so the website can stay focused on product shape.