Ingestion And Extraction
Sources
Section titled “Sources”MVP sources:
- Kubernetes discovery API.
- Dynamic Kubernetes list/watch API for all allowed resources.
- Periodic full list snapshots.
- Kubernetes Events watch.
- Local kubeconfig collection for PoC.
The default collector should be a global watcher. It discovers every list/watch-capable GroupVersionResource and stores every observed object in the generic resource history store, subject to the configured filter pipeline.
P0 typed extractors:
PodDeploymentReplicaSetServiceEndpointSliceNodeEventP1 typed extractors:
StatefulSetDaemonSetJobCronJobConfigMapSecret metadata onlyPVCP2 typed extractors:
IngressGateway API resourcesHPAPDBNetworkPolicycustom resourcesUnknown CRDs and resources without typed extractors still get:
- resource history,
- latest index,
- generic metadata identity,
- basic ownerReferences topology,
- generic change summary when feasible.
Ingestion Pipeline
Section titled “Ingestion Pipeline”discovery/list/watch event -> decode -> apply resource filters -> resolve object identity -> deduplicate by doc_hash -> store resource version -> update latest index -> extract topology -> extract facts -> extract change summaries -> commit offsetFilter Pipeline
Section titled “Filter Pipeline”Filters are ordered, configurable write-path stages. They can discard a whole observation, transform a resource into its retained form, or attach metadata for storage and later audit.
Default filter chain:
resource_policy_filter -> secret_redaction_filter -> metadata_normalization_filter -> resource_version_normalization_filter -> status_condition_timestamp_normalization_filter -> leader_election_configmap_normalization_filter -> high_churn_field_filter -> change_discard_filterFilter outcomes:
keepkeep_modifieddiscard_changediscard_resourceFilter requirements:
- Run before storage and before
doc_hashcalculation for retained documents. - Persist high-value filter decisions to the audit table, including decisions that discard the resource before a retained version exists.
- Roll up high-volume low-risk normalization decisions, such as
resourceVersion,managedFields, condition timestamp, and leader-election annotation removal, into aggregate time buckets instead of per-object rows. - For destructive
keep_modifieddecisions, include structured audit metadata such asremovedFields,redactedFields, and policy-specific markers likesecretPayloadRemovedso benchmark and safety checks can aggregate them. - Make destructive filters explicit so compliance mode can disable or replace them.
- Keep enough object metadata to write tombstones and close topology edges.
Redaction Filters
Section titled “Redaction Filters”Default behavior:
- Do not store Secret payloads.
- Store Secret metadata, refs, and optionally data hashes.
- Count removed Secret
dataandstringDatafields as redacted fields and fail the safety metric if retained latest Secret JSON still contains either payload field. - Redact known token-like env values when configured.
- Keep ConfigMap data by default, but allow path-based exclusion.
Normalization Filters
Section titled “Normalization Filters”Default ignored or normalized fields:
metadata.resourceVersionmetadata.managedFieldsstatus.conditions[*].lastHeartbeatTimestatus.conditions[*].lastTransitionTimemetadata.annotations["control-plane.alpha.kubernetes.io/leader"] on ConfigMapsmetadata.annotations["kubectl.kubernetes.io/last-applied-configuration"]Do not blindly drop status. Status contains most incident evidence. Instead, extract useful status facts and normalize timestamp-only condition churn so the retained document hash tracks state changes rather than heartbeat updates.
Change Discard Filters
Section titled “Change Discard Filters”Some observations are valid but not useful enough to store as new versions.
Examples:
- unchanged document hash,
- only ignored heartbeat fields changed,
- noisy resources such as Leases when policy chooses skip or downsample.
Discarding a change must still update ingestion metrics and offsets. It must not hide a delete, a UID change, or a topology/fact transition.
The ingestion pipeline preserves DELETED observations even if a change discard
filter requests discard_change; the decision is still audited, but the delete
observation continues to storage.
Resource-Specific Processing
Section titled “Resource-Specific Processing”The global watcher provides coverage, but some resources need specialized processing profiles for performance and signal quality. A profile can choose a filter chain, extractor set, retention policy, compaction strategy, and queue priority for one GVR.
Default rule:
unknown resources: generic history + latest index + generic metadata/change summary
known high-value resources: generic history + resource-specific filters/extractors/index hintsResource identity resolution should be data-driven before it falls back to compiled defaults. The order is:
1. discovered api_resources persisted from client-go or kubectl discovery2. CRD definitions observed in the current batch3. built-in Kubernetes seed resources4. conservative fallback for unknown resourcesKind-only and resource-only seed matches must not cross API groups. If a grouped resource is not discovered, the resolver should return an explicit conservative fallback for that group rather than borrowing a built-in resource or scope from another group.
Do not add new independent kind/resource/group switch tables in extractors,
ingest, or storage. Route GVK/GVR/scope lookups through the shared resolver so
CRDs and uncommon API groups keep their real plural, group, version, and scope.
The ingestion pipeline passes the active resolver, including persisted discovery
data and CRDs observed in the batch, into extractors. Reference-style extractors
must use that resolver for ownerReferences, Events, scaleTargetRef, Gateway
refs, RBAC subjects, webhook services, and other object-reference edges.
SQLite edge-target materialization also checks persisted api_resources before
falling back to compiled defaults, so topology queries preserve discovered CRD
plural names and scope.
Each profile must define five outputs:
retained versions: which observations become reconstructable evidence
facts: compact queryable signals written to object_facts
edges: topology relationships written to object_edges
status changes: status transitions that should become facts and/or object_changes
change summaries: small timeline entries written to object_changesResource-specific output contract:
| Resource | Facts | Edges | Status changes | Change summaries |
|---|---|---|---|---|
| Pod | phase, reason, last reason, restart count, ready, QoS, scheduled/deleted | Pod -> Node, Pod -> owner, Pod -> ConfigMap/Secret/PVC | container state transitions, readiness transitions, scheduling, deletion | spec, ownerRefs, labels affecting selection, resources, image, probes |
| Node | Ready, MemoryPressure, DiskPressure, PIDPressure, NetworkUnavailable, taints, capacity/allocatable changes | optional Node -> hosted Pod is queried through Pod -> Node, not duplicated | condition transitions and pressure changes | labels, taints, capacity/allocatable, provider/zone metadata |
| Event | reason, type, involved object, reporting controller, message fingerprint, count | involved object reference when resolvable | count/lastTimestamp changes for repeated Events | rollup bucket opened/updated/closed |
| EndpointSlice | endpoint readiness, serving/terminating when present, endpoint count | Service -> Pod via targetRef, EndpointSlice -> Pod optional for proof | endpoint readiness/serving/terminating transitions | membership added/removed, targetRef changed |
| Service | selector, type, ports, clusterIP/load balancer changes | Service -> EndpointSlice, selector-derived Service -> Pod fallback | load balancer ingress/status transitions | selector, ports, type, external traffic policy |
| Deployment/ReplicaSet | generation, observedGeneration, replica counts, rollout condition, template hash | ReplicaSet -> Deployment, Pod -> ReplicaSet | rollout condition and replica availability transitions | Pod template image/env/resources/probes/selectors |
| ConfigMap/Secret metadata | reference count when used, optional data hash by policy | Pod/workload -> ConfigMap/Secret | normally none | key set/hash changed, metadata changed |
Pod Fast Path
Section titled “Pod Fast Path”Pods are high-volume and central to most investigations.
Special handling:
- Always detect UID changes, delete/recreate, scheduling changes, and readiness transitions.
- Extract status facts before deciding whether a change is worth storing.
- Write
object_changesfor meaningful status transitions, not only spec changes. - Discard pure heartbeat changes only when status, owner refs, labels, annotations relevant to selection, spec, and placement are unchanged.
- Store Pod -> Node, Pod -> owner, and Pod -> ConfigMap/Secret/PVC edges from compact typed extraction.
- Preserve enough retained JSON to reconstruct evidence for selected versions.
Node Summary
Section titled “Node Summary”Nodes are large and status-heavy. Full JSON is useful as evidence, but every status heartbeat should not become a high-cost query path.
Special handling:
- Extract condition transitions, allocatable/capacity changes, taints, labels, and pressure signals as facts.
- Write
object_changesfor condition, taint, capacity, allocatable, and topology-relevant label changes. - Avoid indexing large status arrays generically.
- Use stronger compression or less frequent full snapshots for large Node documents when reconstruction latency remains bounded.
- Optionally downsample unchanged Node status observations after facts are extracted.
Event Rollup
Section titled “Event Rollup”Events are high-churn and often repetitive.
Special handling:
- Mirror Event reason, type, involved object, reporting controller, count, and first/last timestamp into facts.
- Fingerprint messages to control cardinality.
- Write count changes as Event rollup change summaries so investigation results show repeated Event progression even after native Events expire.
- Roll up repeated Events with the same involved object, reason, and message fingerprint inside a short time bucket.
- Write change summaries when an Event rollup bucket is opened, materially incremented, or closed.
- Keep raw Event versions only according to retention policy; facts are the primary query path after native Kubernetes Event expiry.
EndpointSlice Topology
Section titled “EndpointSlice Topology”EndpointSlices are the preferred source for Service membership.
Special handling:
- Extract namespace-qualified Service -> Pod edges from
endpoints[*].targetRef. - Extract endpoint readiness, serving, and terminating state as facts when present.
- Store edge interval changes instead of repeatedly writing identical service membership.
- Write change summaries for membership added/removed and readiness transitions.
- Treat endpoint address churn without targetRef changes as lower priority unless it affects readiness or topology evidence.
Lease And Coordination Resources
Section titled “Lease And Coordination Resources”Leases are usually high-churn and low troubleshooting value.
Special handling:
- Skip by default in the PoC, or downsample aggressively when enabled.
- Never let Lease churn starve Pod, Event, Node, or EndpointSlice processing.
- Record skipped/downsampled counts by GVR for transparency.
Object Identity
Section titled “Object Identity”Identity resolution:
cluster_id + kind_id + metadata.uidFallback:
cluster_id + kind_id + namespace + nameDelete/recreate with a new UID should become a new object while keeping the same human-readable key.
Topology Extraction
Section titled “Topology Extraction”Pod -> Node
Section titled “Pod -> Node”Source:
Pod.spec.nodeNameEdge:
pod_on_nodePod -> ReplicaSet
Section titled “Pod -> ReplicaSet”Source:
Pod.metadata.ownerReferencesEdge:
pod_owned_by_replicasetReplicaSet -> Deployment
Section titled “ReplicaSet -> Deployment”Source:
ReplicaSet.metadata.ownerReferencesEdge:
replicaset_owned_by_deploymentService -> Pod
Section titled “Service -> Pod”Preferred source:
EndpointSlice.endpoints[*].targetRefEdges:
endpointslice_targets_podservice_selects_podSelector-based matching is fallback only.
Pod -> ConfigMap / Secret / PVC
Section titled “Pod -> ConfigMap / Secret / PVC”Sources:
volumes[*].configMapvolumes[*].secretvolumes[*].persistentVolumeClaimenvFrom[*].configMapRefenvFrom[*].secretRefenv[*].valueFrom.configMapKeyRefenv[*].valueFrom.secretKeyRefFact Extraction
Section titled “Fact Extraction”Pod Status
Section titled “Pod Status”Extract:
status.phase- container restart count changes
state.*.reasonlastState.*.reason- readiness condition transitions
- start time and deletion time
- QoS class
High severity:
OOMKilledEvictedPreemptedCrashLoopBackOffImagePullBackOffReady=FalseWorkload Config
Section titled “Workload Config”Extract from Pod template:
- image
- env and envFrom refs
- ConfigMap and Secret refs
- CPU/memory requests and limits
- liveness/readiness/startup probes
- labels and annotations that affect selection
Write config facts only on changes.
Node Conditions
Section titled “Node Conditions”Extract:
- Ready
- MemoryPressure
- DiskPressure
- PIDPressure
- NetworkUnavailable when present
Write facts on transitions.
Event Mirror
Section titled “Event Mirror”Mirror Kubernetes Events into object_facts:
- reason
- type
- involved object
- message fingerprint
- bounded message preview
- action
- reporting controller and instance
- count
- series count
- first/last timestamp
Keep enough message data to search quickly, but avoid indexing unbounded raw
messages. Use k8s_event.message_preview for triage and retained JSON for full
proof text.
Change Summary Extraction
Section titled “Change Summary Extraction”Generate object_changes from semantic diff summaries:
- spec image/resource/probe changes
- selector changes
- owner reference changes
- scheduling and placement changes
- status condition transitions
This supports timeline UI and fast “what changed near this time” queries.
Rebuildability
Section titled “Rebuildability”Every extractor must be deterministic enough that facts and edges can be rebuilt from stored versions.
Extractor version should be recorded so changes can trigger selective rebuilds.