Skip to main content
The Oracle corpus is the knowledge base behind Rancher Oracle. It indexes 48,000+ real issue resolutions from 13 open-source repositories across the Rancher and Kubernetes ecosystem.

Indexed repositories

RepositoryDomain
rancher/rancherRancher Manager
k3s-io/k3sK3s lightweight Kubernetes
rancher/rke2RKE2 Kubernetes
harvester/harvesterHarvester HCI
longhorn/longhornLonghorn distributed storage
neuvector/neuvectorNeuVector container security
rancher/fleetFleet GitOps
rancher/system-upgrade-controllerAutomated upgrades
rancher/local-path-provisionerLocal path storage
rancher/webhookRancher admission webhooks
rancher/chartsHelm charts
rancher/kontainer-driver-metadataKubernetes version metadata
rancher/normanRancher API framework

What’s indexed

The corpus includes:
  • Issue resolutions — closed issues with confirmed fixes
  • Pull requests — merged PRs with linked issues
  • Discussions — community Q&A with accepted answers
  • Release notes — version-specific changes and known issues

How indexing works

1

Ingestion

Issues, PRs, discussions, and release notes are collected from each repository via the GitHub API.
2

Chunking

Documents are split into semantically meaningful chunks — preserving context around error messages, stack traces, and configuration snippets.
3

Embedding

Each chunk is embedded using a transformer model that captures the semantic meaning of Kubernetes errors, configuration patterns, and troubleshooting procedures.
4

Storage

Embeddings are stored in a vector database optimized for high-recall semantic search.
5

Retrieval

On query, semantic search retrieves the top-k matching chunks, which are passed as context to the LLM for grounded response generation.

Why embeddings matter

Traditional keyword search fails for Kubernetes troubleshooting. An error like failed to create pod sandbox has dozens of root causes — CNI misconfiguration, disk pressure, container runtime issues, and more. Semantic embeddings capture the meaning behind error messages and stack traces, not just the keywords. This means Oracle can match a user’s error against resolutions that describe the same underlying problem in different terms.

Corpus updates

The corpus is updated regularly as new issues are resolved across the indexed repositories. New resolutions, PRs, and discussions are ingested, chunked, embedded, and added to the vector database on a recurring schedule.
Enterprise customers can request custom corpus additions — internal runbooks, private repositories, or domain-specific documentation indexed alongside the public corpus.