Context Graphs: The Future of Agentic IT Operations
Knowledge graphs map what connects. Context graphs add memory by capturing decision traces, guardrails, exceptions, and outcomes so agentic IT operations can learn and scale safely.
7 minutes
24th of June, 2026
About the Author: Vinod Kumar is Head of IT & Engineering Services for Akkodis North America. He works with enterprise leaders to modernize IT operations and engineering environments through AI, automation, and next-generation delivery models.
In many IT operations teams, knowledge graphs are already improving day-to-day work. Agents can triage faster, traverse dependencies with more confidence, and pull the right runbooks without jumping across tools, which often translates into better resolution times even in complex environments.
The friction shows up after the incident closes, because the environment rarely retains what it just learned in a way the next agent can reuse. Similar incidents return, while the best resolution path remains trapped in ticket comments, approval threads, and one-off overrides that never become part of operational memory.
This is where context graphs come in. Knowledge graphs describe what exists and how it connects, while context graphs preserve how decisions unfolded and what worked under which conditions, so the context an AI system reasons over deepens through use rather than resetting after each incident. Foundation Capital makes a strong case for this direction by arguing that decision traces and the “why” behind actions become the durable asset that unlocks the next wave of enterprise AI.
What Knowledge Graphs Still Cannot Do
Knowledge graphs make IT operations more navigable, and that’s a meaningful step for agentic IT operations and modern AIOps. The remaining gap is easier to describe than it is to solve.
Knowledge graphs tend to capture what exists and how it connects, which is why they work so well for dependency awareness, blast radius estimation, ownership routing, and runbook retrieval. What they do not naturally capture is how decisions unfolded during a real incident, including the conditions, constraints, tradeoffs, and outcomes that made an action safe or unsafe.
Three practical gaps show up in enterprise environments.
Resolutions leave no reusable trace. An agent resolves an incident and moves on, while the checks it performed, the guardrails it evaluated, and the approvals it triggered often stay buried in logs or ticket comments. The map remains mostly the same, so the next similar incident still feels new.
Exceptions stay invisible. ITOps runs on exceptions, including freeze windows, vendor quirks, temporary overrides, and escalation paths that differ by service tier. Those patterns exist in practice, although they rarely exist in the knowledge layer in a structured way.
Automation cannot calibrate itself. Guardrails and playbooks only improve when evidence accumulates, and evidence requires a consistent record of what was tried, under which conditions, and with what outcome. Without that record, governance evolves through debate and periodic reviews, rather than through learning from real operations.
These gaps do not mean knowledge graphs fall short. They simply show why a map alone doesn’t create compounding intelligence.
What Context Graphs Capture That Knowledge Graphs Miss
A knowledge graph resembles a city map that shows roads, intersections, neighborhoods, and how places connect. A context graph resembles the log of every trip taken through that city, including the route chosen, the traffic conditions at the time, the detours that worked, and whether the destination was reached without trouble.
That shift turns a static reference into a living record of how the environment actually behaves.
A context graph captures several forms of information that matter deeply in IT operations and autonomous IT operations.
Decision Traces That Preserve How Work Happened
A decision trace records the sequence of observations, checks, and actions that led to resolution. It goes beyond “what was done” and also captures what was ruled out, which dependencies were validated, and which signals were treated as decisive.
In practice, a decision trace might include the alerts present, the correlated change events, the dependency traversal performed, the runbook steps executed, the approvals requested, and the point at which human involvement became necessary.
Conditions and Guardrails That Were in Effect
In agentic IT operations, safe action depends on the state. A production freeze window changes what is allowed. A rising customer impact score changes escalation thresholds. A blast radius crossing a boundary changes whether an action can proceed without approval.
When those conditions remain buried as metadata or implied context, the system cannot learn from them. When those conditions become queryable history, governance becomes easier to operationalize and easier to audit.
The Google SRE guidance on toil and automation aligns with this view, since it emphasizes safeguards and the need for automation to default back to humans when conditions are unsafe.
Outcomes Over Time That Close the Learning Loop
A resolution rarely ends the moment the alert clears, because stability plays out across hours and days. A context graph can retain whether the fix held, whether the incident recurred, and how long stability lasted, which turns incident response into a feedback loop rather than a sequence of isolated events.
That outcome layer matters for AIOps and autonomous IT operations because it links action to impact in a way that supports learning and calibration.
Exceptions That Reveal Real Automation Boundaries
In high-performing ITOps teams, overrides and exceptions often carry useful signals, since they reveal where automation boundaries sit in reality. When those overrides become structured precedents, future agents can reason over what was overridden, who approved it, under what conditions, and how the outcome played out.
Foundation Capital frames context graphs as a shift toward decision traces becoming a core enterprise asset, and that framing fits the practical reality of IT operations.
What Agentic ITOps Looks Like When the Map Has Memory
A mature context graph changes the rhythm of IT operations in ways that feel operational rather than theoretical. Recurring incidents stop behaving like new problems because the system can match early signals to prior decision traces, including the conditions that made past resolutions safe and effective, which allows triage to begin with grounded context rather than a blank slate.
Governance also becomes easier to audit because actions come with traceability by default. Each agentic action can be tied to what was observed, what was decided, what policy was applied, and what outcome followed, which turns post-incident analysis and compliance review into structured queries rather than a reconstruction exercise from ticket comments and chat logs. Risk and accountability frameworks increasingly emphasize governance and transparency, and the NIST AI Risk Management Framework provides a useful reference point for how organizations think about these controls at scale.
Institutional knowledge becomes more durable as well. Patterns that senior engineers carry through experience, including service quirks, non-standard vendor handling, and escalation paths that work in practice, can move from informal memory into a structured context that survives turnover.
Guardrails can also evolve through evidence instead of assumptions. Over time, context data surfaces which thresholds get overridden safely and repeatedly, and which overrides correlate with escalations, so policy owners can tune governance using operational history rather than infrequent reviews and guesswork.
Most IT organizations are not operating at this level yet, and that gap reflects where many teams still are in the foundational work of consolidating data and building connected knowledge layers.
A clear destination still matters because the teams that build toward compounding learning tend to create an operational advantage that grows with every incident they resolve.
Why the Decisions You Make Now Shape the Context Graph Later
Context graphs rarely succeed as a separate initiative, because they tend to emerge as an extension of knowledge graph infrastructure. The cost and complexity of that extension often depends on small decisions that compound over time, especially when the organization instruments agent actions in a way that supports learning.
Three architecture choices tend to matter most:
Logging Agent Actions With Enough State to Be Useful
Action logs become far more valuable when they capture state, not only activity. The record needs to include what the agent observed, which checks were performed, what policy was applied, and what uncertainty existed, since those elements form the raw material of decision traces.
When that discipline is missing, thousands of incidents can flow through an environment without producing reusable learning.
Capturing Exceptions as Data Instead of Letting Them Disappear
IT operations run on structured exceptions, and those exceptions often describe the true boundary between automation and human judgment. Workflows that capture overrides as structured signals, including who approved them, under what conditions, and with what outcomes, create a boundary map that improves governance and future decision-making.
Workflows that allow exceptions to evaporate into chat and ticket text lose intelligence that can rarely be recovered later.
Designing Resolution Paths to Be Queryable
Context data only compounds when it can be retrieved and reasoned over. That requirement pushes teams toward schema design for decision traces, and it rewards teams that define the shape of a decision record before incidents generate thousands of inconsistent versions.
This is where small design choices become expensive later. When decision traces are retrofitted after the fact, teams often discover that they captured the wrong metadata, missed key conditions, or failed to link actions to outcomes in a consistent way.
Building IT Operations That Learn Over Time
ITOps teams are often not held back by a lack of tools or effort. Setbacks show up in similar places during every incident, since people still have to assemble context, validate risk, and translate scattered knowledge into a safe decision before any meaningful action can happen. That is why speed gains often stay local, even when automation and AIOps mature.
Knowledge graphs change the pace by making the environment navigable, so agents can understand dependencies, ownership, and the right operational workflows without the usual hunt across systems.
Context graphs take the next step by preserving decision traces, conditions, exceptions, and outcomes in a form future agents can reuse, which is what turns incident response into compounding intelligence instead of repeated reinvention.
For a deeper foundation behind the decision-chain bottleneck, check out our first blog post in this series.
If you want the practical bridge on how knowledge graphs create the map agents need, read the second blog in this series.
The overarching lesson from these three articles is simple. Autonomous IT operations become realistic when the architecture helps agents learn from real operations, not when models get marginally better. That direction starts with foundational work, including clean knowledge layers, instrumented actions, and governance that can be applied consistently, because each layer built well becomes a base that compounds.
To discover more about where your environment sits today and what the next step could look like, connect with our team today.