Modern SAP AMS: outcome-driven operations with responsible agentic support

The interface backlog is growing again. Billing is blocked for a subset of orders, the batch chain is half-red, and a “small” change request is waiting because nobody wants to risk another regression during a release freeze. Meanwhile, incidents are closed on time. Green SLAs. And the same defect returns after every transport/import cycle because the real rule sits in someone’s head, not in a runbook or a test.

That’s L2–L4 AMS reality: complex incidents, change requests, problem management, process improvements, and small-to-medium new developments. Ticket closure is necessary, but it’s not the outcome the business pays for.

Why this matters now

Classic AMS reporting can hide three expensive patterns:

Repeat incidents: the same IDoc/interface family fails, gets restarted, and “resolved” without removing the cause.
Manual work that never becomes productized: one-off data corrections, recurring monitoring checks, manual reconciliations.
Knowledge loss: a messy handover, undocumented exceptions, and “tribal” fixes that don’t survive staff rotation.
Cost drift: run cost increases quietly because every change adds support burden, but nobody measures the run cost delta.

Modern AMS (as I use the term) is day-to-day operations that optimizes for repeat reduction, safer change delivery, learning loops, and predictable run costs—not just SLA closure. Agentic / AI-assisted work helps when it reduces friction in evidence gathering, option drafting, and documentation. It should not be used to bypass approvals, change governance, or security controls.

The mental model

Traditional AMS optimizes for throughput: how many tickets closed, how fast, and whether SLAs are met.

Modern AMS optimizes for decision quality: fewer repeats, clearer ownership, and changes that come with verification and rollback. The source record calls this a small decision support factory: consistent templates, comparable options, transparent assumptions, and post-decision learning that compounds trust.

Rules of thumb I use:

If you can’t explain risk and reversibility in plain words, you’re not ready to change production.
If uncertainty is high, widen the estimate band and list missing facts (from the source: “bands, not false precision”).

What changes in practice

From incident closure → to root-cause removal
- Mechanism: link recurring incident families to a problem record; define the “done” state as repeat reduction, not workaround.
- Signal: repeat rate and reopen rate trend down for the top demand drivers.
From ad-hoc estimates → to decision packs
- Use three pack types from the source: quick pack (same day), standard pack (3–10 working days), investment pack (2–6 weeks).
- Always include “do nothing” as baseline, declare assumptions, and propose verification signals up front.
From tribal knowledge → to versioned knowledge atoms
- Mechanism: every L3/L4 fix produces a small KB update: symptom, evidence, safe checks, rollback notes, and owner.
- Signal: stakeholders re-use templates; fewer “quick asks” because the process is trusted (source metric).
From manual triage → to AI-assisted triage with guardrails
- Mechanism: AI drafts classification, suggests comparable past incidents/decisions, and lists missing facts. Humans confirm.
- Signal: reduced manual touch time in triage; MTTR trend improves without higher change failure rate.
From reactive firefighting → to risk-based prevention
- Mechanism: prioritize work using the source scoring inputs: business flow criticality (SLO impact), cost of delay, repeat load reduction potential, change-induced risk, strategic fit.
- Signal: top demand drivers shrink unless risk is extreme (source decision rule).
From “one vendor” thinking → to explicit decision rights
- Mechanism: each option declares which vendor surface is touched; dependency time separated from execution time (source multi-vendor integrity rules).
- Signal: fewer stalled changes due to hidden dependencies.
From “delivered” → to measured outcomes
- Mechanism: record the decision pack, chosen option, expected bands, and measure actual outcome vs expected; publish a short “what we learned” note (source loop).
- Signal: outcome accuracy improves over time; post-decision regret rate drops.

Agentic / AI pattern (without magic)

“Agentic” here means: a workflow where the system can plan steps, retrieve context, draft actions, and execute only pre-approved safe tasks under human control.

One realistic end-to-end flow for L2–L4 (generalized, because the source does not name specific tools):

Inputs

Incident/change request text, monitoring alerts, logs, interface/IDoc error payloads, batch chain status, prior transports/import notes, runbooks, known errors, and past decision packs.

Steps

Classify: suggest incident family, impacted business flow, likely owners.
Retrieve context (RAG): pull relevant KB atoms, similar past decisions/outcomes (source “suggest comparable past decisions”).
Propose options: draft 2–3 options including “do nothing”, with impact/risk/effort/coordination bands and failure modes (source estimation language).
Request approvals: route to the right decision owner (technical lead, security, business) with a stakeholder-specific summary (source: summaries for business/CIO/CFO).
Execute safe tasks (only if pre-approved): e.g., gather diagnostics, run read-only checks, prepare a draft change plan, generate test evidence checklist.
Document: update the ticket, attach assumptions checklist, create an outcome tracking entry (source automation outputs).

Guardrails

Least privilege access; read-only by default.
Separation of duties: the system can draft, humans approve; production changes require explicit approval.
Audit trail for every retrieved source, suggestion, and executed task.
Rollback discipline: every change option must include reversibility and rollback steps (source: “risk controls + rollback” in investment packs).
Privacy: redact personal data in logs; restrict retrieval scope to approved knowledge bases.

What stays human-owned: approving production changes, data corrections with audit impact, authorization/security decisions, and business sign-off on process changes. Honestly, this will slow you down at first because you will surface missing facts and unclear ownership that were previously hidden by heroics.

A limitation: AI summaries can sound confident even when evidence is thin, so you must force citations to logs/runbooks/KB and keep the “missing facts” list visible.

Implementation steps (first 30 days)

Define outcomes for AMS
- How: pick 3–5 outcomes (repeat reduction, change failure rate, backlog aging, MTTR trend, cost-to-serve proxy).
- Signal: outcomes appear in weekly ops review, not only SLA charts.
Introduce decision packs
- How: implement quick/standard/investment pack templates from the source; require “do nothing”, assumptions, verification signals.
- Signal: decision packs delivered per month (source metric).
Standardize estimation language
- How: use bands (impact/risk/effort/coordination) and widen bands when uncertain (source rule).
- Signal: fewer debates about “exact hours”; more clarity on missing facts.
Create an option comparison habit
- How: for meaningful changes, compare outcome, time-to-value, one-off cost, run cost delta, risk/reversibility, dependencies, verification, lock-in delta (source fields).
- Signal: fewer late surprises during testing/release.
Set decision rights and approval gates
- How: map who approves what (prod change, data correction, interface restart policy, emergency fix).
- Signal: reduced waiting caused by unclear ownership.
Start a post-decision learning loop
- How: record pack + chosen option; measure expected vs actual; publish a short learning note (source loop).
- Signal: outcome accuracy improves; regret rate becomes discussable.
Pilot AI-assisted drafting (not auto-execution)
- How: allow drafting decision packs, assumption checklists, and stakeholder summaries (source copilot moves).
- Signal: faster creation of standard packs without higher error rate.
Protect access and audit
- How: enforce least privilege, logging, and separation of duties before any automation touches production.
- Signal: audit questions can be answered with evidence, not memory.

Pitfalls and anti-patterns

Automating a broken intake: garbage requests produce confident garbage packs.
Trusting AI summaries without links to evidence (logs, runbooks, KB).
Measuring only ticket counts and SLA closure; ignoring repeat load.
Over-broad access “for convenience”; no separation of duties.
Skipping rollback planning because “it’s a small change”.
Treating multi-vendor work as politics instead of declared boundaries and shared SLOs (source integrity rules).
Over-customizing templates until nobody uses them.
No owner for prevention work; everything becomes “someone’s backlog”.
Verification defined after the change, not before.

Checklist

Top 10 repeat incident families identified and linked to problems
Quick/standard/investment decision pack templates live
“Do nothing”, assumptions, verification signals mandatory
Estimation uses bands; missing facts listed
Approval gates and decision rights documented
Audit trail and least privilege enforced for any automation
Post-decision learning note published weekly

FAQ

Is this safe in regulated environments?
Yes, if you keep least privilege, separation of duties, audit trails, and explicit approvals. The agent can draft and collect evidence; humans approve production-impacting actions.

How do we measure value beyond ticket counts?
Use the source metrics plus ops metrics: outcome accuracy (expected vs actual bands), regret rate, reduction in ad-hoc asks, repeat rate, change failure rate, backlog aging, MTTR trend.

What data do we need for RAG / knowledge retrieval?
Curated runbooks, KB atoms, past decision packs with outcomes, and sanitized logs/alerts. If the knowledge base is messy, start with the top incident families and build from there.

How to start if the landscape is messy?
Generalization: pick one critical business flow and its top demand drivers. Build decision packs and KB atoms there first; don’t try to cover the whole landscape in month one.

Will this slow delivery?
At first, yes—because you force assumptions, verification, and rollback into the open. Over time, repeat reduction and clearer decisions usually pay back the overhead.

How do we keep multi-vendor decisions fair?
Declare which vendor surface each option touches, separate dependency time from execution time, and arbitrate using shared SLOs and contracts (source rules).

Next action

Next week, take the top recurring incident family (interfaces, batch chain, or master data corrections—whatever hurts most) and produce one standard decision pack with three options including “do nothing”, explicit assumptions, a verification plan, and rollback steps; then schedule a 30-minute review where you agree the owner and record the outcome for learning.

Source: “Decision Support Factory: Making Reliable Estimates for Business Choices” (ams-047), Dzmitryi Kharlanau (SAP Lead). Dataset bytes: https://dkharlanau.github.io

Operational FAQ

Is this safe in regulated environments?↓

Actually, it is safer. In classical AMS, "the engineer who knows the trick" is a single point of failure (SPOF). Agents formalize that "trick" into repeatable logic with full trace audits (ST22/SMQ2 logs processed into human-decisions).

How do we measure value beyond ticket counts?↓

We shift to MTTR (Mean Time to Resolution) and First-Attempt Success Rate. With "Chat-First", the value is in the elimination of the "ping-pong" between business and support.

What data do we need for RAG / knowledge retrieval?↓

Start with existing Ticket Histories, Solution Documents (KEDBs), and WEO2 logs. Our system indexes these specifically for SAP context.

How to start if the landscape is messy?↓

Don't boil the ocean. Select one SAP Operational Unit (e.g., Procure-to-Pay) and index its unique "Exceptions" first. Order arises from documenting the chaos.

SOURCE_REF: transfer_datasets_ams_agentic_2026-02-18/ams/ams-047.json

MetalHatsCats Operational Intelligence — 2/20/2026