Modern SAP AMS: outcome-driven operations, with responsible agentic support
The interface queue is growing again. Billing is blocked, and the business is asking for a “quick fix” before month-end. L2 can clear some stuck messages, but the same pattern returns after every release. L3 suspects a mapping change. L4 says the real issue is a missing validation step in the process and a fragile batch chain. Meanwhile, a change request for a small enhancement is waiting, because nobody wants to touch production during a release freeze caused by regressions.
This is SAP AMS across L2–L4: complex incidents, change requests, problem management, process improvements, and small-to-medium developments. If you only optimize for ticket closure, you can still show green SLAs while the system keeps producing the same demand.
Why this matters now
Green SLAs often hide four expensive realities:
- Repeat incidents: the same “ticket family” comes back with different screenshots. You close fast, but you don’t delete the cause.
- Manual work that never becomes knowledge: evidence is collected in chats, not in runbooks. The next engineer starts from zero.
- Knowledge loss: key users and senior analysts know the “three checks” for known errors, but it’s tribal and not versioned.
- Cost drift: small recurring defects create a permanent tax on L2/L3 time, pushing real improvements to “later.”
Modern AMS (as a working style) looks different day to day: you treat recurring demand as a backlog item, you train against data, and you build learning loops where each resolved pattern becomes reusable knowledge. Agentic / AI-assisted support can help with the boring parts (classification, retrieval, drafting, documentation), but it must not bypass approvals, access rules, or audit needs.
The mental model
Classic AMS optimizes for throughput: close tickets, meet response/resolve times, keep the queue moving.
Modern AMS optimizes for outcomes:
- reduce repeat ticket families,
- improve time-to-first-triage for L1/L2,
- increase self-service resolution,
- deliver safer changes with rollback thinking.
Two rules of thumb I use:
- No training without a target metric: pick a ticket family and expect it to drop. (This is straight from the source record.)
- If training doesn’t reduce repeats, assume design debt: it’s not always “user error”; sometimes the process or validation is broken.
What changes in practice
-
From incident closure → to root-cause removal
Every high-repeat symptom gets a problem record and an owner. The outcome is not “resolved”; it is “repeat rate down.” You still fix fast, but you also schedule prevention work. -
From weak intake → to request templates that cut back-and-forth
The source JSON calls out “request templates that eliminate back-and-forth.” Make “what to provide” explicit: steps, timing, affected objects, evidence. Less ping-pong means faster L2/L3 diagnosis. -
From tribal knowledge → to searchable, versioned run cards
Turn recurring fixes into one-page job aids: “If you see X, do Y, attach Z.” Store them as KB/runbook entries, not as slide decks. The source rule is clear: every training asset must be reusable as knowledge. -
From ad-hoc triage → to disciplined evidence collection
Train AMS L1/L2 on triage discipline: what logs to capture, what screenshots matter, what “first 3 checks” apply for known errors. This improves time-to-first-triage (a metric in the source). -
From reactive firefighting → to drills and office hours
Short incident simulations with AMS + key users improve coordination and timelines. Weekly office hours focus on one recurring pain point and remove it (source format). This is prevention work with a calendar slot. -
From “one vendor” thinking → to clear decision rights
Define who can approve: production data corrections, security changes, transport imports, and business process changes. Without decision rights, agentic support becomes risky because it will “helpfully” propose actions nobody is accountable for. -
From training as an event → to training as a demand-reduction system
Micro-sessions (10–20 minutes, scenario-based) tied to top ticket families by volume and by business impact (source). The point is fewer repeats, not “knowledge coverage.”
Honestly, this will slow you down at first because you are adding structure (templates, run cards, approvals) where people are used to improvising.
Agentic / AI pattern (without magic)
“Agentic” here means: a workflow where a system can plan steps, retrieve context, draft actions, and execute only pre-approved safe tasks under human control.
One realistic workflow: recurring interface incident → prevention + training asset
Inputs
- Incident tickets and their categories (repeat families)
- Monitoring alerts and logs (generalization: whatever your landscape provides)
- Runbooks / KB
- Change records and transport notes (no tool names assumed)
- Past resolution evidence (sanitized)
Steps
- Classify: group new incidents into known ticket families (volume + impact).
- Retrieve context (RAG): pull the relevant run card content units: symptom pattern, context, first checks, likely causes, fix options, prevention, sanitized evidence examples (these units are defined in the source JSON).
- Propose action: draft a triage checklist for L2 and a likely fix path for L3/L4, with explicit evidence to collect.
- Request approval: if the proposal touches production (data correction, transport import, authorization change), it stops and asks for the right human approver.
- Execute safe tasks (only if pre-approved): create a ticket update, attach the checklist, open a problem record, schedule a drill, draft a micro-session script from anonymized real tickets (source “copilot moves”).
- Document: generate a KB/runbook entry update and a short “post-training impact report” proposal (source output), but require human review before publishing.
Guardrails
- Least privilege: the system can read sanitized tickets and KB; it cannot change production data or import transports.
- Approvals & separation of duties: humans approve prod changes; developers don’t self-approve their own fixes.
- Audit trail: every suggestion and executed safe task is logged (who approved, what evidence was used).
- Rollback discipline: any change proposal must include rollback thinking (explicitly called out for developers in the source).
- Privacy: scenario scripts are generated from real tickets only after anonymization (source).
What stays human-owned
- Approving production changes, data corrections, and security decisions
- Business sign-off for process changes
- Final root-cause conclusion (AI can summarize, but it can be wrong)
- Publishing runbooks that affect compliance or audit scope
A real limitation: if your ticket data is inconsistent or full of free-text noise, classification and retrieval will produce confident-looking but unreliable outputs until you fix the basics.
Implementation steps (first 30 days)
-
Pick the top ticket families
Purpose: focus.
How: rank by volume and by business impact (source).
Success signal: a short list everyone agrees is “the real pain.” -
Define target metrics per family
Purpose: avoid training theater.
How: “which ticket family should drop” (source rule).
Signal: each item has a baseline and an owner. -
Create 3–5 run cards
Purpose: faster triage, fewer repeats.
How: one-page “If X, do Y, attach Z” (source job aids).
Signal: L1/L2 uses them in ticket updates. -
Run micro-sessions
Purpose: delete repeats.
How: 10–20 minutes, scenario-based (source).
Signal: repeat incident rate for trained scenarios starts trending down. -
Start weekly office hours
Purpose: continuous removal of one pain point.
How: fixed slot, one topic (source).
Signal: fewer “where do I find…” tickets. -
Do one drill with key users + AMS
Purpose: improve coordination and evidence timeline.
How: short simulation (source drills).
Signal: time-to-first-triage improves for that scenario. -
Set approval gates for risky actions
Purpose: safety.
How: document who approves prod changes, data fixes, auth changes.
Signal: fewer “urgent but unclear” escalations. -
Pilot agentic support on safe tasks
Purpose: reduce manual admin work without risk.
How: auto-select training topics from demand drivers; draft scripts and quizzes tied to runbooks; draft impact reports (source “copilot moves”).
Signal: training backlog ranked by ROI exists and is reviewed weekly.
Pitfalls and anti-patterns
- Training that is generic and unrelated to AMS load (source anti-pattern)
- One-time workshops with no follow-up (source anti-pattern)
- Teaching users to live with broken design (source anti-pattern)
- Automating triage when intake quality is poor (garbage in, confident garbage out)
- Trusting AI summaries without checking the evidence trail
- Giving broad access “to make it work faster” (breaks least privilege and audit)
- No owner for prevention work (everything becomes “after the incident”)
- Measuring only ticket counts and missing repeat rate and self-service rate
- Over-customizing runbooks so they can’t be reused across teams
Checklist
- Top ticket families ranked by volume and business impact
- Each family has a target metric and an owner
- Run cards exist for known errors and “first 3 checks”
- Request templates reduce missing evidence
- Weekly office hours scheduled with one recurring topic
- One drill completed with AMS + key users
- Approval gates documented (prod changes, data fixes, security)
- Agentic support limited to safe tasks + full audit trail
- Training assets stored as KB/runbook entries (versioned)
FAQ
Is this safe in regulated environments?
Yes, if you treat agentic support as a drafting and retrieval assistant, not an autonomous operator. Keep least privilege, approvals, audit logs, and separation of duties.
How do we measure value beyond ticket counts?
Use the source metrics: ticket family volume before/after training, repeat incident rate, time-to-first-triage improvement, and self-service resolution rate increase.
What data do we need for RAG / knowledge retrieval?
Structured units, not long documents: symptom pattern, context, first checks, likely causes, fix options, prevention, and sanitized evidence examples (all listed in the source).
How to start if the landscape is messy?
Assumption (because the source doesn’t specify tooling): start with the tickets you already have. Normalize categories for the top families, then build run cards from real resolutions. Don’t wait for perfect monitoring.
Will this reduce L3/L4 load or just shift it?
If you only improve closure speed, it shifts load. If you remove root causes and improve user/key-user behavior, repeats drop and L3/L4 time becomes more predictable.
Where does training fit into change requests and small developments?
Train developers on change hygiene, rollback thinking, and regression awareness (source). Then convert the outcomes into runbook updates and checklists used during build and release.
Next action
Next week, take the last 30–60 days of incidents and pick three ticket families; assign an owner to each, write one run card per family, and schedule one 20‑minute micro-session with a single goal: make that family’s repeat rate go down.
Source: Dzmitryi Kharlanau (SAP Lead). Dataset bytes: https://dkharlanau.github.io (ams-018).
Operational FAQ
Is this safe in regulated environments?↓
How do we measure value beyond ticket counts?↓
What data do we need for RAG / knowledge retrieval?↓
How to start if the landscape is messy?↓
MetalHatsCats Operational Intelligence — 2/20/2026
