Security and SoD as a First-Class AMS Flow (and what that means for AI-assisted L2–L4 work)

The week before month-end, a change request lands: adjust a pricing routine, fix a recurring interface failure, and do a one-time data correction because billing is blocked. The developer asks for emergency access “just for today”. The business wants it done in hours. The AMS lead wants zero audit findings. L2 is chasing logs and IDocs, L3 is preparing a transport, L4 is debating whether the role model is already broken. Everyone is “closing tickets”, but the same access drama repeats every release.

That scene is why modern SAP AMS can’t be only about ticket closure. It has to cover L2–L4: complex incidents, change requests, problem management, process improvements, and small-to-medium new developments. And security (authorizations, SoD, emergency access) must be part of the operating flow, not a gate at the end.

Why this matters now

“Green SLAs” can hide expensive patterns:

Repeat incidents: authorization errors after org/process changes, then quick fixes, then the same thing next month.
Manual work: SoD checks done late or via emails; emergency access used as a workaround.
Knowledge loss: the real approval rules live in people’s heads; handovers are messy.
Cost drift: access changes become a silent backlog that steals time from problem removal and improvements.

The source record is blunt: most security incidents are predictable side effects of change. If you treat security as a last-step approval, you discover issues only when the business fails. If you treat it as a continuous flow, you prevent a chunk of L2 incidents and reduce L3/L4 rework.

Where AI-assisted support helps is also practical: better intake, faster context retrieval, clearer SoD explanations, and consistent documentation. Where it should not be used: making security decisions or pushing risky changes to production without explicit human approval.

The mental model

Classic AMS optimizes for throughput: classify ticket → assign → fix → close within SLA.

Modern AMS optimizes for outcomes: reduce repeats → deliver safer changes → shorten recovery time → keep run costs predictable. The unit of value is not “tickets closed”, it’s “incidents not created” and “changes that don’t bounce back”.

Two rules of thumb I use:

If the same access request repeats, it’s not an approval problem. It’s a role design problem. (This is directly in the source: repeated requests should trigger role redesign, not faster approvals.)
If you can’t explain “who/why/when” for access or a change, you don’t have control—you have luck. Traceability is an operational requirement, not paperwork.

What changes in practice

From “security as gate” to “security as flow”
Use a defined access flow: intake → validation → execution → review. The source suggests concrete elements: declared purpose and duration, linked to a process or change; automated SoD and role compatibility checks; temporary access with auto-expiry; post-access verification and quarterly cleanup based on usage.
From incident closure to root-cause removal
Auth-related incidents are often symptoms of role drift. Treat “auth-related incident rate” as a problem backlog input, not just a KPI.
From chat ping-pong to structured intake
Still allow access requests via chat (source allows it), but require business context, purpose, and duration. “No access without declared business intent.” That single rule removes a lot of noise.
From manual SoD justification emails to evidence snapshots
Replace long email threads with a stored “SoD risk snapshot” and a plain-language explanation of conflicts (both are listed outputs in the source). The goal is faster decisions with better auditability.
From permanent emergency access to expiring break-glass
“Emergency access always expires automatically.” This is non-negotiable if you want emergency access to stay exceptional.
From tribal knowledge to versioned runbooks and decisions
For L2–L4, store: incident patterns, interface recovery steps, transport/rollback steps, and access decision rationale. Searchable and versioned. (Generalization: the source doesn’t name a knowledge tool, so assume whatever repository you already govern.)
From “one vendor owns it” to clear decision rights
Security ownership is shared: AMS executes, security governance sets policy, business owns intent and risk acceptance. Decision rights must be explicit, or approvals become favors (an anti-pattern in the source).

Agentic / AI pattern (without magic)

“Agentic” here means: a workflow where a system can plan steps, retrieve context, draft actions, and execute only pre-approved safe tasks under human control.

A realistic end-to-end workflow for access + incident prevention:

Inputs

Access request text (often via chat) with process context, purpose, duration (source).
Recent incidents tagged as authorization-related (generalization).
Monitoring signals: repeated authorization failures, batch chain failures due to missing rights (generalization).
Change context: linked change request / transport plan (source says “linked to process or change”).
Runbooks: how to grant/revoke standard roles, how to verify after access (generalization).

Steps

Classify: is this standard access, temporary access, or emergency access?
Retrieve context: user’s current roles, related process/change, similar past requests.
Validate:
- Automated SoD check
- Role compatibility check
- Historical risk pattern check (all from source)
Propose action: recommend standard role assignment (fast lane) or temporary access with expiry; generate SoD explanation in plain language (source “copilot moves”).
Request approval: route to the right approver based on intent and risk. Keep separation of duties: the requester cannot approve their own access (general governance assumption; not specified in source, but consistent with SoD intent).
Execute safe tasks: apply approved role assignment or temporary access; enforce auto-expiry; capture who/why/when (source).
Document and review: post-access verification, usage review for emergency roles, and later quarterly cleanup based on real usage (source).

Guardrails

Least privilege: the system can draft and recommend; execution is limited to pre-approved standard roles and time-bound access.
Approvals: no production-affecting access without human approval tied to declared intent.
Audit: store decision recommendation, SoD snapshot, and final approval outcome.
Rollback: access changes must be reversible; expiry is a built-in rollback for emergency access.
Privacy: restrict what ticket/chat content is stored and retrieved; avoid pulling sensitive business data into prompts (generalization; the source doesn’t specify privacy controls).

What stays human-owned: approving production access, accepting SoD risk, authorizing emergency access, approving data corrections, and signing off business impact.

Honestly, this will slow you down at first because you’ll discover how many “informal” approvals you were relying on.

Implementation steps (first 30 days)

Define the access intake minimum
Purpose: stop vague requests.
How: require purpose + duration + linked process/change in the request.
Signal: fewer back-and-forth messages; higher “auto-approved (%)” for standard cases (source metric).
Create the security flow as an AMS runbook
Purpose: make security repeatable.
How: document intake/validation/execution/review steps from the source.
Signal: consistent evidence trail (“who/why/when”) on access changes.
Set operating rules and enforce them
Purpose: remove exceptions that become normal.
How: implement “no intent, no access” and “emergency always expires”.
Signal: emergency access usage and duration trend down (source metric).
Introduce SoD checks early in change work
Purpose: prevent SoD violations introduced by changes.
How: add SoD validation to change request refinement, not just pre-go-live.
Signal: fewer “SoD violations introduced by changes” (source metric).
Tag and track auth-related incidents
Purpose: turn noise into a problem backlog.
How: classify incidents as auth-related and link to access/role changes.
Signal: auth-related incident rate starts to drop (source metric).
Build a small “role drift” review cadence
Purpose: stop slow decay.
How: monthly review of drift signals; quarterly cleanup based on real usage (source).
Signal: role drift report exists and actions are recorded (source output).
Pilot the copilot moves in a narrow scope
Purpose: reduce manual touch time without losing control.
How: start with pre-filling requests and plain-language SoD explanations (source).
Signal: faster approvals for standard roles; fewer justification emails.
Define decision rights
Purpose: stop “access as a favor”.
How: write who approves what, and when business must sign off.
Signal: fewer escalations; clearer ownership in tickets and changes.

Limitation: if your role model is already inconsistent across processes, recommendations will be noisy until you clean up the basics.

Pitfalls and anti-patterns

Automating broken intake: faster bad requests are still bad requests.
Trusting AI summaries without evidence links (logs, role assignments, SoD snapshot).
Permanent emergency access (explicit anti-pattern in the source).
Manual SoD justification emails as the primary control (explicit anti-pattern).
Treating access as a favor instead of a controlled operation (explicit anti-pattern).
Over-broad execution rights for automation (breaks least privilege and SoD).
No rollback thinking for access and changes; expiry and reversibility must be designed.
Metrics that reward closure only; you’ll optimize for hiding work, not removing it.
Skipping post-access verification and usage review; you won’t learn.

Checklist

Access requests require intent, duration, and process/change link
Automated SoD + role compatibility checks happen before execution
Emergency access expires automatically and is usage-reviewed
“Who/why/when” is recorded for every access change
Repeated requests trigger role redesign work, not faster approvals
Auth-related incidents are tagged and reviewed as problem candidates
Quarterly cleanup based on real usage is scheduled
AI assistance is limited to drafting, explaining, and safe pre-approved tasks

FAQ

Is this safe in regulated environments?
Yes, if you treat AI assistance as documentation and recommendation, and keep approvals, SoD decisions, and production access under human control with audit trails and expiry.

How do we measure value beyond ticket counts?
Use the source metrics: auth-related incident rate, emergency access usage/duration, SoD violations introduced by changes, and access requests auto-approved (%). Add general ops signals like reopen rate and change failure rate.

What data do we need for RAG / knowledge retrieval?
Minimum: past access decisions (who/why/when), SoD snapshots, role definitions, incident classifications, and runbooks. Keep it curated; dumping raw chat logs without structure usually creates noise. (Generalization.)

How to start if the landscape is messy?
Start narrow: one process area with high access volume or repeated auth incidents. Apply the intake rules and expiry first, then work on role drift.

Will this reduce L2 workload or just move it to governance?
It reduces L2 firefighting when role drift and emergency access shrink. But you will spend more time upfront on validation and cleanup. That trade-off is real.

Who should own the flow?
AMS should own execution and operational metrics; security governance owns policy and SoD rules; business owns intent and risk acceptance. If any of these is missing, approvals become political.

Next action

Next week, pick the last five authorization-related incidents and trace them back: what change triggered them, what access was granted, whether emergency access expired, and whether the same request happened before. If you can’t answer “who/why/when” for each, your first improvement is not more tools—it’s enforcing the access intake and review flow described above.

Source attribution (required): Dzmitryi Kharlanau (SAP Lead). Dataset bytes: https://dkharlanau.github.io — ams-025 “Security and SoD as a First-Class AMS Flow”.

Operational FAQ

Is this safe in regulated environments?↓

Actually, it is safer. In classical AMS, "the engineer who knows the trick" is a single point of failure (SPOF). Agents formalize that "trick" into repeatable logic with full trace audits (ST22/SMQ2 logs processed into human-decisions).

How do we measure value beyond ticket counts?↓

We shift to MTTR (Mean Time to Resolution) and First-Attempt Success Rate. With "Chat-First", the value is in the elimination of the "ping-pong" between business and support.

What data do we need for RAG / knowledge retrieval?↓

Start with existing Ticket Histories, Solution Documents (KEDBs), and WEO2 logs. Our system indexes these specifically for SAP context.

How to start if the landscape is messy?↓

Don't boil the ocean. Select one SAP Operational Unit (e.g., Procure-to-Pay) and index its unique "Exceptions" first. Order arises from documenting the chaos.

SOURCE_REF: transfer_datasets_ams_agentic_2026-02-18/ams/ams-025.json

MetalHatsCats Operational Intelligence — 2/20/2026