Contain Custom Code: Stop Z‑Code from Owning SAP AMS
A change request lands late Thursday: “Small pricing tweak, needed for Monday.” L2 sees it touches a custom validation that also feeds an interface mapping. L3 remembers a similar tweak caused a billing block last quarter. L4 says the original developer left years ago, and the only “documentation” is old ticket comments. Everyone can close tickets fast. Nobody can say, with confidence, what will break after the transport.
That is the gap between classic AMS and modern AMS. Not L1 closure. L2–L4 reality: complex incidents, change requests, problem management, process improvements, and small-to-medium new developments—often around custom code that quietly became critical infrastructure.
Why this matters now
Green SLAs can hide the real pain:
- Repeat incidents that return after every release, especially after transports/imports touching Z-objects.
- Manual work that never becomes a runbook: interface reprocessing, batch chain babysitting, master data fixes, authorization workarounds.
- Knowledge loss: “hero developers” and tribal memory around black-box enhancements.
- Cost drift: AMS hours grow, but the system is not more stable.
The source record is blunt: most AMS pain is not SAP standard. It’s unmanaged custom code. A Z-program solves an urgent need, then logic spreads via copy-paste and implicit dependencies, authors leave, and AMS inherits fear instead of understanding. Result: every change becomes risky, slow, and expensive.
Modern AMS is outcome-driven operations: reduce repeats, make changes safer, and build learning loops. Agentic / AI-assisted support helps with context gathering, classification, and evidence trails—but it must not become an ungoverned “auto-fixer”.
The mental model
Classic AMS optimizes for ticket throughput: close incidents within SLA, keep backlog “under control”, move on.
Modern AMS optimizes for operational outcomes:
- Fewer repeat incidents caused by custom code
- Lower change failure rate (regressions after transports)
- Faster root cause analysis because knowledge is searchable and current
- Predictable run cost because hotspots are visible
Two rules of thumb I use:
- If a Z-object consumes disproportionate AMS hours, treat it like a product with an owner, tests, and a roadmap—not a “support artifact”.
- If logic changes frequently, it likely does not belong in SAP core (generalization, but it matches the source’s containment rule).
What changes in practice
-
From “module-based” reporting to Z-object accountability
Daily execution in the source: track incidents by Z-object, not just by module. Mechanism: tag incidents/problems/changes with the touched Z-objects and enhancements. Signal: “AMS hours by Z-object” becomes a normal weekly view. -
From “we don’t touch this” to classification and contracts
Use the source classification:
- Core-Critical: rare, heavily tested, owned, documented (legal/financial posting depends on it).
- Business Logic: prefer externalization; SAP should not be the brain.
- Utility/Convenience: safe, isolated, replaceable.
Signal: decreasing “custom code with no named owner”.
-
From tribal knowledge to living documentation
The source suggests generating living documentation from code + tickets + changes. Mechanism: every fix or change updates a short “what/why/blast radius/rollback” note linked to the Z-object. Signal: fewer escalations that start with “nobody knows how it works”. -
From reactive firefighting to hotspot removal
Detect hotspots: code that generates disproportionate AMS load. Then choose: stabilize what must remain, retire/freeze low-value Z-code, or move volatile logic to external services (strategic actions in the source). Signal: repeat incidents caused by custom code trend down. -
From manual triage to AI-assisted triage with evidence
AI can map incidents to Z-objects automatically and pull related history. But it must cite sources (tickets, changes, code references) so L2/L3 can verify. Signal: reduced manual “context hunting” time, not blind automation. -
From “emergency fix” culture to traceability and rollback discipline
Guardrails from the source: no emergency fixes without traceability; no silent logic in exits without reference documentation. Mechanism: every urgent change still records scope, approval, and rollback steps. Signal: fewer regressions after transports. -
From “one vendor thinking” to clear decision rights
Ownership must be explicit: functional owner for each Z-object, technical owner for code quality, and change authority for production moves. Signal: fewer stalled changes due to “waiting for someone to decide”.
Agentic / AI pattern (without magic)
By “agentic” I mean: a workflow where a system can plan steps, retrieve context, draft actions, and execute only pre-approved safe tasks under human control. It is not a free-running bot in production.
One realistic end-to-end workflow for L2–L4:
Inputs
- Incident/change request text, attachments, and history
- Monitoring signals and logs (generalization: whatever you already collect)
- Transport/change records, runbooks, and known error patterns
- Code metadata and the list of related Z-objects/enhancements
Steps
- Classify: incident vs problem vs change; propose severity; detect if Z-code is involved.
- Retrieve context: pull prior tickets linked to the same Z-object, recent transports touching it, and any runbook steps.
- Propose action: draft a hypothesis (“likely mapping logic in custom code”), list checks to confirm (interface queue status, batch chain dependencies, authorization impacts), and suggest a safe containment step.
- Request approval: if any step affects production behavior or data, route to the right human approver (separation of duties).
- Execute safe tasks (only pre-approved): collect logs, generate a comparison report, draft a change description, prepare a rollback plan template, update the knowledge note.
- Document: write the evidence trail: what was checked, what changed, what to monitor after import.
Guardrails
- Least privilege: read-only by default; no direct production changes.
- Approvals: human sign-off for production changes, data corrections, and security decisions.
- Audit trail: keep prompts, retrieved sources, and final decisions linked to the ticket/change.
- Rollback: every change proposal includes rollback steps and monitoring checks.
- Privacy: restrict what ticket data can be used in retrieval; avoid copying sensitive business data into free text.
Honestly, this will slow you down at first because you are forcing ownership and documentation where none existed.
What stays human-owned
- Approving production transports/imports
- Business sign-off on process changes and pricing/validation rules
- Any data correction with audit implications
- Authorization and security decisions
- Final root cause statement for problem management (AI can draft, humans must confirm)
A limitation: if your historical tickets are low quality (“fixed, please close”), retrieval will return noise until you improve intake and closure notes.
Implementation steps (first 30 days)
-
Start tagging by Z-object
Purpose: make custom-code load visible.
How: add a required field in incident/problem/change templates for “affected Z-object(s)” (manual at first).
Success: top 10 Z-objects by AMS hours is measurable. -
Name owners for the top hotspots
Purpose: stop orphan critical logic.
How: assign a functional owner + technical owner for each hotspot Z-object.
Success: “custom code with no named owner” decreases week over week. -
Classify hotspots into the three types
Purpose: choose the right treatment.
How: quick workshop with L2–L4 + business owner using the source definitions.
Success: each hotspot has a type and a rule (test level, documentation, exit path). -
Add “blast radius” and “rollback” to every change
Purpose: safer transports.
How: enforce two short sections in change records; link to impacted interfaces/batch chains.
Success: fewer regression incidents after transports. -
Create a living documentation format
Purpose: searchable knowledge that survives turnover.
How: one page per Z-object: purpose, inputs/outputs, dependencies, common failures, tests, rollback notes.
Success: reduced time to onboard a new L3 engineer (qualitative, but visible). -
Pilot AI-assisted mapping and drafting
Purpose: reduce manual context hunting.
How: use AI to suggest Z-object links and draft closure notes with citations to evidence.
Success: lower manual touch time per ticket; fewer reopenings due to missing steps. -
Open one problem record per repeating Z-incident pattern
Purpose: move from closure to removal.
How: define “repeat” threshold (generalization) and trigger problem management.
Success: repeat rate for that pattern declines. -
Decide “freeze / stabilize / move out” per hotspot
Purpose: stop endless patching.
How: for each hotspot, pick one strategic action from the source and track it.
Success: backlog aging improves for changes touching that Z-object.
Pitfalls and anti-patterns
- Automating broken intake: garbage tickets produce confident-looking garbage summaries.
- Trusting AI summaries without evidence links to logs/tickets/changes.
- Over-broad access for assistants (“it needs prod to be useful”).
- No separation of duties: the same person/agent proposes, approves, and executes.
- “Silent logic” in enhancements/exits with no reference documentation (explicitly called out in the source guardrails).
- Emergency fixes that bypass traceability, then become the new baseline.
- No functional owner: technical teams end up deciding business rules by accident.
- Noisy metrics: counting tickets closed while repeats grow.
- Fixing symptoms around broken custom logic (source anti-pattern).
- Hero developers guarding black boxes (source anti-pattern).
Checklist
- Incidents/problems/changes are tagged by Z-object
- Top Z-hotspots are visible by AMS hours and repeat incidents
- Each hotspot Z-object has a functional owner and technical owner
- Hotspots are classified: Core-Critical / Business Logic / Utility
- Every change includes blast radius + rollback + monitoring checks
- Emergency fixes are traceable and documented
- AI assistance is read-only by default, with approvals and audit trail
- One repeating pattern = one problem record with removal actions
FAQ
Is this safe in regulated environments?
Yes, if you enforce least privilege, separation of duties, approvals, and audit trails. The unsafe version is “auto-change in prod” without traceability.
How do we measure value beyond ticket counts?
Use the source metrics: AMS hours by Z-object, repeat incidents caused by custom code, and custom code with no named owner. Add change failure rate and reopen rate as operational signals (generalization).
What data do we need for RAG / knowledge retrieval?
Ticket text with decent closure notes, linked Z-objects, change records, runbooks, and post-transport regression notes. If you lack structure, start by tagging and standardizing closure fields.
How to start if the landscape is messy?
Don’t inventory everything. Start with the top 10 Z-objects by pain (hours/repeats). Contain those first, then expand.
Will moving business logic out of SAP always help?
Not always. It reduces volatility in core, but adds integration and ownership needs. Use it where logic changes frequently and causes repeated AMS load (as the source suggests).
Who owns the decision to retire or freeze Z-code?
Functional owner decides value; technical owner advises risk and effort; change authority controls production moves. Make this explicit to avoid stalemates.
Next action
Next week, run a 60-minute review of the last month’s L2–L4 work and produce one list: the top 10 Z-objects by repeat incidents and effort, with a named functional owner for each and a first decision—freeze, stabilize, or plan to move volatile logic out of SAP core.
Operational FAQ
Is this safe in regulated environments?↓
How do we measure value beyond ticket counts?↓
What data do we need for RAG / knowledge retrieval?↓
How to start if the landscape is messy?↓
MetalHatsCats Operational Intelligence — 2/20/2026
