Phase 3 · W17–W18
W17–W18: Ticket Data Modeling & Labeling
Create a structured, action-oriented ticket dataset and labeling baseline you can trust for real AI evaluation.
Suggested time: 4–6 hours/week
Outcomes
- A ticket schema with only the fields you actually need.
- A label taxonomy that matches real AMS actions and outcomes.
- A labeled sample dataset (50–200 tickets) for practical experiments.
- A baseline report showing label distribution and ambiguity patterns.
- A small trusted golden set for future prompt/model evaluation.
Deliverables
- A schema file used in code to validate ticket input.
- A taxonomy and labeling guideline doc with definitions and examples.
- A CSV/JSONL labeled sample with 50–200 tickets.
- A baseline report with distribution and ambiguity notes.
Prerequisites
- W15–W16: Observability (logs, metrics, alerts, dashboards)
W17–W18: Ticket Data Modeling & Labeling
What you’re doing
You’re building the foundation for AI Ticket Analyzer, but like a grown-up.
Most “AI in support” fails because people do this:
- dump messy ticket text into an LLM
- get random outputs
- call it “automation”
No.
First you need a ticket dataset that has structure and truth.
Time: 4–6 hours/week
Output: a clean ticket dataset + a labeling scheme + a small labeled sample that you can evaluate
The promise (what you’ll have by the end)
By the end of W18 you will have:
- A ticket schema (fields you actually need)
- A label taxonomy that matches real AMS life
- A labeled dataset sample (50–200 tickets)
- A baseline report (label distribution, common patterns)
- A “golden set” for evaluation (small but trusted)
The rule: labels must be useful, not academic
Labels are not for ML papers.
Labels are for:
- routing
- triage
- reporting
- automation decisions
If a label doesn’t change an action, it’s useless.
Step 1: Define your ticket schema (minimum)
Your ticket record should have fields like:
- ticket_id
- created_at / updated_at
- system (AS4/PS4/etc. if you use that)
- component (MDG / SD / FI / Interfaces / etc.)
- short_description
- description
- priority (optional)
- resolution notes (optional)
- attachments count (optional)
- related object keys (BP number, material, etc. optional)
- tags (optional)
Keep it minimal. You can add later.
Step 2: Define label taxonomy (your categories)
Use categories that match your real world.
Start with 6–10 top labels, example:
- UI_CLIENT_ACTION (must be done via client/GUI)
- DEV_REQUIRED (code change needed)
- AFS_MASTERDATA_CHANGE (change needed in AFS)
- MDG_MASTERDATA_CHANGE (change needed in MDG)
- MDG_S4_MISMATCH (error between MDG and S/4 sync)
- INTERFACE_ISSUE (inbound/outbound interface failure)
- CONFIG_CUSTOMIZING (settings/customizing issue)
- DATA_QUALITY (bad/invalid data)
- ACCESS_AUTH (authorization issue)
Don’t create 40 labels. Keep it usable.
Step 3: Labeling rules (so it’s consistent)
Write simple labeling guidelines:
- what each label means
- 2–3 examples
- how to handle ambiguous cases
- multi-label allowed or not (choose one)
I recommend:
- allow up to 2 labels max
- always pick a primary label
Step 4: Create your labeled sample
You need 50–200 tickets.
If you don’t have real tickets:
- create synthetic tickets based on patterns you know
- or anonymize old notes
- or build a mock CSV
The point is to have realistic text.
Step 5: Build a “golden set”
Golden set = small set you trust.
Example: 30 tickets labeled carefully.
This is what you use later to compare models/prompts.
Without this, you will lie to yourself about quality.
Deliverables (you must ship these)
Deliverable A — Ticket schema
- A schema file exists (JSON schema / TS type / Pydantic)
- It’s used in code to validate input
Deliverable B — Label taxonomy + guidelines
- A doc exists with label definitions + examples
Deliverable C — Labeled dataset sample
- A CSV/JSONL file exists with labeled tickets (50–200)
Deliverable D — Baseline report
- Label distribution chart/table
- Top patterns and common ambiguity notes
Common traps (don’t do this)
No. Start small. Do it correctly.
- Trap 1: “I’ll label thousands later.”
No. Labels must reflect actions (dev vs data vs config vs manual).
- Trap 2: “Labels based on SAP modules only.”
Without golden set you can’t tell if you improved or just changed outputs.
- Trap 3: “No golden set.”
Quick self-check (2 minutes)
Answer yes/no:
- Do I have a schema that makes tickets structured?
- Are my labels action-oriented?
- Do I have 50–200 labeled examples?
- Do I have a golden set I trust?
- Can I explain labeling rules in 60 seconds?
If any “no” — fix it before moving on.
Next module preview (W19–W20)
Next: Prompting Patterns for Ops.
We’ll build safe prompts, constraints, and outputs that won’t embarrass you in production.