Phase 3 · W17–W18

W17–W18: Ticket Data Modeling & Labeling

Create a structured, action-oriented ticket dataset and labeling baseline you can trust for real AI evaluation.

Suggested time: 4–6 hours/week

Outcomes

  • A ticket schema with only the fields you actually need.
  • A label taxonomy that matches real AMS actions and outcomes.
  • A labeled sample dataset (50–200 tickets) for practical experiments.
  • A baseline report showing label distribution and ambiguity patterns.
  • A small trusted golden set for future prompt/model evaluation.

Deliverables

  • A schema file used in code to validate ticket input.
  • A taxonomy and labeling guideline doc with definitions and examples.
  • A CSV/JSONL labeled sample with 50–200 tickets.
  • A baseline report with distribution and ambiguity notes.

Prerequisites

  • W15–W16: Observability (logs, metrics, alerts, dashboards)

W17–W18: Ticket Data Modeling & Labeling

What you’re doing

You’re building the foundation for AI Ticket Analyzer, but like a grown-up.

Most “AI in support” fails because people do this:

  • dump messy ticket text into an LLM
  • get random outputs
  • call it “automation”

No.
First you need a ticket dataset that has structure and truth.

Time: 4–6 hours/week
Output: a clean ticket dataset + a labeling scheme + a small labeled sample that you can evaluate


The promise (what you’ll have by the end)

By the end of W18 you will have:

  • A ticket schema (fields you actually need)
  • A label taxonomy that matches real AMS life
  • A labeled dataset sample (50–200 tickets)
  • A baseline report (label distribution, common patterns)
  • A “golden set” for evaluation (small but trusted)

The rule: labels must be useful, not academic

Labels are not for ML papers.
Labels are for:

  • routing
  • triage
  • reporting
  • automation decisions

If a label doesn’t change an action, it’s useless.


Step 1: Define your ticket schema (minimum)

Your ticket record should have fields like:

  • ticket_id
  • created_at / updated_at
  • system (AS4/PS4/etc. if you use that)
  • component (MDG / SD / FI / Interfaces / etc.)
  • short_description
  • description
  • priority (optional)
  • resolution notes (optional)
  • attachments count (optional)
  • related object keys (BP number, material, etc. optional)
  • tags (optional)

Keep it minimal. You can add later.


Step 2: Define label taxonomy (your categories)

Use categories that match your real world.

Start with 6–10 top labels, example:

  • UI_CLIENT_ACTION (must be done via client/GUI)
  • DEV_REQUIRED (code change needed)
  • AFS_MASTERDATA_CHANGE (change needed in AFS)
  • MDG_MASTERDATA_CHANGE (change needed in MDG)
  • MDG_S4_MISMATCH (error between MDG and S/4 sync)
  • INTERFACE_ISSUE (inbound/outbound interface failure)
  • CONFIG_CUSTOMIZING (settings/customizing issue)
  • DATA_QUALITY (bad/invalid data)
  • ACCESS_AUTH (authorization issue)

Don’t create 40 labels. Keep it usable.


Step 3: Labeling rules (so it’s consistent)

Write simple labeling guidelines:

  • what each label means
  • 2–3 examples
  • how to handle ambiguous cases
  • multi-label allowed or not (choose one)

I recommend:

  • allow up to 2 labels max
  • always pick a primary label

Step 4: Create your labeled sample

You need 50–200 tickets.
If you don’t have real tickets:

  • create synthetic tickets based on patterns you know
  • or anonymize old notes
  • or build a mock CSV

The point is to have realistic text.


Step 5: Build a “golden set”

Golden set = small set you trust.
Example: 30 tickets labeled carefully.

This is what you use later to compare models/prompts.
Without this, you will lie to yourself about quality.


Deliverables (you must ship these)

Deliverable A — Ticket schema

  • A schema file exists (JSON schema / TS type / Pydantic)
  • It’s used in code to validate input

Deliverable B — Label taxonomy + guidelines

  • A doc exists with label definitions + examples

Deliverable C — Labeled dataset sample

  • A CSV/JSONL file exists with labeled tickets (50–200)

Deliverable D — Baseline report

  • Label distribution chart/table
  • Top patterns and common ambiguity notes

Common traps (don’t do this)

No. Start small. Do it correctly.

  • Trap 1: “I’ll label thousands later.”

No. Labels must reflect actions (dev vs data vs config vs manual).

  • Trap 2: “Labels based on SAP modules only.”

Without golden set you can’t tell if you improved or just changed outputs.

  • Trap 3: “No golden set.”

Quick self-check (2 minutes)

Answer yes/no:

  • Do I have a schema that makes tickets structured?
  • Are my labels action-oriented?
  • Do I have 50–200 labeled examples?
  • Do I have a golden set I trust?
  • Can I explain labeling rules in 60 seconds?

If any “no” — fix it before moving on.


Next module preview (W19–W20)

Next: Prompting Patterns for Ops.
We’ll build safe prompts, constraints, and outputs that won’t embarrass you in production.