Phase 3 · W17–W18

W17–W18: Ticket Data Modeling & Labeling

Create a structured, action-oriented ticket dataset and labeling baseline you can trust for real AI evaluation.

Suggested time: 4–6 hours/week

Outcomes

A ticket schema with only the fields you actually need.
A label taxonomy that matches real AMS actions and outcomes.
A labeled sample dataset (50–200 tickets) for practical experiments.
A baseline report showing label distribution and ambiguity patterns.
A small trusted golden set for future prompt/model evaluation.

Deliverables

A schema file used in code to validate ticket input.
A taxonomy and labeling guideline doc with definitions and examples.
A CSV/JSONL labeled sample with 50–200 tickets.
A baseline report with distribution and ambiguity notes.

Prerequisites

W15–W16: Observability (logs, metrics, alerts, dashboards)

W17–W18: Ticket Data Modeling & Labeling

What you’re doing

You’re building the foundation for AI Ticket Analyzer, but like a grown-up.

Most “AI in support” fails because people do this:

dump messy ticket text into an LLM
get random outputs
call it “automation”

No.
First you need a ticket dataset that has structure and truth.

Time: 4–6 hours/week
Output: a clean ticket dataset + a labeling scheme + a small labeled sample that you can evaluate

The promise (what you’ll have by the end)

By the end of W18 you will have:

A ticket schema (fields you actually need)
A label taxonomy that matches real AMS life
A labeled dataset sample (50–200 tickets)
A baseline report (label distribution, common patterns)
A “golden set” for evaluation (small but trusted)

The rule: labels must be useful, not academic

Labels are not for ML papers.
Labels are for:

routing
triage
reporting
automation decisions

If a label doesn’t change an action, it’s useless.

Step 1: Define your ticket schema (minimum)

Your ticket record should have fields like:

ticket_id
created_at / updated_at
system (AS4/PS4/etc. if you use that)
component (MDG / SD / FI / Interfaces / etc.)
short_description
description
priority (optional)
resolution notes (optional)
attachments count (optional)
related object keys (BP number, material, etc. optional)
tags (optional)

Keep it minimal. You can add later.

Step 2: Define label taxonomy (your categories)

Use categories that match your real world.

Start with 6–10 top labels, example:

UI_CLIENT_ACTION (must be done via client/GUI)
DEV_REQUIRED (code change needed)
AFS_MASTERDATA_CHANGE (change needed in AFS)
MDG_MASTERDATA_CHANGE (change needed in MDG)
MDG_S4_MISMATCH (error between MDG and S/4 sync)
INTERFACE_ISSUE (inbound/outbound interface failure)
CONFIG_CUSTOMIZING (settings/customizing issue)
DATA_QUALITY (bad/invalid data)
ACCESS_AUTH (authorization issue)

Don’t create 40 labels. Keep it usable.

Step 3: Labeling rules (so it’s consistent)

Write simple labeling guidelines:

what each label means
2–3 examples
how to handle ambiguous cases
multi-label allowed or not (choose one)

I recommend:

allow up to 2 labels max
always pick a primary label

Step 4: Create your labeled sample

You need 50–200 tickets.
If you don’t have real tickets:

create synthetic tickets based on patterns you know
or anonymize old notes
or build a mock CSV

The point is to have realistic text.

Step 5: Build a “golden set”

Golden set = small set you trust.
Example: 30 tickets labeled carefully.

This is what you use later to compare models/prompts.
Without this, you will lie to yourself about quality.

Deliverables (you must ship these)

Deliverable A — Ticket schema

A schema file exists (JSON schema / TS type / Pydantic)
It’s used in code to validate input

Deliverable B — Label taxonomy + guidelines

A doc exists with label definitions + examples

Deliverable C — Labeled dataset sample

A CSV/JSONL file exists with labeled tickets (50–200)

Deliverable D — Baseline report

Label distribution chart/table
Top patterns and common ambiguity notes

Common traps (don’t do this)

No. Start small. Do it correctly.

Trap 1: “I’ll label thousands later.”

No. Labels must reflect actions (dev vs data vs config vs manual).

Trap 2: “Labels based on SAP modules only.”

Without golden set you can’t tell if you improved or just changed outputs.

Trap 3: “No golden set.”

Quick self-check (2 minutes)

Answer yes/no:

Do I have a schema that makes tickets structured?
Are my labels action-oriented?
Do I have 50–200 labeled examples?
Do I have a golden set I trust?
Can I explain labeling rules in 60 seconds?

If any “no” — fix it before moving on.

Next module preview (W19–W20)

Next: Prompting Patterns for Ops.
We’ll build safe prompts, constraints, and outputs that won’t embarrass you in production.