Phase 4 · W33–W34

W33–W34: Guardrails & Governance (what is allowed, what is not)

Establish clear governance and guardrails so the knowledge system stays safe, source-grounded, and policy-compliant.

Suggested time: 4–6 hours/week

Outcomes

  • A clear “allowed vs forbidden” content policy.
  • A sensitivity classification (public/internal/restricted).
  • A safe-answer protocol (“answers with sources or no answer”).
  • A redaction/anonymization guideline.
  • Simple enforcement: checks during ingestion and during answering.

Deliverables

  • Governance policy doc with allowed/forbidden content and sensitivity levels.
  • Redaction guideline with rules, examples, and uncertain-case handling.
  • Safe-answer protocol covering cite/refuse/unknown restricted behavior.
  • Enforcement checks at ingestion and answering time.

Prerequisites

  • W31–W32: Retrieval Quality (search, ranking, relevance)

W33–W34: Guardrails & Governance (what is allowed, what is not)

What you’re doing

You stop treating “knowledge base” like a cute wiki.

A KB + RAG system becomes a liability fast if:

  • it leaks sensitive data
  • it gives confident wrong answers
  • it tells people to do risky changes
  • it invents facts

So you build guardrails and governance like an adult.

Time: 4–6 hours/week
Output: a governance policy + content rules + safe-answer behavior + basic enforcement checks


The promise (what you’ll have by the end)

By the end of W34 you will have:

  • A clear “allowed vs forbidden” content policy
  • A sensitivity classification (public/internal/restricted)
  • A safe-answer protocol (“answers with sources or no answer”)
  • A redaction/anonymization guideline
  • Simple enforcement: checks during ingestion and during answering

The rule: if you can’t cite it, don’t claim it

Your system must prefer:

over

  • “Here are sources”
  • “Trust me”

And it must be allowed to say:

  • “I don’t know”
  • “Not enough info”
  • “This is restricted”

No hero mode.


Step-by-step checklist

1) Define sensitivity levels

Start simple:

  • PUBLIC (safe to publish)
  • INTERNAL (company/team internal)
  • RESTRICTED (PII, credentials, customer data, sensitive configs)

Everything ingested must have a sensitivity label.
If you can’t label it, don’t ingest it.

2) Define “allowed content”

Examples:

  • runbooks without secrets
  • RCA summaries without personal data
  • mapping rules without customer identifiers
  • interface descriptions at conceptual level
  • known issues summaries

3) Define “forbidden content”

Examples:

  • credentials, tokens, keys
  • full customer records
  • personal addresses/names
  • screenshots with sensitive info
  • anything that violates policy or contracts

Write it down clearly. Not vague.

4) Add redaction/anonymization rules

Rules like:

  • replace IDs with placeholders
  • mask emails
  • remove addresses
  • remove attachments or store only metadata

If you can’t anonymize safely, exclude.

5) Define safe-answer behavior

Your answer policy should be:

  • Always provide citations (source chunks)
  • If no good source found → say “I don’t know” + suggest where to look
  • If restricted content would be required → refuse and explain restriction
  • Never invent configuration steps or transactions if not in sources

6) Add enforcement checks

During ingestion:

  • block docs flagged as RESTRICTED if not allowed
  • run a simple scanner for secrets (patterns like “password=”, “token=”, etc.)

During answering:

  • only retrieve chunks allowed for the current context
  • refuse if sensitivity mismatch

Start basic. It already helps a lot.


Deliverables (you must ship these)

Deliverable A — Governance policy doc

  • allowed vs forbidden
  • sensitivity levels
  • ownership and update rules

Deliverable B — Redaction guideline

  • clear rules + examples
  • “what to do when unsure”

Deliverable C — Safe-answer protocol

  • “cite or refuse”
  • “unknown path”
  • “restricted path”

Deliverable D — Enforcement checks

  • ingestion-time checks exist
  • answering-time filtering exists (even if basic)

Common traps (don’t do this)

Later means you ship a liability.

  • Trap 1: “We’ll add governance later.”

No. You need constraints and rules.

  • Trap 2: “The model will behave.”

That’s how leaks happen. Label content properly.

  • Trap 3: “Everything is internal anyway.”

Quick self-check (2 minutes)

Answer yes/no:

  • Does every doc/chunk have a sensitivity label?
  • Do I have a written allowed/forbidden policy?
  • Does the system cite sources or refuse?
  • Do ingestion and retrieval enforce sensitivity rules?
  • Do we have a clear redaction/anonymization guideline?

If any “no” — fix it before moving on.


Next module: W35–W36W35–W36: Runbooks, RCA, and Standard Operating Procedures