Phase 4 · W33–W34

W33–W34: Guardrails & Governance (what is allowed, what is not)

Establish clear governance and guardrails so the knowledge system stays safe, source-grounded, and policy-compliant.

Suggested time: 4–6 hours/week

Outcomes

A clear “allowed vs forbidden” content policy.
A sensitivity classification (public/internal/restricted).
A safe-answer protocol (“answers with sources or no answer”).
A redaction/anonymization guideline.
Simple enforcement: checks during ingestion and during answering.

Deliverables

Governance policy doc with allowed/forbidden content and sensitivity levels.
Redaction guideline with rules, examples, and uncertain-case handling.
Safe-answer protocol covering cite/refuse/unknown restricted behavior.
Enforcement checks at ingestion and answering time.

Prerequisites

W31–W32: Retrieval Quality (search, ranking, relevance)

W33–W34: Guardrails & Governance (what is allowed, what is not)

What you’re doing

You stop treating “knowledge base” like a cute wiki.

A KB + RAG system becomes a liability fast if:

it leaks sensitive data
it gives confident wrong answers
it tells people to do risky changes
it invents facts

So you build guardrails and governance like an adult.

Time: 4–6 hours/week
Output: a governance policy + content rules + safe-answer behavior + basic enforcement checks

The promise (what you’ll have by the end)

By the end of W34 you will have:

A clear “allowed vs forbidden” content policy
A sensitivity classification (public/internal/restricted)
A safe-answer protocol (“answers with sources or no answer”)
A redaction/anonymization guideline
Simple enforcement: checks during ingestion and during answering

The rule: if you can’t cite it, don’t claim it

Your system must prefer:

over

“Here are sources”
“Trust me”

And it must be allowed to say:

“I don’t know”
“Not enough info”
“This is restricted”

No hero mode.

Step-by-step checklist

1) Define sensitivity levels

Start simple:

PUBLIC (safe to publish)
INTERNAL (company/team internal)
RESTRICTED (PII, credentials, customer data, sensitive configs)

Everything ingested must have a sensitivity label.
If you can’t label it, don’t ingest it.

2) Define “allowed content”

Examples:

runbooks without secrets
RCA summaries without personal data
mapping rules without customer identifiers
interface descriptions at conceptual level
known issues summaries

3) Define “forbidden content”

Examples:

credentials, tokens, keys
full customer records
personal addresses/names
screenshots with sensitive info
anything that violates policy or contracts

Write it down clearly. Not vague.

4) Add redaction/anonymization rules

Rules like:

replace IDs with placeholders
mask emails
remove addresses
remove attachments or store only metadata

If you can’t anonymize safely, exclude.

5) Define safe-answer behavior

Your answer policy should be:

Always provide citations (source chunks)
If no good source found → say “I don’t know” + suggest where to look
If restricted content would be required → refuse and explain restriction
Never invent configuration steps or transactions if not in sources

6) Add enforcement checks

During ingestion:

block docs flagged as RESTRICTED if not allowed
run a simple scanner for secrets (patterns like “password=”, “token=”, etc.)

During answering:

only retrieve chunks allowed for the current context
refuse if sensitivity mismatch

Start basic. It already helps a lot.

Deliverables (you must ship these)

Deliverable A — Governance policy doc

allowed vs forbidden
sensitivity levels
ownership and update rules

Deliverable B — Redaction guideline

clear rules + examples
“what to do when unsure”

Deliverable C — Safe-answer protocol

“cite or refuse”
“unknown path”
“restricted path”

Deliverable D — Enforcement checks

ingestion-time checks exist
answering-time filtering exists (even if basic)

Common traps (don’t do this)

Later means you ship a liability.

Trap 1: “We’ll add governance later.”

No. You need constraints and rules.

Trap 2: “The model will behave.”

That’s how leaks happen. Label content properly.

Trap 3: “Everything is internal anyway.”

Quick self-check (2 minutes)

Answer yes/no:

Does every doc/chunk have a sensitivity label?
Do I have a written allowed/forbidden policy?
Does the system cite sources or refuse?
Do ingestion and retrieval enforce sensitivity rules?
Do we have a clear redaction/anonymization guideline?

If any “no” — fix it before moving on.

Next module: W35–W36 — W35–W36: Runbooks, RCA, and Standard Operating Procedures