Concept Case Study · HelloPM Assignment · March 2026
AI PM RAG Enterprise LMS Trust & Permissions Internal Tooling HelloPM — Cohort 49

Enterprise LMS Policy
& Support Assistant

A permission-aware AI assistant designed to help Training Operations Managers answer repeated LMS policy, access, and certification questions — without guessing, overreaching, or exposing restricted information. Built on RAG with role-based retrieval, source citations, and safe escalation.

Type Concept / Assignment
Program HelloPM — Cohort 49
Primary User Training Operations Manager
Architecture RAG + Permission-Aware Retrieval
Date March 2026

At a glance

The Problem

Training Operations Managers spend 3–5 hours a week answering the same LMS questions by manually searching across policy PDFs, SOPs, onboarding guides, and compliance documents. The same questions repeat. The work is fragmented and exception-heavy.

My Role

I defined the product problem, scoped the v1 assistant, designed the RAG architecture, outlined permission logic, failure handling, evaluation criteria, rollout stages, and the system behavior needed for a trust-sensitive internal tool.

What I Designed

A grounded internal knowledge assistant that retrieves only approved LMS documents, cites sources, handles ambiguity with clarifying questions, and escalates safely when evidence is weak or access is restricted.

Why It Matters

This project shows how I think about AI product design beyond "add a chatbot" — starting from workflow pain, then designing for trust, permissions, safety, and operational realism. The process is as relevant as the output.

Context & Problem

A repetitive workflow with a high cost of being wrong.

LMS support is full of repetitive, policy-heavy questions that still require careful answers. The primary user is a Training Operations Manager at a mid-sized company with role-based LMS policies. Their job includes answering questions about access, deadlines, certifications, mandatory learning, and process exceptions — by manually searching multiple documents and checking whether the answer changes by role, department, location, or course type.

The pain was not a lack of information. It was the cost of repeatedly assembling the right answer from fragmented internal knowledge.

A wrong AI answer in this context would damage trust quickly. Groundedness mattered more than speed. That shaped every architecture decision that followed.

The 8-step manual workflow

What a TOM does every time a question arrives — and again when it returns.

  1. 01
    Receive LMS question from employee, manager, or HRBP via email, chat, or ticket
  2. 02
    Classify query type — access, certification, mandatory training, deadline, exception
  3. 03
    Search across LMS guides, policy PDFs, onboarding docs, compliance references
  4. 04
    Find the latest version of the relevant document and the relevant section
  5. 05
    Cross-check whether the answer varies by role, department, geography, or course type
  6. 06
    Confirm with another admin or manager when the answer is unclear
  7. 07
    Manually draft and send the response
  8. 08
    Repeat the entire process when the same question arrives again
Product Goal

A scoped assistant. Not a general-purpose chatbot.

The goal was to design a permission-aware internal knowledge assistant that helps a TOM answer repeated questions about policy, access, certification, deadlines, and process guidance — using only approved LMS sources.

One-line product statement: Answer policy-heavy LMS questions faster, with grounded sources and safe escalation — not fake automation.

What it does in v1
Answer policy, access, certification, and exception questions from approved documents
Cite source document name, section, version, and last-updated date
Ask clarifying questions when role, geography, or employee type is missing
Refuse out-of-scope requests cleanly with process guidance
Escalate when evidence is insufficient, with a handoff summary for the human reviewer
What it deliberately does not do
Create courses, assign training, or update LMS records
Act on backend LMS data or perform transactions
Guess when evidence is weak or access is restricted
Reveal restricted information to unauthorized users
Architecture Decision

Why RAG — and why not the alternatives

This was not "use AI because AI is trending." The decision was grounded in three specific characteristics of the knowledge itself.

Approach Why it's the wrong fit here
Prompt-only Can't scale across changing internal documents. Stuffing all policy PDFs into one prompt is impractical and expensive. Fails silently when documents update.
Fine-tuning LMS documents change monthly. Retraining for every policy update is slow, costly, and unsustainable. Fine-tuning stores knowledge in model weights — the wrong layer for dynamic, permission-sensitive data.
RAG Selected Retrieves the latest approved documents at query time. Freshness guaranteed. Supports source citations the TOM can verify. Makes permission-aware retrieval possible without retraining whenever policy changes.
Dynamic knowledge

Policy documents change monthly. RAG retrieves the current version at query time — no retraining cycle required.

Grounded answers

Answers must be citable. The TOM needs to verify before forwarding. RAG produces attributed responses fine-tuning cannot.

Permission sensitivity

Different roles access different documents. RAG supports metadata-based permission filtering at retrieval time.

Product Thinking

Five decisions that defined the product

Each decision had a clear rationale and an explicit tradeoff. These are the ones that shaped the most important behavior.

Decision 01
RAG over fine-tuning

The core challenge was dynamic knowledge retrieval, not teaching static domain knowledge to the model. Fine-tuning would have required retraining every time a policy changed — which happens monthly. The wrong architectural layer for this problem.

Architecture
Decision 02
Permission filter before retrieval

Restricted chunks must never reach the model. Filtering after retrieval creates leakage risk through the prompt. Filtering before retrieval eliminates the problem at the source. This was the most important trust decision in the system.

Trust & Safety
Decision 03
Re-ranking as a trust layer

Broad policy summaries can sound plausible while missing the specific exception clause that matters. Re-ranking the top-8 retrieved chunks to 3–4 before the LLM improves precision and reduces hallucination risk. Slightly more latency — correct tradeoff for a trust-sensitive internal tool.

Retrieval Quality
Decision 04
Narrow v1 scope — knowledge only

No course creation, no training assignment, no LMS record updates. Every request to act becomes a clean refusal with process guidance. This wasn't a technical limitation — it was a deliberate product decision to prove groundedness and permission safety before expanding surface area.

Scope Control
Decision 05
Escalate instead of guessing

In a policy-sensitive internal workflow, "I don't know based on available documents" is better than a confident wrong answer. The escalation path produces a short summary for the human reviewer — original query, sources attempted, reason, suggested next owner — so the handoff adds value rather than creating more work.

Failure Design
Technical Architecture

The RAG pipeline — simplified

A portfolio-friendly view of how the system processes a query from ingestion to response. The key stages and their rationale.

Ingestion Pipeline
Data Sources
PDFs, DOCX, CSV
Ingestion + OCR
Scheduled sync
Pre-processing
Strip noise
Chunking
250–400 words
Embedding
text-emb-3-small
Vector DB + Metadata
Roles, geography, ACL
User Query
Natural language
Query Classification
5 query types
Permission Filter
Before retrieval
Retrieval (Top-8)
Vector search
Re-ranking ★
→ Top 3–4
LLM + Citations
GPT-5 mini / 5.4
Chunking strategy

Section-based at 250–400 words, 80–120 word overlap. Section headings stay attached. Sentence-level chunking breaks policy logic. Document-level loses retrieval precision.

Model routing

GPT-5 mini handles 80% of queries (~$0.002/query). Complex exception cases route to GPT-5.4 (~$0.018/query). Blended cost: ~$0.005/query — ~$1.60–$2.00 per TOM per month.

Re-ranking rationale

The most-retrieved chunk is not always the most relevant. A broad policy overview can rank above the specific exception clause that actually answers the question.

Trust Design

Permission as a product requirement — not a backend concern.

Different roles need different documents. The permission filter happens before retrieval — restricted chunks never reach the model at all.

Training Ops Manager
Policy docs, SOPs, cert rules, onboarding guides
Admin SOPs ✓ Restricted ✗
LMS Admin
All indexed LMS docs including admin-only content
Full ✓ Restricted ✓
Manager
Team training rules, mandatory learning, manager-facing only
Admin SOPs ✗ Restricted ✗
Employee
Learner-facing policy, onboarding, cert basics
Admin ✗ Restricted ✗
Compliance Owner
Regulated-role rules, audit-specific policy
Admin ✗ Restricted ✓
Key edge cases that shaped the design
Employment ends → access revokes immediately, cached sessions expire
Role changes → permissions update on next auth refresh, not hardcoded
Mixed-permission queries → answer only from the allowed document subset
Manager requests employee-private data → refuse or escalate; policy guidance and private account data are separated by design

The product separates policy guidance from private account access. That boundary is what keeps a helpful assistant from becoming a risky one.

Failure Thinking

The PRD didn't assume good retrieval equals a good product.

Nine failure points were mapped across the pipeline — each with what breaks, how it's detected, mitigation, and a PM decision. Four selected here for scan-ability.

Failure 01 · Data Layer
OCR / Data Quality

Scanned policy PDFs with bad OCR misread role names, certification rules, or deadline clauses. The assistant returns wrong policy guidance with a citation that looks credible.

Data quality is a launch gate. Don't launch broad policy coverage until core LMS docs are clean enough to support trusted retrieval.
Failure 02 · Retrieval Layer
Poor Chunking

Too-aggressive chunking splits policy logic — a missed-deadline exception is separated from the clause specifying it applies only to certain employee types. Correct document retrieved, wrong answer produced.

Optimize for policy meaning, not granularity. The section is the unit of meaning — not the sentence.
Failure 03 · Ranking Layer
Re-ranking Failure

The right chunk is retrieved but not prioritized. A weaker generic policy overview survives into the final prompt instead of the specific exception clause that actually answers the question.

Accept more latency for cleaner context. This product is trust-sensitive — precision beats speed.
Failure 04 · User Layer
Non-Technical User Behavior

Users type vague queries ("certificate?" / "deadline?"), don't include role or geography, and may distrust AI answers despite good retrieval. Low adoption despite technically correct outputs.

Non-technical failure is as serious as technical failure. UX and trust design are not optional layers.
System Behavior

How the assistant should respond — three examples.

Grounded answer, "I don't know," and escalation. The response format is: Direct Answer → Why / Conditions → Source(s) → Limitation or Escalation.

Grounded answer with citation
Employee has CSPO. Is Scrum Fundamentals still mandatory?
Not mandatory if the employee holds a CSPO. Exemption applies only to eligible product roles under the equivalency policy. Source: Certification Equivalency Policy v2.1, Section 4. Confirm role eligibility before final response if employee's role is unclear.
"I don't know" — honest escalation
What is the AI ethics certification policy for contractors in Brazil?
I do not know based on available documents. No reliable matching source found. Please escalate to the compliance owner or update the knowledge base if this policy exists in another approved source.
Out-of-scope action — clean refusal
Show me the admin-only troubleshooting SOP and assign the compliance course to all new joiners.
I cannot complete that request. The troubleshooting SOP is access-restricted to authorized admin users, and I do not perform LMS actions in v1. I can provide policy or process guidance only. Contact the LMS admin for restricted troubleshooting access.
Proposed Evaluation

How success would be measured — if this shipped.

All metrics are proposed targets, not measured outcomes. Framed as a hybrid evaluation: human audit for truth, LLM-as-judge for scale.

Retrieval Quality
Precision@5 — top results contain the relevant policy
Recall@5 — correct source usually in top 5
Top-1 exact source hit rate
Citation coverage — near 100% for answerable queries
Target: citation coverage ~100% for answerable queries
Generation Quality
Answer accuracy on high-frequency policy questions
Faithfulness to retrieved sources
Completeness on conditional / exception questions
Hallucination rate — target: very low
Target: high completeness on exception-heavy queries
Guardrails & Safety
Permission violation rate — zero tolerance
Unsupported action rate — refused and redirected
Restricted-content citation rate
Out-of-scope action refusal accuracy
Target: zero permission violations before broader rollout
Business Success
Manual lookup time saved per TOM session
Sessions completed without escalation
Repeat usage and adoption from primary users
Satisfaction on grounded answers with citations
Target: measurable reduction in manual search effort
UX & Rollout

The interface is part of the trust model.

Chat-based, embedded in the LMS admin workflow. Each response shows: direct answer → source document and section → version / date → escalation guidance when needed.

Chat over search-only because the TOM needs grounded answers to conditional questions — not just document discovery. Plain search still forces manual interpretation.

Phase 01
Alpha

TOMs + one LMS admin group. Certification, access, and deadline questions only. One department and geography first. Narrow pilot before any broader exposure.

High citation coverage, no permission leaks, positive qualitative feedback from pilot users.
Phase 02
Beta

Broader TOM group, selected managers, limited employee self-serve for approved learner-facing queries. More document coverage and geographies.

Measurable reduction in manual lookup time. Rising repeat usage. Acceptable escalation rate.
Phase 03
GA

All approved internal users within defined LMS support scope. Stable permission-aware retrieval, governed document update process, operational monitoring in place.

Stable usage across teams. Low permission failure rate. Strong support efficiency gains.
Future Scope

A disciplined roadmap — not a features wish list.

Each phase has a gate condition. The surface area expands only after the previous phase is proven.

  1. v1
    Grounded knowledge assistant — this PRD

    Retrieve approved docs, cite sources, clarify ambiguous queries, escalate safely when evidence is weak or access is restricted.

    Gate: permissions + observability proven in alpha
  2. v2
    Tool-assisted retrieval

    Query structured LMS APIs for live course assignment status and completion state. Richer context without full agentic risk.

    Gate: permissions safe in v1, API integrations available
  3. v3
    Workflow guidance

    Draft escalation summaries, suggest the correct admin owner, pre-fill support checklists. Better human handoff quality.

    Gate: strong answer quality and user trust from v2
  4. v4
    Bounded actions — gated

    Create support tickets, draft course assignment requests, initiate approval workflows. Always with explicit human confirmation before execution. Never fully autonomous.

    Gate: audit trail + human-in-the-loop framework + senior PM and security sign-off

On fine-tuning: Not in v1. If used post-v1, for behavior and tone — not knowledge storage. PEFT/LoRA over full fine-tuning. Goal: improve refusal style and escalation quality, not store dynamic enterprise policy in model weights.

Reflection

What this project sharpened.

The biggest lesson was that a useful assistant isn't defined by how fluent it sounds — it's defined by how well it handles evidence, scope, and uncertainty. The product became stronger when I narrowed the ambition: answer only from approved sources, preserve permission boundaries, ask clarifying questions when context is missing, and escalate instead of guessing.

The second lesson: AI PM is not only about model choice. It's about workflow fit, guardrails, observability, rollout discipline, and designing failure behavior on purpose. The failure analysis was as important as the feature list.

The third: permission isn't a security afterthought. It's a product requirement that shapes retrieval architecture, response format, and what the assistant is allowed to say. Getting that right is as much a product decision as choosing the right embedding model.

Lesson 01

Grounded AI beats fluent AI in enterprise contexts every time.

Lesson 02

Failure behavior is a product feature — not an edge case to handle later.

Lesson 03

Scope control is not timidity. It's the mechanism that makes trust possible.

Portfolio Context

Why this case study is in my portfolio.

I'm including this project because it shows the kind of AI PM work I want to do: a real workflow problem with a clear user, disciplined scope, trust-sensitive design, and product decisions shaped by retrieval quality, permissions, and safe escalation.

It's a concept case study — not shipped production work — but it reflects how I think when turning an AI opportunity into a product that could actually be trusted internally. The value isn't a launch story. It's the architecture decisions, the failure analysis, and the scope discipline.

Interested in how I think through AI product decisions, trust boundaries, or workflow design? This is the kind of problem space I'm actively building toward.

Next step

Seen enough to want a conversation?

I'm open to AI PM roles where the problem is real. This is the kind of structured product thinking I bring to every challenge.