NLU Human Verification
This checklist is the operator-facing control loop for the current AdaOS NLU implementation. It separates what can be verified today from the target Teacher UI that still needs product work.
Scope
Verifiable today:
- Regex-first detection and fallback behavior.
- Rasa dry-run phrase checks through the Teacher probe API.
- Intent ranking, entities, slots, confidence, and stage trace in API responses.
- Dynamic desktop lookup tables from manifests plus live read-only desktop registry overlay.
- Current NLU Teacher modal smoke behavior: missed requests, candidates, raw event payloads, and Apply.
Not yet verifiable through UI:
- Typing a phrase directly inside the NLU Teacher modal.
- Seeing ranking/entities/action preview as a first-class UI panel.
- Marking an interpretation as Correct/Fix/Save example.
- Editing existing templates by stable
template_id. - Root MCP token issuance and governed LLM-assisted patch apply.
1. Regression Tests
Run the focused NLU tests from the repository root:
.\.venv\Scripts\python.exe -m pytest tests/test_nlu_probe.py tests/test_nlu_rasa_baseline.py tests/test_nlu_lookup_tables.py
Expected result:
- Probe API tests pass.
- Rasa baseline export/tests pass.
- Lookup table export/API tests pass.
2. API Smoke Check
Start the hub API in the normal development environment, then use the same bearer token configured for local API access.
Lookup inspection:
Invoke-RestMethod `
-Headers @{ Authorization = "Bearer $env:ADAOS_TOKEN" } `
-Uri "http://127.0.0.1:8000/api/nlu/teacher/desktop/lookups"
Expected result:
- Response contains
modal_id,node_ref,app_id,scenario_id, andwebspace_id. - Values should come from workspace/packaged manifests, plus live desktop registry overlay when the desktop is running.
Phrase probe:
Invoke-RestMethod `
-Method Post `
-Headers @{ Authorization = "Bearer $env:ADAOS_TOKEN" } `
-ContentType "application/json" `
-Body '{"text":"open apps catalog","use_rasa":true,"emit_trace":true}' `
-Uri "http://127.0.0.1:8000/api/nlu/teacher/desktop/probe"
Expected result:
ok=true.intent,confidence,slots,entities, andintent_rankingare visible when a stage accepts the phrase.stages[]explains whether regex missed, Rasa accepted, or the phrase fell through to fallback.- The call does not dispatch desktop actions.
3. Trace Verification
After running a real voice/text command or a probe with emit_trace=true, inspect the NLU trace projection:
- UI/YJS path:
data.nlu_trace.items[] - Expected stage names:
request,regex,pipeline delegate,rasa,dispatcher action/reject
The trace is sufficient for debugging through developer tools today. The missing product slice is a timeline panel inside NLU Teacher.
4. Current UI Smoke Check
Open the NLU Teacher modal in the default web desktop scenario.
Expected current behavior:
- The modal has User requests and Candidates tabs.
- Missed NLU requests are grouped by
request_id. - Candidate events are grouped by candidate name and request id.
- Raw JSON payloads are visible for inspection.
- Apply actions can emit
nlp.teacher.revision.applyornlp.teacher.candidate.apply.
Known UI gap:
- There is no Check phrase field yet.
- There is no first-class ranking/entities/trace panel yet.
- Correct/Fix/Save example are not implemented as operator controls yet.
5. Teacher UI Acceptance Criteria
The NLU Teacher UI becomes useful for non-developer verification when an operator can complete this loop without terminal access:
- Enter a phrase.
- See pipeline trace, intent ranking, entities, slots, lookup matches, confidence, and action preview.
- Confirm the interpretation as correct or open a guided fix form.
- Select scenario/skill training target.
- Preview the diff against the current template/example.
- Save the approved example or template patch with audit metadata.
- Re-run the same phrase and see the improved result.
Until those controls exist, API/CLI verification remains the source of truth.