NLU in AdaOS
This document describes the current production MVP direction for intent detection in AdaOS.
MVP baseline
- Pipeline:
regex->rasa (service-skill)->teacher (LLM in the loop) - System boundary: NLU runtime code is one; only data varies per scenario/skill.
- Transport: intent detection is integrated into AdaOS event bus (not CLI-only).
Event flow (high level)
- UI / Telegram / Voice publishes:
nlp.intent.detect.request { text, webspace_id, request_id, _meta... }nlu.pipelinetries regex rules:- built-in rules (
nlu.pipeline) - dynamic rules loaded centrally from:
- workspace scenarios (
scenario.json:nlu.regex_rules) - workspace skills (
skill.yaml:nlu.regex_rules) - legacy per-webspace cache (
data.nlu.regex_rules)
- workspace scenarios (
- If regex does not match:
nlp.intent.detect.rasais emitted (delegates to the Rasa service skill)- If an intent is found:
nlp.intent.detected { intent, confidence, slots, text, webspace_id, request_id, via }- If intent is not obtained:
nlp.intent.not_obtained { reason, text, via, webspace_id, request_id }- Router emits a human-friendly
io.out.chat.appendand records the request for NLU Teacher. - If teacher is enabled:
nlp.teacher.request { webspace_id, request }is emitted for teacher runtimes.
Rasa as a service-skill
Rasa is treated as a service-type skill (separate Python/venv, managed lifecycle) to avoid dependency conflicts with the hub runtime.
The hub supervises:
- health checks
- crash frequency
- request failures/timeouts
Issues can trigger:
skill.service.issueskill.service.doctor.request->skill.service.doctor.report(LLM doctor can be plugged later)
Teacher-in-the-loop (LLM)
When regex and rasa do not produce an intent, AdaOS calls an LLM teacher to:
- propose a dataset revision (existing intent + new examples + slots), or
- propose a regex rule to improve the
regexstage, or - propose a new capability (skill / scenario candidate), or
- decide to ignore (non-actionable).
Teacher receives scenario + skill context, including:
- current scenario NLU (
scenario.json:nlu) - installed catalog (apps/widgets + origins)
- existing dynamic regex rules (from scenarios/skills + legacy per-webspace cache)
- built-in regex rules (
nlu.pipeline) - selected skill-level NLU artifacts (e.g.
interpreter/intents.yml) - intent routing hints (
intent_routes: scenario intent -> callSkill topic -> skill) - system/host actions catalog (
system_actions,host_actions)
Teacher state is projected into YJS under data.nlu_teacher.* for UI inspection, and also persisted on disk
under .adaos/state/skills/nlu_teacher/<webspace_id>.json so it survives YJS reload/reset.
Web UI: NLU Teacher
In the default web desktop scenario the NLU Teacher UI is a schema-driven modal:
- Tabs: User requests / Candidates
- Grouping:
- User requests: grouped by
request_id - Candidates: grouped by
candidate.name, then byrequest_id - Logs: groups show event payloads inline (raw JSON)
- Apply actions:
nlp.teacher.revision.applynlp.teacher.candidate.apply:- for
regex_rulecandidates: persists the rule into a workspace owner (preferably a skill), then mirrors intodata.nlu.regex_rulesas a runtime cache so the next request matches immediately (via="regex.dynamic") - for
skill/scenariocandidates: creates a development plan item
- for
- a successful apply emits
ui.notifywith the owner (skill/scenario) where the rule was installed
Dynamic regex rules (current contract)
- Storage (source of truth):
- skill:
.adaos/workspace/skills/<skill>/skill.yaml→nlu.regex_rules[] - scenario:
.adaos/workspace/scenarios/<scenario>/scenario.json→nlu.regex_rules[] - Rule identity:
- every rule has
id="rx.<uuid>" - Observability:
- every
regex.dynamicmatch appends a JSONL record intostate/nlu/regex_usage.jsonl(webspace_id, scenario_id, rule_id, intent, slots…) - Optional trust policy:
skill.yaml: llm_policy.autoapply_nlu_teacher=trueenables automatic Apply for teacher-proposed regex candidates targeting that skill
Later (not MVP)
- Rhasspy / offline NLU
- Retriever-style NLU (graph/context retrieval)
- Multi-step, stateful NLU workflows across scenarios