LLM-Safe Skill Development Guide
Status: current guidance and target contract.
This guide is written for humans and LLM agents that create or update AdaOS skills. Its goal is simple: a generated skill should be useful without being able to overload the shared desktop, hide failures, or bypass runtime governance.
Read this together with:
- the repository note
docs/io/webio.md - UI Addressing
- Named Entities and Canonical Naming
- Semantic State Plane
- Runtime Guarding
- Projection Subscription Roadmap
Golden rule
Do not treat the primary Yjs document as a free-form database.
Normal skill-owned browser-visible state must go through governed SDK helpers and declared projection routes. Direct Yjs mutation from a skill is legacy or explicitly capability-gated, not the default authoring model.
Preferred data-plane choices:
data_projectionsplusctx_subnet.set()/ctx_subnet.set_async()for compact reconnect-stable bootstrap/control state.stream_variable_publish(),stream_publish(), andwebio.receiversfor high-churn live variables, append-heavy data, and operator-facing variables.- skill-local storage for private durable skill state.
- tool/detail endpoints for explicit user-requested details.
- 360log or disk snapshots for later diagnostics, not browser steady-state rendering.
Responsibility model
The skill author chooses the data route. The runtime does not silently move a skill's data between Yjs and streams.
That choice is part of the skill design and must be visible in skill.yaml,
webui.json, handler code, and tests. For LLM-authored work, the route choice
must be treated as a reviewable implementation decision, not an accidental
side effect of the helper API used.
Runtime guardrails still enforce shared safety:
- Yjs owner guards count attempted and applied Yjs writes, attribute pressure to the skill owner, and may warn, throttle, block, or quarantine unsafe owners.
- Stream guards bound payload size, publish rate, snapshot request bursts, and receiver fanout, and must log suppressions or degraded delivery.
- Guards emit diagnostics and quarantine records so the UI and future LLM repair loops can explain the failure.
- Guards are emergency control, not a replacement for a well-designed data route.
The desired failure mode is explicit: a badly routed skill should become visible as a design defect and be returned for repair. It should not be hidden by runtime magic that makes the browser appear healthy while the skill keeps producing unsafe data.
Required data route plan
Before editing a browser-facing skill, write down the route plan. A concise comment in the implementation notes, PR description, or adjacent docs is enough, but the design must be explicit.
For every widget, modal section, status row, and detail view, answer:
surface: what browser surface consumes this data?route:yjs,stream,tool/details,skill-local, ordisk/360log.why: reconnect-stable bootstrap, live variable, explicit drill-down, private durable cache, or diagnostic evidence.first_paint: what does the user see before live data arrives?recovery: how does the surface recover after room rebuild, reconnect, or stream resubscribe?update_source: which events or commands can update it?budget: expected payload size, event rate, coalescing window, and maximum fanout.guard_visibility: what warning, degraded state, or incident is shown when the route is throttled, blocked, or quarantined?
If a route cannot answer these questions, do not add it yet.
Data-plane decision table
| Need | Use | Avoid |
|---|---|---|
| Bootstrap/control state needed for first paint | Yjs projection | Full operational snapshot in Yjs |
| Selected ids, compact health badge, latest stable status | Yjs projection | Rewriting data, ui, or registry broadly |
| Operator-facing variables, active operations, logs, telemetry, chat/event tail | Stream receiver | Unbounded arrays in Yjs |
| Big diagnostics or object inspector payload | Details tool / stream snapshot / disk snapshot | Embedding full diagnostics in primary Yjs |
| Small operator health/guard summary | Status card pointing to stream/tool/details route | Treating statusPlane as a live data route |
| Durable private skill cache | Skill-local files or DB | Hidden browser-only state as source of truth |
| Command from UI to runtime | callHost / tool with small ack |
Large command response used as data transport |
| Raw high-frequency evidence | Stream or disk/360log | Smoothed Yjs status that loses diagnostic truth |
| Smoothed operator status | Debounced stream or compact Yjs badge | Flickering every raw transport event |
Skill manifest checklist
Every browser-facing skill should make its data contract explicit.
Use skill.yaml to declare:
data_routesfor the reviewable route plan: surface, route, first paint, recovery, budget, and guard visibility.toolswith stable input and output schema.exports.toolsfor callable public tools.events.subscribefor command or domain events.data_projectionsonly for browser-visible Yjs branches the skill owns.webui.receiversinwebui.jsonfor live stream variables.- optional lifecycle hooks such as
healthcheck,drain,dispose, andonQuarantine/on_quarantinewhen the skill can clean up or explain a guard action.
Every declared Yjs projection should have a reason to be reconnect-stable. Every stream receiver should have bounded delivery semantics and an initial or snapshot-on-subscribe story.
Example:
data_routes:
- surface: widget:weather_status
route: yjs
projection_slot: weather.snapshot
first_paint: cached compact weather status
recovery: Yjs replay restores the latest compact status
update_source: [weather.refresh.completed]
budget:
max_payload_bytes: 4096
max_publish_hz: 0.2
snapshot_policy: on_subscribe
guard_visibility:
degraded_state: weather status shows stale/degraded
log: service.weather_skill.runtime.log
quarantine: true
- surface: modal:weather_history
route: stream
receiver: weather.history
first_paint: empty history with loading state
recovery: bounded stream snapshot requested on subscribe
data_projections:
- scope: subnet
slot: weather.snapshot
targets:
- backend: yjs
path: data/weather
tools:
- name: get_snapshot
description: Return the compact current weather state.
entry: handlers.main:get_snapshot
input_schema:
type: object
properties:
webspace_id:
type: string
target_node_id:
type: string
output_schema:
type: object
required: [ok]
properties:
ok:
type: boolean
current:
type: object
Browser-visible Yjs writes
Use logical slots, not raw paths, in handler code.
Yjs is for the minimum reconnect-stable state needed to bootstrap the surface, preserve collaborative/control state, and explain health. It is not the normal transport for changing variables, diagnostic tables, event tails, or raw runtime evidence.
Preferred:
from adaos.sdk.data import ctx_subnet
ctx_subnet.set(
"weather.snapshot",
{"current": current},
webspace_id=webspace_id,
)
For async handlers:
Avoid in normal skills:
webspace_ydocget_ydoc()async_get_ydoc()- direct
y_pytransactions - replacing broad roots such as
data,ui,registry,data.catalog,data.installed, ordata.desktop - writing hot telemetry, logs, session churn, transport events, or stream tails into Yjs because a widget needs to see them
If a legacy skill still needs direct Yjs access, document why and keep it on a
short migration path toward ProjectionService / ctx_subnet.
Make hot projection writes idempotent before calling the SDK helper. Runtime
projection code can skip physical no-op mutations, but guard/governance checks
still see the attempted write. For refresh-heavy skills, keep a small
per-(webspace_id, slot) fingerprint and do not call ctx_subnet.set*() when
the semantic payload has not changed. Keep an explicit recovery path, such as a
user/API refresh_snapshot, that can bypass this fingerprint when the browser
reports a missing projection after room rebuild or reconnect.
Do not fan out routine projection refreshes to every webspace by default. Target the webspace from event metadata or the UI action. Reserve all-webspace fanout for boot, activation, migration, or explicit resync events.
Yjs payloads should be small enough to inspect in logs and reason about in code review. If a projection is hard to summarize in one short schema paragraph, it is probably too large for Yjs and should be split into stream variables or details.
Stream data
Use streams for data that changes often, grows by appending, or represents operator-facing variables that should not be durable collaborative state.
Streams are not a free replacement for Yjs. They are active volatile delivery: messages can be missed during reconnect, subscriptions can flap, and duplicate or out-of-order payloads can happen around recovery. Design every stream as a bounded replace or append channel with explicit recovery.
Declare receivers in webui.json:
{
"webio": {
"receivers": {
"voice_chat.messages": {
"mode": "append",
"collectionKey": "items",
"dedupeBy": "id",
"maxItems": 100,
"initialState": { "items": [] }
}
}
}
}
Publish from the skill:
from adaos.sdk.io import stream_publish, stream_variable_publish
stream_variable_publish(
"voice_chat.status",
{"state": "ready", "peer_count": 1},
var_id="status",
ttl_ms=30000,
_meta={"webspace_id": webspace_id},
)
stream_publish(
"voice_chat.messages",
{"items": [message]},
_meta={"webspace_id": webspace_id, "target_node_id": target_node_id},
)
Stream rules:
- keep payloads bounded
- dedupe events with stable ids
- provide snapshot-on-subscribe for widgets that should not open empty
- coalesce repeated snapshot requests per receiver/webspace/node
- include
updated_at,seq, stable ids, or a content fingerprint when the receiver needs to reject stale or duplicate payloads - prefer
stream_variable_publish()for replace-mode current-state variables; it wrapsid,value,seq,updated_at,fingerprint, and optionalttl_msconsistently - use
mode: "replace"for current-state variables and include a complete bounded current value in each snapshot - use
mode: "append"only for true tails, withmaxItems,dedupeBy, and a clear truncation policy - provide an honest
initialStatesuch asloading,stale,degraded, or an empty bounded collection - do not eager-publish a replace stream for the same state that the widget is already reading from Yjs; use streams for separate high-churn state or snapshot-on-subscribe recovery
- do not copy stream tails back into Yjs just to make them visible
Stream variables should be demand-aware. A stream receiver that is not subscribed should not keep rebuilding full snapshots just in case a browser opens later. Prefer receiver-specific builders over one monolithic skill snapshot.
Status cards
Use status cards for small operator summaries that must be cheap to poll, stream, or project. A card is not a detail payload. It carries identity, current state, freshness, and a pointer to the details route.
statusPlane is not a third data route. It is a compact index over the routes
you already declared in data_routes, data_projections, and
webui.receivers. If a card needs rows, inventories, logs, diagnostics, or a
tail, put those values in a stream receiver or details tool and put only the
reference in the card.
from adaos.sdk.status import publish_status, publish_status_stream
publish_status(
id="runtime",
kind="runtime",
scope="infrastate",
status="ready",
summary="runtime ready",
ttl_ms=30000,
details_ref={"kind": "stream", "receiver": "infrastate.runtime"},
route={"kind": "stream", "receiver": "infrastate.runtime"},
webspace_id=webspace_id,
)
publish_status_stream(
"infrastate.runtime",
id="runtime",
kind="runtime",
scope="infrastate",
status="warning",
summary="route reconnecting",
ttl_ms=30000,
webspace_id=webspace_id,
_meta={"webspace_id": webspace_id},
)
Status card rules:
- use
statusvalues that normalize throughCanonicalStatus:ready,online,warning,degraded,down,offline, orunknown - keep
summaryshort and operator-facing - include
ttl_msfor live runtime cards so stale UI can degrade honestly - use
incident_idonly when the card represents a real active warning or incident - put stream/tool references in
details_ref; do not embed logs, tables, inventories, or tails into the card - never declare
route: statusorroute: statusPlane; the route belongs to Yjs, stream, details/tool, skill-local storage, or diagnostic evidence - put the design-time data route in
routeso guard diagnostics can map pressure back to the skill route plan - use
publish_status_stream()when the card itself should also be available as a replace-mode stream variable - verify cards through
GET /api/node/status/cards; the compatibility/api/node/reliability/summarysurface also carries a compactstatusPlaneblock for badge/status UI during migration - polling clients should prefer
GET /api/node/reliability/summary?mode=thin&webspace_id=<id>and sendIf-None-Matchon the next request; unchanged snapshots return304without rebuilding the full reliability payload - use
GET /api/node/reliability/summary/metricsduring soak/debug runs to verify thin/full mode counts, response bytes,304reuse, and the compactacceptanceblock with status-registry, Yjs owner-guard/quarantine, stream guard, stream-control, and per-receiver pressure counters - for a human-readable check, use
adaos node reliability-metrics --webspace <id> --receiver <stream>and include theacceptance.*lines in soak notes - verify
statusPlane.diagnostics.oversizedCardTotal == 0; a nonzero value means a status card is being used as a payload container and needs a route redesign - Yjs pressure, stream guard, and stream-control pressure are also projected as
compact guard cards in
statusPlane; use theirguardRefto map observed pressure back to owner, route, receiver/path, budget, and quarantine context
Hot events and smoothing
Some events are useful evidence but terrible UI clocks. Examples include:
browser.session.changeddevice.registeredwebrtc.peer.state.changed- YWS open, close, guard, quarantine, and reconnect events
- network route flaps
- fast operation progress ticks
Handle these as two different products:
- Raw evidence goes to diagnostics streams, bounded logs, or 360log so the operator and LLM repair loop can see what really happened.
- Operator-facing state is smoothed through debounce, coalescing, or a small state machine so short transport bumps do not shake the UI.
Recommended rules:
- coalesce by the narrowest useful key, usually
(webspace_id, device_id, receiver)or(webspace_id, node_id, section) - set an explicit burst window, for example 10-15 seconds for browser session churn
- publish the latest stable state, plus counters such as
flap_count,last_raw_state, andlast_raw_atwhen useful - let hard states bypass smoothing: revoked, denied, auth required, guard quarantined, explicit user disconnect, or admin shutdown
- never trigger a full skill snapshot rebuild for each raw hot event
- do not write raw hot-event churn into Yjs
- use the shared
HotEventBudgethelper when turning hot raw events into status cards or stream variables; keep the raw event trail in diagnostics and publish only coalesced operator state
from adaos.services.status import HotEventBudget
budget = HotEventBudget(debounce_ms=1000, window_ms=10000, max_events=5)
decision = budget.admit(
"browser.session.changed",
key=f"{webspace_id}:{device_id}",
)
if not decision.admitted:
return
This smoothing is part of the skill design. Runtime guards may limit abusive bursts, but they should not be the main mechanism that keeps the UI calm.
Minimal UI plus details
The primary desktop Yjs document should contain the minimum state needed to render the surface and explain whether it is healthy.
For heavy skills, prefer this split:
- minimal bootstrap/control state in Yjs
- operator-facing variables, active rows, and event tails in stream receivers
- details behind a
Detailsaction or modal - full diagnostic evidence in disk snapshots or 360log
Good shape:
data/infrastate/state
data/infrastate/subscriptions
stream:infrastate.summary
stream:infrastate.nodes
stream:infrastate.operations.active
tool:infrastate.get_details(section="logs")
Bad shape:
Tool and action responses
UI actions should return small acknowledgements.
Preferred response:
Avoid returning:
- full browser snapshots
- full log files
- full scenario materialization payloads
- data already published into Yjs or stream receivers
If the UI needs the data, publish it through the declared data plane and return only enough metadata for the user and logs to correlate the action.
Member-aware skills
Member skills do not own transport. The runtime, router, hub-member link, and browser choose the best delivery path.
Skill tools and handlers should accept optional routing fields:
def get_snapshot(
webspace_id: str | None = None,
node_id: str | None = None,
target_node_id: str | None = None,
_meta: dict | None = None,
**_: object,
) -> dict:
...
Rules:
- preserve
_meta.webspace_idand_meta.target_node_id - do not infer target node from global process state if the request already contains explicit routing metadata
- keep node-owned Yjs state node-scoped when it enters the shared desktop
- publish member stream data with
_meta.webspace_idand node identity
Names, aliases, and localization
Generated skills should treat human-facing names as presentation and input resolution data, not as routing identity.
Use canonical refs for actions and storage:
device:member:<node_id>device:browser:<device_id>webspace:<webspace_id>scenario:<scenario_id>skill:<skill_name>
Do not parse or persist a localized label as the only target id.
If a skill receives a phrase such as work browser or рабочий браузер, it
should let the named-entity resolver produce the canonical ref before dispatch.
Localization rules for generated skills:
- preserve exact user-confirmed names instead of translating them
- use localized aliases as resolver input, not as storage keys
- keep language-neutral observed labels such as hostnames under
locale: "und" - accept
request_localeorpreferred_localesmetadata when the runtime provides it - return canonical refs plus display labels in responses when humans need to see what was targeted
- treat runtime alias resolution as model-training neutral: aliases should
appear in
entity_resolution/ trace evidence, not as required Rasa or neural retraining inputs - propose alias changes through
sdk.data.entities.propose_alias_add,propose_alias_remove, orpropose_alias_deprecateplus the matching apply helper instead of mutating projected registry data directly; the apply result returns lifecycle event envelopes that the authoritative write path can persist and publish - when adding an alias for an actual browser/member device, prefer
sdk.data.entities.add_device_alias(device_ref, alias, locale=...); useremove_device_aliasto stop accepting an alias, anddeprecate_device_aliasto keep compatibility while marking the alias as old vocabulary. These helpers write through the governed access-link source and keep Yjs as a read-only projection - when applying an alias change from a previously read registry item, pass the
item's
fingerprintasbase_fingerprint; if the result isstale, reread the registry instead of retrying blindly - MCP clients can use
add_device_alias,remove_device_alias, anddeprecate_device_aliasfrom NLUAuthoringPlane only with a write-capable session such asProfileOpsControl; read-only sessions should useget_nlu_authoring_contextandget_named_entity_registry
Guarding and quarantine
The runtime may warn, throttle, block, or quarantine a skill owner when either Yjs or stream routes apply unsafe pressure.
Generated skills must not hide that state.
Recommended behavior:
- implement
onQuarantineoron_quarantinewhen the skill can release resources or record context - accept
ttl_s,reason,metrics,webspace_id, andowner - write a compact skill-local incident log for later LLM repair
- return structured errors such as
skill_owner_quarantined - let the Web UI render disabled/quarantined state instead of silently pretending the action succeeded
- expose which route was guarded:
yjs,stream,tool, ormixed - include the affected slot or receiver when safe to disclose
- keep enough local context to repair the data route, not just the symptom
The runtime owns the shared quarantine projection, for example data.yjs_qrnt.
Skills should not write that service branch directly.
Guard responsibilities:
- Yjs guard protects the primary document from oversized, too frequent, or poorly attributed writes.
- Stream guard protects event delivery from oversized payloads, receiver fanout, snapshot request storms, and publish loops.
- Both guards should produce bounded logs and operator-visible degraded state.
- Neither guard decides the normal data route for the skill.
Observability rules
Every skill should make failures diagnosable.
Use:
- stable error codes
- compact
trace_idor operation id - bounded logs
- explicit
retryableflags - visible
degraded/unavailablestates when data cannot be fetched - disk/360log snapshot references for large evidence
Do not:
- swallow exceptions and return stale success
- fall back to another data plane without surfacing that fallback
- report a command
ackas if browser-visible state is already delivered - retry in tight loops
- perform expensive snapshot rebuilds for every browser poll
LLM implementation workflow
Before coding:
- read
skill.yaml - read
webui.json - identify every browser-visible state branch
- write the data route plan for every browser-visible branch or receiver
- choose Yjs projection, stream, details tool, or skill-local storage for each branch, and record why
- check whether the skill is node-aware
- define size and frequency expectations
- identify hot events and define debounce/budget behavior before writing handlers
When coding:
- use SDK helpers instead of direct Yjs primitives
- keep tool responses small
- make updates idempotent
- fingerprint or coalesce heavy projections
- keep arrays bounded
- build stream payloads per receiver when possible, rather than rebuilding the whole skill snapshot
- keep raw diagnostic evidence separate from smoothed operator state
- accept routing metadata and unknown keyword args
- preserve owner attribution where helper APIs require it
Before publishing:
- verify
data_routesexists for browser-facing Yjs, stream, details, or diagnostic surfaces - verify
data_projectionsexist for Yjs state - verify stream receivers have bounded modes and snapshot-on-subscribe behavior
- verify stream receivers have
initialState, freshness metadata, and a recovery path after resubscribe - verify no handler rewrites broad Yjs roots
- verify hot events have debounce/budget tests
- verify SDK projection diagnostics show the expected
by_eventpressure counters for dirty refresh paths before optimizing a noisy event source - verify stream request bursts cannot rebuild every skill section by default
- verify status cards stay small and point to details instead of embedding detail payloads
- verify status-card compact-boundary diagnostics stay clean:
oversizedCardTotal == 0and observed card bytes are comfortably below the card budget - verify no action returns a large payload when a projection/stream is the real data path
- verify Yjs and stream guard errors are visible to the UI
Anti-patterns
Treat these as defects in LLM-generated skills:
- direct skill writes to the primary Yjs document
- broad replacement of
data,ui,registry,data.catalog,data.installed, ordata.desktop - unbounded chat/log/event arrays in Yjs
- returning a huge snapshot from
refresh_snapshot - polling a heavy snapshot endpoint to keep normal UI alive
- duplicating the same data in a tool response and Yjs
- duplicating the same replace-state in both eager stream publishes and Yjs projections on every refresh
- using HTTP/API fallback as steady-state transport for Yjs-rendered data
- hiding degraded state behind "successful" empty UI
- controlling WebRTC/Yjs channel lifecycle from business logic
- doing continuous profiling, deep JSON normalization, or full snapshot serialization inside hot handlers
- treating stream delivery as durable state without snapshot-on-subscribe
- letting subscription flaps rewrite Yjs on every subscribe/unsubscribe
- using runtime quarantine as the normal way to quiet a noisy skill
Current migration priorities
The current workspace audit suggests this priority order:
- migrate
voice_chat_skillto declared projection/stream contracts - make
browsers_skillprojection refreshes idempotent, avoid all-webspace fanout for routine events, and keep streams to snapshot-on-subscribe or genuinely high-churn data - split
infrastate_skillinto minimal summary plus details/streams - split
infrascope_skillinto demanded projection families - decide whether
mediaserverandprompt_engineer_skillshould remain tool-driven or adopt browser-facing projection contracts