LLM-Safe Skill Development Guide

Status: current guidance and target contract.

This guide is written for humans and LLM agents that create or update AdaOS skills. Its goal is simple: a generated skill should be useful without being able to overload the shared desktop, hide failures, or bypass runtime governance.

Read this together with:

Golden rule

Do not treat the primary Yjs document as a free-form database.

Normal skill-owned browser-visible state must go through governed SDK helpers and declared projection routes. Direct Yjs mutation from a skill is legacy or explicitly capability-gated, not the default authoring model.

Preferred data-plane choices:

data_projections plus ctx_subnet.set() / ctx_subnet.set_async() for compact reconnect-stable bootstrap/control state.
stream_variable_publish(), stream_publish(), and webio.receivers for high-churn live variables, append-heavy data, and operator-facing variables.
skill-local storage for private durable skill state.
tool/detail endpoints for explicit user-requested details.
360log or disk snapshots for later diagnostics, not browser steady-state rendering.

Responsibility model

The skill author chooses the data route. The runtime does not silently move a skill's data between Yjs and streams.

That choice is part of the skill design and must be visible in skill.yaml, webui.json, handler code, and tests. For LLM-authored work, the route choice must be treated as a reviewable implementation decision, not an accidental side effect of the helper API used.

Runtime guardrails still enforce shared safety:

Yjs owner guards count attempted and applied Yjs writes, attribute pressure to the skill owner, and may warn, throttle, block, or quarantine unsafe owners.
Stream guards bound payload size, publish rate, snapshot request bursts, and receiver fanout, and must log suppressions or degraded delivery.
Guards emit diagnostics and quarantine records so the UI and future LLM repair loops can explain the failure.
Guards are emergency control, not a replacement for a well-designed data route.

The desired failure mode is explicit: a badly routed skill should become visible as a design defect and be returned for repair. It should not be hidden by runtime magic that makes the browser appear healthy while the skill keeps producing unsafe data.

Required data route plan

Before editing a browser-facing skill, write down the route plan. A concise comment in the implementation notes, PR description, or adjacent docs is enough, but the design must be explicit.

For every widget, modal section, status row, and detail view, answer:

surface: what browser surface consumes this data?
route: yjs, stream, tool/details, skill-local, or disk/360log.
why: reconnect-stable bootstrap, live variable, explicit drill-down, private durable cache, or diagnostic evidence.
first_paint: what does the user see before live data arrives?
recovery: how does the surface recover after room rebuild, reconnect, or stream resubscribe?
update_source: which events or commands can update it?
budget: expected payload size, event rate, coalescing window, and maximum fanout.
guard_visibility: what warning, degraded state, or incident is shown when the route is throttled, blocked, or quarantined?

If a route cannot answer these questions, do not add it yet.

Data-plane decision table

Need	Use	Avoid
Bootstrap/control state needed for first paint	Yjs projection	Full operational snapshot in Yjs
Selected ids, compact health badge, latest stable status	Yjs projection	Rewriting `data`, `ui`, or `registry` broadly
Operator-facing variables, active operations, logs, telemetry, chat/event tail	Stream receiver	Unbounded arrays in Yjs
Big diagnostics or object inspector payload	Details tool / stream snapshot / disk snapshot	Embedding full diagnostics in primary Yjs
Small operator health/guard summary	Status card pointing to stream/tool/details route	Treating `statusPlane` as a live data route
Durable private skill cache	Skill-local files or DB	Hidden browser-only state as source of truth
Command from UI to runtime	`callHost` / tool with small ack	Large command response used as data transport
Raw high-frequency evidence	Stream or disk/360log	Smoothed Yjs status that loses diagnostic truth
Smoothed operator status	Debounced stream or compact Yjs badge	Flickering every raw transport event

Skill manifest checklist

Every browser-facing skill should make its data contract explicit.

Use skill.yaml to declare:

data_routes for the reviewable route plan: surface, route, first paint, recovery, budget, and guard visibility.
tools with stable input and output schema.
exports.tools for callable public tools.
events.subscribe for command or domain events.
data_projections only for browser-visible Yjs branches the skill owns.
webui.receivers in webui.json for live stream variables.
optional lifecycle hooks such as healthcheck, drain, dispose, and onQuarantine / on_quarantine when the skill can clean up or explain a guard action.

Every declared Yjs projection should have a reason to be reconnect-stable. Every stream receiver should have bounded delivery semantics and an initial or snapshot-on-subscribe story.

Example:

data_routes:
- surface: widget:weather_status
  route: yjs
  projection_slot: weather.snapshot
  first_paint: cached compact weather status
  recovery: Yjs replay restores the latest compact status
  update_source: [weather.refresh.completed]
  budget:
    max_payload_bytes: 4096
    max_publish_hz: 0.2
    snapshot_policy: on_subscribe
  guard_visibility:
    degraded_state: weather status shows stale/degraded
    log: service.weather_skill.runtime.log
    quarantine: true
- surface: modal:weather_history
  route: stream
  receiver: weather.history
  first_paint: empty history with loading state
  recovery: bounded stream snapshot requested on subscribe

data_projections:
- scope: subnet
  slot: weather.snapshot
  targets:
  - backend: yjs
    path: data/weather

tools:
- name: get_snapshot
  description: Return the compact current weather state.
  entry: handlers.main:get_snapshot
  input_schema:
    type: object
    properties:
      webspace_id:
        type: string
      target_node_id:
        type: string
  output_schema:
    type: object
    required: [ok]
    properties:
      ok:
        type: boolean
      current:
        type: object

Browser-visible Yjs writes

Use logical slots, not raw paths, in handler code.

Yjs is for the minimum reconnect-stable state needed to bootstrap the surface, preserve collaborative/control state, and explain health. It is not the normal transport for changing variables, diagnostic tables, event tails, or raw runtime evidence.

Preferred:

from adaos.sdk.data import ctx_subnet

ctx_subnet.set(
    "weather.snapshot",
    {"current": current},
    webspace_id=webspace_id,
)

For async handlers:

await ctx_subnet.set_async(
    "adaos_connect.current",
    current,
    webspace_id=webspace_id,
)

Avoid in normal skills:

webspace_ydoc
get_ydoc()
async_get_ydoc()
direct y_py transactions
replacing broad roots such as data, ui, registry, data.catalog, data.installed, or data.desktop
writing hot telemetry, logs, session churn, transport events, or stream tails into Yjs because a widget needs to see them

If a legacy skill still needs direct Yjs access, document why and keep it on a short migration path toward ProjectionService / ctx_subnet.

Make hot projection writes idempotent before calling the SDK helper. Runtime projection code can skip physical no-op mutations, but guard/governance checks still see the attempted write. For refresh-heavy skills, keep a small per-(webspace_id, slot) fingerprint and do not call ctx_subnet.set*() when the semantic payload has not changed. Keep an explicit recovery path, such as a user/API refresh_snapshot, that can bypass this fingerprint when the browser reports a missing projection after room rebuild or reconnect.

Do not fan out routine projection refreshes to every webspace by default. Target the webspace from event metadata or the UI action. Reserve all-webspace fanout for boot, activation, migration, or explicit resync events.

Yjs payloads should be small enough to inspect in logs and reason about in code review. If a projection is hard to summarize in one short schema paragraph, it is probably too large for Yjs and should be split into stream variables or details.

Stream data

Use streams for data that changes often, grows by appending, or represents operator-facing variables that should not be durable collaborative state.

Streams are not a free replacement for Yjs. They are active volatile delivery: messages can be missed during reconnect, subscriptions can flap, and duplicate or out-of-order payloads can happen around recovery. Design every stream as a bounded replace or append channel with explicit recovery.

Declare receivers in webui.json:

{
  "webio": {
    "receivers": {
      "voice_chat.messages": {
        "mode": "append",
        "collectionKey": "items",
        "dedupeBy": "id",
        "maxItems": 100,
        "initialState": { "items": [] }
      }
    }
  }
}

Publish from the skill:

from adaos.sdk.io import stream_publish, stream_variable_publish

stream_variable_publish(
    "voice_chat.status",
    {"state": "ready", "peer_count": 1},
    var_id="status",
    ttl_ms=30000,
    _meta={"webspace_id": webspace_id},
)

stream_publish(
    "voice_chat.messages",
    {"items": [message]},
    _meta={"webspace_id": webspace_id, "target_node_id": target_node_id},
)

Stream rules:

keep payloads bounded
dedupe events with stable ids
provide snapshot-on-subscribe for widgets that should not open empty
coalesce repeated snapshot requests per receiver/webspace/node
include updated_at, seq, stable ids, or a content fingerprint when the receiver needs to reject stale or duplicate payloads
prefer stream_variable_publish() for replace-mode current-state variables; it wraps id, value, seq, updated_at, fingerprint, and optional ttl_ms consistently
use mode: "replace" for current-state variables and include a complete bounded current value in each snapshot
use mode: "append" only for true tails, with maxItems, dedupeBy, and a clear truncation policy
provide an honest initialState such as loading, stale, degraded, or an empty bounded collection
do not eager-publish a replace stream for the same state that the widget is already reading from Yjs; use streams for separate high-churn state or snapshot-on-subscribe recovery
do not copy stream tails back into Yjs just to make them visible

Stream variables should be demand-aware. A stream receiver that is not subscribed should not keep rebuilding full snapshots just in case a browser opens later. Prefer receiver-specific builders over one monolithic skill snapshot.

Status cards

Use status cards for small operator summaries that must be cheap to poll, stream, or project. A card is not a detail payload. It carries identity, current state, freshness, and a pointer to the details route.

statusPlane is not a third data route. It is a compact index over the routes you already declared in data_routes, data_projections, and webui.receivers. If a card needs rows, inventories, logs, diagnostics, or a tail, put those values in a stream receiver or details tool and put only the reference in the card.

from adaos.sdk.status import publish_status, publish_status_stream

publish_status(
    id="runtime",
    kind="runtime",
    scope="infrastate",
    status="ready",
    summary="runtime ready",
    ttl_ms=30000,
    details_ref={"kind": "stream", "receiver": "infrastate.runtime"},
    route={"kind": "stream", "receiver": "infrastate.runtime"},
    webspace_id=webspace_id,
)

publish_status_stream(
    "infrastate.runtime",
    id="runtime",
    kind="runtime",
    scope="infrastate",
    status="warning",
    summary="route reconnecting",
    ttl_ms=30000,
    webspace_id=webspace_id,
    _meta={"webspace_id": webspace_id},
)

Status card rules:

use status values that normalize through CanonicalStatus: ready, online, warning, degraded, down, offline, or unknown
keep summary short and operator-facing
include ttl_ms for live runtime cards so stale UI can degrade honestly
use incident_id only when the card represents a real active warning or incident
put stream/tool references in details_ref; do not embed logs, tables, inventories, or tails into the card
never declare route: status or route: statusPlane; the route belongs to Yjs, stream, details/tool, skill-local storage, or diagnostic evidence
put the design-time data route in route so guard diagnostics can map pressure back to the skill route plan
use publish_status_stream() when the card itself should also be available as a replace-mode stream variable
verify cards through GET /api/node/status/cards; the compatibility /api/node/reliability/summary surface also carries a compact statusPlane block for badge/status UI during migration
polling clients should prefer GET /api/node/reliability/summary?mode=thin&webspace_id=<id> and send If-None-Match on the next request; unchanged snapshots return 304 without rebuilding the full reliability payload
use GET /api/node/reliability/summary/metrics during soak/debug runs to verify thin/full mode counts, response bytes, 304 reuse, and the compact acceptance block with status-registry, Yjs owner-guard/quarantine, stream guard, stream-control, and per-receiver pressure counters
for a human-readable check, use adaos node reliability-metrics --webspace <id> --receiver <stream> and include the acceptance.* lines in soak notes
verify statusPlane.diagnostics.oversizedCardTotal == 0; a nonzero value means a status card is being used as a payload container and needs a route redesign
Yjs pressure, stream guard, and stream-control pressure are also projected as compact guard cards in statusPlane; use their guardRef to map observed pressure back to owner, route, receiver/path, budget, and quarantine context

Hot events and smoothing

Some events are useful evidence but terrible UI clocks. Examples include:

browser.session.changed
device.registered
webrtc.peer.state.changed
YWS open, close, guard, quarantine, and reconnect events
network route flaps
fast operation progress ticks

Handle these as two different products:

Raw evidence goes to diagnostics streams, bounded logs, or 360log so the operator and LLM repair loop can see what really happened.
Operator-facing state is smoothed through debounce, coalescing, or a small state machine so short transport bumps do not shake the UI.

Recommended rules:

coalesce by the narrowest useful key, usually (webspace_id, device_id, receiver) or (webspace_id, node_id, section)
set an explicit burst window, for example 10-15 seconds for browser session churn
publish the latest stable state, plus counters such as flap_count, last_raw_state, and last_raw_at when useful
let hard states bypass smoothing: revoked, denied, auth required, guard quarantined, explicit user disconnect, or admin shutdown
never trigger a full skill snapshot rebuild for each raw hot event
do not write raw hot-event churn into Yjs
use the shared HotEventBudget helper when turning hot raw events into status cards or stream variables; keep the raw event trail in diagnostics and publish only coalesced operator state

from adaos.services.status import HotEventBudget

budget = HotEventBudget(debounce_ms=1000, window_ms=10000, max_events=5)
decision = budget.admit(
    "browser.session.changed",
    key=f"{webspace_id}:{device_id}",
)
if not decision.admitted:
    return

This smoothing is part of the skill design. Runtime guards may limit abusive bursts, but they should not be the main mechanism that keeps the UI calm.

Minimal UI plus details

The primary desktop Yjs document should contain the minimum state needed to render the surface and explain whether it is healthy.

For heavy skills, prefer this split:

minimal bootstrap/control state in Yjs
operator-facing variables, active rows, and event tails in stream receivers
details behind a Details action or modal
full diagnostic evidence in disk snapshots or 360log

Good shape:

data/infrastate/state
data/infrastate/subscriptions
stream:infrastate.summary
stream:infrastate.nodes
stream:infrastate.operations.active
tool:infrastate.get_details(section="logs")

Bad shape:

data/infrastate = <full multi-thousand-line snapshot every refresh>

Tool and action responses

UI actions should return small acknowledgements.

Preferred response:

{
  "ok": true,
  "accepted": true,
  "status": "refresh_scheduled",
  "trace_id": "..."
}

Avoid returning:

full browser snapshots
full log files
full scenario materialization payloads
data already published into Yjs or stream receivers

If the UI needs the data, publish it through the declared data plane and return only enough metadata for the user and logs to correlate the action.

Member-aware skills

Member skills do not own transport. The runtime, router, hub-member link, and browser choose the best delivery path.

Skill tools and handlers should accept optional routing fields:

def get_snapshot(
    webspace_id: str | None = None,
    node_id: str | None = None,
    target_node_id: str | None = None,
    _meta: dict | None = None,
    **_: object,
) -> dict:
    ...

Rules:

preserve _meta.webspace_id and _meta.target_node_id
do not infer target node from global process state if the request already contains explicit routing metadata
keep node-owned Yjs state node-scoped when it enters the shared desktop
publish member stream data with _meta.webspace_id and node identity

Names, aliases, and localization

Generated skills should treat human-facing names as presentation and input resolution data, not as routing identity.

Use canonical refs for actions and storage:

device:member:<node_id>
device:browser:<device_id>
webspace:<webspace_id>
scenario:<scenario_id>
skill:<skill_name>

Do not parse or persist a localized label as the only target id. If a skill receives a phrase such as work browser or рабочий браузер, it should let the named-entity resolver produce the canonical ref before dispatch.

Localization rules for generated skills:

preserve exact user-confirmed names instead of translating them
use localized aliases as resolver input, not as storage keys
keep language-neutral observed labels such as hostnames under locale: "und"
accept request_locale or preferred_locales metadata when the runtime provides it
return canonical refs plus display labels in responses when humans need to see what was targeted
treat runtime alias resolution as model-training neutral: aliases should appear in entity_resolution / trace evidence, not as required Rasa or neural retraining inputs
propose alias changes through sdk.data.entities.propose_alias_add, propose_alias_remove, or propose_alias_deprecate plus the matching apply helper instead of mutating projected registry data directly; the apply result returns lifecycle event envelopes that the authoritative write path can persist and publish
when adding an alias for an actual browser/member device, prefer sdk.data.entities.add_device_alias(device_ref, alias, locale=...); use remove_device_alias to stop accepting an alias, and deprecate_device_alias to keep compatibility while marking the alias as old vocabulary. These helpers write through the governed access-link source and keep Yjs as a read-only projection
when applying an alias change from a previously read registry item, pass the item's fingerprint as base_fingerprint; if the result is stale, reread the registry instead of retrying blindly
MCP clients can use add_device_alias, remove_device_alias, and deprecate_device_alias from NLUAuthoringPlane only with a write-capable session such as ProfileOpsControl; read-only sessions should use get_nlu_authoring_context and get_named_entity_registry

Guarding and quarantine

The runtime may warn, throttle, block, or quarantine a skill owner when either Yjs or stream routes apply unsafe pressure.

Generated skills must not hide that state.

Recommended behavior:

implement onQuarantine or on_quarantine when the skill can release resources or record context
accept ttl_s, reason, metrics, webspace_id, and owner
write a compact skill-local incident log for later LLM repair
return structured errors such as skill_owner_quarantined
let the Web UI render disabled/quarantined state instead of silently pretending the action succeeded
expose which route was guarded: yjs, stream, tool, or mixed
include the affected slot or receiver when safe to disclose
keep enough local context to repair the data route, not just the symptom

The runtime owns the shared quarantine projection, for example data.yjs_qrnt. Skills should not write that service branch directly.

Guard responsibilities:

Yjs guard protects the primary document from oversized, too frequent, or poorly attributed writes.
Stream guard protects event delivery from oversized payloads, receiver fanout, snapshot request storms, and publish loops.
Both guards should produce bounded logs and operator-visible degraded state.
Neither guard decides the normal data route for the skill.

Observability rules

Every skill should make failures diagnosable.

Use:

stable error codes
compact trace_id or operation id
bounded logs
explicit retryable flags
visible degraded / unavailable states when data cannot be fetched
disk/360log snapshot references for large evidence

Do not:

swallow exceptions and return stale success
fall back to another data plane without surfacing that fallback
report a command ack as if browser-visible state is already delivered
retry in tight loops
perform expensive snapshot rebuilds for every browser poll

LLM implementation workflow

Before coding:

read skill.yaml
read webui.json
identify every browser-visible state branch
write the data route plan for every browser-visible branch or receiver
choose Yjs projection, stream, details tool, or skill-local storage for each branch, and record why
check whether the skill is node-aware
define size and frequency expectations
identify hot events and define debounce/budget behavior before writing handlers

When coding:

use SDK helpers instead of direct Yjs primitives
keep tool responses small
make updates idempotent
fingerprint or coalesce heavy projections
keep arrays bounded
build stream payloads per receiver when possible, rather than rebuilding the whole skill snapshot
keep raw diagnostic evidence separate from smoothed operator state
accept routing metadata and unknown keyword args
preserve owner attribution where helper APIs require it

Before publishing:

verify data_routes exists for browser-facing Yjs, stream, details, or diagnostic surfaces
verify data_projections exist for Yjs state
verify stream receivers have bounded modes and snapshot-on-subscribe behavior
verify stream receivers have initialState, freshness metadata, and a recovery path after resubscribe
verify no handler rewrites broad Yjs roots
verify hot events have debounce/budget tests
verify SDK projection diagnostics show the expected by_event pressure counters for dirty refresh paths before optimizing a noisy event source
verify stream request bursts cannot rebuild every skill section by default
verify status cards stay small and point to details instead of embedding detail payloads
verify status-card compact-boundary diagnostics stay clean: oversizedCardTotal == 0 and observed card bytes are comfortably below the card budget
verify no action returns a large payload when a projection/stream is the real data path
verify Yjs and stream guard errors are visible to the UI

Anti-patterns

Treat these as defects in LLM-generated skills:

direct skill writes to the primary Yjs document
broad replacement of data, ui, registry, data.catalog, data.installed, or data.desktop
unbounded chat/log/event arrays in Yjs
returning a huge snapshot from refresh_snapshot
polling a heavy snapshot endpoint to keep normal UI alive
duplicating the same data in a tool response and Yjs
duplicating the same replace-state in both eager stream publishes and Yjs projections on every refresh
using HTTP/API fallback as steady-state transport for Yjs-rendered data
hiding degraded state behind "successful" empty UI
controlling WebRTC/Yjs channel lifecycle from business logic
doing continuous profiling, deep JSON normalization, or full snapshot serialization inside hot handlers
treating stream delivery as durable state without snapshot-on-subscribe
letting subscription flaps rewrite Yjs on every subscribe/unsubscribe
using runtime quarantine as the normal way to quiet a noisy skill

Current migration priorities

The current workspace audit suggests this priority order:

migrate voice_chat_skill to declared projection/stream contracts
make browsers_skill projection refreshes idempotent, avoid all-webspace fanout for routine events, and keep streams to snapshot-on-subscribe or genuinely high-churn data
split infrastate_skill into minimal summary plus details/streams
split infrascope_skill into demanded projection families
decide whether mediaserver and prompt_engineer_skill should remain tool-driven or adopt browser-facing projection contracts