Cohesix is an open-source high-assurance control-plane operating system built on the formally verified seL4 microkernel, designed to keep the trusted computing base intentionally small while enabling deterministic orchestration of edge GPU systems and auditable MLOps. Cohesix is "infrastructure for AGI".
The queen/worker verbs and /queen/ctl schema form the hive control API: one Queen instance uses these interfaces to control many workers over the shared Secure9P namespace.
This document is canonical for control-plane interfaces. Snippets marked coh-rtc are generated from
configs/root_task.toml and must not be edited by hand. If code diverges from this document, update IR,
regenerate artifacts, and then update docs/tests in the same change.
Related docs
docs/SECURE9P.md — transport invariants and AccessPolicy ordering.docs/ROLES_AND_SCHEDULING.md — role-to-namespace rules.docs/HOST_TOOLS.md — host tool semantics and interdependencies.docs/API_GUIDELINES.md — REST gateway scope and mapping.docs/USERLAND_AND_CLI.md — CLI grammar and bounds.At a glance
/queen/ctl (3), /queen/*/ctl (3a–3e), /policy/ctl (10)./proc/* (6, 6a)./gpu/* (7), /host/* (8)./updates/*, /models/* (9).cohsh framing and verbs (13)./proc formats are breaking.Interface invariants:
max_bytes/msize budget... traversal.ERR responses are deterministic and must be treated as no side effects unless explicitly documented.Figure 1. Sequence diagram
sequenceDiagram
autonumber
participant Operator
participant Cohsh as cohsh
participant Console as root-task TCP console
participant ND as NineDoor
participant RT as root-task
participant QCTL as /queen/ctl
participant WT as /shard/<label>/worker/<id>/telemetry
participant LOG as /log/queen.log
participant GPUB as gpu-bridge-host
participant GPU as /gpu/<id>/*
%% =========================
%% Protocol invariants
%% =========================
Note over ND: Secure9P only. Version 9P2000.L. Remove disabled. Msize max 8192.
Note over ND: Paths are UTF-8. No NUL. Max component length 255 bytes.
Note over QCTL: Append-only control file. One command per line.
Note over Console: Line protocol. Max line length 256 bytes. ACK before side effects.
Note over GPU: Provider-backed nodes. info read-only. ctl and job append-only.
%% =========================
%% A) TCP console attachment
%% =========================
Operator->>Cohsh: run cohsh with TCP transport
Cohsh->>Console: ATTACH role ticket
alt ticket and role valid
Console-->>Cohsh: OK ATTACH
else invalid or rate-limited
Console-->>Cohsh: ERR ATTACH
end
%% Keepalive
Cohsh->>Console: PING
Console-->>Cohsh: PONG
%% Tail logs over console
Cohsh->>Console: TAIL path
Console-->>Cohsh: OK TAIL
loop log streaming
Console-->>Cohsh: log line
end
Console-->>Cohsh: END
%% =========================
%% B) Secure9P session setup
%% =========================
Operator->>Cohsh: run cohsh in 9P mode
Cohsh->>ND: TVERSION msize 8192
ND-->>Cohsh: RVERSION
Cohsh->>ND: TATTACH with ticket
alt ticket valid
ND-->>Cohsh: RATTACH
else invalid
ND-->>Cohsh: Rerror Permission
end
%% =========================
%% C) Queen control via /queen/ctl
%% =========================
Cohsh->>ND: TWALK /queen/ctl
ND-->>Cohsh: RWALK
Cohsh->>ND: TOPEN /queen/ctl append
ND-->>Cohsh: ROPEN
Cohsh->>ND: TWRITE spawn heartbeat worker
ND->>RT: validate command and permissions
alt spawn allowed
RT-->>ND: spawn OK
ND-->>Cohsh: RWRITE
else invalid or busy
RT-->>ND: error
ND-->>Cohsh: Rerror
end
%% =========================
%% D) Worker telemetry
%% =========================
RT->>WT: append heartbeat record
RT->>WT: append heartbeat record
%% =========================
%% E) GPU provider registration
%% =========================
GPUB->>ND: connect as Secure9P provider
ND-->>GPUB: provider session ready
GPUB->>GPU: publish info
GPUB->>GPU: publish ctl
GPUB->>GPU: publish job
GPUB->>GPU: publish status
%% =========================
%% F) GPU lease request
%% =========================
Cohsh->>ND: TWRITE spawn gpu lease request
ND->>RT: validate lease request
alt provider available
RT-->>ND: lease queued
ND-->>Cohsh: RWRITE
RT->>GPU: append lease to ctl
RT->>LOG: append lease issued
GPUB->>GPU: update status QUEUED
GPUB->>GPU: update status RUNNING
else provider unavailable
RT-->>ND: error Busy
ND-->>Cohsh: Rerror Busy
end
%% =========================
%% G) GPU job execution
%% =========================
Cohsh->>ND: TWRITE append job
ND-->>Cohsh: RWRITE
GPUB->>GPU: update status OK or ERR
RT->>WT: append job result
%% =========================
%% H) Tail logs via 9P
%% =========================
Cohsh->>ND: TWALK /log/queen.log
ND-->>Cohsh: RWALK
Cohsh->>ND: TOPEN read
ND-->>Cohsh: ROPEN
loop tail polling
Cohsh->>ND: TREAD offset
ND-->>Cohsh: RREAD
end
version, attach, walk, open, read, write, clunk, stat, remove (disabled)).msize negotiated ≤ 8192 bytes; larger requests rejected with Rerror(TooBig).clunk invalidates handles immediately.secure9p.batch_frames); each response is keyed by its tag and may arrive out-of-order, so clients must match replies by tag instead of FIFO ordering.secure9p.tags_per_session) and batch back-pressure return deterministic Rerror(Invalid) or Rerror(Busy) with stable ordering, preserving prior single-request semantics when batching is disabled.pub struct Ticket(pub [u8; 32]);
pub struct TicketClaims {
pub role: Role,
pub budget: Budget,
pub subject: Option<String>,
pub mounts: MountSpec,
pub issued_at_ms: u64,
}
attach.Path: /queen/ctl (append-only JSON lines)
{"spawn":"heartbeat","ticks":100,"budget":{"ttl_s":120,"ops":500}}
{"kill":"worker-7"}
{"bind":{"from":"/shard","to":"/shadow"}}
{"mount":{"service":"gpu-bridge","at":"/gpu"}}
{"spawn":"gpu","lease":{"gpu_id":"GPU-0","mem_mb":4096,"streams":2,"ttl_s":120}}
ERR (schema is strict).spawn:"gpu" queues a lease request for the host GPU bridge; if the bridge is unavailable the command returns Error::Busy./gpu/<id> entries via install_gpu_nodes; lease issuance is mirrored to /log/queen.log and /gpu/<id>/ctl.priority fields raise scheduling weight on the host bridge when multiple leases compete.cohsh, and any GUI client is expected to speak the same protocol./policy/rules present), writes to /queen/ctl require approvals queued in /actions/queue.Path: /queen/lifecycle/ctl (append-only, queen-only)
cordon
drain
resume
quiesce
reset
ERR./log/queen.log:
lifecycle transition old=<STATE> new=<STATE> reason=<reason>lifecycle denied action=<cmd> state=<STATE> reason=<invalid-transition|outstanding-leases|invalid-command|gate-denied>/proc/lifecycle/state: state=<BOOTING|DEGRADED|ONLINE|DRAINING|QUIESCED|OFFLINE>/proc/lifecycle/reason: reason=<text>/proc/lifecycle/since: since_ms=<u64>Path: /gpu/bridge/ctl (append-only, queen-only; lifecycle gate: host_publish)
Publish lines (one per append):
begin bytes=<payload_bytes> sha256=<hex>
b64:<base64_chunk>
...
end
begin defines the expected payload byte size and SHA-256 of the decoded wire payload.b64: lines stream the base64-encoded wire payload in bounded chunks.end finalizes the snapshot; invalid size/hash results in deterministic ERR./gpu/<id>/*, /gpu/models/*, and /gpu/telemetry/schema.json.Status path: /gpu/bridge/status (read-only)
state=idle — no active publish.state=receiving bytes=<n> — ingesting snapshot.state=ok bytes=<n> sha256=<hex> — last publish succeeded.state=err reason=<detail> — last publish failed (detail is bounded).Path: /queen/schedule/ctl (append-only JSONL)
{"id":"sched-1","role":"worker-gpu","priority":2,"ticks":3,"budget_ms":120}
id and role must be short ASCII tokens (alphanumeric, -, _); ticks and budget_ms must be > 0.control_plane.schedule.queue_max_entries; duplicate id entries are rejected.control_plane.schedule.ctl_max_bytes; overflow returns deterministic ERR./proc/schedule/summary and /proc/schedule/queue expose read-only snapshots of the queue (see /proc observability).Path: /queen/lease/ctl (append-only JSONL)
{"op":"grant","id":"lease-1","subject":"queen","resource":"gpu0","ttl_s":300,"priority":5}
{"op":"renew","id":"lease-1","ttl_s":600,"priority":6}
{"op":"preempt","id":"lease-1","reason":"timeout"}
{"op":"quota","subject":"queen","resource":"gpu0","max_active":4,"max_preemptions":8}
op = grant|renew|preempt|quota; unknown fields are rejected.id, subject, and resource are bounded ASCII tokens; ttl_s, max_active, and max_preemptions must be > 0.control_plane.lease.*_max_entries; overflow returns deterministic ERR.control_plane.lease.ctl_max_bytes./proc/lease/summary, /proc/lease/active, and /proc/lease/preemptions expose read-only snapshots when enabled.Path: /queen/export/ctl (append-only JSONL)
{"op":"open","id":"export-1","ttl_s":900}
{"op":"close","id":"export-1","reason":"window-complete"}
op = open|close; unknown fields are rejected.id and reason are bounded ASCII tokens; ttl_s must be > 0.control_plane.export.ctl_max_bytes./shard/<label>/worker/<id>/telemetry (append-only, newline-delimited records)./worker/<id>/telemetry.{"tick":42,"ts_ms":123456789}.{"job":"jid-9","state":"RUNNING","detail":"scheduled"} followed by {"job":"jid-9","state":"OK","detail":"completed"}.telemetry.ring_bytes_per_worker caps the per-worker append-only ring.telemetry.cursor.retain_on_boot preserves or resets cursor state after reboot.telemetry.frame_schema gates legacy plain-text vs CBOR framing./gpu/telemetry/schema.json (read-only, versioned)schema_version, device_id, model_id, time_window, token_count, latency_histogram.lora_id, confidence, entropy, drift, feedback_flags./queen/telemetry/*./queen/telemetry/<device_id>/
ctl — append-only control log. Accepts JSON lines of the form {"new":"segment","mime":"text/plain"}.seg/ — directory containing OS-named segments (append-only).latest — read-only pointer to the newest segment (single line: <seg_id>).ctl; names are assigned seg-000001, seg-000002, … per device.u64::MAX). Random writes, truncation, and renames are rejected.telemetry_ingest.*:
max_segments_per_devicemax_bytes_per_segmentmax_total_bytes_per_devicemax_reference_entries_per_segmentmax_reference_manifest_bytes_per_segmentmax_reference_bytes_per_segmenteviction_policy (refuse |
evict-oldest) |
/queen/export/lora_jobs/<job_id>/
telemetry.cbor — CBOR telemetry bundle (bounded).base_model.ref — base model identifier (single line).policy.toml — export policy snapshot (TOML).cohsh telemetry push emits UTF-8 JSON lines, one per append:
{"schema":"cohsh-telemetry-push/v1","seq":1,"mime":"text/plain","payload":"telemetry demo line 1"}
| Field | Type | Required | Description |
| — | — | — | — |
| schema | text | yes | Schema identifier; must be cohsh-telemetry-push/v1. |
| seq | uint | yes | Monotonic per-segment sequence number (starts at 1). |
| mime | text | yes | MIME type of the source payload (e.g. text/plain). |
| payload | text | yes | Opaque UTF-8 payload chunk; cohsh chunks to stay within max_record_bytes (4096). |
For large host artifacts, cohsh telemetry push and the Python SDK emit reference-manifest lines instead of inline payload transfer:
{"schema":"coh-ref-c/v1","seq":1,"off":0,"len":16777216,"sha256":"QmFzZTY0RGlnZXN0Li4u"}
| Field | Type | Required | Description |
| — | — | — | — |
| schema | text | yes | Schema identifier; must be coh-ref-c/v1. |
| seq | uint | yes | Monotonic record sequence (starts at 1). |
| off | uint | yes | Referenced byte offset. Must be contiguous (off == prior off + len). |
| len | uint | yes | Referenced chunk bytes (>= 1). |
| sha256 | text | yes | Chunk digest token (bounded ASCII digest alphabet). |
Deterministic ingest rules:
cohsh-telemetry-push/v1) or reference-manifest (coh-ref-c/v1); mixing modes in one segment is rejected.max_reference_entries_per_segment, max_reference_manifest_bytes_per_segment, and max_reference_bytes_per_segment.msize <= 8192, record payload <= 4096 bytes).sharding.enabled: truesharding.shard_bits: 8sharding.legacy_worker_alias: true00..ff (count: 256)/shard/<label>/worker/<id>/telemetry/worker/<id>/telemetryGenerated from configs/root_task.toml (sha256: afc015e7a9f9bea1625f43a291c485760b380eebedb622af15ebcc40f6ba2fc9).
telemetry-frame/v11| Field | CBOR type | Required | Description |
|---|---|---|---|
schema |
text |
yes |
Schema identifier; must be telemetry-frame/v1. |
worker_id |
text |
yes |
Worker identifier emitting the record. |
role |
text |
yes |
Worker role label (worker-heartbeat, worker-gpu). |
seq |
uint |
yes |
Monotonic frame sequence number. |
emitted_ms |
uint |
yes |
Unix epoch milliseconds captured by the worker. |
payload |
map |
yes |
Schema-specific payload map (e.g., heartbeat or GPU job data). |
Generated by coh-rtc (sha256: d1906bce668a4d73d95a8262734f1ec04a1480610ebfd9b6c3f3c8ad2e402b7e).
Sidecar namespaces are manifest-gated; mounts appear only when sidecars.*.enable = true and adapter labels are compiler-resolved (hash-prefixed on collision).
/bus/<adapter> (MODBUS/DNP3)ctl — append-only control log for sidecar coordination (byte-for-byte, bounded by secure9p.msize).telemetry — append-only; when link is offline, payloads are spooled for deterministic replay.link — append-only; accepts online or offline to toggle link state.replay — append-only; any write drains the spool into telemetry and appends replay entries=<n> bytes=<m>.spool — read-only; lines entries=<n> bytes=<m> max_entries=<n> max_bytes=<m> plus per-frame seq=<n> bytes=<m> payload=<text>./lora/<adapter>ctl — append-only transmit attempts; duty-cycle guard enforces window/percent limits, violations return ERR and record tamper entries.telemetry — read-only mirror of accepted payloads (populated by ctl writes).tamper — read-only; lines tamper ts_ms=<ms> reason=<payload-oversize|duty-cycle> bytes=<n>.
ERR plus a sidecar-deny audit line in /log/queen.log./proc/9p/sessions (read-only, max 8192 bytes): sessions total=<u64> worker=<u64> shard_bits=<u8> shard_count=<u16> plus shard <hex> <count> lines./proc/9p/outstanding (read-only, max 128 bytes): outstanding current=<u64> limit=<u64>./proc/9p/short_writes (read-only, max 128 bytes): short_writes total=<u64> retries=<u64>./proc/9p/session/active (read-only, max 128 bytes): active=<u64> draining=<u64>./proc/9p/session/<id>/state (read-only, max 64 bytes): state=SETUP|ACTIVE|DRAINING|CLOSED./proc/9p/session/<id>/since_ms (read-only, max 64 bytes): since_ms=<u64>./proc/9p/session/<id>/owner (read-only, max 96 bytes): owner=<identity>./proc/ingest/p50_ms (read-only, max 64 bytes): p50_ms=<u32> (milliseconds)./proc/ingest/p95_ms (read-only, max 64 bytes): p95_ms=<u32> (milliseconds)./proc/ingest/backpressure (read-only, max 64 bytes): backpressure=<u64>./proc/ingest/dropped (read-only, max 64 bytes): dropped=<u64>./proc/ingest/queued (read-only, max 64 bytes): queued=<u32>./proc/ingest/watch (append-only, max_entries=16, line_bytes=192, min_interval_ms=50): watch ts_ms=<u64> p50_ms=<u32> p95_ms=<u32> queued=<u32> backpressure=<u64> dropped=<u64> ui_reads=<u64> ui_denies=<u64>./proc/root/reachable (read-only, max 32 bytes): reachable=yes|no./proc/root/last_seen_ms (read-only, max 64 bytes): last_seen_ms=<u64>./proc/root/cut_reason (read-only, max 64 bytes): cut_reason=<none|network_unreachable|session_revoked|policy_denied|lifecycle_offline>./proc/pressure/busy (read-only, max 64 bytes): busy=<u64>./proc/pressure/quota (read-only, max 64 bytes): quota=<u64>./proc/pressure/cut (read-only, max 64 bytes): cut=<u64>./proc/pressure/policy (read-only, max 64 bytes): policy=<u64>./proc/schedule/summary (read-only, max 128 bytes): queue=<u64> dequeued=<u64> dropped=<u64> max_entries=<u32>./proc/schedule/queue (read-only, max 256 bytes): id=<id> role=<role> priority=<u32> ticks=<u32> budget_ms=<u32> seq=<u64>./proc/lease/summary (read-only, max 160 bytes): active=<u64> preemptions=<u64> quotas=<u64> max_active=<u32> max_preemptions=<u32>./proc/lease/active (read-only, max 256 bytes): id=<id> subject=<subject> resource=<resource> ttl_s=<u32> priority=<u32> state=<STATE> seq=<u64>./proc/lease/preemptions (read-only, max 256 bytes): id=<id> subject=<subject> resource=<resource> reason=<reason> seq=<u64>.Generated by coh-rtc (sha256: 4ff0d485329b917eeaa1b604f8adfb28fd0a75924e7d55ac818d9359b81379b5).
ui_providers.* gates visibility; /proc UI providers require corresponding observability.*, /policy/preflight/* requires ecosystem.policy.enable, /updates/* requires cas.enable.ERR and emit ui-provider audit lines./proc/9p/sessions(.cbor) — text output matches /proc observability; CBOR map: total, worker, shard_bits, shard_count, shards[] {label, count}./proc/9p/outstanding(.cbor) — text output matches /proc observability; CBOR map: current, limit./proc/9p/short_writes(.cbor) — text output matches /proc observability; CBOR map: total, retries./proc/ingest/p50_ms(.cbor) — text output matches /proc observability; CBOR map: p50_ms./proc/ingest/p95_ms(.cbor) — text output matches /proc observability; CBOR map: p95_ms./proc/ingest/backpressure(.cbor) — text output matches /proc observability; CBOR map: backpressure./policy/preflight/req(.cbor) — text: req total=<u64> queued=<u64> consumed=<u64> plus req id=<id> target=<path> decision=<allow|deny> state=<queued|consumed>./policy/preflight/req.cbor — CBOR map: total, queued, consumed, actions[] {id, target, decision, state}./policy/preflight/diff(.cbor) — text: diff rules=<u64> actions=<u64> unmatched=<u64> plus rule id=<id> target=<path> queued=<u64> consumed=<u64>./policy/preflight/diff.cbor — CBOR map: rules, actions, unmatched, entries[] {id, target, queued, consumed}./updates/<epoch>/manifest.cbor — read-only; schema cohesix-cas/manifest-v1./updates/<epoch>/status(.cbor) — text lines: status epoch=<epoch> state=<empty|manifest_pending|chunks_pending|ready>, manifest_bytes=<u64> manifest_pending_bytes=<u64>, chunks_expected=<u64> chunks_committed=<u64> chunks_pending=<u64> chunks_missing=<u64>, payload_bytes=<u64> payload_sha256=<hex|none>, delta_base_epoch=<epoch|none> delta_base_sha256=<hex|none>./updates/<epoch>/status.cbor — CBOR map: epoch, state, manifest_bytes, manifest_pending_bytes, chunks_expected, chunks_committed, chunks_pending, chunks_missing, payload_bytes, payload_sha256 (bytes or null), delta (map {base_epoch, base_sha256} or null).cohsh; SWARMUI_TRANSPORT=9p enables Secure9P. No new verbs or in-VM services are introduced.tail transcripts must emit OK ..., stream lines, and terminate with END exactly as the CLI does./proc/schedule/* and /proc/lease/*; they never append or mutate control files.$DATA_DIR/snapshots/; no network access or retries occur while offline.ERR -> error pulse, other lines -> telemetry); the frontend applies bounded diffs and makes no per-event draw guarantees.swarmui.hive.poll_workers_per_tick workers per poll, caps per-worker pending lines with swarmui.hive.pending_lines_per_worker, and drops oldest events beyond swarmui.hive.pending_event_cap to keep newest telemetry visible./proc/root/reachable, /proc/root/cut_reason, /proc/9p/session/active, and /proc/pressure/* (text-only). UI badges render ROOT OK vs CUT, session counts highlight DRAINING, and pressure counters are displayed inline.swarmui.hive.status_poll_ms and cached between polls; the Live Hive poll loop follows the same interval (clamped to ≥250 ms) so telemetry deltas refresh once per poll within the configured event budget.ERR lines tagged with reason=busy|quota|cut|policy are classified and displayed by category instead of a single generic failure bucket.--replay <snapshot> accepts CBOR snapshots or transcripts; offline mode reads $DATA_DIR/snapshots/hive:<key>.cbor.swarmui.hive.frame_cap_fps), step (swarmui.hive.step_ms), budgets (swarmui.hive.lod_event_budget, swarmui.hive.snapshot_max_events, swarmui.hive.pending_event_cap), and pressure threshold (swarmui.hive.degrade_pressure) are compiler-emitted.cohsh-core tail buffers, not in UI code.splice); a debug metrics hook is reserved for UI performance harnesses.apps/swarmui/frontend/assets/fonts (mono ligatures off by default via .mono, opt-in .mono.liga), colors and spacing live in apps/swarmui/frontend/styles/colors.css and apps/swarmui/frontend/styles/tokens.css, hive tokens mirror CSS in apps/swarmui/frontend/hive/tokens.js, icons use apps/swarmui/frontend/assets/icons/sprite.svg via apps/swarmui/frontend/components/icon.js, and layout spacing is limited to 4/8/12/16/24/32 with no shadows.| Path | Mode | Description |
|——|——|————-|
| /gpu/bridge/ctl | append-only | GPU bridge snapshot publish channel (begin/b64:/end). |
| /gpu/bridge/status | read-only | Publish status (state=idle|receiving|ok|err). |
| /gpu/<id>/info | read-only | JSON metadata: vendor, model, memory, SMs, driver/runtime versions |
| /gpu/<id>/ctl | append-only | Lease management: LEASE, RELEASE, PRIORITY <n> |
| /gpu/<id>/lease | append-only | Lease/ticket log entries (gpu-lease/v1) with active/release state |
| /gpu/<id>/job | append-only | JSON job descriptors (validated hash, grid/block dims, optional payload_b64) |
| /gpu/<id>/status | read-only append stream | Job lifecycle entries (QUEUED/RUNNING/OK/ERR) |
| /gpu/models/available/<model_id>/manifest.toml | read-only | Host-authored model manifests; no uploads from the VM |
| /gpu/models/active | append-only pointer | Symlink-like pointer to the active model (atomic swap on host) |
| /gpu/telemetry/schema.json | read-only | Versioned schema descriptor (gpu-telemetry/v1) with field and size limits |
| /gpu/telemetry/* | host-only | Telemetry records remain host-side; only the schema is mirrored into the VM. |
memory_mb, sm_count, and version fields.dev-virt QEMU runs without a host GPU bridge, the root-task exposes mock /gpu/<id>/info, /gpu/<id>/lease, and /gpu/<id>/status entries (GPU-0/GPU-1) for CLI demos; /gpu/models and /gpu/telemetry/schema.json remain host-mirrored only.coh.run.lease.schema: gpu-lease/v1coh.run.lease.active_state: ACTIVEcoh.run.lease.max_bytes: 1024coh.run.breadcrumb.schema: gpu-breadcrumb/v1coh.run.breadcrumb.max_line_bytes: 512coh.run.breadcrumb.max_command_bytes: 256schema, state, gpu_id, worker_id, mem_mb, streams, ttl_s, priority.schema, event, command, status, exit_code (optional).Generated by coh-rtc (sha256: 80eff6277e0b97c54fc8996ffc01a54ccff20b899bcd0e9f63c30de1afb02f80).
/gpu/models/active before emitting telemetry and propagate the model_id/lora_id into every record.max_record_bytes or omit required fields must be rejected by host-side emitters; the VM does not accept /gpu/telemetry/* writes./host)| Path | Mode | Description |
|——|——|————-|
| /host/systemd/<unit>/status | append-only | Host-published unit status snapshots (mock or live) |
| /host/systemd/<unit>/start | append-only | Control sink for start requests (queen-only) |
| /host/systemd/<unit>/stop | append-only | Control sink for stop requests (queen-only) |
| /host/systemd/<unit>/restart | append-only | Control sink for restart requests (queen-only) |
| /host/k8s/node/<name>/cordon | append-only | Control sink for cordon requests (queen-only) |
| /host/k8s/node/<name>/drain | append-only | Control sink for drain requests (queen-only) |
| /host/docker/status | append-only | Host-published Docker status snapshot (mock or live) |
| /host/docker/restart | append-only | Control sink for restart requests (queen-only) |
| /host/docker/stop | append-only | Control sink for stop requests (queen-only) |
| /host/nvidia/gpu/<id>/status | append-only | Host-published GPU status snapshots (mock or live) |
| /host/nvidia/gpu/<id>/power_cap | append-only | Control sink for power-cap changes (queen-only) |
| /host/nvidia/gpu/<id>/thermal | append-only | Host-published thermal snapshots (mock or live) |
| /host/tickets/spec | append-only JSONL | Host control ticket requests (host-ticket/v1) |
| /host/tickets/status | append-only JSONL | Host control ticket lifecycle receipts (host-ticket-result/v1) |
| /host/tickets/deadletter | append-only JSONL | Terminal failure/expiry receipts (host-ticket-result/v1) |
| /host/tickets/spec.snapshot | read-only | Bounded snapshot view of /host/tickets/spec |
| /host/tickets/status.snapshot | read-only | Bounded snapshot view of /host/tickets/status |
| /host/tickets/deadletter.snapshot | read-only | Bounded snapshot view of /host/tickets/deadletter |
Line formats (append-only snapshots; values are sanitized and lines capped at 256 bytes):
state=<state> sub=<substate>state=<ready|unknown|...> role=<role> version=<version>version=<ver> containers=<n> running=<n> paused=<n> stopped=<n>util_pct=<n> mem_used_mb=<n> mem_total_mb=<n> temp_c=<n> power_w=<n>temp_c=<n>state=unknown reason=<detail> and thermal falls back to temp_c=unknown.schema, id, idempotency_key, action; optional target, args, expires_unix_ms.source_hive, target_hive, relay_hop, relay_correlation_id.schema, id, idempotency_key, action, state; optional message.source_hive, target_hive, relay_hop, relay_correlation_id.id/idempotency_key tokens are bounded ASCII ([A-Za-z0-9._:-], max 128 bytes).source_hive and target_hive are pair-required (both set or both unset).relay_hop must be in range 1..=32 when present.relay_correlation_id follows the same bounded token charset ([A-Za-z0-9._:-]).<= ecosystem.host.tickets.max_line_bytes and <= secure9p.msize.queued -> claimed -> running -> succeeded|failed|expired.id + idempotency_keyid + idempotency_key + source_hive + target_hiveFederation relay policy is manifest-gated under ecosystem.host.federation.* (peer inventory, allowlisted actions, queue/WAL bounds, timeout).
/host is only mounted when ecosystem.host.enable = true; providers are selected from ecosystem.host.providers[] and mounted at ecosystem.host.mount_at.Permission (EPERM) errors and emit audit lines that include the ticket and path./log/queen.log logging path; no new logging protocol is introduced./updates/<epoch> and written via append-only manifest.cbor and chunks/<sha256> nodes; chunk payloads must exactly match cas.store.chunk_bytes.delta.base_epoch, delta.base_sha256), and the payload hash covers base + delta bytes./models/<sha256>/{weights,schema,signature} and become read-only once committed; the entire registry is gated by ecosystem.models.enable.bind /models/<sha256> /worker/worker-1/model.cas.store.chunk_bytes: 128cas.delta.enable: truecas.signing.required: true/updates/<epoch>/manifest.cbor, /updates/<epoch>/chunks/<sha256>./models/<sha256>/weights, /models/<sha256>/schema, /models/<sha256>/signature.delta.base_epoch and delta.base_sha256, referencing a non-delta base.b64:-prefixed base64.{
"chunk_bytes": 128,
"chunks": [
"<sha256-hex>"
],
"delta": {
"base_epoch": "<epoch>",
"base_sha256": "<sha256-hex>"
},
"epoch": "<epoch>",
"payload_bytes": "<payload-bytes>",
"payload_sha256": "<sha256-hex>",
"schema": "cohesix-cas/manifest-v1",
"signature": "<ed25519-signature-hex>"
}
Generated by coh-rtc (sha256: 1bd13b5ce9da8c2e5442e87cfca3e95daa90ee3fbba7de30e21855f19a3ae8a5).
/policy, /actions)| Path | Mode | Description |
|——|——|————-|
| /policy/ctl | append-only | Policy control JSONL commands (validated UTF-8, manifest-bounded) |
| /policy/rules | read-only | Manifest-derived policy rules snapshot |
| /actions/queue | append-only | JSONL approvals/denials (id, target, decision) |
| /actions/<id>/status | read-only | Status snapshot (queued → consumed) |
ecosystem.policy.enable = true.EPERM and append a policy audit line to /log/queen.log.configs/root_task.toml and emitted verbatim in /policy/rules for deterministic inspection.Policy control (/policy/ctl) JSONL:
{"op":"apply","id":"rev-2026-02-03","sha256":"0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef"}
{"op":"rollback","id":"rev-2026-02-03"}
op = apply|rollback; unknown fields are rejected.id must be a bounded ASCII token; sha256 must be 64 hex characters.id.ecosystem.policy.ctl_max_bytes; overflow returns deterministic ERR./audit, /replay)| Path | Mode | Description |
|——|——|————-|
| /audit/journal | append-only | JSONL audit journal of Cohesix control actions (bounded by manifest) |
| /audit/decisions | append-only | Policy approvals/denials (policy-action, policy-gate) with role/ticket metadata |
| /audit/export | read-only | Snapshot of retention bounds (journal_base, journal_next, decisions_base, decisions_next) plus replay flags |
| /replay/ctl | append-only | Replay command JSON ({"from":<cursor>}) |
| /replay/status | read-only | Replay status (idle/ok/err) with deterministic sequence_fnv1a |
ecosystem.audit.enable = true; ReplayFS nodes require ecosystem.audit.replay_enable = true./audit/journal and /replay/ctl enforce append-only semantics; offset mismatches return deterministic Invalid errors and emit audit lines.ERR and produce no side effects.pub trait RootTaskControl {
fn spawn(&self, role: Role, spec: WorkerSpec) -> Result<WorkerId, SpawnError>;
fn kill(&self, id: WorkerId) -> Result<(), KillError>;
fn bind(&self, session: SessionId, from: &str, to: &str) -> Result<(), NamespaceError>;
fn mount(&self, session: SessionId, service: &str, at: &str) -> Result<(), NamespaceError>;
}
WorkerSpec includes budget, initial telemetry seed, and optional GPU lease request.cohsh) Protocolmsize, then issues 9P ops corresponding to shell commands.tail uses repeated read calls with offset tracking; NineDoor enforces append-only by ignoring provided offsets.bind and mount commands are no-ops for non-queen roles.--transport tcp connects to the root-task console listener (default 127.0.0.1:31337) and speaks a Secure9P-style framed protocol:
ATTACH <role> <ticket?> → OK ATTACH role=<role> on success or ERR ATTACH reason=<cause> on failure.TAIL <path> emits OK TAIL path=<path> before newline-delimited log entries; the stream still terminates with END.CAT <path> emits OK CAT path=<path> data=<summary> before newline-delimited contents; the stream still terminates with END.LS <path> currently returns ERR LS reason=unsupported path=<path> until directory listings are exposed.LOG, ECHO, SPAWN) mirror serial behaviour and return a single acknowledgement before triggering side effects.PING / PONG probes keep sessions alive; the client sends PING every 15 seconds of inactivity and expects an immediate
PONG even when the server is mid-stream.
ERR FRAME reason=invalid-length and the session remains open. cohsh additionally validates worker tickets locally,
rejecting whitespace or malformed values so automation does not leak failed attempts over the wire..coh format consumed by coh> test; see the canonical spec in USERLAND_AND_CLI.md for syntax and assertion rules.dev-virt, QEMU forwards 127.0.0.1:{31337/tcp,31338/udp,31339/tcp} to 10.0.2.15 for the console and self-test ports; the virtio-net backend is the default (net-backend-virtio), with RTL8139 available as a fallback by removing that feature. Operators generally do not need to care which NIC is active, but the backend label appears in boot logs for diagnostics.cohsh is the authoritative implementation of this protocol, and the planned WASM GUI is conceptually another client that wraps the same verbs without introducing a new control surface.| Error | Meaning |
|——-|———|
| Permission | Role not permitted to access path or mode |
| NotFound | Path or worker ID missing |
| Busy | Resource in use (GPU lease, worker slot) |
| Invalid | JSON parse failure or malformed 9P frame |
| TooBig | Frame exceeds negotiated msize |
| Closed | Fid used after clunk or revoked ticket |
| RateLimited | Console authentication locked out due to repeated failures |
ROLES_AND_SCHEDULING.md and BUILD_PLAN.md before implementation.