Cohesix is an open-source high-assurance control-plane operating system built on the formally verified seL4 microkernel, designed to keep the trusted computing base intentionally small while enabling deterministic orchestration of edge GPU systems and auditable MLOps. Cohesix is "infrastructure for AGI".
Cohesix is a control-plane OS for secure orchestration and telemetry of edge GPU nodes using a Queen/Worker hive model. This document describes the current as-built system for the QEMU aarch64/virt target, the uefi-aarch64 manifest profile, and the macOS host; manifest-gated features are called out explicitly.
Scope:
aarch64/virt (GICv3) running upstream seL4 plus UEFI boot packaging (elfloader.efi path) for uefi-aarch64; userspace is a pure Rust CPIO rootfs.cohsh/cohsh-core.Non-goals:
root-task and worker binaries). This is the trusted computing base.cohsh, coh, swarmui, gpu-bridge-host, host-sidecar-bridge, cas-tool) is outside the TCB and interacts only through Secure9P or the console.SMP=ON and KernelMaxNumNodes >= 2.COHESIX_QEMU_SMP (or QEMU_SMP) to match KernelMaxNumNodes, or provide a full topology via COHESIX_QEMU_SMP_TOPO, when running scripts/qemu-run.sh or scripts/cohesix-build-run.sh.smc; the SMP kernel build must ingest a DTB dumped with virtualization=on so the elfloader emits PSCI_METHOD_SMC for the platform.root_task.affinity inside configs/root_task.toml; all core indices must be < max_cores, and max_cores must match the kernel build’s node count when affinity is enabled. Defaults enable affinity with authority_core=0, ninedoor_cores=[1], provider_cores=[2, 3], and worker_cores=[2, 3] unless overridden in the manifest. When enabled, root-task pins its init TCB to authority_core during bootstrap, temporarily applies the NineDoor core when attaching the bridge, and applies worker-core affinity during worker spawns. The smp debug command cycles the init TCB across the configured role cores and emits per-core scheduler dumps to prove core reachability.root-task (apps/root-task): seL4 bootstrap, CSpace management, event pump, console (serial + TCP), ticket issuance, log buffer (/log/queen.log), HAL, and the in-VM NineDoor bridge.NineDoor (apps/nine-door): Secure9P server for host builds and in-process tests. On seL4, apps/root-task/src/ninedoor.rs provides NineDoorBridge, a namespace/control shim used by the console path.crates/secure9p-*): 9P2000.L codec and session logic used by NineDoor, cohsh, and coh.apps/worker-heart, apps/worker-gpu, apps/worker-bus, apps/worker-lora): role-specific binaries; orchestration is file-driven via /queen/ctl and role-scoped mounts.cohsh CLI (apps/cohsh), coh host bridge (apps/coh), swarmui UI (apps/swarmui), hive-gateway REST gateway, gpu-bridge-host, host-sidecar-bridge, and cas-tool.tools/coh-rtc generates root-task tables, policies, and docs snippets from configs/root_task.toml into apps/root-task/src/generated, out/manifests/, and docs/snippets/.version, attach, walk, open, read, write, clunk, stat. remove is disabled.msize <= 8192, walk depth <= 8, UTF-8 only, no ..; walks validate each component and reject invalid or oversized segments./queen/ctl, /queen/lifecycle/ctl, /queen/schedule/ctl, /queen/lease/ctl, /queen/export/ctl, /policy/ctl, /gpu/bridge/ctl, /log/*, /queen/telemetry/*, telemetry, policy/audit sinks).cohesix> prompt when serial-console is built; used for bring-up and bootinfo checks.net-console is built; frames are length-prefixed (4-byte little-endian) and capped by Secure9P bounds (msize <= 8192).AUTH <token> handshake before any console verbs; failed auth is rate-limited.cohsh-core; acknowledgements (OK / ERR) precede side effects and streamed commands terminate with END.cohsh is the canonical operator client. It speaks Secure9P for in-process/host NineDoor sessions and the console grammar over TCP for QEMU/VM sessions.coh is the host bridge for GPU leases, telemetry pulls, and PEFT lifecycle; it reuses the same console grammar and manifests.swarmui is observational only and reuses cohsh-core tailers; it does not add verbs or protocols.hive-gateway is a host-only REST projection of LS/CAT/ECHO with manifest-derived bounds; it uses a bounded broker dispatcher (control + telemetry queues, fair scheduling) and never introduces new control semantics.gpu-bridge-host and host-sidecar-bridge publish provider data into /gpu/* and /host/* via Secure9P; they never run inside the VM.cas-tool uploads CAS bundles via append-only /updates/* flows over the TCP console.seL4_CapInitThreadCNode and bootinfo.initThreadCNodeSizeBits, validates the bootinfo.empty window, and logs copy/mint/retype tuples before consuming slots.apps/root-task/src/generated.hw.no_nic) suppresses net-console bring-up.hw.attestation.*) is evaluated and can abort boot deterministically.hw.local_seat.*) is evaluated with fail-fast/degrade semantics./log/queen.log) and NineDoorBridge are initialized.initThreadCNodeSizeBits), with seL4_CapInitThreadCNode as the root and offsets fixed at 0.bootinfo.empty window; reserved slots remain untouched.| Role | Namespace view (as-built) | Notes |
| — | — | — |
| Queen | Full tree (/, /queen, /log, /proc, /shard/*/worker/*, legacy /worker/* when enabled) plus manifest-gated /gpu, /host, /policy, /actions, /audit, /replay, /updates, /models | Queen tickets are optional; worker tickets are required. |
| WorkerHeartbeat | /proc/boot, /proc/lifecycle/*, /shard/<label>/worker/<id>/telemetry, /log/queen.log (RO); legacy /worker/<id>/telemetry when enabled | Ticket must include a subject identity. |
| WorkerGpu | WorkerHeartbeat view + /gpu/<id>/* when GPU nodes are present | GPU nodes are host-published; no in-VM GPU stack. |
| WorkerBus | WorkerHeartbeat view + /bus/<adapter>/* when MODBUS/DNP3 sidecars are enabled | Scope is derived from ticket subject. |
| WorkerLora | WorkerHeartbeat view + /lora/<adapter>/* when LoRa sidecars are enabled | Scope is derived from ticket subject. |
Mount and bind semantics:
bind and mount are queen-only.mount is limited to manifest-provided namespace mounts (see generated mounts; logs maps to /log by default)./shard/<label>/worker/<id>/telemetry; legacy /worker/<id>/telemetry exists only when sharding.legacy_worker_alias = true.msize <= 8192; walk depth <= 8; UTF-8 paths; no ..; no fid reuse after clunk; remove disabled./queen/ctl, /queen/lifecycle/ctl, /queen/schedule/ctl, /queen/lease/ctl, /queen/export/ctl, /policy/ctl, /log/*, telemetry, policy/audit sinks, /gpu/bridge/ctl, and /queen/telemetry/* ignore offsets and reject writes that break bounds.scripts/ci/size_guard.sh).no_std; no POSIX or libc-style emulation layers./queen/ctl; NineDoor validates and the root task updates worker state, bind tables, and audits to /log/queen.log./queen/lifecycle/ctl; /proc/lifecycle/* exposes state, reason, and since-ms./queen/schedule/ctl, /queen/lease/ctl, and /queen/export/ctl; /proc/schedule/* and /proc/lease/* expose bounded read-only snapshots./shard/<label>/worker/<id>/telemetry; ring sizes and schema selection are manifest-driven (telemetry.ring_bytes_per_worker, telemetry.frame_schema)./queen/telemetry/<device_id>/; quotas and eviction are manifest-driven (telemetry_ingest.*). Large artifacts use bounded reference manifests (coh-ref-c/v1) instead of generic file transfer, and /proc/ingest/* reports ingest health./log/queen.log; only queen/host tools append./proc/boot exposes manifest fingerprints, Milestone 26 hardware gates, and attestation evidence hashes; /proc/tests/* carries regression scripts; /proc/9p/*, /proc/root/*, /proc/pressure/*, /proc/ingest/*, /proc/schedule/*, and /proc/lease/* surface bounded stats when enabled./gpu/<id>/*, /gpu/models/*, and /gpu/telemetry/schema.json via /gpu/bridge/ctl; worker-gpu reads info/status and appends to job/ctl within ticket scope./host/* is present only when enabled in the manifest; providers are published by host-sidecar-bridge./policy, /actions, /audit, /replay appear only when enabled in the manifest; /policy/ctl drives apply/rollback and /actions/queue carries approvals/denials; writes are append-only and audited./updates/* and /models/* are available when CAS and model registry gates are enabled (cas.enable, ecosystem.models.enable).blake3::keyed_hash) and bound to role, budget, subject, and mount scope.cohsh-core to keep ACK/ERR/END lines deterministic across transports.cache.*) and is audited; misconfiguration is rejected by coh-rtc.cohsh --transport tcp for authenticated remote workflows.cohsh appends to /queen/ctl and /queen/lifecycle/ctl, then tails /log/queen.log or worker telemetry files.gpu-bridge-host --publish (or coh peft import --publish) to refresh /gpu/*, then coh gpu lease/run for host-side GPU workflows.hive-gateway exposes a host-only HTTP projection of LS/CAT/ECHO for automation; bounds and semantics match the console/file grammar.coh> test executes the preinstalled /proc/tests/*.coh scripts; it is the canonical regression gate for console and Secure9P behavior.scripts/cohsh/run_regression_batch.sh runs the full .coh suite across base and gated manifests using QEMU.flowchart LR
subgraph Host["Host (outside VM/TCB)"]
Cohsh["cohsh (CLI)"]
GPUB["gpu-bridge-host"]
HS["host-sidecar-bridge"]
end
subgraph VM["Cohesix target (seL4 VM)"]
subgraph K["seL4 kernel"]
SEL4["seL4"]
end
subgraph U["Userspace (CPIO rootfs)"]
RT["root-task\n(event pump + console + HAL)"]
ND["NineDoor\n(Secure9P namespace; bridge in seL4 build)"]
WH["worker-heart"]
WG["worker-gpu"]
WB["worker-bus"]
WL["worker-lora"]
end
end
subgraph NS["Secure9P namespace (role-scoped)"]
QCTL["/queen/ctl"]
LOG["/log/queen.log"]
PROC["/proc/*"]
TEL["/shard/<label>/worker/<id>/telemetry"]
GPU["/gpu/<id>/* (when enabled)"]
HOST["/host/* (when enabled)"]
end
SEL4 --> RT
RT --> ND
WH --> ND
WG --> ND
WB --> ND
WL --> ND
ND --> QCTL
ND --> LOG
ND --> PROC
ND --> TEL
ND --> GPU
ND --> HOST
Cohsh -->|"TCP console"| RT
Cohsh -->|"Secure9P client"| ND
GPUB -->|"Secure9P provider"| ND
HS -->|"Secure9P provider"| ND
sequenceDiagram
autonumber
participant Operator
participant Cohsh as cohsh
participant TCP as root-task TCP console
participant ND as NineDoorBridge
participant Log as /log/queen.log
Operator->>Cohsh: run cohsh --transport tcp
Cohsh->>TCP: AUTH <token>
TCP-->>Cohsh: OK AUTH (or ERR AUTH)
Cohsh->>TCP: ATTACH queen [ticket]
TCP->>ND: validate role + ticket
ND-->>TCP: accept/deny
TCP-->>Cohsh: OK ATTACH role=queen (or ERR ATTACH)
Cohsh->>TCP: TAIL /log/queen.log
TCP->>ND: open + snapshot
TCP-->>Cohsh: OK TAIL path=/log/queen.log
loop stream
TCP-->>Cohsh: log line
end
TCP-->>Cohsh: END
The live publish path keeps /gpu/models/* and /gpu/telemetry/schema.json out of the VM until the host bridge pushes a bounded snapshot. PEFT import optionally refreshes the live model registry immediately after updating the host registry.
flowchart LR
subgraph Host["Host"]
GBH["gpu-bridge-host"]
COH["coh peft import"]
REG["host model registry"]
end
subgraph VM["VM"]
ND["NineDoor /gpu/bridge/ctl"]
GPU["/gpu/<id>/*"]
MODELS["/gpu/models/*"]
SCHEMA["/gpu/telemetry/schema.json"]
end
REG -->|writes| COH
COH -->|"--publish/--refresh-gpu-models"| GBH
GBH -->|"bounded snapshot"| ND
ND --> GPU
ND --> MODELS
ND --> SCHEMA
Live Hive renders only what the backend tailers ingest. Polling bounds and line caps live in cohsh-core, not in the UI.
flowchart LR
W["worker telemetry file\n/shard/<label>/worker/<id>/telemetry"] --> TAIL["cohsh-core tailer"]
TAIL --> BUF["bounded line buffers"]
BUF --> UI["SwarmUI Live Hive overlays + detail panel"]
UI -->|"read-only"| PROC["/proc/root/*, /proc/pressure/*, /proc/9p/session/active"]
AGENTS.mdREADME.mddocs/BUILD_PLAN.mddocs/INTERFACES.mddocs/SECURE9P.mddocs/USERLAND_AND_CLI.mddocs/ROLES_AND_SCHEDULING.mddocs/REPO_LAYOUT.mddocs/GPU_NODES.mddocs/HOST_TOOLS.mdconfigs/root_task.tomlout/manifests/root_task_resolved.jsonapps/root-taskapps/nine-doorapps/cohshapps/cohapps/swarmuitools/coh-rtcscripts/cohsh/run_regression_batch.shtests/integrationThe following block is generated by coh-rtc and mirrored from docs/snippets/root_task_manifest.md. Do not edit by hand.
meta.author: Lukas Bowermeta.purpose: Root-task manifest input for coh-rtc.root_task.schema: 1.5profile.name: virt-aarch64profile.kernel: trueevent_pump.tick_ms: 5secure9p.msize: 8192secure9p.walk_depth: 8secure9p.tags_per_session: 16secure9p.batch_frames: 1secure9p.short_write.policy: rejectticket_limits.max_scopes: 8ticket_limits.max_scope_path_len: 128ticket_limits.max_scope_rate_per_s: 64ticket_limits.bandwidth_bytes: 131072ticket_limits.cursor_resumes: 16ticket_limits.cursor_advances: 256cas.enable: truecas.store.chunk_bytes: 128cas.delta.enable: truecas.signing.required: truecas.signing.key_path: resources/fixtures/cas_signing_key.hextelemetry.ring_bytes_per_worker: 1024telemetry.frame_schema: legacy-plaintexttelemetry.cursor.retain_on_boot: falsetelemetry_ingest.max_segments_per_device: 4telemetry_ingest.max_bytes_per_segment: 32768telemetry_ingest.max_total_bytes_per_device: 131072telemetry_ingest.eviction_policy: evict-oldestlifecycle.initial_state: BOOTINGlifecycle.auto_transitions: BOOTING->ONLINEcontrol_plane.schedule.enable: truecontrol_plane.schedule.queue_max_entries: 64control_plane.schedule.ctl_max_bytes: 8192control_plane.lease.enable: truecontrol_plane.lease.active_max_entries: 64control_plane.lease.preemptions_max_entries: 64control_plane.lease.ctl_max_bytes: 8192control_plane.export.enable: truecontrol_plane.export.ctl_max_bytes: 2048observability.proc_9p.sessions: trueobservability.proc_9p.outstanding: trueobservability.proc_9p.short_writes: trueobservability.proc_9p.sessions_bytes: 8192observability.proc_9p.outstanding_bytes: 128observability.proc_9p.short_writes_bytes: 128observability.proc_9p_session.active: trueobservability.proc_9p_session.state: trueobservability.proc_9p_session.since_ms: trueobservability.proc_9p_session.owner: trueobservability.proc_9p_session.active_bytes: 128observability.proc_9p_session.state_bytes: 64observability.proc_9p_session.since_ms_bytes: 64observability.proc_9p_session.owner_bytes: 96observability.proc_ingest.p50_ms: trueobservability.proc_ingest.p95_ms: trueobservability.proc_ingest.backpressure: trueobservability.proc_ingest.dropped: trueobservability.proc_ingest.queued: trueobservability.proc_ingest.watch: trueobservability.proc_ingest.p50_ms_bytes: 64observability.proc_ingest.p95_ms_bytes: 64observability.proc_ingest.backpressure_bytes: 64observability.proc_ingest.dropped_bytes: 64observability.proc_ingest.queued_bytes: 64observability.proc_ingest.watch_max_entries: 16observability.proc_ingest.watch_line_bytes: 192observability.proc_ingest.watch_min_interval_ms: 50observability.proc_ingest.latency_samples: 32observability.proc_ingest.latency_tolerance_ms: 5observability.proc_ingest.counter_tolerance: 1observability.proc_root.reachable: trueobservability.proc_root.last_seen_ms: trueobservability.proc_root.cut_reason: trueobservability.proc_root.reachable_bytes: 32observability.proc_root.last_seen_ms_bytes: 64observability.proc_root.cut_reason_bytes: 64observability.proc_pressure.busy: trueobservability.proc_pressure.quota: trueobservability.proc_pressure.cut: trueobservability.proc_pressure.policy: trueobservability.proc_pressure.busy_bytes: 64observability.proc_pressure.quota_bytes: 64observability.proc_pressure.cut_bytes: 64observability.proc_pressure.policy_bytes: 64observability.proc_schedule.summary: trueobservability.proc_schedule.queue: trueobservability.proc_schedule.summary_bytes: 128observability.proc_schedule.queue_bytes: 256observability.proc_lease.summary: trueobservability.proc_lease.active: trueobservability.proc_lease.preemptions: trueobservability.proc_lease.summary_bytes: 160observability.proc_lease.active_bytes: 256observability.proc_lease.preemptions_bytes: 256ui_providers.proc_9p.sessions: trueui_providers.proc_9p.outstanding: trueui_providers.proc_9p.short_writes: trueui_providers.proc_ingest.p50_ms: trueui_providers.proc_ingest.p95_ms: trueui_providers.proc_ingest.backpressure: trueui_providers.policy_preflight.req: falseui_providers.policy_preflight.diff: falseui_providers.updates.manifest: trueui_providers.updates.status: trueclient_policies.cohsh.pool.control_sessions: 2client_policies.cohsh.pool.telemetry_sessions: 4client_policies.cohsh.tail.poll_ms_default: 1500client_policies.cohsh.tail.poll_ms_min: 500client_policies.cohsh.tail.poll_ms_max: 10000client_policies.cohsh.host_telemetry.nvidia_poll_ms: 1000client_policies.cohsh.host_telemetry.systemd_poll_ms: 2000client_policies.cohsh.host_telemetry.docker_poll_ms: 2000client_policies.cohsh.host_telemetry.k8s_poll_ms: 5000client_policies.coh.mount.root: /client_policies.coh.mount.allowlist: /proc, /queen, /worker, /log, /gpu, /hostclient_policies.coh.telemetry.root: /queen/telemetryclient_policies.coh.telemetry.max_devices: 32client_policies.coh.telemetry.max_segments_per_device: 4client_policies.coh.telemetry.max_bytes_per_segment: 32768client_policies.coh.telemetry.max_total_bytes_per_device: 131072client_policies.retry.max_attempts: 3client_policies.retry.backoff_ms: 200client_policies.retry.ceiling_ms: 2000client_policies.retry.timeout_ms: 5000client_policies.heartbeat.interval_ms: 15000client_paths.queen_ctl: /queen/ctlclient_paths.queen_lifecycle_ctl: /queen/lifecycle/ctlclient_paths.queen_schedule_ctl: /queen/schedule/ctlclient_paths.queen_lease_ctl: /queen/lease/ctlclient_paths.queen_export_ctl: /queen/export/ctlclient_paths.policy_ctl: /policy/ctlclient_paths.log: /log/queen.logswarmui.ticket_scope: per-ticketswarmui.cache.enabled: falseswarmui.cache.max_bytes: 262144swarmui.cache.ttl_s: 3600swarmui.hive.frame_cap_fps: 30swarmui.hive.step_ms: 16swarmui.hive.lod_zoom_out: 0.7swarmui.hive.lod_zoom_in: 1.25swarmui.hive.lod_event_budget: 512swarmui.hive.snapshot_max_events: 4096swarmui.hive.overlay_lines: 3swarmui.hive.detail_lines: 50swarmui.hive.line_cap_bytes: 160swarmui.hive.per_worker_bytes: 2048swarmui.paths.telemetry_root: /workerswarmui.paths.proc_ingest_root: /proc/ingestswarmui.paths.worker_root: /workerswarmui.paths.namespace_roots: /proc, /queen, /worker, /log, /gpucache.kernel_ops: truecache.dma_clean: truecache.dma_invalidate: truecache.unify_instructions: falsefeatures.net_console: truefeatures.serial_console: truefeatures.std_console: falsefeatures.std_host_tools: falsenamespaces.role_isolation: truesharding.enabled: truesharding.shard_bits: 8sharding.legacy_worker_alias: truetickets: 5 entriesmanifest.sha256: 0884f452da6fe84e7148c3c1e01d605b45f09f1da09914d87e961cc2c256b905logs → /logsharding.enabled: truesharding.shard_bits: 8sharding.legacy_worker_alias: true00..ff (count: 256)/shard/<label>/worker/<id>/telemetry/worker/<id>/telemetrysidecars.modbus.enable: falsesidecars.modbus.mount_at: /bussidecars.modbus.adapters: (none)sidecars.dnp3.enable: falsesidecars.dnp3.mount_at: /bussidecars.dnp3.adapters: (none)sidecars.lora.enable: falsesidecars.lora.mount_at: /lorasidecars.lora.adapters: (none)ecosystem.host.enable: trueecosystem.host.mount_at: /hostecosystem.host.providers: systemd, k8s, docker, nvidiaecosystem.host.tickets.enable: trueecosystem.host.tickets.request_schema: host-ticket/v1ecosystem.host.tickets.result_schema: host-ticket-result/v1ecosystem.host.tickets.max_line_bytes: 2048ecosystem.host.tickets.action_allowlist: gpu.lease.grant|renew|release, peft.import|activate|rollback, systemd.start|stop|restart|status-check, docker.restart|stop|status-check, k8s.cordon|drain|lease.syncecosystem.host.tickets.lifecycle: queued, claimed, running, succeeded, failed, expiredecosystem.host.federation.enable: trueecosystem.host.federation.local_hive: hive-aecosystem.host.federation.relay_queue_max_entries: 256ecosystem.host.federation.relay_queue_max_bytes: 32768ecosystem.host.federation.wal_max_entries: 1024ecosystem.host.federation.wal_max_bytes: 524288ecosystem.host.federation.relay_timeout_ms: 2000ecosystem.host.federation.peers: hive-b -> http://127.0.0.1:8081 (auth_ref=COHESIX_RELAY_HIVE_B_TOKEN)ecosystem.host.federation.peers: hive-c -> http://127.0.0.1:8082 (auth_ref=COHESIX_RELAY_HIVE_C_TOKEN)ecosystem.host.federation.action_allowlist: gpu.lease.grant, gpu.lease.renew, gpu.lease.release, peft.import, peft.activate, peft.rollback, systemd.start, systemd.stop, systemd.restart, systemd.status-check, docker.restart, docker.stop, docker.status-check, k8s.cordon, k8s.drain, k8s.lease.sync/host namespace mounted at /host when enabled.ecosystem.audit.enable: falseecosystem.audit.journal_max_bytes: 8192ecosystem.audit.decisions_max_bytes: 4096ecosystem.audit.replay_enable: falseecosystem.audit.replay_max_entries: 64ecosystem.audit.replay_ctl_max_bytes: 1024ecosystem.audit.replay_status_max_bytes: 1024ecosystem.policy.enable: trueecosystem.policy.queue_max_entries: 32ecosystem.policy.queue_max_bytes: 4096ecosystem.policy.ctl_max_bytes: 2048ecosystem.policy.status_max_bytes: 512ecosystem.policy.rules: queen-ctl → /queen/ctlecosystem.policy.rules: systemd-restart → /host/systemd/*/restartecosystem.models.enable: false/host/tickets/spec.host-ticket-agent (host-only) tails spec, performs allowlisted adapters, and appends lifecycle receipts:
/host/tickets/status for claimed|running|succeeded/host/tickets/deadletter for failed|expiredhost-ticket-agent --relay forwards allowlisted intents to peer hives using REST /v1/fs/echo.out/host-ticket-agent/relay-wal.json by default).relay_queue_depth, relay_deduped, relay_remote_write_failures).source_hive, target_hive, relay_hop, relay_correlation_id.id + idempotency_key for local flow and id + idempotency_key + source_hive + target_hive for federated flow./queen/ctl, /queen/lease/ctl, /gpu/*, /gpu/models/*./host/*.Generated from configs/root_task.toml (sha256: 0884f452da6fe84e7148c3c1e01d605b45f09f1da09914d87e961cc2c256b905).