cohesix

Cohesix is an open-source high-assurance control-plane operating system built on the formally verified seL4 microkernel, designed to keep the trusted computing base intentionally small while enabling deterministic orchestration of edge GPU systems and auditable MLOps. Cohesix is "infrastructure for AGI".

View the Project on GitHub lukeb-aidev/cohesix

Cohesix Architecture (As-Built)

Cohesix is a control-plane OS for secure orchestration and telemetry of edge GPU nodes using a Queen/Worker hive model. This document describes the current as-built system for the QEMU aarch64/virt target, the uefi-aarch64 manifest profile, and the macOS host; manifest-gated features are called out explicitly.

1. Scope and Non-Goals

Scope:

Non-goals:

2. System Boundaries and TCB

2.1 SMP Execution Model (Task Isolation)

3. Top-Level Architecture

4. Control Surfaces

Secure9P namespace (NineDoor)

Console surfaces

Host tooling

5. Boot and Bring-Up Flow

  1. seL4 elfloader enters the root-task entry point.
  2. Root task reconstructs canonical CSpace addressing using seL4_CapInitThreadCNode and bootinfo.initThreadCNodeSizeBits, validates the bootinfo.empty window, and logs copy/mint/retype tuples before consuming slots.
  3. UART is mapped and the serial logger is activated; the boot banner is emitted.
  4. HAL setup, timer initialization, and IPC endpoints are established.
  5. Manifest-generated tables (tickets, Secure9P limits, policy/audit flags) are loaded from apps/root-task/src/generated.
  6. Milestone 26 hardware gates execute before ticket publication:
    • no-NIC baseline (hw.no_nic) suppresses net-console bring-up.
    • attestation policy (hw.attestation.*) is evaluated and can abort boot deterministically.
    • local-seat policy (hw.local_seat.*) is evaluated with fail-fast/degrade semantics.
  7. The log buffer (/log/queen.log) and NineDoorBridge are initialized.
  8. Serial console starts; TCP console is started only when networking is enabled by profile/policy.
  9. The event pump enters its cooperative loop (serial, timer, networking, IPC, NineDoorBridge), avoiding busy waits.

CSpace bootstrap invariants

6. Role Model and Mounts

| Role | Namespace view (as-built) | Notes | | — | — | — | | Queen | Full tree (/, /queen, /log, /proc, /shard/*/worker/*, legacy /worker/* when enabled) plus manifest-gated /gpu, /host, /policy, /actions, /audit, /replay, /updates, /models | Queen tickets are optional; worker tickets are required. | | WorkerHeartbeat | /proc/boot, /proc/lifecycle/*, /shard/<label>/worker/<id>/telemetry, /log/queen.log (RO); legacy /worker/<id>/telemetry when enabled | Ticket must include a subject identity. | | WorkerGpu | WorkerHeartbeat view + /gpu/<id>/* when GPU nodes are present | GPU nodes are host-published; no in-VM GPU stack. | | WorkerBus | WorkerHeartbeat view + /bus/<adapter>/* when MODBUS/DNP3 sidecars are enabled | Scope is derived from ticket subject. | | WorkerLora | WorkerHeartbeat view + /lora/<adapter>/* when LoRa sidecars are enabled | Scope is derived from ticket subject. |

Mount and bind semantics:

7. Key Invariants (Red Lines)

8. Data Flows

9. Security Posture

10. Operational Workflows

11. Diagrams

Boundary and components

flowchart LR
  subgraph Host["Host (outside VM/TCB)"]
    Cohsh["cohsh (CLI)"]
    GPUB["gpu-bridge-host"]
    HS["host-sidecar-bridge"]
  end

  subgraph VM["Cohesix target (seL4 VM)"]
    subgraph K["seL4 kernel"]
      SEL4["seL4"]
    end
    subgraph U["Userspace (CPIO rootfs)"]
      RT["root-task\n(event pump + console + HAL)"]
      ND["NineDoor\n(Secure9P namespace; bridge in seL4 build)"]
      WH["worker-heart"]
      WG["worker-gpu"]
      WB["worker-bus"]
      WL["worker-lora"]
    end
  end

  subgraph NS["Secure9P namespace (role-scoped)"]
    QCTL["/queen/ctl"]
    LOG["/log/queen.log"]
    PROC["/proc/*"]
    TEL["/shard/<label>/worker/<id>/telemetry"]
    GPU["/gpu/<id>/* (when enabled)"]
    HOST["/host/* (when enabled)"]
  end

  SEL4 --> RT
  RT --> ND
  WH --> ND
  WG --> ND
  WB --> ND
  WL --> ND
  ND --> QCTL
  ND --> LOG
  ND --> PROC
  ND --> TEL
  ND --> GPU
  ND --> HOST

  Cohsh -->|"TCP console"| RT
  Cohsh -->|"Secure9P client"| ND
  GPUB -->|"Secure9P provider"| ND
  HS -->|"Secure9P provider"| ND

TCP console attach + tail

sequenceDiagram
  autonumber
  participant Operator
  participant Cohsh as cohsh
  participant TCP as root-task TCP console
  participant ND as NineDoorBridge
  participant Log as /log/queen.log

  Operator->>Cohsh: run cohsh --transport tcp
  Cohsh->>TCP: AUTH <token>
  TCP-->>Cohsh: OK AUTH (or ERR AUTH)

  Cohsh->>TCP: ATTACH queen [ticket]
  TCP->>ND: validate role + ticket
  ND-->>TCP: accept/deny
  TCP-->>Cohsh: OK ATTACH role=queen (or ERR ATTACH)

  Cohsh->>TCP: TAIL /log/queen.log
  TCP->>ND: open + snapshot
  TCP-->>Cohsh: OK TAIL path=/log/queen.log
  loop stream
    TCP-->>Cohsh: log line
  end
  TCP-->>Cohsh: END

Live GPU publish + PEFT refresh (Milestone 24b)

The live publish path keeps /gpu/models/* and /gpu/telemetry/schema.json out of the VM until the host bridge pushes a bounded snapshot. PEFT import optionally refreshes the live model registry immediately after updating the host registry.

flowchart LR
  subgraph Host["Host"]
    GBH["gpu-bridge-host"]
    COH["coh peft import"]
    REG["host model registry"]
  end
  subgraph VM["VM"]
    ND["NineDoor /gpu/bridge/ctl"]
    GPU["/gpu/<id>/*"]
    MODELS["/gpu/models/*"]
    SCHEMA["/gpu/telemetry/schema.json"]
  end
  REG -->|writes| COH
  COH -->|"--publish/--refresh-gpu-models"| GBH
  GBH -->|"bounded snapshot"| ND
  ND --> GPU
  ND --> MODELS
  ND --> SCHEMA

Live Hive telemetry path (Milestone 24b)

Live Hive renders only what the backend tailers ingest. Polling bounds and line caps live in cohsh-core, not in the UI.

flowchart LR
  W["worker telemetry file\n/shard/<label>/worker/<id>/telemetry"] --> TAIL["cohsh-core tailer"]
  TAIL --> BUF["bounded line buffers"]
  BUF --> UI["SwarmUI Live Hive overlays + detail panel"]
  UI -->|"read-only"| PROC["/proc/root/*, /proc/pressure/*, /proc/9p/session/active"]

12. References

Manifest snapshot (generated)

The following block is generated by coh-rtc and mirrored from docs/snippets/root_task_manifest.md. Do not edit by hand.

Root-task manifest schema (generated)

Namespace mounts (generated)

Sharded worker namespace (generated)

Sidecars section (generated)

Ecosystem section (generated)

Host ticket control plane (Milestones 25g + 25h)

Generated from configs/root_task.toml (sha256: 0884f452da6fe84e7148c3c1e01d605b45f09f1da09914d87e961cc2c256b905).