Cohesix is an open-source high-assurance control-plane operating system built on the formally verified seL4 microkernel, designed to keep the trusted computing base intentionally small while enabling deterministic orchestration of edge GPU systems and auditable MLOps. Cohesix is "infrastructure for AGI".
This document lists deterministic failure behavior and the required operator responses. All behavior here is as-built: observed via /proc nodes, control files, and /log/queen.log audit lines.
Operating principles
ERR implies no side effects unless explicitly documented./log/queen.log is the authoritative audit trail for control denials and lifecycle gates./proc/* is the authoritative read-only source for lifecycle, pressure, and queue state.Quick triage checklist
cat /proc/lifecycle/state and cat /proc/lifecycle/reasoncat /proc/root/reachable and cat /proc/root/cut_reasoncat /proc/pressure/* (busy/quota/cut/policy)tail /log/queen.log (or cat for bounded inspection)Signal
ERR on /queen/lifecycle/ctl write./log/queen.log line:
lifecycle denied action=<cmd> state=<STATE> reason=invalid-transitionImpact
Recovery
/proc/lifecycle/state and choose a valid command:
cordon only from ONLINE or DEGRADEDdrain only from DRAININGquiesce from ONLINE, DEGRADED, or DRAININGresume from any non-ONLINE statereset from any non-BOOTING statedrain, quiesce, or resetSignal
ERR on /queen/lifecycle/ctl write./log/queen.log line:
lifecycle denied action=<cmd> state=<STATE> reason=outstanding-leases leases=<n>Impact
Recovery
/worker or /shard/.../worker)./queen/ctl.Signal
ERR on a gated path (worker attach, telemetry ingest, host publishes, or GPU job writes)./log/queen.log line:
lifecycle denied action=<gate> state=<STATE> reason=gate-deniedImpact
Recovery
ONLINE or DEGRADED).cordon → drain → quiesce instead of forcing actions in blocked states.Signal
ERR ECHO reason=policy ... EPERM when writing to a gated path (for example /queen/ctl)./log/queen.log line indicating policy denial.Impact
Recovery
/policy/rules to confirm the target is gated./actions/queue with id, target, and decision.Signal
ERR on a gated write even though an approval was previously queued./log/queen.log indicates a consumed or replayed action.Impact
Recovery
/actions/queue, then retry the write.Signal
cohsh or a tool hangs on connect, or a tool reports a busy/locked console.Impact
Recovery
cohsh, swarmui, hive-gateway, coh, gpu-bridge-host, host-sidecar-bridge).Signal
Impact
Recovery
127.0.0.1:31337).Telemetry ingest refusal is deterministic and policy-driven.
Signals
ERR on /queen/telemetry/<device>/seg/<id> append when over limits./log/queen.log entries indicate quota or wrap behavior (for example telemetry quota reject or telemetry ring wrap).Recovery
telemetry_ingest.* quotas in the manifest and regenerate with coh-rtc./proc/spool/status once available.Host providers are gated by lifecycle state and policy.
Signals
ERR on /host/... append when state disallows host publishes./log/queen.log contains a lifecycle denied gate line.Recovery
ONLINE or DEGRADED./actions/queue.Worker roles cannot attach when lifecycle gates are closed.
Signals
ERR and /log/queen.log shows lifecycle denied action=worker-attach.Recovery
resume) once maintenance is complete./gpu or /gpu/models is emptySignal
ls /gpu returns empty or ERR and /gpu/models is missing.Impact
Recovery
./bin/gpu-bridge-host --publish ... and verify /gpu/bridge/status./host is emptySignal
ls /host returns empty or ERR.Impact
Recovery
./bin/host-sidecar-bridge --watch --provider ... and re-check /host/*.Signal
path exceeds max length, path component '..' is not permitted, or read exceeds max bytes.Impact
Recovery
/v1/meta/bounds (REST) or the manifest-derived limits to size requests within bounds.