feat: integrate 5 feature branches, upstream fixes, and MIME detection fix

Squashed 5 commits: Features (from 5 feature branches): - MCP fix, pipe mute, stub recovery - KAIROS activation, openclaw autonomy - Daemon/job command hierarchy + cross-platform bg engine Upstream fixes: - fix: Bun.hash compatibility - chore: chrome dependency update - docs: browser support guide MIME detection fix: - Screenshot detectMimeFromBase64(): decode raw bytes from base64 instead of broken charCodeAt comparison - Fixes API 400 on Windows (JPEG) and macOS (PNG) screenshots
2026-06-15 21:05:51 +00:00 · 2026-04-14 18:32:19 +08:00
93 changed files with 9653 additions and 1400 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -15,7 +15,7 @@ src/utils/vendor/
 .claude/
 .codex/
 .omx/
-
+.docs/task/
 # Binary / screenshot files (root only)
 /*.png
 *.bmp
--- a/(1).md
+++ b/(1).md
@@ -0,0 +1,204 @@
+# KAIROS — 永不关机的 Claude
+
+> 源码位置：`src/assistant/`、`src/proactive/`、`src/services/autoDream/`
+> 编译开关：`feature('KAIROS')`、`feature('KAIROS_BRIEF')`、`feature('KAIROS_CHANNELS')`
+> 远程开关：GrowthBook `tengu_kairos`
+
+关掉终端 Claude 还在运行的持久助手模式。KAIROS 是 Claude Code 中最复杂的隐藏功能之一。
+
+---
+
+## 核心概念
+
+KAIROS 让 Claude 从"一次性对话工具"变成"持久运行的 AI 助手"：
+
+- 关闭终端后 Claude 仍在后台运行
+- 每天自动写日志
+- 晚上自动"做梦"整理记忆
+- 没人说话时自己找活干
+- 命令超 15 秒自动丢后台
+
+---
+
+## 激活流程
+
+定义在 `src/main.tsx`（约第 1054-1092 行），需要通过五层检查：
+
+```
+1. feature('KAIROS')          ← 编译时 flag
+2. settings.assistant: true   ← .claude/settings.json
+3. 目录信任状态检查            ← 防恶意仓库劫持
+4. tengu_kairos               ← GrowthBook 远程开关
+5. setKairosActive(true)      ← 全局状态激活
+```
+
+`--assistant` CLI 参数可跳过远程开关检查（用于 Agent SDK daemon 模式）。
+
+全局状态存储在 `src/bootstrap/state.ts`：
+- `kairosActive: boolean`（默认 `false`）
+- `getKairosActive()` / `setKairosActive(true)`
+
+---
+
+## 跨会话持久运行
+
+### 会话恢复
+
+`src/utils/conversationRecovery.ts` 中使用 `feature('KAIROS')` 条件导入 `BriefTool` 和 `SendUserFileTool`。在反序列化会话时识别这些工具的结果为"终端工具结果"，判断 turn 是正常完成还是被中断。
+
+### 持久 Cron 任务
+
+关键在 `.claude/scheduled_tasks.json`。标记为 `permanent: true` 的任务不受 7 天自动过期限制：
+
+- `catch-up`：恢复中断的工作
+- `morning-checkin`：每日早间签到
+- `dream`：记忆整合
+
+### 会话历史 API
+
+`src/assistant/sessionHistory.ts` 通过 OAuth API 加载远程会话历史，使用 `v1/sessions/{sessionId}/events` 端点，支持分页拉取。
+
+---
+
+## 做梦机制（Dream）
+
+KAIROS 最精巧的子系统——后台运行的子代理，将分散的会话记忆整合为持久的结构化知识。
+
+### 触发条件（三层门控，由廉到贵）
+
+定义在 `src/services/autoDream/autoDream.ts`：
+
+```
+1. 时间门控：距上次整合超过 24 小时（minHours）
+2. 会话门控：至少 5 个新会话（minSessions）
+3. 锁门控：没有其他进程正在整合
+```
+
+阈值通过 GrowthBook `tengu_onyx_plover` 远程配置动态控制。
+
+### 四阶段整合流程
+
+定义在 `src/services/autoDream/consolidationPrompt.ts`：
+
+| 阶段 | 动作 |
+|------|------|
+| **Orient** | 列出记忆目录、读取 `MEMORY.md` 索引、浏览已有主题文件 |
+| **Gather** | 从每日日志、已有记忆、JSONL transcript 中搜集新信号 |
+| **Consolidate** | 合并新信号到主题文件，转换相对日期为绝对日期，删除过时事实 |
+| **Prune** | 更新 `MEMORY.md` 索引，保持在行数和大小限制内 |
+
+### 锁机制
+
+`src/services/autoDream/consolidationLock.ts`：
+
+- 使用 `.consolidate-lock` 文件
+- 文件 mtime = `lastConsolidatedAt`
+- 文件内容 = 持有者 PID
+- 支持 PID 存活检查（1 小时超时）
+- double-write 后 re-read 验证防竞争
+
+### 每日日志
+
+路径由 `src/memdir/paths.ts` 的 `getAutoMemDailyLogPath()` 计算：
+
+```
+<autoMemPath>/logs/YYYY/MM/YYYY-MM-DD.md
+```
+
+### UI 呈现
+
+- Footer pill 标签显示 **"dreaming"**
+- `src/components/tasks/DreamDetailDialog.tsx` 提供专门的详情对话框
+- 支持查看实时进度和手动中止
+- `Shift+Down` 打开后台任务对话框
+
+---
+
+## 主动模式（Proactive Mode）
+
+没人说话时 Claude 自己找活干。
+
+### 核心状态
+
+`src/proactive/index.ts` 维护三个状态：
+
+| 状态 | 说明 |
+|------|------|
+| `active` | 是否激活 |
+| `paused` | 是否暂停（用户按 Esc 取消时暂停，下次输入恢复） |
+| `contextBlocked` | API 错误时阻塞 tick，防止 tick-error-tick 死循环 |
+
+### 激活方式
+
+- `--proactive` CLI 参数
+- `CLAUDE_CODE_PROACTIVE` 环境变量
+- 受 `feature('PROACTIVE') || feature('KAIROS')` 保护
+
+### 系统提示
+
+激活后追加：
+
+```
+# Proactive Mode
+
+You are in proactive mode. Take initiative -- explore, act, and make progress
+without waiting for instructions.
+
+Start by briefly greeting the user.
+
+You will receive periodic <tick> prompts. These are check-ins. Do whatever
+seems most useful, or call Sleep if there's nothing to do.
+```
+
+### SleepTool 集成
+
+设置中的 `minSleepDurationMs` 和 `maxSleepDurationMs` 控制 Sleep 持续时间范围，节流 proactive tick 频率。没活干就 Sleep 等着。
+
+---
+
+## 后台任务管理
+
+### Cron 调度器
+
+`src/utils/cronScheduler.ts`：
+
+- 每 1 秒 tick 一次（`CHECK_INTERVAL_MS = 1000`）
+- 使用 chokidar 监视 `.claude/scheduled_tasks.json`
+- 支持调度器锁（`src/utils/cronTasksLock.ts`），防止多实例重复触发
+- 锁探测间隔 5 秒，持有者崩溃时自动接管
+
+### 任务类型
+
+| 类型 | 说明 |
+|------|------|
+| 一次性（`recurring: false`） | 触发后自动删除，支持错过任务检测 |
+| 循环（`recurring: true`） | 触发后重新调度，默认 7 天过期 |
+| 永久（`permanent: true`） | 不受过期限制（KAIROS 专用） |
+| 会话级（`durable: false`） | 仅内存中，进程退出即消失 |
+
+### Jitter 防雷群机制
+
+`src/utils/cronJitterConfig.ts`：
+
+- 循环任务：基于 taskId 的确定性延迟（interval 的 10%，上限 15 分钟）
+- 一次性任务：在 :00 和 :30 施加最多 90 秒提前量
+- 运维可在事故期间推送配置变更，60 秒内全客户端生效
+
+---
+
+## 关键源码文件
+
+| 文件 | 职责 |
+|------|------|
+| `src/bootstrap/state.ts` | KAIROS 全局状态 |
+| `src/assistant/index.ts` | 助手模式入口 |
+| `src/assistant/sessionHistory.ts` | 远程会话历史 API |
+| `src/proactive/index.ts` | 主动模式状态管理 |
+| `src/services/autoDream/autoDream.ts` | Auto-Dream 引擎 |
+| `src/services/autoDream/consolidationPrompt.ts` | 整合提示（四阶段） |
+| `src/services/autoDream/consolidationLock.ts` | 整合锁 |
+| `src/services/autoDream/config.ts` | Dream 配置 |
+| `src/tasks/DreamTask/DreamTask.ts` | Dream 任务定义 |
+| `src/utils/cronScheduler.ts` | Cron 调度器 |
+| `src/utils/cronTasks.ts` | Cron 任务持久化 |
+| `src/skills/bundled/dream.ts` | `/dream` Skill（存根） |
--- a/README.md
+++ b/README.md
@@ -22,8 +22,10 @@
 | Web Search | 内置网页搜索工具 | [文档](https://ccb.agent-aura.top/docs/features/web-browser-tool) |
 | 自定义模型供应商 | OpenAI/Anthropic/Gemini/Grok 兼容 | [文档](https://ccb.agent-aura.top/docs/features/custom-platform-login) |
 | Voice Mode | Push-to-Talk 语音输入 | [文档](https://ccb.agent-aura.top/docs/features/voice-mode) |
-| Computer Use / Chrome Use | 截图、键鼠控制、浏览器操控 | [Computer Use](https://ccb.agent-aura.top/docs/features/computer-use)<br>[Chrome Use](https://ccb.agent-aura.top/docs/features/claude-in-chrome-mcp) |
-| Sentry / GrowthBook 企业监控 | 企业级错误追踪与特性开关 | [Sentry](https://ccb.agent-aura.top/docs/internals/sentry-setup)<br>[GrowthBook](https://ccb.agent-aura.top/docs/internals/growthbook-adapter) |
+| Computer Use | 屏幕截图、键鼠控制 | [文档](https://ccb.agent-aura.top/docs/features/computer-use) |
+| Chrome Use | 浏览器自动化、表单填写、数据抓取 | [魔改版](docs/features/chrome-use-mcp) [原生版](https://ccb.agent-aura.top/docs/features/claude-in-chrome-mcp) |
+| Sentry | 企业级错误追踪 | [文档](https://ccb.agent-aura.top/docs/internals/sentry-setup) |
+| GrowthBook | 企业级特性开关 | [文档](https://ccb.agent-aura.top/docs/internals/growthbook-adapter) |
 | Langfuse 监控 | LLM 调用/工具执行/多 Agent 全链路追踪 | [文档](https://ccb.agent-aura.top/docs/features/langfuse-monitoring) |
 | Poor Mode | 穷鬼模式，关闭记忆提取和键入建议 | /poor 可以开关 |

--- a/build.ts
+++ b/build.ts
@@ -40,6 +40,8 @@ const DEFAULT_BUILD_FEATURES = [
  'KAIROS',
  'COORDINATOR_MODE',
  'LAN_PIPES',
+  'BG_SESSIONS',
+  'TEMPLATES',
  // 'REVIEW_ARTIFACT', // API 请求无响应，需进一步排查 schema 兼容性
  // P3: poor mode (disable extract_memories + prompt_suggestion)
  'POOR',
@@ -145,7 +147,15 @@ if (typeof globalThis.Bun === "undefined") {
  function $(parts, ...args) {
    throw new Error("Bun.$ shell API is not available in Node.js. Use Bun runtime for this feature.");
  }
-  globalThis.Bun = { which, $ };
+  function hash(data, seed) {
+    let h = ((seed || 0) ^ 0x811c9dc5) >>> 0;
+    for (let i = 0; i < data.length; i++) {
+      h ^= data.charCodeAt(i);
+      h = Math.imul(h, 0x01000193) >>> 0;
+    }
+    return h;
+  }
+  globalThis.Bun = { which, $, hash };
 }
 import "./cli.js"
 `
--- a/bun.lock
+++ b/bun.lock
@@ -5,7 +5,7 @@
    "": {
      "name": "claude-code-best",
      "dependencies": {
-        "@claude-code-best/mcp-chrome-bridge": "^2.0.6",
+        "@claude-code-best/mcp-chrome-bridge": "^2.0.7",
      },
      "devDependencies": {
        "@alcalzone/ansi-tokenize": "^0.3.0",
@@ -443,7 +443,7 @@

    "@claude-code-best/builtin-tools": ["@claude-code-best/builtin-tools@workspace:packages/builtin-tools"],

-    "@claude-code-best/mcp-chrome-bridge": ["@claude-code-best/mcp-chrome-bridge@2.0.6", "", { "dependencies": { "@fastify/cors": "^11.0.1", "@modelcontextprotocol/sdk": "^1.11.0", "chalk": "^5.4.1", "chrome-mcp-shared": "^1.0.2", "commander": "^13.1.0", "fastify": "^5.3.2", "is-admin": "^4.0.0", "pino": "^9.6.0", "uuid": "^11.1.0" }, "bin": { "mcp-chrome-bridge": "dist/cli.js", "mcp-chrome-stdio": "dist/mcp/mcp-server-stdio.js" } }, "sha512-eKHXl+prvuNgU6NFti9qpYD1/jnddNiNjNSyLhcvsNhBIn695cZn3KJG65yV+e8OXbjYIf9DeIMGYGVwqrClqA=="],
+    "@claude-code-best/mcp-chrome-bridge": ["@claude-code-best/mcp-chrome-bridge@2.0.7", "", { "dependencies": { "@fastify/cors": "^11.0.1", "@modelcontextprotocol/sdk": "^1.11.0", "chalk": "^5.4.1", "chrome-mcp-shared": "^1.0.2", "commander": "^13.1.0", "fastify": "^5.3.2", "is-admin": "^4.0.0", "pino": "^9.6.0", "uuid": "^11.1.0" }, "bin": { "mcp-chrome-bridge": "dist/cli.js", "mcp-chrome-stdio": "dist/mcp/mcp-server-stdio.js" } }, "sha512-gb64+Ga6li3A8Ll9NKV+ePBn5/U0fccCdrH43tGYveLKZIZxURz8cbY+Z3BdbTdYSPVdFXtfUlp3TMxu4OT5gg=="],

    "@claude-code-best/mcp-client": ["@claude-code-best/mcp-client@workspace:packages/mcp-client"],

--- a/docs/features/chrome-use-mcp.md
+++ b/docs/features/chrome-use-mcp.md
@@ -0,0 +1,30 @@
+# Chrome Use — 浏览器自动化快速指南
+
+让 Claude Code 直接控制你的 Chrome 浏览器，用自然语言完成网页操作。
+
+## 快速开始（3 分钟）
+
+### 第一步：安装 Chrome 扩展
+
+1. 下载扩展：https://github.com/hangwin/mcp-chrome/releases（下载最新 zip）
+2. 解压 zip 文件
+3. 打开 Chrome 访问 `chrome://extensions/`
+4. 开启右上角「开发者模式」
+5. 点击「加载已解压的扩展程序」，选择解压后的文件夹
+
+### 第二步：启动 Claude Code
+
+```bash
+bun run dev
+ccb # 或者 ccb 安装版也行
+```
+
+### 第三步：启用 Chrome MCP
+
+1. 在 REPL 中输入 `/mcp` 打开 MCP 面板
+2. 找到 `mcp-chrome`，按空格键启用
+3. 按 Enter 确认
+
+## 相关文档
+
+- GitHub 仓库：https://github.com/hangwin/mcp-chrome
--- a/docs/features/stub-recovery-design-1-4.md
+++ b/docs/features/stub-recovery-design-1-4.md
@@ -0,0 +1,310 @@
+# Stub 恢复设计 1-4
+
+> 日期：2026-04-12
+> 目标：基于当前代码边界，为下一阶段 4 个 stub/半 stub 命令面给出可实施的设计方案。
+> 排序原则：按建议实施顺序排序，不按问题严重性排序。
+
+## 设计原则
+
+- 先做能独立闭环、收益明确、改动边界清晰的项。
+- 大项拆成 `MVP` 和 `Phase 2+`，避免一次性掉进大范围恢复。
+- 优先复用已有状态、传输层、日志与配置能力，不重造协议。
+- 设计以当前仓库实际代码为准，不以旧文档的理想状态为准。
+
+## 1. `claude daemon status` / `claude daemon stop`
+
+### 现状
+
+- `start` 路径已有完整 supervisor + worker 生命周期：
+  [src/daemon/main.ts](</e:/Source_code/Claude-code-bast/src/daemon/main.ts:1>)
+  [src/daemon/workerRegistry.ts](</e:/Source_code/Claude-code-bast/src/daemon/workerRegistry.ts:1>)
+- `status` / `stop` 目前只是占位输出：
+  [src/daemon/main.ts](</e:/Source_code/Claude-code-bast/src/daemon/main.ts:49>)
+- `/remote-control-server` 有自己的命令内 UI 状态，但只维护当前进程内的 `daemonProcess`，并不适合作为跨进程 CLI 管理基础：
+  [src/commands/remoteControlServer/remoteControlServer.tsx](</e:/Source_code/Claude-code-bast/src/commands/remoteControlServer/remoteControlServer.tsx:32>)
+
+### 目标
+
+- 让 `claude daemon status` 和 `claude daemon stop` 在另一个 CLI 进程中也能正确工作。
+- 不依赖 TUI 内存态，不要求当前命令进程就是启动 daemon 的那个进程。
+
+### MVP 方案
+
+- 新增 daemon 状态文件，例如：
+  `~/.claude/daemon/remote-control.json`
+- `start` 时写入：
+  - supervisor pid
+  - cwd
+  - startedAt
+  - worker kinds
+  - 最近状态
+- `status`：
+  - 读取状态文件
+  - 用现有进程探测能力验证 pid 是否存活
+  - 输出 `running / stopped / stale`
+  - stale 时自动清理状态文件
+- `stop`：
+  - 读取 pid
+  - 发送 `SIGTERM`
+  - 等待退出
+  - 超时后 `SIGKILL`
+  - 清理状态文件
+
+### 代码范围
+
+- 新增 `src/daemon/state.ts`
+- 修改 [src/daemon/main.ts](</e:/Source_code/Claude-code-bast/src/daemon/main.ts:1>)
+- 轻量修改 [src/commands/remoteControlServer/remoteControlServer.tsx](</e:/Source_code/Claude-code-bast/src/commands/remoteControlServer/remoteControlServer.tsx:32>)，让 UI 尽量读取同一份状态文件
+
+### 验证
+
+1. `claude daemon start`
+2. 新开终端执行 `claude daemon status`
+3. 执行 `claude daemon stop`
+4. 再次执行 `claude daemon status`，确认返回 `stopped` 或清晰的 `stale cleaned`
+
+### 风险
+
+- Windows 信号模型和 Unix 不同，`stop` 需要超时兜底。
+- 当前设计默认单 supervisor，不处理多实例并发。
+
+### 工作量判断
+
+- 小
+- 适合作为下一步的首选实现项
+
+## 2. `BG_SESSIONS`
+
+### 现状
+
+- fast-path 已接好：
+  [src/entrypoints/cli.tsx](</e:/Source_code/Claude-code-bast/src/entrypoints/cli.tsx:218>)
+- session registry 已有真实实现：
+  [src/utils/concurrentSessions.ts](</e:/Source_code/Claude-code-bast/src/utils/concurrentSessions.ts:1>)
+- `exit` 在 bg session 内已会 `tmux detach-client`：
+  [src/commands/exit/exit.tsx](</e:/Source_code/Claude-code-bast/src/commands/exit/exit.tsx:20>)
+- 但 CLI handler 仍全空：
+  [src/cli/bg.ts](</e:/Source_code/Claude-code-bast/src/cli/bg.ts:1>)
+- task summary 仍然是 stub：
+  [src/utils/taskSummary.ts](</e:/Source_code/Claude-code-bast/src/utils/taskSummary.ts:1>)
+
+### 目标
+
+- 先把 `ps` / `logs` / `kill` 做成真正有用的 session 管理命令。
+- 不在第一阶段就强行补完 `attach` / `--bg`。
+
+### Phase 2A：MVP
+
+- 实现 `ps`
+  - 从 registry 读取 live sessions
+  - 展示 pid、kind、sessionId、cwd、name、startedAt、bridgeSessionId
+  - 如果有 activity/status，则一并展示
+- 实现 `logs`
+  - 支持按 `sessionId / pid / name` 查找
+  - 优先复用本地 transcript/log 读取能力
+  - 如果 registry 里存在 `logPath`，支持 tail 文件
+- 实现 `kill`
+  - 解析目标 session
+  - 发退出信号
+  - 清理 stale registry
+
+### Phase 2B：后续
+
+- 实现 `attach`
+- 实现 `--bg`
+- 实现 `taskSummary` 的中途状态更新
+
+### 为什么要拆
+
+- 现有 registry 记录了 `pid / sessionId / name / logPath`
+- 但没有可靠的 tmux attach target
+- 所以 `attach` 和 `--bg` 不是简单补 handler，而是需要补启动/附着元数据设计
+
+### 代码范围
+
+- 修改 [src/cli/bg.ts](</e:/Source_code/Claude-code-bast/src/cli/bg.ts:1>)
+- 修改 [src/utils/concurrentSessions.ts](</e:/Source_code/Claude-code-bast/src/utils/concurrentSessions.ts:1>) 以便后续 attach/--bg 扩展
+- 修改 [src/utils/taskSummary.ts](</e:/Source_code/Claude-code-bast/src/utils/taskSummary.ts:1>)
+- 复用：
+  [src/utils/sessionStorage.ts](</e:/Source_code/Claude-code-bast/src/utils/sessionStorage.ts:3870>)
+  [src/utils/udsClient.ts](</e:/Source_code/Claude-code-bast/src/utils/udsClient.ts:1>)
+
+### 验证
+
+1. `ps` 能列出 live sessions
+2. `logs <sessionId|pid|name>` 能输出对应日志
+3. `kill <sessionId|pid|name>` 能结束目标 session
+
+### 风险
+
+- `attach` / `--bg` 第二阶段需要 tmux 元数据设计
+- Windows 下 tmux 路径需要明确降级策略
+
+### 工作量判断
+
+- `ps/logs/kill` 中等
+- `attach/--bg` 明显更大，应分阶段
+
+## 3. `TEMPLATES`
+
+### 现状
+
+- 命令入口只有 fast-path：
+  [src/entrypoints/cli.tsx](</e:/Source_code/Claude-code-bast/src/entrypoints/cli.tsx:249>)
+- handler 是空的：
+  [src/cli/handlers/templateJobs.ts](</e:/Source_code/Claude-code-bast/src/cli/handlers/templateJobs.ts:1>)
+- `markdownConfigLoader` 已把 `templates` 纳入配置目录：
+  [src/utils/markdownConfigLoader.ts](</e:/Source_code/Claude-code-bast/src/utils/markdownConfigLoader.ts:29>)
+- `query / stopHooks` 已预留 job classifier 链路：
+  [src/query/stopHooks.ts](</e:/Source_code/Claude-code-bast/src/query/stopHooks.ts:103>)
+- `jobs/classifier.ts` 仍是 stub：
+  [src/jobs/classifier.ts](</e:/Source_code/Claude-code-bast/src/jobs/classifier.ts:1>)
+
+### 目标
+
+- 把 `new / list / reply` 做成可用的模板任务系统。
+- 第一阶段不碰复杂的自动分类与自动执行。
+
+### MVP 方案
+
+- 模板来源：
+  `.claude/templates/*.md`
+- 模板格式：
+  复用现有 markdown + frontmatter 解析，不另外设计 DSL
+- `list`
+  - 列出所有模板
+  - 显示模板名、description、路径
+- `new <template> [args...]`
+  - 解析模板
+  - 在 `~/.claude/jobs/<job-id>/` 下创建 job 目录
+  - 写入 `template.md`、`input.txt`、`state.json`
+  - 返回 job id 与目录
+- `reply <job-id> <text>`
+  - 将回复写入 `replies.jsonl` 或 `input.txt`
+  - 更新 `state.json`
+
+### Phase 2
+
+- 恢复 [src/jobs/classifier.ts](</e:/Source_code/Claude-code-bast/src/jobs/classifier.ts:1>)
+- 让带 `CLAUDE_JOB_DIR` 的 job session 在 turn 完成后自动更新 `state.json`
+- 再决定是否补自动 job runner
+
+### 为什么要拆
+
+- 当前证据表明这是“template job commands”，不是单纯模板列表
+- 但自动 job 运行链路没有足够现成实现，先做文件系统 job lifecycle 更稳
+
+### 代码范围
+
+- 修改 [src/cli/handlers/templateJobs.ts](</e:/Source_code/Claude-code-bast/src/cli/handlers/templateJobs.ts:1>)
+- 新增 `src/jobs/state.ts`
+- 新增 `src/jobs/templates.ts`
+- Phase 2 再改 [src/jobs/classifier.ts](</e:/Source_code/Claude-code-bast/src/jobs/classifier.ts:1>)
+
+### 验证
+
+1. `list` 能列出 `.claude/templates`
+2. `new` 能创建 job 目录和状态文件
+3. `reply` 能更新 job 内容和状态
+4. Phase 2 再验证 classifier 写状态
+
+### 风险
+
+- frontmatter schema 需要先定义最小字段集
+- 一旦扩展到“自动运行 job”，范围会明显膨胀
+
+### 工作量判断
+
+- MVP 中等
+- 完整 job 系统偏大
+
+## 4. `assistant [sessionId]`
+
+### 现状
+
+- attach 主流程其实已经存在：
+  [src/main.tsx](</e:/Source_code/Claude-code-bast/src/main.tsx:4708>)
+- 远端 viewer 所需基础模块已存在：
+  [src/remote/RemoteSessionManager.ts](</e:/Source_code/Claude-code-bast/src/remote/RemoteSessionManager.ts:1>)
+  [src/hooks/useAssistantHistory.ts](</e:/Source_code/Claude-code-bast/src/hooks/useAssistantHistory.ts:1>)
+  [src/assistant/sessionHistory.ts](</e:/Source_code/Claude-code-bast/src/assistant/sessionHistory.ts:1>)
+- 真正 stub 的主要是：
+  [src/assistant/sessionDiscovery.ts](</e:/Source_code/Claude-code-bast/src/assistant/sessionDiscovery.ts:1>)
+  [src/assistant/AssistantSessionChooser.ts](</e:/Source_code/Claude-code-bast/src/assistant/AssistantSessionChooser.ts:1>)
+  [src/commands/assistant/assistant.ts](</e:/Source_code/Claude-code-bast/src/commands/assistant/assistant.ts:7>)
+  [src/assistant/index.ts](</e:/Source_code/Claude-code-bast/src/assistant/index.ts:1>)
+
+### 目标
+
+- 不一次性恢复整个 KAIROS 助手系统。
+- 先做“明确 sessionId 的 viewer attach 可用”，再逐步补 discovery / chooser / install。
+
+### Phase 4A：MVP
+
+- 只支持 `claude assistant <sessionId>`
+- 对 `claude assistant` 无参数模式，先返回明确提示：
+  - 当前版本需要显式 `sessionId`
+  - discovery 尚未启用
+- 这样可以直接复用现有 attach 分支，不必先恢复 chooser/install wizard
+
+### Phase 4B
+
+- 恢复 `discoverAssistantSessions()`
+- 数据来源优先复用现有 sessions / bridge / teleport API，而不是新协议
+- 让 `claude assistant` 无参数时能拿到候选 session 列表
+
+### Phase 4C
+
+- 恢复 `AssistantSessionChooser`
+- 多 session 时可交互选择
+
+### Phase 4D
+
+- 最后考虑 install wizard 辅助函数
+- 这部分属于“没有 session 时如何引导”，不是 attach 核心路径
+
+### 为什么要拆
+
+- attach 渲染层与远端消息通道大部分已经在
+- 真正缺的是“如何发现目标 session”和“如何交互选择”
+- 如果把 `src/assistant/index.ts` 的整套 KAIROS 正常模式也一起拉进来，范围会失控
+
+### 代码范围
+
+- Phase 4A：
+  - [src/main.tsx](</e:/Source_code/Claude-code-bast/src/main.tsx:4708>)
+  - [src/commands/assistant/index.ts](</e:/Source_code/Claude-code-bast/src/commands/assistant/index.ts:1>)
+- Phase 4B：
+  - [src/assistant/sessionDiscovery.ts](</e:/Source_code/Claude-code-bast/src/assistant/sessionDiscovery.ts:1>)
+- Phase 4C：
+  - [src/assistant/AssistantSessionChooser.ts](</e:/Source_code/Claude-code-bast/src/assistant/AssistantSessionChooser.ts:1>)
+- Phase 4D：
+  - [src/commands/assistant/assistant.ts](</e:/Source_code/Claude-code-bast/src/commands/assistant/assistant.ts:7>)
+
+### 验证
+
+1. `claude assistant <sessionId>` 能进入 remote viewer
+2. 历史懒加载工作正常
+3. 无参数模式先给出明确提示
+4. 后续阶段再分别验证 discovery / chooser / install
+
+### 风险
+
+- 这是四项里范围最大的
+- 一旦把 KAIROS 正常模式整体拉入，会从“viewer attach”膨胀成“完整 assistant mode 恢复”
+
+### 工作量判断
+
+- Phase 4A 中等
+- 4A-4D 全做完很大
+
+## 建议执行顺序
+
+1. `claude daemon status` / `claude daemon stop`
+2. `BG_SESSIONS` 先做 `ps/logs/kill`
+3. `TEMPLATES` 先做 job 文件系统 MVP
+4. `assistant [sessionId]` 先做显式 sessionId attach，再补 discovery/chooser/install
+
+## 简短结论
+
+这四项里，最适合立刻实现的是 `daemon status/stop`。`BG_SESSIONS` 和 `TEMPLATES` 适合按 MVP 先补 handler 与文件系统闭环。`assistant [sessionId]` 不能整块硬上，应该按“attach → discovery → chooser → install”拆开恢复。
--- a/docs/task/task-001-daemon-status-stop.md
+++ b/docs/task/task-001-daemon-status-stop.md
@@ -0,0 +1,77 @@
+# Task 001: daemon status / stop
+
+> 来源: [stub-recovery-design-1-4.md](../features/stub-recovery-design-1-4.md) 第 1 项
+> 优先级: P0 (首选实现项)
+> 工作量: 小
+> 状态: DONE
+
+## 目标
+
+让 `claude daemon status` 和 `claude daemon stop` 在任意 CLI 进程中都能正确工作，不依赖 TUI 内存态。
+
+## 背景
+
+- `start` 路径已有完整 supervisor + worker 生命周期 (`src/daemon/main.ts`, `src/daemon/workerRegistry.ts`)
+- `status` / `stop` 目前只是占位输出 (`src/daemon/main.ts:49`)
+- `/remote-control-server` 有自己的命令内 UI 状态，但只维护当前进程内的 `daemonProcess`，不适合跨进程管理
+
+## 实现方案
+
+### 新增文件
+
+| 文件 | 说明 |
+|------|------|
+| `src/daemon/state.ts` | daemon 状态文件读写模块 |
+
+### 修改文件
+
+| 文件 | 改动 |
+|------|------|
+| `src/daemon/main.ts` | `start` 写入状态文件；`status`/`stop` 调用 state 模块 |
+| `src/commands/remoteControlServer/remoteControlServer.tsx` | 读取同一份状态文件（轻量改动） |
+
+### 状态文件
+
+路径: `~/.claude/daemon/remote-control.json`
+
+```json
+{
+  "pid": 12345,
+  "cwd": "/path/to/project",
+  "startedAt": "2026-04-12T10:00:00Z",
+  "workerKinds": ["bridge", "rcs"],
+  "lastStatus": "running"
+}
+```
+
+### status 逻辑
+
+1. 读取状态文件
+2. 用进程探测验证 pid 是否存活
+3. 输出 `running` / `stopped` / `stale`
+4. stale 时自动清理状态文件
+
+### stop 逻辑
+
+1. 读取 pid
+2. 发送 `SIGTERM`
+3. 等待退出（超时兜底）
+4. 超时后 `SIGKILL`
+5. 清理状态文件
+
+## 验证步骤
+
+- [ ] `claude daemon start` 正常启动并写入状态文件
+- [ ] 新开终端执行 `claude daemon status`，显示 `running`
+- [ ] 执行 `claude daemon stop`，daemon 正常退出
+- [ ] 再次执行 `claude daemon status`，返回 `stopped` 或 `stale cleaned`
+- [ ] Windows 下 stop 超时兜底正常工作
+
+## 风险
+
+- Windows 信号模型和 Unix 不同，`stop` 需要超时兜底
+- 当前设计默认单 supervisor，不处理多实例并发
+
+## 依赖
+
+无外部依赖，可独立实施。
--- a/docs/task/task-002-bg-sessions-ps-logs-kill.md
+++ b/docs/task/task-002-bg-sessions-ps-logs-kill.md
@@ -0,0 +1,80 @@
+# Task 002: BG_SESSIONS — ps / logs / kill
+
+> 来源: [stub-recovery-design-1-4.md](../features/stub-recovery-design-1-4.md) 第 2 项
+> 优先级: P1
+> 工作量: 中等
+> 状态: DONE
+> 阶段: Phase 2A (MVP)
+
+## 目标
+
+把 `ps` / `logs` / `kill` 做成真正有用的 session 管理命令。不在第一阶段补完 `attach` / `--bg`。
+
+## 背景
+
+- fast-path 已接好 (`src/entrypoints/cli.tsx:218`)
+- session registry 已有真实实现 (`src/utils/concurrentSessions.ts`)
+- `exit` 在 bg session 内已会 `tmux detach-client` (`src/commands/exit/exit.tsx:20`)
+- CLI handler 仍全空 (`src/cli/bg.ts`)
+- task summary 仍然是 stub (`src/utils/taskSummary.ts`)
+
+## 实现方案
+
+### 修改文件
+
+| 文件 | 改动 |
+|------|------|
+| `src/cli/bg.ts` | 实现 `ps` / `logs` / `kill` handler |
+| `src/utils/concurrentSessions.ts` | 扩展以便后续 attach/--bg 使用 |
+| `src/utils/taskSummary.ts` | 补充基础实现 |
+
+### 复用模块
+
+- `src/utils/sessionStorage.ts` — session 存储
+- `src/utils/udsClient.ts` — UDS 通信
+
+### ps 命令
+
+- 从 registry 读取 live sessions
+- 展示: pid, kind, sessionId, cwd, name, startedAt, bridgeSessionId
+- 如果有 activity/status，一并展示
+
+### logs 命令
+
+- 支持按 `sessionId` / `pid` / `name` 查找
+- 优先复用本地 transcript/log 读取能力
+- 如果 registry 里存在 `logPath`，支持 tail 文件
+
+### kill 命令
+
+- 解析目标 session
+- 发退出信号
+- 清理 stale registry
+
+## 验证步骤
+
+- [ ] `ps` 能列出当前 live sessions
+- [ ] `logs <sessionId|pid|name>` 能输出对应日志
+- [ ] `kill <sessionId|pid|name>` 能结束目标 session 并清理 registry
+- [ ] 无 live session 时各命令有明确提示
+
+## Phase 2B (后续)
+
+- [ ] 实现 `attach`
+- [ ] 实现 `--bg`
+- [ ] 实现 `taskSummary` 的中途状态更新
+
+### 为什么拆分
+
+- 现有 registry 记录了 `pid / sessionId / name / logPath`
+- 但没有可靠的 tmux attach target
+- `attach` 和 `--bg` 需要补启动/附着元数据设计，不是简单补 handler
+
+## 风险
+
+- `attach` / `--bg` 第二阶段需要 tmux 元数据设计
+- Windows 下 tmux 路径需要明确降级策略
+
+## 依赖
+
+- Task 001 (daemon 状态管理可复用模式，但非硬性依赖)
--- a/docs/task/task-003-templates-job-mvp.md
+++ b/docs/task/task-003-templates-job-mvp.md
@@ -0,0 +1,87 @@
+# Task 003: TEMPLATES — job 文件系统 MVP
+
+> 来源: [stub-recovery-design-1-4.md](../features/stub-recovery-design-1-4.md) 第 3 项
+> 优先级: P2
+> 工作量: 中等
+> 状态: DONE
+> 阶段: MVP
+
+## 目标
+
+把 `new` / `list` / `reply` 做成可用的模板任务系统。第一阶段不碰复杂的自动分类与自动执行。
+
+## 背景
+
+- 命令入口只有 fast-path (`src/entrypoints/cli.tsx:249`)
+- handler 是空的 (`src/cli/handlers/templateJobs.ts`)
+- `markdownConfigLoader` 已把 `templates` 纳入配置目录 (`src/utils/markdownConfigLoader.ts:29`)
+- `query/stopHooks` 已预留 job classifier 链路 (`src/query/stopHooks.ts:103`)
+- `jobs/classifier.ts` 仍是 stub (`src/jobs/classifier.ts`)
+
+## 实现方案
+
+### 新增文件
+
+| 文件 | 说明 |
+|------|------|
+| `src/jobs/state.ts` | job 状态管理 |
+| `src/jobs/templates.ts` | 模板解析与列表 |
+
+### 修改文件
+
+| 文件 | 改动 |
+|------|------|
+| `src/cli/handlers/templateJobs.ts` | 实现 `new` / `list` / `reply` handler |
+
+### 模板来源
+
+`.claude/templates/*.md`
+
+### 模板格式
+
+复用现有 markdown + frontmatter 解析，不另外设计 DSL。
+
+### list 命令
+
+- 列出所有模板
+- 显示: 模板名, description, 路径
+
+### new 命令
+
+- 解析模板
+- 在 `~/.claude/jobs/<job-id>/` 下创建 job 目录
+- 写入 `template.md`, `input.txt`, `state.json`
+- 返回 job id 与目录路径
+
+### reply 命令
+
+- 将回复写入 `replies.jsonl` 或 `input.txt`
+- 更新 `state.json`
+
+## 验证步骤
+
+- [ ] `list` 能列出 `.claude/templates` 下的所有模板
+- [ ] `new <template> [args...]` 能创建 job 目录和状态文件
+- [ ] `reply <job-id> <text>` 能更新 job 内容和状态
+- [ ] frontmatter schema 最小字段集已定义
+
+## Phase 2 (后续)
+
+- [ ] 恢复 `src/jobs/classifier.ts`
+- [ ] 让带 `CLAUDE_JOB_DIR` 的 job session 在 turn 完成后自动更新 `state.json`
+- [ ] 再决定是否补自动 job runner
+
+### 为什么拆分
+
+- 当前是 "template job commands"，不是单纯模板列表
+- 自动 job 运行链路没有足够现成实现
+- 先做文件系统 job lifecycle 更稳
+
+## 风险
+
+- frontmatter schema 需要先定义最小字段集
+- 一旦扩展到"自动运行 job"，范围会明显膨胀
+
+## 依赖
+
+无硬性依赖，可独立实施。
--- a/docs/task/task-004-assistant-session-attach.md
+++ b/docs/task/task-004-assistant-session-attach.md
@@ -0,0 +1,103 @@
+# Task 004: assistant [sessionId] — 分阶段恢复
+
+> 来源: [stub-recovery-design-1-4.md](../features/stub-recovery-design-1-4.md) 第 4 项
+> 优先级: P3
+> 工作量: Phase 4A 中等，4A-4D 全做完很大
+> 状态: Phase 4A DONE, 4B-4D TODO
+
+## 目标
+
+不一次性恢复整个 KAIROS 助手系统。先做"明确 sessionId 的 viewer attach 可用"，再逐步补 discovery / chooser / install。
+
+## 背景
+
+- attach 主流程已存在 (`src/main.tsx:4708`)
+- 远端 viewer 所需基础模块已存在:
+  - `src/remote/RemoteSessionManager.ts`
+  - `src/hooks/useAssistantHistory.ts`
+  - `src/assistant/sessionHistory.ts`
+- 真正 stub 的主要是:
+  - `src/assistant/sessionDiscovery.ts`
+  - `src/assistant/AssistantSessionChooser.ts`
+  - `src/commands/assistant/assistant.ts:7`
+  - `src/assistant/index.ts`
+
+## 分阶段实现
+
+### Phase 4A: MVP — 显式 sessionId attach
+
+**修改文件:**
+
+| 文件 | 改动 |
+|------|------|
+| `src/main.tsx` | 确保 attach 分支可用 |
+| `src/commands/assistant/index.ts` | 实现显式 sessionId 参数入口 |
+
+**行为:**
+- `claude assistant <sessionId>` — 进入 remote viewer
+- `claude assistant` (无参数) — 返回明确提示: 当前版本需要显式 sessionId，discovery 尚未启用
+
+**验证:**
+- [ ] `claude assistant <sessionId>` 能进入 remote viewer
+- [ ] 历史懒加载工作正常
+- [ ] 无参数模式给出明确提示
+
+### Phase 4B: session discovery
+
+**修改文件:**
+
+| 文件 | 改动 |
+|------|------|
+| `src/assistant/sessionDiscovery.ts` | 恢复 `discoverAssistantSessions()` |
+
+**行为:**
+- 数据来源优先复用现有 sessions / bridge / teleport API，不新增协议
+- `claude assistant` 无参数时能拿到候选 session 列表
+
+**验证:**
+- [ ] 无参数调用能列出可用 sessions
+- [ ] 数据来源复用现有通道
+
+### Phase 4C: session chooser
+
+**修改文件:**
+
+| 文件 | 改动 |
+|------|------|
+| `src/assistant/AssistantSessionChooser.ts` | 恢复交互式选择器 |
+
+**行为:**
+- 多 session 时可交互选择
+
+**验证:**
+- [ ] 多个 session 时弹出选择器
+- [ ] 选择后正确 attach
+
+### Phase 4D: install wizard
+
+**修改文件:**
+
+| 文件 | 改动 |
+|------|------|
+| `src/commands/assistant/assistant.ts` | 恢复 install wizard 辅助函数 |
+
+**行为:**
+- 没有 session 时如何引导用户
+
+**验证:**
+- [ ] 无可用 session 时引导用户创建/连接
+
+## 为什么拆分
+
+- attach 渲染层与远端消息通道大部分已在
+- 真正缺的是"如何发现目标 session"和"如何交互选择"
+- 如果把 `src/assistant/index.ts` 的整套 KAIROS 正常模式也一起拉进来，范围会失控
+
+## 风险
+
+- 这是四项里范围最大的
+- 一旦把 KAIROS 正常模式整体拉入，会从"viewer attach"膨胀成"完整 assistant mode 恢复"
+
+## 依赖
+
+- Task 002 的 session registry 模式可复用
--- a/docs/test-plans/openclaw-autonomy-baseline.md
+++ b/docs/test-plans/openclaw-autonomy-baseline.md
@@ -0,0 +1,88 @@
+# OpenClaw Autonomy Baseline Test Spec
+
+## Purpose
+
+This test spec locks the current behavior of the existing trigger and context layers before any formal autonomy-subsystem implementation begins.
+
+At this stage, production code is read-only. Only test files, fixtures, and planning documents may change.
+
+## Goal
+
+Establish a stable baseline around the parts of `Claude-code-bast` that later autonomy work is most likely to touch:
+
+- proactive state handling
+- cron task storage semantics
+- cron scheduler helper semantics
+- user-context cache and `CLAUDE.md` injection behavior
+
+## Out of Scope for This Baseline Round
+
+- New authority behavior (`AGENTS.md` / `HEARTBEAT.md`)
+- New detached-run ledger behavior
+- New flow behavior
+- UI redesign
+
+## Files Under Baseline Protection
+
+- `src/proactive/index.ts`
+- `src/utils/cronTasks.ts`
+- `src/utils/cronScheduler.ts`
+- `src/context.ts`
+
+## Test Files Added In This Round
+
+- `src/proactive/__tests__/state.baseline.test.ts`
+- `src/commands/__tests__/proactive.baseline.test.ts`
+- `src/utils/__tests__/cronTasks.baseline.test.ts`
+- `src/utils/__tests__/cronScheduler.baseline.test.ts`
+- `src/__tests__/context.baseline.test.ts`
+
+## Baseline Assertions
+
+### Proactive state
+
+1. Activating proactive mode sets active state and activation source.
+2. Pausing proactive mode suppresses `shouldTick()` and clears `nextTickAt`.
+3. Blocking context suppresses `shouldTick()` and clears `nextTickAt`.
+4. Subscribers are notified on state transitions.
+5. The `/proactive` command enables proactive mode and emits the expected hidden reminder.
+6. The `/proactive` command disables proactive mode on the second invocation.
+
+### Cron task storage
+
+1. Session-only cron tasks remain in memory only.
+2. Durable cron tasks are persisted to `.claude/scheduled_tasks.json`.
+3. Daemon-style `dir`-scoped reads exclude session-only cron tasks.
+4. `removeCronTasks()` without `dir` can remove session-only tasks.
+5. `removeCronTasks()` with `dir` does not mutate session-only task storage.
+
+### Cron scheduler helpers
+
+1. `isRecurringTaskAged()` preserves current aging semantics.
+2. `buildMissedTaskNotification()` preserves the current AskUserQuestion safety wording.
+3. `buildMissedTaskNotification()` preserves code-fence hardening for prompt bodies that contain backticks.
+
+### User context caching
+
+1. `getUserContext()` includes `currentDate`.
+2. `getUserContext()` includes mocked `claudeMd` content when memory loading is enabled.
+3. `CLAUDE_CODE_DISABLE_CLAUDE_MDS` suppresses `claudeMd`.
+4. `setSystemPromptInjection()` clears the memoized user-context cache.
+5. `getSystemContext()` reflects the injection after cache invalidation.
+
+## Remaining Baseline Gaps
+
+The following areas are intentionally deferred because they require higher-cost harnessing and should still avoid production-code changes:
+
+1. `useScheduledTasks.ts` hook-level runtime behavior
+2. `src/cli/print.ts` full headless scheduler loop behavior
+3. `useProactive.ts` hook timer behavior
+4. end-to-end queue interaction between proactive ticks and `SleepTool`
+
+## Acceptance
+
+This baseline round is complete when:
+
+1. The four new test files pass.
+2. No production source files are modified.
+3. The tests are stable enough to serve as a pre-implementation guardrail.
--- a/package.json
+++ b/package.json
@@ -55,7 +55,7 @@
    "rcs": "bun run scripts/rcs.ts"
  },
  "dependencies": {
-    "@claude-code-best/mcp-chrome-bridge": "^2.0.6"
+    "@claude-code-best/mcp-chrome-bridge": "^2.0.7"
  },
  "devDependencies": {
    "@types/he": "^1.2.3",
--- a/packages/@ant/computer-use-mcp/src/toolCalls.ts
+++ b/packages/@ant/computer-use-mcp/src/toolCalls.ts
@@ -37,16 +37,21 @@
 import type { CallToolResult } from "@modelcontextprotocol/sdk/types.js";
 import { randomUUID } from "node:crypto";

-/** Detect actual image MIME type from base64 data using magic bytes. */
+/** Detect actual image MIME type from base64 data by decoding the magic bytes. */
 function detectMimeFromBase64(b64: string): string {
-  // First byte is enough to distinguish PNG (0x89) from JPEG (0xFF)
-  const c = b64.charCodeAt(0);
-  if (c === 0x89) return "image/png";
-  if (c === 0xFF) return "image/jpeg";
-  // RIFF = WebP
-  if (c === 0x52) return "image/webp";
-  // GIF
-  if (c === 0x47) return "image/gif";
+  // Decode first 12 raw bytes (16 base64 chars is enough) and check standard magic bytes.
+  // PNG:  89 50 4E 47
+  // JPEG: FF D8 FF
+  // RIFF+WEBP: "RIFF" at 0..3 + "WEBP" at 8..11
+  // GIF:  "GIF" at 0..2
+  const raw = Buffer.from(b64.slice(0, 16), "base64");
+  if (raw[0] === 0x89 && raw[1] === 0x50 && raw[2] === 0x4e && raw[3] === 0x47) return "image/png";
+  if (raw[0] === 0xff && raw[1] === 0xd8 && raw[2] === 0xff) return "image/jpeg";
+  if (
+    raw[0] === 0x52 && raw[1] === 0x49 && raw[2] === 0x46 && raw[3] === 0x46 && // RIFF
+    raw[8] === 0x57 && raw[9] === 0x45 && raw[10] === 0x42 && raw[11] === 0x50  // WEBP
+  ) return "image/webp";
+  if (raw[0] === 0x47 && raw[1] === 0x49 && raw[2] === 0x46) return "image/gif";
  return "image/png";
 }

--- a/packages/builtin-tools/src/tools/PushNotificationTool/PushNotificationTool.ts
+++ b/packages/builtin-tools/src/tools/PushNotificationTool/PushNotificationTool.ts
@@ -1,7 +1,9 @@
+import { feature } from 'bun:bundle'
 import { z } from 'zod/v4'
 import type { ToolResultBlockParam } from 'src/Tool.js'
 import { buildTool } from 'src/Tool.js'
 import { lazySchema } from 'src/utils/lazySchema.js'
+import { logForDebugging } from 'src/utils/debug.js'

 const PUSH_NOTIFICATION_TOOL_NAME = 'PushNotification'

@@ -74,14 +76,58 @@ Requires Remote Control to be configured. Respects user notification settings (t
    }
  },

-  async call(_input: PushInput) {
-    // Push delivery is handled by the Remote Control / KAIROS transport layer.
-    // Without the KAIROS runtime, this tool is not available.
-    return {
-      data: {
-        sent: false,
-        error: 'PushNotification requires the KAIROS transport layer.',
-      },
+  async call(input: PushInput, context) {
+    const appState = context.getAppState()
+
+    // Try bridge delivery first (for remote/mobile viewers)
+    if (appState.replBridgeEnabled) {
+      if (feature('BRIDGE_MODE')) {
+        try {
+          const { getBridgeAccessToken, getBridgeBaseUrl } = await import(
+            'src/bridge/bridgeConfig.js'
+          )
+          const { getSessionId } = await import('src/bootstrap/state.js')
+          const token = getBridgeAccessToken()
+          const sessionId = getSessionId()
+          if (token && sessionId) {
+            const baseUrl = getBridgeBaseUrl()
+            const axios = (await import('axios')).default
+            const response = await axios.post(
+              `${baseUrl}/v1/sessions/${sessionId}/events`,
+              {
+                events: [
+                  {
+                    type: 'push_notification',
+                    title: input.title,
+                    body: input.body,
+                    priority: input.priority ?? 'normal',
+                  },
+                ],
+              },
+              {
+                headers: {
+                  Authorization: `Bearer ${token}`,
+                  'Content-Type': 'application/json',
+                  'anthropic-version': '2023-06-01',
+                },
+                timeout: 10_000,
+                validateStatus: (s: number) => s < 500,
+              },
+            )
+            if (response.status >= 200 && response.status < 300) {
+              logForDebugging(`[PushNotification] delivered via bridge session=${sessionId}`)
+              return { data: { sent: true } }
+            }
+            logForDebugging(`[PushNotification] bridge delivery failed: status=${response.status}`)
+          }
+        } catch (e) {
+          logForDebugging(`[PushNotification] bridge delivery error: ${e}`)
+        }
+      }
    }
+
+    // Fallback: no bridge available, push was not delivered to a remote device.
+    logForDebugging(`[PushNotification] no bridge available, not delivered: ${input.title}`)
+    return { data: { sent: false, error: 'No Remote Control bridge configured. Notification not delivered.' } }
  },
 })
--- a/packages/builtin-tools/src/tools/SendUserFileTool/SendUserFileTool.ts
+++ b/packages/builtin-tools/src/tools/SendUserFileTool/SendUserFileTool.ts
@@ -70,14 +70,51 @@ Guidelines:
    }
  },

-  async call(_input: SendUserFileInput) {
-    // File transfer is handled by the KAIROS assistant transport layer.
-    // Without the KAIROS runtime, this tool is not available.
+  async call(input: SendUserFileInput, context) {
+    const { file_path } = input
+    const { stat } = await import('fs/promises')
+
+    // Verify file exists and is readable
+    let fileSize: number
+    try {
+      const fileStat = await stat(file_path)
+      if (!fileStat.isFile()) {
+        return {
+          data: { sent: false, file_path, error: 'Path is not a file.' },
+        }
+      }
+      fileSize = fileStat.size
+    } catch {
+      return {
+        data: { sent: false, file_path, error: 'File does not exist or is not readable.' },
+      }
+    }
+
+    // Attempt bridge upload if available (so web viewers can download)
+    const appState = context.getAppState()
+    let fileUuid: string | undefined
+    if (appState.replBridgeEnabled) {
+      try {
+        const { uploadBriefAttachment } = await import(
+          '@claude-code-best/builtin-tools/tools/BriefTool/upload.js'
+        )
+        fileUuid = await uploadBriefAttachment(file_path, fileSize, {
+          replBridgeEnabled: true,
+          signal: context.abortController.signal,
+        })
+      } catch {
+        // Best-effort upload — local path is always available
+      }
+    }
+
+    const delivered = !appState.replBridgeEnabled || Boolean(fileUuid)
    return {
      data: {
-        sent: false,
-        file_path: _input.file_path,
-        error: 'SendUserFile requires the KAIROS assistant transport layer.',
+        sent: delivered,
+        file_path,
+        size: fileSize,
+        ...(fileUuid ? { file_uuid: fileUuid } : {}),
+        ...(!delivered ? { error: 'Bridge upload failed. File available at local path.' } : {}),
      },
    }
  },
--- a/scripts/dev.ts
+++ b/scripts/dev.ts
@@ -47,6 +47,8 @@ const DEFAULT_FEATURES = [
  "KAIROS",
  "COORDINATOR_MODE",
  "LAN_PIPES",
+  "BG_SESSIONS",
+  "TEMPLATES",
  // "REVIEW_ARTIFACT", // API 请求无响应，需进一步排查 schema 兼容性
  // P3: poor mode (disable extract_memories + prompt_suggestion)
  "POOR",
--- a/src/tests/context.baseline.test.ts
+++ b/src/tests/context.baseline.test.ts
@@ -0,0 +1,91 @@
+import { afterEach, beforeEach, describe, expect, test } from 'bun:test'
+import {
+  resetStateForTests,
+  setOriginalCwd,
+  setProjectRoot,
+} from '../bootstrap/state'
+import {
+  getSystemContext,
+  getUserContext,
+  setSystemPromptInjection,
+} from '../context'
+import { clearMemoryFileCaches } from '../utils/claudemd'
+import { cleanupTempDir, createTempDir, writeTempFile } from '../../tests/mocks/file-system'
+
+let tempDir = ''
+let projectClaudeMdContent = ''
+
+beforeEach(async () => {
+  tempDir = await createTempDir('context-baseline-')
+  projectClaudeMdContent = `baseline-${Date.now()}`
+
+  resetStateForTests()
+  setOriginalCwd(tempDir)
+  setProjectRoot(tempDir)
+  await writeTempFile(tempDir, 'CLAUDE.md', projectClaudeMdContent)
+
+  clearMemoryFileCaches()
+  getUserContext.cache.clear?.()
+  getSystemContext.cache.clear?.()
+  setSystemPromptInjection(null)
+  delete process.env.CLAUDE_CODE_DISABLE_CLAUDE_MDS
+})
+
+afterEach(async () => {
+  clearMemoryFileCaches()
+  getUserContext.cache.clear?.()
+  getSystemContext.cache.clear?.()
+  setSystemPromptInjection(null)
+  delete process.env.CLAUDE_CODE_DISABLE_CLAUDE_MDS
+  resetStateForTests()
+  if (tempDir) {
+    await cleanupTempDir(tempDir)
+  }
+})
+
+describe('context baseline', () => {
+  test('getUserContext includes currentDate and project CLAUDE.md content', async () => {
+    const ctx = await getUserContext()
+
+    expect(ctx.currentDate).toContain("Today's date is")
+    expect(ctx.claudeMd).toContain(projectClaudeMdContent)
+  })
+
+  test('CLAUDE_CODE_DISABLE_CLAUDE_MDS suppresses claudeMd loading', async () => {
+    process.env.CLAUDE_CODE_DISABLE_CLAUDE_MDS = '1'
+
+    const ctx = await getUserContext()
+
+    expect(ctx.currentDate).toContain("Today's date is")
+    expect(ctx.claudeMd).toBeUndefined()
+  })
+
+  test('setSystemPromptInjection clears the memoized user-context cache', async () => {
+    const first = await getUserContext()
+    process.env.CLAUDE_CODE_DISABLE_CLAUDE_MDS = '1'
+
+    const second = await getUserContext()
+    expect(first.claudeMd).toContain(projectClaudeMdContent)
+    expect(second.claudeMd).toContain(projectClaudeMdContent)
+
+    setSystemPromptInjection('cache-break')
+
+    const third = await getUserContext()
+    expect(third.claudeMd).toBeUndefined()
+  })
+
+  test('getSystemContext reflects system prompt injection after cache invalidation', async () => {
+    const first = await getSystemContext()
+    expect(first.gitStatus).toBeUndefined()
+    expect(first.cacheBreaker).toBeUndefined()
+
+    setSystemPromptInjection('baseline-cache-break')
+
+    const second = await getSystemContext()
+    if ('cacheBreaker' in second) {
+      expect(second.cacheBreaker).toContain('baseline-cache-break')
+    } else {
+      expect(second.gitStatus).toBeUndefined()
+    }
+  })
+})
--- a/src/assistant/AssistantSessionChooser.ts
+++ b/src/assistant/AssistantSessionChooser.ts
@@ -1,3 +0,0 @@
-// Auto-generated stub — replace with real implementation
-export {};
-export const AssistantSessionChooser: (props: Record<string, unknown>) => null = () => null;
--- a/src/assistant/AssistantSessionChooser.tsx
+++ b/src/assistant/AssistantSessionChooser.tsx
@@ -0,0 +1,54 @@
+import * as React from 'react';
+import { useState } from 'react';
+import { Box, Text } from '@anthropic/ink';
+import { Dialog } from '../components/design-system/Dialog.js';
+import { ListItem } from '../components/design-system/ListItem.js';
+import { useRegisterOverlay } from '../context/overlayContext.js';
+import { useKeybindings } from '../keybindings/useKeybinding.js';
+import type { AssistantSession } from './sessionDiscovery.js';
+
+interface Props {
+  sessions: AssistantSession[];
+  onSelect: (id: string) => void;
+  onCancel: () => void;
+}
+
+/**
+ * Interactive session chooser for `claude assistant` when multiple
+ * CCR sessions are discovered. Renders a Dialog with up/down navigation.
+ *
+ * Session IDs are in `session_*` compat format — passed directly to
+ * createRemoteSessionConfig() for viewer attach.
+ */
+export function AssistantSessionChooser({ sessions, onSelect, onCancel }: Props): React.ReactNode {
+  useRegisterOverlay('assistant-session-chooser');
+  const [focusIndex, setFocusIndex] = useState(0);
+
+  useKeybindings(
+    {
+      'select:next': () => setFocusIndex(i => (i + 1) % sessions.length),
+      'select:previous': () => setFocusIndex(i => (i - 1 + sessions.length) % sessions.length),
+      'select:accept': () => onSelect(sessions[focusIndex]!.id),
+    },
+    { context: 'Select' },
+  );
+
+  return (
+    <Dialog title="Select Assistant Session" onCancel={onCancel} hideInputGuide>
+      <Box flexDirection="column" gap={1}>
+        <Text>Multiple sessions found. Select one to attach:</Text>
+        <Box flexDirection="column">
+          {sessions.map((s, i) => (
+            <ListItem key={s.id} isFocused={focusIndex === i}>
+              <Box>
+                <Text>{s.title || s.id.slice(0, 20)}</Text>
+                <Text dimColor> [{s.status}]</Text>
+              </Box>
+            </ListItem>
+          ))}
+        </Box>
+        <Text dimColor>↑↓ navigate · Enter select · Esc cancel</Text>
+      </Box>
+    </Dialog>
+  );
+}
--- a/src/assistant/gate.ts
+++ b/src/assistant/gate.ts
@@ -5,21 +5,20 @@ import { getFeatureValue_CACHED_MAY_BE_STALE } from '../services/analytics/growt
 /**
 * Runtime gate for KAIROS features.
 *
- * Build-time: feature('KAIROS') must be on (checked by caller before
- * this module is required).
+ * Two-layer gate:
+ *   1. Build-time: feature('KAIROS') must be on
+ *   2. Runtime: tengu_kairos_assistant GrowthBook flag (remote kill switch)
 *
- * Runtime: tengu_kairos_assistant GrowthBook flag acts as a remote kill
- * switch, and kairosActive state must be true (set during bootstrap when
- * the session qualifies for KAIROS features).
+ * Called by main.tsx BEFORE setKairosActive(true) — must NOT check
+ * kairosActive (that would deadlock: gate needs active, active needs gate).
+ * The caller (main.tsx L1826-1832) sets kairosActive after this returns true.
 */
 export async function isKairosEnabled(): Promise<boolean> {
  if (!feature('KAIROS')) {
    return false
  }
-  if (
-    !getFeatureValue_CACHED_MAY_BE_STALE('tengu_kairos_assistant', false)
-  ) {
+  if (!getFeatureValue_CACHED_MAY_BE_STALE('tengu_kairos_assistant', false)) {
    return false
  }
-  return getKairosActive()
+  return true
 }
--- a/src/assistant/index.ts
+++ b/src/assistant/index.ts
@@ -1,9 +1,64 @@
-// Auto-generated stub — replace with real implementation
-export {}
-export const isAssistantMode: () => boolean = () => false
-export const initializeAssistantTeam: () => Promise<void> = async () => {}
-export const markAssistantForced: () => void = () => {}
-export const isAssistantForced: () => boolean = () => false
-export const getAssistantSystemPromptAddendum: () => string = () => ''
-export const getAssistantActivationPath: () => string | undefined = () =>
-  undefined
+import { readFileSync } from 'fs'
+import { join } from 'path'
+import { getKairosActive } from '../bootstrap/state.js'
+import { getClaudeConfigHomeDir } from '../utils/envUtils.js'
+
+let _assistantForced = false
+
+/**
+ * Whether the current session is in assistant (KAIROS) daemon mode.
+ * Wraps the bootstrap kairosActive state set by main.tsx after gate check.
+ */
+export function isAssistantMode(): boolean {
+  return getKairosActive()
+}
+
+/**
+ * Mark this session as forced assistant mode (--assistant flag).
+ * Skips the GrowthBook gate check — daemon is pre-entitled.
+ */
+export function markAssistantForced(): void {
+  _assistantForced = true
+}
+
+export function isAssistantForced(): boolean {
+  return _assistantForced
+}
+
+/**
+ * Pre-create an in-process team so Agent(name) can spawn teammates
+ * without TeamCreate.
+ *
+ * Phase 1: returns undefined so main.tsx's `assistantTeamContext ?? computeInitialTeamContext()`
+ * correctly falls back. Returning {} would bypass the ?? operator since {} is truthy.
+ *
+ * Phase 2: should return a full team context object matching AppState.teamContext shape.
+ */
+export async function initializeAssistantTeam(): Promise<undefined> {
+  return undefined
+}
+
+/**
+ * Assistant-specific system prompt addendum loaded from ~/.claude/agents/assistant.md.
+ * Returns empty string if the file doesn't exist.
+ */
+export function getAssistantSystemPromptAddendum(): string {
+  try {
+    return readFileSync(
+      join(getClaudeConfigHomeDir(), 'agents', 'assistant.md'),
+      'utf-8',
+    )
+  } catch {
+    return ''
+  }
+}
+
+/**
+ * How assistant mode was activated. Used for diagnostics/analytics.
+ * - 'daemon': via --assistant flag (Agent SDK daemon)
+ * - 'gate': via GrowthBook gate check
+ */
+export function getAssistantActivationPath(): string | undefined {
+  if (!isAssistantMode()) return undefined
+  return _assistantForced ? 'daemon' : 'gate'
+}
--- a/src/assistant/sessionDiscovery.ts
+++ b/src/assistant/sessionDiscovery.ts
@@ -1,3 +1,51 @@
-// Auto-generated stub — replace with real implementation
-export type AssistantSession = { id: string; [key: string]: unknown };
-export const discoverAssistantSessions: () => Promise<AssistantSession[]> = () => Promise.resolve([]);
+import { logForDebugging } from '../utils/debug.js'
+
+/**
+ * Minimal session type for assistant discovery.
+ * Only `id` is consumed by main.tsx (L4757); other fields are for chooser display.
+ * ID format is `session_*` (compat prefix) — viewer endpoints use /v1/sessions/*.
+ */
+export type AssistantSession = {
+  id: string
+  title: string
+  status: string
+  created_at: string
+}
+
+/**
+ * Discover assistant sessions on Anthropic CCR.
+ *
+ * Reuses the existing fetchCodeSessionsFromSessionsAPI() which calls
+ * GET /v1/sessions with proper OAuth + anthropic-beta headers.
+ *
+ * Throws on failure — main.tsx L4720-4725 catch displays the error.
+ * Does NOT return [] on error (that would silently redirect to install wizard).
+ */
+export async function discoverAssistantSessions(): Promise<AssistantSession[]> {
+  const { fetchCodeSessionsFromSessionsAPI } = await import(
+    '../utils/teleport/api.js'
+  )
+
+  let allSessions
+  try {
+    allSessions = await fetchCodeSessionsFromSessionsAPI()
+  } catch (err) {
+    logForDebugging(
+      `[assistant:discovery] fetchCodeSessionsFromSessionsAPI failed: ${err}`,
+    )
+    throw err
+  }
+
+  // Filter to active/working sessions only — completed/archived are not attachable
+  return allSessions
+    .filter(
+      s =>
+        s.status === 'idle' || s.status === 'working' || s.status === 'waiting',
+    )
+    .map(s => ({
+      id: s.id,
+      title: s.title || 'Untitled',
+      status: s.status,
+      created_at: s.created_at ?? '',
+    }))
+}
--- a/src/cli/bg.ts
+++ b/src/cli/bg.ts
@@ -1,7 +1,348 @@
-// Auto-generated stub — replace with real implementation
-export {};
-export const psHandler: (args: string[]) => Promise<void> = (async () => {}) as (args: string[]) => Promise<void>;
-export const logsHandler: (sessionId: string | undefined) => Promise<void> = (async () => {}) as (sessionId: string | undefined) => Promise<void>;
-export const attachHandler: (sessionId: string | undefined) => Promise<void> = (async () => {}) as (sessionId: string | undefined) => Promise<void>;
-export const killHandler: (sessionId: string | undefined) => Promise<void> = (async () => {}) as (sessionId: string | undefined) => Promise<void>;
-export const handleBgFlag: (args: string[]) => Promise<void> = (async () => {}) as (args: string[]) => Promise<void>;
+import { readdir, readFile, unlink } from 'fs/promises'
+import { join } from 'path'
+import { randomUUID } from 'crypto'
+import { spawnSync } from 'child_process'
+import { getClaudeConfigHomeDir } from '../utils/envUtils.js'
+import { isProcessRunning } from '../utils/genericProcessUtils.js'
+import { jsonParse } from '../utils/slowOperations.js'
+import { execFileNoThrow } from '../utils/execFileNoThrow.js'
+import { quote } from '../utils/bash/shellQuote.js'
+
+interface SessionEntry {
+  pid: number
+  sessionId: string
+  cwd: string
+  startedAt: number
+  kind: string
+  name?: string
+  logPath?: string
+  entrypoint?: string
+  status?: string
+  waitingFor?: string
+  updatedAt?: number
+  bridgeSessionId?: string
+  agent?: string
+  tmuxSessionName?: string
+}
+
+function getSessionsDir(): string {
+  return join(getClaudeConfigHomeDir(), 'sessions')
+}
+
+async function listLiveSessions(): Promise<SessionEntry[]> {
+  const dir = getSessionsDir()
+  let files: string[]
+  try {
+    files = await readdir(dir)
+  } catch {
+    return []
+  }
+
+  const sessions: SessionEntry[] = []
+  for (const file of files) {
+    if (!/^\d+\.json$/.test(file)) continue
+    const pid = parseInt(file.slice(0, -5), 10)
+
+    if (!isProcessRunning(pid)) {
+      void unlink(join(dir, file)).catch(() => {})
+      continue
+    }
+
+    try {
+      const raw = await readFile(join(dir, file), 'utf-8')
+      const entry = jsonParse(raw) as SessionEntry
+      sessions.push(entry)
+    } catch {
+      // Corrupt file — skip
+    }
+  }
+
+  return sessions
+}
+
+function findSession(
+  sessions: SessionEntry[],
+  target: string,
+): SessionEntry | undefined {
+  const asNum = parseInt(target, 10)
+  return sessions.find(
+    s =>
+      s.sessionId === target ||
+      s.pid === asNum ||
+      (s.name && s.name === target),
+  )
+}
+
+function formatTime(ts: number): string {
+  return new Date(ts).toLocaleString()
+}
+
+/**
+ * `claude ps` — list live sessions.
+ */
+export async function psHandler(_args: string[]): Promise<void> {
+  const sessions = await listLiveSessions()
+
+  if (sessions.length === 0) {
+    console.log('No active sessions.')
+    return
+  }
+
+  console.log(
+    `${sessions.length} active session${sessions.length > 1 ? 's' : ''}:\n`,
+  )
+
+  for (const s of sessions) {
+    const parts: string[] = [
+      `  PID: ${s.pid}`,
+      `  Kind: ${s.kind}`,
+      `  Session: ${s.sessionId}`,
+      `  CWD: ${s.cwd}`,
+    ]
+
+    if (s.name) parts.push(`  Name: ${s.name}`)
+    if (s.startedAt) parts.push(`  Started: ${formatTime(s.startedAt)}`)
+    if (s.status) parts.push(`  Status: ${s.status}`)
+    if (s.waitingFor) parts.push(`  Waiting for: ${s.waitingFor}`)
+    if (s.bridgeSessionId) parts.push(`  Bridge: ${s.bridgeSessionId}`)
+    if (s.tmuxSessionName) parts.push(`  Tmux: ${s.tmuxSessionName}`)
+
+    console.log(parts.join('\n'))
+    console.log()
+  }
+}
+
+/**
+ * `claude logs <target>` — show logs for a session.
+ */
+export async function logsHandler(target: string | undefined): Promise<void> {
+  const sessions = await listLiveSessions()
+
+  if (!target) {
+    if (sessions.length === 0) {
+      console.log('No active sessions.')
+      return
+    }
+    if (sessions.length === 1) {
+      target = sessions[0]!.sessionId
+    } else {
+      console.log('Multiple sessions active. Specify one:')
+      for (const s of sessions) {
+        const label = s.name ? `${s.name} (${s.sessionId})` : s.sessionId
+        console.log(`  ${label}  PID=${s.pid}`)
+      }
+      return
+    }
+  }
+
+  const session = findSession(sessions, target)
+  if (!session) {
+    console.error(`Session not found: ${target}`)
+    process.exitCode = 1
+    return
+  }
+
+  if (!session.logPath) {
+    console.log(`No log path recorded for session ${session.sessionId}`)
+    return
+  }
+
+  try {
+    const content = await readFile(session.logPath, 'utf-8')
+    process.stdout.write(content)
+  } catch (e) {
+    console.error(`Failed to read log file: ${session.logPath}`)
+    console.error(e instanceof Error ? e.message : String(e))
+    process.exitCode = 1
+  }
+}
+
+/**
+ * `claude attach <target>` — attach to a background tmux session.
+ */
+export async function attachHandler(target: string | undefined): Promise<void> {
+  // Check tmux availability
+  const { code: tmuxCode } = await execFileNoThrow('tmux', ['-V'])
+  if (tmuxCode !== 0) {
+    console.error(
+      'tmux is required for attach. Install tmux to use background sessions.',
+    )
+    console.error(getTmuxHint())
+    process.exitCode = 1
+    return
+  }
+
+  const sessions = await listLiveSessions()
+
+  if (!target) {
+    // Find bg sessions with tmux metadata
+    const bgSessions = sessions.filter(s => s.tmuxSessionName)
+    if (bgSessions.length === 0) {
+      console.log(
+        'No background sessions to attach to. Start one with `claude --bg`.',
+      )
+      return
+    }
+    if (bgSessions.length === 1) {
+      target = bgSessions[0]!.sessionId
+    } else {
+      console.log('Multiple background sessions. Specify one:')
+      for (const s of bgSessions) {
+        const label = s.name ? `${s.name} (${s.sessionId})` : s.sessionId
+        console.log(`  ${label}  PID=${s.pid}  tmux=${s.tmuxSessionName}`)
+      }
+      return
+    }
+  }
+
+  const session = findSession(sessions, target)
+  if (!session) {
+    console.error(`Session not found: ${target}`)
+    process.exitCode = 1
+    return
+  }
+
+  if (!session.tmuxSessionName) {
+    console.error(
+      `Session ${session.sessionId} was not started with --bg (no tmux session).`,
+    )
+    process.exitCode = 1
+    return
+  }
+
+  // tmux attach is a blocking call — replaces this process's terminal
+  const result = spawnSync(
+    'tmux',
+    ['attach-session', '-t', session.tmuxSessionName],
+    {
+      stdio: 'inherit',
+    },
+  )
+
+  if (result.status !== 0) {
+    console.error(
+      `Failed to attach to tmux session '${session.tmuxSessionName}'.`,
+    )
+    process.exitCode = 1
+  }
+}
+
+/**
+ * `claude kill <target>` — kill a session.
+ */
+export async function killHandler(target: string | undefined): Promise<void> {
+  const sessions = await listLiveSessions()
+
+  if (!target) {
+    if (sessions.length === 0) {
+      console.log('No active sessions to kill.')
+      return
+    }
+    console.log('Specify a session to kill:')
+    for (const s of sessions) {
+      const label = s.name ? `${s.name} (${s.sessionId})` : s.sessionId
+      console.log(`  ${label}  PID=${s.pid}`)
+    }
+    return
+  }
+
+  const session = findSession(sessions, target)
+  if (!session) {
+    console.error(`Session not found: ${target}`)
+    process.exitCode = 1
+    return
+  }
+
+  console.log(`Killing session ${session.sessionId} (PID: ${session.pid})...`)
+
+  try {
+    process.kill(session.pid, 'SIGTERM')
+  } catch {
+    console.log('Session already exited.')
+    return
+  }
+
+  await new Promise(resolve => setTimeout(resolve, 2000))
+
+  if (isProcessRunning(session.pid)) {
+    try {
+      process.kill(session.pid, 'SIGKILL')
+      console.log('Session force-killed.')
+    } catch {
+      console.log('Session exited during grace period.')
+    }
+  } else {
+    console.log('Session stopped.')
+  }
+
+  const pidFile = join(getSessionsDir(), `${session.pid}.json`)
+  void unlink(pidFile).catch(() => {})
+}
+
+/**
+ * `claude --bg [args]` — start a session in a background tmux pane.
+ */
+export async function handleBgFlag(args: string[]): Promise<void> {
+  // Check tmux availability
+  const { code: tmuxCode } = await execFileNoThrow('tmux', ['-V'])
+  if (tmuxCode !== 0) {
+    console.error(
+      'tmux is required for --bg. Install tmux to use background sessions.',
+    )
+    console.error(getTmuxHint())
+    process.exitCode = 1
+    return
+  }
+
+  const sessionName = `claude-bg-${randomUUID().slice(0, 8)}`
+  const logPath = join(
+    getClaudeConfigHomeDir(),
+    'sessions',
+    'logs',
+    `${sessionName}.log`,
+  )
+
+  // Strip --bg/--background from args
+  const filteredArgs = args.filter(a => a !== '--bg' && a !== '--background')
+
+  // Build the command to run inside tmux — use array form to avoid shell injection
+  const entrypoint = process.argv[1]!
+  const tmuxEnv = {
+    ...process.env,
+    CLAUDE_CODE_SESSION_KIND: 'bg',
+    CLAUDE_CODE_SESSION_NAME: sessionName,
+    CLAUDE_CODE_SESSION_LOG: logPath,
+    CLAUDE_CODE_TMUX_SESSION: sessionName,
+  }
+  const cmd = quote([process.execPath, entrypoint, ...filteredArgs])
+
+  const result = spawnSync(
+    'tmux',
+    ['new-session', '-d', '-s', sessionName, cmd],
+    { stdio: 'inherit', env: tmuxEnv },
+  )
+
+  if (result.status !== 0) {
+    console.error('Failed to create tmux session.')
+    process.exitCode = 1
+    return
+  }
+
+  console.log(`Background session started: ${sessionName}`)
+  console.log(`  tmux session: ${sessionName}`)
+  console.log(`  log: ${logPath}`)
+  console.log()
+  console.log(`Use \`claude attach ${sessionName}\` to reconnect.`)
+  console.log(`Use \`claude ps\` to check status.`)
+  console.log(`Use \`claude kill ${sessionName}\` to stop.`)
+}
+
+function getTmuxHint(): string {
+  if (process.platform === 'darwin') {
+    return 'Install with: brew install tmux'
+  }
+  if (process.platform === 'win32') {
+    return 'tmux is not natively available on Windows. Consider using WSL.'
+  }
+  return 'Install with: sudo apt install tmux  (or your package manager)'
+}
--- a/src/cli/handlers/ant.ts
+++ b/src/cli/handlers/ant.ts
@@ -1,13 +1,216 @@
-// Auto-generated stub — replace with real implementation
-import type { Command } from '@commander-js/extra-typings';
+import type { Command } from '@commander-js/extra-typings'
+import {
+  createTask,
+  getTask,
+  updateTask,
+  listTasks,
+  getTasksDir,
+} from '../../utils/tasks.js'
+import { getRecentActivity } from '../../utils/logoV2Utils.js'
+import type { LogOption } from '../../types/logs.js'

-export {};
-export const logHandler: (logId: string | number | undefined) => Promise<void> = (async () => {}) as (logId: string | number | undefined) => Promise<void>;
-export const errorHandler: (num: number | undefined) => Promise<void> = (async () => {}) as (num: number | undefined) => Promise<void>;
-export const exportHandler: (source: string, outputFile: string) => Promise<void> = (async () => {}) as (source: string, outputFile: string) => Promise<void>;
-export const taskCreateHandler: (subject: string, opts: { description?: string; list?: string }) => Promise<void> = (async () => {}) as (subject: string, opts: { description?: string; list?: string }) => Promise<void>;
-export const taskListHandler: (opts: { list?: string; pending?: boolean; json?: boolean }) => Promise<void> = (async () => {}) as (opts: { list?: string; pending?: boolean; json?: boolean }) => Promise<void>;
-export const taskGetHandler: (id: string, opts: { list?: string }) => Promise<void> = (async () => {}) as (id: string, opts: { list?: string }) => Promise<void>;
-export const taskUpdateHandler: (id: string, opts: { list?: string; status?: string; subject?: string; description?: string; owner?: string; clearOwner?: boolean }) => Promise<void> = (async () => {}) as (id: string, opts: { list?: string; status?: string; subject?: string; description?: string; owner?: string; clearOwner?: boolean }) => Promise<void>;
-export const taskDirHandler: (opts: { list?: string }) => Promise<void> = (async () => {}) as (opts: { list?: string }) => Promise<void>;
-export const completionHandler: (shell: string, opts: { output?: string }, program: Command) => Promise<void> = (async () => {}) as (shell: string, opts: { output?: string }, program: Command) => Promise<void>;
+const DEFAULT_LIST = 'default'
+
+// ─── Group C: Task CRUD ──────────────────────────────────────────────────────
+
+export async function taskCreateHandler(
+  subject: string,
+  opts: { description?: string; list?: string },
+): Promise<void> {
+  const listId = opts.list || DEFAULT_LIST
+  const id = await createTask(listId, {
+    subject,
+    description: opts.description || '',
+    status: 'pending',
+    blocks: [],
+    blockedBy: [],
+  })
+  console.log(`Created task ${id}: ${subject}`)
+}
+
+export async function taskListHandler(opts: {
+  list?: string
+  pending?: boolean
+  json?: boolean
+}): Promise<void> {
+  const listId = opts.list || DEFAULT_LIST
+  let tasks = await listTasks(listId)
+
+  if (opts.pending) {
+    tasks = tasks.filter(t => t.status === 'pending')
+  }
+
+  if (opts.json) {
+    console.log(JSON.stringify(tasks, null, 2))
+    return
+  }
+
+  if (tasks.length === 0) {
+    console.log('No tasks found.')
+    return
+  }
+
+  for (const t of tasks) {
+    console.log(`  [${t.status}] ${t.id}: ${t.subject}`)
+    if (t.description) console.log(`    ${t.description}`)
+    if (t.owner) console.log(`    owner: ${t.owner}`)
+  }
+}
+
+export async function taskGetHandler(
+  id: string,
+  opts: { list?: string },
+): Promise<void> {
+  const listId = opts.list || DEFAULT_LIST
+  const task = await getTask(listId, id)
+  if (!task) {
+    console.error(`Task not found: ${id}`)
+    process.exitCode = 1
+    return
+  }
+  console.log(JSON.stringify(task, null, 2))
+}
+
+export async function taskUpdateHandler(
+  id: string,
+  opts: {
+    list?: string
+    status?: string
+    subject?: string
+    description?: string
+    owner?: string
+    clearOwner?: boolean
+  },
+): Promise<void> {
+  const listId = opts.list || DEFAULT_LIST
+  const updates: Record<string, unknown> = {}
+
+  if (opts.status) updates.status = opts.status
+  if (opts.subject) updates.subject = opts.subject
+  if (opts.description) updates.description = opts.description
+  if (opts.owner) updates.owner = opts.owner
+  if (opts.clearOwner) updates.owner = undefined
+
+  const task = await updateTask(listId, id, updates)
+  if (!task) {
+    console.error(`Task not found: ${id}`)
+    process.exitCode = 1
+    return
+  }
+  console.log(`Updated task ${id}: [${task.status}] ${task.subject}`)
+}
+
+export async function taskDirHandler(opts: { list?: string }): Promise<void> {
+  const listId = opts.list || DEFAULT_LIST
+  console.log(getTasksDir(listId))
+}
+
+// ─── Group B: Log / Error / Export ───────────────────────────────────────────
+
+export async function logHandler(
+  logId: string | number | undefined,
+): Promise<void> {
+  const logs = await getRecentActivity()
+
+  if (logId === undefined) {
+    if (logs.length === 0) {
+      console.log('No recent sessions.')
+      return
+    }
+    for (let i = 0; i < Math.min(logs.length, 20); i++) {
+      const log = logs[i]!
+      const date = log.modified
+        ? new Date(log.modified).toLocaleString()
+        : 'unknown'
+      const title =
+        (log as Record<string, unknown>).title || log.sessionId || 'untitled'
+      console.log(`  ${i}: ${title}  (${date})`)
+    }
+    return
+  }
+
+  const idx = typeof logId === 'string' ? parseInt(logId, 10) : logId
+  const log =
+    Number.isFinite(idx) && idx >= 0 && idx < logs.length
+      ? logs[idx]
+      : logs.find(l => l.sessionId === String(logId))
+
+  if (!log) {
+    console.error(`Session not found: ${logId}`)
+    process.exitCode = 1
+    return
+  }
+
+  console.log(JSON.stringify(log, null, 2))
+}
+
+export async function errorHandler(num: number | undefined): Promise<void> {
+  // Error log viewing — shows recent session errors
+  const logs = await getRecentActivity()
+  const count = num ?? 5
+
+  console.log(`Last ${count} sessions:`)
+  for (let i = 0; i < Math.min(count, logs.length); i++) {
+    const log = logs[i]!
+    const date = log.modified
+      ? new Date(log.modified).toLocaleString()
+      : 'unknown'
+    console.log(`  ${i}: ${log.sessionId}  (${date})`)
+  }
+}
+
+export async function exportHandler(
+  source: string,
+  outputFile: string,
+): Promise<void> {
+  const { writeFile, readFile } = await import('fs/promises')
+  const logs = await getRecentActivity()
+
+  // Try as index first
+  const idx = parseInt(source, 10)
+  let log: LogOption | undefined
+  if (Number.isFinite(idx) && idx >= 0 && idx < logs.length) {
+    log = logs[idx]
+  } else {
+    log = logs.find(l => l.sessionId === source)
+  }
+
+  if (!log) {
+    // Try as file path
+    try {
+      const content = await readFile(source, 'utf-8')
+      await writeFile(outputFile, content, 'utf-8')
+      console.log(`Exported ${source} → ${outputFile}`)
+      return
+    } catch {
+      console.error(`Source not found: ${source}`)
+      process.exitCode = 1
+      return
+    }
+  }
+
+  await writeFile(outputFile, JSON.stringify(log, null, 2), 'utf-8')
+  console.log(`Exported session ${log.sessionId} → ${outputFile}`)
+}
+
+// ─── Group D: Completion ─────────────────────────────────────────────────────
+
+export async function completionHandler(
+  shell: string,
+  opts: { output?: string },
+  _program: Command,
+): Promise<void> {
+  const { regenerateCompletionCache } = await import(
+    '../../utils/completionCache.js'
+  )
+
+  if (opts.output) {
+    // Generate and write to file
+    await regenerateCompletionCache()
+    console.log(`Completion cache regenerated for ${shell}.`)
+  } else {
+    // Regenerate and output to stdout
+    await regenerateCompletionCache()
+    console.log(`Completion cache regenerated for ${shell}.`)
+  }
+}
--- a/src/cli/handlers/templateJobs.ts
+++ b/src/cli/handlers/templateJobs.ts
@@ -1,3 +1,131 @@
-// Auto-generated stub — replace with real implementation
-export {};
-export const templatesMain: (args: string[]) => Promise<void> = () => Promise.resolve();
+import { randomUUID } from 'crypto'
+import { listTemplates, loadTemplate } from '../../jobs/templates.js'
+import {
+  createJob,
+  readJobState,
+  appendJobReply,
+  getJobDir,
+} from '../../jobs/state.js'
+
+/**
+ * Entry point for template job commands: `new`, `list`, `reply`.
+ * Called from cli.tsx fast-path.
+ */
+export async function templatesMain(args: string[]): Promise<void> {
+  const subcommand = args[0]
+
+  switch (subcommand) {
+    case 'list':
+      handleList()
+      break
+    case 'new':
+      handleNew(args.slice(1))
+      break
+    case 'reply':
+      handleReply(args.slice(1))
+      break
+    default:
+      console.error(`Unknown template command: ${subcommand}`)
+      printUsage()
+      process.exitCode = 1
+  }
+}
+
+function printUsage(): void {
+  console.log(`
+Template Job Commands:
+
+  claude list                    List available templates
+  claude new <template> [args]   Create a new job from a template
+  claude reply <job-id> <text>   Reply to an existing job
+`)
+}
+
+function handleList(): void {
+  const templates = listTemplates()
+
+  if (templates.length === 0) {
+    console.log('No templates found.')
+    console.log('Place .md files in .claude/templates/ or ~/.claude/templates/')
+    return
+  }
+
+  console.log(
+    `${templates.length} template${templates.length > 1 ? 's' : ''} found:\n`,
+  )
+
+  for (const t of templates) {
+    console.log(`  ${t.name}`)
+    console.log(`    ${t.description}`)
+    console.log(`    Path: ${t.filePath}`)
+    console.log()
+  }
+}
+
+function handleNew(args: string[]): void {
+  const templateName = args[0]
+  if (!templateName) {
+    console.error('Usage: claude new <template> [args...]')
+    process.exitCode = 1
+    return
+  }
+
+  const template = loadTemplate(templateName)
+  if (!template) {
+    console.error(`Template not found: ${templateName}`)
+    console.log('\nAvailable templates:')
+    for (const t of listTemplates()) {
+      console.log(`  ${t.name}`)
+    }
+    process.exitCode = 1
+    return
+  }
+
+  const jobId = randomUUID().slice(0, 8)
+  const inputText = args.slice(1).join(' ')
+  const rawContent = `---\n${Object.entries(template.frontmatter)
+    .map(([k, v]) => `${k}: ${v}`)
+    .join('\n')}\n---\n${template.content}`
+
+  const dir = createJob(
+    jobId,
+    templateName,
+    rawContent,
+    inputText,
+    args.slice(1),
+  )
+
+  console.log(`Job created: ${jobId}`)
+  console.log(`  Template: ${templateName}`)
+  console.log(`  Directory: ${dir}`)
+  if (inputText) {
+    console.log(`  Input: ${inputText}`)
+  }
+}
+
+function handleReply(args: string[]): void {
+  const jobId = args[0]
+  const text = args.slice(1).join(' ')
+
+  if (!jobId || !text) {
+    console.error('Usage: claude reply <job-id> <text>')
+    process.exitCode = 1
+    return
+  }
+
+  const state = readJobState(jobId)
+  if (!state) {
+    console.error(`Job not found: ${jobId}`)
+    process.exitCode = 1
+    return
+  }
+
+  const ok = appendJobReply(jobId, text)
+  if (ok) {
+    console.log(`Reply added to job ${jobId}`)
+    console.log(`  Directory: ${getJobDir(jobId)}`)
+  } else {
+    console.error(`Failed to append reply to job ${jobId}`)
+    process.exitCode = 1
+  }
+}
--- a/src/cli/print.ts
+++ b/src/cli/print.ts
@@ -320,6 +320,17 @@ import {
  logQueryProfileReport,
 } from 'src/utils/queryProfiler.js'
 import { asSessionId } from 'src/types/ids.js'
+import {
+  commitAutonomyQueuedPrompt,
+  createAutonomyQueuedPrompt,
+  createProactiveAutonomyCommands,
+  finalizeAutonomyRunCompleted,
+  finalizeAutonomyRunFailed,
+  markAutonomyRunCompleted,
+  markAutonomyRunFailed,
+  markAutonomyRunRunning,
+} from 'src/utils/autonomyRuns.js'
+import { prepareAutonomyTurnPrompt } from 'src/utils/autonomyAuthority.js'
 import { jsonStringify } from '../utils/slowOperations.js'
 import { skillChangeDetector } from '../utils/skills/skillChangeDetector.js'
 import { getCommands, clearCommandsCache } from '../commands.js'
@@ -1839,15 +1850,23 @@ function runHeadlessStreaming(
            ) {
              return
            }
-            const tickContent = `<${TICK_TAG}>${new Date().toLocaleTimeString()}</${TICK_TAG}>`
-            enqueue({
-              mode: 'prompt' as const,
-              value: tickContent,
-              uuid: randomUUID(),
-              priority: 'later',
-              isMeta: true,
-            })
-            void run()
+            void (async () => {
+              const commands = await createProactiveAutonomyCommands({
+                basePrompt: `<${TICK_TAG}>${new Date().toLocaleTimeString()}</${TICK_TAG}>`,
+                currentDir: cwd(),
+                shouldCreate: () => !inputClosed,
+              })
+              for (const command of commands) {
+                if (inputClosed) {
+                  return
+                }
+                enqueue({
+                  ...command,
+                  uuid: randomUUID(),
+                })
+              }
+              void run()
+            })()
          }, 0)
        }
      : undefined
@@ -2092,6 +2111,9 @@ function runHeadlessStreaming(
          }

          const input = command.value
+          const autonomyRunIds = batch
+            .map(item => item.autonomy?.runId)
+            .filter((runId): runId is string => Boolean(runId))

          if (structuredIO instanceof RemoteIO && command.mode === 'prompt') {
            logEvent('tengu_bridge_message_received', {
@@ -2141,8 +2163,12 @@ function runHeadlessStreaming(
          // const-capture: TS loses `while ((command = dequeue()))` narrowing
          // inside the closure.
          const cmd = command
-          await runWithWorkload(cmd.workload ?? options.workload, async () => {
-            for await (const message of ask({
+          for (const runId of autonomyRunIds) {
+            await markAutonomyRunRunning(runId)
+          }
+          try {
+            await runWithWorkload(cmd.workload ?? options.workload, async () => {
+              for await (const message of ask({
              commands: uniqBy(
                [...currentCommands, ...appState.mcp.commands],
                'name',
@@ -2241,7 +2267,30 @@ function runHeadlessStreaming(
                output.enqueue(message as StdoutMessage)
              }
            }
-          }) // end runWithWorkload
+            }) // end runWithWorkload
+            for (const runId of autonomyRunIds) {
+              const nextCommands = await finalizeAutonomyRunCompleted({
+                runId,
+                currentDir: cwd(),
+                priority: 'later',
+                workload: cmd.workload ?? options.workload,
+              })
+              for (const nextCommand of nextCommands) {
+                enqueue({
+                  ...nextCommand,
+                  uuid: randomUUID(),
+                })
+              }
+            }
+          } catch (error) {
+            for (const runId of autonomyRunIds) {
+              await finalizeAutonomyRunFailed({
+                runId,
+                error: String(error),
+              })
+            }
+            throw error
+          }

          for (const uuid of batchUuids) {
            notifyCommandLifecycle(uuid, 'completed')
@@ -2706,22 +2755,69 @@ function runHeadlessStreaming(
    cronScheduler = cronSchedulerModule.createCronScheduler({
      onFire: prompt => {
        if (inputClosed) return
-        enqueue({
-          mode: 'prompt',
-          value: prompt,
-          uuid: randomUUID(),
-          priority: 'later',
-          // System-generated — matches useScheduledTasks.ts REPL equivalent.
-          // Without this, messages.ts metaProp eval is {} → prompt leaks
-          // into visible transcript when cron fires mid-turn in -p mode.
-          isMeta: true,
-          // Threaded to cc_workload= in the billing-header attribution block
-          // so the API can serve cron requests at lower QoS. drainCommandQueue
-          // reads this per-iteration and hoists it into bootstrap state for
-          // the ask() call.
-          workload: WORKLOAD_CRON,
-        })
-        void run()
+        void (async () => {
+          const prepared = await prepareAutonomyTurnPrompt({
+            basePrompt: prompt,
+            trigger: 'scheduled-task',
+            currentDir: cwd(),
+          })
+          if (inputClosed) return
+          const command = await commitAutonomyQueuedPrompt({
+            prepared,
+            currentDir: cwd(),
+            workload: WORKLOAD_CRON,
+          })
+          if (inputClosed) return
+          enqueue({
+            ...command,
+            uuid: randomUUID(),
+          })
+          void run()
+        })()
+      },
+      onFireTask: task => {
+        if (inputClosed) return
+        void (async () => {
+          if (task.agentId) {
+            const prepared = await prepareAutonomyTurnPrompt({
+              basePrompt: task.prompt,
+              trigger: 'scheduled-task',
+              currentDir: cwd(),
+            })
+            if (inputClosed) return
+            const command = await commitAutonomyQueuedPrompt({
+              prepared,
+              currentDir: cwd(),
+              sourceId: task.id,
+              sourceLabel: task.prompt,
+              workload: WORKLOAD_CRON,
+            })
+            await markAutonomyRunFailed(
+              command.autonomy!.runId,
+              `No teammate runtime available for scheduled task owner ${task.agentId} in headless mode.`,
+            )
+            return
+          }
+          const prepared = await prepareAutonomyTurnPrompt({
+            basePrompt: task.prompt,
+            trigger: 'scheduled-task',
+            currentDir: cwd(),
+          })
+          if (inputClosed) return
+          const command = await commitAutonomyQueuedPrompt({
+            prepared,
+            currentDir: cwd(),
+            sourceId: task.id,
+            sourceLabel: task.prompt,
+            workload: WORKLOAD_CRON,
+          })
+          if (inputClosed) return
+          enqueue({
+            ...command,
+            uuid: randomUUID(),
+          })
+          void run()
+        })()
      },
      isLoading: () => running || inputClosed,
      getJitterConfig: cronJitterConfigModule?.getCronJitterConfig,
--- a/src/cli/rollback.ts
+++ b/src/cli/rollback.ts
@@ -1,2 +1,69 @@
-// Auto-generated stub
-export async function rollback(target?: string, options?: { list?: boolean; dryRun?: boolean; safe?: boolean }): Promise<void> {}
+/**
+ * `claude rollback [target]` — roll back to a previous Claude Code version.
+ *
+ * ANT-only command (USER_TYPE === "ant").
+ *
+ * Options:
+ *   --list      List recent published versions
+ *   --dry-run   Show what would be installed without installing
+ *   --safe      Roll back to the server-pinned safe version
+ */
+export async function rollback(
+  target?: string,
+  options?: { list?: boolean; dryRun?: boolean; safe?: boolean },
+): Promise<void> {
+  if (options?.list) {
+    console.log('Recent versions:')
+    console.log('  (version listing requires access to the release registry)')
+    console.log('  Use `claude update --list` for available versions.')
+    return
+  }
+
+  if (options?.safe) {
+    console.log('Safe rollback: would install the server-pinned safe version.')
+    if (options.dryRun) {
+      console.log('  (dry run — no changes made)')
+      return
+    }
+    console.log('  Safe version pinning requires access to the release API.')
+    console.log('  Contact oncall for the current safe version.')
+    return
+  }
+
+  if (!target) {
+    console.log(
+      'Usage: claude rollback [target]\n\n' +
+        'Options:\n' +
+        '  -l, --list     List recent published versions\n' +
+        '  --dry-run      Show what would be installed\n' +
+        '  --safe         Roll back to server-pinned safe version\n\n' +
+        'Examples:\n' +
+        '  claude rollback 2.1.880\n' +
+        '  claude rollback --list\n' +
+        '  claude rollback --safe',
+    )
+    return
+  }
+
+  console.log(`Rolling back to version ${target}...`)
+
+  if (options?.dryRun) {
+    console.log(`  (dry run — would install ${target})`)
+    return
+  }
+
+  // Version rollback via npm/bun
+  const { spawnSync } = await import('child_process')
+  const result = spawnSync(
+    'npm',
+    ['install', '-g', `@anthropic-ai/claude-code@${target}`],
+    { stdio: 'inherit' },
+  )
+
+  if (result.status !== 0) {
+    console.error(`Rollback failed with exit code ${result.status}`)
+    process.exitCode = result.status ?? 1
+  } else {
+    console.log(`Rolled back to ${target} successfully.`)
+  }
+}
--- a/src/cli/up.ts
+++ b/src/cli/up.ts
@@ -1,2 +1,95 @@
-// Auto-generated stub
-export async function up(): Promise<void> {}
+import { readFileSync } from 'fs'
+import { join } from 'path'
+import { spawnSync } from 'child_process'
+import { findGitRoot } from '../utils/git.js'
+
+/**
+ * `claude up` — run the "# claude up" section from the nearest CLAUDE.md.
+ *
+ * Walks up from CWD looking for CLAUDE.md files, extracts the section
+ * under the `# claude up` heading, and executes it as a shell script.
+ *
+ * ANT-only command (USER_TYPE === "ant").
+ */
+export async function up(): Promise<void> {
+  const cwd = process.cwd()
+  const gitRoot = findGitRoot(cwd)
+  const searchDirs = gitRoot ? [gitRoot, cwd] : [cwd]
+
+  let upSection: string | null = null
+
+  for (const dir of searchDirs) {
+    const claudeMdPath = join(dir, 'CLAUDE.md')
+    try {
+      const content = readFileSync(claudeMdPath, 'utf-8')
+      upSection = extractUpSection(content)
+      if (upSection) {
+        console.log(`Found "# claude up" in ${claudeMdPath}`)
+        break
+      }
+    } catch {
+      // File not found — continue searching
+    }
+  }
+
+  if (!upSection) {
+    console.log(
+      'No "# claude up" section found in CLAUDE.md.\n' +
+        'Add a section like:\n\n' +
+        '  # claude up\n' +
+        '  ```bash\n' +
+        '  npm install\n' +
+        '  npm run build\n' +
+        '  ```',
+    )
+    return
+  }
+
+  console.log('Running:\n')
+  console.log(upSection)
+  console.log()
+
+  const result = spawnSync('bash', ['-c', upSection], {
+    cwd,
+    stdio: 'inherit',
+  })
+
+  if (result.status !== 0) {
+    console.error(`\nclaude up failed with exit code ${result.status}`)
+    process.exitCode = result.status ?? 1
+  } else {
+    console.log('\nclaude up completed successfully.')
+  }
+}
+
+/**
+ * Extract the content under "# claude up" heading from markdown.
+ * Returns the text between `# claude up` and the next `#` heading (or EOF).
+ * Strips fenced code block markers if present.
+ */
+function extractUpSection(markdown: string): string | null {
+  const lines = markdown.split('\n')
+  let inSection = false
+  const sectionLines: string[] = []
+
+  for (const line of lines) {
+    if (/^#\s+claude\s+up\b/i.test(line)) {
+      inSection = true
+      continue
+    }
+    if (inSection && /^#\s/.test(line)) {
+      break
+    }
+    if (inSection) {
+      sectionLines.push(line)
+    }
+  }
+
+  if (sectionLines.length === 0) return null
+
+  // Strip fenced code block markers
+  let text = sectionLines.join('\n').trim()
+  text = text.replace(/^```\w*\n?/, '').replace(/\n?```\s*$/, '')
+
+  return text.trim() || null
+}
--- a/src/commands.ts
+++ b/src/commands.ts
@@ -25,6 +25,7 @@ import ide from './commands/ide/index.js'
 import init from './commands/init.js'
 import initVerifiers from './commands/init-verifiers.js'
 import keybindings from './commands/keybindings/index.js'
+import lang from './commands/lang/index.js'
 import login from './commands/login/index.js'
 import logout from './commands/logout/index.js'
 import installGitHubApp from './commands/install-github-app/index.js'
@@ -182,6 +183,7 @@ import sandboxToggle from './commands/sandbox-toggle/index.js'
 import chrome from './commands/chrome/index.js'
 import stickers from './commands/stickers/index.js'
 import advisor from './commands/advisor.js'
+import autonomy from './commands/autonomy.js'
 import provider from './commands/provider.js'
 import { logError } from './utils/log.js'
 import { toError } from './utils/errors.js'
@@ -290,6 +292,7 @@ export const INTERNAL_ONLY_COMMANDS = [
 const COMMANDS = memoize((): Command[] => [
  addDir,
  advisor,
+  autonomy,
  provider,
  agents,
  branch,
@@ -315,6 +318,7 @@ const COMMANDS = memoize((): Command[] => [
  ide,
  init,
  keybindings,
+  lang,
  installGitHubApp,
  installSlackApp,
  mcp,
--- a/src/commands/tests/autonomy.test.ts
+++ b/src/commands/tests/autonomy.test.ts
@@ -0,0 +1,237 @@
+import { afterEach, beforeEach, describe, expect, test } from 'bun:test'
+import autonomyCommand from '../autonomy'
+import {
+  resetStateForTests,
+  setOriginalCwd,
+  setProjectRoot,
+} from '../../bootstrap/state'
+import { listAutonomyFlows } from '../../utils/autonomyFlows'
+import {
+  createAutonomyQueuedPrompt,
+  markAutonomyRunCompleted,
+  startManagedAutonomyFlowFromHeartbeatTask,
+} from '../../utils/autonomyRuns'
+import {
+  enqueuePendingNotification,
+  getCommandQueueSnapshot,
+  resetCommandQueue,
+} from '../../utils/messageQueueManager'
+import { cleanupTempDir, createTempDir } from '../../../tests/mocks/file-system'
+
+let tempDir = ''
+
+beforeEach(async () => {
+  tempDir = await createTempDir('autonomy-command-')
+  resetStateForTests()
+  resetCommandQueue()
+  setOriginalCwd(tempDir)
+  setProjectRoot(tempDir)
+})
+
+afterEach(async () => {
+  resetStateForTests()
+  resetCommandQueue()
+  if (tempDir) {
+    await cleanupTempDir(tempDir)
+  }
+})
+
+describe('/autonomy', () => {
+  test('status reports autonomy runs and managed flows separately', async () => {
+    const plainRun = await createAutonomyQueuedPrompt({
+      basePrompt: 'scheduled prompt',
+      trigger: 'scheduled-task',
+      rootDir: tempDir,
+      currentDir: tempDir,
+      sourceLabel: 'nightly',
+    })
+    await markAutonomyRunCompleted(plainRun.autonomy!.runId, tempDir)
+
+    await startManagedAutonomyFlowFromHeartbeatTask({
+      task: {
+        name: 'weekly-report',
+        interval: '7d',
+        prompt: 'Ship the weekly report',
+        steps: [
+          {
+            name: 'gather',
+            prompt: 'Gather weekly inputs',
+          },
+          {
+            name: 'draft',
+            prompt: 'Draft the weekly report',
+          },
+        ],
+      },
+      rootDir: tempDir,
+      currentDir: tempDir,
+    })
+
+    const mod = await autonomyCommand.load()
+    const result = await mod.call('', {} as any)
+
+    expect(result.type).toBe('text')
+    expect(result.value).toContain('Autonomy runs: 2')
+    expect(result.value).toContain('Autonomy flows: 1')
+    expect(result.value).toContain('Completed: 1')
+    expect(result.value).toContain('Queued: 1')
+  })
+
+  test('runs subcommand lists recent autonomy runs', async () => {
+    const queued = await createAutonomyQueuedPrompt({
+      basePrompt: '<tick>12:00:00</tick>',
+      trigger: 'proactive-tick',
+      rootDir: tempDir,
+      currentDir: tempDir,
+    })
+
+    const mod = await autonomyCommand.load()
+    const result = await mod.call('runs 5', {} as any)
+
+    expect(result.type).toBe('text')
+    expect(result.value).toContain(queued.autonomy!.runId)
+    expect(result.value).toContain('proactive-tick')
+  })
+
+  test('flows subcommand lists managed flows and flow subcommand shows detail', async () => {
+    await startManagedAutonomyFlowFromHeartbeatTask({
+      task: {
+        name: 'weekly-report',
+        interval: '7d',
+        prompt: 'Ship the weekly report',
+        steps: [
+          {
+            name: 'gather',
+            prompt: 'Gather weekly inputs',
+          },
+          {
+            name: 'draft',
+            prompt: 'Draft the weekly report',
+          },
+        ],
+      },
+      rootDir: tempDir,
+      currentDir: tempDir,
+    })
+
+    const [flow] = await listAutonomyFlows(tempDir)
+    const mod = await autonomyCommand.load()
+
+    const flowsResult = await mod.call('flows 5', {} as any)
+    expect(flowsResult.type).toBe('text')
+    expect(flowsResult.value).toContain(flow!.flowId)
+    expect(flowsResult.value).toContain('managed')
+
+    const flowResult = await mod.call(`flow ${flow!.flowId}`, {} as any)
+    expect(flowResult.type).toBe('text')
+    expect(flowResult.value).toContain(`Flow: ${flow!.flowId}`)
+    expect(flowResult.value).toContain('Mode: managed')
+    expect(flowResult.value).toContain('Current step: gather')
+  })
+
+  test('flow resume queues the next waiting step', async () => {
+    const waitingStart = await startManagedAutonomyFlowFromHeartbeatTask({
+      task: {
+        name: 'weekly-report',
+        interval: '7d',
+        prompt: 'Ship the weekly report',
+        steps: [
+          {
+            name: 'gather',
+            prompt: 'Gather weekly inputs',
+            waitFor: 'manual',
+          },
+          {
+            name: 'draft',
+            prompt: 'Draft the weekly report',
+          },
+        ],
+      },
+      rootDir: tempDir,
+      currentDir: tempDir,
+    })
+
+    expect(waitingStart).toBeNull()
+    const [flow] = await listAutonomyFlows(tempDir)
+
+    const mod = await autonomyCommand.load()
+    const result = await mod.call(`flow resume ${flow!.flowId}`, {} as any)
+
+    expect(result.type).toBe('text')
+    expect(result.value).toContain('Queued the next managed step')
+    expect(getCommandQueueSnapshot()).toHaveLength(1)
+    expect(getCommandQueueSnapshot()[0]!.autonomy?.flowId).toBe(flow!.flowId)
+  })
+
+  test('flow cancel removes queued managed steps and marks the flow cancelled', async () => {
+    const queued = await startManagedAutonomyFlowFromHeartbeatTask({
+      task: {
+        name: 'weekly-report',
+        interval: '7d',
+        prompt: 'Ship the weekly report',
+        steps: [
+          {
+            name: 'gather',
+            prompt: 'Gather weekly inputs',
+          },
+          {
+            name: 'draft',
+            prompt: 'Draft the weekly report',
+          },
+        ],
+      },
+      rootDir: tempDir,
+      currentDir: tempDir,
+    })
+
+    expect(queued).not.toBeNull()
+    enqueuePendingNotification(queued!)
+    expect(getCommandQueueSnapshot()).toHaveLength(1)
+    const [flow] = await listAutonomyFlows(tempDir)
+    const mod = await autonomyCommand.load()
+    const result = await mod.call(`flow cancel ${flow!.flowId}`, {} as any)
+    const [cancelledFlow] = await listAutonomyFlows(tempDir)
+
+    expect(result.type).toBe('text')
+    expect(result.value).toContain('Cancelled flow')
+    expect(cancelledFlow!.status).toBe('cancelled')
+    expect(getCommandQueueSnapshot()).toHaveLength(0)
+  })
+
+  test('flow cancel refuses to rewrite a terminal managed flow', async () => {
+    const queued = await startManagedAutonomyFlowFromHeartbeatTask({
+      task: {
+        name: 'weekly-report',
+        interval: '7d',
+        prompt: 'Ship the weekly report',
+        steps: [
+          {
+            name: 'gather',
+            prompt: 'Gather weekly inputs',
+          },
+        ],
+      },
+      rootDir: tempDir,
+      currentDir: tempDir,
+    })
+
+    await markAutonomyRunCompleted(queued!.autonomy!.runId, tempDir)
+
+    const [flow] = await listAutonomyFlows(tempDir)
+    const mod = await autonomyCommand.load()
+    const result = await mod.call(`flow cancel ${flow!.flowId}`, {} as any)
+    const [terminalFlow] = await listAutonomyFlows(tempDir)
+
+    expect(result.type).toBe('text')
+    expect(result.value).toContain('already terminal')
+    expect(terminalFlow!.status).toBe('succeeded')
+  })
+
+  test('invalid subcommands return usage text', async () => {
+    const mod = await autonomyCommand.load()
+    const result = await mod.call('unknown', {} as any)
+
+    expect(result.type).toBe('text')
+    expect(result.value).toContain('Usage: /autonomy')
+  })
+})
--- a/src/commands/tests/proactive.baseline.test.ts
+++ b/src/commands/tests/proactive.baseline.test.ts
@@ -0,0 +1,54 @@
+import { beforeEach, describe, expect, test } from 'bun:test'
+import proactiveCommand from '../proactive'
+import {
+  activateProactive,
+  deactivateProactive,
+  isProactiveActive,
+} from '../../proactive/index'
+
+beforeEach(() => {
+  deactivateProactive()
+})
+
+describe('/proactive baseline', () => {
+  test('invoking the command enables proactive mode and emits a system reminder', async () => {
+    const mod = await proactiveCommand.load()
+    let resultText: string | undefined
+    let options: Parameters<Parameters<typeof mod.call>[0]>[1] | undefined
+
+    await mod.call(
+      (result, opts) => {
+        resultText = result
+        options = opts
+      },
+      {} as any,
+      '',
+    )
+
+    expect(isProactiveActive()).toBe(true)
+    expect(resultText).toContain('Proactive mode enabled')
+    expect(options?.display).toBe('system')
+    expect(options?.metaMessages?.[0]).toContain('Proactive mode is now enabled')
+  })
+
+  test('invoking the command again disables proactive mode', async () => {
+    const mod = await proactiveCommand.load()
+    activateProactive('test')
+
+    let resultText: string | undefined
+    let options: Parameters<Parameters<typeof mod.call>[0]>[1] | undefined
+
+    await mod.call(
+      (result, opts) => {
+        resultText = result
+        options = opts
+      },
+      {} as any,
+      '',
+    )
+
+    expect(isProactiveActive()).toBe(false)
+    expect(resultText).toBe('Proactive mode disabled')
+    expect(options?.display).toBe('system')
+  })
+})
--- a/src/commands/assistant/assistant.ts
+++ b/src/commands/assistant/assistant.ts
@@ -1,53 +0,0 @@
-import * as React from 'react'
-import type { LocalJSXCommandContext } from '../../commands.js'
-import type { LocalJSXCommandOnDone } from '../../types/command.js'
-import type { AppState } from '../../state/AppState.js'
-
-/** Stub — install wizard is not yet restored. */
-export async function computeDefaultInstallDir(): Promise<string> {
-  return ''
-}
-
-/** Stub — install wizard is not yet restored. */
-export function NewInstallWizard(_props: {
-  defaultDir: string
-  onInstalled: (dir: string) => void
-  onCancel: () => void
-  onError: (message: string) => void
-}): React.ReactNode {
-  return null
-}
-
-/**
- * /assistant command implementation.
- *
- * Opens the Kairos assistant panel. In the current build the panel is
- * rendered by the REPL layer when kairosActive is true; the slash command
- * simply toggles visibility and prints a confirmation line.
- */
-export async function call(
-  onDone: LocalJSXCommandOnDone,
-  context: LocalJSXCommandContext,
-  _args: string,
-): Promise<React.ReactNode> {
-  const { setAppState, getAppState } = context
-
-  const current = getAppState()
-  const isVisible = (current as Record<string, unknown>).assistantPanelVisible
-
-  if (isVisible) {
-    setAppState((prev: AppState) => ({
-      ...prev,
-      assistantPanelVisible: false,
-    } as AppState))
-    onDone('Assistant panel hidden.', { display: 'system' })
-  } else {
-    setAppState((prev: AppState) => ({
-      ...prev,
-      assistantPanelVisible: true,
-    } as AppState))
-    onDone('Assistant panel opened.', { display: 'system' })
-  }
-
-  return null
-}
--- a/src/commands/assistant/assistant.tsx
+++ b/src/commands/assistant/assistant.tsx
@@ -0,0 +1,175 @@
+import { spawn } from 'child_process';
+import * as React from 'react';
+import { useEffect, useState } from 'react';
+import { resolve } from 'path';
+import { Box, Text } from '@anthropic/ink';
+import { Dialog } from '../../components/design-system/Dialog.js';
+import { ListItem } from '../../components/design-system/ListItem.js';
+import { useRegisterOverlay } from '../../context/overlayContext.js';
+import { useKeybindings } from '../../keybindings/useKeybinding.js';
+import { findGitRoot } from '../../utils/git.js';
+import { getKairosActive, setKairosActive } from '../../bootstrap/state.js';
+import type { LocalJSXCommandContext } from '../../commands.js';
+import type { LocalJSXCommandOnDone } from '../../types/command.js';
+import type { AppState } from '../../state/AppState.js';
+
+/**
+ * Compute the default directory for assistant daemon installation.
+ * Prefers git root of cwd; falls back to cwd itself.
+ */
+export async function computeDefaultInstallDir(): Promise<string> {
+  const cwd = process.cwd();
+  const gitRoot = findGitRoot(cwd);
+  return gitRoot || resolve(cwd);
+}
+
+interface WizardProps {
+  defaultDir: string;
+  onInstalled: (dir: string) => void;
+  onCancel: () => void;
+  onError: (message: string) => void;
+}
+
+/**
+ * Install wizard for assistant mode. Shown when `claude assistant` finds
+ * zero CCR sessions. Guides the user to start a daemon that registers
+ * a bridge → CCR cloud session.
+ *
+ * After installation, main.tsx tells the user to run `claude assistant`
+ * again in a few seconds (daemon needs time to register the bridge session).
+ */
+export function NewInstallWizard({ defaultDir, onInstalled, onCancel, onError }: WizardProps): React.ReactNode {
+  useRegisterOverlay('assistant-install-wizard');
+  const [focusIndex, setFocusIndex] = useState(0);
+  const [starting, setStarting] = useState(false);
+
+  useKeybindings(
+    {
+      'select:next': () => setFocusIndex(i => (i + 1) % 2),
+      'select:previous': () => setFocusIndex(i => (i - 1 + 2) % 2),
+      'select:accept': () => {
+        if (focusIndex === 0) {
+          startDaemon();
+        } else {
+          onCancel();
+        }
+      },
+    },
+    { context: 'Select' },
+  );
+
+  function startDaemon(): void {
+    if (starting) return;
+    setStarting(true);
+
+    const dir = defaultDir || resolve('.');
+
+    try {
+      const execArgs = [...process.execArgv, process.argv[1]!, 'daemon', 'start', `--dir=${dir}`];
+
+      const child = spawn(process.execPath, execArgs, {
+        cwd: dir,
+        stdio: 'ignore',
+        detached: true,
+      });
+
+      child.unref();
+
+      child.on('error', err => {
+        onError(`Failed to start daemon: ${err.message}`);
+      });
+
+      // Give the daemon a moment to initialize, then report success.
+      // The daemon still needs several more seconds to register the bridge
+      // and create a CCR session — main.tsx will tell the user to reconnect.
+      setTimeout(() => {
+        onInstalled(dir);
+      }, 1500);
+    } catch (err) {
+      onError(`Failed to start daemon: ${err instanceof Error ? err.message : String(err)}`);
+    }
+  }
+
+  if (starting) {
+    return (
+      <Dialog title="Assistant Setup" onCancel={onCancel} hideInputGuide>
+        <Text>Starting daemon in {defaultDir}...</Text>
+      </Dialog>
+    );
+  }
+
+  return (
+    <Dialog title="Assistant Setup" onCancel={onCancel} hideInputGuide>
+      <Box flexDirection="column" gap={1}>
+        <Text>No active assistant sessions found.</Text>
+        <Text>
+          Start a daemon in <Text bold>{defaultDir || '.'}</Text> to create a cloud session?
+        </Text>
+        <Box flexDirection="column">
+          <ListItem isFocused={focusIndex === 0}>
+            <Text>Start assistant daemon</Text>
+          </ListItem>
+          <ListItem isFocused={focusIndex === 1}>
+            <Text>Cancel</Text>
+          </ListItem>
+        </Box>
+        <Text dimColor>Enter to select · Esc to cancel</Text>
+      </Box>
+    </Dialog>
+  );
+}
+
+/**
+ * /assistant command implementation.
+ *
+ * First invocation activates KAIROS (sets kairosActive, enables brief
+ * and proactive tools). Subsequent invocations toggle the assistant panel.
+ */
+export async function call(
+  onDone: LocalJSXCommandOnDone,
+  context: LocalJSXCommandContext,
+  _args: string,
+): Promise<React.ReactNode> {
+  const { setAppState, getAppState } = context;
+
+  // First invocation: activate KAIROS
+  if (!getKairosActive()) {
+    setKairosActive(true);
+    setAppState(
+      (prev: AppState) =>
+        ({
+          ...prev,
+          kairosEnabled: true,
+          assistantPanelVisible: true,
+        }) as AppState,
+    );
+    onDone('KAIROS assistant mode activated.', { display: 'system' });
+    return null;
+  }
+
+  // Subsequent invocations: toggle panel visibility
+  const current = getAppState();
+  const isVisible = (current as Record<string, unknown>).assistantPanelVisible;
+
+  if (isVisible) {
+    setAppState(
+      (prev: AppState) =>
+        ({
+          ...prev,
+          assistantPanelVisible: false,
+        }) as AppState,
+    );
+    onDone('Assistant panel hidden.', { display: 'system' });
+  } else {
+    setAppState(
+      (prev: AppState) =>
+        ({
+          ...prev,
+          assistantPanelVisible: true,
+        }) as AppState,
+    );
+    onDone('Assistant panel opened.', { display: 'system' });
+  }
+
+  return null;
+}
--- a/src/commands/assistant/gate.ts
+++ b/src/commands/assistant/gate.ts
@@ -1,25 +1,21 @@
 import { feature } from 'bun:bundle'
-import { getKairosActive } from '../../bootstrap/state.js'
 import { getFeatureValue_CACHED_MAY_BE_STALE } from '../../services/analytics/growthbook.js'

 /**
- * Runtime gate for the /assistant command.
+ * Runtime gate for the /assistant command visibility.
 *
- * Build-time: feature('KAIROS') must be on (checked in commands.ts before
- * the module is even required).
+ * Build-time: feature('KAIROS') must be on.
+ * Runtime: tengu_kairos_assistant GrowthBook flag (remote kill switch).
 *
- * Runtime: tengu_kairos_assistant GrowthBook flag acts as a remote kill
- * switch, and kairosActive state must be true (set during bootstrap when
- * the session qualifies for KAIROS features).
+ * Does NOT require kairosActive — the /assistant command is visible
+ * before activation so users can invoke it to activate KAIROS.
 */
 export function isAssistantEnabled(): boolean {
  if (!feature('KAIROS')) {
    return false
  }
-  if (
-    !getFeatureValue_CACHED_MAY_BE_STALE('tengu_kairos_assistant', false)
-  ) {
+  if (!getFeatureValue_CACHED_MAY_BE_STALE('tengu_kairos_assistant', false)) {
    return false
  }
-  return getKairosActive()
+  return true
 }
--- a/src/commands/autonomy.ts
+++ b/src/commands/autonomy.ts
@@ -0,0 +1,125 @@
+import type { Command, LocalCommandCall } from '../types/command.js'
+import {
+  formatAutonomyFlowDetail,
+  formatAutonomyFlowsList,
+  formatAutonomyFlowsStatus,
+  getAutonomyFlowById,
+  listAutonomyFlows,
+  requestManagedAutonomyFlowCancel,
+} from '../utils/autonomyFlows.js'
+import {
+  formatAutonomyRunsList,
+  formatAutonomyRunsStatus,
+  listAutonomyRuns,
+  markAutonomyRunCancelled,
+  resumeManagedAutonomyFlowPrompt,
+} from '../utils/autonomyRuns.js'
+import {
+  enqueuePendingNotification,
+  removeByFilter,
+} from '../utils/messageQueueManager.js'
+
+function parseRunsLimit(raw?: string): number {
+  const parsed = Number.parseInt(raw ?? '', 10)
+  if (!Number.isFinite(parsed) || parsed <= 0) {
+    return 10
+  }
+  return Math.min(parsed, 50)
+}
+
+const call: LocalCommandCall = async (args: string) => {
+  const [subcommand = 'status', arg1, arg2] = args.trim().split(/\s+/, 3)
+  const runs = await listAutonomyRuns()
+  const flows = await listAutonomyFlows()
+
+  if (subcommand === 'runs') {
+    return {
+      type: 'text',
+      value: formatAutonomyRunsList(runs, parseRunsLimit(arg1)),
+    }
+  }
+
+  if (subcommand === 'flows') {
+    return {
+      type: 'text',
+      value: formatAutonomyFlowsList(flows, parseRunsLimit(arg1)),
+    }
+  }
+
+  if (subcommand === 'flow') {
+    if (arg1 === 'cancel') {
+      const flowId = arg2 ?? ''
+      const cancelled = await requestManagedAutonomyFlowCancel({ flowId })
+      if (!cancelled) {
+        return {
+          type: 'text',
+          value: 'Autonomy flow not found.',
+        }
+      }
+      if (!cancelled.accepted) {
+        return {
+          type: 'text',
+          value: `Autonomy flow ${flowId} is already terminal (${cancelled.flow.status}).`,
+        }
+      }
+      const removed = removeByFilter(cmd => cmd.autonomy?.flowId === flowId)
+      for (const command of removed) {
+        if (command.autonomy?.runId) {
+          await markAutonomyRunCancelled(command.autonomy.runId)
+        }
+      }
+      return {
+        type: 'text',
+        value:
+          cancelled.flow.status === 'running'
+            ? `Cancellation requested for flow ${flowId}. The current step is still running, and no new steps will be started.`
+            : `Cancelled flow ${flowId}. Removed ${removed.length} queued step(s).`,
+      }
+    }
+
+    if (arg1 === 'resume') {
+      const flowId = arg2 ?? ''
+      const command = await resumeManagedAutonomyFlowPrompt({ flowId })
+      if (!command) {
+        return {
+          type: 'text',
+          value: 'Autonomy flow is not waiting or was not found.',
+        }
+      }
+      enqueuePendingNotification(command)
+      return {
+        type: 'text',
+        value: `Queued the next managed step for flow ${flowId}.`,
+      }
+    }
+
+    return {
+      type: 'text',
+      value: formatAutonomyFlowDetail(await getAutonomyFlowById(arg1 ?? '')),
+    }
+  }
+
+  if (subcommand !== 'status' && subcommand !== '') {
+    return {
+      type: 'text',
+      value:
+        'Usage: /autonomy [status|runs [limit]|flows [limit]|flow <id>|flow cancel <id>|flow resume <id>]',
+    }
+  }
+
+  return {
+    type: 'text',
+    value: [formatAutonomyRunsStatus(runs), formatAutonomyFlowsStatus(flows)].join('\n'),
+  }
+}
+
+const autonomy = {
+  type: 'local',
+  name: 'autonomy',
+  description:
+    'Inspect automatic autonomy runs recorded for proactive ticks and scheduled tasks',
+  supportsNonInteractive: true,
+  load: () => Promise.resolve({ call }),
+} satisfies Command
+
+export default autonomy
--- a/src/commands/init.ts
+++ b/src/commands/init.ts
@@ -1,6 +1,7 @@
 import { feature } from 'bun:bundle'
 import type { Command } from '../commands.js'
 import { maybeMarkProjectOnboardingComplete } from '../projectOnboardingState.js'
+import { AUTONOMY_AGENTS_PATH_POSIX } from '../utils/autonomyAuthority.js'
 import { isEnvTruthy } from '../utils/envUtils.js'

 const OLD_INIT_PROMPT = `Please analyze this codebase and create a CLAUDE.md file, which will be given to future instances of Claude Code to operate in this repository.
@@ -43,7 +44,7 @@ Use AskUserQuestion to find out what the user wants:

 ## Phase 2: Explore the codebase

-Launch a subagent to survey the codebase, and ask it to read key files to understand the project: manifest files (package.json, Cargo.toml, pyproject.toml, go.mod, pom.xml, etc.), README, Makefile/build configs, CI config, existing CLAUDE.md, .claude/rules/, AGENTS.md, .cursor/rules or .cursorrules, .github/copilot-instructions.md, .windsurfrules, .clinerules, .mcp.json.
+Launch a subagent to survey the codebase, and ask it to read key files to understand the project: manifest files (package.json, Cargo.toml, pyproject.toml, go.mod, pom.xml, etc.), README, Makefile/build configs, CI config, existing CLAUDE.md, .claude/rules/, ${AUTONOMY_AGENTS_PATH_POSIX}, .cursor/rules or .cursorrules, .github/copilot-instructions.md, .windsurfrules, .clinerules, .mcp.json.

 Detect:
 - Build, test, and lint commands (especially non-standard ones)
@@ -105,7 +106,7 @@ Include:
 - Repo etiquette (branch naming, PR conventions, commit style)
 - Required env vars or setup steps
 - Non-obvious gotchas or architectural decisions
- Important parts from existing AI coding tool configs if they exist (AGENTS.md, .cursor/rules, .cursorrules, .github/copilot-instructions.md, .windsurfrules, .clinerules)
+- Important parts from existing AI coding tool configs if they exist (${AUTONOMY_AGENTS_PATH_POSIX}, .cursor/rules, .cursorrules, .github/copilot-instructions.md, .windsurfrules, .clinerules)

 Exclude:
 - File-by-file structure or component lists (Claude can discover these by reading the codebase)
--- a/src/commands/lang/index.ts
+++ b/src/commands/lang/index.ts
@@ -0,0 +1,12 @@
+import type { Command } from '../../commands.js'
+
+const lang = {
+  type: 'local-jsx',
+  name: 'lang',
+  description: 'Set display language (en/zh/auto)',
+  immediate: true,
+  argumentHint: '<en|zh|auto>',
+  load: () => import('./lang.js'),
+} satisfies Command
+
+export default lang
--- a/src/commands/lang/lang.ts
+++ b/src/commands/lang/lang.ts
@@ -0,0 +1,49 @@
+import type { ToolUseContext } from '../../Tool.js'
+import type {
+  LocalJSXCommandContext,
+  LocalJSXCommandOnDone,
+} from '../../types/command.js'
+import { getGlobalConfig, saveGlobalConfig } from '../../utils/config.js'
+import {
+  type PreferredLanguage,
+  getLanguageDisplayName,
+  getResolvedLanguage,
+} from '../../utils/language.js'
+
+const VALID_LANGS: readonly PreferredLanguage[] = ['en', 'zh', 'auto']
+
+export async function call(
+  onDone: LocalJSXCommandOnDone,
+  _context: ToolUseContext & LocalJSXCommandContext,
+  args: string,
+): Promise<null> {
+  const arg = args.trim().toLowerCase()
+
+  if (!arg) {
+    const pref = getGlobalConfig().preferredLanguage ?? 'auto'
+    const resolved = getResolvedLanguage()
+    const suffix =
+      pref === 'auto' ? ` → ${getLanguageDisplayName(resolved)}` : ''
+    onDone(`Language: ${getLanguageDisplayName(pref)}${suffix}`, {
+      display: 'system',
+    })
+    return null
+  }
+
+  if (!VALID_LANGS.includes(arg as PreferredLanguage)) {
+    onDone(`Invalid language "${arg}". Use: en, zh, or auto`, {
+      display: 'system',
+    })
+    return null
+  }
+
+  const lang = arg as PreferredLanguage
+  saveGlobalConfig(current => ({ ...current, preferredLanguage: lang }))
+
+  const resolved = getResolvedLanguage()
+  const suffix = lang === 'auto' ? ` → ${getLanguageDisplayName(resolved)}` : ''
+  onDone(`Language set to ${getLanguageDisplayName(lang)}${suffix}`, {
+    display: 'system',
+  })
+  return null
+}
--- a/src/commands/send/send.ts
+++ b/src/commands/send/send.ts
@@ -1,6 +1,7 @@
 import type { LocalCommandCall } from '../../types/command.js'
 import { getSlaveClient } from '../../hooks/useMasterMonitor.js'
 import { getPipeIpc } from '../../utils/pipeTransport.js'
+import { addSendOverride, removeMasterPipeMute } from '../../utils/pipeMuteState.js'

 export const call: LocalCommandCall = async (args, context) => {
  const currentState = context.getAppState()
@@ -48,6 +49,12 @@ export const call: LocalCommandCall = async (args, context) => {
  }

  try {
+    // Temporarily override mute for this slave so its response is visible.
+    // Override lasts until the slave emits 'done' or 'error' (cleared by
+    // useMasterMonitor's attachPipeEntryEmitter handler).
+    addSendOverride(targetName)
+    removeMasterPipeMute(targetName)
+    client.send({ type: 'relay_unmute' })
    client.send({
      type: 'prompt',
      data: message,
@@ -89,6 +96,8 @@ export const call: LocalCommandCall = async (args, context) => {
      value: `Sent to "${targetName}": ${message.slice(0, 100)}${message.length > 100 ? '...' : ''}`,
    }
  } catch (err) {
+    // Roll back override on send failure to prevent permanent unmute
+    removeSendOverride(targetName)
    return {
      type: 'text',
      value: `Failed to send to "${targetName}": ${err instanceof Error ? err.message : String(err)}`,
--- a/src/commands/torch.ts
+++ b/src/commands/torch.ts
@@ -1 +1,19 @@
-export default null
+import type { Command, LocalJSXCommandOnDone } from '../types/command.js'
+import type { ReactNode } from 'react'
+
+const call = async (onDone: LocalJSXCommandOnDone): Promise<ReactNode> => {
+  onDone(
+    'torch: Reserved internal debug command. No implementation is available in this build.',
+    { display: 'system' },
+  )
+  return null
+}
+
+export default {
+  type: 'local-jsx',
+  name: 'torch',
+  description: '[INTERNAL] Development debug command (reserved)',
+  isEnabled: () => true,
+  isHidden: true,
+  load: () => Promise.resolve({ call }),
+} satisfies Command
--- a/src/daemon/tests/state.test.ts
+++ b/src/daemon/tests/state.test.ts
@@ -0,0 +1,185 @@
+/**
+ * Tests for src/daemon/state.ts
+ *
+ * Uses real temp directories and CLAUDE_CONFIG_DIR env var
+ * instead of mocking fs/envUtils, to avoid cross-test mock pollution.
+ */
+import { describe, expect, test, beforeEach, afterAll } from 'bun:test'
+import { mkdtempSync, rmSync, existsSync, readFileSync } from 'fs'
+import { join } from 'path'
+import { tmpdir } from 'os'
+import { getClaudeConfigHomeDir } from '../../utils/envUtils.js'
+
+// ─── setup: real temp dir via env var ──────────────────────────────────────
+
+const tempBase = mkdtempSync(join(tmpdir(), 'daemon-state-test-'))
+
+beforeEach(() => {
+  // Clear lodash memoize cache so CLAUDE_CONFIG_DIR env var takes effect
+  if (
+    typeof getClaudeConfigHomeDir === 'function' &&
+    'cache' in getClaudeConfigHomeDir
+  ) {
+    ;(getClaudeConfigHomeDir as any).cache.clear?.()
+  }
+  const tempHome = mkdtempSync(join(tempBase, 'home-'))
+  process.env.CLAUDE_CONFIG_DIR = tempHome
+})
+
+afterAll(() => {
+  delete process.env.CLAUDE_CONFIG_DIR
+  // Clear memoize cache after all tests so other files see fresh state
+  if (
+    typeof getClaudeConfigHomeDir === 'function' &&
+    'cache' in getClaudeConfigHomeDir
+  ) {
+    ;(getClaudeConfigHomeDir as any).cache.clear?.()
+  }
+  try {
+    rmSync(tempBase, { recursive: true, force: true })
+  } catch {
+    // best-effort cleanup
+  }
+})
+
+// ─── import ─────────────────────────────────────────────────────────────────
+
+const {
+  getDaemonStateFilePath,
+  writeDaemonState,
+  readDaemonState,
+  removeDaemonState,
+  queryDaemonStatus,
+} = await import('../state.js')
+
+// ─── tests ─────────────────────────────────────────────────────────────────
+
+describe('getDaemonStateFilePath', () => {
+  test('returns default path with remote-control name', () => {
+    const p = getDaemonStateFilePath()
+    expect(p).toContain('daemon')
+    expect(p).toContain('remote-control.json')
+  })
+
+  test('returns path with custom name', () => {
+    const p = getDaemonStateFilePath('my-daemon')
+    expect(p).toContain('my-daemon.json')
+  })
+})
+
+describe('writeDaemonState', () => {
+  test('writes state JSON to disk', () => {
+    const state = {
+      pid: 1234,
+      cwd: '/test',
+      startedAt: '2026-01-01T00:00:00Z',
+      workerKinds: ['rcs'],
+      lastStatus: 'running' as const,
+    }
+    writeDaemonState(state, 'test')
+    const filePath = getDaemonStateFilePath('test')
+    expect(existsSync(filePath)).toBe(true)
+    const parsed = JSON.parse(readFileSync(filePath, 'utf-8'))
+    expect(parsed.pid).toBe(1234)
+    expect(parsed.cwd).toBe('/test')
+  })
+
+  test('creates directory recursively', () => {
+    writeDaemonState(
+      {
+        pid: 1,
+        cwd: '/',
+        startedAt: '',
+        workerKinds: [],
+        lastStatus: 'running',
+      },
+      'dir-test',
+    )
+    const filePath = getDaemonStateFilePath('dir-test')
+    expect(existsSync(filePath)).toBe(true)
+  })
+})
+
+describe('readDaemonState', () => {
+  test('returns null when no state file', () => {
+    expect(readDaemonState('nonexistent')).toBeNull()
+  })
+
+  test('returns parsed state when file exists', () => {
+    const state = {
+      pid: 42,
+      cwd: '/x',
+      startedAt: '',
+      workerKinds: [],
+      lastStatus: 'running' as const,
+    }
+    writeDaemonState(state, 'read-test')
+    const result = readDaemonState('read-test')
+    expect(result).not.toBeNull()
+    expect(result!.pid).toBe(42)
+  })
+})
+
+describe('removeDaemonState', () => {
+  test('removes existing state file', () => {
+    writeDaemonState(
+      {
+        pid: 1,
+        cwd: '/',
+        startedAt: '',
+        workerKinds: [],
+        lastStatus: 'running',
+      },
+      'rm-test',
+    )
+    const filePath = getDaemonStateFilePath('rm-test')
+    expect(existsSync(filePath)).toBe(true)
+    removeDaemonState('rm-test')
+    expect(existsSync(filePath)).toBe(false)
+  })
+
+  test('does not throw when file does not exist', () => {
+    expect(() => removeDaemonState('no-file')).not.toThrow()
+  })
+})
+
+describe('queryDaemonStatus', () => {
+  test('returns stopped when no state file', () => {
+    const result = queryDaemonStatus('empty')
+    expect(result.status).toBe('stopped')
+    expect(result.state).toBeUndefined()
+  })
+
+  test('returns running when PID is alive (current process)', () => {
+    writeDaemonState(
+      {
+        pid: process.pid,
+        cwd: process.cwd(),
+        startedAt: new Date().toISOString(),
+        workerKinds: ['test'],
+        lastStatus: 'running',
+      },
+      'alive-test',
+    )
+    const result = queryDaemonStatus('alive-test')
+    expect(result.status).toBe('running')
+    expect(result.state).toBeDefined()
+    expect(result.state!.pid).toBe(process.pid)
+  })
+
+  test('returns stale when PID is dead and cleans up', () => {
+    writeDaemonState(
+      {
+        pid: 999999,
+        cwd: '/',
+        startedAt: '',
+        workerKinds: [],
+        lastStatus: 'running',
+      },
+      'stale-test',
+    )
+    const result = queryDaemonStatus('stale-test')
+    expect(result.status).toBe('stale')
+    expect(existsSync(getDaemonStateFilePath('stale-test'))).toBe(false)
+  })
+})
--- a/src/daemon/main.ts
+++ b/src/daemon/main.ts
@@ -1,6 +1,12 @@
 import { spawn, type ChildProcess } from 'child_process'
 import { resolve } from 'path'
 import { errorMessage } from '../utils/errors.js'
+import {
+  writeDaemonState,
+  removeDaemonState,
+  queryDaemonStatus,
+  stopDaemonByPid,
+} from './state.js'

 /**
 * Exit code used by workers for permanent (non-retryable) failures.
@@ -46,10 +52,10 @@ export async function daemonMain(args: string[]): Promise<void> {
      await runSupervisor(args.slice(1))
      break
    case 'status':
-      console.log('daemon status: not yet implemented (requires IPC)')
+      showDaemonStatus()
      break
    case 'stop':
-      console.log('daemon stop: not yet implemented (requires PID file)')
+      await handleDaemonStop()
      break
    case '--help':
    case '-h':
@@ -85,6 +91,57 @@ OPTIONS
 `)
 }

+/**
+ * Show daemon status by reading the state file and probing the PID.
+ */
+function showDaemonStatus(): void {
+  const result = queryDaemonStatus()
+
+  switch (result.status) {
+    case 'running': {
+      const s = result.state!
+      console.log(`daemon status: running`)
+      console.log(`  PID:        ${s.pid}`)
+      console.log(`  CWD:        ${s.cwd}`)
+      console.log(`  Started:    ${s.startedAt}`)
+      console.log(`  Workers:    ${s.workerKinds.join(', ')}`)
+      break
+    }
+    case 'stopped':
+      console.log('daemon status: stopped')
+      break
+    case 'stale':
+      console.log('daemon status: stale (cleaned up)')
+      break
+  }
+}
+
+/**
+ * Stop a running daemon from another CLI process.
+ */
+async function handleDaemonStop(): Promise<void> {
+  const result = queryDaemonStatus()
+
+  if (result.status === 'stopped') {
+    console.log('daemon is not running')
+    return
+  }
+
+  if (result.status === 'stale') {
+    console.log('daemon was stale (cleaned up)')
+    return
+  }
+
+  console.log(`stopping daemon (PID: ${result.state!.pid})...`)
+  const stopped = await stopDaemonByPid()
+
+  if (stopped) {
+    console.log('daemon stopped')
+  } else {
+    console.log('daemon could not be stopped (may have already exited)')
+  }
+}
+
 /**
 * Parse supervisor arguments from CLI.
 */
@@ -140,12 +197,22 @@ async function runSupervisor(args: string[]): Promise<void> {
    },
  ]

+  // Write daemon state file so other CLI processes can query/stop us
+  writeDaemonState({
+    pid: process.pid,
+    cwd: dir,
+    startedAt: new Date().toISOString(),
+    workerKinds: workers.map(w => w.kind),
+    lastStatus: 'running',
+  })
+
  const controller = new AbortController()

  // Graceful shutdown
  const shutdown = () => {
    console.log('[daemon] supervisor shutting down...')
    controller.abort()
+    removeDaemonState()
    for (const w of workers) {
      if (w.process && !w.process.killed) {
        w.process.kill('SIGTERM')
--- a/src/daemon/state.ts
+++ b/src/daemon/state.ts
@@ -0,0 +1,157 @@
+import { readFileSync, writeFileSync, mkdirSync, unlinkSync } from 'fs'
+import { join, dirname } from 'path'
+import { getClaudeConfigHomeDir } from '../utils/envUtils.js'
+
+/**
+ * Daemon state persisted to disk so that `status` / `stop` can work
+ * from a different CLI process than the one that started the daemon.
+ */
+export interface DaemonStateData {
+  pid: number
+  cwd: string
+  startedAt: string
+  workerKinds: string[]
+  lastStatus: 'running' | 'stopped' | 'error'
+}
+
+export type DaemonStatus = 'running' | 'stopped' | 'stale'
+
+/**
+ * Returns the path to the daemon state file for a given daemon name.
+ */
+export function getDaemonStateFilePath(name = 'remote-control'): string {
+  return join(getClaudeConfigHomeDir(), 'daemon', `${name}.json`)
+}
+
+/**
+ * Write daemon state to disk. Called by the supervisor on startup.
+ */
+export function writeDaemonState(
+  state: DaemonStateData,
+  name = 'remote-control',
+): void {
+  const filePath = getDaemonStateFilePath(name)
+  mkdirSync(dirname(filePath), { recursive: true })
+  writeFileSync(filePath, JSON.stringify(state, null, 2), 'utf-8')
+}
+
+/**
+ * Read daemon state from disk. Returns null if no state file exists.
+ */
+export function readDaemonState(
+  name = 'remote-control',
+): DaemonStateData | null {
+  const filePath = getDaemonStateFilePath(name)
+  try {
+    const raw = readFileSync(filePath, 'utf-8')
+    return JSON.parse(raw) as DaemonStateData
+  } catch {
+    return null
+  }
+}
+
+/**
+ * Remove the daemon state file.
+ */
+export function removeDaemonState(name = 'remote-control'): void {
+  const filePath = getDaemonStateFilePath(name)
+  try {
+    unlinkSync(filePath)
+  } catch {
+    // File may not exist — that's fine
+  }
+}
+
+/**
+ * Check if a process with the given PID is alive.
+ */
+function isProcessAlive(pid: number): boolean {
+  try {
+    process.kill(pid, 0)
+    return true
+  } catch {
+    return false
+  }
+}
+
+/**
+ * Query the daemon status by reading the state file and probing the PID.
+ *
+ * Returns:
+ *  - { status: 'running', state } — PID is alive
+ *  - { status: 'stopped' }       — no state file
+ *  - { status: 'stale' }         — state file exists but PID is dead (auto-cleaned)
+ */
+export function queryDaemonStatus(name = 'remote-control'): {
+  status: DaemonStatus
+  state?: DaemonStateData
+} {
+  const state = readDaemonState(name)
+  if (!state) {
+    return { status: 'stopped' }
+  }
+
+  if (isProcessAlive(state.pid)) {
+    return { status: 'running', state }
+  }
+
+  // Stale — process is dead but state file remains
+  removeDaemonState(name)
+  return { status: 'stale' }
+}
+
+/**
+ * Stop a running daemon by sending SIGTERM, waiting, then SIGKILL if needed.
+ * Cleans up the state file afterward.
+ *
+ * @returns true if the daemon was stopped, false if it wasn't running
+ */
+export async function stopDaemonByPid(
+  name = 'remote-control',
+  timeoutMs = 10_000,
+): Promise<boolean> {
+  const state = readDaemonState(name)
+  if (!state) {
+    return false
+  }
+
+  const { pid } = state
+
+  if (!isProcessAlive(pid)) {
+    removeDaemonState(name)
+    return false
+  }
+
+  // Send SIGTERM
+  try {
+    process.kill(pid, 'SIGTERM')
+  } catch {
+    removeDaemonState(name)
+    return false
+  }
+
+  // Wait for exit with timeout
+  const deadline = Date.now() + timeoutMs
+  const pollInterval = 200
+
+  while (Date.now() < deadline) {
+    if (!isProcessAlive(pid)) {
+      removeDaemonState(name)
+      return true
+    }
+    await new Promise(resolve => setTimeout(resolve, pollInterval))
+  }
+
+  // Force kill
+  try {
+    process.kill(pid, 'SIGKILL')
+  } catch {
+    // Already dead
+  }
+
+  // Brief wait for SIGKILL to take effect
+  await new Promise(resolve => setTimeout(resolve, 500))
+
+  removeDaemonState(name)
+  return true
+}
--- a/src/hooks/useAwaySummary.ts
+++ b/src/hooks/useAwaySummary.ts
@@ -48,7 +48,6 @@ export function useAwaySummary(
    'tengu_sedge_lantern',
    false,
  )
-
  useEffect(() => {
    if (!feature('AWAY_SUMMARY')) return
    if (!gbEnabled) return
--- a/src/hooks/useMasterMonitor.ts
+++ b/src/hooks/useMasterMonitor.ts
@@ -18,6 +18,11 @@ import {
  type PipeIpcSlaveState,
 } from '../utils/pipeTransport.js'
 import { logForDebugging } from '../utils/debug.js'
+import {
+  isMasterPipeMuted,
+  hasSendOverride,
+  removeSendOverride,
+} from '../utils/pipeMuteState.js'

 /** Session history entry for pipe IPC monitoring. */
 export type SessionEntry = {
@@ -113,6 +118,28 @@ function isMonitoredPipeEntryType(type: string): boolean {
  return MONITORED_PIPE_ENTRY_TYPES.includes(type)
 }

+/** Business message types that should be dropped when a slave is muted. */
+const MUTED_DROPPABLE_TYPES = new Set([
+  'prompt_ack',
+  'stream',
+  'tool_start',
+  'tool_result',
+  'done',
+  'error',
+  'permission_request',
+  'permission_cancel',
+])
+
+/**
+ * Centralized mute check used by both attachPipeEntryEmitter and
+ * useMasterMonitor's inline handler — keeps the two gates in sync.
+ */
+export function shouldDropMutedMessage(slaveName: string, msgType: string): boolean {
+  if (hasSendOverride(slaveName)) return false
+  if (!isMasterPipeMuted(slaveName)) return false
+  return MUTED_DROPPABLE_TYPES.has(msgType)
+}
+
 function pipeMessageToSessionEntry(
  slaveName: string,
  msg: PipeMessage,
@@ -153,6 +180,35 @@ function attachPipeEntryEmitter(name: string, client: PipeClient): void {
  if (typeof client.on !== 'function') return
  const handler = (msg: PipeMessage) => {
    if (!isMonitoredPipeEntryType(msg.type)) return
+
+    // Mute gate: drop business messages from muted slaves
+    if (shouldDropMutedMessage(name, msg.type)) {
+      // Auto-deny permission_request to prevent slave deadlock
+      if (msg.type === 'permission_request') {
+        try {
+          const payload = JSON.parse(msg.data ?? '{}')
+          if (payload.requestId) {
+            client.send({
+              type: 'permission_response',
+              data: JSON.stringify({
+                requestId: payload.requestId,
+                behavior: 'deny',
+                feedback: 'Permission auto-denied: pipe is logically disconnected.',
+              }),
+            })
+          }
+        } catch {
+          // Malformed payload — safe to ignore
+        }
+      }
+      return
+    }
+
+    // Clear /send override when slave turn completes
+    if ((msg.type === 'done' || msg.type === 'error') && hasSendOverride(name)) {
+      removeSendOverride(name)
+    }
+
    emitPipeEntry(name, pipeMessageToSessionEntry(name, msg))
  }
  _pipeEntryHandlers.set(name, handler)
@@ -166,14 +222,14 @@ function emitSlaveClientRegistryChanged(): void {
  }
 }

-function subscribeToSlaveClientRegistry(listener: () => void): () => void {
+export function subscribeToSlaveClientRegistry(listener: () => void): () => void {
  _slaveClientRegistryListeners.add(listener)
  return () => {
    _slaveClientRegistryListeners.delete(listener)
  }
 }

-function getSlaveClientRegistryVersion(): number {
+export function getSlaveClientRegistryVersion(): number {
  return _slaveClientRegistryVersion
 }

@@ -248,13 +304,23 @@ export function useMasterMonitor(): void {

    for (const [slaveName, client] of _slaveClients.entries()) {
      const handler = (msg: PipeMessage) => {
-        const entry = pipeMessageToSessionEntry(slaveName, msg)
-
        // Only record relevant message types
        if (!isMonitoredPipeEntryType(msg.type)) {
          return
        }

+        // Mute gate (second gate, same helper as attachPipeEntryEmitter)
+        if (shouldDropMutedMessage(slaveName, msg.type)) {
+          return
+        }
+
+        // Clear /send override when slave turn completes
+        if ((msg.type === 'done' || msg.type === 'error') && hasSendOverride(slaveName)) {
+          removeSendOverride(slaveName)
+        }
+
+        const entry = pipeMessageToSessionEntry(slaveName, msg)
+
        setAppState(prev => {
          const slave = getPipeIpc(prev).slaves[slaveName]
          if (!slave) return prev
@@ -294,6 +360,8 @@ export function useMasterMonitor(): void {
      // Handle slave disconnect
      const onDisconnect = () => {
        logForDebugging(`[MasterMonitor] Slave "${slaveName}" disconnected`)
+        // Clear any lingering /send override before removing client
+        removeSendOverride(slaveName)
        removeSlaveClient(slaveName)
        setAppState(prev => {
          const { [slaveName]: _removed, ...remainingSlaves } =
--- a/src/hooks/usePipeIpc.ts
+++ b/src/hooks/usePipeIpc.ts
@@ -246,6 +246,15 @@ function registerMessageHandlers(
    }
  })

+  // Handle relay mute/unmute from master
+  server.onMessage((msg: PipeMessage, _reply) => {
+    if (msg.type === 'relay_mute') {
+      pp().setRelayMuted(true)
+    } else if (msg.type === 'relay_unmute') {
+      pp().setRelayMuted(false)
+    }
+  })
+
  // Handle detach
  server.onMessage((msg: PipeMessage, _reply) => {
    if (msg.type !== 'detach') return
--- a/src/hooks/usePipeMuteSync.ts
+++ b/src/hooks/usePipeMuteSync.ts
@@ -0,0 +1,141 @@
+/**
+ * usePipeMuteSync — Sync master's UI selection state to slave relay mute flags.
+ *
+ * Watches routeMode, selectedPipes, slave client registry, and send-override
+ * changes. When a slave is deselected or routeMode switches to 'local', sends
+ * relay_mute. When re-selected, sends relay_unmute. Also maintains the
+ * master-side muted set for in-flight message filtering.
+ *
+ * Feature-gated by UDS_INBOX (conditional import in REPL.tsx).
+ */
+import { useEffect, useRef, useSyncExternalStore } from 'react'
+import { useAppState } from '../state/AppState.js'
+import { getPipeIpc } from '../utils/pipeTransport.js'
+import {
+  setMasterMutedPipes,
+  clearMasterMutedPipes,
+  hasSendOverride,
+  clearSendOverrides,
+  subscribeSendOverride,
+  getSendOverrideVersion,
+} from '../utils/pipeMuteState.js'
+import {
+  getAllSlaveClients,
+  subscribeToSlaveClientRegistry,
+  getSlaveClientRegistryVersion,
+} from './useMasterMonitor.js'
+
+type UsePipeMuteSyncDeps = {
+  setToolUseConfirmQueue: (action: React.SetStateAction<Record<string, unknown>[]>) => void
+}
+
+export function usePipeMuteSync({
+  setToolUseConfirmQueue,
+}: UsePipeMuteSyncDeps): void {
+  // Subscribe to individual scalars to avoid object-selector re-render churn
+  // (AppState.tsx warns against object-returning selectors)
+  const routeMode = useAppState(
+    s => (getPipeIpc(s).routeMode as 'selected' | 'local') ?? 'selected',
+  )
+  const selectedPipes: string[] = useAppState(
+    s => (getPipeIpc(s).selectedPipes as string[]) ?? [],
+  )
+
+  // Subscribe to slave client registry changes
+  const registryVersion = useSyncExternalStore(
+    subscribeToSlaveClientRegistry,
+    getSlaveClientRegistryVersion,
+    getSlaveClientRegistryVersion,
+  )
+
+  // Subscribe to send-override changes so mute recalculates after /send completes
+  const sendOverrideVersion = useSyncExternalStore(
+    subscribeSendOverride,
+    getSendOverrideVersion,
+    getSendOverrideVersion,
+  )
+
+  const prevMutedRef = useRef<Set<string>>(new Set())
+
+  useEffect(() => {
+    const slaves = getAllSlaveClients()
+
+    // Compute which slaves should be muted now
+    const nextMuted = new Set<string>()
+    if (routeMode === 'local') {
+      // All connected slaves muted
+      for (const name of slaves.keys()) {
+        if (!hasSendOverride(name)) {
+          nextMuted.add(name)
+        }
+      }
+    } else {
+      // routeMode === 'selected': mute slaves NOT in selectedPipes
+      const selectedSet = new Set(selectedPipes)
+      for (const name of slaves.keys()) {
+        if (!selectedSet.has(name) && !hasSendOverride(name)) {
+          nextMuted.add(name)
+        }
+      }
+    }
+
+    // Step 1: Update master-side muted set FIRST (before sending control packets)
+    setMasterMutedPipes(nextMuted)
+
+    const prevMuted = prevMutedRef.current
+
+    // Step 2: For newly muted slaves — abort pending permissions, then send relay_mute
+    for (const name of nextMuted) {
+      if (!prevMuted.has(name)) {
+        // Abort pending permission prompts for this slave
+        setToolUseConfirmQueue((queue: Record<string, unknown>[]) => {
+          const toAbort = queue.filter(
+            (item: Record<string, unknown>) => item.pipeName === name,
+          )
+          for (const item of toAbort) {
+            try {
+              ;(item.onAbort as (() => void) | undefined)?.()
+            } catch {
+              // onAbort may throw if client disconnected — safe to ignore
+            }
+          }
+          return queue.filter((item: Record<string, unknown>) => item.pipeName !== name)
+        })
+
+        // Send relay_mute to slave
+        const client = slaves.get(name)
+        if (client?.connected) {
+          try {
+            client.send({ type: 'relay_mute' })
+          } catch {
+            // send may fail if socket is closing — non-fatal
+          }
+        }
+      }
+    }
+
+    // Step 3: For newly unmuted slaves — send relay_unmute
+    for (const name of prevMuted) {
+      if (!nextMuted.has(name)) {
+        const client = slaves.get(name)
+        if (client?.connected) {
+          try {
+            client.send({ type: 'relay_unmute' })
+          } catch {
+            // non-fatal
+          }
+        }
+      }
+    }
+
+    prevMutedRef.current = nextMuted
+  }, [routeMode, selectedPipes, registryVersion, sendOverrideVersion, setToolUseConfirmQueue])
+
+  // Cleanup on unmount: clear all master-side mute state
+  useEffect(() => {
+    return () => {
+      clearMasterMutedPipes()
+      clearSendOverrides()
+    }
+  }, [])
+}
--- a/src/hooks/usePipePermissionForward.ts
+++ b/src/hooks/usePipePermissionForward.ts
@@ -90,6 +90,7 @@ export function usePipePermissionForward({
                input: payload.input,
                toolUseContext,
                toolUseID: `pipe:${payload.requestId}`,
+                pipeName,
                permissionResult: payload.permissionResult,
                permissionPromptStartTimeMs:
                  payload.permissionPromptStartTimeMs,
--- a/src/hooks/usePipeRelay.ts
+++ b/src/hooks/usePipeRelay.ts
@@ -6,7 +6,7 @@
 * `getPipeRelay()` singleton set by usePipeIpc's attach handler.
 */
 import { useRef, useCallback } from 'react'
-import { getPipeRelay } from '../utils/pipePermissionRelay.js'
+import { getPipeRelay, isRelayMuted } from '../utils/pipePermissionRelay.js'
 import type { PipeMessage } from '../utils/pipeTransport.js'

 export type PipeRelayHandle = {
@@ -29,6 +29,9 @@ export function usePipeRelay(): PipeRelayHandle {
      if (typeof relay !== 'function') {
        return false
      }
+      if (isRelayMuted()) {
+        return false
+      }
      relay(message)
      return true
    },
--- a/src/hooks/useScheduledTasks.ts
+++ b/src/hooks/useScheduledTasks.ts
@@ -7,9 +7,12 @@ import {
 } from '../tasks/InProcessTeammateTask/InProcessTeammateTask.js'
 import { isKairosCronEnabled } from '@claude-code-best/builtin-tools/tools/ScheduleCronTool/prompt.js'
 import type { Message } from '../types/message.js'
+import { getCwd } from '../utils/cwd.js'
 import { getCronJitterConfig } from '../utils/cronJitterConfig.js'
 import { createCronScheduler } from '../utils/cronScheduler.js'
 import { removeCronTasks } from '../utils/cronTasks.js'
+import { createAutonomyQueuedPrompt } from '../utils/autonomyRuns.js'
+import { markAutonomyRunFailed } from '../utils/autonomyRuns.js'
 import { logForDebugging } from '../utils/debug.js'
 import { enqueuePendingNotification } from '../utils/messageQueueManager.js'
 import { createScheduledTaskFireMessage } from '../utils/messages.js'
@@ -68,50 +71,92 @@ export function useScheduledTasks({
    // forward isMeta, so their messages remain visible in the
    // transcript. This is acceptable since normal mode is not the
    // primary use case for scheduled tasks.
-    const enqueueForLead = (prompt: string) =>
-      enqueuePendingNotification({
-        value: prompt,
-        mode: 'prompt',
-        priority: 'later',
-        isMeta: true,
-        // Threaded through to cc_workload= in the billing-header
-        // attribution block so the API can serve cron-initiated requests
-        // at lower QoS when capacity is tight. No human is actively
-        // waiting on this response.
+    const enqueueForLead = async (prompt: string) => {
+      const command = await createAutonomyQueuedPrompt({
+        basePrompt: prompt,
+        trigger: 'scheduled-task',
+        currentDir: getCwd(),
        workload: WORKLOAD_CRON,
      })
+      if (!command) {
+        return
+      }
+      enqueuePendingNotification(command)
+    }

    const scheduler = createCronScheduler({
      // Missed-task surfacing (onFire fallback). Teammate crons are always
      // session-only (durable:false) so they never appear in the missed list,
      // which is populated from disk at scheduler startup — this path only
      // handles team-lead durable crons.
-      onFire: enqueueForLead,
+      onFire: prompt => {
+        void enqueueForLead(prompt)
+      },
      // Normal fires receive the full CronTask so we can route by agentId.
      onFireTask: task => {
-        if (task.agentId) {
-          const teammate = findTeammateTaskByAgentId(
-            task.agentId,
-            store.getState().tasks,
-          )
-          if (teammate && !isTerminalTaskStatus(teammate.status)) {
-            injectUserMessageToTeammate(teammate.id, task.prompt, setAppState)
+        void (async () => {
+          if (task.agentId) {
+            const teammate = findTeammateTaskByAgentId(
+              task.agentId,
+              store.getState().tasks,
+            )
+            if (teammate && !isTerminalTaskStatus(teammate.status)) {
+              const command = await createAutonomyQueuedPrompt({
+                basePrompt: task.prompt,
+                trigger: 'scheduled-task',
+                currentDir: getCwd(),
+                sourceId: task.id,
+                sourceLabel: task.prompt,
+                workload: WORKLOAD_CRON,
+              })
+              if (!command) {
+                return
+              }
+              const injected = injectUserMessageToTeammate(
+                teammate.id,
+                command.value as string,
+                {
+                  autonomyRunId: command.autonomy?.runId,
+                  origin: command.origin,
+                },
+                setAppState,
+              )
+              if (!injected && command.autonomy?.runId) {
+                await markAutonomyRunFailed(
+                  command.autonomy.runId,
+                  `Teammate ${task.agentId} exited before the scheduled message could be delivered.`,
+                )
+              }
+              return
+            }
+            // Teammate is gone — clean up the orphaned cron so it doesn't keep
+            // firing into nowhere every tick. One-shots would auto-delete on
+            // fire anyway, but recurring crons would loop until auto-expiry.
+            logForDebugging(
+              `[ScheduledTasks] teammate ${task.agentId} gone, removing orphaned cron ${task.id}`,
+            )
+            void removeCronTasks([task.id])
            return
          }
-          // Teammate is gone — clean up the orphaned cron so it doesn't keep
-          // firing into nowhere every tick. One-shots would auto-delete on
-          // fire anyway, but recurring crons would loop until auto-expiry.
-          logForDebugging(
-            `[ScheduledTasks] teammate ${task.agentId} gone, removing orphaned cron ${task.id}`,
+
+          const command = await createAutonomyQueuedPrompt({
+            basePrompt: task.prompt,
+            trigger: 'scheduled-task',
+            currentDir: getCwd(),
+            sourceId: task.id,
+            sourceLabel: task.prompt,
+            workload: WORKLOAD_CRON,
+          })
+          if (!command) {
+            return
+          }
+
+          const msg = createScheduledTaskFireMessage(
+            `Running scheduled task (${formatCronFireTime(new Date())})`,
          )
-          void removeCronTasks([task.id])
-          return
-        }
-        const msg = createScheduledTaskFireMessage(
-          `Running scheduled task (${formatCronFireTime(new Date())})`,
-        )
-        setMessages(prev => [...prev, msg])
-        enqueueForLead(task.prompt)
+          setMessages(prev => [...prev, msg])
+          enqueuePendingNotification(command)
+        })()
      },
      isLoading: () => isLoadingRef.current,
      assistantMode,
--- a/src/jobs/tests/classifier.test.ts
+++ b/src/jobs/tests/classifier.test.ts
@@ -0,0 +1,140 @@
+/**
+ * Tests for src/jobs/classifier.ts
+ *
+ * Uses real temp directories instead of mocking fs to avoid
+ * cross-test mock pollution in bun test.
+ *
+ * classifier.ts takes jobDir as a parameter, so no envUtils mock needed.
+ */
+import { describe, expect, test, beforeEach, afterAll } from 'bun:test'
+import { mkdtempSync, mkdirSync, writeFileSync, readFileSync, rmSync } from 'fs'
+import { join } from 'path'
+import { tmpdir } from 'os'
+import type { AssistantMessage } from '../../types/message.js'
+import { classifyAndWriteState } from '../classifier.js'
+
+// ─── setup: real temp dir ──────────────────────────────────────────────────
+
+let tempBase: string
+let jobDir: string
+let stateFile: string
+
+tempBase = mkdtempSync(join(tmpdir(), 'classifier-test-'))
+
+function freshJobDir(): void {
+  jobDir = mkdtempSync(join(tempBase, 'job-'))
+  stateFile = join(jobDir, 'state.json')
+}
+
+// ─── helpers ────────────────────────────────────────────────────────────────
+
+function makeAssistantMessage(
+  content: any[],
+  extra: Record<string, any> = {},
+): AssistantMessage {
+  return {
+    type: 'assistant',
+    uuid: '00000000-0000-0000-0000-000000000000' as any,
+    message: {
+      role: 'assistant',
+      content,
+      ...extra,
+    },
+  } as any
+}
+
+// ─── lifecycle ─────────────────────────────────────────────────────────────
+
+beforeEach(() => {
+  freshJobDir()
+})
+
+afterAll(() => {
+  try {
+    rmSync(tempBase, { recursive: true, force: true })
+  } catch {
+    // best-effort cleanup
+  }
+})
+
+// ─── tests ──────────────────────────────────────────────────────────────────
+
+describe('classifyAndWriteState', () => {
+  test('does nothing when state.json is missing', async () => {
+    await classifyAndWriteState(jobDir, [])
+    // stateFile should still not exist
+    let exists = false
+    try {
+      readFileSync(stateFile, 'utf-8')
+      exists = true
+    } catch {
+      // expected
+    }
+    expect(exists).toBe(false)
+  })
+
+  test('sets status to running when last message has tool_use block', async () => {
+    writeFileSync(
+      stateFile,
+      JSON.stringify({ status: 'created', updatedAt: '2026-01-01' }),
+      'utf-8',
+    )
+
+    const msg = makeAssistantMessage([
+      { type: 'text', text: 'Let me check...' },
+      { type: 'tool_use', id: 'toolu_1', name: 'bash', input: {} },
+    ])
+
+    await classifyAndWriteState(jobDir, [msg])
+
+    const state = JSON.parse(readFileSync(stateFile, 'utf-8'))
+    expect(state.status).toBe('running')
+  })
+
+  test('sets status to completed when stop_reason is end_turn', async () => {
+    writeFileSync(
+      stateFile,
+      JSON.stringify({ status: 'running', updatedAt: '2026-01-01' }),
+      'utf-8',
+    )
+
+    const msg = makeAssistantMessage([{ type: 'text', text: 'All done.' }], {
+      stop_reason: 'end_turn',
+    })
+
+    await classifyAndWriteState(jobDir, [msg])
+
+    const state = JSON.parse(readFileSync(stateFile, 'utf-8'))
+    expect(state.status).toBe('completed')
+  })
+
+  test('sets status to running for empty messages (state exists)', async () => {
+    writeFileSync(
+      stateFile,
+      JSON.stringify({ status: 'created', updatedAt: '2026-01-01' }),
+      'utf-8',
+    )
+
+    await classifyAndWriteState(jobDir, [])
+
+    const state = JSON.parse(readFileSync(stateFile, 'utf-8'))
+    expect(state.status).toBe('running')
+  })
+
+  test('sets status to running when stop_reason is max_tokens', async () => {
+    writeFileSync(
+      stateFile,
+      JSON.stringify({ status: 'running', updatedAt: '2026-01-01' }),
+      'utf-8',
+    )
+
+    const msg = makeAssistantMessage([{ type: 'text', text: 'I need more' }], {
+      stop_reason: 'max_tokens',
+    })
+
+    await classifyAndWriteState(jobDir, [msg])
+
+    const state = JSON.parse(readFileSync(stateFile, 'utf-8'))
+    expect(state.status).toBe('running')
+  })
+})
--- a/src/jobs/tests/state.test.ts
+++ b/src/jobs/tests/state.test.ts
@@ -0,0 +1,91 @@
+/**
+ * Tests for src/jobs/state.ts
+ *
+ * Uses real temp directories and CLAUDE_CONFIG_DIR env var
+ * instead of mocking fs, to avoid cross-test mock pollution.
+ */
+import { describe, expect, test, beforeEach, afterAll } from 'bun:test'
+import { mkdtempSync, rmSync, readFileSync, existsSync } from 'fs'
+import { join } from 'path'
+import { tmpdir } from 'os'
+
+// ─── setup: real temp dir via env var ──────────────────────────────────────
+
+const tempBase = mkdtempSync(join(tmpdir(), 'jobs-state-test-'))
+
+beforeEach(() => {
+  // Each test gets a fresh config dir
+  const tempHome = mkdtempSync(join(tempBase, 'home-'))
+  process.env.CLAUDE_CONFIG_DIR = tempHome
+})
+
+afterAll(() => {
+  delete process.env.CLAUDE_CONFIG_DIR
+  try {
+    rmSync(tempBase, { recursive: true, force: true })
+  } catch {
+    // best-effort cleanup
+  }
+})
+
+// ─── import ─────────────────────────────────────────────────────────────────
+
+const { createJob, readJobState, appendJobReply, getJobDir } = await import(
+  '../state.js'
+)
+
+// ─── tests ──────────────────────────────────────────────────────────────────
+
+describe('createJob', () => {
+  test('creates job directory and writes state, template, and input files', () => {
+    const dir = createJob('job-1', 'my-template', '# Template', 'hello', [
+      '--flag',
+    ])
+    expect(dir).toContain('job-1')
+    expect(existsSync(dir)).toBe(true)
+
+    const stateFile = join(dir, 'state.json')
+    expect(existsSync(stateFile)).toBe(true)
+    const state = JSON.parse(readFileSync(stateFile, 'utf-8'))
+    expect(state.jobId).toBe('job-1')
+    expect(state.templateName).toBe('my-template')
+    expect(state.status).toBe('created')
+    expect(state.args).toEqual(['--flag'])
+
+    expect(readFileSync(join(dir, 'template.md'), 'utf-8')).toBe('# Template')
+    expect(readFileSync(join(dir, 'input.txt'), 'utf-8')).toBe('hello')
+  })
+})
+
+describe('readJobState', () => {
+  test('returns null when job does not exist', () => {
+    expect(readJobState('nonexistent')).toBeNull()
+  })
+
+  test('returns parsed state when job exists', () => {
+    createJob('job-2', 'tpl', 'content', 'input', [])
+    const result = readJobState('job-2')
+    expect(result).not.toBeNull()
+    expect(result!.jobId).toBe('job-2')
+    expect(result!.status).toBe('created')
+  })
+})
+
+describe('appendJobReply', () => {
+  test('returns false when job does not exist', () => {
+    expect(appendJobReply('no-job', 'hello')).toBe(false)
+  })
+
+  test('appends reply and updates state', () => {
+    createJob('job-3', 'tpl', 'content', 'input', [])
+
+    const result = appendJobReply('job-3', 'my reply')
+    expect(result).toBe(true)
+
+    const dir = getJobDir('job-3')
+    const repliesPath = join(dir, 'replies.jsonl')
+    expect(existsSync(repliesPath)).toBe(true)
+    const replyLine = JSON.parse(readFileSync(repliesPath, 'utf-8').trim())
+    expect(replyLine.text).toBe('my reply')
+  })
+})
--- a/src/jobs/tests/templates.test.ts
+++ b/src/jobs/tests/templates.test.ts
@@ -0,0 +1,87 @@
+/**
+ * Tests for src/jobs/templates.ts
+ *
+ * Uses real temp directories and CLAUDE_CONFIG_DIR env var
+ * instead of mocking fs, to avoid cross-test mock pollution.
+ */
+import { describe, expect, test, beforeEach, afterAll } from 'bun:test'
+import { mkdtempSync, mkdirSync, writeFileSync, rmSync } from 'fs'
+import { join } from 'path'
+import { tmpdir } from 'os'
+
+// ─── setup: real temp dir via env var ──────────────────────────────────────
+
+const tempBase = mkdtempSync(join(tmpdir(), 'jobs-templates-test-'))
+
+beforeEach(() => {
+  const tempHome = mkdtempSync(join(tempBase, 'home-'))
+  process.env.CLAUDE_CONFIG_DIR = tempHome
+})
+
+afterAll(() => {
+  delete process.env.CLAUDE_CONFIG_DIR
+  try {
+    rmSync(tempBase, { recursive: true, force: true })
+  } catch {
+    // best-effort cleanup
+  }
+})
+
+// ─── import ─────────────────────────────────────────────────────────────────
+
+const { listTemplates, loadTemplate } = await import('../templates.js')
+
+// ─── tests ──────────────────────────────────────────────────────────────────
+
+describe('listTemplates', () => {
+  test('returns empty array when no template dirs exist', () => {
+    const result = listTemplates()
+    expect(result).toEqual([])
+  })
+
+  test('discovers templates from user-level dir', () => {
+    const userDir = join(process.env.CLAUDE_CONFIG_DIR!, 'templates')
+    mkdirSync(userDir, { recursive: true })
+    writeFileSync(
+      join(userDir, 'greeting.md'),
+      '---\ndescription: A greeting template\n---\nHello {{name}}',
+      'utf-8',
+    )
+
+    const result = listTemplates()
+    expect(result.length).toBe(1)
+    expect(result[0]!.name).toBe('greeting')
+    expect(result[0]!.description).toBe('A greeting template')
+    expect(result[0]!.content).toBe('Hello {{name}}')
+  })
+
+  test('skips non-md files', () => {
+    const userDir = join(process.env.CLAUDE_CONFIG_DIR!, 'templates')
+    mkdirSync(userDir, { recursive: true })
+    writeFileSync(join(userDir, 'notes.txt'), 'not a template', 'utf-8')
+    writeFileSync(join(userDir, 'data.json'), '{}', 'utf-8')
+
+    const result = listTemplates()
+    expect(result).toEqual([])
+  })
+})
+
+describe('loadTemplate', () => {
+  test('returns null when template not found', () => {
+    expect(loadTemplate('nonexistent')).toBeNull()
+  })
+
+  test('returns template by name', () => {
+    const userDir = join(process.env.CLAUDE_CONFIG_DIR!, 'templates')
+    mkdirSync(userDir, { recursive: true })
+    writeFileSync(
+      join(userDir, 'deploy.md'),
+      '---\ndescription: Deploy script\n---\nrun deploy',
+      'utf-8',
+    )
+
+    const result = loadTemplate('deploy')
+    expect(result).not.toBeNull()
+    expect(result!.name).toBe('deploy')
+  })
+})
--- a/src/jobs/classifier.ts
+++ b/src/jobs/classifier.ts
@@ -1,3 +1,67 @@
-// Auto-generated stub — replace with real implementation
-export {};
-export const classifyAndWriteState: (...args: unknown[]) => Promise<void> = () => Promise.resolve();
+import { readFileSync, writeFileSync } from 'fs'
+import { join } from 'path'
+import type { AssistantMessage } from '../types/message.js'
+
+/**
+ * Classify the job status from the turn's assistant messages and update state.json.
+ *
+ * Called by stopHooks.ts after each repl_main_thread turn when CLAUDE_JOB_DIR is set.
+ * Only the main thread calls this (not subagents).
+ *
+ * @param jobDir - Path to the job directory (from CLAUDE_JOB_DIR env)
+ * @param assistantMessages - Assistant messages from this turn
+ */
+export async function classifyAndWriteState(
+  jobDir: string,
+  assistantMessages: AssistantMessage[],
+): Promise<void> {
+  const stateFile = join(jobDir, 'state.json')
+
+  let state: Record<string, unknown>
+  try {
+    state = JSON.parse(readFileSync(stateFile, 'utf-8'))
+  } catch {
+    // No state file or corrupt — not a valid job directory
+    return
+  }
+
+  const newStatus = classifyStatus(assistantMessages)
+  state.status = newStatus
+  state.updatedAt = new Date().toISOString()
+
+  writeFileSync(stateFile, JSON.stringify(state, null, 2), 'utf-8')
+}
+
+/**
+ * Determine job status from assistant messages.
+ *
+ * - Has tool_use blocks → still running (tools executing)
+ * - stop_reason === 'end_turn' → completed (model finished)
+ * - Otherwise → running
+ */
+function classifyStatus(messages: AssistantMessage[]): string {
+  if (messages.length === 0) return 'running'
+
+  const lastMessage = messages[messages.length - 1]!
+  const content = lastMessage.message?.content
+
+  // Check if the last message has tool_use blocks (still executing)
+  if (Array.isArray(content)) {
+    const hasToolUse = content.some(
+      block =>
+        typeof block === 'object' &&
+        block !== null &&
+        'type' in block &&
+        block.type === 'tool_use',
+    )
+    if (hasToolUse) return 'running'
+  }
+
+  // Check stop_reason via index signature
+  const stopReason = (lastMessage.message as Record<string, unknown>)
+    ?.stop_reason
+  if (stopReason === 'end_turn') return 'completed'
+  if (stopReason === 'max_tokens') return 'running'
+
+  return 'running'
+}
--- a/src/jobs/state.ts
+++ b/src/jobs/state.ts
@@ -0,0 +1,96 @@
+import { appendFileSync, mkdirSync, readFileSync, writeFileSync } from 'fs'
+import { join } from 'path'
+import { getClaudeConfigHomeDir } from '../utils/envUtils.js'
+
+export interface JobState {
+  jobId: string
+  templateName: string
+  createdAt: string
+  updatedAt: string
+  status: 'created' | 'running' | 'completed' | 'failed'
+  args: string[]
+}
+
+function getJobsDir(): string {
+  return join(getClaudeConfigHomeDir(), 'jobs')
+}
+
+export function getJobDir(jobId: string): string {
+  return join(getJobsDir(), jobId)
+}
+
+/**
+ * Create a new job directory with initial state.
+ */
+export function createJob(
+  jobId: string,
+  templateName: string,
+  templateContent: string,
+  inputText: string,
+  args: string[],
+): string {
+  const dir = getJobDir(jobId)
+  mkdirSync(dir, { recursive: true })
+
+  const state: JobState = {
+    jobId,
+    templateName,
+    createdAt: new Date().toISOString(),
+    updatedAt: new Date().toISOString(),
+    status: 'created',
+    args,
+  }
+
+  writeFileSync(
+    join(dir, 'state.json'),
+    JSON.stringify(state, null, 2),
+    'utf-8',
+  )
+  writeFileSync(join(dir, 'template.md'), templateContent, 'utf-8')
+  writeFileSync(join(dir, 'input.txt'), inputText, 'utf-8')
+
+  return dir
+}
+
+/**
+ * Read job state from disk.
+ */
+export function readJobState(jobId: string): JobState | null {
+  try {
+    const raw = readFileSync(join(getJobDir(jobId), 'state.json'), 'utf-8')
+    return JSON.parse(raw) as JobState
+  } catch {
+    return null
+  }
+}
+
+/**
+ * Append a reply to a job.
+ */
+export function appendJobReply(jobId: string, text: string): boolean {
+  const dir = getJobDir(jobId)
+  const state = readJobState(jobId)
+  if (!state) return false
+
+  const repliesPath = join(dir, 'replies.jsonl')
+  const entry = JSON.stringify({
+    text,
+    timestamp: new Date().toISOString(),
+  })
+
+  try {
+    appendFileSync(repliesPath, entry + '\n', 'utf-8')
+  } catch {
+    writeFileSync(repliesPath, entry + '\n', 'utf-8')
+  }
+
+  // Update state
+  state.updatedAt = new Date().toISOString()
+  writeFileSync(
+    join(dir, 'state.json'),
+    JSON.stringify(state, null, 2),
+    'utf-8',
+  )
+
+  return true
+}
--- a/src/jobs/templates.ts
+++ b/src/jobs/templates.ts
@@ -0,0 +1,105 @@
+import { readdirSync, readFileSync, statSync } from 'fs'
+import { join, basename } from 'path'
+import { parseFrontmatter } from '../utils/frontmatterParser.js'
+import type { FrontmatterData } from '../utils/frontmatterParser.js'
+import { getClaudeConfigHomeDir } from '../utils/envUtils.js'
+
+export interface TemplateInfo {
+  name: string
+  description: string
+  filePath: string
+  frontmatter: FrontmatterData
+  content: string
+}
+
+/**
+ * Discover .claude/templates directories from CWD up to root,
+ * plus the user-level ~/.claude/templates.
+ */
+function getTemplatesDirs(): string[] {
+  const dirs: string[] = []
+
+  // Project-level: walk up from CWD
+  let dir = process.cwd()
+  const seen = new Set<string>()
+  while (true) {
+    const candidate = join(dir, '.claude', 'templates')
+    if (!seen.has(candidate)) {
+      seen.add(candidate)
+      try {
+        if (statSync(candidate).isDirectory()) {
+          dirs.push(candidate)
+        }
+      } catch {
+        // Not found — keep walking
+      }
+    }
+
+    const parent = join(dir, '..')
+    if (parent === dir) break
+    dir = parent
+  }
+
+  // User-level
+  const userDir = join(getClaudeConfigHomeDir(), 'templates')
+  try {
+    if (statSync(userDir).isDirectory()) {
+      dirs.push(userDir)
+    }
+  } catch {
+    // Not found
+  }
+
+  return dirs
+}
+
+/**
+ * List all available templates.
+ */
+export function listTemplates(): TemplateInfo[] {
+  const templates: TemplateInfo[] = []
+  const seenNames = new Set<string>()
+
+  for (const dir of getTemplatesDirs()) {
+    let files: string[]
+    try {
+      files = readdirSync(dir)
+    } catch {
+      continue
+    }
+
+    for (const file of files) {
+      if (!file.endsWith('.md')) continue
+      const name = basename(file, '.md')
+      if (seenNames.has(name)) continue
+      seenNames.add(name)
+
+      const filePath = join(dir, file)
+      try {
+        const raw = readFileSync(filePath, 'utf-8')
+        const { frontmatter, content } = parseFrontmatter(raw, filePath)
+        const description =
+          (frontmatter.description as string) ||
+          content
+            .split('\n')
+            .find(l => l.trim().length > 0)
+            ?.trim() ||
+          'No description'
+
+        templates.push({ name, description, filePath, frontmatter, content })
+      } catch {
+        // Skip unreadable files
+      }
+    }
+  }
+
+  return templates
+}
+
+/**
+ * Load a specific template by name.
+ */
+export function loadTemplate(name: string): TemplateInfo | null {
+  const all = listTemplates()
+  return all.find(t => t.name === name) ?? null
+}
--- a/src/main.tsx
+++ b/src/main.tsx
@@ -1802,9 +1802,11 @@ async function run(): Promise<CommanderCommand> {
 			}
 			if (
 				feature("KAIROS") &&
-				assistantModule?.isAssistantMode() &&
+				assistantModule &&
+				(assistantModule.isAssistantForced() ||
+					(options as Record<string, unknown>).assistant === true) &&
 				// Spawned teammates share the leader's cwd + settings.json, so
-				// isAssistantMode() is true for them too. --agent-id being set
+				// the flag is true for them too. --agent-id being set
 				// means we ARE a spawned teammate (extractTeammateOptions runs
 				// ~170 lines later so check the raw commander option) — don't
 				// re-init the team or override teammateMode/proactive/brief.
--- a/src/proactive/tests/state.baseline.test.ts
+++ b/src/proactive/tests/state.baseline.test.ts
@@ -0,0 +1,80 @@
+import { beforeEach, describe, expect, test } from 'bun:test'
+import {
+  activateProactive,
+  deactivateProactive,
+  getActivationSource,
+  getNextTickAt,
+  isContextBlocked,
+  isProactiveActive,
+  isProactivePaused,
+  pauseProactive,
+  resumeProactive,
+  setContextBlocked,
+  setNextTickAt,
+  shouldTick,
+  subscribeToProactiveChanges,
+} from '../index'
+
+function resetProactiveState() {
+  activateProactive('reset')
+  setContextBlocked(false)
+  setNextTickAt(null)
+  deactivateProactive()
+}
+
+beforeEach(() => {
+  resetProactiveState()
+})
+
+describe('proactive state baseline', () => {
+  test('activateProactive enables proactive mode and records the source', () => {
+    activateProactive('baseline_test')
+
+    expect(isProactiveActive()).toBe(true)
+    expect(isProactivePaused()).toBe(false)
+    expect(isContextBlocked()).toBe(false)
+    expect(getActivationSource()).toBe('baseline_test')
+    expect(shouldTick()).toBe(true)
+  })
+
+  test('pauseProactive suppresses ticking and clears nextTickAt', () => {
+    activateProactive('pause_case')
+    setNextTickAt(Date.now() + 30_000)
+
+    pauseProactive()
+
+    expect(isProactivePaused()).toBe(true)
+    expect(getNextTickAt()).toBeNull()
+    expect(shouldTick()).toBe(false)
+
+    resumeProactive()
+    expect(isProactivePaused()).toBe(false)
+    expect(shouldTick()).toBe(true)
+  })
+
+  test('setContextBlocked clears nextTickAt and blocks ticking', () => {
+    activateProactive('blocked_case')
+    setNextTickAt(Date.now() + 5_000)
+
+    setContextBlocked(true)
+
+    expect(isContextBlocked()).toBe(true)
+    expect(getNextTickAt()).toBeNull()
+    expect(shouldTick()).toBe(false)
+  })
+
+  test('subscribers are notified on state changes', () => {
+    let notifications = 0
+    const unsubscribe = subscribeToProactiveChanges(() => {
+      notifications += 1
+    })
+
+    activateProactive('subscriber_case')
+    setNextTickAt(Date.now() + 1_000)
+    setContextBlocked(true)
+    deactivateProactive()
+    unsubscribe()
+
+    expect(notifications).toBeGreaterThanOrEqual(3)
+  })
+})
--- a/src/proactive/useProactive.ts
+++ b/src/proactive/useProactive.ts
@@ -6,7 +6,10 @@
 * proactive mode is active and not blocked.
 */
 import { useEffect, useRef } from 'react'
+import type { QueuedCommand } from '../types/textInputTypes.js'
 import { TICK_TAG } from '../constants/xml.js'
+import { getCwd } from '../utils/cwd.js'
+import { createProactiveAutonomyCommands } from '../utils/autonomyRuns.js'
 import {
  isProactiveActive,
  isProactivePaused,
@@ -24,8 +27,7 @@ type UseProactiveOpts = {
  queuedCommandsLength: number
  hasActiveLocalJsxUI: boolean
  isInPlanMode: boolean
-  onSubmitTick: (prompt: string) => void
-  onQueueTick: (prompt: string) => void
+  onQueueTick: (command: QueuedCommand) => void
 }

 export function useProactive(opts: UseProactiveOpts): void {
@@ -70,14 +72,19 @@ export function useProactive(opts: UseProactiveOpts): void {
          return
        }

-        const tickContent = `<${TICK_TAG}>${new Date().toLocaleTimeString()}</${TICK_TAG}>`
-
-        // If nothing is in the queue, submit directly; otherwise queue
-        if (queuedCommandsLength === 0) {
-          optsRef.current.onSubmitTick(tickContent)
-        } else {
-          optsRef.current.onQueueTick(tickContent)
-        }
+        void (async () => {
+          const commands = await createProactiveAutonomyCommands({
+            basePrompt: `<${TICK_TAG}>${new Date().toLocaleTimeString()}</${TICK_TAG}>`,
+            currentDir: getCwd(),
+          })
+          for (const command of commands) {
+            // Always queue proactive turns. This avoids races where the prompt
+            // is built asynchronously, a user turn starts meanwhile, and a
+            // direct-submit path would silently drop the autonomy turn after
+            // consuming its heartbeat due-state.
+            optsRef.current.onQueueTick(command)
+          }
+        })()

        // Schedule next tick
        scheduleTick()
--- a/src/screens/REPL.tsx
+++ b/src/screens/REPL.tsx
@@ -72,6 +72,11 @@ import { QueryGuard } from '../utils/QueryGuard.js';
 import { isEnvTruthy } from '../utils/envUtils.js';
 import { formatTokens, truncateToWidth } from '../utils/format.js';
 import { consumeEarlyInput } from '../utils/earlyInput.js';
+import {
+  finalizeAutonomyRunCompleted,
+  finalizeAutonomyRunFailed,
+  markAutonomyRunRunning,
+} from '../utils/autonomyRuns.js';

 import { setMemberActive } from '../utils/swarm/teamHelpers.js';
 import {
@@ -346,6 +351,9 @@ const usePipeRelay = feature('UDS_INBOX')
 const usePipePermissionForward = feature('UDS_INBOX')
  ? require('../hooks/usePipePermissionForward.js').usePipePermissionForward
  : () => undefined;
+const usePipeMuteSync = feature('UDS_INBOX')
+  ? require('../hooks/usePipeMuteSync.js').usePipeMuteSync
+  : () => undefined;
 const usePipeRouter = feature('UDS_INBOX')
  ? require('../hooks/usePipeRouter.js').usePipeRouter
  : () => ({ routeToSelectedPipes: () => false });
@@ -4299,7 +4307,7 @@ export function REPL({
          });
        }
      } else {
-        injectUserMessageToTeammate(task.id, input, setAppState);
+        injectUserMessageToTeammate(task.id, input, undefined, setAppState);
      }
      setInputValue('');
      helpers.setCursorOffset(0);
@@ -4804,7 +4812,10 @@ export function REPL({
  // Submits incoming prompts from teammate messages or tasks mode as new turns
  // Returns true if submission succeeded, false if a query is already running
  const handleIncomingPrompt = useCallback(
-    (content: string, options?: { isMeta?: boolean }): boolean => {
+    (
+      input: string | QueuedCommand,
+      options?: { isMeta?: boolean },
+    ): boolean => {
      if (queryGuard.isActive) return false;

      // Defer to user-queued commands — user input always takes priority
@@ -4816,16 +4827,53 @@ export function REPL({
        return false;
      }

+      const queuedCommand =
+        typeof input === 'string'
+          ? ({
+              value: input,
+              mode: 'prompt',
+              isMeta: options?.isMeta ? true : undefined,
+            } satisfies QueuedCommand)
+          : input
+
      const newAbortController = createAbortController();
      setAbortController(newAbortController);

      // Create a user message with the formatted content (includes XML wrapper)
      const userMessage = createUserMessage({
-        content,
-        isMeta: options?.isMeta ? true : undefined,
+        content: queuedCommand.value as string,
+        isMeta: queuedCommand.isMeta ? true : undefined,
+        origin: queuedCommand.origin,
      });

-      void onQuery([userMessage], newAbortController, true, [], mainLoopModel);
+      const autonomyRunId = queuedCommand.autonomy?.runId
+      if (autonomyRunId) {
+        void markAutonomyRunRunning(autonomyRunId)
+      }
+
+      void onQuery([userMessage], newAbortController, true, [], mainLoopModel)
+        .then(() => {
+          if (autonomyRunId) {
+            void finalizeAutonomyRunCompleted({
+              runId: autonomyRunId,
+              currentDir: getCwd(),
+              priority: 'later',
+            }).then(nextCommands => {
+              for (const command of nextCommands) {
+                enqueue(command);
+              }
+            })
+          }
+        })
+        .catch((error: unknown) => {
+          if (autonomyRunId) {
+            void finalizeAutonomyRunFailed({
+              runId: autonomyRunId,
+              error: String(error),
+            })
+          }
+          logError(toError(error))
+        })
      return true;
    },
    [onQuery, mainLoopModel, store],
@@ -4856,6 +4904,7 @@ export function REPL({
  const pipeIpcState = useAppState(s => getPipeIpc(s as any));

  usePipePermissionForward({ store, tools, setMessages, setToolUseConfirmQueue, getToolUseContext, mainLoopModel });
+  usePipeMuteSync({ setToolUseConfirmQueue });

  // Pipe IPC lifecycle — extracted to usePipeIpc hook
  usePipeIpc({ store, handleIncomingPrompt });
@@ -4898,8 +4947,7 @@ export function REPL({
    queuedCommandsLength: queuedCommands.length,
    hasActiveLocalJsxUI: isShowingLocalJSXCommand,
    isInPlanMode: toolPermissionContext.mode === 'plan',
-    onSubmitTick: (prompt: string) => handleIncomingPrompt(prompt, { isMeta: true }),
-    onQueueTick: (prompt: string) => enqueue({ mode: 'prompt', value: prompt, isMeta: true }),
+    onQueueTick: (command: QueuedCommand) => enqueue(command),
  });

  // Abort the current operation when a 'now' priority message arrives
--- a/src/services/analytics/growthbook.ts
+++ b/src/services/analytics/growthbook.ts
@@ -466,6 +466,7 @@ const LOCAL_GATE_DEFAULTS: Record<string, unknown> = {
  tengu_birch_trellis: true, // Tree-sitter bash security analysis
  tengu_collage_kaleidoscope: true, // macOS clipboard image reading
  tengu_compact_cache_prefix: true, // Reuse prompt cache during compaction
+  tengu_kairos_assistant: true, // KAIROS assistant mode activation
  tengu_kairos_cron_durable: true, // Persistent cron tasks
  tengu_attribution_header: true, // API request attribution header
  tengu_slate_prism: true, // Agent progress summaries
@@ -830,6 +831,16 @@ export function getFeatureValue_CACHED_MAY_BE_STALE<T>(
    return localDefault !== undefined ? (localDefault as T) : defaultValue
  }

+  // LOCAL_GATE_DEFAULTS take priority over remote values and disk cache.
+  // In fork/self-hosted deployments, the GrowthBook server may push false
+  // for gates we intentionally enable. Local defaults represent the
+  // project's intentional configuration and override everything except
+  // env/config overrides (which are explicit user intent).
+  const localDefault = getLocalGateDefault(feature)
+  if (localDefault !== undefined) {
+    return localDefault as T
+  }
+
  // Log experiment exposure if data is available, otherwise defer until after init
  if (experimentDataByFeature.has(feature)) {
    logExposureForFeature(feature)
@@ -838,10 +849,6 @@ export function getFeatureValue_CACHED_MAY_BE_STALE<T>(
  }

  // In-memory payload is authoritative once processRemoteEvalPayload has run.
-  // Disk is also fresh by then (syncRemoteEvalToDisk runs synchronously inside
-  // init), so this is correctness-equivalent to the disk read below — but it
-  // skips the config JSON parse and is what onGrowthBookRefresh subscribers
-  // depend on to read fresh values the instant they're notified.
  if (remoteEvalFeatureValues.has(feature)) {
    return remoteEvalFeatureValues.get(feature) as T
  }
@@ -853,14 +860,9 @@ export function getFeatureValue_CACHED_MAY_BE_STALE<T>(
      return cached as T
    }
  } catch {
-    // Config not yet initialized — fall through to local gate defaults
+    // Config not yet initialized — fall through to defaultValue
  }
-  // Disk cache miss (or config not initialized) — use local gate defaults
-  // before falling back to the caller's defaultValue. This covers:
-  // 1. GrowthBook "enabled" but never connected (caches empty)
-  // 2. Config not yet initialized (early in startup)
-  const localDefault = getLocalGateDefault(feature)
-  return localDefault !== undefined ? (localDefault as T) : defaultValue
+  return defaultValue
 }

 /**
--- a/src/services/api/openai/tests/queryModelOpenAI.isolated.ts
+++ b/src/services/api/openai/tests/queryModelOpenAI.isolated.ts
@@ -0,0 +1,487 @@
+/**
+ * Tests for queryModelOpenAI in index.ts.
+ *
+ * Focused on the two bugs fixed:
+ *  1. stop_reason was always null in the assembled AssistantMessage because
+ *     partialMessage (from message_start) has stop_reason: null, and the
+ *     stop_reason captured from message_delta was never applied.
+ *  2. partialMessage was not reset to null after message_stop, so the safety
+ *     fallback at the end of the loop would yield a second identical
+ *     AssistantMessage (causing doubled content in the next API request).
+ *
+ * Strategy: mock getOpenAIClient + adaptOpenAIStreamToAnthropic so we can
+ * feed pre-built Anthropic events directly into queryModelOpenAI and inspect
+ * what it emits — without any real HTTP calls.
+ */
+import { describe, expect, test, mock, beforeEach, afterEach } from 'bun:test'
+import type { BetaRawMessageStreamEvent } from '@anthropic-ai/sdk/resources/beta/messages/messages.mjs'
+import type { AssistantMessage, StreamEvent } from '../../../../types/message.js'
+
+// ─── helpers ─────────────────────────────────────────────────────────────────
+
+/** Build a minimal message_start event */
+function makeMessageStart(overrides: Record<string, any> = {}): BetaRawMessageStreamEvent {
+  return {
+    type: 'message_start',
+    message: {
+      id: 'msg_test',
+      type: 'message',
+      role: 'assistant',
+      content: [],
+      model: 'test-model',
+      stop_reason: null,
+      stop_sequence: null,
+      usage: { input_tokens: 0, output_tokens: 0, cache_creation_input_tokens: 0, cache_read_input_tokens: 0 },
+      ...overrides,
+    },
+  } as any
+}
+
+/** Build a content_block_start event for the given block type */
+function makeContentBlockStart(index: number, type: 'text' | 'tool_use' | 'thinking', extra: Record<string, any> = {}): BetaRawMessageStreamEvent {
+  const block =
+    type === 'text'
+      ? { type: 'text', text: '' }
+      : type === 'tool_use'
+        ? { type: 'tool_use', id: 'toolu_test', name: 'bash', input: {} }
+        : { type: 'thinking', thinking: '', signature: '' }
+  return { type: 'content_block_start', index, content_block: { ...block, ...extra } } as any
+}
+
+/** Build a text_delta content_block_delta event */
+function makeTextDelta(index: number, text: string): BetaRawMessageStreamEvent {
+  return { type: 'content_block_delta', index, delta: { type: 'text_delta', text } } as any
+}
+
+/** Build an input_json_delta content_block_delta event */
+function makeInputJsonDelta(index: number, json: string): BetaRawMessageStreamEvent {
+  return { type: 'content_block_delta', index, delta: { type: 'input_json_delta', partial_json: json } } as any
+}
+
+/** Build a thinking_delta content_block_delta event */
+function makeThinkingDelta(index: number, thinking: string): BetaRawMessageStreamEvent {
+  return { type: 'content_block_delta', index, delta: { type: 'thinking_delta', thinking } } as any
+}
+
+/** Build a content_block_stop event */
+function makeContentBlockStop(index: number): BetaRawMessageStreamEvent {
+  return { type: 'content_block_stop', index } as any
+}
+
+/** Build a message_delta event with stop_reason and output_tokens */
+function makeMessageDelta(stopReason: string, outputTokens: number): BetaRawMessageStreamEvent {
+  return {
+    type: 'message_delta',
+    delta: { stop_reason: stopReason, stop_sequence: null },
+    usage: { output_tokens: outputTokens },
+  } as any
+}
+
+/** Build a message_stop event */
+function makeMessageStop(): BetaRawMessageStreamEvent {
+  return { type: 'message_stop' } as any
+}
+
+/** Async generator from a fixed array of events */
+async function* eventStream(events: BetaRawMessageStreamEvent[]) {
+  for (const e of events) yield e
+}
+
+/** Collect all outputs from queryModelOpenAI into typed buckets */
+async function runQueryModel(
+  events: BetaRawMessageStreamEvent[],
+  envOverrides: Record<string, string | undefined> = {},
+) {
+  // Wire events into the mocked stream adapter
+  _nextEvents = events
+  // Save + apply env overrides
+  const saved: Record<string, string | undefined> = {}
+  for (const [k, v] of Object.entries(envOverrides)) {
+    saved[k] = process.env[k]
+    if (v === undefined) delete process.env[k]
+    else process.env[k] = v
+  }
+
+  try {
+    // We inline mock.module inside the try block.
+    // Bun resolves mock.module at the call site synchronously (hoisted),
+    // so we register once per test file, then re-import each time.
+    const { queryModelOpenAI } = await import('../index.js')
+
+    const assistantMessages: AssistantMessage[] = []
+    const streamEvents: StreamEvent[] = []
+    const otherOutputs: any[] = []
+
+    const minimalOptions: any = {
+      model: 'test-model',
+      tools: [],
+      agents: [],
+      querySource: 'main_loop',
+      getToolPermissionContext: async () => ({
+        alwaysAllow: [],
+        alwaysDeny: [],
+        needsPermission: [],
+        mode: 'default',
+        isBypassingPermissions: false,
+      }),
+    }
+
+    for await (const item of queryModelOpenAI(
+      [],
+      { type: 'text', text: '' } as any,
+      [],
+      new AbortController().signal,
+      minimalOptions,
+    )) {
+      if (item.type === 'assistant') {
+        assistantMessages.push(item as AssistantMessage)
+      } else if (item.type === 'stream_event') {
+        streamEvents.push(item as StreamEvent)
+      } else {
+        otherOutputs.push(item)
+      }
+    }
+
+    return { assistantMessages, streamEvents, otherOutputs }
+  } finally {
+    // Restore env
+    for (const [k, v] of Object.entries(saved)) {
+      if (v === undefined) delete process.env[k]
+      else process.env[k] = v
+    }
+  }
+}
+
+// ─── mock setup ──────────────────────────────────────────────────────────────
+
+// We mock at module level. Bun's mock.module replaces the module for the
+// entire file, so we configure the stream per-test via a shared variable.
+let _nextEvents: BetaRawMessageStreamEvent[] = []
+
+/** Captured arguments from the last chat.completions.create() call */
+let _lastCreateArgs: Record<string, any> | null = null
+
+mock.module('../client.js', () => ({
+  getOpenAIClient: () => ({
+    chat: {
+      completions: {
+        create: async (args: Record<string, any>) => {
+          _lastCreateArgs = args
+          return { [Symbol.asyncIterator]: async function* () {} }
+        },
+      },
+    },
+  }),
+}))
+
+mock.module('../streamAdapter.js', () => ({
+  adaptOpenAIStreamToAnthropic: (_stream: any, _model: string) => eventStream(_nextEvents),
+}))
+
+mock.module('../modelMapping.js', () => ({
+  resolveOpenAIModel: (m: string) => m,
+}))
+
+mock.module('../convertMessages.js', () => ({
+  anthropicMessagesToOpenAI: () => [],
+}))
+
+mock.module('../convertTools.js', () => ({
+  anthropicToolsToOpenAI: () => [],
+  anthropicToolChoiceToOpenAI: () => undefined,
+}))
+
+mock.module('../../../../utils/context.js', () => ({
+  MODEL_CONTEXT_WINDOW_DEFAULT: 200_000,
+  COMPACT_MAX_OUTPUT_TOKENS: 20_000,
+  CAPPED_DEFAULT_MAX_TOKENS: 8_000,
+  ESCALATED_MAX_TOKENS: 64_000,
+  is1mContextDisabled: () => false,
+  has1mContext: () => false,
+  modelSupports1M: () => false,
+  getModelMaxOutputTokens: () => ({ upperLimit: 8192, default: 8192 }),
+  getContextWindowForModel: () => 200_000,
+  getSonnet1mExpTreatmentEnabled: () => false,
+  calculateContextPercentages: () => ({ usedPercent: 0, remainingPercent: 100 }),
+  getMaxThinkingTokensForModel: () => 0,
+}))
+
+mock.module('../../../../utils/messages.js', () => ({
+  normalizeMessagesForAPI: (msgs: any) => msgs,
+  normalizeContentFromAPI: (blocks: any[]) => blocks,
+  createAssistantAPIErrorMessage: (opts: any) => ({
+    type: 'assistant',
+    message: { content: [{ type: 'text', text: opts.content }], apiError: opts.apiError },
+    uuid: 'error-uuid',
+    timestamp: new Date().toISOString(),
+  }),
+}))
+
+mock.module('../../../../utils/api.js', () => ({
+  toolToAPISchema: async (t: any) => t,
+}))
+
+mock.module('../../../../utils/toolSearch.js', () => ({
+  isToolSearchEnabled: async () => false,
+  extractDiscoveredToolNames: () => new Set(),
+}))
+
+mock.module('../../../../tools/ToolSearchTool/prompt.js', () => ({
+  isDeferredTool: () => false,
+  TOOL_SEARCH_TOOL_NAME: '__tool_search__',
+}))
+
+mock.module('../../../../cost-tracker.js', () => ({
+  addToTotalSessionCost: () => {},
+}))
+
+mock.module('../../../../utils/modelCost.js', () => ({
+  COST_TIER_3_15: {},
+  COST_TIER_15_75: {},
+  COST_TIER_5_25: {},
+  COST_TIER_30_150: {},
+  COST_HAIKU_35: {},
+  COST_HAIKU_45: {},
+  getOpus46CostTier: () => ({}),
+  MODEL_COSTS: {},
+  getModelCosts: () => ({}),
+  calculateUSDCost: () => 0,
+  calculateCostFromTokens: () => 0,
+  formatModelPricing: () => '',
+  getModelPricingString: () => undefined,
+}))
+
+mock.module('../../../../utils/debug.js', () => ({
+  logForDebugging: () => {},
+  logAntError: () => {},
+  isDebugMode: () => false,
+  isDebugToStdErr: () => false,
+  getDebugFilePath: () => null,
+  getDebugLogPath: () => '',
+  getDebugFilter: () => null,
+  getMinDebugLogLevel: () => 'debug',
+  enableDebugLogging: () => false,
+  setHasFormattedOutput: () => {},
+  getHasFormattedOutput: () => false,
+  flushDebugLogs: async () => {},
+}))
+
+// ─── tests ───────────────────────────────────────────────────────────────────
+
+describe('queryModelOpenAI — stop_reason propagation', () => {
+  test('assembled AssistantMessage has stop_reason end_turn (not null)', async () => {
+    _nextEvents = [
+      makeMessageStart(),
+      makeContentBlockStart(0, 'text'),
+      makeTextDelta(0, 'Hello'),
+      makeContentBlockStop(0),
+      makeMessageDelta('end_turn', 10),
+      makeMessageStop(),
+    ]
+
+    const { assistantMessages } = await runQueryModel(_nextEvents)
+
+    expect(assistantMessages).toHaveLength(1)
+    expect(assistantMessages[0]!.message.stop_reason).toBe('end_turn')
+  })
+
+  test('assembled AssistantMessage has stop_reason tool_use', async () => {
+    _nextEvents = [
+      makeMessageStart(),
+      makeContentBlockStart(0, 'tool_use'),
+      makeInputJsonDelta(0, '{"cmd":"ls"}'),
+      makeContentBlockStop(0),
+      makeMessageDelta('tool_use', 20),
+      makeMessageStop(),
+    ]
+
+    const { assistantMessages } = await runQueryModel(_nextEvents)
+
+    expect(assistantMessages).toHaveLength(1)
+    expect(assistantMessages[0]!.message.stop_reason).toBe('tool_use')
+  })
+
+  test('assembled AssistantMessage has stop_reason max_tokens', async () => {
+    _nextEvents = [
+      makeMessageStart(),
+      makeContentBlockStart(0, 'text'),
+      makeTextDelta(0, 'truncated'),
+      makeContentBlockStop(0),
+      makeMessageDelta('max_tokens', 8192),
+      makeMessageStop(),
+    ]
+
+    const { assistantMessages } = await runQueryModel(_nextEvents)
+
+    // Two assistant-typed items: the content message + the max_output_tokens error signal.
+    // The error signal is emitted as a synthetic assistant message by createAssistantAPIErrorMessage.
+    expect(assistantMessages).toHaveLength(2)
+    const contentMsg = assistantMessages[0]!
+    expect(contentMsg.message.stop_reason).toBe('max_tokens')
+    // Second item is the error signal (has apiError set)
+    const errorMsg = assistantMessages[1]!.message as any
+    expect(errorMsg.apiError).toBe('max_output_tokens')
+  })
+
+  test('stop_reason is null when no message_delta was received (safety fallback path)', async () => {
+    // Stream ends without message_stop — triggers the safety fallback branch.
+    // stop_reason stays null since no message_delta was ever seen.
+    _nextEvents = [
+      makeMessageStart(),
+      makeContentBlockStart(0, 'text'),
+      makeTextDelta(0, 'partial'),
+      makeContentBlockStop(0),
+      // No message_delta / message_stop
+    ]
+
+    const { assistantMessages } = await runQueryModel(_nextEvents)
+
+    // Safety fallback should yield the partial content
+    expect(assistantMessages).toHaveLength(1)
+    expect(assistantMessages[0]!.message.stop_reason).toBeNull()
+  })
+})
+
+describe('queryModelOpenAI — usage accumulation', () => {
+  test('usage in assembled message reflects all four fields from message_delta', async () => {
+    // message_start has all fields=0 (trailing-chunk pattern: usage not yet available).
+    // message_delta carries the real values after stream ends.
+    // The spread in the message_delta handler must override all zeros from message_start,
+    // including cache_read_input_tokens which was previously missing from message_delta.
+    _nextEvents = [
+      makeMessageStart({ usage: { input_tokens: 0, output_tokens: 0, cache_creation_input_tokens: 0, cache_read_input_tokens: 0 } }),
+      makeContentBlockStart(0, 'text'),
+      makeTextDelta(0, 'response'),
+      makeContentBlockStop(0),
+      // message_delta carries all four Anthropic usage fields (as emitted by the fixed streamAdapter)
+      {
+        type: 'message_delta',
+        delta: { stop_reason: 'end_turn', stop_sequence: null },
+        usage: { input_tokens: 30011, output_tokens: 190, cache_read_input_tokens: 19904, cache_creation_input_tokens: 0 },
+      } as any,
+      makeMessageStop(),
+    ]
+
+    const { assistantMessages } = await runQueryModel(_nextEvents)
+
+    expect(assistantMessages).toHaveLength(1)
+    const usage = assistantMessages[0]!.message.usage as any
+    expect(usage.input_tokens).toBe(30011)
+    expect(usage.output_tokens).toBe(190)
+    // cache_read_input_tokens from message_delta overrides the 0 from message_start
+    expect(usage.cache_read_input_tokens).toBe(19904)
+    expect(usage.cache_creation_input_tokens).toBe(0)
+  })
+
+  test('usage is zero when no usage events arrive (prevents false autocompact)', async () => {
+    // If usage stays 0, tokenCountWithEstimation will undercount — so at least
+    // verify the field exists and is numeric (to detect regressions).
+    _nextEvents = [
+      makeMessageStart(),
+      makeContentBlockStart(0, 'text'),
+      makeTextDelta(0, 'hi'),
+      makeContentBlockStop(0),
+      makeMessageDelta('end_turn', 0),
+      makeMessageStop(),
+    ]
+
+    const { assistantMessages } = await runQueryModel(_nextEvents)
+
+    const usage = assistantMessages[0]!.message.usage as any
+    expect(typeof usage.input_tokens).toBe('number')
+    expect(typeof usage.output_tokens).toBe('number')
+  })
+})
+
+describe('queryModelOpenAI — no duplicate AssistantMessage (partialMessage reset)', () => {
+  test('yields exactly one AssistantMessage per message_stop when content is present', async () => {
+    _nextEvents = [
+      makeMessageStart(),
+      makeContentBlockStart(0, 'text'),
+      makeTextDelta(0, 'only once'),
+      makeContentBlockStop(0),
+      makeMessageDelta('end_turn', 5),
+      makeMessageStop(),
+    ]
+
+    const { assistantMessages } = await runQueryModel(_nextEvents)
+
+    // Before the fix, partialMessage was not reset to null, so the safety
+    // fallback at the end of the loop would yield a second message with the
+    // same message.id — causing mergeAssistantMessages to concatenate content.
+    expect(assistantMessages).toHaveLength(1)
+  })
+
+  test('thinking + text response yields exactly one AssistantMessage', async () => {
+    _nextEvents = [
+      makeMessageStart(),
+      makeContentBlockStart(0, 'thinking'),
+      makeThinkingDelta(0, 'let me think'),
+      makeContentBlockStop(0),
+      makeContentBlockStart(1, 'text'),
+      makeTextDelta(1, 'answer'),
+      makeContentBlockStop(1),
+      makeMessageDelta('end_turn', 30),
+      makeMessageStop(),
+    ]
+
+    const { assistantMessages } = await runQueryModel(_nextEvents)
+
+    expect(assistantMessages).toHaveLength(1)
+  })
+
+  test('safety fallback path still yields message when stream ends without message_stop', async () => {
+    // Simulates a stream that cuts off without the normal termination sequence.
+    _nextEvents = [
+      makeMessageStart(),
+      makeContentBlockStart(0, 'text'),
+      makeTextDelta(0, 'abrupt end'),
+      // No content_block_stop, no message_delta, no message_stop
+    ]
+
+    const { assistantMessages } = await runQueryModel(_nextEvents)
+
+    expect(assistantMessages).toHaveLength(1)
+  })
+})
+
+describe('queryModelOpenAI — stream_events forwarded', () => {
+  test('every adapted event is also yielded as stream_event for real-time display', async () => {
+    _nextEvents = [
+      makeMessageStart(),
+      makeContentBlockStart(0, 'text'),
+      makeTextDelta(0, 'hello'),
+      makeContentBlockStop(0),
+      makeMessageDelta('end_turn', 5),
+      makeMessageStop(),
+    ]
+
+    const { streamEvents } = await runQueryModel(_nextEvents)
+
+    const eventTypes = streamEvents.map(e => (e as any).event?.type)
+    expect(eventTypes).toContain('message_start')
+    expect(eventTypes).toContain('content_block_start')
+    expect(eventTypes).toContain('content_block_delta')
+    expect(eventTypes).toContain('content_block_stop')
+    expect(eventTypes).toContain('message_delta')
+    expect(eventTypes).toContain('message_stop')
+  })
+})
+
+describe('queryModelOpenAI — max_tokens forwarded to request', () => {
+  test('buildOpenAIRequestBody includes max_tokens in the request payload', async () => {
+    _nextEvents = [
+      makeMessageStart(),
+      makeContentBlockStart(0, 'text'),
+      makeTextDelta(0, 'hi'),
+      makeContentBlockStop(0),
+      makeMessageDelta('end_turn', 5),
+      makeMessageStop(),
+    ]
+
+    await runQueryModel(_nextEvents)
+
+    expect(_lastCreateArgs).not.toBeNull()
+    expect(_lastCreateArgs!.max_tokens).toBe(8192)
+  })
+})
--- a/src/services/api/openai/tests/queryModelOpenAI.test.ts
+++ b/src/services/api/openai/tests/queryModelOpenAI.test.ts
@@ -1,454 +1,40 @@
-/**
- * Tests for queryModelOpenAI in index.ts.
- *
- * Focused on the two bugs fixed:
- *  1. stop_reason was always null in the assembled AssistantMessage because
- *     partialMessage (from message_start) has stop_reason: null, and the
- *     stop_reason captured from message_delta was never applied.
- *  2. partialMessage was not reset to null after message_stop, so the safety
- *     fallback at the end of the loop would yield a second identical
- *     AssistantMessage (causing doubled content in the next API request).
- *
- * Strategy: mock getOpenAIClient + adaptOpenAIStreamToAnthropic so we can
- * feed pre-built Anthropic events directly into queryModelOpenAI and inspect
- * what it emits — without any real HTTP calls.
- */
-import { describe, expect, test, mock, beforeEach, afterEach } from 'bun:test'
-import type { BetaRawMessageStreamEvent } from '@anthropic-ai/sdk/resources/beta/messages/messages.mjs'
-import type { AssistantMessage, StreamEvent } from '../../../../types/message.js'
+import { describe, expect, test } from 'bun:test'
+import { fileURLToPath } from 'node:url'

-// ─── helpers ─────────────────────────────────────────────────────────────────
+const isolatedPath = fileURLToPath(
+  new URL('./queryModelOpenAI.isolated.ts', import.meta.url),
+)

-/** Build a minimal message_start event */
-function makeMessageStart(overrides: Record<string, any> = {}): BetaRawMessageStreamEvent {
-  return {
-    type: 'message_start',
-    message: {
-      id: 'msg_test',
-      type: 'message',
-      role: 'assistant',
-      content: [],
-      model: 'test-model',
-      stop_reason: null,
-      stop_sequence: null,
-      usage: { input_tokens: 0, output_tokens: 0, cache_creation_input_tokens: 0, cache_read_input_tokens: 0 },
-      ...overrides,
-    },
-  } as any
-}
+describe('queryModelOpenAI', () => {
+  test('passes in isolated subprocess', async () => {
+    const proc = Bun.spawn({
+      cmd: [process.execPath, 'test', isolatedPath],
+      cwd: process.cwd(),
+      stdout: 'pipe',
+      stderr: 'pipe',
+      env: process.env,
+    })

-/** Build a content_block_start event for the given block type */
-function makeContentBlockStart(index: number, type: 'text' | 'tool_use' | 'thinking', extra: Record<string, any> = {}): BetaRawMessageStreamEvent {
-  const block =
-    type === 'text'
-      ? { type: 'text', text: '' }
-      : type === 'tool_use'
-        ? { type: 'tool_use', id: 'toolu_test', name: 'bash', input: {} }
-        : { type: 'thinking', thinking: '', signature: '' }
-  return { type: 'content_block_start', index, content_block: { ...block, ...extra } } as any
-}
+    const [stdout, stderr, exitCode] = await Promise.all([
+      new Response(proc.stdout).text(),
+      new Response(proc.stderr).text(),
+      proc.exited,
+    ])

-/** Build a text_delta content_block_delta event */
-function makeTextDelta(index: number, text: string): BetaRawMessageStreamEvent {
-  return { type: 'content_block_delta', index, delta: { type: 'text_delta', text } } as any
-}
-
-/** Build an input_json_delta content_block_delta event */
-function makeInputJsonDelta(index: number, json: string): BetaRawMessageStreamEvent {
-  return { type: 'content_block_delta', index, delta: { type: 'input_json_delta', partial_json: json } } as any
-}
-
-/** Build a thinking_delta content_block_delta event */
-function makeThinkingDelta(index: number, thinking: string): BetaRawMessageStreamEvent {
-  return { type: 'content_block_delta', index, delta: { type: 'thinking_delta', thinking } } as any
-}
-
-/** Build a content_block_stop event */
-function makeContentBlockStop(index: number): BetaRawMessageStreamEvent {
-  return { type: 'content_block_stop', index } as any
-}
-
-/** Build a message_delta event with stop_reason and output_tokens */
-function makeMessageDelta(stopReason: string, outputTokens: number): BetaRawMessageStreamEvent {
-  return {
-    type: 'message_delta',
-    delta: { stop_reason: stopReason, stop_sequence: null },
-    usage: { output_tokens: outputTokens },
-  } as any
-}
-
-/** Build a message_stop event */
-function makeMessageStop(): BetaRawMessageStreamEvent {
-  return { type: 'message_stop' } as any
-}
-
-/** Async generator from a fixed array of events */
-async function* eventStream(events: BetaRawMessageStreamEvent[]) {
-  for (const e of events) yield e
-}
-
-/** Collect all outputs from queryModelOpenAI into typed buckets */
-async function runQueryModel(
-  events: BetaRawMessageStreamEvent[],
-  envOverrides: Record<string, string | undefined> = {},
-) {
-  // Wire events into the mocked stream adapter
-  _nextEvents = events
-  // Save + apply env overrides
-  const saved: Record<string, string | undefined> = {}
-  for (const [k, v] of Object.entries(envOverrides)) {
-    saved[k] = process.env[k]
-    if (v === undefined) delete process.env[k]
-    else process.env[k] = v
-  }
-
-  try {
-    // We inline mock.module inside the try block.
-    // Bun resolves mock.module at the call site synchronously (hoisted),
-    // so we register once per test file, then re-import each time.
-    const { queryModelOpenAI } = await import('../index.js')
-
-    const assistantMessages: AssistantMessage[] = []
-    const streamEvents: StreamEvent[] = []
-    const otherOutputs: any[] = []
-
-    const minimalOptions: any = {
-      model: 'test-model',
-      tools: [],
-      agents: [],
-      querySource: 'main_loop',
-      getToolPermissionContext: async () => ({
-        alwaysAllow: [],
-        alwaysDeny: [],
-        needsPermission: [],
-        mode: 'default',
-        isBypassingPermissions: false,
-      }),
+    if (exitCode !== 0) {
+      throw new Error(
+        [
+          `isolated queryModelOpenAI test failed with exit code ${exitCode}`,
+          '',
+          'STDOUT:',
+          stdout,
+          '',
+          'STDERR:',
+          stderr,
+        ].join('\n'),
+      )
    }

-    for await (const item of queryModelOpenAI(
-      [],
-      { type: 'text', text: '' } as any,
-      [],
-      new AbortController().signal,
-      minimalOptions,
-    )) {
-      if (item.type === 'assistant') {
-        assistantMessages.push(item as AssistantMessage)
-      } else if (item.type === 'stream_event') {
-        streamEvents.push(item as StreamEvent)
-      } else {
-        otherOutputs.push(item)
-      }
-    }
-
-    return { assistantMessages, streamEvents, otherOutputs }
-  } finally {
-    // Restore env
-    for (const [k, v] of Object.entries(saved)) {
-      if (v === undefined) delete process.env[k]
-      else process.env[k] = v
-    }
-  }
-}
-
-// ─── mock setup ──────────────────────────────────────────────────────────────
-
-// We mock at module level. Bun's mock.module replaces the module for the
-// entire file, so we configure the stream per-test via a shared variable.
-let _nextEvents: BetaRawMessageStreamEvent[] = []
-
-/** Captured arguments from the last chat.completions.create() call */
-let _lastCreateArgs: Record<string, any> | null = null
-
-mock.module('../client.js', () => ({
-  getOpenAIClient: () => ({
-    chat: {
-      completions: {
-        create: async (args: Record<string, any>) => {
-          _lastCreateArgs = args
-          return { [Symbol.asyncIterator]: async function* () {} }
-        },
-      },
-    },
-  }),
-}))
-
-mock.module('../streamAdapter.js', () => ({
-  adaptOpenAIStreamToAnthropic: (_stream: any, _model: string) => eventStream(_nextEvents),
-}))
-
-mock.module('../modelMapping.js', () => ({
-  resolveOpenAIModel: (m: string) => m,
-}))
-
-mock.module('../convertMessages.js', () => ({
-  anthropicMessagesToOpenAI: () => [],
-}))
-
-mock.module('../convertTools.js', () => ({
-  anthropicToolsToOpenAI: () => [],
-  anthropicToolChoiceToOpenAI: () => undefined,
-}))
-
-mock.module('../../../../utils/context.js', () => ({
-  getModelMaxOutputTokens: () => ({ upperLimit: 8192, default: 8192 }),
-  getContextWindowForModel: () => 200_000,
-}))
-
-mock.module('../../../../utils/messages.js', () => ({
-  normalizeMessagesForAPI: (msgs: any) => msgs,
-  normalizeContentFromAPI: (blocks: any[]) => blocks,
-  createAssistantAPIErrorMessage: (opts: any) => ({
-    type: 'assistant',
-    message: { content: [{ type: 'text', text: opts.content }], apiError: opts.apiError },
-    uuid: 'error-uuid',
-    timestamp: new Date().toISOString(),
-  }),
-}))
-
-mock.module('../../../../utils/api.js', () => ({
-  toolToAPISchema: async (t: any) => t,
-}))
-
-mock.module('../../../../utils/toolSearch.js', () => ({
-  isToolSearchEnabled: async () => false,
-  extractDiscoveredToolNames: () => new Set(),
-}))
-
-mock.module('../../../../tools/ToolSearchTool/prompt.js', () => ({
-  isDeferredTool: () => false,
-  TOOL_SEARCH_TOOL_NAME: '__tool_search__',
-}))
-
-mock.module('../../../../cost-tracker.js', () => ({
-  addToTotalSessionCost: () => {},
-}))
-
-mock.module('../../../../utils/modelCost.js', () => ({
-  calculateUSDCost: () => 0,
-}))
-
-mock.module('../../../../utils/debug.js', () => ({
-  logForDebugging: () => {},
-}))
-
-// ─── tests ───────────────────────────────────────────────────────────────────
-
-describe('queryModelOpenAI — stop_reason propagation', () => {
-  test('assembled AssistantMessage has stop_reason end_turn (not null)', async () => {
-    _nextEvents = [
-      makeMessageStart(),
-      makeContentBlockStart(0, 'text'),
-      makeTextDelta(0, 'Hello'),
-      makeContentBlockStop(0),
-      makeMessageDelta('end_turn', 10),
-      makeMessageStop(),
-    ]
-
-    const { assistantMessages } = await runQueryModel(_nextEvents)
-
-    expect(assistantMessages).toHaveLength(1)
-    expect(assistantMessages[0]!.message.stop_reason).toBe('end_turn')
-  })
-
-  test('assembled AssistantMessage has stop_reason tool_use', async () => {
-    _nextEvents = [
-      makeMessageStart(),
-      makeContentBlockStart(0, 'tool_use'),
-      makeInputJsonDelta(0, '{"cmd":"ls"}'),
-      makeContentBlockStop(0),
-      makeMessageDelta('tool_use', 20),
-      makeMessageStop(),
-    ]
-
-    const { assistantMessages } = await runQueryModel(_nextEvents)
-
-    expect(assistantMessages).toHaveLength(1)
-    expect(assistantMessages[0]!.message.stop_reason).toBe('tool_use')
-  })
-
-  test('assembled AssistantMessage has stop_reason max_tokens', async () => {
-    _nextEvents = [
-      makeMessageStart(),
-      makeContentBlockStart(0, 'text'),
-      makeTextDelta(0, 'truncated'),
-      makeContentBlockStop(0),
-      makeMessageDelta('max_tokens', 8192),
-      makeMessageStop(),
-    ]
-
-    const { assistantMessages } = await runQueryModel(_nextEvents)
-
-    // Two assistant-typed items: the content message + the max_output_tokens error signal.
-    // The error signal is emitted as a synthetic assistant message by createAssistantAPIErrorMessage.
-    expect(assistantMessages).toHaveLength(2)
-    const contentMsg = assistantMessages[0]!
-    expect(contentMsg.message.stop_reason).toBe('max_tokens')
-    // Second item is the error signal (has apiError set)
-    const errorMsg = assistantMessages[1]!.message as any
-    expect(errorMsg.apiError).toBe('max_output_tokens')
-  })
-
-  test('stop_reason is null when no message_delta was received (safety fallback path)', async () => {
-    // Stream ends without message_stop — triggers the safety fallback branch.
-    // stop_reason stays null since no message_delta was ever seen.
-    _nextEvents = [
-      makeMessageStart(),
-      makeContentBlockStart(0, 'text'),
-      makeTextDelta(0, 'partial'),
-      makeContentBlockStop(0),
-      // No message_delta / message_stop
-    ]
-
-    const { assistantMessages } = await runQueryModel(_nextEvents)
-
-    // Safety fallback should yield the partial content
-    expect(assistantMessages).toHaveLength(1)
-    expect(assistantMessages[0]!.message.stop_reason).toBeNull()
-  })
-})
-
-describe('queryModelOpenAI — usage accumulation', () => {
-  test('usage in assembled message reflects all four fields from message_delta', async () => {
-    // message_start has all fields=0 (trailing-chunk pattern: usage not yet available).
-    // message_delta carries the real values after stream ends.
-    // The spread in the message_delta handler must override all zeros from message_start,
-    // including cache_read_input_tokens which was previously missing from message_delta.
-    _nextEvents = [
-      makeMessageStart({ usage: { input_tokens: 0, output_tokens: 0, cache_creation_input_tokens: 0, cache_read_input_tokens: 0 } }),
-      makeContentBlockStart(0, 'text'),
-      makeTextDelta(0, 'response'),
-      makeContentBlockStop(0),
-      // message_delta carries all four Anthropic usage fields (as emitted by the fixed streamAdapter)
-      {
-        type: 'message_delta',
-        delta: { stop_reason: 'end_turn', stop_sequence: null },
-        usage: { input_tokens: 30011, output_tokens: 190, cache_read_input_tokens: 19904, cache_creation_input_tokens: 0 },
-      } as any,
-      makeMessageStop(),
-    ]
-
-    const { assistantMessages } = await runQueryModel(_nextEvents)
-
-    expect(assistantMessages).toHaveLength(1)
-    const usage = assistantMessages[0]!.message.usage as any
-    expect(usage.input_tokens).toBe(30011)
-    expect(usage.output_tokens).toBe(190)
-    // cache_read_input_tokens from message_delta overrides the 0 from message_start
-    expect(usage.cache_read_input_tokens).toBe(19904)
-    expect(usage.cache_creation_input_tokens).toBe(0)
-  })
-
-  test('usage is zero when no usage events arrive (prevents false autocompact)', async () => {
-    // If usage stays 0, tokenCountWithEstimation will undercount — so at least
-    // verify the field exists and is numeric (to detect regressions).
-    _nextEvents = [
-      makeMessageStart(),
-      makeContentBlockStart(0, 'text'),
-      makeTextDelta(0, 'hi'),
-      makeContentBlockStop(0),
-      makeMessageDelta('end_turn', 0),
-      makeMessageStop(),
-    ]
-
-    const { assistantMessages } = await runQueryModel(_nextEvents)
-
-    const usage = assistantMessages[0]!.message.usage as any
-    expect(typeof usage.input_tokens).toBe('number')
-    expect(typeof usage.output_tokens).toBe('number')
-  })
-})
-
-describe('queryModelOpenAI — no duplicate AssistantMessage (partialMessage reset)', () => {
-  test('yields exactly one AssistantMessage per message_stop when content is present', async () => {
-    _nextEvents = [
-      makeMessageStart(),
-      makeContentBlockStart(0, 'text'),
-      makeTextDelta(0, 'only once'),
-      makeContentBlockStop(0),
-      makeMessageDelta('end_turn', 5),
-      makeMessageStop(),
-    ]
-
-    const { assistantMessages } = await runQueryModel(_nextEvents)
-
-    // Before the fix, partialMessage was not reset to null, so the safety
-    // fallback at the end of the loop would yield a second message with the
-    // same message.id — causing mergeAssistantMessages to concatenate content.
-    expect(assistantMessages).toHaveLength(1)
-  })
-
-  test('thinking + text response yields exactly one AssistantMessage', async () => {
-    _nextEvents = [
-      makeMessageStart(),
-      makeContentBlockStart(0, 'thinking'),
-      makeThinkingDelta(0, 'let me think'),
-      makeContentBlockStop(0),
-      makeContentBlockStart(1, 'text'),
-      makeTextDelta(1, 'answer'),
-      makeContentBlockStop(1),
-      makeMessageDelta('end_turn', 30),
-      makeMessageStop(),
-    ]
-
-    const { assistantMessages } = await runQueryModel(_nextEvents)
-
-    expect(assistantMessages).toHaveLength(1)
-  })
-
-  test('safety fallback path still yields message when stream ends without message_stop', async () => {
-    // Simulates a stream that cuts off without the normal termination sequence.
-    _nextEvents = [
-      makeMessageStart(),
-      makeContentBlockStart(0, 'text'),
-      makeTextDelta(0, 'abrupt end'),
-      // No content_block_stop, no message_delta, no message_stop
-    ]
-
-    const { assistantMessages } = await runQueryModel(_nextEvents)
-
-    expect(assistantMessages).toHaveLength(1)
-  })
-})
-
-describe('queryModelOpenAI — stream_events forwarded', () => {
-  test('every adapted event is also yielded as stream_event for real-time display', async () => {
-    _nextEvents = [
-      makeMessageStart(),
-      makeContentBlockStart(0, 'text'),
-      makeTextDelta(0, 'hello'),
-      makeContentBlockStop(0),
-      makeMessageDelta('end_turn', 5),
-      makeMessageStop(),
-    ]
-
-    const { streamEvents } = await runQueryModel(_nextEvents)
-
-    const eventTypes = streamEvents.map(e => (e as any).event?.type)
-    expect(eventTypes).toContain('message_start')
-    expect(eventTypes).toContain('content_block_start')
-    expect(eventTypes).toContain('content_block_delta')
-    expect(eventTypes).toContain('content_block_stop')
-    expect(eventTypes).toContain('message_delta')
-    expect(eventTypes).toContain('message_stop')
-  })
-})
-
-describe('queryModelOpenAI — max_tokens forwarded to request', () => {
-  test('buildOpenAIRequestBody includes max_tokens in the request payload', async () => {
-    _nextEvents = [
-      makeMessageStart(),
-      makeContentBlockStart(0, 'text'),
-      makeTextDelta(0, 'hi'),
-      makeContentBlockStop(0),
-      makeMessageDelta('end_turn', 5),
-      makeMessageStop(),
-    ]
-
-    await runQueryModel(_nextEvents)
-
-    expect(_lastCreateArgs).not.toBeNull()
-    expect(_lastCreateArgs!.max_tokens).toBe(8192)
+    expect(exitCode).toBe(0)
  })
 })
--- a/src/services/api/openai/tests/streamAdapter.test.ts
+++ b/src/services/api/openai/tests/streamAdapter.test.ts
@@ -1,6 +1,21 @@
 import { describe, expect, test } from 'bun:test'
-import { adaptOpenAIStreamToAnthropic } from '../streamAdapter.js'
 import type { ChatCompletionChunk } from 'openai/resources/chat/completions/completions.mjs'
+import { join, dirname } from 'path'
+import { fileURLToPath } from 'url'
+import { readFileSync, writeFileSync, mkdirSync } from 'fs'
+import { tmpdir } from 'os'
+
+// Guard against mock pollution from queryModelOpenAI.test.ts which replaces
+// ../streamAdapter.js process-wide via mock.module (bun has no un-mock API).
+// We copy the source to a unique temp path so the import bypasses bun's
+// module mock cache completely.
+const _testDir = dirname(fileURLToPath(import.meta.url))
+const _realSource = readFileSync(join(_testDir, '..', 'streamAdapter.ts'), 'utf-8')
+const _tempDir = join(tmpdir(), `stream-adapter-test-${Date.now()}`)
+mkdirSync(_tempDir, { recursive: true })
+const _tempFile = join(_tempDir, 'streamAdapter.ts')
+writeFileSync(_tempFile, _realSource, 'utf-8')
+const { adaptOpenAIStreamToAnthropic } = await import(_tempFile)

 /** Helper to create a mock async iterable from chunk array */
 function mockStream(chunks: ChatCompletionChunk[]): AsyncIterable<ChatCompletionChunk> {
@@ -31,6 +46,11 @@ function makeChunk(overrides: Partial<ChatCompletionChunk> & any = {}): ChatComp

 /** Collect all emitted Anthropic events from the stream adapter for assertion */
 async function collectEvents(chunks: ChatCompletionChunk[]) {
+  const realModuleUrl = new URL(
+    `../streamAdapter.js?real=${Date.now()}-${Math.random().toString(36).slice(2)}`,
+    import.meta.url,
+  ).href
+  const { adaptOpenAIStreamToAnthropic } = await import(realModuleUrl)
  const events: any[] = []
  for await (const event of adaptOpenAIStreamToAnthropic(mockStream(chunks), 'gpt-4o')) {
    events.push(event)
--- a/src/services/awaySummary.ts
+++ b/src/services/awaySummary.ts
@@ -8,6 +8,7 @@ import {
 } from '../utils/messages.js'
 import { getSmallFastModel } from '../utils/model/model.js'
 import { asSystemPrompt } from '../utils/systemPromptType.js'
+import { getResolvedLanguage } from '../utils/language.js'
 import { queryModelWithoutStreaming } from './api/claude.js'
 import { getSessionMemoryContent } from './SessionMemory/sessionMemoryUtils.js'

@@ -15,11 +16,18 @@ import { getSessionMemoryContent } from './SessionMemory/sessionMemoryUtils.js'
 // large sessions. 30 messages ≈ ~15 exchanges, plenty for "where we left off."
 const RECENT_MESSAGE_WINDOW = 30

+const PROMPT_EN =
+  'The user stepped away and is coming back. Write exactly 1-3 short sentences. Start by stating the high-level task — what they are building or debugging, not implementation details. Next: the concrete next step. Skip status reports and commit recaps.'
+
+const PROMPT_ZH =
+  '用户离开后回来了。用中文写 1-3 句话。先说明用户在做什么（高层目标，不是实现细节），然后说明下一步具体操作。不要写状态报告或提交总结。'
+
 function buildAwaySummaryPrompt(memory: string | null): string {
  const memoryBlock = memory
    ? `Session memory (broader context):\n${memory}\n\n`
    : ''
-  return `${memoryBlock}The user stepped away and is coming back. Write exactly 1-3 short sentences. Start by stating the high-level task — what they are building or debugging, not implementation details. Next: the concrete next step. Skip status reports and commit recaps.`
+  const prompt = getResolvedLanguage() === 'zh' ? PROMPT_ZH : PROMPT_EN
+  return `${memoryBlock}${prompt}`
 }

 /**
--- a/src/services/langfuse/tests/langfuse.isolated.ts
+++ b/src/services/langfuse/tests/langfuse.isolated.ts
@@ -0,0 +1,569 @@
+import { mock, describe, test, expect, beforeEach } from 'bun:test'
+
+// Mock @langfuse/otel before any imports
+const mockForceFlush = mock(() => Promise.resolve())
+const mockShutdown = mock(() => Promise.resolve())
+
+mock.module('@langfuse/otel', () => ({
+  LangfuseSpanProcessor: class MockLangfuseSpanProcessor {
+    forceFlush = mockForceFlush
+    shutdown = mockShutdown
+    onStart = mock(() => {})
+    onEnd = mock(() => {})
+  },
+}))
+
+// Mock @opentelemetry/sdk-trace-base
+mock.module('@opentelemetry/sdk-trace-base', () => ({
+  BasicTracerProvider: class MockBasicTracerProvider {
+    constructor(_opts?: unknown) {}
+  },
+}))
+
+// Mock @langfuse/tracing
+const mockChildUpdate = mock(() => {})
+const mockChildEnd = mock(() => {})
+const mockRootUpdate = mock(() => {})
+const mockRootEnd = mock(() => {})
+
+// Mock LangfuseOtelSpanAttributes (re-exported from @langfuse/core)
+const mockLangfuseOtelSpanAttributes: Record<string, string> = {
+  TRACE_SESSION_ID: 'session.id',
+  OBSERVATION_TYPE: 'observation.type',
+  OBSERVATION_INPUT: 'observation.input',
+  OBSERVATION_OUTPUT: 'observation.output',
+  OBSERVATION_MODEL: 'observation.model',
+  OBSERVATION_COMPLETION_START_TIME: 'observation.completionStartTime',
+  OBSERVATION_USAGE_DETAILS: 'observation.usageDetails',
+}
+
+const mockSpanContext = { traceId: 'test-trace-id', spanId: 'test-span-id', traceFlags: 1 }
+const mockSetAttribute = mock(() => {})
+
+// Child observation mock (returned by rootSpan.startObservation for tools)
+const mockChildStartObservation = mock(() => ({
+  id: 'child-id',
+  update: mockChildUpdate,
+  end: mockChildEnd,
+}))
+
+const mockStartObservation = mock(() => ({
+  id: 'test-span-id',
+  traceId: 'test-trace-id',
+  type: 'span',
+  otelSpan: {
+    spanContext: () => mockSpanContext,
+    setAttribute: mockSetAttribute,
+  },
+  update: mockRootUpdate,
+  end: mockRootEnd,
+  // Instance method — used by recordToolObservation
+  startObservation: mockChildStartObservation,
+}))
+const mockSetLangfuseTracerProvider = mock(() => {})
+
+mock.module('@langfuse/tracing', () => ({
+  startObservation: mockStartObservation,
+  LangfuseOtelSpanAttributes: mockLangfuseOtelSpanAttributes,
+  propagateAttributes: mock((_params: unknown, fn?: () => void) => fn?.()),
+  setLangfuseTracerProvider: mockSetLangfuseTracerProvider,
+}))
+
+// Mock debug logger
+mock.module('src/utils/debug.js', () => ({
+  logForDebugging: mock(() => {}),
+}))
+
+describe('Langfuse integration', () => {
+  beforeEach(() => {
+    // Reset env
+    process.env.HOME = '/Users/testuser'
+    delete process.env.LANGFUSE_PUBLIC_KEY
+    delete process.env.LANGFUSE_SECRET_KEY
+    delete process.env.LANGFUSE_BASE_URL
+    mockStartObservation.mockClear()
+    mockChildStartObservation.mockClear()
+    mockChildUpdate.mockClear()
+    mockChildEnd.mockClear()
+    mockRootUpdate.mockClear()
+    mockRootEnd.mockClear()
+    mockForceFlush.mockClear()
+    mockShutdown.mockClear()
+    mockSetAttribute.mockClear()
+  })
+
+  // ── sanitize tests ──────────────────────────────────────────────────────────
+
+  describe('sanitizeToolInput', () => {
+    test('replaces home dir in file_path', async () => {
+      const { sanitizeToolInput } = await import('../sanitize.js')
+      const home = process.env.HOME ?? '/Users/testuser'
+      const result = sanitizeToolInput('FileReadTool', { file_path: `${home}/project/file.ts` }) as Record<string, string>
+      expect(result.file_path).toBe('~/project/file.ts')
+    })
+
+    test('redacts sensitive keys', async () => {
+      const { sanitizeToolInput } = await import('../sanitize.js')
+      const result = sanitizeToolInput('MCPTool', { api_key: 'secret123', token: 'abc' }) as Record<string, string>
+      expect(result.api_key).toBe('[REDACTED]')
+      expect(result.token).toBe('[REDACTED]')
+    })
+
+    test('returns non-object input unchanged', async () => {
+      const { sanitizeToolInput } = await import('../sanitize.js')
+      expect(sanitizeToolInput('BashTool', 'raw string')).toBe('raw string')
+      expect(sanitizeToolInput('BashTool', null)).toBe(null)
+    })
+  })
+
+  describe('sanitizeToolOutput', () => {
+    test('redacts FileReadTool output', async () => {
+      const { sanitizeToolOutput } = await import('../sanitize.js')
+      const result = sanitizeToolOutput('FileReadTool', 'file content here')
+      expect(result).toBe('[file content redacted, 17 chars]')
+    })
+
+    test('redacts FileWriteTool output', async () => {
+      const { sanitizeToolOutput } = await import('../sanitize.js')
+      const result = sanitizeToolOutput('FileWriteTool', 'written content')
+      expect(result).toBe('[file content redacted, 15 chars]')
+    })
+
+    test('truncates BashTool output over 500 chars', async () => {
+      const { sanitizeToolOutput } = await import('../sanitize.js')
+      const longOutput = 'x'.repeat(600)
+      const result = sanitizeToolOutput('BashTool', longOutput)
+      expect(result).toContain('[truncated]')
+      expect(result.length).toBeLessThan(600)
+    })
+
+    test('does not truncate BashTool output under 500 chars', async () => {
+      const { sanitizeToolOutput } = await import('../sanitize.js')
+      const shortOutput = 'hello world'
+      expect(sanitizeToolOutput('BashTool', shortOutput)).toBe('hello world')
+    })
+
+    test('redacts ConfigTool output', async () => {
+      const { sanitizeToolOutput } = await import('../sanitize.js')
+      const result = sanitizeToolOutput('ConfigTool', 'config data')
+      expect(result).toBe('[ConfigTool output redacted, 11 chars]')
+    })
+
+    test('redacts MCPTool output', async () => {
+      const { sanitizeToolOutput } = await import('../sanitize.js')
+      const result = sanitizeToolOutput('MCPTool', 'mcp data')
+      expect(result).toBe('[MCPTool output redacted, 8 chars]')
+    })
+  })
+
+  describe('sanitizeGlobal', () => {
+    test('replaces home dir in strings', async () => {
+      const { sanitizeGlobal } = await import('../sanitize.js')
+      const home = process.env.HOME ?? '/Users/testuser'
+      expect(sanitizeGlobal(`path: ${home}/file`)).toBe('path: ~/file')
+    })
+
+    test('recursively sanitizes nested objects', async () => {
+      const { sanitizeGlobal } = await import('../sanitize.js')
+      const result = sanitizeGlobal({ nested: { api_key: 'secret', name: 'test' } }) as Record<string, Record<string, string>>
+      expect(result.nested.api_key).toBe('[REDACTED]')
+      expect(result.nested.name).toBe('test')
+    })
+
+    test('returns non-string/object values unchanged', async () => {
+      const { sanitizeGlobal } = await import('../sanitize.js')
+      expect(sanitizeGlobal(42)).toBe(42)
+      expect(sanitizeGlobal(true)).toBe(true)
+    })
+  })
+
+  // ── client tests ────────────────────────────────────────────────────────────
+
+  describe('isLangfuseEnabled', () => {
+    test('returns false when keys not configured', async () => {
+      const { isLangfuseEnabled } = await import('../client.js')
+      expect(isLangfuseEnabled()).toBe(false)
+    })
+
+    test('returns true when both keys are set', async () => {
+      process.env.LANGFUSE_PUBLIC_KEY = 'pk-test'
+      process.env.LANGFUSE_SECRET_KEY = 'sk-test'
+      const { isLangfuseEnabled } = await import('../client.js')
+      expect(isLangfuseEnabled()).toBe(true)
+    })
+  })
+
+  describe('initLangfuse', () => {
+    test('returns false when keys not configured', async () => {
+      const { initLangfuse } = await import('../client.js')
+      expect(initLangfuse()).toBe(false)
+    })
+
+    test('returns true when keys are configured', async () => {
+      process.env.LANGFUSE_PUBLIC_KEY = 'pk-test'
+      process.env.LANGFUSE_SECRET_KEY = 'sk-test'
+      // client.js is a singleton — test via isLangfuseEnabled which reads env directly
+      const { isLangfuseEnabled } = await import('../client.js')
+      expect(isLangfuseEnabled()).toBe(true)
+    })
+
+    test('is idempotent — multiple calls do not re-initialize', async () => {
+      // client.js singleton: once processor is set, initLangfuse returns true immediately
+      // We verify this by checking that calling it multiple times doesn't throw
+      const { initLangfuse } = await import('../client.js')
+      expect(() => { initLangfuse(); initLangfuse() }).not.toThrow()
+    })
+  })
+
+  describe('shutdownLangfuse', () => {
+    test('calls forceFlush and shutdown on processor', async () => {
+      // Verify shutdown is callable without error even when no processor is set
+      const { shutdownLangfuse } = await import('../client.js')
+      await expect(shutdownLangfuse()).resolves.toBeUndefined()
+    })
+  })
+
+  // ── tracing tests ───────────────────────────────────────────────────────────
+
+  describe('createTrace', () => {
+    test('returns null when langfuse not enabled', async () => {
+      const { createTrace } = await import('../tracing.js')
+      const span = createTrace({ sessionId: 's1', model: 'claude-3', provider: 'firstParty' })
+      expect(span).toBeNull()
+    })
+
+    test('creates root span when enabled', async () => {
+      process.env.LANGFUSE_PUBLIC_KEY = 'pk-test'
+      process.env.LANGFUSE_SECRET_KEY = 'sk-test'
+      const { createTrace } = await import('../tracing.js')
+      const span = createTrace({ sessionId: 's1', model: 'claude-3', provider: 'firstParty', input: [] })
+      expect(span).not.toBeNull()
+      expect(mockStartObservation).toHaveBeenCalledWith('agent-run', expect.objectContaining({
+        metadata: expect.objectContaining({ provider: 'firstParty', model: 'claude-3' }),
+      }), { asType: 'agent' })
+    })
+  })
+
+  describe('recordLLMObservation', () => {
+    test('no-ops when rootSpan is null', async () => {
+      const { recordLLMObservation } = await import('../tracing.js')
+      recordLLMObservation(null, { model: 'm', provider: 'firstParty', input: [], output: [], usage: { input_tokens: 10, output_tokens: 5 } })
+      expect(mockStartObservation).toHaveBeenCalledTimes(0)
+    })
+
+    test('records generation child observation via global startObservation', async () => {
+      process.env.LANGFUSE_PUBLIC_KEY = 'pk-test'
+      process.env.LANGFUSE_SECRET_KEY = 'sk-test'
+      const { createTrace, recordLLMObservation } = await import('../tracing.js')
+      const span = createTrace({ sessionId: 's1', model: 'claude-3', provider: 'firstParty' })
+      mockStartObservation.mockClear()
+      recordLLMObservation(span, {
+        model: 'claude-3',
+        provider: 'firstParty',
+        input: [{ role: 'user', content: 'hello' }],
+        output: [{ role: 'assistant', content: 'hi' }],
+        usage: { input_tokens: 10, output_tokens: 5 },
+      })
+      // Should call the global startObservation with asType: 'generation' and parentSpanContext
+      expect(mockStartObservation).toHaveBeenCalledWith('ChatAnthropic', expect.objectContaining({
+        model: 'claude-3',
+      }), expect.objectContaining({
+        asType: 'generation',
+        parentSpanContext: mockSpanContext,
+      }))
+      expect(mockRootUpdate).toHaveBeenCalledWith(expect.objectContaining({
+        usageDetails: { input: 10, output: 5 },
+      }))
+      expect(mockRootEnd).toHaveBeenCalled()
+    })
+  })
+
+  describe('recordToolObservation', () => {
+    test('no-ops when rootSpan is null', async () => {
+      const { recordToolObservation } = await import('../tracing.js')
+      recordToolObservation(null, { toolName: 'BashTool', toolUseId: 'id1', input: {}, output: 'out' })
+      // startObservation should not be called beyond the initial trace creation (none here)
+    })
+
+    test('records tool child observation via global startObservation', async () => {
+      process.env.LANGFUSE_PUBLIC_KEY = 'pk-test'
+      process.env.LANGFUSE_SECRET_KEY = 'sk-test'
+      const { createTrace, recordToolObservation } = await import('../tracing.js')
+      const span = createTrace({ sessionId: 's1', model: 'claude-3', provider: 'firstParty' })
+      mockStartObservation.mockClear()
+      mockRootUpdate.mockClear()
+      mockRootEnd.mockClear()
+      recordToolObservation(span, {
+        toolName: 'BashTool',
+        toolUseId: 'tu-1',
+        input: { command: 'ls' },
+        output: 'file.ts',
+      })
+      // Should call the global startObservation with asType: 'tool' and parentSpanContext
+      expect(mockStartObservation).toHaveBeenCalledWith('BashTool', expect.objectContaining({
+        input: expect.any(Object),
+      }), expect.objectContaining({
+        asType: 'tool',
+        parentSpanContext: mockSpanContext,
+      }))
+      expect(mockRootUpdate).toHaveBeenCalled()
+      expect(mockRootEnd).toHaveBeenCalled()
+    })
+
+    test('passes startTime to global startObservation', async () => {
+      process.env.LANGFUSE_PUBLIC_KEY = 'pk-test'
+      process.env.LANGFUSE_SECRET_KEY = 'sk-test'
+      const { createTrace, recordToolObservation } = await import('../tracing.js')
+      const span = createTrace({ sessionId: 's1', model: 'claude-3', provider: 'firstParty' })
+      mockStartObservation.mockClear()
+      const startTime = new Date('2026-01-01T00:00:00Z')
+      recordToolObservation(span, {
+        toolName: 'BashTool',
+        toolUseId: 'tu-2',
+        input: {},
+        output: 'out',
+        startTime,
+      })
+      expect(mockStartObservation).toHaveBeenCalledWith('BashTool', expect.any(Object), expect.objectContaining({
+        startTime,
+        parentSpanContext: mockSpanContext,
+      }))
+    })
+
+    test('sanitizes FileReadTool output', async () => {
+      process.env.LANGFUSE_PUBLIC_KEY = 'pk-test'
+      process.env.LANGFUSE_SECRET_KEY = 'sk-test'
+      const { createTrace, recordToolObservation } = await import('../tracing.js')
+      const span = createTrace({ sessionId: 's1', model: 'claude-3', provider: 'firstParty' })
+      mockRootUpdate.mockClear()
+      recordToolObservation(span, {
+        toolName: 'FileReadTool',
+        toolUseId: 'tu-2',
+        input: { file_path: '/tmp/file.ts' },
+        output: 'file content here',
+      })
+      expect(mockRootUpdate).toHaveBeenCalledWith(expect.objectContaining({
+        output: '[file content redacted, 17 chars]',
+      }))
+    })
+
+    test('sets ERROR level for error observations', async () => {
+      process.env.LANGFUSE_PUBLIC_KEY = 'pk-test'
+      process.env.LANGFUSE_SECRET_KEY = 'sk-test'
+      const { createTrace, recordToolObservation } = await import('../tracing.js')
+      const span = createTrace({ sessionId: 's1', model: 'claude-3', provider: 'firstParty' })
+      mockRootUpdate.mockClear()
+      recordToolObservation(span, {
+        toolName: 'BashTool',
+        toolUseId: 'tu-3',
+        input: {},
+        output: 'error occurred',
+        isError: true,
+      })
+      expect(mockRootUpdate).toHaveBeenCalledWith(expect.objectContaining({ level: 'ERROR' }))
+    })
+  })
+
+  describe('endTrace', () => {
+    test('no-ops when rootSpan is null', async () => {
+      const { endTrace } = await import('../tracing.js')
+      endTrace(null)
+      expect(mockRootEnd).not.toHaveBeenCalled()
+    })
+
+    test('calls span.end()', async () => {
+      process.env.LANGFUSE_PUBLIC_KEY = 'pk-test'
+      process.env.LANGFUSE_SECRET_KEY = 'sk-test'
+      const { createTrace, endTrace } = await import('../tracing.js')
+      const span = createTrace({ sessionId: 's1', model: 'claude-3', provider: 'firstParty' })
+      endTrace(span)
+      expect(mockRootEnd).toHaveBeenCalled()
+    })
+
+    test('calls span.update() with output when provided', async () => {
+      process.env.LANGFUSE_PUBLIC_KEY = 'pk-test'
+      process.env.LANGFUSE_SECRET_KEY = 'sk-test'
+      const { createTrace, endTrace } = await import('../tracing.js')
+      const span = createTrace({ sessionId: 's1', model: 'claude-3', provider: 'firstParty' })
+      endTrace(span, 'final output')
+      expect(mockRootUpdate).toHaveBeenCalledWith({ output: 'final output' })
+      expect(mockRootEnd).toHaveBeenCalled()
+    })
+  })
+
+  describe('createSubagentTrace', () => {
+    test('returns null when langfuse not enabled', async () => {
+      const { createSubagentTrace } = await import('../tracing.js')
+      const span = createSubagentTrace({
+        sessionId: 's1',
+        agentType: 'Explore',
+        agentId: 'agent-1',
+        model: 'claude-3',
+        provider: 'firstParty',
+      })
+      expect(span).toBeNull()
+    })
+
+    test('creates trace with agentType and agentId metadata', async () => {
+      process.env.LANGFUSE_PUBLIC_KEY = 'pk-test'
+      process.env.LANGFUSE_SECRET_KEY = 'sk-test'
+      const { createSubagentTrace } = await import('../tracing.js')
+      const span = createSubagentTrace({
+        sessionId: 's1',
+        agentType: 'Explore',
+        agentId: 'agent-1',
+        model: 'claude-3',
+        provider: 'firstParty',
+        input: [{ role: 'user', content: 'search for X' }],
+      })
+      expect(span).not.toBeNull()
+      expect(mockStartObservation).toHaveBeenCalledWith('agent:Explore', expect.objectContaining({
+        metadata: expect.objectContaining({
+          agentType: 'Explore',
+          agentId: 'agent-1',
+          provider: 'firstParty',
+          model: 'claude-3',
+        }),
+      }), { asType: 'agent' })
+      // Verify session.id attribute is set
+      expect(mockSetAttribute).toHaveBeenCalledWith('session.id', 's1')
+    })
+
+    test('returns null on SDK error', async () => {
+      process.env.LANGFUSE_PUBLIC_KEY = 'pk-test'
+      process.env.LANGFUSE_SECRET_KEY = 'sk-test'
+      mockStartObservation.mockImplementationOnce(() => { throw new Error('SDK error') })
+      const { createSubagentTrace } = await import('../tracing.js')
+      const span = createSubagentTrace({
+        sessionId: 's1',
+        agentType: 'Plan',
+        agentId: 'agent-2',
+        model: 'claude-3',
+        provider: 'firstParty',
+      })
+      expect(span).toBeNull()
+    })
+  })
+
+  describe('createTrace with querySource', () => {
+    test('includes querySource in metadata', async () => {
+      process.env.LANGFUSE_PUBLIC_KEY = 'pk-test'
+      process.env.LANGFUSE_SECRET_KEY = 'sk-test'
+      const { createTrace } = await import('../tracing.js')
+      const span = createTrace({
+        sessionId: 's1',
+        model: 'claude-3',
+        provider: 'firstParty',
+        querySource: 'user',
+      })
+      expect(span).not.toBeNull()
+      expect(mockStartObservation).toHaveBeenCalledWith('agent-run:user', expect.objectContaining({
+        metadata: expect.objectContaining({
+          agentType: 'main',
+          querySource: 'user',
+        }),
+      }), { asType: 'agent' })
+    })
+
+    test('omits querySource when not provided', async () => {
+      process.env.LANGFUSE_PUBLIC_KEY = 'pk-test'
+      process.env.LANGFUSE_SECRET_KEY = 'sk-test'
+      mockStartObservation.mockClear()
+      const { createTrace } = await import('../tracing.js')
+      createTrace({ sessionId: 's1', model: 'claude-3', provider: 'firstParty' })
+      const calls = mockStartObservation.mock.calls as unknown[][]
+      const secondArg = calls[0]?.[1] as Record<string, unknown> | undefined
+      const metadata = (secondArg?.metadata ?? {}) as Record<string, unknown>
+      expect(metadata).not.toHaveProperty('querySource')
+    })
+  })
+
+  describe('nested agent scenario', () => {
+    test('sub-agent trace shares sessionId with parent', async () => {
+      process.env.LANGFUSE_PUBLIC_KEY = 'pk-test'
+      process.env.LANGFUSE_SECRET_KEY = 'sk-test'
+      const { createTrace, createSubagentTrace } = await import('../tracing.js')
+      mockSetAttribute.mockClear()
+
+      // Create parent trace
+      const parentSpan = createTrace({
+        sessionId: 'shared-session',
+        model: 'claude-3',
+        provider: 'firstParty',
+      })
+
+      // Create sub-agent trace with same sessionId
+      const subSpan = createSubagentTrace({
+        sessionId: 'shared-session',
+        agentType: 'Explore',
+        agentId: 'agent-explore-1',
+        model: 'claude-3',
+        provider: 'firstParty',
+      })
+
+      expect(parentSpan).not.toBeNull()
+      expect(subSpan).not.toBeNull()
+
+      // Both should have set session.id attribute
+      const sessionAttributeCalls = mockSetAttribute.mock.calls.filter(
+        (call: unknown[]) => Array.isArray(call) && call[0] === 'session.id' && call[1] === 'shared-session',
+      )
+      expect(sessionAttributeCalls.length).toBeGreaterThanOrEqual(2)
+    })
+
+    test('query reuses passed langfuseTrace instead of creating new one', async () => {
+      // This validates the pattern used in query.ts:
+      //   const ownsTrace = !params.toolUseContext.langfuseTrace
+      //   const langfuseTrace = params.toolUseContext.langfuseTrace ?? createTrace(...)
+      // When langfuseTrace is already set, createTrace should NOT be called
+      process.env.LANGFUSE_PUBLIC_KEY = 'pk-test'
+      process.env.LANGFUSE_SECRET_KEY = 'sk-test'
+      const { createSubagentTrace } = await import('../tracing.js')
+
+      // Simulate what runAgent does: create subTrace, then pass it as langfuseTrace
+      const subTrace = createSubagentTrace({
+        sessionId: 's1',
+        agentType: 'Explore',
+        agentId: 'agent-1',
+        model: 'claude-3',
+        provider: 'firstParty',
+      })
+      expect(subTrace).not.toBeNull()
+
+      // Simulate query.ts logic: if langfuseTrace already set, don't create new one
+      const ownsTrace = false  // Would be: !params.toolUseContext.langfuseTrace
+      const langfuseTrace = subTrace  // Would be: params.toolUseContext.langfuseTrace ?? createTrace(...)
+
+      expect(ownsTrace).toBe(false)
+      expect(langfuseTrace).toBe(subTrace)
+    })
+  })
+
+  describe('SDK exceptions do not affect main flow', () => {
+    test('createTrace returns null on SDK error', async () => {
+      process.env.LANGFUSE_PUBLIC_KEY = 'pk-test'
+      process.env.LANGFUSE_SECRET_KEY = 'sk-test'
+      mockStartObservation.mockImplementationOnce(() => { throw new Error('SDK error') })
+      const { createTrace } = await import('../tracing.js')
+      const span = createTrace({ sessionId: 's1', model: 'claude-3', provider: 'firstParty' })
+      expect(span).toBeNull()
+    })
+
+    test('recordLLMObservation silently fails on SDK error', async () => {
+      process.env.LANGFUSE_PUBLIC_KEY = 'pk-test'
+      process.env.LANGFUSE_SECRET_KEY = 'sk-test'
+      mockStartObservation.mockImplementationOnce(() => { throw new Error('SDK error') })
+      const { createTrace, recordLLMObservation } = await import('../tracing.js')
+      const span = createTrace({ sessionId: 's1', model: 'claude-3', provider: 'firstParty' })
+      // The second call to startObservation (for the generation) will throw
+      mockStartObservation.mockImplementationOnce(() => { throw new Error('SDK error') })
+      expect(() => recordLLMObservation(span, {
+        model: 'm',
+        provider: 'firstParty',
+        input: [],
+        output: [],
+        usage: { input_tokens: 1, output_tokens: 1 },
+      })).not.toThrow()
+    })
+  })
+})
--- a/src/services/langfuse/tests/langfuse.test.ts
+++ b/src/services/langfuse/tests/langfuse.test.ts
@@ -1,683 +1,40 @@
-import { mock, describe, test, expect, beforeEach } from 'bun:test'
+import { describe, expect, test } from 'bun:test'
+import { fileURLToPath } from 'node:url'

-// Mock @langfuse/otel before any imports
-const mockForceFlush = mock(() => Promise.resolve())
-const mockShutdown = mock(() => Promise.resolve())
-
-mock.module('@langfuse/otel', () => ({
-  LangfuseSpanProcessor: class MockLangfuseSpanProcessor {
-    forceFlush = mockForceFlush
-    shutdown = mockShutdown
-    onStart = mock(() => {})
-    onEnd = mock(() => {})
-  },
-}))
-
-// Mock @opentelemetry/sdk-trace-base
-mock.module('@opentelemetry/sdk-trace-base', () => ({
-  BasicTracerProvider: class MockBasicTracerProvider {
-    constructor(_opts?: unknown) {}
-  },
-}))
-
-// Mock @langfuse/tracing
-const mockChildUpdate = mock(() => {})
-const mockChildEnd = mock(() => {})
-const mockRootUpdate = mock(() => {})
-const mockRootEnd = mock(() => {})
-
-// Mock LangfuseOtelSpanAttributes (re-exported from @langfuse/core)
-const mockLangfuseOtelSpanAttributes: Record<string, string> = {
-  TRACE_SESSION_ID: 'session.id',
-  TRACE_USER_ID: 'user.id',
-  OBSERVATION_TYPE: 'observation.type',
-  OBSERVATION_INPUT: 'observation.input',
-  OBSERVATION_OUTPUT: 'observation.output',
-  OBSERVATION_MODEL: 'observation.model',
-  OBSERVATION_COMPLETION_START_TIME: 'observation.completionStartTime',
-  OBSERVATION_USAGE_DETAILS: 'observation.usageDetails',
-}
-
-const mockSpanContext = { traceId: 'test-trace-id', spanId: 'test-span-id', traceFlags: 1 }
-const mockSetAttribute = mock(() => {})
-
-// Child observation mock (returned by rootSpan.startObservation for tools)
-const mockChildStartObservation = mock(() => ({
-  id: 'child-id',
-  update: mockChildUpdate,
-  end: mockChildEnd,
-}))
-
-const mockStartObservation = mock(() => ({
-  id: 'test-span-id',
-  traceId: 'test-trace-id',
-  type: 'span',
-  otelSpan: {
-    spanContext: () => mockSpanContext,
-    setAttribute: mockSetAttribute,
-  },
-  update: mockRootUpdate,
-  end: mockRootEnd,
-  // Instance method — used by recordToolObservation
-  startObservation: mockChildStartObservation,
-}))
-const mockSetLangfuseTracerProvider = mock(() => {})
-
-mock.module('@langfuse/tracing', () => ({
-  startObservation: mockStartObservation,
-  LangfuseOtelSpanAttributes: mockLangfuseOtelSpanAttributes,
-  propagateAttributes: mock((_params: unknown, fn?: () => void) => fn?.()),
-  setLangfuseTracerProvider: mockSetLangfuseTracerProvider,
-}))
-
-// Mock debug logger
-mock.module('src/utils/debug.js', () => ({
-  logForDebugging: mock(() => {}),
-}))
-
-// Mock user data — resolveLangfuseUserId uses getCoreUserData().email and .deviceId
-mock.module('src/utils/user.js', () => ({
-  getCoreUserData: mock(() => ({
-    email: 'test-device-id',
-    deviceId: 'test-device-id',
-  })),
-}))
+const isolatedPath = fileURLToPath(
+  new URL('./langfuse.isolated.ts', import.meta.url),
+)

 describe('Langfuse integration', () => {
-  beforeEach(() => {
-    // Reset env
-    delete process.env.LANGFUSE_PUBLIC_KEY
-    delete process.env.LANGFUSE_SECRET_KEY
-    delete process.env.LANGFUSE_BASE_URL
-    mockStartObservation.mockClear()
-    mockChildStartObservation.mockClear()
-    mockChildUpdate.mockClear()
-    mockChildEnd.mockClear()
-    mockRootUpdate.mockClear()
-    mockRootEnd.mockClear()
-    mockForceFlush.mockClear()
-    mockShutdown.mockClear()
-    mockSetAttribute.mockClear()
-  })
-
-  // ── sanitize tests ──────────────────────────────────────────────────────────
-
-  describe('sanitizeToolInput', () => {
-    test('replaces home dir in file_path', async () => {
-      const { sanitizeToolInput } = await import('../sanitize.js')
-      const home = process.env.HOME ?? '/Users/testuser'
-      const result = sanitizeToolInput('FileReadTool', { file_path: `${home}/project/file.ts` }) as Record<string, string>
-      expect(result.file_path).toBe('~/project/file.ts')
+  test('passes in isolated subprocess', async () => {
+    const proc = Bun.spawn({
+      cmd: [process.execPath, 'test', isolatedPath],
+      cwd: process.cwd(),
+      stdout: 'pipe',
+      stderr: 'pipe',
+      env: process.env,
    })

-    test('redacts sensitive keys', async () => {
-      const { sanitizeToolInput } = await import('../sanitize.js')
-      const result = sanitizeToolInput('MCPTool', { api_key: 'secret123', token: 'abc' }) as Record<string, string>
-      expect(result.api_key).toBe('[REDACTED]')
-      expect(result.token).toBe('[REDACTED]')
-    })
+    const [stdout, stderr, exitCode] = await Promise.all([
+      new Response(proc.stdout).text(),
+      new Response(proc.stderr).text(),
+      proc.exited,
+    ])

-    test('returns non-object input unchanged', async () => {
-      const { sanitizeToolInput } = await import('../sanitize.js')
-      expect(sanitizeToolInput('BashTool', 'raw string')).toBe('raw string')
-      expect(sanitizeToolInput('BashTool', null)).toBe(null)
-    })
-  })
-
-  describe('sanitizeToolOutput', () => {
-    test('redacts FileReadTool output', async () => {
-      const { sanitizeToolOutput } = await import('../sanitize.js')
-      const result = sanitizeToolOutput('FileReadTool', 'file content here')
-      expect(result).toBe('[file content redacted, 17 chars]')
-    })
-
-    test('redacts FileWriteTool output', async () => {
-      const { sanitizeToolOutput } = await import('../sanitize.js')
-      const result = sanitizeToolOutput('FileWriteTool', 'written content')
-      expect(result).toBe('[file content redacted, 15 chars]')
-    })
-
-    test('truncates BashTool output over 500 chars', async () => {
-      const { sanitizeToolOutput } = await import('../sanitize.js')
-      const longOutput = 'x'.repeat(600)
-      const result = sanitizeToolOutput('BashTool', longOutput)
-      expect(result).toContain('[truncated]')
-      expect(result.length).toBeLessThan(600)
-    })
-
-    test('does not truncate BashTool output under 500 chars', async () => {
-      const { sanitizeToolOutput } = await import('../sanitize.js')
-      const shortOutput = 'hello world'
-      expect(sanitizeToolOutput('BashTool', shortOutput)).toBe('hello world')
-    })
-
-    test('redacts ConfigTool output', async () => {
-      const { sanitizeToolOutput } = await import('../sanitize.js')
-      const result = sanitizeToolOutput('ConfigTool', 'config data')
-      expect(result).toBe('[ConfigTool output redacted, 11 chars]')
-    })
-
-    test('redacts MCPTool output', async () => {
-      const { sanitizeToolOutput } = await import('../sanitize.js')
-      const result = sanitizeToolOutput('MCPTool', 'mcp data')
-      expect(result).toBe('[MCPTool output redacted, 8 chars]')
-    })
-  })
-
-  describe('sanitizeGlobal', () => {
-    test('replaces home dir in strings', async () => {
-      const { sanitizeGlobal } = await import('../sanitize.js')
-      const home = process.env.HOME ?? '/Users/testuser'
-      expect(sanitizeGlobal(`path: ${home}/file`)).toBe('path: ~/file')
-    })
-
-    test('recursively sanitizes nested objects', async () => {
-      const { sanitizeGlobal } = await import('../sanitize.js')
-      const result = sanitizeGlobal({ nested: { api_key: 'secret', name: 'test' } }) as Record<string, Record<string, string>>
-      expect(result.nested.api_key).toBe('[REDACTED]')
-      expect(result.nested.name).toBe('test')
-    })
-
-    test('returns non-string/object values unchanged', async () => {
-      const { sanitizeGlobal } = await import('../sanitize.js')
-      expect(sanitizeGlobal(42)).toBe(42)
-      expect(sanitizeGlobal(true)).toBe(true)
-    })
-  })
-
-  // ── client tests ────────────────────────────────────────────────────────────
-
-  describe('isLangfuseEnabled', () => {
-    test('returns false when keys not configured', async () => {
-      const { isLangfuseEnabled } = await import('../client.js')
-      expect(isLangfuseEnabled()).toBe(false)
-    })
-
-    test('returns true when both keys are set', async () => {
-      process.env.LANGFUSE_PUBLIC_KEY = 'pk-test'
-      process.env.LANGFUSE_SECRET_KEY = 'sk-test'
-      const { isLangfuseEnabled } = await import('../client.js')
-      expect(isLangfuseEnabled()).toBe(true)
-    })
-  })
-
-  describe('initLangfuse', () => {
-    test('returns false when keys not configured', async () => {
-      const { initLangfuse } = await import('../client.js')
-      expect(initLangfuse()).toBe(false)
-    })
-
-    test('returns true when keys are configured', async () => {
-      process.env.LANGFUSE_PUBLIC_KEY = 'pk-test'
-      process.env.LANGFUSE_SECRET_KEY = 'sk-test'
-      // client.js is a singleton — test via isLangfuseEnabled which reads env directly
-      const { isLangfuseEnabled } = await import('../client.js')
-      expect(isLangfuseEnabled()).toBe(true)
-    })
-
-    test('is idempotent — multiple calls do not re-initialize', async () => {
-      // client.js singleton: once processor is set, initLangfuse returns true immediately
-      // We verify this by checking that calling it multiple times doesn't throw
-      const { initLangfuse } = await import('../client.js')
-      expect(() => { initLangfuse(); initLangfuse() }).not.toThrow()
-    })
-  })
-
-  describe('shutdownLangfuse', () => {
-    test('calls forceFlush and shutdown on processor', async () => {
-      // Verify shutdown is callable without error even when no processor is set
-      const { shutdownLangfuse } = await import('../client.js')
-      await expect(shutdownLangfuse()).resolves.toBeUndefined()
-    })
-  })
-
-  // ── tracing tests ───────────────────────────────────────────────────────────
-
-  describe('createTrace', () => {
-    test('returns null when langfuse not enabled', async () => {
-      const { createTrace } = await import('../tracing.js')
-      const span = createTrace({ sessionId: 's1', model: 'claude-3', provider: 'firstParty' })
-      expect(span).toBeNull()
-    })
-
-    test('creates root span when enabled', async () => {
-      process.env.LANGFUSE_PUBLIC_KEY = 'pk-test'
-      process.env.LANGFUSE_SECRET_KEY = 'sk-test'
-      const { createTrace } = await import('../tracing.js')
-      const span = createTrace({ sessionId: 's1', model: 'claude-3', provider: 'firstParty', input: [] })
-      expect(span).not.toBeNull()
-      expect(mockStartObservation).toHaveBeenCalledWith('agent-run', expect.objectContaining({
-        metadata: expect.objectContaining({ provider: 'firstParty', model: 'claude-3' }),
-      }), { asType: 'agent' })
-    })
-  })
-
-  describe('recordLLMObservation', () => {
-    test('no-ops when rootSpan is null', async () => {
-      const { recordLLMObservation } = await import('../tracing.js')
-      recordLLMObservation(null, { model: 'm', provider: 'firstParty', input: [], output: [], usage: { input_tokens: 10, output_tokens: 5 } })
-      expect(mockStartObservation).toHaveBeenCalledTimes(0)
-    })
-
-    test('records generation child observation via global startObservation', async () => {
-      process.env.LANGFUSE_PUBLIC_KEY = 'pk-test'
-      process.env.LANGFUSE_SECRET_KEY = 'sk-test'
-      const { createTrace, recordLLMObservation } = await import('../tracing.js')
-      const span = createTrace({ sessionId: 's1', model: 'claude-3', provider: 'firstParty' })
-      mockStartObservation.mockClear()
-      recordLLMObservation(span, {
-        model: 'claude-3',
-        provider: 'firstParty',
-        input: [{ role: 'user', content: 'hello' }],
-        output: [{ role: 'assistant', content: 'hi' }],
-        usage: { input_tokens: 10, output_tokens: 5 },
-      })
-      // Should call the global startObservation with asType: 'generation' and parentSpanContext
-      expect(mockStartObservation).toHaveBeenCalledWith('ChatAnthropic', expect.objectContaining({
-        model: 'claude-3',
-      }), expect.objectContaining({
-        asType: 'generation',
-        parentSpanContext: mockSpanContext,
-      }))
-      expect(mockRootUpdate).toHaveBeenCalledWith(expect.objectContaining({
-        usageDetails: { input: 10, output: 5 },
-      }))
-      expect(mockRootEnd).toHaveBeenCalled()
-    })
-
-    test('includes cache tokens in usageDetails when provided', async () => {
-      process.env.LANGFUSE_PUBLIC_KEY = 'pk-test'
-      process.env.LANGFUSE_SECRET_KEY = 'sk-test'
-      const { createTrace, recordLLMObservation } = await import('../tracing.js')
-      const span = createTrace({ sessionId: 's1', model: 'claude-3', provider: 'firstParty' })
-      mockStartObservation.mockClear()
-      mockRootUpdate.mockClear()
-      recordLLMObservation(span, {
-        model: 'claude-3',
-        provider: 'firstParty',
-        input: [],
-        output: [],
-        usage: { input_tokens: 10000, output_tokens: 50, cache_creation_input_tokens: 2000, cache_read_input_tokens: 7000 },
-      })
-      expect(mockRootUpdate).toHaveBeenCalledWith(expect.objectContaining({
-        usageDetails: {
-          input: 19000, // 10000 + 2000 + 7000
-          output: 50,
-          cache_read: 7000,
-          cache_creation: 2000,
-        },
-      }))
-    })
-
-    test('omits cache fields when not provided', async () => {
-      process.env.LANGFUSE_PUBLIC_KEY = 'pk-test'
-      process.env.LANGFUSE_SECRET_KEY = 'sk-test'
-      const { createTrace, recordLLMObservation } = await import('../tracing.js')
-      const span = createTrace({ sessionId: 's1', model: 'claude-3', provider: 'firstParty' })
-      mockRootUpdate.mockClear()
-      recordLLMObservation(span, {
-        model: 'claude-3',
-        provider: 'firstParty',
-        input: [],
-        output: [],
-        usage: { input_tokens: 100, output_tokens: 20 },
-      })
-      expect(mockRootUpdate).toHaveBeenCalledWith(expect.objectContaining({
-        usageDetails: { input: 100, output: 20 },
-      }))
-    })
-  })
-
-  describe('recordToolObservation', () => {
-    test('no-ops when rootSpan is null', async () => {
-      const { recordToolObservation } = await import('../tracing.js')
-      recordToolObservation(null, { toolName: 'BashTool', toolUseId: 'id1', input: {}, output: 'out' })
-      // startObservation should not be called beyond the initial trace creation (none here)
-    })
-
-    test('records tool child observation via global startObservation', async () => {
-      process.env.LANGFUSE_PUBLIC_KEY = 'pk-test'
-      process.env.LANGFUSE_SECRET_KEY = 'sk-test'
-      const { createTrace, recordToolObservation } = await import('../tracing.js')
-      const span = createTrace({ sessionId: 's1', model: 'claude-3', provider: 'firstParty' })
-      mockStartObservation.mockClear()
-      mockRootUpdate.mockClear()
-      mockRootEnd.mockClear()
-      recordToolObservation(span, {
-        toolName: 'BashTool',
-        toolUseId: 'tu-1',
-        input: { command: 'ls' },
-        output: 'file.ts',
-      })
-      // Should call the global startObservation with asType: 'tool' and parentSpanContext
-      expect(mockStartObservation).toHaveBeenCalledWith('BashTool', expect.objectContaining({
-        input: expect.any(Object),
-      }), expect.objectContaining({
-        asType: 'tool',
-        parentSpanContext: mockSpanContext,
-      }))
-      expect(mockRootUpdate).toHaveBeenCalled()
-      expect(mockRootEnd).toHaveBeenCalled()
-    })
-
-    test('passes startTime to global startObservation', async () => {
-      process.env.LANGFUSE_PUBLIC_KEY = 'pk-test'
-      process.env.LANGFUSE_SECRET_KEY = 'sk-test'
-      const { createTrace, recordToolObservation } = await import('../tracing.js')
-      const span = createTrace({ sessionId: 's1', model: 'claude-3', provider: 'firstParty' })
-      mockStartObservation.mockClear()
-      const startTime = new Date('2026-01-01T00:00:00Z')
-      recordToolObservation(span, {
-        toolName: 'BashTool',
-        toolUseId: 'tu-2',
-        input: {},
-        output: 'out',
-        startTime,
-      })
-      expect(mockStartObservation).toHaveBeenCalledWith('BashTool', expect.any(Object), expect.objectContaining({
-        startTime,
-        parentSpanContext: mockSpanContext,
-      }))
-    })
-
-    test('sanitizes FileReadTool output', async () => {
-      process.env.LANGFUSE_PUBLIC_KEY = 'pk-test'
-      process.env.LANGFUSE_SECRET_KEY = 'sk-test'
-      const { createTrace, recordToolObservation } = await import('../tracing.js')
-      const span = createTrace({ sessionId: 's1', model: 'claude-3', provider: 'firstParty' })
-      mockRootUpdate.mockClear()
-      recordToolObservation(span, {
-        toolName: 'FileReadTool',
-        toolUseId: 'tu-2',
-        input: { file_path: '/tmp/file.ts' },
-        output: 'file content here',
-      })
-      expect(mockRootUpdate).toHaveBeenCalledWith(expect.objectContaining({
-        output: '[file content redacted, 17 chars]',
-      }))
-    })
-
-    test('sets ERROR level for error observations', async () => {
-      process.env.LANGFUSE_PUBLIC_KEY = 'pk-test'
-      process.env.LANGFUSE_SECRET_KEY = 'sk-test'
-      const { createTrace, recordToolObservation } = await import('../tracing.js')
-      const span = createTrace({ sessionId: 's1', model: 'claude-3', provider: 'firstParty' })
-      mockRootUpdate.mockClear()
-      recordToolObservation(span, {
-        toolName: 'BashTool',
-        toolUseId: 'tu-3',
-        input: {},
-        output: 'error occurred',
-        isError: true,
-      })
-      expect(mockRootUpdate).toHaveBeenCalledWith(expect.objectContaining({ level: 'ERROR' }))
-    })
-  })
-
-  describe('endTrace', () => {
-    test('no-ops when rootSpan is null', async () => {
-      const { endTrace } = await import('../tracing.js')
-      endTrace(null)
-      expect(mockRootEnd).not.toHaveBeenCalled()
-    })
-
-    test('calls span.end()', async () => {
-      process.env.LANGFUSE_PUBLIC_KEY = 'pk-test'
-      process.env.LANGFUSE_SECRET_KEY = 'sk-test'
-      const { createTrace, endTrace } = await import('../tracing.js')
-      const span = createTrace({ sessionId: 's1', model: 'claude-3', provider: 'firstParty' })
-      endTrace(span)
-      expect(mockRootEnd).toHaveBeenCalled()
-    })
-
-    test('calls span.update() with output when provided', async () => {
-      process.env.LANGFUSE_PUBLIC_KEY = 'pk-test'
-      process.env.LANGFUSE_SECRET_KEY = 'sk-test'
-      const { createTrace, endTrace } = await import('../tracing.js')
-      const span = createTrace({ sessionId: 's1', model: 'claude-3', provider: 'firstParty' })
-      endTrace(span, 'final output')
-      expect(mockRootUpdate).toHaveBeenCalledWith({ output: 'final output' })
-      expect(mockRootEnd).toHaveBeenCalled()
-    })
-  })
-
-  describe('createSubagentTrace', () => {
-    test('returns null when langfuse not enabled', async () => {
-      const { createSubagentTrace } = await import('../tracing.js')
-      const span = createSubagentTrace({
-        sessionId: 's1',
-        agentType: 'Explore',
-        agentId: 'agent-1',
-        model: 'claude-3',
-        provider: 'firstParty',
-      })
-      expect(span).toBeNull()
-    })
-
-    test('creates trace with agentType and agentId metadata', async () => {
-      process.env.LANGFUSE_PUBLIC_KEY = 'pk-test'
-      process.env.LANGFUSE_SECRET_KEY = 'sk-test'
-      const { createSubagentTrace } = await import('../tracing.js')
-      const span = createSubagentTrace({
-        sessionId: 's1',
-        agentType: 'Explore',
-        agentId: 'agent-1',
-        model: 'claude-3',
-        provider: 'firstParty',
-        input: [{ role: 'user', content: 'search for X' }],
-      })
-      expect(span).not.toBeNull()
-      expect(mockStartObservation).toHaveBeenCalledWith('agent:Explore', expect.objectContaining({
-        metadata: expect.objectContaining({
-          agentType: 'Explore',
-          agentId: 'agent-1',
-          provider: 'firstParty',
-          model: 'claude-3',
-        }),
-      }), { asType: 'agent' })
-      // Verify session.id attribute is set
-      expect(mockSetAttribute).toHaveBeenCalledWith('session.id', 's1')
-    })
-
-    test('returns null on SDK error', async () => {
-      process.env.LANGFUSE_PUBLIC_KEY = 'pk-test'
-      process.env.LANGFUSE_SECRET_KEY = 'sk-test'
-      mockStartObservation.mockImplementationOnce(() => { throw new Error('SDK error') })
-      const { createSubagentTrace } = await import('../tracing.js')
-      const span = createSubagentTrace({
-        sessionId: 's1',
-        agentType: 'Plan',
-        agentId: 'agent-2',
-        model: 'claude-3',
-        provider: 'firstParty',
-      })
-      expect(span).toBeNull()
-    })
-  })
-
-  describe('createTrace with querySource', () => {
-    test('includes querySource in metadata', async () => {
-      process.env.LANGFUSE_PUBLIC_KEY = 'pk-test'
-      process.env.LANGFUSE_SECRET_KEY = 'sk-test'
-      const { createTrace } = await import('../tracing.js')
-      const span = createTrace({
-        sessionId: 's1',
-        model: 'claude-3',
-        provider: 'firstParty',
-        querySource: 'user',
-      })
-      expect(span).not.toBeNull()
-      expect(mockStartObservation).toHaveBeenCalledWith('agent-run:user', expect.objectContaining({
-        metadata: expect.objectContaining({
-          agentType: 'main',
-          querySource: 'user',
-        }),
-      }), { asType: 'agent' })
-    })
-
-    test('omits querySource when not provided', async () => {
-      process.env.LANGFUSE_PUBLIC_KEY = 'pk-test'
-      process.env.LANGFUSE_SECRET_KEY = 'sk-test'
-      mockStartObservation.mockClear()
-      const { createTrace } = await import('../tracing.js')
-      createTrace({ sessionId: 's1', model: 'claude-3', provider: 'firstParty' })
-      const calls = mockStartObservation.mock.calls as unknown[][]
-      const secondArg = calls[0]?.[1] as Record<string, unknown> | undefined
-      const metadata = (secondArg?.metadata ?? {}) as Record<string, unknown>
-      expect(metadata).not.toHaveProperty('querySource')
-    })
-  })
-
-  describe('createTrace with username', () => {
-    test('sets user.id attribute when username is provided', async () => {
-      process.env.LANGFUSE_PUBLIC_KEY = 'pk-test'
-      process.env.LANGFUSE_SECRET_KEY = 'sk-test'
-      mockSetAttribute.mockClear()
-      const { createTrace } = await import('../tracing.js')
-      const span = createTrace({
-        sessionId: 's1',
-        model: 'claude-3',
-        provider: 'firstParty',
-        username: 'user@example.com',
-      })
-      expect(span).not.toBeNull()
-      expect(mockSetAttribute).toHaveBeenCalledWith('user.id', 'user@example.com')
-    })
-
-    test('falls back to LANGFUSE_USER_ID env when username not provided', async () => {
-      process.env.LANGFUSE_PUBLIC_KEY = 'pk-test'
-      process.env.LANGFUSE_SECRET_KEY = 'sk-test'
-      process.env.LANGFUSE_USER_ID = 'env-user@test.com'
-      mockSetAttribute.mockClear()
-      const { createTrace } = await import('../tracing.js')
-      const span = createTrace({
-        sessionId: 's1',
-        model: 'claude-3',
-        provider: 'firstParty',
-      })
-      expect(span).not.toBeNull()
-      expect(mockSetAttribute).toHaveBeenCalledWith('user.id', 'env-user@test.com')
-      delete process.env.LANGFUSE_USER_ID
-    })
-
-    test('falls back to deviceId when neither username nor env is provided', async () => {
-      process.env.LANGFUSE_PUBLIC_KEY = 'pk-test'
-      process.env.LANGFUSE_SECRET_KEY = 'sk-test'
-      delete process.env.LANGFUSE_USER_ID
-      mockSetAttribute.mockClear()
-      const { createTrace } = await import('../tracing.js')
-      createTrace({ sessionId: 's1', model: 'claude-3', provider: 'firstParty' })
-      // Falls back to getCoreUserData().deviceId (mocked as 'test-device-id')
-      expect(mockSetAttribute).toHaveBeenCalledWith('user.id', 'test-device-id')
-    })
-
-    test('username takes precedence over LANGFUSE_USER_ID env', async () => {
-      process.env.LANGFUSE_PUBLIC_KEY = 'pk-test'
-      process.env.LANGFUSE_SECRET_KEY = 'sk-test'
-      process.env.LANGFUSE_USER_ID = 'env-user@test.com'
-      mockSetAttribute.mockClear()
-      const { createTrace } = await import('../tracing.js')
-      createTrace({
-        sessionId: 's1',
-        model: 'claude-3',
-        provider: 'firstParty',
-        username: 'param-user@test.com',
-      })
-      const userIdCalls = mockSetAttribute.mock.calls.filter(
-        (call: unknown[]) => Array.isArray(call) && call[0] === 'user.id',
+    if (exitCode !== 0) {
+      throw new Error(
+        [
+          `isolated langfuse test failed with exit code ${exitCode}`,
+          '',
+          'STDOUT:',
+          stdout,
+          '',
+          'STDERR:',
+          stderr,
+        ].join('\n'),
      )
-      expect(userIdCalls.length).toBe(1)
-      expect((userIdCalls[0] as unknown[])[1]).toBe('param-user@test.com')
-      delete process.env.LANGFUSE_USER_ID
-    })
-  })
+    }

-  describe('nested agent scenario', () => {
-    test('sub-agent trace shares sessionId with parent', async () => {
-      process.env.LANGFUSE_PUBLIC_KEY = 'pk-test'
-      process.env.LANGFUSE_SECRET_KEY = 'sk-test'
-      const { createTrace, createSubagentTrace } = await import('../tracing.js')
-      mockSetAttribute.mockClear()
-
-      // Create parent trace
-      const parentSpan = createTrace({
-        sessionId: 'shared-session',
-        model: 'claude-3',
-        provider: 'firstParty',
-      })
-
-      // Create sub-agent trace with same sessionId
-      const subSpan = createSubagentTrace({
-        sessionId: 'shared-session',
-        agentType: 'Explore',
-        agentId: 'agent-explore-1',
-        model: 'claude-3',
-        provider: 'firstParty',
-      })
-
-      expect(parentSpan).not.toBeNull()
-      expect(subSpan).not.toBeNull()
-
-      // Both should have set session.id attribute
-      const sessionAttributeCalls = mockSetAttribute.mock.calls.filter(
-        (call: unknown[]) => Array.isArray(call) && call[0] === 'session.id' && call[1] === 'shared-session',
-      )
-      expect(sessionAttributeCalls.length).toBeGreaterThanOrEqual(2)
-    })
-
-    test('query reuses passed langfuseTrace instead of creating new one', async () => {
-      // This validates the pattern used in query.ts:
-      //   const ownsTrace = !params.toolUseContext.langfuseTrace
-      //   const langfuseTrace = params.toolUseContext.langfuseTrace ?? createTrace(...)
-      // When langfuseTrace is already set, createTrace should NOT be called
-      process.env.LANGFUSE_PUBLIC_KEY = 'pk-test'
-      process.env.LANGFUSE_SECRET_KEY = 'sk-test'
-      const { createSubagentTrace } = await import('../tracing.js')
-
-      // Simulate what runAgent does: create subTrace, then pass it as langfuseTrace
-      const subTrace = createSubagentTrace({
-        sessionId: 's1',
-        agentType: 'Explore',
-        agentId: 'agent-1',
-        model: 'claude-3',
-        provider: 'firstParty',
-      })
-      expect(subTrace).not.toBeNull()
-
-      // Simulate query.ts logic: if langfuseTrace already set, don't create new one
-      const ownsTrace = false  // Would be: !params.toolUseContext.langfuseTrace
-      const langfuseTrace = subTrace  // Would be: params.toolUseContext.langfuseTrace ?? createTrace(...)
-
-      expect(ownsTrace).toBe(false)
-      expect(langfuseTrace).toBe(subTrace)
-    })
-  })
-
-  describe('SDK exceptions do not affect main flow', () => {
-    test('createTrace returns null on SDK error', async () => {
-      process.env.LANGFUSE_PUBLIC_KEY = 'pk-test'
-      process.env.LANGFUSE_SECRET_KEY = 'sk-test'
-      mockStartObservation.mockImplementationOnce(() => { throw new Error('SDK error') })
-      const { createTrace } = await import('../tracing.js')
-      const span = createTrace({ sessionId: 's1', model: 'claude-3', provider: 'firstParty' })
-      expect(span).toBeNull()
-    })
-
-    test('recordLLMObservation silently fails on SDK error', async () => {
-      process.env.LANGFUSE_PUBLIC_KEY = 'pk-test'
-      process.env.LANGFUSE_SECRET_KEY = 'sk-test'
-      mockStartObservation.mockImplementationOnce(() => { throw new Error('SDK error') })
-      const { createTrace, recordLLMObservation } = await import('../tracing.js')
-      const span = createTrace({ sessionId: 's1', model: 'claude-3', provider: 'firstParty' })
-      // The second call to startObservation (for the generation) will throw
-      mockStartObservation.mockImplementationOnce(() => { throw new Error('SDK error') })
-      expect(() => recordLLMObservation(span, {
-        model: 'm',
-        provider: 'firstParty',
-        input: [],
-        output: [],
-        usage: { input_tokens: 1, output_tokens: 1 },
-      })).not.toThrow()
-    })
+    expect(exitCode).toBe(0)
  })
 })
--- a/src/tasks/InProcessTeammateTask/InProcessTeammateTask.tsx
+++ b/src/tasks/InProcessTeammateTask/InProcessTeammateTask.tsx
@@ -82,8 +82,13 @@ export function appendTeammateMessage(
 export function injectUserMessageToTeammate(
  taskId: string,
  message: string,
+  options: {
+    autonomyRunId?: string
+    origin?: Message['origin']
+  } | undefined,
  setAppState: SetAppState,
-): void {
+): boolean {
+  let injected = false
  updateTaskState<InProcessTeammateTaskState>(taskId, setAppState, task => {
    // Allow message injection when teammate is running or idle (waiting for input)
    // Only reject if teammate is in a terminal state
@@ -94,15 +99,29 @@ export function injectUserMessageToTeammate(
      return task
    }

+    injected = true
    return {
      ...task,
-      pendingUserMessages: [...task.pendingUserMessages, message],
+      pendingUserMessages: [
+        ...task.pendingUserMessages,
+        {
+          message,
+          ...(options?.autonomyRunId
+            ? { autonomyRunId: options.autonomyRunId }
+            : {}),
+          ...(options?.origin ? { origin: options.origin } : {}),
+        },
+      ],
      messages: appendCappedMessage(
        task.messages,
-        createUserMessage({ content: message }),
+        createUserMessage({
+          content: message,
+          ...(options?.origin ? { origin: options.origin } : {}),
+        }),
      ),
    }
  })
+  return injected
 }

 /**
--- a/src/tasks/InProcessTeammateTask/types.ts
+++ b/src/tasks/InProcessTeammateTask/types.ts
@@ -1,7 +1,7 @@
 import type { TaskStateBase } from '../../Task.js'
 import type { AgentToolResult } from '@claude-code-best/builtin-tools/tools/AgentTool/agentToolUtils.js'
 import type { AgentDefinition } from '@claude-code-best/builtin-tools/tools/AgentTool/loadAgentsDir.js'
-import type { Message } from '../../types/message.js'
+import type { Message, MessageOrigin } from '../../types/message.js'
 import type { PermissionMode } from '../../utils/permissions/PermissionMode.js'
 import type { AgentProgress } from '../LocalAgentTask/LocalAgentTask.js'

@@ -19,6 +19,12 @@ export type TeammateIdentity = {
  parentSessionId: string // Leader's session ID
 }

+export type PendingTeammateUserMessage = {
+  message: string
+  autonomyRunId?: string
+  origin?: MessageOrigin
+}
+
 export type InProcessTeammateTaskState = TaskStateBase & {
  type: 'in_process_teammate'

@@ -56,7 +62,7 @@ export type InProcessTeammateTaskState = TaskStateBase & {
  inProgressToolUseIDs?: Set<string>

  // Queue of user messages to deliver when viewing teammate transcript
-  pendingUserMessages: string[]
+  pendingUserMessages: PendingTeammateUserMessage[]

  // UI: random spinner verbs (stable across re-renders, shared between components)
  spinnerVerb?: string
--- a/src/types/textInputTypes.ts
+++ b/src/types/textInputTypes.ts
@@ -355,6 +355,19 @@ export type QueuedCommand = {
   * unified the queue but lost the isolation the dual-queue accidentally had).
   */
  agentId?: AgentId
+  /**
+   * Autonomy-run provenance for system-generated automatic turns.
+   * Used by the autonomy ledger to track queue → execution lifecycle.
+   */
+  autonomy?: {
+    runId: string
+    trigger: 'scheduled-task' | 'proactive-tick' | 'managed-flow-step'
+    sourceId?: string
+    sourceLabel?: string
+    flowId?: string
+    flowStepId?: string
+    flowStepName?: string
+  }
 }

 /**
--- a/src/utils/tests/autonomyAuthority.test.ts
+++ b/src/utils/tests/autonomyAuthority.test.ts
@@ -0,0 +1,241 @@
+import { afterEach, beforeEach, describe, expect, test } from 'bun:test'
+import { join } from 'node:path'
+import {
+  AUTONOMY_AGENTS_PATH_POSIX,
+  AUTONOMY_DIR,
+  buildAutonomyTurnPrompt,
+  loadAutonomyAuthority,
+  resetAutonomyAuthorityForTests,
+} from '../autonomyAuthority'
+import {
+  cleanupTempDir,
+  createTempDir,
+  createTempSubdir,
+  writeTempFile,
+} from '../../../tests/mocks/file-system'
+
+const AGENTS_REL = join(AUTONOMY_DIR, 'AGENTS.md')
+const HEARTBEAT_REL = join(AUTONOMY_DIR, 'HEARTBEAT.md')
+
+let tempDir = ''
+
+beforeEach(async () => {
+  tempDir = await createTempDir('autonomy-authority-')
+})
+
+afterEach(async () => {
+  resetAutonomyAuthorityForTests()
+  if (tempDir) {
+    await cleanupTempDir(tempDir)
+  }
+})
+
+describe('autonomyAuthority', () => {
+  test('loadAutonomyAuthority merges AGENTS.md files from root to current directory', async () => {
+    const nestedDir = await createTempSubdir(tempDir, 'packages/app')
+    await writeTempFile(tempDir, AGENTS_REL, 'root authority')
+    await writeTempFile(nestedDir, AGENTS_REL, 'nested authority')
+    await writeTempFile(
+      tempDir,
+      HEARTBEAT_REL,
+      [
+        '# Heartbeat',
+        'tasks:',
+        '  - name: inbox',
+        '    interval: 30m',
+        '    prompt: "Check inbox"',
+      ].join('\n'),
+    )
+
+    const snapshot = await loadAutonomyAuthority({
+      rootDir: tempDir,
+      currentDir: nestedDir,
+    })
+
+    expect(snapshot.agentsFiles.map(file => file.relativePath)).toEqual([
+      AUTONOMY_AGENTS_PATH_POSIX,
+      `packages/app/${AUTONOMY_AGENTS_PATH_POSIX}`,
+    ])
+    expect(snapshot.agentsContent).toContain('root authority')
+    expect(snapshot.agentsContent).toContain('nested authority')
+    expect(snapshot.heartbeatContent).toContain('# Heartbeat')
+    expect(snapshot.heartbeatTasks).toEqual([
+      {
+        name: 'inbox',
+        interval: '30m',
+        prompt: 'Check inbox',
+        steps: [],
+      },
+    ])
+  })
+
+  test('loadAutonomyAuthority reads HEARTBEAT.md only from the workspace root', async () => {
+    const nestedDir = await createTempSubdir(tempDir, 'child')
+    await writeTempFile(
+      tempDir,
+      HEARTBEAT_REL,
+      '# Root heartbeat\nRemember the root task',
+    )
+    await writeTempFile(
+      nestedDir,
+      HEARTBEAT_REL,
+      '# Nested heartbeat\nThis should not be used',
+    )
+
+    const snapshot = await loadAutonomyAuthority({
+      rootDir: tempDir,
+      currentDir: nestedDir,
+    })
+
+    expect(snapshot.heartbeatFile?.path).toBe(join(tempDir, HEARTBEAT_REL))
+    expect(snapshot.heartbeatContent).toContain('Root heartbeat')
+    expect(snapshot.heartbeatContent).not.toContain('Nested heartbeat')
+  })
+
+  test('buildAutonomyTurnPrompt returns the original prompt when no authority files exist', async () => {
+    const prompt = await buildAutonomyTurnPrompt({
+      basePrompt: 'Run the scheduled task.',
+      trigger: 'scheduled-task',
+      rootDir: tempDir,
+      currentDir: tempDir,
+    })
+
+    expect(prompt).toBe('Run the scheduled task.')
+  })
+
+  test('buildAutonomyTurnPrompt injects AGENTS.md and HEARTBEAT.md for automated turns', async () => {
+    const nestedDir = await createTempSubdir(tempDir, 'nested')
+    await writeTempFile(tempDir, AGENTS_REL, 'root rules')
+    await writeTempFile(nestedDir, AGENTS_REL, 'nested rules')
+    await writeTempFile(tempDir, HEARTBEAT_REL, 'Check heartbeat directives')
+
+    const scheduledPrompt = await buildAutonomyTurnPrompt({
+      basePrompt: 'Review the nightly report.',
+      trigger: 'scheduled-task',
+      rootDir: tempDir,
+      currentDir: nestedDir,
+    })
+    const tickPrompt = await buildAutonomyTurnPrompt({
+      basePrompt: '<tick>12:00:00</tick>',
+      trigger: 'proactive-tick',
+      rootDir: tempDir,
+      currentDir: nestedDir,
+    })
+
+    expect(scheduledPrompt).toContain(
+      'This prompt was generated automatically. Follow the workspace authority below before acting.',
+    )
+    expect(scheduledPrompt).toContain('<autonomy_authority>')
+    expect(scheduledPrompt).toContain('root rules')
+    expect(scheduledPrompt).toContain('nested rules')
+    expect(scheduledPrompt).toContain('Check heartbeat directives')
+    expect(scheduledPrompt).toContain('Review the nightly report.')
+
+    expect(tickPrompt).toContain(
+      'This is an autonomous proactive turn. Follow the workspace authority below before acting.',
+    )
+    expect(tickPrompt).toContain('<tick>12:00:00</tick>')
+  })
+
+  test('proactive prompts surface due HEARTBEAT.md tasks only when their interval elapses', async () => {
+    await writeTempFile(
+      tempDir,
+      HEARTBEAT_REL,
+      [
+        'tasks:',
+        '  - name: inbox',
+        '    interval: 30m',
+        '    prompt: "Check inbox"',
+      ].join('\n'),
+    )
+
+    const first = await buildAutonomyTurnPrompt({
+      basePrompt: '<tick>12:00:00</tick>',
+      trigger: 'proactive-tick',
+      rootDir: tempDir,
+      currentDir: tempDir,
+      nowMs: 0,
+    })
+    const second = await buildAutonomyTurnPrompt({
+      basePrompt: '<tick>12:10:00</tick>',
+      trigger: 'proactive-tick',
+      rootDir: tempDir,
+      currentDir: tempDir,
+      nowMs: 10 * 60_000,
+    })
+    const third = await buildAutonomyTurnPrompt({
+      basePrompt: '<tick>12:31:00</tick>',
+      trigger: 'proactive-tick',
+      rootDir: tempDir,
+      currentDir: tempDir,
+      nowMs: 31 * 60_000,
+    })
+
+    expect(first).toContain('Due HEARTBEAT.md tasks:')
+    expect(first).toContain('- inbox (30m): Check inbox')
+    expect(second).not.toContain('Due HEARTBEAT.md tasks:')
+    expect(third).toContain('Due HEARTBEAT.md tasks:')
+  })
+
+  test('managed HEARTBEAT.md tasks parse nested steps and are not duplicated into the inline due-task section', async () => {
+    await writeTempFile(
+      tempDir,
+      HEARTBEAT_REL,
+      [
+        'tasks:',
+        '  - name: inbox',
+        '    interval: 30m',
+        '    prompt: "Check inbox"',
+        '  - name: weekly-report',
+        '    interval: 7d',
+        '    prompt: "Ship the weekly report"',
+        '    steps:',
+        '      - name: gather',
+        '        prompt: "Gather weekly inputs"',
+        '      - name: draft',
+        '        prompt: "Draft the weekly report"',
+        '        wait_for: manual',
+      ].join('\n'),
+    )
+
+    const snapshot = await loadAutonomyAuthority({
+      rootDir: tempDir,
+      currentDir: tempDir,
+    })
+    const prompt = await buildAutonomyTurnPrompt({
+      basePrompt: '<tick>12:00:00</tick>',
+      trigger: 'proactive-tick',
+      rootDir: tempDir,
+      currentDir: tempDir,
+      nowMs: 0,
+    })
+
+    expect(snapshot.heartbeatTasks).toEqual([
+      {
+        name: 'inbox',
+        interval: '30m',
+        prompt: 'Check inbox',
+        steps: [],
+      },
+      {
+        name: 'weekly-report',
+        interval: '7d',
+        prompt: 'Ship the weekly report',
+        steps: [
+          {
+            name: 'gather',
+            prompt: 'Gather weekly inputs',
+          },
+          {
+            name: 'draft',
+            prompt: 'Draft the weekly report',
+            waitFor: 'manual',
+          },
+        ],
+      },
+    ])
+    expect(prompt).toContain('- inbox (30m): Check inbox')
+    expect(prompt).not.toContain('- weekly-report (7d): Ship the weekly report')
+    expect(prompt).not.toContain('- gather (')
+  })
+})
--- a/src/utils/tests/autonomyRuns.test.ts
+++ b/src/utils/tests/autonomyRuns.test.ts
@@ -0,0 +1,411 @@
+import { afterEach, beforeEach, describe, expect, test } from 'bun:test'
+import { mkdir, writeFile } from 'fs/promises'
+import { join } from 'path'
+import {
+  resetStateForTests,
+  setCwdState,
+  setOriginalCwd,
+  setProjectRoot,
+} from '../../bootstrap/state'
+import {
+  formatAutonomyRunsList,
+  formatAutonomyRunsStatus,
+  listAutonomyRuns,
+  createAutonomyQueuedPrompt,
+  createProactiveAutonomyCommands,
+  finalizeAutonomyRunCompleted,
+  markAutonomyRunCompleted,
+  markAutonomyRunFailed,
+  markAutonomyRunRunning,
+  recoverManagedAutonomyFlowPrompt,
+  resolveAutonomyRunsPath,
+  startManagedAutonomyFlowFromHeartbeatTask,
+} from '../autonomyRuns'
+import {
+  formatAutonomyFlowsList,
+  getAutonomyFlowById,
+  listAutonomyFlows,
+} from '../autonomyFlows'
+import {
+  AUTONOMY_DIR,
+  resetAutonomyAuthorityForTests,
+} from '../autonomyAuthority'
+import { resetCommandQueue } from '../messageQueueManager'
+import {
+  cleanupTempDir,
+  createTempDir,
+  createTempSubdir,
+  writeTempFile,
+} from '../../../tests/mocks/file-system'
+
+const AGENTS_REL = join(AUTONOMY_DIR, 'AGENTS.md')
+const HEARTBEAT_REL = join(AUTONOMY_DIR, 'HEARTBEAT.md')
+
+let tempDir = ''
+
+beforeEach(async () => {
+  tempDir = await createTempDir('autonomy-runs-')
+  resetStateForTests()
+  resetAutonomyAuthorityForTests()
+  resetCommandQueue()
+  setOriginalCwd(tempDir)
+  setProjectRoot(tempDir)
+})
+
+afterEach(async () => {
+  resetStateForTests()
+  resetAutonomyAuthorityForTests()
+  resetCommandQueue()
+  if (tempDir) {
+    await cleanupTempDir(tempDir)
+  }
+})
+
+describe('autonomyRuns', () => {
+  test('createAutonomyQueuedPrompt records a queued automatic run and returns a prompt command', async () => {
+    const currentDir = await createTempSubdir(tempDir, 'nested')
+    await writeTempFile(tempDir, AGENTS_REL, 'root authority')
+
+    const command = await createAutonomyQueuedPrompt({
+      basePrompt: 'Review nightly report',
+      trigger: 'scheduled-task',
+      rootDir: tempDir,
+      currentDir,
+      sourceId: 'cron-1',
+      sourceLabel: 'nightly-report',
+      workload: 'cron',
+    })
+
+    const runs = await listAutonomyRuns(tempDir)
+    const flows = await listAutonomyFlows(tempDir)
+
+    expect(command.mode).toBe('prompt')
+    expect(command.isMeta).toBe(true)
+    expect(command.autonomy?.trigger).toBe('scheduled-task')
+    expect(command.autonomy?.sourceId).toBe('cron-1')
+    expect(command.origin).toBeDefined()
+    expect(command.value).toContain('root authority')
+    expect(runs).toHaveLength(1)
+    expect(runs[0]).toMatchObject({
+      runId: command.autonomy?.runId,
+      runtime: 'automatic',
+      trigger: 'scheduled-task',
+      status: 'queued',
+      ownerKey: 'main-thread',
+      sourceId: 'cron-1',
+      sourceLabel: 'nightly-report',
+    })
+    expect(flows).toHaveLength(0)
+    expect(resolveAutonomyRunsPath(tempDir)).toContain('.claude')
+  })
+
+  test('createAutonomyQueuedPrompt defaults currentDir to the active cwd for nested authority', async () => {
+    const nestedDir = await createTempSubdir(tempDir, 'nested')
+    await writeTempFile(tempDir, AGENTS_REL, 'root authority')
+    await writeTempFile(nestedDir, AGENTS_REL, 'nested authority')
+    setOriginalCwd(nestedDir)
+    setCwdState(nestedDir)
+
+    const command = await createAutonomyQueuedPrompt({
+      basePrompt: '<tick>12:00:00</tick>',
+      trigger: 'proactive-tick',
+      rootDir: tempDir,
+    })
+
+    expect(command.value).toContain('root authority')
+    expect(command.value).toContain('nested authority')
+  })
+
+  test('markAutonomyRunRunning/completed/failed update persisted lifecycle state for plain runs', async () => {
+    const command = await createAutonomyQueuedPrompt({
+      basePrompt: '<tick>12:00:00</tick>',
+      trigger: 'proactive-tick',
+      rootDir: tempDir,
+      currentDir: tempDir,
+    })
+    const runId = command.autonomy!.runId
+
+    await markAutonomyRunRunning(runId, tempDir, 100)
+    let runs = await listAutonomyRuns(tempDir)
+    expect(runs[0]).toMatchObject({
+      runId,
+      status: 'running',
+      startedAt: 100,
+    })
+
+    await markAutonomyRunCompleted(runId, tempDir, 200)
+    runs = await listAutonomyRuns(tempDir)
+    expect(runs[0]).toMatchObject({
+      runId,
+      status: 'completed',
+      endedAt: 200,
+    })
+
+    await markAutonomyRunFailed(runId, 'boom', tempDir, 300)
+    runs = await listAutonomyRuns(tempDir)
+    expect(runs[0]).toMatchObject({
+      runId,
+      status: 'failed',
+      endedAt: 300,
+      error: 'boom',
+    })
+  })
+
+  test('formatters produce readable status and run listings', async () => {
+    const first = await createAutonomyQueuedPrompt({
+      basePrompt: 'scheduled prompt',
+      trigger: 'scheduled-task',
+      rootDir: tempDir,
+      currentDir: tempDir,
+      sourceId: 'cron-1',
+      sourceLabel: 'nightly',
+    })
+    const second = await createAutonomyQueuedPrompt({
+      basePrompt: '<tick>12:00:00</tick>',
+      trigger: 'proactive-tick',
+      rootDir: tempDir,
+      currentDir: tempDir,
+    })
+
+    await markAutonomyRunRunning(first.autonomy!.runId, tempDir, 100)
+    await markAutonomyRunCompleted(first.autonomy!.runId, tempDir, 200)
+    await markAutonomyRunFailed(second.autonomy!.runId, 'stopped', tempDir, 300)
+
+    const runs = await listAutonomyRuns(tempDir)
+    const status = formatAutonomyRunsStatus(runs)
+    const list = formatAutonomyRunsList(runs, 5)
+    const flows = await listAutonomyFlows(tempDir)
+    const flowList = formatAutonomyFlowsList(flows, 5)
+
+    expect(status).toContain('Autonomy runs: 2')
+    expect(status).toContain('Completed: 1')
+    expect(status).toContain('Failed: 1')
+    expect(list).toContain(first.autonomy!.runId)
+    expect(list).toContain(second.autonomy!.runId)
+    expect(list).toContain('nightly')
+    expect(list).toContain('stopped')
+    expect(flowList).toBe('No autonomy flows recorded.')
+  })
+
+  test('same-process concurrent run creation does not lose updates', async () => {
+    await Promise.all([
+      createAutonomyQueuedPrompt({
+        basePrompt: 'scheduled one',
+        trigger: 'scheduled-task',
+        rootDir: tempDir,
+        currentDir: tempDir,
+        sourceId: 'cron-1',
+      }),
+      createAutonomyQueuedPrompt({
+        basePrompt: 'scheduled two',
+        trigger: 'scheduled-task',
+        rootDir: tempDir,
+        currentDir: tempDir,
+        sourceId: 'cron-2',
+      }),
+    ])
+
+    const runs = await listAutonomyRuns(tempDir)
+
+    expect(runs).toHaveLength(2)
+    expect(new Set(runs.map(run => run.sourceId))).toEqual(
+      new Set(['cron-1', 'cron-2']),
+    )
+  })
+
+  test('listAutonomyRuns keeps older persisted records by normalizing missing runtime and owner metadata', async () => {
+    const runsPath = resolveAutonomyRunsPath(tempDir)
+    await mkdir(join(tempDir, '.claude', 'autonomy'), { recursive: true })
+    await writeFile(
+      runsPath,
+      `${JSON.stringify(
+        {
+          runs: [
+            {
+              runId: 'legacy-run',
+              trigger: 'scheduled-task',
+              status: 'completed',
+              rootDir: tempDir,
+              promptPreview: 'legacy prompt',
+              createdAt: 123,
+            },
+          ],
+        },
+        null,
+        2,
+      )}\n`,
+      'utf-8',
+    )
+
+    const [legacy] = await listAutonomyRuns(tempDir)
+
+    expect(legacy).toMatchObject({
+      runId: 'legacy-run',
+      runtime: 'automatic',
+      ownerKey: 'main-thread',
+      currentDir: tempDir,
+      status: 'completed',
+    })
+  })
+
+  test('createAutonomyQueuedPrompt does not consume heartbeat tasks or create runs when shouldCreate rejects commit', async () => {
+    await writeTempFile(
+      tempDir,
+      HEARTBEAT_REL,
+      [
+        'tasks:',
+        '  - name: inbox',
+        '    interval: 30m',
+        '    prompt: "Check inbox"',
+      ].join('\n'),
+    )
+
+    const skipped = await createAutonomyQueuedPrompt({
+      basePrompt: '<tick>12:00:00</tick>',
+      trigger: 'proactive-tick',
+      rootDir: tempDir,
+      currentDir: tempDir,
+      shouldCreate: () => false,
+    })
+    const committed = await createAutonomyQueuedPrompt({
+      basePrompt: '<tick>12:01:00</tick>',
+      trigger: 'proactive-tick',
+      rootDir: tempDir,
+      currentDir: tempDir,
+    })
+
+    const runs = await listAutonomyRuns(tempDir)
+
+    expect(skipped).toBeNull()
+    expect(committed).not.toBeNull()
+    expect(committed!.value).toContain('Due HEARTBEAT.md tasks:')
+    expect(runs).toHaveLength(1)
+  })
+
+  test('createProactiveAutonomyCommands queues one managed flow step command per due HEARTBEAT flow', async () => {
+    await writeTempFile(
+      tempDir,
+      HEARTBEAT_REL,
+      [
+        'tasks:',
+        '  - name: inbox',
+        '    interval: 30m',
+        '    prompt: "Check inbox"',
+        '  - name: weekly-report',
+        '    interval: 7d',
+        '    prompt: "Ship the weekly report"',
+        '    steps:',
+        '      - name: gather',
+        '        prompt: "Gather weekly inputs"',
+        '      - name: draft',
+        '        prompt: "Draft the weekly report"',
+      ].join('\n'),
+    )
+
+    const commands = await createProactiveAutonomyCommands({
+      basePrompt: '<tick>12:00:00</tick>',
+      rootDir: tempDir,
+      currentDir: tempDir,
+    })
+
+    const runs = await listAutonomyRuns(tempDir)
+    const flows = await listAutonomyFlows(tempDir)
+
+    expect(commands).toHaveLength(2)
+    expect(commands[0]!.autonomy?.trigger).toBe('proactive-tick')
+    expect(commands[0]!.value).toContain('- inbox (30m): Check inbox')
+    expect(commands[1]!.autonomy?.trigger).toBe('managed-flow-step')
+    expect(commands[1]!.value).toContain(
+      'This is step 1/2 of the managed autonomy flow',
+    )
+    expect(runs).toHaveLength(2)
+    expect(flows).toHaveLength(1)
+    expect(flows[0]).toMatchObject({
+      status: 'queued',
+      currentStep: 'gather',
+      goal: 'Ship the weekly report',
+    })
+  })
+
+  test('finalizeAutonomyRunCompleted advances managed flows to the next queued step', async () => {
+    const command = await startManagedAutonomyFlowFromHeartbeatTask({
+      task: {
+        name: 'weekly-report',
+        interval: '7d',
+        prompt: 'Ship the weekly report',
+        steps: [
+          {
+            name: 'gather',
+            prompt: 'Gather weekly inputs',
+          },
+          {
+            name: 'draft',
+            prompt: 'Draft the weekly report',
+          },
+        ],
+      },
+      rootDir: tempDir,
+      currentDir: tempDir,
+    })
+
+    expect(command).not.toBeNull()
+    await markAutonomyRunRunning(command!.autonomy!.runId, tempDir, 100)
+
+    const nextCommands = await finalizeAutonomyRunCompleted({
+      runId: command!.autonomy!.runId,
+      rootDir: tempDir,
+      currentDir: tempDir,
+    })
+
+    const runs = await listAutonomyRuns(tempDir)
+    const [flow] = await listAutonomyFlows(tempDir)
+    const detail = await getAutonomyFlowById(flow!.flowId, tempDir)
+
+    expect(nextCommands).toHaveLength(1)
+    expect(nextCommands[0]!.autonomy?.trigger).toBe('managed-flow-step')
+    expect(nextCommands[0]!.value).toContain('Current step: draft')
+    expect(runs).toHaveLength(2)
+    expect(flow).toMatchObject({
+      status: 'queued',
+      currentStep: 'draft',
+      runCount: 2,
+    })
+    expect(detail?.stateJson?.steps.map(step => step.status)).toEqual([
+      'completed',
+      'queued',
+    ])
+  })
+
+  test('recoverManagedAutonomyFlowPrompt rehydrates a queued managed step with the same run id', async () => {
+    const command = await startManagedAutonomyFlowFromHeartbeatTask({
+      task: {
+        name: 'weekly-report',
+        interval: '7d',
+        prompt: 'Ship the weekly report',
+        steps: [
+          {
+            name: 'gather',
+            prompt: 'Gather weekly inputs',
+          },
+          {
+            name: 'draft',
+            prompt: 'Draft the weekly report',
+          },
+        ],
+      },
+      rootDir: tempDir,
+      currentDir: tempDir,
+    })
+
+    const [flow] = await listAutonomyFlows(tempDir)
+    const recovered = await recoverManagedAutonomyFlowPrompt({
+      flowId: flow!.flowId,
+      rootDir: tempDir,
+      currentDir: tempDir,
+    })
+
+    expect(recovered).not.toBeNull()
+    expect(recovered!.autonomy?.runId).toBe(command!.autonomy?.runId)
+    expect(recovered!.autonomy?.flowId).toBe(flow!.flowId)
+  })
+})
--- a/src/utils/tests/cronScheduler.baseline.test.ts
+++ b/src/utils/tests/cronScheduler.baseline.test.ts
@@ -0,0 +1,79 @@
+import { describe, expect, test } from 'bun:test'
+import {
+  buildMissedTaskNotification,
+  isRecurringTaskAged,
+} from '../cronScheduler'
+
+describe('cronScheduler baseline helpers', () => {
+  test('isRecurringTaskAged returns false when maxAgeMs is zero', () => {
+    expect(
+      isRecurringTaskAged(
+        { id: 'a', cron: '* * * * *', prompt: 'x', createdAt: 0, recurring: true },
+        10_000,
+        0,
+      ),
+    ).toBe(false)
+  })
+
+  test('isRecurringTaskAged only ages recurring non-permanent tasks', () => {
+    expect(
+      isRecurringTaskAged(
+        { id: 'a', cron: '* * * * *', prompt: 'x', createdAt: 0 },
+        10_000,
+        100,
+      ),
+    ).toBe(false)
+
+    expect(
+      isRecurringTaskAged(
+        {
+          id: 'b',
+          cron: '* * * * *',
+          prompt: 'x',
+          createdAt: 0,
+          recurring: true,
+          permanent: true,
+        },
+        10_000,
+        100,
+      ),
+    ).toBe(false)
+
+    expect(
+      isRecurringTaskAged(
+        { id: 'c', cron: '* * * * *', prompt: 'x', createdAt: 0, recurring: true },
+        10_000,
+        100,
+      ),
+    ).toBe(true)
+  })
+
+  test('buildMissedTaskNotification preserves AskUserQuestion safety instruction', () => {
+    const msg = buildMissedTaskNotification([
+      {
+        id: 'a1b2c3d4',
+        cron: '* * * * *',
+        prompt: 'check deployment',
+        createdAt: new Date('2026-04-12T10:00:00Z').getTime(),
+      },
+    ])
+
+    expect(msg).toContain('AskUserQuestion')
+    expect(msg).toContain('Do NOT execute this prompt yet')
+    expect(msg).toContain('check deployment')
+  })
+
+  test('buildMissedTaskNotification widens the code fence when the prompt contains backticks', () => {
+    const msg = buildMissedTaskNotification([
+      {
+        id: 'z9y8x7w6',
+        cron: '* * * * *',
+        prompt: 'run ```dangerous``` only if approved',
+        createdAt: new Date('2026-04-12T10:00:00Z').getTime(),
+      },
+    ])
+
+    expect(msg).toContain('````')
+    expect(msg).toContain('run ```dangerous``` only if approved')
+  })
+})
--- a/src/utils/tests/cronTasks.baseline.test.ts
+++ b/src/utils/tests/cronTasks.baseline.test.ts
@@ -0,0 +1,203 @@
+import { afterEach, beforeEach, describe, expect, test } from 'bun:test'
+import { existsSync } from 'node:fs'
+import { readFile } from 'node:fs/promises'
+import { join } from 'node:path'
+import {
+  getSessionCronTasks,
+  resetStateForTests,
+  setOriginalCwd,
+  setProjectRoot,
+} from '../../bootstrap/state'
+import {
+  addCronTask,
+  findMissedTasks,
+  getCronFilePath,
+  hasCronTasksSync,
+  listAllCronTasks,
+  markCronTasksFired,
+  nextCronRunMs,
+  oneShotJitteredNextCronRunMs,
+  readCronTasks,
+  removeCronTasks,
+  writeCronTasks,
+} from '../cronTasks'
+import { cleanupTempDir, createTempDir } from '../../../tests/mocks/file-system'
+
+let tempDir = ''
+
+beforeEach(async () => {
+  tempDir = await createTempDir('cron-baseline-')
+  resetStateForTests()
+  setOriginalCwd(tempDir)
+  setProjectRoot(tempDir)
+})
+
+afterEach(async () => {
+  resetStateForTests()
+  if (tempDir) {
+    await cleanupTempDir(tempDir)
+  }
+})
+
+describe('cronTasks baseline', () => {
+  test('session-only cron tasks remain in memory and do not create the cron file', async () => {
+    const id = await addCronTask('* * * * *', 'session-only prompt', true, false)
+
+    const tasks = await listAllCronTasks()
+
+    expect(id).toHaveLength(8)
+    expect(getSessionCronTasks()).toHaveLength(1)
+    expect(tasks).toHaveLength(1)
+    expect(tasks[0]).toMatchObject({
+      id,
+      prompt: 'session-only prompt',
+      durable: false,
+      recurring: true,
+    })
+    expect(existsSync(getCronFilePath())).toBe(false)
+  })
+
+  test('durable cron tasks are written to .claude/scheduled_tasks.json', async () => {
+    const id = await addCronTask('* * * * *', 'durable prompt', true, true)
+
+    const filePath = getCronFilePath()
+    const fileTasks = await readCronTasks()
+
+    expect(existsSync(filePath)).toBe(true)
+    expect(filePath).toBe(join(tempDir, '.claude', 'scheduled_tasks.json'))
+    expect(fileTasks).toHaveLength(1)
+    expect(fileTasks[0]).toMatchObject({
+      id,
+      prompt: 'durable prompt',
+      recurring: true,
+    })
+    expect(fileTasks[0].durable).toBeUndefined()
+  })
+
+  test('writeCronTasks strips runtime-only durable flags from disk', async () => {
+    await writeCronTasks([
+      {
+        id: 'abc12345',
+        cron: '* * * * *',
+        prompt: 'strip durable',
+        createdAt: 123,
+        recurring: true,
+        durable: false,
+      },
+    ])
+
+    const raw = await readFile(getCronFilePath(), 'utf-8')
+    expect(raw).not.toContain('"durable"')
+  })
+
+  test('hasCronTasksSync reflects whether the durable cron file has entries', async () => {
+    expect(hasCronTasksSync()).toBe(false)
+
+    await writeCronTasks([
+      {
+        id: 'sync0001',
+        cron: '* * * * *',
+        prompt: 'present',
+        createdAt: 1,
+      },
+    ])
+
+    expect(hasCronTasksSync()).toBe(true)
+  })
+
+  test('daemon-style listAllCronTasks(dir) excludes session-only tasks', async () => {
+    await addCronTask('* * * * *', 'session prompt', true, false)
+    const durableId = await addCronTask('* * * * *', 'durable prompt', true, true)
+
+    const sessionView = await listAllCronTasks()
+    const daemonView = await listAllCronTasks(tempDir)
+
+    expect(sessionView).toHaveLength(2)
+    expect(daemonView).toHaveLength(1)
+    expect(daemonView[0]).toMatchObject({
+      id: durableId,
+      prompt: 'durable prompt',
+    })
+  })
+
+  test('removeCronTasks without dir removes session-only tasks from memory', async () => {
+    const sessionId = await addCronTask('* * * * *', 'remove me', true, false)
+
+    await removeCronTasks([sessionId])
+
+    expect(getSessionCronTasks()).toHaveLength(0)
+    expect(await listAllCronTasks()).toHaveLength(0)
+  })
+
+  test('removeCronTasks with dir does not mutate session-only task storage', async () => {
+    const sessionId = await addCronTask('* * * * *', 'keep session task', true, false)
+    await addCronTask('* * * * *', 'durable prompt', true, true)
+
+    await removeCronTasks([sessionId], tempDir)
+
+    expect(getSessionCronTasks()).toHaveLength(1)
+    expect(getSessionCronTasks()[0]?.id).toBe(sessionId)
+  })
+
+  test('markCronTasksFired persists lastFiredAt for durable tasks', async () => {
+    await writeCronTasks([
+      {
+        id: 'fire0001',
+        cron: '* * * * *',
+        prompt: 'persist fired',
+        createdAt: 100,
+        recurring: true,
+      },
+    ])
+
+    await markCronTasksFired(['fire0001'], 123456789)
+
+    const tasks = await readCronTasks()
+    expect(tasks[0]?.lastFiredAt).toBe(123456789)
+  })
+
+  test('findMissedTasks returns tasks whose first scheduled run is in the past', () => {
+    const nowMs = new Date('2026-04-12T10:10:00').getTime()
+    const tasks = findMissedTasks(
+      [
+        {
+          id: 'missed01',
+          cron: '* * * * *',
+          prompt: 'old task',
+          createdAt: new Date('2026-04-12T10:00:00').getTime(),
+        },
+        {
+          id: 'future01',
+          cron: '59 23 31 12 *',
+          prompt: 'far future',
+          createdAt: nowMs,
+        },
+      ],
+      nowMs,
+    )
+
+    expect(tasks.map(t => t.id)).toEqual(['missed01'])
+  })
+
+  test('nextCronRunMs returns null for invalid cron expressions', () => {
+    expect(nextCronRunMs('invalid cron', Date.now())).toBeNull()
+  })
+
+  test('oneShotJitteredNextCronRunMs never returns a time earlier than fromMs', () => {
+    const fromMs = new Date('2026-04-12T10:59:50').getTime()
+    const next = oneShotJitteredNextCronRunMs('0 11 * * *', fromMs, '00000000')
+
+    expect(next).not.toBeNull()
+    expect(next!).toBeGreaterThanOrEqual(fromMs)
+  })
+
+  test('jitteredNextCronRunMs returns the exact next fire time when no second match exists in range', () => {
+    const fromMs = new Date('2026-04-12T10:00:00').getTime()
+    const exact = nextCronRunMs('0 0 29 2 *', fromMs)
+    const jittered = oneShotJitteredNextCronRunMs('0 0 29 2 *', fromMs, '89abcdef')
+
+    expect(exact).not.toBeNull()
+    expect(jittered).not.toBeNull()
+    expect(jittered!).toBeGreaterThanOrEqual(fromMs)
+  })
+})
--- a/src/utils/tests/language.test.ts
+++ b/src/utils/tests/language.test.ts
@@ -0,0 +1,82 @@
+import { describe, test, expect, mock } from 'bun:test'
+
+// Mock dependencies before importing the module under test
+let mockPreferredLanguage: string | undefined
+let mockSystemLocale: string | undefined
+
+mock.module('src/utils/config.js', () => ({
+  getGlobalConfig: () => ({
+    preferredLanguage: mockPreferredLanguage,
+  }),
+}))
+
+mock.module('src/utils/intl.js', () => ({
+  getSystemLocaleLanguage: () => mockSystemLocale,
+}))
+
+const { getResolvedLanguage, getLanguageDisplayName } = await import(
+  'src/utils/language.js'
+)
+
+describe('getResolvedLanguage', () => {
+  test('returns en when config is explicitly en', () => {
+    mockPreferredLanguage = 'en'
+    mockSystemLocale = 'zh'
+    expect(getResolvedLanguage()).toBe('en')
+  })
+
+  test('returns zh when config is explicitly zh', () => {
+    mockPreferredLanguage = 'zh'
+    mockSystemLocale = 'en'
+    expect(getResolvedLanguage()).toBe('zh')
+  })
+
+  test('falls back to system locale zh when config is auto', () => {
+    mockPreferredLanguage = 'auto'
+    mockSystemLocale = 'zh'
+    expect(getResolvedLanguage()).toBe('zh')
+  })
+
+  test('falls back to en when config is auto and system locale is not zh', () => {
+    mockPreferredLanguage = 'auto'
+    mockSystemLocale = 'en'
+    expect(getResolvedLanguage()).toBe('en')
+  })
+
+  test('falls back to en when config is auto and system locale is undefined', () => {
+    mockPreferredLanguage = 'auto'
+    mockSystemLocale = undefined
+    expect(getResolvedLanguage()).toBe('en')
+  })
+
+  test('falls back to auto behavior when config preferredLanguage is undefined', () => {
+    mockPreferredLanguage = undefined
+    mockSystemLocale = 'zh'
+    expect(getResolvedLanguage()).toBe('zh')
+  })
+
+  test('defaults to en when both config and locale are undefined', () => {
+    mockPreferredLanguage = undefined
+    mockSystemLocale = undefined
+    expect(getResolvedLanguage()).toBe('en')
+  })
+})
+
+describe('getLanguageDisplayName', () => {
+  test('returns Auto (follow system) for auto', () => {
+    expect(getLanguageDisplayName('auto')).toBe('Auto (follow system)')
+  })
+
+  test('returns English for en', () => {
+    expect(getLanguageDisplayName('en')).toBe('English')
+  })
+
+  test('returns 中文 for zh', () => {
+    expect(getLanguageDisplayName('zh')).toBe('中文')
+  })
+
+  test('returns the input string for unknown language codes', () => {
+    expect(getLanguageDisplayName('fr')).toBe('fr')
+    expect(getLanguageDisplayName('unknown')).toBe('unknown')
+  })
+})
--- a/src/utils/tests/pipeMuteState.test.ts
+++ b/src/utils/tests/pipeMuteState.test.ts
@@ -0,0 +1,124 @@
+import { describe, test, expect, beforeEach } from 'bun:test'
+import {
+  setMasterMutedPipes,
+  isMasterPipeMuted,
+  removeMasterPipeMute,
+  clearMasterMutedPipes,
+  addSendOverride,
+  removeSendOverride,
+  hasSendOverride,
+  clearSendOverrides,
+} from '../pipeMuteState.js'
+
+describe('setMasterMutedPipes', () => {
+  beforeEach(() => {
+    clearMasterMutedPipes()
+    clearSendOverrides()
+  })
+
+  test('sets muted pipes from iterable', () => {
+    setMasterMutedPipes(['pipe-a', 'pipe-b'])
+    expect(isMasterPipeMuted('pipe-a')).toBe(true)
+    expect(isMasterPipeMuted('pipe-b')).toBe(true)
+    expect(isMasterPipeMuted('pipe-c')).toBe(false)
+  })
+
+  test('replaces previous muted set', () => {
+    setMasterMutedPipes(['pipe-a'])
+    setMasterMutedPipes(['pipe-b'])
+    expect(isMasterPipeMuted('pipe-a')).toBe(false)
+    expect(isMasterPipeMuted('pipe-b')).toBe(true)
+  })
+})
+
+describe('isMasterPipeMuted', () => {
+  beforeEach(() => {
+    clearMasterMutedPipes()
+  })
+
+  test('returns false for unknown pipe', () => {
+    expect(isMasterPipeMuted('unknown')).toBe(false)
+  })
+})
+
+describe('removeMasterPipeMute', () => {
+  beforeEach(() => {
+    clearMasterMutedPipes()
+  })
+
+  test('removes a single muted pipe', () => {
+    setMasterMutedPipes(['pipe-a', 'pipe-b'])
+    removeMasterPipeMute('pipe-a')
+    expect(isMasterPipeMuted('pipe-a')).toBe(false)
+    expect(isMasterPipeMuted('pipe-b')).toBe(true)
+  })
+
+  test('no-ops for non-existent pipe', () => {
+    removeMasterPipeMute('nonexistent')
+    expect(isMasterPipeMuted('nonexistent')).toBe(false)
+  })
+})
+
+describe('clearMasterMutedPipes', () => {
+  test('clears all muted pipes', () => {
+    setMasterMutedPipes(['pipe-a', 'pipe-b', 'pipe-c'])
+    clearMasterMutedPipes()
+    expect(isMasterPipeMuted('pipe-a')).toBe(false)
+    expect(isMasterPipeMuted('pipe-b')).toBe(false)
+    expect(isMasterPipeMuted('pipe-c')).toBe(false)
+  })
+})
+
+describe('addSendOverride', () => {
+  beforeEach(() => {
+    clearSendOverrides()
+  })
+
+  test('adds a send override', () => {
+    addSendOverride('pipe-x')
+    expect(hasSendOverride('pipe-x')).toBe(true)
+  })
+
+  test('adding same override twice is idempotent', () => {
+    addSendOverride('pipe-x')
+    addSendOverride('pipe-x')
+    expect(hasSendOverride('pipe-x')).toBe(true)
+  })
+})
+
+describe('removeSendOverride', () => {
+  beforeEach(() => {
+    clearSendOverrides()
+  })
+
+  test('removes a send override', () => {
+    addSendOverride('pipe-x')
+    removeSendOverride('pipe-x')
+    expect(hasSendOverride('pipe-x')).toBe(false)
+  })
+
+  test('no-ops for non-existent override', () => {
+    removeSendOverride('nonexistent')
+    expect(hasSendOverride('nonexistent')).toBe(false)
+  })
+})
+
+describe('hasSendOverride', () => {
+  beforeEach(() => {
+    clearSendOverrides()
+  })
+
+  test('returns false when no overrides set', () => {
+    expect(hasSendOverride('pipe-x')).toBe(false)
+  })
+})
+
+describe('clearSendOverrides', () => {
+  test('clears all send overrides', () => {
+    addSendOverride('pipe-a')
+    addSendOverride('pipe-b')
+    clearSendOverrides()
+    expect(hasSendOverride('pipe-a')).toBe(false)
+    expect(hasSendOverride('pipe-b')).toBe(false)
+  })
+})
--- a/src/utils/tests/taskSummary.test.ts
+++ b/src/utils/tests/taskSummary.test.ts
@@ -0,0 +1,93 @@
+/**
+ * Tests for src/utils/taskSummary.ts
+ *
+ * Covers: shouldGenerateTaskSummary, maybeGenerateTaskSummary
+ *
+ * Note: bun:bundle's feature() is a compile-time construct and cannot be
+ * trivially mocked at test time. We test maybeGenerateTaskSummary (which
+ * is called unconditionally) and the rate-limit behavior indirectly.
+ */
+import { describe, expect, test, mock, beforeEach } from 'bun:test'
+
+// ─── mocks ──────────────────────────────────────────────────────────────────
+
+let _updateCalls: any[] = []
+
+mock.module('bun:bundle', () => ({
+  feature: (_name: string) => false,
+}))
+
+mock.module('../concurrentSessions.js', () => ({
+  isBgSession: () => false,
+  updateSessionActivity: async (data: any) => {
+    _updateCalls.push(data)
+  },
+}))
+
+mock.module('../debug.js', () => ({
+  logForDebugging: () => {},
+}))
+
+// ─── import after mocks ─────────────────────────────────────────────────────
+
+const { shouldGenerateTaskSummary, maybeGenerateTaskSummary } = await import(
+  '../taskSummary.js'
+)
+
+// ─── tests ──────────────────────────────────────────────────────────────────
+
+beforeEach(() => {
+  _updateCalls = []
+})
+
+describe('shouldGenerateTaskSummary', () => {
+  test('returns false when feature is disabled', () => {
+    // bun:bundle feature mock returns false
+    expect(shouldGenerateTaskSummary()).toBe(false)
+  })
+})
+
+describe('maybeGenerateTaskSummary', () => {
+  test('does not throw with empty messages', () => {
+    expect(() =>
+      maybeGenerateTaskSummary({ forkContextMessages: [] }),
+    ).not.toThrow()
+  })
+
+  test('does not throw with undefined messages', () => {
+    expect(() => maybeGenerateTaskSummary({})).not.toThrow()
+  })
+
+  test('does not throw with assistant message containing tool_use', () => {
+    expect(() =>
+      maybeGenerateTaskSummary({
+        forkContextMessages: [
+          {
+            type: 'assistant',
+            message: {
+              content: [
+                { type: 'text', text: 'Let me check' },
+                { type: 'tool_use', name: 'bash' },
+              ],
+            },
+          },
+        ],
+      }),
+    ).not.toThrow()
+  })
+
+  test('does not throw with non-array content', () => {
+    expect(() =>
+      maybeGenerateTaskSummary({
+        forkContextMessages: [
+          {
+            type: 'assistant',
+            message: {
+              content: 'plain text response',
+            },
+          },
+        ],
+      }),
+    ).not.toThrow()
+  })
+})
--- a/src/utils/autonomyAuthority.ts
+++ b/src/utils/autonomyAuthority.ts
@@ -0,0 +1,522 @@
+import {
+  basename,
+  dirname,
+  isAbsolute,
+  join,
+  relative,
+  resolve,
+} from 'node:path'
+import { getProjectRoot } from '../bootstrap/state.js'
+import { getCwd } from './cwd.js'
+import { getFsImplementation } from './fsOperations.js'
+import { normalizePathForConfigKey } from './path.js'
+
+export const AUTONOMY_DIR = join('.claude', 'autonomy')
+export const AUTONOMY_DIR_POSIX = '.claude/autonomy'
+export const AUTONOMY_AGENTS_FILENAME = 'AGENTS.md'
+export const AUTONOMY_HEARTBEAT_FILENAME = 'HEARTBEAT.md'
+export const AUTONOMY_AGENTS_PATH_POSIX = `${AUTONOMY_DIR_POSIX}/${AUTONOMY_AGENTS_FILENAME}`
+export const AUTONOMY_HEARTBEAT_PATH_POSIX = `${AUTONOMY_DIR_POSIX}/${AUTONOMY_HEARTBEAT_FILENAME}`
+
+export type HeartbeatAuthorityTask = {
+  name: string
+  interval: string
+  prompt: string
+  steps: HeartbeatAuthorityTaskStep[]
+}
+
+export type HeartbeatAuthorityTaskStep = {
+  name: string
+  prompt: string
+  waitFor?: string
+}
+
+export type AutonomyAuthorityFile = {
+  path: string
+  relativePath: string
+  content: string
+}
+
+export type AutonomyAuthoritySnapshot = {
+  rootDir: string
+  currentDir: string
+  agentsFiles: AutonomyAuthorityFile[]
+  agentsContent: string | null
+  heartbeatFile: AutonomyAuthorityFile | null
+  heartbeatContent: string | null
+  heartbeatTasks: HeartbeatAuthorityTask[]
+}
+
+type AutonomyAuthorityParams = {
+  rootDir?: string
+  currentDir?: string
+}
+
+export type AutonomyTriggerKind =
+  | 'proactive-tick'
+  | 'scheduled-task'
+  | 'managed-flow-step'
+
+export type PreparedAutonomyTurn = {
+  rootDir: string
+  currentDir: string
+  trigger: AutonomyTriggerKind
+  prompt: string
+  dueHeartbeatTasks: HeartbeatAuthorityTask[]
+  nowMs: number
+}
+
+const heartbeatTaskLastRunByKey = new Map<string, number>()
+
+function isPathWithinRoot(rootDir: string, currentDir: string): boolean {
+  const delta = relative(rootDir, currentDir)
+  return delta === '' || (!delta.startsWith('..') && !isAbsolute(delta))
+}
+
+function listAuthorityDirectories(
+  rootDir: string,
+  currentDir: string,
+): string[] {
+  const resolvedRoot = resolve(rootDir)
+  const resolvedCurrent = resolve(currentDir)
+  if (!isPathWithinRoot(resolvedRoot, resolvedCurrent)) {
+    return [resolvedRoot]
+  }
+
+  const dirs: string[] = []
+  let cursor = resolvedCurrent
+  for (;;) {
+    dirs.push(cursor)
+    if (cursor === resolvedRoot) {
+      break
+    }
+    const parent = dirname(cursor)
+    if (parent === cursor) {
+      break
+    }
+    cursor = parent
+  }
+  return dirs.reverse()
+}
+
+async function readAuthorityFile(
+  filePath: string,
+  rootDir: string,
+): Promise<AutonomyAuthorityFile | null> {
+  try {
+    const content = (await getFsImplementation().readFile(filePath, {
+      encoding: 'utf-8',
+    })) as string
+    const trimmed = content.trim()
+    if (!trimmed) {
+      return null
+    }
+    return {
+      path: filePath,
+      relativePath:
+        normalizePathForConfigKey(relative(rootDir, filePath)) ||
+        basename(filePath),
+      content: trimmed,
+    }
+  } catch {
+    return null
+  }
+}
+
+function mergeAgentsAuthority(files: AutonomyAuthorityFile[]): string | null {
+  if (files.length === 0) {
+    return null
+  }
+
+  return files
+    .map(file => `## ${file.relativePath}\n${file.content}`)
+    .join('\n\n')
+}
+
+export function parseHeartbeatAuthorityTasks(
+  content: string,
+): HeartbeatAuthorityTask[] {
+  const tasks: HeartbeatAuthorityTask[] = []
+  const lines = content.split('\n')
+  const getIndent = (line: string): number =>
+    line.length - line.trimStart().length
+  const parseScalar = (line: string, key: string): string =>
+    line
+      .replace(key, '')
+      .trim()
+      .replace(/^["']|["']$/g, '')
+
+  function parseSteps(
+    startIndex: number,
+    stepsIndent: number,
+  ): { steps: HeartbeatAuthorityTaskStep[]; nextIndex: number } {
+    const steps: HeartbeatAuthorityTaskStep[] = []
+    let index = startIndex
+
+    while (index < lines.length) {
+      const line = lines[index]!
+      const trimmed = line.trim()
+      const indent = getIndent(line)
+
+      if (!trimmed) {
+        index += 1
+        continue
+      }
+
+      if (indent <= stepsIndent) {
+        break
+      }
+
+      if (!trimmed.startsWith('- name:')) {
+        index += 1
+        continue
+      }
+
+      const stepIndent = indent
+      const name = parseScalar(trimmed, '- name:')
+      let prompt = ''
+      let waitFor: string | undefined
+      index += 1
+
+      while (index < lines.length) {
+        const nextLine = lines[index]!
+        const nextTrimmed = nextLine.trim()
+        const nextIndent = getIndent(nextLine)
+
+        if (!nextTrimmed) {
+          index += 1
+          continue
+        }
+
+        if (nextIndent <= stepIndent) {
+          break
+        }
+
+        if (nextTrimmed.startsWith('prompt:')) {
+          prompt = parseScalar(nextTrimmed, 'prompt:')
+        } else if (nextTrimmed.startsWith('wait_for:')) {
+          waitFor = parseScalar(nextTrimmed, 'wait_for:')
+        }
+
+        index += 1
+      }
+
+      if (name && prompt) {
+        steps.push({
+          name,
+          prompt,
+          ...(waitFor ? { waitFor } : {}),
+        })
+      }
+    }
+
+    return { steps, nextIndex: index }
+  }
+
+  const tasksLineIndex = lines.findIndex(line => line.trim() === 'tasks:')
+  if (tasksLineIndex === -1) {
+    return tasks
+  }
+
+  const tasksIndent = getIndent(lines[tasksLineIndex]!)
+  let index = tasksLineIndex + 1
+
+  while (index < lines.length) {
+    const line = lines[index]!
+    const trimmed = line.trim()
+    const indent = getIndent(line)
+
+    if (!trimmed) {
+      index += 1
+      continue
+    }
+
+    if (indent <= tasksIndent) {
+      break
+    }
+
+    if (!trimmed.startsWith('- name:')) {
+      index += 1
+      continue
+    }
+
+    const taskIndent = indent
+    const name = parseScalar(trimmed, '- name:')
+    let interval = ''
+    let prompt = ''
+    let steps: HeartbeatAuthorityTaskStep[] = []
+    index += 1
+
+    while (index < lines.length) {
+      const nextLine = lines[index]!
+      const nextTrimmed = nextLine.trim()
+      const nextIndent = getIndent(nextLine)
+
+      if (!nextTrimmed) {
+        index += 1
+        continue
+      }
+
+      if (nextIndent <= tasksIndent) {
+        break
+      }
+
+      if (nextIndent === taskIndent && nextTrimmed.startsWith('- name:')) {
+        break
+      }
+
+      if (nextIndent <= taskIndent) {
+        break
+      }
+
+      if (nextTrimmed.startsWith('interval:')) {
+        interval = parseScalar(nextTrimmed, 'interval:')
+        index += 1
+        continue
+      }
+
+      if (nextTrimmed.startsWith('prompt:')) {
+        prompt = parseScalar(nextTrimmed, 'prompt:')
+        index += 1
+        continue
+      }
+
+      if (nextTrimmed === 'steps:') {
+        const parsed = parseSteps(index + 1, nextIndent)
+        steps = parsed.steps
+        index = parsed.nextIndex
+        continue
+      }
+
+      index += 1
+    }
+
+    if (name && interval && prompt) {
+      tasks.push({
+        name,
+        interval,
+        prompt,
+        steps,
+      })
+    }
+  }
+
+  return tasks
+}
+
+function parseHeartbeatIntervalMs(interval: string): number | null {
+  const match = interval.trim().match(/^(\d+)\s*(ms|s|m|h|d)?$/i)
+  if (!match) {
+    return null
+  }
+
+  const value = Number.parseInt(match[1]!, 10)
+  const unit = (match[2] ?? 'm').toLowerCase()
+  switch (unit) {
+    case 'ms':
+      return value
+    case 's':
+      return value * 1_000
+    case 'm':
+      return value * 60_000
+    case 'h':
+      return value * 60 * 60_000
+    case 'd':
+      return value * 24 * 60 * 60_000
+    default:
+      return null
+  }
+}
+
+function heartbeatTaskKey(
+  rootDir: string,
+  task: HeartbeatAuthorityTask,
+): string {
+  return `${rootDir}::${task.name}::${task.interval}::${task.prompt}`
+}
+
+function collectDueHeartbeatTasks(
+  snapshot: AutonomyAuthoritySnapshot,
+  nowMs: number,
+): HeartbeatAuthorityTask[] {
+  const due: HeartbeatAuthorityTask[] = []
+  for (const task of snapshot.heartbeatTasks) {
+    const intervalMs = parseHeartbeatIntervalMs(task.interval)
+    if (intervalMs == null) {
+      continue
+    }
+    const key = heartbeatTaskKey(snapshot.rootDir, task)
+    const lastRunMs = heartbeatTaskLastRunByKey.get(key)
+    if (lastRunMs !== undefined && nowMs - lastRunMs < intervalMs) {
+      continue
+    }
+    due.push(task)
+  }
+  return due
+}
+
+function markHeartbeatTasksConsumed(
+  snapshot: AutonomyAuthoritySnapshot,
+  tasks: HeartbeatAuthorityTask[],
+  nowMs: number,
+): void {
+  for (const task of tasks) {
+    heartbeatTaskLastRunByKey.set(
+      heartbeatTaskKey(snapshot.rootDir, task),
+      nowMs,
+    )
+  }
+}
+
+export function resetAutonomyAuthorityForTests(): void {
+  heartbeatTaskLastRunByKey.clear()
+}
+
+export async function loadAutonomyAuthority(
+  params: AutonomyAuthorityParams = {},
+): Promise<AutonomyAuthoritySnapshot> {
+  const rootDir = resolve(params.rootDir ?? getProjectRoot())
+  const currentDir = resolve(params.currentDir ?? getCwd())
+  const authorityDirs = listAuthorityDirectories(rootDir, currentDir)
+
+  const [agentsResults, heartbeatFile] = await Promise.all([
+    Promise.all(
+      authorityDirs.map(async dir =>
+        readAuthorityFile(
+          join(dir, AUTONOMY_DIR, AUTONOMY_AGENTS_FILENAME),
+          rootDir,
+        ),
+      ),
+    ),
+    readAuthorityFile(
+      join(rootDir, AUTONOMY_DIR, AUTONOMY_HEARTBEAT_FILENAME),
+      rootDir,
+    ),
+  ])
+  const agentsFiles = agentsResults.filter(
+    (file): file is AutonomyAuthorityFile => file !== null,
+  )
+
+  return {
+    rootDir,
+    currentDir,
+    agentsFiles,
+    agentsContent: mergeAgentsAuthority(agentsFiles),
+    heartbeatFile,
+    heartbeatContent: heartbeatFile?.content ?? null,
+    heartbeatTasks: heartbeatFile
+      ? parseHeartbeatAuthorityTasks(heartbeatFile.content)
+      : [],
+  }
+}
+
+export async function buildAutonomyTurnPrompt(params: {
+  basePrompt: string
+  trigger: AutonomyTriggerKind
+  rootDir?: string
+  currentDir?: string
+  nowMs?: number
+}): Promise<string> {
+  const prepared = await prepareAutonomyTurnPrompt(params)
+  commitPreparedAutonomyTurn(prepared)
+  return prepared.prompt
+}
+
+export async function prepareAutonomyTurnPrompt(params: {
+  basePrompt: string
+  trigger: AutonomyTriggerKind
+  rootDir?: string
+  currentDir?: string
+  nowMs?: number
+}): Promise<PreparedAutonomyTurn> {
+  const snapshot = await loadAutonomyAuthority({
+    rootDir: params.rootDir,
+    currentDir: params.currentDir,
+  })
+  const nowMs = params.nowMs ?? Date.now()
+  const dueHeartbeatTasks =
+    params.trigger === 'proactive-tick'
+      ? collectDueHeartbeatTasks(snapshot, nowMs)
+      : []
+  const duePromptTasks = dueHeartbeatTasks.filter(
+    task => task.steps.length === 0,
+  )
+
+  const sections: string[] = []
+  if (snapshot.agentsContent) {
+    sections.push(
+      `Workspace authority from ${AUTONOMY_AGENTS_FILENAME}:\n${snapshot.agentsContent}`,
+    )
+  }
+  if (snapshot.heartbeatContent) {
+    sections.push(
+      `Workspace heartbeat guidance from ${AUTONOMY_HEARTBEAT_FILENAME}:\n${snapshot.heartbeatContent}`,
+    )
+  }
+  if (duePromptTasks.length > 0) {
+    sections.push(
+      [
+        `Due ${AUTONOMY_HEARTBEAT_FILENAME} tasks:`,
+        ...duePromptTasks.map(
+          task => `- ${task.name} (${task.interval}): ${task.prompt}`,
+        ),
+      ].join('\n'),
+    )
+  }
+
+  if (sections.length === 0) {
+    return {
+      rootDir: snapshot.rootDir,
+      currentDir: snapshot.currentDir,
+      trigger: params.trigger,
+      prompt: params.basePrompt,
+      dueHeartbeatTasks,
+      nowMs,
+    }
+  }
+
+  const prelude =
+    params.trigger === 'proactive-tick'
+      ? 'This is an autonomous proactive turn. Follow the workspace authority below before acting.'
+      : 'This prompt was generated automatically. Follow the workspace authority below before acting.'
+
+  return {
+    rootDir: snapshot.rootDir,
+    currentDir: snapshot.currentDir,
+    trigger: params.trigger,
+    prompt: [
+      prelude,
+      '<autonomy_authority>',
+      ...sections,
+      '</autonomy_authority>',
+      params.basePrompt,
+    ].join('\n\n'),
+    dueHeartbeatTasks,
+    nowMs,
+  }
+}
+
+export function commitPreparedAutonomyTurn(
+  prepared: PreparedAutonomyTurn,
+): void {
+  if (
+    prepared.trigger !== 'proactive-tick' ||
+    prepared.dueHeartbeatTasks.length === 0
+  ) {
+    return
+  }
+  const snapshot: AutonomyAuthoritySnapshot = {
+    rootDir: prepared.rootDir,
+    currentDir: prepared.currentDir,
+    agentsFiles: [],
+    agentsContent: null,
+    heartbeatFile: null,
+    heartbeatContent: null,
+    heartbeatTasks: prepared.dueHeartbeatTasks,
+  }
+  markHeartbeatTasksConsumed(
+    snapshot,
+    prepared.dueHeartbeatTasks,
+    prepared.nowMs,
+  )
+}
--- a/src/utils/autonomyFlows.ts
+++ b/src/utils/autonomyFlows.ts
--- a/src/utils/autonomyPersistence.ts
+++ b/src/utils/autonomyPersistence.ts
@@ -0,0 +1,48 @@
+import { mkdir, writeFile } from 'fs/promises'
+import { join, resolve } from 'path'
+import { lock } from './lockfile.js'
+
+const persistenceLocks = new Map<string, Promise<void>>()
+
+export async function withAutonomyPersistenceLock<T>(
+  rootDir: string,
+  fn: () => Promise<T>,
+): Promise<T> {
+  const key = resolve(rootDir)
+  const lockPath = join(key, '.claude', 'autonomy', '.lock')
+  const previous = persistenceLocks.get(key) ?? Promise.resolve()
+
+  let release!: () => void
+  const current = new Promise<void>(resolve => {
+    release = resolve
+  })
+  persistenceLocks.set(
+    key,
+    previous.then(() => current),
+  )
+
+  await previous
+  try {
+    await mkdir(join(key, '.claude', 'autonomy'), { recursive: true })
+    await writeFile(lockPath, '', { flag: 'a' })
+    const unlock = await lock(lockPath, {
+      lockfilePath: `${lockPath}.lock`,
+      retries: {
+        retries: 10,
+        factor: 1.2,
+        minTimeout: 10,
+        maxTimeout: 100,
+      },
+    })
+    try {
+      return await fn()
+    } finally {
+      await unlock().catch(() => {})
+    }
+  } finally {
+    release()
+    if (persistenceLocks.get(key) === current) {
+      persistenceLocks.delete(key)
+    }
+  }
+}
--- a/src/utils/autonomyRuns.ts
+++ b/src/utils/autonomyRuns.ts
@@ -0,0 +1,782 @@
+import { randomUUID } from 'crypto'
+import { mkdir, writeFile } from 'fs/promises'
+import { dirname, join, resolve } from 'path'
+import { getProjectRoot } from '../bootstrap/state.js'
+import type { MessageOrigin } from '../types/message.js'
+import type { QueuedCommand } from '../types/textInputTypes.js'
+import {
+  AUTONOMY_DIR,
+  buildAutonomyTurnPrompt,
+  commitPreparedAutonomyTurn,
+  prepareAutonomyTurnPrompt,
+  type AutonomyTriggerKind,
+  type HeartbeatAuthorityTask,
+} from './autonomyAuthority.js'
+import { getCwd } from './cwd.js'
+import {
+  DEFAULT_AUTONOMY_OWNER_KEY,
+  getAutonomyFlowById,
+  markManagedAutonomyFlowStepCancelled,
+  markManagedAutonomyFlowStepCompleted,
+  markManagedAutonomyFlowStepFailed,
+  markManagedAutonomyFlowStepRunning,
+  queueManagedAutonomyFlowStepRun,
+  resumeManagedAutonomyFlow,
+  startManagedAutonomyFlow,
+  type AutonomyFlowRecord,
+  type AutonomyFlowSyncMode,
+  type ManagedAutonomyFlowStepDefinition,
+} from './autonomyFlows.js'
+import { withAutonomyPersistenceLock } from './autonomyPersistence.js'
+import { getFsImplementation } from './fsOperations.js'
+
+const AUTONOMY_RUNS_MAX = 200
+const AUTONOMY_RUNS_RELATIVE_PATH = join(AUTONOMY_DIR, 'runs.json')
+
+export type AutonomyRunStatus =
+  | 'queued'
+  | 'running'
+  | 'completed'
+  | 'failed'
+  | 'cancelled'
+
+export type AutonomyRunRuntime = 'automatic' | 'flow_step'
+
+export type AutonomyRunRecord = {
+  runId: string
+  runtime: AutonomyRunRuntime
+  trigger: AutonomyTriggerKind
+  status: AutonomyRunStatus
+  rootDir: string
+  currentDir: string
+  ownerKey: string
+  sourceId?: string
+  sourceLabel?: string
+  parentFlowId?: string
+  parentFlowKey?: string
+  parentFlowSyncMode?: AutonomyFlowSyncMode
+  flowStepId?: string
+  flowStepName?: string
+  promptPreview: string
+  createdAt: number
+  startedAt?: number
+  endedAt?: number
+  error?: string
+}
+
+type AutonomyRunsFile = {
+  runs: AutonomyRunRecord[]
+}
+
+type AutonomyRunFlowRef = {
+  flowId: string
+  flowKey: string
+  syncMode: AutonomyFlowSyncMode
+  ownerKey: string
+  stepId: string
+  stepName: string
+}
+
+function truncatePromptPreview(prompt: string): string {
+  const singleLine = prompt.replace(/\s+/g, ' ').trim()
+  return singleLine.length <= 240
+    ? singleLine
+    : `${singleLine.slice(0, 237)}...`
+}
+
+function cloneRunRecord(run: AutonomyRunRecord): AutonomyRunRecord {
+  return { ...run }
+}
+
+export function resolveAutonomyRunsPath(
+  rootDir: string = getProjectRoot(),
+): string {
+  return join(resolve(rootDir), AUTONOMY_RUNS_RELATIVE_PATH)
+}
+
+export async function listAutonomyRuns(
+  rootDir: string = getProjectRoot(),
+): Promise<AutonomyRunRecord[]> {
+  try {
+    const raw = (await getFsImplementation().readFile(
+      resolveAutonomyRunsPath(rootDir),
+      {
+        encoding: 'utf-8',
+      },
+    )) as string
+    const parsed = JSON.parse(raw) as Partial<AutonomyRunsFile>
+    if (!Array.isArray(parsed.runs)) {
+      return []
+    }
+    return parsed.runs
+      .filter((run): run is AutonomyRunRecord => {
+        return Boolean(
+          run &&
+            typeof run.runId === 'string' &&
+            typeof run.trigger === 'string' &&
+            typeof run.status === 'string' &&
+            typeof run.rootDir === 'string' &&
+            typeof run.promptPreview === 'string' &&
+            typeof run.createdAt === 'number',
+        )
+      })
+      .map(run => ({
+        ...cloneRunRecord(run),
+        runtime: run.runtime === 'flow_step' ? 'flow_step' : 'automatic',
+        currentDir: run.currentDir || run.rootDir,
+        ownerKey: run.ownerKey || DEFAULT_AUTONOMY_OWNER_KEY,
+      }))
+      .sort((left, right) => right.createdAt - left.createdAt)
+  } catch {
+    return []
+  }
+}
+
+async function writeAutonomyRuns(
+  runs: AutonomyRunRecord[],
+  rootDir: string = getProjectRoot(),
+): Promise<void> {
+  const path = resolveAutonomyRunsPath(rootDir)
+  await mkdir(dirname(path), { recursive: true })
+  await writeFile(
+    path,
+    `${JSON.stringify(
+      {
+        runs: runs
+          .slice()
+          .map(cloneRunRecord)
+          .sort((left, right) => right.createdAt - left.createdAt)
+          .slice(0, AUTONOMY_RUNS_MAX),
+      } satisfies AutonomyRunsFile,
+      null,
+      2,
+    )}\n`,
+    'utf-8',
+  )
+}
+
+async function updateAutonomyRun(
+  runId: string,
+  updater: (current: AutonomyRunRecord) => AutonomyRunRecord,
+  rootDir: string = getProjectRoot(),
+): Promise<AutonomyRunRecord | null> {
+  return withAutonomyPersistenceLock(rootDir, async () => {
+    const runs = await listAutonomyRuns(rootDir)
+    const index = runs.findIndex(run => run.runId === runId)
+    if (index === -1) {
+      return null
+    }
+    const updated = cloneRunRecord(updater(cloneRunRecord(runs[index]!)))
+    runs[index] = updated
+    await writeAutonomyRuns(runs, rootDir)
+    return updated
+  })
+}
+
+export async function getAutonomyRunById(
+  runId: string,
+  rootDir: string = getProjectRoot(),
+): Promise<AutonomyRunRecord | null> {
+  const runs = await listAutonomyRuns(rootDir)
+  return runs.find(run => run.runId === runId) ?? null
+}
+
+export async function createAutonomyRun(params: {
+  trigger: AutonomyTriggerKind
+  prompt: string
+  rootDir?: string
+  currentDir?: string
+  sourceId?: string
+  sourceLabel?: string
+  runtime?: AutonomyRunRuntime
+  ownerKey?: string
+  flow?: AutonomyRunFlowRef
+  nowMs?: number
+}): Promise<AutonomyRunRecord> {
+  const rootDir = resolve(params.rootDir ?? getProjectRoot())
+  const currentDir = resolve(params.currentDir ?? rootDir)
+  const record: AutonomyRunRecord = {
+    runId: randomUUID(),
+    runtime: params.runtime ?? (params.flow ? 'flow_step' : 'automatic'),
+    trigger: params.trigger,
+    status: 'queued',
+    rootDir,
+    currentDir,
+    ownerKey:
+      params.flow?.ownerKey ?? params.ownerKey ?? DEFAULT_AUTONOMY_OWNER_KEY,
+    ...(params.sourceId ? { sourceId: params.sourceId } : {}),
+    ...(params.sourceLabel ? { sourceLabel: params.sourceLabel } : {}),
+    ...(params.flow
+      ? {
+          parentFlowId: params.flow.flowId,
+          parentFlowKey: params.flow.flowKey,
+          parentFlowSyncMode: params.flow.syncMode,
+          flowStepId: params.flow.stepId,
+          flowStepName: params.flow.stepName,
+        }
+      : {}),
+    promptPreview: truncatePromptPreview(params.prompt),
+    createdAt: params.nowMs ?? Date.now(),
+  }
+  await withAutonomyPersistenceLock(rootDir, async () => {
+    const runs = await listAutonomyRuns(rootDir)
+    runs.unshift(record)
+    await writeAutonomyRuns(runs, rootDir)
+  })
+  if (
+    record.parentFlowId &&
+    record.flowStepId &&
+    record.parentFlowSyncMode === 'managed'
+  ) {
+    const stepIndex =
+      (
+        await getAutonomyFlowById(record.parentFlowId, rootDir)
+      )?.stateJson?.steps.findIndex(
+        step => step.stepId === record.flowStepId,
+      ) ?? 0
+    await queueManagedAutonomyFlowStepRun({
+      flowId: record.parentFlowId,
+      stepId: record.flowStepId,
+      stepIndex: stepIndex >= 0 ? stepIndex : 0,
+      runId: record.runId,
+      rootDir,
+      nowMs: record.createdAt,
+    })
+  }
+  return record
+}
+
+function buildManagedFlowStepPrompt(
+  flow: AutonomyFlowRecord,
+  stepIndex: number,
+): string {
+  const state = flow.stateJson
+  const step = state?.steps[stepIndex]
+  if (!state || !step) {
+    return flow.goal
+  }
+  const completed = state.steps
+    .slice(0, stepIndex)
+    .filter(candidate => candidate.status === 'completed')
+    .map(candidate => `- ${candidate.name}`)
+  const remaining = state.steps
+    .slice(stepIndex + 1)
+    .map(candidate => `- ${candidate.name}`)
+
+  return [
+    `This is step ${stepIndex + 1}/${state.steps.length} of the managed autonomy flow "${flow.goal}".`,
+    '<autonomy_flow>',
+    `Flow ID: ${flow.flowId}`,
+    `Flow source: ${flow.sourceLabel ?? flow.sourceId ?? 'automatic'}`,
+    `Current step: ${step.name}`,
+    completed.length > 0
+      ? ['Completed steps:', ...completed].join('\n')
+      : 'Completed steps: none',
+    remaining.length > 0
+      ? ['Remaining steps after this one:', ...remaining].join('\n')
+      : 'Remaining steps after this one: none',
+    '</autonomy_flow>',
+    step.prompt,
+  ].join('\n\n')
+}
+
+async function createOrRecoverManagedFlowStepCommand(params: {
+  flowId: string
+  rootDir?: string
+  currentDir?: string
+  priority?: 'now' | 'next' | 'later'
+  workload?: string
+}): Promise<QueuedCommand | null> {
+  const rootDir = resolve(params.rootDir ?? getProjectRoot())
+  const flow = await getAutonomyFlowById(params.flowId, rootDir)
+  if (!flow || flow.status !== 'queued' || !flow.stateJson) {
+    return null
+  }
+  const stepIndex = flow.stateJson.currentStepIndex
+  const step = flow.stateJson.steps[stepIndex]
+  if (!step) {
+    return null
+  }
+  if (step.status === 'queued' && step.runId) {
+    const run = await getAutonomyRunById(step.runId, rootDir)
+    if (run && run.status === 'queued' && !run.startedAt && !run.endedAt) {
+      const value = await buildAutonomyTurnPrompt({
+        basePrompt: buildManagedFlowStepPrompt(flow, stepIndex),
+        trigger: 'managed-flow-step',
+        rootDir,
+        currentDir: params.currentDir ?? flow.currentDir,
+      })
+      const origin = {
+        kind: 'autonomy',
+        trigger: 'managed-flow-step',
+        runId: run.runId,
+        ...(run.sourceId ? { sourceId: run.sourceId } : {}),
+      } as unknown as MessageOrigin
+      return {
+        value,
+        mode: 'prompt',
+        priority: params.priority ?? 'later',
+        isMeta: true,
+        origin,
+        workload: params.workload,
+        autonomy: {
+          runId: run.runId,
+          trigger: 'managed-flow-step',
+          sourceId: run.sourceId,
+          sourceLabel: run.sourceLabel,
+          ...(run.parentFlowId ? { flowId: run.parentFlowId } : {}),
+          ...(run.flowStepId ? { flowStepId: run.flowStepId } : {}),
+          ...(run.flowStepName ? { flowStepName: run.flowStepName } : {}),
+        },
+      }
+    }
+    return null
+  }
+  if (step.status !== 'pending' || step.runId) {
+    return null
+  }
+  return createAutonomyQueuedPrompt({
+    basePrompt: buildManagedFlowStepPrompt(flow, stepIndex),
+    trigger: 'managed-flow-step',
+    rootDir,
+    currentDir: params.currentDir ?? flow.currentDir,
+    sourceId: flow.sourceId ?? flow.flowId,
+    sourceLabel: flow.sourceLabel ?? flow.goal,
+    workload: params.workload,
+    priority: params.priority,
+    flow: {
+      flowId: flow.flowId,
+      flowKey: flow.flowKey,
+      syncMode: 'managed',
+      ownerKey: flow.ownerKey,
+      stepId: step.stepId,
+      stepName: step.name,
+    },
+  })
+}
+
+async function queueCurrentManagedFlowStepCommand(params: {
+  flowId: string
+  rootDir?: string
+  currentDir?: string
+  priority?: 'now' | 'next' | 'later'
+  workload?: string
+}): Promise<QueuedCommand | null> {
+  return createOrRecoverManagedFlowStepCommand(params)
+}
+
+export async function startManagedAutonomyFlowFromHeartbeatTask(params: {
+  task: HeartbeatAuthorityTask
+  rootDir?: string
+  currentDir?: string
+  ownerKey?: string
+  priority?: 'now' | 'next' | 'later'
+  workload?: string
+}): Promise<QueuedCommand | null> {
+  if (params.task.steps.length === 0) {
+    return null
+  }
+  const rootDir = resolve(params.rootDir ?? getProjectRoot())
+  const currentDir = resolve(params.currentDir ?? getCwd())
+  const started = await startManagedAutonomyFlow({
+    trigger: 'proactive-tick',
+    goal: params.task.prompt,
+    steps: params.task.steps.map<ManagedAutonomyFlowStepDefinition>(step => ({
+      name: step.name,
+      prompt: step.prompt,
+      ...(step.waitFor ? { waitFor: step.waitFor } : {}),
+    })),
+    rootDir,
+    currentDir,
+    ownerKey: params.ownerKey,
+    sourceId: `heartbeat:${params.task.name}`,
+    sourceLabel: params.task.name,
+  })
+  if (!started) {
+    return null
+  }
+  return createOrRecoverManagedFlowStepCommand({
+    flowId: started.flow.flowId,
+    rootDir,
+    currentDir,
+    priority: params.priority,
+    workload: params.workload,
+  })
+}
+
+export async function markAutonomyRunRunning(
+  runId: string,
+  rootDir?: string,
+  nowMs?: number,
+): Promise<AutonomyRunRecord | null> {
+  const updated = await updateAutonomyRun(
+    runId,
+    current => ({
+      ...current,
+      status: 'running',
+      startedAt: nowMs ?? Date.now(),
+    }),
+    rootDir,
+  )
+  if (updated?.parentFlowId && updated.parentFlowSyncMode === 'managed') {
+    await markManagedAutonomyFlowStepRunning({
+      flowId: updated.parentFlowId,
+      runId: updated.runId,
+      rootDir,
+      nowMs: updated.startedAt,
+    })
+  }
+  return updated
+}
+
+export async function markAutonomyRunCompleted(
+  runId: string,
+  rootDir?: string,
+  nowMs?: number,
+): Promise<AutonomyRunRecord | null> {
+  const updated = await updateAutonomyRun(
+    runId,
+    current => ({
+      ...current,
+      status: 'completed',
+      endedAt: nowMs ?? Date.now(),
+      error: undefined,
+    }),
+    rootDir,
+  )
+  if (updated?.parentFlowId && updated.parentFlowSyncMode === 'managed') {
+    await markManagedAutonomyFlowStepCompleted({
+      flowId: updated.parentFlowId,
+      runId: updated.runId,
+      rootDir,
+      nowMs: updated.endedAt,
+    })
+  }
+  return updated
+}
+
+export async function markAutonomyRunFailed(
+  runId: string,
+  error: string,
+  rootDir?: string,
+  nowMs?: number,
+): Promise<AutonomyRunRecord | null> {
+  const updated = await updateAutonomyRun(
+    runId,
+    current => ({
+      ...current,
+      status: 'failed',
+      endedAt: nowMs ?? Date.now(),
+      error,
+    }),
+    rootDir,
+  )
+  if (updated?.parentFlowId && updated.parentFlowSyncMode === 'managed') {
+    await markManagedAutonomyFlowStepFailed({
+      flowId: updated.parentFlowId,
+      runId: updated.runId,
+      error,
+      rootDir,
+      nowMs: updated.endedAt,
+    })
+  }
+  return updated
+}
+
+export async function markAutonomyRunCancelled(
+  runId: string,
+  rootDir?: string,
+  nowMs?: number,
+): Promise<AutonomyRunRecord | null> {
+  const updated = await updateAutonomyRun(
+    runId,
+    current => ({
+      ...current,
+      status: 'cancelled',
+      endedAt: nowMs ?? Date.now(),
+      error: undefined,
+    }),
+    rootDir,
+  )
+  if (updated?.parentFlowId && updated.parentFlowSyncMode === 'managed') {
+    await markManagedAutonomyFlowStepCancelled({
+      flowId: updated.parentFlowId,
+      runId: updated.runId,
+      rootDir,
+      nowMs: updated.endedAt,
+    })
+  }
+  return updated
+}
+
+export async function finalizeAutonomyRunCompleted(params: {
+  runId: string
+  rootDir?: string
+  currentDir?: string
+  priority?: 'now' | 'next' | 'later'
+  workload?: string
+  nowMs?: number
+}): Promise<QueuedCommand[]> {
+  const updated = await markAutonomyRunCompleted(
+    params.runId,
+    params.rootDir,
+    params.nowMs,
+  )
+  if (!updated?.parentFlowId || updated.parentFlowSyncMode !== 'managed') {
+    return []
+  }
+  const next = await queueCurrentManagedFlowStepCommand({
+    flowId: updated.parentFlowId,
+    rootDir: params.rootDir,
+    currentDir: params.currentDir ?? updated.currentDir,
+    priority: params.priority,
+    workload: params.workload,
+  })
+  return next ? [next] : []
+}
+
+export async function finalizeAutonomyRunFailed(params: {
+  runId: string
+  error: string
+  rootDir?: string
+  nowMs?: number
+}): Promise<void> {
+  await markAutonomyRunFailed(
+    params.runId,
+    params.error,
+    params.rootDir,
+    params.nowMs,
+  )
+}
+
+export async function recoverManagedAutonomyFlowPrompt(params: {
+  flowId: string
+  rootDir?: string
+  currentDir?: string
+  priority?: 'now' | 'next' | 'later'
+  workload?: string
+}): Promise<QueuedCommand | null> {
+  return createOrRecoverManagedFlowStepCommand(params)
+}
+
+export async function resumeManagedAutonomyFlowPrompt(params: {
+  flowId: string
+  rootDir?: string
+  currentDir?: string
+  priority?: 'now' | 'next' | 'later'
+  workload?: string
+  nowMs?: number
+}): Promise<QueuedCommand | null> {
+  const resumed = await resumeManagedAutonomyFlow({
+    flowId: params.flowId,
+    rootDir: params.rootDir,
+    nowMs: params.nowMs,
+  })
+  if (!resumed) {
+    return recoverManagedAutonomyFlowPrompt({
+      flowId: params.flowId,
+      rootDir: params.rootDir,
+      currentDir: params.currentDir,
+      priority: params.priority,
+      workload: params.workload,
+    })
+  }
+  return createOrRecoverManagedFlowStepCommand({
+    flowId: resumed.flow.flowId,
+    rootDir: params.rootDir,
+    currentDir: params.currentDir ?? resumed.flow.currentDir,
+    priority: params.priority,
+    workload: params.workload,
+  })
+}
+
+export async function createAutonomyQueuedPrompt(params: {
+  trigger: AutonomyTriggerKind
+  basePrompt: string
+  rootDir?: string
+  currentDir?: string
+  sourceId?: string
+  sourceLabel?: string
+  workload?: string
+  priority?: 'now' | 'next' | 'later'
+  shouldCreate?: () => boolean
+  flow?: AutonomyRunFlowRef
+}): Promise<QueuedCommand | null> {
+  const rootDir = resolve(params.rootDir ?? getProjectRoot())
+  const currentDir = resolve(params.currentDir ?? getCwd())
+  const prepared = await prepareAutonomyTurnPrompt({
+    basePrompt: params.basePrompt,
+    trigger: params.trigger,
+    rootDir,
+    currentDir,
+  })
+  if (params.shouldCreate && !params.shouldCreate()) {
+    return null
+  }
+  return commitAutonomyQueuedPrompt({
+    prepared,
+    rootDir,
+    currentDir,
+    sourceId: params.sourceId,
+    sourceLabel: params.sourceLabel,
+    workload: params.workload,
+    priority: params.priority,
+    flow: params.flow,
+  })
+}
+
+export async function commitAutonomyQueuedPrompt(params: {
+  prepared: Awaited<ReturnType<typeof prepareAutonomyTurnPrompt>>
+  rootDir?: string
+  currentDir?: string
+  sourceId?: string
+  sourceLabel?: string
+  workload?: string
+  priority?: 'now' | 'next' | 'later'
+  flow?: AutonomyRunFlowRef
+}): Promise<QueuedCommand> {
+  const rootDir = resolve(
+    params.rootDir ?? params.prepared.rootDir ?? getProjectRoot(),
+  )
+  const currentDir = resolve(
+    params.currentDir ?? params.prepared.currentDir ?? getCwd(),
+  )
+  commitPreparedAutonomyTurn(params.prepared)
+  const value = params.prepared.prompt
+  const run = await createAutonomyRun({
+    trigger: params.prepared.trigger,
+    prompt: value,
+    rootDir,
+    currentDir,
+    sourceId: params.sourceId,
+    sourceLabel: params.sourceLabel,
+    flow: params.flow,
+  })
+  const origin = {
+    kind: 'autonomy',
+    trigger: params.prepared.trigger,
+    runId: run.runId,
+    ...(params.sourceId ? { sourceId: params.sourceId } : {}),
+  } as unknown as MessageOrigin
+
+  return {
+    value,
+    mode: 'prompt',
+    priority: params.priority ?? 'later',
+    isMeta: true,
+    origin,
+    workload: params.workload,
+    autonomy: {
+      runId: run.runId,
+      trigger: params.prepared.trigger,
+      sourceId: params.sourceId,
+      sourceLabel: params.sourceLabel,
+      ...(run.parentFlowId ? { flowId: run.parentFlowId } : {}),
+      ...(run.flowStepId ? { flowStepId: run.flowStepId } : {}),
+      ...(run.flowStepName ? { flowStepName: run.flowStepName } : {}),
+    },
+  }
+}
+
+export async function createProactiveAutonomyCommands(params: {
+  basePrompt: string
+  rootDir?: string
+  currentDir?: string
+  workload?: string
+  priority?: 'now' | 'next' | 'later'
+  shouldCreate?: () => boolean
+}): Promise<QueuedCommand[]> {
+  const rootDir = resolve(params.rootDir ?? getProjectRoot())
+  const currentDir = resolve(params.currentDir ?? getCwd())
+  const prepared = await prepareAutonomyTurnPrompt({
+    basePrompt: params.basePrompt,
+    trigger: 'proactive-tick',
+    rootDir,
+    currentDir,
+  })
+  if (params.shouldCreate && !params.shouldCreate()) {
+    return []
+  }
+
+  const commands: QueuedCommand[] = [
+    await commitAutonomyQueuedPrompt({
+      prepared,
+      rootDir,
+      currentDir,
+      workload: params.workload,
+      priority: params.priority,
+    }),
+  ]
+
+  for (const task of prepared.dueHeartbeatTasks) {
+    if (task.steps.length === 0) {
+      continue
+    }
+    if (params.shouldCreate && !params.shouldCreate()) {
+      break
+    }
+    const flowCommand = await startManagedAutonomyFlowFromHeartbeatTask({
+      task,
+      rootDir,
+      currentDir,
+      priority: params.priority,
+      workload: params.workload,
+    })
+    if (flowCommand) {
+      commands.push(flowCommand)
+    }
+  }
+
+  return commands
+}
+
+export function formatAutonomyRunsStatus(runs: AutonomyRunRecord[]): string {
+  const counts = {
+    queued: 0,
+    running: 0,
+    completed: 0,
+    failed: 0,
+    cancelled: 0,
+  }
+  for (const run of runs) {
+    counts[run.status] += 1
+  }
+  const latest = runs[0]
+  const latestLine = latest
+    ? `Latest: ${latest.trigger} ${latest.status} (${new Date(latest.createdAt).toLocaleString()})`
+    : 'Latest: none'
+  return [
+    `Autonomy runs: ${runs.length}`,
+    `Queued: ${counts.queued}`,
+    `Running: ${counts.running}`,
+    `Completed: ${counts.completed}`,
+    `Failed: ${counts.failed}`,
+    `Cancelled: ${counts.cancelled}`,
+    latestLine,
+  ].join('\n')
+}
+
+export function formatAutonomyRunsList(
+  runs: AutonomyRunRecord[],
+  limit = 10,
+): string {
+  const slice = runs.slice(0, limit)
+  if (slice.length === 0) {
+    return 'No autonomy runs recorded.'
+  }
+  return slice
+    .map(run => {
+      const source = run.sourceLabel ?? run.sourceId ?? 'auto'
+      const flow =
+        run.parentFlowId && run.flowStepName
+          ? ` | flow=${run.parentFlowId} step=${run.flowStepName}`
+          : ''
+      const ended =
+        run.endedAt != null
+          ? ` -> ${new Date(run.endedAt).toLocaleTimeString()}`
+          : ''
+      const error = run.error ? ` | ${run.error}` : ''
+      return `${run.runId} | ${run.runtime} | ${run.trigger} | ${run.status} | ${source}${flow} | ${new Date(run.createdAt).toLocaleTimeString()}${ended}\n  ${run.promptPreview}${error}`
+    })
+    .join('\n')
+}
--- a/src/utils/config.ts
+++ b/src/utils/config.ts
@@ -334,6 +334,9 @@ export type GlobalConfig = {
  overageCreditUpsellSeenCount?: number // Number of times the overage credit upsell has been shown
  hasVisitedExtraUsage?: boolean // Whether the user has visited /extra-usage — hides credit upsells

+  // Display language preference
+  preferredLanguage?: 'auto' | 'en' | 'zh' // auto = follow system locale, en = English, zh = 中文
+
  // Voice mode notice tracking
  voiceNoticeSeenCount?: number // Number of times the voice-mode-available notice has been shown
  voiceLangHintShownCount?: number // Number of times the /voice dictation-language hint has been shown
@@ -556,7 +559,6 @@ export type GlobalConfig = {
  // Speculation configuration (ant-only)
  speculationEnabled?: boolean // Whether speculation is enabled (default: true)

-
  // Client data for server-side experiments (fetched during bootstrap).
  clientDataCache?: Record<string, unknown> | null

--- a/src/utils/handlePromptSubmit.ts
+++ b/src/utils/handlePromptSubmit.ts
@@ -26,6 +26,12 @@ import { fileHistoryEnabled, fileHistoryMakeSnapshot } from './fileHistory.js'
 import { gracefulShutdownSync } from './gracefulShutdown.js'
 import { enqueue } from './messageQueueManager.js'
 import { resolveSkillModelOverride } from './model/model.js'
+import {
+  finalizeAutonomyRunCompleted,
+  finalizeAutonomyRunFailed,
+  markAutonomyRunFailed,
+  markAutonomyRunRunning,
+} from './autonomyRuns.js'
 import type { ProcessUserInputContext } from './processUserInput/processUserInput.js'
 import { processUserInput } from './processUserInput/processUserInput.js'
 import type { QueryGuard } from './QueryGuard.js'
@@ -460,6 +466,7 @@ async function executeUserInput(params: ExecuteUserInputParams): Promise<void> {
      commands.every(c => c.workload === firstWorkload)
        ? firstWorkload
        : undefined
+    const autonomyRunIds = new Set<string>()

    // Wrap the entire turn (processUserInput loop + onQuery) in an
    // AsyncLocalStorage context. This is the ONLY way to correctly
@@ -469,10 +476,15 @@ async function executeUserInput(params: ExecuteUserInputParams): Promise<void> {
    // context — isolated from the parent's continuation. A process-global
    // mutable slot would be clobbered at the detached closure's first
    // await by this function's synchronous return path. See state.ts.
-    await runWithWorkload(turnWorkload, async () => {
+    try {
+      await runWithWorkload(turnWorkload, async () => {
      for (let i = 0; i < commands.length; i++) {
        const cmd = commands[i]!
        const isFirst = i === 0
+        if (cmd.autonomy?.runId) {
+          autonomyRunIds.add(cmd.autonomy.runId)
+          await markAutonomyRunRunning(cmd.autonomy.runId)
+        }
        const result = await processUserInput({
          input: cmd.value,
          preExpansionInput: cmd.preExpansionValue,
@@ -593,7 +605,26 @@ async function executeUserInput(params: ExecuteUserInputParams): Promise<void> {
          params.onInputChange(nextInput)
        }
      }
-    }) // end runWithWorkload — ALS context naturally scoped, no finally needed
+      }) // end runWithWorkload — ALS context naturally scoped, no finally needed
+      for (const runId of autonomyRunIds) {
+        const nextCommands = await finalizeAutonomyRunCompleted({
+          runId,
+          priority: 'later',
+          workload: turnWorkload,
+        })
+        for (const nextCommand of nextCommands) {
+          enqueue(nextCommand)
+        }
+      }
+    } catch (error) {
+      for (const runId of autonomyRunIds) {
+        await finalizeAutonomyRunFailed({
+          runId,
+          error: String(error),
+        })
+      }
+      throw error
+    }
  } finally {
    // Safety net: release the guard reservation if processUserInput threw
    // or onQuery was skipped. No-op if onQuery already ran (guard is idle
--- a/src/utils/language.ts
+++ b/src/utils/language.ts
@@ -0,0 +1,26 @@
+import { getGlobalConfig } from './config.js'
+import { getSystemLocaleLanguage } from './intl.js'
+
+export type PreferredLanguage = 'auto' | 'en' | 'zh'
+export type ResolvedLanguage = 'en' | 'zh'
+
+/**
+ * Resolve the effective display language.
+ * Priority: GlobalConfig.preferredLanguage → system locale → default 'en'.
+ */
+export function getResolvedLanguage(): ResolvedLanguage {
+  const pref = getGlobalConfig().preferredLanguage ?? 'auto'
+  if (pref === 'en' || pref === 'zh') return pref
+  const sysLang = getSystemLocaleLanguage()
+  return sysLang === 'zh' ? 'zh' : 'en'
+}
+
+const DISPLAY_NAMES: Record<string, string> = {
+  auto: 'Auto (follow system)',
+  en: 'English',
+  zh: '中文',
+}
+
+export function getLanguageDisplayName(lang: string): string {
+  return DISPLAY_NAMES[lang] ?? lang
+}
--- a/src/utils/pipeMuteState.ts
+++ b/src/utils/pipeMuteState.ts
@@ -0,0 +1,78 @@
+/**
+ * pipeMuteState — Master-side logical disconnect state.
+ *
+ * Tracks which slave pipes are currently "muted" (logically disconnected)
+ * and which have a temporary `/send` override active.
+ *
+ * This is local master state only — not part of the socket protocol.
+ */
+
+// ---------------------------------------------------------------------------
+// Muted set: slaves whose business messages should be dropped by master
+// ---------------------------------------------------------------------------
+
+const _mutedPipes = new Set<string>()
+
+export function setMasterMutedPipes(names: Iterable<string>): void {
+  _mutedPipes.clear()
+  for (const n of names) _mutedPipes.add(n)
+}
+
+export function isMasterPipeMuted(name: string): boolean {
+  return _mutedPipes.has(name)
+}
+
+export function removeMasterPipeMute(name: string): void {
+  _mutedPipes.delete(name)
+}
+
+export function clearMasterMutedPipes(): void {
+  _mutedPipes.clear()
+}
+
+// ---------------------------------------------------------------------------
+// Send override set: slaves temporarily unmuted by explicit `/send` command.
+// Override lasts until the slave emits `done` or `error`.
+// ---------------------------------------------------------------------------
+
+const _sendOverrides = new Set<string>()
+let _sendOverrideVersion = 0
+const _sendOverrideListeners = new Set<() => void>()
+
+function emitSendOverrideChanged(): void {
+  _sendOverrideVersion += 1
+  for (const listener of _sendOverrideListeners) {
+    listener()
+  }
+}
+
+export function addSendOverride(name: string): void {
+  _sendOverrides.add(name)
+  emitSendOverrideChanged()
+}
+
+export function removeSendOverride(name: string): void {
+  if (_sendOverrides.delete(name)) {
+    emitSendOverrideChanged()
+  }
+}
+
+export function hasSendOverride(name: string): boolean {
+  return _sendOverrides.has(name)
+}
+
+export function clearSendOverrides(): void {
+  if (_sendOverrides.size > 0) {
+    _sendOverrides.clear()
+    emitSendOverrideChanged()
+  }
+}
+
+export function subscribeSendOverride(listener: () => void): () => void {
+  _sendOverrideListeners.add(listener)
+  return () => { _sendOverrideListeners.delete(listener) }
+}
+
+export function getSendOverrideVersion(): number {
+  return _sendOverrideVersion
+}
--- a/src/utils/pipePermissionRelay.ts
+++ b/src/utils/pipePermissionRelay.ts
@@ -19,8 +19,22 @@ const pendingPipePermissions = new Map<string, PendingPipePermission>()
 type PipeRelayFn = (message: PipeMessage) => void
 let _pipeRelay: PipeRelayFn | null = null

+// Slave-side mute flag: when true, relayPipeMessage() and permission
+// relay functions will short-circuit. Set by relay_mute / relay_unmute
+// control messages from master.
+let _relayMuted = false
+
+export function setRelayMuted(muted: boolean): void {
+  _relayMuted = muted
+}
+
+export function isRelayMuted(): boolean {
+  return _relayMuted
+}
+
 export function setPipeRelay(fn: PipeRelayFn | null): void {
  _pipeRelay = fn
+  if (!fn) _relayMuted = false // reset on disconnect
 }

 export function getPipeRelay(): PipeRelayFn | null {
@@ -37,6 +51,7 @@ export function tryRelayPipePermissionRequest(
  toolUseConfirm: ToolUseConfirm,
  onResponse: (payload: PipePermissionResponsePayload) => void,
 ): string | null {
+  if (_relayMuted) return null
  const send = getPipeSender()
  if (!send) return null

@@ -93,6 +108,7 @@ export function notifyPipePermissionCancel(
  reason?: string,
 ): void {
  if (!requestId) return
+  if (_relayMuted) return
  const send = getPipeSender()
  if (!send) return
  send({
--- a/src/utils/pipeTransport.ts
+++ b/src/utils/pipeTransport.ts
@@ -31,7 +31,8 @@ import { attachNdjsonFramer } from './ndjsonFramer.js'
 * Message types exchanged over the pipe.
 *
 * Basic:        ping, pong
- * Control:      attach_request, attach_accept, attach_reject, detach
+ * Control:      attach_request, attach_accept, attach_reject, detach,
+ *               relay_mute, relay_unmute
 * Data (M→S):   prompt           — master sends user input to slave
 * Data (S→M):   stream           — slave streams AI output fragments
 *               tool_start       — slave notifies tool execution start
@@ -49,6 +50,9 @@ export type PipeMessageType =
  | 'attach_accept'
  | 'attach_reject'
  | 'detach'
+  // Mute control (master → slave): logical disconnect without dropping transport
+  | 'relay_mute'
+  | 'relay_unmute'
  // Data flow (master → slave)
  | 'prompt'
  // Data flow (slave → master)
--- a/src/utils/swarm/inProcessRunner.ts
+++ b/src/utils/swarm/inProcessRunner.ts
@@ -66,6 +66,11 @@ import { evictTerminalTask } from '../../utils/task/framework.js'
 import { tokenCountWithEstimation } from '../../utils/tokens.js'
 import { createAbortController } from '../abortController.js'
 import { type AgentContext, runWithAgentContext } from '../agentContext.js'
+import {
+  markAutonomyRunCompleted,
+  markAutonomyRunFailed,
+  markAutonomyRunRunning,
+} from '../autonomyRuns.js'
 import { count } from '../array.js'
 import { logForDebugging } from '../debug.js'
 import { cloneFileStateCache } from '../fileStateCache.js'
@@ -668,6 +673,7 @@ type WaitResult =
  | {
      type: 'new_message'
      message: string
+      autonomyRunId?: string
      from: string
      color?: string
      summary?: string
@@ -710,7 +716,7 @@ async function waitForNextPromptOrShutdown(
      task.type === 'in_process_teammate' &&
      task.pendingUserMessages.length > 0
    ) {
-      const message = task.pendingUserMessages[0]! // Safe: checked length > 0
+      const pending = task.pendingUserMessages[0]! // Safe: checked length > 0
      // Pop the message from the queue
      setAppState(prev => {
        const prevTask = prev.tasks[taskId]
@@ -731,9 +737,13 @@ async function waitForNextPromptOrShutdown(
      logForDebugging(
        `[inProcessRunner] ${identity.agentName} found pending user message (poll #${pollCount})`,
      )
+      if (pending.autonomyRunId) {
+        await markAutonomyRunRunning(pending.autonomyRunId)
+      }
      return {
        type: 'new_message',
-        message,
+        message: pending.message,
+        autonomyRunId: pending.autonomyRunId,
        from: 'user',
      }
    }
@@ -1010,6 +1020,7 @@ export async function runInProcessTeammate(
    description,
  )
  let currentPrompt = wrappedInitialPrompt
+  let currentAutonomyRunId: string | undefined
  let shouldExit = false

  // Try to claim an available task immediately so the UI can show activity
@@ -1306,6 +1317,13 @@ export async function runInProcessTeammate(
          }),
          setAppState,
        )
+        if (currentAutonomyRunId) {
+          await markAutonomyRunFailed(currentAutonomyRunId, ERROR_MESSAGE_USER_ABORT)
+          currentAutonomyRunId = undefined
+        }
+      } else if (currentAutonomyRunId) {
+        await markAutonomyRunCompleted(currentAutonomyRunId)
+        currentAutonomyRunId = undefined
      }

      // Check if already idle before updating (to skip duplicate notification)
@@ -1378,6 +1396,7 @@ export async function runInProcessTeammate(
            createUserMessage({ content: currentPrompt }),
            setAppState,
          )
+          currentAutonomyRunId = undefined
          break

        case 'new_message':
@@ -1389,6 +1408,7 @@ export async function runInProcessTeammate(
          // Messages from other teammates get XML wrapper for identification
          if (waitResult.from === 'user') {
            currentPrompt = waitResult.message
+            currentAutonomyRunId = waitResult.autonomyRunId
          } else {
            currentPrompt = formatAsTeammateMessage(
              waitResult.from,
@@ -1404,6 +1424,7 @@ export async function runInProcessTeammate(
              createUserMessage({ content: currentPrompt }),
              setAppState,
            )
+            currentAutonomyRunId = undefined
          }
          break

@@ -1459,7 +1480,6 @@ export async function runInProcessTeammate(
        summary: identity.agentId,
      })
    }
-
    unregisterPerfettoAgent(identity.agentId)
    return { success: true, messages: allMessages }
  } catch (error) {
@@ -1511,6 +1531,9 @@ export async function runInProcessTeammate(
        summary: identity.agentId,
      })
    }
+    if (currentAutonomyRunId) {
+      await markAutonomyRunFailed(currentAutonomyRunId, errorMessage)
+    }

    // Send idle notification with failure via file-based mailbox
    await sendIdleNotification(
--- a/src/utils/swarm/spawnInProcess.ts
+++ b/src/utils/swarm/spawnInProcess.ts
@@ -24,6 +24,7 @@ import type {
  TeammateIdentity,
 } from '../../tasks/InProcessTeammateTask/types.js'
 import { createAbortController } from '../abortController.js'
+import { markAutonomyRunFailed } from '../autonomyRuns.js'
 import { formatAgentId } from '../agentId.js'
 import { registerCleanup } from '../cleanupRegistry.js'
 import { logForDebugging } from '../debug.js'
@@ -304,6 +305,15 @@ export function killInProcessTeammate(
  }

  if (killed) {
+    const pendingAutonomyRunIds = teammateTask.pendingUserMessages
+      .map(message => message.autonomyRunId)
+      .filter((runId): runId is string => Boolean(runId))
+    for (const runId of pendingAutonomyRunIds) {
+      void markAutonomyRunFailed(
+        runId,
+        `Teammate ${agentId ?? taskId} was stopped before it could consume the queued autonomy prompt.`,
+      )
+    }
    void evictTaskOutput(taskId)
    // notified:true was pre-set so no XML notification fires; close the SDK
    // task_started bookend directly. The in-process runner's own
--- a/src/utils/taskSummary.ts
+++ b/src/utils/taskSummary.ts
@@ -1,3 +1,78 @@
-// Auto-generated stub — replace with real implementation
-export const shouldGenerateTaskSummary: () => boolean = () => false;
-export const maybeGenerateTaskSummary: (options: Record<string, unknown>) => void = () => {};
+import { feature } from 'bun:bundle'
+import { isBgSession, updateSessionActivity } from './concurrentSessions.js'
+import { logForDebugging } from './debug.js'
+
+/**
+ * Minimum interval between task summary generations (ms).
+ * Prevents excessive updates during rapid tool-call loops.
+ */
+const SUMMARY_INTERVAL_MS = 30_000
+
+let lastSummaryTime = 0
+
+/**
+ * Whether a task summary should be generated this turn.
+ * Only generates in bg sessions, and rate-limits to avoid churn.
+ */
+export function shouldGenerateTaskSummary(): boolean {
+  if (!feature('BG_SESSIONS')) return false
+  if (!isBgSession()) return false
+
+  const now = Date.now()
+  return now - lastSummaryTime >= SUMMARY_INTERVAL_MS
+}
+
+/**
+ * Generate a task summary from the current turn's messages and push it
+ * to the session registry so `claude ps` can display live status.
+ *
+ * Fire-and-forget from query.ts — errors are logged, never thrown.
+ */
+export function maybeGenerateTaskSummary(
+  options: Record<string, unknown>,
+): void {
+  lastSummaryTime = Date.now()
+
+  try {
+    const messages = options.forkContextMessages as
+      | Array<{
+          type: string
+          message?: { content?: unknown }
+        }>
+      | undefined
+
+    if (!messages || messages.length === 0) return
+
+    // Extract a short status from the most recent assistant message
+    const lastAssistant = [...messages]
+      .reverse()
+      .find(m => m.type === 'assistant')
+
+    let status = 'working'
+    let waitingFor: string | undefined
+
+    if (lastAssistant?.message?.content) {
+      const content = lastAssistant.message.content
+      // Check if last block is tool_use
+      if (Array.isArray(content)) {
+        const lastBlock = content[content.length - 1] as
+          | Record<string, unknown>
+          | undefined
+        if (lastBlock?.type === 'tool_use') {
+          status = 'busy'
+          waitingFor = `tool: ${lastBlock.name || 'unknown'}`
+        }
+      }
+    }
+
+    // Fire-and-forget update to session registry
+    void updateSessionActivity({
+      status: status as 'busy' | 'idle',
+      waitingFor,
+    }).catch(err => {
+      logForDebugging(`[taskSummary] updateSessionActivity failed: ${err}`)
+    })
+  } catch (err) {
+    logForDebugging(`[taskSummary] error: ${err}`)
+  }
+}
--- a/tests/mocks/file-system.ts
+++ b/tests/mocks/file-system.ts
@@ -1,15 +1,13 @@
-import { mkdtemp, rm, writeFile, mkdir } from "node:fs/promises";
-import { tmpdir } from "node:os";
-import { join } from "node:path";
+import { mkdtemp, rm, writeFile, mkdir } from 'node:fs/promises'
+import { tmpdir } from 'node:os'
+import { dirname, join } from 'node:path'

-export async function createTempDir(
-  prefix = "claude-test-",
-): Promise<string> {
-  return mkdtemp(join(tmpdir(), prefix));
+export async function createTempDir(prefix = 'claude-test-'): Promise<string> {
+  return mkdtemp(join(tmpdir(), prefix))
 }

 export async function cleanupTempDir(dir: string): Promise<void> {
-  await rm(dir, { recursive: true, force: true });
+  await rm(dir, { recursive: true, force: true })
 }

 export async function writeTempFile(
@@ -17,16 +15,18 @@ export async function writeTempFile(
  name: string,
  content: string,
 ): Promise<string> {
-  const path = join(dir, name);
-  await writeFile(path, content, "utf-8");
-  return path;
+  const path = join(dir, name)
+  const parentDir = dirname(path)
+  await mkdir(parentDir, { recursive: true })
+  await writeFile(path, content, 'utf-8')
+  return path
 }

 export async function createTempSubdir(
  dir: string,
  name: string,
 ): Promise<string> {
-  const path = join(dir, name);
-  await mkdir(path, { recursive: true });
-  return path;
+  const path = join(dir, name)
+  await mkdir(path, { recursive: true })
+  return path
 }
--- a/tsconfig.json
+++ b/tsconfig.json
@@ -14,9 +14,13 @@
        "paths": {
            "src/*": ["./src/*"],
            "@claude-code-best/builtin-tools/*": ["./packages/builtin-tools/src/*"],
-            "@claude-code-best/builtin-tools": ["./packages/builtin-tools/src/index.ts"]
+            "@claude-code-best/builtin-tools": ["./packages/builtin-tools/src/index.ts"],
+            "@claude-code-best/mcp-client/*": ["./packages/mcp-client/src/*"],
+            "@claude-code-best/mcp-client": ["./packages/mcp-client/src/index.ts"],
+            "@claude-code-best/agent-tools/*": ["./packages/agent-tools/src/*"],
+            "@claude-code-best/agent-tools": ["./packages/agent-tools/src/index.ts"]
        }
    },
-    "include": ["src/**/*.ts", "src/**/*.tsx", "packages/builtin-tools/src/**/*.ts", "packages/builtin-tools/src/**/*.tsx"],
-    "exclude": ["node_modules"]
+    "include": ["src/**/*.ts", "src/**/*.tsx", "packages/builtin-tools/src/**/*.ts", "packages/builtin-tools/src/**/*.tsx", "packages/mcp-client/src/**/*.ts"],
+    "exclude": ["node_modules", "packages/mcp-client/src/__tests__"]
 }