feat: dynamic-workflow 来了 (#1271)

* feat(workflow): add workflow engine, /workflows panel, /ultracode skill 将 feat/sdk-backend 分支中 workflow 相关的 20 个 commit 压缩为单 commit： - 工作流引擎核心：phase / agent / parallel / pipeline 编排原语（packages/workflow-engine/） - /workflows 面板：三区焦点布局（顶部 run tabs + 左侧 phase 侧栏 + 右侧 agent 列表） - /ultracode skill：多 agent workflow 编排入口 - 进度存储 / journal / notification 系统 - WorkflowService 生命周期管理 + SentryErrorBoundary - 脚本沙箱：禁用 dynamic import()、JSON args 防御性归一化 - journal 与 named-workflow 路径统一在 projectRoot - 错误处理：parallel/pipeline hooks 错误日志、failure routing、semaphore abort - workflow 工具升级为 core 工具 + PascalCase 命名 Co-Authored-By: glm-5.1 <zai-org@claude-code-best.win> * feat(workflow): 复刻 ultracode 手册并修复 worktree/inline/opt-in 三处缺口围绕 ultracode skill 审查 agent 系统一致性后： - ultracode.ts: 用系统提示版完整 Workflow 编排手册替换中文精简版 - HIGH#1 isolation:'worktree': claudeCodeBackend.run() 用 createAgentWorktree + runWithCwdOverride 包裹 runAgent + finally 清理实现真正的 cwd 隔离；slug 用 sha256(runId:agentId) 派生以匹配 cleanupStaleAgentWorktrees 清理正则（修 runId 为 w+base36 非 UUID 导致的泄漏盲区）；worktree.ts 注释同步修正 - HIGH#2 inline 持久化: 新增 persistInlineScript，WorkflowTool + service 两条 inline 路径对称持久化到 .claude/workflow-runs/<runId>/script.js，返回可复用 scriptPath（闭环 inline→编辑→scriptPath 重提迭代循环） - HIGH#3 opt-in 分工: ultracode/WorkflowTool/effort 注明 session reminder 由 harness 注入，repo 内无 ultracode 信号，保持 feature('WORKFLOW_SCRIPTS') + isEnabled 两层 gate，不自造注入 - 测试: 新增 persistInline.test.ts；扩展 claudeCodeBackend(isolation 4 用例)/ WorkflowTool(inline)/service(scriptPath)/ultracode(harness) 含配套 workflow engine/panel 完善与 run-state-persistence design doc。 Co-Authored-By: Claude <noreply@anthropic.com> * feat(workflow): run 终态落盘 state.json 支持跨重启恢复终态 RunProgress（含 returnValue/error）此前只在内存 ProgressStore，进程重启即丢失。本次让其落盘到 .claude/workflow-runs/<runId>/state.json，使 (a) 重启后可按 runId 取 return、(b) /workflows 面板跨重启展示历史 run。跨进程 resume 明确不在范围。 - persistence.ts: getRunsDir/writeRunState/readRunState/listPersistedRuns + attachRunStatePersistence；原子覆盖写（tmp+rename），读容错（缺文件/ 损坏/schemaVersion 不符 → null），写 best-effort（IO 失败只 log warn） - progress/store.ts: 加 hydrate(run) 直接注入磁盘 run（已存在 runId 跳过，内存优先） - service.ts: getWorkflowService() 接线 attachRunStatePersistence(bus, store) 订阅 run_done（completed/failed/killed 三态共用，shutdown-kill 也走同路径，无需额外钩子）；WorkflowService 加 getRunAsync(id) 内存 miss→读盘 fallback（不注入内存）+ loadPersistedRuns() 扫盘 hydrate （persistedLoaded flag 守护幂等） - panel/WorkflowsPanel.tsx: mount 时调一次 loadPersistedRuns（重 mount 不重复） - ports.ts: runsDir 改用 getRunsDir() 消除拼接重复 - 测试: persistence.test.ts(11)/runStatePersistence.test.ts(5)/ progressStore(2)/service(5)/WorkflowsPanel(1) 共 24 个新测试； precheck 5629 pass / 0 fail 设计偏离: 计划原写 monkey-patch getRunsDir 指向 tmpdir，Bun ESM namespace 不可变不可行；改用可选 runsDirProvider 参数（默认 getRunsDir）DI 注入，加到 attachRunStatePersistence 与 makeService（cwdOverride 之后第 4 参），与现有 cwdOverride 模式一致。makeService 的 cwdOverride 保持不变，不破坏 inline 持久化特性。 Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win> * feat(workflow): 默认并发降为 3 并支持 per-run maxConcurrency 注入 - DEFAULT_MAX_CONCURRENCY=3 替代旧的 min(16, cores-2)；MAX_CONCURRENCY_CAP=16 保留为用户输入的绝对上限 - 新增 clampMaxConcurrency() 处理 undefined/<1/>CAP 边界 - WorkflowInput schema 新增 maxConcurrency: number.int().min(1).max(16).optional() - 引擎层 context/runWorkflow 全链路透传：semaphore 容量来自 per-run 入参 - WorkflowTool prompt 增加指引：fan-out 场景先用 AskUserQuestion 与用户确认并发再启动 - 同步 ultracode skill + audit workflow spec 的并发文字（删 cpu-cores 公式） - 同步 docs/features/workflow-scripts.md 旧公式 Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win> * fix(workflow): 面板 UI 字符串英文化 WorkflowsPanel 中 4 处面向用户的中文（onDone 错误消息、键位提示行）改为英文；其他面板组件（AgentList/TabsBar）原本已是英文。代码注释保留中文，与 workflow 模块惯例一致。 Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win> * feat(workflow): 中断系统（x 杀单 agent / K 杀整个 workflow，Dialog 二次确认） - claudeCodeBackend 桥接 ctx.signal → runAgent.override.abortController（修 'x' 无效根因：abort 到不了内部 fetch） - AbortError 识别为 throw WorkflowAbortedError（不再吞成 dead，workflow 能感知被 kill） - ports.taskRegistrar 加 registerAgentAbort/unregisterAgentAbort/killAgent；service.killAgent(runId, agentId) 精确中断 - 面板键位：'x' 杀当前 agent（agents 列聚焦时） / 'K' 杀整个 workflow；Dialog 二次确认 + confirm 模式吞导航键防误触 - 新增测试 8 项（backend signal bridge / hooks inject / ports killAgent / service killAgent） Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win> * docs(workflow): ultracode skill 加 model tier 选择指引（haiku/sonnet/opus/best 场景匹配）补足 agent() 已有 model 参数缺的判断依据：列出 4 个 tier 的成本/延迟量级和典型场景，明确"无法 articulate 为什么换 tier 就 omit"的 rule of thumb。 Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win> * feat(workflow): maxConcurrency≠3 必须先 AskUserQuestion（默认 3 推荐值）把 fan-out 时才问改成任何 maxConcurrency≠3 都必须问。唯一例外：用户在当前会话已明确说过并发数（"use 6" / "maxConcurrency 9"）。 prompt (WorkflowTool.ts) + skill (ultracode.ts) + audit spec 三处同步。 Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win> * feat(workflow): agent 失败自动重试一次（dead 或非 abort throw） - hooks.agent 包装 invokeBackend：第一次 dead 或非 abort throw → 重试一次 - WorkflowAbortedError（kill）不重试——是用户意图 - registry.resolve 配置错（AdapterNotFoundError 等）在 try 外直接上抛，不走重试—— 配置问题重试无意义且掩盖 bug - 重试仍失败：dead 保持 dead；throw 降级 dead（不击穿 workflow，与 parallel/pipeline null-on-error 契约一致） - budget 不重复扣：dead 不 addOutputTokens，重试 ok 才扣一次 - 新增 7 项 hooks 层重试测试 + 1 项 service 层降级测试 Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win> * fix(workflow): 面板 label 截断保留 #数字后缀（同 dim 多 finding 可区分） audit workflow 用 verify:\${dim}#\${findingIdx} 命名 verify agent。旧逻辑 slice(0, 18) 从右切把 #idx 全吃了——同 dimension 多 finding 肉眼无法区分。新逻辑：含 #数字后缀时保留后缀，前缀截断 + … 省略号。例：verify:correctness#0 → verify:correctn…#0 verify:architecture#15 → verify:archite…#15 Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win> * feat(workflow): kill 整个 workflow 后立即回主 chat run_done→store→notifications.ts 的通知路径已有，但 confirmYes 后面板继续挂着挡住主 chat，用户看不到"已停止"反馈。kill 后调 onDone() 立即退出面板，让主 chat 的 `Workflow "<name>" was stopped` 通知直接可见。 Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win> * fix(workflow): agent dead 带 reason/detail + prompt 加压 StructuredOutput 12 agent audit workflow 8 个 dead，journal 只记 {kind:"dead"} 无信息，事后无法区分 "agent 没产 StructuredOutput" vs "runAgent 抛错"。证据指向主因：sonnet 长 tool chain 后忘记调 StructuredOutput， extractStructuredOutput 返回 null 即降级 dead。 - types.ts: AgentRunResult.dead 加可选 reason/detail 字段（no-structured-output / runagent-threw / worktree-failed / unknown）兼容旧 journal（均 optional）。 - claudeCodeBackend.ts: 三处 dead 填 reason + detail； no-structured-output 把 finalized 文本前 200 字符做 detail，让日志/面板能立刻看到 agent 最后说了什么。 - claudeCodeBackend.ts: schema 模式 prompt 首尾各放一次 StructuredOutput 强制要求，针对 sonnet 长 tool chain 后忘记收尾。 - hooks.ts: retry 日志带 reason；retry 仍 throw 时降级 dead 也填 reason=runagent-threw + detail。 - types.test.ts: 加 reason JSON 往返 + 旧 journal 兼容测试。 Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win> * fix(workflow): schema 模式弃用 StructuredOutput 工具契约，改鲁棒 JSON 文本解析上一轮 70a2f76 把"agent 长 tool chain 后忘调 StructuredOutput"当作死因，加 prompt 头尾双强制。但实测跑 5 个 review agent 4 个 dead，detail 全是 "StructuredOutput tool is not available as a deferred tool"——根因是该工具从未注入 workflow sub-agent 的工具集（assembleToolPool 默认池不含，只有 stop_hook 路径 execAgentHook.ts 显式 createStructuredOutputTool()）。 prompt 反复要求调一个不可达的工具，agent 困扰、长篇辩解、最终没产 JSON。 - claudeCodeBackend.ts: - extractStructuredOutput 重写：括号栈扫描替代 indexOf/lastIndexOf，处理嵌套对象、字符串内的括号、转义符；新增 fenced code block 优先路径（```json / ```），多 JSON 块取第一个 parse 成功的；只返回 plain object（拒 array/number/string/null）。不做语法修复（尾逗号/单引号/注释）——避免在字符串内误改（如 "http://" 被 // 注释正则吃）。 - schema 模式 prompt 简化：删首尾双 STRUCTURED OUTPUT 强制（600+ token），改成指示 agent 在最后文本块 emit raw JSON；明确告知"StructuredOutput is not available in this environment"，消除调用幻觉。 - hooks.ts: detail.slice 用 typeof === 'string' 守卫；catch 块用 e instanceof Error ? e.message : String(e)（旧 journal / 第三方 adapter 可能写非 string detail，直接 .slice 会抛 TypeError 击穿日志）。 - claudeCodeBackend.test.ts: +9 测试覆盖 fenced / 嵌套 / 字符串内括号 / 转义引号 / 多块取首 / 类型守卫 / 损坏 JSON。 precheck: 5663 pass / 0 fail。 Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win> * docs(effort): 新增 /effort 交互面板设计 spec 设计要点： - /effort 无参 → 横向 slider 面板（low/medium/high/xhigh/max/ultracode） - ←/→ 移动光标，Enter 确认，Esc 取消 - ultracode 仅视觉占位，确认后提示走 /ultracode <context> - env override 时双标记 + 顶部警告 - 模型不支持时面板禁用 - 两阶段交付：先基础面板 commit，再做 ultracode 波纹动画 Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win> * docs(effort): 新增 EffortPanel 基础面板实施计划（第一阶段）按 TDD 分 6 个 task：纯函数状态 → keybinding 注册 → 组件 → 命令挂载 → 分支测试 → precheck。波纹动画在第二阶段单独 commit。 Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win> * docs(effort): plan 补 q/ctrl+c 取消绑定，对齐 spec §5 状态机 verifier 抓到的 gap：spec §5 写明 Esc / Ctrl+C / q 都是取消事件，但 plan Task 2.3 只绑了 escape。补上 q 和 ctrl+c → effortPanel:cancel。同时把 Step 2.2 直接写成 6 个 action 版本（home/end），删除迂回表达。 Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win> * docs(effort): plan 修订执行前 review 发现的 5 处 gap - Task 3.3 EffortPanel.tsx 草稿：Faster/Smarter padEnd 语法错乱重写； useKeybindings import 路径从 @anthropic/ink 修正为 ../../keybindings/useKeybinding.js；移除冗余 renderSeparatorLine；保留 renderPaddedLine - Task 5.2 computeConfirmOutcome 改为注入 ApplyFn 模式：避免 effortPanelState → effort.tsx → EffortPanel 循环依赖；测试可注入 mockApply，无需 mock settings - Step 5.3 测试代码对齐注入版签名 Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win> * feat(effort): 新增 EffortPanel 纯函数状态模块（PanelPosition + 移动/初始光标）仅含纯函数与类型，无 React/Ink 依赖，便于单测。 - PANEL_POSITIONS：low → medium → high → xhigh → max → ultracode - moveLeft/moveRight：边界钳制（low 不再左移、ultracode 不再右移） - getInitialCursor：env override > displayed level Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win> * feat(keybindings): 注册 EffortPanel context 与 6 个 action 绑定 ←/→/h/l/home/end/enter/escape/q/ctrl+c 到 effortPanel:* action。与 ModelPicker context 范式一致，避免左右键被全局 keybinding 拦截。 Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win> * feat(effort): 实现 EffortPanel 组件主体（渲染 + 键盘交互 + 确认/取消分支） - 横向 slider 布局：Faster ↔ Smarter 两极，6 档刻度 - useKeybindings 注册 EffortPanel context（←/→/h/l/home/end/enter/escape/q/ctrl+c） - Enter 在 5 档之一 → 调 executeEffort 写 settings + AppState - Enter 在 ultracode → 输出引导文案，不写状态 - Esc/q → "Effort unchanged." - env override 时顶部黄色警告 - computeConfirmOutcome 注入 ApplyFn，便于测试（Task 5 补测试） Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win> * feat(effort): /effort 无参时挂载 EffortPanel 交互面板 - 无参 → <EffortPanelWrapper> 透传 AppState.effortValue - current/status → 仍显示文本（不变） - 有参 → 直跳 executeEffort（不变） - help/-h/--help → 不变 Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win> * test(effort): 补 computeConfirmOutcome 分支测试（注入 mockApply） - ultracode → kind=ultracode-hint，不调 applyFn - low → kind=apply，message/effortUpdate 来自 applyFn - applyFn 返回无 effortUpdate 时 outcome.effortUpdate 为 undefined - CANCEL_MESSAGE / ULTRACODE_HINT 常量 Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win> * fix(effort): 测试里 cursor cast 为 EffortValue，避免 PanelPosition 含 ultracode 触发 TS 错误 computeConfirmOutcome 的 ApplyFn 契约要求 EffortValue，但测试 mockApply 接收 PanelPosition。实际运行时 computeConfirmOutcome 在 ultracode 档位走 hint 分支不会调 applyFn， cast 安全。precheck 全量通过：5688 tests / 0 fail。 Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win> * fix(effort): 面板对齐与配色修复 - 对齐：用 Box width={SEGMENT} + justifyContent="center" 让 ▲ 与档位名严格居中对齐，替代之前 string padEnd(11) 与 SEGMENT=12 不一致导致的 1 列偏移 - 配色：所有面板文字改用 theme.claude（Claude Orange rgb(215,119,87)），替代终端默认紫；分隔线/副标签/底栏用 theme.subtle；env 警告用 theme.warning - 光标档位的档位名也加粗，强化视觉焦点 Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win> * fix(effort): 面板文字改紫色，ULTRACODE_HINT 英文化 - 颜色：theme.claude（橙）→ theme.purple_FOR_SUBAGENTS_ONLY（Purple 600, rgb(147,51,234)），覆盖标题、Faster/Smarter、▲、档位名 - ULTRACODE_HINT：中文 → 英文 "ultracode is not an effort level. Use /ultracode <context> to start a multi-agent workflow." Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win> * fix(effort): 统一用色版——选中 suggestion（蓝），未选中 subtle（灰）弃用 purple_FOR_SUBAGENTS_ONLY（subagent 专用）。改与项目其他面板一致： - 选中档位 + ▲：color="suggestion"（Medium blue rgb(87,105,247)）+ bold - 未选中档位 + 空 ▲ 占位：color="subtle"（Light gray rgb(175,175,175)） - 标题 / Faster / Smarter：color="suggestion" - 分隔线 / 副标签 / 底栏：color="subtle" Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win> * fix(workflow): 终态前补发 phase_done，面板自动退出 running→terminal 转换 runWorkflow：脚本结束时 hook.phase 不会触发最后一个 phase 的 phase_done， UI 左栏会永远显示 running。三路径（completed/killed/failed）统一在 run_done 之前补发 emitTerminalPhaseDone。 WorkflowsPanel：抽 isRunTerminatedTransition 纯函数判定 running → terminal，面板 useEffect 检测到转换后自动退出聚焦。 Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win> * feat(effort): 波纹动画纯函数 pickChar/computeRippleLine/mergeLayers + 18 测试 Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win> * feat(effort): useRippleFrame hook 包装 useAnimationFrame，按需订阅时钟 Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win> * feat(effort): EffortPanel 集成波纹背景——cursor 停在 ultracode 时切换波纹模式仅在 cursor === 'ultracode' 时启用 useRippleFrame，渲染 5 行波纹背景 + overlay 文字（Faster/Smarter、分隔线、▲、档位名、副标签）。其余档位保持原 PlainContent 渲染路径不动。 Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win> * refactor(effort): 波纹动画从字符密度改为颜色渐变按原版风格把波纹背景从 INTENSITY_CHARS 密度字符（'·∙░▒▓'）改为 suggestion 系颜色渐变（transparent → 暗深紫蓝 → suggestion → 高光）： rippleAnimation.ts: - 删除 pickChar / INTENSITY_CHARS / WAVE_PEAK_CHARS / mergeLayers - 新增 intensityToColor(intensity) → 'transparent' | '#xxxxxx' - 新增 computeRippleCells 返回 Cell[]（每位置 char+color） - 新增 applyOverlaysToCells(cells, overlays) 替代 mergeLayers - 新增 cellsToSegments(cells) 合并相邻同色段（减少 Text 节点） EffortPanel.tsx: - RippleContent 用 cells→segments→tokens 渲染 - 空格段用 BaseText backgroundColor 染色块（纯色块视觉） - 文字段用 Text color 染色（亮色突出） - tokens 按空格/文字二次拆分，避免混合段渲染歧义测试: 29 个 rippleAnimation 测试覆盖 intensityToColor 边界、 computeRippleCells 长度/震源/衰减、applyOverlaysToCells 覆盖/截断/ 防御式拷贝、cellsToSegments 合并逻辑。 Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win> * fix(effort): 波纹参数调优——铺满左侧 + 速度调慢 + 全面板有底色用户反馈三个问题： 1. "低峰部分没有颜色变化" → intensity ≤ 0.1 返回 transparent 导致波谷位置看不见。改为永不返回 transparent，最低档 #0a0d1a 作为面板底色（暗紫黑海洋），波峰在底色上流动。 2. "波浪速度太快" → time 系数 0.012 → 0.004（约 1/3 速）。波峰移动速度从 34 cell/s 降到 11 cell/s，每帧颜色变化从 45% 降到 36%。 3. "波浪只到中间部分，没覆盖左侧" → falloff 覆盖半径 40 → 90。震源 x=65，左侧 dist=65 < 90，波纹可达最左端（约 30-50% 覆盖）。色阶调整： - 删除 transparent 档，新增 #0a0d1a 作最暗档（底色） - 最高档从 #8aa0ff（高光）改为 #5769F7（suggestion），避免与文字 overlay 同色互相吞噬 - 7 档颜色：#0a0d1a → #15182b → #1f2543 → #2a3360 → #3a4582 → #4a5bb0 → #5769F7 测试：删除 transparent 期望，改为期望具体颜色（#0a0d1a 等）。新增"覆盖半径扩大"测试验证 dist=65 仍有非最暗颜色。 Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win> * fix(effort): 波纹 v3 — 去黑边 + 删中心高频涟漪 + y 轴覆盖快捷键行用户反馈三个问题： 1. "黑色边感觉不太对" — 最暗档 #0a0d1a (rgb 10,13,26) 太接近纯黑，远端波谷看起来像硬黑边。改为 #1a1f3a (rgb 26,31,58)，紫蓝感更强而非纯黑。 2. "中心的快速波纹有点奇怪" — 删除震源附近 dist<6 的高频涟漪叠加 (time*0.02，5 倍主波纹频率)。原本想让震源附近"水波感"更强，实际效果像"快速闪烁"反而突兀。主波纹已经足够，无需叠加。 3. "y 方向覆盖快捷键" — RippleContent 新增 y=2 行渲染快捷键 overlay ("←/→ adjust · Enter confirm · Esc cancel")。PlainContent 路径保持原 Box marginTop=1 + Text 渲染。色阶调整（紫蓝感更强）： - #1a1f3a (原 #0a0d1a) — 最暗档 - #1f2543 / #252c55 / #2e3870 / #3a4582 / #4a5bb0 / #5769F7 (中间档略调亮度，保持平滑过渡) 测试：震源点测试更新为"time=0 时波谷最暗，time 推进后扫过波峰变亮"，反映删除高频涟漪后的纯主波纹行为。 Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win> * chore(workflow): 工作流相关代码中文文案全部英文化源码（src/workflow/ + packages/workflow-engine/src/）的中文注释、用户可见错误消息、字符串字面量；测试文件的标题与注释；同步 6 条硬编码断言到英文化后的错误消息。 Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win> * feat(effort): 波纹 v4 — 平滑波 + 全色环旋转 + 淡入淡出 + 宽度自适应 - 波函数改 (sin+1)/2：消除 max(0,sin) 平直暗带（约 6 行宽） - 主色相连续旋转（0.03°/ms，12s/圈全色环）：蓝→紫→品红→红→橙→黄→绿→青 - 文字 overlay 同步色相旋转（rotateHue 应用到 Faster/▲/档位名/分隔线/副标签） - 淡入淡出动画：fadeColor/fadeCells + fade 状态机 ~300ms 进出过渡 - 副标签固定 ultracode 段下方，不跟随光标移动 - 顶部/底部各加一行纯波纹行，视觉一致 - 宽度自适应终端列数：窄则 72，宽则铺满（computeSegment/computeRippleSourceX） - 快捷键改 plain Text，不参与波纹背景渲染 - 新增 18 测试（fadeColor/fadeCells/rotateHue/getHueShiftAtTime） Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win> * refactor: remove CYBER_RISK_MITIGATION_REMINDER from FileReadTool Co-Authored-By: deepseek-v4-pro <deepseek-ai@claude-code-best.win> * fix: prevent ReDoS in extractMeta regex by anchoring to splice boundary Co-Authored-By: deepseek-v4-pro <deepseek-ai@claude-code-best.win> * chore: 更新脚本 --------- Co-authored-by: glm-5.1 <zai-org@claude-code-best.win> Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: deepseek-v4-pro <deepseek-ai@claude-code-best.win>
2026-06-17 22:05:50 +00:00 · 2026-06-14 18:13:49 +08:00
parent 3e3e1de81b
commit 58ee6419b1
130 changed files with 23347 additions and 885 deletions
--- a/src/commands.ts
+++ b/src/commands.ts
@@ -483,7 +483,7 @@ async function getSkills(cwd: string): Promise<{
 /* eslint-disable @typescript-eslint/no-require-imports */
 const getWorkflowCommands = feature('WORKFLOW_SCRIPTS')
  ? (
-      require('@claude-code-best/builtin-tools/tools/WorkflowTool/createWorkflowCommand.js') as typeof import('@claude-code-best/builtin-tools/tools/WorkflowTool/createWorkflowCommand.js')
+      require('./workflow/namedWorkflowCommands.js') as typeof import('./workflow/namedWorkflowCommands.js')
    ).getWorkflowCommands
  : null
 /* eslint-enable @typescript-eslint/no-require-imports */
--- a/src/commands/effort/effort.tsx
+++ b/src/commands/effort/effort.tsx
@@ -1,4 +1,5 @@
 import * as React from 'react';
+import { EffortPanel } from '../../components/EffortPanel/EffortPanel.js';
 import { useMainLoopModel } from '../../hooks/useMainLoopModel.js';
 import {
  type AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS,
@@ -161,9 +162,18 @@ export async function call(onDone: LocalJSXCommandOnDone, _context: unknown, arg
  }

  if (!args || args === 'current' || args === 'status') {
-    return <ShowCurrentEffort onDone={onDone} />;
+    if (args === 'current' || args === 'status') {
+      return <ShowCurrentEffort onDone={onDone} />;
+    }
+    // 完全无参 → 打开交互面板
+    return <EffortPanelWrapper onDone={onDone} />;
  }

  const result = executeEffort(args);
  return <ApplyEffortAndClose result={result} onDone={onDone} />;
 }
+
+function EffortPanelWrapper({ onDone }: { onDone: (result: string) => void }): React.ReactNode {
+  const effortValue = useAppState(s => s.effortValue);
+  return <EffortPanel appStateEffort={effortValue} onDone={onDone} />;
+}
--- a/src/commands/workflows/index.ts
+++ b/src/commands/workflows/index.ts
@@ -1,28 +1,11 @@
-import type { Command, LocalCommandCall } from '../../types/command.js'
-import { getWorkflowCommands } from '@claude-code-best/builtin-tools/tools/WorkflowTool/createWorkflowCommand.js'
-import { getCwd } from '../../utils/cwd.js'
-
-const call: LocalCommandCall = async (_args, _context) => {
-  const commands = await getWorkflowCommands(getCwd())
-  if (commands.length === 0) {
-    return {
-      type: 'text',
-      value:
-        'No workflows found. Add workflow files to .claude/workflows/ (YAML or Markdown).',
-    }
-  }
-  const list = commands
-    .map(cmd => `  /${cmd.name} - ${cmd.description}`)
-    .join('\n')
-  return { type: 'text', value: `Available workflows:\n${list}` }
-}
+import type { Command } from '../../types/command.js'

 const workflows = {
-  type: 'local',
+  type: 'local-jsx',
  name: 'workflows',
-  description: 'List available workflow scripts',
-  supportsNonInteractive: true,
-  load: () => Promise.resolve({ call }),
+  description: 'Workflow 监控面板：实时 run/phase/agent 进度，键盘控制',
+  // 延迟加载面板实现，避免启动时拉入 Ink/React 依赖。
+  load: () => import('../../workflow/panel/panelCall.js'),
 } satisfies Command

 export default workflows
--- a/src/components/EffortPanel/EffortPanel.tsx
+++ b/src/components/EffortPanel/EffortPanel.tsx
@@ -0,0 +1,408 @@
+import * as React from 'react';
+import { BaseText, Box, Text, useTerminalSize } from '@anthropic/ink';
+import { useKeybindings } from '../../keybindings/useKeybinding.js';
+import { type EffortValue, getDisplayedEffortLevel, getEffortEnvOverride } from '../../utils/effort.js';
+import {
+  type PanelPosition,
+  CANCEL_MESSAGE,
+  computeConfirmOutcome,
+  getInitialCursor,
+  moveLeft,
+  moveRight,
+  PANEL_POSITIONS,
+} from './effortPanelState.js';
+import { executeEffort } from '../../commands/effort/effort.js';
+import { useMainLoopModel } from '../../hooks/useMainLoopModel.js';
+import { useSetAppState } from '../../state/AppState.js';
+import { useRippleFrame } from './useRippleFrame.js';
+import {
+  TRANSPARENT,
+  type Overlay,
+  type Segment,
+  applyOverlaysToCells,
+  cellsToSegments,
+  computeRippleCells,
+  fadeCells,
+  getHueShiftAtTime,
+  rotateHue,
+} from './rippleAnimation.js';
+
+/**
+ * 每档最小宽度（足够装下 'ultracode' 9 字符 + 居中留白）。
+ * 当终端窄时使用此值，保证最低可读性。
+ */
+const MIN_SEGMENT = 12;
+
+const SUBLABEL_ULTRACODE = 'xhigh + workflows';
+
+// 颜色：与项目主题对齐（suggestion=Medium blue #5769F7）。
+const COLOR_LABEL_SELECTED = '#5769F7'; // 选中档位（suggestion）
+const COLOR_LABEL_DEFAULT = '#7a8eff'; // 未选中档位（淡紫蓝，与波纹背景协调）
+const COLOR_OVERLAY = '#5769F7'; // Faster / Smarter / ▲ 等 overlay 文字
+
+// 淡入淡出每帧步长：60ms 间隔下 5 帧达到目标 ≈ 300ms 动画时长。
+const FADE_STEP = 0.2;
+
+// 波纹震源 y 坐标（相对波纹区域坐标系，y=0 是档位名行）。
+const RIPPLE_SOURCE_Y = 0;
+
+/**
+ * 根据终端宽度计算每档实际宽度（SEGMENT）。
+ *
+ * 规则：
+ * - 留出 paddingX={1} 的左右各 1 列 → 可用宽度 = columns - 2
+ * - 若可用宽度 <= MIN_SEGMENT * 6（72），用 MIN_SEGMENT（保持当前窄布局）
+ * - 否则铺满：floor(可用宽度 / 6)
+ *
+ * 即"窄则不变，宽则铺满"。最小宽度保证 'ultracode' 9 字符能正常显示。
+ */
+function computeSegment(terminalColumns: number): number {
+  const available = terminalColumns - 2; // paddingX={1} 两侧
+  const minNeeded = MIN_SEGMENT * PANEL_POSITIONS.length;
+  if (available <= minNeeded) return MIN_SEGMENT;
+  return Math.floor(available / PANEL_POSITIONS.length);
+}
+
+/**
+ * 计算波纹震源 x 坐标（ultracode 段内 'ultracode' 标签的中心列）。
+ *
+ * 'ultracode' 是 9 字符，在 SEGMENT 列内居中：
+ *   offset = floor((SEGMENT - 9) / 2)
+ *   labelCenter = SEGMENT * 5 + offset + 4  （4 是 9 字符串的中心偏移）
+ *
+ * SEGMENT=12 → 60 + 1 + 4 = 65（与历史值一致）
+ * SEGMENT=20 → 100 + 5 + 4 = 109
+ */
+function computeRippleSourceX(segment: number): number {
+  const LABEL_LEN = 9; // 'ultracode'
+  const offset = Math.max(0, Math.floor((segment - LABEL_LEN) / 2));
+  const labelCenter = Math.floor(LABEL_LEN / 2); // 4
+  return segment * (PANEL_POSITIONS.length - 1) + offset + labelCenter;
+}
+
+/**
+ * 计算某段 idx 内居中文字的起始列。
+ * 动态 segment：textLen 字符在 segment 列内居中。
+ */
+function segmentTextStartX(idx: number, textLen: number, segment: number): number {
+  return segment * idx + Math.max(0, Math.floor((segment - textLen) / 2));
+}
+
+type Props = {
+  appStateEffort: EffortValue | undefined;
+  onDone: (message: string) => void;
+};
+
+export function EffortPanel({ appStateEffort, onDone }: Props): React.ReactNode {
+  const setAppState = useSetAppState();
+  const model = useMainLoopModel();
+  const { columns } = useTerminalSize();
+
+  // 自适应宽度：根据终端列数计算每档宽度。
+  // 终端变化（resize）时 columns 改变 → 重新计算 → 重渲染。
+  const segment = React.useMemo(() => computeSegment(columns), [columns]);
+  const panelWidth = segment * PANEL_POSITIONS.length;
+  const rippleSourceX = React.useMemo(() => computeRippleSourceX(segment), [segment]);
+
+  const envOverride = getEffortEnvOverride();
+  const displayed = getDisplayedEffortLevel(model, appStateEffort);
+  const initialCursor = getInitialCursor({ envOverride, appStateEffort, displayed });
+
+  const [cursor, setCursor] = React.useState<PanelPosition>(initialCursor);
+  const [done, setDone] = React.useState(false);
+
+  const isOnUltracode = cursor === 'ultracode';
+  const [fade, setFade] = React.useState(0);
+  // 仍在波纹模式：cursor 在 ultracode，或退出动画未结束（fade > 0）
+  const showingRipple = isOnUltracode || fade > 0.001;
+  const [rippleRef, time] = useRippleFrame(showingRipple);
+
+  // 淡入淡出驱动：每 tick（time 推进）朝目标步进 FADE_STEP。
+  // 退出动画完成后 fade 归零，showingRipple 变 false，时钟停止订阅。
+  React.useEffect(() => {
+    if (!showingRipple) return;
+    const target = isOnUltracode ? 1 : 0;
+    setFade(prev => {
+      if (prev === target) return prev;
+      const next = target > prev ? prev + FADE_STEP : prev - FADE_STEP;
+      return target > prev ? Math.min(target, next) : Math.max(target, next);
+    });
+  }, [time, isOnUltracode, showingRipple]);
+
+  const handleConfirm = React.useCallback(() => {
+    if (done) return;
+    setDone(true);
+    const outcome = computeConfirmOutcome(cursor, executeEffort);
+    if (outcome.kind === 'apply' && outcome.effortUpdate) {
+      setAppState(prev => ({
+        ...prev,
+        effortValue: outcome.effortUpdate!.value,
+      }));
+    }
+    onDone(outcome.message);
+  }, [cursor, done, onDone, setAppState]);
+
+  const handleCancel = React.useCallback(() => {
+    if (done) return;
+    setDone(true);
+    onDone(CANCEL_MESSAGE);
+  }, [done, onDone]);
+
+  useKeybindings(
+    {
+      'effortPanel:decrease': () => setCursor(c => moveLeft(c)),
+      'effortPanel:increase': () => setCursor(c => moveRight(c)),
+      'effortPanel:home': () => setCursor('low'),
+      'effortPanel:end': () => setCursor('ultracode'),
+      'effortPanel:confirm': handleConfirm,
+      'effortPanel:cancel': handleCancel,
+    },
+    { context: 'EffortPanel' },
+  );
+
+  const envActive = envOverride !== null && envOverride !== undefined;
+  const envRaw = process.env.CLAUDE_CODE_EFFORT_LEVEL;
+
+  // 波纹行 cells 计算：返回该行所有 cell（含 overlay 文字）
+  // fade 控制背景颜色亮度（0 → 全 transparent，1 → 完整波纹）。
+  // 文字 overlay 也乘以 fade，让进入/退出动画整体淡入淡出。
+  const renderRippleRow = React.useCallback(
+    (relY: number, overlays: Overlay[]): Segment[] => {
+      const cells = computeRippleCells({
+        y: relY + RIPPLE_SOURCE_Y,
+        width: panelWidth,
+        time,
+        sourceX: rippleSourceX,
+        sourceY: RIPPLE_SOURCE_Y,
+      });
+      const overlayed = applyOverlaysToCells(cells, overlays);
+      const faded = fadeCells(overlayed, fade);
+      return cellsToSegments(faded);
+    },
+    [time, fade, panelWidth, rippleSourceX],
+  );
+
+  return (
+    <Box ref={rippleRef} flexDirection="column" paddingX={1} width={panelWidth + 2}>
+      <Text bold color="suggestion">
+        Effort
+      </Text>
+      {envActive && <Text color="warning">{`⚠ CLAUDE_CODE_EFFORT_LEVEL=${envRaw} overrides this session`}</Text>}
+      {showingRipple ? (
+        <RippleContent
+          renderRow={renderRippleRow}
+          cursor={cursor}
+          fade={fade}
+          segment={segment}
+          panelWidth={panelWidth}
+          time={time}
+        />
+      ) : (
+        <>
+          <PlainContent cursor={cursor} segment={segment} panelWidth={panelWidth} />
+          <Box marginTop={1}>
+            <Text color="subtle">←/→ adjust · Enter confirm · Esc cancel</Text>
+          </Box>
+        </>
+      )}
+    </Box>
+  );
+}
+
+// ---- 普通模式（无波纹）----
+
+function PlainContent({
+  cursor,
+  segment,
+  panelWidth,
+}: {
+  cursor: PanelPosition;
+  segment: number;
+  panelWidth: number;
+}): React.ReactNode {
+  return (
+    <>
+      <Box marginTop={1} flexDirection="row" justifyContent="space-between">
+        <Text color="suggestion">Faster</Text>
+        <Text color="suggestion">Smarter</Text>
+      </Box>
+      <Text color="subtle">{'─'.repeat(panelWidth)}</Text>
+      <Box flexDirection="row">
+        {PANEL_POSITIONS.map(p => (
+          <Box key={`cursor-${p}`} width={segment} justifyContent="center">
+            <Text bold color={cursor === p ? 'suggestion' : 'subtle'}>
+              {cursor === p ? '▲' : ' '}
+            </Text>
+          </Box>
+        ))}
+      </Box>
+      <Box flexDirection="row">
+        {PANEL_POSITIONS.map(p => (
+          <Box key={`label-${p}`} width={segment} justifyContent="center">
+            <Text bold={cursor === p} color={cursor === p ? 'suggestion' : 'subtle'}>
+              {p}
+            </Text>
+          </Box>
+        ))}
+      </Box>
+      <Box flexDirection="row">
+        <Box width={segment * (PANEL_POSITIONS.length - 1)} />
+        <Box width={segment} justifyContent="center">
+          <Text color="subtle">{SUBLABEL_ULTRACODE}</Text>
+        </Box>
+      </Box>
+    </>
+  );
+}
+
+// ---- 波纹模式（cursor === 'ultracode'）----
+//
+// 渲染策略：
+// - 每行先 computeRippleCells 算出强度→颜色的 cell 数组（背景为空格 + 颜色）
+// - applyOverlaysToCells 把文字 overlay（Faster/▲/档位名/副标签）写入对应 cell
+// - cellsToSegments 合并相邻同色段
+// - 渲染层遍历 segments：每个段判断是"空格波纹段"还是"文字段"
+//   - 空格段：用 backgroundColor 把空格染成色块（pure color block）
+//   - 文字段：用 color 染色文字（背景保持终端默认，让文字最清晰）
+//   - 混合段（既有空格又有文字，少见）：拆为前后两个 Text
+//
+// 注意：Segment 内可能同时有空格和非空格字符（如 "  Faster  " 居中文字）。
+// 这种段用 color 渲染时，空格部分不显示色块——视觉上"色块断裂"。
+// 解决：渲染时把 segment 按字符类型二次拆分（runs of whitespace vs non-whitespace）。
+
+type RippleContentProps = {
+  renderRow: (relY: number, overlays: Overlay[]) => Segment[];
+  cursor: PanelPosition;
+  fade: number;
+  segment: number;
+  panelWidth: number;
+  time: number;
+};
+
+function RippleContent({ renderRow, cursor, segment, panelWidth, time }: RippleContentProps): React.ReactNode {
+  // 光标索引跟随 cursor（退出动画期间 cursor 已移到别处，
+  // 让 ▲ overlay 跟着移走，ultracode 段恢复普通背景色）。
+  const cursorIdx = PANEL_POSITIONS.indexOf(cursor);
+  // 副标签固定在 ultracode 段下方，不跟随光标移动。
+  const ultracodeIdx = PANEL_POSITIONS.length - 1;
+
+  // 文字颜色跟随波浪色相旋转：取当前 time 的 hueShift，
+  // 应用到所有 overlay 颜色，让文字与背景色环保持同步。
+  const hueShift = getHueShiftAtTime(time);
+  const overlayColor = rotateHue(COLOR_OVERLAY, hueShift);
+  const labelSelectedColor = rotateHue(COLOR_LABEL_SELECTED, hueShift);
+  const labelDefaultColor = rotateHue(COLOR_LABEL_DEFAULT, hueShift);
+
+  const fasterOverlay: Overlay = { text: 'Faster', x: 0, color: overlayColor };
+  const smarterOverlay: Overlay = {
+    text: 'Smarter',
+    x: panelWidth - 'Smarter'.length,
+    color: overlayColor,
+  };
+  const separatorOverlay: Overlay = {
+    text: '─'.repeat(panelWidth),
+    x: 0,
+    color: labelDefaultColor,
+  };
+  const cursorOverlay: Overlay = {
+    text: '▲',
+    x: segmentTextStartX(cursorIdx, 1, segment),
+    color: overlayColor,
+  };
+  const labelOverlays: Overlay[] = PANEL_POSITIONS.map((p, idx) => ({
+    text: p,
+    x: segmentTextStartX(idx, p.length, segment),
+    color: p === cursor ? labelSelectedColor : labelDefaultColor,
+  }));
+  const sublabelOverlay: Overlay = {
+    text: SUBLABEL_ULTRACODE,
+    x: segmentTextStartX(ultracodeIdx, SUBLABEL_ULTRACODE.length, segment),
+    color: labelDefaultColor,
+  };
+
+  // 各行 y 坐标（相对震源 RIPPLE_SOURCE_Y = 档位名行）
+  //   y=-4: 顶部纯波纹行（视觉一致，无 overlay）
+  //   y=-3: Faster/Smarter
+  //   y=-2: 分隔线
+  //   y=-1: ▲
+  //   y=0:  档位名（震源）
+  //   y=1:  副标签
+  //   y=2:  底部纯波纹行（视觉一致，无 overlay）
+  //
+  // 快捷键行：plain Text，不参与波纹渲染（无背景动画），紧贴底部波纹行。
+  return (
+    <>
+      <RippleRow segments={renderRow(-4, [])} />
+      <RippleRow segments={renderRow(-3, [fasterOverlay, smarterOverlay])} />
+      <RippleRow segments={renderRow(-2, [separatorOverlay])} />
+      <RippleRow segments={renderRow(-1, [cursorOverlay])} />
+      <RippleRow segments={renderRow(0, labelOverlays)} />
+      <RippleRow segments={renderRow(1, [sublabelOverlay])} />
+      <RippleRow segments={renderRow(2, [])} />
+      <Text color={COLOR_LABEL_DEFAULT}>←/→ adjust · Enter confirm · Esc cancel</Text>
+    </>
+  );
+}
+
+/**
+ * 渲染一行波纹 segments。
+ *
+ * 每个 segment 可能含空格 + 文字混合（如 "  Faster  "）：
+ * - 空格部分用 backgroundColor 染色块（波纹颜色）
+ * - 文字部分用 color 染色（亮色，背景保持终端默认）
+ *
+ * 简化策略：遍历 segment 字符，按"是否为空格"二次拆分为 token。
+ * 相邻同类型 token 合并，避免 React key 爆炸。
+ */
+function RippleRow({ segments }: { segments: Segment[] }): React.ReactNode {
+  const tokens: Array<{ text: string; kind: 'space' | 'text'; color: string }> = [];
+  for (const seg of segments) {
+    // 拆分 seg.text 为空格段和非空格段
+    let buf = '';
+    let bufIsSpace: boolean | null = null;
+    const flush = (): void => {
+      if (buf === '' || bufIsSpace === null) return;
+      tokens.push({
+        text: buf,
+        kind: bufIsSpace ? 'space' : 'text',
+        color: seg.color,
+      });
+      buf = '';
+      bufIsSpace = null;
+    };
+    for (const ch of seg.text) {
+      const isSpace = ch === ' ';
+      if (bufIsSpace === null) {
+        buf = ch;
+        bufIsSpace = isSpace;
+      } else if (isSpace === bufIsSpace) {
+        buf += ch;
+      } else {
+        flush();
+        buf = ch;
+        bufIsSpace = isSpace;
+      }
+    }
+    flush();
+  }
+
+  return (
+    <Box flexDirection="row">
+      {tokens.map((tok, i) =>
+        tok.kind === 'space' ? (
+          tok.color === TRANSPARENT ? (
+            <BaseText key={i}>{tok.text}</BaseText>
+          ) : (
+            <BaseText key={i} backgroundColor={tok.color as `#${string}`}>
+              {tok.text}
+            </BaseText>
+          )
+        ) : (
+          <Text key={i} color={tok.color as `#${string}`} bold>
+            {tok.text}
+          </Text>
+        ),
+      )}
+    </Box>
+  );
+}
--- a/src/components/EffortPanel/tests/EffortPanel.test.tsx
+++ b/src/components/EffortPanel/tests/EffortPanel.test.tsx
@@ -0,0 +1,24 @@
+import { expect, test } from 'bun:test';
+import React from 'react';
+import { EffortPanel } from '../EffortPanel.js';
+
+// EffortPanel 是 UI 组件，渲染依赖链（useMainLoopModel / GrowthBook / settings）
+// 在测试环境模拟成本高且脆化。本文件只做"组件契约"sanity check：
+// 1) 默认导出为有效 React 组件
+// 2) 接收正确 props 类型（编译期保证）
+// 3) onDone 类型为 (message: string) => void
+//
+// 渲染输出与键盘交互通过 Step 6.2 手动验收覆盖；
+// 确认/取消分支通过 computeConfirmOutcome 纯函数测试覆盖（见 effortPanelState.test.ts）。
+
+test('EffortPanel 是有效 React 组件', () => {
+  expect(typeof EffortPanel).toBe('function');
+});
+
+test('EffortPanel 接受 props 并返回 React element（不挂载）', () => {
+  const element = React.createElement(EffortPanel, {
+    appStateEffort: undefined,
+    onDone: () => {},
+  });
+  expect(React.isValidElement(element)).toBe(true);
+});
--- a/src/components/EffortPanel/tests/effortPanelState.test.ts
+++ b/src/components/EffortPanel/tests/effortPanelState.test.ts
@@ -0,0 +1,163 @@
+import { describe, expect, test } from 'bun:test'
+import type { EffortValue } from '../../../utils/effort.js'
+import {
+  CANCEL_MESSAGE,
+  type ApplyFn,
+  ULTRACODE_HINT,
+  END_POSITION,
+  HOME_POSITION,
+  PANEL_POSITIONS,
+  type PanelPosition,
+  computeConfirmOutcome,
+  getInitialCursor,
+  isUltracode,
+  moveLeft,
+  moveRight,
+} from '../effortPanelState.js'
+
+describe('effortPanelState', () => {
+  test('PANEL_POSITIONS 顺序为 low → ultracode', () => {
+    expect(PANEL_POSITIONS).toEqual([
+      'low',
+      'medium',
+      'high',
+      'xhigh',
+      'max',
+      'ultracode',
+    ])
+  })
+
+  test('moveLeft 在 low 处保持 low', () => {
+    expect(moveLeft('low')).toBe('low')
+  })
+
+  test('moveLeft 正常左移', () => {
+    expect(moveLeft('high')).toBe('medium')
+    expect(moveLeft('ultracode')).toBe('max')
+  })
+
+  test('moveRight 在 ultracode 处保持 ultracode', () => {
+    expect(moveRight('ultracode')).toBe('ultracode')
+  })
+
+  test('moveRight 正常右移', () => {
+    expect(moveRight('medium')).toBe('high')
+    expect(moveRight('max')).toBe('ultracode')
+  })
+
+  test('HOME_POSITION 等于 low', () => {
+    expect(HOME_POSITION).toBe('low')
+  })
+
+  test('END_POSITION 等于 ultracode', () => {
+    expect(END_POSITION).toBe('ultracode')
+  })
+
+  test('isUltracode 守卫', () => {
+    expect(isUltracode('ultracode')).toBe(true)
+    expect(isUltracode('max')).toBe(false)
+  })
+
+  test('getInitialCursor：env override 为合法档位时返回 env 值', () => {
+    expect(
+      getInitialCursor({
+        envOverride: 'high',
+        appStateEffort: 'medium',
+        displayed: 'high',
+      }),
+    ).toBe('high')
+  })
+
+  test('getInitialCursor：env 为 null（unset）时用 displayed', () => {
+    expect(
+      getInitialCursor({
+        envOverride: null,
+        appStateEffort: undefined,
+        displayed: 'medium',
+      }),
+    ).toBe('medium')
+  })
+
+  test('getInitialCursor：env undefined 时用 displayed', () => {
+    expect(
+      getInitialCursor({
+        envOverride: undefined,
+        appStateEffort: 'high',
+        displayed: 'high',
+      }),
+    ).toBe('high')
+  })
+
+  test('getInitialCursor：env 是数值（ant-only）时落回 displayed', () => {
+    // 数值不是合法 PanelPosition，回退
+    expect(
+      getInitialCursor({
+        envOverride: 75,
+        appStateEffort: 'medium',
+        displayed: 'medium',
+      }),
+    ).toBe('medium')
+  })
+
+  test('PanelPosition 类型编译期检查（隐式）', () => {
+    const p: PanelPosition = 'xhigh'
+    expect(p).toBe('xhigh')
+  })
+})
+
+describe('computeConfirmOutcome', () => {
+  const mockApply: ApplyFn = cursor => ({
+    message: `applied:${cursor}`,
+    // 测试里 cursor 是 PanelPosition（含 ultracode），但 ApplyFn 的契约要求 EffortValue。
+    // 实际运行时 mockApply 只会被 computeConfirmOutcome 在非 ultracode 档位调用，
+    // 因此 cast 是安全的。生产代码用真 executeEffort 不会出现 ultracode。
+    effortUpdate: { value: cursor as unknown as EffortValue },
+  })
+
+  test('ultracode → kind=ultracode-hint，含 /ultracode 引导', () => {
+    const out = computeConfirmOutcome('ultracode', mockApply)
+    expect(out.kind).toBe('ultracode-hint')
+    if (out.kind === 'ultracode-hint') {
+      expect(out.message).toBe(ULTRACODE_HINT)
+      expect(out.message).toContain('/ultracode')
+    }
+  })
+
+  test('ultracode 不调 applyFn（不会被副作用触发）', () => {
+    let called = false
+    const spy: ApplyFn = c => {
+      called = true
+      return { message: `applied:${c}` }
+    }
+    computeConfirmOutcome('ultracode', spy)
+    expect(called).toBe(false)
+  })
+
+  test('low → kind=apply，message 来自 applyFn，effortUpdate 透传', () => {
+    const out = computeConfirmOutcome('low', mockApply)
+    expect(out.kind).toBe('apply')
+    if (out.kind === 'apply') {
+      expect(out.message).toBe('applied:low')
+      expect(out.effortUpdate?.value).toBe('low')
+    }
+  })
+
+  test('high → apply 路径不调 ultracode 分支', () => {
+    const out = computeConfirmOutcome('high', mockApply)
+    expect(out.kind).toBe('apply')
+  })
+
+  test('applyFn 返回无 effortUpdate 时，outcome.effortUpdate 为 undefined', () => {
+    const noUpdate: ApplyFn = c => ({ message: `applied:${c}` })
+    const out = computeConfirmOutcome('medium', noUpdate)
+    expect(out.kind).toBe('apply')
+    if (out.kind === 'apply') {
+      expect(out.effortUpdate).toBeUndefined()
+    }
+  })
+})
+
+test('常量字符串', () => {
+  expect(CANCEL_MESSAGE).toBe('Effort unchanged.')
+  expect(ULTRACODE_HINT).toContain('/ultracode <context>')
+})
--- a/src/components/EffortPanel/tests/rippleAnimation.test.ts
+++ b/src/components/EffortPanel/tests/rippleAnimation.test.ts
@@ -0,0 +1,501 @@
+import { describe, expect, test } from 'bun:test'
+import {
+  type Cell,
+  type Overlay,
+  TRANSPARENT,
+  applyOverlaysToCells,
+  cellsToSegments,
+  computeRippleCells,
+  fadeCells,
+  fadeColor,
+  getHueShiftAtTime,
+  intensityToColor,
+  rotateHue,
+} from '../rippleAnimation.js'
+
+describe('intensityToColor', () => {
+  test('intensity=0 → 最暗档（不再是 transparent，作面板底色）', () => {
+    expect(intensityToColor(0)).toBe('#1a1f3a')
+  })
+
+  test('intensity < 0 钳到 0 → 最暗档', () => {
+    expect(intensityToColor(-0.5)).toBe('#1a1f3a')
+  })
+
+  test('intensity > 0 → 永远是 #hex 颜色字符串（不返回 transparent）', () => {
+    for (const v of [0.05, 0.1, 0.2, 0.5, 0.8]) {
+      const c = intensityToColor(v)
+      expect(c).not.toBe(TRANSPARENT)
+      expect(c).toMatch(/^#[0-9a-fA-F]{6}$/)
+    }
+  })
+
+  test('intensity > 1 钳到 1 → 最高强度颜色', () => {
+    expect(intensityToColor(1.5)).toBe(intensityToColor(1))
+  })
+
+  test('intensity 单调递增 → 颜色档位递增（至少 3 档）', () => {
+    const samples = [0.2, 0.4, 0.6, 0.8, 1.0]
+    const colors = samples.map(intensityToColor)
+    const unique = new Set(colors)
+    expect(unique.size).toBeGreaterThanOrEqual(3)
+  })
+
+  test('intensity=1 → suggestion 档（波峰最高档）', () => {
+    expect(intensityToColor(1)).toBe('#5769F7')
+  })
+
+  test('hueShift=0 → 与无 hueShift 相同（快路径）', () => {
+    for (const v of [0, 0.2, 0.5, 0.8, 1]) {
+      expect(intensityToColor(v, 0)).toBe(intensityToColor(v))
+    }
+  })
+
+  test('hueShift ≠ 0 → 返回不同颜色（但仍是合法 hex）', () => {
+    const base = intensityToColor(0.8)
+    const shifted = intensityToColor(0.8, 30)
+    expect(shifted).toMatch(/^#[0-9a-fA-F]{6}$/)
+    expect(shifted).not.toBe(base)
+  })
+
+  test('hueShift 180° → 大致补色（亮色变暗色族）', () => {
+    // #5769F7 ≈ HSL(233, 91, 65)，旋转 180° → HSL(53, 91, 65) ≈ 黄色系
+    const shifted = intensityToColor(1, 180)
+    expect(shifted).toMatch(/^#[0-9a-fA-F]{6}$/)
+    // 不再是蓝紫族（R 分量应明显大于 B 分量）
+    const r = parseInt(shifted.slice(1, 3), 16)
+    const b = parseInt(shifted.slice(5, 7), 16)
+    expect(r).toBeGreaterThan(b)
+  })
+})
+
+describe('rotateHue', () => {
+  test('hueShift=0 → 原样返回（快路径，无 round-trip 误差）', () => {
+    expect(rotateHue('#5769F7', 0)).toBe('#5769F7')
+    expect(rotateHue('#1a1f3a', 0)).toBe('#1a1f3a')
+  })
+
+  test('旋转 360° → 等同原色（一圈回起点，大小写无关）', () => {
+    expect(rotateHue('#5769F7', 360).toLowerCase()).toBe('#5769f7')
+    expect(rotateHue('#5769F7', -360).toLowerCase()).toBe('#5769f7')
+  })
+
+  test('旋转 ±n*360° → 等同原色（任意整圈）', () => {
+    expect(rotateHue('#3a4582', 720).toLowerCase()).toBe('#3a4582')
+    expect(rotateHue('#3a4582', -1080).toLowerCase()).toBe('#3a4582')
+  })
+
+  test('灰度色（saturation=0）旋转后不变', () => {
+    // #808080 = (128,128,128)，saturation=0，旋转无意义
+    expect(rotateHue('#808080', 90)).toBe('#808080')
+  })
+
+  test('非法 hex → 原样返回（防御式）', () => {
+    expect(rotateHue('not-a-color', 90)).toBe('not-a-color')
+    expect(rotateHue('#123', 90)).toBe('#123')
+  })
+
+  test('旋转后保持 6 位 hex 格式', () => {
+    const rotated = rotateHue('#5769F7', 45)
+    expect(rotated).toMatch(/^#[0-9a-fA-F]{6}$/)
+  })
+})
+
+describe('getHueShiftAtTime', () => {
+  test('time=0 → 0', () => {
+    expect(getHueShiftAtTime(0)).toBe(0)
+  })
+
+  test('time > 0 → 在 [0, 360) 范围内（连续旋转，非负）', () => {
+    for (const t of [100, 500, 1000, 2000, 5000, 10000, 50000, 100000]) {
+      const shift = getHueShiftAtTime(t)
+      expect(shift).toBeGreaterThanOrEqual(0)
+      expect(shift).toBeLessThan(360)
+    }
+  })
+
+  test('time 推进 → hueShift 单调递增（模 360）', () => {
+    // 在一个周期内（12000ms），hueShift 应单调递增
+    const samples = [0, 1000, 2000, 3000, 4000, 5000, 6000]
+    const shifts = samples.map(getHueShiftAtTime)
+    for (let i = 1; i < shifts.length; i++) {
+      expect(shifts[i]).toBeGreaterThan(shifts[i - 1])
+    }
+  })
+
+  test('周期 12000ms（time=12000 应回到 0，模 360）', () => {
+    // 12000ms * 0.03 = 360，% 360 = 0
+    const shift = getHueShiftAtTime(12000)
+    expect(shift).toBe(0)
+  })
+
+  test('半周期 6000ms → hueShift=180（对面色相）', () => {
+    // 6000ms * 0.03 = 180
+    expect(getHueShiftAtTime(6000)).toBe(180)
+  })
+
+  test('四分之一周期 3000ms → hueShift=90', () => {
+    expect(getHueShiftAtTime(3000)).toBe(90)
+  })
+
+  test('多周期循环：time=24000 等同 time=0', () => {
+    expect(getHueShiftAtTime(24000)).toBe(0)
+    expect(getHueShiftAtTime(36000)).toBe(0)
+  })
+})
+
+describe('computeRippleCells', () => {
+  test('返回数组长度等于 width', () => {
+    const cells = computeRippleCells({
+      y: 2,
+      width: 30,
+      time: 100,
+      sourceX: 25,
+      sourceY: 2,
+    })
+    expect(cells.length).toBe(30)
+  })
+
+  test('每个 cell 的 char 是空格', () => {
+    const cells = computeRippleCells({
+      y: 0,
+      width: 10,
+      time: 0,
+      sourceX: 5,
+      sourceY: 0,
+    })
+    for (const cell of cells) {
+      expect(cell.char).toBe(' ')
+    }
+  })
+
+  test('每个 cell 的 color 是合法字符串', () => {
+    const cells = computeRippleCells({
+      y: 0,
+      width: 10,
+      time: 0,
+      sourceX: 5,
+      sourceY: 0,
+    })
+    for (const cell of cells) {
+      expect(typeof cell.color).toBe('string')
+      expect(
+        cell.color === TRANSPARENT || /^#[0-9a-fA-F]{6}$/.test(cell.color),
+      ).toBe(true)
+    }
+  })
+
+  test('width=0 → 空数组', () => {
+    expect(
+      computeRippleCells({ y: 0, width: 0, time: 0, sourceX: 0, sourceY: 0 }),
+    ).toEqual([])
+  })
+
+  test('width<0 → 空数组', () => {
+    expect(
+      computeRippleCells({ y: 0, width: -5, time: 0, sourceX: 0, sourceY: 0 }),
+    ).toEqual([])
+  })
+
+  test('震源点 time=0 时为中间档（(sin+1)/2 → intensity=0.5），time 推进后扫过波峰/波谷', () => {
+    // v5 平滑波：dist=0，time=0 时 phase=0，sin(0)=0，(0+1)/2=0.5 → intensity=0.5 → 中间档
+    const t0 = computeRippleCells({
+      y: 5,
+      width: 11,
+      time: 0,
+      sourceX: 5,
+      sourceY: 5,
+    })
+    // 0.5 * 7 = 3.5, floor = 3, RIPPLE_COLOR_STOPS[3] = '#2e3870'
+    expect(t0[5].color).toBe('#2e3870')
+
+    // time 推进，phase 变化，震源会扫过波峰（亮档）和波谷（暗档）
+    const t1 = computeRippleCells({
+      y: 5,
+      width: 11,
+      time: 1500,
+      sourceX: 5,
+      sourceY: 5,
+    })
+    // 不同 time 不同颜色（动画推进）
+    expect(t1[5].color).not.toBe('#2e3870')
+  })
+
+  test('覆盖半径扩大：dist=65（左侧远端）仍有非最暗颜色', () => {
+    // 震源 x=65，远端 x=0 → dist=65
+    // falloff = max(0, 1 - 65/90) = 0.278，波峰时 intensity ≈ 0.278
+    // 应映射到非最暗档（#15182b 或更亮）
+    const cells = computeRippleCells({
+      y: 0,
+      width: 66,
+      time: 0,
+      sourceX: 65,
+      sourceY: 0,
+    })
+    // 第 0 列 dist=65，time=0 时 phase = 65*0.35 = 22.75 rad
+    // sin(22.75) ≈ -0.59 → wave = 0 → intensity = 0 → 最暗档
+    // 但 time 推进时波峰会扫过此处，强度变高
+    // 这里只验证 cell 有合法颜色（最暗档也算合法）
+    expect(cells[0].color).toMatch(/^#[0-9a-fA-F]{6}$/)
+    // 推进 time 后，左侧应出现非最暗颜色（波峰扫过）
+    const t1 = computeRippleCells({
+      y: 0,
+      width: 66,
+      time: 2000,
+      sourceX: 65,
+      sourceY: 0,
+    })
+    const nonDarkest = t1.filter(c => c.color !== '#1a1f3a')
+    expect(nonDarkest.length).toBeGreaterThan(0)
+  })
+
+  test('time 推进时颜色分布变化（动画效果）', () => {
+    const t0 = computeRippleCells({
+      y: 2,
+      width: 30,
+      time: 0,
+      sourceX: 25,
+      sourceY: 2,
+    })
+    const t1 = computeRippleCells({
+      y: 2,
+      width: 30,
+      time: 500,
+      sourceX: 25,
+      sourceY: 2,
+    })
+    // 至少有一个位置颜色不同
+    const diffs = t0.filter((c, i) => c.color !== t1[i].color)
+    expect(diffs.length).toBeGreaterThan(0)
+  })
+})
+
+describe('applyOverlaysToCells', () => {
+  function makeCells(colors: string[]): Cell[] {
+    return colors.map(c => ({ char: ' ', color: c }))
+  }
+
+  test('无 overlay 时原样返回（但为新数组）', () => {
+    const cells = makeCells(['#111', '#222', '#333'])
+    const out = applyOverlaysToCells(cells, [])
+    expect(out).toEqual(cells)
+    expect(out).not.toBe(cells) // 防御式拷贝
+  })
+
+  test('overlay 替换 char 但保留底层 color（color 未指定时）', () => {
+    const cells = makeCells([
+      TRANSPARENT,
+      TRANSPARENT,
+      TRANSPARENT,
+      TRANSPARENT,
+    ])
+    const overlays: Overlay[] = [{ text: 'hi', x: 1 }]
+    const out = applyOverlaysToCells(cells, overlays)
+    expect(out[1].char).toBe('h')
+    expect(out[2].char).toBe('i')
+    expect(out[1].color).toBe(TRANSPARENT) // 保留底层色
+    expect(out[0].char).toBe(' ')
+  })
+
+  test('overlay 指定 color 时同时覆盖 char + color', () => {
+    const cells = makeCells([TRANSPARENT, TRANSPARENT, TRANSPARENT])
+    const overlays: Overlay[] = [{ text: 'AB', x: 0, color: '#5769F7' }]
+    const out = applyOverlaysToCells(cells, overlays)
+    expect(out[0]).toEqual({ char: 'A', color: '#5769F7' })
+    expect(out[1]).toEqual({ char: 'B', color: '#5769F7' })
+    expect(out[2]).toEqual({ char: ' ', color: TRANSPARENT })
+  })
+
+  test('overlay 超出右边界被截断', () => {
+    const cells = makeCells([TRANSPARENT, TRANSPARENT, TRANSPARENT])
+    const overlays: Overlay[] = [{ text: 'abcdef', x: 1 }]
+    const out = applyOverlaysToCells(cells, overlays)
+    expect(out[0].char).toBe(' ')
+    expect(out[1].char).toBe('a')
+    expect(out[2].char).toBe('b')
+    // 'cdef' 被截断
+  })
+
+  test('overlay x 为负数 → 从开头截断（不向左溢出）', () => {
+    const cells = makeCells([TRANSPARENT, TRANSPARENT, TRANSPARENT])
+    const overlays: Overlay[] = [{ text: 'abc', x: -1 }]
+    const out = applyOverlaysToCells(cells, overlays)
+    expect(out[0].char).toBe('b') // 跳过 'a'，'b' 占 0
+    expect(out[1].char).toBe('c')
+    expect(out[2].char).toBe(' ')
+  })
+
+  test('多个 overlay 后者覆盖前者（同位置）', () => {
+    const cells = makeCells([TRANSPARENT, TRANSPARENT, TRANSPARENT])
+    const overlays: Overlay[] = [
+      { text: 'AAA', x: 0, color: '#111' },
+      { text: 'B', x: 1, color: '#222' },
+    ]
+    const out = applyOverlaysToCells(cells, overlays)
+    expect(out[0]).toEqual({ char: 'A', color: '#111' })
+    expect(out[1]).toEqual({ char: 'B', color: '#222' }) // 第二个 overlay 覆盖
+    expect(out[2]).toEqual({ char: 'A', color: '#111' })
+  })
+
+  test('overlay 起始位置 >= 数组长度 → 完全跳过', () => {
+    const cells = makeCells([TRANSPARENT, TRANSPARENT])
+    const overlays: Overlay[] = [{ text: 'X', x: 5 }]
+    const out = applyOverlaysToCells(cells, overlays)
+    expect(out.every(c => c.char === ' ')).toBe(true)
+  })
+
+  test('不修改原数组（防御式拷贝）', () => {
+    const cells = makeCells([TRANSPARENT])
+    const snapshot = cells.map(c => ({ ...c }))
+    applyOverlaysToCells(cells, [{ text: 'X', x: 0 }])
+    expect(cells).toEqual(snapshot)
+  })
+})
+
+describe('cellsToSegments', () => {
+  test('空数组 → 空数组', () => {
+    expect(cellsToSegments([])).toEqual([])
+  })
+
+  test('单 cell → 单段', () => {
+    const cells: Cell[] = [{ char: 'a', color: '#111' }]
+    expect(cellsToSegments(cells)).toEqual([{ text: 'a', color: '#111' }])
+  })
+
+  test('全部同色 → 合并为一段', () => {
+    const cells: Cell[] = [
+      { char: 'a', color: '#111' },
+      { char: 'b', color: '#111' },
+      { char: 'c', color: '#111' },
+    ]
+    expect(cellsToSegments(cells)).toEqual([{ text: 'abc', color: '#111' }])
+  })
+
+  test('颜色交替 → 每个独立段', () => {
+    const cells: Cell[] = [
+      { char: 'a', color: '#111' },
+      { char: 'b', color: '#222' },
+      { char: 'c', color: '#111' },
+    ]
+    expect(cellsToSegments(cells)).toEqual([
+      { text: 'a', color: '#111' },
+      { text: 'b', color: '#222' },
+      { text: 'c', color: '#111' },
+    ])
+  })
+
+  test('相邻同色段合并，不同色段分开', () => {
+    const cells: Cell[] = [
+      { char: 'a', color: TRANSPARENT },
+      { char: 'b', color: TRANSPARENT },
+      { char: 'X', color: '#5769F7' },
+      { char: 'Y', color: '#5769F7' },
+      { char: 'c', color: TRANSPARENT },
+    ]
+    expect(cellsToSegments(cells)).toEqual([
+      { text: 'ab', color: TRANSPARENT },
+      { text: 'XY', color: '#5769F7' },
+      { text: 'c', color: TRANSPARENT },
+    ])
+  })
+
+  test('段文本拼接顺序保持原顺序', () => {
+    const cells: Cell[] = [
+      { char: '1', color: '#111' },
+      { char: '2', color: '#111' },
+      { char: '3', color: '#111' },
+    ]
+    expect(cellsToSegments(cells)[0].text).toBe('123')
+  })
+})
+
+describe('fadeColor', () => {
+  test('fade=1 → 原色（不变）', () => {
+    expect(fadeColor('#5769F7', 1)).toBe('#5769f7')
+  })
+
+  test('fade=0 → TRANSPARENT（cell 不渲染）', () => {
+    expect(fadeColor('#5769F7', 0)).toBe(TRANSPARENT)
+  })
+
+  test('fade ≤ 0.01 → TRANSPARENT（阈值）', () => {
+    expect(fadeColor('#5769F7', 0.01)).toBe(TRANSPARENT)
+    expect(fadeColor('#5769F7', 0.009)).toBe(TRANSPARENT)
+  })
+
+  test('fade=0.5 → RGB 各分量减半', () => {
+    // #5769F7 = (87, 105, 247)，减半 → (44, 53, 124) = #2c357c
+    // Math.round(87*0.5)=44, Math.round(105*0.5)=53, Math.round(247*0.5)=124
+    expect(fadeColor('#5769F7', 0.5)).toBe('#2c357c')
+  })
+
+  test('TRANSPARENT 输入 → 原样返回（不处理）', () => {
+    expect(fadeColor(TRANSPARENT, 1)).toBe(TRANSPARENT)
+    expect(fadeColor(TRANSPARENT, 0.5)).toBe(TRANSPARENT)
+  })
+
+  test('非法 hex 格式 → 原样返回（防御式）', () => {
+    expect(fadeColor('not-a-color', 0.5)).toBe('not-a-color')
+    expect(fadeColor('#123', 0.5)).toBe('#123') // 非 6 位 hex
+  })
+
+  test('fade < 0 钳到 0 → TRANSPARENT', () => {
+    expect(fadeColor('#5769F7', -0.5)).toBe(TRANSPARENT)
+  })
+
+  test('fade > 1 钳到 1 → 原色', () => {
+    expect(fadeColor('#5769F7', 1.5)).toBe('#5769f7')
+  })
+
+  test('结果始终为 6 位 hex（前导零补全）', () => {
+    // #010203 = (1, 2, 3)，fade=0.5 → Math.round 后为 (1, 1, 2) = #010102
+    // 但 1*0.5 = 0.5, Math.round(0.5) = 1（ banker's rounding 在 JS 中是 round half up）
+    // 验证格式：6 位 hex
+    const result = fadeColor('#010203', 0.5)
+    expect(result).toMatch(/^#[0-9a-f]{6}$/)
+  })
+})
+
+describe('fadeCells', () => {
+  test('空数组 → 空数组', () => {
+    expect(fadeCells([], 0.5)).toEqual([])
+  })
+
+  test('每个 cell 的颜色按 fade 缩放，char 保留', () => {
+    const cells: Cell[] = [
+      { char: ' ', color: '#5769F7' },
+      { char: 'A', color: '#ffffff' },
+    ]
+    const out = fadeCells(cells, 0.5)
+    expect(out[0]).toEqual({ char: ' ', color: '#2c357c' })
+    // #ffffff = (255, 255, 255)，fade=0.5 → (128, 128, 128) = #808080
+    expect(out[1]).toEqual({ char: 'A', color: '#808080' })
+  })
+
+  test('不修改原数组（防御式拷贝）', () => {
+    const cells: Cell[] = [{ char: ' ', color: '#5769F7' }]
+    const snapshot = cells.map(c => ({ ...c }))
+    fadeCells(cells, 0.5)
+    expect(cells).toEqual(snapshot)
+  })
+
+  test('TRANSPARENT cell 保持 TRANSPARENT', () => {
+    const cells: Cell[] = [
+      { char: ' ', color: TRANSPARENT },
+      { char: ' ', color: '#5769F7' },
+    ]
+    const out = fadeCells(cells, 0.5)
+    expect(out[0].color).toBe(TRANSPARENT)
+    expect(out[1].color).toBe('#2c357c')
+  })
+
+  test('fade=0 → 所有非 transparent 颜色变 TRANSPARENT', () => {
+    const cells: Cell[] = [
+      { char: ' ', color: '#5769F7' },
+      { char: ' ', color: '#1a1f3a' },
+    ]
+    const out = fadeCells(cells, 0)
+    expect(out[0].color).toBe(TRANSPARENT)
+    expect(out[1].color).toBe(TRANSPARENT)
+  })
+})
--- a/src/components/EffortPanel/effortPanelState.ts
+++ b/src/components/EffortPanel/effortPanelState.ts
@@ -0,0 +1,126 @@
+import type { EffortValue } from '../../utils/effort.js'
+
+/**
+ * 光标在面板上的位置。仅面板内部使用，不进入 AppState / settings / API。
+ * 'ultracode' 不是 EffortLevel；它在本面板里仅作视觉占位与文案引导。
+ */
+export type PanelPosition =
+  | 'low'
+  | 'medium'
+  | 'high'
+  | 'xhigh'
+  | 'max'
+  | 'ultracode'
+
+export const PANEL_POSITIONS: readonly PanelPosition[] = [
+  'low',
+  'medium',
+  'high',
+  'xhigh',
+  'max',
+  'ultracode',
+] as const
+
+export const HOME_POSITION: PanelPosition = 'low'
+export const END_POSITION: PanelPosition = 'ultracode'
+
+/**
+ * 判断一个值是否可作为面板光标位置（不含 ultracode，因 ultracode 仅由面板内部产生）。
+ */
+function isNonUltracodePosition(
+  value: unknown,
+): value is Exclude<PanelPosition, 'ultracode'> {
+  return (
+    typeof value === 'string' &&
+    value !== 'ultracode' &&
+    (PANEL_POSITIONS as readonly string[]).includes(value)
+  )
+}
+
+/**
+ * 把 EffortValue 归一化为面板可用的光标位置。
+ * - null / undefined / 数值（ant-only）/ ultracode → undefined（让上层用 displayed）
+ * - 合法 string 档位 → 返回该档位
+ */
+function normalizeToPanelPosition(
+  value: EffortValue | null | undefined,
+): PanelPosition | undefined {
+  if (value === null || value === undefined) return undefined
+  if (typeof value === 'number') return undefined
+  if (isNonUltracodePosition(value)) {
+    return value
+  }
+  return undefined
+}
+
+export function moveLeft(cursor: PanelPosition): PanelPosition {
+  const idx = PANEL_POSITIONS.indexOf(cursor)
+  if (idx <= 0) return PANEL_POSITIONS[0]
+  return PANEL_POSITIONS[idx - 1]
+}
+
+export function moveRight(cursor: PanelPosition): PanelPosition {
+  const idx = PANEL_POSITIONS.indexOf(cursor)
+  if (idx === -1 || idx >= PANEL_POSITIONS.length - 1) {
+    return PANEL_POSITIONS[PANEL_POSITIONS.length - 1]
+  }
+  return PANEL_POSITIONS[idx + 1]
+}
+
+export function isUltracode(cursor: PanelPosition): boolean {
+  return cursor === 'ultracode'
+}
+
+/**
+ * 决定面板挂载时的初始光标位置。
+ * 优先级：env override（若是合法档位）> displayed level
+ *
+ * @param envOverride    getEffortEnvOverride() 的返回值：EffortValue | null | undefined
+ * @param appStateEffort AppState.effortValue
+ * @param displayed      getDisplayedEffortLevel(model, appStateEffort) —— 必传，避免此处再依赖 model
+ */
+export function getInitialCursor(args: {
+  envOverride: EffortValue | null | undefined
+  appStateEffort: EffortValue | undefined
+  displayed: PanelPosition
+}): PanelPosition {
+  const fromEnv = normalizeToPanelPosition(args.envOverride)
+  if (fromEnv !== undefined) return fromEnv
+  // displayed 已经是 EffortLevel（不含 ultracode），合法
+  return args.displayed
+}
+
+// ---- 确认/取消决策（注入 ApplyFn 避免循环依赖 + 便于测试）----
+
+export type ConfirmOutcome =
+  | {
+      kind: 'apply'
+      message: string
+      effortUpdate?: { value: EffortValue | undefined }
+    }
+  | { kind: 'ultracode-hint'; message: string }
+
+export type ApplyFn = (cursor: PanelPosition) => {
+  message: string
+  effortUpdate?: { value: EffortValue | undefined }
+}
+
+export const ULTRACODE_HINT =
+  'ultracode is not an effort level. Use /ultracode <context> to start a multi-agent workflow.'
+
+export const CANCEL_MESSAGE = 'Effort unchanged.'
+
+export function computeConfirmOutcome(
+  cursor: PanelPosition,
+  applyFn: ApplyFn,
+): ConfirmOutcome {
+  if (isUltracode(cursor)) {
+    return { kind: 'ultracode-hint', message: ULTRACODE_HINT }
+  }
+  const result = applyFn(cursor)
+  return {
+    kind: 'apply',
+    message: result.message,
+    effortUpdate: result.effortUpdate,
+  }
+}
--- a/src/components/EffortPanel/rippleAnimation.ts
+++ b/src/components/EffortPanel/rippleAnimation.ts
@@ -0,0 +1,361 @@
+/**
+ * EffortPanel ultracode 档位的背景波纹动画 —— 纯函数模块（颜色驱动）。
+ *
+ * 设计：
+ * - 仅在 cursor 停在 ultracode 时启动（订阅时钟由 useRippleFrame 控制）
+ * - 震源：面板右下（ultracode 字符位置），向左/上辐射同心圆波
+ * - 每位置强度（0~1）→ 颜色（suggestion 系暗紫蓝渐变）
+ * - 文字 overlay 在波纹之上（last-write-wins，颜色可单独指定）
+ *
+ * 渲染模型：每位置一个 cell（char + color），相邻同色合并为 segment。
+ * 渲染层用 Box flexDirection="row" + 多个 Text 段输出（每段一个 color）。
+ *
+ * 所有函数纯：相同入参 → 相同出参，便于单测 + 帧快照。
+ */
+
+/**
+ * suggestion 系颜色梯度（暗背景 → suggestion 色）。
+ *
+ * 设计：所有强度都映射到具体颜色（不返回 transparent），让整面板都是
+ * "暗紫蓝海洋"作为底色，波峰在底色上流动。这样波纹颜色变化更明显，
+ * 波谷也有暗色（不会"消失"）。
+ *
+ * 最暗档用 #1a1f3a（紫黑，亮度 ~12%），不是纯黑——避免远端波谷
+ * 看起来像"硬黑边"。波峰最高升到 suggestion (#5769F7)，避免与
+ * 文字 overlay（也用 suggestion 系）同色互相吞噬。
+ *
+ * 这些是 base 颜色（hueShift=0 时返回）。生产代码会传 hueShift 让
+ * 整个梯度绕色相环旋转，制造主色随时间漂移的视觉效果。
+ */
+const RIPPLE_COLOR_STOPS = [
+  '#1a1f3a', // 0.00 ~ 0.14 — 最暗（紫黑底色，非纯黑）
+  '#1f2543', // 0.14 ~ 0.28
+  '#252c55', // 0.28 ~ 0.42
+  '#2e3870', // 0.42 ~ 0.56
+  '#3a4582', // 0.56 ~ 0.70
+  '#4a5bb0', // 0.70 ~ 0.84
+  '#5769F7', // 0.84 ~ 1.00 — suggestion (波峰)
+] as const
+
+/**
+ * 色相连续旋转速度（度/ms）。
+ * 周期 = 360 / 0.03 = 12000ms = 12s，远慢于波纹相位（~1.6s），
+ * 让主色漂移感"ambient"而非"动画"。
+ *
+ * 连续旋转（非 sin 振荡）让色相 0~360° 全色环都被访问：
+ * 蓝 233° → 紫 270° → 品红 300° → 红 0° → 橙 30° → 黄 60° →
+ * 绿 120° → 青 180° → 蓝 233°（一圈）。
+ */
+const HUE_ROTATION_DEG_PER_MS = 0.03
+
+/**
+ * hex → {h, s, l}（h 单位度，s/l 为 0~1）。
+ *
+ * 标准 RGB → HSL 转换。非法 hex（非 #rrggbb）→ h=0, s=0, l=0（黑）。
+ */
+function hexToHsl(hex: string): { h: number; s: number; l: number } {
+  if (!/^#[0-9a-fA-F]{6}$/.test(hex)) return { h: 0, s: 0, l: 0 }
+  const r = parseInt(hex.slice(1, 3), 16) / 255
+  const g = parseInt(hex.slice(3, 5), 16) / 255
+  const b = parseInt(hex.slice(5, 7), 16) / 255
+  const max = Math.max(r, g, b)
+  const min = Math.min(r, g, b)
+  const l = (max + min) / 2
+  const d = max - min
+  if (d === 0) return { h: 0, s: 0, l }
+  const s = d / (1 - Math.abs(2 * l - 1))
+  let h: number
+  if (max === r) {
+    h = 60 * (((g - b) / d) % 6)
+  } else if (max === g) {
+    h = 60 * ((b - r) / d + 2)
+  } else {
+    h = 60 * ((r - g) / d + 4)
+  }
+  if (h < 0) h += 360
+  return { h, s, l }
+}
+
+/**
+ * {h, s, l} → hex。
+ *
+ * 标准 HSL → RGB 转换。h 自动 mod 360 处理。
+ */
+function hslToHex(h: number, s: number, l: number): string {
+  const hNorm = ((h % 360) + 360) % 360
+  const c = (1 - Math.abs(2 * l - 1)) * s
+  const hPrime = hNorm / 60
+  const x = c * (1 - Math.abs((hPrime % 2) - 1))
+  let r = 0
+  let g = 0
+  let b = 0
+  if (hPrime < 1) {
+    r = c
+    g = x
+  } else if (hPrime < 2) {
+    r = x
+    g = c
+  } else if (hPrime < 3) {
+    g = c
+    b = x
+  } else if (hPrime < 4) {
+    g = x
+    b = c
+  } else if (hPrime < 5) {
+    r = x
+    b = c
+  } else {
+    r = c
+    b = x
+  }
+  const m = l - c / 2
+  const toHex = (v: number): string =>
+    Math.round((v + m) * 255)
+      .toString(16)
+      .padStart(2, '0')
+  return `#${toHex(r)}${toHex(g)}${toHex(b)}`
+}
+
+/**
+ * 把 hex 颜色绕色相环旋转 hueShift 度。
+ *
+ * 保持饱和度和亮度不变，仅旋转 hue。用于让 RIPPLE_COLOR_STOPS 整体
+ * 漂移到不同色相（蓝→青→紫→蓝循环），制造主色随时间变化的效果。
+ *
+ * 非法 hex 原样返回（防御式）。
+ */
+export function rotateHue(hex: string, hueShift: number): string {
+  if (!/^#[0-9a-fA-F]{6}$/.test(hex)) return hex
+  if (hueShift === 0) return hex // 快路径：避免无意义 round-trip
+  const { h, s, l } = hexToHsl(hex)
+  return hslToHex(h + hueShift, s, l)
+}
+
+/**
+ * 根据 time 计算当前色相偏移（度，连续旋转）。
+ *
+ * 返回值始终在 [0, 360) 区间，单调递增（模 360）。
+ * 周期约 12s 一圈，覆盖完整色环。
+ */
+export function getHueShiftAtTime(time: number): number {
+  return (time * HUE_ROTATION_DEG_PER_MS) % 360
+}
+
+/**
+ * 强度（任意实数）→ 颜色字符串。
+ *
+ * 钳到 [0, 1]，按 RIPPLE_COLOR_STOPS 分级。永不返回 transparent。
+ * intensity=0 → 最暗档（#1a1f3a，作为面板底色）。
+ *
+ * @param hueShift 整个色阶绕色相环旋转的度数（0 = base 颜色）。
+ *                 生产代码传 getHueShiftAtTime(time) 实现主色漂移。
+ *                 测试代码传 0（默认）获得确定性输出。
+ */
+export function intensityToColor(intensity: number, hueShift = 0): string {
+  const v = intensity < 0 ? 0 : intensity > 1 ? 1 : intensity
+  const idx = Math.min(
+    RIPPLE_COLOR_STOPS.length - 1,
+    Math.floor(v * RIPPLE_COLOR_STOPS.length),
+  )
+  const base = RIPPLE_COLOR_STOPS[idx]
+  return hueShift === 0 ? base : rotateHue(base, hueShift)
+}
+
+/**
+ * 'transparent' 字面量。intensityToColor 永不返回它（保留为兼容性导出）。
+ * 渲染层可用此常量做语义判定（如 cell 是 overlay 文字而非波纹背景）。
+ */
+export const TRANSPARENT = 'transparent'
+
+/**
+ * 单位置 cell：char + color。
+ * - color 为 'transparent' 时渲染层不染色（背景保持终端默认）。
+ * - 文字 overlay cell 用具体颜色（suggestion / warning 等）。
+ */
+export type Cell = {
+  char: string
+  color: string
+}
+
+/**
+ * 渲染段：相邻同 color 的 cells 合并。
+ * 减少 React Text 节点数量（一行从 72 个 Text 降到 ~5-10 个）。
+ */
+export type Segment = {
+  text: string
+  color: string
+}
+
+/**
+ * 文字 overlay：在某行的 x 位置覆盖 text 字符串。
+ * - color undefined 时保留底层波纹 cell 自身颜色（仅替换 char）
+ * - color 指定时同时覆盖 char + color
+ *
+ * 后渲染的 overlay 在相同位置覆盖先渲染的（last-write-wins）。
+ */
+export type Overlay = {
+  text: string
+  /** 起始列；可为负（前缀被截断） */
+  x: number
+  /** overlay 字符颜色；undefined = 保留底层波纹颜色 */
+  color?: string
+}
+
+/**
+ * 波纹背景字符。
+ * 用空格让背景留空、只靠 color 染色（视觉上像"颜色斑点"）。
+ * 空格宽度稳定（永远 1 列），不像可变宽度 unicode 字符。
+ */
+const RIPPLE_BG_CHAR = ' '
+
+/**
+ * 计算面板某一行 y 的完整波纹 cell 列表。
+ *
+ * 波纹数学（v6.1 — 平滑呼吸 + 主色全色环旋转）：
+ *   dx = x - sourceX
+ *   dy = (y - sourceY) * 1.5    （y 方向视觉拉伸，行高 > 字宽）
+ *   dist = sqrt(dx² + dy²)
+ *   phase = dist * 0.35 - time * 0.004   （速度调慢至原 1/3）
+ *   wave = (sin(phase) + 1) / 2          （[−1,1] → [0,1]，平滑无平带）
+ *   falloff = max(0, 1 - dist / 90)       （覆盖半径扩到 90）
+ *   intensity = wave * falloff
+ *   hueShift = (time * 0.03) % 360        （连续旋转，12s 一圈全色环）
+ *   color = intensityToColor(intensity, hueShift)
+ *
+ * v6.1 改 hueShift 为连续旋转（v6 是 sin±25° 振荡，色域太窄到不了
+ * 红黄）。现在每 12s 走完一圈完整色环：蓝→紫→品红→红→橙→黄→绿→青→蓝。
+ * 两个时间常数（相位 0.004 vs hue 0.03）解耦，让"流动"和"变色"不同步。
+ *
+ * 每位置强度经 intensityToColor → 颜色字符串（永不 transparent），写入 cell。
+ *
+ * @returns 长度严格等于 width 的 Cell 数组
+ */
+export function computeRippleCells(args: {
+  y: number
+  width: number
+  time: number
+  sourceX: number
+  sourceY: number
+}): Cell[] {
+  const { y, width, time, sourceX, sourceY } = args
+  if (width <= 0) return []
+
+  const hueShift = getHueShiftAtTime(time)
+
+  const cells: Cell[] = new Array(width)
+  for (let x = 0; x < width; x++) {
+    const dx = x - sourceX
+    const dy = (y - sourceY) * 1.5
+    const dist = Math.sqrt(dx * dx + dy * dy)
+
+    // 主波纹相位（速度调慢：原 0.012 → 0.004，约 1/3 速）
+    const phase = dist * 0.35 - time * 0.004
+    // 平滑呼吸：[−1,1] → [0,1]，无平带，无双倍频率
+    const wave = (Math.sin(phase) + 1) / 2
+
+    // 距离衰减（覆盖半径扩到 90：原 40）
+    const falloff = Math.max(0, 1 - dist / 90)
+    const intensity = wave * falloff
+
+    cells[x] = {
+      char: RIPPLE_BG_CHAR,
+      color: intensityToColor(intensity, hueShift),
+    }
+  }
+  return cells
+}
+
+/**
+ * 把 overlays 文字覆盖到 cells。
+ *
+ * 行为：
+ * - 文字字符永远胜出（替换底层 cell.char）
+ * - overlay.color 为 undefined 时保留底层 cell.color（仅替换 char）
+ * - overlay.color 指定时同时覆盖 char + color
+ * - 超出右边界的文字被截断
+ * - x 为负时跳过前 |x| 个字符
+ *
+ * 不修改原数组，返回新数组（防御式拷贝）。
+ */
+export function applyOverlaysToCells(
+  cells: Cell[],
+  overlays: Overlay[],
+): Cell[] {
+  const out: Cell[] = cells.map(c => ({ ...c }))
+  for (const overlay of overlays) {
+    const start = overlay.x
+    if (start >= out.length) continue
+    for (let i = 0; i < overlay.text.length; i++) {
+      const targetIdx = start + i
+      if (targetIdx < 0) continue
+      if (targetIdx >= out.length) break
+      out[targetIdx] = {
+        char: overlay.text[i],
+        color: overlay.color ?? out[targetIdx].color,
+      }
+    }
+  }
+  return out
+}
+
+/**
+ * 合并相邻同色 cells 为 segments。
+ *
+ * 用于减少渲染节点：一行 72 cells 可能只有 5-10 个颜色变化点，
+ * 合并后只需渲染 N 个 Text 段而非 N 个单字符 Text。
+ */
+export function cellsToSegments(cells: Cell[]): Segment[] {
+  if (cells.length === 0) return []
+  const segments: Segment[] = []
+  let current: Segment = { text: cells[0].char, color: cells[0].color }
+  for (let i = 1; i < cells.length; i++) {
+    const cell = cells[i]
+    if (cell.color === current.color) {
+      current.text += cell.char
+    } else {
+      segments.push(current)
+      current = { text: cell.char, color: cell.color }
+    }
+  }
+  segments.push(current)
+  return segments
+}
+
+/**
+ * 把 hex 颜色按 fade 因子（0~1）缩放亮度。
+ *
+ * 用于进入/退出动画：
+ * - fade ≤ 0.01 → TRANSPARENT（cell 不渲染背景，等同终端默认）
+ * - fade = 0.5  → 颜色 RGB 各分量减半（暗紫蓝）
+ * - fade = 1    → 原色（完整波纹）
+ *
+ * 非法 hex（非 #rrggbb 格式）原样返回（防御式）。
+ */
+export function fadeColor(color: string, fade: number): string {
+  if (color === TRANSPARENT) return TRANSPARENT
+  const f = fade < 0 ? 0 : fade > 1 ? 1 : fade
+  if (f <= 0.01) return TRANSPARENT
+  if (!/^#[0-9a-fA-F]{6}$/.test(color)) return color
+  const r = parseInt(color.slice(1, 3), 16)
+  const g = parseInt(color.slice(3, 5), 16)
+  const b = parseInt(color.slice(5, 7), 16)
+  const fr = Math.round(r * f)
+    .toString(16)
+    .padStart(2, '0')
+  const fg = Math.round(g * f)
+    .toString(16)
+    .padStart(2, '0')
+  const fb = Math.round(b * f)
+    .toString(16)
+    .padStart(2, '0')
+  return `#${fr}${fg}${fb}`
+}
+
+/**
+ * 把整行 cells 的颜色按 fade 缩放（用于进入/退出动画）。
+ *
+ * 不修改原数组，返回新数组。
+ */
+export function fadeCells(cells: Cell[], fade: number): Cell[] {
+  return cells.map(c => ({ char: c.char, color: fadeColor(c.color, fade) }))
+}
--- a/src/components/EffortPanel/useRippleFrame.ts
+++ b/src/components/EffortPanel/useRippleFrame.ts
@@ -0,0 +1,25 @@
+import { type DOMElement, useAnimationFrame } from '@anthropic/ink'
+
+const RIPPLE_INTERVAL_MS = 60
+
+/**
+ * ultracode 波纹动画 hook。
+ *
+ * 设计：
+ * - 仅当 enabled=true（cursor === 'ultracode' 或退出淡出未结束）时订阅时钟，
+ *   pass null 时 useAnimationFrame 内部不订阅 ClockContext，setInterval 不触发。
+ * - 返回 [ref, time]：ref 附到波纹容器（驱动 viewport-pause），time
+ *   用于 computeRippleLine 计算各行的波纹相位。
+ *
+ * enabled=false 时返回 time=0（下游基于 enabled 直接不渲染波纹层，
+ * 但 0 仍是合法值，避免意外的 phase 输出 NaN）。
+ *
+ * 注意：调用方应传 showingRipple（on ultracode || fade > 0），不是 rippleActive，
+ * 这样退出动画期间时钟继续推进，fade useEffect 才有 tick 触发。
+ */
+export function useRippleFrame(
+  enabled: boolean,
+): [ref: (element: DOMElement | null) => void, time: number] {
+  const [ref, time] = useAnimationFrame(enabled ? RIPPLE_INTERVAL_MS : null)
+  return [ref, enabled ? time : 0]
+}
--- a/src/components/permissions/PermissionRequest.tsx
+++ b/src/components/permissions/PermissionRequest.tsx
@@ -45,14 +45,12 @@ const ReviewArtifactPermissionRequest = feature('REVIEW_ARTIFACT')
  : null;

 const WorkflowTool = feature('WORKFLOW_SCRIPTS')
-  ? (
-      require('@claude-code-best/builtin-tools/tools/WorkflowTool/WorkflowTool.js') as typeof import('@claude-code-best/builtin-tools/tools/WorkflowTool/WorkflowTool.js')
-    ).WorkflowTool
+  ? (require('../../workflow/wiring.js') as typeof import('../../workflow/wiring.js')).createWorkflowToolCore()
  : null;

 const WorkflowPermissionRequest = feature('WORKFLOW_SCRIPTS')
  ? (
-      require('@claude-code-best/builtin-tools/tools/WorkflowTool/WorkflowPermissionRequest.js') as typeof import('@claude-code-best/builtin-tools/tools/WorkflowTool/WorkflowPermissionRequest.js')
+      require('../../workflow/WorkflowPermissionRequest.js') as typeof import('../../workflow/WorkflowPermissionRequest.js')
    ).WorkflowPermissionRequest
  : null;

--- a/src/components/tasks/BackgroundTasksDialog.tsx
+++ b/src/components/tasks/BackgroundTasksDialog.tsx
@@ -1,6 +1,5 @@
 import { feature } from 'bun:bundle';
 import figures from 'figures';
-import type { AgentId } from '../../types/ids.js';
 import React, { type ReactNode, useEffect, useEffectEvent, useMemo, useRef, useState } from 'react';
 import { isCoordinatorMode } from 'src/coordinator/coordinatorMode.js';
 import { useTerminalSize } from 'src/hooks/useTerminalSize.js';
@@ -107,15 +106,12 @@ type ListItem =
 // ~1.3K lines into external builds. Gate with feature() + require so the
 // bundler can dead-code-eliminate the branch.
 /* eslint-disable @typescript-eslint/no-require-imports */
-const WorkflowDetailDialog = feature('WORKFLOW_SCRIPTS')
-  ? (require('./WorkflowDetailDialog.js') as typeof import('./WorkflowDetailDialog.js')).WorkflowDetailDialog
-  : null;
+// WorkflowDetailDialog 已移除：workflow 详情改由 /workflows 面板展示。
 const workflowTaskModule = feature('WORKFLOW_SCRIPTS')
  ? (require('src/tasks/LocalWorkflowTask/LocalWorkflowTask.js') as typeof import('src/tasks/LocalWorkflowTask/LocalWorkflowTask.js'))
  : null;
 const killWorkflowTask = workflowTaskModule?.killWorkflowTask ?? null;
-const skipWorkflowAgent = workflowTaskModule?.skipWorkflowAgent ?? null;
-const retryWorkflowAgent = workflowTaskModule?.retryWorkflowAgent ?? null;
+// skipWorkflowAgent / retryWorkflowAgent 仅由 /workflows 面板调用（原详情对话框已移除）。
 // Relative path, not `src/...` path-mapping — Bun's DCE can statically
 // resolve + eliminate `./` requires, but path-mapped strings stay opaque
 // and survive as dead literals in the bundle. Matches tasks.ts pattern.
@@ -440,29 +436,58 @@ export function BackgroundTasksDialog({ onDone, toolUseContext, initialDetailTas
            key={`teammate-${task.id}`}
          />
        );
-      case 'local_workflow':
-        if (!WorkflowDetailDialog) return null;
+      case 'local_workflow': {
+        // shift+下/Enter 进入的 workflow 详情。原 WorkflowDetailDialog 已移除，
+        // 详情改由 /workflows 面板展示，但此处仍需一个能退出的占位视图——
+        // 否则用户进入后 Esc/←/q 全无效，卡死。照 MonitorMcpDetailDialog 模式：
+        // ←/Esc 返回（goBackToList：单任务关闭、多任务回列表），x kill（running）。
+        const onKill =
+          task.status === 'running' && killWorkflowTask ? () => killWorkflowTask(task.id, setAppState) : undefined;
        return (
-          <WorkflowDetailDialog
-            workflow={task}
-            onDone={onDone as (message?: string, options?: { display?: string }) => void}
-            onKill={
-              task.status === 'running' && killWorkflowTask ? () => killWorkflowTask(task.id, setAppState) : undefined
-            }
-            onSkipAgent={
-              task.status === 'running' && skipWorkflowAgent
-                ? (agentId: string) => skipWorkflowAgent(task.id, agentId as AgentId, setAppState)
-                : undefined
-            }
-            onRetryAgent={
-              task.status === 'running' && retryWorkflowAgent
-                ? (agentId: string) => retryWorkflowAgent(task.id, agentId as AgentId, setAppState)
-                : undefined
-            }
-            onBack={goBackToList}
+          <Box
            key={`workflow-${task.id}`}
-          />
+            flexDirection="column"
+            tabIndex={0}
+            borderStyle="round"
+            onKeyDown={(e: KeyboardEvent) => {
+              if (e.key === 'left') {
+                e.preventDefault();
+                goBackToList();
+              } else if (e.key === 'x' && onKill) {
+                e.preventDefault();
+                onKill();
+              }
+            }}
+          >
+            <Dialog
+              title={task.workflowName}
+              subtitle={
+                <Text dimColor>
+                  {task.status}
+                  {task.summary ? ` · ${task.summary}` : ''}
+                </Text>
+              }
+              onCancel={goBackToList}
+              inputGuide={() => (
+                <Byline>
+                  <KeyboardShortcutHint shortcut="←" action="go back" />
+                  <KeyboardShortcutHint shortcut="Esc" action="close" />
+                  {onKill && <KeyboardShortcutHint shortcut="x" action="stop" />}
+                </Byline>
+              )}
+            >
+              {task.status === 'failed' && task.error ? (
+                <Box flexDirection="column">
+                  <Text color="error">失败原因：{task.error}</Text>
+                  <Text color="subtle">用 /workflows 查看阶段与 agent 实时进度</Text>
+                </Box>
+              ) : (
+                <Text color="subtle">用 /workflows 查看阶段与 agent 实时进度</Text>
+              )}
+            </Dialog>
+          </Box>
        );
+      }
      case 'monitor_mcp':
        if (!MonitorMcpDetailDialog) return null;
        return (
--- a/src/components/tasks/WorkflowDetailDialog.tsx
+++ b/src/components/tasks/WorkflowDetailDialog.tsx
@@ -1,103 +0,0 @@
-import React, { useCallback } from 'react';
-import type { DeepImmutable } from 'src/types/utils.js';
-import { useElapsedTime } from '../../hooks/useElapsedTime.js';
-import { Box, Text, type KeyboardEvent } from '@anthropic/ink';
-import { useKeybindings } from '../../keybindings/useKeybinding.js';
-import type { LocalWorkflowTaskState } from '../../tasks/LocalWorkflowTask/LocalWorkflowTask.js';
-import { Byline } from '../design-system/Byline.js';
-import { Dialog } from '../design-system/Dialog.js';
-import { KeyboardShortcutHint } from '../design-system/KeyboardShortcutHint.js';
-
-type Props = {
-  workflow: DeepImmutable<LocalWorkflowTaskState>;
-  onDone: (message?: string, options?: { display?: string }) => void;
-  onKill?: () => void;
-  onSkipAgent?: (agentId: string) => void;
-  onRetryAgent?: (agentId: string) => void;
-  onBack?: () => void;
-};
-
-/**
- * Detail dialog for local workflow tasks shown in the Shift+Down background
- * tasks overlay. Displays the workflow name, file, status, and output.
- * Follows the DreamDetailDialog/ShellDetailDialog pattern.
- */
-export function WorkflowDetailDialog({
-  workflow,
-  onDone: _onDone,
-  onKill,
-  onSkipAgent: _onSkipAgent,
-  onRetryAgent: _onRetryAgent,
-  onBack,
-}: Props): React.ReactNode {
-  const elapsedTime = useElapsedTime(workflow.startTime, workflow.status === 'running', 1000, 0);
-
-  useKeybindings({}, { context: 'WorkflowDetail' });
-
-  const handleKeyDown = useCallback(
-    (e: KeyboardEvent): void => {
-      if (e.key === 'left' && onBack) {
-        e.preventDefault();
-        onBack();
-      } else if (e.key === 'x' && workflow.status === 'running' && onKill) {
-        e.preventDefault();
-        onKill();
-      }
-    },
-    [onBack, onKill, workflow.status],
-  );
-
-  return (
-    <Box flexDirection="column" tabIndex={0} borderStyle="round" onKeyDown={handleKeyDown}>
-      <Dialog
-        title="Workflow"
-        subtitle={
-          <Text dimColor>
-            {elapsedTime} · {workflow.workflowName}
-          </Text>
-        }
-        onCancel={onBack ?? (() => {})}
-        inputGuide={() => (
-          <Byline>
-            {onBack && <KeyboardShortcutHint shortcut={'\u2190'} action="go back" />}
-            <KeyboardShortcutHint shortcut="Esc" action="close" />
-            {workflow.status === 'running' && onKill && <KeyboardShortcutHint shortcut="x" action="stop" />}
-          </Byline>
-        )}
-      >
-        <Box flexDirection="column" gap={1}>
-          <Text>
-            <Text bold>Status:</Text>{' '}
-            {workflow.status === 'running' ? (
-              <Text color="ansi:green">running</Text>
-            ) : workflow.status === 'completed' ? (
-              <Text color="ansi:green">{workflow.status}</Text>
-            ) : (
-              <Text color="ansi:red">{workflow.status}</Text>
-            )}
-          </Text>
-          <Text>
-            <Text bold>Description:</Text> {workflow.description}
-          </Text>
-          <Text>
-            <Text bold>Workflow:</Text> {workflow.workflowName}
-          </Text>
-          <Text>
-            <Text bold>File:</Text> {workflow.workflowFile}
-          </Text>
-          {workflow.summary && (
-            <Text>
-              <Text bold>Summary:</Text> {workflow.summary}
-            </Text>
-          )}
-          {workflow.output && (
-            <Box flexDirection="column">
-              <Text bold>Output:</Text>
-              <Text dimColor>{workflow.output}</Text>
-            </Box>
-          )}
-        </Box>
-      </Dialog>
-    </Box>
-  );
-}
--- a/src/constants/tools.ts
+++ b/src/constants/tools.ts
@@ -32,7 +32,7 @@ import { TEAM_DELETE_TOOL_NAME } from '@claude-code-best/builtin-tools/tools/Tea
 import { EXECUTE_TOOL_NAME } from '@claude-code-best/builtin-tools/tools/ExecuteTool/constants.js'
 import { ENTER_WORKTREE_TOOL_NAME } from '@claude-code-best/builtin-tools/tools/EnterWorktreeTool/constants.js'
 import { EXIT_WORKTREE_TOOL_NAME } from '@claude-code-best/builtin-tools/tools/ExitWorktreeTool/constants.js'
-import { WORKFLOW_TOOL_NAME } from '@claude-code-best/builtin-tools/tools/WorkflowTool/constants.js'
+import { WORKFLOW_TOOL_NAME } from '@claude-code-best/workflow-engine'
 import {
  CRON_CREATE_TOOL_NAME,
  CRON_DELETE_TOOL_NAME,
@@ -165,6 +165,11 @@ export const CORE_TOOLS = new Set([
  LSP_TOOL_NAME, // 'LSP'
  // Skills
  SKILL_TOOL_NAME, // 'Skill'
+  // Workflow orchestration — first-class primitive /ultracode directs the
+  // model to call directly. Kept core (not deferred) so it's always visible
+  // and callable without a SearchExtraTools round-trip. Registration itself
+  // is still feature-gated (feature('WORKFLOW_SCRIPTS')) in tools.ts.
+  WORKFLOW_TOOL_NAME, // 'Workflow'
  // Scheduling & monitoring
  SLEEP_TOOL_NAME, // 'Sleep'
  // Tool discovery (always loaded)
--- a/src/keybindings/defaultBindings.ts
+++ b/src/keybindings/defaultBindings.ts
@@ -326,6 +326,22 @@ export const DEFAULT_BINDINGS: KeybindingBlock[] = [
      space: 'modelPicker:toggle1M',
    },
  },
+  // Effort panel (slash /effort without args)
+  {
+    context: 'EffortPanel',
+    bindings: {
+      left: 'effortPanel:decrease',
+      right: 'effortPanel:increase',
+      h: 'effortPanel:decrease',
+      l: 'effortPanel:increase',
+      home: 'effortPanel:home',
+      end: 'effortPanel:end',
+      enter: 'effortPanel:confirm',
+      escape: 'effortPanel:cancel',
+      q: 'effortPanel:cancel',
+      'ctrl+c': 'effortPanel:cancel',
+    },
+  },
  // Select component navigation (used by /model, /resume, permission prompts, etc.)
  {
    context: 'Select',
--- a/src/keybindings/schema.ts
+++ b/src/keybindings/schema.ts
@@ -154,6 +154,13 @@ export const KEYBINDING_ACTIONS = [
  'modelPicker:decreaseEffort',
  'modelPicker:increaseEffort',
  'modelPicker:toggle1M',
+  // Effort panel actions (slash /effort without args)
+  'effortPanel:decrease',
+  'effortPanel:increase',
+  'effortPanel:home',
+  'effortPanel:end',
+  'effortPanel:confirm',
+  'effortPanel:cancel',
  // Select component actions (distinct from confirm: to avoid collisions)
  'select:next',
  'select:previous',
--- a/src/main.tsx
+++ b/src/main.tsx
@@ -753,6 +753,15 @@ export async function main() {

  process.on('exit', () => {
    resetCursor();
+    // 杀掉所有 running workflow，避免孤儿 task 留在 AppState 里
+    try {
+      const { peekWorkflowService } = require('./workflow/service.js') as {
+        peekWorkflowService: () => { shutdown: () => void } | null;
+      };
+      peekWorkflowService()?.shutdown();
+    } catch {
+      // workflow 未启用或已卸载——忽略
+    }
  });
  process.on('SIGINT', () => {
    // In print mode, print.ts registers its own SIGINT handler that aborts
--- a/src/skills/bundled/tests/ultracode.test.ts
+++ b/src/skills/bundled/tests/ultracode.test.ts
@@ -0,0 +1,97 @@
+import { afterEach, describe, expect, test } from 'bun:test'
+
+import type { PromptCommand } from '../../../types/command.js'
+import { clearBundledSkills, getBundledSkills } from '../../bundledSkills.js'
+import { registerUltracodeSkill } from '../ultracode.js'
+
+// Command is a union; source/getPromptForCommand only exist on the prompt
+// variant. Narrow via type assertion once we've confirmed type === 'prompt'.
+function asPrompt(c: { type: string }): PromptCommand {
+  return c as unknown as PromptCommand
+}
+
+// bundledSkills is a process-global registry (per CLAUDE.md mock/state rules,
+// module-level singletons leak across test files in one bun test process).
+// Clear after each test so `ultracode` never leaks into other suites that
+// enumerate registered skills (e.g. skill-search prefetch discovery).
+afterEach(() => {
+  clearBundledSkills()
+})
+
+describe('registerUltracodeSkill', () => {
+  test('registers a user-invocable prompt command named ultracode', () => {
+    clearBundledSkills()
+    registerUltracodeSkill()
+
+    const skills = getBundledSkills()
+    const ultracode = skills.find(s => s.name === 'ultracode')
+    expect(ultracode).toBeDefined()
+    expect(ultracode!.type).toBe('prompt')
+    expect(ultracode!.userInvocable).toBe(true)
+    expect(ultracode!.whenToUse).toBeTruthy()
+    expect(ultracode!.description).toContain('workflow')
+    const promptCmd = asPrompt(ultracode!)
+    expect(promptCmd.source).toBe('bundled')
+  })
+
+  test('getPromptForCommand injects the orchestration playbook with key sections', async () => {
+    clearBundledSkills()
+    registerUltracodeSkill()
+
+    const ultracode = getBundledSkills().find(s => s.name === 'ultracode')!
+    const blocks = await asPrompt(ultracode).getPromptForCommand(
+      '',
+      {} as never,
+    )
+    expect(blocks).toHaveLength(1)
+    expect(blocks[0]!.type).toBe('text')
+
+    const text = (blocks[0] as { type: 'text'; text: string }).text
+    // Title + opt-in rule + harness-injection note
+    expect(text).toContain('Workflow Orchestration Playbook')
+    expect(text).toContain('explicitly opted into multi-agent orchestration')
+    expect(text).toContain('harness')
+    // Orchestration primitives
+    expect(text).toContain('Script body hooks')
+    expect(text).toContain('parallel')
+    expect(text).toContain('pipeline')
+    // Determinism / script-execution-model constraints (JS not TS; Date.now/Math.random throw)
+    expect(text).toContain('plain JavaScript, NOT TypeScript')
+    expect(text).toContain('Date.now()')
+    // Barrier vs pipeline guidance, quality patterns, resume, hard limits
+    expect(text).toContain('DEFAULT TO pipeline()')
+    expect(text).toContain('Quality patterns')
+    expect(text).toContain('resumeFromRunId')
+    expect(text).toContain('4096')
+  })
+
+  test('appends user-provided args to the prompt when given', async () => {
+    clearBundledSkills()
+    registerUltracodeSkill()
+
+    const ultracode = getBundledSkills().find(s => s.name === 'ultracode')!
+    const blocks = await asPrompt(ultracode).getPromptForCommand(
+      '迁移 auth 模块',
+      {} as never,
+    )
+    const text = (blocks[0] as { type: 'text'; text: string }).text
+    expect(text.endsWith('迁移 auth 模块\n')).toBe(true)
+    expect(text).toContain('User input')
+  })
+
+  test('is not gated behind USER_TYPE — registers with no env set', () => {
+    // No USER_TYPE env is configured in this test process. If the skill were
+    // ant-gated (like stuck.ts), it would not appear here.
+    const previousUserType = process.env.USER_TYPE
+    delete process.env.USER_TYPE
+    clearBundledSkills()
+    registerUltracodeSkill()
+
+    const skills = getBundledSkills()
+    expect(skills.some(s => s.name === 'ultracode')).toBe(true)
+
+    // Restore so we never mutate the process env for other test files.
+    if (previousUserType === undefined) delete process.env.USER_TYPE
+    else process.env.USER_TYPE = previousUserType
+  })
+})
--- a/src/skills/bundled/index.ts
+++ b/src/skills/bundled/index.ts
@@ -9,6 +9,7 @@ import { registerRememberSkill } from './remember.js'
 import { registerSimplifySkill } from './simplify.js'
 import { registerSkillifySkill } from './skillify.js'
 import { registerStuckSkill } from './stuck.js'
+import { registerUltracodeSkill } from './ultracode.js'
 import { registerCronDeleteSkill, registerCronListSkill } from './cronManage.js'
 import { registerLoopSkill } from './loop.js'
 import { registerDreamSkill } from './dream.js'
@@ -35,6 +36,7 @@ export function initBundledSkills(): void {
  registerSimplifySkill()
  registerBatchSkill()
  registerStuckSkill()
+  registerUltracodeSkill()
  registerLoopSkill()
  registerCronListSkill()
  registerCronDeleteSkill()
--- a/src/skills/bundled/ultracode.ts
+++ b/src/skills/bundled/ultracode.ts
@@ -0,0 +1,235 @@
+import { registerBundledSkill } from '../bundledSkills.js'
+
+/**
+ * /ultracode — multi-agent workflow orchestration playbook (knowledge-only prompt skill).
+ *
+ * Injects the Workflow orchestration manual into context with zero runtime side
+ * effects: it doesn't change the main loop or toggle any behavior switch. The
+ * user/model uses it to decide when to call the Workflow tool, how to script
+ * fan-out and verification, and how to keep runs deterministic and resumable.
+ *
+ * General-purpose skill (not ant-only); available to all users.
+ */
+const ULTRACODE_PROMPT = `# /ultracode — Workflow Orchestration Playbook
+
+Execute a workflow script that orchestrates multiple subagents deterministically. Workflows run in the background — this tool returns immediately with a task ID, and a \`<task-notification>\` arrives when the workflow completes. Use \`/workflows\` to watch live progress.
+
+A workflow structures work across many agents — to be comprehensive (decompose and cover in parallel), to be confident (independent perspectives and adversarial checks before committing), or to take on scale one context can't hold (migrations, audits, broad sweeps). The script is where you encode that structure: what fans out, what verifies, what synthesizes.
+
+ONLY call this tool when the user has explicitly opted into multi-agent orchestration. Workflows can spawn dozens of agents and consume a large amount of tokens; the user must request that scale, not have it inferred. Explicit opt-in means one of:
+
+- The user included the keyword "ultracode" in their prompt (you'll see a system-reminder confirming it).
+- Ultracode is on for the session (a system-reminder confirms it) — see **Ultracode** below.
+- The user directly asked you to run a workflow or use multi-agent orchestration in their own words ("use a workflow", "run a workflow", "fan out agents", "orchestrate this with subagents"). The ask must be in the user's words — a task that would merely benefit from a workflow does not count.
+- The user invoked a skill or slash command whose instructions tell you to call Workflow.
+- The user asked you to run a specific named or saved workflow.
+
+For any other task — even one that would clearly benefit from parallelism — do NOT call this tool. Use the Agent tool for individual subagents, or briefly describe what a multi-agent workflow could do and how much it would roughly cost, and ask the user whether to run it. Mention they can ask for one with "use a workflow" in a future message to skip the ask.
+
+When you do call it, the right move is often **hybrid**: scout inline first (list the files, find the channels, scope the diff) to discover the work-list, then call Workflow to pipeline over it. You don't need to know the shape before the *task* — only before the *orchestration step*.
+
+Common single-phase workflows you can chain across turns:
+
+- **Understand** — parallel readers over relevant subsystems → structured map
+- **Design** — judge panel of N independent approaches → scored synthesis
+- **Review** — dimensions → find → adversarially verify (example below)
+- **Research** — multi-modal sweep → deep-read → synthesize
+- **Migrate** — discover sites → transform each (worktree isolation) → verify
+
+For larger work, run several in sequence — read each result before deciding the next phase. You stay in the loop; each workflow is one well-scoped fan-out.
+
+**Opt-in is harness-injected, not repo-controlled.** The "ultracode is on for the session" system-reminder is produced by the harness (claude.ai/client), not by this skill or any repo state. This skill is knowledge-only — it injects the playbook and toggles nothing. There is no in-repo feature flag, env var, or effort level for ultracode; do not add one. The Workflow tool is compiled in via the build-time WORKFLOW_SCRIPTS feature flag and exposes no per-session runtime gate.
+
+**Ultracode.** When a system-reminder confirms ultracode is on, that opt-in is standing: author and run a workflow for every substantive task by default. The goal is the most exhaustive, correct answer you can produce — token cost is not a constraint. For multi-phase work (understand → design → implement → review), that often means several workflows in sequence — one per phase — so you stay in the loop between them. The quality patterns below (adversarial verify, multi-modal sweep, completeness critic, loop-until-dry) are the tools; pick what fits the task. Lean toward orchestrating with workflows and adversarially verifying your findings — unless the work is trivial or already verified. Solo only on conversational turns or trivial mechanical edits. When a reminder says ultracode is off, revert to the opt-in rule above.
+
+Pass the script inline via \`script\` — do not Write it to a file first. Every invocation automatically persists its script to a file under the session directory and returns the path in the tool result. To iterate on a workflow, edit that file with Write/Edit and re-invoke Workflow with \`{scriptPath: "<path>"}\` instead of resending the full script.
+
+Every script must begin with \`export const meta = {...}\`:
+
+\`\`\`js
+export const meta = {
+  name: 'find-flaky-tests',
+  description: 'Find flaky tests and propose fixes',   // one-line, shown in permission dialog
+  phases: [                                            // one entry per phase() call
+    { title: 'Scan', detail: 'grep test logs for retries' },
+    { title: 'Fix', detail: 'one agent per flaky test' },
+  ],
+}
+// script body starts here — use agent()/parallel()/pipeline()/phase()/log()
+phase('Scan')
+const flaky = await agent('grep CI logs for retry markers', {schema: FLAKY_SCHEMA})
+...
+\`\`\`
+
+The \`meta\` object must be a PURE LITERAL — no variables, function calls, spreads, or template interpolation. Required fields: \`name\`, \`description\`. Optional: \`whenToUse\` (shown in the workflow list), \`phases\`. Use the SAME phase titles in meta.phases as in phase() calls — titles are matched exactly; a phase() call with no matching meta entry just gets its own progress group. Add \`model\` to a phase entry when that phase uses a specific model override.
+
+Script body hooks:
+
+- \`agent(prompt: string, opts?: {label?: string, phase?: string, schema?: object, model?: string, isolation?: 'worktree', agentType?: string}): Promise<any>\` — spawn a subagent. Without schema, returns its final text as a string. With schema (a JSON Schema), the subagent is forced to call a StructuredOutput tool and agent() returns the validated object — no parsing needed. Returns null if the user skips the agent mid-run or the subagent dies on a terminal API error after retries (filter with .filter(Boolean)). opts.label overrides the display label. opts.phase explicitly assigns this agent to a progress group (use this inside pipeline()/parallel() stages to avoid races on the global phase() state — same phase string → same group box). opts.model overrides the model for this agent call. Default to omitting it — the agent inherits the main-loop model (the resolved session model), which is almost always correct. Only set it when you're highly confident a different tier fits the task; when unsure, omit. opts.isolation: 'worktree' runs the agent in a fresh git worktree — EXPENSIVE (~200-500ms setup + disk per agent), use ONLY when agents mutate files in parallel and would otherwise conflict; the worktree is auto-removed if unchanged. opts.agentType uses a custom subagent type (e.g. 'Explore', 'code-reviewer') instead of the default workflow subagent — resolved from the same registry as the Agent tool; composes with schema (the custom agent's system prompt gets a StructuredOutput instruction appended).
+- \`pipeline(items, stage1, stage2, ...): Promise<any[]>\` — run each item through all stages independently, NO barrier between stages. Item A can be in stage 3 while item B is still in stage 1. This is the DEFAULT for multi-stage work. Wall-clock = slowest single-item chain, not sum-of-slowest-per-stage. Every stage callback receives (prevResult, originalItem, index) — use originalItem/index in later stages to label work without threading context through stage 1's return value. A stage that throws drops that item to \`null\` and skips its remaining stages.
+- \`parallel(thunks: Array<() => Promise<any>>): Promise<any[]>\` — run tasks concurrently. This is a BARRIER: awaits all thunks before returning. A thunk that throws (or whose agent errors) resolves to \`null\` in the result array — the call itself never rejects, so \`.filter(Boolean)\` before using the results. Use ONLY when you genuinely need all results together.
+- \`log(message: string): void\` — emit a progress message to the user (shown as a narrator line above the progress tree)
+- \`phase(title: string): void\` — start a new phase; subsequent agent() calls are grouped under this title in the progress display
+- \`args: any\` — the value passed as Workflow's \`args\` input, verbatim (undefined if not provided). Pass arrays/objects as actual JSON values in the tool call, NOT as a JSON-encoded string — \`args: ["a.ts", "b.ts"]\`, not \`args: "[\\"a.ts\\", ...]"\` (a stringified list reaches the script as one string, so \`args.filter\`/\`args.map\` throw). Use this to parameterize named workflows — e.g. pass a research question, target path, or config object directly instead of via a side-channel file.
+- \`budget: {total: number|null, spent(): number, remaining(): number}\` — the turn's token target from the user's "+500k"-style directive. \`budget.total\` is null if no target was set. \`budget.spent()\` returns output tokens spent this turn across the main loop and all workflows — the pool is shared, not per-workflow. \`budget.remaining()\` returns \`max(0, total - spent())\`, or \`Infinity\` if no target. The target is a HARD ceiling, not advisory: once \`spent()\` reaches \`total\`, further \`agent()\` calls throw. Use for dynamic loops: \`while (budget.total && budget.remaining() > 50_000) { ... }\`, or static scaling: \`const FLEET = budget.total ? Math.floor(budget.total / 100_000) : 5\`.
+- \`workflow(nameOrRef: string | {scriptPath: string}, args?: any): Promise<any>\` — run another workflow inline as a sub-step and return whatever it returns. Pass a name to invoke a saved workflow (same registry as {name: "..."}), or {scriptPath} to run a script file you Wrote earlier. The child shares this run's concurrency cap, agent counter, abort signal, and token budget — its agents appear under a "▸ name" group in /workflows and its tokens count toward budget.spent(). The args param becomes the child's \`args\` global. Nesting is one level only: workflow() inside a child throws. Throws on unknown name / unreadable scriptPath / child syntax error; catch to handle gracefully.
+
+Concurrent agent() calls are capped at 3 by default per workflow — excess calls queue and run as slots free up. The Workflow tool accepts an optional \`maxConcurrency\` input (1–16) to override per-run. OMIT it to use 3. To set maxConcurrency to ANY value other than 3, you MUST first ask the user via AskUserQuestion (offer 3 / 6 / 9 with 3 marked "(Recommended)") — the ONLY exception is when the user has already specified a number this session ("use 6", "maxConcurrency 9"). Never silently raise concurrency above 3 just because the workflow fans out; 3 is the recommended default. You can still pass 100 items to parallel()/pipeline() and they all complete; only the configured number run at any moment. Total agent count across a workflow's lifetime is capped at 1000 — a runaway-loop backstop set far above any real workflow. A single parallel()/pipeline() call accepts at most 4096 items; passing more is an explicit error, not a silent truncation.
+
+Model tier per task — when you DO override opts.model. Valid aliases: 'haiku' | 'sonnet' | 'opus' | 'best' | 'sonnet[1m]' | 'opus[1m]' | 'opusplan'. The main loop already runs on the user's chosen tier (usually sonnet), so omit model for most agents. Override only when the task clearly fits a different tier:
+
+- 'haiku' — fast and cheap (~5x cheaper/faster than sonnet). Use for: classification, extraction, labeling, regex-like pattern matching, "does this match X?" gating, simple format conversions. Wrong choice for anything reasoning over multiple concepts or producing code.
+- 'sonnet' — the workhorse. Most code edits, multi-file reading, tool-use chains, schema/structured output, code review, refactoring, debugging. When in doubt, OMIT model and let the agent inherit this.
+- 'opus' — strongest reasoning, slowest and most expensive (~5x sonnet cost). Use for: architecture decisions, deep root-causing across modules, novel algorithm design, adversarial verification of sonnet's findings, security review. Reserve for the 1-2 agents per workflow where reasoning actually matters.
+- 'best' — provider's "best available" (currently opus-tier). Use when you want max intelligence and don't care about cost or pinning a tier.
+
+Rule of thumb: if you can't articulate WHY this agent needs a different tier, omit model. A workflow that mixes tiers deliberately (haiku to triage → sonnet for the work → opus to verify) usually beats uniform opus-everywhere on cost AND quality. Don't put opus on every dimension of a 9-dimension review — sonnet finds the bugs, opus verifies the few that matter.
+
+Subagents are told their final text IS the return value (not a human-facing message), so they return raw data. For structured output, use the schema option — validation happens at the tool-call layer so the model retries on mismatch.
+
+Workflow agents can reach all session-connected MCP tools via ToolSearch — schemas load on demand per agent. Caveat: interactively-authenticated MCP servers (e.g. claude.ai) may be absent in headless/cron runs.
+
+Scripts are plain JavaScript, NOT TypeScript — type annotations (\`: string[]\`), interfaces, and generics fail to parse. The script body runs in an async context — use \`await\` directly. Standard JS built-ins (JSON, Math, Array, etc.) are available — EXCEPT \`Date.now()\`/\`Math.random()\`/argless \`new Date()\`, which throw (they would break resume); pass timestamps in via \`args\`, stamp results after the workflow returns, and for randomness vary the agent prompt/label by index. No filesystem or Node.js API access.
+
+DEFAULT TO pipeline(). Only reach for a barrier (parallel between stages) when you genuinely need ALL prior-stage results together.
+
+A barrier is correct ONLY when stage N needs cross-item context from all of stage N-1:
+
+- Dedup/merge across the full result set before expensive downstream work
+- Early-exit if the total count is zero ("0 bugs found → skip verification entirely")
+- Stage N's prompt references "the other findings" for comparison
+
+A barrier is NOT justified by:
+
+- "I need to flatten/map/filter first" — do it inside a pipeline stage: \`pipeline(items, stageA, r => transform([r]).flat(), stageB)\`
+- "The stages are conceptually separate" — that's what pipeline() models. Separate stages ≠ synchronized stages.
+- "It's cleaner code" — barrier latency is real. If 5 finders run and the slowest takes 3× the fastest, a barrier wastes 2/3 of the fast finders' idle time.
+
+Smell test: if you wrote
+
+\`\`\`js
+const a = await parallel(...)
+const b = transform(a)        // flatten, map, filter — no cross-item dependency
+const c = await parallel(b.map(...))
+\`\`\`
+
+that middle transform doesn't need the barrier. Rewrite as a pipeline with the transform inside a stage. When in doubt: pipeline.
+
+The canonical multi-stage pattern — pipeline by default, each dimension verifies as soon as its review completes:
+
+\`\`\`js
+export const meta = {
+  name: 'review-changes',
+  description: 'Review changed files across dimensions, verify each finding',
+  phases: [{ title: 'Review' }, { title: 'Verify' }],
+}
+const DIMENSIONS = [{key: 'bugs', prompt: '...'}, {key: 'perf', prompt: '...'}]
+const results = await pipeline(
+  DIMENSIONS,
+  d => agent(d.prompt, {label: \`review:\${d.key}\`, phase: 'Review', schema: FINDINGS_SCHEMA}),
+  review => parallel(review.findings.map(f => () =>
+    agent(\`Adversarially verify: \${f.title}\`, {label: \`verify:\${f.file}\`, phase: 'Verify', schema: VERDICT_SCHEMA})
+      .then(v => ({...f, verdict: v}))
+  ))
+)
+const confirmed = results.flat().filter(Boolean).filter(f => f.verdict?.isReal)
+return { confirmed }
+// Dimension 'bugs' findings verify while dimension 'perf' is still reviewing. No wasted wall-clock.
+\`\`\`
+
+When a barrier IS correct — dedup across all findings before expensive verification:
+
+\`\`\`js
+const all = await parallel(DIMENSIONS.map(d => () => agent(d.prompt, {schema: FINDINGS_SCHEMA})))
+const deduped = dedupeByFileAndLine(all.filter(Boolean).flatMap(r => r.findings))  // <-- genuinely needs ALL at once
+const verified = await parallel(deduped.map(f => () => agent(verifyPrompt(f), {schema: VERDICT_SCHEMA})))
+\`\`\`
+
+Loop-until-count pattern — accumulate to a target:
+
+\`\`\`js
+const bugs = []
+while (bugs.length < 10) {
+  const result = await agent("Find bugs in this codebase.", {schema: BUGS_SCHEMA})
+  bugs.push(...result.bugs)
+  log(\`\${bugs.length}/10 found\`)
+}
+\`\`\`
+
+Loop-until-budget pattern — scale depth to the user's "+500k" directive. Guard on budget.total: with no target set, remaining() is Infinity and the loop would run straight to the 1000-agent cap.
+
+\`\`\`js
+const bugs = []
+while (budget.total && budget.remaining() > 50_000) {
+  const result = await agent("Find bugs in this codebase.", {schema: BUGS_SCHEMA})
+  bugs.push(...result.bugs)
+  log(\`\${bugs.length} found, \${Math.round(budget.remaining()/1000)}k remaining\`)
+}
+\`\`\`
+
+Composing patterns — exhaustive review (find → dedup vs seen → diverse-lens panel → loop-until-dry):
+
+\`\`\`js
+const seen = new Set(), confirmed = []
+let dry = 0
+while (dry < 2) {                                              // loop-until-dry
+  const found = (await parallel(FINDERS.map(f => () =>          // barrier: collect all finders this round
+    agent(f.prompt, {phase: 'Find', schema: BUGS})))).filter(Boolean).flatMap(r => r.bugs)
+  const fresh = found.filter(b => !seen.has(key(b)))           // dedup vs ALL seen — plain code, not an agent
+  if (!fresh.length) { dry++; continue }
+  dry = 0; fresh.forEach(b => seen.add(key(b)))
+  const judged = await parallel(fresh.map(b => () =>           // every fresh bug judged concurrently...
+    parallel(['correctness','security','repro'].map(lens => () =>   // ...each by 3 distinct lenses
+      agent(\`Judge "\${b.desc}" via the \${lens} lens — real?\`, {phase: 'Verify', schema: VERDICT})))
+      .then(vs => ({ b, real: vs.filter(Boolean).filter(v => v.real).length >= 2 }))))
+  confirmed.push(...judged.filter(v => v.real).map(v => v.b))
+}
+return confirmed
+// dedup vs \`seen\`, NOT \`confirmed\` — else judge-rejected findings reappear every round and it never converges.
+\`\`\`
+
+Quality patterns — common shapes; pick by task and compose freely:
+
+- Adversarial verify: spawn N independent skeptics per finding, each prompted to REFUTE. Kill if ≥majority refute. Prevents plausible-but-wrong findings from surviving.
+
+\`\`\`js
+const votes = await parallel(Array.from({length: 3}, () => () =>
+  agent(\`Try to refute: \${claim}. Default to refuted=true if uncertain.\`, {schema: VERDICT})))
+const survives = votes.filter(Boolean).filter(v => !v.refuted).length >= 2
+\`\`\`
+
+- Perspective-diverse verify: when a finding can fail in more than one way, give each verifier a distinct lens (correctness, security, perf, does-it-reproduce) instead of N identical refuters — diversity catches failure modes redundancy can't.
+- Judge panel: generate N independent attempts from different angles (e.g. MVP-first, risk-first, user-first), score with parallel judges, synthesize from the winner while grafting the best ideas from runners-up. Beats one-attempt-iterated when the solution space is wide.
+- Loop-until-dry: for unknown-size discovery (bugs, issues, edge cases), keep spawning finders until K consecutive rounds return nothing new. Simple counters (while count < N) miss the tail.
+- Multi-modal sweep: parallel agents each searching a different way (by-container, by-content, by-entity, by-time). Each is blind to what the others surface; useful when one search angle won't find everything.
+- Completeness critic: a final agent that asks "what's missing — modality not run, claim unverified, source unread?" What it finds becomes the next round of work.
+- No silent caps: if a workflow bounds coverage (top-N, no-retry, sampling), \`log()\` what was dropped — silent truncation reads as "covered everything" when it didn't.
+
+Scale to what the user asked for. "find any bugs" → a few finders, single-vote verify. "thoroughly audit this" or "be comprehensive" → larger finder pool, 3–5 vote adversarial pass, synthesis stage. When unsure, lean toward thoroughness for research/review/audit requests and toward brevity for quick checks.
+
+These patterns aren't exhaustive — compose novel harnesses when the task calls for it (tournament brackets, self-repair loops, staged escalation, whatever fits).
+
+Use this tool for multi-step orchestration where control flow should be deterministic (loops, conditionals, fan-out) rather than model-driven.
+
+## Resume
+
+The tool result includes a runId. To resume after a pause, kill, or script edit, relaunch with \`Workflow({scriptPath, resumeFromRunId})\` — the longest unchanged prefix of agent() calls returns cached results instantly; the first edited/new call and everything after it runs live. Same script + same args → 100% cache hit. Date.now()/Math.random()/new Date() are unavailable in scripts (they would break this) — stamp results after the workflow returns, or pass timestamps via args. Fallback when no journal is available: Read agent-<id>.jsonl files in the transcript directory and hand-author a continuation script.
+`
+
+export function registerUltracodeSkill(): void {
+  registerBundledSkill({
+    name: 'ultracode',
+    description:
+      'Enter multi-agent workflow orchestration mode: when to use the Workflow tool, script primitives, quality patterns, determinism constraints, resume/budget, and files/commands.',
+    whenToUse:
+      'When a task can be decomposed or parallelized, needs multi-perspective confidence (e.g. find then adversarially verify), exceeds a single context (large migrations, broad audits, long-tail enumeration), or needs resume/auditability — orchestrate multiple subagents with the Workflow tool.',
+    userInvocable: true,
+    async getPromptForCommand(args) {
+      let prompt = ULTRACODE_PROMPT
+      if (args) {
+        prompt += `\n## User input\n\n${args}\n`
+      }
+      return [{ type: 'text', text: prompt }]
+    },
+  })
+}
--- a/src/tasks/LocalWorkflowTask/LocalWorkflowTask.ts
+++ b/src/tasks/LocalWorkflowTask/LocalWorkflowTask.ts
@@ -22,6 +22,8 @@ export type LocalWorkflowTaskState = TaskStateBase & {
  agentCount?: number
  /** Captured output from workflow execution. */
  output?: string
+  /** Failure reason surfaced to BackgroundTasksDialog (parallels RunProgress.error). */
+  error?: string
  /** Agent that spawned this task. Used for orphan cleanup. */
  agentId?: AgentId
  /** Abort controller for cancellation. */
@@ -96,6 +98,7 @@ export function completeWorkflowTask(
 export function failWorkflowTask(
  taskId: string,
  setAppState: SetAppState,
+  error?: string,
 ): void {
  updateTaskState<LocalWorkflowTaskState>(taskId, setAppState, task => ({
    ...task,
@@ -103,6 +106,7 @@ export function failWorkflowTask(
    endTime: Date.now(),
    notified: true,
    abortController: undefined,
+    ...(error !== undefined ? { error } : {}),
  }))
 }

--- a/src/tasks/LocalWorkflowTask/tests/LocalWorkflowTask.test.ts
+++ b/src/tasks/LocalWorkflowTask/tests/LocalWorkflowTask.test.ts
@@ -0,0 +1,90 @@
+import { describe, expect, mock, test } from 'bun:test'
+import { debugMock } from '../../../../tests/mocks/debug.js'
+import { logMock } from '../../../../tests/mocks/log.js'
+
+// ─── Mocks（仅 mock 有副作用的依赖链）───
+
+mock.module('src/utils/debug.ts', debugMock)
+mock.module('src/utils/log.ts', logMock)
+
+mock.module('src/constants/xml.js', () => ({
+  TASK_NOTIFICATION_TAG: 'task_notification',
+  TASK_ID_TAG: 'task_id',
+  TOOL_USE_ID_TAG: 'tool_use_id',
+  OUTPUT_FILE_TAG: 'output_file',
+  STATUS_TAG: 'status',
+  SUMMARY_TAG: 'summary',
+  WORKTREE_TAG: 'worktree',
+  WORKTREE_PATH_TAG: 'worktree_path',
+  WORKTREE_BRANCH_TAG: 'worktree_branch',
+  TASK_TYPE_TAG: 'task_type',
+}))
+
+mock.module('src/utils/messageQueueManager.js', () => ({
+  enqueuePendingNotification: () => {},
+}))
+
+mock.module('src/utils/sdkEventQueue.js', () => ({
+  enqueueSdkEvent: () => {},
+}))
+
+mock.module('src/utils/task/diskOutput.js', () => ({
+  getTaskOutputDelta: async () => null,
+  getTaskOutputPath: (id: string) => `/tmp/${id}`,
+  evictTaskOutput: () => {},
+  initTaskOutputAsSymlink: async () => {},
+}))
+
+// ─── Import after mocks ───
+
+const { registerLocalWorkflowTask, failWorkflowTask } = await import(
+  '../LocalWorkflowTask.js'
+)
+
+// ─── Helpers ───
+
+type AppStateLike = { tasks: Record<string, any> }
+type SetAppStateLike = (f: (prev: AppStateLike) => AppStateLike) => void
+
+function createSetState(): {
+  setAppState: SetAppStateLike
+  getState: () => AppStateLike
+} {
+  let state: AppStateLike = { tasks: {} }
+  return {
+    setAppState: f => {
+      state = f(state)
+    },
+    getState: () => state,
+  }
+}
+
+// ─── Tests ───
+
+describe('failWorkflowTask', () => {
+  test('保存 error 字符串到 state（供 BackgroundTasksDialog 显示失败原因）', () => {
+    const { setAppState, getState } = createSetState()
+    const taskId = registerLocalWorkflowTask(setAppState as any, {
+      description: 'test',
+      workflowName: 'wf',
+      workflowFile: '/tmp/wf.ts',
+    })
+    failWorkflowTask(taskId, setAppState as any, 'agent X 抛 Error: boom')
+    const task = getState().tasks[taskId]
+    expect(task.status).toBe('failed')
+    expect(task.error).toBe('agent X 抛 Error: boom')
+  })
+
+  test('不传 error 时 state.error 保持 undefined（向后兼容现有调用）', () => {
+    const { setAppState, getState } = createSetState()
+    const taskId = registerLocalWorkflowTask(setAppState as any, {
+      description: 'test',
+      workflowName: 'wf',
+      workflowFile: '/tmp/wf.ts',
+    })
+    failWorkflowTask(taskId, setAppState as any)
+    const task = getState().tasks[taskId]
+    expect(task.status).toBe('failed')
+    expect(task.error).toBeUndefined()
+  })
+})
--- a/src/tools.ts
+++ b/src/tools.ts
@@ -154,11 +154,7 @@ const ListPeersTool = feature('UDS_INBOX')
      .ListPeersTool
  : null
 const WorkflowTool = feature('WORKFLOW_SCRIPTS')
-  ? (() => {
-      require('@claude-code-best/builtin-tools/tools/WorkflowTool/bundled/index.js').initBundledWorkflows()
-      return require('@claude-code-best/builtin-tools/tools/WorkflowTool/WorkflowTool.js')
-        .WorkflowTool
-    })()
+  ? require('./workflow/wiring.js').createWorkflowToolCore()
  : null
 /* eslint-enable custom-rules/no-process-env-top-level, @typescript-eslint/no-require-imports */
 import type { ToolPermissionContext } from './Tool.js'
--- a/src/utils/effort.ts
+++ b/src/utils/effort.ts
@@ -16,6 +16,10 @@ import {

 export type { EffortLevel }

+// NOTE: 'ultracode' is NOT an effort level. It is a session-scoped multi-agent
+// orchestration opt-in injected by the harness (claude.ai/client) as a
+// system-reminder, orthogonal to the effort parameter. EffortLevel / EffortValue
+// must never include 'ultracode'; /effort only accepts the levels below.
 export const EFFORT_LEVELS = [
  'low',
  'medium',
--- a/src/utils/permissions/classifierDecision.ts
+++ b/src/utils/permissions/classifierDecision.ts
@@ -42,7 +42,7 @@ const VERIFY_PLAN_EXECUTION_TOOL_NAME =
    : null
 const WORKFLOW_TOOL_NAME = feature('WORKFLOW_SCRIPTS')
  ? (
-      require('@claude-code-best/builtin-tools/tools/WorkflowTool/constants.js') as typeof import('@claude-code-best/builtin-tools/tools/WorkflowTool/constants.js')
+      require('@claude-code-best/workflow-engine') as typeof import('@claude-code-best/workflow-engine')
    ).WORKFLOW_TOOL_NAME
  : null
 /* eslint-enable @typescript-eslint/no-require-imports */
--- a/src/utils/worktree.ts
+++ b/src/utils/worktree.ts
@@ -1021,11 +1021,13 @@ export async function removeAgentWorktree(

 /**
 * Slug patterns for throwaway worktrees created by AgentTool (`agent-a<7hex>`,
- * from earlyAgentId.slice(0,8)), WorkflowTool (`wf_<runId>-<idx>` where runId
- * is randomUUID().slice(0,12) = 8 hex + `-` + 3 hex), and bridgeMain
- * (`bridge-<safeFilenameId>`). These leak when the parent process is killed
- * (Ctrl+C, ESC, crash) before their in-process cleanup runs. Exact-shape
- * patterns avoid sweeping user-named EnterWorktree slugs like `wf-myfeature`.
+ * from earlyAgentId.slice(0,8)), workflow engine isolation:'worktree'
+ * (`wf_<8hex>-<3hex>-<n>` derived from sha256(runId:agentId) in
+ * claudeCodeBackend — taskId is `w`+base36, not a UUID, so the slug cannot
+ * embed runId directly and is hashed to satisfy this hex pattern), and
+ * bridgeMain (`bridge-<safeFilenameId>`). These leak when the parent process
+ * is killed (Ctrl+C, ESC, crash) before their in-process cleanup runs.
+ * Exact-shape patterns avoid sweeping user-named EnterWorktree slugs like `wf-myfeature`.
 */
 const EPHEMERAL_WORKTREE_PATTERNS = [
  /^agent-a[0-9a-f]{7}$/,
--- a/src/workflow/WorkflowPermissionRequest.tsx
+++ b/src/workflow/WorkflowPermissionRequest.tsx
@@ -0,0 +1,145 @@
+import React, { useCallback, useMemo } from 'react';
+import { Box, Text, useTheme } from '@anthropic/ink';
+import { getTheme, type Theme } from 'src/utils/theme.js';
+import { env } from 'src/utils/env.js';
+import { shouldShowAlwaysAllowOptions } from 'src/utils/permissions/permissionsLoader.js';
+import { logUnaryEvent } from 'src/utils/unaryLogging.js';
+import { PermissionDialog } from 'src/components/permissions/PermissionDialog.js';
+import { PermissionPrompt, type PermissionPromptOption } from 'src/components/permissions/PermissionPrompt.js';
+import type { PermissionRequestProps } from 'src/components/permissions/PermissionRequest.js';
+import { PermissionRuleExplanation } from 'src/components/permissions/PermissionRuleExplanation.js';
+
+type OptionValue = 'yes' | 'yes-dont-ask-again' | 'no';
+
+/**
+ * Permission request UI for the WorkflowTool. Asks the user to confirm
+ * executing a workflow script.
+ * Follows the MonitorPermissionRequest / FallbackPermissionRequest pattern.
+ */
+export function WorkflowPermissionRequest({
+  toolUseConfirm,
+  onDone,
+  onReject,
+  workerBadge,
+}: PermissionRequestProps): React.ReactNode {
+  const [themeName] = useTheme();
+  const theme = getTheme(themeName);
+
+  const input = toolUseConfirm.input as {
+    workflow: string;
+    args?: string;
+  };
+
+  const showAlwaysAllowOptions = useMemo(() => shouldShowAlwaysAllowOptions(), []);
+
+  const options: PermissionPromptOption<OptionValue>[] = useMemo(() => {
+    const opts: PermissionPromptOption<OptionValue>[] = [
+      {
+        label: 'Yes',
+        value: 'yes',
+        feedbackConfig: { type: 'accept' as const },
+      },
+    ];
+    if (showAlwaysAllowOptions) {
+      opts.push({
+        label: (
+          <Text>
+            Yes, and don{'\u2019'}t ask again for <Text bold>{toolUseConfirm.tool.name}</Text> commands
+          </Text>
+        ),
+        value: 'yes-dont-ask-again',
+      });
+    }
+    opts.push({
+      label: 'No',
+      value: 'no',
+      feedbackConfig: { type: 'reject' as const },
+    });
+    return opts;
+  }, [showAlwaysAllowOptions, toolUseConfirm.tool.name]);
+
+  const handleSelect = useCallback(
+    (value: OptionValue, feedback?: string) => {
+      switch (value) {
+        case 'yes':
+          logUnaryEvent({
+            completion_type: 'tool_use_single',
+            event: 'accept',
+            metadata: {
+              language_name: 'none',
+              message_id: toolUseConfirm.assistantMessage.message.id ?? '',
+              platform: env.platform,
+            },
+          });
+          toolUseConfirm.onAllow(toolUseConfirm.input, [], feedback);
+          onDone();
+          break;
+        case 'yes-dont-ask-again':
+          logUnaryEvent({
+            completion_type: 'tool_use_single',
+            event: 'accept',
+            metadata: {
+              language_name: 'none',
+              message_id: toolUseConfirm.assistantMessage.message.id ?? '',
+              platform: env.platform,
+            },
+          });
+          toolUseConfirm.onAllow(toolUseConfirm.input, [
+            {
+              type: 'addRules',
+              rules: [{ toolName: toolUseConfirm.tool.name }],
+              behavior: 'allow',
+              destination: 'localSettings',
+            },
+          ]);
+          onDone();
+          break;
+        case 'no':
+          logUnaryEvent({
+            completion_type: 'tool_use_single',
+            event: 'reject',
+            metadata: {
+              language_name: 'none',
+              message_id: toolUseConfirm.assistantMessage.message.id ?? '',
+              platform: env.platform,
+            },
+          });
+          toolUseConfirm.onReject(feedback);
+          onReject();
+          onDone();
+          break;
+      }
+    },
+    [toolUseConfirm, onDone, onReject],
+  );
+
+  const handleCancel = useCallback(() => {
+    logUnaryEvent({
+      completion_type: 'tool_use_single',
+      event: 'reject',
+      metadata: {
+        language_name: 'none',
+        message_id: toolUseConfirm.assistantMessage.message.id ?? '',
+        platform: env.platform,
+      },
+    });
+    toolUseConfirm.onReject();
+    onReject();
+    onDone();
+  }, [toolUseConfirm, onDone, onReject]);
+
+  return (
+    <PermissionDialog title="Workflow" workerBadge={workerBadge}>
+      <Box flexDirection="column" gap={1}>
+        <Box flexDirection="column">
+          <Text bold color={theme.permission as keyof Theme}>
+            Execute workflow: {input.workflow}
+          </Text>
+          {input.args && <Text dimColor>Arguments: {input.args}</Text>}
+        </Box>
+        <PermissionRuleExplanation permissionResult={toolUseConfirm.permissionResult} toolType="command" />
+        <PermissionPrompt<OptionValue> options={options} onSelect={handleSelect} onCancel={handleCancel} />
+      </Box>
+    </PermissionDialog>
+  );
+}
--- a/src/workflow/tests/WorkflowsPanel.test.tsx
+++ b/src/workflow/tests/WorkflowsPanel.test.tsx
@@ -0,0 +1,197 @@
+import { expect, test } from 'bun:test';
+import { PassThrough } from 'node:stream';
+import React from 'react';
+import { wrappedRender as render } from '@anthropic/ink';
+import { SentryErrorBoundary } from '../../components/SentryErrorBoundary.js';
+import type { RunProgress } from '../progress/store.js';
+import { call as panelCall } from '../panel/panelCall.js';
+import { clampSelected, isRunTerminatedTransition, WorkflowsPanel } from '../panel/WorkflowsPanel.js';
+import { truncateLabel } from '../panel/AgentList.js';
+import { STATUS_DOT } from '../panel/status.js';
+import { __resetWorkflowServiceForTests, getWorkflowService } from '../service.js';
+
+// Pure function: clamp selection to valid range (same source as clampSelected inside the panel).
+test('clampSelected: empty list → 0; out of bounds → last; negative/NaN → 0; normal → original', () => {
+  expect(clampSelected(5, 0)).toBe(0);
+  expect(clampSelected(5, 3)).toBe(2);
+  expect(clampSelected(-3, 3)).toBe(0);
+  expect(clampSelected(1, 3)).toBe(1);
+  expect(clampSelected(0, 1)).toBe(0);
+  // NaN (e.g. uninitialized state) safely falls back to 0
+  expect(clampSelected(Number.NaN, 3)).toBe(0);
+});
+
+// truncateLabel: short label as-is; with `#number` suffix keep suffix, truncate prefix + ellipsis;
+// without suffix, cut from the right. Lets audit workflow's verify:${dim}#${idx} multi-finding still be distinguishable.
+test('truncateLabel: short label as-is; with #number suffix keep suffix and truncate prefix; without suffix cut from right', () => {
+  // short label as-is
+  expect(truncateLabel('agent-1', 18)).toBe('agent-1');
+  expect(truncateLabel('review:bugs', 18)).toBe('review:bugs');
+  // exactly max length (boundary)
+  expect(truncateLabel('review:correctness', 18)).toBe('review:correctness');
+  // over max + with #number suffix: keep suffix, truncate prefix + ellipsis
+  expect(truncateLabel('verify:correctness#0', 18)).toBe('verify:correctn…#0');
+  expect(truncateLabel('verify:architecture#15', 18)).toBe('verify:archite…#15');
+  // multi-digit #idx also distinguishable
+  expect(truncateLabel('verify:correctness#2', 18)).toBe('verify:correctn…#2');
+  // without #number suffix: cut from right (legacy behavior)
+  expect(truncateLabel('a-very-long-label-no-suffix', 18)).toBe('a-very-long-label-');
+});
+
+// STATUS_DOT covers four states, all visible dot characters.
+test('STATUS_DOT covers running/completed/failed/killed and is non-empty character', () => {
+  const statuses = ['running', 'completed', 'failed', 'killed'] as const;
+  for (const s of statuses) {
+    expect(STATUS_DOT[s]).toBeTruthy();
+    expect(STATUS_DOT[s].length).toBeGreaterThan(0);
+  }
+});
+
+// Progress data shape contract: fields read by the panel exist/are readable on a typical RunProgress,
+// preventing silent panel render breakage from store.ts structural drift.
+test('RunProgress field contract: keys read by panel all exist', () => {
+  const run: RunProgress = {
+    runId: 'r1',
+    workflowName: 'review',
+    status: 'running',
+    phases: [{ title: 'Find', status: 'done' }],
+    declaredPhases: ['Find', 'Review'],
+    currentPhase: 'Review',
+    agents: [{ id: 1, label: 'review:api', phase: 'Review', status: 'running' }],
+    agentCount: 1,
+    startedAt: 1,
+    updatedAt: 1,
+  };
+  // paths read by panel WorkflowList/Detail
+  expect(run.status).toBe('running');
+  expect(STATUS_DOT[run.status]).toBe('●');
+  expect(run.currentPhase).toBe('Review');
+  expect(run.agents.length).toBe(run.agentCount);
+  expect(run.phases[0]?.title).toBe('Find');
+  expect(run.phases[0]?.status).toBe('done');
+  expect(run.agents[0]?.label).toBe('review:api');
+});
+
+// Completed/failed shape: returnValue / error only shown when not running.
+test('RunProgress completed/failed shape: returnValue/error optional', () => {
+  const completed: RunProgress = {
+    runId: 'r2',
+    workflowName: 'w',
+    status: 'completed',
+    phases: [],
+    declaredPhases: [],
+    currentPhase: null,
+    agents: [],
+    agentCount: 0,
+    returnValue: 'ok',
+    startedAt: 2,
+    updatedAt: 2,
+  };
+  const failed: RunProgress = {
+    runId: 'r3',
+    workflowName: 'w',
+    status: 'failed',
+    phases: [],
+    declaredPhases: [],
+    currentPhase: null,
+    agents: [],
+    agentCount: 0,
+    error: 'boom',
+    startedAt: 3,
+    updatedAt: 3,
+  };
+  expect(completed.returnValue).toBe('ok');
+  expect(completed.error).toBeUndefined();
+  expect(failed.error).toBe('boom');
+  expect(failed.returnValue).toBeUndefined();
+  expect(STATUS_DOT['completed']).toBe('✓');
+  expect(STATUS_DOT['failed']).toBe('✗');
+});
+
+// Fix M: useSyncExternalStore / listNamed / child component throwing should not break through REPL.
+// panelCall must wrap WorkflowsPanel in SentryErrorBoundary.
+test('panelCall wraps WorkflowsPanel in SentryErrorBoundary (fix M regression)', async () => {
+  const element = (await (panelCall as unknown as (a: unknown, b: unknown, c: unknown) => Promise<React.ReactNode>)(
+    () => {},
+    { canUseTool: undefined },
+    '',
+  )) as React.ReactElement<{ name?: string; children: React.ReactNode }>;
+  expect(element.type).toBe(SentryErrorBoundary);
+  expect(element.props.name).toBe('WorkflowsPanel');
+  const child = element.props.children as React.ReactElement<{
+    onDone: () => void;
+  }>;
+  expect(child.type).toBe(WorkflowsPanel);
+  expect(React.isValidElement(child)).toBe(true);
+  expect(typeof child.props.onDone).toBe('function');
+});
+
+// ---- Task 6: panel mount triggers loadPersistedRuns once ----
+// Verify that WorkflowsPanel mount calls svc.loadPersistedRuns() exactly once.
+// The persistedLoaded flag inside service guards idempotency; re-render / re-mount does not repeat the call.
+// Use a spy to replace the singleton's loadPersistedRuns, render to a PassThrough stream, wait for useEffect to trigger.
+
+test('WorkflowsPanel mount triggers loadPersistedRuns once', async () => {
+  __resetWorkflowServiceForTests();
+  const svc = getWorkflowService();
+  let calls = 0;
+  const orig = svc.loadPersistedRuns.bind(svc);
+  svc.loadPersistedRuns = async () => {
+    calls++;
+  };
+
+  const stdout = new PassThrough();
+  // consume data to avoid buffer overflow (render writes multiple frames)
+  stdout.on('data', () => {});
+  let instance: { unmount: () => void; waitUntilExit: () => Promise<void> } | undefined;
+  try {
+    instance = await render(
+      React.createElement(WorkflowsPanel, {
+        onDone: () => {},
+        context: { canUseTool: undefined } as never,
+      }),
+      { stdout: stdout as unknown as NodeJS.WriteStream, patchConsole: false },
+    );
+    // after mount useEffect triggers asynchronously; wait a tick for React commit + effect to complete
+    await new Promise(r => setTimeout(r, 30));
+
+    expect(calls).toBe(1);
+  } finally {
+    instance?.unmount();
+    svc.loadPersistedRuns = orig;
+    __resetWorkflowServiceForTests();
+  }
+});
+
+// When the focused run transitions from running to terminal, the panel auto onDone() (800ms delay lets the user see the terminal state).
+// Only same-runId state transitions trigger: switching to a completed tab does not exit; opening history panel does not exit either.
+// Transition detection logic is extracted into the isRunTerminatedTransition pure function for offline unit testing (Ink test mode does not
+// auto-pump concurrent state updates, integration tests are unreliable).
+test('isRunTerminatedTransition: same runId running → terminal triggers; other cases do not trigger', () => {
+  const running = { runId: 'r1', status: 'running' as const };
+  const completed = { runId: 'r1', status: 'completed' as const };
+  const failed = { runId: 'r1', status: 'failed' as const };
+  const killed = { runId: 'r1', status: 'killed' as const };
+
+  // same run running → terminal: all three terminal states trigger
+  expect(isRunTerminatedTransition(running, completed)).toBe(true);
+  expect(isRunTerminatedTransition(running, failed)).toBe(true);
+  expect(isRunTerminatedTransition(running, killed)).toBe(true);
+
+  // prev=null (open history panel): does not trigger
+  expect(isRunTerminatedTransition(null, completed)).toBe(false);
+  // curr=null (runs cleared): does not trigger
+  expect(isRunTerminatedTransition(running, null)).toBe(false);
+
+  // different runId (switch tab): does not trigger
+  expect(isRunTerminatedTransition({ runId: 'r1', status: 'running' }, { runId: 'r2', status: 'completed' })).toBe(
+    false,
+  );
+
+  // same run but prev not running (already terminal and re-rendered): does not trigger
+  expect(isRunTerminatedTransition(completed, completed)).toBe(false);
+  expect(isRunTerminatedTransition(killed, completed)).toBe(false);
+
+  // same run running → running (no change): does not trigger
+  expect(isRunTerminatedTransition(running, running)).toBe(false);
+});
--- a/src/workflow/tests/claudeCodeBackend.test.ts
+++ b/src/workflow/tests/claudeCodeBackend.test.ts
@@ -0,0 +1,398 @@
+import { expect, test, mock } from 'bun:test'
+
+// Note: mock specifier must resolve to the same module that impl actually imports (bun mock.module
+// matches by resolved module). impl uses '@claude-code-best/builtin-tools/...' and 'src/*' alias
+// path imports, so the same specifier is used here.
+mock.module(
+  '@claude-code-best/builtin-tools/tools/AgentTool/runAgent.js',
+  () => ({
+    runAgent: async function* () {
+      yield {
+        type: 'assistant',
+        message: { content: [{ type: 'text', text: 'agent-text' }] },
+      }
+    },
+  }),
+)
+mock.module(
+  '@claude-code-best/builtin-tools/tools/AgentTool/agentToolUtils.js',
+  () => ({
+    finalizeAgentTool: () => ({
+      content: [{ type: 'text', text: 'agent-text' }],
+      usage: { output_tokens: 42 },
+      totalTokens: 42,
+      totalToolUseCount: 3,
+    }),
+  }),
+)
+mock.module(
+  '@claude-code-best/builtin-tools/tools/AgentTool/loadAgentsDir.js',
+  () => ({
+    isBuiltInAgent: () => true,
+  }),
+)
+mock.module('src/tools.js', () => ({ assembleToolPool: () => ({ tools: [] }) }))
+mock.module('src/utils/messages.js', () => ({
+  // Return a shape that satisfies UserMessage consumers process-wide.
+  // Bun's mock.module is process-global (last-write-wins), so an incomplete
+  // mock here corrupts every later test that imports the real createUserMessage
+  // (e.g. bridgeMessaging.test.ts's `type !== 'user'` early-exit, or
+  // processSlashCommand.test.ts's `message.content` access). Mirror the real
+  // shape from src/utils/messages.ts: type + message envelope + passthrough.
+  createUserMessage: (
+    o: {
+      content: string
+    } & Record<string, unknown>,
+  ) => ({
+    type: 'user' as const,
+    message: { role: 'user', content: o.content },
+    ...o,
+  }),
+  extractTextContent: () => 'agent-text',
+}))
+mock.module('src/utils/uuid.js', () => ({ createAgentId: () => 'agent-1' }))
+mock.module('src/services/analytics/index.js', () => ({ logEvent: () => {} }))
+mock.module('src/utils/debug.js', () => ({ logForDebugging: () => {} }))
+
+// isolation:'worktree' tests: mock worktree trio (to avoid actually running git worktree add).
+// Note mock.module is process-global; worktreeState is defined outside the factory for test reset.
+// Do not mock cwd.js: runWithCwdOverride actually running AsyncLocalStorage is harmless to mocked runAgent,
+// and avoids polluting other tests in the same process that depend on pwd/getCwd.
+const worktreeState = {
+  shouldThrow: false,
+  hasChanges: false,
+  created: [] as string[],
+  removed: [] as string[],
+  changesCalls: 0,
+}
+mock.module('src/utils/worktree.js', () => ({
+  createAgentWorktree: async (slug: string) => {
+    if (worktreeState.shouldThrow) throw new Error('wt boom')
+    worktreeState.created.push(slug)
+    return {
+      worktreePath: '/fake/wt',
+      worktreeBranch: 'wt-branch',
+      headCommit: 'abc123',
+      gitRoot: '/fake',
+      hookBased: false,
+    }
+  },
+  hasWorktreeChanges: async () => {
+    worktreeState.changesCalls++
+    return worktreeState.hasChanges
+  },
+  removeAgentWorktree: async (path: string) => {
+    worktreeState.removed.push(path)
+    return true
+  },
+}))
+
+import { WorkflowAbortedError } from '@claude-code-best/workflow-engine'
+import {
+  claudeCodeBackend,
+  resolveAgentDefinition,
+  mapWorkflowModel,
+  extractStructuredOutput,
+  WORKFLOW_AGENT,
+} from '../backends/claudeCodeBackend.js'
+import { makeHostHandle } from '../hostHandle.js'
+
+function ctx() {
+  return {
+    host: makeHostHandle({
+      toolUseContext: {
+        options: {
+          agentDefinitions: { activeAgents: [] },
+          querySource: 'workflow',
+          mainLoopModel: 'm',
+        },
+        getAppState: () => ({
+          toolPermissionContext: {
+            mode: 'acceptEdits',
+            alwaysAllowRules: {},
+          },
+          mcp: { tools: [] },
+        }),
+      } as never,
+      canUseTool: (() => Promise.resolve({ behavior: 'allow' })) as never,
+      // run() does not read parentMessage; use an empty object placeholder to satisfy the WorkflowHostBundle type.
+      parentMessage: {} as never,
+    }),
+    signal: new AbortController().signal,
+    runId: 'r1',
+    agentId: 1,
+  }
+}
+
+test('text agent → ok + token/tool/model accounting', async () => {
+  const res = await claudeCodeBackend.run({ prompt: 'do it' }, ctx())
+  expect(res.kind).toBe('ok')
+  if (res.kind === 'ok') {
+    expect(res.output).toBe('agent-text')
+    expect(res.usage.outputTokens).toBe(42)
+    // panel display fields: tokenCount(=totalTokens) / toolCount / model (fallback mainLoopModel 'm')
+    expect(res.tokenCount).toBe(42)
+    expect(res.toolCount).toBe(3)
+    expect(res.model).toBe('m')
+  }
+})
+
+test('isolation:worktree → create worktree + auto-cleanup on no changes; slug matches cleanup regex', async () => {
+  worktreeState.shouldThrow = false
+  worktreeState.hasChanges = false
+  worktreeState.created = []
+  worktreeState.removed = []
+  worktreeState.changesCalls = 0
+  const res = await claudeCodeBackend.run(
+    { prompt: 'do', isolation: 'worktree' },
+    ctx(),
+  )
+  expect(res.kind).toBe('ok')
+  expect(worktreeState.created).toHaveLength(1)
+  // slug must match cleanupStaleAgentWorktrees cleanup regex ^wf_[0-9a-f]{8}-[0-9a-f]{3}-\d+$
+  expect(worktreeState.created[0]).toMatch(/^wf_[0-9a-f]{8}-[0-9a-f]{3}-\d+$/)
+  expect(worktreeState.changesCalls).toBe(1)
+  expect(worktreeState.removed).toHaveLength(1) // no changes → auto-remove
+})
+
+test('isolation:worktree has changes → keep worktree (no remove)', async () => {
+  worktreeState.hasChanges = true
+  worktreeState.created = []
+  worktreeState.removed = []
+  worktreeState.changesCalls = 0
+  const res = await claudeCodeBackend.run(
+    { prompt: 'do', isolation: 'worktree' },
+    ctx(),
+  )
+  expect(res.kind).toBe('ok')
+  expect(worktreeState.removed).toHaveLength(0) // has changes → keep
+  expect(worktreeState.changesCalls).toBe(1)
+})
+
+test('isolation:worktree creation fails → fail-closed returns dead (does not silently degrade to shared cwd)', async () => {
+  worktreeState.shouldThrow = true
+  const res = await claudeCodeBackend.run(
+    { prompt: 'do', isolation: 'worktree' },
+    ctx(),
+  )
+  expect(res.kind).toBe('dead')
+  worktreeState.shouldThrow = false
+})
+
+test('no isolation → no worktree created', async () => {
+  worktreeState.created = []
+  const res = await claudeCodeBackend.run({ prompt: 'do' }, ctx())
+  expect(res.kind).toBe('ok')
+  expect(worktreeState.created).toHaveLength(0)
+})
+
+test('runAgent throws → dead', async () => {
+  // override mock so runAgent throws (last-write-wins)
+  mock.module(
+    '@claude-code-best/builtin-tools/tools/AgentTool/runAgent.js',
+    () => ({
+      // biome-ignore lint/correctness/useYield: intentionally throws to test dead branch (no yield)
+      runAgent: async function* () {
+        throw new Error('boom')
+      },
+    }),
+  )
+  const res = await claudeCodeBackend.run({ prompt: 'fail' }, ctx())
+  expect(res.kind).toBe('dead')
+})
+
+// The next three groups of tests cover the 'x' invalid fix: backend must bridge ctx.signal to runAgent.override
+// .abortController, and recognize AbortError as abort (throw WorkflowAbortedError, not swallow as dead).
+// Also verify registerAgentAbort injection so service.kill(runId, agentId) can precisely abort a single agent.
+
+test('ctx.signal pre-abort → backend bridge: override.abortController.signal.aborted=true', async () => {
+  // use capturedOverride to expose the agentAbort created by backend (the override.abortController received by mock)
+  let capturedController: AbortController | undefined
+  mock.module(
+    '@claude-code-best/builtin-tools/tools/AgentTool/runAgent.js',
+    () => ({
+      runAgent: async function* (opts: {
+        override?: { abortController?: AbortController }
+      }) {
+        capturedController = opts.override?.abortController
+        yield {
+          type: 'assistant',
+          message: { content: [{ type: 'text', text: 'x' }] },
+        }
+      },
+    }),
+  )
+  const parentAbort = new AbortController()
+  parentAbort.abort()
+  // mock does not throw → backend takes the normal return path; but the bridge `if (ctx.signal.aborted) agentAbort.abort()`
+  // has already triggered synchronously, capturedController.signal.aborted must be true (root cause of kill bridge)
+  await claudeCodeBackend.run(
+    { prompt: 'pre-aborted' },
+    { ...ctx(), signal: parentAbort.signal },
+  )
+  expect(capturedController?.signal.aborted).toBe(true)
+})
+
+test('runAgent throws AbortError → backend throws WorkflowAbortedError (not swallowed as dead)', async () => {
+  mock.module(
+    '@claude-code-best/builtin-tools/tools/AgentTool/runAgent.js',
+    () => ({
+      // biome-ignore lint/correctness/useYield: intentionally throws AbortError to test recognition branch
+      runAgent: async function* () {
+        const e = new Error('aborted by parent')
+        e.name = 'AbortError'
+        throw e
+      },
+    }),
+  )
+  await expect(
+    claudeCodeBackend.run({ prompt: 'abort' }, ctx()),
+  ).rejects.toBeInstanceOf(WorkflowAbortedError)
+})
+
+test('registerAgentAbort/unregisterAgentAbort injection: key=ctx.agentId (number), controller from bridge', async () => {
+  // restore default mock (previous test changed it to throw AbortError)
+  mock.module(
+    '@claude-code-best/builtin-tools/tools/AgentTool/runAgent.js',
+    () => ({
+      runAgent: async function* () {
+        yield {
+          type: 'assistant',
+          message: { content: [{ type: 'text', text: 'agent-text' }] },
+        }
+      },
+    }),
+  )
+  const registered: Array<{ id: number; controller: AbortController }> = []
+  const unregistered: number[] = []
+  await claudeCodeBackend.run(
+    { prompt: 'wiring' },
+    {
+      ...ctx(),
+      agentId: 42,
+      registerAgentAbort: (id, ac) => registered.push({ id, controller: ac }),
+      unregisterAgentAbort: id => unregistered.push(id),
+    },
+  )
+  expect(registered).toHaveLength(1)
+  expect(registered[0]?.id).toBe(42) // engine numeric agentId (not coreAgentId string)
+  expect(registered[0]?.controller).toBeInstanceOf(AbortController)
+  expect(unregistered).toEqual([42]) // finally cleanup idempotent
+})
+
+test('id and capabilities shape', () => {
+  expect(claudeCodeBackend.id).toBe('claude-code')
+  expect(claudeCodeBackend.capabilities.structuredOutput).toBe(true)
+  expect(claudeCodeBackend.capabilities.tools).toBe(true)
+})
+
+test('resolveAgentDefinition: no agentType → WORKFLOW_AGENT fallback', () => {
+  const tuc = {
+    options: { agentDefinitions: { activeAgents: [] } },
+  } as never
+  expect(resolveAgentDefinition(undefined, tuc)).toBe(WORKFLOW_AGENT)
+})
+
+test('resolveAgentDefinition: hits activeAgents', () => {
+  const fake = { agentType: 'Explore', permissionMode: 'plan' } as never
+  const tuc = {
+    options: { agentDefinitions: { activeAgents: [fake] } },
+  } as never
+  expect(resolveAgentDefinition('Explore', tuc)).toBe(fake)
+  // miss still falls back
+  expect(resolveAgentDefinition('Nope', tuc)).toBe(WORKFLOW_AGENT)
+})
+
+test('mapWorkflowModel passthrough', () => {
+  expect(mapWorkflowModel(undefined)).toBeUndefined()
+  expect(mapWorkflowModel('claude-haiku-*')).toBe('claude-haiku-*')
+})
+
+test('extractStructuredOutput: valid JSON extracted; invalid returns null', () => {
+  expect(
+    extractStructuredOutput([
+      { type: 'text', text: 'prefix {"a":1,"b":2} suffix' },
+    ]),
+  ).toEqual({ a: 1, b: 2 })
+  expect(
+    extractStructuredOutput([{ type: 'text', text: 'no json here' }]),
+  ).toBeNull()
+  expect(extractStructuredOutput([])).toBeNull()
+})
+
+test('extractStructuredOutput: fenced code block (strip fence + strip language tag)', () => {
+  expect(
+    extractStructuredOutput([
+      {
+        type: 'text',
+        text: 'Here are the findings:\n```json\n{"findings":[{"title":"x"}]}\n```\nDone.',
+      },
+    ]),
+  ).toEqual({ findings: [{ title: 'x' }] })
+  // no language tag
+  expect(
+    extractStructuredOutput([{ type: 'text', text: '```\n{"a":1}\n```' }]),
+  ).toEqual({ a: 1 })
+})
+
+test('extractStructuredOutput: nested object (bracket-balanced scan; legacy indexOf/lastIndexOf would cross-block concat)', () => {
+  const text = 'Result: {"outer":{"inner":{"deep":true}},"n":3} trailing'
+  expect(extractStructuredOutput([{ type: 'text', text }])).toEqual({
+    outer: { inner: { deep: true } },
+    n: 3,
+  })
+})
+
+test('extractStructuredOutput: brackets inside strings are not counted as pairing', () => {
+  // } inside a string does not zero out depth, scan can skip to the real pairing }
+  const text = '{"note":"this } char is in a string","ok":true}'
+  expect(extractStructuredOutput([{ type: 'text', text }])).toEqual({
+    note: 'this } char is in a string',
+    ok: true,
+  })
+})
+
+test('extractStructuredOutput: escaped quotes do not break string boundary', () => {
+  const text = '{"escaped":"he said \\"hi\\"","n":1}'
+  expect(extractStructuredOutput([{ type: 'text', text }])).toEqual({
+    escaped: 'he said "hi"',
+    n: 1,
+  })
+})
+
+test('extractStructuredOutput: multiple JSON blocks → return first parse success', () => {
+  // first one unbalanced (no pairing }), skip to the second
+  const text = 'broken { stuff\n{"real":1}\n{"ignored":2}'
+  expect(extractStructuredOutput([{ type: 'text', text }])).toEqual({ real: 1 })
+})
+
+test('extractStructuredOutput: array / number / string / null do not count as object', () => {
+  expect(
+    extractStructuredOutput([{ type: 'text', text: '[1,2,3]' }]),
+  ).toBeNull()
+  expect(extractStructuredOutput([{ type: 'text', text: '42' }])).toBeNull()
+  expect(
+    extractStructuredOutput([{ type: 'text', text: '"raw string"' }]),
+  ).toBeNull()
+  expect(extractStructuredOutput([{ type: 'text', text: 'null' }])).toBeNull()
+})
+
+test('extractStructuredOutput: multiple text blocks → cross-block find first success', () => {
+  expect(
+    extractStructuredOutput([
+      { type: 'text', text: 'no json' },
+      { type: 'text', text: '```json\n{"k":"v"}\n```' },
+    ]),
+  ).toEqual({ k: 'v' })
+})
+
+test('extractStructuredOutput: broken JSON returns null (does not throw)', () => {
+  expect(
+    extractStructuredOutput([
+      { type: 'text', text: '{broken: missing quotes}' },
+    ]),
+  ).toBeNull()
+  expect(
+    extractStructuredOutput([{ type: 'text', text: '{"a":1,}' }]), // trailing comma — no syntax repair
+  ).toBeNull()
+})
--- a/src/workflow/tests/notifications.test.ts
+++ b/src/workflow/tests/notifications.test.ts
@@ -0,0 +1,176 @@
+import { describe, expect, test } from 'bun:test'
+import type { RunProgress } from '../progress/store.js'
+import type { WorkflowService } from '../service.js'
+
+function makeMockService(runs: RunProgress[]): {
+  service: WorkflowService
+  emit: () => void
+  setRuns: (runs: RunProgress[]) => void
+} {
+  let current = runs
+  const listeners = new Set<() => void>()
+  return {
+    service: {
+      ports: {},
+      launch: async () => ({ runId: 'x' }),
+      kill: () => {},
+      listRuns: () => current,
+      getRun: () => undefined,
+      subscribe: (fn: () => void) => {
+        listeners.add(fn)
+        return () => {
+          listeners.delete(fn)
+        }
+      },
+      listNamed: async () => [],
+    } as unknown as WorkflowService,
+    emit: () => {
+      for (const fn of listeners) fn()
+    },
+    setRuns: r => {
+      current = r
+    },
+  }
+}
+
+function makeRun(
+  runId: string,
+  status: RunProgress['status'],
+  overrides: Partial<RunProgress> = {},
+): RunProgress {
+  return {
+    runId,
+    workflowName: 'wf',
+    status,
+    phases: [],
+    declaredPhases: [],
+    currentPhase: null,
+    agents: [],
+    agentCount: 0,
+    startedAt: Date.now(),
+    updatedAt: Date.now(),
+    ...overrides,
+  }
+}
+
+describe('installWorkflowNotifications', () => {
+  test('running → completed triggers notification (incl. workflow name)', async () => {
+    const { installWorkflowNotifications } = await import('../notifications.js')
+    const { service, emit, setRuns } = makeMockService([
+      makeRun('r1', 'running'),
+    ])
+    const calls: string[] = []
+    const unsubscribe = installWorkflowNotifications(service, msg =>
+      calls.push(msg),
+    )
+
+    // first emit: listener records initial running state, no notification
+    emit()
+    expect(calls.length).toBe(0)
+
+    setRuns([makeRun('r1', 'completed')])
+    emit()
+
+    expect(calls.length).toBe(1)
+    expect(calls[0]).toMatch(/task-notification/)
+    expect(calls[0]).toMatch(/completed successfully/)
+    expect(calls[0]).toMatch(/"wf"/)
+    unsubscribe()
+  })
+
+  test('running → failed triggers notification, includes error text', async () => {
+    const { installWorkflowNotifications } = await import('../notifications.js')
+    const { service, emit, setRuns } = makeMockService([
+      makeRun('r1', 'running'),
+    ])
+    const calls: string[] = []
+    installWorkflowNotifications(service, msg => calls.push(msg))
+
+    emit() // record initial running
+    setRuns([makeRun('r1', 'failed', { error: 'agent X boom' })])
+    emit()
+
+    expect(calls.length).toBe(1)
+    expect(calls[0]).toMatch(/failed/)
+    expect(calls[0]).toMatch(/agent X boom/)
+  })
+
+  test('running → killed triggers notification', async () => {
+    const { installWorkflowNotifications } = await import('../notifications.js')
+    const { service, emit, setRuns } = makeMockService([
+      makeRun('r1', 'running'),
+    ])
+    const calls: string[] = []
+    installWorkflowNotifications(service, msg => calls.push(msg))
+
+    emit() // record initial running
+    setRuns([makeRun('r1', 'killed')])
+    emit()
+
+    expect(calls.length).toBe(1)
+    expect(calls[0]).toMatch(/was stopped/)
+  })
+
+  test('first time seeing run (no prev) does not notify (avoid notifying historical runs on startup)', async () => {
+    const { installWorkflowNotifications } = await import('../notifications.js')
+    const { service, emit, setRuns } = makeMockService([])
+    const calls: string[] = []
+    installWorkflowNotifications(service, msg => calls.push(msg))
+
+    // first emit after startup, sees r1 already completed — should not notify (not a transition from running)
+    setRuns([makeRun('r1', 'completed')])
+    emit()
+
+    expect(calls.length).toBe(0)
+  })
+
+  test('running → running does not notify', async () => {
+    const { installWorkflowNotifications } = await import('../notifications.js')
+    const { service, emit, setRuns } = makeMockService([
+      makeRun('r1', 'running'),
+    ])
+    const calls: string[] = []
+    installWorkflowNotifications(service, msg => calls.push(msg))
+
+    emit() // record initial running
+    setRuns([makeRun('r1', 'running', { agentCount: 1 })])
+    emit()
+
+    expect(calls.length).toBe(0)
+  })
+
+  test('already completed run emitting again does not repeat notification', async () => {
+    const { installWorkflowNotifications } = await import('../notifications.js')
+    const { service, emit, setRuns } = makeMockService([
+      makeRun('r1', 'running'),
+    ])
+    const calls: string[] = []
+    installWorkflowNotifications(service, msg => calls.push(msg))
+
+    emit() // record initial running
+    setRuns([makeRun('r1', 'completed')])
+    emit()
+    expect(calls.length).toBe(1)
+
+    emit()
+    expect(calls.length).toBe(1)
+  })
+
+  test('after unsubscribe no more notifications', async () => {
+    const { installWorkflowNotifications } = await import('../notifications.js')
+    const { service, emit, setRuns } = makeMockService([
+      makeRun('r1', 'running'),
+    ])
+    const calls: string[] = []
+    const unsubscribe = installWorkflowNotifications(service, msg =>
+      calls.push(msg),
+    )
+
+    emit() // record initial running
+    unsubscribe()
+    setRuns([makeRun('r1', 'completed')])
+    emit()
+
+    expect(calls.length).toBe(0)
+  })
+})
--- a/src/workflow/tests/persistence.test.ts
+++ b/src/workflow/tests/persistence.test.ts
@@ -0,0 +1,199 @@
+import { expect, test } from 'bun:test'
+import {
+  mkdir,
+  mkdtemp,
+  readFile,
+  readdir,
+  rm,
+  writeFile as fsWriteFile,
+} from 'node:fs/promises'
+import { tmpdir } from 'node:os'
+import { join } from 'node:path'
+import {
+  getRunsDir,
+  listPersistedRuns,
+  readRunState,
+  writeRunState,
+} from '../persistence.js'
+import type { RunProgress } from '../progress/store.js'
+
+function makeRun(over: Partial<RunProgress> = {}): RunProgress {
+  return {
+    runId: 'r1',
+    workflowName: 'w',
+    status: 'completed',
+    phases: [],
+    declaredPhases: [],
+    currentPhase: null,
+    agents: [],
+    agentCount: 0,
+    startedAt: 1000,
+    updatedAt: 2000,
+    ...over,
+  } as RunProgress
+}
+
+test('writeRunState → readRunState round-trip consistent (returnValue is object)', async () => {
+  const dir = await mkdtemp(join(tmpdir(), 'wf-'))
+  try {
+    const run = makeRun({
+      returnValue: { confirmedCount: 2, items: ['a', 'b'] },
+    })
+    await writeRunState(dir, run)
+    const got = await readRunState(dir, 'r1')
+    expect(got).not.toBeNull()
+    expect(got!.runId).toBe('r1')
+    expect(got!.returnValue).toEqual({ confirmedCount: 2, items: ['a', 'b'] })
+  } finally {
+    await rm(dir, { recursive: true, force: true })
+  }
+})
+
+test('readRunState missing file → null', async () => {
+  const dir = await mkdtemp(join(tmpdir(), 'wf-'))
+  try {
+    const got = await readRunState(dir, 'never-exists')
+    expect(got).toBeNull()
+  } finally {
+    await rm(dir, { recursive: true, force: true })
+  }
+})
+
+test('readRunState corrupt JSON → null', async () => {
+  const dir = await mkdtemp(join(tmpdir(), 'wf-'))
+  try {
+    await mkdir(join(dir, 'rX'), { recursive: true })
+    await fsWriteFile(join(dir, 'rX', 'state.json'), '{not valid json', 'utf-8')
+    const got = await readRunState(dir, 'rX')
+    expect(got).toBeNull()
+  } finally {
+    await rm(dir, { recursive: true, force: true })
+  }
+})
+
+test('readRunState schemaVersion mismatch → null', async () => {
+  const dir = await mkdtemp(join(tmpdir(), 'wf-'))
+  try {
+    await mkdir(join(dir, 'rX'), { recursive: true })
+    await fsWriteFile(
+      join(dir, 'rX', 'state.json'),
+      JSON.stringify({ schemaVersion: 999, run: makeRun({ runId: 'rX' }) }),
+      'utf-8',
+    )
+    const got = await readRunState(dir, 'rX')
+    expect(got).toBeNull()
+  } finally {
+    await rm(dir, { recursive: true, force: true })
+  }
+})
+
+test('writeRunState atomic write: no tmp residue after success', async () => {
+  const dir = await mkdtemp(join(tmpdir(), 'wf-'))
+  try {
+    await writeRunState(dir, makeRun({ runId: 'rAtom' }))
+    const sub = await readdir(join(dir, 'rAtom'))
+    expect(sub).toContain('state.json')
+    expect(sub).not.toContain('state.json.tmp')
+  } finally {
+    await rm(dir, { recursive: true, force: true })
+  }
+})
+
+test('listPersistedRuns scans multiple subdirs, skips dirs without state.json, sorts by updatedAt desc', async () => {
+  const dir = await mkdtemp(join(tmpdir(), 'wf-'))
+  try {
+    // three valid runs + one half-broken dir with only journal, no state.json
+    await writeRunState(dir, makeRun({ runId: 'old', updatedAt: 1000 }))
+    await writeRunState(dir, makeRun({ runId: 'mid', updatedAt: 2000 }))
+    await writeRunState(dir, makeRun({ runId: 'new', updatedAt: 3000 }))
+    await mkdir(join(dir, 'half-broken'), { recursive: true })
+
+    const runs = await listPersistedRuns(dir)
+    expect(runs.map(r => r.runId)).toEqual(['new', 'mid', 'old'])
+  } finally {
+    await rm(dir, { recursive: true, force: true })
+  }
+})
+
+test('listPersistedRuns scans a corrupt state.json → skip that single one, continue scanning the rest', async () => {
+  const dir = await mkdtemp(join(tmpdir(), 'wf-'))
+  try {
+    await writeRunState(dir, makeRun({ runId: 'good' }))
+    await mkdir(join(dir, 'bad'), { recursive: true })
+    await fsWriteFile(join(dir, 'bad', 'state.json'), 'corrupt', 'utf-8')
+
+    const runs = await listPersistedRuns(dir)
+    expect(runs.map(r => r.runId)).toEqual(['good'])
+  } finally {
+    await rm(dir, { recursive: true, force: true })
+  }
+})
+
+test('writeRunState does not throw when returnValue is null/string/array', async () => {
+  const dir = await mkdtemp(join(tmpdir(), 'wf-'))
+  try {
+    await writeRunState(dir, makeRun({ runId: 'n', returnValue: null }))
+    await writeRunState(dir, makeRun({ runId: 's', returnValue: 'text' }))
+    await writeRunState(dir, makeRun({ runId: 'a', returnValue: [1, 2, 3] }))
+    expect((await readRunState(dir, 'n'))!.returnValue).toBeNull()
+    expect((await readRunState(dir, 's'))!.returnValue).toBe('text')
+    expect((await readRunState(dir, 'a'))!.returnValue).toEqual([1, 2, 3])
+  } finally {
+    await rm(dir, { recursive: true, force: true })
+  }
+})
+
+test('writeRunState overwrite: same runId second write overwrites old content', async () => {
+  const dir = await mkdtemp(join(tmpdir(), 'wf-'))
+  try {
+    await writeRunState(dir, makeRun({ runId: 'rOV', status: 'running' }))
+    await writeRunState(dir, makeRun({ runId: 'rOV', status: 'completed' }))
+    const got = await readRunState(dir, 'rOV')
+    expect(got!.status).toBe('completed')
+  } finally {
+    await rm(dir, { recursive: true, force: true })
+  }
+})
+
+test('writeRunState writes full AgentProgress (no output content, includes label/phase/token etc.)', async () => {
+  const dir = await mkdtemp(join(tmpdir(), 'wf-'))
+  try {
+    const run = makeRun({
+      runId: 'rAg',
+      agents: [
+        {
+          id: 1,
+          label: 'review:hooks',
+          phase: 'Review',
+          status: 'done',
+          outputShape: 'object',
+          tokenCount: 12345,
+          toolCount: 3,
+          model: 'claude-sonnet-4-6',
+        },
+      ],
+      agentCount: 1,
+    })
+    await writeRunState(dir, run)
+    const got = await readRunState(dir, 'rAg')
+    expect(got!.agents).toHaveLength(1)
+    expect(got!.agents[0]).toEqual({
+      id: 1,
+      label: 'review:hooks',
+      phase: 'Review',
+      status: 'done',
+      outputShape: 'object',
+      tokenCount: 12345,
+      toolCount: 3,
+      model: 'claude-sonnet-4-6',
+    })
+  } finally {
+    await rm(dir, { recursive: true, force: true })
+  }
+})
+
+test('getRunsDir returns <projectRoot>/.claude/workflow-runs shape', () => {
+  const dir = getRunsDir()
+  // do not hard-code projectRoot (differs across machines), only check suffix structure
+  expect(dir.endsWith(`${join('.claude', 'workflow-runs')}`)).toBe(true)
+})
--- a/src/workflow/tests/ports.test.ts
+++ b/src/workflow/tests/ports.test.ts
@@ -0,0 +1,198 @@
+import { expect, test } from 'bun:test'
+// Note: this test does not mock bootstrap/state, utils/cwd, analytics, debug.
+// Reason: mock.module is process-global (last-write-wins); mocking these common modules would pollute
+// other tests in the same process (e.g. src/commands/__tests__/autonomy.test.ts imports the real
+// bootstrap/state via its dependency chain). ports can resolve getProjectRoot/getCwd normally in the test env,
+// logEvent/logForDebugging are silent no-ops when sink is not attached, no need to mock.
+
+import { buildRegistry } from '../registry.js'
+import { createWorkflowPorts } from '../ports.js'
+import { createProgressBus } from '../progress/bus.js'
+import { createProgressStoreFromBus } from '../progress/store.js'
+import { getProjectRoot } from '../../bootstrap/state.js'
+import type { SetAppState } from '../../Task.js'
+import type { AppState } from '../../state/AppState.tsx'
+
+test('buildRegistry registers claude-code as default and resolve hits', () => {
+  const reg = buildRegistry()
+  expect(reg.has('claude-code')).toBe(true)
+  expect(reg.resolve({ prompt: 'x' }).id).toBe('claude-code')
+  expect(reg.resolve({ prompt: 'x', agentType: 'whatever' }).id).toBe(
+    'claude-code',
+  )
+})
+
+test('createWorkflowPorts assembles full ports (incl. agentAdapterRegistry and progressEmitter→bus)', () => {
+  const bus = createProgressBus()
+  const store = createProgressStoreFromBus(bus)
+  const ports = createWorkflowPorts({ bus, store })
+
+  expect(ports.agentAdapterRegistry).toBeDefined()
+  expect(ports.agentAdapterRegistry!.resolve({ prompt: 'x' }).id).toBe(
+    'claude-code',
+  )
+  expect(typeof ports.taskRegistrar.register).toBe('function')
+  expect(typeof ports.taskRegistrar.kill).toBe('function')
+  expect(typeof ports.hostFactory).toBe('function')
+  // agentRunner fallback fields still exist (WorkflowPorts required)
+  expect(ports.agentRunner).toBeDefined()
+  expect(typeof ports.agentRunner.runAgentToResult).toBe('function')
+
+  // progressEmitter via bus → store: emit a run_started, store can see it
+  ports.progressEmitter.emit({
+    type: 'run_started',
+    runId: 't',
+    workflowName: 'w',
+    meta: null,
+  })
+  expect(store.get('t')?.workflowName).toBe('w')
+})
+
+test('taskRegistrar.register/complete/kill routes via RunBinding (real setAppState, no mock)', () => {
+  const bus = createProgressBus()
+  const store = createProgressStoreFromBus(bus)
+  const ports = createWorkflowPorts({ bus, store })
+
+  // real setAppState: use a local AppState object to hold tasks, registerTask goes through the real code path.
+  const state = { tasks: {} } as unknown as AppState
+  const setAppState: SetAppState = f => {
+    Object.assign(state, f(state))
+  }
+
+  const hostCtx = ports.hostFactory({
+    context: {
+      agentId: 'a-1',
+      toolUseId: 'tu-1',
+      setAppState,
+    },
+    canUseTool: (() => Promise.resolve({ behavior: 'allow' })) as never,
+    parentMessage: {} as never,
+  })
+
+  const { runId, signal } = ports.taskRegistrar.register(
+    {
+      workflowName: 'wf',
+      summary: 'summary',
+      workflowFile: 'wf.ts',
+      toolUseId: 'tu-1',
+    },
+    hostCtx.handle,
+  )
+  expect(typeof runId).toBe('string')
+  expect(signal).toBeInstanceOf(AbortSignal)
+
+  // complete/fail/kill do not throw (RunBinding hit)
+  expect(() => ports.taskRegistrar.complete(runId, 'done')).not.toThrow()
+  expect(() => ports.taskRegistrar.kill(runId)).not.toThrow()
+  // unknown runId safe no-op
+  expect(() => ports.taskRegistrar.complete('nope')).not.toThrow()
+  expect(ports.taskRegistrar.pendingAction('nope')).toBeNull()
+
+  // after terminal state binding is reclaimed: calling complete on the same runId again should be safe no-op (no throw, no repeated call to workflow task fn)
+  ports.taskRegistrar.complete(runId)
+  ports.taskRegistrar.kill(runId)
+})
+
+// agent-level kill bridge: register → killAgent precisely aborts; kill(runId) aborts all agents.
+test('taskRegistrar agentAbortControllers: register/killAgent precise abort; kill(runId) batch abort', () => {
+  const bus = createProgressBus()
+  const store = createProgressStoreFromBus(bus)
+  const ports = createWorkflowPorts({ bus, store })
+  // impl always provides these — cast flattens optional to required (avoids per-line ! assertion)
+  const tr = ports.taskRegistrar as Required<typeof ports.taskRegistrar>
+
+  const state = { tasks: {} } as unknown as AppState
+  const setAppState: SetAppState = f => {
+    Object.assign(state, f(state))
+  }
+  const hostCtx = ports.hostFactory({
+    context: { agentId: 'a-1', toolUseId: 'tu-1', setAppState },
+    canUseTool: (() => Promise.resolve({ behavior: 'allow' })) as never,
+    parentMessage: {} as never,
+  })
+  const { runId } = tr.register(
+    {
+      workflowName: 'wf',
+      summary: 'summary',
+      workflowFile: 'wf.ts',
+      toolUseId: 'tu-1',
+    },
+    hostCtx.handle,
+  )
+
+  // register AbortController for two agents (simulating backend calling when launching agent)
+  const ac1 = new AbortController()
+  const ac2 = new AbortController()
+  tr.registerAgentAbort(runId, 1, ac1)
+  tr.registerAgentAbort(runId, 2, ac2)
+  expect(ac1.signal.aborted).toBe(false)
+  expect(ac2.signal.aborted).toBe(false)
+
+  // killAgent precisely aborts agent #1: only ac1 aborts, ac2 unaffected
+  expect(tr.killAgent(runId, 1)).toBe(true)
+  expect(ac1.signal.aborted).toBe(true)
+  expect(ac2.signal.aborted).toBe(false)
+  // repeat kill on same agent: controller already deleted, returns false (idempotent)
+  expect(tr.killAgent(runId, 1)).toBe(false)
+
+  // unknown agentId / unknown runId safe returns false
+  expect(tr.killAgent(runId, 999)).toBe(false)
+  expect(tr.killAgent('nope', 1)).toBe(false)
+
+  // kill(runId) batch aborts remaining agent (ac2)
+  tr.kill(runId)
+  expect(ac2.signal.aborted).toBe(true)
+
+  // after run terminal state binding is reclaimed: killAgent returns false
+  expect(tr.killAgent(runId, 2)).toBe(false)
+})
+
+test('unregisterAgentAbort deletes from Map (backend finally cleanup idempotent)', () => {
+  const bus = createProgressBus()
+  const store = createProgressStoreFromBus(bus)
+  const ports = createWorkflowPorts({ bus, store })
+  const tr = ports.taskRegistrar as Required<typeof ports.taskRegistrar>
+
+  const state = { tasks: {} } as unknown as AppState
+  const setAppState: SetAppState = f => {
+    Object.assign(state, f(state))
+  }
+  const hostCtx = ports.hostFactory({
+    context: { agentId: 'a-1', toolUseId: 'tu-1', setAppState },
+    canUseTool: (() => Promise.resolve({ behavior: 'allow' })) as never,
+    parentMessage: {} as never,
+  })
+  const { runId } = tr.register(
+    {
+      workflowName: 'wf',
+      summary: 'summary',
+      workflowFile: 'wf.ts',
+      toolUseId: 'tu-1',
+    },
+    hostCtx.handle,
+  )
+  const ac = new AbortController()
+  tr.registerAgentAbort(runId, 5, ac)
+  // after unregister killAgent has no target, returns false (does not throw)
+  tr.unregisterAgentAbort(runId, 5)
+  expect(tr.killAgent(runId, 5)).toBe(false)
+  // repeat unregister idempotent (backend finally does not throw)
+  expect(() => tr.unregisterAgentAbort(runId, 5)).not.toThrow()
+  // unknown runId safe no-op
+  expect(() => tr.unregisterAgentAbort('nope', 5)).not.toThrow()
+})
+
+test('hostFactory.cwd and journalStore share root (getProjectRoot) — fix K regression', () => {
+  // historical bug: hostFactory.cwd used getCwd(), journalStore used getProjectRoot(),
+  // when user enters worktree/subdirectory the two differ → named workflow resolution and journal persist out of sync.
+  // After fix both use projectRoot, this test locks-in that choice, preventing regression.
+  const bus = createProgressBus()
+  const store = createProgressStoreFromBus(bus)
+  const ports = createWorkflowPorts({ bus, store })
+  const hostCtx = ports.hostFactory({
+    context: { agentId: 'a', toolUseId: 'tu' },
+    canUseTool: (() => Promise.resolve({ behavior: 'allow' })) as never,
+    parentMessage: {} as never,
+  })
+  expect(hostCtx.cwd).toBe(getProjectRoot())
+})
--- a/src/workflow/tests/progressBus.test.ts
+++ b/src/workflow/tests/progressBus.test.ts
@@ -0,0 +1,23 @@
+import { expect, test, mock } from 'bun:test'
+import { createProgressBus } from '../progress/bus.js'
+
+test('emit broadcasts to all subscribers', () => {
+  const bus = createProgressBus()
+  const a = mock(() => {})
+  const b = mock(() => {})
+  bus.subscribe(a)
+  bus.subscribe(b)
+  const ev = { type: 'log' as const, runId: 'r', message: 'hi' }
+  bus.emit(ev)
+  expect(a).toHaveBeenCalledTimes(1)
+  expect(b).toHaveBeenCalledWith(ev)
+})
+
+test('subscribe returns unsubscribe', () => {
+  const bus = createProgressBus()
+  const fn = mock(() => {})
+  const unsub = bus.subscribe(fn)
+  unsub()
+  bus.emit({ type: 'log', runId: 'r', message: 'x' })
+  expect(fn).not.toHaveBeenCalled()
+})
--- a/src/workflow/tests/progressStore.test.ts
+++ b/src/workflow/tests/progressStore.test.ts
@@ -0,0 +1,289 @@
+import { expect, test } from 'bun:test'
+import { createProgressBus, type ProgressBus } from '../progress/bus.js'
+import {
+  createProgressStoreFromBus,
+  type RunProgress,
+} from '../progress/store.js'
+import type { AgentRunResult } from '@claude-code-best/workflow-engine'
+
+const ok = (o: string): AgentRunResult => ({
+  kind: 'ok',
+  output: o,
+  usage: { outputTokens: 1 },
+})
+
+function newStore() {
+  const bus: ProgressBus = createProgressBus()
+  return { bus, store: createProgressStoreFromBus(bus) }
+}
+
+test('run_started creates entry; phase_started/done updates phases', () => {
+  const { bus, store } = newStore()
+  bus.emit({ type: 'run_started', runId: 'r1', workflowName: 'w', meta: null })
+  bus.emit({ type: 'phase_started', runId: 'r1', phase: 'A' })
+  bus.emit({ type: 'phase_started', runId: 'r1', phase: 'B' })
+  bus.emit({ type: 'phase_done', runId: 'r1', phase: 'A' })
+  const r = store.get('r1')!
+  expect(r.phases.map(p => [p.title, p.status])).toEqual([
+    ['A', 'done'],
+    ['B', 'running'],
+  ])
+  expect(r.currentPhase).toBe('B')
+})
+
+test('concurrent agent_done correlates by agentId precisely (regression of old LIFO race)', () => {
+  const { bus, store } = newStore()
+  bus.emit({ type: 'run_started', runId: 'r1', workflowName: 'w', meta: null })
+  bus.emit({
+    type: 'agent_started',
+    runId: 'r1',
+    agentId: 0,
+    label: 'a',
+    phase: 'A',
+  })
+  bus.emit({
+    type: 'agent_started',
+    runId: 'r1',
+    agentId: 1,
+    label: 'b',
+    phase: 'A',
+  })
+  bus.emit({
+    type: 'agent_done',
+    runId: 'r1',
+    agentId: 1,
+    label: 'b',
+    phase: 'A',
+    result: ok('b-out'),
+  })
+  bus.emit({
+    type: 'agent_done',
+    runId: 'r1',
+    agentId: 0,
+    label: 'a',
+    phase: 'A',
+    result: ok('a-out'),
+  })
+  const agents = store.get('r1')!.agents
+  expect(agents.find(x => x.id === 0)?.status).toBe('done')
+  expect(agents.find(x => x.id === 1)?.status).toBe('done')
+  expect(agents.find(x => x.id === 0)?.label).toBe('a')
+  expect(agents.find(x => x.id === 1)?.label).toBe('b')
+})
+
+test('journal hit (agent_done without started) backfills done entry by id', () => {
+  const { bus, store } = newStore()
+  bus.emit({ type: 'run_started', runId: 'r1', workflowName: 'w', meta: null })
+  bus.emit({
+    type: 'agent_done',
+    runId: 'r1',
+    agentId: 7,
+    label: 'c',
+    phase: 'A',
+    result: ok('c'),
+  })
+  const a = store.get('r1')!.agents.find(x => x.id === 7)!
+  expect(a.status).toBe('done')
+})
+
+test('run_done terminal state + list sort + subscribe notification', () => {
+  const { bus, store } = newStore()
+  let calls = 0
+  store.subscribe(() => calls++)
+  bus.emit({ type: 'run_started', runId: 'r1', workflowName: 'w', meta: null })
+  bus.emit({
+    type: 'run_done',
+    runId: 'r1',
+    status: 'completed',
+    returnValue: 42,
+  })
+  const r = store.get('r1')!
+  expect(r.status).toBe('completed')
+  expect(r.returnValue).toBe(42)
+  expect(store.list().map(x => x.runId)).toEqual(['r1'])
+  expect(calls).toBe(2)
+})
+
+test('run_done failed terminal state records error', () => {
+  const { bus, store } = newStore()
+  bus.emit({ type: 'run_started', runId: 'r2', workflowName: 'w', meta: null })
+  bus.emit({ type: 'run_done', runId: 'r2', status: 'failed', error: 'boom' })
+  const r = store.get('r2')!
+  expect(r.status).toBe('failed')
+  expect(r.error).toBe('boom')
+})
+
+test('log event does not trigger notify', () => {
+  const { bus, store } = newStore()
+  let calls = 0
+  store.subscribe(() => calls++)
+  bus.emit({ type: 'run_started', runId: 'r3', workflowName: 'w', meta: null })
+  const before = calls
+  bus.emit({ type: 'log', runId: 'r3', message: 'hi' })
+  expect(calls).toBe(before) // log should not trigger notify
+})
+
+test('run_started persists declaredPhases (from meta.phases, order preserved)', () => {
+  const { bus, store } = newStore()
+  bus.emit({
+    type: 'run_started',
+    runId: 'r1',
+    workflowName: 'w',
+    meta: {
+      name: 'w',
+      description: 'd',
+      phases: [{ title: 'Find' }, { title: 'Review' }, { title: 'Verify' }],
+    },
+  })
+  expect(store.get('r1')!.declaredPhases).toEqual(['Find', 'Review', 'Verify'])
+})
+
+test('run_started meta is null → declaredPhases = []', () => {
+  const { bus, store } = newStore()
+  bus.emit({ type: 'run_started', runId: 'r1', workflowName: 'w', meta: null })
+  expect(store.get('r1')!.declaredPhases).toEqual([])
+})
+
+test('agent_done persists outputShape (ok·object / ok·text / dead none)', () => {
+  const { bus, store } = newStore()
+  bus.emit({ type: 'run_started', runId: 'r1', workflowName: 'w', meta: null })
+  bus.emit({ type: 'agent_started', runId: 'r1', agentId: 0, phase: 'A' })
+  bus.emit({ type: 'agent_started', runId: 'r1', agentId: 1, phase: 'A' })
+  bus.emit({ type: 'agent_started', runId: 'r1', agentId: 2, phase: 'A' })
+  bus.emit({
+    type: 'agent_done',
+    runId: 'r1',
+    agentId: 0,
+    phase: 'A',
+    result: { kind: 'ok', output: { x: 1 }, usage: { outputTokens: 1 } },
+  })
+  bus.emit({
+    type: 'agent_done',
+    runId: 'r1',
+    agentId: 1,
+    phase: 'A',
+    result: { kind: 'ok', output: 'hi', usage: { outputTokens: 1 } },
+  })
+  bus.emit({
+    type: 'agent_done',
+    runId: 'r1',
+    agentId: 2,
+    phase: 'A',
+    result: { kind: 'dead' },
+  })
+  const agents = store.get('r1')!.agents
+  expect(agents.find(a => a.id === 0)?.outputShape).toBe('object')
+  expect(agents.find(a => a.id === 1)?.outputShape).toBe('text')
+  expect(agents.find(a => a.id === 2)?.outputShape).toBeUndefined()
+})
+
+test('agent_progress real-time updates token/tool (correlated by agentId)', () => {
+  const { bus, store } = newStore()
+  bus.emit({ type: 'run_started', runId: 'r1', workflowName: 'w', meta: null })
+  bus.emit({
+    type: 'agent_started',
+    runId: 'r1',
+    agentId: 0,
+    label: 'a',
+    phase: 'A',
+  })
+  bus.emit({
+    type: 'agent_progress',
+    runId: 'r1',
+    agentId: 0,
+    tokenCount: 1200,
+    toolCount: 2,
+  })
+  let a = store.get('r1')!.agents.find(x => x.id === 0)!
+  expect(a.tokenCount).toBe(1200)
+  expect(a.toolCount).toBe(2)
+  bus.emit({
+    type: 'agent_progress',
+    runId: 'r1',
+    agentId: 0,
+    tokenCount: 2400,
+    toolCount: 3,
+  })
+  a = store.get('r1')!.agents.find(x => x.id === 0)!
+  expect(a.tokenCount).toBe(2400)
+  expect(a.toolCount).toBe(3)
+})
+
+test('agent_done persists model/tokenCount/toolCount (ok variant)', () => {
+  const { bus, store } = newStore()
+  bus.emit({ type: 'run_started', runId: 'r1', workflowName: 'w', meta: null })
+  bus.emit({ type: 'agent_started', runId: 'r1', agentId: 0, phase: 'A' })
+  bus.emit({
+    type: 'agent_done',
+    runId: 'r1',
+    agentId: 0,
+    phase: 'A',
+    result: {
+      kind: 'ok',
+      output: 'x',
+      usage: { outputTokens: 5 },
+      model: 'glm-5.2',
+      tokenCount: 22900,
+      toolCount: 1,
+    },
+  })
+  const a = store.get('r1')!.agents.find(x => x.id === 0)!
+  expect(a.model).toBe('glm-5.2')
+  expect(a.tokenCount).toBe(22900)
+  expect(a.toolCount).toBe(1)
+})
+
+// ---- hydrate: inject historical run from disk (cross-restart recovery) ----
+
+test('hydrate injects new run → get hits + list includes it + notifies listener', () => {
+  const { store } = newStore()
+  let notified = 0
+  store.subscribe(() => notified++)
+
+  const historical: RunProgress = {
+    runId: 'hist-1',
+    workflowName: 'old-job',
+    status: 'completed',
+    phases: [],
+    declaredPhases: [],
+    currentPhase: null,
+    agents: [],
+    agentCount: 5,
+    returnValue: { summary: 'past' },
+    startedAt: 1,
+    updatedAt: 2,
+  }
+  store.hydrate(historical)
+
+  expect(store.get('hist-1')).toBe(historical)
+  expect(store.list().map(r => r.runId)).toContain('hist-1')
+  expect(notified).toBeGreaterThan(0)
+})
+
+test('hydrate existing runId → skip (memory first, not overwritten by disk)', () => {
+  const { bus, store } = newStore()
+  bus.emit({
+    type: 'run_started',
+    runId: 'r1',
+    workflowName: 'live',
+    meta: null,
+  })
+
+  const stale: RunProgress = {
+    runId: 'r1',
+    workflowName: 'STALE-SHOULD-NOT-WIN',
+    status: 'completed',
+    phases: [],
+    declaredPhases: [],
+    currentPhase: null,
+    agents: [],
+    agentCount: 0,
+    startedAt: 1,
+    updatedAt: 2,
+  }
+  store.hydrate(stale)
+
+  const got = store.get('r1')!
+  expect(got.workflowName).toBe('live')
+  expect(got.status).toBe('running')
+})
--- a/src/workflow/tests/runStatePersistence.test.ts
+++ b/src/workflow/tests/runStatePersistence.test.ts
@@ -0,0 +1,177 @@
+import { expect, test } from 'bun:test'
+import { mkdtemp, rm, writeFile } from 'node:fs/promises'
+import { tmpdir } from 'node:os'
+import { join } from 'node:path'
+import { attachRunStatePersistence, readRunState } from '../persistence.js'
+import { createProgressBus } from '../progress/bus.js'
+import { createProgressStoreFromBus } from '../progress/store.js'
+
+/**
+ * Contract test for attachRunStatePersistence (adjusted Task 4):
+ * directly test the bus + store combination, bypassing makeService (keeps makeService signature (ports, store, cwdOverride?) unchanged).
+ *
+ * runsDir is injected as tmpdir via attachRunStatePersistence's third parameter runsDirProvider,
+ * to avoid writing to the real project directory (Bun ESM module namespace is read-only, cannot monkey-patch getRunsDir).
+ */
+
+test('run_done completed → writes state.json to disk, returnValue consistent', async () => {
+  const dir = await mkdtemp(join(tmpdir(), 'wf-persist-'))
+  try {
+    const bus = createProgressBus()
+    const store = createProgressStoreFromBus(bus)
+    attachRunStatePersistence(bus, store, () => dir)
+
+    bus.emit({
+      type: 'run_started',
+      runId: 'rW',
+      workflowName: 'w',
+      meta: null,
+    })
+    bus.emit({
+      type: 'run_done',
+      runId: 'rW',
+      status: 'completed',
+      returnValue: { ok: true, n: 3 },
+    })
+
+    // writeRunState is async (void writeRunState(...) in the subscription); let the microtask complete
+    await new Promise(r => setTimeout(r, 50))
+
+    const got = await readRunState(dir, 'rW')
+    expect(got).not.toBeNull()
+    expect(got!.status).toBe('completed')
+    expect(got!.returnValue).toEqual({ ok: true, n: 3 })
+  } finally {
+    await rm(dir, { recursive: true, force: true })
+  }
+})
+
+test('run_done failed → writes status=failed + error field to disk', async () => {
+  const dir = await mkdtemp(join(tmpdir(), 'wf-persist-'))
+  try {
+    const bus = createProgressBus()
+    const store = createProgressStoreFromBus(bus)
+    attachRunStatePersistence(bus, store, () => dir)
+
+    bus.emit({
+      type: 'run_started',
+      runId: 'rF',
+      workflowName: 'w',
+      meta: null,
+    })
+    bus.emit({
+      type: 'run_done',
+      runId: 'rF',
+      status: 'failed',
+      error: 'boom',
+    })
+    await new Promise(r => setTimeout(r, 50))
+
+    const got = await readRunState(dir, 'rF')
+    expect(got).not.toBeNull()
+    expect(got!.status).toBe('failed')
+    expect(got!.error).toBe('boom')
+  } finally {
+    await rm(dir, { recursive: true, force: true })
+  }
+})
+
+test('run_done killed → writes status=killed to disk', async () => {
+  const dir = await mkdtemp(join(tmpdir(), 'wf-persist-'))
+  try {
+    const bus = createProgressBus()
+    const store = createProgressStoreFromBus(bus)
+    attachRunStatePersistence(bus, store, () => dir)
+
+    bus.emit({
+      type: 'run_started',
+      runId: 'rK',
+      workflowName: 'w',
+      meta: null,
+    })
+    bus.emit({ type: 'run_done', runId: 'rK', status: 'killed' })
+    await new Promise(r => setTimeout(r, 50))
+
+    const got = await readRunState(dir, 'rK')
+    expect(got?.status).toBe('killed')
+  } finally {
+    await rm(dir, { recursive: true, force: true })
+  }
+})
+
+test('writeRunState internal IO exception is swallowed: attachRunStatePersistence does not propagate, bus emit does not break', async () => {
+  const blockerDir = await mkdtemp(join(tmpdir(), 'wf-persist-'))
+  // first create a same-named file, so subdir mkdir fails → writeRunState internal catch swallows it
+  await writeFile(join(blockerDir, 'not-a-dir.txt'), 'blocker', 'utf-8')
+  try {
+    const bus = createProgressBus()
+    const store = createProgressStoreFromBus(bus)
+    // runsDir points to a dir whose parent path is a file: mkdir recursive fails
+    attachRunStatePersistence(bus, store, () =>
+      join(blockerDir, 'not-a-dir.txt'),
+    )
+
+    // an extra subscriber to verify it still gets notified (bus emit should not break due to internal exception in persistence listener)
+    let otherNotified = 0
+    bus.subscribe(() => otherNotified++)
+
+    // bus.emit should not throw — writeRunState swallows the exception internally
+    expect(() => {
+      bus.emit({
+        type: 'run_started',
+        runId: 'rErr',
+        workflowName: 'w',
+        meta: null,
+      })
+      bus.emit({
+        type: 'run_done',
+        runId: 'rErr',
+        status: 'completed',
+        returnValue: 'x',
+      })
+    }).not.toThrow()
+
+    // let writeRunState's microtask complete (exception swallowed internally)
+    await new Promise(r => setTimeout(r, 50))
+
+    // this store subscriber still works normally (received both run_started + run_done events)
+    expect(otherNotified).toBeGreaterThanOrEqual(2)
+    expect(store.get('rErr')?.status).toBe('completed')
+  } finally {
+    await rm(blockerDir, { recursive: true, force: true })
+  }
+})
+
+test('attachRunStatePersistence returns unsubscribe; after calling it no more disk writes', async () => {
+  const dir = await mkdtemp(join(tmpdir(), 'wf-persist-'))
+  try {
+    const bus = createProgressBus()
+    const store = createProgressStoreFromBus(bus)
+    const unsub = attachRunStatePersistence(bus, store, () => dir)
+
+    // first emit a run_done, verify disk write takes effect
+    bus.emit({
+      type: 'run_started',
+      runId: 'r1',
+      workflowName: 'w',
+      meta: null,
+    })
+    bus.emit({ type: 'run_done', runId: 'r1', status: 'completed' })
+    await new Promise(r => setTimeout(r, 50))
+    expect(await readRunState(dir, 'r1')).not.toBeNull()
+
+    // after unsubscribe, emit run_done again, should not write to disk
+    unsub()
+    bus.emit({
+      type: 'run_started',
+      runId: 'r2',
+      workflowName: 'w',
+      meta: null,
+    })
+    bus.emit({ type: 'run_done', runId: 'r2', status: 'completed' })
+    await new Promise(r => setTimeout(r, 50))
+    expect(await readRunState(dir, 'r2')).toBeNull()
+  } finally {
+    await rm(dir, { recursive: true, force: true })
+  }
+})
--- a/src/workflow/tests/selectors.test.ts
+++ b/src/workflow/tests/selectors.test.ts
@@ -0,0 +1,82 @@
+import { expect, test } from 'bun:test'
+import type { AgentProgress, RunProgress } from '../progress/store.js'
+import {
+  ALL_PHASE,
+  mergePhases,
+  filterAgentsByPhase,
+  tabLabel,
+} from '../panel/selectors.js'
+
+function run(partial: Partial<RunProgress>): RunProgress {
+  return {
+    runId: 'r1',
+    workflowName: 'w',
+    status: 'running',
+    phases: [],
+    declaredPhases: [],
+    currentPhase: null,
+    agents: [],
+    agentCount: 0,
+    startedAt: 1,
+    updatedAt: 1,
+    ...partial,
+  }
+}
+
+test('mergePhases: declared order first, actual phases append undeclared ones, counts done/total', () => {
+  const r = run({
+    declaredPhases: ['Find', 'Review', 'Verify'],
+    phases: [
+      { title: 'Find', status: 'done' },
+      { title: 'Review', status: 'running' },
+    ],
+    agents: [
+      {
+        id: 1,
+        phase: 'Find',
+        status: 'done',
+        resultKind: 'ok',
+        outputShape: 'text',
+      },
+      { id: 2, phase: 'Find', status: 'done', resultKind: 'dead' },
+      { id: 3, phase: 'Review', status: 'running' },
+    ],
+  })
+  expect(mergePhases(r)).toEqual([
+    { title: 'Find', status: 'done', done: 2, total: 2 },
+    { title: 'Review', status: 'running', done: 0, total: 1 },
+    { title: 'Verify', status: 'pending', done: 0, total: 0 },
+  ])
+})
+
+test('mergePhases: actual but undeclared phase appended to the end', () => {
+  const r = run({
+    declaredPhases: ['Find'],
+    phases: [
+      { title: 'Find', status: 'done' },
+      { title: 'Adhoc', status: 'running' },
+    ],
+    agents: [],
+  })
+  expect(mergePhases(r).map(p => p.title)).toEqual(['Find', 'Adhoc'])
+})
+
+test('filterAgentsByPhase: All / undefined → all; specified → only that phase', () => {
+  const agents: AgentProgress[] = [
+    { id: 1, phase: 'A', status: 'running' },
+    {
+      id: 2,
+      phase: 'B',
+      status: 'done',
+      resultKind: 'ok',
+      outputShape: 'text',
+    },
+  ]
+  expect(filterAgentsByPhase(agents, undefined)).toHaveLength(2)
+  expect(filterAgentsByPhase(agents, ALL_PHASE)).toHaveLength(2)
+  expect(filterAgentsByPhase(agents, 'A')).toEqual([agents[0]])
+})
+
+test('tabLabel: workflow name + last 4 chars short code of runId', () => {
+  expect(tabLabel('review-changes', 'wf_abc123def')).toBe('review-changes#3def')
+})
--- a/src/workflow/tests/service.test.ts
+++ b/src/workflow/tests/service.test.ts
@@ -0,0 +1,594 @@
+import { expect, test } from 'bun:test'
+// DI pattern: do not use mock.module (process-global, last-write-wins, would pollute other tests in the same process such as
+// autonomy.test.ts). Instead hand-construct FAKE WorkflowPorts: registry.run returns a fixed ok
+// result, taskRegistrar maintains abort bindings, journalStore is an in-memory empty impl. The real runWorkflow
+// thus runs to completion without needing LLM or mocks.
+
+import { mkdtemp, rm, writeFile } from 'node:fs/promises'
+import { tmpdir } from 'node:os'
+import { join } from 'node:path'
+import { makeService, __resetWorkflowServiceForTests } from '../service.js'
+import { createProgressBus } from '../progress/bus.js'
+import {
+  createProgressStoreFromBus,
+  type RunProgress,
+} from '../progress/store.js'
+import type {
+  AgentRunResult,
+  ProgressEvent,
+  WorkflowPorts,
+} from '@claude-code-best/workflow-engine'
+
+// Construct FAKE ports: registry.run returns a fixed AgentRunResult, taskRegistrar has bindings,
+// journalStore is an in-memory empty impl. progressEmitter.emit → bus.emit (store subscribes to bus at construction).
+// Note: runWorkflow itself emits run_started/run_done; taskRegistrar only manages abort bindings,
+// does not re-emit events (avoids store reducer receiving duplicate run_done).
+type RegistrarCall =
+  | { kind: 'complete'; runId: string; summary?: string }
+  | { kind: 'fail'; runId: string; error?: string }
+  | { kind: 'kill'; runId: string }
+  | {
+      kind: 'registerAgentAbort'
+      runId: string
+      agentId: number
+      controller: AbortController
+    }
+  | { kind: 'unregisterAgentAbort'; runId: string; agentId: number }
+  | { kind: 'killAgent'; runId: string; agentId: number }
+
+function fakePorts(
+  opts: {
+    /** adapter.run throws (simulates agent backend crash). */
+    adapterThrow?: string
+    /** adapter.run return value (default ok). */
+    adapterResult?: AgentRunResult
+    /** agentRunner.runAgentToResult return value (fallback path, default throws). */
+    runnerResult?: AgentRunResult
+  } = {},
+): {
+  ports: WorkflowPorts
+  store: ReturnType<typeof createProgressStoreFromBus>
+  killed: string[]
+  /** taskRegistrar call records (complete/fail/kill/registerAgentAbort/...). */
+  calls: RegistrarCall[]
+  /** runId → (agentId → AbortController). Used by tests to simulate backend registration. */
+  agentBindings: Map<string, Map<number, AbortController>>
+  /** adapter.run call count (accumulates on retry). holder reference, tests read adapterCalls.value. */
+  adapterCallsRef: { value: number }
+} {
+  const bus = createProgressBus()
+  const store = createProgressStoreFromBus(bus)
+  const killed: string[] = []
+  const calls: RegistrarCall[] = []
+  const bindings = new Map<string, { abort: AbortController }>()
+  // agentId → AbortController (per runId). killAgent uses this to abort precisely.
+  const agentBindings = new Map<string, Map<number, AbortController>>()
+  // adapter.run call count (accumulates on retry). Use holder object to avoid closure/getter
+  // snapshot semantics issues in Bun test runner — when returning, shorthand takes the current value (=0),
+  // subsequent outer variable ++ does not reflect into the returned object field. holder reference is stable.
+  const adapterCallsRef = { value: 0 }
+  let seq = 0
+  const ports = {
+    // hostFactory is not actually called by the service.launch path (service builds its own host handle),
+    // but the WorkflowPorts type requires it to exist; keep a minimal impl.
+    hostFactory: () => ({
+      handle: {} as never,
+      cwd: '/tmp',
+      budgetTotal: null,
+      toolUseId: 'tu',
+    }),
+    agentAdapterRegistry: {
+      resolve: () => ({
+        id: 'claude-code',
+        capabilities: { structuredOutput: true },
+        run:
+          opts.adapterThrow !== undefined
+            ? async (): Promise<AgentRunResult> => {
+                adapterCallsRef.value++
+                throw new Error(opts.adapterThrow)
+              }
+            : async (): Promise<AgentRunResult> => {
+                adapterCallsRef.value++
+                return (
+                  opts.adapterResult ?? {
+                    kind: 'ok',
+                    output: 'mock-out',
+                    usage: { outputTokens: 1 },
+                  }
+                )
+              },
+      }),
+    },
+    agentRunner: {
+      runAgentToResult:
+        opts.runnerResult !== undefined
+          ? async () => opts.runnerResult
+          : async () => {
+              throw new Error('should not reach')
+            },
+    },
+    progressEmitter: {
+      emit: (e: ProgressEvent) => bus.emit(e),
+    },
+    taskRegistrar: {
+      register: ({ workflowName }: { workflowName: string }) => {
+        const abort = new AbortController()
+        seq += 1
+        const runId = `run-${seq}`
+        bindings.set(runId, { abort })
+        agentBindings.set(runId, new Map())
+        return { runId, signal: abort.signal }
+      },
+      complete: (runId: string, summary?: string) => {
+        calls.push({ kind: 'complete', runId, summary })
+      },
+      fail: (runId: string, error?: string) => {
+        calls.push({ kind: 'fail', runId, error })
+      },
+      kill: (runId: string) => {
+        killed.push(runId)
+        calls.push({ kind: 'kill', runId })
+        bindings.get(runId)?.abort.abort()
+      },
+      registerAgentAbort: (
+        runId: string,
+        agentId: number,
+        controller: AbortController,
+      ) => {
+        calls.push({
+          kind: 'registerAgentAbort',
+          runId,
+          agentId,
+          controller,
+        })
+        agentBindings.get(runId)?.set(agentId, controller)
+      },
+      unregisterAgentAbort: (runId: string, agentId: number) => {
+        calls.push({ kind: 'unregisterAgentAbort', runId, agentId })
+        agentBindings.get(runId)?.delete(agentId)
+      },
+      killAgent: (runId: string, agentId: number) => {
+        calls.push({ kind: 'killAgent', runId, agentId })
+        const ac = agentBindings.get(runId)?.get(agentId)
+        if (!ac) return false
+        ac.abort()
+        agentBindings.get(runId)!.delete(agentId)
+        return true
+      },
+      pendingAction: () => null,
+    },
+    journalStore: {
+      read: async () => [],
+      append: async () => {},
+      truncate: async () => {},
+    },
+    permissionGate: { isAborted: () => false },
+    logger: {
+      debug: () => {},
+      event: () => {},
+      warn: () => {},
+    },
+  } as unknown as WorkflowPorts
+  return { ports, store, killed, calls, agentBindings, adapterCallsRef }
+}
+
+const stubTUC = { agentId: 'a1', toolUseId: 'tu' } as never
+const stubCanUseTool = (() => Promise.resolve({ behavior: 'allow' })) as never
+
+/** Wait for detached runWorkflow to complete (detached call, need to drain microtasks/macrotasks). */
+async function settle(): Promise<void> {
+  await new Promise(r => setTimeout(r, 60))
+}
+
+test('launch → completed; store shows this run', async () => {
+  __resetWorkflowServiceForTests()
+  const { ports, store } = fakePorts()
+  const svc = makeService(ports, store)
+  const { runId } = await svc.launch(
+    { script: `return agent('compute')` },
+    stubTUC,
+    stubCanUseTool,
+  )
+  await settle()
+  const r = svc.getRun(runId)
+  expect(r).toBeDefined()
+  // detached execution may still be running within the settle window, or already completed — both are acceptable.
+  expect(['completed', 'running']).toContain(r!.status)
+  expect(r!.workflowName).toBe('workflow')
+})
+
+test('launch inline script → returns scriptPath (persisted to cwdOverride dir)', async () => {
+  __resetWorkflowServiceForTests()
+  const dir = await mkdtemp(join(tmpdir(), 'wf-svc-'))
+  try {
+    const { ports, store } = fakePorts()
+    const svc = makeService(ports, store, dir)
+    const result = await svc.launch(
+      { script: `return agent('x')` },
+      stubTUC,
+      stubCanUseTool,
+    )
+    expect(result.scriptPath).toBe(
+      join(dir, '.claude', 'workflow-runs', 'run-1', 'script.js'),
+    )
+    const { readFile } = await import('node:fs/promises')
+    expect(await readFile(result.scriptPath!, 'utf-8')).toBe(
+      `return agent('x')`,
+    )
+  } finally {
+    await rm(dir, { recursive: true, force: true })
+  }
+})
+
+test('kill goes through taskRegistrar.kill', async () => {
+  __resetWorkflowServiceForTests()
+  const { ports, store, killed } = fakePorts()
+  const svc = makeService(ports, store)
+  const { runId } = await svc.launch(
+    { script: `return agent('x')` },
+    stubTUC,
+    stubCanUseTool,
+  )
+  svc.kill(runId)
+  expect(killed).toContain(runId)
+})
+
+test('killAgent goes through taskRegistrar.killAgent: precisely aborts a single agent', async () => {
+  __resetWorkflowServiceForTests()
+  const { ports, store, calls, agentBindings } = fakePorts()
+  const svc = makeService(ports, store)
+  const { runId } = await svc.launch(
+    { script: `return agent('x')` },
+    stubTUC,
+    stubCanUseTool,
+  )
+  // simulate backend registering AbortController when launching agent
+  const ac = new AbortController()
+  agentBindings.get(runId)!.set(7, ac)
+  // service.killAgent routes to taskRegistrar.killAgent, which actually aborts the corresponding controller
+  expect(svc.killAgent(runId, 7)).toBe(true)
+  expect(ac.signal.aborted).toBe(true)
+  expect(
+    calls.some(
+      c => c.kind === 'killAgent' && c.runId === runId && c.agentId === 7,
+    ),
+  ).toBe(true)
+  // after abort controller is deleted from Map: calling killAgent on same agent again returns false (idempotent)
+  expect(svc.killAgent(runId, 7)).toBe(false)
+  // unknown agentId / unknown runId safe returns false
+  expect(svc.killAgent(runId, 999)).toBe(false)
+  expect(svc.killAgent('nope', 1)).toBe(false)
+})
+
+test('listRuns/subscribe come from store', () => {
+  __resetWorkflowServiceForTests()
+  const { ports, store } = fakePorts()
+  const svc = makeService(ports, store)
+  expect(svc.listRuns()).toEqual([])
+  let n = 0
+  const unsub = svc.subscribe(() => {
+    n++
+  })
+  expect(typeof unsub).toBe('function')
+  unsub()
+  expect(n).toBe(0)
+})
+
+test('listNamed delegates to namedWorkflows (empty dir → []; with files → lists)', async () => {
+  __resetWorkflowServiceForTests()
+  const { ports, store } = fakePorts()
+  const svc = makeService(ports, store)
+  // non-existent dir → []
+  const empty = await svc.listNamed(
+    join(tmpdir(), `wf-nope-${Math.random().toString(36).slice(2)}`),
+  )
+  expect(empty).toEqual([])
+  // dir with named files → lists names (extension stripped, sorted)
+  const dir = await mkdtemp(join(tmpdir(), 'wf-named-'))
+  try {
+    await writeFile(
+      join(dir, 'a.ts'),
+      'export const meta = { name: "a", description: "d" }\nreturn 1',
+    )
+    await writeFile(join(dir, 'b.js'), 'return 2')
+    const names = await svc.listNamed(dir)
+    expect(names).toEqual(['a', 'b'])
+  } finally {
+    await rm(dir, { recursive: true, force: true })
+  }
+})
+
+test('missing script/name/scriptPath → throws', async () => {
+  __resetWorkflowServiceForTests()
+  const { ports, store } = fakePorts()
+  const svc = makeService(ports, store)
+  await expect(svc.launch({}, stubTUC, stubCanUseTool)).rejects.toThrow(
+    /script|name|scriptPath/,
+  )
+})
+
+test('scriptPath reads file content and validates', async () => {
+  __resetWorkflowServiceForTests()
+  const { ports, store } = fakePorts()
+  const svc = makeService(ports, store)
+  const dir = await mkdtemp(join(tmpdir(), 'wf-path-'))
+  const file = join(dir, 's.ts')
+  try {
+    await writeFile(file, `return agent('from-file')`)
+    const { runId } = await svc.launch(
+      { scriptPath: file },
+      stubTUC,
+      stubCanUseTool,
+    )
+    await settle()
+    const r = svc.getRun(runId)
+    expect(r).toBeDefined()
+    expect(['completed', 'running']).toContain(r!.status)
+  } finally {
+    await rm(dir, { recursive: true, force: true })
+  }
+})
+
+test('parseScript validation failed → launch throws', async () => {
+  __resetWorkflowServiceForTests()
+  const { ports, store } = fakePorts()
+  const svc = makeService(ports, store)
+  // trigger ScriptError: meta literal missing description (validateMeta requires both name+description to be strings)
+  await expect(
+    svc.launch(
+      { script: `export const meta = { name: "x" }\nreturn 1` },
+      stubTUC,
+      stubCanUseTool,
+    ),
+  ).rejects.toThrow(/Script validation failed/i)
+})
+
+// ---- Service-layer failure routing coverage (review gap: .then/.catch → taskRegistrar path) ----
+
+test('script run throws → service routes to taskRegistrar.fail, with error text', async () => {
+  __resetWorkflowServiceForTests()
+  const { ports, store, calls } = fakePorts()
+  const svc = makeService(ports, store)
+  await svc.launch(
+    { script: `throw new Error('script boom')` },
+    stubTUC,
+    stubCanUseTool,
+  )
+  await settle()
+  const fail = calls.find(c => c.kind === 'fail')
+  expect(fail).toBeDefined()
+  expect(fail?.kind === 'fail' && fail.error).toMatch(/script boom/)
+})
+
+test('adapter throws → retry still throws → degrade to dead → workflow completed (not fail)', async () => {
+  __resetWorkflowServiceForTests()
+  // new semantics: agent non-abort throw → retry once → still throws → degrade to dead (agent returns null),
+  // workflow continues and completes. Retry tolerates transient failures (429/network), but a permanently
+  // broken agent does not break through the entire workflow (consistent with parallel/pipeline null-on-error contract).
+  const { ports, store, calls, adapterCallsRef } = fakePorts({
+    adapterThrow: 'adapter boom',
+  })
+  const svc = makeService(ports, store)
+  await svc.launch({ script: `return agent('x')` }, stubTUC, stubCanUseTool)
+  await settle()
+  // retry once → adapter called 2 times
+  expect(adapterCallsRef.value).toBe(2)
+  // workflow normal completed, not failed
+  const complete = calls.find(c => c.kind === 'complete')
+  expect(complete).toBeDefined()
+  const fail = calls.find(c => c.kind === 'fail')
+  expect(fail).toBeUndefined()
+})
+
+test('script completes normally → service routes to taskRegistrar.complete', async () => {
+  __resetWorkflowServiceForTests()
+  const { ports, store, calls } = fakePorts()
+  const svc = makeService(ports, store)
+  await svc.launch({ script: `return agent('x')` }, stubTUC, stubCanUseTool)
+  await settle()
+  expect(calls.some(c => c.kind === 'complete')).toBe(true)
+})
+
+// ---- Fix N: shutdown cleanup ----
+
+test('shutdown kills all running runs (taskRegistrar.kill called for each)', async () => {
+  __resetWorkflowServiceForTests()
+  const { ports, store, killed } = fakePorts()
+  // make adapter slower, so during settle the run is still running
+  const slowPorts = {
+    ...ports,
+    agentAdapterRegistry: {
+      resolve: () => ({
+        id: 'claude-code',
+        capabilities: { structuredOutput: true },
+        run: async (): Promise<AgentRunResult> => {
+          await new Promise(r => setTimeout(r, 200))
+          return { kind: 'ok', output: 'slow', usage: { outputTokens: 1 } }
+        },
+      }),
+    },
+  } as unknown as typeof ports
+  const slowSvc = makeService(slowPorts, store)
+  const { runId: a } = await slowSvc.launch(
+    { script: `return agent('a')` },
+    stubTUC,
+    stubCanUseTool,
+  )
+  const { runId: b } = await slowSvc.launch(
+    { script: `return agent('b')` },
+    stubTUC,
+    stubCanUseTool,
+  )
+  killed.length = 0
+  slowSvc.shutdown()
+  expect(killed).toContain(a)
+  expect(killed).toContain(b)
+})
+
+test('shutdown does not re-kill completed runs; idempotent (multiple calls safe)', async () => {
+  __resetWorkflowServiceForTests()
+  const { ports, store, killed } = fakePorts()
+  const svc = makeService(ports, store)
+  const { runId } = await svc.launch(
+    { script: `return agent('x')` },
+    stubTUC,
+    stubCanUseTool,
+  )
+  await settle() // complete
+  killed.length = 0
+  svc.shutdown()
+  // already completed should not be killed again
+  expect(killed).not.toContain(runId)
+  // idempotent
+  expect(() => svc.shutdown()).not.toThrow()
+})
+
+// ---- Task 5: loadPersistedRuns + getRunAsync fallback ----
+// runsDirProvider is injected as makeService's fourth optional parameter with tmpdir, to avoid writing to the real project dir
+// (Bun ESM module namespace is read-only, cannot monkey-patch getRunsDir).
+
+test('loadPersistedRuns scans disk to hydrate historical runs; existing in-memory runs are not overwritten', async () => {
+  __resetWorkflowServiceForTests()
+  const dir = await mkdtemp(join(tmpdir(), 'wf-svc-'))
+  try {
+    // disk first has two historical runs
+    const { writeRunState } = await import('../persistence.js')
+    const historicalA = {
+      runId: 'hA',
+      workflowName: 'old-A',
+      status: 'completed',
+      phases: [],
+      declaredPhases: [],
+      currentPhase: null,
+      agents: [],
+      agentCount: 1,
+      returnValue: 'a',
+      startedAt: 10,
+      updatedAt: 20,
+    } as RunProgress
+    const historicalB = {
+      runId: 'hB',
+      workflowName: 'old-B',
+      status: 'failed',
+      phases: [],
+      declaredPhases: [],
+      currentPhase: null,
+      agents: [],
+      agentCount: 2,
+      error: 'x',
+      startedAt: 30,
+      updatedAt: 40,
+    } as RunProgress
+    await writeRunState(dir, historicalA)
+    await writeRunState(dir, historicalB)
+
+    const { ports, store } = fakePorts()
+    // in-memory first has one current-session run (via ports.progressEmitter.emit through bus → store)
+    ports.progressEmitter.emit({
+      type: 'run_started',
+      runId: 'live',
+      workflowName: 'live-w',
+      meta: null,
+    })
+    const svc = makeService(ports, store, undefined, () => dir)
+
+    await svc.loadPersistedRuns()
+
+    const ids = svc.listRuns().map(r => r.runId)
+    expect(ids).toContain('hA')
+    expect(ids).toContain('hB')
+    expect(ids).toContain('live')
+    // memory first: live is still running (not overwritten by disk; disk has no live so no STALE injected)
+    expect(svc.getRun('live')!.status).toBe('running')
+    expect(svc.getRun('hA')!.returnValue).toBe('a')
+  } finally {
+    await rm(dir, { recursive: true, force: true })
+  }
+})
+
+test('loadPersistedRuns repeated calls scan disk only once (persistedLoaded flag)', async () => {
+  __resetWorkflowServiceForTests()
+  const dir = await mkdtemp(join(tmpdir(), 'wf-svc-'))
+  try {
+    const { ports, store } = fakePorts()
+    const svc = makeService(ports, store, undefined, () => dir)
+
+    await svc.loadPersistedRuns()
+    await svc.loadPersistedRuns()
+    await svc.loadPersistedRuns()
+
+    // repeated calls do not throw, do not change listRuns result (empty dir)
+    expect(svc.listRuns()).toEqual([])
+  } finally {
+    await rm(dir, { recursive: true, force: true })
+  }
+})
+
+test('getRunAsync memory hit → no disk read', async () => {
+  __resetWorkflowServiceForTests()
+  const dir = await mkdtemp(join(tmpdir(), 'wf-svc-'))
+  try {
+    const { ports, store } = fakePorts()
+    const svc = makeService(ports, store, undefined, () => dir)
+    ports.progressEmitter.emit({
+      type: 'run_started',
+      runId: 'live',
+      workflowName: 'w',
+      meta: null,
+    })
+
+    const got = await svc.getRunAsync('live')
+    expect(got?.runId).toBe('live')
+  } finally {
+    await rm(dir, { recursive: true, force: true })
+  }
+})
+
+test('getRunAsync memory miss + disk hit → returns disk value, and does not inject into memory (subsequent get still reads disk)', async () => {
+  __resetWorkflowServiceForTests()
+  const dir = await mkdtemp(join(tmpdir(), 'wf-svc-'))
+  try {
+    const { writeRunState } = await import('../persistence.js')
+    const historical = {
+      runId: 'hist-only',
+      workflowName: 'old',
+      status: 'completed',
+      phases: [],
+      declaredPhases: [],
+      currentPhase: null,
+      agents: [],
+      agentCount: 0,
+      returnValue: { x: 1 },
+      startedAt: 1,
+      updatedAt: 2,
+    } as RunProgress
+    await writeRunState(dir, historical)
+
+    const { ports, store } = fakePorts()
+    const svc = makeService(ports, store, undefined, () => dir)
+
+    const got = await svc.getRunAsync('hist-only')
+    expect(got?.returnValue).toEqual({ x: 1 })
+    // not injected into memory: in-memory list does not contain (not hydrated)
+    expect(svc.listRuns().map(r => r.runId)).not.toContain('hist-only')
+    // subsequent get still returns (each goes through readRunState fallback)
+    const got2 = await svc.getRunAsync('hist-only')
+    expect(got2?.returnValue).toEqual({ x: 1 })
+  } finally {
+    await rm(dir, { recursive: true, force: true })
+  }
+})
+
+test('getRunAsync memory miss + disk miss → undefined', async () => {
+  __resetWorkflowServiceForTests()
+  const dir = await mkdtemp(join(tmpdir(), 'wf-svc-'))
+  try {
+    const { ports, store } = fakePorts()
+    const svc = makeService(ports, store, undefined, () => dir)
+
+    const got = await svc.getRunAsync('no-such-run')
+    expect(got).toBeUndefined()
+  } finally {
+    await rm(dir, { recursive: true, force: true })
+  }
+})
--- a/src/workflow/tests/status.test.ts
+++ b/src/workflow/tests/status.test.ts
@@ -0,0 +1,88 @@
+import { expect, test } from 'bun:test'
+import type { AgentProgress, RunProgress } from '../progress/store.js'
+import {
+  STATUS_DOT,
+  RUN_STATUS_COLOR,
+  RUN_STATUS_TEXT,
+  PHASE_MARK,
+  PHASE_COLOR,
+  agentVisual,
+  formatTokenCount,
+  agentMetaText,
+} from '../panel/status.js'
+
+test('STATUS_DOT / RUN_STATUS_COLOR / RUN_STATUS_TEXT cover four run states', () => {
+  const statuses: RunProgress['status'][] = [
+    'running',
+    'completed',
+    'failed',
+    'killed',
+  ]
+  for (const s of statuses) {
+    expect(STATUS_DOT[s].length).toBeGreaterThan(0)
+    expect(RUN_STATUS_COLOR[s]).toBeTruthy()
+    expect(RUN_STATUS_TEXT[s].length).toBeGreaterThan(0)
+  }
+  expect(STATUS_DOT.running).toBe('●')
+  expect(STATUS_DOT.completed).toBe('✓')
+  expect(STATUS_DOT.failed).toBe('✗')
+  expect(STATUS_DOT.killed).toBe('■')
+  expect(RUN_STATUS_TEXT.completed).toBe('done')
+  expect(RUN_STATUS_TEXT.running).toBe('running')
+})
+
+test('PHASE_MARK / PHASE_COLOR cover running/done/pending', () => {
+  expect(PHASE_MARK.running).toBe('●')
+  expect(PHASE_MARK.done).toBe('✓')
+  expect(PHASE_MARK.pending).toBe('○')
+  expect(PHASE_COLOR.pending).toBe('subtle')
+})
+
+test('agentVisual: running → ● warning', () => {
+  const a: AgentProgress = { id: 1, status: 'running' }
+  expect(agentVisual(a)).toEqual({ mark: '●', color: 'warning' })
+})
+
+test('agentVisual: done·ok → ✓ success (no longer carries outputShape suffix)', () => {
+  const a: AgentProgress = {
+    id: 1,
+    status: 'done',
+    resultKind: 'ok',
+    outputShape: 'object',
+  }
+  expect(agentVisual(a)).toEqual({ mark: '✓', color: 'success' })
+})
+
+test('agentVisual: dead → ✗ error', () => {
+  const a: AgentProgress = { id: 1, status: 'done', resultKind: 'dead' }
+  expect(agentVisual(a)).toEqual({ mark: '✗', color: 'error' })
+})
+
+test('formatTokenCount: <1000 original value, ≥1000 keeps 1 decimal + k', () => {
+  expect(formatTokenCount(undefined)).toBe('0')
+  expect(formatTokenCount(0)).toBe('0')
+  expect(formatTokenCount(42)).toBe('42')
+  expect(formatTokenCount(1000)).toBe('1.0k')
+  expect(formatTokenCount(22900)).toBe('22.9k')
+})
+
+test('agentMetaText: model · Nk tok · N tool', () => {
+  const a: AgentProgress = {
+    id: 1,
+    status: 'done',
+    model: 'glm-5.2',
+    tokenCount: 22900,
+    toolCount: 1,
+  }
+  expect(agentMetaText(a)).toBe('glm-5.2 · 22.9k tok · 1 tool')
+})
+
+test('agentMetaText: omits prefix when no model', () => {
+  const a: AgentProgress = {
+    id: 1,
+    status: 'running',
+    tokenCount: 500,
+    toolCount: 2,
+  }
+  expect(agentMetaText(a)).toBe('500 tok · 2 tool')
+})
--- a/src/workflow/tests/useWorkflowKeyboard.test.ts
+++ b/src/workflow/tests/useWorkflowKeyboard.test.ts
@@ -0,0 +1,45 @@
+import { expect, test } from 'bun:test'
+import { routeWorkflowKey } from '../panel/useWorkflowKeyboard.js'
+
+test('Tab → nextTab；Shift+Tab → prevTab', () => {
+  expect(routeWorkflowKey('', { tab: true })).toBe('nextTab')
+  expect(routeWorkflowKey('', { tab: true, shift: true })).toBe('prevTab')
+})
+
+test('q / Esc → quit', () => {
+  expect(routeWorkflowKey('q', {})).toBe('quit')
+  expect(routeWorkflowKey('', { escape: true })).toBe('quit')
+})
+
+test('x → killAgent；K → killWorkflow；r → resume；n → newRun', () => {
+  expect(routeWorkflowKey('x', {})).toBe('killAgent')
+  expect(routeWorkflowKey('K', {})).toBe('killWorkflow')
+  expect(routeWorkflowKey('r', {})).toBe('resume')
+  expect(routeWorkflowKey('n', {})).toBe('newRun')
+})
+
+test('confirm mode: y/Enter → confirmYes; n/Esc/q → confirmNo; other keys → null', () => {
+  expect(routeWorkflowKey('y', {}, 'confirm')).toBe('confirmYes')
+  expect(routeWorkflowKey('Y', {}, 'confirm')).toBe('confirmYes')
+  expect(routeWorkflowKey('', { return: true }, 'confirm')).toBe('confirmYes')
+  expect(routeWorkflowKey('n', {}, 'confirm')).toBe('confirmNo')
+  expect(routeWorkflowKey('N', {}, 'confirm')).toBe('confirmNo')
+  expect(routeWorkflowKey('', { escape: true }, 'confirm')).toBe('confirmNo')
+  expect(routeWorkflowKey('q', {}, 'confirm')).toBe('confirmNo')
+  // confirm mode swallows navigation/edit keys, preventing accidental triggers
+  expect(routeWorkflowKey('x', {}, 'confirm')).toBeNull()
+  expect(routeWorkflowKey('', { tab: true }, 'confirm')).toBeNull()
+  expect(routeWorkflowKey('', { upArrow: true }, 'confirm')).toBeNull()
+})
+
+test('←/→ switch focus column; ↑/↓ move within column', () => {
+  expect(routeWorkflowKey('', { leftArrow: true })).toBe('focusLeft')
+  expect(routeWorkflowKey('', { rightArrow: true })).toBe('focusRight')
+  expect(routeWorkflowKey('', { upArrow: true })).toBe('moveUp')
+  expect(routeWorkflowKey('', { downArrow: true })).toBe('moveDown')
+})
+
+test('unrelated input → null', () => {
+  expect(routeWorkflowKey('z', {})).toBeNull()
+  expect(routeWorkflowKey('', {})).toBeNull()
+})
--- a/src/workflow/backends/claudeCodeBackend.ts
+++ b/src/workflow/backends/claudeCodeBackend.ts
@@ -0,0 +1,409 @@
+// Deeply-integrated backend: parses agent/model/tools from the live session, delegates to the core runAgent.
+// Implements the AgentAdapter interface, registered and routed by the registry (U5).
+import {
+  type AgentAdapter,
+  type AgentAdapterContext,
+  type AgentRunParams,
+  type AgentRunResult,
+  WorkflowAbortedError,
+} from '@claude-code-best/workflow-engine'
+import { assembleToolPool } from '../../tools.js'
+import { finalizeAgentTool } from '@claude-code-best/builtin-tools/tools/AgentTool/agentToolUtils.js'
+import { runAgent } from '@claude-code-best/builtin-tools/tools/AgentTool/runAgent.js'
+import {
+  isBuiltInAgent,
+  type AgentDefinition,
+  type BuiltInAgentDefinition,
+} from '@claude-code-best/builtin-tools/tools/AgentTool/loadAgentsDir.js'
+import { createUserMessage, extractTextContent } from '../../utils/messages.js'
+import { getTokenCountFromUsage } from '../../utils/tokens.js'
+import { createHash } from 'node:crypto'
+import { createAgentId } from '../../utils/uuid.js'
+import { logForDebugging } from '../../utils/debug.js'
+import { runWithCwdOverride } from '../../utils/cwd.js'
+import {
+  createAgentWorktree,
+  hasWorktreeChanges,
+  removeAgentWorktree,
+} from '../../utils/worktree.js'
+import { logEvent } from '../../services/analytics/index.js'
+import type { ModelAlias } from '../../utils/model/aliases.js'
+import type { Message } from '../../types/message.js'
+import type { ToolUseContext } from '../../Tool.js'
+import { readHostBundle } from '../hostHandle.js'
+
+/** Fallback definition for workflow subagents (used when agentType does not match a real registry entry). */
+export const WORKFLOW_AGENT: BuiltInAgentDefinition = {
+  agentType: 'workflow-worker',
+  whenToUse: 'subtask dispatched by the agent() hook inside a workflow script',
+  tools: ['*'],
+  source: 'built-in',
+  baseDir: 'built-in',
+  getSystemPrompt: () =>
+    'You are a workflow sub-agent. Complete the task concisely; your final text is the return value relayed to the workflow.',
+}
+
+/** agentType -> real agent registry (use if activeAgents hits, otherwise fallback). Exported for unit test coverage. */
+export function resolveAgentDefinition(
+  agentType: string | undefined,
+  toolUseContext: ToolUseContext,
+): AgentDefinition {
+  if (!agentType) return WORKFLOW_AGENT
+  const found = toolUseContext.options.agentDefinitions.activeAgents.find(
+    a => a.agentType === agentType,
+  )
+  return found ?? WORKFLOW_AGENT
+}
+
+/** model alias -> the actual model id of the current provider. v1 passes it through directly (keeps a mapping extension point). Exported for unit test coverage. */
+export function mapWorkflowModel(
+  model: string | undefined,
+): string | undefined {
+  return model
+}
+
+/**
+ * Extract the JSON object produced under schema mode from the agent's final message; returns null on failure. Exported for unit test coverage.
+ *
+ * Robustness strategy (in priority order, returns the first that successfully parses):
+ * 1. fenced code block (```json ... ``` or ``` ... ```) - agents often spontaneously add fences
+ * 2. the first "brace-balanced" {...} fragment in the bare text - handles preceding/trailing narration / multi-segment output
+ *
+ * Uses a brace-stack scan instead of `indexOf('{')..lastIndexOf('}')`: correctly handles nested objects,
+ * `{}` inside string literals, and escape characters. Will not concatenate multiple unrelated JSON fragments (the original version did).
+ *
+ * Does not do syntax repair (trailing commas, single quotes -> double quotes, comment removal) - agents do not produce non-standard JSON,
+ * and fixing it may instead cause wrong edits inside strings (e.g. `"http://..."` getting eaten by a // comment regex).
+ * On parse failure it directly skips to the next candidate.
+ *
+ * Only returns a plain object (typeof === 'object' && !null && !Array);
+ * the schema mode contract is object, array/number/string are all treated as the agent going off-track.
+ */
+export function extractStructuredOutput(
+  content: Array<{ type: string; text?: string }>,
+): unknown | null {
+  for (const block of content) {
+    if (block.type !== 'text' || !block.text) continue
+    const found = findFirstJsonObject(block.text)
+    if (found !== null) return found
+  }
+  return null
+}
+
+/** Find the first JSON fragment in text that can be parsed as a plain object. */
+function findFirstJsonObject(text: string): unknown | null {
+  // 1. fenced code blocks - priority (agents naturally tend to add them; strip the fence and parse the whole block)
+  for (const m of text.matchAll(
+    /```[\t ]*[a-zA-Z0-9_-]*\s*\n([\s\S]*?)\n?```/g,
+  )) {
+    const parsed = tryParseObject(m[1] ?? '')
+    if (parsed !== null) return parsed
+  }
+  // 2. bare text: scan each '{', find a balanced pair and try parse
+  for (let i = 0; i < text.length; i++) {
+    if (text[i] !== '{') continue
+    const end = findBalancedObjectEnd(text, i)
+    if (end < 0) continue
+    const parsed = tryParseObject(text.slice(i, end + 1))
+    if (parsed !== null) return parsed
+  }
+  return null
+}
+
+/**
+ * Find the matching `}` index starting from start (which must be `{`); returns -1 when unbalanced.
+ * Skips braces inside string literals and escape characters. Does not skip comments (the JSON standard does not allow comments,
+ * agents do not produce them; doing so is a risk - see the function doc).
+ */
+function findBalancedObjectEnd(text: string, start: number): number {
+  let depth = 0
+  let inString = false
+  for (let i = start; i < text.length; i++) {
+    const c = text[i]
+    if (inString) {
+      if (c === '\\')
+        i++ // skip the escape char and the next character
+      else if (c === '"') inString = false
+      continue
+    }
+    if (c === '"') inString = true
+    else if (c === '{') depth++
+    else if (c === '}') {
+      depth--
+      if (depth === 0) return i
+    }
+  }
+  return -1
+}
+
+/** try parse the candidate; only returns a plain object, others (array/number/null) return null. */
+function tryParseObject(candidate: string): unknown | null {
+  const trimmed = candidate.trim()
+  if (!trimmed.startsWith('{') || !trimmed.endsWith('}')) return null
+  try {
+    const v = JSON.parse(trimmed)
+    return typeof v === 'object' && v !== null && !Array.isArray(v) ? v : null
+  } catch {
+    return null
+  }
+}
+
+type WorkflowWorktreeInfo = Awaited<ReturnType<typeof createAgentWorktree>>
+
+/**
+ * Generate a slug for the worktree isolation of a workflow agent: derive hex segments from sha256(runId:agentId),
+ * matching the cleanup regex of cleanupStaleAgentWorktrees `^wf_[0-9a-f]{8}-[0-9a-f]{3}-\d+$`.
+ * taskId is `w`+base36 (not a UUID), so runId cannot be placed directly into the regex segment; sha256 is a deterministic mapping,
+ * and agentId ensures slug uniqueness for multiple agents under the same runId (no shared counter, no thread safety issues).
+ */
+function makeWorkflowWorktreeSlug(runId: string, agentId: string): string {
+  const h = createHash('sha256').update(`${runId}:${agentId}`).digest('hex')
+  return `wf_${h.slice(0, 8)}-${h.slice(8, 11)}-${parseInt(h.slice(11, 17), 16) % 100000}`
+}
+
+/**
+ * Clean up the worktree after the agent finishes: hookBased keeps it (cannot detect VCS changes); otherwise uses
+ * hasWorktreeChanges (fail-closed) to detect, auto-removes when there is no change, keeps it on change/detection failure
+ * and logs the path (v1 uses logs rather than extending AgentRunResult, to avoid touching journal serialization).
+ */
+async function cleanupWorkflowWorktree(
+  info: WorkflowWorktreeInfo,
+  agentType: string,
+): Promise<void> {
+  if (info.hookBased || !info.headCommit) return
+  let changed = true
+  try {
+    changed = await hasWorktreeChanges(info.worktreePath, info.headCommit)
+  } catch (e) {
+    logForDebugging(
+      `workflow worktree change-detect failed (${agentType}): ${(e as Error).message}`,
+    )
+    changed = true
+  }
+  if (!changed) {
+    try {
+      await removeAgentWorktree(
+        info.worktreePath,
+        info.worktreeBranch,
+        info.gitRoot,
+      )
+    } catch (e) {
+      logForDebugging(
+        `workflow worktree remove failed (${agentType}): ${(e as Error).message}`,
+      )
+    }
+  } else {
+    logForDebugging(
+      `workflow worktree retained (has changes, ${agentType}): ${info.worktreePath}`,
+    )
+  }
+}
+
+/** Deeply-integrated backend: parses agent/model/tools from the live session, delegates to the core runAgent. */
+export const claudeCodeBackend: AgentAdapter = {
+  id: 'claude-code',
+  capabilities: { structuredOutput: true, tools: true },
+
+  async run(
+    params: AgentRunParams,
+    ctx: AgentAdapterContext,
+  ): Promise<AgentRunResult> {
+    const { toolUseContext, canUseTool } = readHostBundle(ctx.host)
+    const appState = toolUseContext.getAppState()
+    const agentDef = resolveAgentDefinition(params.agentType, toolUseContext)
+    const model = mapWorkflowModel(params.model)
+    // coreAgentId: the tracking ID for the core-layer subagent (a string, used inside runAgent).
+    // Different from ctx.agentId (the engine's number seq, used for panel / killAgent routing) - two distinct concepts, must not be mixed up.
+    const coreAgentId = createAgentId()
+
+    // isolation:'worktree' - run the agent inside an independent git worktree, so concurrent writes do not conflict.
+    let worktreeInfo: WorkflowWorktreeInfo | null = null
+    if (params.isolation === 'worktree') {
+      try {
+        worktreeInfo = await createAgentWorktree(
+          makeWorkflowWorktreeSlug(ctx.runId, coreAgentId),
+        )
+      } catch (e) {
+        // fail-closed: when isolation fails, do not silently fall back to a shared cwd (otherwise concurrent writes race on data)
+        const detail = (e as Error).message
+        logForDebugging(
+          `workflow worktree creation failed (${agentDef.agentType}): ${detail}`,
+        )
+        return { kind: 'dead', reason: 'worktree-failed', detail }
+      }
+    }
+    // runWithCwdOverride makes tools such as Bash/Read inside the agent see the worktree path
+    // (AsyncLocalStorage is preserved across awaits); the worktreePath parameter of runAgent only writes metadata.
+    const runInCwd = worktreeInfo
+      ? <T>(fn: () => T): T =>
+          runWithCwdOverride(worktreeInfo!.worktreePath, fn)
+      : <T>(fn: () => T): T => fn()
+
+    // Bridge ctx.signal -> runAgent.override.abortController. Otherwise, when the workflow is killed
+    // runAgent is unaware (root cause of 'x' being ineffective): the abort signal cannot reach the internal fetch, and the agent runs to completion.
+    // Single-agent kill goes through service.kill(runId, agentId) -> ports.taskRegistrar.killAgent ->
+    // agentAbortControllers.get(agentId).abort(); the same controller takes over both paths.
+    const agentAbort = new AbortController()
+    const onParentAbort = (): void => agentAbort.abort()
+    if (ctx.signal.aborted) {
+      agentAbort.abort()
+    } else {
+      ctx.signal.addEventListener('abort', onParentAbort, { once: true })
+    }
+    if (typeof ctx.registerAgentAbort === 'function') {
+      ctx.registerAgentAbort(ctx.agentId, agentAbort)
+    }
+
+    const workerPermissionContext = {
+      ...appState.toolPermissionContext,
+      mode: agentDef.permissionMode ?? 'acceptEdits',
+    }
+    const workerTools = assembleToolPool(
+      workerPermissionContext,
+      appState.mcp.tools,
+    )
+
+    // schema -> instructs the agent to directly emit JSON in the final text block.
+    // Does not require calling the StructuredOutput tool - it is not in the workflow subagent's tool set (only
+    // the stop_hook path explicitly injects it; workflow goes through assembleToolPool whose default pool does not include it).
+    // Historically the prompt required "call StructuredOutput tool", causing 8/12 agents to refuse to wrap up or struggle to call it;
+    // empirically the main cause of dead is the tool being unreachable rather than "forgetting". Change the contract: raw JSON text, extractStructuredOutput
+    // tolerates fenced fences + preceding/trailing narration + multiple segments.
+    const promptText = params.schema
+      ? [
+          params.prompt,
+          '',
+          'After completing the task, emit your final answer as a single JSON object matching this JSON Schema:',
+          '```json',
+          JSON.stringify(params.schema, null, 2),
+          '```',
+          '',
+          'CRITICAL RULES:',
+          '- The JSON object must be the LAST text block in your response. Do not write any prose after it.',
+          '- Emit the JSON as plain text (markdown code fences optional).',
+          '- Do NOT call any "StructuredOutput" or "SyntheticOutput" tool — it is not available in this environment.',
+          '- Your turn must end with the JSON object. Anything after it (prose, tool calls) will be ignored or cause your answer to be discarded.',
+        ].join('\n')
+      : params.prompt
+
+    const promptMessages = [createUserMessage({ content: promptText })]
+    const messages: Message[] = []
+    const startTime = Date.now()
+    // Accumulate running progress (onProgress push -> agent_progress event -> panel refreshes token/tool in real time).
+    let tokenCount = 0
+    let toolCount = 0
+
+    try {
+      await runInCwd(async () => {
+        for await (const msg of runAgent({
+          agentDefinition: agentDef,
+          promptMessages,
+          toolUseContext,
+          canUseTool,
+          isAsync: true,
+          querySource: toolUseContext.options.querySource ?? 'workflow',
+          availableTools: workerTools,
+          // override the same object: coreAgentId (core subagent tracking) + abortController (kill bridge).
+          // runAgent's model is the top-level ModelAlias; workflow's model is an arbitrary alias string,
+          // the types are incompatible and resolved by the provider layer at runtime. Passes through via double assertion (better than as any/never).
+          override: { agentId: coreAgentId, abortController: agentAbort },
+          ...(model ? { model: model as unknown as ModelAlias } : {}),
+          ...(worktreeInfo ? { worktreePath: worktreeInfo.worktreePath } : {}),
+        })) {
+          messages.push(msg as Message)
+          // Accumulate running progress: assistant message carries usage (cumulative value -> overwrite), tool_use inside content (incremental).
+          if (msg.type === 'assistant' && msg.message) {
+            const usage = msg.message.usage as
+              | Parameters<typeof getTokenCountFromUsage>[0]
+              | undefined
+            if (usage) tokenCount = getTokenCountFromUsage(usage)
+            const content = msg.message.content as
+              | Array<{ type: string }>
+              | undefined
+            if (content)
+              toolCount += content.filter(b => b.type === 'tool_use').length
+          }
+          ctx.onProgress?.({ tokenCount, toolCount })
+        }
+      })
+    } catch (e) {
+      // abort (kill workflow / kill agent): must rethrow WorkflowAbortedError after detection,
+      // otherwise hooks.agent will swallow the abort as an ordinary failure into dead, and the workflow won't know it was killed
+      // (the other side of the 'x' kill path being ineffective: the signal did arrive, but the result was disguised as a normal completion).
+      if (agentAbort.signal.aborted || (e as Error)?.name === 'AbortError') {
+        throw new WorkflowAbortedError()
+      }
+      const detail = (e as Error).message
+      logForDebugging(
+        `workflow sub-agent error (${agentDef.agentType}): ${detail}`,
+      )
+      logEvent('tengu_workflow_agent', { ok: 0 })
+      return { kind: 'dead', reason: 'runagent-threw', detail }
+    } finally {
+      // cleanup (idempotent): listener removeEventListener / Map.delete are safe to call repeatedly.
+      if (typeof ctx.unregisterAgentAbort === 'function') {
+        ctx.unregisterAgentAbort(ctx.agentId)
+      }
+      ctx.signal.removeEventListener('abort', onParentAbort)
+      if (worktreeInfo) {
+        const info = worktreeInfo
+        worktreeInfo = null
+        await cleanupWorkflowWorktree(info, agentDef.agentType)
+      }
+    }
+
+    const finalized = finalizeAgentTool(messages, coreAgentId, {
+      prompt: params.prompt,
+      resolvedAgentModel: toolUseContext.options.mainLoopModel,
+      isBuiltInAgent: isBuiltInAgent(agentDef),
+      startTime,
+      agentType: agentDef.agentType,
+      isAsync: true,
+    })
+    const outputTokens =
+      finalized.usage?.output_tokens ?? finalized.totalTokens ?? 0
+    // For panel display: total context tokens, tool-call count, parsed model id at completion.
+    const finalTokenCount = finalized.totalTokens ?? 0
+    const finalToolCount = finalized.totalToolUseCount ?? 0
+    const resolvedModel = model ?? toolUseContext.options.mainLoopModel
+    logEvent('tengu_workflow_agent', { ok: 1, outputTokens })
+
+    if (params.schema) {
+      const structured = extractStructuredOutput(finalized.content)
+      if (structured === null) {
+        // The agent finished all tool calls but no plain-object JSON was found in the final text block.
+        // Typical scenarios: forgot to emit JSON after a long tool chain, unbalanced JSON nesting, parse failure.
+        // Put a preview of the last text into detail so the hooks retry log and the panel can immediately see what the agent actually said.
+        const preview = extractTextContent(finalized.content, '\n').slice(
+          0,
+          200,
+        )
+        logForDebugging(
+          `workflow sub-agent produced no JSON object (${agentDef.agentType}); preview: ${preview}`,
+        )
+        return {
+          kind: 'dead',
+          reason: 'no-structured-output',
+          detail: preview,
+        }
+      }
+      return {
+        kind: 'ok',
+        output: structured as object,
+        usage: { outputTokens },
+        model: resolvedModel,
+        toolCount: finalToolCount,
+        tokenCount: finalTokenCount,
+      }
+    }
+    const text = extractTextContent(finalized.content, '\n')
+    return {
+      kind: 'ok',
+      output: text,
+      usage: { outputTokens },
+      model: resolvedModel,
+      toolCount: finalToolCount,
+      tokenCount: finalTokenCount,
+    }
+  },
+}
--- a/src/workflow/hostHandle.ts
+++ b/src/workflow/hostHandle.ts
@@ -0,0 +1,42 @@
+import {
+  createHostHandle,
+  unwrapHostHandle,
+  type HostHandle,
+} from '@claude-code-best/workflow-engine'
+import type { CanUseToolFn } from '../hooks/useCanUseTool.js'
+import type { AssistantMessage } from '../types/message.js'
+import type { AgentId } from '../types/ids.js'
+import type { ToolUseContext } from '../Tool.js'
+
+/** Opaque bundle held inside HostHandle (unpacked on the core side). */
+export type WorkflowHostBundle = {
+  toolUseContext: ToolUseContext
+  canUseTool: CanUseToolFn
+  parentMessage?: AssistantMessage
+  agentId?: AgentId
+}
+
+/**
+ * Shared: builds the host bundle from toolUseContext/canUseTool.
+ * parentMessage is optional (absent on the panel launch path — claudeCodeBackend never reads it).
+ */
+export function buildHostBundle(
+  toolUseContext: WorkflowHostBundle['toolUseContext'],
+  canUseTool: WorkflowHostBundle['canUseTool'],
+  parentMessage?: AssistantMessage,
+): WorkflowHostBundle {
+  return {
+    toolUseContext,
+    canUseTool,
+    ...(parentMessage !== undefined ? { parentMessage } : {}),
+    agentId: toolUseContext.agentId,
+  }
+}
+
+export function makeHostHandle(bundle: WorkflowHostBundle): HostHandle {
+  return createHostHandle(bundle)
+}
+
+export function readHostBundle(handle: HostHandle): WorkflowHostBundle {
+  return unwrapHostHandle(handle) as WorkflowHostBundle
+}
--- a/src/workflow/namedWorkflowCommands.ts
+++ b/src/workflow/namedWorkflowCommands.ts
@@ -0,0 +1,34 @@
+import { join } from 'node:path'
+import {
+  listNamedWorkflows,
+  WORKFLOW_DIR_NAME,
+} from '@claude-code-best/workflow-engine'
+import type { Command } from '../types/command.js'
+import { getProjectRoot } from '../bootstrap/state.js'
+
+/** Scan *.ts|*.js|*.mjs under .claude/workflows/ and generate a /<name> command for each. */
+export async function getWorkflowCommands(
+  cwd: string = getProjectRoot(),
+): Promise<Command[]> {
+  const dir = join(cwd, WORKFLOW_DIR_NAME)
+  const names = await listNamedWorkflows(dir)
+  return names.map(name => ({
+    type: 'prompt',
+    name,
+    description: `Run workflow: ${name}`,
+    kind: 'workflow',
+    source: 'builtin',
+    progressMessage: `Running workflow ${name}...`,
+    contentLength: 0,
+    async getPromptForCommand(args, _context) {
+      const argText =
+        typeof args === 'string' && args ? `\n\nArguments: ${args}` : ''
+      return [
+        {
+          type: 'text',
+          text: `Run the "${name}" workflow now by calling the Workflow tool with name="${name}".${argText}`,
+        },
+      ]
+    },
+  }))
+}
--- a/src/workflow/notifications.ts
+++ b/src/workflow/notifications.ts
@@ -0,0 +1,88 @@
+/**
+ * Bridge for workflow status-change notifications.
+ *
+ * The engine emits events via progressEmitter.emit({ type: 'run_done', ... }),
+ * and the progress/store reducer records the status into RunProgress. But the
+ * old implementation had no code bridging status transitions to the host
+ * notification mechanism — the "notifies automatically on completion" promise
+ * in WorkflowTool's return text went unfulfilled.
+ *
+ * This module subscribes to WorkflowService.subscribe, watches status transitions
+ * from running → completed/failed/killed, and emits a host notification via the
+ * injected notifier callback (defaults to enqueuePendingNotification task-notification mode).
+ */
+import {
+  STATUS_TAG,
+  SUMMARY_TAG,
+  TASK_ID_TAG,
+  TASK_NOTIFICATION_TAG,
+  TASK_TYPE_TAG,
+} from '../constants/xml.js'
+import { enqueuePendingNotification } from '../utils/messageQueueManager.js'
+import type { RunProgress } from './progress/store.js'
+import type { WorkflowService } from './service.js'
+
+const WORKFLOW_TASK_TYPE = 'local_workflow'
+
+/** Notifier abstraction (lets tests inject a spy). */
+export type WorkflowNotifier = (message: string) => void
+
+const TERMINAL_STATUSES: ReadonlySet<RunProgress['status']> = new Set([
+  'completed',
+  'failed',
+  'killed',
+])
+
+/** Default notifier: uses the host message queue's task-notification mode. */
+const defaultNotifier: WorkflowNotifier = message => {
+  enqueuePendingNotification({ value: message, mode: 'task-notification' })
+}
+
+export function installWorkflowNotifications(
+  service: WorkflowService,
+  notify: WorkflowNotifier = defaultNotifier,
+): () => void {
+  const prevStatus = new Map<string, RunProgress['status'] | undefined>()
+
+  const unsubscribe = service.subscribe(() => {
+    const runs = service.listRuns()
+    for (const run of runs) {
+      const prev = prevStatus.get(run.runId)
+      // First time seeing this run: just record the current status without notifying
+      // (avoids treating existing historical runs as new notifications on install)
+      if (prev === undefined) {
+        prevStatus.set(run.runId, run.status)
+        continue
+      }
+      // Status changed + entered terminal state → emit notification
+      if (prev !== run.status && TERMINAL_STATUSES.has(run.status)) {
+        notify(buildMessage(run))
+      }
+      prevStatus.set(run.runId, run.status)
+    }
+  })
+
+  return () => {
+    unsubscribe()
+    prevStatus.clear()
+  }
+}
+
+function buildMessage(run: RunProgress): string {
+  const statusText =
+    run.status === 'completed'
+      ? 'completed successfully'
+      : run.status === 'failed'
+        ? 'failed'
+        : 'was stopped'
+  const errorSuffix =
+    run.status === 'failed' && run.error ? `: ${run.error}` : ''
+  const summary = `Workflow "${run.workflowName}" ${statusText}${errorSuffix}`
+
+  return `<${TASK_NOTIFICATION_TAG}>
+<${TASK_ID_TAG}>${run.runId}</${TASK_ID_TAG}>
+<${TASK_TYPE_TAG}>${WORKFLOW_TASK_TYPE}</${TASK_TYPE_TAG}>
+<${STATUS_TAG}>${run.status}</${STATUS_TAG}>
+<${SUMMARY_TAG}>${summary}</${SUMMARY_TAG}>
+</${TASK_NOTIFICATION_TAG}>`
+}
--- a/src/workflow/panel/AgentList.tsx
+++ b/src/workflow/panel/AgentList.tsx
@@ -0,0 +1,71 @@
+import React from 'react';
+import { Box, Text, useAnimationFrame } from '@anthropic/ink';
+import type { Theme } from '@anthropic/ink';
+import type { AgentProgress } from '../progress/store.js';
+import { agentMetaText, agentVisual } from './status.js';
+
+const SPINNER_FRAMES = ['·', '✢', '✱', '✶', '✻', '✽'];
+const FRAME_MS = 120;
+const LABEL_MAX = 18;
+
+/**
+ * Truncate the label to at most max characters. Preserves the trailing `#number` suffix (the audit workflow
+ * `verify:${dim}#${findingIdx}` format) - so verify agent labels with multiple findings under the same dimension
+ * stay distinguishable (the prefix is elided with `…`). When there is no suffix, truncates from the right (legacy behavior).
+ * Exported for unit test coverage.
+ */
+export function truncateLabel(raw: string, max: number): string {
+  if (raw.length <= max) return raw;
+  const m = raw.match(/#\d+$/);
+  if (!m) return raw.slice(0, max);
+  const suffix = m[0]; // includes the # sign
+  const prefix = raw.slice(0, raw.length - suffix.length);
+  const available = max - suffix.length - 1; // -1 reserved for …
+  return `${prefix.slice(0, available)}…${suffix}`;
+}
+
+/**
+ * Right-side agent list (already filtered by the selected phase).
+ * Selected row: only when this column has focus (focused=true) does it paint a selectionBg background (keeps fg, not inverse color);
+ * when focus is not on this column it does not paint the background color, to avoid a "fake focus".
+ * The status mark of a running agent is driven by useAnimationFrame via a spinner animation (shared clock, globally synchronized);
+ * the right side `model · Nk tok · N tool` is refreshed in real time by agent_progress / agent_done.
+ */
+export function AgentList({
+  agents,
+  selectedIndex,
+  focused,
+}: {
+  agents: AgentProgress[];
+  selectedIndex: number;
+  focused: boolean;
+}): React.ReactNode {
+  // Subscribe once to the animation frame at the top level: all running agents share the same frame (synchronized animation, avoids a per-row hook).
+  const [ref, time] = useAnimationFrame(FRAME_MS);
+  const frame = SPINNER_FRAMES[Math.floor(time / FRAME_MS) % SPINNER_FRAMES.length];
+
+  if (agents.length === 0) {
+    return <Text color="subtle">(no agents in this phase)</Text>;
+  }
+  return (
+    <Box ref={ref} flexDirection="column">
+      {agents.map((a, i) => {
+        const v = agentVisual(a);
+        const selected = i === selectedIndex;
+        const highlighted = selected && focused;
+        const running = a.status === 'running';
+        const mark = running ? frame : v.mark;
+        const label = truncateLabel(a.label ?? `agent-${a.id}`, LABEL_MAX);
+        return (
+          <Box key={a.id} backgroundColor={highlighted ? 'selectionBg' : undefined} justifyContent="space-between">
+            <Box>
+              <Text color={v.color as keyof Theme}>{mark}</Text>
+              <Text> {label}</Text>
+            </Box>
+            <Text color="subtle">{agentMetaText(a)}</Text>
+          </Box>
+        );
+      })}
+    </Box>
+  );
+}
--- a/src/workflow/panel/PhaseSidebar.tsx
+++ b/src/workflow/panel/PhaseSidebar.tsx
@@ -0,0 +1,65 @@
+import React from 'react';
+import { Box, Text, useAnimationFrame } from '@anthropic/ink';
+import type { Theme } from '@anthropic/ink';
+import type { AgentProgress } from '../progress/store.js';
+import { PHASE_COLOR, PHASE_MARK, type PhaseStatus } from './status.js';
+import { ALL_PHASE, type MergedPhase } from './selectors.js';
+
+const SPINNER_FRAMES = ['·', '✢', '✱', '✶', '✻', '✽'];
+const FRAME_MS = 120;
+
+type PhaseRow = {
+  title: string;
+  status?: PhaseStatus;
+  done: number;
+  total: number;
+};
+
+/**
+ * Left phase sidebar: the first row is All (aggregating done/total), followed by the merged phases (including pending ○).
+ * Selected row: only when this column has focus (focused=true) does it paint a selectionBg background (keeps fg, not inverse color) + a `>` marker;
+ * when focus is not on this column it does not paint the background color, to avoid a "fake focus". The status mark of a running phase is driven by useAnimationFrame via a spinner animation.
+ * Style aligns with the reference image: `> ✓ Scan  3/3`.
+ */
+export function PhaseSidebar({
+  phases,
+  agents,
+  selectedIndex,
+  focused,
+}: {
+  phases: MergedPhase[];
+  agents: AgentProgress[];
+  selectedIndex: number;
+  focused: boolean;
+}): React.ReactNode {
+  const [ref, time] = useAnimationFrame(FRAME_MS);
+  const frame = SPINNER_FRAMES[Math.floor(time / FRAME_MS) % SPINNER_FRAMES.length];
+  const totalAgents = agents.length;
+  const doneAgents = agents.filter(a => a.status === 'done').length;
+  const rows: PhaseRow[] = [{ title: ALL_PHASE, done: doneAgents, total: totalAgents }, ...phases];
+
+  return (
+    <Box ref={ref} flexDirection="column">
+      {rows.map((row, i) => {
+        const selected = i === selectedIndex;
+        const highlighted = selected && focused;
+        const running = row.status === 'running';
+        const mark = running ? frame : row.status ? PHASE_MARK[row.status] : ' ';
+        const color = (row.status ? PHASE_COLOR[row.status] : 'subtle') as keyof Theme;
+        return (
+          <Box key={row.title} backgroundColor={highlighted ? 'selectionBg' : undefined} justifyContent="space-between">
+            <Box>
+              <Text color={selected ? 'claude' : undefined}>{highlighted ? '>' : ' '}</Text>
+              <Text> </Text>
+              <Text color={color}>{mark}</Text>
+              <Text> {row.title}</Text>
+            </Box>
+            <Text color="subtle">
+              {row.done}/{row.total}
+            </Text>
+          </Box>
+        );
+      })}
+    </Box>
+  );
+}
--- a/src/workflow/panel/TabsBar.tsx
+++ b/src/workflow/panel/TabsBar.tsx
@@ -0,0 +1,37 @@
+import React from 'react';
+import { Box, Text } from '@anthropic/ink';
+import type { Theme } from '@anthropic/ink';
+import type { RunProgress } from '../progress/store.js';
+import { RUN_STATUS_COLOR, STATUS_DOT } from './status.js';
+import { tabLabel } from './selectors.js';
+
+/**
+ * Top run tab row: one tab per run (status dot + name + #short code).
+ * The current tab is highlighted with an orange ═ underline.
+ */
+export function TabsBar({ runs, activeRunId }: { runs: RunProgress[]; activeRunId: string | null }): React.ReactNode {
+  if (runs.length === 0) {
+    return <Text color="subtle">(no runs)</Text>;
+  }
+  return (
+    <Box>
+      {runs.map(r => {
+        const active = r.runId === activeRunId;
+        const label = tabLabel(r.workflowName, r.runId);
+        const underline = '═'.repeat(label.length + 2);
+        return (
+          <Box key={r.runId} flexDirection="column" marginRight={2}>
+            <Box>
+              <Text color={RUN_STATUS_COLOR[r.status] as keyof Theme}>{STATUS_DOT[r.status]}</Text>
+              <Text> </Text>
+              <Text color={active ? 'claude' : undefined} bold={active}>
+                {label}
+              </Text>
+            </Box>
+            <Text color={active ? 'claude' : undefined}>{active ? underline : ''}</Text>
+          </Box>
+        );
+      })}
+    </Box>
+  );
+}
--- a/src/workflow/panel/WorkflowsPanel.tsx
+++ b/src/workflow/panel/WorkflowsPanel.tsx
@@ -0,0 +1,283 @@
+import React, { useEffect, useRef, useState, useSyncExternalStore } from 'react';
+import { Box, Dialog, Text, useAnimationFrame } from '@anthropic/ink';
+import type { Theme } from '@anthropic/ink';
+import type { LocalJSXCommandContext, LocalJSXCommandOnDone } from '../../types/command.js';
+import { getWorkflowService } from '../service.js';
+import type { RunProgress } from '../progress/store.js';
+import { AgentList } from './AgentList.js';
+import { PhaseSidebar } from './PhaseSidebar.js';
+import { TabsBar } from './TabsBar.js';
+import { RUN_STATUS_COLOR, RUN_STATUS_TEXT } from './status.js';
+import { type FocusColumn, type WorkflowKeyboardHandlers, useWorkflowKeyboard } from './useWorkflowKeyboard.js';
+import { ALL_PHASE, filterAgentsByPhase, formatDuration, mergePhases } from './selectors.js';
+
+/**
+ * Clamp the selected index to a valid range (empty list -> 0; out of range -> last position; negative/NaN -> 0).
+ * Extracted into a module-level pure function: called inside the panel + unit tested for the same logic, to avoid behavior drift.
+ */
+export function clampSelected(selected: number, len: number): number {
+  if (len === 0) return 0;
+  const n = Math.trunc(selected);
+  if (Number.isNaN(n) || n < 0) return 0;
+  return Math.min(n, len - 1);
+}
+
+/**
+ * Determine whether the focused run completed the running -> terminal state transition (used for panel auto-exit).
+ * Extracted into a pure function for easy unit testing; called directly inside the panel's useEffect.
+ *
+ * Trigger condition: prev and curr are the same runId, prev is running, curr is completed/failed/killed.
+ * - Opening the history panel (prev=null): does not trigger
+ * - Switching to an already completed tab (different runId): does not trigger
+ * - Same run running -> terminal: triggers
+ */
+export function isRunTerminatedTransition(
+  prev: { runId: string; status: RunProgress['status'] } | null,
+  curr: { runId: string; status: RunProgress['status'] } | null,
+): boolean {
+  if (!prev || !curr) return false;
+  if (prev.runId !== curr.runId) return false;
+  if (prev.status !== 'running') return false;
+  return curr.status === 'completed' || curr.status === 'failed' || curr.status === 'killed';
+}
+
+/**
+ * /workflows main panel: three-region focus model (top tab + left phase sidebar + right agent list).
+ *
+ * - useSyncExternalStore subscribes to WorkflowService (the store returns stable snapshots, no re-render without change).
+ * - Focus state: activeRunId / focusColumn('phases'|'agents') / selectedPhaseIndex(0=All) / selectedAgentIndex.
+ * - Keybindings: Tab switch run · Left/Right switch focus column · Up/Down move within column · x kill · r resume · q/Esc quit.
+ */
+export function WorkflowsPanel({
+  onDone,
+  context,
+}: {
+  onDone: LocalJSXCommandOnDone;
+  context: LocalJSXCommandContext;
+}): React.ReactNode {
+  const svc = getWorkflowService();
+  const runs = useSyncExternalStore(
+    svc.subscribe,
+    () => svc.listRuns(),
+    () => [],
+  );
+
+  const [activeRunId, setActiveRunId] = useState<string | null>(null);
+  const [focusColumn, setFocusColumn] = useState<FocusColumn>('phases');
+  const [selectedPhaseIndex, setSelectedPhaseIndex] = useState(0);
+  const [selectedAgentIndex, setSelectedAgentIndex] = useState(0);
+  // kill secondary confirmation. null = no dialog; 'workflow' = kill the whole run; 'agent' = kill the currently selected agent.
+  // When non-null the keyboard enters confirm mode (only y/Enter/n/Esc/q respond).
+  const [confirmKill, setConfirmKill] = useState<null | 'agent' | 'workflow'>(null);
+
+  // On mount, trigger a single disk scan to hydrate historical runs (the service's internal persistedLoaded flag guards idempotency).
+  // Re-mount / re-render does not scan again (guarded by the process-singleton flag). The svc reference is stable (getWorkflowService singleton).
+  useEffect(() => {
+    void svc.loadPersistedRuns();
+  }, [svc]);
+
+  // On runs change: activeRunId invalidated (killed / first time) -> clamp to the first one
+  useEffect(() => {
+    if (runs.length === 0) {
+      if (activeRunId !== null) setActiveRunId(null);
+      return;
+    }
+    if (!runs.some(r => r.runId === activeRunId)) {
+      setActiveRunId(runs[0]!.runId);
+    }
+  }, [runs, activeRunId]);
+
+  const focused: RunProgress | undefined = runs.find(r => r.runId === activeRunId);
+  const phases = focused ? mergePhases(focused) : [];
+  // The sidebar includes the All row: prepend one item to the phases array -> total rows = phases.length + 1
+  const phaseRowCount = phases.length + 1;
+  const clampedPhase = clampSelected(selectedPhaseIndex, phaseRowCount);
+
+  // Auto-exit the panel when the focused run transitions from running to terminal (800ms delay so the user sees the ✓/✗ terminal state).
+  // Only triggered by a state transition on the same runId: switching to an already completed tab (prev was a different run) does not exit; opening the history panel
+  // (prev=null) does not exit either. Otherwise the agent is blocked by the panel while waiting for the Workflow tool result, and the user must press q manually.
+  const prevFocusedRef = useRef<{ runId: string; status: RunProgress['status'] } | null>(null);
+  useEffect(() => {
+    const curr = focused ? { runId: focused.runId, status: focused.status } : null;
+    const prev = prevFocusedRef.current;
+    prevFocusedRef.current = curr;
+    if (!isRunTerminatedTransition(prev, curr)) return;
+    const timer = setTimeout(() => onDone(), 800);
+    return (): void => {
+      clearTimeout(timer);
+    };
+  }, [focused?.runId, focused?.status, onDone]);
+
+  // Selected phase title (0 = All = undefined)
+  const selectedPhaseTitle = clampedPhase === 0 ? undefined : phases[clampedPhase - 1]?.title;
+
+  const visibleAgents = focused ? filterAgentsByPhase(focused.agents, selectedPhaseTitle) : [];
+  const clampedAgent = clampSelected(selectedAgentIndex, visibleAgents.length);
+
+  const switchTab = (runId: string): void => {
+    setActiveRunId(runId);
+    setFocusColumn('phases');
+    setSelectedPhaseIndex(0);
+    setSelectedAgentIndex(0);
+  };
+
+  const nextTab = (): void => {
+    if (runs.length === 0) return;
+    const idx = runs.findIndex(r => r.runId === activeRunId);
+    const next = runs[(idx + 1) % runs.length]!;
+    switchTab(next.runId);
+  };
+  const prevTab = (): void => {
+    if (runs.length === 0) return;
+    const idx = runs.findIndex(r => r.runId === activeRunId);
+    const next = runs[(idx - 1 + runs.length) % runs.length]!;
+    switchTab(next.runId);
+  };
+
+  const handlers: WorkflowKeyboardHandlers = {
+    nextTab,
+    prevTab,
+    focusLeft: () => setFocusColumn('phases'),
+    focusRight: () => setFocusColumn('agents'),
+    moveUp: () => {
+      if (focusColumn === 'phases') setSelectedPhaseIndex(s => clampSelected(s - 1, phaseRowCount));
+      else setSelectedAgentIndex(s => clampSelected(s - 1, visibleAgents.length));
+    },
+    moveDown: () => {
+      if (focusColumn === 'phases') setSelectedPhaseIndex(s => clampSelected(s + 1, phaseRowCount));
+      else setSelectedAgentIndex(s => clampSelected(s + 1, visibleAgents.length));
+    },
+    killAgent: () => {
+      // Only pop the agent confirmation when the agents column is focused (pressing x in the phases column has no target, no-op).
+      // The selected agent is decided by visibleAgents[clampedAgent]; saved into confirmKill and then
+      // actually executed by confirmYes - to avoid mis-killing caused by visibleAgents changing between two renders.
+      if (focusColumn !== 'agents' || !focused) return;
+      const agent = visibleAgents[clampedAgent];
+      if (!agent) return;
+      setConfirmKill('agent');
+    },
+    killWorkflow: () => {
+      if (!focused) return;
+      setConfirmKill('workflow');
+    },
+    resumeFocused: () => {
+      if (!focused) return;
+      const canUseTool = context.canUseTool;
+      if (!canUseTool) {
+        onDone('resume needs canUseTool context; run /<name> resume from the main session.');
+        return;
+      }
+      void svc
+        .launch({ resumeFromRunId: focused.runId, name: focused.workflowName }, context, canUseTool)
+        .catch(e => onDone(`resume failed: ${(e as Error).message}`));
+    },
+    newRun: () => onDone('Tip: start a named workflow with /<name>, or pass name via the Workflow tool.'),
+    quit: () => {
+      // In confirm mode q = cancel confirmation (routeWorkflowKey already routed to confirmNo);
+      // only in non-confirm mode does it really exit the panel.
+      if (confirmKill !== null) {
+        setConfirmKill(null);
+        return;
+      }
+      onDone();
+    },
+    confirmYes: () => {
+      if (confirmKill === 'workflow' && focused) {
+        svc.kill(focused.runId);
+        // After killing the entire workflow, immediately return to the main chat: the run_done event -> the store reducer changes the status to
+        // killed -> notifications.ts bridges enqueuePendingNotification, and the main chat shows
+        // `Workflow "<name>" was stopped`. Staying on the panel would instead make the user miss the "stopped" feedback.
+        setConfirmKill(null);
+        onDone();
+        return;
+      } else if (confirmKill === 'agent' && focused) {
+        const agent = visibleAgents[clampedAgent];
+        if (agent) svc.killAgent(focused.runId, agent.id);
+      }
+      setConfirmKill(null);
+    },
+    confirmNo: () => setConfirmKill(null),
+  };
+  useWorkflowKeyboard(handlers, confirmKill !== null ? 'confirm' : 'normal');
+
+  const running = runs.filter(r => r.status === 'running').length;
+  const done = runs.length - running;
+  const phaseHeader = selectedPhaseTitle ?? ALL_PHASE;
+  const agentDone = focused ? focused.agents.filter(a => a.status === 'done').length : 0;
+  // Refresh the header duration every second (shared clock; subscribing triggers re-render, duration follows wall clock).
+  const [clockRef] = useAnimationFrame(1000);
+  const elapsed = focused ? Date.now() - focused.startedAt : 0;
+
+  return (
+    <Box ref={clockRef} flexDirection="column" borderStyle="round" borderColor="claude" paddingX={1}>
+      <Box justifyContent="space-between">
+        <Text bold>{focused?.workflowName ?? 'Workflows'}</Text>
+        {focused ? (
+          <Text color="subtle">
+            {agentDone}/{focused.agentCount} agents · {formatDuration(elapsed)} ·{' '}
+            <Text color={RUN_STATUS_COLOR[focused.status] as keyof Theme}>{RUN_STATUS_TEXT[focused.status]}</Text>
+          </Text>
+        ) : (
+          <Text color="subtle">
+            {running} running · {done} done
+          </Text>
+        )}
+      </Box>
+      {focused?.description ? <Text color="subtle">{focused.description}</Text> : null}
+
+      {runs.length > 1 ? (
+        <Box marginTop={1}>
+          <TabsBar runs={runs} activeRunId={activeRunId} />
+        </Box>
+      ) : null}
+
+      <Box flexDirection="row" marginTop={1}>
+        <Box width="25%" flexDirection="column">
+          <Text color={focusColumn === 'phases' ? 'claude' : 'subtle'} bold>
+            Phases
+          </Text>
+          <PhaseSidebar
+            phases={phases}
+            agents={focused?.agents ?? []}
+            selectedIndex={clampedPhase}
+            focused={focusColumn === 'phases'}
+          />
+        </Box>
+        <Text color="subtle">│</Text>
+        <Box flexGrow={1} flexDirection="column">
+          <Text color={focusColumn === 'agents' ? 'claude' : 'subtle'} bold>
+            {phaseHeader} · {visibleAgents.length} agents
+          </Text>
+          <AgentList agents={visibleAgents} selectedIndex={clampedAgent} focused={focusColumn === 'agents'} />
+        </Box>
+      </Box>
+
+      <Box marginTop={1}>
+        <Text color="subtle">
+          {confirmKill !== null
+            ? 'Confirm: y kill · n/Esc cancel'
+            : 'Tab switch run · ←/→ focus · ↑/↓ move · x kill agent · K kill workflow · r resume · q quit'}
+        </Text>
+      </Box>
+
+      {confirmKill !== null ? (
+        <Dialog
+          title={
+            confirmKill === 'workflow'
+              ? `Kill workflow "${focused?.workflowName ?? ''}"?`
+              : `Kill agent "${visibleAgents[clampedAgent]?.label ?? ''}"?`
+          }
+          subtitle={
+            confirmKill === 'workflow'
+              ? 'All in-flight agents will be aborted. Resume will replay from journal.'
+              : 'Only this agent aborts; other agents in the workflow keep running.'
+          }
+          onCancel={() => setConfirmKill(null)}
+          color="warning"
+        >
+          <Text color="subtle">Press y to confirm, or n/Esc to cancel.</Text>
+        </Dialog>
+      ) : null}
+    </Box>
+  );
+}
--- a/src/workflow/panel/panelCall.tsx
+++ b/src/workflow/panel/panelCall.tsx
@@ -0,0 +1,16 @@
+import type { LocalJSXCommandCall } from '../../types/command.js';
+import { SentryErrorBoundary } from '../../components/SentryErrorBoundary.js';
+import { WorkflowsPanel } from './WorkflowsPanel.js';
+
+/**
+ * local-jsx call for /workflows: builds the panel element and returns it for Ink to render.
+ *
+ * Wrapped in SentryErrorBoundary: when useSyncExternalStore / listNamed / child components
+ * throw, the exception must not break through to the REPL top level and crash the whole session; the boundary falls back to a local error card.
+ * onDone/context are injected by the command runtime; args is unused (the panel has no parameterized behavior).
+ */
+export const call: LocalJSXCommandCall = async (onDone, context, _args) => (
+  <SentryErrorBoundary name="WorkflowsPanel">
+    <WorkflowsPanel onDone={onDone} context={context} />
+  </SentryErrorBoundary>
+);
--- a/src/workflow/panel/selectors.ts
+++ b/src/workflow/panel/selectors.ts
@@ -0,0 +1,71 @@
+import type { AgentProgress, RunProgress } from '../progress/store.js'
+import type { PhaseStatus } from './status.js'
+
+/** Title of the fixed "no filter" item (first row of the sidebar). */
+export const ALL_PHASE = 'All'
+
+/** Merged phase (including pending), with done/total counts of agents under that phase. */
+export type MergedPhase = {
+  title: string
+  status: PhaseStatus
+  done: number
+  total: number
+}
+
+/**
+ * Merge declaredPhases (declared by meta) and run.phases (actually running/done):
+ * - Declared order takes priority; phases present in actual but not declared are appended at the end.
+ * - No actual record -> pending; otherwise take the actual status.
+ * - done/total = done under that phase / total agents under that phase.
+ */
+export function mergePhases(
+  run: Pick<RunProgress, 'declaredPhases' | 'phases' | 'agents'>,
+): MergedPhase[] {
+  const actualByTitle = new Map(run.phases.map(p => [p.title, p]))
+  const seen = new Set<string>()
+  const out: MergedPhase[] = []
+  const push = (title: string): void => {
+    if (seen.has(title)) return
+    seen.add(title)
+    const actual = actualByTitle.get(title)
+    const status: PhaseStatus = !actual ? 'pending' : actual.status
+    const inPhase = run.agents.filter(a => a.phase === title)
+    out.push({
+      title,
+      status,
+      done: inPhase.filter(a => a.status === 'done').length,
+      total: inPhase.length,
+    })
+  }
+  for (const t of run.declaredPhases) push(t)
+  for (const p of run.phases) push(p.title)
+  return out
+}
+
+/**
+ * Filter agents by the selected phase.
+ * selectedPhase undefined or ALL_PHASE -> all.
+ */
+export function filterAgentsByPhase(
+  agents: AgentProgress[],
+  selectedPhase: string | undefined,
+): AgentProgress[] {
+  if (selectedPhase === undefined || selectedPhase === ALL_PHASE) return agents
+  return agents.filter(a => a.phase === selectedPhase)
+}
+
+/** tab label: workflow name + `#` + last 4 chars of runId (disambiguates same-name runs). */
+export function tabLabel(workflowName: string, runId: string): string {
+  return `${workflowName}#${runId.slice(-4)}`
+}
+
+/** milliseconds -> compact duration (<60s -> `Ns`; <60m -> `MmSSs`; otherwise `HhMMm`). Used by the panel header. */
+export function formatDuration(ms: number): string {
+  const s = Math.floor(ms / 1000)
+  if (s < 60) return `${s}s`
+  const m = Math.floor(s / 60)
+  const ss = s % 60
+  if (m < 60) return `${m}m${String(ss).padStart(2, '0')}s`
+  const h = Math.floor(m / 60)
+  return `${h}h${String(m % 60).padStart(2, '0')}m`
+}
--- a/src/workflow/panel/status.ts
+++ b/src/workflow/panel/status.ts
@@ -0,0 +1,73 @@
+import type { AgentProgress, RunProgress } from '../progress/store.js'
+
+/** run status -> dot character (used by top tab). */
+export const STATUS_DOT: Record<RunProgress['status'], string> = {
+  running: '●',
+  completed: '✓',
+  failed: '✗',
+  killed: '■',
+}
+
+/** run status -> ink theme color token (follows existing WorkflowList palette). */
+export const RUN_STATUS_COLOR: Record<RunProgress['status'], string> = {
+  running: 'warning',
+  completed: 'success',
+  failed: 'error',
+  killed: 'subtle',
+}
+
+/** run status -> display text (used by header; aligns with reference image done/running). */
+export const RUN_STATUS_TEXT: Record<RunProgress['status'], string> = {
+  running: 'running',
+  completed: 'done',
+  failed: 'failed',
+  killed: 'killed',
+}
+
+/** merged phase status in the sidebar (includes pending: declared by meta but not started). */
+export type PhaseStatus = 'running' | 'done' | 'pending'
+
+export const PHASE_MARK: Record<PhaseStatus, string> = {
+  running: '●',
+  done: '✓',
+  pending: '○',
+}
+
+export const PHASE_COLOR: Record<PhaseStatus, string> = {
+  running: 'warning',
+  done: 'success',
+  pending: 'subtle',
+}
+
+/** visual for an agent row: mark character + color (running has the mark overridden by a spinner animation in UI). */
+export type AgentVisual = { mark: string; color: string }
+
+/**
+ * agent status -> visual.
+ * - running -> ● warning (UI overrides mark with spinner animation)
+ * - done·dead -> ✗ error
+ * - done·ok -> ✓ success
+ */
+export function agentVisual(a: AgentProgress): AgentVisual {
+  if (a.status === 'running') return { mark: '●', color: 'warning' }
+  if (a.resultKind === 'dead') return { mark: '✗', color: 'error' }
+  return { mark: '✓', color: 'success' }
+}
+
+/** token count -> display string (<1000 keeps the raw value; otherwise keeps 1 decimal + k). */
+export function formatTokenCount(n: number | undefined): string {
+  if (!n) return '0'
+  return n >= 1000 ? `${(n / 1000).toFixed(1)}k` : String(n)
+}
+
+/**
+ * right-side stats text for an agent row: `model · Nk tok · N tool`.
+ * Omits the prefix when there is no model; token/tool refresh in real time via agent_progress while running.
+ */
+export function agentMetaText(a: AgentProgress): string {
+  const parts: string[] = []
+  if (a.model) parts.push(a.model)
+  parts.push(`${formatTokenCount(a.tokenCount)} tok`)
+  parts.push(`${a.toolCount ?? 0} tool`)
+  return parts.join(' · ')
+}
--- a/src/workflow/panel/useWorkflowKeyboard.ts
+++ b/src/workflow/panel/useWorkflowKeyboard.ts
@@ -0,0 +1,145 @@
+import { useInput } from '@anthropic/ink'
+
+/** The column that currently has focus. */
+export type FocusColumn = 'phases' | 'agents'
+
+/** Keyboard mode: normal = regular navigation; confirm = a Dialog is open, waiting for the user's y/n confirmation. */
+export type WorkflowKeyboardMode = 'normal' | 'confirm'
+
+/** Subset of the useInput key object (only declares the fields we use, to avoid coupling to the ink Key type). */
+type KeyEvent = {
+  tab?: boolean
+  shift?: boolean
+  escape?: boolean
+  return?: boolean
+  leftArrow?: boolean
+  rightArrow?: boolean
+  upArrow?: boolean
+  downArrow?: boolean
+}
+
+/** key -> action (pure function, easy to unit test; no rendering dependencies). */
+export type WorkflowKeyAction =
+  | 'nextTab'
+  | 'prevTab'
+  | 'focusLeft'
+  | 'focusRight'
+  | 'moveUp'
+  | 'moveDown'
+  | 'killAgent'
+  | 'killWorkflow'
+  | 'resume'
+  | 'newRun'
+  | 'quit'
+  | 'confirmYes'
+  | 'confirmNo'
+
+export function routeWorkflowKey(
+  input: string,
+  key: KeyEvent,
+  mode: WorkflowKeyboardMode = 'normal',
+): WorkflowKeyAction | null {
+  // confirm mode: only y/Enter confirms, n/Esc/q cancels, all other keys are swallowed (prevent mis-touch)
+  if (mode === 'confirm') {
+    if (input === 'y' || input === 'Y' || key.return) return 'confirmYes'
+    if (input === 'n' || input === 'N' || key.escape || input === 'q') {
+      return 'confirmNo'
+    }
+    return null
+  }
+  // @anthropic/ink sets key.tab to true for the Tab key; some environments fall back to '\t'
+  if (key.tab || input === '\t') return key.shift ? 'prevTab' : 'nextTab'
+  if (key.escape || input === 'q') return 'quit'
+  // Capital K = kill the entire workflow; lowercase x = kill the currently selected agent (agents column only).
+  // Case distinction avoids x accidentally triggering workflow kill; K explicitly requires Shift, hinting at a "heavy operation".
+  if (input === 'K') return 'killWorkflow'
+  if (input === 'x') return 'killAgent'
+  if (input === 'r') return 'resume'
+  if (input === 'n') return 'newRun'
+  if (key.leftArrow) return 'focusLeft'
+  if (key.rightArrow) return 'focusRight'
+  if (key.upArrow) return 'moveUp'
+  if (key.downArrow) return 'moveDown'
+  return null
+}
+
+/** Focus model callbacks (injected by WorkflowsPanel). */
+export type WorkflowKeyboardHandlers = {
+  nextTab: () => void
+  prevTab: () => void
+  focusLeft: () => void
+  focusRight: () => void
+  moveUp: () => void
+  moveDown: () => void
+  /** Request killing the currently selected agent (panel pops a Dialog for secondary confirmation). */
+  killAgent: () => void
+  /** Request killing the entire workflow (panel pops a Dialog for secondary confirmation). */
+  killWorkflow: () => void
+  resumeFocused: () => void
+  newRun: () => void
+  quit: () => void
+  /** User confirms in confirm mode (y/Enter). */
+  confirmYes: () => void
+  /** User cancels in confirm mode (n/Esc/q). */
+  confirmNo: () => void
+}
+
+/**
+ * /workflows panel keybindings (focus rotation model):
+ * - Tab / Shift+Tab: switch the top run tab
+ * - Left / Right: switch focus between phases and agents
+ * - Up / Down: move within the currently focused column
+ * - x kill single agent · K kill the entire workflow (with Dialog secondary confirmation) · r resume · n new · q / Esc quit
+ *
+ * @param mode In confirm mode only y/n/Esc/q are accepted, all other keys are swallowed - avoid mis-navigation inside the confirmation dialog.
+ */
+export function useWorkflowKeyboard(
+  h: WorkflowKeyboardHandlers,
+  mode: WorkflowKeyboardMode = 'normal',
+): void {
+  useInput((input, key) => {
+    const action = routeWorkflowKey(input, key as KeyEvent, mode)
+    if (action === null) return
+    switch (action) {
+      case 'nextTab':
+        h.nextTab()
+        break
+      case 'prevTab':
+        h.prevTab()
+        break
+      case 'focusLeft':
+        h.focusLeft()
+        break
+      case 'focusRight':
+        h.focusRight()
+        break
+      case 'moveUp':
+        h.moveUp()
+        break
+      case 'moveDown':
+        h.moveDown()
+        break
+      case 'killAgent':
+        h.killAgent()
+        break
+      case 'killWorkflow':
+        h.killWorkflow()
+        break
+      case 'resume':
+        h.resumeFocused()
+        break
+      case 'newRun':
+        h.newRun()
+        break
+      case 'quit':
+        h.quit()
+        break
+      case 'confirmYes':
+        h.confirmYes()
+        break
+      case 'confirmNo':
+        h.confirmNo()
+        break
+    }
+  })
+}
--- a/src/workflow/persistence.ts
+++ b/src/workflow/persistence.ts
@@ -0,0 +1,131 @@
+import { mkdir, readFile, readdir, rename, writeFile } from 'node:fs/promises'
+import { join } from 'node:path'
+import { getProjectRoot } from '../bootstrap/state.js'
+import { logForDebugging } from '../utils/debug.js'
+import type { ProgressBus } from './progress/bus.js'
+import type { ProgressStore, RunProgress } from './progress/store.js'
+
+/** Current schema version of state.json; introduces a migration chain on upgrade. */
+const SCHEMA_VERSION = 1
+const STATE_FILE = 'state.json'
+const STATE_TMP = 'state.json.tmp'
+
+/**
+ * Single source for runsDir: shares the same root as ports.ts journalStore (${projectRoot}/.claude/workflow-runs).
+ * Extracted as a function: eliminates duplicated path concatenation between ports.ts and persistence logic, staying in the same root when entering worktree/subdirectory.
+ * Tests monkey-patch this function to point at a tmpdir.
+ */
+export function getRunsDir(): string {
+  return join(getProjectRoot(), '.claude', 'workflow-runs')
+}
+
+type StateFile = {
+  schemaVersion: number
+  run: RunProgress
+}
+
+/**
+ * Atomically overwrite the terminal RunProgress to <runsDir>/<runId>/state.json.
+ * Atomicity: writeFile(tmp) → rename(tmp, target), rename is atomic; worst case leaves tmp, next write overwrites it.
+ * Failure is best-effort: IO exceptions only log a warn, do not throw (workflow already succeeded; persistence failure only means it cannot be retrieved after restart).
+ */
+export async function writeRunState(
+  runsDir: string,
+  run: RunProgress,
+): Promise<void> {
+  const dir = join(runsDir, run.runId)
+  const target = join(dir, STATE_FILE)
+  const tmp = join(dir, STATE_TMP)
+  const payload: StateFile = { schemaVersion: SCHEMA_VERSION, run }
+  try {
+    await mkdir(dir, { recursive: true })
+    await writeFile(tmp, JSON.stringify(payload), 'utf-8')
+    await rename(tmp, target)
+  } catch (e) {
+    logForDebugging(
+      `[workflow warn] writeRunState failed for ${run.runId}: ${(e as Error).message}`,
+    )
+  }
+}
+
+/**
+ * Read <runsDir>/<runId>/state.json with fault tolerance:
+ * - File does not exist → null (caller treats it as a miss)
+ * - JSON parse failure / schema structure mismatch / schemaVersion mismatch → null (log warn, do not crash)
+ */
+export async function readRunState(
+  runsDir: string,
+  runId: string,
+): Promise<RunProgress | null> {
+  const target = join(runsDir, runId, STATE_FILE)
+  let raw: string
+  try {
+    raw = await readFile(target, 'utf-8')
+  } catch {
+    return null
+  }
+  try {
+    const parsed = JSON.parse(raw) as Partial<StateFile>
+    if (parsed.schemaVersion !== SCHEMA_VERSION) return null
+    const run = parsed.run
+    if (!run || typeof run !== 'object') return null
+    if (typeof run.runId !== 'string') return null
+    if (typeof run.status !== 'string') return null
+    return run as RunProgress
+  } catch (e) {
+    logForDebugging(
+      `[workflow warn] readRunState parse failed for ${runId}: ${(e as Error).message}`,
+    )
+    return null
+  }
+}
+
+/**
+ * Scan all subdirectories under runsDir, read each state.json, return a list of non-null RunProgress.
+ * - runsDir does not exist → empty array
+ * - A subdirectory without state.json (half-written run) → skip
+ * - A subdirectory whose state.json is corrupted → skip that single one, keep scanning the rest
+ * - Sort by updatedAt descending (consistent with store.list() ordering)
+ */
+export async function listPersistedRuns(
+  runsDir: string,
+): Promise<RunProgress[]> {
+  let entries: string[]
+  try {
+    entries = await readdir(runsDir)
+  } catch {
+    return []
+  }
+  const runs: RunProgress[] = []
+  for (const name of entries) {
+    const run = await readRunState(runsDir, name)
+    if (run) runs.push(run)
+  }
+  return runs.sort((a, b) => b.updatedAt - a.updatedAt)
+}
+
+/**
+ * Subscribe to the bus's run_done event and write the terminal RunProgress to state.json on disk.
+ * Covers all three terminal states (completed/failed/killed; shutdown-kill also routes to run_done killed).
+ * The store registers to the bus before this subscription, so when the listener runs store.get(runId) is already terminal.
+ * Returns an unsubscribe function (for test cleanup).
+ *
+ * Disk write is best-effort: writeRunState swallows IO exceptions and only logs, does not propagate —
+ * so other bus subscribers (store, etc.) are not affected by persistence failures.
+ *
+ * @param runsDirProvider Optional runsDir resolver (defaults to getRunsDir).
+ *   Production path uses the default; tests inject a tmpdir to avoid writing to the real project directory (Bun ESM module namespace is read-only,
+ *   cannot monkey-patch getRunsDir itself).
+ */
+export function attachRunStatePersistence(
+  bus: ProgressBus,
+  store: ProgressStore,
+  runsDirProvider: () => string = getRunsDir,
+): () => void {
+  return bus.subscribe(event => {
+    if (event.type !== 'run_done') return
+    const run = store.get(event.runId)
+    if (!run) return
+    void writeRunState(runsDirProvider(), run)
+  })
+}
--- a/src/workflow/ports.ts
+++ b/src/workflow/ports.ts
@@ -0,0 +1,202 @@
+import {
+  createFileJournalStore,
+  type ProgressEvent,
+  type WorkflowPorts,
+} from '@claude-code-best/workflow-engine'
+import { logForDebugging } from '../utils/debug.js'
+import { getProjectRoot } from '../bootstrap/state.js'
+import { getRunsDir } from './persistence.js'
+import {
+  type AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS,
+  logEvent,
+} from '../services/analytics/index.js'
+import {
+  completeWorkflowTask,
+  failWorkflowTask,
+  killWorkflowTask,
+  registerLocalWorkflowTask,
+} from '../tasks/LocalWorkflowTask/LocalWorkflowTask.js'
+import {
+  buildHostBundle,
+  makeHostHandle,
+  readHostBundle,
+  type WorkflowHostBundle,
+} from './hostHandle.js'
+import { buildRegistry } from './registry.js'
+import type { ProgressBus } from './progress/bus.js'
+import type { ProgressStore } from './progress/store.js'
+import type { SetAppState } from '../Task.js'
+import type { AssistantMessage } from '../types/message.js'
+
+type RunBinding = {
+  runId: string
+  taskId: string
+  setAppState: SetAppState
+  abortController: AbortController
+  workflowName: string
+  /** agentId → AbortController. Registered when backend starts an agent; killAgent uses it for precise abort. */
+  agentAbortControllers: Map<number, AbortController>
+}
+
+/** Constructs a WorkflowHostContext from toolUseContext on each tool invocation. */
+function makeHostFactory(): WorkflowPorts['hostFactory'] {
+  return ({ context, canUseTool, parentMessage }) => {
+    const ctx = context as WorkflowHostBundle['toolUseContext'] & {
+      agentId?: string
+    }
+    return {
+      handle: makeHostHandle(
+        buildHostBundle(
+          ctx,
+          canUseTool as WorkflowHostBundle['canUseTool'],
+          parentMessage as AssistantMessage | undefined,
+        ),
+      ),
+      // Use projectRoot rather than getCwd(): shares the same root as journalStore's runsDir,
+      // otherwise named workflow resolution and journal persistence diverge when the user
+      // enters a worktree/sub-directory. The engine's internal ctx.cwd is only used for
+      // resolution (scriptPath/name) and does not affect the agent's execution cwd
+      // (the agent gets its own cwd via the toolUseContext inside the host bundle).
+      cwd: getProjectRoot(),
+      budgetTotal: null, // turn-level budget injection point (read from settings in the future)
+      ...(ctx.toolUseId ? { toolUseId: ctx.toolUseId } : {}),
+    }
+  }
+}
+
+/**
+ * Assembles the complete WorkflowPorts. bus/store are passed in by the caller (shared via the service singleton).
+ * taskRegistrar maintains runId → RunBinding for kill routing.
+ */
+export function createWorkflowPorts(opts: {
+  bus: ProgressBus
+  store: ProgressStore
+}): WorkflowPorts {
+  const bindings = new Map<string, RunBinding>()
+  const runsDir = getRunsDir()
+  const registry = buildRegistry()
+
+  // Telemetry subscription (independent of store). LogEventMetadata only accepts boolean/number/undefined,
+  // and runId is a string — use the brand cast provided by the analytics module (verified non-code/path) to pass it through.
+  opts.bus.subscribe((e: ProgressEvent) => {
+    if (e.type === 'run_done') {
+      logEvent('tengu_workflow_done', {
+        status: e.status === 'completed' ? 0 : e.status === 'failed' ? 1 : 2,
+        runId:
+          e.runId as AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS,
+      })
+    }
+  })
+
+  const taskRegistrar: WorkflowPorts['taskRegistrar'] = {
+    register(regOpts, host) {
+      const bundle = readHostBundle(host)
+      const setAppState =
+        bundle.toolUseContext.setAppStateForTasks ??
+        bundle.toolUseContext.setAppState
+      const abortController = new AbortController()
+      const taskId = registerLocalWorkflowTask(setAppState, {
+        description: regOpts.summary ?? regOpts.workflowName,
+        workflowName: regOpts.workflowName,
+        workflowFile: regOpts.workflowFile ?? '',
+        summary: regOpts.summary,
+        ...(regOpts.toolUseId ? { toolUseId: regOpts.toolUseId } : {}),
+        abortController,
+      })
+      const runId = regOpts.runId ?? taskId
+      bindings.set(runId, {
+        runId,
+        taskId,
+        setAppState,
+        abortController,
+        workflowName: regOpts.workflowName,
+        agentAbortControllers: new Map(),
+      })
+      logForDebugging(
+        `workflow task registered: ${runId} (${regOpts.workflowName})`,
+      )
+      return { runId, signal: abortController.signal }
+    },
+    complete(runId, summary) {
+      const b = bindings.get(runId)
+      if (!b) return
+      completeWorkflowTask(b.taskId, b.setAppState)
+      logForDebugging(`workflow ${runId} completed: ${summary ?? ''}`)
+      bindings.delete(runId)
+    },
+    fail(runId, error) {
+      const b = bindings.get(runId)
+      if (!b) return
+      failWorkflowTask(b.taskId, b.setAppState, error)
+      logForDebugging(`workflow ${runId} failed: ${error}`)
+      bindings.delete(runId)
+    },
+    kill(runId) {
+      const b = bindings.get(runId)
+      if (!b) return
+      killWorkflowTask(b.taskId, b.setAppState) // internal abort controller
+      // Killing the run also aborts all in-flight agents (guards against the edge timing where the backend misses the task abort)
+      for (const ac of b.agentAbortControllers.values()) {
+        try {
+          ac.abort()
+        } catch {
+          // no-op: abort won't throw internally, but fail-closed
+        }
+      }
+      b.agentAbortControllers.clear()
+      bindings.delete(runId)
+    },
+    registerAgentAbort(runId, agentId, ac) {
+      const b = bindings.get(runId)
+      if (!b) return
+      b.agentAbortControllers.set(agentId, ac)
+    },
+    unregisterAgentAbort(runId, agentId) {
+      const b = bindings.get(runId)
+      if (!b) return
+      b.agentAbortControllers.delete(agentId)
+    },
+    killAgent(runId, agentId) {
+      const b = bindings.get(runId)
+      if (!b) return false
+      const ac = b.agentAbortControllers.get(agentId)
+      if (!ac) return false
+      try {
+        ac.abort()
+      } catch {
+        // no-op
+      }
+      b.agentAbortControllers.delete(agentId)
+      return true
+    },
+    pendingAction() {
+      return null // v1: skip/retry not wired (seam retained)
+    },
+  }
+
+  return {
+    hostFactory: makeHostFactory(),
+    agentAdapterRegistry: registry,
+    agentRunner: {
+      // Dead-code fallback: hooks always go through agentAdapterRegistry (required on ports). Reaching here means the registry was not registered — fail-fast.
+      async runAgentToResult() {
+        throw new Error(
+          'workflow agentRunner fallback reached — agentAdapterRegistry must be set on ports',
+        )
+      },
+    },
+    progressEmitter: {
+      emit(event) {
+        opts.bus.emit(event) // → store reducer + telemetry
+      },
+    },
+    taskRegistrar,
+    journalStore: createFileJournalStore(runsDir),
+    permissionGate: { isAborted: () => false }, // engine uses ctx.signal to check abort
+    logger: {
+      debug: msg => logForDebugging(msg),
+      warn: msg => logForDebugging(`[workflow warn] ${msg}`),
+      event: name => logForDebugging(`workflow event: ${name}`),
+    },
+  }
+}
--- a/src/workflow/progress/bus.ts
+++ b/src/workflow/progress/bus.ts
@@ -0,0 +1,20 @@
+import type { ProgressEvent } from '@claude-code-best/workflow-engine'
+
+/** Typed progress event bus. engine progressEmitter.emit -> broadcasts to all subscribers (store / telemetry). */
+export type ProgressBus = {
+  emit(event: ProgressEvent): void
+  subscribe(listener: (event: ProgressEvent) => void): () => void
+}
+
+export function createProgressBus(): ProgressBus {
+  const listeners = new Set<(event: ProgressEvent) => void>()
+  return {
+    emit(event) {
+      for (const fn of listeners) fn(event)
+    },
+    subscribe(listener) {
+      listeners.add(listener)
+      return () => listeners.delete(listener)
+    },
+  }
+}
--- a/src/workflow/progress/store.ts
+++ b/src/workflow/progress/store.ts
@@ -0,0 +1,200 @@
+import type { ProgressEvent } from '@claude-code-best/workflow-engine'
+import type { ProgressBus } from './bus.js'
+
+export type AgentProgress = {
+  /** Unique id stamped by the engine, precisely correlates started/done (fixes the old LIFO race condition). */
+  id: number
+  label?: string
+  phase?: string
+  status: 'running' | 'done'
+  resultKind?: string
+  /** Only meaningful when done·ok: output is an object -> 'object', otherwise -> 'text'. None for dead/skipped. */
+  outputShape?: 'text' | 'object'
+  /** Actually parsed model id (carried in by agent_done; none while running). */
+  model?: string
+  /** Cumulative context tokens (live via agent_progress / final value settled by agent_done). */
+  tokenCount?: number
+  /** Cumulative tool-call count (live via agent_progress / final value settled by agent_done). */
+  toolCount?: number
+}
+
+export type RunProgress = {
+  runId: string
+  workflowName: string
+  status: 'running' | 'completed' | 'failed' | 'killed'
+  phases: Array<{ title: string; status: 'running' | 'done' }>
+  /** From run_started.meta.phases[].title; the panel uses this to show pending(○) phases. [] when no meta. */
+  declaredPhases: string[]
+  currentPhase: string | null
+  agents: AgentProgress[]
+  agentCount: number
+  returnValue?: unknown
+  error?: string
+  /** run_started timestamp (used by the panel to compute run duration). */
+  startedAt: number
+  /** workflow description (from run_started.meta.description). */
+  description?: string
+  updatedAt: number
+}
+
+export type ProgressStore = {
+  apply(event: ProgressEvent): void
+  list(): RunProgress[]
+  get(runId: string): RunProgress | undefined
+  /** Directly inject a run read from disk (bypassing bus); skips existing runId - in-memory takes priority. */
+  hydrate(run: RunProgress): void
+  /** For useSyncExternalStore: returns a stable reference, the same array when no change. */
+  subscribe(listener: () => void): () => void
+  getSnapshot(): RunProgress[]
+}
+
+/** Build a reactive store from the bus: subscribe to the bus, reduce events, notify React subscribers. */
+export function createProgressStoreFromBus(bus: ProgressBus): ProgressStore {
+  const byId = new Map<string, RunProgress>()
+  let snapshot: RunProgress[] = []
+  const listeners = new Set<() => void>()
+
+  const notify = (): void => {
+    snapshot = [...byId.values()].sort((a, b) => b.updatedAt - a.updatedAt)
+    for (const fn of listeners) fn()
+  }
+
+  const ensure = (runId: string, workflowName: string): RunProgress => {
+    let p = byId.get(runId)
+    if (!p) {
+      p = {
+        runId,
+        workflowName,
+        status: 'running',
+        phases: [],
+        declaredPhases: [],
+        currentPhase: null,
+        agents: [],
+        agentCount: 0,
+        startedAt: Date.now(),
+        updatedAt: Date.now(),
+      }
+      byId.set(runId, p)
+    }
+    return p
+  }
+
+  const apply = (event: ProgressEvent): void => {
+    // log produces no visible state change (panel has no log view): early exit to avoid pointless snapshot rebuild and React re-render
+    if (event.type === 'log') return
+    const runId = event.runId
+    const p = ensure(
+      runId,
+      'workflowName' in event ? event.workflowName : 'workflow',
+    )
+    p.updatedAt = Date.now()
+    switch (event.type) {
+      case 'run_started':
+        p.workflowName = event.workflowName
+        p.status = 'running'
+        p.declaredPhases = event.meta?.phases?.map(ph => ph.title) ?? []
+        p.description = event.meta?.description ?? undefined
+        break
+      case 'phase_started':
+        if (!p.phases.some(ph => ph.title === event.phase)) {
+          p.phases.push({ title: event.phase, status: 'running' })
+        }
+        p.currentPhase = event.phase
+        break
+      case 'phase_done':
+        for (const ph of p.phases)
+          if (ph.title === event.phase) ph.status = 'done'
+        if (p.currentPhase === event.phase) p.currentPhase = null
+        break
+      case 'agent_started': {
+        let a = p.agents.find(x => x.id === event.agentId)
+        if (!a) {
+          a = {
+            id: event.agentId,
+            label: event.label,
+            phase: event.phase,
+            status: 'running',
+          }
+          p.agents.push(a)
+          p.agentCount = p.agents.length
+        } else {
+          a.status = 'running'
+          a.label = event.label
+          a.phase = event.phase
+        }
+        break
+      }
+      case 'agent_progress': {
+        // live progress: only update token/tool (high frequency, but once per agent message, frequency is controllable).
+        const ap = p.agents.find(x => x.id === event.agentId)
+        if (ap) {
+          ap.tokenCount = event.tokenCount
+          ap.toolCount = event.toolCount
+        }
+        break
+      }
+      case 'agent_done': {
+        let a = p.agents.find(x => x.id === event.agentId)
+        if (!a) {
+          a = {
+            id: event.agentId,
+            label: event.label,
+            phase: event.phase,
+            status: 'done',
+            ...(event.result.kind === 'ok'
+              ? {
+                  outputShape:
+                    typeof event.result.output === 'object' &&
+                    event.result.output !== null
+                      ? ('object' as const)
+                      : ('text' as const),
+                  tokenCount: event.result.tokenCount,
+                  toolCount: event.result.toolCount,
+                  model: event.result.model,
+                }
+              : {}),
+          }
+          p.agents.push(a)
+          p.agentCount = p.agents.length
+        } else {
+          a.status = 'done'
+          a.resultKind = event.result.kind
+          if (event.result.kind === 'ok') {
+            a.outputShape =
+              typeof event.result.output === 'object' &&
+              event.result.output !== null
+                ? 'object'
+                : 'text'
+            a.tokenCount = event.result.tokenCount
+            a.toolCount = event.result.toolCount
+            a.model = event.result.model
+          }
+        }
+        break
+      }
+      case 'run_done':
+        p.status = event.status
+        if (event.returnValue !== undefined) p.returnValue = event.returnValue
+        if (event.error !== undefined) p.error = event.error
+        break
+    }
+    notify()
+  }
+
+  bus.subscribe(apply)
+  return {
+    apply,
+    list: () => snapshot,
+    get: id => byId.get(id),
+    hydrate(run) {
+      if (byId.has(run.runId)) return
+      byId.set(run.runId, run)
+      notify()
+    },
+    subscribe: fn => {
+      listeners.add(fn)
+      return () => listeners.delete(fn)
+    },
+    getSnapshot: () => snapshot,
+  }
+}
--- a/src/workflow/registry.ts
+++ b/src/workflow/registry.ts
@@ -0,0 +1,13 @@
+import { AgentAdapterRegistry } from '@claude-code-best/workflow-engine'
+import { claudeCodeBackend } from './backends/claudeCodeBackend.js'
+
+/**
+ * Build a multi-backend registry. v1 (depth B) only registers a single
+ * claude-code adapter as default, without prefilling routing rules — add
+ * .route(...) when extending with a second provider adapter.
+ */
+export function buildRegistry(): AgentAdapterRegistry {
+  const reg = new AgentAdapterRegistry()
+  reg.register(claudeCodeBackend).default('claude-code')
+  return reg
+}
--- a/src/workflow/service.ts
+++ b/src/workflow/service.ts
@@ -0,0 +1,314 @@
+import {
+  listNamedWorkflows,
+  parseScript,
+  persistInlineScript,
+  resolveNamedWorkflow,
+  runWorkflow,
+  WORKFLOW_DIR_NAME,
+  type WorkflowHostContext,
+  type WorkflowInput,
+  type WorkflowPorts,
+} from '@claude-code-best/workflow-engine'
+import { readFile } from 'node:fs/promises'
+import { join } from 'node:path'
+import { getProjectRoot } from '../bootstrap/state.js'
+import { logForDebugging } from '../utils/debug.js'
+import { buildHostBundle, makeHostHandle } from './hostHandle.js'
+import { installWorkflowNotifications } from './notifications.js'
+import {
+  attachRunStatePersistence,
+  getRunsDir,
+  listPersistedRuns,
+  readRunState,
+} from './persistence.js'
+import { createProgressBus } from './progress/bus.js'
+import {
+  createProgressStoreFromBus,
+  type ProgressStore,
+  type RunProgress,
+} from './progress/store.js'
+import { createWorkflowPorts } from './ports.js'
+import type { CanUseToolFn } from '../hooks/useCanUseTool.js'
+import type { ToolUseContext } from '../Tool.js'
+
+/**
+ * WorkflowService: the single entry shared by the tool (U7) and panel (U9).
+ *
+ * - `ports`: shared WorkflowPorts; tool descriptors are passed through to the engine.
+ * - `launch`: parse script → parseScript quick validation → taskRegistrar.register (gets runId+signal)
+ *   → detached runWorkflow → on completion routes to complete/fail/kill.
+ * - `kill/listRuns/getRun/subscribe/listNamed`: auxiliary queries for panel and tool.
+ */
+export type WorkflowService = {
+  /** Shared ports (used by tool descriptors). */
+  ports: WorkflowPorts
+  /** Panel/tool launches a workflow: parse script → register → detached runWorkflow. */
+  launch(
+    input: Pick<
+      WorkflowInput,
+      | 'script'
+      | 'name'
+      | 'scriptPath'
+      | 'args'
+      | 'description'
+      | 'resumeFromRunId'
+      | 'title'
+      | 'maxConcurrency'
+    >,
+    toolUseContext: ToolUseContext,
+    canUseTool: CanUseToolFn,
+  ): Promise<{ runId: string; scriptPath?: string }>
+  kill(runId: string): void
+  /**
+   * Aborts a single agent (does not affect other agents in the same run; workflow keeps running).
+   * Returns whether the agent was hit (false = agent already finished/does not exist). An aborted agent returns dead → null.
+   */
+  killAgent(runId: string, agentId: number): boolean
+  /**
+   * Cleanup on process exit / config unload: kill all running runs to avoid orphan tasks.
+   * Completed/failed runs are unaffected. Idempotent — safe to call multiple times.
+   */
+  shutdown(): void
+  listRuns(): RunProgress[]
+  getRun(runId: string): RunProgress | undefined
+  /**
+   * Async lookup by runId: return on memory hit; on miss read state.json from disk (not injected into memory).
+   * Used by the "get historical return by runId" scenario; for panel display use loadPersistedRuns + listRuns.
+   */
+  getRunAsync(runId: string): Promise<RunProgress | undefined>
+  /**
+   * Scans the disk and hydrates state.json of all historical runs into the store (skips existing runIds).
+   * The process singleton only scans the disk once (persistedLoaded flag); repeated calls return immediately.
+   */
+  loadPersistedRuns(): Promise<void>
+  subscribe(listener: () => void): () => void
+  listNamed(workflowDir?: string): Promise<string[]>
+}
+
+let cached: WorkflowService | null = null
+
+/** Process singleton. Tool and panel share the same ports/registry/store. */
+export function getWorkflowService(): WorkflowService {
+  if (cached) return cached
+  const bus = createProgressBus()
+  const store = createProgressStoreFromBus(bus)
+  const ports = createWorkflowPorts({ bus, store })
+  const service = makeService(ports, store)
+  // Subscribe to run_done to write the terminal snapshot to disk (shared entry for completed/failed/killed; shutdown-kill also routes here).
+  // The store registers to the bus before this subscription, so when the listener runs store.get(runId) is already terminal.
+  attachRunStatePersistence(bus, store)
+  // Install the state-change notification bridge (commit 0768d4dc promised "auto-notify on completion" but the old implementation left it unfulfilled)
+  installWorkflowNotifications(service)
+  cached = service
+  return cached
+}
+
+/**
+ * Construct the service (inject ports + store).
+ *
+ * Production path uses {@link getWorkflowService}; tests use this function to inject fake ports directly,
+ * avoiding touching real getProjectRoot/getCwd/analytics and other module-level side effects.
+ *
+ * @param cwdOverride For tests only: inject a temp directory (avoids inline persistence writing to the real project directory).
+ * @param runsDirProvider For tests only: inject a tmpdir (Bun ESM module namespace is read-only, cannot monkey-patch getRunsDir).
+ */
+export function makeService(
+  ports: WorkflowPorts,
+  store: ProgressStore,
+  cwdOverride?: string,
+  runsDirProvider: () => string = getRunsDir,
+): WorkflowService {
+  const buildHost = (
+    toolUseContext: ToolUseContext,
+    canUseTool: CanUseToolFn,
+  ): WorkflowHostContext => ({
+    handle: makeHostHandle(buildHostBundle(toolUseContext, canUseTool)),
+    // Use projectRoot to stay in sync with ports.ts hostFactory / journalStore;
+    // entering a worktree/subdirectory will not desync named workflow resolution from journal persistence.
+    // cwdOverride is for tests only: inject a temp directory (avoids inline persistence writing to the real project directory).
+    cwd: cwdOverride ?? getProjectRoot(),
+    budgetTotal: null, // turn-level budget injection point (in future read from settings)
+    toolUseId: toolUseContext.toolUseId,
+  })
+
+  async function resolveSource(input: {
+    script?: string
+    name?: string
+    scriptPath?: string
+  }): Promise<{
+    script: string
+    workflowFile?: string
+    workflowName: string
+  }> {
+    if (input.script) {
+      return { script: input.script, workflowName: 'workflow' }
+    }
+    if (input.scriptPath) {
+      return {
+        script: await readFile(input.scriptPath, 'utf-8'),
+        workflowFile: input.scriptPath,
+        workflowName: 'workflow',
+      }
+    }
+    if (input.name) {
+      const dir = join(getProjectRoot(), WORKFLOW_DIR_NAME)
+      const found = await resolveNamedWorkflow(dir, input.name)
+      if (!found) {
+        throw new Error(
+          `Named workflow "${input.name}" not found (looked in ${WORKFLOW_DIR_NAME}/)`,
+        )
+      }
+      return {
+        script: found.content,
+        workflowFile: found.path,
+        workflowName: input.name,
+      }
+    }
+    throw new Error('One of script, name, or scriptPath must be provided')
+  }
+
+  // Process-singleton flag for loadPersistedRuns: set to true on first call, subsequent calls return immediately.
+  // Reset on scan failure to allow next retry. Each makeService call has its own closure variable (reset when tests build a new service).
+  let persistedLoaded = false
+
+  return {
+    ports,
+
+    async launch(input, toolUseContext, canUseTool) {
+      const { script, workflowFile, workflowName } = await resolveSource(input)
+      try {
+        parseScript(script)
+      } catch (e) {
+        throw new Error(`Script validation failed: ${(e as Error).message}`)
+      }
+
+      const host = buildHost(toolUseContext, canUseTool)
+      const { runId, signal } = ports.taskRegistrar.register(
+        {
+          workflowName,
+          ...(workflowFile ? { workflowFile } : {}),
+          ...(input.description ? { summary: input.description } : {}),
+          ...(host.toolUseId ? { toolUseId: host.toolUseId } : {}),
+          ...(input.resumeFromRunId ? { runId: input.resumeFromRunId } : {}),
+        },
+        host.handle,
+      )
+
+      // Inline entry: persist script to the run directory (symmetric with WorkflowTool), return a reusable path.
+      // Degrade on write failure (log), do not block the run (script is already in memory).
+      let persistedScriptPath: string | undefined
+      if (!workflowFile && input.script) {
+        try {
+          persistedScriptPath = await persistInlineScript(
+            input.script,
+            runId,
+            host.cwd,
+          )
+        } catch (e) {
+          logForDebugging(
+            `workflow inline script persist failed: ${(e as Error).message}`,
+          )
+        }
+      }
+
+      // detached: do not await, let the caller get runId immediately; on completion route to the registrar.
+      void runWorkflow({
+        script,
+        ...(input.args !== undefined ? { args: input.args } : {}),
+        runId,
+        workflowName,
+        ports,
+        host: host.handle,
+        signal,
+        cwd: host.cwd,
+        budgetTotal: host.budgetTotal,
+        ...(input.maxConcurrency !== undefined
+          ? { maxConcurrency: input.maxConcurrency }
+          : {}),
+        ...(input.resumeFromRunId ? { resume: true } : {}),
+      })
+        .then(result => {
+          if (result.status === 'completed') {
+            ports.taskRegistrar.complete(runId)
+          } else if (result.status === 'failed') {
+            ports.taskRegistrar.fail(runId, result.error ?? 'failed')
+          } else {
+            ports.taskRegistrar.kill(runId)
+          }
+        })
+        .catch(e => ports.taskRegistrar.fail(runId, (e as Error).message))
+
+      logForDebugging(`workflow launched: ${runId} (${workflowName})`)
+      return {
+        runId,
+        ...(persistedScriptPath ? { scriptPath: persistedScriptPath } : {}),
+      }
+    },
+
+    kill(runId) {
+      ports.taskRegistrar.kill(runId)
+    },
+    killAgent(runId, agentId) {
+      return ports.taskRegistrar.killAgent?.(runId, agentId) ?? false
+    },
+
+    shutdown() {
+      // Only kill running: for completed/failed runs the taskRegistrar has already reclaimed the binding, kill is a no-op.
+      // taskRegistrar.kill is a safe no-op for unknown runIds, hence idempotent — multiple shutdowns do not throw repeatedly.
+      // Each kill is wrapped in its own try/catch: kill internally routes through setAppState, and process-exit phase triggers a React re-render
+      // which may throw (render already unmounted, etc.); a single failure should not block cleanup of other runs.
+      for (const run of store.list()) {
+        if (run.status !== 'running') continue
+        try {
+          ports.taskRegistrar.kill(run.runId)
+        } catch (e) {
+          logForDebugging(
+            `workflow shutdown: kill ${run.runId} failed: ${(e as Error).message}`,
+          )
+        }
+      }
+    },
+
+    listRuns: () => store.list(),
+    getRun: id => store.get(id),
+    async getRunAsync(id) {
+      const mem = store.get(id)
+      if (mem) return mem
+      return (await readRunState(runsDirProvider(), id)) ?? undefined
+    },
+    async loadPersistedRuns() {
+      if (persistedLoaded) return
+      persistedLoaded = true
+      try {
+        const runs = await listPersistedRuns(runsDirProvider())
+        for (const run of runs) store.hydrate(run)
+      } catch (e) {
+        // Scan failure does not block the panel: log + reset flag to allow next retry
+        logForDebugging(
+          `[workflow warn] loadPersistedRuns failed: ${(e as Error).message}`,
+        )
+        persistedLoaded = false
+      }
+    },
+    subscribe: fn => store.subscribe(fn),
+
+    async listNamed(workflowDir) {
+      return listNamedWorkflows(
+        workflowDir ?? join(getProjectRoot(), WORKFLOW_DIR_NAME),
+      )
+    },
+  }
+}
+
+/** For tests: reset the singleton (avoid cross-case contamination). */
+export function __resetWorkflowServiceForTests(): void {
+  cached = null
+}
+
+/**
+ * Returns the already-instantiated service (does not create one). Used on process exit / config unload to peek;
+ * if workflow was never used, cached is still null — avoids side-effecting bus/ports creation in the exit hook.
+ */
+export function peekWorkflowService(): WorkflowService | null {
+  return cached
+}
--- a/src/workflow/wiring.ts
+++ b/src/workflow/wiring.ts
@@ -0,0 +1,65 @@
+import {
+  createWorkflowTool,
+  workflowInputSchema,
+  WORKFLOW_TOOL_NAME,
+  type WorkflowToolDescriptor,
+} from '@claude-code-best/workflow-engine'
+import { buildTool, type Tool } from '../Tool.js'
+import { getWorkflowService } from './service.js'
+
+/**
+ * Adapts the engine's self-contained descriptor into a buildTool-compatible Tool.
+ * The descriptor routes through the service singleton (sharing ports/registry/store).
+ *
+ * ports resolution is deferred to the first real method call (lazy): tools.ts calls
+ * createWorkflowToolCore() during module-load (feature-gated), and resolving ports
+ * immediately would trigger service instantiation, which in turn calls module-level
+ * side effects like getProjectRoot — yielding wrong paths before bootstrap completes.
+ * The Tool object itself is a singleton via createWorkflowToolCore's cached (PermissionRequest
+ * matches by reference), and the ports singleton is guaranteed by getWorkflowService.
+ */
+function buildWorkflowTool(): Tool {
+  let cachedDescriptor: WorkflowToolDescriptor | null = null
+  const descriptor = (): WorkflowToolDescriptor => {
+    if (!cachedDescriptor) {
+      const { ports } = getWorkflowService()
+      cachedDescriptor = createWorkflowTool(ports)
+    }
+    return cachedDescriptor
+  }
+  return buildTool({
+    name: WORKFLOW_TOOL_NAME,
+    maxResultSizeChars: 50_000,
+    inputSchema: workflowInputSchema,
+    isEnabled: () => descriptor().isEnabled(),
+    isReadOnly: input => descriptor().isReadOnly(input),
+    isConcurrencySafe: () => true,
+    async description() {
+      return descriptor().description()
+    },
+    async prompt() {
+      return descriptor().prompt()
+    },
+    async call(input, context, canUseTool, parentMessage, onProgress) {
+      const result = await descriptor().call(
+        input,
+        context,
+        canUseTool,
+        parentMessage,
+        onProgress,
+      )
+      return { data: result.data }
+    },
+    renderToolUseMessage: input => descriptor().renderToolUseMessage(input),
+    mapToolResultToToolResultBlockParam: (data, toolUseId) =>
+      descriptor().mapToolResultToToolResultBlockParam(data, toolUseId),
+  })
+}
+
+// Singleton: tools.ts registration and PermissionRequest must reference the same instance (switch matches by reference).
+let cached: Tool | null = null
+
+export function createWorkflowToolCore(): Tool {
+  if (!cached) cached = buildWorkflowTool()
+  return cached
+}