feat: dynamic-workflow 来了 (#1271)

* feat(workflow): add workflow engine, /workflows panel, /ultracode skill

将 feat/sdk-backend 分支中 workflow 相关的 20 个 commit 压缩为单 commit:

- 工作流引擎核心:phase / agent / parallel / pipeline 编排原语(packages/workflow-engine/)
- /workflows 面板:三区焦点布局(顶部 run tabs + 左侧 phase 侧栏 + 右侧 agent 列表)
- /ultracode skill:多 agent workflow 编排入口
- 进度存储 / journal / notification 系统
- WorkflowService 生命周期管理 + SentryErrorBoundary
- 脚本沙箱:禁用 dynamic import()、JSON args 防御性归一化
- journal 与 named-workflow 路径统一在 projectRoot
- 错误处理:parallel/pipeline hooks 错误日志、failure routing、semaphore abort
- workflow 工具升级为 core 工具 + PascalCase 命名

Co-Authored-By: glm-5.1 <zai-org@claude-code-best.win>

* feat(workflow): 复刻 ultracode 手册并修复 worktree/inline/opt-in 三处缺口

围绕 ultracode skill 审查 agent 系统一致性后:
- ultracode.ts: 用系统提示版完整 Workflow 编排手册替换中文精简版
- HIGH#1 isolation:'worktree': claudeCodeBackend.run() 用 createAgentWorktree +
  runWithCwdOverride 包裹 runAgent + finally 清理实现真正的 cwd 隔离;slug 用
  sha256(runId:agentId) 派生以匹配 cleanupStaleAgentWorktrees 清理正则
  (修 runId 为 w+base36 非 UUID 导致的泄漏盲区);worktree.ts 注释同步修正
- HIGH#2 inline 持久化: 新增 persistInlineScript,WorkflowTool + service 两条
  inline 路径对称持久化到 .claude/workflow-runs/<runId>/script.js,返回可复用
  scriptPath(闭环 inline→编辑→scriptPath 重提迭代循环)
- HIGH#3 opt-in 分工: ultracode/WorkflowTool/effort 注明 session reminder 由
  harness 注入,repo 内无 ultracode 信号,保持 feature('WORKFLOW_SCRIPTS') +
  isEnabled 两层 gate,不自造注入
- 测试: 新增 persistInline.test.ts;扩展 claudeCodeBackend(isolation 4 用例)/
  WorkflowTool(inline)/service(scriptPath)/ultracode(harness)

含配套 workflow engine/panel 完善与 run-state-persistence design doc。

Co-Authored-By: Claude <noreply@anthropic.com>

* feat(workflow): run 终态落盘 state.json 支持跨重启恢复

终态 RunProgress(含 returnValue/error)此前只在内存 ProgressStore,进程
重启即丢失。本次让其落盘到 .claude/workflow-runs/<runId>/state.json,使
(a) 重启后可按 runId 取 return、(b) /workflows 面板跨重启展示历史 run。
跨进程 resume 明确不在范围。

- persistence.ts: getRunsDir/writeRunState/readRunState/listPersistedRuns
  + attachRunStatePersistence;原子覆盖写(tmp+rename),读容错(缺文件/
  损坏/schemaVersion 不符 → null),写 best-effort(IO 失败只 log warn)
- progress/store.ts: 加 hydrate(run) 直接注入磁盘 run(已存在 runId 跳过,
  内存优先)
- service.ts: getWorkflowService() 接线 attachRunStatePersistence(bus,
  store) 订阅 run_done(completed/failed/killed 三态共用,shutdown-kill
  也走同路径,无需额外钩子);WorkflowService 加 getRunAsync(id) 内存
  miss→读盘 fallback(不注入内存)+ loadPersistedRuns() 扫盘 hydrate
  (persistedLoaded flag 守护幂等)
- panel/WorkflowsPanel.tsx: mount 时调一次 loadPersistedRuns(重 mount
  不重复)
- ports.ts: runsDir 改用 getRunsDir() 消除拼接重复
- 测试: persistence.test.ts(11)/runStatePersistence.test.ts(5)/
  progressStore(2)/service(5)/WorkflowsPanel(1) 共 24 个新测试;
  precheck 5629 pass / 0 fail

设计偏离: 计划原写 monkey-patch getRunsDir 指向 tmpdir,Bun ESM namespace
不可变不可行;改用可选 runsDirProvider 参数(默认 getRunsDir)DI 注入,
加到 attachRunStatePersistence 与 makeService(cwdOverride 之后第 4 参),
与现有 cwdOverride 模式一致。makeService 的 cwdOverride 保持不变,不破坏
inline 持久化特性。

Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>

* feat(workflow): 默认并发降为 3 并支持 per-run maxConcurrency 注入

- DEFAULT_MAX_CONCURRENCY=3 替代旧的 min(16, cores-2);MAX_CONCURRENCY_CAP=16 保留为用户输入的绝对上限
- 新增 clampMaxConcurrency() 处理 undefined/<1/>CAP 边界
- WorkflowInput schema 新增 maxConcurrency: number.int().min(1).max(16).optional()
- 引擎层 context/runWorkflow 全链路透传:semaphore 容量来自 per-run 入参
- WorkflowTool prompt 增加指引:fan-out 场景先用 AskUserQuestion 与用户确认并发再启动
- 同步 ultracode skill + audit workflow spec 的并发文字(删 cpu-cores 公式)
- 同步 docs/features/workflow-scripts.md 旧公式

Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>

* fix(workflow): 面板 UI 字符串英文化

WorkflowsPanel 中 4 处面向用户的中文(onDone 错误消息、键位提示行)
改为英文;其他面板组件(AgentList/TabsBar)原本已是英文。代码注释
保留中文,与 workflow 模块惯例一致。

Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>

* feat(workflow): 中断系统(x 杀单 agent / K 杀整个 workflow,Dialog 二次确认)

- claudeCodeBackend 桥接 ctx.signal → runAgent.override.abortController(修 'x' 无效根因:abort 到不了内部 fetch)
- AbortError 识别为 throw WorkflowAbortedError(不再吞成 dead,workflow 能感知被 kill)
- ports.taskRegistrar 加 registerAgentAbort/unregisterAgentAbort/killAgent;service.killAgent(runId, agentId) 精确中断
- 面板键位:'x' 杀当前 agent(agents 列聚焦时) / 'K' 杀整个 workflow;Dialog 二次确认 + confirm 模式吞导航键防误触
- 新增测试 8 项(backend signal bridge / hooks inject / ports killAgent / service killAgent)

Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>

* docs(workflow): ultracode skill 加 model tier 选择指引(haiku/sonnet/opus/best 场景匹配)

补足 agent() 已有 model 参数缺的判断依据:列出 4 个 tier 的成本/延迟量级和典型场景,
明确"无法 articulate 为什么换 tier 就 omit"的 rule of thumb。

Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>

* feat(workflow): maxConcurrency≠3 必须先 AskUserQuestion(默认 3 推荐值)

把 fan-out 时才问改成任何 maxConcurrency≠3 都必须问。
唯一例外:用户在当前会话已明确说过并发数("use 6" / "maxConcurrency 9")。
prompt (WorkflowTool.ts) + skill (ultracode.ts) + audit spec 三处同步。

Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>

* feat(workflow): agent 失败自动重试一次(dead 或非 abort throw)

- hooks.agent 包装 invokeBackend:第一次 dead 或非 abort throw → 重试一次
- WorkflowAbortedError(kill)不重试——是用户意图
- registry.resolve 配置错(AdapterNotFoundError 等)在 try 外直接上抛,不走重试——
  配置问题重试无意义且掩盖 bug
- 重试仍失败:dead 保持 dead;throw 降级 dead(不击穿 workflow,
  与 parallel/pipeline null-on-error 契约一致)
- budget 不重复扣:dead 不 addOutputTokens,重试 ok 才扣一次
- 新增 7 项 hooks 层重试测试 + 1 项 service 层降级测试

Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>

* fix(workflow): 面板 label 截断保留 #数字 后缀(同 dim 多 finding 可区分)

audit workflow 用 verify:\${dim}#\${findingIdx} 命名 verify agent。
旧逻辑 slice(0, 18) 从右切把 #idx 全吃了——同 dimension 多 finding
肉眼无法区分。新逻辑:含 #数字 后缀时保留后缀,前缀截断 + … 省略号。

例:verify:correctness#0 → verify:correctn…#0
   verify:architecture#15 → verify:archite…#15

Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>

* feat(workflow): kill 整个 workflow 后立即回主 chat

run_done→store→notifications.ts 的通知路径已有,但 confirmYes 后面板继续
挂着挡住主 chat,用户看不到"已停止"反馈。kill 后调 onDone() 立即退出面板,
让主 chat 的 `Workflow "<name>" was stopped` 通知直接可见。

Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>

* fix(workflow): agent dead 带 reason/detail + prompt 加压 StructuredOutput

12 agent audit workflow 8 个 dead,journal 只记 {kind:"dead"} 无信息,
事后无法区分 "agent 没产 StructuredOutput" vs "runAgent 抛错"。
证据指向主因:sonnet 长 tool chain 后忘记调 StructuredOutput,
extractStructuredOutput 返回 null 即降级 dead。

- types.ts: AgentRunResult.dead 加可选 reason/detail 字段
  (no-structured-output / runagent-threw / worktree-failed / unknown)
  兼容旧 journal(均 optional)。
- claudeCodeBackend.ts: 三处 dead 填 reason + detail;
  no-structured-output 把 finalized 文本前 200 字符做 detail,
  让日志/面板能立刻看到 agent 最后说了什么。
- claudeCodeBackend.ts: schema 模式 prompt 首尾各放一次
  StructuredOutput 强制要求,针对 sonnet 长 tool chain 后忘记收尾。
- hooks.ts: retry 日志带 reason;retry 仍 throw 时降级 dead 也填
  reason=runagent-threw + detail。
- types.test.ts: 加 reason JSON 往返 + 旧 journal 兼容测试。

Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>

* fix(workflow): schema 模式弃用 StructuredOutput 工具契约,改鲁棒 JSON 文本解析

上一轮 70a2f76 把"agent 长 tool chain 后忘调 StructuredOutput"当作死因,
加 prompt 头尾双强制。但实测跑 5 个 review agent 4 个 dead,detail 全是
"StructuredOutput tool is not available as a deferred tool"——根因是
该工具从未注入 workflow sub-agent 的工具集(assembleToolPool 默认池不含,
只有 stop_hook 路径 execAgentHook.ts 显式 createStructuredOutputTool())。
prompt 反复要求调一个不可达的工具,agent 困扰、长篇辩解、最终没产 JSON。

- claudeCodeBackend.ts:
  - extractStructuredOutput 重写:括号栈扫描替代 indexOf/lastIndexOf,
    处理嵌套对象、字符串内的括号、转义符;新增 fenced code block
    优先路径(```json / ```),多 JSON 块取第一个 parse 成功的;
    只返回 plain object(拒 array/number/string/null)。不做语法修复
    (尾逗号/单引号/注释)——避免在字符串内误改(如 "http://" 被 // 注释正则吃)。
  - schema 模式 prompt 简化:删首尾双 STRUCTURED OUTPUT 强制(600+ token),
    改成指示 agent 在最后文本块 emit raw JSON;明确告知"StructuredOutput
    is not available in this environment",消除调用幻觉。
- hooks.ts: detail.slice 用 typeof === 'string' 守卫;catch 块用
  e instanceof Error ? e.message : String(e)(旧 journal / 第三方 adapter
  可能写非 string detail,直接 .slice 会抛 TypeError 击穿日志)。
- claudeCodeBackend.test.ts: +9 测试覆盖 fenced / 嵌套 / 字符串内括号 /
  转义引号 / 多块取首 / 类型守卫 / 损坏 JSON。

precheck: 5663 pass / 0 fail。

Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>

* docs(effort): 新增 /effort 交互面板设计 spec

设计要点:
- /effort 无参 → 横向 slider 面板(low/medium/high/xhigh/max/ultracode)
- ←/→ 移动光标,Enter 确认,Esc 取消
- ultracode 仅视觉占位,确认后提示走 /ultracode <context>
- env override 时双标记 + 顶部警告
- 模型不支持时面板禁用
- 两阶段交付:先基础面板 commit,再做 ultracode 波纹动画

Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>

* docs(effort): 新增 EffortPanel 基础面板实施计划(第一阶段)

按 TDD 分 6 个 task:纯函数状态 → keybinding 注册 → 组件 → 命令挂载 → 分支测试 → precheck。
波纹动画在第二阶段单独 commit。

Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>

* docs(effort): plan 补 q/ctrl+c 取消绑定,对齐 spec §5 状态机

verifier 抓到的 gap:spec §5 写明 Esc / Ctrl+C / q 都是取消事件,
但 plan Task 2.3 只绑了 escape。补上 q 和 ctrl+c → effortPanel:cancel。
同时把 Step 2.2 直接写成 6 个 action 版本(home/end),删除迂回表达。

Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>

* docs(effort): plan 修订执行前 review 发现的 5 处 gap

- Task 3.3 EffortPanel.tsx 草稿:Faster/Smarter padEnd 语法错乱重写;
  useKeybindings import 路径从 @anthropic/ink 修正为 ../../keybindings/useKeybinding.js;
  移除冗余 renderSeparatorLine;保留 renderPaddedLine
- Task 5.2 computeConfirmOutcome 改为注入 ApplyFn 模式:
  避免 effortPanelState → effort.tsx → EffortPanel 循环依赖;
  测试可注入 mockApply,无需 mock settings
- Step 5.3 测试代码对齐注入版签名

Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>

* feat(effort): 新增 EffortPanel 纯函数状态模块(PanelPosition + 移动/初始光标)

仅含纯函数与类型,无 React/Ink 依赖,便于单测。
- PANEL_POSITIONS:low → medium → high → xhigh → max → ultracode
- moveLeft/moveRight:边界钳制(low 不再左移、ultracode 不再右移)
- getInitialCursor:env override > displayed level

Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>

* feat(keybindings): 注册 EffortPanel context 与 6 个 action

绑定 ←/→/h/l/home/end/enter/escape/q/ctrl+c 到 effortPanel:* action。
与 ModelPicker context 范式一致,避免左右键被全局 keybinding 拦截。

Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>

* feat(effort): 实现 EffortPanel 组件主体(渲染 + 键盘交互 + 确认/取消分支)

- 横向 slider 布局:Faster ↔ Smarter 两极,6 档刻度
- useKeybindings 注册 EffortPanel context(←/→/h/l/home/end/enter/escape/q/ctrl+c)
- Enter 在 5 档之一 → 调 executeEffort 写 settings + AppState
- Enter 在 ultracode → 输出引导文案,不写状态
- Esc/q → "Effort unchanged."
- env override 时顶部黄色警告
- computeConfirmOutcome 注入 ApplyFn,便于测试(Task 5 补测试)

Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>

* feat(effort): /effort 无参时挂载 EffortPanel 交互面板

- 无参 → <EffortPanelWrapper> 透传 AppState.effortValue
- current/status → 仍显示文本(不变)
- 有参 → 直跳 executeEffort(不变)
- help/-h/--help → 不变

Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>

* test(effort): 补 computeConfirmOutcome 分支测试(注入 mockApply)

- ultracode → kind=ultracode-hint,不调 applyFn
- low → kind=apply,message/effortUpdate 来自 applyFn
- applyFn 返回无 effortUpdate 时 outcome.effortUpdate 为 undefined
- CANCEL_MESSAGE / ULTRACODE_HINT 常量

Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>

* fix(effort): 测试里 cursor cast 为 EffortValue,避免 PanelPosition 含 ultracode 触发 TS 错误

computeConfirmOutcome 的 ApplyFn 契约要求 EffortValue,但测试 mockApply 接收 PanelPosition。
实际运行时 computeConfirmOutcome 在 ultracode 档位走 hint 分支不会调 applyFn,
cast 安全。precheck 全量通过:5688 tests / 0 fail。

Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>

* fix(effort): 面板对齐与配色修复

- 对齐:用 Box width={SEGMENT} + justifyContent="center" 让 ▲ 与档位名严格居中对齐,
  替代之前 string padEnd(11) 与 SEGMENT=12 不一致导致的 1 列偏移
- 配色:所有面板文字改用 theme.claude(Claude Orange rgb(215,119,87)),
  替代终端默认紫;分隔线/副标签/底栏用 theme.subtle;env 警告用 theme.warning
- 光标档位的档位名也加粗,强化视觉焦点

Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>

* fix(effort): 面板文字改紫色,ULTRACODE_HINT 英文化

- 颜色:theme.claude(橙)→ theme.purple_FOR_SUBAGENTS_ONLY(Purple 600, rgb(147,51,234)),
  覆盖标题、Faster/Smarter、▲、档位名
- ULTRACODE_HINT:中文 → 英文
  "ultracode is not an effort level. Use /ultracode <context> to start a multi-agent workflow."

Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>

* fix(effort): 统一用色版——选中 suggestion(蓝),未选中 subtle(灰)

弃用 purple_FOR_SUBAGENTS_ONLY(subagent 专用)。改与项目其他面板一致:
- 选中档位 + ▲:color="suggestion"(Medium blue rgb(87,105,247))+ bold
- 未选中档位 + 空 ▲ 占位:color="subtle"(Light gray rgb(175,175,175))
- 标题 / Faster / Smarter:color="suggestion"
- 分隔线 / 副标签 / 底栏:color="subtle"

Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>

* fix(workflow): 终态前补发 phase_done,面板自动退出 running→terminal 转换

runWorkflow:脚本结束时 hook.phase 不会触发最后一个 phase 的 phase_done,
UI 左栏会永远显示 running。三路径(completed/killed/failed)统一在 run_done
之前补发 emitTerminalPhaseDone。

WorkflowsPanel:抽 isRunTerminatedTransition 纯函数判定 running → terminal,
面板 useEffect 检测到转换后自动退出聚焦。

Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>

* feat(effort): 波纹动画纯函数 pickChar/computeRippleLine/mergeLayers + 18 测试

Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>

* feat(effort): useRippleFrame hook 包装 useAnimationFrame,按需订阅时钟

Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>

* feat(effort): EffortPanel 集成波纹背景——cursor 停在 ultracode 时切换波纹模式

仅在 cursor === 'ultracode' 时启用 useRippleFrame,渲染 5 行波纹背景
+ overlay 文字(Faster/Smarter、分隔线、▲、档位名、副标签)。
其余档位保持原 PlainContent 渲染路径不动。

Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>

* refactor(effort): 波纹动画从字符密度改为颜色渐变

按原版风格把波纹背景从 INTENSITY_CHARS 密度字符('·∙░▒▓')改为
suggestion 系颜色渐变(transparent → 暗深紫蓝 → suggestion → 高光):

rippleAnimation.ts:
- 删除 pickChar / INTENSITY_CHARS / WAVE_PEAK_CHARS / mergeLayers
- 新增 intensityToColor(intensity) → 'transparent' | '#xxxxxx'
- 新增 computeRippleCells 返回 Cell[](每位置 char+color)
- 新增 applyOverlaysToCells(cells, overlays) 替代 mergeLayers
- 新增 cellsToSegments(cells) 合并相邻同色段(减少 Text 节点)

EffortPanel.tsx:
- RippleContent 用 cells→segments→tokens 渲染
- 空格段用 BaseText backgroundColor 染色块(纯色块视觉)
- 文字段用 Text color 染色(亮色突出)
- tokens 按空格/文字二次拆分,避免混合段渲染歧义

测试: 29 个 rippleAnimation 测试覆盖 intensityToColor 边界、
computeRippleCells 长度/震源/衰减、applyOverlaysToCells 覆盖/截断/
防御式拷贝、cellsToSegments 合并逻辑。

Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>

* fix(effort): 波纹参数调优——铺满左侧 + 速度调慢 + 全面板有底色

用户反馈三个问题:
1. "低峰部分没有颜色变化" → intensity ≤ 0.1 返回 transparent 导致波谷
   位置看不见。改为永不返回 transparent,最低档 #0a0d1a 作为面板
   底色(暗紫黑海洋),波峰在底色上流动。
2. "波浪速度太快" → time 系数 0.012 → 0.004(约 1/3 速)。波峰移动
   速度从 34 cell/s 降到 11 cell/s,每帧颜色变化从 45% 降到 36%。
3. "波浪只到中间部分,没覆盖左侧" → falloff 覆盖半径 40 → 90。
   震源 x=65,左侧 dist=65 < 90,波纹可达最左端(约 30-50% 覆盖)。

色阶调整:
- 删除 transparent 档,新增 #0a0d1a 作最暗档(底色)
- 最高档从 #8aa0ff(高光)改为 #5769F7(suggestion),避免与
  文字 overlay 同色互相吞噬
- 7 档颜色:#0a0d1a → #15182b → #1f2543 → #2a3360 →
  #3a4582 → #4a5bb0 → #5769F7

测试:删除 transparent 期望,改为期望具体颜色(#0a0d1a 等)。
新增"覆盖半径扩大"测试验证 dist=65 仍有非最暗颜色。

Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>

* fix(effort): 波纹 v3 — 去黑边 + 删中心高频涟漪 + y 轴覆盖快捷键行

用户反馈三个问题:
1. "黑色边感觉不太对" — 最暗档 #0a0d1a (rgb 10,13,26) 太接近纯黑,
   远端波谷看起来像硬黑边。改为 #1a1f3a (rgb 26,31,58),紫蓝感
   更强而非纯黑。
2. "中心的快速波纹有点奇怪" — 删除震源附近 dist<6 的高频涟漪叠加
   (time*0.02,5 倍主波纹频率)。原本想让震源附近"水波感"更强,
   实际效果像"快速闪烁"反而突兀。主波纹已经足够,无需叠加。
3. "y 方向覆盖快捷键" — RippleContent 新增 y=2 行渲染快捷键 overlay
   ("←/→ adjust · Enter confirm · Esc cancel")。PlainContent 路径
   保持原 Box marginTop=1 + Text 渲染。

色阶调整(紫蓝感更强):
- #1a1f3a (原 #0a0d1a) — 最暗档
- #1f2543 / #252c55 / #2e3870 / #3a4582 / #4a5bb0 / #5769F7
  (中间档略调亮度,保持平滑过渡)

测试:震源点测试更新为"time=0 时波谷最暗,time 推进后扫过波峰变亮",
反映删除高频涟漪后的纯主波纹行为。

Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>

* chore(workflow): 工作流相关代码中文文案全部英文化

源码(src/workflow/ + packages/workflow-engine/src/)的中文注释、
用户可见错误消息、字符串字面量;测试文件的标题与注释;同步 6 条
硬编码断言到英文化后的错误消息。

Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>

* feat(effort): 波纹 v4 — 平滑波 + 全色环旋转 + 淡入淡出 + 宽度自适应

- 波函数改 (sin+1)/2:消除 max(0,sin) 平直暗带(约 6 行宽)
- 主色相连续旋转(0.03°/ms,12s/圈全色环):蓝→紫→品红→红→橙→黄→绿→青
- 文字 overlay 同步色相旋转(rotateHue 应用到 Faster/▲/档位名/分隔线/副标签)
- 淡入淡出动画:fadeColor/fadeCells + fade 状态机 ~300ms 进出过渡
- 副标签固定 ultracode 段下方,不跟随光标移动
- 顶部/底部各加一行纯波纹行,视觉一致
- 宽度自适应终端列数:窄则 72,宽则铺满(computeSegment/computeRippleSourceX)
- 快捷键改 plain Text,不参与波纹背景渲染
- 新增 18 测试(fadeColor/fadeCells/rotateHue/getHueShiftAtTime)

Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>

* refactor: remove CYBER_RISK_MITIGATION_REMINDER from FileReadTool

Co-Authored-By: deepseek-v4-pro <deepseek-ai@claude-code-best.win>

* fix: prevent ReDoS in extractMeta regex by anchoring to splice boundary

Co-Authored-By: deepseek-v4-pro <deepseek-ai@claude-code-best.win>

* chore: 更新脚本

---------

Co-authored-by: glm-5.1 <zai-org@claude-code-best.win>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: deepseek-v4-pro <deepseek-ai@claude-code-best.win>
This commit is contained in:
claude-code-best
2026-06-14 18:13:49 +08:00
committed by GitHub
parent 3e3e1de81b
commit 58ee6419b1
130 changed files with 23347 additions and 885 deletions

View File

@@ -0,0 +1,145 @@
import React, { useCallback, useMemo } from 'react';
import { Box, Text, useTheme } from '@anthropic/ink';
import { getTheme, type Theme } from 'src/utils/theme.js';
import { env } from 'src/utils/env.js';
import { shouldShowAlwaysAllowOptions } from 'src/utils/permissions/permissionsLoader.js';
import { logUnaryEvent } from 'src/utils/unaryLogging.js';
import { PermissionDialog } from 'src/components/permissions/PermissionDialog.js';
import { PermissionPrompt, type PermissionPromptOption } from 'src/components/permissions/PermissionPrompt.js';
import type { PermissionRequestProps } from 'src/components/permissions/PermissionRequest.js';
import { PermissionRuleExplanation } from 'src/components/permissions/PermissionRuleExplanation.js';
type OptionValue = 'yes' | 'yes-dont-ask-again' | 'no';
/**
* Permission request UI for the WorkflowTool. Asks the user to confirm
* executing a workflow script.
* Follows the MonitorPermissionRequest / FallbackPermissionRequest pattern.
*/
export function WorkflowPermissionRequest({
toolUseConfirm,
onDone,
onReject,
workerBadge,
}: PermissionRequestProps): React.ReactNode {
const [themeName] = useTheme();
const theme = getTheme(themeName);
const input = toolUseConfirm.input as {
workflow: string;
args?: string;
};
const showAlwaysAllowOptions = useMemo(() => shouldShowAlwaysAllowOptions(), []);
const options: PermissionPromptOption<OptionValue>[] = useMemo(() => {
const opts: PermissionPromptOption<OptionValue>[] = [
{
label: 'Yes',
value: 'yes',
feedbackConfig: { type: 'accept' as const },
},
];
if (showAlwaysAllowOptions) {
opts.push({
label: (
<Text>
Yes, and don{'\u2019'}t ask again for <Text bold>{toolUseConfirm.tool.name}</Text> commands
</Text>
),
value: 'yes-dont-ask-again',
});
}
opts.push({
label: 'No',
value: 'no',
feedbackConfig: { type: 'reject' as const },
});
return opts;
}, [showAlwaysAllowOptions, toolUseConfirm.tool.name]);
const handleSelect = useCallback(
(value: OptionValue, feedback?: string) => {
switch (value) {
case 'yes':
logUnaryEvent({
completion_type: 'tool_use_single',
event: 'accept',
metadata: {
language_name: 'none',
message_id: toolUseConfirm.assistantMessage.message.id ?? '',
platform: env.platform,
},
});
toolUseConfirm.onAllow(toolUseConfirm.input, [], feedback);
onDone();
break;
case 'yes-dont-ask-again':
logUnaryEvent({
completion_type: 'tool_use_single',
event: 'accept',
metadata: {
language_name: 'none',
message_id: toolUseConfirm.assistantMessage.message.id ?? '',
platform: env.platform,
},
});
toolUseConfirm.onAllow(toolUseConfirm.input, [
{
type: 'addRules',
rules: [{ toolName: toolUseConfirm.tool.name }],
behavior: 'allow',
destination: 'localSettings',
},
]);
onDone();
break;
case 'no':
logUnaryEvent({
completion_type: 'tool_use_single',
event: 'reject',
metadata: {
language_name: 'none',
message_id: toolUseConfirm.assistantMessage.message.id ?? '',
platform: env.platform,
},
});
toolUseConfirm.onReject(feedback);
onReject();
onDone();
break;
}
},
[toolUseConfirm, onDone, onReject],
);
const handleCancel = useCallback(() => {
logUnaryEvent({
completion_type: 'tool_use_single',
event: 'reject',
metadata: {
language_name: 'none',
message_id: toolUseConfirm.assistantMessage.message.id ?? '',
platform: env.platform,
},
});
toolUseConfirm.onReject();
onReject();
onDone();
}, [toolUseConfirm, onDone, onReject]);
return (
<PermissionDialog title="Workflow" workerBadge={workerBadge}>
<Box flexDirection="column" gap={1}>
<Box flexDirection="column">
<Text bold color={theme.permission as keyof Theme}>
Execute workflow: {input.workflow}
</Text>
{input.args && <Text dimColor>Arguments: {input.args}</Text>}
</Box>
<PermissionRuleExplanation permissionResult={toolUseConfirm.permissionResult} toolType="command" />
<PermissionPrompt<OptionValue> options={options} onSelect={handleSelect} onCancel={handleCancel} />
</Box>
</PermissionDialog>
);
}

View File

@@ -0,0 +1,197 @@
import { expect, test } from 'bun:test';
import { PassThrough } from 'node:stream';
import React from 'react';
import { wrappedRender as render } from '@anthropic/ink';
import { SentryErrorBoundary } from '../../components/SentryErrorBoundary.js';
import type { RunProgress } from '../progress/store.js';
import { call as panelCall } from '../panel/panelCall.js';
import { clampSelected, isRunTerminatedTransition, WorkflowsPanel } from '../panel/WorkflowsPanel.js';
import { truncateLabel } from '../panel/AgentList.js';
import { STATUS_DOT } from '../panel/status.js';
import { __resetWorkflowServiceForTests, getWorkflowService } from '../service.js';
// Pure function: clamp selection to valid range (same source as clampSelected inside the panel).
test('clampSelected: empty list → 0; out of bounds → last; negative/NaN → 0; normal → original', () => {
expect(clampSelected(5, 0)).toBe(0);
expect(clampSelected(5, 3)).toBe(2);
expect(clampSelected(-3, 3)).toBe(0);
expect(clampSelected(1, 3)).toBe(1);
expect(clampSelected(0, 1)).toBe(0);
// NaN (e.g. uninitialized state) safely falls back to 0
expect(clampSelected(Number.NaN, 3)).toBe(0);
});
// truncateLabel: short label as-is; with `#number` suffix keep suffix, truncate prefix + ellipsis;
// without suffix, cut from the right. Lets audit workflow's verify:${dim}#${idx} multi-finding still be distinguishable.
test('truncateLabel: short label as-is; with #number suffix keep suffix and truncate prefix; without suffix cut from right', () => {
// short label as-is
expect(truncateLabel('agent-1', 18)).toBe('agent-1');
expect(truncateLabel('review:bugs', 18)).toBe('review:bugs');
// exactly max length (boundary)
expect(truncateLabel('review:correctness', 18)).toBe('review:correctness');
// over max + with #number suffix: keep suffix, truncate prefix + ellipsis
expect(truncateLabel('verify:correctness#0', 18)).toBe('verify:correctn…#0');
expect(truncateLabel('verify:architecture#15', 18)).toBe('verify:archite…#15');
// multi-digit #idx also distinguishable
expect(truncateLabel('verify:correctness#2', 18)).toBe('verify:correctn…#2');
// without #number suffix: cut from right (legacy behavior)
expect(truncateLabel('a-very-long-label-no-suffix', 18)).toBe('a-very-long-label-');
});
// STATUS_DOT covers four states, all visible dot characters.
test('STATUS_DOT covers running/completed/failed/killed and is non-empty character', () => {
const statuses = ['running', 'completed', 'failed', 'killed'] as const;
for (const s of statuses) {
expect(STATUS_DOT[s]).toBeTruthy();
expect(STATUS_DOT[s].length).toBeGreaterThan(0);
}
});
// Progress data shape contract: fields read by the panel exist/are readable on a typical RunProgress,
// preventing silent panel render breakage from store.ts structural drift.
test('RunProgress field contract: keys read by panel all exist', () => {
const run: RunProgress = {
runId: 'r1',
workflowName: 'review',
status: 'running',
phases: [{ title: 'Find', status: 'done' }],
declaredPhases: ['Find', 'Review'],
currentPhase: 'Review',
agents: [{ id: 1, label: 'review:api', phase: 'Review', status: 'running' }],
agentCount: 1,
startedAt: 1,
updatedAt: 1,
};
// paths read by panel WorkflowList/Detail
expect(run.status).toBe('running');
expect(STATUS_DOT[run.status]).toBe('●');
expect(run.currentPhase).toBe('Review');
expect(run.agents.length).toBe(run.agentCount);
expect(run.phases[0]?.title).toBe('Find');
expect(run.phases[0]?.status).toBe('done');
expect(run.agents[0]?.label).toBe('review:api');
});
// Completed/failed shape: returnValue / error only shown when not running.
test('RunProgress completed/failed shape: returnValue/error optional', () => {
const completed: RunProgress = {
runId: 'r2',
workflowName: 'w',
status: 'completed',
phases: [],
declaredPhases: [],
currentPhase: null,
agents: [],
agentCount: 0,
returnValue: 'ok',
startedAt: 2,
updatedAt: 2,
};
const failed: RunProgress = {
runId: 'r3',
workflowName: 'w',
status: 'failed',
phases: [],
declaredPhases: [],
currentPhase: null,
agents: [],
agentCount: 0,
error: 'boom',
startedAt: 3,
updatedAt: 3,
};
expect(completed.returnValue).toBe('ok');
expect(completed.error).toBeUndefined();
expect(failed.error).toBe('boom');
expect(failed.returnValue).toBeUndefined();
expect(STATUS_DOT['completed']).toBe('✓');
expect(STATUS_DOT['failed']).toBe('✗');
});
// Fix M: useSyncExternalStore / listNamed / child component throwing should not break through REPL.
// panelCall must wrap WorkflowsPanel in SentryErrorBoundary.
test('panelCall wraps WorkflowsPanel in SentryErrorBoundary (fix M regression)', async () => {
const element = (await (panelCall as unknown as (a: unknown, b: unknown, c: unknown) => Promise<React.ReactNode>)(
() => {},
{ canUseTool: undefined },
'',
)) as React.ReactElement<{ name?: string; children: React.ReactNode }>;
expect(element.type).toBe(SentryErrorBoundary);
expect(element.props.name).toBe('WorkflowsPanel');
const child = element.props.children as React.ReactElement<{
onDone: () => void;
}>;
expect(child.type).toBe(WorkflowsPanel);
expect(React.isValidElement(child)).toBe(true);
expect(typeof child.props.onDone).toBe('function');
});
// ---- Task 6: panel mount triggers loadPersistedRuns once ----
// Verify that WorkflowsPanel mount calls svc.loadPersistedRuns() exactly once.
// The persistedLoaded flag inside service guards idempotency; re-render / re-mount does not repeat the call.
// Use a spy to replace the singleton's loadPersistedRuns, render to a PassThrough stream, wait for useEffect to trigger.
test('WorkflowsPanel mount triggers loadPersistedRuns once', async () => {
__resetWorkflowServiceForTests();
const svc = getWorkflowService();
let calls = 0;
const orig = svc.loadPersistedRuns.bind(svc);
svc.loadPersistedRuns = async () => {
calls++;
};
const stdout = new PassThrough();
// consume data to avoid buffer overflow (render writes multiple frames)
stdout.on('data', () => {});
let instance: { unmount: () => void; waitUntilExit: () => Promise<void> } | undefined;
try {
instance = await render(
React.createElement(WorkflowsPanel, {
onDone: () => {},
context: { canUseTool: undefined } as never,
}),
{ stdout: stdout as unknown as NodeJS.WriteStream, patchConsole: false },
);
// after mount useEffect triggers asynchronously; wait a tick for React commit + effect to complete
await new Promise(r => setTimeout(r, 30));
expect(calls).toBe(1);
} finally {
instance?.unmount();
svc.loadPersistedRuns = orig;
__resetWorkflowServiceForTests();
}
});
// When the focused run transitions from running to terminal, the panel auto onDone() (800ms delay lets the user see the terminal state).
// Only same-runId state transitions trigger: switching to a completed tab does not exit; opening history panel does not exit either.
// Transition detection logic is extracted into the isRunTerminatedTransition pure function for offline unit testing (Ink test mode does not
// auto-pump concurrent state updates, integration tests are unreliable).
test('isRunTerminatedTransition: same runId running → terminal triggers; other cases do not trigger', () => {
const running = { runId: 'r1', status: 'running' as const };
const completed = { runId: 'r1', status: 'completed' as const };
const failed = { runId: 'r1', status: 'failed' as const };
const killed = { runId: 'r1', status: 'killed' as const };
// same run running → terminal: all three terminal states trigger
expect(isRunTerminatedTransition(running, completed)).toBe(true);
expect(isRunTerminatedTransition(running, failed)).toBe(true);
expect(isRunTerminatedTransition(running, killed)).toBe(true);
// prev=null (open history panel): does not trigger
expect(isRunTerminatedTransition(null, completed)).toBe(false);
// curr=null (runs cleared): does not trigger
expect(isRunTerminatedTransition(running, null)).toBe(false);
// different runId (switch tab): does not trigger
expect(isRunTerminatedTransition({ runId: 'r1', status: 'running' }, { runId: 'r2', status: 'completed' })).toBe(
false,
);
// same run but prev not running (already terminal and re-rendered): does not trigger
expect(isRunTerminatedTransition(completed, completed)).toBe(false);
expect(isRunTerminatedTransition(killed, completed)).toBe(false);
// same run running → running (no change): does not trigger
expect(isRunTerminatedTransition(running, running)).toBe(false);
});

View File

@@ -0,0 +1,398 @@
import { expect, test, mock } from 'bun:test'
// Note: mock specifier must resolve to the same module that impl actually imports (bun mock.module
// matches by resolved module). impl uses '@claude-code-best/builtin-tools/...' and 'src/*' alias
// path imports, so the same specifier is used here.
mock.module(
'@claude-code-best/builtin-tools/tools/AgentTool/runAgent.js',
() => ({
runAgent: async function* () {
yield {
type: 'assistant',
message: { content: [{ type: 'text', text: 'agent-text' }] },
}
},
}),
)
mock.module(
'@claude-code-best/builtin-tools/tools/AgentTool/agentToolUtils.js',
() => ({
finalizeAgentTool: () => ({
content: [{ type: 'text', text: 'agent-text' }],
usage: { output_tokens: 42 },
totalTokens: 42,
totalToolUseCount: 3,
}),
}),
)
mock.module(
'@claude-code-best/builtin-tools/tools/AgentTool/loadAgentsDir.js',
() => ({
isBuiltInAgent: () => true,
}),
)
mock.module('src/tools.js', () => ({ assembleToolPool: () => ({ tools: [] }) }))
mock.module('src/utils/messages.js', () => ({
// Return a shape that satisfies UserMessage consumers process-wide.
// Bun's mock.module is process-global (last-write-wins), so an incomplete
// mock here corrupts every later test that imports the real createUserMessage
// (e.g. bridgeMessaging.test.ts's `type !== 'user'` early-exit, or
// processSlashCommand.test.ts's `message.content` access). Mirror the real
// shape from src/utils/messages.ts: type + message envelope + passthrough.
createUserMessage: (
o: {
content: string
} & Record<string, unknown>,
) => ({
type: 'user' as const,
message: { role: 'user', content: o.content },
...o,
}),
extractTextContent: () => 'agent-text',
}))
mock.module('src/utils/uuid.js', () => ({ createAgentId: () => 'agent-1' }))
mock.module('src/services/analytics/index.js', () => ({ logEvent: () => {} }))
mock.module('src/utils/debug.js', () => ({ logForDebugging: () => {} }))
// isolation:'worktree' tests: mock worktree trio (to avoid actually running git worktree add).
// Note mock.module is process-global; worktreeState is defined outside the factory for test reset.
// Do not mock cwd.js: runWithCwdOverride actually running AsyncLocalStorage is harmless to mocked runAgent,
// and avoids polluting other tests in the same process that depend on pwd/getCwd.
const worktreeState = {
shouldThrow: false,
hasChanges: false,
created: [] as string[],
removed: [] as string[],
changesCalls: 0,
}
mock.module('src/utils/worktree.js', () => ({
createAgentWorktree: async (slug: string) => {
if (worktreeState.shouldThrow) throw new Error('wt boom')
worktreeState.created.push(slug)
return {
worktreePath: '/fake/wt',
worktreeBranch: 'wt-branch',
headCommit: 'abc123',
gitRoot: '/fake',
hookBased: false,
}
},
hasWorktreeChanges: async () => {
worktreeState.changesCalls++
return worktreeState.hasChanges
},
removeAgentWorktree: async (path: string) => {
worktreeState.removed.push(path)
return true
},
}))
import { WorkflowAbortedError } from '@claude-code-best/workflow-engine'
import {
claudeCodeBackend,
resolveAgentDefinition,
mapWorkflowModel,
extractStructuredOutput,
WORKFLOW_AGENT,
} from '../backends/claudeCodeBackend.js'
import { makeHostHandle } from '../hostHandle.js'
function ctx() {
return {
host: makeHostHandle({
toolUseContext: {
options: {
agentDefinitions: { activeAgents: [] },
querySource: 'workflow',
mainLoopModel: 'm',
},
getAppState: () => ({
toolPermissionContext: {
mode: 'acceptEdits',
alwaysAllowRules: {},
},
mcp: { tools: [] },
}),
} as never,
canUseTool: (() => Promise.resolve({ behavior: 'allow' })) as never,
// run() does not read parentMessage; use an empty object placeholder to satisfy the WorkflowHostBundle type.
parentMessage: {} as never,
}),
signal: new AbortController().signal,
runId: 'r1',
agentId: 1,
}
}
test('text agent → ok + token/tool/model accounting', async () => {
const res = await claudeCodeBackend.run({ prompt: 'do it' }, ctx())
expect(res.kind).toBe('ok')
if (res.kind === 'ok') {
expect(res.output).toBe('agent-text')
expect(res.usage.outputTokens).toBe(42)
// panel display fields: tokenCount(=totalTokens) / toolCount / model (fallback mainLoopModel 'm')
expect(res.tokenCount).toBe(42)
expect(res.toolCount).toBe(3)
expect(res.model).toBe('m')
}
})
test('isolation:worktree → create worktree + auto-cleanup on no changes; slug matches cleanup regex', async () => {
worktreeState.shouldThrow = false
worktreeState.hasChanges = false
worktreeState.created = []
worktreeState.removed = []
worktreeState.changesCalls = 0
const res = await claudeCodeBackend.run(
{ prompt: 'do', isolation: 'worktree' },
ctx(),
)
expect(res.kind).toBe('ok')
expect(worktreeState.created).toHaveLength(1)
// slug must match cleanupStaleAgentWorktrees cleanup regex ^wf_[0-9a-f]{8}-[0-9a-f]{3}-\d+$
expect(worktreeState.created[0]).toMatch(/^wf_[0-9a-f]{8}-[0-9a-f]{3}-\d+$/)
expect(worktreeState.changesCalls).toBe(1)
expect(worktreeState.removed).toHaveLength(1) // no changes → auto-remove
})
test('isolation:worktree has changes → keep worktree (no remove)', async () => {
worktreeState.hasChanges = true
worktreeState.created = []
worktreeState.removed = []
worktreeState.changesCalls = 0
const res = await claudeCodeBackend.run(
{ prompt: 'do', isolation: 'worktree' },
ctx(),
)
expect(res.kind).toBe('ok')
expect(worktreeState.removed).toHaveLength(0) // has changes → keep
expect(worktreeState.changesCalls).toBe(1)
})
test('isolation:worktree creation fails → fail-closed returns dead (does not silently degrade to shared cwd)', async () => {
worktreeState.shouldThrow = true
const res = await claudeCodeBackend.run(
{ prompt: 'do', isolation: 'worktree' },
ctx(),
)
expect(res.kind).toBe('dead')
worktreeState.shouldThrow = false
})
test('no isolation → no worktree created', async () => {
worktreeState.created = []
const res = await claudeCodeBackend.run({ prompt: 'do' }, ctx())
expect(res.kind).toBe('ok')
expect(worktreeState.created).toHaveLength(0)
})
test('runAgent throws → dead', async () => {
// override mock so runAgent throws (last-write-wins)
mock.module(
'@claude-code-best/builtin-tools/tools/AgentTool/runAgent.js',
() => ({
// biome-ignore lint/correctness/useYield: intentionally throws to test dead branch (no yield)
runAgent: async function* () {
throw new Error('boom')
},
}),
)
const res = await claudeCodeBackend.run({ prompt: 'fail' }, ctx())
expect(res.kind).toBe('dead')
})
// The next three groups of tests cover the 'x' invalid fix: backend must bridge ctx.signal to runAgent.override
// .abortController, and recognize AbortError as abort (throw WorkflowAbortedError, not swallow as dead).
// Also verify registerAgentAbort injection so service.kill(runId, agentId) can precisely abort a single agent.
test('ctx.signal pre-abort → backend bridge: override.abortController.signal.aborted=true', async () => {
// use capturedOverride to expose the agentAbort created by backend (the override.abortController received by mock)
let capturedController: AbortController | undefined
mock.module(
'@claude-code-best/builtin-tools/tools/AgentTool/runAgent.js',
() => ({
runAgent: async function* (opts: {
override?: { abortController?: AbortController }
}) {
capturedController = opts.override?.abortController
yield {
type: 'assistant',
message: { content: [{ type: 'text', text: 'x' }] },
}
},
}),
)
const parentAbort = new AbortController()
parentAbort.abort()
// mock does not throw → backend takes the normal return path; but the bridge `if (ctx.signal.aborted) agentAbort.abort()`
// has already triggered synchronously, capturedController.signal.aborted must be true (root cause of kill bridge)
await claudeCodeBackend.run(
{ prompt: 'pre-aborted' },
{ ...ctx(), signal: parentAbort.signal },
)
expect(capturedController?.signal.aborted).toBe(true)
})
test('runAgent throws AbortError → backend throws WorkflowAbortedError (not swallowed as dead)', async () => {
mock.module(
'@claude-code-best/builtin-tools/tools/AgentTool/runAgent.js',
() => ({
// biome-ignore lint/correctness/useYield: intentionally throws AbortError to test recognition branch
runAgent: async function* () {
const e = new Error('aborted by parent')
e.name = 'AbortError'
throw e
},
}),
)
await expect(
claudeCodeBackend.run({ prompt: 'abort' }, ctx()),
).rejects.toBeInstanceOf(WorkflowAbortedError)
})
test('registerAgentAbort/unregisterAgentAbort injection: key=ctx.agentId (number), controller from bridge', async () => {
// restore default mock (previous test changed it to throw AbortError)
mock.module(
'@claude-code-best/builtin-tools/tools/AgentTool/runAgent.js',
() => ({
runAgent: async function* () {
yield {
type: 'assistant',
message: { content: [{ type: 'text', text: 'agent-text' }] },
}
},
}),
)
const registered: Array<{ id: number; controller: AbortController }> = []
const unregistered: number[] = []
await claudeCodeBackend.run(
{ prompt: 'wiring' },
{
...ctx(),
agentId: 42,
registerAgentAbort: (id, ac) => registered.push({ id, controller: ac }),
unregisterAgentAbort: id => unregistered.push(id),
},
)
expect(registered).toHaveLength(1)
expect(registered[0]?.id).toBe(42) // engine numeric agentId (not coreAgentId string)
expect(registered[0]?.controller).toBeInstanceOf(AbortController)
expect(unregistered).toEqual([42]) // finally cleanup idempotent
})
test('id and capabilities shape', () => {
expect(claudeCodeBackend.id).toBe('claude-code')
expect(claudeCodeBackend.capabilities.structuredOutput).toBe(true)
expect(claudeCodeBackend.capabilities.tools).toBe(true)
})
test('resolveAgentDefinition: no agentType → WORKFLOW_AGENT fallback', () => {
const tuc = {
options: { agentDefinitions: { activeAgents: [] } },
} as never
expect(resolveAgentDefinition(undefined, tuc)).toBe(WORKFLOW_AGENT)
})
test('resolveAgentDefinition: hits activeAgents', () => {
const fake = { agentType: 'Explore', permissionMode: 'plan' } as never
const tuc = {
options: { agentDefinitions: { activeAgents: [fake] } },
} as never
expect(resolveAgentDefinition('Explore', tuc)).toBe(fake)
// miss still falls back
expect(resolveAgentDefinition('Nope', tuc)).toBe(WORKFLOW_AGENT)
})
test('mapWorkflowModel passthrough', () => {
expect(mapWorkflowModel(undefined)).toBeUndefined()
expect(mapWorkflowModel('claude-haiku-*')).toBe('claude-haiku-*')
})
test('extractStructuredOutput: valid JSON extracted; invalid returns null', () => {
expect(
extractStructuredOutput([
{ type: 'text', text: 'prefix {"a":1,"b":2} suffix' },
]),
).toEqual({ a: 1, b: 2 })
expect(
extractStructuredOutput([{ type: 'text', text: 'no json here' }]),
).toBeNull()
expect(extractStructuredOutput([])).toBeNull()
})
test('extractStructuredOutput: fenced code block (strip fence + strip language tag)', () => {
expect(
extractStructuredOutput([
{
type: 'text',
text: 'Here are the findings:\n```json\n{"findings":[{"title":"x"}]}\n```\nDone.',
},
]),
).toEqual({ findings: [{ title: 'x' }] })
// no language tag
expect(
extractStructuredOutput([{ type: 'text', text: '```\n{"a":1}\n```' }]),
).toEqual({ a: 1 })
})
test('extractStructuredOutput: nested object (bracket-balanced scan; legacy indexOf/lastIndexOf would cross-block concat)', () => {
const text = 'Result: {"outer":{"inner":{"deep":true}},"n":3} trailing'
expect(extractStructuredOutput([{ type: 'text', text }])).toEqual({
outer: { inner: { deep: true } },
n: 3,
})
})
test('extractStructuredOutput: brackets inside strings are not counted as pairing', () => {
// } inside a string does not zero out depth, scan can skip to the real pairing }
const text = '{"note":"this } char is in a string","ok":true}'
expect(extractStructuredOutput([{ type: 'text', text }])).toEqual({
note: 'this } char is in a string',
ok: true,
})
})
test('extractStructuredOutput: escaped quotes do not break string boundary', () => {
const text = '{"escaped":"he said \\"hi\\"","n":1}'
expect(extractStructuredOutput([{ type: 'text', text }])).toEqual({
escaped: 'he said "hi"',
n: 1,
})
})
test('extractStructuredOutput: multiple JSON blocks → return first parse success', () => {
// first one unbalanced (no pairing }), skip to the second
const text = 'broken { stuff\n{"real":1}\n{"ignored":2}'
expect(extractStructuredOutput([{ type: 'text', text }])).toEqual({ real: 1 })
})
test('extractStructuredOutput: array / number / string / null do not count as object', () => {
expect(
extractStructuredOutput([{ type: 'text', text: '[1,2,3]' }]),
).toBeNull()
expect(extractStructuredOutput([{ type: 'text', text: '42' }])).toBeNull()
expect(
extractStructuredOutput([{ type: 'text', text: '"raw string"' }]),
).toBeNull()
expect(extractStructuredOutput([{ type: 'text', text: 'null' }])).toBeNull()
})
test('extractStructuredOutput: multiple text blocks → cross-block find first success', () => {
expect(
extractStructuredOutput([
{ type: 'text', text: 'no json' },
{ type: 'text', text: '```json\n{"k":"v"}\n```' },
]),
).toEqual({ k: 'v' })
})
test('extractStructuredOutput: broken JSON returns null (does not throw)', () => {
expect(
extractStructuredOutput([
{ type: 'text', text: '{broken: missing quotes}' },
]),
).toBeNull()
expect(
extractStructuredOutput([{ type: 'text', text: '{"a":1,}' }]), // trailing comma — no syntax repair
).toBeNull()
})

View File

@@ -0,0 +1,176 @@
import { describe, expect, test } from 'bun:test'
import type { RunProgress } from '../progress/store.js'
import type { WorkflowService } from '../service.js'
function makeMockService(runs: RunProgress[]): {
service: WorkflowService
emit: () => void
setRuns: (runs: RunProgress[]) => void
} {
let current = runs
const listeners = new Set<() => void>()
return {
service: {
ports: {},
launch: async () => ({ runId: 'x' }),
kill: () => {},
listRuns: () => current,
getRun: () => undefined,
subscribe: (fn: () => void) => {
listeners.add(fn)
return () => {
listeners.delete(fn)
}
},
listNamed: async () => [],
} as unknown as WorkflowService,
emit: () => {
for (const fn of listeners) fn()
},
setRuns: r => {
current = r
},
}
}
function makeRun(
runId: string,
status: RunProgress['status'],
overrides: Partial<RunProgress> = {},
): RunProgress {
return {
runId,
workflowName: 'wf',
status,
phases: [],
declaredPhases: [],
currentPhase: null,
agents: [],
agentCount: 0,
startedAt: Date.now(),
updatedAt: Date.now(),
...overrides,
}
}
describe('installWorkflowNotifications', () => {
test('running → completed triggers notification (incl. workflow name)', async () => {
const { installWorkflowNotifications } = await import('../notifications.js')
const { service, emit, setRuns } = makeMockService([
makeRun('r1', 'running'),
])
const calls: string[] = []
const unsubscribe = installWorkflowNotifications(service, msg =>
calls.push(msg),
)
// first emit: listener records initial running state, no notification
emit()
expect(calls.length).toBe(0)
setRuns([makeRun('r1', 'completed')])
emit()
expect(calls.length).toBe(1)
expect(calls[0]).toMatch(/task-notification/)
expect(calls[0]).toMatch(/completed successfully/)
expect(calls[0]).toMatch(/"wf"/)
unsubscribe()
})
test('running → failed triggers notification, includes error text', async () => {
const { installWorkflowNotifications } = await import('../notifications.js')
const { service, emit, setRuns } = makeMockService([
makeRun('r1', 'running'),
])
const calls: string[] = []
installWorkflowNotifications(service, msg => calls.push(msg))
emit() // record initial running
setRuns([makeRun('r1', 'failed', { error: 'agent X boom' })])
emit()
expect(calls.length).toBe(1)
expect(calls[0]).toMatch(/failed/)
expect(calls[0]).toMatch(/agent X boom/)
})
test('running → killed triggers notification', async () => {
const { installWorkflowNotifications } = await import('../notifications.js')
const { service, emit, setRuns } = makeMockService([
makeRun('r1', 'running'),
])
const calls: string[] = []
installWorkflowNotifications(service, msg => calls.push(msg))
emit() // record initial running
setRuns([makeRun('r1', 'killed')])
emit()
expect(calls.length).toBe(1)
expect(calls[0]).toMatch(/was stopped/)
})
test('first time seeing run (no prev) does not notify (avoid notifying historical runs on startup)', async () => {
const { installWorkflowNotifications } = await import('../notifications.js')
const { service, emit, setRuns } = makeMockService([])
const calls: string[] = []
installWorkflowNotifications(service, msg => calls.push(msg))
// first emit after startup, sees r1 already completed — should not notify (not a transition from running)
setRuns([makeRun('r1', 'completed')])
emit()
expect(calls.length).toBe(0)
})
test('running → running does not notify', async () => {
const { installWorkflowNotifications } = await import('../notifications.js')
const { service, emit, setRuns } = makeMockService([
makeRun('r1', 'running'),
])
const calls: string[] = []
installWorkflowNotifications(service, msg => calls.push(msg))
emit() // record initial running
setRuns([makeRun('r1', 'running', { agentCount: 1 })])
emit()
expect(calls.length).toBe(0)
})
test('already completed run emitting again does not repeat notification', async () => {
const { installWorkflowNotifications } = await import('../notifications.js')
const { service, emit, setRuns } = makeMockService([
makeRun('r1', 'running'),
])
const calls: string[] = []
installWorkflowNotifications(service, msg => calls.push(msg))
emit() // record initial running
setRuns([makeRun('r1', 'completed')])
emit()
expect(calls.length).toBe(1)
emit()
expect(calls.length).toBe(1)
})
test('after unsubscribe no more notifications', async () => {
const { installWorkflowNotifications } = await import('../notifications.js')
const { service, emit, setRuns } = makeMockService([
makeRun('r1', 'running'),
])
const calls: string[] = []
const unsubscribe = installWorkflowNotifications(service, msg =>
calls.push(msg),
)
emit() // record initial running
unsubscribe()
setRuns([makeRun('r1', 'completed')])
emit()
expect(calls.length).toBe(0)
})
})

View File

@@ -0,0 +1,199 @@
import { expect, test } from 'bun:test'
import {
mkdir,
mkdtemp,
readFile,
readdir,
rm,
writeFile as fsWriteFile,
} from 'node:fs/promises'
import { tmpdir } from 'node:os'
import { join } from 'node:path'
import {
getRunsDir,
listPersistedRuns,
readRunState,
writeRunState,
} from '../persistence.js'
import type { RunProgress } from '../progress/store.js'
function makeRun(over: Partial<RunProgress> = {}): RunProgress {
return {
runId: 'r1',
workflowName: 'w',
status: 'completed',
phases: [],
declaredPhases: [],
currentPhase: null,
agents: [],
agentCount: 0,
startedAt: 1000,
updatedAt: 2000,
...over,
} as RunProgress
}
test('writeRunState → readRunState round-trip consistent (returnValue is object)', async () => {
const dir = await mkdtemp(join(tmpdir(), 'wf-'))
try {
const run = makeRun({
returnValue: { confirmedCount: 2, items: ['a', 'b'] },
})
await writeRunState(dir, run)
const got = await readRunState(dir, 'r1')
expect(got).not.toBeNull()
expect(got!.runId).toBe('r1')
expect(got!.returnValue).toEqual({ confirmedCount: 2, items: ['a', 'b'] })
} finally {
await rm(dir, { recursive: true, force: true })
}
})
test('readRunState missing file → null', async () => {
const dir = await mkdtemp(join(tmpdir(), 'wf-'))
try {
const got = await readRunState(dir, 'never-exists')
expect(got).toBeNull()
} finally {
await rm(dir, { recursive: true, force: true })
}
})
test('readRunState corrupt JSON → null', async () => {
const dir = await mkdtemp(join(tmpdir(), 'wf-'))
try {
await mkdir(join(dir, 'rX'), { recursive: true })
await fsWriteFile(join(dir, 'rX', 'state.json'), '{not valid json', 'utf-8')
const got = await readRunState(dir, 'rX')
expect(got).toBeNull()
} finally {
await rm(dir, { recursive: true, force: true })
}
})
test('readRunState schemaVersion mismatch → null', async () => {
const dir = await mkdtemp(join(tmpdir(), 'wf-'))
try {
await mkdir(join(dir, 'rX'), { recursive: true })
await fsWriteFile(
join(dir, 'rX', 'state.json'),
JSON.stringify({ schemaVersion: 999, run: makeRun({ runId: 'rX' }) }),
'utf-8',
)
const got = await readRunState(dir, 'rX')
expect(got).toBeNull()
} finally {
await rm(dir, { recursive: true, force: true })
}
})
test('writeRunState atomic write: no tmp residue after success', async () => {
const dir = await mkdtemp(join(tmpdir(), 'wf-'))
try {
await writeRunState(dir, makeRun({ runId: 'rAtom' }))
const sub = await readdir(join(dir, 'rAtom'))
expect(sub).toContain('state.json')
expect(sub).not.toContain('state.json.tmp')
} finally {
await rm(dir, { recursive: true, force: true })
}
})
test('listPersistedRuns scans multiple subdirs, skips dirs without state.json, sorts by updatedAt desc', async () => {
const dir = await mkdtemp(join(tmpdir(), 'wf-'))
try {
// three valid runs + one half-broken dir with only journal, no state.json
await writeRunState(dir, makeRun({ runId: 'old', updatedAt: 1000 }))
await writeRunState(dir, makeRun({ runId: 'mid', updatedAt: 2000 }))
await writeRunState(dir, makeRun({ runId: 'new', updatedAt: 3000 }))
await mkdir(join(dir, 'half-broken'), { recursive: true })
const runs = await listPersistedRuns(dir)
expect(runs.map(r => r.runId)).toEqual(['new', 'mid', 'old'])
} finally {
await rm(dir, { recursive: true, force: true })
}
})
test('listPersistedRuns scans a corrupt state.json → skip that single one, continue scanning the rest', async () => {
const dir = await mkdtemp(join(tmpdir(), 'wf-'))
try {
await writeRunState(dir, makeRun({ runId: 'good' }))
await mkdir(join(dir, 'bad'), { recursive: true })
await fsWriteFile(join(dir, 'bad', 'state.json'), 'corrupt', 'utf-8')
const runs = await listPersistedRuns(dir)
expect(runs.map(r => r.runId)).toEqual(['good'])
} finally {
await rm(dir, { recursive: true, force: true })
}
})
test('writeRunState does not throw when returnValue is null/string/array', async () => {
const dir = await mkdtemp(join(tmpdir(), 'wf-'))
try {
await writeRunState(dir, makeRun({ runId: 'n', returnValue: null }))
await writeRunState(dir, makeRun({ runId: 's', returnValue: 'text' }))
await writeRunState(dir, makeRun({ runId: 'a', returnValue: [1, 2, 3] }))
expect((await readRunState(dir, 'n'))!.returnValue).toBeNull()
expect((await readRunState(dir, 's'))!.returnValue).toBe('text')
expect((await readRunState(dir, 'a'))!.returnValue).toEqual([1, 2, 3])
} finally {
await rm(dir, { recursive: true, force: true })
}
})
test('writeRunState overwrite: same runId second write overwrites old content', async () => {
const dir = await mkdtemp(join(tmpdir(), 'wf-'))
try {
await writeRunState(dir, makeRun({ runId: 'rOV', status: 'running' }))
await writeRunState(dir, makeRun({ runId: 'rOV', status: 'completed' }))
const got = await readRunState(dir, 'rOV')
expect(got!.status).toBe('completed')
} finally {
await rm(dir, { recursive: true, force: true })
}
})
test('writeRunState writes full AgentProgress (no output content, includes label/phase/token etc.)', async () => {
const dir = await mkdtemp(join(tmpdir(), 'wf-'))
try {
const run = makeRun({
runId: 'rAg',
agents: [
{
id: 1,
label: 'review:hooks',
phase: 'Review',
status: 'done',
outputShape: 'object',
tokenCount: 12345,
toolCount: 3,
model: 'claude-sonnet-4-6',
},
],
agentCount: 1,
})
await writeRunState(dir, run)
const got = await readRunState(dir, 'rAg')
expect(got!.agents).toHaveLength(1)
expect(got!.agents[0]).toEqual({
id: 1,
label: 'review:hooks',
phase: 'Review',
status: 'done',
outputShape: 'object',
tokenCount: 12345,
toolCount: 3,
model: 'claude-sonnet-4-6',
})
} finally {
await rm(dir, { recursive: true, force: true })
}
})
test('getRunsDir returns <projectRoot>/.claude/workflow-runs shape', () => {
const dir = getRunsDir()
// do not hard-code projectRoot (differs across machines), only check suffix structure
expect(dir.endsWith(`${join('.claude', 'workflow-runs')}`)).toBe(true)
})

View File

@@ -0,0 +1,198 @@
import { expect, test } from 'bun:test'
// Note: this test does not mock bootstrap/state, utils/cwd, analytics, debug.
// Reason: mock.module is process-global (last-write-wins); mocking these common modules would pollute
// other tests in the same process (e.g. src/commands/__tests__/autonomy.test.ts imports the real
// bootstrap/state via its dependency chain). ports can resolve getProjectRoot/getCwd normally in the test env,
// logEvent/logForDebugging are silent no-ops when sink is not attached, no need to mock.
import { buildRegistry } from '../registry.js'
import { createWorkflowPorts } from '../ports.js'
import { createProgressBus } from '../progress/bus.js'
import { createProgressStoreFromBus } from '../progress/store.js'
import { getProjectRoot } from '../../bootstrap/state.js'
import type { SetAppState } from '../../Task.js'
import type { AppState } from '../../state/AppState.tsx'
test('buildRegistry registers claude-code as default and resolve hits', () => {
const reg = buildRegistry()
expect(reg.has('claude-code')).toBe(true)
expect(reg.resolve({ prompt: 'x' }).id).toBe('claude-code')
expect(reg.resolve({ prompt: 'x', agentType: 'whatever' }).id).toBe(
'claude-code',
)
})
test('createWorkflowPorts assembles full ports (incl. agentAdapterRegistry and progressEmitter→bus)', () => {
const bus = createProgressBus()
const store = createProgressStoreFromBus(bus)
const ports = createWorkflowPorts({ bus, store })
expect(ports.agentAdapterRegistry).toBeDefined()
expect(ports.agentAdapterRegistry!.resolve({ prompt: 'x' }).id).toBe(
'claude-code',
)
expect(typeof ports.taskRegistrar.register).toBe('function')
expect(typeof ports.taskRegistrar.kill).toBe('function')
expect(typeof ports.hostFactory).toBe('function')
// agentRunner fallback fields still exist (WorkflowPorts required)
expect(ports.agentRunner).toBeDefined()
expect(typeof ports.agentRunner.runAgentToResult).toBe('function')
// progressEmitter via bus → store: emit a run_started, store can see it
ports.progressEmitter.emit({
type: 'run_started',
runId: 't',
workflowName: 'w',
meta: null,
})
expect(store.get('t')?.workflowName).toBe('w')
})
test('taskRegistrar.register/complete/kill routes via RunBinding (real setAppState, no mock)', () => {
const bus = createProgressBus()
const store = createProgressStoreFromBus(bus)
const ports = createWorkflowPorts({ bus, store })
// real setAppState: use a local AppState object to hold tasks, registerTask goes through the real code path.
const state = { tasks: {} } as unknown as AppState
const setAppState: SetAppState = f => {
Object.assign(state, f(state))
}
const hostCtx = ports.hostFactory({
context: {
agentId: 'a-1',
toolUseId: 'tu-1',
setAppState,
},
canUseTool: (() => Promise.resolve({ behavior: 'allow' })) as never,
parentMessage: {} as never,
})
const { runId, signal } = ports.taskRegistrar.register(
{
workflowName: 'wf',
summary: 'summary',
workflowFile: 'wf.ts',
toolUseId: 'tu-1',
},
hostCtx.handle,
)
expect(typeof runId).toBe('string')
expect(signal).toBeInstanceOf(AbortSignal)
// complete/fail/kill do not throw (RunBinding hit)
expect(() => ports.taskRegistrar.complete(runId, 'done')).not.toThrow()
expect(() => ports.taskRegistrar.kill(runId)).not.toThrow()
// unknown runId safe no-op
expect(() => ports.taskRegistrar.complete('nope')).not.toThrow()
expect(ports.taskRegistrar.pendingAction('nope')).toBeNull()
// after terminal state binding is reclaimed: calling complete on the same runId again should be safe no-op (no throw, no repeated call to workflow task fn)
ports.taskRegistrar.complete(runId)
ports.taskRegistrar.kill(runId)
})
// agent-level kill bridge: register → killAgent precisely aborts; kill(runId) aborts all agents.
test('taskRegistrar agentAbortControllers: register/killAgent precise abort; kill(runId) batch abort', () => {
const bus = createProgressBus()
const store = createProgressStoreFromBus(bus)
const ports = createWorkflowPorts({ bus, store })
// impl always provides these — cast flattens optional to required (avoids per-line ! assertion)
const tr = ports.taskRegistrar as Required<typeof ports.taskRegistrar>
const state = { tasks: {} } as unknown as AppState
const setAppState: SetAppState = f => {
Object.assign(state, f(state))
}
const hostCtx = ports.hostFactory({
context: { agentId: 'a-1', toolUseId: 'tu-1', setAppState },
canUseTool: (() => Promise.resolve({ behavior: 'allow' })) as never,
parentMessage: {} as never,
})
const { runId } = tr.register(
{
workflowName: 'wf',
summary: 'summary',
workflowFile: 'wf.ts',
toolUseId: 'tu-1',
},
hostCtx.handle,
)
// register AbortController for two agents (simulating backend calling when launching agent)
const ac1 = new AbortController()
const ac2 = new AbortController()
tr.registerAgentAbort(runId, 1, ac1)
tr.registerAgentAbort(runId, 2, ac2)
expect(ac1.signal.aborted).toBe(false)
expect(ac2.signal.aborted).toBe(false)
// killAgent precisely aborts agent #1: only ac1 aborts, ac2 unaffected
expect(tr.killAgent(runId, 1)).toBe(true)
expect(ac1.signal.aborted).toBe(true)
expect(ac2.signal.aborted).toBe(false)
// repeat kill on same agent: controller already deleted, returns false (idempotent)
expect(tr.killAgent(runId, 1)).toBe(false)
// unknown agentId / unknown runId safe returns false
expect(tr.killAgent(runId, 999)).toBe(false)
expect(tr.killAgent('nope', 1)).toBe(false)
// kill(runId) batch aborts remaining agent (ac2)
tr.kill(runId)
expect(ac2.signal.aborted).toBe(true)
// after run terminal state binding is reclaimed: killAgent returns false
expect(tr.killAgent(runId, 2)).toBe(false)
})
test('unregisterAgentAbort deletes from Map (backend finally cleanup idempotent)', () => {
const bus = createProgressBus()
const store = createProgressStoreFromBus(bus)
const ports = createWorkflowPorts({ bus, store })
const tr = ports.taskRegistrar as Required<typeof ports.taskRegistrar>
const state = { tasks: {} } as unknown as AppState
const setAppState: SetAppState = f => {
Object.assign(state, f(state))
}
const hostCtx = ports.hostFactory({
context: { agentId: 'a-1', toolUseId: 'tu-1', setAppState },
canUseTool: (() => Promise.resolve({ behavior: 'allow' })) as never,
parentMessage: {} as never,
})
const { runId } = tr.register(
{
workflowName: 'wf',
summary: 'summary',
workflowFile: 'wf.ts',
toolUseId: 'tu-1',
},
hostCtx.handle,
)
const ac = new AbortController()
tr.registerAgentAbort(runId, 5, ac)
// after unregister killAgent has no target, returns false (does not throw)
tr.unregisterAgentAbort(runId, 5)
expect(tr.killAgent(runId, 5)).toBe(false)
// repeat unregister idempotent (backend finally does not throw)
expect(() => tr.unregisterAgentAbort(runId, 5)).not.toThrow()
// unknown runId safe no-op
expect(() => tr.unregisterAgentAbort('nope', 5)).not.toThrow()
})
test('hostFactory.cwd and journalStore share root (getProjectRoot) — fix K regression', () => {
// historical bug: hostFactory.cwd used getCwd(), journalStore used getProjectRoot(),
// when user enters worktree/subdirectory the two differ → named workflow resolution and journal persist out of sync.
// After fix both use projectRoot, this test locks-in that choice, preventing regression.
const bus = createProgressBus()
const store = createProgressStoreFromBus(bus)
const ports = createWorkflowPorts({ bus, store })
const hostCtx = ports.hostFactory({
context: { agentId: 'a', toolUseId: 'tu' },
canUseTool: (() => Promise.resolve({ behavior: 'allow' })) as never,
parentMessage: {} as never,
})
expect(hostCtx.cwd).toBe(getProjectRoot())
})

View File

@@ -0,0 +1,23 @@
import { expect, test, mock } from 'bun:test'
import { createProgressBus } from '../progress/bus.js'
test('emit broadcasts to all subscribers', () => {
const bus = createProgressBus()
const a = mock(() => {})
const b = mock(() => {})
bus.subscribe(a)
bus.subscribe(b)
const ev = { type: 'log' as const, runId: 'r', message: 'hi' }
bus.emit(ev)
expect(a).toHaveBeenCalledTimes(1)
expect(b).toHaveBeenCalledWith(ev)
})
test('subscribe returns unsubscribe', () => {
const bus = createProgressBus()
const fn = mock(() => {})
const unsub = bus.subscribe(fn)
unsub()
bus.emit({ type: 'log', runId: 'r', message: 'x' })
expect(fn).not.toHaveBeenCalled()
})

View File

@@ -0,0 +1,289 @@
import { expect, test } from 'bun:test'
import { createProgressBus, type ProgressBus } from '../progress/bus.js'
import {
createProgressStoreFromBus,
type RunProgress,
} from '../progress/store.js'
import type { AgentRunResult } from '@claude-code-best/workflow-engine'
const ok = (o: string): AgentRunResult => ({
kind: 'ok',
output: o,
usage: { outputTokens: 1 },
})
function newStore() {
const bus: ProgressBus = createProgressBus()
return { bus, store: createProgressStoreFromBus(bus) }
}
test('run_started creates entry; phase_started/done updates phases', () => {
const { bus, store } = newStore()
bus.emit({ type: 'run_started', runId: 'r1', workflowName: 'w', meta: null })
bus.emit({ type: 'phase_started', runId: 'r1', phase: 'A' })
bus.emit({ type: 'phase_started', runId: 'r1', phase: 'B' })
bus.emit({ type: 'phase_done', runId: 'r1', phase: 'A' })
const r = store.get('r1')!
expect(r.phases.map(p => [p.title, p.status])).toEqual([
['A', 'done'],
['B', 'running'],
])
expect(r.currentPhase).toBe('B')
})
test('concurrent agent_done correlates by agentId precisely (regression of old LIFO race)', () => {
const { bus, store } = newStore()
bus.emit({ type: 'run_started', runId: 'r1', workflowName: 'w', meta: null })
bus.emit({
type: 'agent_started',
runId: 'r1',
agentId: 0,
label: 'a',
phase: 'A',
})
bus.emit({
type: 'agent_started',
runId: 'r1',
agentId: 1,
label: 'b',
phase: 'A',
})
bus.emit({
type: 'agent_done',
runId: 'r1',
agentId: 1,
label: 'b',
phase: 'A',
result: ok('b-out'),
})
bus.emit({
type: 'agent_done',
runId: 'r1',
agentId: 0,
label: 'a',
phase: 'A',
result: ok('a-out'),
})
const agents = store.get('r1')!.agents
expect(agents.find(x => x.id === 0)?.status).toBe('done')
expect(agents.find(x => x.id === 1)?.status).toBe('done')
expect(agents.find(x => x.id === 0)?.label).toBe('a')
expect(agents.find(x => x.id === 1)?.label).toBe('b')
})
test('journal hit (agent_done without started) backfills done entry by id', () => {
const { bus, store } = newStore()
bus.emit({ type: 'run_started', runId: 'r1', workflowName: 'w', meta: null })
bus.emit({
type: 'agent_done',
runId: 'r1',
agentId: 7,
label: 'c',
phase: 'A',
result: ok('c'),
})
const a = store.get('r1')!.agents.find(x => x.id === 7)!
expect(a.status).toBe('done')
})
test('run_done terminal state + list sort + subscribe notification', () => {
const { bus, store } = newStore()
let calls = 0
store.subscribe(() => calls++)
bus.emit({ type: 'run_started', runId: 'r1', workflowName: 'w', meta: null })
bus.emit({
type: 'run_done',
runId: 'r1',
status: 'completed',
returnValue: 42,
})
const r = store.get('r1')!
expect(r.status).toBe('completed')
expect(r.returnValue).toBe(42)
expect(store.list().map(x => x.runId)).toEqual(['r1'])
expect(calls).toBe(2)
})
test('run_done failed terminal state records error', () => {
const { bus, store } = newStore()
bus.emit({ type: 'run_started', runId: 'r2', workflowName: 'w', meta: null })
bus.emit({ type: 'run_done', runId: 'r2', status: 'failed', error: 'boom' })
const r = store.get('r2')!
expect(r.status).toBe('failed')
expect(r.error).toBe('boom')
})
test('log event does not trigger notify', () => {
const { bus, store } = newStore()
let calls = 0
store.subscribe(() => calls++)
bus.emit({ type: 'run_started', runId: 'r3', workflowName: 'w', meta: null })
const before = calls
bus.emit({ type: 'log', runId: 'r3', message: 'hi' })
expect(calls).toBe(before) // log should not trigger notify
})
test('run_started persists declaredPhases (from meta.phases, order preserved)', () => {
const { bus, store } = newStore()
bus.emit({
type: 'run_started',
runId: 'r1',
workflowName: 'w',
meta: {
name: 'w',
description: 'd',
phases: [{ title: 'Find' }, { title: 'Review' }, { title: 'Verify' }],
},
})
expect(store.get('r1')!.declaredPhases).toEqual(['Find', 'Review', 'Verify'])
})
test('run_started meta is null → declaredPhases = []', () => {
const { bus, store } = newStore()
bus.emit({ type: 'run_started', runId: 'r1', workflowName: 'w', meta: null })
expect(store.get('r1')!.declaredPhases).toEqual([])
})
test('agent_done persists outputShape (ok·object / ok·text / dead none)', () => {
const { bus, store } = newStore()
bus.emit({ type: 'run_started', runId: 'r1', workflowName: 'w', meta: null })
bus.emit({ type: 'agent_started', runId: 'r1', agentId: 0, phase: 'A' })
bus.emit({ type: 'agent_started', runId: 'r1', agentId: 1, phase: 'A' })
bus.emit({ type: 'agent_started', runId: 'r1', agentId: 2, phase: 'A' })
bus.emit({
type: 'agent_done',
runId: 'r1',
agentId: 0,
phase: 'A',
result: { kind: 'ok', output: { x: 1 }, usage: { outputTokens: 1 } },
})
bus.emit({
type: 'agent_done',
runId: 'r1',
agentId: 1,
phase: 'A',
result: { kind: 'ok', output: 'hi', usage: { outputTokens: 1 } },
})
bus.emit({
type: 'agent_done',
runId: 'r1',
agentId: 2,
phase: 'A',
result: { kind: 'dead' },
})
const agents = store.get('r1')!.agents
expect(agents.find(a => a.id === 0)?.outputShape).toBe('object')
expect(agents.find(a => a.id === 1)?.outputShape).toBe('text')
expect(agents.find(a => a.id === 2)?.outputShape).toBeUndefined()
})
test('agent_progress real-time updates token/tool (correlated by agentId)', () => {
const { bus, store } = newStore()
bus.emit({ type: 'run_started', runId: 'r1', workflowName: 'w', meta: null })
bus.emit({
type: 'agent_started',
runId: 'r1',
agentId: 0,
label: 'a',
phase: 'A',
})
bus.emit({
type: 'agent_progress',
runId: 'r1',
agentId: 0,
tokenCount: 1200,
toolCount: 2,
})
let a = store.get('r1')!.agents.find(x => x.id === 0)!
expect(a.tokenCount).toBe(1200)
expect(a.toolCount).toBe(2)
bus.emit({
type: 'agent_progress',
runId: 'r1',
agentId: 0,
tokenCount: 2400,
toolCount: 3,
})
a = store.get('r1')!.agents.find(x => x.id === 0)!
expect(a.tokenCount).toBe(2400)
expect(a.toolCount).toBe(3)
})
test('agent_done persists model/tokenCount/toolCount (ok variant)', () => {
const { bus, store } = newStore()
bus.emit({ type: 'run_started', runId: 'r1', workflowName: 'w', meta: null })
bus.emit({ type: 'agent_started', runId: 'r1', agentId: 0, phase: 'A' })
bus.emit({
type: 'agent_done',
runId: 'r1',
agentId: 0,
phase: 'A',
result: {
kind: 'ok',
output: 'x',
usage: { outputTokens: 5 },
model: 'glm-5.2',
tokenCount: 22900,
toolCount: 1,
},
})
const a = store.get('r1')!.agents.find(x => x.id === 0)!
expect(a.model).toBe('glm-5.2')
expect(a.tokenCount).toBe(22900)
expect(a.toolCount).toBe(1)
})
// ---- hydrate: inject historical run from disk (cross-restart recovery) ----
test('hydrate injects new run → get hits + list includes it + notifies listener', () => {
const { store } = newStore()
let notified = 0
store.subscribe(() => notified++)
const historical: RunProgress = {
runId: 'hist-1',
workflowName: 'old-job',
status: 'completed',
phases: [],
declaredPhases: [],
currentPhase: null,
agents: [],
agentCount: 5,
returnValue: { summary: 'past' },
startedAt: 1,
updatedAt: 2,
}
store.hydrate(historical)
expect(store.get('hist-1')).toBe(historical)
expect(store.list().map(r => r.runId)).toContain('hist-1')
expect(notified).toBeGreaterThan(0)
})
test('hydrate existing runId → skip (memory first, not overwritten by disk)', () => {
const { bus, store } = newStore()
bus.emit({
type: 'run_started',
runId: 'r1',
workflowName: 'live',
meta: null,
})
const stale: RunProgress = {
runId: 'r1',
workflowName: 'STALE-SHOULD-NOT-WIN',
status: 'completed',
phases: [],
declaredPhases: [],
currentPhase: null,
agents: [],
agentCount: 0,
startedAt: 1,
updatedAt: 2,
}
store.hydrate(stale)
const got = store.get('r1')!
expect(got.workflowName).toBe('live')
expect(got.status).toBe('running')
})

View File

@@ -0,0 +1,177 @@
import { expect, test } from 'bun:test'
import { mkdtemp, rm, writeFile } from 'node:fs/promises'
import { tmpdir } from 'node:os'
import { join } from 'node:path'
import { attachRunStatePersistence, readRunState } from '../persistence.js'
import { createProgressBus } from '../progress/bus.js'
import { createProgressStoreFromBus } from '../progress/store.js'
/**
* Contract test for attachRunStatePersistence (adjusted Task 4):
* directly test the bus + store combination, bypassing makeService (keeps makeService signature (ports, store, cwdOverride?) unchanged).
*
* runsDir is injected as tmpdir via attachRunStatePersistence's third parameter runsDirProvider,
* to avoid writing to the real project directory (Bun ESM module namespace is read-only, cannot monkey-patch getRunsDir).
*/
test('run_done completed → writes state.json to disk, returnValue consistent', async () => {
const dir = await mkdtemp(join(tmpdir(), 'wf-persist-'))
try {
const bus = createProgressBus()
const store = createProgressStoreFromBus(bus)
attachRunStatePersistence(bus, store, () => dir)
bus.emit({
type: 'run_started',
runId: 'rW',
workflowName: 'w',
meta: null,
})
bus.emit({
type: 'run_done',
runId: 'rW',
status: 'completed',
returnValue: { ok: true, n: 3 },
})
// writeRunState is async (void writeRunState(...) in the subscription); let the microtask complete
await new Promise(r => setTimeout(r, 50))
const got = await readRunState(dir, 'rW')
expect(got).not.toBeNull()
expect(got!.status).toBe('completed')
expect(got!.returnValue).toEqual({ ok: true, n: 3 })
} finally {
await rm(dir, { recursive: true, force: true })
}
})
test('run_done failed → writes status=failed + error field to disk', async () => {
const dir = await mkdtemp(join(tmpdir(), 'wf-persist-'))
try {
const bus = createProgressBus()
const store = createProgressStoreFromBus(bus)
attachRunStatePersistence(bus, store, () => dir)
bus.emit({
type: 'run_started',
runId: 'rF',
workflowName: 'w',
meta: null,
})
bus.emit({
type: 'run_done',
runId: 'rF',
status: 'failed',
error: 'boom',
})
await new Promise(r => setTimeout(r, 50))
const got = await readRunState(dir, 'rF')
expect(got).not.toBeNull()
expect(got!.status).toBe('failed')
expect(got!.error).toBe('boom')
} finally {
await rm(dir, { recursive: true, force: true })
}
})
test('run_done killed → writes status=killed to disk', async () => {
const dir = await mkdtemp(join(tmpdir(), 'wf-persist-'))
try {
const bus = createProgressBus()
const store = createProgressStoreFromBus(bus)
attachRunStatePersistence(bus, store, () => dir)
bus.emit({
type: 'run_started',
runId: 'rK',
workflowName: 'w',
meta: null,
})
bus.emit({ type: 'run_done', runId: 'rK', status: 'killed' })
await new Promise(r => setTimeout(r, 50))
const got = await readRunState(dir, 'rK')
expect(got?.status).toBe('killed')
} finally {
await rm(dir, { recursive: true, force: true })
}
})
test('writeRunState internal IO exception is swallowed: attachRunStatePersistence does not propagate, bus emit does not break', async () => {
const blockerDir = await mkdtemp(join(tmpdir(), 'wf-persist-'))
// first create a same-named file, so subdir mkdir fails → writeRunState internal catch swallows it
await writeFile(join(blockerDir, 'not-a-dir.txt'), 'blocker', 'utf-8')
try {
const bus = createProgressBus()
const store = createProgressStoreFromBus(bus)
// runsDir points to a dir whose parent path is a file: mkdir recursive fails
attachRunStatePersistence(bus, store, () =>
join(blockerDir, 'not-a-dir.txt'),
)
// an extra subscriber to verify it still gets notified (bus emit should not break due to internal exception in persistence listener)
let otherNotified = 0
bus.subscribe(() => otherNotified++)
// bus.emit should not throw — writeRunState swallows the exception internally
expect(() => {
bus.emit({
type: 'run_started',
runId: 'rErr',
workflowName: 'w',
meta: null,
})
bus.emit({
type: 'run_done',
runId: 'rErr',
status: 'completed',
returnValue: 'x',
})
}).not.toThrow()
// let writeRunState's microtask complete (exception swallowed internally)
await new Promise(r => setTimeout(r, 50))
// this store subscriber still works normally (received both run_started + run_done events)
expect(otherNotified).toBeGreaterThanOrEqual(2)
expect(store.get('rErr')?.status).toBe('completed')
} finally {
await rm(blockerDir, { recursive: true, force: true })
}
})
test('attachRunStatePersistence returns unsubscribe; after calling it no more disk writes', async () => {
const dir = await mkdtemp(join(tmpdir(), 'wf-persist-'))
try {
const bus = createProgressBus()
const store = createProgressStoreFromBus(bus)
const unsub = attachRunStatePersistence(bus, store, () => dir)
// first emit a run_done, verify disk write takes effect
bus.emit({
type: 'run_started',
runId: 'r1',
workflowName: 'w',
meta: null,
})
bus.emit({ type: 'run_done', runId: 'r1', status: 'completed' })
await new Promise(r => setTimeout(r, 50))
expect(await readRunState(dir, 'r1')).not.toBeNull()
// after unsubscribe, emit run_done again, should not write to disk
unsub()
bus.emit({
type: 'run_started',
runId: 'r2',
workflowName: 'w',
meta: null,
})
bus.emit({ type: 'run_done', runId: 'r2', status: 'completed' })
await new Promise(r => setTimeout(r, 50))
expect(await readRunState(dir, 'r2')).toBeNull()
} finally {
await rm(dir, { recursive: true, force: true })
}
})

View File

@@ -0,0 +1,82 @@
import { expect, test } from 'bun:test'
import type { AgentProgress, RunProgress } from '../progress/store.js'
import {
ALL_PHASE,
mergePhases,
filterAgentsByPhase,
tabLabel,
} from '../panel/selectors.js'
function run(partial: Partial<RunProgress>): RunProgress {
return {
runId: 'r1',
workflowName: 'w',
status: 'running',
phases: [],
declaredPhases: [],
currentPhase: null,
agents: [],
agentCount: 0,
startedAt: 1,
updatedAt: 1,
...partial,
}
}
test('mergePhases: declared order first, actual phases append undeclared ones, counts done/total', () => {
const r = run({
declaredPhases: ['Find', 'Review', 'Verify'],
phases: [
{ title: 'Find', status: 'done' },
{ title: 'Review', status: 'running' },
],
agents: [
{
id: 1,
phase: 'Find',
status: 'done',
resultKind: 'ok',
outputShape: 'text',
},
{ id: 2, phase: 'Find', status: 'done', resultKind: 'dead' },
{ id: 3, phase: 'Review', status: 'running' },
],
})
expect(mergePhases(r)).toEqual([
{ title: 'Find', status: 'done', done: 2, total: 2 },
{ title: 'Review', status: 'running', done: 0, total: 1 },
{ title: 'Verify', status: 'pending', done: 0, total: 0 },
])
})
test('mergePhases: actual but undeclared phase appended to the end', () => {
const r = run({
declaredPhases: ['Find'],
phases: [
{ title: 'Find', status: 'done' },
{ title: 'Adhoc', status: 'running' },
],
agents: [],
})
expect(mergePhases(r).map(p => p.title)).toEqual(['Find', 'Adhoc'])
})
test('filterAgentsByPhase: All / undefined → all; specified → only that phase', () => {
const agents: AgentProgress[] = [
{ id: 1, phase: 'A', status: 'running' },
{
id: 2,
phase: 'B',
status: 'done',
resultKind: 'ok',
outputShape: 'text',
},
]
expect(filterAgentsByPhase(agents, undefined)).toHaveLength(2)
expect(filterAgentsByPhase(agents, ALL_PHASE)).toHaveLength(2)
expect(filterAgentsByPhase(agents, 'A')).toEqual([agents[0]])
})
test('tabLabel: workflow name + last 4 chars short code of runId', () => {
expect(tabLabel('review-changes', 'wf_abc123def')).toBe('review-changes#3def')
})

View File

@@ -0,0 +1,594 @@
import { expect, test } from 'bun:test'
// DI pattern: do not use mock.module (process-global, last-write-wins, would pollute other tests in the same process such as
// autonomy.test.ts). Instead hand-construct FAKE WorkflowPorts: registry.run returns a fixed ok
// result, taskRegistrar maintains abort bindings, journalStore is an in-memory empty impl. The real runWorkflow
// thus runs to completion without needing LLM or mocks.
import { mkdtemp, rm, writeFile } from 'node:fs/promises'
import { tmpdir } from 'node:os'
import { join } from 'node:path'
import { makeService, __resetWorkflowServiceForTests } from '../service.js'
import { createProgressBus } from '../progress/bus.js'
import {
createProgressStoreFromBus,
type RunProgress,
} from '../progress/store.js'
import type {
AgentRunResult,
ProgressEvent,
WorkflowPorts,
} from '@claude-code-best/workflow-engine'
// Construct FAKE ports: registry.run returns a fixed AgentRunResult, taskRegistrar has bindings,
// journalStore is an in-memory empty impl. progressEmitter.emit → bus.emit (store subscribes to bus at construction).
// Note: runWorkflow itself emits run_started/run_done; taskRegistrar only manages abort bindings,
// does not re-emit events (avoids store reducer receiving duplicate run_done).
type RegistrarCall =
| { kind: 'complete'; runId: string; summary?: string }
| { kind: 'fail'; runId: string; error?: string }
| { kind: 'kill'; runId: string }
| {
kind: 'registerAgentAbort'
runId: string
agentId: number
controller: AbortController
}
| { kind: 'unregisterAgentAbort'; runId: string; agentId: number }
| { kind: 'killAgent'; runId: string; agentId: number }
function fakePorts(
opts: {
/** adapter.run throws (simulates agent backend crash). */
adapterThrow?: string
/** adapter.run return value (default ok). */
adapterResult?: AgentRunResult
/** agentRunner.runAgentToResult return value (fallback path, default throws). */
runnerResult?: AgentRunResult
} = {},
): {
ports: WorkflowPorts
store: ReturnType<typeof createProgressStoreFromBus>
killed: string[]
/** taskRegistrar call records (complete/fail/kill/registerAgentAbort/...). */
calls: RegistrarCall[]
/** runId → (agentId → AbortController). Used by tests to simulate backend registration. */
agentBindings: Map<string, Map<number, AbortController>>
/** adapter.run call count (accumulates on retry). holder reference, tests read adapterCalls.value. */
adapterCallsRef: { value: number }
} {
const bus = createProgressBus()
const store = createProgressStoreFromBus(bus)
const killed: string[] = []
const calls: RegistrarCall[] = []
const bindings = new Map<string, { abort: AbortController }>()
// agentId → AbortController (per runId). killAgent uses this to abort precisely.
const agentBindings = new Map<string, Map<number, AbortController>>()
// adapter.run call count (accumulates on retry). Use holder object to avoid closure/getter
// snapshot semantics issues in Bun test runner — when returning, shorthand takes the current value (=0),
// subsequent outer variable ++ does not reflect into the returned object field. holder reference is stable.
const adapterCallsRef = { value: 0 }
let seq = 0
const ports = {
// hostFactory is not actually called by the service.launch path (service builds its own host handle),
// but the WorkflowPorts type requires it to exist; keep a minimal impl.
hostFactory: () => ({
handle: {} as never,
cwd: '/tmp',
budgetTotal: null,
toolUseId: 'tu',
}),
agentAdapterRegistry: {
resolve: () => ({
id: 'claude-code',
capabilities: { structuredOutput: true },
run:
opts.adapterThrow !== undefined
? async (): Promise<AgentRunResult> => {
adapterCallsRef.value++
throw new Error(opts.adapterThrow)
}
: async (): Promise<AgentRunResult> => {
adapterCallsRef.value++
return (
opts.adapterResult ?? {
kind: 'ok',
output: 'mock-out',
usage: { outputTokens: 1 },
}
)
},
}),
},
agentRunner: {
runAgentToResult:
opts.runnerResult !== undefined
? async () => opts.runnerResult
: async () => {
throw new Error('should not reach')
},
},
progressEmitter: {
emit: (e: ProgressEvent) => bus.emit(e),
},
taskRegistrar: {
register: ({ workflowName }: { workflowName: string }) => {
const abort = new AbortController()
seq += 1
const runId = `run-${seq}`
bindings.set(runId, { abort })
agentBindings.set(runId, new Map())
return { runId, signal: abort.signal }
},
complete: (runId: string, summary?: string) => {
calls.push({ kind: 'complete', runId, summary })
},
fail: (runId: string, error?: string) => {
calls.push({ kind: 'fail', runId, error })
},
kill: (runId: string) => {
killed.push(runId)
calls.push({ kind: 'kill', runId })
bindings.get(runId)?.abort.abort()
},
registerAgentAbort: (
runId: string,
agentId: number,
controller: AbortController,
) => {
calls.push({
kind: 'registerAgentAbort',
runId,
agentId,
controller,
})
agentBindings.get(runId)?.set(agentId, controller)
},
unregisterAgentAbort: (runId: string, agentId: number) => {
calls.push({ kind: 'unregisterAgentAbort', runId, agentId })
agentBindings.get(runId)?.delete(agentId)
},
killAgent: (runId: string, agentId: number) => {
calls.push({ kind: 'killAgent', runId, agentId })
const ac = agentBindings.get(runId)?.get(agentId)
if (!ac) return false
ac.abort()
agentBindings.get(runId)!.delete(agentId)
return true
},
pendingAction: () => null,
},
journalStore: {
read: async () => [],
append: async () => {},
truncate: async () => {},
},
permissionGate: { isAborted: () => false },
logger: {
debug: () => {},
event: () => {},
warn: () => {},
},
} as unknown as WorkflowPorts
return { ports, store, killed, calls, agentBindings, adapterCallsRef }
}
const stubTUC = { agentId: 'a1', toolUseId: 'tu' } as never
const stubCanUseTool = (() => Promise.resolve({ behavior: 'allow' })) as never
/** Wait for detached runWorkflow to complete (detached call, need to drain microtasks/macrotasks). */
async function settle(): Promise<void> {
await new Promise(r => setTimeout(r, 60))
}
test('launch → completed; store shows this run', async () => {
__resetWorkflowServiceForTests()
const { ports, store } = fakePorts()
const svc = makeService(ports, store)
const { runId } = await svc.launch(
{ script: `return agent('compute')` },
stubTUC,
stubCanUseTool,
)
await settle()
const r = svc.getRun(runId)
expect(r).toBeDefined()
// detached execution may still be running within the settle window, or already completed — both are acceptable.
expect(['completed', 'running']).toContain(r!.status)
expect(r!.workflowName).toBe('workflow')
})
test('launch inline script → returns scriptPath (persisted to cwdOverride dir)', async () => {
__resetWorkflowServiceForTests()
const dir = await mkdtemp(join(tmpdir(), 'wf-svc-'))
try {
const { ports, store } = fakePorts()
const svc = makeService(ports, store, dir)
const result = await svc.launch(
{ script: `return agent('x')` },
stubTUC,
stubCanUseTool,
)
expect(result.scriptPath).toBe(
join(dir, '.claude', 'workflow-runs', 'run-1', 'script.js'),
)
const { readFile } = await import('node:fs/promises')
expect(await readFile(result.scriptPath!, 'utf-8')).toBe(
`return agent('x')`,
)
} finally {
await rm(dir, { recursive: true, force: true })
}
})
test('kill goes through taskRegistrar.kill', async () => {
__resetWorkflowServiceForTests()
const { ports, store, killed } = fakePorts()
const svc = makeService(ports, store)
const { runId } = await svc.launch(
{ script: `return agent('x')` },
stubTUC,
stubCanUseTool,
)
svc.kill(runId)
expect(killed).toContain(runId)
})
test('killAgent goes through taskRegistrar.killAgent: precisely aborts a single agent', async () => {
__resetWorkflowServiceForTests()
const { ports, store, calls, agentBindings } = fakePorts()
const svc = makeService(ports, store)
const { runId } = await svc.launch(
{ script: `return agent('x')` },
stubTUC,
stubCanUseTool,
)
// simulate backend registering AbortController when launching agent
const ac = new AbortController()
agentBindings.get(runId)!.set(7, ac)
// service.killAgent routes to taskRegistrar.killAgent, which actually aborts the corresponding controller
expect(svc.killAgent(runId, 7)).toBe(true)
expect(ac.signal.aborted).toBe(true)
expect(
calls.some(
c => c.kind === 'killAgent' && c.runId === runId && c.agentId === 7,
),
).toBe(true)
// after abort controller is deleted from Map: calling killAgent on same agent again returns false (idempotent)
expect(svc.killAgent(runId, 7)).toBe(false)
// unknown agentId / unknown runId safe returns false
expect(svc.killAgent(runId, 999)).toBe(false)
expect(svc.killAgent('nope', 1)).toBe(false)
})
test('listRuns/subscribe come from store', () => {
__resetWorkflowServiceForTests()
const { ports, store } = fakePorts()
const svc = makeService(ports, store)
expect(svc.listRuns()).toEqual([])
let n = 0
const unsub = svc.subscribe(() => {
n++
})
expect(typeof unsub).toBe('function')
unsub()
expect(n).toBe(0)
})
test('listNamed delegates to namedWorkflows (empty dir → []; with files → lists)', async () => {
__resetWorkflowServiceForTests()
const { ports, store } = fakePorts()
const svc = makeService(ports, store)
// non-existent dir → []
const empty = await svc.listNamed(
join(tmpdir(), `wf-nope-${Math.random().toString(36).slice(2)}`),
)
expect(empty).toEqual([])
// dir with named files → lists names (extension stripped, sorted)
const dir = await mkdtemp(join(tmpdir(), 'wf-named-'))
try {
await writeFile(
join(dir, 'a.ts'),
'export const meta = { name: "a", description: "d" }\nreturn 1',
)
await writeFile(join(dir, 'b.js'), 'return 2')
const names = await svc.listNamed(dir)
expect(names).toEqual(['a', 'b'])
} finally {
await rm(dir, { recursive: true, force: true })
}
})
test('missing script/name/scriptPath → throws', async () => {
__resetWorkflowServiceForTests()
const { ports, store } = fakePorts()
const svc = makeService(ports, store)
await expect(svc.launch({}, stubTUC, stubCanUseTool)).rejects.toThrow(
/script|name|scriptPath/,
)
})
test('scriptPath reads file content and validates', async () => {
__resetWorkflowServiceForTests()
const { ports, store } = fakePorts()
const svc = makeService(ports, store)
const dir = await mkdtemp(join(tmpdir(), 'wf-path-'))
const file = join(dir, 's.ts')
try {
await writeFile(file, `return agent('from-file')`)
const { runId } = await svc.launch(
{ scriptPath: file },
stubTUC,
stubCanUseTool,
)
await settle()
const r = svc.getRun(runId)
expect(r).toBeDefined()
expect(['completed', 'running']).toContain(r!.status)
} finally {
await rm(dir, { recursive: true, force: true })
}
})
test('parseScript validation failed → launch throws', async () => {
__resetWorkflowServiceForTests()
const { ports, store } = fakePorts()
const svc = makeService(ports, store)
// trigger ScriptError: meta literal missing description (validateMeta requires both name+description to be strings)
await expect(
svc.launch(
{ script: `export const meta = { name: "x" }\nreturn 1` },
stubTUC,
stubCanUseTool,
),
).rejects.toThrow(/Script validation failed/i)
})
// ---- Service-layer failure routing coverage (review gap: .then/.catch → taskRegistrar path) ----
test('script run throws → service routes to taskRegistrar.fail, with error text', async () => {
__resetWorkflowServiceForTests()
const { ports, store, calls } = fakePorts()
const svc = makeService(ports, store)
await svc.launch(
{ script: `throw new Error('script boom')` },
stubTUC,
stubCanUseTool,
)
await settle()
const fail = calls.find(c => c.kind === 'fail')
expect(fail).toBeDefined()
expect(fail?.kind === 'fail' && fail.error).toMatch(/script boom/)
})
test('adapter throws → retry still throws → degrade to dead → workflow completed (not fail)', async () => {
__resetWorkflowServiceForTests()
// new semantics: agent non-abort throw → retry once → still throws → degrade to dead (agent returns null),
// workflow continues and completes. Retry tolerates transient failures (429/network), but a permanently
// broken agent does not break through the entire workflow (consistent with parallel/pipeline null-on-error contract).
const { ports, store, calls, adapterCallsRef } = fakePorts({
adapterThrow: 'adapter boom',
})
const svc = makeService(ports, store)
await svc.launch({ script: `return agent('x')` }, stubTUC, stubCanUseTool)
await settle()
// retry once → adapter called 2 times
expect(adapterCallsRef.value).toBe(2)
// workflow normal completed, not failed
const complete = calls.find(c => c.kind === 'complete')
expect(complete).toBeDefined()
const fail = calls.find(c => c.kind === 'fail')
expect(fail).toBeUndefined()
})
test('script completes normally → service routes to taskRegistrar.complete', async () => {
__resetWorkflowServiceForTests()
const { ports, store, calls } = fakePorts()
const svc = makeService(ports, store)
await svc.launch({ script: `return agent('x')` }, stubTUC, stubCanUseTool)
await settle()
expect(calls.some(c => c.kind === 'complete')).toBe(true)
})
// ---- Fix N: shutdown cleanup ----
test('shutdown kills all running runs (taskRegistrar.kill called for each)', async () => {
__resetWorkflowServiceForTests()
const { ports, store, killed } = fakePorts()
// make adapter slower, so during settle the run is still running
const slowPorts = {
...ports,
agentAdapterRegistry: {
resolve: () => ({
id: 'claude-code',
capabilities: { structuredOutput: true },
run: async (): Promise<AgentRunResult> => {
await new Promise(r => setTimeout(r, 200))
return { kind: 'ok', output: 'slow', usage: { outputTokens: 1 } }
},
}),
},
} as unknown as typeof ports
const slowSvc = makeService(slowPorts, store)
const { runId: a } = await slowSvc.launch(
{ script: `return agent('a')` },
stubTUC,
stubCanUseTool,
)
const { runId: b } = await slowSvc.launch(
{ script: `return agent('b')` },
stubTUC,
stubCanUseTool,
)
killed.length = 0
slowSvc.shutdown()
expect(killed).toContain(a)
expect(killed).toContain(b)
})
test('shutdown does not re-kill completed runs; idempotent (multiple calls safe)', async () => {
__resetWorkflowServiceForTests()
const { ports, store, killed } = fakePorts()
const svc = makeService(ports, store)
const { runId } = await svc.launch(
{ script: `return agent('x')` },
stubTUC,
stubCanUseTool,
)
await settle() // complete
killed.length = 0
svc.shutdown()
// already completed should not be killed again
expect(killed).not.toContain(runId)
// idempotent
expect(() => svc.shutdown()).not.toThrow()
})
// ---- Task 5: loadPersistedRuns + getRunAsync fallback ----
// runsDirProvider is injected as makeService's fourth optional parameter with tmpdir, to avoid writing to the real project dir
// (Bun ESM module namespace is read-only, cannot monkey-patch getRunsDir).
test('loadPersistedRuns scans disk to hydrate historical runs; existing in-memory runs are not overwritten', async () => {
__resetWorkflowServiceForTests()
const dir = await mkdtemp(join(tmpdir(), 'wf-svc-'))
try {
// disk first has two historical runs
const { writeRunState } = await import('../persistence.js')
const historicalA = {
runId: 'hA',
workflowName: 'old-A',
status: 'completed',
phases: [],
declaredPhases: [],
currentPhase: null,
agents: [],
agentCount: 1,
returnValue: 'a',
startedAt: 10,
updatedAt: 20,
} as RunProgress
const historicalB = {
runId: 'hB',
workflowName: 'old-B',
status: 'failed',
phases: [],
declaredPhases: [],
currentPhase: null,
agents: [],
agentCount: 2,
error: 'x',
startedAt: 30,
updatedAt: 40,
} as RunProgress
await writeRunState(dir, historicalA)
await writeRunState(dir, historicalB)
const { ports, store } = fakePorts()
// in-memory first has one current-session run (via ports.progressEmitter.emit through bus → store)
ports.progressEmitter.emit({
type: 'run_started',
runId: 'live',
workflowName: 'live-w',
meta: null,
})
const svc = makeService(ports, store, undefined, () => dir)
await svc.loadPersistedRuns()
const ids = svc.listRuns().map(r => r.runId)
expect(ids).toContain('hA')
expect(ids).toContain('hB')
expect(ids).toContain('live')
// memory first: live is still running (not overwritten by disk; disk has no live so no STALE injected)
expect(svc.getRun('live')!.status).toBe('running')
expect(svc.getRun('hA')!.returnValue).toBe('a')
} finally {
await rm(dir, { recursive: true, force: true })
}
})
test('loadPersistedRuns repeated calls scan disk only once (persistedLoaded flag)', async () => {
__resetWorkflowServiceForTests()
const dir = await mkdtemp(join(tmpdir(), 'wf-svc-'))
try {
const { ports, store } = fakePorts()
const svc = makeService(ports, store, undefined, () => dir)
await svc.loadPersistedRuns()
await svc.loadPersistedRuns()
await svc.loadPersistedRuns()
// repeated calls do not throw, do not change listRuns result (empty dir)
expect(svc.listRuns()).toEqual([])
} finally {
await rm(dir, { recursive: true, force: true })
}
})
test('getRunAsync memory hit → no disk read', async () => {
__resetWorkflowServiceForTests()
const dir = await mkdtemp(join(tmpdir(), 'wf-svc-'))
try {
const { ports, store } = fakePorts()
const svc = makeService(ports, store, undefined, () => dir)
ports.progressEmitter.emit({
type: 'run_started',
runId: 'live',
workflowName: 'w',
meta: null,
})
const got = await svc.getRunAsync('live')
expect(got?.runId).toBe('live')
} finally {
await rm(dir, { recursive: true, force: true })
}
})
test('getRunAsync memory miss + disk hit → returns disk value, and does not inject into memory (subsequent get still reads disk)', async () => {
__resetWorkflowServiceForTests()
const dir = await mkdtemp(join(tmpdir(), 'wf-svc-'))
try {
const { writeRunState } = await import('../persistence.js')
const historical = {
runId: 'hist-only',
workflowName: 'old',
status: 'completed',
phases: [],
declaredPhases: [],
currentPhase: null,
agents: [],
agentCount: 0,
returnValue: { x: 1 },
startedAt: 1,
updatedAt: 2,
} as RunProgress
await writeRunState(dir, historical)
const { ports, store } = fakePorts()
const svc = makeService(ports, store, undefined, () => dir)
const got = await svc.getRunAsync('hist-only')
expect(got?.returnValue).toEqual({ x: 1 })
// not injected into memory: in-memory list does not contain (not hydrated)
expect(svc.listRuns().map(r => r.runId)).not.toContain('hist-only')
// subsequent get still returns (each goes through readRunState fallback)
const got2 = await svc.getRunAsync('hist-only')
expect(got2?.returnValue).toEqual({ x: 1 })
} finally {
await rm(dir, { recursive: true, force: true })
}
})
test('getRunAsync memory miss + disk miss → undefined', async () => {
__resetWorkflowServiceForTests()
const dir = await mkdtemp(join(tmpdir(), 'wf-svc-'))
try {
const { ports, store } = fakePorts()
const svc = makeService(ports, store, undefined, () => dir)
const got = await svc.getRunAsync('no-such-run')
expect(got).toBeUndefined()
} finally {
await rm(dir, { recursive: true, force: true })
}
})

View File

@@ -0,0 +1,88 @@
import { expect, test } from 'bun:test'
import type { AgentProgress, RunProgress } from '../progress/store.js'
import {
STATUS_DOT,
RUN_STATUS_COLOR,
RUN_STATUS_TEXT,
PHASE_MARK,
PHASE_COLOR,
agentVisual,
formatTokenCount,
agentMetaText,
} from '../panel/status.js'
test('STATUS_DOT / RUN_STATUS_COLOR / RUN_STATUS_TEXT cover four run states', () => {
const statuses: RunProgress['status'][] = [
'running',
'completed',
'failed',
'killed',
]
for (const s of statuses) {
expect(STATUS_DOT[s].length).toBeGreaterThan(0)
expect(RUN_STATUS_COLOR[s]).toBeTruthy()
expect(RUN_STATUS_TEXT[s].length).toBeGreaterThan(0)
}
expect(STATUS_DOT.running).toBe('●')
expect(STATUS_DOT.completed).toBe('✓')
expect(STATUS_DOT.failed).toBe('✗')
expect(STATUS_DOT.killed).toBe('■')
expect(RUN_STATUS_TEXT.completed).toBe('done')
expect(RUN_STATUS_TEXT.running).toBe('running')
})
test('PHASE_MARK / PHASE_COLOR cover running/done/pending', () => {
expect(PHASE_MARK.running).toBe('●')
expect(PHASE_MARK.done).toBe('✓')
expect(PHASE_MARK.pending).toBe('○')
expect(PHASE_COLOR.pending).toBe('subtle')
})
test('agentVisual: running → ● warning', () => {
const a: AgentProgress = { id: 1, status: 'running' }
expect(agentVisual(a)).toEqual({ mark: '●', color: 'warning' })
})
test('agentVisual: done·ok → ✓ success (no longer carries outputShape suffix)', () => {
const a: AgentProgress = {
id: 1,
status: 'done',
resultKind: 'ok',
outputShape: 'object',
}
expect(agentVisual(a)).toEqual({ mark: '✓', color: 'success' })
})
test('agentVisual: dead → ✗ error', () => {
const a: AgentProgress = { id: 1, status: 'done', resultKind: 'dead' }
expect(agentVisual(a)).toEqual({ mark: '✗', color: 'error' })
})
test('formatTokenCount: <1000 original value, ≥1000 keeps 1 decimal + k', () => {
expect(formatTokenCount(undefined)).toBe('0')
expect(formatTokenCount(0)).toBe('0')
expect(formatTokenCount(42)).toBe('42')
expect(formatTokenCount(1000)).toBe('1.0k')
expect(formatTokenCount(22900)).toBe('22.9k')
})
test('agentMetaText: model · Nk tok · N tool', () => {
const a: AgentProgress = {
id: 1,
status: 'done',
model: 'glm-5.2',
tokenCount: 22900,
toolCount: 1,
}
expect(agentMetaText(a)).toBe('glm-5.2 · 22.9k tok · 1 tool')
})
test('agentMetaText: omits prefix when no model', () => {
const a: AgentProgress = {
id: 1,
status: 'running',
tokenCount: 500,
toolCount: 2,
}
expect(agentMetaText(a)).toBe('500 tok · 2 tool')
})

View File

@@ -0,0 +1,45 @@
import { expect, test } from 'bun:test'
import { routeWorkflowKey } from '../panel/useWorkflowKeyboard.js'
test('Tab → nextTabShift+Tab → prevTab', () => {
expect(routeWorkflowKey('', { tab: true })).toBe('nextTab')
expect(routeWorkflowKey('', { tab: true, shift: true })).toBe('prevTab')
})
test('q / Esc → quit', () => {
expect(routeWorkflowKey('q', {})).toBe('quit')
expect(routeWorkflowKey('', { escape: true })).toBe('quit')
})
test('x → killAgentK → killWorkflowr → resumen → newRun', () => {
expect(routeWorkflowKey('x', {})).toBe('killAgent')
expect(routeWorkflowKey('K', {})).toBe('killWorkflow')
expect(routeWorkflowKey('r', {})).toBe('resume')
expect(routeWorkflowKey('n', {})).toBe('newRun')
})
test('confirm mode: y/Enter → confirmYes; n/Esc/q → confirmNo; other keys → null', () => {
expect(routeWorkflowKey('y', {}, 'confirm')).toBe('confirmYes')
expect(routeWorkflowKey('Y', {}, 'confirm')).toBe('confirmYes')
expect(routeWorkflowKey('', { return: true }, 'confirm')).toBe('confirmYes')
expect(routeWorkflowKey('n', {}, 'confirm')).toBe('confirmNo')
expect(routeWorkflowKey('N', {}, 'confirm')).toBe('confirmNo')
expect(routeWorkflowKey('', { escape: true }, 'confirm')).toBe('confirmNo')
expect(routeWorkflowKey('q', {}, 'confirm')).toBe('confirmNo')
// confirm mode swallows navigation/edit keys, preventing accidental triggers
expect(routeWorkflowKey('x', {}, 'confirm')).toBeNull()
expect(routeWorkflowKey('', { tab: true }, 'confirm')).toBeNull()
expect(routeWorkflowKey('', { upArrow: true }, 'confirm')).toBeNull()
})
test('←/→ switch focus column; ↑/↓ move within column', () => {
expect(routeWorkflowKey('', { leftArrow: true })).toBe('focusLeft')
expect(routeWorkflowKey('', { rightArrow: true })).toBe('focusRight')
expect(routeWorkflowKey('', { upArrow: true })).toBe('moveUp')
expect(routeWorkflowKey('', { downArrow: true })).toBe('moveDown')
})
test('unrelated input → null', () => {
expect(routeWorkflowKey('z', {})).toBeNull()
expect(routeWorkflowKey('', {})).toBeNull()
})

View File

@@ -0,0 +1,409 @@
// Deeply-integrated backend: parses agent/model/tools from the live session, delegates to the core runAgent.
// Implements the AgentAdapter interface, registered and routed by the registry (U5).
import {
type AgentAdapter,
type AgentAdapterContext,
type AgentRunParams,
type AgentRunResult,
WorkflowAbortedError,
} from '@claude-code-best/workflow-engine'
import { assembleToolPool } from '../../tools.js'
import { finalizeAgentTool } from '@claude-code-best/builtin-tools/tools/AgentTool/agentToolUtils.js'
import { runAgent } from '@claude-code-best/builtin-tools/tools/AgentTool/runAgent.js'
import {
isBuiltInAgent,
type AgentDefinition,
type BuiltInAgentDefinition,
} from '@claude-code-best/builtin-tools/tools/AgentTool/loadAgentsDir.js'
import { createUserMessage, extractTextContent } from '../../utils/messages.js'
import { getTokenCountFromUsage } from '../../utils/tokens.js'
import { createHash } from 'node:crypto'
import { createAgentId } from '../../utils/uuid.js'
import { logForDebugging } from '../../utils/debug.js'
import { runWithCwdOverride } from '../../utils/cwd.js'
import {
createAgentWorktree,
hasWorktreeChanges,
removeAgentWorktree,
} from '../../utils/worktree.js'
import { logEvent } from '../../services/analytics/index.js'
import type { ModelAlias } from '../../utils/model/aliases.js'
import type { Message } from '../../types/message.js'
import type { ToolUseContext } from '../../Tool.js'
import { readHostBundle } from '../hostHandle.js'
/** Fallback definition for workflow subagents (used when agentType does not match a real registry entry). */
export const WORKFLOW_AGENT: BuiltInAgentDefinition = {
agentType: 'workflow-worker',
whenToUse: 'subtask dispatched by the agent() hook inside a workflow script',
tools: ['*'],
source: 'built-in',
baseDir: 'built-in',
getSystemPrompt: () =>
'You are a workflow sub-agent. Complete the task concisely; your final text is the return value relayed to the workflow.',
}
/** agentType -> real agent registry (use if activeAgents hits, otherwise fallback). Exported for unit test coverage. */
export function resolveAgentDefinition(
agentType: string | undefined,
toolUseContext: ToolUseContext,
): AgentDefinition {
if (!agentType) return WORKFLOW_AGENT
const found = toolUseContext.options.agentDefinitions.activeAgents.find(
a => a.agentType === agentType,
)
return found ?? WORKFLOW_AGENT
}
/** model alias -> the actual model id of the current provider. v1 passes it through directly (keeps a mapping extension point). Exported for unit test coverage. */
export function mapWorkflowModel(
model: string | undefined,
): string | undefined {
return model
}
/**
* Extract the JSON object produced under schema mode from the agent's final message; returns null on failure. Exported for unit test coverage.
*
* Robustness strategy (in priority order, returns the first that successfully parses):
* 1. fenced code block (```json ... ``` or ``` ... ```) - agents often spontaneously add fences
* 2. the first "brace-balanced" {...} fragment in the bare text - handles preceding/trailing narration / multi-segment output
*
* Uses a brace-stack scan instead of `indexOf('{')..lastIndexOf('}')`: correctly handles nested objects,
* `{}` inside string literals, and escape characters. Will not concatenate multiple unrelated JSON fragments (the original version did).
*
* Does not do syntax repair (trailing commas, single quotes -> double quotes, comment removal) - agents do not produce non-standard JSON,
* and fixing it may instead cause wrong edits inside strings (e.g. `"http://..."` getting eaten by a // comment regex).
* On parse failure it directly skips to the next candidate.
*
* Only returns a plain object (typeof === 'object' && !null && !Array);
* the schema mode contract is object, array/number/string are all treated as the agent going off-track.
*/
export function extractStructuredOutput(
content: Array<{ type: string; text?: string }>,
): unknown | null {
for (const block of content) {
if (block.type !== 'text' || !block.text) continue
const found = findFirstJsonObject(block.text)
if (found !== null) return found
}
return null
}
/** Find the first JSON fragment in text that can be parsed as a plain object. */
function findFirstJsonObject(text: string): unknown | null {
// 1. fenced code blocks - priority (agents naturally tend to add them; strip the fence and parse the whole block)
for (const m of text.matchAll(
/```[\t ]*[a-zA-Z0-9_-]*\s*\n([\s\S]*?)\n?```/g,
)) {
const parsed = tryParseObject(m[1] ?? '')
if (parsed !== null) return parsed
}
// 2. bare text: scan each '{', find a balanced pair and try parse
for (let i = 0; i < text.length; i++) {
if (text[i] !== '{') continue
const end = findBalancedObjectEnd(text, i)
if (end < 0) continue
const parsed = tryParseObject(text.slice(i, end + 1))
if (parsed !== null) return parsed
}
return null
}
/**
* Find the matching `}` index starting from start (which must be `{`); returns -1 when unbalanced.
* Skips braces inside string literals and escape characters. Does not skip comments (the JSON standard does not allow comments,
* agents do not produce them; doing so is a risk - see the function doc).
*/
function findBalancedObjectEnd(text: string, start: number): number {
let depth = 0
let inString = false
for (let i = start; i < text.length; i++) {
const c = text[i]
if (inString) {
if (c === '\\')
i++ // skip the escape char and the next character
else if (c === '"') inString = false
continue
}
if (c === '"') inString = true
else if (c === '{') depth++
else if (c === '}') {
depth--
if (depth === 0) return i
}
}
return -1
}
/** try parse the candidate; only returns a plain object, others (array/number/null) return null. */
function tryParseObject(candidate: string): unknown | null {
const trimmed = candidate.trim()
if (!trimmed.startsWith('{') || !trimmed.endsWith('}')) return null
try {
const v = JSON.parse(trimmed)
return typeof v === 'object' && v !== null && !Array.isArray(v) ? v : null
} catch {
return null
}
}
type WorkflowWorktreeInfo = Awaited<ReturnType<typeof createAgentWorktree>>
/**
* Generate a slug for the worktree isolation of a workflow agent: derive hex segments from sha256(runId:agentId),
* matching the cleanup regex of cleanupStaleAgentWorktrees `^wf_[0-9a-f]{8}-[0-9a-f]{3}-\d+$`.
* taskId is `w`+base36 (not a UUID), so runId cannot be placed directly into the regex segment; sha256 is a deterministic mapping,
* and agentId ensures slug uniqueness for multiple agents under the same runId (no shared counter, no thread safety issues).
*/
function makeWorkflowWorktreeSlug(runId: string, agentId: string): string {
const h = createHash('sha256').update(`${runId}:${agentId}`).digest('hex')
return `wf_${h.slice(0, 8)}-${h.slice(8, 11)}-${parseInt(h.slice(11, 17), 16) % 100000}`
}
/**
* Clean up the worktree after the agent finishes: hookBased keeps it (cannot detect VCS changes); otherwise uses
* hasWorktreeChanges (fail-closed) to detect, auto-removes when there is no change, keeps it on change/detection failure
* and logs the path (v1 uses logs rather than extending AgentRunResult, to avoid touching journal serialization).
*/
async function cleanupWorkflowWorktree(
info: WorkflowWorktreeInfo,
agentType: string,
): Promise<void> {
if (info.hookBased || !info.headCommit) return
let changed = true
try {
changed = await hasWorktreeChanges(info.worktreePath, info.headCommit)
} catch (e) {
logForDebugging(
`workflow worktree change-detect failed (${agentType}): ${(e as Error).message}`,
)
changed = true
}
if (!changed) {
try {
await removeAgentWorktree(
info.worktreePath,
info.worktreeBranch,
info.gitRoot,
)
} catch (e) {
logForDebugging(
`workflow worktree remove failed (${agentType}): ${(e as Error).message}`,
)
}
} else {
logForDebugging(
`workflow worktree retained (has changes, ${agentType}): ${info.worktreePath}`,
)
}
}
/** Deeply-integrated backend: parses agent/model/tools from the live session, delegates to the core runAgent. */
export const claudeCodeBackend: AgentAdapter = {
id: 'claude-code',
capabilities: { structuredOutput: true, tools: true },
async run(
params: AgentRunParams,
ctx: AgentAdapterContext,
): Promise<AgentRunResult> {
const { toolUseContext, canUseTool } = readHostBundle(ctx.host)
const appState = toolUseContext.getAppState()
const agentDef = resolveAgentDefinition(params.agentType, toolUseContext)
const model = mapWorkflowModel(params.model)
// coreAgentId: the tracking ID for the core-layer subagent (a string, used inside runAgent).
// Different from ctx.agentId (the engine's number seq, used for panel / killAgent routing) - two distinct concepts, must not be mixed up.
const coreAgentId = createAgentId()
// isolation:'worktree' - run the agent inside an independent git worktree, so concurrent writes do not conflict.
let worktreeInfo: WorkflowWorktreeInfo | null = null
if (params.isolation === 'worktree') {
try {
worktreeInfo = await createAgentWorktree(
makeWorkflowWorktreeSlug(ctx.runId, coreAgentId),
)
} catch (e) {
// fail-closed: when isolation fails, do not silently fall back to a shared cwd (otherwise concurrent writes race on data)
const detail = (e as Error).message
logForDebugging(
`workflow worktree creation failed (${agentDef.agentType}): ${detail}`,
)
return { kind: 'dead', reason: 'worktree-failed', detail }
}
}
// runWithCwdOverride makes tools such as Bash/Read inside the agent see the worktree path
// (AsyncLocalStorage is preserved across awaits); the worktreePath parameter of runAgent only writes metadata.
const runInCwd = worktreeInfo
? <T>(fn: () => T): T =>
runWithCwdOverride(worktreeInfo!.worktreePath, fn)
: <T>(fn: () => T): T => fn()
// Bridge ctx.signal -> runAgent.override.abortController. Otherwise, when the workflow is killed
// runAgent is unaware (root cause of 'x' being ineffective): the abort signal cannot reach the internal fetch, and the agent runs to completion.
// Single-agent kill goes through service.kill(runId, agentId) -> ports.taskRegistrar.killAgent ->
// agentAbortControllers.get(agentId).abort(); the same controller takes over both paths.
const agentAbort = new AbortController()
const onParentAbort = (): void => agentAbort.abort()
if (ctx.signal.aborted) {
agentAbort.abort()
} else {
ctx.signal.addEventListener('abort', onParentAbort, { once: true })
}
if (typeof ctx.registerAgentAbort === 'function') {
ctx.registerAgentAbort(ctx.agentId, agentAbort)
}
const workerPermissionContext = {
...appState.toolPermissionContext,
mode: agentDef.permissionMode ?? 'acceptEdits',
}
const workerTools = assembleToolPool(
workerPermissionContext,
appState.mcp.tools,
)
// schema -> instructs the agent to directly emit JSON in the final text block.
// Does not require calling the StructuredOutput tool - it is not in the workflow subagent's tool set (only
// the stop_hook path explicitly injects it; workflow goes through assembleToolPool whose default pool does not include it).
// Historically the prompt required "call StructuredOutput tool", causing 8/12 agents to refuse to wrap up or struggle to call it;
// empirically the main cause of dead is the tool being unreachable rather than "forgetting". Change the contract: raw JSON text, extractStructuredOutput
// tolerates fenced fences + preceding/trailing narration + multiple segments.
const promptText = params.schema
? [
params.prompt,
'',
'After completing the task, emit your final answer as a single JSON object matching this JSON Schema:',
'```json',
JSON.stringify(params.schema, null, 2),
'```',
'',
'CRITICAL RULES:',
'- The JSON object must be the LAST text block in your response. Do not write any prose after it.',
'- Emit the JSON as plain text (markdown code fences optional).',
'- Do NOT call any "StructuredOutput" or "SyntheticOutput" tool — it is not available in this environment.',
'- Your turn must end with the JSON object. Anything after it (prose, tool calls) will be ignored or cause your answer to be discarded.',
].join('\n')
: params.prompt
const promptMessages = [createUserMessage({ content: promptText })]
const messages: Message[] = []
const startTime = Date.now()
// Accumulate running progress (onProgress push -> agent_progress event -> panel refreshes token/tool in real time).
let tokenCount = 0
let toolCount = 0
try {
await runInCwd(async () => {
for await (const msg of runAgent({
agentDefinition: agentDef,
promptMessages,
toolUseContext,
canUseTool,
isAsync: true,
querySource: toolUseContext.options.querySource ?? 'workflow',
availableTools: workerTools,
// override the same object: coreAgentId (core subagent tracking) + abortController (kill bridge).
// runAgent's model is the top-level ModelAlias; workflow's model is an arbitrary alias string,
// the types are incompatible and resolved by the provider layer at runtime. Passes through via double assertion (better than as any/never).
override: { agentId: coreAgentId, abortController: agentAbort },
...(model ? { model: model as unknown as ModelAlias } : {}),
...(worktreeInfo ? { worktreePath: worktreeInfo.worktreePath } : {}),
})) {
messages.push(msg as Message)
// Accumulate running progress: assistant message carries usage (cumulative value -> overwrite), tool_use inside content (incremental).
if (msg.type === 'assistant' && msg.message) {
const usage = msg.message.usage as
| Parameters<typeof getTokenCountFromUsage>[0]
| undefined
if (usage) tokenCount = getTokenCountFromUsage(usage)
const content = msg.message.content as
| Array<{ type: string }>
| undefined
if (content)
toolCount += content.filter(b => b.type === 'tool_use').length
}
ctx.onProgress?.({ tokenCount, toolCount })
}
})
} catch (e) {
// abort (kill workflow / kill agent): must rethrow WorkflowAbortedError after detection,
// otherwise hooks.agent will swallow the abort as an ordinary failure into dead, and the workflow won't know it was killed
// (the other side of the 'x' kill path being ineffective: the signal did arrive, but the result was disguised as a normal completion).
if (agentAbort.signal.aborted || (e as Error)?.name === 'AbortError') {
throw new WorkflowAbortedError()
}
const detail = (e as Error).message
logForDebugging(
`workflow sub-agent error (${agentDef.agentType}): ${detail}`,
)
logEvent('tengu_workflow_agent', { ok: 0 })
return { kind: 'dead', reason: 'runagent-threw', detail }
} finally {
// cleanup (idempotent): listener removeEventListener / Map.delete are safe to call repeatedly.
if (typeof ctx.unregisterAgentAbort === 'function') {
ctx.unregisterAgentAbort(ctx.agentId)
}
ctx.signal.removeEventListener('abort', onParentAbort)
if (worktreeInfo) {
const info = worktreeInfo
worktreeInfo = null
await cleanupWorkflowWorktree(info, agentDef.agentType)
}
}
const finalized = finalizeAgentTool(messages, coreAgentId, {
prompt: params.prompt,
resolvedAgentModel: toolUseContext.options.mainLoopModel,
isBuiltInAgent: isBuiltInAgent(agentDef),
startTime,
agentType: agentDef.agentType,
isAsync: true,
})
const outputTokens =
finalized.usage?.output_tokens ?? finalized.totalTokens ?? 0
// For panel display: total context tokens, tool-call count, parsed model id at completion.
const finalTokenCount = finalized.totalTokens ?? 0
const finalToolCount = finalized.totalToolUseCount ?? 0
const resolvedModel = model ?? toolUseContext.options.mainLoopModel
logEvent('tengu_workflow_agent', { ok: 1, outputTokens })
if (params.schema) {
const structured = extractStructuredOutput(finalized.content)
if (structured === null) {
// The agent finished all tool calls but no plain-object JSON was found in the final text block.
// Typical scenarios: forgot to emit JSON after a long tool chain, unbalanced JSON nesting, parse failure.
// Put a preview of the last text into detail so the hooks retry log and the panel can immediately see what the agent actually said.
const preview = extractTextContent(finalized.content, '\n').slice(
0,
200,
)
logForDebugging(
`workflow sub-agent produced no JSON object (${agentDef.agentType}); preview: ${preview}`,
)
return {
kind: 'dead',
reason: 'no-structured-output',
detail: preview,
}
}
return {
kind: 'ok',
output: structured as object,
usage: { outputTokens },
model: resolvedModel,
toolCount: finalToolCount,
tokenCount: finalTokenCount,
}
}
const text = extractTextContent(finalized.content, '\n')
return {
kind: 'ok',
output: text,
usage: { outputTokens },
model: resolvedModel,
toolCount: finalToolCount,
tokenCount: finalTokenCount,
}
},
}

View File

@@ -0,0 +1,42 @@
import {
createHostHandle,
unwrapHostHandle,
type HostHandle,
} from '@claude-code-best/workflow-engine'
import type { CanUseToolFn } from '../hooks/useCanUseTool.js'
import type { AssistantMessage } from '../types/message.js'
import type { AgentId } from '../types/ids.js'
import type { ToolUseContext } from '../Tool.js'
/** Opaque bundle held inside HostHandle (unpacked on the core side). */
export type WorkflowHostBundle = {
toolUseContext: ToolUseContext
canUseTool: CanUseToolFn
parentMessage?: AssistantMessage
agentId?: AgentId
}
/**
* Shared: builds the host bundle from toolUseContext/canUseTool.
* parentMessage is optional (absent on the panel launch path — claudeCodeBackend never reads it).
*/
export function buildHostBundle(
toolUseContext: WorkflowHostBundle['toolUseContext'],
canUseTool: WorkflowHostBundle['canUseTool'],
parentMessage?: AssistantMessage,
): WorkflowHostBundle {
return {
toolUseContext,
canUseTool,
...(parentMessage !== undefined ? { parentMessage } : {}),
agentId: toolUseContext.agentId,
}
}
export function makeHostHandle(bundle: WorkflowHostBundle): HostHandle {
return createHostHandle(bundle)
}
export function readHostBundle(handle: HostHandle): WorkflowHostBundle {
return unwrapHostHandle(handle) as WorkflowHostBundle
}

View File

@@ -0,0 +1,34 @@
import { join } from 'node:path'
import {
listNamedWorkflows,
WORKFLOW_DIR_NAME,
} from '@claude-code-best/workflow-engine'
import type { Command } from '../types/command.js'
import { getProjectRoot } from '../bootstrap/state.js'
/** Scan *.ts|*.js|*.mjs under .claude/workflows/ and generate a /<name> command for each. */
export async function getWorkflowCommands(
cwd: string = getProjectRoot(),
): Promise<Command[]> {
const dir = join(cwd, WORKFLOW_DIR_NAME)
const names = await listNamedWorkflows(dir)
return names.map(name => ({
type: 'prompt',
name,
description: `Run workflow: ${name}`,
kind: 'workflow',
source: 'builtin',
progressMessage: `Running workflow ${name}...`,
contentLength: 0,
async getPromptForCommand(args, _context) {
const argText =
typeof args === 'string' && args ? `\n\nArguments: ${args}` : ''
return [
{
type: 'text',
text: `Run the "${name}" workflow now by calling the Workflow tool with name="${name}".${argText}`,
},
]
},
}))
}

View File

@@ -0,0 +1,88 @@
/**
* Bridge for workflow status-change notifications.
*
* The engine emits events via progressEmitter.emit({ type: 'run_done', ... }),
* and the progress/store reducer records the status into RunProgress. But the
* old implementation had no code bridging status transitions to the host
* notification mechanism — the "notifies automatically on completion" promise
* in WorkflowTool's return text went unfulfilled.
*
* This module subscribes to WorkflowService.subscribe, watches status transitions
* from running → completed/failed/killed, and emits a host notification via the
* injected notifier callback (defaults to enqueuePendingNotification task-notification mode).
*/
import {
STATUS_TAG,
SUMMARY_TAG,
TASK_ID_TAG,
TASK_NOTIFICATION_TAG,
TASK_TYPE_TAG,
} from '../constants/xml.js'
import { enqueuePendingNotification } from '../utils/messageQueueManager.js'
import type { RunProgress } from './progress/store.js'
import type { WorkflowService } from './service.js'
const WORKFLOW_TASK_TYPE = 'local_workflow'
/** Notifier abstraction (lets tests inject a spy). */
export type WorkflowNotifier = (message: string) => void
const TERMINAL_STATUSES: ReadonlySet<RunProgress['status']> = new Set([
'completed',
'failed',
'killed',
])
/** Default notifier: uses the host message queue's task-notification mode. */
const defaultNotifier: WorkflowNotifier = message => {
enqueuePendingNotification({ value: message, mode: 'task-notification' })
}
export function installWorkflowNotifications(
service: WorkflowService,
notify: WorkflowNotifier = defaultNotifier,
): () => void {
const prevStatus = new Map<string, RunProgress['status'] | undefined>()
const unsubscribe = service.subscribe(() => {
const runs = service.listRuns()
for (const run of runs) {
const prev = prevStatus.get(run.runId)
// First time seeing this run: just record the current status without notifying
// (avoids treating existing historical runs as new notifications on install)
if (prev === undefined) {
prevStatus.set(run.runId, run.status)
continue
}
// Status changed + entered terminal state → emit notification
if (prev !== run.status && TERMINAL_STATUSES.has(run.status)) {
notify(buildMessage(run))
}
prevStatus.set(run.runId, run.status)
}
})
return () => {
unsubscribe()
prevStatus.clear()
}
}
function buildMessage(run: RunProgress): string {
const statusText =
run.status === 'completed'
? 'completed successfully'
: run.status === 'failed'
? 'failed'
: 'was stopped'
const errorSuffix =
run.status === 'failed' && run.error ? `: ${run.error}` : ''
const summary = `Workflow "${run.workflowName}" ${statusText}${errorSuffix}`
return `<${TASK_NOTIFICATION_TAG}>
<${TASK_ID_TAG}>${run.runId}</${TASK_ID_TAG}>
<${TASK_TYPE_TAG}>${WORKFLOW_TASK_TYPE}</${TASK_TYPE_TAG}>
<${STATUS_TAG}>${run.status}</${STATUS_TAG}>
<${SUMMARY_TAG}>${summary}</${SUMMARY_TAG}>
</${TASK_NOTIFICATION_TAG}>`
}

View File

@@ -0,0 +1,71 @@
import React from 'react';
import { Box, Text, useAnimationFrame } from '@anthropic/ink';
import type { Theme } from '@anthropic/ink';
import type { AgentProgress } from '../progress/store.js';
import { agentMetaText, agentVisual } from './status.js';
const SPINNER_FRAMES = ['·', '✢', '✱', '✶', '✻', '✽'];
const FRAME_MS = 120;
const LABEL_MAX = 18;
/**
* Truncate the label to at most max characters. Preserves the trailing `#number` suffix (the audit workflow
* `verify:${dim}#${findingIdx}` format) - so verify agent labels with multiple findings under the same dimension
* stay distinguishable (the prefix is elided with `…`). When there is no suffix, truncates from the right (legacy behavior).
* Exported for unit test coverage.
*/
export function truncateLabel(raw: string, max: number): string {
if (raw.length <= max) return raw;
const m = raw.match(/#\d+$/);
if (!m) return raw.slice(0, max);
const suffix = m[0]; // includes the # sign
const prefix = raw.slice(0, raw.length - suffix.length);
const available = max - suffix.length - 1; // -1 reserved for …
return `${prefix.slice(0, available)}${suffix}`;
}
/**
* Right-side agent list (already filtered by the selected phase).
* Selected row: only when this column has focus (focused=true) does it paint a selectionBg background (keeps fg, not inverse color);
* when focus is not on this column it does not paint the background color, to avoid a "fake focus".
* The status mark of a running agent is driven by useAnimationFrame via a spinner animation (shared clock, globally synchronized);
* the right side `model · Nk tok · N tool` is refreshed in real time by agent_progress / agent_done.
*/
export function AgentList({
agents,
selectedIndex,
focused,
}: {
agents: AgentProgress[];
selectedIndex: number;
focused: boolean;
}): React.ReactNode {
// Subscribe once to the animation frame at the top level: all running agents share the same frame (synchronized animation, avoids a per-row hook).
const [ref, time] = useAnimationFrame(FRAME_MS);
const frame = SPINNER_FRAMES[Math.floor(time / FRAME_MS) % SPINNER_FRAMES.length];
if (agents.length === 0) {
return <Text color="subtle">(no agents in this phase)</Text>;
}
return (
<Box ref={ref} flexDirection="column">
{agents.map((a, i) => {
const v = agentVisual(a);
const selected = i === selectedIndex;
const highlighted = selected && focused;
const running = a.status === 'running';
const mark = running ? frame : v.mark;
const label = truncateLabel(a.label ?? `agent-${a.id}`, LABEL_MAX);
return (
<Box key={a.id} backgroundColor={highlighted ? 'selectionBg' : undefined} justifyContent="space-between">
<Box>
<Text color={v.color as keyof Theme}>{mark}</Text>
<Text> {label}</Text>
</Box>
<Text color="subtle">{agentMetaText(a)}</Text>
</Box>
);
})}
</Box>
);
}

View File

@@ -0,0 +1,65 @@
import React from 'react';
import { Box, Text, useAnimationFrame } from '@anthropic/ink';
import type { Theme } from '@anthropic/ink';
import type { AgentProgress } from '../progress/store.js';
import { PHASE_COLOR, PHASE_MARK, type PhaseStatus } from './status.js';
import { ALL_PHASE, type MergedPhase } from './selectors.js';
const SPINNER_FRAMES = ['·', '✢', '✱', '✶', '✻', '✽'];
const FRAME_MS = 120;
type PhaseRow = {
title: string;
status?: PhaseStatus;
done: number;
total: number;
};
/**
* Left phase sidebar: the first row is All (aggregating done/total), followed by the merged phases (including pending ○).
* Selected row: only when this column has focus (focused=true) does it paint a selectionBg background (keeps fg, not inverse color) + a `>` marker;
* when focus is not on this column it does not paint the background color, to avoid a "fake focus". The status mark of a running phase is driven by useAnimationFrame via a spinner animation.
* Style aligns with the reference image: `> ✓ Scan 3/3`.
*/
export function PhaseSidebar({
phases,
agents,
selectedIndex,
focused,
}: {
phases: MergedPhase[];
agents: AgentProgress[];
selectedIndex: number;
focused: boolean;
}): React.ReactNode {
const [ref, time] = useAnimationFrame(FRAME_MS);
const frame = SPINNER_FRAMES[Math.floor(time / FRAME_MS) % SPINNER_FRAMES.length];
const totalAgents = agents.length;
const doneAgents = agents.filter(a => a.status === 'done').length;
const rows: PhaseRow[] = [{ title: ALL_PHASE, done: doneAgents, total: totalAgents }, ...phases];
return (
<Box ref={ref} flexDirection="column">
{rows.map((row, i) => {
const selected = i === selectedIndex;
const highlighted = selected && focused;
const running = row.status === 'running';
const mark = running ? frame : row.status ? PHASE_MARK[row.status] : ' ';
const color = (row.status ? PHASE_COLOR[row.status] : 'subtle') as keyof Theme;
return (
<Box key={row.title} backgroundColor={highlighted ? 'selectionBg' : undefined} justifyContent="space-between">
<Box>
<Text color={selected ? 'claude' : undefined}>{highlighted ? '>' : ' '}</Text>
<Text> </Text>
<Text color={color}>{mark}</Text>
<Text> {row.title}</Text>
</Box>
<Text color="subtle">
{row.done}/{row.total}
</Text>
</Box>
);
})}
</Box>
);
}

View File

@@ -0,0 +1,37 @@
import React from 'react';
import { Box, Text } from '@anthropic/ink';
import type { Theme } from '@anthropic/ink';
import type { RunProgress } from '../progress/store.js';
import { RUN_STATUS_COLOR, STATUS_DOT } from './status.js';
import { tabLabel } from './selectors.js';
/**
* Top run tab row: one tab per run (status dot + name + #short code).
* The current tab is highlighted with an orange ═ underline.
*/
export function TabsBar({ runs, activeRunId }: { runs: RunProgress[]; activeRunId: string | null }): React.ReactNode {
if (runs.length === 0) {
return <Text color="subtle">(no runs)</Text>;
}
return (
<Box>
{runs.map(r => {
const active = r.runId === activeRunId;
const label = tabLabel(r.workflowName, r.runId);
const underline = '═'.repeat(label.length + 2);
return (
<Box key={r.runId} flexDirection="column" marginRight={2}>
<Box>
<Text color={RUN_STATUS_COLOR[r.status] as keyof Theme}>{STATUS_DOT[r.status]}</Text>
<Text> </Text>
<Text color={active ? 'claude' : undefined} bold={active}>
{label}
</Text>
</Box>
<Text color={active ? 'claude' : undefined}>{active ? underline : ''}</Text>
</Box>
);
})}
</Box>
);
}

View File

@@ -0,0 +1,283 @@
import React, { useEffect, useRef, useState, useSyncExternalStore } from 'react';
import { Box, Dialog, Text, useAnimationFrame } from '@anthropic/ink';
import type { Theme } from '@anthropic/ink';
import type { LocalJSXCommandContext, LocalJSXCommandOnDone } from '../../types/command.js';
import { getWorkflowService } from '../service.js';
import type { RunProgress } from '../progress/store.js';
import { AgentList } from './AgentList.js';
import { PhaseSidebar } from './PhaseSidebar.js';
import { TabsBar } from './TabsBar.js';
import { RUN_STATUS_COLOR, RUN_STATUS_TEXT } from './status.js';
import { type FocusColumn, type WorkflowKeyboardHandlers, useWorkflowKeyboard } from './useWorkflowKeyboard.js';
import { ALL_PHASE, filterAgentsByPhase, formatDuration, mergePhases } from './selectors.js';
/**
* Clamp the selected index to a valid range (empty list -> 0; out of range -> last position; negative/NaN -> 0).
* Extracted into a module-level pure function: called inside the panel + unit tested for the same logic, to avoid behavior drift.
*/
export function clampSelected(selected: number, len: number): number {
if (len === 0) return 0;
const n = Math.trunc(selected);
if (Number.isNaN(n) || n < 0) return 0;
return Math.min(n, len - 1);
}
/**
* Determine whether the focused run completed the running -> terminal state transition (used for panel auto-exit).
* Extracted into a pure function for easy unit testing; called directly inside the panel's useEffect.
*
* Trigger condition: prev and curr are the same runId, prev is running, curr is completed/failed/killed.
* - Opening the history panel (prev=null): does not trigger
* - Switching to an already completed tab (different runId): does not trigger
* - Same run running -> terminal: triggers
*/
export function isRunTerminatedTransition(
prev: { runId: string; status: RunProgress['status'] } | null,
curr: { runId: string; status: RunProgress['status'] } | null,
): boolean {
if (!prev || !curr) return false;
if (prev.runId !== curr.runId) return false;
if (prev.status !== 'running') return false;
return curr.status === 'completed' || curr.status === 'failed' || curr.status === 'killed';
}
/**
* /workflows main panel: three-region focus model (top tab + left phase sidebar + right agent list).
*
* - useSyncExternalStore subscribes to WorkflowService (the store returns stable snapshots, no re-render without change).
* - Focus state: activeRunId / focusColumn('phases'|'agents') / selectedPhaseIndex(0=All) / selectedAgentIndex.
* - Keybindings: Tab switch run · Left/Right switch focus column · Up/Down move within column · x kill · r resume · q/Esc quit.
*/
export function WorkflowsPanel({
onDone,
context,
}: {
onDone: LocalJSXCommandOnDone;
context: LocalJSXCommandContext;
}): React.ReactNode {
const svc = getWorkflowService();
const runs = useSyncExternalStore(
svc.subscribe,
() => svc.listRuns(),
() => [],
);
const [activeRunId, setActiveRunId] = useState<string | null>(null);
const [focusColumn, setFocusColumn] = useState<FocusColumn>('phases');
const [selectedPhaseIndex, setSelectedPhaseIndex] = useState(0);
const [selectedAgentIndex, setSelectedAgentIndex] = useState(0);
// kill secondary confirmation. null = no dialog; 'workflow' = kill the whole run; 'agent' = kill the currently selected agent.
// When non-null the keyboard enters confirm mode (only y/Enter/n/Esc/q respond).
const [confirmKill, setConfirmKill] = useState<null | 'agent' | 'workflow'>(null);
// On mount, trigger a single disk scan to hydrate historical runs (the service's internal persistedLoaded flag guards idempotency).
// Re-mount / re-render does not scan again (guarded by the process-singleton flag). The svc reference is stable (getWorkflowService singleton).
useEffect(() => {
void svc.loadPersistedRuns();
}, [svc]);
// On runs change: activeRunId invalidated (killed / first time) -> clamp to the first one
useEffect(() => {
if (runs.length === 0) {
if (activeRunId !== null) setActiveRunId(null);
return;
}
if (!runs.some(r => r.runId === activeRunId)) {
setActiveRunId(runs[0]!.runId);
}
}, [runs, activeRunId]);
const focused: RunProgress | undefined = runs.find(r => r.runId === activeRunId);
const phases = focused ? mergePhases(focused) : [];
// The sidebar includes the All row: prepend one item to the phases array -> total rows = phases.length + 1
const phaseRowCount = phases.length + 1;
const clampedPhase = clampSelected(selectedPhaseIndex, phaseRowCount);
// Auto-exit the panel when the focused run transitions from running to terminal (800ms delay so the user sees the ✓/✗ terminal state).
// Only triggered by a state transition on the same runId: switching to an already completed tab (prev was a different run) does not exit; opening the history panel
// (prev=null) does not exit either. Otherwise the agent is blocked by the panel while waiting for the Workflow tool result, and the user must press q manually.
const prevFocusedRef = useRef<{ runId: string; status: RunProgress['status'] } | null>(null);
useEffect(() => {
const curr = focused ? { runId: focused.runId, status: focused.status } : null;
const prev = prevFocusedRef.current;
prevFocusedRef.current = curr;
if (!isRunTerminatedTransition(prev, curr)) return;
const timer = setTimeout(() => onDone(), 800);
return (): void => {
clearTimeout(timer);
};
}, [focused?.runId, focused?.status, onDone]);
// Selected phase title (0 = All = undefined)
const selectedPhaseTitle = clampedPhase === 0 ? undefined : phases[clampedPhase - 1]?.title;
const visibleAgents = focused ? filterAgentsByPhase(focused.agents, selectedPhaseTitle) : [];
const clampedAgent = clampSelected(selectedAgentIndex, visibleAgents.length);
const switchTab = (runId: string): void => {
setActiveRunId(runId);
setFocusColumn('phases');
setSelectedPhaseIndex(0);
setSelectedAgentIndex(0);
};
const nextTab = (): void => {
if (runs.length === 0) return;
const idx = runs.findIndex(r => r.runId === activeRunId);
const next = runs[(idx + 1) % runs.length]!;
switchTab(next.runId);
};
const prevTab = (): void => {
if (runs.length === 0) return;
const idx = runs.findIndex(r => r.runId === activeRunId);
const next = runs[(idx - 1 + runs.length) % runs.length]!;
switchTab(next.runId);
};
const handlers: WorkflowKeyboardHandlers = {
nextTab,
prevTab,
focusLeft: () => setFocusColumn('phases'),
focusRight: () => setFocusColumn('agents'),
moveUp: () => {
if (focusColumn === 'phases') setSelectedPhaseIndex(s => clampSelected(s - 1, phaseRowCount));
else setSelectedAgentIndex(s => clampSelected(s - 1, visibleAgents.length));
},
moveDown: () => {
if (focusColumn === 'phases') setSelectedPhaseIndex(s => clampSelected(s + 1, phaseRowCount));
else setSelectedAgentIndex(s => clampSelected(s + 1, visibleAgents.length));
},
killAgent: () => {
// Only pop the agent confirmation when the agents column is focused (pressing x in the phases column has no target, no-op).
// The selected agent is decided by visibleAgents[clampedAgent]; saved into confirmKill and then
// actually executed by confirmYes - to avoid mis-killing caused by visibleAgents changing between two renders.
if (focusColumn !== 'agents' || !focused) return;
const agent = visibleAgents[clampedAgent];
if (!agent) return;
setConfirmKill('agent');
},
killWorkflow: () => {
if (!focused) return;
setConfirmKill('workflow');
},
resumeFocused: () => {
if (!focused) return;
const canUseTool = context.canUseTool;
if (!canUseTool) {
onDone('resume needs canUseTool context; run /<name> resume from the main session.');
return;
}
void svc
.launch({ resumeFromRunId: focused.runId, name: focused.workflowName }, context, canUseTool)
.catch(e => onDone(`resume failed: ${(e as Error).message}`));
},
newRun: () => onDone('Tip: start a named workflow with /<name>, or pass name via the Workflow tool.'),
quit: () => {
// In confirm mode q = cancel confirmation (routeWorkflowKey already routed to confirmNo);
// only in non-confirm mode does it really exit the panel.
if (confirmKill !== null) {
setConfirmKill(null);
return;
}
onDone();
},
confirmYes: () => {
if (confirmKill === 'workflow' && focused) {
svc.kill(focused.runId);
// After killing the entire workflow, immediately return to the main chat: the run_done event -> the store reducer changes the status to
// killed -> notifications.ts bridges enqueuePendingNotification, and the main chat shows
// `Workflow "<name>" was stopped`. Staying on the panel would instead make the user miss the "stopped" feedback.
setConfirmKill(null);
onDone();
return;
} else if (confirmKill === 'agent' && focused) {
const agent = visibleAgents[clampedAgent];
if (agent) svc.killAgent(focused.runId, agent.id);
}
setConfirmKill(null);
},
confirmNo: () => setConfirmKill(null),
};
useWorkflowKeyboard(handlers, confirmKill !== null ? 'confirm' : 'normal');
const running = runs.filter(r => r.status === 'running').length;
const done = runs.length - running;
const phaseHeader = selectedPhaseTitle ?? ALL_PHASE;
const agentDone = focused ? focused.agents.filter(a => a.status === 'done').length : 0;
// Refresh the header duration every second (shared clock; subscribing triggers re-render, duration follows wall clock).
const [clockRef] = useAnimationFrame(1000);
const elapsed = focused ? Date.now() - focused.startedAt : 0;
return (
<Box ref={clockRef} flexDirection="column" borderStyle="round" borderColor="claude" paddingX={1}>
<Box justifyContent="space-between">
<Text bold>{focused?.workflowName ?? 'Workflows'}</Text>
{focused ? (
<Text color="subtle">
{agentDone}/{focused.agentCount} agents · {formatDuration(elapsed)} ·{' '}
<Text color={RUN_STATUS_COLOR[focused.status] as keyof Theme}>{RUN_STATUS_TEXT[focused.status]}</Text>
</Text>
) : (
<Text color="subtle">
{running} running · {done} done
</Text>
)}
</Box>
{focused?.description ? <Text color="subtle">{focused.description}</Text> : null}
{runs.length > 1 ? (
<Box marginTop={1}>
<TabsBar runs={runs} activeRunId={activeRunId} />
</Box>
) : null}
<Box flexDirection="row" marginTop={1}>
<Box width="25%" flexDirection="column">
<Text color={focusColumn === 'phases' ? 'claude' : 'subtle'} bold>
Phases
</Text>
<PhaseSidebar
phases={phases}
agents={focused?.agents ?? []}
selectedIndex={clampedPhase}
focused={focusColumn === 'phases'}
/>
</Box>
<Text color="subtle"></Text>
<Box flexGrow={1} flexDirection="column">
<Text color={focusColumn === 'agents' ? 'claude' : 'subtle'} bold>
{phaseHeader} · {visibleAgents.length} agents
</Text>
<AgentList agents={visibleAgents} selectedIndex={clampedAgent} focused={focusColumn === 'agents'} />
</Box>
</Box>
<Box marginTop={1}>
<Text color="subtle">
{confirmKill !== null
? 'Confirm: y kill · n/Esc cancel'
: 'Tab switch run · ←/→ focus · ↑/↓ move · x kill agent · K kill workflow · r resume · q quit'}
</Text>
</Box>
{confirmKill !== null ? (
<Dialog
title={
confirmKill === 'workflow'
? `Kill workflow "${focused?.workflowName ?? ''}"?`
: `Kill agent "${visibleAgents[clampedAgent]?.label ?? ''}"?`
}
subtitle={
confirmKill === 'workflow'
? 'All in-flight agents will be aborted. Resume will replay from journal.'
: 'Only this agent aborts; other agents in the workflow keep running.'
}
onCancel={() => setConfirmKill(null)}
color="warning"
>
<Text color="subtle">Press y to confirm, or n/Esc to cancel.</Text>
</Dialog>
) : null}
</Box>
);
}

View File

@@ -0,0 +1,16 @@
import type { LocalJSXCommandCall } from '../../types/command.js';
import { SentryErrorBoundary } from '../../components/SentryErrorBoundary.js';
import { WorkflowsPanel } from './WorkflowsPanel.js';
/**
* local-jsx call for /workflows: builds the panel element and returns it for Ink to render.
*
* Wrapped in SentryErrorBoundary: when useSyncExternalStore / listNamed / child components
* throw, the exception must not break through to the REPL top level and crash the whole session; the boundary falls back to a local error card.
* onDone/context are injected by the command runtime; args is unused (the panel has no parameterized behavior).
*/
export const call: LocalJSXCommandCall = async (onDone, context, _args) => (
<SentryErrorBoundary name="WorkflowsPanel">
<WorkflowsPanel onDone={onDone} context={context} />
</SentryErrorBoundary>
);

View File

@@ -0,0 +1,71 @@
import type { AgentProgress, RunProgress } from '../progress/store.js'
import type { PhaseStatus } from './status.js'
/** Title of the fixed "no filter" item (first row of the sidebar). */
export const ALL_PHASE = 'All'
/** Merged phase (including pending), with done/total counts of agents under that phase. */
export type MergedPhase = {
title: string
status: PhaseStatus
done: number
total: number
}
/**
* Merge declaredPhases (declared by meta) and run.phases (actually running/done):
* - Declared order takes priority; phases present in actual but not declared are appended at the end.
* - No actual record -> pending; otherwise take the actual status.
* - done/total = done under that phase / total agents under that phase.
*/
export function mergePhases(
run: Pick<RunProgress, 'declaredPhases' | 'phases' | 'agents'>,
): MergedPhase[] {
const actualByTitle = new Map(run.phases.map(p => [p.title, p]))
const seen = new Set<string>()
const out: MergedPhase[] = []
const push = (title: string): void => {
if (seen.has(title)) return
seen.add(title)
const actual = actualByTitle.get(title)
const status: PhaseStatus = !actual ? 'pending' : actual.status
const inPhase = run.agents.filter(a => a.phase === title)
out.push({
title,
status,
done: inPhase.filter(a => a.status === 'done').length,
total: inPhase.length,
})
}
for (const t of run.declaredPhases) push(t)
for (const p of run.phases) push(p.title)
return out
}
/**
* Filter agents by the selected phase.
* selectedPhase undefined or ALL_PHASE -> all.
*/
export function filterAgentsByPhase(
agents: AgentProgress[],
selectedPhase: string | undefined,
): AgentProgress[] {
if (selectedPhase === undefined || selectedPhase === ALL_PHASE) return agents
return agents.filter(a => a.phase === selectedPhase)
}
/** tab label: workflow name + `#` + last 4 chars of runId (disambiguates same-name runs). */
export function tabLabel(workflowName: string, runId: string): string {
return `${workflowName}#${runId.slice(-4)}`
}
/** milliseconds -> compact duration (<60s -> `Ns`; <60m -> `MmSSs`; otherwise `HhMMm`). Used by the panel header. */
export function formatDuration(ms: number): string {
const s = Math.floor(ms / 1000)
if (s < 60) return `${s}s`
const m = Math.floor(s / 60)
const ss = s % 60
if (m < 60) return `${m}m${String(ss).padStart(2, '0')}s`
const h = Math.floor(m / 60)
return `${h}h${String(m % 60).padStart(2, '0')}m`
}

View File

@@ -0,0 +1,73 @@
import type { AgentProgress, RunProgress } from '../progress/store.js'
/** run status -> dot character (used by top tab). */
export const STATUS_DOT: Record<RunProgress['status'], string> = {
running: '●',
completed: '✓',
failed: '✗',
killed: '■',
}
/** run status -> ink theme color token (follows existing WorkflowList palette). */
export const RUN_STATUS_COLOR: Record<RunProgress['status'], string> = {
running: 'warning',
completed: 'success',
failed: 'error',
killed: 'subtle',
}
/** run status -> display text (used by header; aligns with reference image done/running). */
export const RUN_STATUS_TEXT: Record<RunProgress['status'], string> = {
running: 'running',
completed: 'done',
failed: 'failed',
killed: 'killed',
}
/** merged phase status in the sidebar (includes pending: declared by meta but not started). */
export type PhaseStatus = 'running' | 'done' | 'pending'
export const PHASE_MARK: Record<PhaseStatus, string> = {
running: '●',
done: '✓',
pending: '○',
}
export const PHASE_COLOR: Record<PhaseStatus, string> = {
running: 'warning',
done: 'success',
pending: 'subtle',
}
/** visual for an agent row: mark character + color (running has the mark overridden by a spinner animation in UI). */
export type AgentVisual = { mark: string; color: string }
/**
* agent status -> visual.
* - running -> ● warning (UI overrides mark with spinner animation)
* - done·dead -> ✗ error
* - done·ok -> ✓ success
*/
export function agentVisual(a: AgentProgress): AgentVisual {
if (a.status === 'running') return { mark: '●', color: 'warning' }
if (a.resultKind === 'dead') return { mark: '✗', color: 'error' }
return { mark: '✓', color: 'success' }
}
/** token count -> display string (<1000 keeps the raw value; otherwise keeps 1 decimal + k). */
export function formatTokenCount(n: number | undefined): string {
if (!n) return '0'
return n >= 1000 ? `${(n / 1000).toFixed(1)}k` : String(n)
}
/**
* right-side stats text for an agent row: `model · Nk tok · N tool`.
* Omits the prefix when there is no model; token/tool refresh in real time via agent_progress while running.
*/
export function agentMetaText(a: AgentProgress): string {
const parts: string[] = []
if (a.model) parts.push(a.model)
parts.push(`${formatTokenCount(a.tokenCount)} tok`)
parts.push(`${a.toolCount ?? 0} tool`)
return parts.join(' · ')
}

View File

@@ -0,0 +1,145 @@
import { useInput } from '@anthropic/ink'
/** The column that currently has focus. */
export type FocusColumn = 'phases' | 'agents'
/** Keyboard mode: normal = regular navigation; confirm = a Dialog is open, waiting for the user's y/n confirmation. */
export type WorkflowKeyboardMode = 'normal' | 'confirm'
/** Subset of the useInput key object (only declares the fields we use, to avoid coupling to the ink Key type). */
type KeyEvent = {
tab?: boolean
shift?: boolean
escape?: boolean
return?: boolean
leftArrow?: boolean
rightArrow?: boolean
upArrow?: boolean
downArrow?: boolean
}
/** key -> action (pure function, easy to unit test; no rendering dependencies). */
export type WorkflowKeyAction =
| 'nextTab'
| 'prevTab'
| 'focusLeft'
| 'focusRight'
| 'moveUp'
| 'moveDown'
| 'killAgent'
| 'killWorkflow'
| 'resume'
| 'newRun'
| 'quit'
| 'confirmYes'
| 'confirmNo'
export function routeWorkflowKey(
input: string,
key: KeyEvent,
mode: WorkflowKeyboardMode = 'normal',
): WorkflowKeyAction | null {
// confirm mode: only y/Enter confirms, n/Esc/q cancels, all other keys are swallowed (prevent mis-touch)
if (mode === 'confirm') {
if (input === 'y' || input === 'Y' || key.return) return 'confirmYes'
if (input === 'n' || input === 'N' || key.escape || input === 'q') {
return 'confirmNo'
}
return null
}
// @anthropic/ink sets key.tab to true for the Tab key; some environments fall back to '\t'
if (key.tab || input === '\t') return key.shift ? 'prevTab' : 'nextTab'
if (key.escape || input === 'q') return 'quit'
// Capital K = kill the entire workflow; lowercase x = kill the currently selected agent (agents column only).
// Case distinction avoids x accidentally triggering workflow kill; K explicitly requires Shift, hinting at a "heavy operation".
if (input === 'K') return 'killWorkflow'
if (input === 'x') return 'killAgent'
if (input === 'r') return 'resume'
if (input === 'n') return 'newRun'
if (key.leftArrow) return 'focusLeft'
if (key.rightArrow) return 'focusRight'
if (key.upArrow) return 'moveUp'
if (key.downArrow) return 'moveDown'
return null
}
/** Focus model callbacks (injected by WorkflowsPanel). */
export type WorkflowKeyboardHandlers = {
nextTab: () => void
prevTab: () => void
focusLeft: () => void
focusRight: () => void
moveUp: () => void
moveDown: () => void
/** Request killing the currently selected agent (panel pops a Dialog for secondary confirmation). */
killAgent: () => void
/** Request killing the entire workflow (panel pops a Dialog for secondary confirmation). */
killWorkflow: () => void
resumeFocused: () => void
newRun: () => void
quit: () => void
/** User confirms in confirm mode (y/Enter). */
confirmYes: () => void
/** User cancels in confirm mode (n/Esc/q). */
confirmNo: () => void
}
/**
* /workflows panel keybindings (focus rotation model):
* - Tab / Shift+Tab: switch the top run tab
* - Left / Right: switch focus between phases and agents
* - Up / Down: move within the currently focused column
* - x kill single agent · K kill the entire workflow (with Dialog secondary confirmation) · r resume · n new · q / Esc quit
*
* @param mode In confirm mode only y/n/Esc/q are accepted, all other keys are swallowed - avoid mis-navigation inside the confirmation dialog.
*/
export function useWorkflowKeyboard(
h: WorkflowKeyboardHandlers,
mode: WorkflowKeyboardMode = 'normal',
): void {
useInput((input, key) => {
const action = routeWorkflowKey(input, key as KeyEvent, mode)
if (action === null) return
switch (action) {
case 'nextTab':
h.nextTab()
break
case 'prevTab':
h.prevTab()
break
case 'focusLeft':
h.focusLeft()
break
case 'focusRight':
h.focusRight()
break
case 'moveUp':
h.moveUp()
break
case 'moveDown':
h.moveDown()
break
case 'killAgent':
h.killAgent()
break
case 'killWorkflow':
h.killWorkflow()
break
case 'resume':
h.resumeFocused()
break
case 'newRun':
h.newRun()
break
case 'quit':
h.quit()
break
case 'confirmYes':
h.confirmYes()
break
case 'confirmNo':
h.confirmNo()
break
}
})
}

131
src/workflow/persistence.ts Normal file
View File

@@ -0,0 +1,131 @@
import { mkdir, readFile, readdir, rename, writeFile } from 'node:fs/promises'
import { join } from 'node:path'
import { getProjectRoot } from '../bootstrap/state.js'
import { logForDebugging } from '../utils/debug.js'
import type { ProgressBus } from './progress/bus.js'
import type { ProgressStore, RunProgress } from './progress/store.js'
/** Current schema version of state.json; introduces a migration chain on upgrade. */
const SCHEMA_VERSION = 1
const STATE_FILE = 'state.json'
const STATE_TMP = 'state.json.tmp'
/**
* Single source for runsDir: shares the same root as ports.ts journalStore (${projectRoot}/.claude/workflow-runs).
* Extracted as a function: eliminates duplicated path concatenation between ports.ts and persistence logic, staying in the same root when entering worktree/subdirectory.
* Tests monkey-patch this function to point at a tmpdir.
*/
export function getRunsDir(): string {
return join(getProjectRoot(), '.claude', 'workflow-runs')
}
type StateFile = {
schemaVersion: number
run: RunProgress
}
/**
* Atomically overwrite the terminal RunProgress to <runsDir>/<runId>/state.json.
* Atomicity: writeFile(tmp) → rename(tmp, target), rename is atomic; worst case leaves tmp, next write overwrites it.
* Failure is best-effort: IO exceptions only log a warn, do not throw (workflow already succeeded; persistence failure only means it cannot be retrieved after restart).
*/
export async function writeRunState(
runsDir: string,
run: RunProgress,
): Promise<void> {
const dir = join(runsDir, run.runId)
const target = join(dir, STATE_FILE)
const tmp = join(dir, STATE_TMP)
const payload: StateFile = { schemaVersion: SCHEMA_VERSION, run }
try {
await mkdir(dir, { recursive: true })
await writeFile(tmp, JSON.stringify(payload), 'utf-8')
await rename(tmp, target)
} catch (e) {
logForDebugging(
`[workflow warn] writeRunState failed for ${run.runId}: ${(e as Error).message}`,
)
}
}
/**
* Read <runsDir>/<runId>/state.json with fault tolerance:
* - File does not exist → null (caller treats it as a miss)
* - JSON parse failure / schema structure mismatch / schemaVersion mismatch → null (log warn, do not crash)
*/
export async function readRunState(
runsDir: string,
runId: string,
): Promise<RunProgress | null> {
const target = join(runsDir, runId, STATE_FILE)
let raw: string
try {
raw = await readFile(target, 'utf-8')
} catch {
return null
}
try {
const parsed = JSON.parse(raw) as Partial<StateFile>
if (parsed.schemaVersion !== SCHEMA_VERSION) return null
const run = parsed.run
if (!run || typeof run !== 'object') return null
if (typeof run.runId !== 'string') return null
if (typeof run.status !== 'string') return null
return run as RunProgress
} catch (e) {
logForDebugging(
`[workflow warn] readRunState parse failed for ${runId}: ${(e as Error).message}`,
)
return null
}
}
/**
* Scan all subdirectories under runsDir, read each state.json, return a list of non-null RunProgress.
* - runsDir does not exist → empty array
* - A subdirectory without state.json (half-written run) → skip
* - A subdirectory whose state.json is corrupted → skip that single one, keep scanning the rest
* - Sort by updatedAt descending (consistent with store.list() ordering)
*/
export async function listPersistedRuns(
runsDir: string,
): Promise<RunProgress[]> {
let entries: string[]
try {
entries = await readdir(runsDir)
} catch {
return []
}
const runs: RunProgress[] = []
for (const name of entries) {
const run = await readRunState(runsDir, name)
if (run) runs.push(run)
}
return runs.sort((a, b) => b.updatedAt - a.updatedAt)
}
/**
* Subscribe to the bus's run_done event and write the terminal RunProgress to state.json on disk.
* Covers all three terminal states (completed/failed/killed; shutdown-kill also routes to run_done killed).
* The store registers to the bus before this subscription, so when the listener runs store.get(runId) is already terminal.
* Returns an unsubscribe function (for test cleanup).
*
* Disk write is best-effort: writeRunState swallows IO exceptions and only logs, does not propagate —
* so other bus subscribers (store, etc.) are not affected by persistence failures.
*
* @param runsDirProvider Optional runsDir resolver (defaults to getRunsDir).
* Production path uses the default; tests inject a tmpdir to avoid writing to the real project directory (Bun ESM module namespace is read-only,
* cannot monkey-patch getRunsDir itself).
*/
export function attachRunStatePersistence(
bus: ProgressBus,
store: ProgressStore,
runsDirProvider: () => string = getRunsDir,
): () => void {
return bus.subscribe(event => {
if (event.type !== 'run_done') return
const run = store.get(event.runId)
if (!run) return
void writeRunState(runsDirProvider(), run)
})
}

202
src/workflow/ports.ts Normal file
View File

@@ -0,0 +1,202 @@
import {
createFileJournalStore,
type ProgressEvent,
type WorkflowPorts,
} from '@claude-code-best/workflow-engine'
import { logForDebugging } from '../utils/debug.js'
import { getProjectRoot } from '../bootstrap/state.js'
import { getRunsDir } from './persistence.js'
import {
type AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS,
logEvent,
} from '../services/analytics/index.js'
import {
completeWorkflowTask,
failWorkflowTask,
killWorkflowTask,
registerLocalWorkflowTask,
} from '../tasks/LocalWorkflowTask/LocalWorkflowTask.js'
import {
buildHostBundle,
makeHostHandle,
readHostBundle,
type WorkflowHostBundle,
} from './hostHandle.js'
import { buildRegistry } from './registry.js'
import type { ProgressBus } from './progress/bus.js'
import type { ProgressStore } from './progress/store.js'
import type { SetAppState } from '../Task.js'
import type { AssistantMessage } from '../types/message.js'
type RunBinding = {
runId: string
taskId: string
setAppState: SetAppState
abortController: AbortController
workflowName: string
/** agentId → AbortController. Registered when backend starts an agent; killAgent uses it for precise abort. */
agentAbortControllers: Map<number, AbortController>
}
/** Constructs a WorkflowHostContext from toolUseContext on each tool invocation. */
function makeHostFactory(): WorkflowPorts['hostFactory'] {
return ({ context, canUseTool, parentMessage }) => {
const ctx = context as WorkflowHostBundle['toolUseContext'] & {
agentId?: string
}
return {
handle: makeHostHandle(
buildHostBundle(
ctx,
canUseTool as WorkflowHostBundle['canUseTool'],
parentMessage as AssistantMessage | undefined,
),
),
// Use projectRoot rather than getCwd(): shares the same root as journalStore's runsDir,
// otherwise named workflow resolution and journal persistence diverge when the user
// enters a worktree/sub-directory. The engine's internal ctx.cwd is only used for
// resolution (scriptPath/name) and does not affect the agent's execution cwd
// (the agent gets its own cwd via the toolUseContext inside the host bundle).
cwd: getProjectRoot(),
budgetTotal: null, // turn-level budget injection point (read from settings in the future)
...(ctx.toolUseId ? { toolUseId: ctx.toolUseId } : {}),
}
}
}
/**
* Assembles the complete WorkflowPorts. bus/store are passed in by the caller (shared via the service singleton).
* taskRegistrar maintains runId → RunBinding for kill routing.
*/
export function createWorkflowPorts(opts: {
bus: ProgressBus
store: ProgressStore
}): WorkflowPorts {
const bindings = new Map<string, RunBinding>()
const runsDir = getRunsDir()
const registry = buildRegistry()
// Telemetry subscription (independent of store). LogEventMetadata only accepts boolean/number/undefined,
// and runId is a string — use the brand cast provided by the analytics module (verified non-code/path) to pass it through.
opts.bus.subscribe((e: ProgressEvent) => {
if (e.type === 'run_done') {
logEvent('tengu_workflow_done', {
status: e.status === 'completed' ? 0 : e.status === 'failed' ? 1 : 2,
runId:
e.runId as AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS,
})
}
})
const taskRegistrar: WorkflowPorts['taskRegistrar'] = {
register(regOpts, host) {
const bundle = readHostBundle(host)
const setAppState =
bundle.toolUseContext.setAppStateForTasks ??
bundle.toolUseContext.setAppState
const abortController = new AbortController()
const taskId = registerLocalWorkflowTask(setAppState, {
description: regOpts.summary ?? regOpts.workflowName,
workflowName: regOpts.workflowName,
workflowFile: regOpts.workflowFile ?? '',
summary: regOpts.summary,
...(regOpts.toolUseId ? { toolUseId: regOpts.toolUseId } : {}),
abortController,
})
const runId = regOpts.runId ?? taskId
bindings.set(runId, {
runId,
taskId,
setAppState,
abortController,
workflowName: regOpts.workflowName,
agentAbortControllers: new Map(),
})
logForDebugging(
`workflow task registered: ${runId} (${regOpts.workflowName})`,
)
return { runId, signal: abortController.signal }
},
complete(runId, summary) {
const b = bindings.get(runId)
if (!b) return
completeWorkflowTask(b.taskId, b.setAppState)
logForDebugging(`workflow ${runId} completed: ${summary ?? ''}`)
bindings.delete(runId)
},
fail(runId, error) {
const b = bindings.get(runId)
if (!b) return
failWorkflowTask(b.taskId, b.setAppState, error)
logForDebugging(`workflow ${runId} failed: ${error}`)
bindings.delete(runId)
},
kill(runId) {
const b = bindings.get(runId)
if (!b) return
killWorkflowTask(b.taskId, b.setAppState) // internal abort controller
// Killing the run also aborts all in-flight agents (guards against the edge timing where the backend misses the task abort)
for (const ac of b.agentAbortControllers.values()) {
try {
ac.abort()
} catch {
// no-op: abort won't throw internally, but fail-closed
}
}
b.agentAbortControllers.clear()
bindings.delete(runId)
},
registerAgentAbort(runId, agentId, ac) {
const b = bindings.get(runId)
if (!b) return
b.agentAbortControllers.set(agentId, ac)
},
unregisterAgentAbort(runId, agentId) {
const b = bindings.get(runId)
if (!b) return
b.agentAbortControllers.delete(agentId)
},
killAgent(runId, agentId) {
const b = bindings.get(runId)
if (!b) return false
const ac = b.agentAbortControllers.get(agentId)
if (!ac) return false
try {
ac.abort()
} catch {
// no-op
}
b.agentAbortControllers.delete(agentId)
return true
},
pendingAction() {
return null // v1: skip/retry not wired (seam retained)
},
}
return {
hostFactory: makeHostFactory(),
agentAdapterRegistry: registry,
agentRunner: {
// Dead-code fallback: hooks always go through agentAdapterRegistry (required on ports). Reaching here means the registry was not registered — fail-fast.
async runAgentToResult() {
throw new Error(
'workflow agentRunner fallback reached — agentAdapterRegistry must be set on ports',
)
},
},
progressEmitter: {
emit(event) {
opts.bus.emit(event) // → store reducer + telemetry
},
},
taskRegistrar,
journalStore: createFileJournalStore(runsDir),
permissionGate: { isAborted: () => false }, // engine uses ctx.signal to check abort
logger: {
debug: msg => logForDebugging(msg),
warn: msg => logForDebugging(`[workflow warn] ${msg}`),
event: name => logForDebugging(`workflow event: ${name}`),
},
}
}

View File

@@ -0,0 +1,20 @@
import type { ProgressEvent } from '@claude-code-best/workflow-engine'
/** Typed progress event bus. engine progressEmitter.emit -> broadcasts to all subscribers (store / telemetry). */
export type ProgressBus = {
emit(event: ProgressEvent): void
subscribe(listener: (event: ProgressEvent) => void): () => void
}
export function createProgressBus(): ProgressBus {
const listeners = new Set<(event: ProgressEvent) => void>()
return {
emit(event) {
for (const fn of listeners) fn(event)
},
subscribe(listener) {
listeners.add(listener)
return () => listeners.delete(listener)
},
}
}

View File

@@ -0,0 +1,200 @@
import type { ProgressEvent } from '@claude-code-best/workflow-engine'
import type { ProgressBus } from './bus.js'
export type AgentProgress = {
/** Unique id stamped by the engine, precisely correlates started/done (fixes the old LIFO race condition). */
id: number
label?: string
phase?: string
status: 'running' | 'done'
resultKind?: string
/** Only meaningful when done·ok: output is an object -> 'object', otherwise -> 'text'. None for dead/skipped. */
outputShape?: 'text' | 'object'
/** Actually parsed model id (carried in by agent_done; none while running). */
model?: string
/** Cumulative context tokens (live via agent_progress / final value settled by agent_done). */
tokenCount?: number
/** Cumulative tool-call count (live via agent_progress / final value settled by agent_done). */
toolCount?: number
}
export type RunProgress = {
runId: string
workflowName: string
status: 'running' | 'completed' | 'failed' | 'killed'
phases: Array<{ title: string; status: 'running' | 'done' }>
/** From run_started.meta.phases[].title; the panel uses this to show pending(○) phases. [] when no meta. */
declaredPhases: string[]
currentPhase: string | null
agents: AgentProgress[]
agentCount: number
returnValue?: unknown
error?: string
/** run_started timestamp (used by the panel to compute run duration). */
startedAt: number
/** workflow description (from run_started.meta.description). */
description?: string
updatedAt: number
}
export type ProgressStore = {
apply(event: ProgressEvent): void
list(): RunProgress[]
get(runId: string): RunProgress | undefined
/** Directly inject a run read from disk (bypassing bus); skips existing runId - in-memory takes priority. */
hydrate(run: RunProgress): void
/** For useSyncExternalStore: returns a stable reference, the same array when no change. */
subscribe(listener: () => void): () => void
getSnapshot(): RunProgress[]
}
/** Build a reactive store from the bus: subscribe to the bus, reduce events, notify React subscribers. */
export function createProgressStoreFromBus(bus: ProgressBus): ProgressStore {
const byId = new Map<string, RunProgress>()
let snapshot: RunProgress[] = []
const listeners = new Set<() => void>()
const notify = (): void => {
snapshot = [...byId.values()].sort((a, b) => b.updatedAt - a.updatedAt)
for (const fn of listeners) fn()
}
const ensure = (runId: string, workflowName: string): RunProgress => {
let p = byId.get(runId)
if (!p) {
p = {
runId,
workflowName,
status: 'running',
phases: [],
declaredPhases: [],
currentPhase: null,
agents: [],
agentCount: 0,
startedAt: Date.now(),
updatedAt: Date.now(),
}
byId.set(runId, p)
}
return p
}
const apply = (event: ProgressEvent): void => {
// log produces no visible state change (panel has no log view): early exit to avoid pointless snapshot rebuild and React re-render
if (event.type === 'log') return
const runId = event.runId
const p = ensure(
runId,
'workflowName' in event ? event.workflowName : 'workflow',
)
p.updatedAt = Date.now()
switch (event.type) {
case 'run_started':
p.workflowName = event.workflowName
p.status = 'running'
p.declaredPhases = event.meta?.phases?.map(ph => ph.title) ?? []
p.description = event.meta?.description ?? undefined
break
case 'phase_started':
if (!p.phases.some(ph => ph.title === event.phase)) {
p.phases.push({ title: event.phase, status: 'running' })
}
p.currentPhase = event.phase
break
case 'phase_done':
for (const ph of p.phases)
if (ph.title === event.phase) ph.status = 'done'
if (p.currentPhase === event.phase) p.currentPhase = null
break
case 'agent_started': {
let a = p.agents.find(x => x.id === event.agentId)
if (!a) {
a = {
id: event.agentId,
label: event.label,
phase: event.phase,
status: 'running',
}
p.agents.push(a)
p.agentCount = p.agents.length
} else {
a.status = 'running'
a.label = event.label
a.phase = event.phase
}
break
}
case 'agent_progress': {
// live progress: only update token/tool (high frequency, but once per agent message, frequency is controllable).
const ap = p.agents.find(x => x.id === event.agentId)
if (ap) {
ap.tokenCount = event.tokenCount
ap.toolCount = event.toolCount
}
break
}
case 'agent_done': {
let a = p.agents.find(x => x.id === event.agentId)
if (!a) {
a = {
id: event.agentId,
label: event.label,
phase: event.phase,
status: 'done',
...(event.result.kind === 'ok'
? {
outputShape:
typeof event.result.output === 'object' &&
event.result.output !== null
? ('object' as const)
: ('text' as const),
tokenCount: event.result.tokenCount,
toolCount: event.result.toolCount,
model: event.result.model,
}
: {}),
}
p.agents.push(a)
p.agentCount = p.agents.length
} else {
a.status = 'done'
a.resultKind = event.result.kind
if (event.result.kind === 'ok') {
a.outputShape =
typeof event.result.output === 'object' &&
event.result.output !== null
? 'object'
: 'text'
a.tokenCount = event.result.tokenCount
a.toolCount = event.result.toolCount
a.model = event.result.model
}
}
break
}
case 'run_done':
p.status = event.status
if (event.returnValue !== undefined) p.returnValue = event.returnValue
if (event.error !== undefined) p.error = event.error
break
}
notify()
}
bus.subscribe(apply)
return {
apply,
list: () => snapshot,
get: id => byId.get(id),
hydrate(run) {
if (byId.has(run.runId)) return
byId.set(run.runId, run)
notify()
},
subscribe: fn => {
listeners.add(fn)
return () => listeners.delete(fn)
},
getSnapshot: () => snapshot,
}
}

13
src/workflow/registry.ts Normal file
View File

@@ -0,0 +1,13 @@
import { AgentAdapterRegistry } from '@claude-code-best/workflow-engine'
import { claudeCodeBackend } from './backends/claudeCodeBackend.js'
/**
* Build a multi-backend registry. v1 (depth B) only registers a single
* claude-code adapter as default, without prefilling routing rules — add
* .route(...) when extending with a second provider adapter.
*/
export function buildRegistry(): AgentAdapterRegistry {
const reg = new AgentAdapterRegistry()
reg.register(claudeCodeBackend).default('claude-code')
return reg
}

314
src/workflow/service.ts Normal file
View File

@@ -0,0 +1,314 @@
import {
listNamedWorkflows,
parseScript,
persistInlineScript,
resolveNamedWorkflow,
runWorkflow,
WORKFLOW_DIR_NAME,
type WorkflowHostContext,
type WorkflowInput,
type WorkflowPorts,
} from '@claude-code-best/workflow-engine'
import { readFile } from 'node:fs/promises'
import { join } from 'node:path'
import { getProjectRoot } from '../bootstrap/state.js'
import { logForDebugging } from '../utils/debug.js'
import { buildHostBundle, makeHostHandle } from './hostHandle.js'
import { installWorkflowNotifications } from './notifications.js'
import {
attachRunStatePersistence,
getRunsDir,
listPersistedRuns,
readRunState,
} from './persistence.js'
import { createProgressBus } from './progress/bus.js'
import {
createProgressStoreFromBus,
type ProgressStore,
type RunProgress,
} from './progress/store.js'
import { createWorkflowPorts } from './ports.js'
import type { CanUseToolFn } from '../hooks/useCanUseTool.js'
import type { ToolUseContext } from '../Tool.js'
/**
* WorkflowService: the single entry shared by the tool (U7) and panel (U9).
*
* - `ports`: shared WorkflowPorts; tool descriptors are passed through to the engine.
* - `launch`: parse script → parseScript quick validation → taskRegistrar.register (gets runId+signal)
* → detached runWorkflow → on completion routes to complete/fail/kill.
* - `kill/listRuns/getRun/subscribe/listNamed`: auxiliary queries for panel and tool.
*/
export type WorkflowService = {
/** Shared ports (used by tool descriptors). */
ports: WorkflowPorts
/** Panel/tool launches a workflow: parse script → register → detached runWorkflow. */
launch(
input: Pick<
WorkflowInput,
| 'script'
| 'name'
| 'scriptPath'
| 'args'
| 'description'
| 'resumeFromRunId'
| 'title'
| 'maxConcurrency'
>,
toolUseContext: ToolUseContext,
canUseTool: CanUseToolFn,
): Promise<{ runId: string; scriptPath?: string }>
kill(runId: string): void
/**
* Aborts a single agent (does not affect other agents in the same run; workflow keeps running).
* Returns whether the agent was hit (false = agent already finished/does not exist). An aborted agent returns dead → null.
*/
killAgent(runId: string, agentId: number): boolean
/**
* Cleanup on process exit / config unload: kill all running runs to avoid orphan tasks.
* Completed/failed runs are unaffected. Idempotent — safe to call multiple times.
*/
shutdown(): void
listRuns(): RunProgress[]
getRun(runId: string): RunProgress | undefined
/**
* Async lookup by runId: return on memory hit; on miss read state.json from disk (not injected into memory).
* Used by the "get historical return by runId" scenario; for panel display use loadPersistedRuns + listRuns.
*/
getRunAsync(runId: string): Promise<RunProgress | undefined>
/**
* Scans the disk and hydrates state.json of all historical runs into the store (skips existing runIds).
* The process singleton only scans the disk once (persistedLoaded flag); repeated calls return immediately.
*/
loadPersistedRuns(): Promise<void>
subscribe(listener: () => void): () => void
listNamed(workflowDir?: string): Promise<string[]>
}
let cached: WorkflowService | null = null
/** Process singleton. Tool and panel share the same ports/registry/store. */
export function getWorkflowService(): WorkflowService {
if (cached) return cached
const bus = createProgressBus()
const store = createProgressStoreFromBus(bus)
const ports = createWorkflowPorts({ bus, store })
const service = makeService(ports, store)
// Subscribe to run_done to write the terminal snapshot to disk (shared entry for completed/failed/killed; shutdown-kill also routes here).
// The store registers to the bus before this subscription, so when the listener runs store.get(runId) is already terminal.
attachRunStatePersistence(bus, store)
// Install the state-change notification bridge (commit 0768d4dc promised "auto-notify on completion" but the old implementation left it unfulfilled)
installWorkflowNotifications(service)
cached = service
return cached
}
/**
* Construct the service (inject ports + store).
*
* Production path uses {@link getWorkflowService}; tests use this function to inject fake ports directly,
* avoiding touching real getProjectRoot/getCwd/analytics and other module-level side effects.
*
* @param cwdOverride For tests only: inject a temp directory (avoids inline persistence writing to the real project directory).
* @param runsDirProvider For tests only: inject a tmpdir (Bun ESM module namespace is read-only, cannot monkey-patch getRunsDir).
*/
export function makeService(
ports: WorkflowPorts,
store: ProgressStore,
cwdOverride?: string,
runsDirProvider: () => string = getRunsDir,
): WorkflowService {
const buildHost = (
toolUseContext: ToolUseContext,
canUseTool: CanUseToolFn,
): WorkflowHostContext => ({
handle: makeHostHandle(buildHostBundle(toolUseContext, canUseTool)),
// Use projectRoot to stay in sync with ports.ts hostFactory / journalStore;
// entering a worktree/subdirectory will not desync named workflow resolution from journal persistence.
// cwdOverride is for tests only: inject a temp directory (avoids inline persistence writing to the real project directory).
cwd: cwdOverride ?? getProjectRoot(),
budgetTotal: null, // turn-level budget injection point (in future read from settings)
toolUseId: toolUseContext.toolUseId,
})
async function resolveSource(input: {
script?: string
name?: string
scriptPath?: string
}): Promise<{
script: string
workflowFile?: string
workflowName: string
}> {
if (input.script) {
return { script: input.script, workflowName: 'workflow' }
}
if (input.scriptPath) {
return {
script: await readFile(input.scriptPath, 'utf-8'),
workflowFile: input.scriptPath,
workflowName: 'workflow',
}
}
if (input.name) {
const dir = join(getProjectRoot(), WORKFLOW_DIR_NAME)
const found = await resolveNamedWorkflow(dir, input.name)
if (!found) {
throw new Error(
`Named workflow "${input.name}" not found (looked in ${WORKFLOW_DIR_NAME}/)`,
)
}
return {
script: found.content,
workflowFile: found.path,
workflowName: input.name,
}
}
throw new Error('One of script, name, or scriptPath must be provided')
}
// Process-singleton flag for loadPersistedRuns: set to true on first call, subsequent calls return immediately.
// Reset on scan failure to allow next retry. Each makeService call has its own closure variable (reset when tests build a new service).
let persistedLoaded = false
return {
ports,
async launch(input, toolUseContext, canUseTool) {
const { script, workflowFile, workflowName } = await resolveSource(input)
try {
parseScript(script)
} catch (e) {
throw new Error(`Script validation failed: ${(e as Error).message}`)
}
const host = buildHost(toolUseContext, canUseTool)
const { runId, signal } = ports.taskRegistrar.register(
{
workflowName,
...(workflowFile ? { workflowFile } : {}),
...(input.description ? { summary: input.description } : {}),
...(host.toolUseId ? { toolUseId: host.toolUseId } : {}),
...(input.resumeFromRunId ? { runId: input.resumeFromRunId } : {}),
},
host.handle,
)
// Inline entry: persist script to the run directory (symmetric with WorkflowTool), return a reusable path.
// Degrade on write failure (log), do not block the run (script is already in memory).
let persistedScriptPath: string | undefined
if (!workflowFile && input.script) {
try {
persistedScriptPath = await persistInlineScript(
input.script,
runId,
host.cwd,
)
} catch (e) {
logForDebugging(
`workflow inline script persist failed: ${(e as Error).message}`,
)
}
}
// detached: do not await, let the caller get runId immediately; on completion route to the registrar.
void runWorkflow({
script,
...(input.args !== undefined ? { args: input.args } : {}),
runId,
workflowName,
ports,
host: host.handle,
signal,
cwd: host.cwd,
budgetTotal: host.budgetTotal,
...(input.maxConcurrency !== undefined
? { maxConcurrency: input.maxConcurrency }
: {}),
...(input.resumeFromRunId ? { resume: true } : {}),
})
.then(result => {
if (result.status === 'completed') {
ports.taskRegistrar.complete(runId)
} else if (result.status === 'failed') {
ports.taskRegistrar.fail(runId, result.error ?? 'failed')
} else {
ports.taskRegistrar.kill(runId)
}
})
.catch(e => ports.taskRegistrar.fail(runId, (e as Error).message))
logForDebugging(`workflow launched: ${runId} (${workflowName})`)
return {
runId,
...(persistedScriptPath ? { scriptPath: persistedScriptPath } : {}),
}
},
kill(runId) {
ports.taskRegistrar.kill(runId)
},
killAgent(runId, agentId) {
return ports.taskRegistrar.killAgent?.(runId, agentId) ?? false
},
shutdown() {
// Only kill running: for completed/failed runs the taskRegistrar has already reclaimed the binding, kill is a no-op.
// taskRegistrar.kill is a safe no-op for unknown runIds, hence idempotent — multiple shutdowns do not throw repeatedly.
// Each kill is wrapped in its own try/catch: kill internally routes through setAppState, and process-exit phase triggers a React re-render
// which may throw (render already unmounted, etc.); a single failure should not block cleanup of other runs.
for (const run of store.list()) {
if (run.status !== 'running') continue
try {
ports.taskRegistrar.kill(run.runId)
} catch (e) {
logForDebugging(
`workflow shutdown: kill ${run.runId} failed: ${(e as Error).message}`,
)
}
}
},
listRuns: () => store.list(),
getRun: id => store.get(id),
async getRunAsync(id) {
const mem = store.get(id)
if (mem) return mem
return (await readRunState(runsDirProvider(), id)) ?? undefined
},
async loadPersistedRuns() {
if (persistedLoaded) return
persistedLoaded = true
try {
const runs = await listPersistedRuns(runsDirProvider())
for (const run of runs) store.hydrate(run)
} catch (e) {
// Scan failure does not block the panel: log + reset flag to allow next retry
logForDebugging(
`[workflow warn] loadPersistedRuns failed: ${(e as Error).message}`,
)
persistedLoaded = false
}
},
subscribe: fn => store.subscribe(fn),
async listNamed(workflowDir) {
return listNamedWorkflows(
workflowDir ?? join(getProjectRoot(), WORKFLOW_DIR_NAME),
)
},
}
}
/** For tests: reset the singleton (avoid cross-case contamination). */
export function __resetWorkflowServiceForTests(): void {
cached = null
}
/**
* Returns the already-instantiated service (does not create one). Used on process exit / config unload to peek;
* if workflow was never used, cached is still null — avoids side-effecting bus/ports creation in the exit hook.
*/
export function peekWorkflowService(): WorkflowService | null {
return cached
}

65
src/workflow/wiring.ts Normal file
View File

@@ -0,0 +1,65 @@
import {
createWorkflowTool,
workflowInputSchema,
WORKFLOW_TOOL_NAME,
type WorkflowToolDescriptor,
} from '@claude-code-best/workflow-engine'
import { buildTool, type Tool } from '../Tool.js'
import { getWorkflowService } from './service.js'
/**
* Adapts the engine's self-contained descriptor into a buildTool-compatible Tool.
* The descriptor routes through the service singleton (sharing ports/registry/store).
*
* ports resolution is deferred to the first real method call (lazy): tools.ts calls
* createWorkflowToolCore() during module-load (feature-gated), and resolving ports
* immediately would trigger service instantiation, which in turn calls module-level
* side effects like getProjectRoot — yielding wrong paths before bootstrap completes.
* The Tool object itself is a singleton via createWorkflowToolCore's cached (PermissionRequest
* matches by reference), and the ports singleton is guaranteed by getWorkflowService.
*/
function buildWorkflowTool(): Tool {
let cachedDescriptor: WorkflowToolDescriptor | null = null
const descriptor = (): WorkflowToolDescriptor => {
if (!cachedDescriptor) {
const { ports } = getWorkflowService()
cachedDescriptor = createWorkflowTool(ports)
}
return cachedDescriptor
}
return buildTool({
name: WORKFLOW_TOOL_NAME,
maxResultSizeChars: 50_000,
inputSchema: workflowInputSchema,
isEnabled: () => descriptor().isEnabled(),
isReadOnly: input => descriptor().isReadOnly(input),
isConcurrencySafe: () => true,
async description() {
return descriptor().description()
},
async prompt() {
return descriptor().prompt()
},
async call(input, context, canUseTool, parentMessage, onProgress) {
const result = await descriptor().call(
input,
context,
canUseTool,
parentMessage,
onProgress,
)
return { data: result.data }
},
renderToolUseMessage: input => descriptor().renderToolUseMessage(input),
mapToolResultToToolResultBlockParam: (data, toolUseId) =>
descriptor().mapToolResultToToolResultBlockParam(data, toolUseId),
})
}
// Singleton: tools.ts registration and PermissionRequest must reference the same instance (switch matches by reference).
let cached: Tool | null = null
export function createWorkflowToolCore(): Tool {
if (!cached) cached = buildWorkflowTool()
return cached
}