feat(workflow): 复刻 ultracode 手册并修复 worktree/inline/opt-in 三处缺口

围绕 ultracode skill 审查 agent 系统一致性后： - ultracode.ts: 用系统提示版完整 Workflow 编排手册替换中文精简版 - HIGH#1 isolation:'worktree': claudeCodeBackend.run() 用 createAgentWorktree + runWithCwdOverride 包裹 runAgent + finally 清理实现真正的 cwd 隔离；slug 用 sha256(runId:agentId) 派生以匹配 cleanupStaleAgentWorktrees 清理正则（修 runId 为 w+base36 非 UUID 导致的泄漏盲区）；worktree.ts 注释同步修正 - HIGH#2 inline 持久化: 新增 persistInlineScript，WorkflowTool + service 两条 inline 路径对称持久化到 .claude/workflow-runs/<runId>/script.js，返回可复用 scriptPath（闭环 inline→编辑→scriptPath 重提迭代循环） - HIGH#3 opt-in 分工: ultracode/WorkflowTool/effort 注明 session reminder 由 harness 注入，repo 内无 ultracode 信号，保持 feature('WORKFLOW_SCRIPTS') + isEnabled 两层 gate，不自造注入 - 测试: 新增 persistInline.test.ts；扩展 claudeCodeBackend(isolation 4 用例)/ WorkflowTool(inline)/service(scriptPath)/ultracode(harness) 含配套 workflow engine/panel 完善与 run-state-persistence design doc。 Co-Authored-By: Claude <noreply@anthropic.com>
2026-06-15 12:55:51 +00:00 · 2026-06-13 23:04:33 +08:00
parent d236880bc3
commit 54d2bf6f12
32 changed files with 2253 additions and 196 deletions
--- a/docs/superpowers/plans/2026-06-13-workflow-run-state-persistence.md
+++ b/docs/superpowers/plans/2026-06-13-workflow-run-state-persistence.md
--- a/docs/superpowers/specs/2026-06-13-workflow-run-state-persistence-design.md
+++ b/docs/superpowers/specs/2026-06-13-workflow-run-state-persistence-design.md
@@ -0,0 +1,191 @@
+# Workflow Run State Persistence — Design
+
+**Date**: 2026-06-13
+**Status**: Approved (brainstorming), pending implementation plan
+**Related**: `2026-06-12-workflow-engine-design.md`, `2026-06-13-workflow-panel-redesign.md`
+
+## 问题陈述
+
+Workflow 脚本的 `return` 值和终态 `RunProgress`（status / agents / phases / returnValue / error）只活在 `ProgressStore`（`src/workflow/progress/store.ts`）的内存 Map 里。一旦 Claude Code 进程关闭/重启，全部丢失。
+
+已落盘的 `.claude/workflow-runs/<runId>/journal.jsonl` 只记录每个 `agent()` 调用的结构化结果，**不**包含脚本顶层 `return` 值，也无法重建 `/workflows` 面板需要的 `RunProgress` 摘要。重启后面板为空，对话 agent 也无法按 runId 取回 return 值。
+
+## 目标
+
+- **(a) 重启后按 runId 取 return** — 对话 agent 在新进程里能拿到已完成 run 的 `returnValue` 与 `error`。
+- **(b) 面板跨重启展示历史** — `/workflows` 面板重启后能列出历史 run 及其状态/agents/phases/耗时。
+
+## 非目标
+
+- **(c) 跨进程 resume 明确排除** — 不重建 abort controller、agent binding、未完成 phase 的中间态。当前 resume 机制（同进程内 journal replay）保持不变；跨进程续跑是独立大特性，不在本 spec 范围。
+- **自动清理** — `.claude/workflow-runs/` 持续累积，依赖项目 `.gitignore` 与用户手动清理。生命周期管理是后续特性。
+
+## 架构
+
+新增一个 host 侧持久化模块 + 三处接入点。**引擎层 `@claude-code-best/workflow-engine` 零改动**——持久化是 host 侧关注，不污染引擎接口。
+
+### 组件
+
+| 文件 | 改动 | 职责 |
+|---|---|---|
+| `src/workflow/persistence.ts` | 新增 | `writeRunState` / `readRunState` / `listPersistedRuns`；原子覆盖写（tmp + rename）；`getRunsDir()` 统一 runsDir 来源 |
+| `src/workflow/progress/store.ts` | 改 | 新增 `hydrate(run: RunProgress): void` —— 绕过 bus 直接注入磁盘 run（用于 `loadPersistedRuns`） |
+| `src/workflow/service.ts` | 改 | 订阅 bus `run_done` → `writeRunState`；`getRun(id)` 内存 miss → `readRunState` fallback；新增 `loadPersistedRuns(): Promise<void>` |
+| `src/workflow/panel/WorkflowsPanel.tsx` | 改 | mount 时调一次 `svc.loadPersistedRuns()`（flag 在 service 单例内部守护，panel 无脑调，重复调用是 no-op） |
+| `src/workflow/ports.ts` | 改 | `${getProjectRoot()}/.claude/workflow-runs` 提取为 `getRunsDir()` 共享（消除重复拼接，与 persistence.ts 同源） |
+
+## 数据流
+
+### 写入（终态触发，单一入口覆盖 A+ 所有终态）
+
+```
+engine runWorkflow
+  └─ progressEmitter.emit({type:'run_done', status, returnValue, error})
+     └─ bus.emit
+        ├─ store.apply(event)            [store 先订阅，内存 RunProgress 已更新]
+        └─ service 订阅 listener          [后订阅，store.get(runId) 拿到最新快照]
+           └─ writeRunState(runsDir, runId, snapshot)
+              └─ writeFile(state.json.tmp) → rename(state.json)   [原子]
+```
+
+**订阅顺序**：bus 是 `Set<listener>`，注册顺序 = 触发顺序。`createProgressStoreFromBus(bus)` 在 service 创建之前先订阅 store；service 后订阅。因此 service 的 `run_done` listener 执行时，`store.get(event.runId)` 已是 apply 后的最新值，直接序列化写盘即可。
+
+**为什么不需要单独的 shutdown 钩子**：`taskRegistrar.kill` → `abortController.abort()` → `runWorkflow` 看到 signal → 发 `run_done killed` → 走同一个订阅。`service.shutdown()` 显式 kill running run 时同样触发 `run_done`。三种终态（completed / failed / killed）共用一个写盘入口。
+
+### 读取① — 面板跨重启展示
+
+```
+CLI 重启 → 用户 /workflows → WorkflowsPanel mount
+  └─ useEffect: svc.loadPersistedRuns()   [service 内部 persistedLoaded flag 守护，仅一次实际扫盘]
+     └─ listPersistedRuns(runsDir)         [扫所有子目录的 state.json]
+        └─ store.hydrate(run)              [已存在的 runId 跳过，内存优先]
+```
+
+**`persistedLoaded` flag 归属**：放在 `WorkflowService` 单例上（`makeService` 闭包变量），不是 panel 模块级。理由：service 是进程单例，flag 跟随单例生命周期最稳；panel 可能多次 mount/unmount，flag 在 service 上可避免重复扫盘。panel `useEffect` 无脑调 `loadPersistedRuns()`，service 内部判断"已加载过则立即返回 resolved Promise"。
+
+### 读取② — agent 按 runId 取 return
+
+```
+service.getRun(id)
+  ├─ store.get(id) 命中 → 返回（本次会话的 run）
+  └─ miss → readRunState(runsDir, id) → 返回（历史 run，不注入内存）
+```
+
+**不注入内存的取舍**：历史 run 进入内存会污染本次会话的 store / 面板列表语义（"内存 = 本次会话产生的 run"这条不变量要保留）。代价是同会话内反复查同一历史 run 会反复读盘——可接受（查询频率低，文件小）。
+
+## state.json 格式
+
+包一层 `schemaVersion` 留 migration 空间，payload 是终态 `RunProgress` 全字段：
+
+```json
+{
+  "schemaVersion": 1,
+  "run": {
+    "runId": "w12tp1rrk",
+    "workflowName": "audit-agent-system-vs-ultracode",
+    "status": "completed",
+    "phases": [
+      {"title": "Review", "status": "done"},
+      {"title": "Verify", "status": "done"}
+    ],
+    "declaredPhases": ["Review", "Verify"],
+    "currentPhase": null,
+    "agents": [
+      {
+        "id": 1,
+        "label": "review:hooks",
+        "phase": "Review",
+        "status": "done",
+        "outputShape": "object",
+        "tokenCount": 12345,
+        "toolCount": 3,
+        "model": "claude-sonnet-4-6"
+      }
+    ],
+    "agentCount": 11,
+    "returnValue": {"dimensionsAudited": 9, "confirmedCount": 2, "confirmed": []},
+    "startedAt": 1718277600000,
+    "updatedAt": 1718278000000,
+    "description": "Audit workflow engine against ultracode skill spec"
+  }
+}
+```
+
+### 字段决策
+
+- `agents[]` 写完整 `AgentProgress`（含 `label` / `phase` / `status` / `tokenCount` / `toolCount` / `model` / `outputShape` / `resultKind`），**不含 agent 实际 output 内容**——output 已在 `journal.jsonl`，避免冗余。
+- 失败 run 的 `error` 字段直接进 `run.error`（`RunProgress` 已有该字段）。
+- `returnValue?: unknown` 原样序列化，**不截断**。用户对自己的 return 大小负责（脚本若 return 整个数据库 dump，磁盘占用自负）。
+
+## 错误处理
+
+| 场景 | 行为 |
+|---|---|
+| `writeRunState` IO 失败（磁盘满 / 权限） | `logForDebugging('[workflow warn] ...')` 吞掉，**不阻断 workflow 完成**——workflow 本身已成功，持久化失败只意味着重启后取不到，可接受 |
+| `readRunState` 文件不存在 | 返回 `null`，调用方按 miss 处理 |
+| `readRunState` JSON 解析失败 | 返回 `null`，log warn，当 miss（不崩） |
+| `readRunState` schema 结构不匹配（缺字段/类型错） | 返回 `null`，log warn，当 miss |
+| `schemaVersion` 未来不匹配 | 当前是 `1`，无迁移链，任何非 1 的版本 → 返回 `null` 当 miss（向前兼容兜底）。未来升级版本时再引入迁移函数链 |
+| 原子写中途崩溃 | `writeFile(state.json.tmp)` + `rename(tmp, state.json)`，rename 原子；最坏留下 `.tmp` 文件，下次写覆盖 |
+| `loadPersistedRuns` 扫到子目录无 `state.json`（只有 journal） | 跳过，不报错（半残 run） |
+| `loadPersistedRuns` 扫到某 `state.json` 损坏 | 跳过该单个文件，继续扫其余（一个坏文件不阻塞整体加载） |
+
+## 关键不变量
+
+1. **内存 run 永远优先于磁盘 run** — `store.hydrate` 跳过已存在 runId；`getRun` 内存命中则不读盘。
+2. **磁盘是纯终态快照** — 本次会话 running 中的 run 不写盘；进程在 run 终态前被 SIGKILL/断电/crash，该 run 在磁盘上缺失（连 `run_done` 都来不及发）。这是 A+ 接受的边缘情况。
+3. **磁盘 run 不注入 `getRun` 路径的内存** — 只有 `loadPersistedRuns`（面板 mount）会 hydrate；`getRun` fallback 仅返回，不 hydrate。
+4. **持久化失败不阻断 workflow** — 写盘是 best-effort，IO 异常只 log 不抛。
+5. **引擎层零改动** — 所有持久化逻辑在 host 侧（`src/workflow/`），引擎 `@claude-code-best/workflow-engine` 接口不变。
+
+## 测试策略
+
+### `src/workflow/__tests__/persistence.test.ts`（新增）— 纯 fs，用 tmpdir
+
+- `writeRunState` → `readRunState` 往返一致（含 `returnValue` 为对象 / 数组 / 字符串 / null 各形态）
+- `writeRunState` 原子性：构造 tmp 残留场景，验证 `state.json` 要么完整要么不存在，无半写
+- `readRunState` 损坏 JSON / 缺文件 / schemaVersion 不符 / 必需字段缺失 → 均返回 `null`
+- `listPersistedRuns` 扫多子目录、跳过无 `state.json` 的目录、跳过损坏文件、按 `updatedAt` 降序返回
+
+### `src/workflow/__tests__/store.test.ts`（扩展）
+
+- `hydrate(run)` 注入新 runId → `get` 命中、`list` 含该项
+- `hydrate(run)` 已存在 runId → 跳过（内存值不被磁盘覆盖）
+- `hydrate` 后 `subscribe` listener 被通知
+
+### `src/workflow/__tests__/service.test.ts`（新增 / 扩展）— 注入 fake bus / ports / tmpdir
+
+- bus emit `run_done completed` + returnValue → `readRunState(runId)` 命中且 returnValue 一致
+- bus emit `run_done failed` + error → state.json 写入 status=failed + error 字段
+- bus emit `run_done killed` → state.json 写入 status=killed
+- bus emit `run_done` 但 `writeRunState` 抛 IO 错 → service 不抛、其他订阅者（store）仍正常
+- `getRun(id)` 内存命中 → 不读盘（spy 断言 readRunState 未被调）
+- `getRun(id)` 内存 miss + 磁盘命中 → 返回磁盘值；再次 `getRun(id)` 仍读盘（未注入内存）
+- `getRun(id)` 内存 miss + 磁盘 miss → 返回 undefined
+- `loadPersistedRuns()` 扫盘后 `listRuns()` 含历史 run；已有内存 runId 不被磁盘覆盖
+
+### `src/workflow/__tests__/WorkflowsPanel.test.tsx`（扩展）
+
+- WorkflowsPanel mount → 调一次 `loadPersistedRuns`（spy 断言调用次数 = 1）
+- 重复 mount / 重渲染 → 不重复调用（`persistedLoaded` flag 防重入）
+
+### 回归
+
+- `bun test src/workflow/` 全套通过
+- `bun run precheck` 零错误（typecheck + lint fix + test）
+
+## 实现顺序提示（供 writing-plans 展开）
+
+1. `persistence.ts` + 单测（最底层，无依赖）
+2. `store.ts` 加 `hydrate` + 单测
+3. `ports.ts` 提取 `getRunsDir()`
+4. `service.ts` 订阅 `run_done` + `getRun` fallback + `loadPersistedRuns` + 单测
+5. `WorkflowsPanel.tsx` mount 触发 + 测试
+6. 全量 `precheck`
+
+## 未来工作（明确不在本 spec）
+
+- **跨进程 resume (c)** — 需重建 agent binding / abort / 中间态，独立特性
+- **生命周期管理** — 数量 cap / 时间 cap / 手动清理命令
+- **return 值大小限制** — 若发现滥用，再加 schema 级 cap 与截断策略
+- **schema migration 链** — 当 `schemaVersion` 升到 2 时再引入
--- a/packages/builtin-tools/src/index.ts
+++ b/packages/builtin-tools/src/index.ts
@@ -62,13 +62,15 @@ export { TerminalCaptureTool } from './tools/TerminalCaptureTool/TerminalCapture
 export { VerifyPlanExecutionTool } from './tools/VerifyPlanExecutionTool/VerifyPlanExecutionTool.js'
 export { WebBrowserTool } from './tools/WebBrowserTool/WebBrowserTool.js'
 // WorkflowTool 实现已迁移到 @claude-code-best/workflow-engine（独立包，端口适配）。
-// 这里仅 re-export 工厂与常量，保持向后兼容。
+// 注意：本 commit 移除了 builtin-tools 的 WorkflowTool 值导出和 getWorkflowCommands。
+// - WorkflowTool 工厂：改由 @claude-code-best/workflow-engine 的 createWorkflowTool 提供
+// - getWorkflowCommands：已移除，功能迁至 src/workflow/namedWorkflowCommands.ts
+// 第三方若从本包 import 这两个符号，需切换到新路径。
 export {
  createWorkflowTool,
  WORKFLOW_TOOL_NAME,
  type WorkflowToolDescriptor,
 } from '@claude-code-best/workflow-engine'
-export { initBundledWorkflows } from './tools/WorkflowTool/bundled/index.js'

 // Constants
 export {
--- a/packages/builtin-tools/src/tools/WorkflowTool/bundled/index.ts
+++ b/packages/builtin-tools/src/tools/WorkflowTool/bundled/index.ts
@@ -1,15 +0,0 @@
-// Bundled workflow initialization.
-// Called by tools.ts when WORKFLOW_SCRIPTS feature flag is enabled.
-// Sets up any pre-bundled workflow scripts that ship with the CLI.
-
-/**
- * Initialize bundled workflows. Called once at startup when the
- * WORKFLOW_SCRIPTS feature flag is active. This is the hook point
- * for registering any workflow scripts that are compiled into the
- * binary (as opposed to user-authored ones in .claude/workflows/).
- */
-export function initBundledWorkflows(): void {
-  // Bundled workflows are registered here at startup.
-  // Currently a no-op — all workflows are user-authored in .claude/workflows/.
-  // This function exists as the extension point for future built-in workflows.
-}
--- a/packages/workflow-engine/src/tests/WorkflowTool.test.ts
+++ b/packages/workflow-engine/src/tests/WorkflowTool.test.ts
@@ -1,5 +1,5 @@
 import { expect, test } from 'bun:test'
-import { mkdtemp, mkdir, rm, writeFile } from 'node:fs/promises'
+import { mkdtemp, mkdir, readFile, rm, writeFile } from 'node:fs/promises'
 import { tmpdir } from 'node:os'
 import { join } from 'node:path'
 import { createWorkflowTool } from '../tool/WorkflowTool.js'
@@ -74,6 +74,34 @@ test('call 返回 launch 消息并在后台完成', async () => {
  }
 })

+test('inline script 持久化到 run 目录，返回真实 scriptPath', async () => {
+  const dir = await mkdtemp(join(tmpdir(), 'wf-tool-'))
+  try {
+    const { ports } = mockPorts(
+      dir,
+      new Map([['x', { kind: 'ok', output: 'x', usage: { outputTokens: 1 } }]]),
+    )
+    const tool = createWorkflowTool(ports)
+    const res = await tool.call(
+      { script: `return agent('x')` },
+      undefined,
+      undefined,
+      undefined,
+    )
+    const expectedPath = join(
+      dir,
+      '.claude',
+      'workflow-runs',
+      'run-x',
+      'script.js',
+    )
+    expect(res.data.output).toContain(expectedPath)
+    expect(await readFile(expectedPath, 'utf-8')).toBe(`return agent('x')`)
+  } finally {
+    await rm(dir, { recursive: true, force: true })
+  }
+})
+
 test('缺少 script/name/scriptPath → 返回错误（不进后台）', async () => {
  const dir = await mkdtemp(join(tmpdir(), 'wf-tool-'))
  try {
--- a/packages/workflow-engine/src/tests/persistInline.test.ts
+++ b/packages/workflow-engine/src/tests/persistInline.test.ts
@@ -0,0 +1,41 @@
+import { expect, test } from 'bun:test'
+import { mkdtemp, readFile, rm } from 'node:fs/promises'
+import { tmpdir } from 'node:os'
+import { join } from 'node:path'
+
+import { persistInlineScript } from '../tool/persistInline.js'
+
+test('持久化到 <cwd>/.claude/workflow-runs/<runId>/script.js 并返回路径', async () => {
+  const dir = await mkdtemp(join(tmpdir(), 'wf-pi-'))
+  try {
+    const path = await persistInlineScript('return 1', 'r1', dir)
+    expect(path).toBe(join(dir, '.claude', 'workflow-runs', 'r1', 'script.js'))
+    expect(await readFile(path, 'utf-8')).toBe('return 1')
+  } finally {
+    await rm(dir, { recursive: true, force: true })
+  }
+})
+
+test('同 runId 重复写覆盖（mkdir 幂等，不抛错）', async () => {
+  const dir = await mkdtemp(join(tmpdir(), 'wf-pi-'))
+  try {
+    await persistInlineScript('first', 'r2', dir)
+    const path = await persistInlineScript('second', 'r2', dir)
+    expect(await readFile(path, 'utf-8')).toBe('second')
+  } finally {
+    await rm(dir, { recursive: true, force: true })
+  }
+})
+
+test('不同 runId 互不干扰（各自独立子目录）', async () => {
+  const dir = await mkdtemp(join(tmpdir(), 'wf-pi-'))
+  try {
+    const p1 = await persistInlineScript('a', 'run-a', dir)
+    const p2 = await persistInlineScript('b', 'run-b', dir)
+    expect(p1).not.toBe(p2)
+    expect(await readFile(p1, 'utf-8')).toBe('a')
+    expect(await readFile(p2, 'utf-8')).toBe('b')
+  } finally {
+    await rm(dir, { recursive: true, force: true })
+  }
+})
--- a/packages/workflow-engine/src/agentAdapter.ts
+++ b/packages/workflow-engine/src/agentAdapter.ts
@@ -1,6 +1,10 @@
 // Agent 后端适配器抽象。引擎通过 registry 取 adapter 再调 run，不关心具体实现
 // （Anthropic SDK / 核心 runAgent / OpenAI / 本地模型 / mock 均为 adapter 的实现）。
-import type { AgentRunParams, AgentRunResult } from './types.js'
+import type {
+  AgentProgressUpdate,
+  AgentRunParams,
+  AgentRunResult,
+} from './types.js'
 import type { HostHandle } from './ports.js'

 /** adapter 能力声明。引擎/脚本据此降级（如后端不支持 schema 则改文本 + 解析）。 */
@@ -21,6 +25,11 @@ export type AgentAdapterContext = {
  signal: AbortSignal
  /** 当前 workflow runId（日志/追踪用）。 */
  runId: string
+  /**
+   * 运行中进度上报（后端循环累计 token/tool 时调用）。可选：独立后端可不实现；
+   * 引擎据此发 agent_progress 事件（闭包带 agentId/runId 关联），面板实时刷新。
+   */
+  onProgress?: (update: AgentProgressUpdate) => void
 }

 /**
--- a/packages/workflow-engine/src/engine/hooks.ts
+++ b/packages/workflow-engine/src/engine/hooks.ts
@@ -1,5 +1,6 @@
 import { MAX_ITEMS_PER_CALL, MAX_TOTAL_AGENTS } from '../constants.js'
 import type {
+  AgentProgressUpdate,
  AgentRunParams,
  AgentRunResult,
  JournalEntry,
@@ -29,6 +30,14 @@ type HookProgressInit =
      phase?: string
      result: AgentRunResult
    }
+  | {
+      type: 'agent_progress'
+      agentId: number
+      label?: string
+      phase?: string
+      tokenCount: number
+      toolCount: number
+    }
  | { type: 'log'; message: string }

 export function makeHooks(
@@ -104,11 +113,16 @@ export function makeHooks(
      ctx.resources.agentCountBox.value++
      emit({ type: 'agent_started', agentId, label, phase })
      const registry = ctx.ports.agentAdapterRegistry
+      // onProgress 闭包：后端循环累计 token/tool → 发 agent_progress 事件（带 agentId 关联）
+      const onProgress = (update: AgentProgressUpdate): void => {
+        emit({ type: 'agent_progress', agentId, label, phase, ...update })
+      }
      const result = registry
        ? await registry.resolve(params).run(params, {
            host: ctx.host,
            signal: ctx.signal,
            runId: ctx.runId,
+            onProgress,
          })
        : await ctx.ports.agentRunner.runAgentToResult(params, ctx.host)
      if (result.kind === 'ok') {
--- a/packages/workflow-engine/src/index.ts
+++ b/packages/workflow-engine/src/index.ts
@@ -21,4 +21,5 @@ export {
  type WorkflowToolDescriptor,
 } from './tool/WorkflowTool.js'
 export { workflowInputSchema, type WorkflowInput } from './tool/schema.js'
+export { persistInlineScript } from './tool/persistInline.js'
 export { WORKFLOW_TOOL_NAME } from './tool/constants.js'
--- a/packages/workflow-engine/src/tool/WorkflowTool.ts
+++ b/packages/workflow-engine/src/tool/WorkflowTool.ts
@@ -9,6 +9,7 @@ import { containsPath, sanitizeWorkflowName } from '../engine/paths.js'
 import type { WorkflowPorts } from '../ports.js'
 import type { WorkflowRunResult } from '../types.js'
 import { workflowInputSchema, type WorkflowInput } from './schema.js'
+import { persistInlineScript } from './persistInline.js'

 /** 自包含工具描述符（核心 wiring 用 buildTool 包装它）。零核心层依赖。 */
 export type WorkflowToolDescriptor = {
@@ -55,6 +56,10 @@ export function createWorkflowTool(
  return {
    name: WORKFLOW_TOOL_NAME,
    inputSchema: workflowInputSchema,
+    // No per-session runtime opt-in gate here: the "ultracode is on for the
+    // session" signal is injected by the harness (claude.ai/client), not held
+    // in any repo state. This tool is compiled in/out via feature('WORKFLOW_SCRIPTS')
+    // in src/tools.ts; beyond that it is always enabled when present.
    isEnabled: () => true,
    isReadOnly: () => false,

@@ -109,6 +114,23 @@ export function createWorkflowTool(
        host.handle,
      )

+      // inline 入口持久化脚本到 run 目录，返回可复用路径（ultracode skill 承诺的
+      // inline → 持久化 → 编辑 → scriptPath 重提迭代循环）。写盘失败降级为占位符
+      // + warn，不阻断 run（script 已在内存）。
+      if (!workflowFile && input.script) {
+        try {
+          workflowFile = await persistInlineScript(
+            input.script,
+            runId,
+            host.cwd,
+          )
+        } catch (e) {
+          ports.logger.warn?.(
+            `inline script persist failed: ${(e as Error).message}`,
+          )
+        }
+      }
+
      // detached 执行
      void runWorkflow({
        script,
--- a/packages/workflow-engine/src/tool/persistInline.ts
+++ b/packages/workflow-engine/src/tool/persistInline.ts
@@ -0,0 +1,28 @@
+import { mkdir, writeFile } from 'node:fs/promises'
+import { join } from 'node:path'
+
+import { WORKFLOW_RUNS_DIR } from '../constants.js'
+
+/**
+ * Persist an inline workflow script to the run directory so the caller can
+ * iterate via `scriptPath` + `resumeFromRunId` without resending the full script
+ * (the round-trip the ultracode skill promises for the inline entry path).
+ *
+ * Mirrors engine/journal.ts: writes directly via node:fs/promises (no port) to
+ * `<cwd>/<WORKFLOW_RUNS_DIR>/<runId>/script.js` — the same directory as
+ * journal.jsonl, so journalStore.truncate(runId) cleans it up alongside the journal.
+ *
+ * Fixed filename `script.js`: parseScript ignores the extension and the runId
+ * already makes the directory unique, so a stable name aids muscle memory.
+ */
+export async function persistInlineScript(
+  script: string,
+  runId: string,
+  cwd: string,
+): Promise<string> {
+  const dir = join(cwd, WORKFLOW_RUNS_DIR, runId)
+  await mkdir(dir, { recursive: true })
+  const filePath = join(dir, 'script.js')
+  await writeFile(filePath, script, 'utf-8')
+  return filePath
+}
--- a/packages/workflow-engine/src/types.ts
+++ b/packages/workflow-engine/src/types.ts
@@ -27,9 +27,25 @@ export type AgentRunParams = {
  phase?: string
 }

-/** AgentRunner 返回。 */
+/** agent 运行中进度快照（onProgress 回调载荷；后端循环累计 token/tool）。 */
+export type AgentProgressUpdate = {
+  tokenCount: number
+  toolCount: number
+}
+
+/** AgentRunner 返回。ok 变体携带 model/toolCount 供面板展示（可选，独立后端可不填）。 */
 export type AgentRunResult =
-  | { kind: 'ok'; output: string | object; usage: { outputTokens: number } }
+  | {
+      kind: 'ok'
+      output: string | object
+      usage: { outputTokens: number }
+      /** 实际解析后的 model id（展示用）。 */
+      model?: string
+      /** agent 运行期间工具调用次数。 */
+      toolCount?: number
+      /** 完成时的 context 总 token 数（展示用；与 agent_progress 实时口径一致）。 */
+      tokenCount?: number
+    }
  | { kind: 'skipped' }
  | { kind: 'dead' }

@@ -66,6 +82,15 @@ export type ProgressEvent =
      phase?: string
      result: AgentRunResult
    }
+  | {
+      type: 'agent_progress'
+      runId: string
+      agentId: number
+      label?: string
+      phase?: string
+      tokenCount: number
+      toolCount: number
+    }
  | { type: 'log'; runId: string; message: string }
  | {
      type: 'run_done'
--- a/src/skills/bundled/tests/ultracode.test.ts
+++ b/src/skills/bundled/tests/ultracode.test.ts
@@ -47,16 +47,22 @@ describe('registerUltracodeSkill', () => {
    expect(blocks[0]!.type).toBe('text')

    const text = (blocks[0] as { type: 'text'; text: string }).text
-    expect(text).toContain('编排原语')
+    // Title + opt-in rule + harness-injection note
+    expect(text).toContain('Workflow Orchestration Playbook')
+    expect(text).toContain('explicitly opted into multi-agent orchestration')
+    expect(text).toContain('harness')
+    // Orchestration primitives
+    expect(text).toContain('Script body hooks')
    expect(text).toContain('parallel')
    expect(text).toContain('pipeline')
+    // Determinism / script-execution-model constraints (JS not TS; Date.now/Math.random throw)
+    expect(text).toContain('plain JavaScript, NOT TypeScript')
+    expect(text).toContain('Date.now()')
+    // Barrier vs pipeline guidance, quality patterns, resume, hard limits
+    expect(text).toContain('DEFAULT TO pipeline()')
+    expect(text).toContain('Quality patterns')
    expect(text).toContain('resumeFromRunId')
-    expect(text).toContain('AgentAdapterRegistry')
-    expect(text).toContain('确定性约束')
-    // 脚本执行模型约束（非 ESM / 禁 import / 禁 TS / 单 export / 顶层 return）
-    expect(text).toContain('脚本编写约束')
-    expect(text).toContain('不转译 TS')
-    expect(text).toContain('禁 `import`')
+    expect(text).toContain('4096')
  })

  test('appends user-provided args to the prompt when given', async () => {
@@ -70,7 +76,7 @@ describe('registerUltracodeSkill', () => {
    )
    const text = (blocks[0] as { type: 'text'; text: string }).text
    expect(text.endsWith('迁移 auth 模块\n')).toBe(true)
-    expect(text).toContain('用户输入')
+    expect(text).toContain('User input')
  })

  test('is not gated behind USER_TYPE — registers with no env set', () => {
--- a/src/skills/bundled/ultracode.ts
+++ b/src/skills/bundled/ultracode.ts
@@ -1,102 +1,224 @@
 import { registerBundledSkill } from '../bundledSkills.js'

 /**
- * /ultracode — 多 agent workflow 编排工作法（纯知识 prompt skill）。
+ * /ultracode — multi-agent workflow orchestration playbook (knowledge-only prompt skill).
 *
- * 调用即把 workflow 编排手册注入上下文，零运行时副作用：不改主循环、
- * 不切换行为开关。用户/模型据此判断何时用 Workflow 工具、如何编排、
- * 如何保证质量与可恢复。
+ * Injects the Workflow orchestration manual into context with zero runtime side
+ * effects: it doesn't change the main loop or toggle any behavior switch. The
+ * user/model uses it to decide when to call the Workflow tool, how to script
+ * fan-out and verification, and how to keep runs deterministic and resumable.
 *
- * 通用 skill（非 ant-only），所有用户可用。
+ * General-purpose skill (not ant-only); available to all users.
 */
-const ULTRACODE_PROMPT = `# /ultracode — 多 agent workflow 编排工作法
+const ULTRACODE_PROMPT = `# /ultracode — Workflow Orchestration Playbook

-## 何时用 Workflow 工具
+Execute a workflow script that orchestrates multiple subagents deterministically. Workflows run in the background — this tool returns immediately with a task ID, and a \`<task-notification>\` arrives when the workflow completes. Use \`/workflows\` to watch live progress.

-用，当任务满足任一：
- 可**分解 / 并行**（多文件、多维度、可独立推进的子任务）。
- 需要**多视角置信**（如审查：先生成再对抗式验证）。
- **规模超单上下文**（大迁移、广度审计、长尾枚举）。
- 需要 **resume / 可审计**（journal 重放、确定性回放）。
+A workflow structures work across many agents — to be comprehensive (decompose and cover in parallel), to be confident (independent perspectives and adversarial checks before committing), or to take on scale one context can't hold (migrations, audits, broad sweeps). The script is where you encode that structure: what fans out, what verifies, what synthesizes.

-**不要用**：琐碎单文件改、单次问答、一次 Read 能解决的事——直接做。
+ONLY call this tool when the user has explicitly opted into multi-agent orchestration. Workflows can spawn dozens of agents and consume a large amount of tokens; the user must request that scale, not have it inferred. Explicit opt-in means one of:

-## 编排原语（workflow 脚本内可用）
+- The user included the keyword "ultracode" in their prompt (you'll see a system-reminder confirming it).
+- Ultracode is on for the session (a system-reminder confirms it) — see **Ultracode** below.
+- The user directly asked you to run a workflow or use multi-agent orchestration in their own words ("use a workflow", "run a workflow", "fan out agents", "orchestrate this with subagents"). The ask must be in the user's words — a task that would merely benefit from a workflow does not count.
+- The user invoked a skill or slash command whose instructions tell you to call Workflow.
+- The user asked you to run a specific named or saved workflow.

- \`agent(prompt, opts?)\` — 派发一个子 agent；返回其最终文本，或（带 \`opts.schema\` 时）schema 校验对象。可在 opts 指定 \`model\`、\`agentType\`、\`label\`、\`phase\`、\`schema\`。
- \`parallel([() => agent(...), ...])\` — 并发跑 thunk 数组，等全部完成。**单项抛错 → 该项变 \`null\`**，其余保留。是 barrier。
- \`pipeline(items, stage1, stage2, …)\` — 每个 item 链式过各 stage；**item 间无 barrier**（item A 可在 stage 3 时 item B 仍在 stage 1），stage 内顺序。单 item 某 stage 抛错 → 该 item \`null\`。
- \`phase(title)\` — 标记阶段（监控面板按此展示进度分组）。
- \`log(msg)\` — 进度日志（面板展示，无状态变更）。
- \`workflow(name | { scriptPath }, args?)\` — 嵌套一层子 workflow（**仅允许一层**）。
+For any other task — even one that would clearly benefit from parallelism — do NOT call this tool. Use the Agent tool for individual subagents, or briefly describe what a multi-agent workflow could do and how much it would roughly cost, and ask the user whether to run it. Mention they can ask for one with "use a workflow" in a future message to skip the ask.

-## 脚本编写约束（引擎执行模型，违反直接报错）
+When you do call it, the right move is often **hybrid**: scout inline first (list the files, find the channels, scope the diff) to discover the work-list, then call Workflow to pipeline over it. You don't need to know the shape before the *task* — only before the *orchestration step*.

-脚本是 \`new AsyncFunction\` 的**函数体**，不是 ESM 模块，引擎**不转译 TS**。这是脚本报错的首要原因，务必遵守：
+Common single-phase workflows you can chain across turns:

- **禁 \`import\`**：\`agent\`/\`parallel\`/\`pipeline\`/\`phase\`/\`log\`/\`workflow\` 与 \`args\`/\`budget\` 是注入的形参，直接用，不 import 任何东西。
- **禁 TS 语法**：不要类型注解（\`x: number\`）、\`interface\`、\`enum\`、\`as\`、泛型——即便文件扩展名是 \`.ts\`，引擎不转译会原样报语法错。**推荐 \`.js\` / \`.mjs\`**。
- **只允许一处 \`export const meta = {...}\`**（纯字面量，引擎正则提取剥离）；不要 \`export\` 其他任何东西，不要 \`export default\`。
- **顶层 \`return\` 返回结果**（函数体内 return 合法且必需）。
+- **Understand** — parallel readers over relevant subsystems → structured map
+- **Design** — judge panel of N independent approaches → scored synthesis
+- **Review** — dimensions → find → adversarially verify (example below)
+- **Research** — multi-modal sweep → deep-read → synthesize
+- **Migrate** — discover sites → transform each (worktree isolation) → verify
+
+For larger work, run several in sequence — read each result before deciding the next phase. You stay in the loop; each workflow is one well-scoped fan-out.
+
+**Opt-in is harness-injected, not repo-controlled.** The "ultracode is on for the session" system-reminder is produced by the harness (claude.ai/client), not by this skill or any repo state. This skill is knowledge-only — it injects the playbook and toggles nothing. There is no in-repo feature flag, env var, or effort level for ultracode; do not add one. The Workflow tool is compiled in via the build-time WORKFLOW_SCRIPTS feature flag and exposes no per-session runtime gate.
+
+**Ultracode.** When a system-reminder confirms ultracode is on, that opt-in is standing: author and run a workflow for every substantive task by default. The goal is the most exhaustive, correct answer you can produce — token cost is not a constraint. For multi-phase work (understand → design → implement → review), that often means several workflows in sequence — one per phase — so you stay in the loop between them. The quality patterns below (adversarial verify, multi-modal sweep, completeness critic, loop-until-dry) are the tools; pick what fits the task. Lean toward orchestrating with workflows and adversarially verifying your findings — unless the work is trivial or already verified. Solo only on conversational turns or trivial mechanical edits. When a reminder says ultracode is off, revert to the opt-in rule above.
+
+Pass the script inline via \`script\` — do not Write it to a file first. Every invocation automatically persists its script to a file under the session directory and returns the path in the tool result. To iterate on a workflow, edit that file with Write/Edit and re-invoke Workflow with \`{scriptPath: "<path>"}\` instead of resending the full script.
+
+Every script must begin with \`export const meta = {...}\`:

 \`\`\`js
-// .claude/workflows/review-changes.js  ← 纯 JS，无类型注解
-export const meta = { name: 'review-changes', description: '按维度审查改动' }
-
-const DIMENSIONS = [{ key: 'bugs' }, { key: 'perf' }]
-const results = await pipeline(
-  DIMENSIONS,
-  d => agent(\`审查 \${d.key}\`, { phase: 'Review' }),
-  r => parallel(((r && r.findings) || []).map(f => () => agent(\`验证 \${f}\`))),
-)
-return results.flat().filter(Boolean)
+export const meta = {
+  name: 'find-flaky-tests',
+  description: 'Find flaky tests and propose fixes',   // one-line, shown in permission dialog
+  phases: [                                            // one entry per phase() call
+    { title: 'Scan', detail: 'grep test logs for retries' },
+    { title: 'Fix', detail: 'one agent per flaky test' },
+  ],
+}
+// script body starts here — use agent()/parallel()/pipeline()/phase()/log()
+phase('Scan')
+const flaky = await agent('grep CI logs for retry markers', {schema: FLAKY_SCHEMA})
+...
 \`\`\`

-## 确定性约束（关键，违反则 resume 失效）
+The \`meta\` object must be a PURE LITERAL — no variables, function calls, spreads, or template interpolation. Required fields: \`name\`, \`description\`. Optional: \`whenToUse\` (shown in the workflow list), \`phases\`. Use the SAME phase titles in meta.phases as in phase() calls — titles are matched exactly; a phase() call with no matching meta entry just gets its own progress group. Add \`model\` to a phase entry when that phase uses a specific model override.

-脚本内**禁用** \`Date.now()\` / \`Math.random()\` / 无参 \`new Date()\`（破坏 journal 重放）。
-需要时间戳 / 随机种子时，经 \`args\` 传入。\`export const meta = { ... }\` 必须是**纯字面量**（无变量、函数调用、模板插值）。
+Script body hooks:

-上限（引擎硬限）：单次 \`parallel\`/\`pipeline\` ≤ **4096** items；单个 workflow 总 **≤ 1000** agent；并发 cap = \`min(16, cores - 2)\`。
+- \`agent(prompt: string, opts?: {label?: string, phase?: string, schema?: object, model?: string, isolation?: 'worktree', agentType?: string}): Promise<any>\` — spawn a subagent. Without schema, returns its final text as a string. With schema (a JSON Schema), the subagent is forced to call a StructuredOutput tool and agent() returns the validated object — no parsing needed. Returns null if the user skips the agent mid-run or the subagent dies on a terminal API error after retries (filter with .filter(Boolean)). opts.label overrides the display label. opts.phase explicitly assigns this agent to a progress group (use this inside pipeline()/parallel() stages to avoid races on the global phase() state — same phase string → same group box). opts.model overrides the model for this agent call. Default to omitting it — the agent inherits the main-loop model (the resolved session model), which is almost always correct. Only set it when you're highly confident a different tier fits the task; when unsure, omit. opts.isolation: 'worktree' runs the agent in a fresh git worktree — EXPENSIVE (~200-500ms setup + disk per agent), use ONLY when agents mutate files in parallel and would otherwise conflict; the worktree is auto-removed if unchanged. opts.agentType uses a custom subagent type (e.g. 'Explore', 'code-reviewer') instead of the default workflow subagent — resolved from the same registry as the Agent tool; composes with schema (the custom agent's system prompt gets a StructuredOutput instruction appended).
+- \`pipeline(items, stage1, stage2, ...): Promise<any[]>\` — run each item through all stages independently, NO barrier between stages. Item A can be in stage 3 while item B is still in stage 1. This is the DEFAULT for multi-stage work. Wall-clock = slowest single-item chain, not sum-of-slowest-per-stage. Every stage callback receives (prevResult, originalItem, index) — use originalItem/index in later stages to label work without threading context through stage 1's return value. A stage that throws drops that item to \`null\` and skips its remaining stages.
+- \`parallel(thunks: Array<() => Promise<any>>): Promise<any[]>\` — run tasks concurrently. This is a BARRIER: awaits all thunks before returning. A thunk that throws (or whose agent errors) resolves to \`null\` in the result array — the call itself never rejects, so \`.filter(Boolean)\` before using the results. Use ONLY when you genuinely need all results together.
+- \`log(message: string): void\` — emit a progress message to the user (shown as a narrator line above the progress tree)
+- \`phase(title: string): void\` — start a new phase; subsequent agent() calls are grouped under this title in the progress display
+- \`args: any\` — the value passed as Workflow's \`args\` input, verbatim (undefined if not provided). Pass arrays/objects as actual JSON values in the tool call, NOT as a JSON-encoded string — \`args: ["a.ts", "b.ts"]\`, not \`args: "[\\"a.ts\\", ...]"\` (a stringified list reaches the script as one string, so \`args.filter\`/\`args.map\` throw). Use this to parameterize named workflows — e.g. pass a research question, target path, or config object directly instead of via a side-channel file.
+- \`budget: {total: number|null, spent(): number, remaining(): number}\` — the turn's token target from the user's "+500k"-style directive. \`budget.total\` is null if no target was set. \`budget.spent()\` returns output tokens spent this turn across the main loop and all workflows — the pool is shared, not per-workflow. \`budget.remaining()\` returns \`max(0, total - spent())\`, or \`Infinity\` if no target. The target is a HARD ceiling, not advisory: once \`spent()\` reaches \`total\`, further \`agent()\` calls throw. Use for dynamic loops: \`while (budget.total && budget.remaining() > 50_000) { ... }\`, or static scaling: \`const FLEET = budget.total ? Math.floor(budget.total / 100_000) : 5\`.
+- \`workflow(nameOrRef: string | {scriptPath: string}, args?: any): Promise<any>\` — run another workflow inline as a sub-step and return whatever it returns. Pass a name to invoke a saved workflow (same registry as {name: "..."}), or {scriptPath} to run a script file you Wrote earlier. The child shares this run's concurrency cap, agent counter, abort signal, and token budget — its agents appear under a "▸ name" group in /workflows and its tokens count toward budget.spent(). The args param becomes the child's \`args\` global. Nesting is one level only: workflow() inside a child throws. Throws on unknown name / unreadable scriptPath / child syntax error; catch to handle gracefully.

-## 质量模式（每种给最小片段）
+Concurrent agent() calls are capped at min(16, cpu cores - 2) per workflow — excess calls queue and run as slots free up. You can still pass 100 items to parallel()/pipeline() and they all complete; only ~10 run at any moment. Total agent count across a workflow's lifetime is capped at 1000 — a runaway-loop backstop set far above any real workflow. A single parallel()/pipeline() call accepts at most 4096 items; passing more is an explicit error, not a silent truncation.

- **Adversarial verify**：\`parallel([() => agent(claim), () => agent(refute)])\`，多数 refute 即弃。
- **Perspective-diverse verify**：同一发现给多个 verifier 不同 lens（正确性 / 安全 / 复现），红队冗余抓不到的失败模式。
- **Judge panel**：N 个独立方案 → 评分 → 取胜者，嫁接亚军亮点。
- **Loop-until-dry**：\`while (fresh.length) { found = await parallel(...); fresh = dedup(found) }\`，连续 K 轮无新增即停。
- **Multi-modal sweep**：多个 agent 各用不同搜索角度（按容器 / 按内容 / 按实体 / 按时间），互不可见。
- **Completeness critic**：末尾一个 agent 问"还缺什么"，其发现成为下一轮工作。
+Subagents are told their final text IS the return value (not a human-facing message), so they return raw data. For structured output, use the schema option — validation happens at the tool-call layer so the model retries on mismatch.

-## 后端路由
+Workflow agents can reach all session-connected MCP tools via ToolSearch — schemas load on demand per agent. Caveat: interactively-authenticated MCP servers (e.g. claude.ai) may be absent in headless/cron runs.

-\`AgentAdapterRegistry\` v1 为单后端（默认 \`claude-code\`）。由后端**内部**按 \`model\` / \`agentType\` 深度解析当前会话的 provider / model / agent 体系（registry 本身可配路由规则，v1 未配，恒落默认）。例：\`agent({ model: 'claude-haiku-4-5', agentType: 'Explore' })\` 经默认后端命中真实 agent 定义。
+Scripts are plain JavaScript, NOT TypeScript — type annotations (\`: string[]\`), interfaces, and generics fail to parse. The script body runs in an async context — use \`await\` directly. Standard JS built-ins (JSON, Math, Array, etc.) are available — EXCEPT \`Date.now()\`/\`Math.random()\`/argless \`new Date()\`, which throw (they would break resume); pass timestamps in via \`args\`, stamp results after the workflow returns, and for randomness vary the agent prompt/label by index. No filesystem or Node.js API access.

-## resume / budget
+DEFAULT TO pipeline(). Only reach for a barrier (parallel between stages) when you genuinely need ALL prior-stage results together.

- \`resumeFromRunId: '<id>'\` — 重放该 run 的 journal，已完成的 \`agent()\` 秒回缓存结果；首个发散点之后全部现场重跑。
- \`budget.total\` — token 硬顶（默认 \`null\` = 无限）；\`budget.spent()\` / \`budget.remaining()\` 读实时消耗。耗尽后再发 agent 抛错。
+A barrier is correct ONLY when stage N needs cross-item context from all of stage N-1:

-## 文件与命令
+- Dedup/merge across the full result set before expensive downstream work
+- Early-exit if the total count is zero ("0 bugs found → skip verification entirely")
+- Stage N's prompt references "the other findings" for comparison

- 脚本目录：\`.claude/workflows/<name>.ts|.js|.mjs\` → 自动成 \`/<name>\` 命令。
- run 记录：\`.claude/workflow-runs/<runId>/journal.jsonl\`。
- 监控面板：\`/workflows\`（双栏：左 run 列表，右 phase + agent；键位 j/k 选中、r resume、x kill、n 新建提示、q 退出）。
- 工具：\`Workflow\`（input 字段：\`script\` / \`name\` / \`scriptPath\` / \`args\` / \`resumeFromRunId\`）。
+A barrier is NOT justified by:
+
+- "I need to flatten/map/filter first" — do it inside a pipeline stage: \`pipeline(items, stageA, r => transform([r]).flat(), stageB)\`
+- "The stages are conceptually separate" — that's what pipeline() models. Separate stages ≠ synchronized stages.
+- "It's cleaner code" — barrier latency is real. If 5 finders run and the slowest takes 3× the fastest, a barrier wastes 2/3 of the fast finders' idle time.
+
+Smell test: if you wrote
+
+\`\`\`js
+const a = await parallel(...)
+const b = transform(a)        // flatten, map, filter — no cross-item dependency
+const c = await parallel(b.map(...))
+\`\`\`
+
+that middle transform doesn't need the barrier. Rewrite as a pipeline with the transform inside a stage. When in doubt: pipeline.
+
+The canonical multi-stage pattern — pipeline by default, each dimension verifies as soon as its review completes:
+
+\`\`\`js
+export const meta = {
+  name: 'review-changes',
+  description: 'Review changed files across dimensions, verify each finding',
+  phases: [{ title: 'Review' }, { title: 'Verify' }],
+}
+const DIMENSIONS = [{key: 'bugs', prompt: '...'}, {key: 'perf', prompt: '...'}]
+const results = await pipeline(
+  DIMENSIONS,
+  d => agent(d.prompt, {label: \`review:\${d.key}\`, phase: 'Review', schema: FINDINGS_SCHEMA}),
+  review => parallel(review.findings.map(f => () =>
+    agent(\`Adversarially verify: \${f.title}\`, {label: \`verify:\${f.file}\`, phase: 'Verify', schema: VERDICT_SCHEMA})
+      .then(v => ({...f, verdict: v}))
+  ))
+)
+const confirmed = results.flat().filter(Boolean).filter(f => f.verdict?.isReal)
+return { confirmed }
+// Dimension 'bugs' findings verify while dimension 'perf' is still reviewing. No wasted wall-clock.
+\`\`\`
+
+When a barrier IS correct — dedup across all findings before expensive verification:
+
+\`\`\`js
+const all = await parallel(DIMENSIONS.map(d => () => agent(d.prompt, {schema: FINDINGS_SCHEMA})))
+const deduped = dedupeByFileAndLine(all.filter(Boolean).flatMap(r => r.findings))  // <-- genuinely needs ALL at once
+const verified = await parallel(deduped.map(f => () => agent(verifyPrompt(f), {schema: VERDICT_SCHEMA})))
+\`\`\`
+
+Loop-until-count pattern — accumulate to a target:
+
+\`\`\`js
+const bugs = []
+while (bugs.length < 10) {
+  const result = await agent("Find bugs in this codebase.", {schema: BUGS_SCHEMA})
+  bugs.push(...result.bugs)
+  log(\`\${bugs.length}/10 found\`)
+}
+\`\`\`
+
+Loop-until-budget pattern — scale depth to the user's "+500k" directive. Guard on budget.total: with no target set, remaining() is Infinity and the loop would run straight to the 1000-agent cap.
+
+\`\`\`js
+const bugs = []
+while (budget.total && budget.remaining() > 50_000) {
+  const result = await agent("Find bugs in this codebase.", {schema: BUGS_SCHEMA})
+  bugs.push(...result.bugs)
+  log(\`\${bugs.length} found, \${Math.round(budget.remaining()/1000)}k remaining\`)
+}
+\`\`\`
+
+Composing patterns — exhaustive review (find → dedup vs seen → diverse-lens panel → loop-until-dry):
+
+\`\`\`js
+const seen = new Set(), confirmed = []
+let dry = 0
+while (dry < 2) {                                              // loop-until-dry
+  const found = (await parallel(FINDERS.map(f => () =>          // barrier: collect all finders this round
+    agent(f.prompt, {phase: 'Find', schema: BUGS})))).filter(Boolean).flatMap(r => r.bugs)
+  const fresh = found.filter(b => !seen.has(key(b)))           // dedup vs ALL seen — plain code, not an agent
+  if (!fresh.length) { dry++; continue }
+  dry = 0; fresh.forEach(b => seen.add(key(b)))
+  const judged = await parallel(fresh.map(b => () =>           // every fresh bug judged concurrently...
+    parallel(['correctness','security','repro'].map(lens => () =>   // ...each by 3 distinct lenses
+      agent(\`Judge "\${b.desc}" via the \${lens} lens — real?\`, {phase: 'Verify', schema: VERDICT})))
+      .then(vs => ({ b, real: vs.filter(Boolean).filter(v => v.real).length >= 2 }))))
+  confirmed.push(...judged.filter(v => v.real).map(v => v.b))
+}
+return confirmed
+// dedup vs \`seen\`, NOT \`confirmed\` — else judge-rejected findings reappear every round and it never converges.
+\`\`\`
+
+Quality patterns — common shapes; pick by task and compose freely:
+
+- Adversarial verify: spawn N independent skeptics per finding, each prompted to REFUTE. Kill if ≥majority refute. Prevents plausible-but-wrong findings from surviving.
+
+\`\`\`js
+const votes = await parallel(Array.from({length: 3}, () => () =>
+  agent(\`Try to refute: \${claim}. Default to refuted=true if uncertain.\`, {schema: VERDICT})))
+const survives = votes.filter(Boolean).filter(v => !v.refuted).length >= 2
+\`\`\`
+
+- Perspective-diverse verify: when a finding can fail in more than one way, give each verifier a distinct lens (correctness, security, perf, does-it-reproduce) instead of N identical refuters — diversity catches failure modes redundancy can't.
+- Judge panel: generate N independent attempts from different angles (e.g. MVP-first, risk-first, user-first), score with parallel judges, synthesize from the winner while grafting the best ideas from runners-up. Beats one-attempt-iterated when the solution space is wide.
+- Loop-until-dry: for unknown-size discovery (bugs, issues, edge cases), keep spawning finders until K consecutive rounds return nothing new. Simple counters (while count < N) miss the tail.
+- Multi-modal sweep: parallel agents each searching a different way (by-container, by-content, by-entity, by-time). Each is blind to what the others surface; useful when one search angle won't find everything.
+- Completeness critic: a final agent that asks "what's missing — modality not run, claim unverified, source unread?" What it finds becomes the next round of work.
+- No silent caps: if a workflow bounds coverage (top-N, no-retry, sampling), \`log()\` what was dropped — silent truncation reads as "covered everything" when it didn't.
+
+Scale to what the user asked for. "find any bugs" → a few finders, single-vote verify. "thoroughly audit this" or "be comprehensive" → larger finder pool, 3–5 vote adversarial pass, synthesis stage. When unsure, lean toward thoroughness for research/review/audit requests and toward brevity for quick checks.
+
+These patterns aren't exhaustive — compose novel harnesses when the task calls for it (tournament brackets, self-repair loops, staged escalation, whatever fits).
+
+Use this tool for multi-step orchestration where control flow should be deterministic (loops, conditionals, fan-out) rather than model-driven.
+
+## Resume
+
+The tool result includes a runId. To resume after a pause, kill, or script edit, relaunch with \`Workflow({scriptPath, resumeFromRunId})\` — the longest unchanged prefix of agent() calls returns cached results instantly; the first edited/new call and everything after it runs live. Same script + same args → 100% cache hit. Date.now()/Math.random()/new Date() are unavailable in scripts (they would break this) — stamp results after the workflow returns, or pass timestamps via args. Fallback when no journal is available: Read agent-<id>.jsonl files in the transcript directory and hand-author a continuation script.
 `

 export function registerUltracodeSkill(): void {
  registerBundledSkill({
    name: 'ultracode',
    description:
-      '进入多 agent workflow 编排模式：何时用、编排原语、质量模式、确定性约束、后端路由、resume/budget、文件与命令。',
+      'Enter multi-agent workflow orchestration mode: when to use the Workflow tool, script primitives, quality patterns, determinism constraints, resume/budget, and files/commands.',
    whenToUse:
-      '任务可分解/并行、需多视角置信、规模超单上下文、或需 resume/可审计时，用 Workflow 工具编排多个子 agent。',
+      'When a task can be decomposed or parallelized, needs multi-perspective confidence (e.g. find then adversarially verify), exceeds a single context (large migrations, broad audits, long-tail enumeration), or needs resume/auditability — orchestrate multiple subagents with the Workflow tool.',
    userInvocable: true,
    async getPromptForCommand(args) {
      let prompt = ULTRACODE_PROMPT
      if (args) {
-        prompt += `\n## 用户输入\n\n${args}\n`
+        prompt += `\n## User input\n\n${args}\n`
      }
      return [{ type: 'text', text: prompt }]
    },
--- a/src/utils/effort.ts
+++ b/src/utils/effort.ts
@@ -16,6 +16,10 @@ import {

 export type { EffortLevel }

+// NOTE: 'ultracode' is NOT an effort level. It is a session-scoped multi-agent
+// orchestration opt-in injected by the harness (claude.ai/client) as a
+// system-reminder, orthogonal to the effort parameter. EffortLevel / EffortValue
+// must never include 'ultracode'; /effort only accepts the levels below.
 export const EFFORT_LEVELS = [
  'low',
  'medium',
--- a/src/utils/worktree.ts
+++ b/src/utils/worktree.ts
@@ -1021,11 +1021,13 @@ export async function removeAgentWorktree(

 /**
 * Slug patterns for throwaway worktrees created by AgentTool (`agent-a<7hex>`,
- * from earlyAgentId.slice(0,8)), WorkflowTool (`wf_<runId>-<idx>` where runId
- * is randomUUID().slice(0,12) = 8 hex + `-` + 3 hex), and bridgeMain
- * (`bridge-<safeFilenameId>`). These leak when the parent process is killed
- * (Ctrl+C, ESC, crash) before their in-process cleanup runs. Exact-shape
- * patterns avoid sweeping user-named EnterWorktree slugs like `wf-myfeature`.
+ * from earlyAgentId.slice(0,8)), workflow engine isolation:'worktree'
+ * (`wf_<8hex>-<3hex>-<n>` derived from sha256(runId:agentId) in
+ * claudeCodeBackend — taskId is `w`+base36, not a UUID, so the slug cannot
+ * embed runId directly and is hashed to satisfy this hex pattern), and
+ * bridgeMain (`bridge-<safeFilenameId>`). These leak when the parent process
+ * is killed (Ctrl+C, ESC, crash) before their in-process cleanup runs.
+ * Exact-shape patterns avoid sweeping user-named EnterWorktree slugs like `wf-myfeature`.
 */
 const EPHEMERAL_WORKTREE_PATTERNS = [
  /^agent-a[0-9a-f]{7}$/,
--- a/src/workflow/tests/WorkflowsPanel.test.tsx
+++ b/src/workflow/tests/WorkflowsPanel.test.tsx
@@ -34,9 +34,11 @@ test('RunProgress 字段契约：面板读取的 key 均存在', () => {
    workflowName: 'review',
    status: 'running',
    phases: [{ title: 'Find', status: 'done' }],
+    declaredPhases: ['Find', 'Review'],
    currentPhase: 'Review',
    agents: [{ id: 1, label: 'review:api', phase: 'Review', status: 'running' }],
    agentCount: 1,
+    startedAt: 1,
    updatedAt: 1,
  };
  // 面板 WorkflowList/Detail 读取的路径
@@ -56,10 +58,12 @@ test('RunProgress 完成/失败形态：returnValue/error 可选', () => {
    workflowName: 'w',
    status: 'completed',
    phases: [],
+    declaredPhases: [],
    currentPhase: null,
    agents: [],
    agentCount: 0,
    returnValue: 'ok',
+    startedAt: 2,
    updatedAt: 2,
  };
  const failed: RunProgress = {
@@ -67,10 +71,12 @@ test('RunProgress 完成/失败形态：returnValue/error 可选', () => {
    workflowName: 'w',
    status: 'failed',
    phases: [],
+    declaredPhases: [],
    currentPhase: null,
    agents: [],
    agentCount: 0,
    error: 'boom',
+    startedAt: 3,
    updatedAt: 3,
  };
  expect(completed.returnValue).toBe('ok');
--- a/src/workflow/tests/claudeCodeBackend.test.ts
+++ b/src/workflow/tests/claudeCodeBackend.test.ts
@@ -21,6 +21,7 @@ mock.module(
      content: [{ type: 'text', text: 'agent-text' }],
      usage: { output_tokens: 42 },
      totalTokens: 42,
+      totalToolUseCount: 3,
    }),
  }),
 )
@@ -42,6 +43,39 @@ mock.module('src/utils/uuid.js', () => ({ createAgentId: () => 'agent-1' }))
 mock.module('src/services/analytics/index.js', () => ({ logEvent: () => {} }))
 mock.module('src/utils/debug.js', () => ({ logForDebugging: () => {} }))

+// isolation:'worktree' 测试用：mock worktree 三件套（避免真跑 git worktree add）。
+// 注意 mock.module 是 process-global；worktreeState 在工厂外定义供测试重置。
+// 不 mock cwd.js：runWithCwdOverride 真跑 AsyncLocalStorage 对 mock runAgent 无害，
+// 且避免污染同进程其他依赖 pwd/getCwd 的测试。
+const worktreeState = {
+  shouldThrow: false,
+  hasChanges: false,
+  created: [] as string[],
+  removed: [] as string[],
+  changesCalls: 0,
+}
+mock.module('src/utils/worktree.js', () => ({
+  createAgentWorktree: async (slug: string) => {
+    if (worktreeState.shouldThrow) throw new Error('wt boom')
+    worktreeState.created.push(slug)
+    return {
+      worktreePath: '/fake/wt',
+      worktreeBranch: 'wt-branch',
+      headCommit: 'abc123',
+      gitRoot: '/fake',
+      hookBased: false,
+    }
+  },
+  hasWorktreeChanges: async () => {
+    worktreeState.changesCalls++
+    return worktreeState.hasChanges
+  },
+  removeAgentWorktree: async (path: string) => {
+    worktreeState.removed.push(path)
+    return true
+  },
+}))
+
 import {
  claudeCodeBackend,
  resolveAgentDefinition,
@@ -77,15 +111,68 @@ function ctx() {
  }
 }

-test('文本 agent → ok + token 计量', async () => {
+test('文本 agent → ok + token/tool/model 计量', async () => {
  const res = await claudeCodeBackend.run({ prompt: 'do it' }, ctx())
  expect(res.kind).toBe('ok')
  if (res.kind === 'ok') {
    expect(res.output).toBe('agent-text')
    expect(res.usage.outputTokens).toBe(42)
+    // 面板展示字段：tokenCount(=totalTokens) / toolCount / model(fallback mainLoopModel 'm')
+    expect(res.tokenCount).toBe(42)
+    expect(res.toolCount).toBe(3)
+    expect(res.model).toBe('m')
  }
 })

+test('isolation:worktree → 创建 worktree + 无变更自动清理；slug 匹配清理正则', async () => {
+  worktreeState.shouldThrow = false
+  worktreeState.hasChanges = false
+  worktreeState.created = []
+  worktreeState.removed = []
+  worktreeState.changesCalls = 0
+  const res = await claudeCodeBackend.run(
+    { prompt: 'do', isolation: 'worktree' },
+    ctx(),
+  )
+  expect(res.kind).toBe('ok')
+  expect(worktreeState.created).toHaveLength(1)
+  // slug 必须匹配 cleanupStaleAgentWorktrees 的清理正则 ^wf_[0-9a-f]{8}-[0-9a-f]{3}-\d+$
+  expect(worktreeState.created[0]).toMatch(/^wf_[0-9a-f]{8}-[0-9a-f]{3}-\d+$/)
+  expect(worktreeState.changesCalls).toBe(1)
+  expect(worktreeState.removed).toHaveLength(1) // 无变更 → auto-remove
+})
+
+test('isolation:worktree 有变更 → 保留 worktree（不 remove）', async () => {
+  worktreeState.hasChanges = true
+  worktreeState.created = []
+  worktreeState.removed = []
+  worktreeState.changesCalls = 0
+  const res = await claudeCodeBackend.run(
+    { prompt: 'do', isolation: 'worktree' },
+    ctx(),
+  )
+  expect(res.kind).toBe('ok')
+  expect(worktreeState.removed).toHaveLength(0) // 有变更 → 保留
+  expect(worktreeState.changesCalls).toBe(1)
+})
+
+test('isolation:worktree 创建失败 → fail-closed 返 dead（不静默退化共享 cwd）', async () => {
+  worktreeState.shouldThrow = true
+  const res = await claudeCodeBackend.run(
+    { prompt: 'do', isolation: 'worktree' },
+    ctx(),
+  )
+  expect(res.kind).toBe('dead')
+  worktreeState.shouldThrow = false
+})
+
+test('无 isolation → 不创建 worktree', async () => {
+  worktreeState.created = []
+  const res = await claudeCodeBackend.run({ prompt: 'do' }, ctx())
+  expect(res.kind).toBe('ok')
+  expect(worktreeState.created).toHaveLength(0)
+})
+
 test('runAgent 抛错 → dead', async () => {
  // 覆盖 mock 让 runAgent 抛（last-write-wins）
  mock.module(
--- a/src/workflow/tests/notifications.test.ts
+++ b/src/workflow/tests/notifications.test.ts
@@ -47,6 +47,7 @@ function makeRun(
    currentPhase: null,
    agents: [],
    agentCount: 0,
+    startedAt: Date.now(),
    updatedAt: Date.now(),
    ...overrides,
  }
--- a/src/workflow/tests/progressStore.test.ts
+++ b/src/workflow/tests/progressStore.test.ts
@@ -173,3 +173,59 @@ test('agent_done 落地 outputShape（ok·object / ok·text / dead 无）', () =
  expect(agents.find(a => a.id === 1)?.outputShape).toBe('text')
  expect(agents.find(a => a.id === 2)?.outputShape).toBeUndefined()
 })
+
+test('agent_progress 实时更新 token/tool（按 agentId 关联）', () => {
+  const { bus, store } = newStore()
+  bus.emit({ type: 'run_started', runId: 'r1', workflowName: 'w', meta: null })
+  bus.emit({
+    type: 'agent_started',
+    runId: 'r1',
+    agentId: 0,
+    label: 'a',
+    phase: 'A',
+  })
+  bus.emit({
+    type: 'agent_progress',
+    runId: 'r1',
+    agentId: 0,
+    tokenCount: 1200,
+    toolCount: 2,
+  })
+  let a = store.get('r1')!.agents.find(x => x.id === 0)!
+  expect(a.tokenCount).toBe(1200)
+  expect(a.toolCount).toBe(2)
+  bus.emit({
+    type: 'agent_progress',
+    runId: 'r1',
+    agentId: 0,
+    tokenCount: 2400,
+    toolCount: 3,
+  })
+  a = store.get('r1')!.agents.find(x => x.id === 0)!
+  expect(a.tokenCount).toBe(2400)
+  expect(a.toolCount).toBe(3)
+})
+
+test('agent_done 落地 model/tokenCount/toolCount（ok 变体）', () => {
+  const { bus, store } = newStore()
+  bus.emit({ type: 'run_started', runId: 'r1', workflowName: 'w', meta: null })
+  bus.emit({ type: 'agent_started', runId: 'r1', agentId: 0, phase: 'A' })
+  bus.emit({
+    type: 'agent_done',
+    runId: 'r1',
+    agentId: 0,
+    phase: 'A',
+    result: {
+      kind: 'ok',
+      output: 'x',
+      usage: { outputTokens: 5 },
+      model: 'glm-5.2',
+      tokenCount: 22900,
+      toolCount: 1,
+    },
+  })
+  const a = store.get('r1')!.agents.find(x => x.id === 0)!
+  expect(a.model).toBe('glm-5.2')
+  expect(a.tokenCount).toBe(22900)
+  expect(a.toolCount).toBe(1)
+})
--- a/src/workflow/tests/selectors.test.ts
+++ b/src/workflow/tests/selectors.test.ts
@@ -17,6 +17,7 @@ function run(partial: Partial<RunProgress>): RunProgress {
    currentPhase: null,
    agents: [],
    agentCount: 0,
+    startedAt: 1,
    updatedAt: 1,
    ...partial,
  }
--- a/src/workflow/tests/service.test.ts
+++ b/src/workflow/tests/service.test.ts
@@ -145,6 +145,29 @@ test('launch → completed；store 出现该 run', async () => {
  expect(r!.workflowName).toBe('workflow')
 })

+test('launch inline script → 返回 scriptPath（持久化到 cwdOverride 目录）', async () => {
+  __resetWorkflowServiceForTests()
+  const dir = await mkdtemp(join(tmpdir(), 'wf-svc-'))
+  try {
+    const { ports, store } = fakePorts()
+    const svc = makeService(ports, store, dir)
+    const result = await svc.launch(
+      { script: `return agent('x')` },
+      stubTUC,
+      stubCanUseTool,
+    )
+    expect(result.scriptPath).toBe(
+      join(dir, '.claude', 'workflow-runs', 'run-1', 'script.js'),
+    )
+    const { readFile } = await import('node:fs/promises')
+    expect(await readFile(result.scriptPath!, 'utf-8')).toBe(
+      `return agent('x')`,
+    )
+  } finally {
+    await rm(dir, { recursive: true, force: true })
+  }
+})
+
 test('kill 走 taskRegistrar.kill', async () => {
  __resetWorkflowServiceForTests()
  const { ports, store, killed } = fakePorts()
--- a/src/workflow/tests/status.test.ts
+++ b/src/workflow/tests/status.test.ts
@@ -3,12 +3,15 @@ import type { AgentProgress, RunProgress } from '../progress/store.js'
 import {
  STATUS_DOT,
  RUN_STATUS_COLOR,
+  RUN_STATUS_TEXT,
  PHASE_MARK,
  PHASE_COLOR,
  agentVisual,
+  formatTokenCount,
+  agentMetaText,
 } from '../panel/status.js'

-test('STATUS_DOT / RUN_STATUS_COLOR 覆盖四种 run 状态且为非空字符', () => {
+test('STATUS_DOT / RUN_STATUS_COLOR / RUN_STATUS_TEXT 覆盖四种 run 状态', () => {
  const statuses: RunProgress['status'][] = [
    'running',
    'completed',
@@ -18,11 +21,14 @@ test('STATUS_DOT / RUN_STATUS_COLOR 覆盖四种 run 状态且为非空字符',
  for (const s of statuses) {
    expect(STATUS_DOT[s].length).toBeGreaterThan(0)
    expect(RUN_STATUS_COLOR[s]).toBeTruthy()
+    expect(RUN_STATUS_TEXT[s].length).toBeGreaterThan(0)
  }
  expect(STATUS_DOT.running).toBe('●')
  expect(STATUS_DOT.completed).toBe('✓')
  expect(STATUS_DOT.failed).toBe('✗')
  expect(STATUS_DOT.killed).toBe('■')
+  expect(RUN_STATUS_TEXT.completed).toBe('done')
+  expect(RUN_STATUS_TEXT.running).toBe('running')
 })

 test('PHASE_MARK / PHASE_COLOR 覆盖 running/done/pending', () => {
@@ -32,44 +38,51 @@ test('PHASE_MARK / PHASE_COLOR 覆盖 running/done/pending', () => {
  expect(PHASE_COLOR.pending).toBe('subtle')
 })

-test('agentVisual：running → ● warning running', () => {
+test('agentVisual：running → ● warning', () => {
  const a: AgentProgress = { id: 1, status: 'running' }
-  expect(agentVisual(a)).toEqual({
-    mark: '●',
-    color: 'warning',
-    suffix: 'running',
-  })
+  expect(agentVisual(a)).toEqual({ mark: '●', color: 'warning' })
 })

-test('agentVisual：done·object → ✓ success object', () => {
+test('agentVisual：done·ok → ✓ success（不再带 outputShape 后缀）', () => {
  const a: AgentProgress = {
    id: 1,
    status: 'done',
    resultKind: 'ok',
    outputShape: 'object',
  }
-  expect(agentVisual(a)).toEqual({
-    mark: '✓',
-    color: 'success',
-    suffix: 'object',
-  })
+  expect(agentVisual(a)).toEqual({ mark: '✓', color: 'success' })
 })

-test('agentVisual：done·text → ✓ success text', () => {
+test('agentVisual：dead → ✗ error', () => {
+  const a: AgentProgress = { id: 1, status: 'done', resultKind: 'dead' }
+  expect(agentVisual(a)).toEqual({ mark: '✗', color: 'error' })
+})
+
+test('formatTokenCount：<1000 原值，≥1000 保留 1 位小数 + k', () => {
+  expect(formatTokenCount(undefined)).toBe('0')
+  expect(formatTokenCount(0)).toBe('0')
+  expect(formatTokenCount(42)).toBe('42')
+  expect(formatTokenCount(1000)).toBe('1.0k')
+  expect(formatTokenCount(22900)).toBe('22.9k')
+})
+
+test('agentMetaText：model · Nk tok · N tool', () => {
  const a: AgentProgress = {
    id: 1,
    status: 'done',
-    resultKind: 'ok',
-    outputShape: 'text',
+    model: 'glm-5.2',
+    tokenCount: 22900,
+    toolCount: 1,
  }
-  expect(agentVisual(a)).toEqual({
-    mark: '✓',
-    color: 'success',
-    suffix: 'text',
-  })
+  expect(agentMetaText(a)).toBe('glm-5.2 · 22.9k tok · 1 tool')
 })

-test('agentVisual：dead → ✗ error dead', () => {
-  const a: AgentProgress = { id: 1, status: 'done', resultKind: 'dead' }
-  expect(agentVisual(a)).toEqual({ mark: '✗', color: 'error', suffix: 'dead' })
+test('agentMetaText：无 model 时省略前段', () => {
+  const a: AgentProgress = {
+    id: 1,
+    status: 'running',
+    tokenCount: 500,
+    toolCount: 2,
+  }
+  expect(agentMetaText(a)).toBe('500 tok · 2 tool')
 })
--- a/src/workflow/backends/claudeCodeBackend.ts
+++ b/src/workflow/backends/claudeCodeBackend.ts
@@ -15,8 +15,16 @@ import {
  type BuiltInAgentDefinition,
 } from '@claude-code-best/builtin-tools/tools/AgentTool/loadAgentsDir.js'
 import { createUserMessage, extractTextContent } from '../../utils/messages.js'
+import { getTokenCountFromUsage } from '../../utils/tokens.js'
+import { createHash } from 'node:crypto'
 import { createAgentId } from '../../utils/uuid.js'
 import { logForDebugging } from '../../utils/debug.js'
+import { runWithCwdOverride } from '../../utils/cwd.js'
+import {
+  createAgentWorktree,
+  hasWorktreeChanges,
+  removeAgentWorktree,
+} from '../../utils/worktree.js'
 import { logEvent } from '../../services/analytics/index.js'
 import type { ModelAlias } from '../../utils/model/aliases.js'
 import type { Message } from '../../types/message.js'
@@ -74,6 +82,57 @@ export function extractStructuredOutput(
  return null
 }

+type WorkflowWorktreeInfo = Awaited<ReturnType<typeof createAgentWorktree>>
+
+/**
+ * 为 workflow agent 的 worktree 隔离生成 slug：sha256(runId:agentId) 派生 hex 段，
+ * 匹配 cleanupStaleAgentWorktrees 的清理正则 `^wf_[0-9a-f]{8}-[0-9a-f]{3}-\d+$`。
+ * taskId 是 `w`+base36（非 UUID），不能直接塞 runId 进正则段；sha256 是确定性映射，
+ * agentId 保证同 runId 多 agent 的 slug 唯一（无共享计数器，无线程安全问题）。
+ */
+function makeWorkflowWorktreeSlug(runId: string, agentId: string): string {
+  const h = createHash('sha256').update(`${runId}:${agentId}`).digest('hex')
+  return `wf_${h.slice(0, 8)}-${h.slice(8, 11)}-${parseInt(h.slice(11, 17), 16) % 100000}`
+}
+
+/**
+ * agent 完成后清理 worktree：hookBased 保留（无法检测 VCS 变更）；否则用
+ * hasWorktreeChanges（fail-closed）检测，无变更 auto-remove，有变更/检测失败保留
+ * 并 log 路径（v1 用日志而非扩 AgentRunResult，避免动 journal 序列化）。
+ */
+async function cleanupWorkflowWorktree(
+  info: WorkflowWorktreeInfo,
+  agentType: string,
+): Promise<void> {
+  if (info.hookBased || !info.headCommit) return
+  let changed = true
+  try {
+    changed = await hasWorktreeChanges(info.worktreePath, info.headCommit)
+  } catch (e) {
+    logForDebugging(
+      `workflow worktree change-detect failed (${agentType}): ${(e as Error).message}`,
+    )
+    changed = true
+  }
+  if (!changed) {
+    try {
+      await removeAgentWorktree(
+        info.worktreePath,
+        info.worktreeBranch,
+        info.gitRoot,
+      )
+    } catch (e) {
+      logForDebugging(
+        `workflow worktree remove failed (${agentType}): ${(e as Error).message}`,
+      )
+    }
+  } else {
+    logForDebugging(
+      `workflow worktree retained (has changes, ${agentType}): ${info.worktreePath}`,
+    )
+  }
+}
+
 /** 深度集成后端：从活会话解析 agent/model/tools，委托核心 runAgent。 */
 export const claudeCodeBackend: AgentAdapter = {
  id: 'claude-code',
@@ -89,6 +148,28 @@ export const claudeCodeBackend: AgentAdapter = {
    const model = mapWorkflowModel(params.model)
    const agentId = createAgentId()

+    // isolation:'worktree' — 在独立 git worktree 里跑 agent，并发写互不冲突。
+    let worktreeInfo: WorkflowWorktreeInfo | null = null
+    if (params.isolation === 'worktree') {
+      try {
+        worktreeInfo = await createAgentWorktree(
+          makeWorkflowWorktreeSlug(ctx.runId, agentId),
+        )
+      } catch (e) {
+        // fail-closed：隔离未达成不静默退化为共享 cwd（否则并发写数据竞争）
+        logForDebugging(
+          `workflow worktree creation failed (${agentDef.agentType}): ${(e as Error).message}`,
+        )
+        return { kind: 'dead' }
+      }
+    }
+    // runWithCwdOverride 让 agent 内的 Bash/Read 等工具看到 worktree 路径
+    // （AsyncLocalStorage 跨 await 保持）；runAgent 的 worktreePath 参数仅写 metadata。
+    const runInCwd = worktreeInfo
+      ? <T>(fn: () => T): T =>
+          runWithCwdOverride(worktreeInfo!.worktreePath, fn)
+      : <T>(fn: () => T): T => fn()
+
    const workerPermissionContext = {
      ...appState.toolPermissionContext,
      mode: agentDef.permissionMode ?? 'acceptEdits',
@@ -106,29 +187,54 @@ export const claudeCodeBackend: AgentAdapter = {
    const promptMessages = [createUserMessage({ content: promptText })]
    const messages: Message[] = []
    const startTime = Date.now()
+    // 运行中进度累计（onProgress 推送 → agent_progress 事件 → 面板实时刷新 token/tool）。
+    let tokenCount = 0
+    let toolCount = 0

    try {
-      for await (const msg of runAgent({
-        agentDefinition: agentDef,
-        promptMessages,
-        toolUseContext,
-        canUseTool,
-        isAsync: true,
-        querySource: toolUseContext.options.querySource ?? 'workflow',
-        availableTools: workerTools,
-        override: { agentId },
-        // runAgent 的 model 是顶层 ModelAlias；workflow 的 model 是任意别名串，
-        // 类型上不兼容，运行时由 provider 层解析。双重断言透传（优于 as any/never）。
-        ...(model ? { model: model as unknown as ModelAlias } : {}),
-      })) {
-        messages.push(msg as Message)
-      }
+      await runInCwd(async () => {
+        for await (const msg of runAgent({
+          agentDefinition: agentDef,
+          promptMessages,
+          toolUseContext,
+          canUseTool,
+          isAsync: true,
+          querySource: toolUseContext.options.querySource ?? 'workflow',
+          availableTools: workerTools,
+          override: { agentId },
+          // runAgent 的 model 是顶层 ModelAlias；workflow 的 model 是任意别名串，
+          // 类型上不兼容，运行时由 provider 层解析。双重断言透传（优于 as any/never）。
+          ...(model ? { model: model as unknown as ModelAlias } : {}),
+          ...(worktreeInfo ? { worktreePath: worktreeInfo.worktreePath } : {}),
+        })) {
+          messages.push(msg as Message)
+          // 累计运行中进度：assistant message 带 usage（累积值→覆盖）、content 内 tool_use（增量）。
+          if (msg.type === 'assistant' && msg.message) {
+            const usage = msg.message.usage as
+              | Parameters<typeof getTokenCountFromUsage>[0]
+              | undefined
+            if (usage) tokenCount = getTokenCountFromUsage(usage)
+            const content = msg.message.content as
+              | Array<{ type: string }>
+              | undefined
+            if (content)
+              toolCount += content.filter(b => b.type === 'tool_use').length
+          }
+          ctx.onProgress?.({ tokenCount, toolCount })
+        }
+      })
    } catch (e) {
      logForDebugging(
        `workflow sub-agent error (${agentDef.agentType}): ${(e as Error).message}`,
      )
      logEvent('tengu_workflow_agent', { ok: 0 })
      return { kind: 'dead' }
+    } finally {
+      if (worktreeInfo) {
+        const info = worktreeInfo
+        worktreeInfo = null
+        await cleanupWorkflowWorktree(info, agentDef.agentType)
+      }
    }

    const finalized = finalizeAgentTool(messages, agentId, {
@@ -141,6 +247,10 @@ export const claudeCodeBackend: AgentAdapter = {
    })
    const outputTokens =
      finalized.usage?.output_tokens ?? finalized.totalTokens ?? 0
+    // 面板展示用：完成时 context 总 token、工具调用次数、解析后 model id。
+    const finalTokenCount = finalized.totalTokens ?? 0
+    const finalToolCount = finalized.totalToolUseCount ?? 0
+    const resolvedModel = model ?? toolUseContext.options.mainLoopModel
    logEvent('tengu_workflow_agent', { ok: 1, outputTokens })

    if (params.schema) {
@@ -150,9 +260,19 @@ export const claudeCodeBackend: AgentAdapter = {
        kind: 'ok',
        output: structured as object,
        usage: { outputTokens },
+        model: resolvedModel,
+        toolCount: finalToolCount,
+        tokenCount: finalTokenCount,
      }
    }
    const text = extractTextContent(finalized.content, '\n')
-    return { kind: 'ok', output: text, usage: { outputTokens } }
+    return {
+      kind: 'ok',
+      output: text,
+      usage: { outputTokens },
+      model: resolvedModel,
+      toolCount: finalToolCount,
+      tokenCount: finalTokenCount,
+    }
  },
 }
--- a/src/workflow/panel/AgentList.tsx
+++ b/src/workflow/panel/AgentList.tsx
@@ -1,36 +1,52 @@
 import React from 'react';
-import { Box, Text } from '@anthropic/ink';
+import { Box, Text, useAnimationFrame } from '@anthropic/ink';
 import type { Theme } from '@anthropic/ink';
 import type { AgentProgress } from '../progress/store.js';
-import { agentVisual } from './status.js';
+import { agentMetaText, agentVisual } from './status.js';

-const LABEL_WIDTH = 18;
+const SPINNER_FRAMES = ['·', '✢', '✱', '✶', '✻', '✽'];
+const FRAME_MS = 120;
+const LABEL_MAX = 18;

 /**
 * 右 agent 列表（已按选中 phase 过滤）。
- * 光标行铺橙底；每行：标记 + label + 行尾状态文字（running/object/text/dead）。
+ * 选中行：仅在本列聚焦（focused=true）时铺 selectionBg 底（保留 fg，非反色）；
+ * 焦点不在本列时不铺底色，避免“虚假聚焦”。
+ * running agent 的状态符由 useAnimationFrame 驱动 spinner 动画（共享 clock，全局同步）；
+ * 右侧 `model · Nk tok · N tool` 由 agent_progress / agent_done 实时刷新。
 */
 export function AgentList({
  agents,
  selectedIndex,
+  focused,
 }: {
  agents: AgentProgress[];
  selectedIndex: number;
+  focused: boolean;
 }): React.ReactNode {
+  // 顶层订阅一次动画帧：所有 running agent 共享同一 frame（同步动画，省去逐行 hook）。
+  const [ref, time] = useAnimationFrame(FRAME_MS);
+  const frame = SPINNER_FRAMES[Math.floor(time / FRAME_MS) % SPINNER_FRAMES.length];
+
  if (agents.length === 0) {
    return <Text color="subtle">(no agents in this phase)</Text>;
  }
  return (
-    <Box flexDirection="column">
+    <Box ref={ref} flexDirection="column">
      {agents.map((a, i) => {
        const v = agentVisual(a);
        const selected = i === selectedIndex;
-        const label = (a.label ?? `agent-${a.id}`).slice(0, LABEL_WIDTH).padEnd(LABEL_WIDTH);
+        const highlighted = selected && focused;
+        const running = a.status === 'running';
+        const mark = running ? frame : v.mark;
+        const label = (a.label ?? `agent-${a.id}`).slice(0, LABEL_MAX);
        return (
-          <Box key={a.id}>
-            <Text backgroundColor={selected ? 'claude' : undefined}>
-              <Text color={v.color as keyof Theme}>{v.mark}</Text> {label} <Text color="subtle">{v.suffix}</Text>
-            </Text>
+          <Box key={a.id} backgroundColor={highlighted ? 'selectionBg' : undefined} justifyContent="space-between">
+            <Box>
+              <Text color={v.color as keyof Theme}>{mark}</Text>
+              <Text> {label}</Text>
+            </Box>
+            <Text color="subtle">{agentMetaText(a)}</Text>
          </Box>
        );
      })}
--- a/src/workflow/panel/PhaseSidebar.tsx
+++ b/src/workflow/panel/PhaseSidebar.tsx
@@ -1,10 +1,13 @@
 import React from 'react';
-import { Box, Text } from '@anthropic/ink';
+import { Box, Text, useAnimationFrame } from '@anthropic/ink';
 import type { Theme } from '@anthropic/ink';
 import type { AgentProgress } from '../progress/store.js';
 import { PHASE_COLOR, PHASE_MARK, type PhaseStatus } from './status.js';
 import { ALL_PHASE, type MergedPhase } from './selectors.js';

+const SPINNER_FRAMES = ['·', '✢', '✱', '✶', '✻', '✽'];
+const FRAME_MS = 120;
+
 type PhaseRow = {
  title: string;
  status?: PhaseStatus;
@@ -14,32 +17,45 @@ type PhaseRow = {

 /**
 * 左 phase 侧栏：第一行 All（汇总 done/total），其后 merged phases（含 pending ○）。
- * 选中行铺橙底（文字色不变）；selectedIndex=0 表示 All。
+ * 选中行：仅在本列聚焦（focused=true）时铺 selectionBg 底（保留 fg，非反色）+ `>` 标记；
+ * 焦点不在本列时不铺底色，避免“虚假聚焦”。running phase 状态符由 useAnimationFrame 驱动 spinner 动画。
+ * 样式对齐参考图：`> ✓ Scan  3/3`。
 */
 export function PhaseSidebar({
  phases,
  agents,
  selectedIndex,
+  focused,
 }: {
  phases: MergedPhase[];
  agents: AgentProgress[];
  selectedIndex: number;
+  focused: boolean;
 }): React.ReactNode {
+  const [ref, time] = useAnimationFrame(FRAME_MS);
+  const frame = SPINNER_FRAMES[Math.floor(time / FRAME_MS) % SPINNER_FRAMES.length];
  const totalAgents = agents.length;
  const doneAgents = agents.filter(a => a.status === 'done').length;
  const rows: PhaseRow[] = [{ title: ALL_PHASE, done: doneAgents, total: totalAgents }, ...phases];

  return (
-    <Box flexDirection="column">
+    <Box ref={ref} flexDirection="column">
      {rows.map((row, i) => {
        const selected = i === selectedIndex;
-        const mark = row.status ? PHASE_MARK[row.status] : ' ';
-        const color = row.status ? (PHASE_COLOR[row.status] as keyof Theme) : undefined;
+        const highlighted = selected && focused;
+        const running = row.status === 'running';
+        const mark = running ? frame : row.status ? PHASE_MARK[row.status] : ' ';
+        const color = (row.status ? PHASE_COLOR[row.status] : 'subtle') as keyof Theme;
        return (
-          <Box key={row.title}>
-            <Text backgroundColor={selected ? 'claude' : undefined} color={color}>
-              {selected ? '▶' : ' '}
-              {mark} {row.title.padEnd(10)} {row.done}/{row.total}
+          <Box key={row.title} backgroundColor={highlighted ? 'selectionBg' : undefined} justifyContent="space-between">
+            <Box>
+              <Text color={selected ? 'claude' : undefined}>{highlighted ? '>' : ' '}</Text>
+              <Text> </Text>
+              <Text color={color}>{mark}</Text>
+              <Text> {row.title}</Text>
+            </Box>
+            <Text color="subtle">
+              {row.done}/{row.total}
            </Text>
          </Box>
        );
--- a/src/workflow/panel/WorkflowsPanel.tsx
+++ b/src/workflow/panel/WorkflowsPanel.tsx
@@ -1,13 +1,15 @@
 import React, { useEffect, useState, useSyncExternalStore } from 'react';
-import { Box, Text } from '@anthropic/ink';
+import { Box, Text, useAnimationFrame } from '@anthropic/ink';
+import type { Theme } from '@anthropic/ink';
 import type { LocalJSXCommandContext, LocalJSXCommandOnDone } from '../../types/command.js';
 import { getWorkflowService } from '../service.js';
 import type { RunProgress } from '../progress/store.js';
 import { AgentList } from './AgentList.js';
 import { PhaseSidebar } from './PhaseSidebar.js';
 import { TabsBar } from './TabsBar.js';
+import { RUN_STATUS_COLOR, RUN_STATUS_TEXT } from './status.js';
 import { type FocusColumn, type WorkflowKeyboardHandlers, useWorkflowKeyboard } from './useWorkflowKeyboard.js';
-import { ALL_PHASE, filterAgentsByPhase, mergePhases } from './selectors.js';
+import { ALL_PHASE, filterAgentsByPhase, formatDuration, mergePhases } from './selectors.js';

 /**
 * 夹紧选中索引到有效区间（空列表→0；越界→末位；负/NaN→0）。
@@ -124,33 +126,52 @@ export function WorkflowsPanel({
  const running = runs.filter(r => r.status === 'running').length;
  const done = runs.length - running;
  const phaseHeader = selectedPhaseTitle ?? ALL_PHASE;
+  const agentDone = focused ? focused.agents.filter(a => a.status === 'done').length : 0;
+  // 每秒刷新 header 耗时（共享 clock；订阅即触发重渲染，耗时走墙钟）。
+  const [clockRef] = useAnimationFrame(1000);
+  const elapsed = focused ? Date.now() - focused.startedAt : 0;

  return (
-    <Box flexDirection="column" borderStyle="round" borderColor="claude" paddingX={1}>
+    <Box ref={clockRef} flexDirection="column" borderStyle="round" borderColor="claude" paddingX={1}>
      <Box justifyContent="space-between">
-        <Text bold>Workflows</Text>
-        <Text color="subtle">
-          {running} running · {done} done
-        </Text>
+        <Text bold>{focused?.workflowName ?? 'Workflows'}</Text>
+        {focused ? (
+          <Text color="subtle">
+            {agentDone}/{focused.agentCount} agents · {formatDuration(elapsed)} ·{' '}
+            <Text color={RUN_STATUS_COLOR[focused.status] as keyof Theme}>{RUN_STATUS_TEXT[focused.status]}</Text>
+          </Text>
+        ) : (
+          <Text color="subtle">
+            {running} running · {done} done
+          </Text>
+        )}
      </Box>
+      {focused?.description ? <Text color="subtle">{focused.description}</Text> : null}

-      <Box marginTop={1}>
-        <TabsBar runs={runs} activeRunId={activeRunId} />
-      </Box>
+      {runs.length > 1 ? (
+        <Box marginTop={1}>
+          <TabsBar runs={runs} activeRunId={activeRunId} />
+        </Box>
+      ) : null}

      <Box flexDirection="row" marginTop={1}>
        <Box width="25%" flexDirection="column">
          <Text color={focusColumn === 'phases' ? 'claude' : 'subtle'} bold>
-            PHASES
+            Phases
          </Text>
-          <PhaseSidebar phases={phases} agents={focused?.agents ?? []} selectedIndex={clampedPhase} />
+          <PhaseSidebar
+            phases={phases}
+            agents={focused?.agents ?? []}
+            selectedIndex={clampedPhase}
+            focused={focusColumn === 'phases'}
+          />
        </Box>
        <Text color="subtle">│</Text>
        <Box flexGrow={1} flexDirection="column">
          <Text color={focusColumn === 'agents' ? 'claude' : 'subtle'} bold>
-            AGENTS · {phaseHeader}
+            {phaseHeader} · {visibleAgents.length} agents
          </Text>
-          <AgentList agents={visibleAgents} selectedIndex={clampedAgent} />
+          <AgentList agents={visibleAgents} selectedIndex={clampedAgent} focused={focusColumn === 'agents'} />
        </Box>
      </Box>

--- a/src/workflow/panel/selectors.ts
+++ b/src/workflow/panel/selectors.ts
@@ -58,3 +58,14 @@ export function filterAgentsByPhase(
 export function tabLabel(workflowName: string, runId: string): string {
  return `${workflowName}#${runId.slice(-4)}`
 }
+
+/** 毫秒 → 紧凑耗时（<60s → `Ns`；<60m → `MmSSs`；否则 `HhMMm`）。面板 header 用。 */
+export function formatDuration(ms: number): string {
+  const s = Math.floor(ms / 1000)
+  if (s < 60) return `${s}s`
+  const m = Math.floor(s / 60)
+  const ss = s % 60
+  if (m < 60) return `${m}m${String(ss).padStart(2, '0')}s`
+  const h = Math.floor(m / 60)
+  return `${h}h${String(m % 60).padStart(2, '0')}m`
+}
--- a/src/workflow/panel/status.ts
+++ b/src/workflow/panel/status.ts
@@ -16,6 +16,14 @@ export const RUN_STATUS_COLOR: Record<RunProgress['status'], string> = {
  killed: 'subtle',
 }

+/** run 状态 → 展示文字（header 用；对齐参考图 done/running）。 */
+export const RUN_STATUS_TEXT: Record<RunProgress['status'], string> = {
+  running: 'running',
+  completed: 'done',
+  failed: 'failed',
+  killed: 'killed',
+}
+
 /** phase 在侧栏的合并状态（含 pending：meta 声明但未启动）。 */
 export type PhaseStatus = 'running' | 'done' | 'pending'

@@ -31,23 +39,35 @@ export const PHASE_COLOR: Record<PhaseStatus, string> = {
  pending: 'subtle',
 }

-/** agent 行的视觉三件套：标记字符 + 颜色 + 行尾文字后缀。 */
-export type AgentVisual = { mark: string; color: string; suffix: string }
+/** agent 行的视觉：标记字符 + 颜色（running 由 UI 用 spinner 动画覆盖 mark）。 */
+export type AgentVisual = { mark: string; color: string }

 /**
 * agent 状态 → 视觉。
- * - running → ● warning
+ * - running → ● warning（UI 用 spinner 动画覆盖 mark）
 * - done·dead → ✗ error
- * - done·ok：outputShape='object' → object；否则 text
+ * - done·ok → ✓ success
 */
 export function agentVisual(a: AgentProgress): AgentVisual {
-  if (a.status === 'running')
-    return { mark: '●', color: 'warning', suffix: 'running' }
-  if (a.resultKind === 'dead')
-    return { mark: '✗', color: 'error', suffix: 'dead' }
-  return {
-    mark: '✓',
-    color: 'success',
-    suffix: a.outputShape === 'object' ? 'object' : 'text',
-  }
+  if (a.status === 'running') return { mark: '●', color: 'warning' }
+  if (a.resultKind === 'dead') return { mark: '✗', color: 'error' }
+  return { mark: '✓', color: 'success' }
+}
+
+/** token 数 → 展示字符串（<1000 原值；否则保留 1 位小数 + k）。 */
+export function formatTokenCount(n: number | undefined): string {
+  if (!n) return '0'
+  return n >= 1000 ? `${(n / 1000).toFixed(1)}k` : String(n)
+}
+
+/**
+ * agent 行右侧统计文本：`model · Nk tok · N tool`。
+ * 无 model 时省略前段；running 中 token/tool 由 agent_progress 实时刷新。
+ */
+export function agentMetaText(a: AgentProgress): string {
+  const parts: string[] = []
+  if (a.model) parts.push(a.model)
+  parts.push(`${formatTokenCount(a.tokenCount)} tok`)
+  parts.push(`${a.toolCount ?? 0} tool`)
+  return parts.join(' · ')
 }
--- a/src/workflow/progress/store.ts
+++ b/src/workflow/progress/store.ts
@@ -10,6 +10,12 @@ export type AgentProgress = {
  resultKind?: string
  /** 仅 done·ok 时有意义：output 是对象→'object'，否则→'text'。dead/skipped 无。 */
  outputShape?: 'text' | 'object'
+  /** 实际解析后的 model id（agent_done 带入；运行中无）。 */
+  model?: string
+  /** context 总 token（agent_progress 实时 / agent_done 落地最终值）。 */
+  tokenCount?: number
+  /** 累计工具调用次数（agent_progress 实时 / agent_done 落地最终值）。 */
+  toolCount?: number
 }

 export type RunProgress = {
@@ -24,6 +30,10 @@ export type RunProgress = {
  agentCount: number
  returnValue?: unknown
  error?: string
+  /** run_started 时间戳（面板算运行耗时用）。 */
+  startedAt: number
+  /** workflow 描述（来自 run_started.meta.description）。 */
+  description?: string
  updatedAt: number
 }

@@ -59,6 +69,7 @@ export function createProgressStoreFromBus(bus: ProgressBus): ProgressStore {
        currentPhase: null,
        agents: [],
        agentCount: 0,
+        startedAt: Date.now(),
        updatedAt: Date.now(),
      }
      byId.set(runId, p)
@@ -80,6 +91,7 @@ export function createProgressStoreFromBus(bus: ProgressBus): ProgressStore {
        p.workflowName = event.workflowName
        p.status = 'running'
        p.declaredPhases = event.meta?.phases?.map(ph => ph.title) ?? []
+        p.description = event.meta?.description ?? undefined
        break
      case 'phase_started':
        if (!p.phases.some(ph => ph.title === event.phase)) {
@@ -110,6 +122,15 @@ export function createProgressStoreFromBus(bus: ProgressBus): ProgressStore {
        }
        break
      }
+      case 'agent_progress': {
+        // 实时进度：仅更新 token/tool（高频，但每 agent message 一次，频率可控）。
+        const ap = p.agents.find(x => x.id === event.agentId)
+        if (ap) {
+          ap.tokenCount = event.tokenCount
+          ap.toolCount = event.toolCount
+        }
+        break
+      }
      case 'agent_done': {
        let a = p.agents.find(x => x.id === event.agentId)
        if (!a) {
@@ -125,6 +146,9 @@ export function createProgressStoreFromBus(bus: ProgressBus): ProgressStore {
                    event.result.output !== null
                      ? ('object' as const)
                      : ('text' as const),
+                  tokenCount: event.result.tokenCount,
+                  toolCount: event.result.toolCount,
+                  model: event.result.model,
                }
              : {}),
          }
@@ -139,6 +163,9 @@ export function createProgressStoreFromBus(bus: ProgressBus): ProgressStore {
              event.result.output !== null
                ? 'object'
                : 'text'
+            a.tokenCount = event.result.tokenCount
+            a.toolCount = event.result.toolCount
+            a.model = event.result.model
          }
        }
        break
--- a/src/workflow/service.ts
+++ b/src/workflow/service.ts
@@ -1,6 +1,7 @@
 import {
  listNamedWorkflows,
  parseScript,
+  persistInlineScript,
  resolveNamedWorkflow,
  runWorkflow,
  WORKFLOW_DIR_NAME,
@@ -49,7 +50,7 @@ export type WorkflowService = {
    >,
    toolUseContext: ToolUseContext,
    canUseTool: CanUseToolFn,
-  ): Promise<{ runId: string }>
+  ): Promise<{ runId: string; scriptPath?: string }>
  kill(runId: string): void
  /**
   * 进程退出 / 配置卸载时清理：杀掉所有 running run，避免孤儿 task。
@@ -86,6 +87,7 @@ export function getWorkflowService(): WorkflowService {
 export function makeService(
  ports: WorkflowPorts,
  store: ProgressStore,
+  cwdOverride?: string,
 ): WorkflowService {
  const buildHost = (
    toolUseContext: ToolUseContext,
@@ -94,7 +96,8 @@ export function makeService(
    handle: makeHostHandle(buildHostBundle(toolUseContext, canUseTool)),
    // 用 projectRoot 与 ports.ts hostFactory / journalStore 保持同根；
    // 进入 worktree/子目录时不会让命名 workflow 解析与 journal 落盘不同步。
-    cwd: getProjectRoot(),
+    // cwdOverride 仅供测试注入临时目录（避免 inline 持久化写真实项目目录）。
+    cwd: cwdOverride ?? getProjectRoot(),
    budgetTotal: null, // turn 级预算注入点（未来从 settings 读）
    toolUseId: toolUseContext.toolUseId,
  })
@@ -158,6 +161,23 @@ export function makeService(
        host.handle,
      )

+      // inline 入口持久化脚本到 run 目录（与 WorkflowTool 对称），返回可复用路径。
+      // 写盘失败降级（log），不阻断 run（script 已在内存）。
+      let persistedScriptPath: string | undefined
+      if (!workflowFile && input.script) {
+        try {
+          persistedScriptPath = await persistInlineScript(
+            input.script,
+            runId,
+            host.cwd,
+          )
+        } catch (e) {
+          logForDebugging(
+            `workflow inline script persist failed: ${(e as Error).message}`,
+          )
+        }
+      }
+
      // detached：不 await，让调用方立即拿到 runId；结束路由到 registrar。
      void runWorkflow({
        script,
@@ -183,7 +203,10 @@ export function makeService(
        .catch(e => ports.taskRegistrar.fail(runId, (e as Error).message))

      logForDebugging(`workflow launched: ${runId} (${workflowName})`)
-      return { runId }
+      return {
+        runId,
+        ...(persistedScriptPath ? { scriptPath: persistedScriptPath } : {}),
+      }
    },

    kill(runId) {
@@ -193,8 +216,17 @@ export function makeService(
    shutdown() {
      // 仅杀 running：已完成/失败的 run taskRegistrar 已回收 binding，kill 是 no-op。
      // taskRegistrar.kill 对未知 runId 安全 no-op，因此幂等——多次 shutdown 不重复抛错。
+      // 每个 kill 单独 try/catch：kill 内部走 setAppState，进程 exit 阶段触发 React 重渲染
+      // 可能抛错（render 已卸载等）；单个失败不应阻断其他 run 的清理。
      for (const run of store.list()) {
-        if (run.status === 'running') ports.taskRegistrar.kill(run.runId)
+        if (run.status !== 'running') continue
+        try {
+          ports.taskRegistrar.kill(run.runId)
+        } catch (e) {
+          logForDebugging(
+            `workflow shutdown: kill ${run.runId} failed: ${(e as Error).message}`,
+          )
+        }
      }
    },

--- a/src/workflow/wiring.ts
+++ b/src/workflow/wiring.ts
@@ -1,5 +1,7 @@
 import {
  createWorkflowTool,
+  workflowInputSchema,
+  WORKFLOW_TOOL_NAME,
  type WorkflowToolDescriptor,
 } from '@claude-code-best/workflow-engine'
 import { buildTool, type Tool } from '../Tool.js'
@@ -8,25 +10,37 @@ import { getWorkflowService } from './service.js'
 /**
 * 把引擎自包含描述符适配为 buildTool 兼容的 Tool。
 * 描述符统一走 service 单例（共享 ports/registry/store）。
+ *
+ * ports 解析延迟到首次实际方法调用（lazy）：tools.ts 在模块加载阶段（feature-gated）
+ * 调用 createWorkflowToolCore()，若此时立即解析 ports 会触发 service 实例化，
+ * 进而调用 getProjectRoot 等模块级副作用——这在 bootstrap 完成前可能拿到错误路径。
+ * Tool 对象本身的单例由 createWorkflowToolCore 的 cached 保证（PermissionRequest
+ * 按引用匹配），ports 单例由 getWorkflowService 保证。
 */
 function buildWorkflowTool(): Tool {
-  const { ports } = getWorkflowService()
-  const descriptor: WorkflowToolDescriptor = createWorkflowTool(ports)
+  let cachedDescriptor: WorkflowToolDescriptor | null = null
+  const descriptor = (): WorkflowToolDescriptor => {
+    if (!cachedDescriptor) {
+      const { ports } = getWorkflowService()
+      cachedDescriptor = createWorkflowTool(ports)
+    }
+    return cachedDescriptor
+  }
  return buildTool({
-    name: descriptor.name,
+    name: WORKFLOW_TOOL_NAME,
    maxResultSizeChars: 50_000,
-    inputSchema: descriptor.inputSchema,
-    isEnabled: () => descriptor.isEnabled(),
-    isReadOnly: input => descriptor.isReadOnly(input),
+    inputSchema: workflowInputSchema,
+    isEnabled: () => descriptor().isEnabled(),
+    isReadOnly: input => descriptor().isReadOnly(input),
    isConcurrencySafe: () => true,
    async description() {
-      return descriptor.description()
+      return descriptor().description()
    },
    async prompt() {
-      return descriptor.prompt()
+      return descriptor().prompt()
    },
    async call(input, context, canUseTool, parentMessage, onProgress) {
-      const result = await descriptor.call(
+      const result = await descriptor().call(
        input,
        context,
        canUseTool,
@@ -35,9 +49,9 @@ function buildWorkflowTool(): Tool {
      )
      return { data: result.data }
    },
-    renderToolUseMessage: input => descriptor.renderToolUseMessage(input),
+    renderToolUseMessage: input => descriptor().renderToolUseMessage(input),
    mapToolResultToToolResultBlockParam: (data, toolUseId) =>
-      descriptor.mapToolResultToToolResultBlockParam(data, toolUseId),
+      descriptor().mapToolResultToToolResultBlockParam(data, toolUseId),
  })
 }