fix: normalizeMessagesForAPI 不再跨 tool_result 边界合并同 ID assistant 消息 (CC-1215)

ACP 模式下 extended thinking + tool_use 同一 turn 时，StreamingToolExecutor 在两个同 message.id 的 AssistantMessage 之间插入 tool_result，导致向后遍历合并跨越边界，产生重复 tool_use ID → 孤立 tool_result → 连续 user 消息 → 400。修改向后遍历停止条件：遇到非 assistant 消息（含 tool_result）即停止，不再跳过。
fix: add markResourceTiming polyfill to performance shim for Node.js v22 undici compatibility
2026-06-15 12:55:51 +00:00 · 2026-06-04 15:41:41 +08:00 · 2026-06-04 14:30:34 +08:00 · 2026-06-03 21:38:23 +08:00 · 2026-06-02 09:30:13 +08:00 · 2026-06-01 00:23:43 +00:00
57 changed files with 2952 additions and 683 deletions
--- a/.github/workflows/publish-npm.yml
+++ b/.github/workflows/publish-npm.yml
@@ -3,11 +3,11 @@ name: Publish to npm
 on:
  push:
    tags:
-      - 'v*'
+      - "v*"
  workflow_dispatch:
    inputs:
      version:
-        description: '版本号 (例如: v1.9.0)'
+        description: "版本号 (例如: v1.9.0)"
        required: true
        type: string

@@ -24,6 +24,11 @@ jobs:
        with:
          ref: ${{ github.event.inputs.version || github.ref }}

+      - uses: actions/setup-node@48b55a011bda9f5d6aeb4c2d9c7362e8dae4041e # v6, 2026-04-25
+        with:
+          node-version: "24"
+          registry-url: "https://registry.npmjs.org"
+
      - name: Setup Bun
        uses: oven-sh/setup-bun@0c5077e51419868618aeaa5fe8019c62421857d6 # v2, 2026-04-25
        with:
@@ -38,9 +43,9 @@ jobs:
        run: bun test

      - name: Publish to npm
-        run: bun publish --access public
+        run: npm publish --provenance --access public
        env:
-          BUN_CONFIG_TOKEN: ${{ secrets.NPM_TOKEN }}
+          NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}

      - name: Generate changelog
        id: changelog
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -78,8 +78,9 @@ bun run docs:dev

 - **Runtime**: Bun (not Node.js). All imports, builds, and execution use Bun APIs.
 - **Build**: `build.ts` 执行 `Bun.build()` with `splitting: true`，入口 `src/entrypoints/cli.tsx`，输出 `dist/cli.js` + chunk files。Build 默认启用 19 个 feature（见下方 Feature Flag 段）。构建后自动替换 `import.meta.require` 为 Node.js 兼容版本（产物 bun/node 都可运行）。构建时会将 `vendor/audio-capture/` 和 `src/utils/vendor/ripgrep/` 复制到 `dist/vendor/` 下。
- **Build (Vite)**: `vite.config.ts` + `scripts/post-build.ts`，chunk 输出到 `dist/chunks/`。post-build 同样复制 vendor 文件到 `dist/vendor/`。
- **Vendor 路径解析**: 构建后 chunk 文件位于 `dist/` 或 `dist/chunks/` 下，vendor 二进制在 `dist/vendor/`。`src/utils/ripgrep.ts` 和 `packages/audio-capture-napi/src/index.ts` 均通过 `import.meta.url` 路径中 `lastIndexOf('dist')` 定位 dist 根目录，再拼接 `vendor/` 子路径，确保不同构建产物层级下路径一致。
+- **Build (Vite)**: `vite.config.ts` + `scripts/post-build.ts`，代码分割模式，chunk 输出到 `dist/chunks/`。post-build 遍历 `dist/` 和 `dist/chunks/` 下所有 `.js` 文件做 `globalThis.Bun` 解构 patch，复制 vendor 文件到 `dist/vendor/`。
+- **Vendor 路径解析**: 构建后 chunk 文件位于 `dist/` 或 `dist/chunks/` 下，vendor 二进制在 `dist/vendor/`。`src/utils/distRoot.ts` 提供共享的 `distRoot` 函数，通过 `import.meta.url` 路径中 `lastIndexOf('dist')` 或 `lastIndexOf('src')` 定位根目录。`ripgrep.ts`、`computerUse/setup.ts`、`claudeInChrome/setup.ts`、`updateCCB.ts` 均使用 `distRoot` 而非内联 `import.meta.url` 路径推算。`packages/audio-capture-napi/src/index.ts` 有独立的 `lastIndexOf('dist')` 逻辑，功能等价。
+- **为什么 Vite 必须代码分割**: Bun/JSC 会全量解析单个大 JS 文件的 bytecode 和 JIT，单文件 17MB 产物导致 RSS 暴涨至 ~1GB（Node/V8 懒解析仅需 ~220MB）。代码分割为 600+ 小 chunk 后 Bun 按需加载，`--version` RSS 从 966MB 降至 35MB，完整加载从 1GB+ 降至 ~500MB。
 - **Dev mode**: `scripts/dev.ts` 通过 Bun `-d` flag 注入 `MACRO.*` defines，运行 `src/entrypoints/cli.tsx`。默认启用全部 feature。
 - **Module system**: ESM (`"type": "module"`), TSX with `react-jsx` transform.
 - **Monorepo**: Bun workspaces — 17 个 workspace packages + 若干辅助目录 in `packages/` resolved via `workspace:*`。
--- a/README.md
+++ b/README.md
@@ -10,12 +10,11 @@

 > Which Claude do you like? The open source one is the best.

-牢 A (Anthropic) 官方 [Claude Code](https://docs.anthropic.com/en/docs/claude-code) CLI 工具的源码反编译/逆向还原项目。目标是将 Claude Code 大部分功能及工程化能力复现 (问就是老佛爷已经付过钱了)。虽然很难绷, 但是它叫做 CCB(踩踩背)... 而且, 我们实现了企业版或者需要登陆 Claude 账号才能使用的特性, 实现技术普惠
+牢 A (Anthropic) 官方 [Claude Code](https://docs.anthropic.com/en/docs/claude-code) 完整复原的工程化项目。虽然很难绷, 但是它叫做 CCB(踩踩背)... 而且, 我们实现了企业版或者需要登陆 Claude 账号才能使用的特性, 并在此基础上扩展了更多好玩的特性。

-> 我们将会在五一期间进行整个代码仓库的 lint 规范化, 这个期间提交的 PR 可能会有非常多的冲突, 所以大的功能请尽量在这之前提交哈
-
-[文档在这里, 支持投稿 PR](https://ccb.agent-aura.top/) | [留影文档在这里](./Friends.md) | [Discord 群组](https://discord.gg/uApuzJWGKX)
+[Peri Code](https://github.com/KonghaYao/peri)：Claude Code 兼容的 Rust Agent，多年大模型经验匠心制作，国内大模型（DeepSeek/GLM）精调，CPU/内存极致优化，在开发版/树莓派上也能跑 CC 一样的体验。

+[文档在这里](https://ccb.agent-aura.top/) | [留影文档在这里](./Friends.md) | [Discord 群组，群主在线答疑](https://discord.gg/uApuzJWGKX)

 | 特性                        | 说明                                                                                                                         | 文档                                                                                                                                      |
 | --------------------------- | ---------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------- |
@@ -150,7 +149,6 @@ bun run build

 需要填写的字段：

-
 | 📌 字段      | 📝 说明       | 💡 示例                      |
 | ------------ | ------------- | ---------------------------- |
 | Base URL     | API 服务地址  | `https://api.example.com/v1` |
--- a/contributors.svg
+++ b/contributors.svg
--- a/docs/performance-reporter.md
+++ b/docs/performance-reporter.md
@@ -0,0 +1,54 @@
+# 内存占用 1G 调研报告
+
+> 诊断 session `a3593062` RSS 达 1.09 GB，定位 Bun 运行时内存膨胀根因
+
+## 数据收集
+
+- **诊断数据**: RSS 1,118 MB，V8 heap 84 MB，原生内存缺口 1,034 MB（92%）
+- **构建方式**: `bun run build:vite` → Vite/Rollup 单文件构建，产物 17MB `dist/cli.js`
+- **Vite 配置**: `codeSplitting: false`（`vite.config.ts:97`），所有代码内联为单文件
+- **Node.js 对比**: 相同 17MB 产物，Node.js RSS 仅 223 MB（`--version`）/ 340 MB（完整加载）
+
+## 探索与验证
+
+### 已确认
+
+| 问题 | 位置 | 说明 |
+|------|------|------|
+| **根因: Vite 单文件构建 + Bun 解析大文件内存效率低** | `vite.config.ts:97` | `codeSplitting: false` 产出 17MB 单文件，Bun/JSC 解析时 RSS 暴涨至 966MB |
+| Node.js 对同等 17MB 文件仅需 223MB | 实测 | V8 对大文件解析的内存效率远优于 JSC |
+| Bun.build 代码分割可解决问题 | 实测 | `bun run build`（代码分割 → 627 chunk）Bun RSS 仅 30MB（`--version`）/ 318MB（完整加载） |
+
+### 已否认
+
+- 不是 feature flags 数量问题 — 全部 35 features 开启时，代码分割构建内存正常
+- 不是内存泄漏 — `detachedContexts: 0`，`activeHandles: 0`
+- 不是原生 addon 问题 — vendor 文件仅 2.7MB
+- 不是 TypeScript 源码体量问题 — `bun run dev`（直接加载 TS）完整路径仅 345MB
+
+## 结论
+
+**根因是 Vite 构建配置 `codeSplitting: false`，产出 17MB 单文件，Bun/JSC 解析单文件大 JS 时内存效率极差（966MB vs Node 的 223MB）。**
+
+实测对比矩阵：
+
+| 构建方式 | 产物结构 | Bun RSS | Node RSS | Bun/Node |
+|----------|----------|---------|----------|----------|
+| `build:vite` | 17MB 单文件 | **966 MB** | 223 MB | 4.3x |
+| `build:vite` pipe mode | 同上 | **1,088 MB** | 340 MB | 3.2x |
+| `build` (Bun) | 627 chunk | 30 MB | 42 MB | 0.7x |
+| `build` (Bun) pipe mode | 同上 | 318 MB | 253 MB | 1.3x |
+| `bun run dev` TS 源码 | 动态加载 | 42 MB | — | — |
+| `bun run dev` pipe mode | 动态加载 | 345 MB | — | — |
+
+核心差异：
+- **Node/V8** 解析 17MB 文件只需 223MB — V8 的懒解析（lazy parsing）只编译入口需要的部分
+- **Bun/JSC** 解析 17MB 文件需要 966MB — JSC 对单文件做全量编译，bytecode + JIT 占用大量原生内存
+- 代码分割后（627 个小 chunk），Bun 按需加载，内存回到正常水平
+
+## 建议
+
+1. **开启 Vite 代码分割** — 在 `vite.config.ts` 中启用 `codeSplitting: true` 或使用 Rollup 的 `manualChunks` 配置。这是最直接的修复
+2. **或切换到 Bun.build** — `bun run build` 已默认启用代码分割（`splitting: true`），Bun RSS 仅 30-318MB
+3. **如果必须单文件** — 考虑用 Node.js 运行 Vite 产物（`node dist/cli-node.js`），代价是失去 Bun 特有 API
+4. **验证 `codeSplitting: false` 的存在理由** — 注释说"all dynamic imports inlined"，可能是为了简化部署。评估是否真的需要单文件
--- a/package.json
+++ b/package.json
@@ -1,6 +1,6 @@
 {
  "name": "claude-code-best",
-  "version": "2.4.5",
+  "version": "2.6.6",
  "description": "Reverse-engineered Anthropic Claude Code CLI — interactive AI coding assistant in the terminal",
  "type": "module",
  "author": "claude-code-best <claude-code-best@proton.me>",
@@ -53,7 +53,7 @@
    "format": "biome format --write .",
    "check": "biome check .",
    "check:fix": "biome check --fix .",
-    "prepare": "bunx husky",
+    "prepare": "husky",
    "test": "bun test",
    "test:production": "bun run scripts/production-test.ts",
    "test:production:offline": "bun run scripts/production-test.ts --offline",
--- a/packages/@ant/model-provider/src/shared/tests/openaiStreamAdapter.test.ts
+++ b/packages/@ant/model-provider/src/shared/tests/openaiStreamAdapter.test.ts
@@ -551,7 +551,8 @@ describe('prompt caching support', () => {

    const msgStart = events.find(e => e.type === 'message_start') as any
    expect(msgStart.message.usage.cache_read_input_tokens).toBe(800)
-    expect(msgStart.message.usage.input_tokens).toBe(1000)
+    // input_tokens = prompt_tokens - cached_tokens = 1000 - 800 = 200
+    expect(msgStart.message.usage.input_tokens).toBe(200)
  })

  test('defaults cache_read_input_tokens to 0 when no cached_tokens', async () => {
@@ -750,7 +751,8 @@ describe('prompt caching support', () => {

    // message_delta carries the real values from the trailing chunk
    const msgDelta = events.find(e => e.type === 'message_delta') as any
-    expect(msgDelta.usage.input_tokens).toBe(30011)
+    // input_tokens = prompt_tokens - cached_tokens = 30011 - 19904 = 10107
+    expect(msgDelta.usage.input_tokens).toBe(10107)
    expect(msgDelta.usage.output_tokens).toBe(190)
    expect(msgDelta.usage.cache_read_input_tokens).toBe(19904)
    expect(msgDelta.usage.cache_creation_input_tokens).toBe(0)
@@ -821,7 +823,34 @@ describe('prompt caching support', () => {

    const msgDelta = events.find(e => e.type === 'message_delta') as any
    expect(msgDelta.usage.cache_read_input_tokens).toBe(1500)
-    expect(msgDelta.usage.input_tokens).toBe(2000)
+    // input_tokens = prompt_tokens - cached_tokens = 2000 - 1500 = 500
+    expect(msgDelta.usage.input_tokens).toBe(500)
    expect(msgDelta.usage.output_tokens).toBe(100)
  })
+
+  test('subtracts cached_tokens from input_tokens to match Anthropic semantic', async () => {
+    // Anthropic's input_tokens = non-cached tokens only.
+    // OpenAI's prompt_tokens = total input including cached.
+    // The adapter must subtract: input_tokens = prompt_tokens - cached_tokens.
+    const events = await collectEvents([
+      makeChunk({
+        choices: [{ index: 0, delta: { content: 'hi' }, finish_reason: null }],
+      }),
+      makeChunk({
+        choices: [{ index: 0, delta: {}, finish_reason: 'stop' }],
+        usage: {
+          prompt_tokens: 34097,
+          completion_tokens: 30,
+          total_tokens: 34127,
+          prompt_tokens_details: { cached_tokens: 34048 },
+        } as any,
+      }),
+    ])
+
+    const msgDelta = events.find(e => e.type === 'message_delta') as any
+    // input_tokens = 34097 - 34048 = 49 (non-cached input only)
+    expect(msgDelta.usage.input_tokens).toBe(49)
+    expect(msgDelta.usage.cache_read_input_tokens).toBe(34048)
+    expect(msgDelta.usage.output_tokens).toBe(30)
+  })
 })
--- a/packages/@ant/model-provider/src/shared/openaiStreamAdapter.ts
+++ b/packages/@ant/model-provider/src/shared/openaiStreamAdapter.ts
@@ -13,10 +13,10 @@ import { randomUUID } from 'crypto'
 *   finish_reason            → message_delta(stop_reason) + message_stop
 *
 * Usage field mapping (OpenAI → Anthropic):
- *   prompt_tokens                        → input_tokens
- *   completion_tokens                    → output_tokens
- *   prompt_tokens_details.cached_tokens  → cache_read_input_tokens
- *   (no OpenAI equivalent)               → cache_creation_input_tokens (always 0)
+ *   prompt_tokens - cached_tokens             → input_tokens (non-cached input only)
+ *   completion_tokens                         → output_tokens
+ *   prompt_tokens_details.cached_tokens       → cache_read_input_tokens
+ *   (no OpenAI equivalent)                    → cache_creation_input_tokens (always 0)
 *
 *   All four fields are emitted in the post-loop message_delta (not message_start)
 *   so that trailing usage chunks (sent after finish_reason by some
@@ -54,6 +54,9 @@ export async function* adaptOpenAIStreamToAnthropic(
  let textBlockOpen = false

  // Track usage — all four Anthropic fields, populated from OpenAI usage fields:
+  // rawInputTokens tracks the raw prompt_tokens (OpenAI total, including cached).
+  // inputTokens is the derived Anthropic value (non-cached only = rawInputTokens - cachedReadTokens).
+  let rawInputTokens = 0
  let inputTokens = 0
  let outputTokens = 0
  let cachedReadTokens = 0
@@ -71,12 +74,17 @@ export async function* adaptOpenAIStreamToAnthropic(

    // Extract usage from any chunk that carries it.
    if (chunk.usage) {
-      inputTokens = chunk.usage.prompt_tokens ?? inputTokens
+      rawInputTokens = chunk.usage.prompt_tokens ?? rawInputTokens
+      const rawCached =
+        ((chunk.usage as any).prompt_tokens_details?.cached_tokens as
+          | number
+          | undefined) ?? cachedReadTokens
+      // Anthropic's input_tokens = non-cached input only. OpenAI's prompt_tokens
+      // includes cached tokens, so subtract. Clamp to 0 in case cached > total
+      // due to a streaming race.
+      inputTokens = Math.max(0, rawInputTokens - rawCached)
      outputTokens = chunk.usage.completion_tokens ?? outputTokens
-      const details = (chunk.usage as any).prompt_tokens_details
-      if (details?.cached_tokens != null) {
-        cachedReadTokens = details.cached_tokens
-      }
+      cachedReadTokens = rawCached
    }

    // Emit message_start on first chunk
--- a/packages/builtin-tools/src/tools/FileEditTool/FileEditTool.ts
+++ b/packages/builtin-tools/src/tools/FileEditTool/FileEditTool.ts
@@ -70,7 +70,6 @@ import {
  areFileEditsInputsEquivalent,
  findActualString,
  getPatchForEdit,
-  preserveQuoteStyle,
 } from './utils.js'

 // V8/Bun string length limit is ~2^30 characters (~1 billion). For typical
@@ -297,7 +296,7 @@ export const FileEditTool = buildTool({

    const file = fileContent

-    // Use findActualString to handle quote normalization
+    // Use findActualString to find exact match
    const actualOldString = findActualString(file, old_string)
    if (!actualOldString) {
      return {
@@ -452,23 +451,16 @@ export const FileEditTool = buildTool({
      }
    }

-    // 3. Use findActualString to handle quote normalization
+    // 3. Find the exact string in file content
    const actualOldString =
      findActualString(originalFileContents, old_string) || old_string

-    // Preserve curly quotes in new_string when the file uses them
-    const actualNewString = preserveQuoteStyle(
-      old_string,
-      actualOldString,
-      new_string,
-    )
-
    // 4. Generate patch
    const { patch, updatedFile } = getPatchForEdit({
      filePath: absoluteFilePath,
      fileContents: originalFileContents,
      oldString: actualOldString,
-      newString: actualNewString,
+      newString: new_string,
      replaceAll: replace_all,
    })

--- a/packages/builtin-tools/src/tools/FileEditTool/UI.tsx
+++ b/packages/builtin-tools/src/tools/FileEditTool/UI.tsx
@@ -20,7 +20,7 @@ import { readEditContext } from 'src/utils/readEditContext.js';
 import { firstLineOf } from 'src/utils/stringUtils.js';
 import type { ThemeName } from 'src/utils/theme.js';
 import type { FileEditOutput } from './types.js';
-import { findActualString, getPatchForEdit, preserveQuoteStyle } from './utils.js';
+import { findActualString, getPatchForEdit } from './utils.js';

 export function userFacingName(
  input:
@@ -265,12 +265,11 @@ async function loadRejectionDiff(
      return { patch, firstLine: null, fileContent: undefined };
    }
    const actualOld = findActualString(ctx.content, oldString) || oldString;
-    const actualNew = preserveQuoteStyle(oldString, actualOld, newString);
    const { patch } = getPatchForEdit({
      filePath,
      fileContents: ctx.content,
      oldString: actualOld,
-      newString: actualNew,
+      newString: newString,
      replaceAll,
    });
    return {
--- a/packages/builtin-tools/src/tools/FileEditTool/tests/utils.test.ts
+++ b/packages/builtin-tools/src/tools/FileEditTool/tests/utils.test.ts
@@ -4,45 +4,8 @@ import { logMock } from '../../../../../../tests/mocks/log'
 // Mock log.ts to cut the heavy dependency chain
 mock.module('src/utils/log.ts', logMock)

-const {
-  normalizeQuotes,
-  stripTrailingWhitespace,
-  findActualString,
-  preserveQuoteStyle,
-  applyEditToFile,
-  LEFT_SINGLE_CURLY_QUOTE,
-  RIGHT_SINGLE_CURLY_QUOTE,
-  LEFT_DOUBLE_CURLY_QUOTE,
-  RIGHT_DOUBLE_CURLY_QUOTE,
-} = await import('../utils')
-
-// ─── normalizeQuotes ────────────────────────────────────────────────────
-
-describe('normalizeQuotes', () => {
-  test('converts left single curly to straight', () => {
-    expect(normalizeQuotes(`${LEFT_SINGLE_CURLY_QUOTE}hello`)).toBe("'hello")
-  })
-
-  test('converts right single curly to straight', () => {
-    expect(normalizeQuotes(`hello${RIGHT_SINGLE_CURLY_QUOTE}`)).toBe("hello'")
-  })
-
-  test('converts left double curly to straight', () => {
-    expect(normalizeQuotes(`${LEFT_DOUBLE_CURLY_QUOTE}hello`)).toBe('"hello')
-  })
-
-  test('converts right double curly to straight', () => {
-    expect(normalizeQuotes(`hello${RIGHT_DOUBLE_CURLY_QUOTE}`)).toBe('hello"')
-  })
-
-  test('leaves straight quotes unchanged', () => {
-    expect(normalizeQuotes('\'hello\' "world"')).toBe('\'hello\' "world"')
-  })
-
-  test('handles empty string', () => {
-    expect(normalizeQuotes('')).toBe('')
-  })
-})
+const { stripTrailingWhitespace, findActualString, applyEditToFile } =
+  await import('../utils')

 // ─── stripTrailingWhitespace ────────────────────────────────────────────

@@ -91,12 +54,6 @@ describe('findActualString', () => {
    expect(findActualString('hello world', 'hello')).toBe('hello')
  })

-  test('finds match with curly quotes normalized', () => {
-    const fileContent = `${LEFT_DOUBLE_CURLY_QUOTE}hello${RIGHT_DOUBLE_CURLY_QUOTE}`
-    const result = findActualString(fileContent, '"hello"')
-    expect(result).not.toBeNull()
-  })
-
  test('returns null when not found', () => {
    expect(findActualString('hello world', 'xyz')).toBeNull()
  })
@@ -107,124 +64,13 @@ describe('findActualString', () => {
    expect(result).toBe('')
  })

-  // ── Tab/space normalization (Bug #2 reproduction) ──
-
-  test('finds match when search uses spaces but file uses tabs', () => {
-    // File content uses Tab indentation
-    const fileContent = '\tif (x) {\n\t\treturn 1;\n\t}'
-    // User copies from Read output which renders tabs as spaces
-    const searchWithSpaces = '    if (x) {\n        return 1;\n    }'
-    const result = findActualString(fileContent, searchWithSpaces)
-    expect(result).not.toBeNull()
-    expect(result).toBe(fileContent)
-  })
-
-  test('finds match when search mixes tabs and spaces inconsistently', () => {
-    const fileContent = '\tconst x = 1; // comment'
-    const searchMixed = '    const x = 1; // comment'
-    const result = findActualString(fileContent, searchMixed)
-    expect(result).not.toBeNull()
-  })
-
-  test('finds match for single-line tab-to-space mismatch', () => {
-    const fileContent = '\t\torder_price = NormalizeDouble(ask, digits);'
-    const searchSpaces = '        order_price = NormalizeDouble(ask, digits);'
-    const result = findActualString(fileContent, searchSpaces)
-    expect(result).not.toBeNull()
-  })
-
-  // ── CJK / UTF-8 characters (Bug #1 reproduction) ──
+  // ── CJK / UTF-8 characters ──

  test('finds match with CJK characters in content', () => {
    const fileContent = 'input int x = 620; // 止盈点数(点) — 32个pip=320点'
    const result = findActualString(fileContent, fileContent)
    expect(result).toBe(fileContent)
  })
-
-  test('finds match with CJK characters when tab/space differs', () => {
-    const fileContent = '\t// 向上突破 → Sell Limit (逆方向做空)'
-    const searchSpaces = '    // 向上突破 → Sell Limit (逆方向做空)'
-    const result = findActualString(fileContent, searchSpaces)
-    expect(result).not.toBeNull()
-    expect(result).toBe(fileContent)
-  })
-
-  // ── Multiline with tabs + CJK (combined Bug #1 + #2) ──
-
-  test('finds multiline match with tabs and CJK characters', () => {
-    const fileContent =
-      '\tif(effective_dir == BREAKOUT_UP)\n\t\t{\n\t\t\t// 向上突破\n\t\t}'
-    const searchSpaces =
-      '    if(effective_dir == BREAKOUT_UP)\n        {\n            // 向上突破\n        }'
-    const result = findActualString(fileContent, searchSpaces)
-    expect(result).not.toBeNull()
-    expect(result).toBe(fileContent)
-  })
-
-  // ── Returned string must be a valid substring of fileContent ──
-
-  test('returned string from tab match is a real substring of fileContent', () => {
-    const fileContent = 'prefix\n\t\tindented code\nsuffix'
-    const searchSpaces = 'prefix\n        indented code\nsuffix'
-    const result = findActualString(fileContent, searchSpaces)
-    expect(result).not.toBeNull()
-    expect(fileContent.includes(result!)).toBe(true)
-  })
-
-  test('returned string from partial tab match is a real substring', () => {
-    const fileContent = 'line1\n\tif (x) {\n\t\tdoStuff();\n\t}\nline5'
-    const searchSpaces = '    if (x) {\n        doStuff();\n    }'
-    const result = findActualString(fileContent, searchSpaces)
-    expect(result).not.toBeNull()
-    expect(fileContent.includes(result!)).toBe(true)
-  })
-
-  test('tab match with mixed indentation levels', () => {
-    const fileContent =
-      'class Foo {\n\t\tmethod1() {\n\t\t\treturn 42;\n\t\t}\n}'
-    const searchSpaces =
-      'class Foo {\n        method1() {\n            return 42;\n        }\n}'
-    const result = findActualString(fileContent, searchSpaces)
-    expect(result).not.toBeNull()
-    expect(fileContent.includes(result!)).toBe(true)
-  })
-})
-
-// ─── preserveQuoteStyle ─────────────────────────────────────────────────
-
-describe('preserveQuoteStyle', () => {
-  test('returns newString unchanged when no normalization happened', () => {
-    expect(preserveQuoteStyle('hello', 'hello', 'world')).toBe('world')
-  })
-
-  test('converts straight double quotes to curly in replacement', () => {
-    const oldString = '"hello"'
-    const actualOldString = `${LEFT_DOUBLE_CURLY_QUOTE}hello${RIGHT_DOUBLE_CURLY_QUOTE}`
-    const newString = '"world"'
-    const result = preserveQuoteStyle(oldString, actualOldString, newString)
-    expect(result).toContain(LEFT_DOUBLE_CURLY_QUOTE)
-    expect(result).toContain(RIGHT_DOUBLE_CURLY_QUOTE)
-  })
-
-  test('converts straight single quotes to curly in replacement', () => {
-    const oldString = "'hello'"
-    const actualOldString = `${LEFT_SINGLE_CURLY_QUOTE}hello${RIGHT_SINGLE_CURLY_QUOTE}`
-    const newString = "'world'"
-    const result = preserveQuoteStyle(oldString, actualOldString, newString)
-    expect(result).toContain(LEFT_SINGLE_CURLY_QUOTE)
-    expect(result).toContain(RIGHT_SINGLE_CURLY_QUOTE)
-  })
-
-  test('treats apostrophe in contraction as right curly quote', () => {
-    const oldString = "'it's a test'"
-    const actualOldString = `${LEFT_SINGLE_CURLY_QUOTE}it${RIGHT_SINGLE_CURLY_QUOTE}s a test${RIGHT_SINGLE_CURLY_QUOTE}`
-    const newString = "'don't worry'"
-    const result = preserveQuoteStyle(oldString, actualOldString, newString)
-    // The leading ' at position 0 should be LEFT_SINGLE_CURLY_QUOTE
-    expect(result[0]).toBe(LEFT_SINGLE_CURLY_QUOTE)
-    // The apostrophe in "don't" (between n and t) should be RIGHT_SINGLE_CURLY_QUOTE
-    expect(result).toContain(RIGHT_SINGLE_CURLY_QUOTE)
-  })
 })

 // ─── applyEditToFile ────────────────────────────────────────────────────
--- a/packages/builtin-tools/src/tools/FileEditTool/utils.ts
+++ b/packages/builtin-tools/src/tools/FileEditTool/utils.ts
@@ -15,27 +15,6 @@ import {
 } from 'src/utils/file.js'
 import type { EditInput, FileEdit } from './types.js'

-// Claude can't output curly quotes, so we define them as constants here for Claude to use
-// in the code. We do this because we normalize curly quotes to straight quotes
-// when applying edits.
-export const LEFT_SINGLE_CURLY_QUOTE = '‘'
-export const RIGHT_SINGLE_CURLY_QUOTE = '’'
-export const LEFT_DOUBLE_CURLY_QUOTE = '“'
-export const RIGHT_DOUBLE_CURLY_QUOTE = '”'
-
-/**
- * Normalizes quotes in a string by converting curly quotes to straight quotes
- * @param str The string to normalize
- * @returns The string with all curly quotes replaced by straight quotes
- */
-export function normalizeQuotes(str: string): string {
-  return str
-    .replaceAll(LEFT_SINGLE_CURLY_QUOTE, "'")
-    .replaceAll(RIGHT_SINGLE_CURLY_QUOTE, "'")
-    .replaceAll(LEFT_DOUBLE_CURLY_QUOTE, '"')
-    .replaceAll(RIGHT_DOUBLE_CURLY_QUOTE, '"')
-}
-
 /**
 * Strips trailing whitespace from each line in a string while preserving line endings
 * @param str The string to process
@@ -64,261 +43,22 @@ export function stripTrailingWhitespace(str: string): string {
 }

 /**
- * Normalizes whitespace for fuzzy matching by converting tabs to spaces
- * and collapsing leading whitespace on each line to a canonical form.
- * This handles the case where Read tool output renders tabs as spaces,
- * so users copy spaces from the output but the file actually has tabs.
- */
-function normalizeWhitespace(str: string): string {
-  return str.replace(/\t/g, '    ')
-}
-
-/**
- * Finds the actual string in the file content that matches the search string,
- * accounting for quote normalization and tab/space differences.
- *
- * Matching cascade:
- * 1. Exact match
- * 2. Quote normalization (curly → straight quotes)
- * 3. Tab/space normalization (tabs ↔ spaces in leading whitespace)
- * 4. Quote + tab/space normalization combined
+ * Finds the exact string in the file content.
 *
 * @param fileContent The file content to search in
 * @param searchString The string to search for
- * @returns The actual string found in the file, or null if not found
+ * @returns The search string if found, or null if not found
 */
 export function findActualString(
  fileContent: string,
  searchString: string,
 ): string | null {
-  // First try exact match
  if (fileContent.includes(searchString)) {
    return searchString
  }
-
-  // Try with normalized quotes
-  const normalizedSearch = normalizeQuotes(searchString)
-  const normalizedFile = normalizeQuotes(fileContent)
-
-  const searchIndex = normalizedFile.indexOf(normalizedSearch)
-  if (searchIndex !== -1) {
-    // Find the actual string in the file that matches
-    return fileContent.substring(searchIndex, searchIndex + searchString.length)
-  }
-
-  // Try with tab/space normalization — handles the case where Read output
-  // renders tabs as spaces and the user copies the rendered version
-  const wsNormalizedFile = normalizeWhitespace(fileContent)
-  const wsNormalizedSearch = normalizeWhitespace(searchString)
-
-  const wsSearchIndex = wsNormalizedFile.indexOf(wsNormalizedSearch)
-  if (wsSearchIndex !== -1) {
-    // Map the match position back to the original file content.
-    // We need to find the corresponding range in the original string.
-    return mapNormalizedMatchBackToFile(
-      fileContent,
-      wsNormalizedFile,
-      wsSearchIndex,
-      wsNormalizedSearch.length,
-    )
-  }
-
-  // Try combined: quote normalization + tab/space normalization
-  const combinedFile = normalizeWhitespace(normalizedFile)
-  const combinedSearch = normalizeWhitespace(normalizedSearch)
-
-  const combinedIndex = combinedFile.indexOf(combinedSearch)
-  if (combinedIndex !== -1) {
-    return mapNormalizedMatchBackToFile(
-      fileContent,
-      combinedFile,
-      combinedIndex,
-      combinedSearch.length,
-    )
-  }
-
  return null
 }

-/**
- * Given a match found in a normalized version of fileContent, map the match
- * position back to the original fileContent and extract the corresponding
- * substring.
- *
- * Strategy: walk through both strings character by character, building a
- * mapping from normalized offset to original offset. When a tab is expanded
- * to 4 spaces in the normalized version, the normalized offset advances by 4
- * while the original offset advances by 1.
- */
-function mapNormalizedMatchBackToFile(
-  fileContent: string,
-  normalizedFile: string,
-  normalizedStart: number,
-  normalizedLength: number,
-): string {
-  // Build a sparse mapping from normalized position → original position.
-  // We only need to map the range [normalizedStart, normalizedStart + normalizedLength].
-  let normPos = 0
-  let origPos = 0
-  let origStart = -1
-  let origEnd = -1
-
-  while (
-    origPos < fileContent.length &&
-    normPos <= normalizedStart + normalizedLength
-  ) {
-    if (normPos === normalizedStart) {
-      origStart = origPos
-    }
-    if (normPos === normalizedStart + normalizedLength) {
-      origEnd = origPos
-      break
-    }
-
-    const origChar = fileContent[origPos]!
-    if (origChar === '\t') {
-      // Tab expands to 4 spaces in normalized version
-      const nextNormPos = normPos + 4
-      // If normalizedStart falls within this expanded tab, snap to origPos
-      if (
-        normPos < normalizedStart &&
-        nextNormPos > normalizedStart &&
-        origStart === -1
-      ) {
-        origStart = origPos
-      }
-      if (
-        normPos < normalizedStart + normalizedLength &&
-        nextNormPos > normalizedStart + normalizedLength &&
-        origEnd === -1
-      ) {
-        origEnd = origPos + 1
-      }
-      normPos = nextNormPos
-      origPos++
-    } else {
-      normPos++
-      origPos++
-    }
-  }
-
-  // Fallback: if we couldn't map precisely, use character-count heuristic
-  if (origStart === -1) origStart = 0
-  if (origEnd === -1) {
-    // Approximate: use the ratio of original to normalized length
-    const ratio = fileContent.length / normalizedFile.length
-    origEnd = Math.round(origStart + normalizedLength * ratio)
-  }
-
-  return fileContent.substring(origStart, origEnd)
-}
-
-/**
- * When old_string matched via quote normalization (curly quotes in file,
- * straight quotes from model), apply the same curly quote style to new_string
- * so the edit preserves the file's typography.
- *
- * Uses a simple open/close heuristic: a quote character preceded by whitespace,
- * start of string, or opening punctuation is treated as an opening quote;
- * otherwise it's a closing quote.
- */
-export function preserveQuoteStyle(
-  oldString: string,
-  actualOldString: string,
-  newString: string,
-): string {
-  // If they're the same, no normalization happened
-  if (oldString === actualOldString) {
-    return newString
-  }
-
-  // Detect which curly quote types were in the file
-  const hasDoubleQuotes =
-    actualOldString.includes(LEFT_DOUBLE_CURLY_QUOTE) ||
-    actualOldString.includes(RIGHT_DOUBLE_CURLY_QUOTE)
-  const hasSingleQuotes =
-    actualOldString.includes(LEFT_SINGLE_CURLY_QUOTE) ||
-    actualOldString.includes(RIGHT_SINGLE_CURLY_QUOTE)
-
-  if (!hasDoubleQuotes && !hasSingleQuotes) {
-    return newString
-  }
-
-  let result = newString
-
-  if (hasDoubleQuotes) {
-    result = applyCurlyDoubleQuotes(result)
-  }
-  if (hasSingleQuotes) {
-    result = applyCurlySingleQuotes(result)
-  }
-
-  return result
-}
-
-function isOpeningContext(chars: string[], index: number): boolean {
-  if (index === 0) {
-    return true
-  }
-  const prev = chars[index - 1]
-  return (
-    prev === ' ' ||
-    prev === '\t' ||
-    prev === '\n' ||
-    prev === '\r' ||
-    prev === '(' ||
-    prev === '[' ||
-    prev === '{' ||
-    prev === '\u2014' || // em dash
-    prev === '\u2013' // en dash
-  )
-}
-
-function applyCurlyDoubleQuotes(str: string): string {
-  const chars = [...str]
-  const result: string[] = []
-  for (let i = 0; i < chars.length; i++) {
-    if (chars[i] === '"') {
-      result.push(
-        isOpeningContext(chars, i)
-          ? LEFT_DOUBLE_CURLY_QUOTE
-          : RIGHT_DOUBLE_CURLY_QUOTE,
-      )
-    } else {
-      result.push(chars[i]!)
-    }
-  }
-  return result.join('')
-}
-
-function applyCurlySingleQuotes(str: string): string {
-  const chars = [...str]
-  const result: string[] = []
-  for (let i = 0; i < chars.length; i++) {
-    if (chars[i] === "'") {
-      // Don't convert apostrophes in contractions (e.g., "don't", "it's")
-      // An apostrophe between two letters is a contraction, not a quote
-      const prev = i > 0 ? chars[i - 1] : undefined
-      const next = i < chars.length - 1 ? chars[i + 1] : undefined
-      const prevIsLetter = prev !== undefined && /\p{L}/u.test(prev)
-      const nextIsLetter = next !== undefined && /\p{L}/u.test(next)
-      if (prevIsLetter && nextIsLetter) {
-        // Apostrophe in a contraction — use right single curly quote
-        result.push(RIGHT_SINGLE_CURLY_QUOTE)
-      } else {
-        result.push(
-          isOpeningContext(chars, i)
-            ? LEFT_SINGLE_CURLY_QUOTE
-            : RIGHT_SINGLE_CURLY_QUOTE,
-        )
-      }
-    } else {
-      result.push(chars[i]!)
-    }
-  }
-  return result.join('')
-}
-
 /**
 * Transform edits to ensure replace_all always has a boolean value
 * @param edits Array of edits with optional replace_all
--- a/scripts/post-build.ts
+++ b/scripts/post-build.ts
@@ -9,28 +9,52 @@
 import { readdir, readFile, writeFile, cp } from 'node:fs/promises'
 import { chmodSync } from 'node:fs'
 import { join } from 'node:path'
-import { execSync } from 'node:child_process'

 const outdir = 'dist'

 async function postBuild() {
-  // Step 1: Patch globalThis.Bun destructuring in the single bundled file
-  const cliPath = join(outdir, 'cli.js')
+  // Step 1: Patch globalThis.Bun destructuring in ALL output files
  const BUN_DESTRUCTURE = /var \{([^}]+)\} = globalThis\.Bun;?/g
  const BUN_DESTRUCTURE_SAFE =
    'var {$1} = typeof globalThis.Bun !== "undefined" ? globalThis.Bun : {};'

  let bunPatched = 0
-  {
-    const content = await readFile(cliPath, 'utf-8')
+  const files = await readdir(outdir)
+  const jsFiles = files.filter(f => f.endsWith('.js'))
+
+  for (const file of jsFiles) {
+    const filePath = join(outdir, file)
+    const content = await readFile(filePath, 'utf-8')
+    BUN_DESTRUCTURE.lastIndex = 0
    if (BUN_DESTRUCTURE.test(content)) {
      await writeFile(
-        cliPath,
+        filePath,
        content.replace(BUN_DESTRUCTURE, BUN_DESTRUCTURE_SAFE),
      )
      bunPatched++
    }
+  }
+
+  // Also patch chunk files in dist/chunks/
+  const chunksDir = join(outdir, 'chunks')
+  let chunkFiles: string[] = []
+  try {
+    chunkFiles = (await readdir(chunksDir)).filter(f => f.endsWith('.js'))
+  } catch {
+    // No chunks directory — single-file build fallback
+  }
+
+  for (const file of chunkFiles) {
+    const filePath = join(chunksDir, file)
+    const content = await readFile(filePath, 'utf-8')
    BUN_DESTRUCTURE.lastIndex = 0
+    if (BUN_DESTRUCTURE.test(content)) {
+      await writeFile(
+        filePath,
+        content.replace(BUN_DESTRUCTURE, BUN_DESTRUCTURE_SAFE),
+      )
+      bunPatched++
+    }
  }

  // Step 2: Copy native addon files
@@ -55,7 +79,7 @@ async function postBuild() {
  chmodSync(cliNode, 0o755)

  console.log(
-    `Post-build complete: patched ${bunPatched} Bun destructure, generated entry points`,
+    `Post-build complete: patched ${bunPatched} Bun destructure across ${jsFiles.length + chunkFiles.length} files, generated entry points`,
  )
 }

--- a/src/cli/print.ts
+++ b/src/cli/print.ts
@@ -4966,7 +4966,7 @@ function handleChannelEnable(
  // channel messages queue at priority 'next' and are seen by the model on
  // the turn after they arrive.
  connection.client.setNotificationHandler(
-    ChannelMessageNotificationSchema(),
+    ChannelMessageNotificationSchema() as any,
    async notification => {
      const { content, meta } = notification.params
      logMCPDebug(
@@ -5042,7 +5042,7 @@ function reregisterChannelHandlerAfterReconnect(
    'Channel notifications re-registered after reconnect',
  )
  connection.client.setNotificationHandler(
-    ChannelMessageNotificationSchema(),
+    ChannelMessageNotificationSchema() as any,
    async notification => {
      const { content, meta } = notification.params
      logMCPDebug(
--- a/src/cli/updateCCB.ts
+++ b/src/cli/updateCCB.ts
@@ -9,9 +9,9 @@ import chalk from 'chalk'
 import { execSync } from 'node:child_process'
 import { existsSync, readFileSync } from 'node:fs'
 import { homedir } from 'node:os'
-import { join, dirname } from 'node:path'
-import { fileURLToPath } from 'node:url'
+import { join } from 'node:path'
 import { logForDebugging } from '../utils/debug.js'
+import { distRoot } from '../utils/distRoot.js'
 import { execFileNoThrowWithCwd } from '../utils/execFileNoThrow.js'
 import { gracefulShutdown } from '../utils/gracefulShutdown.js'
 import { writeToStdout } from '../utils/process.js'
@@ -19,12 +19,9 @@ import { writeToStdout } from '../utils/process.js'
 const PACKAGE_NAME = 'claude-code-best'

 function getCurrentVersion(): string {
-  // Read version from the nearest package.json (walks up from this file)
+  // Read version from the nearest package.json (walks up from dist root)
  try {
-    const __dirname = dirname(fileURLToPath(import.meta.url))
-    // In dev: src/cli/updateCCB.ts → ../../package.json
-    // In build: dist/chunks/xxx.js → ../../package.json (may not exist)
-    const pkgPath = join(__dirname, '..', '..', 'package.json')
+    const pkgPath = join(distRoot, '..', 'package.json')
    if (existsSync(pkgPath)) {
      const pkg = JSON.parse(readFileSync(pkgPath, 'utf-8'))
      if (pkg.version) return pkg.version
--- a/src/commands/autofix-pr/tests/extractAutofixResult.test.ts
+++ b/src/commands/autofix-pr/tests/extractAutofixResult.test.ts
@@ -0,0 +1,133 @@
+import { describe, expect, test } from 'bun:test'
+import type { SDKMessage } from '../../../entrypoints/agentSdkTypes.js'
+import {
+  AUTOFIX_RESULT_TAG,
+  extractAutofixResultFromLog,
+} from '../extractAutofixResult.js'
+
+function hookProgressMessage(stdout: string): SDKMessage {
+  return {
+    type: 'system',
+    subtype: 'hook_progress',
+    stdout,
+  } as unknown as SDKMessage
+}
+
+function assistantTextMessage(text: string): SDKMessage {
+  return {
+    type: 'assistant',
+    message: {
+      content: [{ type: 'text', text }],
+    },
+  } as unknown as SDKMessage
+}
+
+const sampleTag = (summary: string): string =>
+  `<${AUTOFIX_RESULT_TAG}>
+  <pr-number>42</pr-number>
+  <commits-pushed>
+    <commit sha="abc123">${summary}</commit>
+  </commits-pushed>
+  <ci-status>green</ci-status>
+  <summary>${summary}</summary>
+</${AUTOFIX_RESULT_TAG}>`
+
+describe('extractAutofixResultFromLog', () => {
+  test('returns null on empty log', () => {
+    expect(extractAutofixResultFromLog([])).toBeNull()
+  })
+
+  test('returns null when no tag present', () => {
+    const log = [
+      assistantTextMessage('just some normal text without the tag'),
+      hookProgressMessage('hook output without tag'),
+    ]
+    expect(extractAutofixResultFromLog(log)).toBeNull()
+  })
+
+  test('extracts from hook stdout', () => {
+    const tag = sampleTag('fixed lint error')
+    const log = [hookProgressMessage(`prefix\n${tag}\nsuffix`)]
+    const result = extractAutofixResultFromLog(log)
+    expect(result).toBe(tag)
+  })
+
+  test('extracts from assistant text', () => {
+    const tag = sampleTag('typecheck fixed')
+    const log = [assistantTextMessage(`Done!\n${tag}`)]
+    expect(extractAutofixResultFromLog(log)).toBe(tag)
+  })
+
+  test('extracts from hook_response subtype too', () => {
+    const tag = sampleTag('via hook_response')
+    const log = [
+      {
+        type: 'system',
+        subtype: 'hook_response',
+        stdout: tag,
+      } as unknown as SDKMessage,
+    ]
+    expect(extractAutofixResultFromLog(log)).toBe(tag)
+  })
+
+  test('returns the latest tag when multiple appear in different messages', () => {
+    const older = sampleTag('older attempt')
+    const newer = sampleTag('newer attempt')
+    const log = [
+      assistantTextMessage(`first try\n${older}`),
+      assistantTextMessage(`retry\n${newer}`),
+    ]
+    expect(extractAutofixResultFromLog(log)).toBe(newer)
+  })
+
+  test('returns null when open tag exists but close tag is missing (truncated)', () => {
+    const log = [
+      assistantTextMessage(
+        `<${AUTOFIX_RESULT_TAG}>\n<summary>got cut off mid-write...`,
+      ),
+    ]
+    expect(extractAutofixResultFromLog(log)).toBeNull()
+  })
+
+  test('returns earlier complete tag when latest open tag is truncated within the same block', () => {
+    // Retry scenario: a full result was emitted, then a second result tag
+    // started but got cut off. We should surface the earlier complete pair
+    // rather than dropping the whole block.
+    const complete = sampleTag('earlier complete result')
+    const truncated = `<${AUTOFIX_RESULT_TAG}>\n<summary>truncated retry...`
+    const log = [assistantTextMessage(`${complete}\n${truncated}`)]
+    expect(extractAutofixResultFromLog(log)).toBe(complete)
+  })
+
+  test('walks backwards so hook stdout from later in log wins over earlier assistant text', () => {
+    const earlier = sampleTag('via assistant first')
+    const later = sampleTag('via hook later')
+    const log = [
+      assistantTextMessage(`some output\n${earlier}`),
+      hookProgressMessage(later),
+    ]
+    expect(extractAutofixResultFromLog(log)).toBe(later)
+  })
+
+  test('ignores tag-shaped strings that span across messages (no concatenation)', () => {
+    // Open tag in one message, close tag in another — should NOT be stitched.
+    const log = [
+      assistantTextMessage(`<${AUTOFIX_RESULT_TAG}>\n<summary>part 1`),
+      assistantTextMessage(`part 2</summary>\n</${AUTOFIX_RESULT_TAG}>`),
+    ]
+    expect(extractAutofixResultFromLog(log)).toBeNull()
+  })
+
+  test('extracts when assistant content is a string (not block array)', () => {
+    // Some SDK paths emit assistant content as a raw string instead of
+    // a content-block array. Current implementation skips those — verify
+    // graceful no-op rather than crash.
+    const log = [
+      {
+        type: 'assistant',
+        message: { content: sampleTag('string content') },
+      } as unknown as SDKMessage,
+    ]
+    expect(extractAutofixResultFromLog(log)).toBeNull()
+  })
+})
--- a/src/commands/autofix-pr/tests/launchAutofixPr.test.ts
+++ b/src/commands/autofix-pr/tests/launchAutofixPr.test.ts
@@ -46,7 +46,7 @@ mock.module('src/utils/teleport.js', () => ({
 }))

 const registerMock = mock(() => ({
-  taskId: 'task-abc',
+  taskId: 'framework-task-id',
  sessionId: 'session-123',
  cleanup: () => {},
 }))
@@ -56,14 +56,41 @@ const checkEligibilityMock = mock(() =>
 const getSessionUrlMock = mock(
  (id: string) => `https://claude.ai/session/${id}`,
 )
+const registerCompletionHookMock = mock<
+  (taskType: string, hook: (taskId: string, metadata?: unknown) => void) => void
+>(() => {})
+const registerCompletionCheckerMock = mock<
+  (
+    taskType: string,
+    checker: (metadata?: unknown) => Promise<string | null>,
+  ) => void
+>(() => {})
+const registerContentExtractorMock = mock<
+  (taskType: string, extractor: (log: unknown[]) => string | null) => void
+>(() => {})

 mock.module('src/tasks/RemoteAgentTask/RemoteAgentTask.js', () => ({
  checkRemoteAgentEligibility: checkEligibilityMock,
  registerRemoteAgentTask: registerMock,
+  registerCompletionHook: registerCompletionHookMock,
+  registerCompletionChecker: registerCompletionCheckerMock,
+  registerContentExtractor: registerContentExtractorMock,
  getRemoteTaskSessionUrl: getSessionUrlMock,
  formatPreconditionError: (e: { type: string }) => e.type,
 }))

+const fetchPrHeadShaMock = mock<
+  (owner: string, repo: string, prNumber: number) => Promise<string | null>
+>(() => Promise.resolve('sha-baseline-abc123'))
+
+// Mock prFetch.ts (gh CLI spawn layer) — keeping the pure decision matrix
+// in prOutcomeCheck.ts unmocked so its tests are unaffected by this file's
+// process-global mock.module pollution.
+mock.module('src/commands/autofix-pr/prFetch.js', () => ({
+  fetchPrHeadSha: fetchPrHeadShaMock,
+  checkPrAutofixOutcome: mock(() => Promise.resolve({ completed: false })),
+}))
+
 const detectRepoMock = mock(() =>
  Promise.resolve({ host: 'github.com', owner: 'acme', name: 'myrepo' }),
 )
@@ -375,6 +402,326 @@ describe('callAutofixPr', () => {
  })
 })

+// Regression suite for the taskId-mismatch latent bug + completion hook wiring.
+// Before this fix, createAutofixTeammate generated a teammate UUID, that UUID
+// was used to acquire the singleton monitor lock, and registerRemoteAgentTask
+// generated a *different* framework taskId. When the framework eventually
+// called clearActiveMonitor(frameworkTaskId) on natural completion, the guard
+// failed (active.taskId !== frameworkTaskId) and the lock stayed acquired,
+// blocking any subsequent /autofix-pr invocations in the same process.
+describe('callAutofixPr · completion hook wiring (taskId mismatch regression)', () => {
+  test('updateActiveMonitor swaps lock taskId to framework-assigned id after register', async () => {
+    await callAutofixPr(onDone, makeContext(), '42')
+    const monitor = getActiveMonitor() as { taskId: string } | null
+    expect(monitor).not.toBeNull()
+    // registerMock returns 'framework-task-id'; before the fix this would be
+    // a teammate-generated random UUID instead.
+    expect(monitor?.taskId).toBe('framework-task-id')
+  })
+
+  test('framework hook → clearActiveMonitor releases lock on natural completion', async () => {
+    await callAutofixPr(onDone, makeContext(), '42')
+    expect(getActiveMonitor()).not.toBeNull()
+
+    // Find the hook the module registered at import time. We grab the last
+    // call so re-imports across tests don't break this — only the most recent
+    // registration is what the framework would invoke now.
+    const calls = registerCompletionHookMock.mock.calls
+    expect(calls.length).toBeGreaterThan(0)
+    const lastCall = calls[calls.length - 1]
+    expect(lastCall?.[0]).toBe('autofix-pr')
+    const hook = lastCall?.[1] as (id: string, metadata?: unknown) => void
+    expect(typeof hook).toBe('function')
+
+    // Simulate the framework invoking the hook with the framework taskId
+    // after a terminal transition. Before the fix this would no-op against
+    // a lock keyed by the teammate UUID.
+    hook('framework-task-id', { owner: 'acme', repo: 'myrepo', prNumber: 42 })
+    expect(getActiveMonitor()).toBeNull()
+  })
+
+  test('subsequent /autofix-pr succeeds after framework hook clears the lock', async () => {
+    await callAutofixPr(onDone, makeContext(), '42')
+    // Simulate natural completion via the registered hook
+    const calls = registerCompletionHookMock.mock.calls
+    const hook = calls[calls.length - 1]?.[1] as (
+      id: string,
+      metadata?: unknown,
+    ) => void
+    hook('framework-task-id', { owner: 'acme', repo: 'myrepo', prNumber: 42 })
+
+    onDone.mockClear()
+    await callAutofixPr(onDone, makeContext(), '99')
+    const firstArg = onDone.mock.calls[0]?.[0] as string
+    // Should be the success path, not "already monitoring"
+    expect(firstArg).not.toMatch(/already monitoring/i)
+    expect(firstArg).toMatch(/Autofix launched/)
+  })
+})
+
+// Phase 2: completionChecker wiring + initialHeadSha capture
+describe('callAutofixPr · Phase 2 completionChecker integration', () => {
+  test('completionChecker is registered at module load with autofix-pr type', () => {
+    // The registration happens during the beforeAll dynamic import; just
+    // verify the mock recorded a call. Filter by task type so any future
+    // additional registrations elsewhere don't break this assertion.
+    const calls = registerCompletionCheckerMock.mock.calls.filter(
+      c => c[0] === 'autofix-pr',
+    )
+    expect(calls.length).toBeGreaterThan(0)
+    const hook = calls[calls.length - 1]?.[1]
+    expect(typeof hook).toBe('function')
+  })
+
+  test('callAutofixPr captures initialHeadSha via fetchPrHeadSha', async () => {
+    fetchPrHeadShaMock.mockClear()
+    await callAutofixPr(onDone, makeContext(), '42')
+    expect(fetchPrHeadShaMock).toHaveBeenCalledWith('acme', 'myrepo', 42)
+  })
+
+  test('initialHeadSha is passed into remoteTaskMetadata on register', async () => {
+    fetchPrHeadShaMock.mockImplementationOnce(() =>
+      Promise.resolve('sha-from-launch'),
+    )
+    await callAutofixPr(onDone, makeContext(), '42')
+    expect(registerMock).toHaveBeenCalledWith(
+      expect.objectContaining({
+        remoteTaskMetadata: expect.objectContaining({
+          owner: 'acme',
+          repo: 'myrepo',
+          prNumber: 42,
+          initialHeadSha: 'sha-from-launch',
+        }),
+      }),
+    )
+  })
+
+  test('fetchPrHeadSha failure → metadata initialHeadSha undefined, launch still succeeds', async () => {
+    fetchPrHeadShaMock.mockImplementationOnce(() =>
+      Promise.reject(new Error('gh not installed')),
+    )
+    await callAutofixPr(onDone, makeContext(), '42')
+    expect(registerMock).toHaveBeenCalledWith(
+      expect.objectContaining({
+        remoteTaskMetadata: expect.objectContaining({
+          owner: 'acme',
+          repo: 'myrepo',
+          prNumber: 42,
+          initialHeadSha: undefined,
+        }),
+      }),
+    )
+    // Launch must NOT fail just because SHA capture failed
+    const firstArg = onDone.mock.calls[0]?.[0] as string
+    expect(firstArg).toMatch(/Autofix launched/)
+  })
+
+  test('fetchPrHeadSha returning null → metadata initialHeadSha undefined', async () => {
+    fetchPrHeadShaMock.mockImplementationOnce(() => Promise.resolve(null))
+    await callAutofixPr(onDone, makeContext(), '42')
+    expect(registerMock).toHaveBeenCalledWith(
+      expect.objectContaining({
+        remoteTaskMetadata: expect.objectContaining({
+          initialHeadSha: undefined,
+        }),
+      }),
+    )
+  })
+})
+
+// Phase 2 (cont.): exercise the registered completionChecker arrow body
+// directly. The earlier suite verifies it was registered but never invokes
+// the arrow itself, leaving the throttle / metadata-guard / gh-CLI dispatch
+// branches uncovered.
+describe('callAutofixPr · Phase 2 completionChecker arrow body', () => {
+  // Pull the most recent registered checker — beforeAll registers once at
+  // module load; nothing else re-registers across this file's tests.
+  function getChecker(): (metadata?: unknown) => Promise<string | null> {
+    const calls = registerCompletionCheckerMock.mock.calls.filter(
+      c => c[0] === 'autofix-pr',
+    )
+    const fn = calls[calls.length - 1]?.[1]
+    if (typeof fn !== 'function') {
+      throw new Error('completionChecker not registered')
+    }
+    return fn
+  }
+
+  test('returns null when metadata is undefined (early guard)', async () => {
+    const checker = getChecker()
+    expect(await checker(undefined)).toBeNull()
+  })
+
+  test('returns null when checkPrAutofixOutcome reports not completed', async () => {
+    const { checkPrAutofixOutcome } = await import('../prFetch.js')
+    ;(checkPrAutofixOutcome as ReturnType<typeof mock>).mockImplementationOnce(
+      () => Promise.resolve({ completed: false }),
+    )
+    const checker = getChecker()
+    // Distinct PR number to dodge the in-process throttle map carried over
+    // from earlier tests.
+    const result = await checker({
+      owner: 'acme',
+      repo: 'myrepo',
+      prNumber: 1001,
+    })
+    expect(result).toBeNull()
+  })
+
+  test('returns the summary string when checkPrAutofixOutcome reports completed', async () => {
+    const { checkPrAutofixOutcome } = await import('../prFetch.js')
+    ;(checkPrAutofixOutcome as ReturnType<typeof mock>).mockImplementationOnce(
+      () =>
+        Promise.resolve({
+          completed: true,
+          summary: 'acme/myrepo#1002 merged. Autofix monitoring complete.',
+        }),
+    )
+    const checker = getChecker()
+    const result = await checker({
+      owner: 'acme',
+      repo: 'myrepo',
+      prNumber: 1002,
+    })
+    expect(result).toBe('acme/myrepo#1002 merged. Autofix monitoring complete.')
+  })
+
+  test('passes initialHeadSha through to checkPrAutofixOutcome', async () => {
+    const { checkPrAutofixOutcome } = await import('../prFetch.js')
+    const checkMock = checkPrAutofixOutcome as ReturnType<typeof mock>
+    checkMock.mockClear()
+    checkMock.mockImplementationOnce(() =>
+      Promise.resolve({ completed: false }),
+    )
+    const checker = getChecker()
+    await checker({
+      owner: 'acme',
+      repo: 'myrepo',
+      prNumber: 1003,
+      initialHeadSha: 'sha-baseline-xyz',
+    })
+    expect(checkMock).toHaveBeenCalledWith({
+      owner: 'acme',
+      repo: 'myrepo',
+      prNumber: 1003,
+      initialHeadSha: 'sha-baseline-xyz',
+    })
+  })
+
+  test('throttles back-to-back calls for the same PR within CHECK_INTERVAL_MS', async () => {
+    const { checkPrAutofixOutcome } = await import('../prFetch.js')
+    const checkMock = checkPrAutofixOutcome as ReturnType<typeof mock>
+    checkMock.mockClear()
+    checkMock.mockImplementation(() => Promise.resolve({ completed: false }))
+    const checker = getChecker()
+    const meta = { owner: 'acme', repo: 'myrepo', prNumber: 1004 }
+    await checker(meta)
+    // Second call within the 5s throttle window must short-circuit to null
+    // without invoking the gh CLI layer again.
+    const callCountAfterFirst = checkMock.mock.calls.length
+    const result = await checker(meta)
+    expect(result).toBeNull()
+    expect(checkMock.mock.calls.length).toBe(callCountAfterFirst)
+  })
+
+  test('completionHook with metadata clears the throttle entry (re-launch can re-check immediately)', async () => {
+    const { checkPrAutofixOutcome } = await import('../prFetch.js')
+    const checkMock = checkPrAutofixOutcome as ReturnType<typeof mock>
+    checkMock.mockClear()
+    checkMock.mockImplementation(() => Promise.resolve({ completed: false }))
+    const checker = getChecker()
+    const meta = { owner: 'acme', repo: 'myrepo', prNumber: 1005 }
+    await checker(meta) // populate throttle map
+
+    // Invoke the registered completion hook with the same metadata so the
+    // throttle entry is wiped, then verify the next checker call dispatches
+    // gh CLI again instead of short-circuiting.
+    const hookCalls = registerCompletionHookMock.mock.calls.filter(
+      c => c[0] === 'autofix-pr',
+    )
+    const hook = hookCalls[hookCalls.length - 1]?.[1] as (
+      id: string,
+      metadata?: unknown,
+    ) => void
+    hook('any-task-id', meta)
+
+    const callCountBefore = checkMock.mock.calls.length
+    await checker(meta)
+    expect(checkMock.mock.calls.length).toBe(callCountBefore + 1)
+  })
+
+  test('completionHook without metadata still clears the active monitor lock', async () => {
+    // Lock is set via callAutofixPr; hook then invoked with undefined metadata
+    // to exercise the `if (meta)` short-circuit branch (the lock-clear half
+    // still has to run regardless of metadata presence).
+    await callAutofixPr(onDone, makeContext(), '42')
+    expect(getActiveMonitor()).not.toBeNull()
+    const hookCalls = registerCompletionHookMock.mock.calls.filter(
+      c => c[0] === 'autofix-pr',
+    )
+    const hook = hookCalls[hookCalls.length - 1]?.[1] as (
+      id: string,
+      metadata?: unknown,
+    ) => void
+    hook('framework-task-id', undefined)
+    expect(getActiveMonitor()).toBeNull()
+  })
+})
+
+// Phase 3: content extractor wiring + initialMessage tag instruction
+describe('callAutofixPr · Phase 3 content extractor integration', () => {
+  test('registerContentExtractor is called at module load with autofix-pr type', () => {
+    const calls = registerContentExtractorMock.mock.calls.filter(
+      c => c[0] === 'autofix-pr',
+    )
+    expect(calls.length).toBeGreaterThan(0)
+    const extractor = calls[calls.length - 1]?.[1]
+    expect(typeof extractor).toBe('function')
+  })
+
+  test('initialMessage instructs the remote agent to emit an <autofix-result> tag', async () => {
+    await callAutofixPr(onDone, makeContext(), '42')
+    // teleportMock's typed signature has no args, so calls[0] is a
+    // zero-length tuple. We know teleportToRemote is invoked with one
+    // options object, so double-cast through unknown to read the args.
+    const calls = teleportMock.mock.calls as unknown as Array<
+      [{ initialMessage?: string }]
+    >
+    const teleportArgs = calls[0]?.[0]
+    expect(teleportArgs?.initialMessage).toContain('<autofix-result>')
+    expect(teleportArgs?.initialMessage).toContain('</autofix-result>')
+    expect(teleportArgs?.initialMessage).toContain('<ci-status>')
+    expect(teleportArgs?.initialMessage).toContain('<summary>')
+  })
+
+  test('registered extractor returns string for valid log and null for empty', () => {
+    const calls = registerContentExtractorMock.mock.calls.filter(
+      c => c[0] === 'autofix-pr',
+    )
+    const extractor = calls[calls.length - 1]?.[1] as
+      | ((log: unknown[]) => string | null)
+      | undefined
+    expect(extractor).toBeDefined()
+    // Empty log → null
+    expect(extractor?.([])).toBeNull()
+    // Log with assistant text containing tag → returns it
+    const logWithTag = [
+      {
+        type: 'assistant',
+        message: {
+          content: [
+            {
+              type: 'text',
+              text: 'done\n<autofix-result><summary>x</summary></autofix-result>',
+            },
+          ],
+        },
+      },
+    ]
+    expect(extractor?.(logWithTag)).toContain('<autofix-result>')
+  })
+})
+
 // Cover ../index.ts load() — placed in this test file so all the heavy mocks
 // (teleport / detectRepository / RemoteAgentTask / bootstrap-state / analytics /
 // skillDetect) are already registered when load() dynamically imports
--- a/src/commands/autofix-pr/tests/monitorState.test.ts
+++ b/src/commands/autofix-pr/tests/monitorState.test.ts
@@ -5,6 +5,7 @@ import {
  isMonitoring,
  setActiveMonitor,
  trySetActiveMonitor,
+  updateActiveMonitor,
 } from '../monitorState.js'

 function makeState(
@@ -76,4 +77,41 @@ describe('monitorState', () => {
    // First state remains
    expect(getActiveMonitor()?.prNumber).toBe(1)
  })
+
+  test('updateActiveMonitor returns false when no active monitor', () => {
+    expect(updateActiveMonitor({ taskId: 'task-x' })).toBe(false)
+    expect(getActiveMonitor()).toBeNull()
+  })
+
+  test('updateActiveMonitor merges partial fields into the active monitor', () => {
+    setActiveMonitor(makeState({ taskId: 'tentative-uuid' }))
+    expect(updateActiveMonitor({ taskId: 'framework-task-id' })).toBe(true)
+    const after = getActiveMonitor()
+    expect(after?.taskId).toBe('framework-task-id')
+    // Other fields untouched
+    expect(after?.owner).toBe('acme')
+    expect(after?.repo).toBe('myrepo')
+    expect(after?.prNumber).toBe(42)
+  })
+
+  test('updateActiveMonitor with new taskId makes clearActiveMonitor recognise framework taskId', () => {
+    // Reproduce the latent bug scenario: lock acquired with one taskId,
+    // framework assigns a different one. Before the fix, the framework's
+    // clearActiveMonitor(frameworkTaskId) would no-op because guard fails.
+    setActiveMonitor(makeState({ taskId: 'teammate-uuid' }))
+    // Framework cleanup using its own taskId — would fail guard before the fix
+    clearActiveMonitor('framework-uuid')
+    expect(getActiveMonitor()).not.toBeNull()
+    // After updateActiveMonitor swaps the taskId, framework cleanup works
+    updateActiveMonitor({ taskId: 'framework-uuid' })
+    clearActiveMonitor('framework-uuid')
+    expect(getActiveMonitor()).toBeNull()
+  })
+
+  test('updateActiveMonitor does not change abortController identity', () => {
+    const ac = new AbortController()
+    setActiveMonitor(makeState({ abortController: ac, taskId: 'tentative' }))
+    updateActiveMonitor({ taskId: 'updated' })
+    expect(getActiveMonitor()?.abortController).toBe(ac)
+  })
 })
--- a/src/commands/autofix-pr/tests/prOutcomeCheck.test.ts
+++ b/src/commands/autofix-pr/tests/prOutcomeCheck.test.ts
@@ -0,0 +1,193 @@
+import { describe, expect, test } from 'bun:test'
+import {
+  type PrViewPayload,
+  summariseAutofixOutcome,
+} from '../prOutcomeCheck.js'
+
+function basePayload(overrides: Partial<PrViewPayload> = {}): PrViewPayload {
+  return {
+    headRefOid: 'sha-baseline',
+    state: 'OPEN',
+    statusCheckRollup: [],
+    ...overrides,
+  }
+}
+
+const identity = (overrides: Partial<{ initialHeadSha: string }> = {}) => ({
+  owner: 'acme',
+  repo: 'myrepo',
+  prNumber: 42,
+  initialHeadSha: 'sha-baseline',
+  ...overrides,
+})
+
+describe('summariseAutofixOutcome · terminal PR states', () => {
+  test('MERGED → completed regardless of head SHA / CI', () => {
+    const result = summariseAutofixOutcome(
+      basePayload({ state: 'MERGED', headRefOid: 'sha-baseline' }),
+      identity(),
+    )
+    expect(result).toEqual({
+      completed: true,
+      summary: 'acme/myrepo#42 merged. Autofix monitoring complete.',
+    })
+  })
+
+  test('CLOSED → completed regardless of head SHA / CI', () => {
+    const result = summariseAutofixOutcome(
+      basePayload({ state: 'CLOSED' }),
+      identity(),
+    )
+    expect(result).toEqual({
+      completed: true,
+      summary:
+        'acme/myrepo#42 closed without merge. Autofix monitoring complete.',
+    })
+  })
+})
+
+describe('summariseAutofixOutcome · OPEN PR without push', () => {
+  test('no initialHeadSha baseline → not completed (cannot detect push)', () => {
+    const result = summariseAutofixOutcome(
+      basePayload({ state: 'OPEN' }),
+      identity({ initialHeadSha: undefined as unknown as string }),
+    )
+    expect(result).toEqual({ completed: false })
+  })
+
+  test('headRefOid unchanged → not completed (autofix has not pushed yet)', () => {
+    const result = summariseAutofixOutcome(
+      basePayload({ state: 'OPEN', headRefOid: 'sha-baseline' }),
+      identity(),
+    )
+    expect(result).toEqual({ completed: false })
+  })
+})
+
+describe('summariseAutofixOutcome · OPEN PR with push, CI variations', () => {
+  test('push detected + no checks configured → completed (success)', () => {
+    const result = summariseAutofixOutcome(
+      basePayload({
+        state: 'OPEN',
+        headRefOid: 'sha-new',
+        statusCheckRollup: [],
+      }),
+      identity(),
+    )
+    expect(result).toEqual({
+      completed: true,
+      summary: 'Autofix pushed commits to acme/myrepo#42, CI green.',
+    })
+  })
+
+  test('push detected + CI pending → not completed (wait for CI)', () => {
+    const result = summariseAutofixOutcome(
+      basePayload({
+        state: 'OPEN',
+        headRefOid: 'sha-new',
+        statusCheckRollup: [
+          { status: 'IN_PROGRESS', conclusion: null, name: 'ci' },
+          { status: 'COMPLETED', conclusion: 'SUCCESS', name: 'lint' },
+        ],
+      }),
+      identity(),
+    )
+    expect(result).toEqual({ completed: false })
+  })
+
+  test('push detected + CI all green → completed (success summary)', () => {
+    const result = summariseAutofixOutcome(
+      basePayload({
+        state: 'OPEN',
+        headRefOid: 'sha-new',
+        statusCheckRollup: [
+          { status: 'COMPLETED', conclusion: 'SUCCESS', name: 'ci' },
+          { status: 'COMPLETED', conclusion: 'SUCCESS', name: 'lint' },
+        ],
+      }),
+      identity(),
+    )
+    expect(result.completed).toBe(true)
+    if (result.completed) {
+      expect(result.summary).toContain('CI green')
+      expect(result.summary).toContain('acme/myrepo#42')
+    }
+  })
+
+  test('push detected + CI red → completed (failure summary surfaces the red)', () => {
+    const result = summariseAutofixOutcome(
+      basePayload({
+        state: 'OPEN',
+        headRefOid: 'sha-new',
+        statusCheckRollup: [
+          { status: 'COMPLETED', conclusion: 'FAILURE', name: 'ci' },
+          { status: 'COMPLETED', conclusion: 'SUCCESS', name: 'lint' },
+        ],
+      }),
+      identity(),
+    )
+    expect(result.completed).toBe(true)
+    if (result.completed) {
+      expect(result.summary).toContain('CI is failing')
+      expect(result.summary).toContain('1/2 checks failing')
+    }
+  })
+
+  test('statusCheckRollup undefined → treated as no checks configured (success)', () => {
+    // Distinct from empty-array: GitHub omits the field entirely on PRs
+    // without any configured checks. The !rollup branch covers undefined.
+    const result = summariseAutofixOutcome(
+      basePayload({
+        state: 'OPEN',
+        headRefOid: 'sha-new',
+        statusCheckRollup: undefined,
+      }),
+      identity(),
+    )
+    expect(result.completed).toBe(true)
+    if (result.completed) {
+      expect(result.summary).toContain('CI green')
+    }
+  })
+
+  test('check with COMPLETED status but empty conclusion → counted as pending', () => {
+    // Edge case: GitHub sometimes reports a check as COMPLETED with a null/
+    // missing conclusion (in-flight result mid-write). The defensive branch
+    // treats empty conclusion after a passed status check as pending.
+    const result = summariseAutofixOutcome(
+      basePayload({
+        state: 'OPEN',
+        headRefOid: 'sha-new',
+        statusCheckRollup: [
+          { status: 'COMPLETED', conclusion: null, name: 'ci-in-flight' },
+          { status: 'COMPLETED', conclusion: 'SUCCESS', name: 'lint' },
+        ],
+      }),
+      identity(),
+    )
+    expect(result).toEqual({ completed: false })
+  })
+
+  test('neutral / skipped conclusions count as success (not failure)', () => {
+    const result = summariseAutofixOutcome(
+      basePayload({
+        state: 'OPEN',
+        headRefOid: 'sha-new',
+        statusCheckRollup: [
+          {
+            status: 'COMPLETED',
+            conclusion: 'NEUTRAL',
+            name: 'optional-check',
+          },
+          { status: 'COMPLETED', conclusion: 'SKIPPED', name: 'docs-check' },
+          { status: 'COMPLETED', conclusion: 'SUCCESS', name: 'ci' },
+        ],
+      }),
+      identity(),
+    )
+    expect(result.completed).toBe(true)
+    if (result.completed) {
+      expect(result.summary).toContain('CI green')
+    }
+  })
+})
--- a/src/commands/autofix-pr/extractAutofixResult.ts
+++ b/src/commands/autofix-pr/extractAutofixResult.ts
@@ -0,0 +1,92 @@
+// Extract the <autofix-result> tag from a remote autofix-pr session log.
+//
+// The remote agent emits a structured XML block as its final message
+// (initialMessage in launchAutofixPr.ts instructs it to). The tag carries
+// PR-specific outcome data — commits pushed, files changed, CI status,
+// summary — that the framework's generic "task completed" notification
+// can't convey. We surface it to the local model by injecting the tag
+// verbatim into the message queue (analogous to <remote-review> handling).
+//
+// Resilient to two production realities:
+//   1. The tag may appear in either an assistant text block or a hook
+//      stdout (some autofix skills wrap the final report in a hook).
+//   2. The tag may not appear at all (older agents, truncated runs) —
+//      caller falls back to generic completion notification.
+
+import type {
+  SDKAssistantMessage,
+  SDKMessage,
+} from '../../entrypoints/agentSdkTypes.js'
+
+export const AUTOFIX_RESULT_TAG = 'autofix-result'
+
+const TAG_OPEN = `<${AUTOFIX_RESULT_TAG}>`
+const TAG_CLOSE = `</${AUTOFIX_RESULT_TAG}>`
+
+/**
+ * Walk the session log for an <autofix-result> tag. Returns the full tag
+ * (including delimiters) so the caller can inject it as-is into the
+ * notification; returns null if no tag is present.
+ *
+ * Search order:
+ *   1. Latest hook_progress / hook_response stdout (autofix skills that
+ *      use hooks to format the report write here first).
+ *   2. Latest assistant text block (agents that don't use hooks write the
+ *      tag inline in their final message).
+ *
+ * Latest-wins so re-tries within the same session don't surface stale
+ * earlier results.
+ */
+export function extractAutofixResultFromLog(log: SDKMessage[]): string | null {
+  // Walk backwards so we hit the most recent tag first.
+  for (let i = log.length - 1; i >= 0; i--) {
+    const msg = log[i]
+    if (!msg) continue
+
+    // Hook stdout (system messages of subtype hook_progress / hook_response).
+    if (
+      msg.type === 'system' &&
+      (msg.subtype === 'hook_progress' || msg.subtype === 'hook_response')
+    ) {
+      const stdout = (msg as { stdout?: unknown }).stdout
+      if (typeof stdout === 'string') {
+        const extracted = extractBetween(stdout, TAG_OPEN, TAG_CLOSE)
+        if (extracted) return extracted
+      }
+      continue
+    }
+
+    // Assistant text blocks.
+    if (msg.type === 'assistant') {
+      const content = (msg as SDKAssistantMessage).message?.content
+      if (!content || typeof content === 'string') continue
+      for (const block of content as Array<{ type: string; text?: string }>) {
+        if (block.type !== 'text' || typeof block.text !== 'string') continue
+        if (!block.text.includes(TAG_OPEN)) continue
+        const extracted = extractBetween(block.text, TAG_OPEN, TAG_CLOSE)
+        if (extracted) return extracted
+      }
+    }
+  }
+  return null
+}
+
+// Walks open tags from latest to earliest, returning the first complete
+// open/close pair. Guards against a truncated final tag shadowing an
+// earlier complete pair within the same text block (e.g., a retry wrote a
+// full result, then the model started a second tag that got cut off).
+function extractBetween(
+  text: string,
+  open: string,
+  close: string,
+): string | null {
+  let searchFrom = text.length
+  while (searchFrom >= 0) {
+    const start = text.lastIndexOf(open, searchFrom)
+    if (start === -1) return null
+    const end = text.indexOf(close, start + open.length)
+    if (end !== -1) return text.slice(start, end + close.length)
+    searchFrom = start - 1
+  }
+  return null
+}
--- a/src/commands/autofix-pr/launchAutofixPr.ts
+++ b/src/commands/autofix-pr/launchAutofixPr.ts
@@ -13,7 +13,11 @@ import {
  checkRemoteAgentEligibility,
  formatPreconditionError,
  getRemoteTaskSessionUrl,
+  registerCompletionChecker,
+  registerCompletionHook,
+  registerContentExtractor,
  registerRemoteAgentTask,
+  type AutofixPrRemoteTaskMetadata,
  type BackgroundRemoteSessionPrecondition,
 } from '../../tasks/RemoteAgentTask/RemoteAgentTask.js'
 import type { LocalJSXCommandCall } from '../../types/command.js'
@@ -26,10 +30,66 @@ import {
  getActiveMonitor,
  isMonitoring,
  trySetActiveMonitor,
+  updateActiveMonitor,
 } from './monitorState.js'
+import { extractAutofixResultFromLog } from './extractAutofixResult.js'
 import { parseAutofixArgs } from './parseArgs.js'
+import { checkPrAutofixOutcome, fetchPrHeadSha } from './prFetch.js'
 import { detectAutofixSkills, formatSkillsHint } from './skillDetect.js'

+// Throttle map for the completionChecker: gh CLI is called at most once per
+// PR per CHECK_INTERVAL_MS, regardless of the framework's 1s poll cadence.
+// Key is `${owner}/${repo}#${prNumber}`. Cleared when the completion hook
+// fires so a re-launched monitor starts with a fresh budget.
+const lastCheckAt = new Map<string, number>()
+const CHECK_INTERVAL_MS = 5_000
+
+function throttleKey(meta: AutofixPrRemoteTaskMetadata): string {
+  return `${meta.owner}/${meta.repo}#${meta.prNumber}`
+}
+
+// Register the completionChecker once at module load. The framework calls it
+// on every poll tick for tasks with remoteTaskType==='autofix-pr'; throttle
+// inside so we don't fire gh CLI 60×/min. Returns the summary string on
+// completion (becomes the task-notification body) or null to keep polling.
+registerCompletionChecker('autofix-pr', async metadata => {
+  const meta = metadata as AutofixPrRemoteTaskMetadata | undefined
+  if (!meta) return null
+
+  const key = throttleKey(meta)
+  const now = Date.now()
+  if (now - (lastCheckAt.get(key) ?? 0) < CHECK_INTERVAL_MS) return null
+  lastCheckAt.set(key, now)
+
+  const result = await checkPrAutofixOutcome({
+    owner: meta.owner,
+    repo: meta.repo,
+    prNumber: meta.prNumber,
+    initialHeadSha: meta.initialHeadSha,
+  })
+  return result.completed ? result.summary : null
+})
+
+// Release the singleton monitor lock when the framework transitions the
+// autofix task to a terminal state. Without this, the lock — keyed by the
+// framework-assigned taskId (after callAutofixPr's updateActiveMonitor swap)
+// — would dangle past natural completion, blocking subsequent /autofix-pr
+// invocations until the process restarts. Registered at module load; the
+// framework's runCompletionHook invokes it once per terminal transition.
+// Also clear the per-PR throttle entry so a re-launch starts fresh.
+registerCompletionHook('autofix-pr', (taskId, metadata) => {
+  clearActiveMonitor(taskId)
+  const meta = metadata as AutofixPrRemoteTaskMetadata | undefined
+  if (meta) lastCheckAt.delete(throttleKey(meta))
+})
+
+// Phase 3 content return: extract the <autofix-result> tag from the session
+// log so the local model sees the agent's structured outcome (commits
+// pushed, files changed, CI status) inline in the completion task-
+// notification — instead of just a file-path pointer. The framework falls
+// back to the generic notification if extraction returns null.
+registerContentExtractor('autofix-pr', log => extractAutofixResultFromLog(log))
+
 function makeErrorText(message: string, code: string): string {
  logEvent('tengu_autofix_pr_result', {
    result:
@@ -198,7 +258,23 @@ export const callAutofixPr: LocalJSXCommandCall = async (
    // 4.5 compose message
    const target = `${owner}/${repo}#${prNumber}`
    const branchName = `refs/pull/${prNumber}/head`
-    const initialMessage = `Auto-fix failing CI checks on PR #${prNumber} in ${owner}/${repo}.${skillsHint}`
+    const initialMessage = `Auto-fix failing CI checks on PR #${prNumber} in ${owner}/${repo}.${skillsHint}
+
+When you finish (or hit a blocker you can't recover from), output the following XML tag as your final message so the local user gets a structured summary:
+
+<autofix-result>
+  <pr-number>${prNumber}</pr-number>
+  <commits-pushed>
+    <commit sha="...">commit message</commit>
+  </commits-pushed>
+  <files-changed>
+    <file path="...">N changes</file>
+  </files-changed>
+  <ci-status>green | red | pending | unknown</ci-status>
+  <summary>One-sentence summary of what was fixed or why it could not be fixed.</summary>
+</autofix-result>
+
+If no fix was needed, omit <commits-pushed> and <files-changed> and explain in <summary>. If you only attempted partial work, list the commits you did push and explain the remainder in <summary>.`

    // 4.6 in-process teammate
    const teammate = createAutofixTeammate(initialMessage, target)
@@ -274,18 +350,35 @@ export const callAutofixPr: LocalJSXCommandCall = async (
      return null
    }

+    // 4.8b capture PR head SHA before registering so the completionChecker
+    // can detect when the agent has pushed new commits. Best-effort — if gh
+    // is unavailable or the call fails, leave initialHeadSha undefined and
+    // the checker falls back to terminal-state-only completion (closed /
+    // merged). Don't block on this; teleport succeeded already.
+    const initialHeadSha =
+      (await fetchPrHeadSha(owner, repo, prNumber).catch(() => null)) ??
+      undefined
+
    // 4.9 register task. If this throws, release the lock so the user can
    // retry — the remote CCR session is already created so we surface a
    // dedicated error code.
+    //
+    // After registration succeeds, swap the lock's taskId from the tentative
+    // teammate UUID (used to acquire the lock atomically before teleport) to
+    // the framework-assigned taskId. Without this swap, the framework's own
+    // cleanup path (clearActiveMonitor(frameworkTaskId) on natural completion)
+    // would no-op against a lock keyed by teammate.taskId, leaving the
+    // singleton lock dangling and blocking future /autofix-pr invocations.
    try {
-      registerRemoteAgentTask({
+      const { taskId: frameworkTaskId } = registerRemoteAgentTask({
        remoteTaskType: 'autofix-pr',
        session,
        command: `/autofix-pr ${prNumber}`,
        context,
        isLongRunning: true,
-        remoteTaskMetadata: { owner, repo, prNumber },
+        remoteTaskMetadata: { owner, repo, prNumber, initialHeadSha },
      })
+      updateActiveMonitor({ taskId: frameworkTaskId })
    } catch (regErr: unknown) {
      clearActiveMonitor(teammate.taskId)
      const regMsg = regErr instanceof Error ? regErr.message : String(regErr)
--- a/src/commands/autofix-pr/monitorState.ts
+++ b/src/commands/autofix-pr/monitorState.ts
@@ -46,6 +46,20 @@ export function clearActiveMonitor(taskId?: string): void {
  active = null
 }

+/**
+ * Atomically merges partial updates into the active monitor. Returns true if
+ * applied, false if no active monitor. Used when the caller needs to swap the
+ * lock's taskId after the framework assigns a different one than the
+ * tentative one used to acquire the lock — without this the framework's
+ * cleanup (clearActiveMonitor with the framework taskId) would no-op against
+ * a lock keyed by the caller's tentative id.
+ */
+export function updateActiveMonitor(partial: Partial<MonitorState>): boolean {
+  if (!active) return false
+  active = { ...active, ...partial }
+  return true
+}
+
 export function isMonitoring(
  owner: string,
  repo: string,
--- a/src/commands/autofix-pr/prFetch.ts
+++ b/src/commands/autofix-pr/prFetch.ts
@@ -0,0 +1,155 @@
+// gh CLI integration for autofix-pr: fetches PR snapshots and feeds them
+// through the pure decision matrix in prOutcomeCheck.ts. Kept separate so
+// tests of the decision matrix never have to mock node:child_process — and
+// tests of callAutofixPr can mock this module without polluting the pure
+// decision matrix module (Bun mock.module is process-global).
+
+import { spawn } from 'node:child_process'
+import {
+  type AutofixOutcomeProbeResult,
+  type PrViewPayload,
+  summariseAutofixOutcome,
+} from './prOutcomeCheck.js'
+
+export interface AutofixOutcomeProbeInput {
+  owner: string
+  repo: string
+  prNumber: number
+  /**
+   * Head commit SHA captured at /autofix-pr launch. When this differs from
+   * the current head, autofix has pushed at least one commit.
+   */
+  initialHeadSha?: string
+  /**
+   * Timeout for the gh CLI invocation. Caller is the framework's per-tick
+   * poller, so failures must be bounded — a hung gh process would stall
+   * the entire poll loop.
+   */
+  timeoutMs?: number
+}
+
+const DEFAULT_TIMEOUT_MS = 5_000
+
+/**
+ * Fetch the PR's current head SHA, state, and CI rollup, and decide whether
+ * autofix has finished. Returns `{ completed: true, summary }` if so;
+ * otherwise `{ completed: false }`. Never throws.
+ */
+export async function checkPrAutofixOutcome(
+  input: AutofixOutcomeProbeInput,
+): Promise<AutofixOutcomeProbeResult> {
+  const { owner, repo, prNumber, initialHeadSha, timeoutMs } = input
+
+  let payload: PrViewPayload
+  try {
+    payload = await runGhPrView(
+      owner,
+      repo,
+      prNumber,
+      timeoutMs ?? DEFAULT_TIMEOUT_MS,
+    )
+  } catch {
+    return { completed: false }
+  }
+
+  return summariseAutofixOutcome(payload, {
+    owner,
+    repo,
+    prNumber,
+    initialHeadSha,
+  })
+}
+
+/**
+ * Resolve the PR's current head commit SHA. Used at /autofix-pr launch to
+ * capture a baseline; later compared against the live SHA to detect pushes.
+ * Returns null on any failure (network, missing gh, permissions) — the
+ * caller treats null as "no baseline" and falls back to terminal-state-only
+ * completion detection.
+ */
+export async function fetchPrHeadSha(
+  owner: string,
+  repo: string,
+  prNumber: number,
+  timeoutMs = DEFAULT_TIMEOUT_MS,
+): Promise<string | null> {
+  try {
+    const payload = await runGhPrView(owner, repo, prNumber, timeoutMs)
+    return payload.headRefOid || null
+  } catch {
+    return null
+  }
+}
+
+interface SpawnError extends Error {
+  code?: string
+}
+
+/**
+ * Spawn `gh pr view {n} --repo {owner}/{repo} --json ...` and parse the
+ * result. Rejects on non-zero exit, timeout, or JSON parse failure.
+ */
+function runGhPrView(
+  owner: string,
+  repo: string,
+  prNumber: number,
+  timeoutMs: number,
+): Promise<PrViewPayload> {
+  return new Promise((resolve, reject) => {
+    const proc = spawn(
+      'gh',
+      [
+        'pr',
+        'view',
+        String(prNumber),
+        '--repo',
+        `${owner}/${repo}`,
+        '--json',
+        'headRefOid,state,statusCheckRollup',
+      ],
+      { stdio: ['ignore', 'pipe', 'pipe'] },
+    )
+    const stdoutChunks: Buffer[] = []
+    const stderrChunks: Buffer[] = []
+    let settled = false
+
+    const timer = setTimeout(() => {
+      if (settled) return
+      settled = true
+      proc.kill('SIGKILL')
+      reject(new Error(`gh pr view timed out after ${timeoutMs}ms`))
+    }, timeoutMs)
+
+    proc.stdout.on('data', chunk => stdoutChunks.push(chunk as Buffer))
+    proc.stderr.on('data', chunk => stderrChunks.push(chunk as Buffer))
+
+    proc.on('error', (err: SpawnError) => {
+      if (settled) return
+      settled = true
+      clearTimeout(timer)
+      reject(err)
+    })
+
+    proc.on('close', code => {
+      if (settled) return
+      settled = true
+      clearTimeout(timer)
+      if (code !== 0) {
+        const stderr = Buffer.concat(stderrChunks).toString('utf8').trim()
+        reject(
+          new Error(`gh pr view exited ${code}: ${stderr || '<no stderr>'}`),
+        )
+        return
+      }
+      const stdout = Buffer.concat(stdoutChunks).toString('utf8').trim()
+      try {
+        const parsed = JSON.parse(stdout) as PrViewPayload
+        resolve(parsed)
+      } catch (e) {
+        reject(
+          new Error(`gh pr view JSON parse failed: ${(e as Error).message}`),
+        )
+      }
+    })
+  })
+}
--- a/src/commands/autofix-pr/prOutcomeCheck.ts
+++ b/src/commands/autofix-pr/prOutcomeCheck.ts
@@ -0,0 +1,123 @@
+// Pure decision matrix for autofix-pr completion detection.
+//
+// Given a snapshot of the PR (state, head SHA, CI rollup) and a baseline
+// head SHA captured at /autofix-pr launch, decide whether autofix has
+// finished. No side effects — extracted from the gh CLI invocation in
+// prFetch.ts so unit tests can exercise every branch without spawning
+// subprocesses.
+
+export type AutofixOutcomeProbeResult =
+  | { completed: true; summary: string }
+  | { completed: false }
+
+export interface PrViewPayload {
+  headRefOid: string
+  state: 'OPEN' | 'CLOSED' | 'MERGED'
+  statusCheckRollup?: Array<{
+    conclusion?: string | null
+    status?: string | null
+    name?: string
+  }>
+}
+
+export interface AutofixOutcomeIdentity {
+  owner: string
+  repo: string
+  prNumber: number
+  /**
+   * Head commit SHA captured at /autofix-pr launch. When this differs from
+   * the current head, autofix has pushed at least one commit. Optional —
+   * absence means we can only finish on terminal PR states (merged/closed).
+   */
+  initialHeadSha?: string
+}
+
+/**
+ * Pure judgement of whether autofix has finished, given a PR snapshot and
+ * the baseline head SHA. Decision matrix:
+ *   - MERGED                         → done (merged)
+ *   - CLOSED (not merged)            → done (closed without fix)
+ *   - OPEN, no baseline              → keep polling
+ *   - OPEN, head unchanged           → keep polling (agent hasn't pushed)
+ *   - OPEN, head changed, CI pending → keep polling (wait for CI)
+ *   - OPEN, head changed, CI failure → done (surface red so user can retry)
+ *   - OPEN, head changed, CI success → done (clean fix)
+ */
+export function summariseAutofixOutcome(
+  payload: PrViewPayload,
+  identity: AutofixOutcomeIdentity,
+): AutofixOutcomeProbeResult {
+  const { owner, repo, prNumber, initialHeadSha } = identity
+
+  if (payload.state === 'MERGED') {
+    return {
+      completed: true,
+      summary: `${owner}/${repo}#${prNumber} merged. Autofix monitoring complete.`,
+    }
+  }
+  if (payload.state === 'CLOSED') {
+    return {
+      completed: true,
+      summary: `${owner}/${repo}#${prNumber} closed without merge. Autofix monitoring complete.`,
+    }
+  }
+
+  if (!initialHeadSha) return { completed: false }
+  if (payload.headRefOid === initialHeadSha) return { completed: false }
+
+  const ciState = summariseCiRollup(payload.statusCheckRollup)
+  if (ciState.state === 'pending') return { completed: false }
+  if (ciState.state === 'failure') {
+    return {
+      completed: true,
+      summary: `Autofix pushed commits to ${owner}/${repo}#${prNumber} but CI is failing (${ciState.detail}).`,
+    }
+  }
+  return {
+    completed: true,
+    summary: `Autofix pushed commits to ${owner}/${repo}#${prNumber}, CI green.`,
+  }
+}
+
+interface CiSummary {
+  state: 'success' | 'pending' | 'failure'
+  detail: string
+}
+
+function summariseCiRollup(
+  rollup: PrViewPayload['statusCheckRollup'],
+): CiSummary {
+  if (!rollup || rollup.length === 0) {
+    // No checks configured on this repo — treat as success so completion
+    // can fire on push alone. PRs without CI are perfectly valid.
+    return { state: 'success', detail: 'no checks configured' }
+  }
+  let pending = 0
+  let failed = 0
+  const total = rollup.length
+  for (const check of rollup) {
+    const status = (check.status ?? '').toUpperCase()
+    const conclusion = (check.conclusion ?? '').toUpperCase()
+    if (status && status !== 'COMPLETED') {
+      pending++
+      continue
+    }
+    if (
+      conclusion === 'SUCCESS' ||
+      conclusion === 'NEUTRAL' ||
+      conclusion === 'SKIPPED'
+    ) {
+      continue
+    }
+    if (conclusion === '') {
+      pending++
+      continue
+    }
+    failed++
+  }
+  if (pending > 0)
+    return { state: 'pending', detail: `${pending}/${total} checks pending` }
+  if (failed > 0)
+    return { state: 'failure', detail: `${failed}/${total} checks failing` }
+  return { state: 'success', detail: `${total}/${total} checks passing` }
+}
--- a/src/commands/effort/effort.tsx
+++ b/src/commands/effort/effort.tsx
@@ -155,7 +155,7 @@ export async function call(onDone: LocalJSXCommandOnDone, _context: unknown, arg

  if (COMMON_HELP_ARGS.includes(args)) {
    onDone(
-      'Usage: /effort [low|medium|high|xhigh|max|auto]\n\nEffort levels:\n- low: Quick, straightforward implementation\n- medium: Balanced approach with standard testing\n- high: Comprehensive implementation with extensive testing\n- xhigh: Extra high reasoning for supported models, including ChatGPT Codex models\n- max: Maximum capability with deepest reasoning where supported (Opus 4.6/4.7, DeepSeek V4 Pro); maps to xhigh for ChatGPT Codex models\n- auto: Use the default effort level for your model',
+      'Usage: /effort [low|medium|high|xhigh|max|auto]\n\nEffort levels:\n- low: Quick, straightforward implementation\n- medium: Balanced approach with standard testing\n- high: Comprehensive implementation with extensive testing\n- xhigh: Extended reasoning beyond high, short of max; including ChatGPT Codex models\n- max: Maximum capability with deepest reasoning; maps to xhigh for ChatGPT Codex models\n- auto: Use the default effort level for your model',
    );
    return;
  }
--- a/src/components/BypassPermissionsModeDialog.tsx
+++ b/src/components/BypassPermissionsModeDialog.tsx
@@ -11,6 +11,18 @@ type Props = {
 };

 export function BypassPermissionsModeDialog({ onAccept }: Props): React.ReactNode {
+  const [pendingExitCode, setPendingExitCode] = React.useState<number | null>(null);
+
+  // Clear screen before shutdown so residual dialog content doesn't leak
+  // to the terminal. Deferred to next tick so Ink flushes the null render.
+  React.useEffect(() => {
+    if (pendingExitCode !== null) {
+      const code = pendingExitCode;
+      const timer = setTimeout(() => gracefulShutdownSync(code));
+      return () => clearTimeout(timer);
+    }
+  }, [pendingExitCode]);
+
  React.useEffect(() => {
    logEvent('tengu_bypass_permissions_mode_dialog_shown', {});
  }, []);
@@ -27,16 +39,20 @@ export function BypassPermissionsModeDialog({ onAccept }: Props): React.ReactNod
        break;
      }
      case 'decline': {
-        gracefulShutdownSync(1);
+        setPendingExitCode(1);
        break;
      }
    }
  }

  const handleEscape = useCallback(() => {
-    gracefulShutdownSync(0);
+    setPendingExitCode(0);
  }, []);

+  if (pendingExitCode !== null) {
+    return null;
+  }
+
  return (
    <Dialog title="WARNING: Claude Code running in Bypass Permissions mode" color="error" onCancel={handleEscape}>
      <Box flexDirection="column" gap={1}>
--- a/src/components/DevChannelsDialog.tsx
+++ b/src/components/DevChannelsDialog.tsx
@@ -10,21 +10,37 @@ type Props = {
 };

 export function DevChannelsDialog({ channels, onAccept }: Props): React.ReactNode {
+  const [pendingExitCode, setPendingExitCode] = React.useState<number | null>(null);
+
+  // Clear screen before shutdown so residual dialog content doesn't leak
+  // to the terminal. Deferred to next tick so Ink flushes the null render.
+  React.useEffect(() => {
+    if (pendingExitCode !== null) {
+      const code = pendingExitCode;
+      const timer = setTimeout(() => gracefulShutdownSync(code));
+      return () => clearTimeout(timer);
+    }
+  }, [pendingExitCode]);
+
  function onChange(value: 'accept' | 'exit') {
    switch (value) {
      case 'accept':
        onAccept();
        break;
      case 'exit':
-        gracefulShutdownSync(1);
+        setPendingExitCode(1);
        break;
    }
  }

  const handleEscape = useCallback(() => {
-    gracefulShutdownSync(0);
+    setPendingExitCode(0);
  }, []);

+  if (pendingExitCode !== null) {
+    return null;
+  }
+
  return (
    <Dialog title="WARNING: Loading development channels" color="error" onCancel={handleEscape}>
      <Box flexDirection="column" gap={1}>
--- a/src/components/FileEditToolDiff.tsx
+++ b/src/components/FileEditToolDiff.tsx
@@ -4,7 +4,7 @@ import { Suspense, use, useState } from 'react';
 import { useTerminalSize } from '../hooks/useTerminalSize.js';
 import { Box, Text } from '@anthropic/ink';
 import type { FileEdit } from '@claude-code-best/builtin-tools/tools/FileEditTool/types.js';
-import { findActualString, preserveQuoteStyle } from '@claude-code-best/builtin-tools/tools/FileEditTool/utils.js';
+import { findActualString } from '@claude-code-best/builtin-tools/tools/FileEditTool/utils.js';
 import { adjustHunkLineNumbers, CONTEXT_LINES, getPatchForDisplay } from '../utils/diff.js';
 import { logError } from '../utils/log.js';
 import { CHUNK_SIZE, openForScan, readCapped, scanForContext } from '../utils/readEditContext.js';
@@ -135,6 +135,5 @@ function diffToolInputsOnly(filePath: string, edits: FileEdit[]): DiffData {

 function normalizeEdit(fileContent: string, edit: FileEdit): FileEdit {
  const actualOld = findActualString(fileContent, edit.old_string) || edit.old_string;
-  const actualNew = preserveQuoteStyle(edit.old_string, actualOld, edit.new_string);
-  return { ...edit, old_string: actualOld, new_string: actualNew };
+  return { ...edit, old_string: actualOld };
 }
--- a/src/components/Messages.tsx
+++ b/src/components/Messages.tsx
@@ -798,9 +798,7 @@ const MessagesImpl = ({

    // Collapse diffs for messages beyond the latest N messages.
    // verbose (ctrl+o) overrides and always shows full diffs.
-    // 0 was too aggressive — tool results are never the last message (assistant
-    // text follows), so diffs were always collapsed.  3 keeps recent edits visible.
-    const DIFF_COLLAPSE_DISTANCE = 3;
+    const DIFF_COLLAPSE_DISTANCE = 0;
    const shouldCollapseDiffs = renderableMessages.length - 1 - index > DIFF_COLLAPSE_DISTANCE;

    const k = messageKey(msg);
--- a/src/components/TrustDialog/TrustDialog.tsx
+++ b/src/components/TrustDialog/TrustDialog.tsx
@@ -80,6 +80,21 @@ export function TrustDialog({ onDone, commands }: Props): React.ReactNode {
  const hasAnyBashExecution = bashSettingSources.length > 0 || hasSlashCommandBash || hasSkillsBash;

  const hasTrustDialogAccepted = checkHasTrustDialogAccepted();
+  const [pendingExitCode, setPendingExitCode] = React.useState<number | null>(null);
+
+  // When a non-null exit code is set, render null (clear screen) first,
+  // then trigger shutdown in the next tick so Ink has time to flush
+  // the empty frame before cleanupTerminalModes() unmounts and exits
+  // the alt screen. Without this deferral, gracefulShutdownSync starts
+  // async cleanup immediately after React commit, racing the reconciler
+  // and leaving residual TrustDialog output on the terminal.
+  React.useEffect(() => {
+    if (pendingExitCode !== null) {
+      const code = pendingExitCode;
+      const timer = setTimeout(() => gracefulShutdownSync(code));
+      return () => clearTimeout(timer);
+    }
+  }, [pendingExitCode]);

  React.useEffect(() => {
    const isHomeDir = homedir() === getCwd();
@@ -107,7 +122,12 @@ export function TrustDialog({ onDone, commands }: Props): React.ReactNode {

  function onChange(value: 'enable_all' | 'exit') {
    if (value === 'exit') {
-      gracefulShutdownSync(1);
+      // Set pendingExitCode to clear the screen before triggering shutdown.
+      // The useEffect above defers gracefulShutdownSync to the next tick
+      // so Ink can flush the empty frame first — otherwise
+      // cleanupTerminalModes races React's re-render and leaves
+      // residual TrustDialog content on the terminal.
+      setPendingExitCode(1);
      return;
    }

@@ -151,17 +171,23 @@ export function TrustDialog({ onDone, commands }: Props): React.ReactNode {
  // so the default would hang the await forever. With keybinding
  // customization enabled, the chokidar watcher (persistent: true) keeps the
  // event loop alive and the process freezes. Explicitly exit 1 like "No".
-  const exitState = useExitOnCtrlCDWithKeybindings(() => gracefulShutdownSync(1));
+  const exitState = useExitOnCtrlCDWithKeybindings(() => setPendingExitCode(1));

  // Use configurable keybinding for ESC to cancel/exit
  useKeybinding(
    'confirm:no',
    () => {
-      gracefulShutdownSync(0);
+      setPendingExitCode(0);
    },
    { context: 'Confirmation' },
  );

+  // When pendingExitCode is set, render nothing so the screen is cleared
+  // before shutdown cleans up the alt screen.  See the useEffect above.
+  if (pendingExitCode !== null) {
+    return null;
+  }
+
  // Automatically resolve the trust dialog if there is nothing to be shown.
  if (hasTrustDialogAccepted) {
    setTimeout(onDone);
--- a/src/constants/prompts.ts
+++ b/src/constants/prompts.ts
@@ -82,10 +82,11 @@ const BRIEF_PROACTIVE_SECTION: string | null =
        require('@claude-code-best/builtin-tools/tools/BriefTool/prompt.js') as typeof import('@claude-code-best/builtin-tools/tools/BriefTool/prompt.js')
      ).BRIEF_PROACTIVE_SECTION
    : null
-const briefToolModule =
-  feature('KAIROS') || feature('KAIROS_BRIEF')
+function getBriefToolModule() {
+  return feature('KAIROS') || feature('KAIROS_BRIEF')
    ? (require('@claude-code-best/builtin-tools/tools/BriefTool/BriefTool.js') as typeof import('@claude-code-best/builtin-tools/tools/BriefTool/BriefTool.js'))
    : null
+}
 const DISCOVER_SKILLS_TOOL_NAME: string | null = feature(
  'EXPERIMENTAL_SKILL_SEARCH',
 )
@@ -800,7 +801,7 @@ function getBriefSection(): string | null {
  // Whenever the tool is available, the model is told to use it. The
  // /brief toggle and --brief flag now only control the isBriefOnly
  // display filter — they no longer gate model-facing behavior.
-  if (!briefToolModule?.isBriefEnabled()) return null
+  if (!getBriefToolModule()?.isBriefEnabled()) return null
  // When proactive is active, getProactiveSection() already appends the
  // section inline. Skip here to avoid duplicating it in the system prompt.
  if (
@@ -864,5 +865,5 @@ Do not narrate each step, list every file you read, or explain routine actions.

 The user context may include a \`terminalFocus\` field indicating whether the user's terminal is focused or unfocused. Use this to calibrate how autonomous you are:
 - **Unfocused**: The user is away. Lean heavily into autonomous action — make decisions, explore, commit, push. Only pause for genuinely irreversible or high-risk actions.
- **Focused**: The user is watching. Be more collaborative — surface choices, ask before committing to large changes, and keep your output concise so it's easy to follow in real time.${BRIEF_PROACTIVE_SECTION && briefToolModule?.isBriefEnabled() ? `\n\n${BRIEF_PROACTIVE_SECTION}` : ''}`
+- **Focused**: The user is watching. Be more collaborative — surface choices, ask before committing to large changes, and keep your output concise so it's easy to follow in real time.${BRIEF_PROACTIVE_SECTION && getBriefToolModule()?.isBriefEnabled() ? `\n\n${BRIEF_PROACTIVE_SECTION}` : ''}`
 }
--- a/src/context/voice.tsx
+++ b/src/context/voice.tsx
@@ -19,7 +19,7 @@ const DEFAULT_STATE: VoiceState = {

 type VoiceStore = Store<VoiceState>;

-export const VoiceContext = createContext<VoiceStore | null>(null);
+const VoiceContext = createContext<VoiceStore | null>(null);

 type Props = {
  children: React.ReactNode;
--- a/src/entrypoints/cli.tsx
+++ b/src/entrypoints/cli.tsx
@@ -146,7 +146,7 @@ async function main(): Promise<void> {
        shutdown1PEventLogging,
        logForDebugging,
        registerPermissionHandler(server, handler) {
-          server.setNotificationHandler(ChannelPermissionRequestNotificationSchema(), async notification =>
+          server.setNotificationHandler(ChannelPermissionRequestNotificationSchema() as any, async notification =>
            handler(notification.params),
          );
        },
--- a/src/hooks/useIdeAtMentioned.ts
+++ b/src/hooks/useIdeAtMentioned.ts
@@ -47,7 +47,7 @@ export function useIdeAtMentioned(
    // If we found a connected IDE client, register our handler
    if (ideClient) {
      ideClient.client.setNotificationHandler(
-        AtMentionedSchema(),
+        AtMentionedSchema() as any,
        notification => {
          if (ideClientRef.current !== ideClient) {
            return
--- a/src/hooks/useIdeLogging.ts
+++ b/src/hooks/useIdeLogging.ts
@@ -27,7 +27,7 @@ export function useIdeLogging(mcpClients: MCPServerConnection[]): void {
    if (ideClient) {
      // Register the log event handler
      ideClient.client.setNotificationHandler(
-        LogEventSchema(),
+        LogEventSchema() as any,
        notification => {
          const { eventName, eventData } = notification.params
          logEvent(
--- a/src/hooks/useIdeSelection.ts
+++ b/src/hooks/useIdeSelection.ts
@@ -110,7 +110,7 @@ export function useIdeSelection(

    // Register notification handler for selection_changed events
    ideClient.client.setNotificationHandler(
-      SelectionChangedSchema(),
+      SelectionChangedSchema() as any,
      notification => {
        if (currentIDERef.current !== ideClient) {
          return
--- a/src/hooks/usePromptsFromClaudeInChrome.tsx
+++ b/src/hooks/usePromptsFromClaudeInChrome.tsx
@@ -48,7 +48,7 @@ export function usePromptsFromClaudeInChrome(
    }

    if (mcpClient) {
-      mcpClient.client.setNotificationHandler(ClaudeInChromePromptNotificationSchema(), notification => {
+      mcpClient.client.setNotificationHandler(ClaudeInChromePromptNotificationSchema() as any, notification => {
        if (mcpClientRef.current !== mcpClient) {
          return;
        }
--- a/src/services/mcp/useManageMCPConnections.ts
+++ b/src/services/mcp/useManageMCPConnections.ts
@@ -504,7 +504,7 @@ export function useManageMCPConnections(
            case 'register':
              logMCPDebug(client.name, 'Channel notifications registered')
              client.client.setNotificationHandler(
-                ChannelMessageNotificationSchema(),
+                ChannelMessageNotificationSchema() as any,
                async notification => {
                  const { content, meta } = notification.params
                  logMCPDebug(
@@ -539,7 +539,7 @@ export function useManageMCPConnections(
                client.capabilities?.experimental?.['claude/channel/permission']
              ) {
                client.client.setNotificationHandler(
-                  ChannelPermissionNotificationSchema(),
+                  ChannelPermissionNotificationSchema() as any,
                  async notification => {
                    const { request_id, behavior } = notification.params
                    const resolved =
--- a/src/services/mcp/vscodeSdkMcp.ts
+++ b/src/services/mcp/vscodeSdkMcp.ts
@@ -69,7 +69,7 @@ export function setupVscodeSdkMcp(sdkClients: MCPServerConnection[]): void {
    vscodeMcpClient = client

    client.client.setNotificationHandler(
-      LogEventNotificationSchema(),
+      LogEventNotificationSchema() as any,
      async notification => {
        const { eventName, eventData } = notification.params
        logEvent(
--- a/src/services/skillSearch/localSearch.ts
+++ b/src/services/skillSearch/localSearch.ts
@@ -385,7 +385,7 @@ export function searchSkills(
  index: SkillIndexEntry[],
  limit = 5,
 ): SearchResult[] {
-  if (index.length === 0 || !query.trim()) return []
+  if (index.length === 0 || !query?.trim()) return []

  const queryTokens = tokenizeAndStem(query)
  if (queryTokens.length === 0) return []
@@ -397,7 +397,7 @@ export function searchSkills(
  for (const v of freq.values()) if (v > max) max = v
  for (const [term, count] of freq) queryTf.set(term, count / max)

-  const idf = cachedIdf ?? computeIdf(index)
+  const idf = cachedIndex === index && cachedIdf ? cachedIdf : computeIdf(index)
  const queryTfIdf = new Map<string, number>()
  for (const [term, tf] of queryTf) {
    queryTfIdf.set(term, tf * (idf.get(term) ?? 0))
--- a/src/tasks/RemoteAgentTask/RemoteAgentTask.tsx
+++ b/src/tasks/RemoteAgentTask/RemoteAgentTask.tsx
@@ -91,6 +91,14 @@ export type AutofixPrRemoteTaskMetadata = {
  owner: string;
  repo: string;
  prNumber: number;
+  /**
+   * PR head commit SHA captured at /autofix-pr launch. The completionChecker
+   * compares this against the live head to detect when the agent has pushed
+   * new commits. Optional because gh CLI may be unavailable at launch — in
+   * that case the checker falls back to terminal-state-only completion.
+   * Survives --resume via the session sidecar.
+   */
+  initialHeadSha?: string;
 };

 export type RemoteTaskMetadata = AutofixPrRemoteTaskMetadata;
@@ -114,6 +122,71 @@ export function registerCompletionChecker(remoteTaskType: RemoteTaskType, checke
  completionCheckers.set(remoteTaskType, checker);
 }

+/**
+ * Called after the task transitions to a terminal state and the notification
+ * has been enqueued. Used by command modules to release singleton locks,
+ * clear cached state, or perform other cleanup the framework cannot see.
+ * Hooks must be synchronous and best-effort — errors are logged but never
+ * propagate.
+ */
+export type RemoteTaskCompletionHook = (taskId: string, remoteTaskMetadata: RemoteTaskMetadata | undefined) => void;
+
+const completionHooks = new Map<RemoteTaskType, RemoteTaskCompletionHook>();
+
+/**
+ * Inspect a completed remote task's accumulated log and return an XML fragment
+ * to inject inline into the completion task-notification. Returning null falls
+ * back to the framework's generic "task completed" notification (file-path
+ * pointer only). Used by command modules whose remote agents emit structured
+ * outcome tags the local model should read directly.
+ */
+export type RemoteTaskContentExtractor = (log: SDKMessage[]) => string | null;
+
+const contentExtractors = new Map<RemoteTaskType, RemoteTaskContentExtractor>();
+
+/**
+ * Register a content extractor for a remote task type. Called once per
+ * completion in the generic completion branches (archived, completionChecker,
+ * result-driven). isRemoteReview tasks have their own bespoke path and skip
+ * extractors entirely. Errors propagate to the framework which logs and falls
+ * back to generic notification.
+ */
+export function registerContentExtractor(remoteTaskType: RemoteTaskType, extractor: RemoteTaskContentExtractor): void {
+  contentExtractors.set(remoteTaskType, extractor);
+}
+
+function tryExtractRichContent(task: RemoteAgentTaskState, log: SDKMessage[]): string | null {
+  const extractor = contentExtractors.get(task.remoteTaskType);
+  if (!extractor) return null;
+  try {
+    return extractor(log);
+  } catch (e) {
+    logError(e);
+    return null;
+  }
+}
+
+/**
+ * Register a completion hook for a remote task type. Invoked once after the
+ * task reaches a terminal state in any of the framework's completion branches
+ * (archived session, completionChecker, stableIdle, result). Use this to
+ * release command-module state (e.g. singleton locks) without forcing the
+ * framework to reverse-import from the command package.
+ */
+export function registerCompletionHook(remoteTaskType: RemoteTaskType, hook: RemoteTaskCompletionHook): void {
+  completionHooks.set(remoteTaskType, hook);
+}
+
+function runCompletionHook(taskId: string, task: RemoteAgentTaskState): void {
+  const hook = completionHooks.get(task.remoteTaskType);
+  if (!hook) return;
+  try {
+    hook(taskId, task.remoteTaskMetadata);
+  } catch (e) {
+    logError(e);
+  }
+}
+
 /**
 * Persist a remote-agent metadata entry to the session sidecar.
 * Fire-and-forget — persistence failures must not block task registration.
@@ -213,6 +286,41 @@ function enqueueRemoteNotification(
  enqueuePendingNotification({ value: message, mode: 'task-notification' });
 }

+/**
+ * Same as enqueueRemoteNotification but inlines a structured XML fragment
+ * (returned by a registered RemoteTaskContentExtractor) so the local model
+ * reads the remote agent's outcome directly instead of having to follow a
+ * file-path pointer. Mode is still 'task-notification' — the framing XML is
+ * the same, only the body differs.
+ */
+function enqueueRichRemoteNotification(
+  taskId: string,
+  title: string,
+  status: 'completed' | 'failed' | 'killed',
+  richContent: string,
+  setAppState: SetAppState,
+  toolUseId?: string,
+): void {
+  if (!markTaskNotified(taskId, setAppState)) return;
+
+  const statusText = status === 'completed' ? 'completed successfully' : status === 'failed' ? 'failed' : 'was stopped';
+  const toolUseIdLine = toolUseId ? `\n<${TOOL_USE_ID_TAG}>${toolUseId}</${TOOL_USE_ID_TAG}>` : '';
+  const outputPath = getTaskOutputPath(taskId);
+
+  const message = `<${TASK_NOTIFICATION_TAG}>
+<${TASK_ID_TAG}>${taskId}</${TASK_ID_TAG}>${toolUseIdLine}
+<${TASK_TYPE_TAG}>remote_agent</${TASK_TYPE_TAG}>
+<${OUTPUT_FILE_TAG}>${outputPath}</${OUTPUT_FILE_TAG}>
+<${STATUS_TAG}>${status}</${STATUS_TAG}>
+<${SUMMARY_TAG}>Remote task "${title}" ${statusText}</${SUMMARY_TAG}>
+</${TASK_NOTIFICATION_TAG}>
+The remote agent produced the following structured outcome. Summarize the key changes for the user:
+
+${richContent}`;
+
+  enqueuePendingNotification({ value: message, mode: 'task-notification' });
+}
+
 /**
 * Atomically mark a task as notified. Returns true if this call flipped the
 * flag (caller should enqueue), false if already notified (caller should skip).
@@ -678,9 +786,22 @@ function startRemoteSessionPolling(taskId: string, context: TaskContext): () =>
        updateTaskState<RemoteAgentTaskState>(taskId, context.setAppState, t =>
          t.status === 'running' ? { ...t, status: 'completed', endTime: Date.now() } : t,
        );
-        enqueueRemoteNotification(taskId, task.title, 'completed', context.setAppState, task.toolUseId);
+        const richContent = tryExtractRichContent(task, accumulatedLog);
+        if (richContent) {
+          enqueueRichRemoteNotification(
+            taskId,
+            task.title,
+            'completed',
+            richContent,
+            context.setAppState,
+            task.toolUseId,
+          );
+        } else {
+          enqueueRemoteNotification(taskId, task.title, 'completed', context.setAppState, task.toolUseId);
+        }
        void evictTaskOutput(taskId);
        void removeRemoteAgentMetadata(taskId);
+        runCompletionHook(taskId, task);
        return;
      }

@@ -691,9 +812,22 @@ function startRemoteSessionPolling(taskId: string, context: TaskContext): () =>
          updateTaskState<RemoteAgentTaskState>(taskId, context.setAppState, t =>
            t.status === 'running' ? { ...t, status: 'completed', endTime: Date.now() } : t,
          );
-          enqueueRemoteNotification(taskId, completionResult, 'completed', context.setAppState, task.toolUseId);
+          const richContent = tryExtractRichContent(task, accumulatedLog);
+          if (richContent) {
+            enqueueRichRemoteNotification(
+              taskId,
+              completionResult,
+              'completed',
+              richContent,
+              context.setAppState,
+              task.toolUseId,
+            );
+          } else {
+            enqueueRemoteNotification(taskId, completionResult, 'completed', context.setAppState, task.toolUseId);
+          }
          void evictTaskOutput(taskId);
          void removeRemoteAgentMetadata(taskId);
+          runCompletionHook(taskId, task);
          return;
        }
      }
@@ -853,6 +987,7 @@ function startRemoteSessionPolling(taskId: string, context: TaskContext): () =>
            enqueueRemoteReviewNotification(taskId, reviewContent, context.setAppState);
            void evictTaskOutput(taskId);
            void removeRemoteAgentMetadata(taskId);
+            runCompletionHook(taskId, task);
            return; // Stop polling
          }

@@ -870,12 +1005,28 @@ function startRemoteSessionPolling(taskId: string, context: TaskContext): () =>
          enqueueRemoteReviewFailureNotification(taskId, reason, context.setAppState);
          void evictTaskOutput(taskId);
          void removeRemoteAgentMetadata(taskId);
+          runCompletionHook(taskId, task);
          return; // Stop polling
        }

-        enqueueRemoteNotification(taskId, task.title, finalStatus, context.setAppState, task.toolUseId);
+        // finalStatus is 'completed' | 'failed' on this path — kill is a
+        // separate code path (RemoteAgentTask.kill) and never reaches here.
+        const richContent = tryExtractRichContent(task, accumulatedLog);
+        if (richContent) {
+          enqueueRichRemoteNotification(
+            taskId,
+            task.title,
+            finalStatus,
+            richContent,
+            context.setAppState,
+            task.toolUseId,
+          );
+        } else {
+          enqueueRemoteNotification(taskId, task.title, finalStatus, context.setAppState, task.toolUseId);
+        }
        void evictTaskOutput(taskId);
        void removeRemoteAgentMetadata(taskId);
+        runCompletionHook(taskId, task);
        return; // Stop polling
      }
    } catch (error) {
--- a/src/utils/tests/effort.test.ts
+++ b/src/utils/tests/effort.test.ts
@@ -224,6 +224,22 @@ describe('getEffortLevelDescription', () => {
    const desc = getEffortLevelDescription('max')
    expect(desc).toContain('Maximum')
  })
+
+  test('max description does not contain model names', () => {
+    const desc = getEffortLevelDescription('max')
+    expect(desc).not.toContain('Opus')
+    expect(desc).not.toContain('DeepSeek')
+  })
+
+  test("returns description for 'xhigh'", () => {
+    const desc = getEffortLevelDescription('xhigh')
+    expect(desc).toContain('Extended reasoning')
+  })
+
+  test('xhigh description does not contain model names', () => {
+    const desc = getEffortLevelDescription('xhigh')
+    expect(desc).not.toContain('Opus')
+  })
 })

 // ─── resolvePickerEffortPersistence ────────────────────────────────────
@@ -274,3 +290,61 @@ describe('resolvePickerEffortPersistence', () => {
    expect(result).toBeUndefined()
  })
 })
+
+// ─── modelSupportsMaxEffort ────────────────────────────────────────────
+
+describe('modelSupportsMaxEffort', () => {
+  test('returns true for opus-4-7', async () => {
+    const { modelSupportsMaxEffort } = await import('src/utils/effort.js')
+    expect(modelSupportsMaxEffort('claude-opus-4-7-20250918')).toBe(true)
+  })
+
+  test('returns true for opus-4-6', async () => {
+    const { modelSupportsMaxEffort } = await import('src/utils/effort.js')
+    expect(modelSupportsMaxEffort('claude-opus-4-6-20250514')).toBe(true)
+  })
+
+  test('returns true for sonnet models', async () => {
+    const { modelSupportsMaxEffort } = await import('src/utils/effort.js')
+    expect(modelSupportsMaxEffort('claude-sonnet-4-6-20250514')).toBe(true)
+  })
+
+  test('returns true for haiku models', async () => {
+    const { modelSupportsMaxEffort } = await import('src/utils/effort.js')
+    expect(modelSupportsMaxEffort('claude-haiku-4-5-20251001')).toBe(true)
+  })
+
+  test('returns true for deepseek models', async () => {
+    const { modelSupportsMaxEffort } = await import('src/utils/effort.js')
+    expect(modelSupportsMaxEffort('deepseek-v4-pro')).toBe(true)
+  })
+
+  test('returns true for unknown models', async () => {
+    const { modelSupportsMaxEffort } = await import('src/utils/effort.js')
+    expect(modelSupportsMaxEffort('some-random-model')).toBe(true)
+  })
+})
+
+// ─── modelSupportsXhighEffort ──────────────────────────────────────────
+
+describe('modelSupportsXhighEffort', () => {
+  test('returns true for opus-4-7', async () => {
+    const { modelSupportsXhighEffort } = await import('src/utils/effort.js')
+    expect(modelSupportsXhighEffort('claude-opus-4-7-20250918')).toBe(true)
+  })
+
+  test('returns true for sonnet models', async () => {
+    const { modelSupportsXhighEffort } = await import('src/utils/effort.js')
+    expect(modelSupportsXhighEffort('claude-sonnet-4-6-20250514')).toBe(true)
+  })
+
+  test('returns true for haiku models', async () => {
+    const { modelSupportsXhighEffort } = await import('src/utils/effort.js')
+    expect(modelSupportsXhighEffort('claude-haiku-4-5-20251001')).toBe(true)
+  })
+
+  test('returns true for unknown models', async () => {
+    const { modelSupportsXhighEffort } = await import('src/utils/effort.js')
+    expect(modelSupportsXhighEffort('some-random-model')).toBe(true)
+  })
+})
--- a/src/utils/tests/messages.test.ts
+++ b/src/utils/tests/messages.test.ts
@@ -27,6 +27,7 @@ import {
  AUTO_REJECT_MESSAGE,
  DONT_ASK_REJECT_MESSAGE,
  SYNTHETIC_MODEL,
+  ensureToolResultPairing,
 } from '../messages'
 import type {
  Message,
@@ -516,3 +517,272 @@ describe('normalizeMessagesForAPI', () => {
    expect(block._geminiThoughtSignature).toBe('sig-123')
  })
 })
+
+describe('ensureToolResultPairing', () => {
+  test('does not produce consecutive user messages when orphaned tool_result is stripped after an existing user message (CC-1215)', () => {
+    // Reproduce the scenario from the bug report:
+    // Streaming yields assistant[thinking] and assistant[tool_use] separately.
+    // normalizeMessagesForAPI merges them, but if the merge fails (e.g. intervening
+    // user message breaks backward walk), ensureToolResultPairing sees duplicate
+    // tool_use ID, strips it, leaving empty content in the next user message,
+    // which becomes NO_CONTENT_MESSAGE. If the previous result entry is already
+    // user, this must NOT create consecutive user messages.
+    const toolUseId = 'toolu_test_dup_001'
+
+    const messages: (UserMessage | AssistantMessage)[] = [
+      // Previous turn: user with tool_result
+      createUserMessage({
+        content: [
+          {
+            type: 'tool_result',
+            tool_use_id: toolUseId,
+            content: 'previous result',
+          },
+        ],
+      }),
+      // Current turn: assistant with thinking only (tool_use was deduped away)
+      makeAssistantMsg([{ type: 'thinking', thinking: 'let me think...' }]),
+      // Current turn: assistant with tool_use (second streaming yield, same ID)
+      makeAssistantMsg([
+        {
+          type: 'tool_use',
+          id: toolUseId,
+          name: 'Bash',
+          input: { command: 'pwd' },
+        },
+      ]),
+      // Tool result for the tool_use
+      createUserMessage({
+        content: [
+          {
+            type: 'tool_result',
+            tool_use_id: toolUseId,
+            content: '/home/user',
+          },
+        ],
+      }),
+    ]
+
+    const result = ensureToolResultPairing(messages)
+
+    // Verify no consecutive user messages
+    for (let i = 1; i < result.length; i++) {
+      if (result[i - 1]!.type === 'user') {
+        expect(result[i]!.type).not.toBe('user')
+      }
+    }
+  })
+
+  test('inserts NO_CONTENT_MESSAGE when previous result entry is assistant', () => {
+    // When the orphan strip empties a user message and the previous entry is
+    // assistant, the placeholder should still be inserted to maintain alternation.
+    const toolUseId = 'toolu_test_orphan_001'
+
+    const messages: (UserMessage | AssistantMessage)[] = [
+      makeAssistantMsg([{ type: 'text', text: 'hello' }]),
+      // This assistant has a tool_use with an ID that won't match any result
+      makeAssistantMsg([
+        {
+          type: 'tool_use',
+          id: toolUseId,
+          name: 'Bash',
+          input: { command: 'ls' },
+        },
+      ]),
+      // User message with ONLY a tool_result for a non-existent tool_use
+      // After orphan stripping, content becomes empty
+      createUserMessage({
+        content: [
+          {
+            type: 'tool_result',
+            tool_use_id: 'nonexistent_id',
+            content: 'orphan',
+          },
+        ],
+      }),
+    ]
+
+    const result = ensureToolResultPairing(messages)
+
+    // Should have assistant, [possibly modified assistant], user placeholder
+    // The key assertion: last message should be a user placeholder
+    const lastMsg = result[result.length - 1]!
+    expect(lastMsg.type).toBe('user')
+  })
+})
+
+// ─── CC-1215: normalizeMessagesForAPI must not merge assistants across tool_results ──
+
+describe('normalizeMessagesForAPI – thinking + tool_use same turn (CC-1215)', () => {
+  test('does not merge same-id assistants across a tool_result boundary', () => {
+    // Simulate the streaming sequence when extended thinking + tool_use appear
+    // in the same turn, and StreamingToolExecutor inserts a tool_result
+    // between the two assistant content-block messages.
+    const sharedMessageId = 'msg_shared_001'
+    const toolUseId = 'toolu_cc1215'
+
+    // assistant[thinking] — first content_block_stop yield
+    const thinkingMsg = createAssistantMessage({
+      content: [
+        { type: 'thinking', thinking: 'Let me think...', signature: 'sig1' },
+      ],
+    })
+    thinkingMsg.message.id = sharedMessageId
+
+    // user[tool_result] — from StreamingToolExecutor completing fast
+    const toolResultMsg = createUserMessage({
+      content: [
+        {
+          type: 'tool_result',
+          tool_use_id: toolUseId,
+          content: '/home/user',
+        },
+      ],
+    })
+
+    // assistant[tool_use] — second content_block_stop yield
+    const toolUseMsg = createAssistantMessage({
+      content: [
+        {
+          type: 'tool_use',
+          id: toolUseId,
+          name: 'Bash',
+          input: { command: 'pwd' },
+        },
+      ],
+    })
+    toolUseMsg.message.id = sharedMessageId
+
+    const messages: Message[] = [
+      makeUserMsg('Run pwd'),
+      thinkingMsg,
+      toolResultMsg,
+      toolUseMsg,
+    ]
+
+    const result = normalizeMessagesForAPI(messages)
+
+    // Before the fix, the backward walk would skip the tool_result and merge
+    // thinking + tool_use into one assistant. This produced duplicate tool_use
+    // IDs after ensureToolResultPairing ran, leading to orphaned tool_results
+    // and consecutive user messages → API 400.
+    //
+    // After the fix, the backward walk stops at the tool_result, so the two
+    // assistants remain separate. The result should have 4 messages:
+    //   user, assistant[thinking], user[tool_result], assistant[tool_use]
+    expect(result).toHaveLength(4)
+    expect(result[0]!.type).toBe('user')
+    expect(result[1]!.type).toBe('assistant')
+    expect(result[2]!.type).toBe('user')
+    expect(result[3]!.type).toBe('assistant')
+
+    // The thinking assistant should NOT have been merged with the tool_use one
+    const thinkingAssistant = result[1] as AssistantMessage
+    const thinkingContent = thinkingAssistant.message.content as Array<{
+      type: string
+    }>
+    expect(thinkingContent.some(b => b.type === 'tool_use')).toBe(false)
+
+    const toolUseAssistant = result[3] as AssistantMessage
+    const toolUseContent = toolUseAssistant.message.content as Array<{
+      type: string
+    }>
+    expect(toolUseContent.some(b => b.type === 'tool_use')).toBe(true)
+  })
+
+  test('still merges consecutive same-id assistants without intervening tool_result', () => {
+    const sharedMessageId = 'msg_shared_002'
+
+    const thinkingMsg = createAssistantMessage({
+      content: [{ type: 'thinking', thinking: 'Hmm', signature: 'sig2' }],
+    })
+    thinkingMsg.message.id = sharedMessageId
+
+    const toolUseMsg = createAssistantMessage({
+      content: [
+        {
+          type: 'tool_use',
+          id: 'toolu_merge',
+          name: 'Bash',
+          input: { command: 'ls' },
+        },
+      ],
+    })
+    toolUseMsg.message.id = sharedMessageId
+
+    // No tool_result between them — they should still be merged
+    const messages: Message[] = [
+      makeUserMsg('List files'),
+      thinkingMsg,
+      toolUseMsg,
+    ]
+
+    const result = normalizeMessagesForAPI(messages)
+
+    // Should be: user, assistant[thinking + tool_use]
+    expect(result).toHaveLength(2)
+    expect(result[0]!.type).toBe('user')
+
+    const merged = result[1] as AssistantMessage
+    const content = merged.message.content as Array<{ type: string }>
+    expect(content.some(b => b.type === 'thinking')).toBe(true)
+    expect(content.some(b => b.type === 'tool_use')).toBe(true)
+  })
+
+  test('full pipeline: normalize + ensureToolResultPairing produces valid role alternation', () => {
+    const sharedMessageId = 'msg_shared_003'
+    const toolUseId = 'toolu_pipeline'
+
+    const thinkingMsg = createAssistantMessage({
+      content: [
+        { type: 'thinking', thinking: 'Planning...', signature: 'sig3' },
+      ],
+    })
+    thinkingMsg.message.id = sharedMessageId
+
+    const toolResultMsg = createUserMessage({
+      content: [
+        {
+          type: 'tool_result',
+          tool_use_id: toolUseId,
+          content: 'file.txt',
+        },
+      ],
+    })
+
+    const toolUseMsg = createAssistantMessage({
+      content: [
+        {
+          type: 'tool_use',
+          id: toolUseId,
+          name: 'Bash',
+          input: { command: 'ls' },
+        },
+      ],
+    })
+    toolUseMsg.message.id = sharedMessageId
+
+    // Full pipeline: normalize → ensureToolResultPairing
+    const normalized = normalizeMessagesForAPI([
+      makeUserMsg('Run ls'),
+      thinkingMsg,
+      toolResultMsg,
+      toolUseMsg,
+    ])
+    const result = ensureToolResultPairing(normalized)
+
+    // Verify strict role alternation: user → assistant → user → assistant → ...
+    for (let i = 1; i < result.length; i++) {
+      const prev = result[i - 1]!
+      const curr = result[i]!
+      if (prev.type === 'user' && curr.type === 'user') {
+        expect.unreachable(`Consecutive user messages at index ${i - 1}-${i}`)
+      }
+      if (prev.type === 'assistant' && curr.type === 'assistant') {
+        expect.unreachable(
+          `Consecutive assistant messages at index ${i - 1}-${i}`,
+        )
+      }
+    }
+  })
+})
--- a/src/utils/apiPreconnect.ts
+++ b/src/utils/apiPreconnect.ts
@@ -25,6 +25,7 @@

 import { getOauthConfig } from '../constants/oauth.js'
 import { isEnvTruthy } from './envUtils.js'
+import { isEssentialTrafficOnly } from './privacyLevel.js'

 let fired = false

@@ -32,6 +33,10 @@ export function preconnectAnthropicApi(): void {
  if (fired) return
  fired = true

+  // Also skip when non-essential traffic is disabled via
+  // CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC / DISABLE_TELEMETRY / proxy env.
+  if (isEssentialTrafficOnly()) return
+
  // Skip if using a cloud provider — different endpoint + auth
  if (
    isEnvTruthy(process.env.CLAUDE_CODE_USE_BEDROCK) ||
--- a/src/utils/claudeInChrome/setup.ts
+++ b/src/utils/claudeInChrome/setup.ts
@@ -2,7 +2,6 @@ import { BROWSER_TOOLS } from '@ant/claude-for-chrome-mcp'
 import { chmod, mkdir, readFile, writeFile } from 'fs/promises'
 import { homedir } from 'os'
 import { join } from 'path'
-import { fileURLToPath } from 'url'
 import {
  getIsInteractive,
  getIsNonInteractiveSession,
@@ -11,6 +10,7 @@ import {
 import { getFeatureValue_CACHED_MAY_BE_STALE } from '../../services/analytics/growthbook.js'
 import type { ScopedMcpServerConfig } from '../../services/mcp/types.js'
 import { isInBundledMode } from '../bundledMode.js'
+import { distRoot } from '../distRoot.js'
 import { getGlobalConfig, saveGlobalConfig } from '../config.js'
 import { logForDebugging } from '../debug.js'
 import {
@@ -135,9 +135,7 @@ export function setupClaudeInChrome(): {
      systemPrompt: getChromeSystemPrompt(),
    }
  } else {
-    const __filename = fileURLToPath(import.meta.url)
-    const __dirname = join(__filename, '..')
-    const cliPath = join(__dirname, 'cli.js')
+    const cliPath = join(distRoot, 'cli.js')

    void createWrapperScript(
      `"${process.execPath}" "${cliPath}" --chrome-native-host`,
--- a/src/utils/computerUse/setup.ts
+++ b/src/utils/computerUse/setup.ts
@@ -1,10 +1,10 @@
 import { buildComputerUseTools } from '@ant/computer-use-mcp'
 import { join } from 'path'
-import { fileURLToPath } from 'url'
 import { buildMcpToolName } from '../../services/mcp/mcpStringUtils.js'
 import type { ScopedMcpServerConfig } from '../../services/mcp/types.js'

 import { isInBundledMode } from '../bundledMode.js'
+import { distRoot } from '../distRoot.js'
 import { CLI_CU_CAPABILITIES, COMPUTER_USE_MCP_SERVER_NAME } from './common.js'
 import { getChicagoCoordinateMode } from './gates.js'

@@ -34,10 +34,7 @@ export function setupComputerUseMCP(): {
  // type 'stdio' to hit the right branch. Mirrors Chrome's setup.
  const args = isInBundledMode()
    ? ['--computer-use-mcp']
-    : [
-        join(fileURLToPath(import.meta.url), '..', 'cli.js'),
-        '--computer-use-mcp',
-      ]
+    : [join(distRoot, 'cli.js'), '--computer-use-mcp']

  return {
    mcpConfig: {
--- a/src/utils/distRoot.ts
+++ b/src/utils/distRoot.ts
@@ -0,0 +1,29 @@
+import { fileURLToPath } from 'url'
+import * as path from 'path'
+
+/**
+ * Resolve the dist root directory from the current module's location.
+ *
+ * Works across all build layouts:
+ * - Single-file: dist/cli.js → dist/
+ * - Code-split:  dist/chunks/chunk-xxx.js → dist/
+ * - Dev mode:    src/utils/distRoot.ts → <project_root>/
+ */
+const __filename = fileURLToPath(import.meta.url)
+const __dirname = path.dirname(__filename)
+
+const distRoot = (() => {
+  const parts = __dirname.split(path.sep)
+  const distIdx = parts.lastIndexOf('dist')
+  if (distIdx !== -1) {
+    return parts.slice(0, distIdx + 1).join(path.sep)
+  }
+  // Dev mode: from src/utils/ → project root
+  const srcIdx = parts.lastIndexOf('src')
+  if (srcIdx !== -1) {
+    return parts.slice(0, srcIdx).join(path.sep)
+  }
+  return __dirname
+})()
+
+export { distRoot }
--- a/src/utils/effort.ts
+++ b/src/utils/effort.ts
@@ -67,51 +67,22 @@ export function modelSupportsEffort(model: string): boolean {
  return getAPIProvider() === 'firstParty'
 }

-// @[MODEL LAUNCH]: Add the new model to the allowlist if it supports 'max' effort.
-// Per API docs, 'max' is Opus 4.6/4.7 only for public models — other models return an error.
-// However, DeepSeek V4 Pro also supports max effort when using Anthropic-compatible API.
-export function modelSupportsMaxEffort(model: string): boolean {
-  const supported3P = get3PModelCapabilityOverride(model, 'max_effort')
+// Effort max/xhigh restrictions removed — all models that support effort
+// can now use these levels. API errors are the user's responsibility.
+export function modelSupportsMaxEffort(_model: string): boolean {
+  const supported3P = get3PModelCapabilityOverride(_model, 'max_effort')
  if (supported3P !== undefined) {
    return supported3P
  }
-  // Support DeepSeek V4 Pro specifically (Anthropic-compatible API)
-  if (model.toLowerCase().includes('deepseek-v4-pro')) {
-    return true
-  }
-  if (
-    model.toLowerCase().includes('opus-4-7') ||
-    model.toLowerCase().includes('opus-4-6')
-  ) {
-    return true
-  }
-  if (process.env.USER_TYPE === 'ant' && resolveAntModel(model)) {
-    return true
-  }
-  return false
+  return true
 }

-// @[MODEL LAUNCH]: Add the new model to the allowlist if it supports 'xhigh' effort.
-// 'xhigh' was introduced with Opus 4.7 as a level between 'high' and 'max'.
-export function modelSupportsXhighEffort(model: string): boolean {
-  const supported3P = get3PModelCapabilityOverride(model, 'xhigh_effort')
+export function modelSupportsXhighEffort(_model: string): boolean {
+  const supported3P = get3PModelCapabilityOverride(_model, 'xhigh_effort')
  if (supported3P !== undefined) {
    return supported3P
  }
-  if (
-    getAPIProvider() === 'openai' &&
-    isChatGPTAuthMode() &&
-    isChatGPTCodexReasoningModel(model)
-  ) {
-    return true
-  }
-  if (model.toLowerCase().includes('opus-4-7')) {
-    return true
-  }
-  if (process.env.USER_TYPE === 'ant' && resolveAntModel(model)) {
-    return true
-  }
-  return false
+  return true
 }

 export function isEffortLevel(value: string): value is EffortLevel {
@@ -214,10 +185,6 @@ export function resolveAppliedEffort(
  }
  const resolved =
    envOverride ?? appStateEffortValue ?? getDefaultEffortForModel(model)
-  // API rejects 'xhigh' on pre-Opus-4.7 models — downgrade to 'high'.
-  if (resolved === 'xhigh' && !modelSupportsXhighEffort(model)) {
-    return 'high'
-  }
  // OpenAI Responses uses xhigh as its highest public reasoning effort.
  // Keep /effort max usable as a familiar alias in ChatGPT subscription mode.
  if (
@@ -228,10 +195,6 @@ export function resolveAppliedEffort(
  ) {
    return 'xhigh'
  }
-  // API rejects 'max' on non-Opus-4.6 models — downgrade to 'high'.
-  if (resolved === 'max' && !modelSupportsMaxEffort(model)) {
-    return 'high'
-  }
  return resolved
 }

@@ -299,9 +262,9 @@ export function getEffortLevelDescription(level: EffortLevel): string {
    case 'high':
      return 'Comprehensive implementation with extensive testing and documentation'
    case 'xhigh':
-      return 'Extended reasoning beyond high, short of max (Opus 4.7 only)'
+      return 'Extended reasoning beyond high, short of max'
    case 'max':
-      return 'Maximum capability with deepest reasoning (Opus 4.6/4.7/DeepSeek V4 Pro)'
+      return 'Maximum capability with deepest reasoning'
  }
 }

--- a/src/utils/messages.ts
+++ b/src/utils/messages.ts
@@ -2541,21 +2541,26 @@ export function normalizeMessagesForAPI(
          }

          // Find a previous assistant message with the same message ID and merge.
-          // Walk backwards, skipping tool results and different-ID assistants,
-          // since concurrent agents (teammates) can interleave streaming content
-          // blocks from multiple API responses with different message IDs.
+          // Walk backwards, skipping different-ID assistants, since concurrent
+          // agents (teammates) can interleave streaming content blocks from
+          // multiple API responses with different message IDs.
+          //
+          // Do NOT skip tool_result messages — when claude.ts yields separate
+          // AssistantMessages for thinking and tool_use blocks (same message.id),
+          // a StreamingToolExecutor tool_result can land between them. Merging
+          // across that boundary produces duplicate tool_use IDs that downstream
+          // ensureToolResultPairing strips, leaving orphaned tool_results and
+          // ultimately consecutive user messages → API 400 (CC-1215).
          for (let i = result.length - 1; i >= 0; i--) {
            const msg = result[i]!

-            if (msg.type !== 'assistant' && !isToolResultMessage(msg)) {
+            if (msg.type !== 'assistant') {
              break
            }

-            if (msg.type === 'assistant') {
-              if (msg.message.id === normalizedMessage.message.id) {
-                result[i] = mergeAssistantMessages(msg, normalizedMessage)
-                return
-              }
+            if (msg.message.id === normalizedMessage.message.id) {
+              result[i] = mergeAssistantMessages(msg, normalizedMessage)
+              return
            }
          }

@@ -5829,11 +5834,15 @@ export function ensureToolResultPairing(
        )
      } else {
        // Content is empty after stripping orphaned tool_results. We still
-        // need a user message here to maintain role alternation — otherwise
-        // the assistant placeholder we just pushed would be immediately
-        // followed by the NEXT assistant message, which the API rejects with
-        // a role-alternation 400 (not the duplicate-id 400 we handle).
+        // need a user message here to maintain role alternation — unless the
+        // previous result entry is already a user message, in which case
+        // inserting another user placeholder creates consecutive-user messages
+        // that Anthropic rejects with a misleading "tool_use without
+        // tool_result" 400 (CC-1215).
        i++
+        if (result.at(-1)?.type === 'user') {
+          continue
+        }
        result.push(
          createUserMessage({
            content: NO_CONTENT_MESSAGE,
--- a/src/utils/model/model.ts
+++ b/src/utils/model/model.ts
@@ -120,6 +120,19 @@ export function getBestModel(): ModelName {
  return getDefaultOpusModel()
 }

+/**
+ * Resolve the provider's primary model from its env var (e.g. OPENAI_MODEL).
+ * Returns undefined for providers that don't have a primary-model env var
+ * (Bedrock, Vertex, Foundry, firstParty).
+ */
+function getProviderPrimaryModel(): ModelName | undefined {
+  const provider = getAPIProvider()
+  if (provider === 'openai') return process.env.OPENAI_MODEL
+  if (provider === 'gemini') return process.env.GEMINI_MODEL
+  if (provider === 'grok') return process.env.GROK_MODEL
+  return undefined
+}
+
 // @[MODEL LAUNCH]: Update the default Opus model (3P providers may lag so keep defaults unchanged).
 export function getDefaultOpusModel(): ModelName {
  const provider = getAPIProvider()
@@ -138,10 +151,12 @@ export function getDefaultOpusModel(): ModelName {
  if (process.env.ANTHROPIC_DEFAULT_OPUS_MODEL) {
    return process.env.ANTHROPIC_DEFAULT_OPUS_MODEL
  }
-  // 3P providers (Bedrock, Vertex, Foundry) all publish Opus 4.7 in sync
-  // with firstParty as of 2026-04-17 (AWS Bedrock, Google Vertex AI, and
-  // Microsoft Foundry announcements and model catalogs all confirm). The
-  // branch is kept as a structural hook in case a future launch lags on 3P.
+  // 3P providers: if user set a primary model (e.g. OPENAI_MODEL=glm-5.1),
+  // fall back to it instead of a hardcoded Anthropic model. This prevents
+  // sideQuery / background tasks from sending requests to Anthropic's API
+  // when the user configured a third-party provider.
+  const primaryModel = getProviderPrimaryModel()
+  if (primaryModel) return primaryModel
  if (provider !== 'firstParty') {
    return getModelStrings().opus47
  }
@@ -166,7 +181,11 @@ export function getDefaultSonnetModel(): ModelName {
  if (process.env.ANTHROPIC_DEFAULT_SONNET_MODEL) {
    return process.env.ANTHROPIC_DEFAULT_SONNET_MODEL
  }
-  // Default to Sonnet 4.5 for 3P since they may not have 4.6 yet
+  // 3P providers: fall back to user's primary model instead of a hardcoded
+  // Anthropic model name. Prevents background API calls from being routed to
+  // Anthropic when the user configured a third-party endpoint.
+  const primaryModel = getProviderPrimaryModel()
+  if (primaryModel) return primaryModel
  if (provider !== 'firstParty') {
    return getModelStrings().sonnet45
  }
@@ -191,6 +210,10 @@ export function getDefaultHaikuModel(): ModelName {
  if (process.env.ANTHROPIC_DEFAULT_HAIKU_MODEL) {
    return process.env.ANTHROPIC_DEFAULT_HAIKU_MODEL
  }
+  // 3P providers: fall back to user's primary model instead of a hardcoded
+  // Anthropic model name.
+  const primaryModel = getProviderPrimaryModel()
+  if (primaryModel) return primaryModel

  // Haiku 4.5 is available on all platforms (first-party, Foundry, Bedrock, Vertex)
  return getModelStrings().haiku45
--- a/src/utils/performanceShim.ts
+++ b/src/utils/performanceShim.ts
@@ -135,6 +135,9 @@ const shim = {
  clearResourceTimings: (() => {}) as typeof performance.clearResourceTimings,
  setResourceTimingBufferSize:
    (() => {}) as typeof performance.setResourceTimingBufferSize,
+  // Node.js v22 undici internal calls this after every fetch — must exist to
+  // avoid TypeError: markResourceTiming is not a function
+  markResourceTiming: (() => {}) as any,
  // Delegate read-only properties to the original
  get timeOrigin() {
    return original.timeOrigin
@@ -148,7 +151,7 @@ const shim = {
  toJSON() {
    return original.toJSON()
  },
-} as typeof performance
+} as unknown as typeof performance

 /**
 * Install the shim onto globalThis.performance. Safe to call multiple times.
--- a/src/utils/ripgrep.ts
+++ b/src/utils/ripgrep.ts
@@ -4,9 +4,9 @@ import memoize from 'lodash-es/memoize.js'
 import { homedir } from 'os'
 import * as path from 'path'
 import { logEvent } from 'src/services/analytics/index.js'
-import { fileURLToPath } from 'url'
 import { isInBundledMode } from './bundledMode.js'
 import { logForDebugging } from './debug.js'
+import { distRoot } from './distRoot.js'
 import { isEnvDefinedFalsy } from './envUtils.js'
 import { execFileNoThrow } from './execFileNoThrow.js'
 import { findExecutable } from './findExecutable.js'
@@ -14,25 +14,9 @@ import { logError } from './log.js'
 import { getPlatform } from './platform.js'
 import { countCharInString } from './stringUtils.js'

-const __filename = fileURLToPath(import.meta.url)
-// we use node:path.join instead of node:url.resolve because the former doesn't encode spaces
-// In dev mode: __filename = <root>/src/utils/ripgrep.ts → __dirname = <root>/src/utils/
-// In built mode (bun): __filename = <root>/dist/chunk-xxx.js → need <root>/dist/
-// In built mode (vite): __filename = <root>/dist/chunks/chunk-xxx.js → need <root>/dist/
-// Both built modes: the dist root is at <root>/dist/ where dist/vendor/ripgrep/ lives.
 const __dirname = (() => {
-  const dir = path.dirname(__filename)
-  // Test mode: from src/utils/ → project root
-  if (process.env.NODE_ENV === 'test') return path.resolve(dir, '../../../')
-  // Check if we're inside a dist directory at any depth
-  // (dist/ or dist/chunks/) — vendor lives at <dist-root>/vendor/ripgrep/
-  const parts = dir.split(path.sep)
-  const distIdx = parts.lastIndexOf('dist')
-  if (distIdx !== -1) {
-    return parts.slice(0, distIdx + 1).join(path.sep)
-  }
-  // Dev mode: from src/utils/ → src/utils/
-  return dir
+  if (process.env.NODE_ENV === 'test') return path.resolve(distRoot)
+  return distRoot
 })()

 type RipgrepConfig = {
--- a/src/utils/sideQuery.ts
+++ b/src/utils/sideQuery.ts
@@ -33,6 +33,19 @@ import { errorMessage } from './errors.js'
 import { computeFingerprint } from './fingerprint.js'
 import { getAPIProvider } from './model/providers.js'
 import { normalizeModelStringForAPI } from './model/model.js'
+import { getOpenAIClient } from '../services/api/openai/client.js'
+import { getGrokClient } from '../services/api/grok/client.js'
+import {
+  anthropicMessagesToOpenAI,
+  resolveOpenAIModel,
+  anthropicToolsToOpenAI,
+  anthropicToolChoiceToOpenAI,
+  resolveGrokModel,
+  resolveGeminiModel,
+  anthropicToolsToGemini,
+  anthropicToolChoiceToGemini,
+} from '@ant/model-provider'
+import type { SystemPrompt } from './systemPromptType.js'

 type MessageParam = Anthropic.MessageParam
 type TextBlockParam = Anthropic.TextBlockParam
@@ -99,6 +112,46 @@ function extractFirstUserMessageText(messages: MessageParam[]): string {
  return textBlock?.type === 'text' ? textBlock.text : ''
 }

+/**
+ * Extract system prompt text from the `system` option.
+ */
+function extractSystemText(system?: string | TextBlockParam[]): string {
+  if (!system) return ''
+  if (typeof system === 'string') return system
+  return system
+    .filter((b): b is { type: 'text'; text: string } => 'text' in b && !!b.text)
+    .map(b => b.text)
+    .join('\n\n')
+}
+
+/**
+ * Convert Anthropic MessageParam[] to a list of {role, content} objects
+ * suitable for OpenAI-compatible chat.completions APIs.
+ */
+function messageParamsToOpenAIRoleContent(
+  messages: MessageParam[],
+): Array<{ role: 'user' | 'assistant'; content: string }> {
+  const result: Array<{ role: 'user' | 'assistant'; content: string }> = []
+  for (const m of messages) {
+    if (m.role !== 'user' && m.role !== 'assistant') continue
+    const text =
+      typeof m.content === 'string'
+        ? m.content
+        : Array.isArray(m.content)
+          ? m.content
+              .filter(
+                (b): b is { type: 'text'; text: string } => b.type === 'text',
+              )
+              .map(b => b.text)
+              .join('\n')
+          : ''
+    if (text) {
+      result.push({ role: m.role as 'user' | 'assistant', content: text })
+    }
+  }
+  return result
+}
+
 /**
 * Lightweight API wrapper for "side queries" outside the main conversation loop.
 *
@@ -112,6 +165,7 @@ function extractFirstUserMessageText(messages: MessageParam[]): string {
 * - Proper betas for the model
 * - API metadata
 * - Model string normalization (strips [1m] suffix for API)
+ * - Third-party provider routing (OpenAI, Grok, Gemini)
 *
 * @example
 * // Permission explainer
@@ -142,6 +196,14 @@ export async function sideQuery(opts: SideQueryOptions): Promise<BetaMessage> {
    stop_sequences,
  } = opts

+  const provider = getAPIProvider()
+  if (provider === 'openai' || provider === 'grok') {
+    return sideQueryViaOpenAICompatible(opts)
+  }
+  if (provider === 'gemini') {
+    return sideQueryViaGemini(opts)
+  }
+
  const client = await getAnthropicClient({
    maxRetries,
    model,
@@ -198,7 +260,6 @@ export async function sideQuery(opts: SideQueryOptions): Promise<BetaMessage> {
  }

  const normalizedModel = normalizeModelStringForAPI(model)
-  const provider = getAPIProvider()
  const start = Date.now()
  const traceName = `side-query:${opts.querySource}`

@@ -328,3 +389,352 @@ export async function sideQuery(opts: SideQueryOptions): Promise<BetaMessage> {

  return response
 }
+
+/**
+ * OpenAI-compatible side query for OpenAI and Grok providers.
+ * Both use the OpenAI SDK with different base URLs.
+ *
+ * Converts Anthropic-format params to OpenAI Chat Completions, sends a
+ * non-streaming request, and wraps the response back into a BetaMessage
+ * shape so callers remain provider-agnostic.
+ *
+ * Supports tools and tool_choice for structured output (e.g. yoloClassifier,
+ * permissionExplainer).
+ */
+async function sideQueryViaOpenAICompatible(
+  opts: SideQueryOptions,
+): Promise<BetaMessage> {
+  const {
+    model,
+    system,
+    messages,
+    tools,
+    tool_choice,
+    max_tokens = 1024,
+    temperature,
+    signal,
+  } = opts
+
+  const provider = getAPIProvider()
+  const normalizedModel = normalizeModelStringForAPI(model)
+
+  // Resolve model name and client per provider
+  let openaiModel: string
+  // eslint-disable-next-line @typescript-eslint/no-redundant-type-constituents
+  let client: import('openai').default
+  if (provider === 'grok') {
+    openaiModel = resolveGrokModel(normalizedModel)
+    client = getGrokClient({ maxRetries: opts.maxRetries ?? 2 })
+  } else {
+    openaiModel = resolveOpenAIModel(normalizedModel)
+    client = getOpenAIClient({ maxRetries: opts.maxRetries ?? 2 })
+  }
+
+  // Build system prompt text
+  const systemText = extractSystemText(system)
+
+  // Build OpenAI messages: system first, then user/assistant
+  const openaiMessages: Array<{
+    role: 'system' | 'user' | 'assistant'
+    content: string
+  }> = []
+  if (systemText) {
+    openaiMessages.push({ role: 'system', content: systemText })
+  }
+  openaiMessages.push(...messageParamsToOpenAIRoleContent(messages))
+
+  // Convert tools and tool_choice if provided
+  const openaiTools =
+    tools && tools.length > 0
+      ? anthropicToolsToOpenAI(tools as BetaToolUnion[])
+      : undefined
+  const openaiToolChoice = tool_choice
+    ? anthropicToolChoiceToOpenAI(tool_choice)
+    : undefined
+
+  const start = Date.now()
+
+  const requestParams: Record<string, unknown> = {
+    model: openaiModel,
+    messages: openaiMessages,
+    max_tokens,
+  }
+  if (temperature !== undefined) requestParams.temperature = temperature
+  if (openaiTools && openaiTools.length > 0) {
+    requestParams.tools = openaiTools
+    if (openaiToolChoice) requestParams.tool_choice = openaiToolChoice
+  }
+
+  const response = await client.chat.completions.create(
+    requestParams as unknown as import('openai/resources/chat/completions/completions.mjs').ChatCompletionCreateParamsNonStreaming,
+    { signal },
+  )
+
+  const choice = response.choices[0]
+  const message = choice?.message
+
+  // Build content blocks for BetaMessage
+  const contentBlocks: Array<
+    | { type: 'text'; text: string }
+    | { type: 'tool_use'; id: string; name: string; input: unknown }
+  > = []
+
+  if (message?.content) {
+    contentBlocks.push({ type: 'text', text: message.content })
+  }
+
+  if (message?.tool_calls) {
+    for (const tc of message.tool_calls) {
+      // ChatCompletionMessageToolCall is a union — only function-type has .function
+      if (tc.type === 'function' && 'function' in tc) {
+        const fn = (tc as { function: { name: string; arguments: string } })
+          .function
+        contentBlocks.push({
+          type: 'tool_use',
+          id: tc.id ?? `toolu_${Date.now()}`,
+          name: fn.name,
+          input: JSON.parse(fn.arguments || '{}'),
+        })
+      }
+    }
+  }
+
+  const now = Date.now()
+  const requestId = response.id
+  const lastCompletion = getLastApiCompletionTimestamp()
+  logEvent('tengu_api_success', {
+    requestId:
+      requestId as AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS,
+    querySource:
+      opts.querySource as AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS,
+    model:
+      openaiModel as AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS,
+    inputTokens: response.usage?.prompt_tokens ?? 0,
+    outputTokens: response.usage?.completion_tokens ?? 0,
+    cachedInputTokens: 0,
+    uncachedInputTokens: response.usage?.prompt_tokens ?? 0,
+    durationMsIncludingRetries: now - start,
+    timeSinceLastApiCallMs:
+      lastCompletion !== null ? now - lastCompletion : undefined,
+  })
+  setLastApiCompletionTimestamp(now)
+
+  const stopReason =
+    choice?.finish_reason === 'tool_calls'
+      ? 'tool_use'
+      : choice?.finish_reason === 'length'
+        ? 'max_tokens'
+        : 'end_turn'
+
+  return {
+    id: response.id,
+    type: 'message',
+    role: 'assistant',
+    content: contentBlocks as BetaMessage['content'],
+    model: openaiModel,
+    stop_reason: stopReason as BetaMessage['stop_reason'],
+    stop_sequence: null,
+    usage: {
+      input_tokens: response.usage?.prompt_tokens ?? 0,
+      output_tokens: response.usage?.completion_tokens ?? 0,
+    },
+  } as BetaMessage
+}
+
+/**
+ * Gemini side query. Converts Anthropic-format params to Gemini
+ * generateContent format, sends a non-streaming request via fetch,
+ * and wraps the response back into a BetaMessage shape.
+ */
+async function sideQueryViaGemini(
+  opts: SideQueryOptions,
+): Promise<BetaMessage> {
+  const {
+    model,
+    system,
+    messages,
+    tools,
+    tool_choice,
+    max_tokens = 1024,
+    temperature,
+    signal,
+  } = opts
+
+  const normalizedModel = normalizeModelStringForAPI(model)
+  const geminiModel = resolveGeminiModel(normalizedModel)
+
+  // Build Gemini contents from Anthropic MessageParam[]
+  const contents: Array<{
+    role: 'user' | 'model'
+    parts: Array<{ text: string }>
+  }> = []
+  for (const m of messages) {
+    if (m.role !== 'user' && m.role !== 'assistant') continue
+    const text =
+      typeof m.content === 'string'
+        ? m.content
+        : Array.isArray(m.content)
+          ? m.content
+              .filter(
+                (b): b is { type: 'text'; text: string } => b.type === 'text',
+              )
+              .map(b => b.text)
+              .join('\n')
+          : ''
+    if (text) {
+      contents.push({
+        role: m.role === 'assistant' ? 'model' : 'user',
+        parts: [{ text }],
+      })
+    }
+  }
+
+  // Build system instruction
+  const systemText = extractSystemText(system)
+  const systemInstruction = systemText
+    ? { parts: [{ text: systemText }] }
+    : undefined
+
+  // Convert tools and tool_choice
+  const geminiTools =
+    tools && tools.length > 0
+      ? anthropicToolsToGemini(tools as BetaToolUnion[])
+      : undefined
+  const geminiToolConfig = tool_choice
+    ? anthropicToolChoiceToGemini(tool_choice)
+    : undefined
+
+  const baseUrl = (
+    process.env.GEMINI_BASE_URL ||
+    'https://generativelanguage.googleapis.com/v1beta'
+  ).replace(/\/+$/, '')
+  const modelPath = geminiModel.startsWith('models/')
+    ? geminiModel
+    : `models/${geminiModel}`
+  const url = `${baseUrl}/${modelPath}:generateContent`
+
+  const body: Record<string, unknown> = {
+    contents,
+    ...(systemInstruction && { systemInstruction }),
+    ...(geminiTools && geminiTools.length > 0 && { tools: geminiTools }),
+    ...(geminiToolConfig && {
+      toolConfig: { functionCallingConfig: geminiToolConfig },
+    }),
+    ...(temperature !== undefined && {
+      generationConfig: { temperature },
+    }),
+    ...(max_tokens !== undefined && {
+      generationConfig: {
+        ...(temperature !== undefined && { temperature }),
+        maxOutputTokens: max_tokens,
+      },
+    }),
+  }
+
+  // Merge generationConfig if both temperature and max_tokens are set
+  if (temperature !== undefined && max_tokens !== undefined) {
+    body.generationConfig = { temperature, maxOutputTokens: max_tokens }
+  }
+
+  const start = Date.now()
+
+  const res = await fetch(url, {
+    method: 'POST',
+    headers: {
+      'Content-Type': 'application/json',
+      'x-goog-api-key': process.env.GEMINI_API_KEY || '',
+    },
+    body: JSON.stringify(body),
+    signal,
+  })
+
+  if (!res.ok) {
+    const errorBody = await res.text()
+    throw new Error(
+      `Gemini API request failed (${res.status} ${res.statusText}): ${errorBody || 'empty response body'}`,
+    )
+  }
+
+  const geminiResponse = (await res.json()) as {
+    candidates?: Array<{
+      content?: {
+        role?: string
+        parts?: Array<{
+          text?: string
+          functionCall?: { name?: string; args?: Record<string, unknown> }
+        }>
+      }
+      finishReason?: string
+    }>
+    usageMetadata?: {
+      promptTokenCount?: number
+      candidatesTokenCount?: number
+      totalTokenCount?: number
+    }
+    id?: string
+  }
+
+  // Build content blocks from Gemini response
+  const contentBlocks: Array<
+    | { type: 'text'; text: string }
+    | { type: 'tool_use'; id: string; name: string; input: unknown }
+  > = []
+
+  const candidate = geminiResponse.candidates?.[0]
+  const parts = candidate?.content?.parts
+  if (parts) {
+    for (const part of parts) {
+      if (part.text) {
+        contentBlocks.push({ type: 'text', text: part.text })
+      }
+      if (part.functionCall) {
+        contentBlocks.push({
+          type: 'tool_use',
+          id: `toolu_${Date.now()}_${Math.random().toString(36).slice(2, 8)}`,
+          name: part.functionCall.name ?? '',
+          input: part.functionCall.args ?? {},
+        })
+      }
+    }
+  }
+
+  const now = Date.now()
+  const lastCompletion = getLastApiCompletionTimestamp()
+  logEvent('tengu_api_success', {
+    requestId: (geminiResponse.id ??
+      '') as AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS,
+    querySource:
+      opts.querySource as AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS,
+    model:
+      geminiModel as AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS,
+    inputTokens: geminiResponse.usageMetadata?.promptTokenCount ?? 0,
+    outputTokens: geminiResponse.usageMetadata?.candidatesTokenCount ?? 0,
+    cachedInputTokens: 0,
+    uncachedInputTokens: geminiResponse.usageMetadata?.promptTokenCount ?? 0,
+    durationMsIncludingRetries: now - start,
+    timeSinceLastApiCallMs:
+      lastCompletion !== null ? now - lastCompletion : undefined,
+  })
+  setLastApiCompletionTimestamp(now)
+
+  const stopReason =
+    candidate?.finishReason === 'STOP'
+      ? 'end_turn'
+      : candidate?.finishReason === 'MAX_TOKENS'
+        ? 'max_tokens'
+        : 'end_turn'
+
+  return {
+    id: geminiResponse.id ?? `gemini_${Date.now()}`,
+    type: 'message',
+    role: 'assistant',
+    content: contentBlocks as BetaMessage['content'],
+    model: geminiModel,
+    stop_reason: stopReason as BetaMessage['stop_reason'],
+    stop_sequence: null,
+    usage: {
+      input_tokens: geminiResponse.usageMetadata?.promptTokenCount ?? 0,
+      output_tokens: geminiResponse.usageMetadata?.candidatesTokenCount ?? 0,
+    },
+  } as BetaMessage
+}
--- a/src/utils/swarm/backends/WindowsTerminalBackend.ts
+++ b/src/utils/swarm/backends/WindowsTerminalBackend.ts
@@ -1,5 +1,5 @@
 import { randomUUID } from 'crypto'
-import { readFile } from 'fs/promises'
+import { readFile, unlink } from 'fs/promises'
 import { join } from 'path'
 import { tmpdir } from 'os'
 import type { AgentColorName } from '@claude-code-best/builtin-tools/tools/AgentTool/agentColorManager.js'
@@ -13,10 +13,15 @@ import type { CreatePaneResult, PaneBackend, PaneId } from './types.js'
 type CommandResult = { stdout: string; stderr: string; code: number }
 type CommandRunner = (command: string, args: string[]) => Promise<CommandResult>

+type PaneStatus = 'registered' | 'spawning' | 'ready' | 'killing' | 'dead'
+
 type WindowsTerminalPane = {
  title: string
  mode: 'pane' | 'window'
  pidFile: string
+  status: PaneStatus
+  pid?: number
+  spawnPromise?: Promise<void>
 }

 function quotePowerShellString(value: string): string {
@@ -39,8 +44,42 @@ function wrapPowerShellCommand(command: string, pidFile: string): string {
  ].join('; ')
 }

-function makePidFile(paneId: string): string {
-  return join(tmpdir(), `${paneId.replace(/[^a-zA-Z0-9_-]/g, '-')}.pid`)
+const WT_PANE_TIMEOUT_DEFAULT_MS = 8000
+const WT_PANE_POLL_INTERVAL_MS = 200
+
+function getWtPaneTimeoutMs(): number {
+  const raw = process.env.CLAUDE_WT_PANE_TIMEOUT_MS
+  if (!raw) return WT_PANE_TIMEOUT_DEFAULT_MS
+  const parsed = Number.parseInt(raw, 10)
+  return Number.isFinite(parsed) && parsed > 0
+    ? parsed
+    : WT_PANE_TIMEOUT_DEFAULT_MS
+}
+
+async function waitForPidFile(
+  pidFile: string,
+  timeoutMs: number,
+): Promise<number> {
+  const deadline = Date.now() + timeoutMs
+  let lastErr: unknown
+  while (Date.now() < deadline) {
+    try {
+      const content = (await readFile(pidFile, 'utf-8')).trim()
+      if (!/^\d+$/.test(content)) {
+        lastErr = new Error(
+          `pidFile content not a valid pid: ${JSON.stringify(content)}`,
+        )
+      } else {
+        const pid = Number.parseInt(content, 10)
+        if (Number.isFinite(pid) && pid > 0) return pid
+        lastErr = new Error(`pidFile content parsed to invalid pid: ${pid}`)
+      }
+    } catch (err) {
+      lastErr = err
+    }
+    await new Promise(r => setTimeout(r, WT_PANE_POLL_INTERVAL_MS))
+  }
+  throw lastErr ?? new Error('pidFile never appeared')
 }

 /**
@@ -58,10 +97,40 @@ export class WindowsTerminalBackend implements PaneBackend {

  private panes = new Map<PaneId, WindowsTerminalPane>()

+  private readonly runCommand: CommandRunner
+  private readonly getPlatformValue: () => Platform
+  private readonly pidFileDir: string
+
  constructor(
-    private readonly runCommand: CommandRunner = execFileNoThrow,
-    private readonly getPlatformValue: () => Platform = getPlatform,
-  ) {}
+    runCommandOrOptions?:
+      | CommandRunner
+      | {
+          runCommand?: CommandRunner
+          getPlatform?: () => Platform
+          pidFileDir?: string
+        },
+    getPlatformValue?: () => Platform,
+  ) {
+    if (
+      typeof runCommandOrOptions === 'function' ||
+      runCommandOrOptions === undefined
+    ) {
+      this.runCommand = runCommandOrOptions ?? execFileNoThrow
+      this.getPlatformValue = getPlatformValue ?? getPlatform
+      this.pidFileDir = tmpdir()
+    } else {
+      this.runCommand = runCommandOrOptions.runCommand ?? execFileNoThrow
+      this.getPlatformValue = runCommandOrOptions.getPlatform ?? getPlatform
+      this.pidFileDir = runCommandOrOptions.pidFileDir ?? tmpdir()
+    }
+  }
+
+  private makePidFile(paneId: string): string {
+    return join(
+      this.pidFileDir,
+      `${paneId.replace(/[^a-zA-Z0-9_-]/g, '-')}.pid`,
+    )
+  }

  async isAvailable(): Promise<boolean> {
    if (this.getPlatformValue() !== 'windows') {
@@ -92,7 +161,8 @@ export class WindowsTerminalBackend implements PaneBackend {
    this.panes.set(paneId, {
      title: name,
      mode: 'pane',
-      pidFile: makePidFile(paneId),
+      pidFile: this.makePidFile(paneId),
+      status: 'registered',
    })
    return { paneId, isFirstTeammate }
  }
@@ -106,7 +176,8 @@ export class WindowsTerminalBackend implements PaneBackend {
    this.panes.set(paneId, {
      title: name,
      mode: 'window',
-      pidFile: makePidFile(paneId),
+      pidFile: this.makePidFile(paneId),
+      status: 'registered',
    })
    return { paneId, isFirstTeammate: false, windowName }
  }
@@ -121,32 +192,95 @@ export class WindowsTerminalBackend implements PaneBackend {
      throw new Error(`Unknown Windows Terminal pane id: ${paneId}`)
    }

-    const launcher = wrapPowerShellCommand(command, pane.pidFile)
-    // wt.exe treats ';' as its own command separator, which breaks
-    // multi-statement PowerShell commands passed via -Command. Encode the
-    // entire script as Base64 UTF-16LE and use -EncodedCommand instead.
-    const encoded = Buffer.from(launcher, 'utf16le').toString('base64')
-    const args =
-      pane.mode === 'window'
-        ? ['-w', '-1', 'new-tab', '--title', pane.title]
-        : ['-w', '0', 'split-pane', '--vertical', '--title', pane.title]
-
-    const result = await this.runCommand('wt.exe', [
-      ...args,
-      'powershell.exe',
-      '-NoLogo',
-      '-NoProfile',
-      '-ExecutionPolicy',
-      'Bypass',
-      '-EncodedCommand',
-      encoded,
-    ])
-
-    if (result.code !== 0) {
+    // 拒绝 ready 态重 spawn（避免同 pidFile 双进程竞争）
+    if (pane.status === 'ready' || pane.status === 'killing') {
      throw new Error(
-        `Failed to launch Windows Terminal teammate ${paneId}: ${result.stderr}`,
+        `Pane ${paneId} already spawned (status=${pane.status}); create a new pane to re-launch`,
      )
    }
+    if (pane.status === 'spawning') {
+      throw new Error(
+        `Pane ${paneId} is currently spawning; wait for the in-flight launch to complete`,
+      )
+    }
+    if (pane.status === 'dead') {
+      throw new Error(`Pane ${paneId} is dead; create a new pane`)
+    }
+    // pane.status === 'registered' → 继续
+
+    // 提前赋值 spawnPromise 在任何 await 前（inner Promise 包装）
+    // Attach a no-op .catch() immediately to prevent unhandled rejection warnings
+    // in case killPane never awaits spawnPromise (e.g. sendCommandToPane fails
+    // before killPane is called).
+    let resolveSpawn!: () => void
+    let rejectSpawn!: (err: unknown) => void
+    const spawnPromise = new Promise<void>((res, rej) => {
+      resolveSpawn = res
+      rejectSpawn = rej
+    })
+    // Silence unhandled-rejection: killPane may .catch() this later, but if
+    // the pane dies before any kill is attempted, the rejection must not leak.
+    spawnPromise.catch(() => {})
+    pane.status = 'spawning'
+    pane.spawnPromise = spawnPromise
+
+    try {
+      const launcher = wrapPowerShellCommand(command, pane.pidFile)
+      // wt.exe treats ';' as its own command separator, which breaks
+      // multi-statement PowerShell commands passed via -Command. Encode the
+      // entire script as Base64 UTF-16LE and use -EncodedCommand instead.
+      const encoded = Buffer.from(launcher, 'utf16le').toString('base64')
+      const args =
+        pane.mode === 'window'
+          ? ['-w', '-1', 'new-tab', '--title', pane.title]
+          : ['-w', '0', 'split-pane', '--vertical', '--title', pane.title]
+
+      await unlink(pane.pidFile).catch(() => {})
+
+      const result = await this.runCommand('wt.exe', [
+        ...args,
+        'powershell.exe',
+        '-NoLogo',
+        '-NoProfile',
+        '-ExecutionPolicy',
+        'Bypass',
+        '-EncodedCommand',
+        encoded,
+      ])
+
+      if (result.code !== 0) {
+        throw new Error(
+          `Failed to launch Windows Terminal teammate ${paneId}: ${result.stderr}`,
+        )
+      }
+
+      const timeoutMs = getWtPaneTimeoutMs()
+      let pid: number
+      try {
+        pid = await waitForPidFile(pane.pidFile, timeoutMs)
+      } catch (err) {
+        throw new Error(
+          `Windows Terminal pane failed to launch within ${timeoutMs}ms\n` +
+            `  paneId: ${paneId}\n` +
+            `  pidFile: ${pane.pidFile}\n` +
+            `  wt.exe stdout: ${result.stdout || '(empty)'}\n` +
+            `  wt.exe stderr: ${result.stderr || '(empty)'}\n` +
+            `  underlying: ${err instanceof Error ? err.message : String(err)}\n` +
+            `  override timeout via env CLAUDE_WT_PANE_TIMEOUT_MS`,
+        )
+      }
+
+      pane.pid = pid
+      pane.status = 'ready'
+      resolveSpawn()
+    } catch (err) {
+      pane.status = 'dead'
+      pane.pid = undefined
+      rejectSpawn(err)
+      throw err
+    } finally {
+      pane.spawnPromise = undefined
+    }
  }

  async setPaneBorderColor(
@@ -189,26 +323,69 @@ export class WindowsTerminalBackend implements PaneBackend {
      return false
    }

-    let pid: number
-    try {
-      pid = Number.parseInt((await readFile(pane.pidFile, 'utf-8')).trim(), 10)
-    } catch {
+    // 1. 解 kill-while-spawn race：await spawn 完成（不论成功失败）
+    if (pane.status === 'spawning' && pane.spawnPromise) {
+      await pane.spawnPromise.catch(() => {})
+    }
+
+    // 2. TOCTOU 修正：重读 status/pid
+    if (pane.status === 'dead') {
      this.panes.delete(paneId)
      return false
    }
-
-    if (!Number.isFinite(pid)) {
-      this.panes.delete(paneId)
+    if (pane.status !== 'ready') {
+      // 还在其它非终态（理论不可达，保险）
      return false
    }

+    pane.status = 'killing'
+
+    // 3. 优先用缓存 pid
+    let pid: number | undefined = pane.pid
+
+    // 4. fallback：缓存没有则读盘（保留 retry 3×500ms）
+    if (pid === undefined) {
+      let pidContent: string | null = null
+      for (let attempt = 0; attempt < 3; attempt++) {
+        try {
+          pidContent = (await readFile(pane.pidFile, 'utf-8')).trim()
+          break
+        } catch {
+          if (attempt === 2) {
+            pane.status = 'dead'
+            this.panes.delete(paneId)
+            return false
+          }
+          await new Promise(r => setTimeout(r, 500))
+        }
+      }
+      if (!pidContent || !/^\d+$/.test(pidContent)) {
+        pane.status = 'dead'
+        this.panes.delete(paneId)
+        return false
+      }
+      const parsed = Number.parseInt(pidContent, 10)
+      if (!Number.isFinite(parsed) || parsed <= 0) {
+        pane.status = 'dead'
+        this.panes.delete(paneId)
+        return false
+      }
+      pid = parsed
+    }
+
+    // 5. 执行 Stop-Process
    const result = await this.runCommand('powershell.exe', [
      '-NoLogo',
      '-NoProfile',
      '-Command',
      `Stop-Process -Id ${pid} -Force -ErrorAction Stop`,
    ])
+
+    // 6. 不管成功失败都清缓存 + 标 dead + 从 map 删（防 PID 复用误杀）
+    pane.pid = undefined
+    pane.status = 'dead'
    this.panes.delete(paneId)
+
    logForDebugging(
      `[WindowsTerminalBackend] killPane ${paneId} pid=${pid} code=${result.code}`,
    )
--- a/src/utils/swarm/backends/tests/WindowsTerminalBackend.test.ts
+++ b/src/utils/swarm/backends/tests/WindowsTerminalBackend.test.ts
@@ -14,20 +14,43 @@ beforeEach(async () => {
    `windows-terminal-backend-${Date.now()}-${Math.random().toString(16).slice(2)}`,
  )
  await mkdir(tempDir, { recursive: true })
+  process.env.CLAUDE_WT_PANE_TIMEOUT_MS = '2000'
 })

 afterEach(async () => {
  await rm(tempDir, { recursive: true, force: true })
+  delete process.env.CLAUDE_WT_PANE_TIMEOUT_MS
 })

-function createBackend(calls: Call[]): WindowsTerminalBackend {
-  return new WindowsTerminalBackend(
-    async (command, args) => {
+function createBackend(
+  calls: Call[],
+  opts: { simulatePidWrite?: boolean | number } = {},
+): WindowsTerminalBackend {
+  const simulate = opts.simulatePidWrite !== false
+  const delayMs =
+    typeof opts.simulatePidWrite === 'number' ? opts.simulatePidWrite : 30
+  return new WindowsTerminalBackend({
+    runCommand: async (command, args) => {
      calls.push({ command, args })
+      if (simulate && command === 'wt.exe') {
+        const encIdx = args.indexOf('-EncodedCommand')
+        if (encIdx >= 0) {
+          const decoded = Buffer.from(args[encIdx + 1]!, 'base64').toString(
+            'utf16le',
+          )
+          const match = decoded.match(/Set-Content -LiteralPath '([^']+)'/)
+          if (match) {
+            setTimeout(() => {
+              writeFile(match[1]!, '54321', 'utf-8').catch(() => {})
+            }, delayMs)
+          }
+        }
+      }
      return { stdout: 'ok', stderr: '', code: 0 }
    },
-    () => 'windows',
-  )
+    getPlatform: () => 'windows',
+    pidFileDir: tempDir,
+  })
 }

 function decodeEncodedCommand(call: Call): {
@@ -78,25 +101,236 @@ describe('WindowsTerminalBackend', () => {
    expect(args.join(' ')).toContain('-w -1 new-tab --title')
  })

-  test('force kills the recorded teammate shell pid when available', async () => {
+  test('force kills the cached pid from sendCommandToPane without reading pidFile', async () => {
    const calls: Call[] = []
    const backend = createBackend(calls)
    const pane = await backend.createTeammatePaneInSwarmView('killer', 'red')

+    // sendCommandToPane resolves — simulate writes '54321' to pidFile, which
+    // becomes pane.pid. killPane should use the cached pid, not re-read the file.
    await backend.sendCommandToPane(pane.paneId, "Write-Output 'running'")
-    const { decodedLauncher } = decodeEncodedCommand(calls[0]!)
-    const pidFile = decodedLauncher.match(
-      /Set-Content -LiteralPath '([^']+)'/,
-    )?.[1]
-    expect(pidFile).toBeString()
-    await writeFile(pidFile!, '12345', 'utf-8')

    const killed = await backend.killPane(pane.paneId)

    expect(killed).toBe(true)
    expect(calls[calls.length - 1]!.command).toBe('powershell.exe')
    expect(calls[calls.length - 1]!.args.join(' ')).toContain(
-      'Stop-Process -Id 12345',
+      'Stop-Process -Id 54321',
    )
  })
+
+  test('throws a diagnostic error when pidFile never appears within timeout', async () => {
+    process.env.CLAUDE_WT_PANE_TIMEOUT_MS = '300'
+    const calls: Call[] = []
+    const backend = createBackend(calls, { simulatePidWrite: false })
+    const pane = await backend.createTeammatePaneInSwarmView('slowpane', 'blue')
+    let caught: unknown
+    try {
+      await backend.sendCommandToPane(pane.paneId, "Write-Output 'x'")
+    } catch (err) {
+      caught = err
+    }
+    expect(caught).toBeInstanceOf(Error)
+    expect((caught as Error).message).toMatch(
+      /Windows Terminal pane failed to launch within 300ms/,
+    )
+  })
+
+  test('error message includes paneId pidFile and override hint', async () => {
+    process.env.CLAUDE_WT_PANE_TIMEOUT_MS = '250'
+    const calls: Call[] = []
+    const backend = createBackend(calls, { simulatePidWrite: false })
+    const pane = await backend.createTeammatePaneInSwarmView(
+      'diagpane',
+      'green',
+    )
+    let caught: unknown
+    try {
+      await backend.sendCommandToPane(pane.paneId, "Write-Output 'x'")
+    } catch (err) {
+      caught = err
+    }
+    expect(caught).toBeInstanceOf(Error)
+    const msg = (caught as Error).message
+    expect(msg).toContain(pane.paneId)
+    expect(msg).toContain('CLAUDE_WT_PANE_TIMEOUT_MS')
+  })
+
+  test('unlinks stale pidFile so a stale pid is not adopted', async () => {
+    const calls: Call[] = []
+    const backend = createBackend(calls, { simulatePidWrite: 30 })
+    const pane = await backend.createTeammatePaneInSwarmView('stale', 'pink')
+    // pidFile path is deterministic: <tempDir>/<sanitized paneId>.pid
+    const stalePidFile = join(
+      tempDir,
+      `${pane.paneId.replace(/[^a-zA-Z0-9_-]/g, '-')}.pid`,
+    )
+    // Pre-seed stale content. If sendCommandToPane did NOT unlink, waitForPidFile
+    // would immediately accept '99999' and cache it as pane.pid. With unlink,
+    // simulate's '54321' is the value killPane sees.
+    await writeFile(stalePidFile, '99999', 'utf-8')
+
+    await backend.sendCommandToPane(pane.paneId, "Write-Output 'x'")
+    const killed = await backend.killPane(pane.paneId)
+    expect(killed).toBe(true)
+    expect(calls[calls.length - 1]!.args.join(' ')).toContain(
+      'Stop-Process -Id 54321',
+    )
+  })
+
+  test('rejects re-spawn on a ready pane', async () => {
+    const calls: Call[] = []
+    const backend = createBackend(calls)
+    const pane = await backend.createTeammatePaneInSwarmView('reentry', 'cyan')
+    await backend.sendCommandToPane(pane.paneId, "Write-Output 'first'")
+    // pane.status === 'ready' now. Second sendCommandToPane must throw.
+    let caught: unknown
+    try {
+      await backend.sendCommandToPane(pane.paneId, "Write-Output 'second'")
+    } catch (err) {
+      caught = err
+    }
+    expect(caught).toBeInstanceOf(Error)
+    expect((caught as Error).message).toMatch(/already spawned/)
+  })
+
+  test('throws on unknown paneId in sendCommandToPane', async () => {
+    const calls: Call[] = []
+    const backend = createBackend(calls)
+    let caught: unknown
+    try {
+      await backend.sendCommandToPane('wt-nonexistent', "Write-Output 'x'")
+    } catch (err) {
+      caught = err
+    }
+    expect(caught).toBeInstanceOf(Error)
+    expect((caught as Error).message).toContain('Unknown Windows Terminal pane')
+  })
+
+  test('rejects corrupted pidFile content ("123abc") and times out', async () => {
+    process.env.CLAUDE_WT_PANE_TIMEOUT_MS = '400'
+    const calls: Call[] = []
+    // Custom runner writes invalid pid content (not all digits).
+    const backend = new WindowsTerminalBackend({
+      runCommand: async (command, args) => {
+        calls.push({ command, args })
+        if (command === 'wt.exe') {
+          const encIdx = args.indexOf('-EncodedCommand')
+          if (encIdx >= 0) {
+            const decoded = Buffer.from(args[encIdx + 1]!, 'base64').toString(
+              'utf16le',
+            )
+            const match = decoded.match(/Set-Content -LiteralPath '([^']+)'/)
+            if (match) {
+              setTimeout(() => {
+                writeFile(match[1]!, '123abc', 'utf-8').catch(() => {})
+              }, 30)
+            }
+          }
+        }
+        return { stdout: 'ok', stderr: '', code: 0 }
+      },
+      getPlatform: () => 'windows',
+      pidFileDir: tempDir,
+    })
+    const pane = await backend.createTeammatePaneInSwarmView('corrupt', 'red')
+    let caught: unknown
+    try {
+      await backend.sendCommandToPane(pane.paneId, "Write-Output 'x'")
+    } catch (err) {
+      caught = err
+    }
+    expect(caught).toBeInstanceOf(Error)
+    // Inner error from waitForPidFile must reach the wrapped diagnostic message.
+    const msg = (caught as Error).message
+    expect(msg).toMatch(/failed to launch within 400ms/)
+    expect(msg).toMatch(/not a valid pid|invalid pid|123abc/)
+  })
+
+  test('killPane awaits in-flight spawn before killing (kill-while-spawn race)', async () => {
+    // simulatePidWrite: 800ms — sendCommandToPane stays in waitForPidFile for ~800ms.
+    process.env.CLAUDE_WT_PANE_TIMEOUT_MS = '3000'
+    const calls: Call[] = []
+    const backend = createBackend(calls, { simulatePidWrite: 800 })
+    const pane = await backend.createTeammatePaneInSwarmView('racy', 'blue')
+
+    // Start spawn but don't await it yet.
+    const spawnP = backend.sendCommandToPane(pane.paneId, "Write-Output 'x'")
+    // 50ms later, call killPane — pane is still 'spawning', killPane must
+    // await spawnPromise (which resolves at ~800ms when simulate writes pid 54321),
+    // then kill using the cached pid.
+    await new Promise(r => setTimeout(r, 50))
+    const killP = backend.killPane(pane.paneId)
+
+    // Both must resolve cleanly.
+    await spawnP
+    const killed = await killP
+    expect(killed).toBe(true)
+    // The kill must target the freshly-spawned pid (54321), not have used a
+    // stale-or-missing fallback path.
+    const killCall = calls[calls.length - 1]!
+    expect(killCall.command).toBe('powershell.exe')
+    expect(killCall.args.join(' ')).toContain('Stop-Process -Id 54321')
+  })
+
+  test('Stop-Process failure clears cached pid and marks pane dead', async () => {
+    const calls: Call[] = []
+    // Runner returns code 1 only for powershell.exe (kill); wt.exe succeeds.
+    const backend = new WindowsTerminalBackend({
+      runCommand: async (command, args) => {
+        calls.push({ command, args })
+        if (command === 'wt.exe') {
+          const encIdx = args.indexOf('-EncodedCommand')
+          if (encIdx >= 0) {
+            const decoded = Buffer.from(args[encIdx + 1]!, 'base64').toString(
+              'utf16le',
+            )
+            const match = decoded.match(/Set-Content -LiteralPath '([^']+)'/)
+            if (match) {
+              setTimeout(() => {
+                writeFile(match[1]!, '54321', 'utf-8').catch(() => {})
+              }, 30)
+            }
+          }
+          return { stdout: 'ok', stderr: '', code: 0 }
+        }
+        // powershell Stop-Process fails
+        return { stdout: '', stderr: 'access denied', code: 1 }
+      },
+      getPlatform: () => 'windows',
+      pidFileDir: tempDir,
+    })
+    const pane = await backend.createTeammatePaneInSwarmView('dier', 'orange')
+    await backend.sendCommandToPane(pane.paneId, "Write-Output 'x'")
+
+    const killed = await backend.killPane(pane.paneId)
+    expect(killed).toBe(false) // Stop-Process exit 1 → false
+
+    // After kill failure, pane is removed from map: second killPane → false (not retry).
+    const killedAgain = await backend.killPane(pane.paneId)
+    expect(killedAgain).toBe(false)
+    // Critically: only ONE powershell call happened — the second killPane returned
+    // false from "pane not in map", not from another Stop-Process attempt.
+    const psCalls = calls.filter(c => c.command === 'powershell.exe')
+    expect(psCalls.length).toBe(1)
+  })
+
+  test('killPane uses cached pid and returns false when pane is unknown', async () => {
+    const calls: Call[] = []
+    const backend = createBackend(calls, { simulatePidWrite: 30 })
+    const pane = await backend.createTeammatePaneInSwarmView('cached', 'yellow')
+    await backend.sendCommandToPane(pane.paneId, "Write-Output 'x'")
+
+    // After sendCommandToPane, pane.pid = 54321 (from simulate). killPane must
+    // use this cached pid without reading the pidFile at all.
+    const killed = await backend.killPane(pane.paneId)
+    expect(killed).toBe(true)
+    expect(calls[calls.length - 1]!.args.join(' ')).toContain(
+      'Stop-Process -Id 54321',
+    )
+
+    // After kill, pane is removed — a second killPane must return false.
+    const killedAgain = await backend.killPane(pane.paneId)
+    expect(killedAgain).toBe(false)
+  })
 })
--- a/tsconfig.json
+++ b/tsconfig.json
@@ -32,5 +32,5 @@
    "packages/**/*.ts",
    "packages/**/*.tsx"
  ],
-  "exclude": ["node_modules"]
+  "exclude": ["node_modules", "packages/remote-control-server/web"]
 }
--- a/vite.config.ts
+++ b/vite.config.ts
@@ -93,9 +93,12 @@ export default defineConfig({

      output: {
        format: 'es',
-        // Single-file build: no code splitting, all dynamic imports inlined
-        codeSplitting: false,
+        // Code splitting: Bun/JSC parses the entire single-file bundle eagerly,
+        // consuming ~1 GB RSS for a 17 MB output (vs ~220 MB on Node/V8 which
+        // lazy-parses). Splitting into chunks allows Bun to load modules on demand,
+        // bringing RSS down to ~300 MB.
        entryFileNames: 'cli.js',
+        chunkFileNames: 'chunks/[name]-[hash].js',
      },

      plugins: [
Author	SHA1	Message	Date
claude-code-best	b62b384e36	fix: normalizeMessagesForAPI 不再跨 tool_result 边界合并同 ID assistant 消息 (CC-1215) ACP 模式下 extended thinking + tool_use 同一 turn 时，StreamingToolExecutor 在两个同 message.id 的 AssistantMessage 之间插入 tool_result，导致向后遍历合并跨越边界，产生重复 tool_use ID → 孤立 tool_result → 连续 user 消息 → 400。修改向后遍历停止条件：遇到非 assistant 消息（含 tool_result）即停止，不再跳过。	2026-06-04 15:41:41 +08:00
claude-code-best	d7001b870f	fix: add markResourceTiming polyfill to performance shim for Node.js v22 undici compatibility Node.js v22 undici internal calls performance.markResourceTiming() after every fetch. The performance shim was missing this method, causing TypeError crashes in ACP mode when running with Node.js.	2026-06-04 14:30:34 +08:00
claude-code-best	18437c20d2	fix: prevent crash when DiscoverSkills receives undefined query via ExecuteExtraTool searchSkills() called .trim() on query without null-guard. When DiscoverSkills is invoked through ExecuteExtraTool with missing description, query is undefined, causing 'Cannot read properties of undefined (reading trim)'. Fixed with optional chaining: !query.trim() → !query?.trim() Co-Authored-By: deepseek-v4-pro <deepseek-ai@claude-code-best.win>	2026-06-03 21:38:23 +08:00
James F	02298cb199	security: close telemetry leak in preconnectAnthropicApi startup path (#1253 ) 🔒 Security Discovery: Un-gated outbound connection bypasses privacy controls Summary ------- preconnectAnthropicApi() unconditionally sends a TCP+TLS handshake to api.anthropic.com on every ccb startup — even when the user has explicitly disabled all non-essential traffic via CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1 or DISABLE_TELEMETRY=1. This is the LAST un-gated outbound connection in the entire startup path. Every other telemetry sink (Sentry, Langfuse, OpenTelemetry, GrowthBook, 1P Event Logger, Datadog, BigQuery, etc.) already respects the privacyLevel module's isEssentialTrafficOnly() gate. This one did not. Impact ------ While the preconnect is a HEAD request with no payload, the connection itself leaks the client's IP address and session timing to Anthropic's infrastructure. For privacy-conscious users and enterprise deployments that have disabled telemetry, this constitutes an unexpected data leak. Fix --- Add isEssentialTrafficOnly() check at the function entry, consistent with every other privacy-gated code path in the codebase. The privacyLevel module is already imported by init.ts and 12+ other modules — no new dependencies. Verification ------------ Reproduced and verified via strace on Linux (aarch64): # Before fix $ strace -f -e connect ccb -p <<< 'hello' connect(16, sin_addr=inet_addr("160.79.104.10"), sin_port=htons(443)) = 0 # ↑ connector to api.anthropic.com despite CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1 # After fix $ strace -f -e connect ccb -p <<< 'hello' # ↑ zero remote TCP connections — all traffic to localhost only Changes: 1 file, +5 lines (import + gate)	2026-06-02 09:30:13 +08:00
claude-code-best	b2b1981da3	docs: update contributors	2026-06-01 00:23:43 +00:00
claude-code-best	33c52578a6	docs: 修改 README	2026-05-31 22:11:29 +08:00
claude-code-best	e33b17bde7	feat: sideQuery 支持第三方 provider 路由 (OpenAI/Grok/Gemini) - 新增 getProviderPrimaryModel() 从环境变量解析 provider 主模型 - getDefaultOpus/Sonnet/HaikuModel 在第三方 provider 下回退到用户配置的主模型 - sideQuery 根据 provider 类型分发到对应的 API 适配器 - 新增 sideQueryViaOpenAICompatible (OpenAI + Grok) 和 sideQueryViaGemini 适配函数 - 避免 sideQuery 后台任务在配置第三方端点时仍请求 Anthropic API	2026-05-31 14:08:30 +08:00
claude-code-best	797424115d	chore: 2.6.6	2026-05-29 17:52:25 +08:00
claude-code-best	efc218d8a9	fix: searchSkills 使用缓存 IDF 前校验 index 引用一致性，修复测试间歇性失败	2026-05-28 22:24:29 +08:00
claude-code-best	a91653a0dd	fix: 删除 edit tool 中的旧逻辑处理，现在已经不需要这些处理了，大模型够屌 (#1251 ) * refactor: remove tab/quote normalization from FileEditTool * fix: resolve pre-existing typecheck errors (zod v4 compat + RCS web exclude)	2026-05-28 21:52:31 +08:00
claude-code-best	c982104476	docs: update contributors	2026-05-25 00:22:36 +00:00
claude-code-best	6dd378bf15	fix: 退出启动对话框时终端残留一行内容 gracefulShutdownSync 启动异步 shutdown 后同步返回，React 立即重新渲染组件，与 cleanupTerminalModes() 中的 Ink unmount 产生竞态条件，导致退出后终端残留对话框内容。修复方案：引入 pendingExitCode state，退出路径先清空画面（渲染 null），在 useEffect 中延迟到下一个 tick 再调用 gracefulShutdownSync，确保 Ink 在终端清理前已完成空帧刷新。影响三个启动对话框：TrustDialog、BypassPermissionsModeDialog、 DevChannelsDialog。 Co-Authored-By: glm-5.1 <zai-org@claude-code-best.win>	2026-05-22 22:25:51 +08:00
claude-code-best	ed61932748	fix: subtract cached_tokens from input_tokens in OpenAI stream adapter OpenAI's prompt_tokens includes cached tokens, but Anthropic's input_tokens semantic excludes them. The adapter was mapping prompt_tokens → input_tokens verbatim, causing downstream code (cache hit rate, cost, autocompact) to double-count. Real-world impact: DeepSeek returns prompt_tokens=34097 with cached_tokens=34048, displayed as 50% hit rate instead of 99.86%. Co-Authored-By: glm-5.1 <zai-org@claude-code-best.win>	2026-05-22 21:58:33 +08:00
claude-code-best	b1c4f40f90	fix: ACP 模式下 extended thinking + tool_use 触发连续 user 消息导致 400 (CC-1215)	2026-05-22 21:58:33 +08:00
Dosion	f91060836f	fix(swarm): WindowsTerminalBackend pidFile health check + 5-state lifecycle (#1237 ) * fix(swarm): WindowsTerminalBackend pidFile health check + 5-state lifecycle 修 wt.exe split-pane fire-and-forget 导致 teammate 假死、TeamDelete 卡死、 kill-while-spawn race 等多个问题。 - 加 waitForPidFile() 在 wt.exe 返回后等 powershell.exe 真启动写 pidFile 默认 8s timeout，env CLAUDE_WT_PANE_TIMEOUT_MS 覆盖，超时 throw 含完整诊断 - 加 5 态生命周期 (registered/spawning/ready/killing/dead)，sendCommandToPane inner Promise 包装 spawnPromise，ready 态重 spawn 直接 throw - killPane TOCTOU 修正：await spawnPromise 后重读 status；优先用缓存 pane.pid 避免读盘，Stop-Process 失败也清缓存 + 标 dead 防 PID 复用误杀 - pid 解析严格化：/^\d+$/ + Number.isFinite + >0；移除 dead try/catch - 构造函数 options 对象注入 pidFileDir（兼容原位置参数） - 清启动前陈旧 pidFile，killPane fallback 3×500ms retry 兜底 * test(swarm): 12 tests covering WindowsTerminalBackend lifecycle, race, pid validation 为 WindowsTerminalBackend 加 12 个测试覆盖 v2 全部新行为，含 5 个 v1 兼容 + 7 个 v2 新场景。配套构造函数 options 对象，测试用 pidFileDir: tempDir 隔离防泄漏到真实 OS tmpdir。新场景覆盖： - unlinks stale pidFile so a stale pid is not adopted - rejects re-spawn on a ready pane - throws on unknown paneId in sendCommandToPane - rejects corrupted pidFile content ("123abc") and times out - killPane awaits in-flight spawn before killing (kill-while-spawn race) - Stop-Process failure clears cached pid and marks pane dead - killPane uses cached pid and returns false when pane is unknown createBackend helper 改用 options 对象 + simulatePidWrite 模拟 powershell 写 pidFile，pidFileDir 注入 tempDir，env CLAUDE_WT_PANE_TIMEOUT_MS beforeEach 设置 afterEach 清理。 --------- Co-authored-by: unraid <local@unraid.local>	2026-05-22 21:06:47 +08:00
Dosion	9d17597e58	feat(autofix-pr): 完整完成回流机制 (latent bug fix + completionChecker + 内容回流) (#1240 ) * fix(autofix-pr): 修复 taskId 不一致导致 monitor lock dangling 问题:createAutofixTeammate 生成 teammate UUID 作为 monitor lock 的 key, 但 registerRemoteAgentTask 内部生成的 framework taskId 是另一个 UUID。 CCR session 自然完成时框架调 clearActiveMonitor(frameworkTaskId) guard 失败,lock 永不释放,导致后续 /autofix-pr 报 "already monitoring"。修复(Phase 1 of remote-agent completion loop): - monitorState 新增 updateActiveMonitor(partial) 原子更新 - callAutofixPr 在 register 后 swap lock 的 taskId 到 framework 分配的 id - RemoteAgentTask 引入 registerCompletionHook 注册式 API(参考已有的 registerCompletionChecker 模式),在 5 个完成路径调 runCompletionHook - autofix-pr 命令模块自己注册 cleanup hook,避免 framework 反向依赖 command 模块测试: - monitorState 新增 4 个测试(updateActiveMonitor 行为 + bug 复现/修复) - launchAutofixPr 新增 3 个端到端回归测试(taskId swap + hook 触发 + subsequent launch 不报 already monitoring) 完整分析与 Phase 2/3 改造方案见 docs/features/remote-agent-completion-analysis.md。 * feat(autofix-pr): 注册 completionChecker 用 gh CLI 探测 PR 完成 Phase 2 of remote-agent completion loop。Phase 1 修了 monitor lock dangling,但完成信号仍然只能等 CCR session 自然 archive(timing 不可预测,且不知道 PR 究竟有没有被修好)。Phase 2 加上主动完成探测。实现: - 新增 prOutcomeCheck.ts(纯决策矩阵):summariseAutofixOutcome 给定 PR 快照 + 基线 SHA 返回 completed/summary。8 个决策分支单元测试。 - 新增 prFetch.ts(spawn 层):runGhPrView 调 gh CLI,fetchPrHeadSha 在 launch 时捕获基线 SHA,checkPrAutofixOutcome 组合两者。 - AutofixPrRemoteTaskMetadata 加 initialHeadSha?: string 字段,survive --resume。 - launchAutofixPr.ts 模块顶部 registerCompletionChecker('autofix-pr', ...),5s throttle 防 gh CLI 调用爆。callAutofixPr 启动时调 fetchPrHeadSha 传入 metadata。决策矩阵: MERGED → done(merged) CLOSED 未 merge → done(closed without fix) OPEN 无 baseline → 继续轮询 OPEN head 未变 → 继续轮询(agent 还没 push) OPEN head 变 + CI pending → 继续轮询 OPEN head 变 + CI failure → done(surface red,user 决定 retry) OPEN head 变 + CI success → done(clean fix) 设计: - gh CLI 而非 Octokit:复用用户已有 auth,不引入 token 管理 - 决策与 spawn 分文件:prOutcomeCheck 纯函数易测,prFetch 单独 mock 避免 Bun mock.module 进程级污染(已在 launchAutofixPr.test 注释说明) - 5s throttle:framework 每 1s 轮询,gh CLI subprocess 太重不能跟上 - 失败兜底:fetchPrHeadSha/checkPrAutofixOutcome 失败均不抛,returns null/false,framework 继续走原路径测试: - prOutcomeCheck 9 个单测覆盖决策矩阵 - launchAutofixPr 5 个新测试:checker 注册 / fetchPrHeadSha 调用 / initialHeadSha 传 metadata / SHA 失败仍能 launch / SHA null 处理完整方案见 docs/features/remote-agent-completion-analysis.md。 * feat(autofix-pr): 内容回流让本地模型读到 PR 修复结果 Phase 3 of remote-agent completion loop。Phase 2 注册了 completionChecker 让框架能在 PR 合并/关闭/有 push+CI 绿时主动完成 task,但 task-notification 仍然只携带 generic 文本(""${owner}/${repo}#42 merged"")。Phase 3 让本地模型读到远端 agent 自己产出的结构化结果(commits 列表、files 列表、CI 状态、人类可读 summary)。实现: - 新增 extractAutofixResultFromLog (src/commands/autofix-pr/ extractAutofixResult.ts):从 SDKMessage[] 中扫 <autofix-result> tag, 优先 hook stdout 后 fallback assistant text,latest-wins。10 个单测。 - RemoteAgentTask 新增 registerContentExtractor 注册式 API + 私有 enqueueRichRemoteNotification(参考 enqueueRemoteReviewNotification), 在 3 个 generic 完成路径(archived / completionChecker / result-driven) 先尝试 tryExtractRichContent,有内容用 rich 变体,没有走 generic。 isRemoteReview 路径不变(它走自己的 enqueueRemoteReviewNotification)。 - launchAutofixPr.ts 模块顶部 registerContentExtractor('autofix-pr', extractAutofixResultFromLog)。initialMessage 加 <autofix-result> 输出指令(pr-number / commits-pushed / files-changed / ci-status / summary)。设计: - 注册式 API(同 Phase 1 hook + Phase 2 checker):framework 不反向依赖命令模块,所有 PR-specific 逻辑在 autofix-pr/ - latest-wins:agent 重试时只取最新 tag,旧 tag 不会污染 - truncated tag → null:开 tag 无对应闭 tag 视为不完整,走 generic fallback - 跨 message 不拼接:开 tag 和闭 tag 在不同 message 视为不完整(避免误拼字符串) - 字符串 content 不解析:assistant.message.content 为 string(非 block array)的少见路径直接 skip,不 crash 测试: - extractAutofixResultFromLog 10 个单测(空 log / 无 tag / hook stdout / assistant text / hook_response subtype / 多 tag latest-wins / 截断 / hook 后于 assistant 的优先级 / 跨 message 不拼接 / 字符串 content graceful) - launchAutofixPr 3 个新测试(extractor 注册 / initialMessage 含 tag schema / extractor 真实行为) 完整方案见 docs/features/remote-agent-completion-analysis.md 第 5.3 节。 * fix(autofix-pr): extractBetween 支持 latest tag 截断时回溯到更早完整对如果远端 agent 重试时写了完整 <autofix-result> 后又开了一个被截断的第二个 tag, 旧实现只看 lastIndexOf(open) 然后找不到 close 就返回 null, 导致前面那个完整结果被丢弃。改为从尾向首遍历所有 open tag, 返回第一个能配对的 open/close 对。附带: - docs/features/remote-agent-completion-analysis.md: 9 处裸 fenced block 补 language tag (text/http), 修复 markdownlint MD040 警告 - 同文件: 两处"三选项" → "三个选项" 符合中文量词习惯 * test(autofix-pr): 补齐 completionChecker / 边界 CI 检查覆盖率针对 codecov patch coverage gap, 补足三块此前未走到的代码路径: prOutcomeCheck.ts (原 96.92%, 2 lines missing): - statusCheckRollup === undefined 路径 (与空数组分支不同, GitHub 在无 checks 配置的 PR 上直接省略字段) - COMPLETED 状态但 conclusion 为 null/空的 in-flight 检查归为 pending launchAutofixPr.ts (原 58.33%, 15 lines missing): - registerCompletionChecker arrow body: metadata 缺失早返回 / 节流窗口内返回 null / completed=false 返回 null / completed=true 返回 summary / initialHeadSha 透传到 checkPrAutofixOutcome - registerCompletionHook 的 if(meta) 短路两侧: 有 metadata 时清空节流条目, 无 metadata 时仍释放 active monitor lock 所有新测试沿用现有 mock.module 与 registerXxxMock.mock.calls 拉取注册回调的模式, 无新增依赖。prOutcomeCheck 11/11 本地通过。 * style: biome check --fix 整形 launchAutofixPr.test 新增段 --------- Co-authored-by: unraid <local@unraid.local> Co-authored-by: Claude <noreply@anthropic.com>	2026-05-22 21:06:26 +08:00
claude-code-best	f2b751f659	chore: 2.6.5	2026-05-22 21:05:06 +08:00
claude-code-best	d4a601475f	fix: 修复 BriefTool 循环依赖导致 isBriefEnabled 未定义将模块顶层 require() 改为懒加载函数 getBriefToolModule()，延迟到实际调用时才加载模块，避免循环依赖时模块尚未完成初始化。	2026-05-22 21:04:17 +08:00
claude-code-best	897c186f28	docs: effort 级别描述去掉模型名限制	2026-05-22 20:11:12 +08:00
claude-code-best	03598d3f84	refactor: 移除 resolveAppliedEffort 中的 max/xhigh 降级分支	2026-05-22 20:09:53 +08:00
claude-code-best	7b52054ff5	feat: 解除 max/xhigh effort 级别的模型白名单限制	2026-05-22 20:09:10 +08:00
claude-code-best	66c892521b	chore: 2.6.0	2026-05-21 16:38:25 +08:00
claude-code-best	dab04af7c9	perf: Vite 构建启用 code splitting，Bun RSS 从 966MB 降至 35MB Bun/JSC 全量解析单文件大 JS 的 bytecode 和 JIT，17MB 产物导致 RSS 暴涨至 ~1GB（Node/V8 懒解析仅需 ~220MB）。启用代码分割后 Bun 按需加载 chunk，--version RSS 35MB，完整加载 ~500MB。改动： - vite.config.ts: 移除 codeSplitting:false，添加 chunkFileNames - post-build.ts: 遍历 dist/ + dist/chunks/ 所有文件做 Bun patch - 新建 distRoot.ts 共享工具函数，统一路径定位逻辑 - ripgrep.ts/computerUse/setup.ts/claudeInChrome/setup.ts/updateCCB.ts: 用 distRoot 替换内联 import.meta.url 路径推算	2026-05-21 16:36:27 +08:00
claude-code-best	5b5fbb2f47	chore: 2.5.0	2026-05-20 10:47:52 +08:00
claude-code-best	9bfa868e61	chore: 复原原始的 package.json	2026-05-20 10:14:40 +08:00
claude-code-best	f6dcf63902	Revert "chore: 切换到 bun publish，修复 husky 路径问题，调整 diff 折叠距离，导出 VoiceContext" This reverts commit `c80a6d062b`.	2026-05-20 10:11:21 +08:00
claude-code-best	5957e26d9b	Revert "chore: 修复 publish 问题" This reverts commit `58c3feb56a`.	2026-05-20 10:11:09 +08:00
claude-code-best	58c3feb56a	chore: 修复 publish 问题	2026-05-20 10:06:44 +08:00
claude-code-best	e2f4d558e1	Revert "fix: bun publish 通过 ~/.npmrc 配置 registry 认证" This reverts commit `9afcb398ca`.	2026-05-20 10:05:38 +08:00
claude-code-best	9afcb398ca	fix: bun publish 通过 ~/.npmrc 配置 registry 认证	2026-05-20 09:34:59 +08:00