mirror of
https://github.com/claude-code-best/claude-code.git
synced 2026-06-15 12:55:51 +00:00
Compare commits
19 Commits
v2.0.0
...
memory-lea
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
5e215bb061 | ||
|
|
b28de717dd | ||
|
|
5c1be19511 | ||
|
|
2545dcabfd | ||
|
|
40fbc4afc4 | ||
|
|
d3eebfed15 | ||
|
|
6becb8b2d4 | ||
|
|
3a2b6dde7c | ||
|
|
4ca7a4895a | ||
|
|
ba74e0976c | ||
|
|
86df024e75 | ||
|
|
c3af45023d | ||
|
|
2847cab787 | ||
|
|
198c09b263 | ||
|
|
4cbf406c70 | ||
|
|
f72b867aa6 | ||
|
|
0290fe3227 | ||
|
|
1b10ea391a | ||
|
|
f724300079 |
12
CLAUDE.md
12
CLAUDE.md
@@ -4,7 +4,7 @@ This file provides guidance to Claude Code (claude.ai/code) and other AI coding
|
||||
|
||||
## Project Overview
|
||||
|
||||
This is a **reverse-engineered / decompiled** version of Anthropic's official Claude Code CLI tool. The goal is to restore core functionality while trimming secondary capabilities. Many modules are stubbed or feature-flagged off. TypeScript strict mode is enforced — **`bunx tsc --noEmit` must pass with zero errors**.
|
||||
This is a **reverse-engineered / decompiled** version of Anthropic's official Claude Code CLI tool. The goal is to restore core functionality while trimming secondary capabilities. Many modules are stubbed or feature-flagged off. TypeScript strict mode is enforced — **`bun run precheck` 必须零错误通过**(包含 typecheck + lint fix + test)。
|
||||
|
||||
## Git Commit Message Convention
|
||||
|
||||
@@ -47,7 +47,7 @@ bun test # run all tests
|
||||
bun test src/utils/__tests__/hash.test.ts # run single file
|
||||
bun test --coverage # with coverage report
|
||||
|
||||
# Lint & Format (Biome)
|
||||
# Lint & Format (Biome) — 日常开发用 precheck 代替单独调用
|
||||
bun run lint # lint check (全项目)
|
||||
bun run lint:fix # auto-fix lint issues
|
||||
bun run format # format all (全项目)
|
||||
@@ -60,7 +60,7 @@ bun run health
|
||||
# Check unused exports
|
||||
bun run check:unused
|
||||
|
||||
# Full check (typecheck + lint fix + test) — run after completing any task
|
||||
# Full check (typecheck + lint fix + test) — 任务完成后必须运行
|
||||
bun run precheck
|
||||
|
||||
# Remote Control Server
|
||||
@@ -311,7 +311,7 @@ mock.module("src/utils/debug.ts", debugMock);
|
||||
项目使用 TypeScript strict 模式,**tsc 必须零错误**。每次修改后运行:
|
||||
|
||||
```bash
|
||||
bun run typecheck
|
||||
bun run precheck
|
||||
```
|
||||
|
||||
**类型规范**:
|
||||
@@ -324,14 +324,14 @@ bun run typecheck
|
||||
|
||||
## Working with This Codebase
|
||||
|
||||
- **tsc must pass** — `bun run typecheck` 必须零错误,任何修改都不能引入新的类型错误。
|
||||
- **precheck must pass** — `bun run precheck`(typecheck + lint fix + test)必须零错误,任何修改都不能引入新的类型/lint/测试错误。
|
||||
- **Feature flags** — 默认全部关闭(`feature()` 返回 `false`)。Dev/build 各有自己的默认启用列表。不要在 `cli.tsx` 中重定义 `feature` 函数。
|
||||
- **React Compiler output** — Components have decompiled memoization boilerplate (`const $ = _c(N)`). This is normal.
|
||||
- **`bun:bundle` import** — `import { feature } from 'bun:bundle'` 是 Bun 内置模块,由运行时/构建器解析。不要用自定义函数替代它。**`feature()` 只能直接用在 `if` 语句或三元表达式的条件位置**(Bun 编译器限制),不能赋值给变量、不能放在箭头函数体里、不能作为 `&&` 链的一部分。正确:`if (feature('X')) {}` 或 `feature('X') ? a : b`。
|
||||
- **`src/` path alias** — tsconfig maps `src/*` to `./src/*`. Imports like `import { ... } from 'src/utils/...'` are valid.
|
||||
- **MACRO defines** — 集中管理在 `scripts/defines.ts`。Dev mode 通过 `bun -d` 注入,build 通过 `Bun.build({ define })` 注入。修改版本号等常量只改这个文件。
|
||||
- **构建产物兼容 Node.js** — `build.ts` 会自动后处理 `import.meta.require`,产物可直接用 `node dist/cli.js` 运行。
|
||||
- **Biome 配置** — 42 条 lint 规则因 decompiled 代码被关闭,仅保留 `recommended` 基线。格式化覆盖全项目(`src/`、`scripts/`、`packages/`,含 `packages/@ant/`)。`.tsx` 文件用 120 行宽 + 强制分号;其他文件 80 行宽 + 按需分号。JSON 格式化已启用。`.editorconfig` 与 Biome 配置对齐(2-space 缩进)。修改任何代码后应运行 `bun run check` 确认无 lint/格式问题,pre-commit hook 会自动拦截不合格提交。
|
||||
- **Biome 配置** — 42 条 lint 规则因 decompiled 代码被关闭,仅保留 `recommended` 基线。格式化覆盖全项目(`src/`、`scripts/`、`packages/`,含 `packages/@ant/`)。`.tsx` 文件用 120 行宽 + 强制分号;其他文件 80 行宽 + 按需分号。JSON 格式化已启用。`.editorconfig` 与 Biome 配置对齐(2-space 缩进)。修改任何代码后应运行 `bun run precheck` 确认无类型/lint/格式/测试问题,pre-commit hook 会自动拦截不合格提交。
|
||||
- **tsc 与 Biome 冲突处理** — 当 tsc 要求声明属性(赋值使用)但 biome 报 `noUnusedPrivateClassMembers`(只写不读)时,用 `// biome-ignore lint/correctness/noUnusedPrivateClassMembers: <原因>` 抑制 lint 警告,保留类型声明。`biome ci` 必须零 warnings。
|
||||
- **`@ts-expect-error` 维护** — 只在下方代码确实有类型错误时保留 `@ts-expect-error`。如果类型系统已更新导致 directive 变为 unused(TS2578),直接移除注释。MACRO 替换产生的永假比较(如 `'production' === 'development'`)仍需保留 `@ts-expect-error`。
|
||||
- **Ink 框架在 `packages/@ant/ink/`** — 不是 `src/ink/`(该目录不存在)。Ink 相关的组件、hooks、keybindings 都在 packages 中。
|
||||
|
||||
File diff suppressed because one or more lines are too long
|
Before Width: | Height: | Size: 1.7 MiB After Width: | Height: | Size: 2.2 MiB |
@@ -1,200 +1,103 @@
|
||||
# 内存与性能峰值分析报告(最终版 — 4 轮迭代完成)
|
||||
# 内存与性能峰值分析报告
|
||||
|
||||
> 进程 bun,物理内存峰值 **700 MB+**,最差场景可达 **1.8 GB**
|
||||
> 日期:2026-05-02 | 状态:**调研完成** | 范围:内存峰值 + CPU 热点 + React 渲染循环
|
||||
> 进程 bun,RSS 基线 **682 MB**,最差 **1.8 GB** | 2026-05-02 | **调研完成**(12 轮迭代)
|
||||
> 修复 commit:`ef10ad28` + `ab0bbbc4`(降 100-300 MB)| 架构限制:Bun mimalloc/JSC 不归还内存页(~150-250 MB 永久占用)
|
||||
|
||||
## 数据收集
|
||||
## 已修复(10 项)
|
||||
|
||||
- 典型场景 RSS 682 MB,基线 JSC heap 300-400 MB
|
||||
- Bun mimalloc 不归还内存页,JSC 页管理只增不减(架构级限制)
|
||||
- 已有每秒 `Bun.gc()` 定时器(`cli/print.ts:554-558`),非强制模式
|
||||
- 10 项已修复(commit `ef10ad28` + `ab0bbbc4`),降低约 100-300MB
|
||||
- Round 3 确认:AWS SDK/Google Auth/Azure Identity 均动态 import(lazy),不贡献基线
|
||||
|
||||
## 已修复问题(commit ef10ad28 + ab0bbbc4)
|
||||
|
||||
| 问题 | 原峰值 | 修复方式 | 位置 |
|
||||
|------|--------|----------|------|
|
||||
| 问题 | 原峰值 | 修复 | 位置 |
|
||||
|------|--------|------|------|
|
||||
| 流式字符串拼接 O(n²) | 2-20 MB | `+=` → 数组累积 | `claude.ts:1834,2271` |
|
||||
| Messages.tsx 多次遍历 | 100-270 MB | 合并单次 pass | `Messages.tsx:417-418` |
|
||||
| ColorFile 无缓存 | 50-100 MB | LRU 缓存 50 条目 | `HighlightedCode.tsx:14-61` |
|
||||
| Ink StylePool 无界 | 10-50+ MB | 1000 条目上限 | `@ant/ink/screen.ts:122` |
|
||||
| ColorFile 无缓存 | 50-100 MB | LRU-50 | `HighlightedCode.tsx:14-61` |
|
||||
| Ink StylePool 无界 | 10-50+ MB | 1000 上限 | `@ant/ink/screen.ts:122` |
|
||||
| CompanionSprite 高频 | CPU | TICK_MS→1000ms | `CompanionSprite.tsx:15` |
|
||||
| MCP stderr 缓冲 | 1-640 MB | 64→8MB/server | `mcp-client/connection.ts:117` |
|
||||
| BashTool 输出缓冲 | 30-330 MB | 32→2MB | `stringUtils.ts:88` |
|
||||
| Transcript 写入队列 | 5-50 MB | 1000 条目上限 | `sessionStorage.ts:613-619` |
|
||||
| Transcript 写入队列 | 5-50 MB | 1000 上限 | `sessionStorage.ts:613-619` |
|
||||
| contentReplacementState | 持续增长 | compact 清理 | `compact/compact.ts` |
|
||||
| SSE 缓冲 | 无上限 | 1MB cap | SSE 处理代码 |
|
||||
|
||||
## 仍存在的问题 — 内存(按峰值影响排序)
|
||||
## P0 — 核心瓶颈(6 项)
|
||||
|
||||
### P0:消息数组 7-8x 拷贝(120-320 MB)
|
||||
| # | 问题 | 峰值 | 位置 | 建议 |
|
||||
|---|------|------|------|------|
|
||||
| 1 | 消息数组 7-8x spread 拷贝(turn 尾部 3-4 份同时驻留) | 120-320 MB | `query.ts` 7 处(:477,:491,:897,:1135,:1745,:1857,:1878) | 去掉 spread / 传引用 / 改 push |
|
||||
| 2 | AutoCompact 时序缺陷(检查在 API 前,增长在 API 后) | API 超限 | `query.ts:575` | 加入预测式阈值检查 |
|
||||
| 3 | reactiveCompact 空存根(API 413 时无紧急压缩) | 无降级 | `reactiveCompact.ts` 全文 | 实现真实逻辑 |
|
||||
| 4 | buildMessageLookups 8 Map/Set 重建(流式每个 delta 触发) | GC STW 100-173ms | `Messages.tsx:519` | 增量更新 / 拆分 useMemo 链 |
|
||||
| 5 | useDeferredValue 双缓冲 | 100-200 MB | `REPL.tsx:1569` | React 调度机制固有,优化空间有限 |
|
||||
| 6 | Compact 峰值窗口(preCompactReadFileState + summary + attachments) | 20-80 MB | `compact.ts:524-644` | 提前释放 preCompactReadFileState/summaryResponse |
|
||||
|
||||
`src/query.ts` 每轮 turn 产生的拷贝(Round 3 新增第 7 项):
|
||||
## P1 — 重要瓶颈(14 项)
|
||||
|
||||
| 位置 | 操作 | 是否必要 | 优化方式 |
|
||||
|------|------|----------|----------|
|
||||
| `:477` | `[...getMessagesAfterCompactBoundary(messages)]` | 双重浪费 | 去掉 spread |
|
||||
| `:491` | `applyToolResultBudget → map()` | 按需 | 无超限返回原数组 |
|
||||
| `:897` | `clonedContent ??= [...contentArr]` | 条件必要 | 保留 |
|
||||
| `:1135` | `[...messagesForQuery, ...assistant]` | 可避免 | 传引用 |
|
||||
| `:1745` | `.concat(assistant, toolResults)` | 可避免 | 传多参数 |
|
||||
| `:1857` | `[...messagesForQuery, ...assistant, ...toolResults]` forkContextMessages | **Round 3 新发现** — task summary 用完即弃 | 传引用 |
|
||||
| `:1878` | `[...messagesForQuery, ...assistant, ...toolResults]` | 必要 | 改 push |
|
||||
| # | 问题 | 峰值 | 位置 | 建议 |
|
||||
|---|------|------|------|------|
|
||||
| 7 | OpenAI/Gemini/Grok 兼容层 O(n²) 拼接 | 25-75 MB | 3 文件 9 处(`openai/index.ts:386`, `gemini/index.ts:148`, `grok/index.ts:163`) | 改数组累积(同 claude.ts 模式) |
|
||||
| 8 | messages.ts O(n²) 拼接 | 10-25 MB | `messages.ts:3252,3268` | 改数组累积 |
|
||||
| 9 | highlight.js 全量 192 语言(仅需 26 种) | 8-12 MB | `color-diff-napi/index.ts:21` | 自定义构建 |
|
||||
| 10 | hlLineCache 模块级单例 2048 条目 | ~4 MB | `color-diff-napi/index.ts:508` | 改 LRU + size 上限 |
|
||||
| 11 | colorFileCache 3x 代码存储 | 2-5 MB | `HighlightedCode.tsx:14` | 移除 value 中 code 字段 |
|
||||
| 12 | 虚拟滚动 200 组件常驻 | 50 MB | `useVirtualScroll.ts` | 降低 OVERSCAN_ROWS / MAX_MOUNTED_ITEMS |
|
||||
| 13 | FileReadTool 大文件(输出上限 100K 字符,但读取期间完整加载) | 临时数 MB | `FileReadTool.ts:342` | 读取前检测大小,流式截断 |
|
||||
| 14 | Session 恢复全量加载(磁盘→JSON→REPL 三阶段) | 200-300 MB | `sessionStorage.ts:3482` | 流式 JSONL / 增量恢复 |
|
||||
| 15 | Session 写入 100MB 累积 | ~100 MB | `sessionStorage.ts:652` | 流式写入 |
|
||||
| 16 | Forked Agent FileStateCache 完整克隆 | 50N MB | `forkedAgent.ts:382` | 共享/分层缓存(agent 用 10MB) |
|
||||
| 17 | GC 阈值 350MB < 基线(每秒无意义强制 GC) | CPU 浪费 | `cli/print.ts:554` | 提高到 800MB+ |
|
||||
| 18 | PDF 100 页处理 | ~100 MB | `apiLimits.ts:54` | 分页流式处理 |
|
||||
| 19 | 图片单张处理(base64→解码→resize) | ~16 MB/张 | `apiLimits.ts:22` | 流式 resize |
|
||||
| 20 | token 估算 ±25-50% 误差放大时序问题 | 阈值不准 | `tokenEstimation.ts:215` | 内容类型感知估算 |
|
||||
|
||||
峰值时 3-4 份完整消息数组同时驻留(477 + 1745 + 1857 + 1878 在同一 turn 尾部顺序执行)。
|
||||
|
||||
### P0:Compact 峰值(20-80 MB)
|
||||
|
||||
峰值时间线(`compact.ts:524-644`):
|
||||
```
|
||||
Before: messages(200K) + mutableMessages(200K) = 400K tokens
|
||||
During: + preCompactReadFileState(25MB) + summary + attachments ≈ 500K+ tokens
|
||||
After: splice → 50K tokens
|
||||
```
|
||||
|
||||
可提前释放:`preCompactReadFileState`(25MB)、`summaryResponse`、原始 `messages` 参数。
|
||||
|
||||
### P1:虚拟滚动组件(~50 MB)— Round 3 新发现
|
||||
|
||||
`src/hooks/useVirtualScroll.ts` + React Ink 渲染管线:
|
||||
- MAX_MOUNTED_ITEMS = 300,OVERSCAN_ROWS = 80
|
||||
- 实际挂载约 200 个 MessageRow(视口 + overscan)
|
||||
- 每个 MessageRow ≈ 250KB RSS(React fiber + Yoga node + 子组件树)
|
||||
- **总计约 50 MB 常驻内存**(当前会话最大挂载窗口)
|
||||
|
||||
优化空间:降低 MAX_MOUNTED_ITEMS 或 OVERSCAN_ROWS;评估 MessageRow 组件内部 memo 化。
|
||||
|
||||
### P1:流式 contentBlocks 累积 — Round 3 新发现
|
||||
|
||||
`src/services/api/claude.ts:1932`:
|
||||
- `contentBlocks` 数组在流式响应期间累积所有内容块
|
||||
- 长 thinking 响应可达数万 token,thinking 文本完整保留在 contentBlock.thinking 中
|
||||
- `streamingDeltas` Map(已修复为数组累积)在 `content_block_stop` 时 `join('')` 赋值给 contentBlock
|
||||
- 思考块在 normalize 后仍然保留完整 thinking 文本
|
||||
|
||||
### P1:其他已确认内存问题
|
||||
## P2 — 次要问题(10 项)
|
||||
|
||||
| # | 问题 | 峰值 | 位置 |
|
||||
|---|------|------|------|
|
||||
| 1 | MCP Tool Schema 双重存储 | ~40 MB | `manager.ts:73` + `AppStateStore.ts:175` |
|
||||
| 2 | lastAPIRequestMessages 常驻 | 30-50 MB | `bootstrap/state.ts:118` |
|
||||
| 3 | Session 恢复全量加载(中小文件) | 50-200 MB | `sessionStorage.ts:3475-3582` |
|
||||
| 4 | HybridTransport 100K 队列 | 1-10 MB | `HybridTransport.ts:86` |
|
||||
| 5 | React messagesRef 双重引用 | 临时 | `REPL.tsx:1437-1477` |
|
||||
| 6 | AppState 不可变更新抖动 | 5-50 MB | `store.ts:20-26` |
|
||||
| 7 | Tool result seenIds/replacements | 0.5-2 MB | `toolResultStorage.ts:390-397` |
|
||||
| 8 | bootstrap/state.ts 无界缓存 | 0.1-1 MB | planSlugCache 等 |
|
||||
| 9 | QueryEngine 无界集合 | 0.1-1 MB | discoveredSkillNames 等 |
|
||||
| 21 | lastAPIRequestMessages 常驻 | 30-50 MB | `bootstrap/state.ts:118` |
|
||||
| 22 | MCP Tool Schema 双重存储 | ~40 MB | `manager.ts:73` + `AppStateStore.ts:175` |
|
||||
| 23 | ContentReplacementState 单调增长 | 0.5-2 MB | `toolResultStorage.ts:390` |
|
||||
| 24 | Perfetto 100K 事件 | ~30 MB | `perfettoTracing.ts:106` |
|
||||
| 25 | StreamingMarkdown 双渲染 | 临时 | `Markdown.tsx:185` |
|
||||
| 26 | MarkdownTable 3 次遍历 | CPU 峰值 | `MarkdownTable.tsx:99` |
|
||||
| 27 | 搜索索引 WeakMap | 5-10 MB | `transcriptSearch.ts:17` |
|
||||
| 28 | ACP FileStateCache/会话 | 50 MB | `acp/agent.ts:554` |
|
||||
| 29 | Agent initialMessages 浅拷贝 | 1-5 MB/agent | `runAgent.ts:382` |
|
||||
| 30 | Hook 结果累积 | ~1 MB+ | `toolExecution.ts:1474` |
|
||||
|
||||
### P2:低优先级(未验证)
|
||||
|
||||
| # | 问题 | 峰值 | 位置 |
|
||||
|---|------|------|------|
|
||||
| 1 | OpenTelemetry 多版本 | ~30 MB | 依赖树 |
|
||||
| 2 | Perfetto tracing 100K events | ~30 MB | `perfettoTracing.ts:99` |
|
||||
| 3 | Prompt Cache 规范化 | 5-15 MB | `claude.ts:3180-3329` |
|
||||
| 4 | GrepTool 全量 stat+sort | ~10 MB | `GrepTool.ts:523-557` |
|
||||
|
||||
## 仍存在的问题 — CPU 与渲染热点
|
||||
|
||||
### 已确认
|
||||
## CPU / 渲染热点
|
||||
|
||||
| # | 问题 | 影响 | 位置 |
|
||||
|---|------|------|------|
|
||||
| C2 | **Ink 每次 React commit 触发 Yoga 布局**(React ConcurrentRoot 自动批处理 setState,5 个 setState → 1 次 commit → 1 次布局) | ~1-3ms/次 commit | `reconciler.ts:279` → `ink.tsx:323` |
|
||||
| C3 | **MessageRow 挂载成本 ~1.5ms**(Markdown 解析仅占 1-7%,主因是 React/Yoga/Ink 管线开销 ~1.3ms) | 已有 SLIDE_STEP=25 + useDeferredValue 限速 | `useVirtualScroll.ts` + `Markdown.tsx` |
|
||||
| C4 | **布局偏移触发全屏 damage** | O(rows×cols) 全量 diff | `ink.tsx:655-661` |
|
||||
| C7 | **CompanionSprite TICK_MS 定时器**(500ms→已修复为 1000ms) | 高频 setState 触发渲染 | `buddy/CompanionSprite.tsx:15,136` |
|
||||
| C9 | 同步 fs 操作 | 阻塞主线程 | `projectOnboardingState.ts:20` 等 |
|
||||
| C2 | Ink 每次 React commit 触发 Yoga 布局 | ~1-3ms/commit | `reconciler.ts:279` → `ink.tsx:323` |
|
||||
| C3 | MessageRow 挂载 ~1.5ms(React/Yoga/Ink 管线开销) | 批量挂载 ~290ms 卡顿 | `useVirtualScroll.ts` |
|
||||
| C4 | 布局偏移触发全屏 damage | O(rows×cols) | `ink.tsx:655-661` |
|
||||
| C9 | 同步 fs 操作阻塞主线程 | 间歇卡顿 | `projectOnboardingState.ts:20` 等 |
|
||||
|
||||
### 已否认
|
||||
已有缓解:React ConcurrentRoot 批处理、帧率限制 16ms、虚拟滚动 overscan 80 + SLIDE_STEP=25 + useDeferredValue、Markdown tokenCache LRU-500 + hasMarkdownSyntax 快速路径、Yoga 增量缓存。
|
||||
|
||||
- **C1 useInboxPoller 状态循环** — 验证确认:useEffect 是收敛的(移除消息 → count 减少 → 稳定),poll 通过 `store.getState()` 读取不触发 React 依赖,1 秒轮询是正常 I/O 模式无循环
|
||||
- **Markdown 是 CPU 热点** — marked.lexer 对典型消息仅 0.01-0.1ms,已有 tokenCache LRU-500(缓存命中 0.0003ms,99.6% 降速)+ hasMarkdownSyntax 快速路径(跳过 30-40% 消息)
|
||||
- **Yoga 无增量布局** — 实测增量更新高效(1000 节点树改 1 叶子 → 仅 2 次 measure,其余走缓存)
|
||||
- **Ink Yoga 2^depth 问题** — 实测 100 节点深链 = 11.7x 访问(线性增长,非指数级)
|
||||
## 已否认(12 轮汇总)
|
||||
|
||||
### 已有优化措施
|
||||
|
||||
- React ConcurrentRoot 自动批处理 setState(多个 setState → 1 次 commit)
|
||||
- Ink 帧率限制 16ms(throttle 仅限终端输出,Yoga 布局无 throttle 但被 React batching 保护)
|
||||
- 虚拟滚动 overscan 80 + MAX_MOUNTED_ITEMS 300 + SLIDE_STEP=25 + useDeferredValue
|
||||
- Markdown tokenCache LRU-500 + hasMarkdownSyntax 快速路径 + StreamingMarkdown 增量解析
|
||||
- Yoga 增量缓存(dirty propagation + measure 结果缓存)
|
||||
- 双缓冲 + damage tracking + 字符池复用
|
||||
- Pool 5 分钟周期重置
|
||||
|
||||
## 已否认(内存,4 轮汇总)
|
||||
|
||||
- VSZ 516 GB 是虚拟映射非物理 | Zod Schema ~650KB | Markdown LRU-500 已优化
|
||||
- useSkillsChange/useSettingsChange — 正确 cleanup | useInboxPoller — 收敛设计
|
||||
- React Compiler `_c(N)` — 未使用 | File watchers — 仅 ~5KB | React reconciler — WeakMap + freeRecursive
|
||||
- Ink 屏幕缓冲 ~86KB | CharPool/HyperlinkPool ~1-5MB 且 5min 重置 | StylePool 缓存 1000 上限
|
||||
- 依赖树 — AWS/Google/Azure SDK 均动态 import,不贡献基线 | Sentry 空实现
|
||||
- Ink 无 scrollback 缓冲 | Markdown tokenCache LRU-500 bounded
|
||||
VSZ 516 GB 是虚拟映射 | Zod ~650KB | Markdown LRU-500 已优化 | useSkillsChange/useSettingsChange 正确 cleanup | useInboxPoller 收敛设计(非循环)| React Compiler `_c(N)` 未使用 | File watchers ~5KB | React reconciler WeakMap + freeRecursive | Ink 屏幕缓冲 ~86KB | CharPool/HyperlinkPool ~1-5MB 5min 重置 | AWS/Google/Azure SDK 均懒加载 | Sentry 空实现 | useCallback 闭包通过 messagesRef 规避(无泄漏)| MCP stderrHandler 有 64MB cap + cleanup | useRef 有 clearConversation/compact 清理 | apiMetricsRef turn 结束重置 | useEffect 有 cleanup 函数 | lodash-es tree-shakable | AppState useSyncExternalStore 仅相关切片更新 | SDK 无全局重试队列 | Ink unmount 有清理
|
||||
|
||||
## 结论
|
||||
|
||||
**内存根因**(4 轮迭代确认):消息数组 turn 尾部 3-4 次同时驻留 + compact 峰值窗口 + 虚拟滚动 200 组件 ~50MB 常驻 + Bun/JSC 不归还内存页。
|
||||
**内存根因排序**:
|
||||
1. 消息数组 7-8x spread 拷贝(120-320 MB)— 核心瓶颈
|
||||
2. useDeferredValue 双缓冲 + React useMemo 链全量重算(100-200 MB + GC STW)
|
||||
3. Session 恢复/写入峰值(200-300 MB)
|
||||
4. AutoCompact 时序缺陷 + reactiveCompact 空存根(API 超限风险)
|
||||
5. Forked Agent FileStateCache 克隆(50N MB)
|
||||
6. 虚拟滚动 200 组件 ~50MB 常驻
|
||||
7. Bun/JSC 不归还内存页(架构级)
|
||||
|
||||
**CPU 根因**:useInboxPoller 每秒轮询触发 React commit → 全量 Yoga 布局 → 全屏 Ink diff 的完整管线。Markdown 渲染(~1.5ms/行)在批量挂载新消息时造成 ~290ms 卡顿。轮询导致的周期性 commit 与消息挂载的 CPU 密集操作互相放大。
|
||||
|
||||
**Round 4 最终验证**:agent 递归 spread 和 attachment 累积均为已知 P0(消息数组拷贝)的变体,无新根因。Snipping 在流式前执行无并发问题。consumedCommandUuids 等数组每轮重置无累积。
|
||||
**CPU 根因**:useInboxPoller 每秒轮询 → React commit → Yoga 布局 → 全屏 Ink diff 完整管线。Markdown 渲染批量挂载时 ~290ms 卡顿。
|
||||
|
||||
**预估优化空间**:
|
||||
|
||||
| 优先级 | 措施 | 预估降低 |
|
||||
|--------|------|----------|
|
||||
| P0 | 消息数组拷贝优化 7 处 | 100-200 MB |
|
||||
| P0 | Compact 峰值管理 3 项 | 20-80 MB |
|
||||
| P1 | 虚拟滚动优化 | 20-30 MB |
|
||||
| P1 | 缓冲与缓存清理 5 项 | 30-80 MB |
|
||||
| P2 | 其他 3 项 | 10-50 MB |
|
||||
| **合计** | **18 项可操作建议** | **180-440 MB** |
|
||||
| 优先级 | 措施数 | 预估降低 |
|
||||
|--------|--------|----------|
|
||||
| P0 | 6 | 240-600 MB |
|
||||
| P1 | 14 | 300-600 MB |
|
||||
| P2 | 10 | 80-200 MB |
|
||||
| **合计** | **30 项** | **620-1400 MB** |
|
||||
|
||||
理论可从当前 400-700 MB 降至 **200-350 MB**。
|
||||
|
||||
## 建议(按优先级)
|
||||
|
||||
### P0:消息数组拷贝(预估降 100-200 MB)
|
||||
|
||||
1. `query.ts:477` — 去掉 spread
|
||||
2. `query.ts:1878` — 改 push 追加
|
||||
3. `query.ts:1135` — 传引用
|
||||
4. `query.ts:1745` — 传多参数
|
||||
5. `query.ts:1857` — 传引用(forkContextMessages)
|
||||
6. `query.ts:491` — 无超限返回原数组
|
||||
|
||||
### P0:Compact 峰值(预估降 20-80 MB)
|
||||
|
||||
7. `compact.ts:543` 后 `preCompactReadFileState = undefined`
|
||||
8. `compact.ts:651` 后 `summaryResponse = undefined`
|
||||
9. 延迟非关键 attachment 生成
|
||||
|
||||
### P1:渲染与缓存(预估降 50-110 MB)
|
||||
|
||||
10. 虚拟滚动 — 降低 OVERSCAN_ROWS 或 MAX_MOUNTED_ITEMS
|
||||
11. `lastAPIRequestMessages` — 非 debug 清空
|
||||
12. MCP Tool Schema — 去掉 manager 层 toolsCache
|
||||
13. `HybridTransport` — maxQueueSize 100K→10K
|
||||
14. `bootstrap/state.ts` — 无界 Map 加 LRU
|
||||
|
||||
### P2:其他(预估降 10-50 MB)
|
||||
|
||||
15. `toolResultStorage.ts` — seenIds/replacements 定期清理
|
||||
16. Session 恢复流式 JSONL | AppState 增量更新
|
||||
17. Thinking 文本截断策略(保留前 N + 后 N 字符)
|
||||
18. `Bun.gc(true)` 低内存触发
|
||||
|
||||
### P2:Ink 渲染层(降低 CPU 开销)
|
||||
|
||||
19. `ink.tsx:655-661` — 布局偏移时尝试增量 damage 而非全屏 `{x:0,y:0,width:full,height:full}`
|
||||
|
||||
## 附录
|
||||
|
||||
- 合并来源:`docs/performance-reporter.md`(7 轮调研,含 CPU/渲染热点详细验证)
|
||||
- 修复 commit:`ab0bbbc4`(compact 清理)、`ef10ad28`(峰值优化 -100-300MB)
|
||||
- Round 2 新发现:HybridTransport 缓冲、React messagesRef 双重引用、toolResultStorage 无界增长
|
||||
- Round 3 新发现:虚拟滚动 ~50MB 常驻、第 7-8 次 spread(query.ts:1857)、流式 contentBlocks thinking 累积、依赖树已懒加载
|
||||
- Round 4 最终验证:无新根因(agent spread 和 attachment 累积为已知变体),调研终止
|
||||
理论可从 400-700 MB 降至 **200-350 MB**(受 mimalloc/JSC 架构限制约束)。
|
||||
|
||||
@@ -12,12 +12,12 @@ Claude Code 将文件操作拆分为三个独立工具——这不是功能划
|
||||
|
||||
| 工具 | 权限级别 | 核心方法 | 关键属性 |
|
||||
|------|---------|---------|---------|
|
||||
| **Read** | 只读(免审批) | `isReadOnly() → true` | `maxResultSizeChars: Infinity` |
|
||||
| **Read** | 只读(免审批) | `isReadOnly() → true` | `maxResultSizeChars: 100,000` |
|
||||
| **Edit** | 写入(需确认) | `checkWritePermissionForTool()` | `maxResultSizeChars: 100,000` |
|
||||
| **Write** | 写入(需确认) | `checkWritePermissionForTool()` | `maxResultSizeChars: 100,000` |
|
||||
|
||||
<Tip>
|
||||
Read 的 `maxResultSizeChars` 是 `Infinity`,但这并不意味着无限制输出——真正的截断发生在 `validateContentTokens()` 中基于 token 预算的动态判定,而非字符数硬限制。
|
||||
Read 的 `maxResultSizeChars` 为 100,000(100KB)。超出此阈值的结果会被持久化到磁盘,减少长会话的内存压力。实际的 token 级别截断由 `validateContentTokens()` 动态控制。
|
||||
</Tip>
|
||||
|
||||
## FileRead:多模态文件读取引擎
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
{
|
||||
"name": "claude-code-best",
|
||||
"version": "2.0.0",
|
||||
"version": "2.0.4",
|
||||
"description": "Reverse-engineered Anthropic Claude Code CLI — interactive AI coding assistant in the terminal",
|
||||
"type": "module",
|
||||
"author": "claude-code-best <claude-code-best@proton.me>",
|
||||
|
||||
@@ -468,7 +468,11 @@ describe('DeepSeek thinking mode (enableThinking)', () => {
|
||||
expect(assistant.reasoning_content).toBe('First thought.\nSecond thought.')
|
||||
})
|
||||
|
||||
test('skips empty thinking blocks', () => {
|
||||
test('preserves empty thinking blocks as reasoning_content: "" (DeepSeek v4 thinking mode)', () => {
|
||||
// DeepSeek v4 thinking mode sometimes returns reasoning_content: ""
|
||||
// when the model answers directly without reasoning. The empty value
|
||||
// must be echoed back in the next request — otherwise DeepSeek returns
|
||||
// 400 ("reasoning_content ... must be passed back"). See issue #399.
|
||||
const result = anthropicMessagesToOpenAI(
|
||||
[
|
||||
makeUserMsg('question'),
|
||||
@@ -481,7 +485,23 @@ describe('DeepSeek thinking mode (enableThinking)', () => {
|
||||
{ enableThinking: true },
|
||||
)
|
||||
const assistant = result.filter(m => m.role === 'assistant')[0] as any
|
||||
expect(assistant.reasoning_content).toBe('')
|
||||
expect(assistant.content).toBe('Answer.')
|
||||
})
|
||||
|
||||
test('omits reasoning_content when no thinking block is present', () => {
|
||||
// No thinking block at all → no reasoning_content field on the
|
||||
// OpenAI-format assistant message (relevant for non-thinking models).
|
||||
const result = anthropicMessagesToOpenAI(
|
||||
[
|
||||
makeUserMsg('question'),
|
||||
makeAssistantMsg([{ type: 'text', text: 'Answer.' }]),
|
||||
],
|
||||
[] as any,
|
||||
)
|
||||
const assistant = result.filter(m => m.role === 'assistant')[0] as any
|
||||
expect(assistant.reasoning_content).toBeUndefined()
|
||||
expect(assistant.content).toBe('Answer.')
|
||||
})
|
||||
|
||||
// ── fix: reorder tool and user messages for OpenAI API compatibility (#168) ──
|
||||
|
||||
@@ -439,6 +439,54 @@ describe('thinking support (reasoning_content)', () => {
|
||||
expect(blockStarts[1].content_block.type).toBe('tool_use')
|
||||
})
|
||||
|
||||
test('opens thinking block on empty reasoning_content (DeepSeek v4 direct-answer)', async () => {
|
||||
// DeepSeek v4 thinking mode sometimes streams reasoning_content: ""
|
||||
// before answering directly. We must still open a thinking block so the
|
||||
// resulting assistant message carries an (empty) thinking block — that
|
||||
// round-trips back as reasoning_content: "" in the next request,
|
||||
// satisfying DeepSeek's requirement (see issue #399).
|
||||
const events = await collectEvents([
|
||||
makeChunk({
|
||||
choices: [
|
||||
{
|
||||
index: 0,
|
||||
delta: { reasoning_content: '' },
|
||||
finish_reason: null,
|
||||
},
|
||||
],
|
||||
}),
|
||||
makeChunk({
|
||||
choices: [
|
||||
{
|
||||
index: 0,
|
||||
delta: { content: 'Direct answer.' },
|
||||
finish_reason: null,
|
||||
},
|
||||
],
|
||||
}),
|
||||
makeChunk({
|
||||
choices: [{ index: 0, delta: {}, finish_reason: 'stop' }],
|
||||
}),
|
||||
])
|
||||
|
||||
// A thinking block was opened (and closed before the text block starts)
|
||||
const blockStarts = events.filter(
|
||||
e => e.type === 'content_block_start',
|
||||
) as any[]
|
||||
expect(blockStarts.length).toBe(2)
|
||||
expect(blockStarts[0].content_block.type).toBe('thinking')
|
||||
expect(blockStarts[0].content_block.thinking).toBe('')
|
||||
expect(blockStarts[1].content_block.type).toBe('text')
|
||||
|
||||
// No empty thinking_delta should be emitted — the empty string is
|
||||
// already conveyed by the thinking block's initial value.
|
||||
const thinkingDeltas = events.filter(
|
||||
e =>
|
||||
e.type === 'content_block_delta' && e.delta.type === 'thinking_delta',
|
||||
)
|
||||
expect(thinkingDeltas.length).toBe(0)
|
||||
})
|
||||
|
||||
test('thinking block index is 0, text block index is 1', async () => {
|
||||
const events = await collectEvents([
|
||||
makeChunk({
|
||||
|
||||
@@ -206,12 +206,14 @@ function convertInternalAssistantMessage(
|
||||
},
|
||||
})
|
||||
} else if (block.type === 'thinking') {
|
||||
// DeepSeek thinking mode: always preserve reasoning_content.
|
||||
// DeepSeek requires reasoning_content to be passed back in subsequent requests,
|
||||
// especially when tool calls are involved (returns 400 if missing).
|
||||
// DeepSeek thinking mode: always preserve reasoning_content,
|
||||
// including the empty-string case. DeepSeek v4 may return
|
||||
// reasoning_content: "" when the model answers directly, and the
|
||||
// empty value must be echoed back in the next request — otherwise
|
||||
// DeepSeek returns 400 ("reasoning_content ... must be passed back").
|
||||
const thinkingText = (block as unknown as Record<string, unknown>)
|
||||
.thinking
|
||||
if (typeof thinkingText === 'string' && thinkingText) {
|
||||
if (typeof thinkingText === 'string') {
|
||||
reasoningParts.push(thinkingText)
|
||||
}
|
||||
}
|
||||
|
||||
@@ -106,9 +106,13 @@ export async function* adaptOpenAIStreamToAnthropic(
|
||||
// Skip chunks that carry only usage data (no delta content)
|
||||
if (!delta) continue
|
||||
|
||||
// Handle reasoning_content → Anthropic thinking block
|
||||
// Handle reasoning_content → Anthropic thinking block.
|
||||
// Empty string is a valid signal: DeepSeek v4 thinking mode sometimes
|
||||
// returns reasoning_content: "" when the model answers directly. The
|
||||
// empty thinking block must round-trip back to the API in subsequent
|
||||
// requests, otherwise DeepSeek rejects with 400.
|
||||
const reasoningContent = (delta as any).reasoning_content
|
||||
if (reasoningContent != null && reasoningContent !== '') {
|
||||
if (reasoningContent != null) {
|
||||
if (!thinkingBlockOpen) {
|
||||
currentContentIndex++
|
||||
thinkingBlockOpen = true
|
||||
@@ -125,14 +129,16 @@ export async function* adaptOpenAIStreamToAnthropic(
|
||||
} as BetaRawMessageStreamEvent
|
||||
}
|
||||
|
||||
yield {
|
||||
type: 'content_block_delta',
|
||||
index: currentContentIndex,
|
||||
delta: {
|
||||
type: 'thinking_delta',
|
||||
thinking: reasoningContent,
|
||||
},
|
||||
} as BetaRawMessageStreamEvent
|
||||
if (reasoningContent !== '') {
|
||||
yield {
|
||||
type: 'content_block_delta',
|
||||
index: currentContentIndex,
|
||||
delta: {
|
||||
type: 'thinking_delta',
|
||||
thinking: reasoningContent,
|
||||
},
|
||||
} as BetaRawMessageStreamEvent
|
||||
}
|
||||
}
|
||||
|
||||
// Handle text content
|
||||
|
||||
@@ -148,6 +148,12 @@ const baseInputSchema = lazySchema(() =>
|
||||
.boolean()
|
||||
.optional()
|
||||
.describe('Set to true to run this agent in the background. You will be notified when it completes.'),
|
||||
fork: z
|
||||
.boolean()
|
||||
.optional()
|
||||
.describe(
|
||||
'Set to true to fork from the parent conversation context. The child inherits full history, system prompt, and model. Requires FORK_SUBAGENT feature flag.',
|
||||
),
|
||||
}),
|
||||
);
|
||||
|
||||
@@ -191,24 +197,23 @@ const fullInputSchema = lazySchema(() => {
|
||||
// type, but call() destructures via the explicit AgentToolInput type below
|
||||
// which always includes all optional fields.
|
||||
export const inputSchema = lazySchema(() => {
|
||||
const schema = feature('KAIROS') ? fullInputSchema() : fullInputSchema().omit({ cwd: true });
|
||||
|
||||
// GrowthBook-in-lazySchema is acceptable here (unlike subagent_type, which
|
||||
// was removed in 906da6c723): the divergence window is one-session-per-
|
||||
// gate-flip via _CACHED_MAY_BE_STALE disk read, and worst case is either
|
||||
// "schema shows a no-op param" (gate flips on mid-session: param ignored
|
||||
// by forceAsync) or "schema hides a param that would've worked" (gate
|
||||
// flips off mid-session: everything still runs async via memoized
|
||||
// forceAsync). No Zod rejection, no crash — unlike required→optional.
|
||||
return isBackgroundTasksDisabled || isForkSubagentEnabled() ? schema.omit({ run_in_background: true }) : schema;
|
||||
const base = feature('KAIROS') ? fullInputSchema() : fullInputSchema().omit({ cwd: true });
|
||||
return isBackgroundTasksDisabled
|
||||
? !isForkSubagentEnabled()
|
||||
? base.omit({ run_in_background: true, fork: true })
|
||||
: base.omit({ run_in_background: true })
|
||||
: !isForkSubagentEnabled()
|
||||
? base.omit({ fork: true })
|
||||
: base;
|
||||
});
|
||||
type InputSchema = ReturnType<typeof inputSchema>;
|
||||
|
||||
// Explicit type widens the schema inference to always include all optional
|
||||
// fields even when .omit() strips them for gating (cwd, run_in_background).
|
||||
// subagent_type is optional; call() defaults it to general-purpose when the
|
||||
// fork gate is off, or routes to the fork path when the gate is on.
|
||||
// subagent_type is optional; call() defaults it to general-purpose.
|
||||
// fork is gated by FORK_SUBAGENT flag; when omitted or flag is off, no fork.
|
||||
type AgentToolInput = z.infer<ReturnType<typeof baseInputSchema>> & {
|
||||
fork?: boolean;
|
||||
name?: string;
|
||||
team_name?: string;
|
||||
mode?: z.infer<ReturnType<typeof permissionModeSchema>>;
|
||||
@@ -322,6 +327,7 @@ export const AgentTool = buildTool({
|
||||
{
|
||||
prompt,
|
||||
subagent_type,
|
||||
fork,
|
||||
description,
|
||||
model: modelParam,
|
||||
run_in_background,
|
||||
@@ -406,12 +412,11 @@ export const AgentTool = buildTool({
|
||||
return { data: spawnResult } as unknown as { data: Output };
|
||||
}
|
||||
|
||||
// Fork subagent experiment routing:
|
||||
// - subagent_type set: use it (explicit wins)
|
||||
// - subagent_type omitted, gate on: fork path (undefined)
|
||||
// - subagent_type omitted, gate off: default general-purpose
|
||||
const effectiveType = subagent_type ?? (isForkSubagentEnabled() ? undefined : GENERAL_PURPOSE_AGENT.agentType);
|
||||
const isForkPath = effectiveType === undefined;
|
||||
// Fork routing: explicit `fork: true` parameter triggers the fork path
|
||||
// (inherits parent context and model). Requires FORK_SUBAGENT flag.
|
||||
// subagent_type is ignored when fork takes effect.
|
||||
const isForkPath = fork === true && isForkSubagentEnabled();
|
||||
const effectiveType = subagent_type ?? GENERAL_PURPOSE_AGENT.agentType;
|
||||
|
||||
let selectedAgent: AgentDefinition;
|
||||
if (isForkPath) {
|
||||
@@ -692,10 +697,6 @@ export const AgentTool = buildTool({
|
||||
// dependency issues during test module loading.
|
||||
const isCoordinator = feature('COORDINATOR_MODE') ? isEnvTruthy(process.env.CLAUDE_CODE_COORDINATOR_MODE) : false;
|
||||
|
||||
// Fork subagent experiment: force ALL spawns async for a unified
|
||||
// <task-notification> interaction model (not just fork spawns — all of them).
|
||||
const forceAsync = isForkSubagentEnabled();
|
||||
|
||||
// Assistant mode: force all agents async. Synchronous subagents hold the
|
||||
// main loop's turn open until they complete — the daemon's inputQueue
|
||||
// backs up, and the first overdue cron catch-up on spawn becomes N
|
||||
@@ -709,7 +710,6 @@ export const AgentTool = buildTool({
|
||||
(run_in_background === true ||
|
||||
selectedAgent.background === true ||
|
||||
isCoordinator ||
|
||||
forceAsync ||
|
||||
assistantForceAsync ||
|
||||
(proactiveModule?.isProactiveActive() ?? false)) &&
|
||||
!isBackgroundTasksDisabled;
|
||||
@@ -889,7 +889,7 @@ export const AgentTool = buildTool({
|
||||
toolUseContext,
|
||||
rootSetAppState,
|
||||
agentIdForCleanup: asyncAgentId,
|
||||
enableSummarization: isCoordinator || isForkSubagentEnabled() || getSdkAgentProgressSummariesEnabled(),
|
||||
enableSummarization: isCoordinator || isForkPath || getSdkAgentProgressSummariesEnabled(),
|
||||
getWorktreeResult: cleanupWorktreeIfNeeded,
|
||||
}),
|
||||
),
|
||||
|
||||
@@ -0,0 +1,69 @@
|
||||
import { describe, expect, test } from 'bun:test'
|
||||
import { readFileSync } from 'fs'
|
||||
import { join, dirname } from 'path'
|
||||
import { fileURLToPath } from 'url'
|
||||
|
||||
const __dirname = dirname(fileURLToPath(import.meta.url))
|
||||
const promptSource = readFileSync(join(__dirname, '..', 'prompt.ts'), 'utf-8')
|
||||
|
||||
describe('prompt.ts fork-related text verification', () => {
|
||||
test('does not contain "omit `subagent_type`" guidance', () => {
|
||||
expect(promptSource).not.toMatch(/omit.*subagent_type/)
|
||||
})
|
||||
|
||||
test('contains `fork: true` in at least 3 locations (shared + whenToFork + forkExamples)', () => {
|
||||
const matches = promptSource.match(/fork: true/g)
|
||||
expect(matches).not.toBeNull()
|
||||
expect(matches!.length).toBeGreaterThanOrEqual(3)
|
||||
})
|
||||
|
||||
test('all forkEnabled references are ternary conditions, not negated', () => {
|
||||
const lines = promptSource.split('\n')
|
||||
for (const line of lines) {
|
||||
if (
|
||||
line.includes('forkEnabled') &&
|
||||
!line.includes('const forkEnabled') &&
|
||||
!line.includes('forkEnabled =')
|
||||
) {
|
||||
expect(line).not.toContain('!forkEnabled')
|
||||
}
|
||||
}
|
||||
})
|
||||
|
||||
test('uses "non-fork" terminology instead of "fresh agent"', () => {
|
||||
expect(promptSource).toContain('non-fork')
|
||||
// "fresh agent" should not appear in fork-aware conditional text
|
||||
const freshAgentMatches = promptSource.match(/fresh agent/g)
|
||||
if (freshAgentMatches) {
|
||||
// Only allowed in comments explaining behavior, not in prompt text
|
||||
const linesWithFreshAgent = promptSource
|
||||
.split('\n')
|
||||
.filter(line => line.includes('fresh agent'))
|
||||
.map(line => line.trim())
|
||||
for (const line of linesWithFreshAgent) {
|
||||
// "fresh agent" in the context of "starts fresh" (not fork-aware) is ok
|
||||
// but "fresh agent" in forkEnabled conditional should not appear
|
||||
expect(line).not.toMatch(/fresh agent.*subagent_type/)
|
||||
}
|
||||
}
|
||||
})
|
||||
|
||||
test('background task condition does not include !forkEnabled', () => {
|
||||
// The condition for showing background task instructions should not exclude fork
|
||||
const bgCondition = promptSource.match(
|
||||
/!isEnvTruthy.*isInProcessTeammate[\s\S]*?run_in_background/,
|
||||
)
|
||||
if (bgCondition) {
|
||||
expect(bgCondition[0]).not.toContain('!forkEnabled')
|
||||
}
|
||||
})
|
||||
|
||||
test('fork example includes fork: true parameter', () => {
|
||||
// The first fork example should have fork: true
|
||||
const forkExampleBlock = promptSource.match(
|
||||
/name: "ship-audit"[\s\S]*?Under 200 words/,
|
||||
)
|
||||
expect(forkExampleBlock).not.toBeNull()
|
||||
expect(forkExampleBlock![0]).toContain('fork: true')
|
||||
})
|
||||
})
|
||||
@@ -82,11 +82,7 @@ export async function getPrompt(
|
||||
|
||||
## When to fork
|
||||
|
||||
Fork yourself (omit \`subagent_type\`) when the intermediate tool output isn't worth keeping in your context. The criterion is qualitative \u2014 "will I need this output again" \u2014 not task size.
|
||||
- **Research**: fork open-ended questions. If research can be broken into independent questions, launch parallel forks in one message. A fork beats a fresh subagent for this \u2014 it inherits context and shares your cache.
|
||||
- **Implementation**: prefer to fork implementation work that requires more than a couple of edits. Do research before jumping to implementation.
|
||||
|
||||
Forks are cheap because they share your prompt cache. Don't set \`model\` on a fork \u2014 a different model can't reuse the parent's cache. Pass a short \`name\` (one or two words, lowercase) so the user can see the fork in the teams panel and steer it mid-run.
|
||||
When you need to delegate work that benefits from full conversation context (e.g., continuing a multi-file refactor where the child needs the same system prompt and history), use \`fork: true\`. For most tasks, prefer specialized agent types (Explore, Plan, general-purpose).
|
||||
|
||||
**Don't peek.** The tool result includes an \`output_file\` path — do not Read or tail it unless the user explicitly asks for a progress check. You get a completion notification; trust it. Reading the transcript mid-flight pulls the fork's tool noise into your context, which defeats the point of forking.
|
||||
|
||||
@@ -100,14 +96,14 @@ Forks are cheap because they share your prompt cache. Don't set \`model\` on a f
|
||||
|
||||
## Writing the prompt
|
||||
|
||||
${forkEnabled ? 'When spawning a fresh agent (with a `subagent_type`), it starts with zero context. ' : ''}Brief the agent like a smart colleague who just walked into the room — it hasn't seen this conversation, doesn't know what you've tried, doesn't understand why this task matters.
|
||||
${forkEnabled ? 'When spawning an agent without `fork: true`, it starts with zero context. ' : ''}Brief the agent like a smart colleague who just walked into the room — it hasn't seen this conversation, doesn't know what you've tried, doesn't understand why this task matters.
|
||||
- Explain what you're trying to accomplish and why.
|
||||
- Describe what you've already learned or ruled out.
|
||||
- Give enough context about the surrounding problem that the agent can make judgment calls rather than just following a narrow instruction.
|
||||
- If you need a short response, say so ("report in under 200 words").
|
||||
- Lookups: hand over the exact command. Investigations: hand over the question — prescribed steps become dead weight when the premise is wrong.
|
||||
|
||||
${forkEnabled ? 'For fresh agents, terse' : 'Terse'} command-style prompts produce shallow, generic work.
|
||||
${forkEnabled ? 'For non-fork agents, terse' : 'Terse'} command-style prompts produce shallow, generic work.
|
||||
|
||||
**Never delegate understanding.** Don't write "based on your findings, fix the bug" or "based on the research, implement it." Those phrases push synthesis onto the agent instead of doing it yourself. Write prompts that prove you understood: include file paths, line numbers, what specifically to change.
|
||||
`
|
||||
@@ -120,6 +116,7 @@ assistant: <thinking>Forking this \u2014 it's a survey question. I want the punc
|
||||
${AGENT_TOOL_NAME}({
|
||||
name: "ship-audit",
|
||||
description: "Branch ship-readiness audit",
|
||||
fork: true,
|
||||
prompt: "Audit what's left before this branch can ship. Check: uncommitted changes, commits ahead of main, whether tests exist, whether the GrowthBook gate is wired up, whether CI-relevant files changed. Report a punch list \u2014 done vs. missing. Under 200 words."
|
||||
})
|
||||
assistant: Ship-readiness audit running.
|
||||
@@ -205,11 +202,7 @@ The ${AGENT_TOOL_NAME} tool launches specialized agents (subprocesses) that auto
|
||||
|
||||
${agentListSection}
|
||||
|
||||
${
|
||||
forkEnabled
|
||||
? `When using the ${AGENT_TOOL_NAME} tool, specify a subagent_type to use a specialized agent, or omit it to fork yourself — a fork inherits your full conversation context.`
|
||||
: `When using the ${AGENT_TOOL_NAME} tool, specify a subagent_type parameter to select which agent type to use. If omitted, the general-purpose agent is used.`
|
||||
}`
|
||||
When using the ${AGENT_TOOL_NAME} tool, specify a subagent_type parameter to select which agent type to use. If omitted, the general-purpose agent is used.${forkEnabled ? ` Set \`fork: true\` to fork from the parent conversation context, inheriting full history and model.` : ''}`
|
||||
|
||||
// Coordinator mode gets the slim prompt -- the coordinator system prompt
|
||||
// already covers usage notes, examples, and when-not-to-use guidance.
|
||||
@@ -257,14 +250,13 @@ Usage notes:
|
||||
- When the agent is done, it will return a single message back to you. The result returned by the agent is not visible to the user. To show the user the result, you should send a text message back to the user with a concise summary of the result.${
|
||||
// eslint-disable-next-line custom-rules/no-process-env-top-level
|
||||
!isEnvTruthy(process.env.CLAUDE_CODE_DISABLE_BACKGROUND_TASKS) &&
|
||||
!isInProcessTeammate() &&
|
||||
!forkEnabled
|
||||
!isInProcessTeammate()
|
||||
? `
|
||||
- You can optionally run agents in the background using the run_in_background parameter. When an agent runs in the background, you will be automatically notified when it completes — do NOT sleep, poll, or proactively check on its progress. Continue with other work or respond to the user instead.
|
||||
- **Foreground vs background**: Use foreground (default) when you need the agent's results before you can proceed — e.g., research agents whose findings inform your next steps. Use background when you have genuinely independent work to do in parallel.`
|
||||
: ''
|
||||
}
|
||||
- To continue a previously spawned agent, use ${SEND_MESSAGE_TOOL_NAME} with the agent's ID or name as the \`to\` field. The agent resumes with its full context preserved. ${forkEnabled ? 'Each fresh Agent invocation with a subagent_type starts without context — provide a complete task description.' : 'Each Agent invocation starts fresh — provide a complete task description.'}
|
||||
- To continue a previously spawned agent, use ${SEND_MESSAGE_TOOL_NAME} with the agent's ID or name as the \`to\` field. The agent resumes with its full context preserved. ${forkEnabled ? 'Each non-fork Agent invocation starts without context — provide a complete task description.' : 'Each Agent invocation starts fresh — provide a complete task description.'}
|
||||
- The agent's outputs should generally be trusted
|
||||
- Clearly tell the agent whether you expect it to write code or just to do research (search, file reads, web fetches, etc.)${forkEnabled ? '' : ", since it is not aware of the user's intent"}
|
||||
- If the agent description mentions that it should be used proactively, then you should try your best to use it without the user having to ask for it first. Use your judgement.
|
||||
|
||||
@@ -337,9 +337,10 @@ export type Output = z.infer<OutputSchema>
|
||||
export const FileReadTool = buildTool({
|
||||
name: FILE_READ_TOOL_NAME,
|
||||
searchHint: 'read files, images, PDFs, notebooks',
|
||||
// Output is bounded by maxTokens (validateContentTokens). Persisting to a
|
||||
// file the model reads back with Read is circular — never persist.
|
||||
maxResultSizeChars: Infinity,
|
||||
// Output is bounded by maxTokens (validateContentTokens). Results exceeding
|
||||
// 100KB are persisted to disk (reducing memory pressure in long sessions)
|
||||
// rather than kept in the message array indefinitely.
|
||||
maxResultSizeChars: 100_000,
|
||||
strict: true,
|
||||
async description() {
|
||||
return DESCRIPTION
|
||||
@@ -760,6 +761,16 @@ async function validateContentTokens(
|
||||
const effectiveMaxTokens =
|
||||
maxTokens ?? getDefaultFileReadingLimits().maxTokens
|
||||
|
||||
// Fast rejection: if raw byte count exceeds 4x the token limit,
|
||||
// no encoding can possibly fit (worst case is ~4 bytes/token).
|
||||
const byteLength = Buffer.byteLength(content)
|
||||
if (byteLength > effectiveMaxTokens * 4) {
|
||||
throw new MaxFileReadTokenExceededError(
|
||||
Math.ceil(byteLength / 4),
|
||||
effectiveMaxTokens,
|
||||
)
|
||||
}
|
||||
|
||||
const tokenEstimate = roughTokenCountEstimationForFileType(content, ext)
|
||||
if (!tokenEstimate || tokenEstimate <= effectiveMaxTokens / 4) return
|
||||
|
||||
|
||||
15
progress.md
Normal file
15
progress.md
Normal file
@@ -0,0 +1,15 @@
|
||||
# Code Review Progress
|
||||
|
||||
## 2026-05-03 — 第一轮 CRUD 业务逻辑层 Code Review
|
||||
|
||||
### 审查范围
|
||||
审查了 4 个核心 CRUD 模块:任务管理(tasks.ts)、设置管理(settings.ts)、插件管理(installedPluginsManager.ts)、团队协作邮箱(teammateMailbox.ts)。
|
||||
|
||||
### 变更内容
|
||||
1. **新增 `src/utils/__tests__/tasks.test.ts`** — 37 个测试覆盖完整 CRUD 操作:创建/读取/更新/删除任务、高水位标记防 ID 复用、文件锁并发安全、blockTask 双向关系、claimTask 竞态保护(含 agent_busy 检查)、resetTaskList、通知信号机制、并发创建唯一 ID 验证。
|
||||
|
||||
### Code Review 发现
|
||||
- tasks.ts 架构合理,文件锁+高水位标记保证了并发安全
|
||||
- settings.ts 依赖链过深(MDM/远程管理/文件系统),63 个现有测试覆盖良好
|
||||
- installedPluginsManager.ts V1→V2 迁移逻辑清晰,内存/磁盘状态分离设计良好
|
||||
- teammateMailbox.ts 25 个现有测试覆盖纯函数,协议消息检测函数完整
|
||||
@@ -49,10 +49,10 @@ export const DEFAULT_BUILD_FEATURES = [
|
||||
'DAEMON', // 守护进程模式,长驻 supervisor 管理后台 worker(非 GB 级主因)
|
||||
'ACP', // ACP 代理协议,支持外部 agent 接入
|
||||
'WORKFLOW_SCRIPTS', // 工作流脚本(.claude/workflows/ 中的 YAML/MD)
|
||||
'HISTORY_SNIP', // 历史消息裁剪,压缩上下文窗口
|
||||
'CONTEXT_COLLAPSE', // 上下文折叠,自动压缩旧消息
|
||||
// 'HISTORY_SNIP', // 历史消息裁剪,压缩上下文窗口
|
||||
// 'CONTEXT_COLLAPSE', // 已禁用:实现是空壳 stub,启用后会抑制 auto compact 导致上下文管理完全失效
|
||||
'MONITOR_TOOL', // Monitor 工具,流式监控后台进程输出
|
||||
// 'FORK_SUBAGENT', // 已禁用:启用后 prompt 引导模型用 fork(继承父模型)替代 Explore(haiku),导致探索任务使用同等级模型
|
||||
// 'FORK_SUBAGENT', // 已禁用:显式 `fork: true` 参数触发 fork 路径(继承父级上下文和模型),不影响 forceAsync 和探索任务模型选择
|
||||
// 'UDS_INBOX', // inbox 数组只增不减(非 GB 级主因)
|
||||
'KAIROS', // Kairos 定时任务系统核心
|
||||
// 'COORDINATOR_MODE', // 已禁用:AgentSummary 30s fork 循环,GB 级泄露主因
|
||||
|
||||
@@ -14,20 +14,18 @@ import { execSync } from 'node:child_process'
|
||||
const outdir = 'dist'
|
||||
|
||||
async function postBuild() {
|
||||
// Step 1: Patch globalThis.Bun destructuring from third-party deps
|
||||
const files = await readdir(outdir, { recursive: true })
|
||||
// Step 1: Patch globalThis.Bun destructuring in the single bundled file
|
||||
const cliPath = join(outdir, 'cli.js')
|
||||
const BUN_DESTRUCTURE = /var \{([^}]+)\} = globalThis\.Bun;?/g
|
||||
const BUN_DESTRUCTURE_SAFE =
|
||||
'var {$1} = typeof globalThis.Bun !== "undefined" ? globalThis.Bun : {};'
|
||||
|
||||
let bunPatched = 0
|
||||
for (const file of files) {
|
||||
const filePath = join(outdir, file)
|
||||
if (typeof file !== 'string' || !file.endsWith('.js')) continue
|
||||
const content = await readFile(filePath, 'utf-8')
|
||||
{
|
||||
const content = await readFile(cliPath, 'utf-8')
|
||||
if (BUN_DESTRUCTURE.test(content)) {
|
||||
await writeFile(
|
||||
filePath,
|
||||
cliPath,
|
||||
content.replace(BUN_DESTRUCTURE, BUN_DESTRUCTURE_SAFE),
|
||||
)
|
||||
bunPatched++
|
||||
|
||||
132
spec/feature_20260502_F001_fork-agent-redesign/spec-design.md
Normal file
132
spec/feature_20260502_F001_fork-agent-redesign/spec-design.md
Normal file
@@ -0,0 +1,132 @@
|
||||
# Feature: 20260502_F001 - fork-agent-redesign
|
||||
|
||||
## 需求背景
|
||||
|
||||
当前 `FORK_SUBAGENT` feature flag 是一个"一刀切"开关,启用时同时强制三件事:
|
||||
|
||||
1. 所有省略 `subagent_type` 的 agent 调用隐式走 fork 路径(继承父级完整上下文和模型)
|
||||
2. 所有 agent spawn 强制异步(`forceAsync` 绑定在 `isForkSubagentEnabled()` 上)
|
||||
3. prompt 引导模型优先省略 `subagent_type`,导致大部分 agent 都用同等级模型(贵)
|
||||
|
||||
这导致探索任务被迫使用与父级相同的模型(而非 haiku),token 消耗大增。因此该 flag 在 `defines.ts` 中被注释禁用。
|
||||
|
||||
## 目标
|
||||
|
||||
- 将 fork 从隐式行为改为**显式参数触发**(`fork: true`)
|
||||
- FORK_SUBAGENT flag 只控制 fork 能力的可用性,**不再影响 `forceAsync` 等其他行为**
|
||||
- 模型始终继承父级(保持现有行为)
|
||||
- **完全向后兼容**——不传 `fork` 参数时行为与当前(flag 关闭时)一致
|
||||
|
||||
## 方案设计
|
||||
|
||||
### Schema 变更
|
||||
|
||||
Agent tool 参数新增 `fork?: boolean`,仅在 `FORK_SUBAGENT` flag 启用时可见(schema 动态裁剪,复用现有的 schema memo 模式)。
|
||||
|
||||
```ts
|
||||
// inputSchema 中新增
|
||||
fork: z.boolean().optional().describe(
|
||||
'Set to true to fork from the parent conversation context. '
|
||||
'The child inherits full history, system prompt, and model. '
|
||||
'Requires FORK_SUBAGENT feature flag.'
|
||||
)
|
||||
```
|
||||
|
||||
flag 关闭时,schema 通过 `.omit({ fork: true })` 裁剪掉该字段(与当前 `run_in_background` 的裁剪方式一致)。
|
||||
|
||||
### 路由逻辑重构
|
||||
|
||||
`AgentTool.tsx` call() 中的路由从当前的隐式判断:
|
||||
|
||||
```ts
|
||||
// 旧行为:省略 subagent_type → fork(flag 开启时)
|
||||
const effectiveType = subagent_type ?? (isForkSubagentEnabled() ? undefined : GENERAL_PURPOSE_AGENT.agentType);
|
||||
const isForkPath = effectiveType === undefined;
|
||||
```
|
||||
|
||||
改为显式参数触发:
|
||||
|
||||
```ts
|
||||
// 新行为:显式 fork 参数触发,fork 优先级高于 subagent_type
|
||||
const isForkPath = input.fork === true && isForkSubagentEnabled();
|
||||
const effectiveType = subagent_type ?? GENERAL_PURPOSE_AGENT.agentType;
|
||||
```
|
||||
|
||||
#### 决策表
|
||||
|
||||
| `fork` | `subagent_type` | flag 开 | 结果 |
|
||||
|--------|----------------|---------|------|
|
||||
| `true` | 有值 | 是 | fork 路径,**忽略 subagent_type** |
|
||||
| `true` | 省略 | 是 | fork 路径(继承上下文) |
|
||||
| `true` | * | 否 | 忽略 fork,走 subagent_type 或 general-purpose |
|
||||
| `false`/省略 | 有值 | * | 走指定 agent 类型(原有行为) |
|
||||
| `false`/省略 | 省略 | * | 走 general-purpose(原有行为) |
|
||||
|
||||
核心原则:**`fork: true` 是最高优先级**(当 flag 开启时),但 flag 关闭时静默降级,不影响原有行为。
|
||||
|
||||
### 后台运行由参数决定
|
||||
|
||||
fork agent 是否后台运行由 `run_in_background` 参数决定,与普通 agent 一致。`forceAsync` 不再绑定 `isForkSubagentEnabled()`:
|
||||
|
||||
```ts
|
||||
// forceAsync 不再受 isForkSubagentEnabled() 影响
|
||||
const forceAsync = /* 其他条件(coordinator, assistant mode 等)*/;
|
||||
```
|
||||
|
||||
fork agent 与普通 agent 使用相同的 `run_in_background` 参数判断逻辑:
|
||||
- `run_in_background: true` → 后台异步运行
|
||||
- `run_in_background: false` / 省略 → 同步阻塞运行
|
||||
|
||||
### prompt 调整
|
||||
|
||||
移除引导模型"省略 subagent_type 以触发 fork"的 prompt 文本。改为说明 `fork: true` 的适用场景:
|
||||
|
||||
> When you need to delegate work that benefits from full conversation context (e.g., continuing a multi-file refactor where the child needs the same system prompt and history), use `fork: true`. For most tasks, prefer specialized agent types (Explore, Plan, general-purpose).
|
||||
|
||||
### isForkSubagentEnabled() 精简
|
||||
|
||||
函数签名和行为保持不变,但调用方语义改变:从"隐式路由判断"变为"参数校验门控"。
|
||||
|
||||
```ts
|
||||
export function isForkSubagentEnabled(): boolean {
|
||||
if (!feature('FORK_SUBAGENT')) return false;
|
||||
if (isCoordinatorMode()) return false;
|
||||
if (getIsNonInteractiveSession()) return false;
|
||||
return true;
|
||||
}
|
||||
```
|
||||
|
||||
### 不变的部分
|
||||
|
||||
以下保持不变,无需修改:
|
||||
|
||||
- `buildForkedMessages()` — fork 消息构建逻辑
|
||||
- `isInForkChild()` — 递归 fork 防护
|
||||
- `FORK_AGENT` — fork agent 定义(model: 'inherit', permissionMode: 'bubble')
|
||||
- `buildChildMessage()` — fork 子 agent 指令模板
|
||||
- `buildWorktreeNotice()` — worktree 隔离通知
|
||||
|
||||
## 实现要点
|
||||
|
||||
1. **Schema 动态裁剪**:`inputSchema` memo 中根据 `isForkSubagentEnabled()` 决定是否 `.omit({ fork: true })`,flag 关闭时字段不存在于 schema
|
||||
2. **省略 `subagent_type` 恢复原有行为**:不再隐式走 fork,恢复为 `GENERAL_PURPOSE_AGENT`
|
||||
3. **`defines.ts` 注释更新**:`FORK_SUBAGENT` 保持注释状态,但描述更新为新行为(显式参数触发,不影响探索任务模型选择)
|
||||
4. **递归 fork 防护**:保持现有 `isInForkChild()` + `querySource` 双重检测
|
||||
|
||||
### 涉及文件
|
||||
|
||||
| 文件 | 改动 |
|
||||
|------|------|
|
||||
| `packages/builtin-tools/src/tools/AgentTool/AgentTool.tsx` | 新增 `fork` 参数解析,路由逻辑重构,forceAsync 解耦 |
|
||||
| `packages/builtin-tools/src/tools/AgentTool/prompt.ts` | 移除隐式 fork 引导,新增 `fork: true` 使用场景说明 |
|
||||
| `scripts/defines.ts` | 更新 `FORK_SUBAGENT` 注释描述 |
|
||||
|
||||
## 验收标准
|
||||
|
||||
- [ ] `fork: true` + `FORK_SUBAGENT` 启用 → 走 fork 路径,继承父级上下文和模型
|
||||
- [ ] `fork: true` + `subagent_type` 有值 + flag 开 → fork 路径,忽略 subagent_type
|
||||
- [ ] `fork: true` + `FORK_SUBAGENT` 关闭 → 忽略 fork,走普通 agent 路径
|
||||
- [ ] 不传 `fork` 参数 → 行为与当前 flag 关闭时完全一致(走 general-purpose 或指定 subagent_type)
|
||||
- [ ] `forceAsync` 不再因 `isForkSubagentEnabled()` 而全局生效
|
||||
- [ ] fork 子 agent 的后台/同步行为由 `run_in_background` 参数控制,与普通 agent 一致
|
||||
- [ ] `bun run precheck` 零错误通过
|
||||
@@ -0,0 +1,170 @@
|
||||
# Fork Agent 显式参数触发重构 人工验收清单
|
||||
|
||||
**生成时间:** 2026-05-02
|
||||
**关联计划:** spec/feature_20260502_F001_fork-agent-redesign/spec-plan.md
|
||||
**关联设计:** spec/feature_20260502_F001_fork-agent-redesign/spec-design.md
|
||||
|
||||
---
|
||||
|
||||
## 验收前准备
|
||||
|
||||
### 环境要求
|
||||
- [ ] [AUTO] 检查 Bun 版本: `bun --version`
|
||||
- [ ] [AUTO] 安装依赖: `bun install`
|
||||
|
||||
---
|
||||
|
||||
## 验收项目
|
||||
|
||||
### 场景 1:Schema 与类型变更
|
||||
|
||||
#### - [x] 1.1 fork 字段已添加到 baseInputSchema
|
||||
- **来源:** spec-plan.md Task 1 / spec-design.md §Schema 变更
|
||||
- **目的:** 确认 fork 参数在基础 schema 中声明
|
||||
- **操作步骤:**
|
||||
1. [A] `grep -n 'fork:' packages/builtin-tools/src/tools/AgentTool/AgentTool.tsx | head -5` → 期望包含: `fork: z`(schema 定义)和 `fork?: boolean`(类型声明)
|
||||
|
||||
#### - [x] 1.2 fork 字段在 flag 关闭时被 schema 裁剪
|
||||
- **来源:** spec-plan.md Task 1 / spec-design.md §Schema 变更
|
||||
- **目的:** 确认 FORK_SUBAGENT 关闭时 fork 字段不可见
|
||||
- **操作步骤:**
|
||||
1. [A] `grep -n 'omit.*fork' packages/builtin-tools/src/tools/AgentTool/AgentTool.tsx` → 期望包含: `schema.omit({ fork: true })`
|
||||
|
||||
#### - [x] 1.3 AgentToolInput 类型包含 fork 字段
|
||||
- **来源:** spec-plan.md Task 1
|
||||
- **目的:** 确认类型声明与 schema 一致
|
||||
- **操作步骤:**
|
||||
1. [A] `grep -n 'fork' packages/builtin-tools/src/tools/AgentTool/AgentTool.tsx | grep 'AgentToolInput\|fork?:'` → 期望包含: `fork?: boolean`
|
||||
|
||||
---
|
||||
|
||||
### 场景 2:路由逻辑重构
|
||||
|
||||
#### - [x] 2.1 isForkPath 使用显式 fork 参数判断
|
||||
- **来源:** spec-plan.md Task 1 / spec-design.md §路由逻辑重构
|
||||
- **目的:** 确认 fork 路径由 fork=true 显式触发
|
||||
- **操作步骤:**
|
||||
1. [A] `grep -n 'isForkPath' packages/builtin-tools/src/tools/AgentTool/AgentTool.tsx` → 期望包含: `fork === true && isForkSubagentEnabled()`
|
||||
|
||||
#### - [x] 2.2 forceAsync 已完全移除
|
||||
- **来源:** spec-plan.md Task 1 / spec-design.md §后台运行由参数决定
|
||||
- **目的:** 确认 forceAsync 不再绑定 isForkSubagentEnabled()
|
||||
- **操作步骤:**
|
||||
1. [A] `grep -c 'forceAsync' packages/builtin-tools/src/tools/AgentTool/AgentTool.tsx` → 期望精确: `0`
|
||||
|
||||
#### - [x] 2.3 isForkSubagentEnabled() 仅用于 schema 裁剪和路由判断
|
||||
- **来源:** spec-plan.md Task 1
|
||||
- **目的:** 确认 isForkSubagentEnabled() 不再影响 forceAsync/shouldRunAsync
|
||||
- **操作步骤:**
|
||||
1. [A] `grep -n 'isForkSubagentEnabled' packages/builtin-tools/src/tools/AgentTool/AgentTool.tsx` → 期望包含: 仅出现在 inputSchema 裁剪和 isForkPath 路由判断中
|
||||
|
||||
#### - [x] 2.4 shouldRunAsync 由 run_in_background 控制
|
||||
- **来源:** spec-plan.md Task 1 / spec-design.md §后台运行由参数决定
|
||||
- **目的:** 确认异步行为与普通 agent 一致
|
||||
- **操作步骤:**
|
||||
1. [A] `grep -n 'run_in_background' packages/builtin-tools/src/tools/AgentTool/AgentTool.tsx | head -5` → 期望包含: `shouldRunAsync` 计算中含 `run_in_background === true`,无 `forceAsync`
|
||||
|
||||
#### - [x] 2.5 enableSummarization 使用 isForkPath 而非 isForkSubagentEnabled()
|
||||
- **来源:** spec-plan.md Task 1
|
||||
- **目的:** 确认摘要仅在当前调用实际走 fork 路径时启用
|
||||
- **操作步骤:**
|
||||
1. [A] `grep -n 'enableSummarization' packages/builtin-tools/src/tools/AgentTool/AgentTool.tsx` → 期望包含: `isForkPath`,不包含 `isForkSubagentEnabled()`
|
||||
|
||||
---
|
||||
|
||||
### 场景 3:Prompt 文本更新
|
||||
|
||||
#### - [x] 3.1 不再包含 "omit subagent_type" 引导文本
|
||||
- **来源:** spec-plan.md Task 2 / spec-design.md §prompt 调整
|
||||
- **目的:** 确认隐式 fork 触发引导已移除
|
||||
- **操作步骤:**
|
||||
1. [A] `grep -c 'omit' packages/builtin-tools/src/tools/AgentTool/prompt.ts` → 期望精确: `0`
|
||||
|
||||
#### - [x] 3.2 包含 "fork: true" 显式参数说明
|
||||
- **来源:** spec-plan.md Task 2 / spec-design.md §prompt 调整
|
||||
- **目的:** 确认新的显式 fork 使用说明已写入
|
||||
- **操作步骤:**
|
||||
1. [A] `grep -c 'fork: true' packages/builtin-tools/src/tools/AgentTool/prompt.ts` → 期望包含: >= 3(shared section + whenToForkSection + forkExamples)
|
||||
|
||||
#### - [x] 3.3 背景任务说明条件不再含 !forkEnabled
|
||||
- **来源:** spec-plan.md Task 2
|
||||
- **目的:** 确认 fork 解耦后背景任务说明在 fork 启用时也显示
|
||||
- **操作步骤:**
|
||||
1. [A] `grep -n 'forkEnabled' packages/builtin-tools/src/tools/AgentTool/prompt.ts` → 期望包含: 所有匹配行均为 `forkEnabled ?` 形式,不包含 `!forkEnabled`
|
||||
|
||||
#### - [x] 3.4 术语从 "fresh agent" 更新为 "non-fork"
|
||||
- **来源:** spec-plan.md Task 2
|
||||
- **目的:** 确认 prompt 术语与新的显式 fork 逻辑一致
|
||||
- **操作步骤:**
|
||||
1. [A] `grep -c 'non-fork' packages/builtin-tools/src/tools/AgentTool/prompt.ts` → 期望包含: >= 2
|
||||
|
||||
---
|
||||
|
||||
### 场景 4:边界与回归(决策表验证)
|
||||
|
||||
#### - [x] 4.1 fork=true + subagent_type + flag 开 → fork 路径,忽略 subagent_type
|
||||
- **来源:** spec-design.md §决策表 + spec-plan.md Task 3
|
||||
- **目的:** 确认 fork 优先级高于 subagent_type
|
||||
- **操作步骤:**
|
||||
1. [A] `grep -A2 'isForkPath = fork === true' packages/builtin-tools/src/tools/AgentTool/AgentTool.tsx` → 期望包含: `effectiveType = subagent_type ?? GENERAL_PURPOSE_AGENT.agentType`(fork 生效时 effectiveType 被 isForkPath 覆盖,subagent_type 不影响路由)
|
||||
|
||||
#### - [x] 4.2 fork=true + flag 关闭 → 忽略 fork,走普通 agent 路径
|
||||
- **来源:** spec-design.md §决策表
|
||||
- **目的:** 确认 flag 关闭时 fork 静默降级
|
||||
- **操作步骤:**
|
||||
1. [A] `grep 'isForkPath = fork === true && isForkSubagentEnabled' packages/builtin-tools/src/tools/AgentTool/AgentTool.tsx` → 期望包含: `&& isForkSubagentEnabled()`(双条件确保 flag 关闭时 isForkPath 为 false)
|
||||
|
||||
#### - [x] 4.3 fork 省略 → 走 general-purpose 或指定 subagent_type
|
||||
- **来源:** spec-design.md §决策表
|
||||
- **目的:** 确认向后兼容
|
||||
- **操作步骤:**
|
||||
1. [A] `grep 'effectiveType = subagent_type ??' packages/builtin-tools/src/tools/AgentTool/AgentTool.tsx` → 期望包含: `GENERAL_PURPOSE_AGENT.agentType`
|
||||
|
||||
---
|
||||
|
||||
### 场景 5:defines.ts 注释与构建验证
|
||||
|
||||
#### - [x] 5.1 FORK_SUBAGENT 注释已更新为新行为描述
|
||||
- **来源:** spec-plan.md Task 1 / spec-design.md §实现要点
|
||||
- **目的:** 确认注释反映显式参数触发设计
|
||||
- **操作步骤:**
|
||||
1. [A] `grep 'FORK_SUBAGENT' scripts/defines.ts` → 期望包含: `显式 \`fork: true\` 参数触发`
|
||||
|
||||
#### - [x] 5.2 单元测试全部通过
|
||||
- **来源:** spec-plan.md Task 1 + Task 2
|
||||
- **目的:** 确认路由逻辑和 prompt 文本测试通过
|
||||
- **操作步骤:**
|
||||
1. [A] `bun test packages/builtin-tools/src/tools/AgentTool/__tests__/ 2>&1 | tail -10` → 期望包含: `0 fail`
|
||||
|
||||
#### - [x] 5.3 precheck 零错误通过
|
||||
- **来源:** spec-plan.md Task 3 / spec-design.md §验收标准
|
||||
- **目的:** 确认 typecheck + lint + test 无回归
|
||||
- **操作步骤:**
|
||||
1. [A] `bun run precheck` → 期望包含: 零错误退出
|
||||
|
||||
---
|
||||
|
||||
## 验收结果汇总
|
||||
|
||||
| 场景 | 序号 | 验收项 | [A] | [H] | 结果 |
|
||||
|------|------|--------|-----|-----|------|
|
||||
| 场景 1 | 1.1 | fork 字段已添加到 baseInputSchema | 1 | 0 | ✅ |
|
||||
| 场景 1 | 1.2 | fork 字段在 flag 关闭时被 schema 裁剪 | 1 | 0 | ✅ |
|
||||
| 场景 1 | 1.3 | AgentToolInput 类型包含 fork 字段 | 1 | 0 | ✅ |
|
||||
| 场景 2 | 2.1 | isForkPath 使用显式 fork 参数判断 | 1 | 0 | ✅ |
|
||||
| 场景 2 | 2.2 | forceAsync 已完全移除 | 1 | 0 | ✅ |
|
||||
| 场景 2 | 2.3 | isForkSubagentEnabled() 仅用于 schema 裁剪和路由判断 | 1 | 0 | ✅ |
|
||||
| 场景 2 | 2.4 | shouldRunAsync 由 run_in_background 控制 | 1 | 0 | ✅ |
|
||||
| 场景 2 | 2.5 | enableSummarization 使用 isForkPath | 1 | 0 | ✅ |
|
||||
| 场景 3 | 3.1 | 不再包含 "omit subagent_type" 引导文本 | 1 | 0 | ✅ |
|
||||
| 场景 3 | 3.2 | 包含 "fork: true" 显式参数说明 | 1 | 0 | ✅ |
|
||||
| 场景 3 | 3.3 | 背景任务条件不再含 !forkEnabled | 1 | 0 | ✅ |
|
||||
| 场景 3 | 3.4 | 术语更新为 "non-fork" | 1 | 0 | ✅ |
|
||||
| 场景 4 | 4.1 | fork=true + subagent_type + flag 开 → fork 路径 | 1 | 0 | ✅ |
|
||||
| 场景 4 | 4.2 | fork=true + flag 关闭 → 忽略 fork | 1 | 0 | ✅ |
|
||||
| 场景 4 | 4.3 | fork 省略 → general-purpose(向后兼容) | 1 | 0 | ✅ |
|
||||
| 场景 5 | 5.1 | FORK_SUBAGENT 注释已更新 | 1 | 0 | ✅ |
|
||||
| 场景 5 | 5.2 | 单元测试全部通过 | 1 | 0 | ✅ |
|
||||
| 场景 5 | 5.3 | precheck 零错误通过 | 1 | 0 | ✅ |
|
||||
|
||||
**验收结论:** ✅ 全部通过 / ⬜ 存在问题
|
||||
317
spec/feature_20260502_F001_fork-agent-redesign/spec-plan.md
Normal file
317
spec/feature_20260502_F001_fork-agent-redesign/spec-plan.md
Normal file
@@ -0,0 +1,317 @@
|
||||
# Fork Agent 显式参数触发重构 执行计划
|
||||
|
||||
**目标:** 将 FORK_SUBAGENT 从隐式行为改为显式 `fork: true` 参数触发,解耦 forceAsync,保持向后兼容
|
||||
|
||||
**技术栈:** TypeScript, Zod schema, Bun test, React/Ink (prompt UI)
|
||||
|
||||
**设计文档:** spec/feature_20260502_F001_fork-agent-redesign/spec-design.md
|
||||
|
||||
## 改动总览
|
||||
|
||||
- 本次改动涉及 3 个修改文件:`AgentTool.tsx`(Schema + 路由 + forceAsync 解耦)、`prompt.ts`(引导文本)、`defines.ts`(注释更新)。新建 1 个测试文件 `prompt.test.ts`。
|
||||
- Task 1 是 Task 2 的前置:Task 1 完成 Schema 变更和路由重构后,Task 2 才能安全地调整 prompt 文本(prompt 行为描述必须与代码实际行为一致)。
|
||||
- 关键设计决策:fork 参数添加到 `baseInputSchema` 而非 `fullInputSchema`,因为 fork 是基础 agent 能力而非 multi-agent 特有能力。
|
||||
|
||||
---
|
||||
|
||||
### Task 0: 环境准备
|
||||
|
||||
**背景:**
|
||||
确保构建和测试工具链在当前开发环境中可用,避免后续 Task 因环境问题阻塞。
|
||||
|
||||
**执行步骤:**
|
||||
- [x] 验证构建工具可用
|
||||
- `bun --version`
|
||||
- 确认输出 Bun 版本号
|
||||
- [x] 验证测试工具可用
|
||||
- `bun test --help 2>&1 | head -3`
|
||||
- 确认输出包含 test 相关帮助信息
|
||||
|
||||
**检查步骤:**
|
||||
- [x] 构建命令执行成功
|
||||
- `bun run build 2>&1 | tail -5`
|
||||
- 预期: 构建成功,输出包含 dist/cli.js
|
||||
- [x] 现有测试通过
|
||||
- `bun test packages/builtin-tools/src/tools/AgentTool/__tests__/ 2>&1 | tail -10`
|
||||
- 预期: 所有现有测试通过,无失败
|
||||
|
||||
---
|
||||
|
||||
### Task 1: 核心路由重构
|
||||
|
||||
**背景:**
|
||||
[业务语境] — 当前 `FORK_SUBAGENT` flag 启用时,所有省略 `subagent_type` 的 agent 调用隐式走 fork 路径,导致探索任务被迫使用父级同等级模型,token 消耗大增。本次重构将 fork 从隐式行为改为显式 `fork: true` 参数触发。
|
||||
[修改原因] — `AgentTool.tsx` 中路由逻辑(`effectiveType` / `isForkPath`)通过 `subagent_type` 是否省略来判断 fork 路径,需改为通过 `fork` 布尔参数显式触发。同时 `forceAsync` 变量绑定在 `isForkSubagentEnabled()` 上,导致 fork flag 开启时所有 agent 强制异步,需解耦。
|
||||
[上下游影响] — 本 Task 的输出(`fork` 参数、新路由逻辑)被 Task 2(prompt 文本调整)依赖。本 Task 无前置依赖。
|
||||
|
||||
**涉及文件:**
|
||||
- 修改: `packages/builtin-tools/src/tools/AgentTool/AgentTool.tsx`
|
||||
- 修改: `scripts/defines.ts`
|
||||
|
||||
**执行步骤:**
|
||||
- [x] 在 baseInputSchema 中新增 `fork` 字段
|
||||
- 位置: `packages/builtin-tools/src/tools/AgentTool/AgentTool.tsx:baseInputSchema()` (~L136-152),在 `run_in_background` 字段之后
|
||||
- 在 `run_in_background` 字段的闭合 `),` 之后,闭合 `})` 之前,新增:
|
||||
```ts
|
||||
fork: z
|
||||
.boolean()
|
||||
.optional()
|
||||
.describe(
|
||||
'Set to true to fork from the parent conversation context. The child inherits full history, system prompt, and model. Requires FORK_SUBAGENT feature flag.',
|
||||
),
|
||||
```
|
||||
- 原因: fork 参数需要在基础 schema 中声明,与 `subagent_type`、`run_in_background` 同级,因为它是所有 agent 调用的可选参数,不限于 multi-agent 场景。
|
||||
|
||||
- [x] 重构 inputSchema memo 的裁剪逻辑
|
||||
- 位置: `packages/builtin-tools/src/tools/AgentTool/AgentTool.tsx:inputSchema()` (~L193-204)
|
||||
- 将 L194-203 替换为:
|
||||
```ts
|
||||
let schema = feature('KAIROS') ? fullInputSchema() : fullInputSchema().omit({ cwd: true });
|
||||
if (isBackgroundTasksDisabled) {
|
||||
schema = schema.omit({ run_in_background: true });
|
||||
}
|
||||
if (!isForkSubagentEnabled()) {
|
||||
schema = schema.omit({ fork: true });
|
||||
}
|
||||
return schema;
|
||||
```
|
||||
- 同时删除 L196-202 的 GrowthBook 注释块(该注释描述的是旧 `forceAsync` 行为,已不适用)。
|
||||
- 原因: fork 字段仅在 `FORK_SUBAGENT` flag 启用时可见;`run_in_background` 不再受 `isForkSubagentEnabled()` 影响,两者独立裁剪。
|
||||
|
||||
- [x] 更新 AgentToolInput 类型声明
|
||||
- 位置: `packages/builtin-tools/src/tools/AgentTool/AgentTool.tsx` (~L211-217),`AgentToolInput` type 定义
|
||||
- 在 `z.infer<ReturnType<typeof baseInputSchema>> & {` 的下一行(`name?: string;` 之前),新增 `fork?: boolean;`
|
||||
- 原因: 类型声明必须包含 `fork` 字段,确保 `call()` 解构时有正确的类型推断。
|
||||
|
||||
- [x] 更新 inputSchema 附近的 fork gate 注释
|
||||
- 位置: `packages/builtin-tools/src/tools/AgentTool/AgentTool.tsx` (~L207-210),`AgentToolInput` 上方的注释
|
||||
- 将 L209-210 的注释:
|
||||
```ts
|
||||
// subagent_type is optional; call() defaults it to general-purpose when the
|
||||
// fork gate is off, or routes to the fork path when the gate is on.
|
||||
```
|
||||
- 替换为:
|
||||
```ts
|
||||
// subagent_type is optional; call() defaults it to general-purpose.
|
||||
// fork is gated by FORK_SUBAGENT flag; when omitted or flag is off, no fork.
|
||||
```
|
||||
- 原因: 旧行为描述与新的显式 fork 触发逻辑不一致,需要更新。
|
||||
|
||||
- [x] 在 call() 解构中新增 `fork` 参数
|
||||
- 位置: `packages/builtin-tools/src/tools/AgentTool/AgentTool.tsx:call()` (~L322-333),参数解构
|
||||
- 在 `subagent_type,` 之后(L324),新增 `fork,`
|
||||
- 原因: `call()` 需要从输入中提取 `fork` 值用于路由判断。
|
||||
|
||||
- [x] 重构路由逻辑为显式 fork 触发
|
||||
- 位置: `packages/builtin-tools/src/tools/AgentTool/AgentTool.tsx:call()` (~L409-414)
|
||||
- 将 L409-414 替换为:
|
||||
```ts
|
||||
// Fork routing: explicit `fork: true` parameter triggers the fork path
|
||||
// (inherits parent context and model). Requires FORK_SUBAGENT flag.
|
||||
// subagent_type is ignored when fork takes effect.
|
||||
const isForkPath = fork === true && isForkSubagentEnabled();
|
||||
const effectiveType = subagent_type ?? GENERAL_PURPOSE_AGENT.agentType;
|
||||
```
|
||||
- 原因: 将隐式路由(省略 `subagent_type` 触发 fork)改为显式参数触发(`fork: true`),同时保持 `subagent_type` 省略时走 general-purpose 的原有行为。
|
||||
|
||||
- [x] 删除 forceAsync 变量及其注释
|
||||
- 位置: `packages/builtin-tools/src/tools/AgentTool/AgentTool.tsx:call()` (~L695-697)
|
||||
- 删除 L695-697(注释 + `const forceAsync = isForkSubagentEnabled();`)
|
||||
- 原因: `forceAsync` 不再绑定 `isForkSubagentEnabled()`,fork agent 的异步行为由 `run_in_background` 参数控制,与普通 agent 一致。
|
||||
|
||||
- [x] 从 shouldRunAsync 中移除 forceAsync 条件
|
||||
- 位置: `packages/builtin-tools/src/tools/AgentTool/AgentTool.tsx:call()` (~L708-715)
|
||||
- 将 L708-715 的 `shouldRunAsync` 计算中的 `forceAsync ||` 移除:
|
||||
```ts
|
||||
const shouldRunAsync =
|
||||
(run_in_background === true ||
|
||||
selectedAgent.background === true ||
|
||||
isCoordinator ||
|
||||
assistantForceAsync ||
|
||||
(proactiveModule?.isProactiveActive() ?? false)) &&
|
||||
!isBackgroundTasksDisabled;
|
||||
```
|
||||
- 原因: `forceAsync` 变量已删除,fork agent 不再全局强制异步。
|
||||
|
||||
- [x] 更新 enableSummarization 使用 isForkPath 替代 isForkSubagentEnabled()
|
||||
- 位置: `packages/builtin-tools/src/tools/AgentTool/AgentTool.tsx:call()` (~L892)
|
||||
- 将:
|
||||
```ts
|
||||
enableSummarization: isCoordinator || isForkSubagentEnabled() || getSdkAgentProgressSummariesEnabled(),
|
||||
```
|
||||
- 替换为:
|
||||
```ts
|
||||
enableSummarization: isCoordinator || isForkPath || getSdkAgentProgressSummariesEnabled(),
|
||||
```
|
||||
- 原因: `enableSummarization` 应仅在当前调用实际走 fork 路径时启用,而非 flag 全局启用。`isForkPath` 是当前调用的运行时判断结果。
|
||||
|
||||
- [x] 更新 defines.ts 中 FORK_SUBAGENT 的注释
|
||||
- 位置: `scripts/defines.ts` (~L55)
|
||||
- 将:
|
||||
```ts
|
||||
// 'FORK_SUBAGENT', // 已禁用:启用后 prompt 引导模型用 fork(继承父模型)替代 Explore(haiku),导致探索任务使用同等级模型
|
||||
```
|
||||
- 替换为:
|
||||
```ts
|
||||
// 'FORK_SUBAGENT', // 已禁用:显式 `fork: true` 参数触发 fork 路径(继承父级上下文和模型),不影响 forceAsync 和探索任务模型选择
|
||||
```
|
||||
- 原因: 旧注释描述的是隐式 fork 行为的问题,新注释描述的是当前显式参数触发的设计。
|
||||
|
||||
- [x] 为路由逻辑重构编写单元测试
|
||||
- 测试文件: `packages/builtin-tools/src/tools/AgentTool/__tests__/agentToolUtils.test.ts`
|
||||
- 测试场景(通过导出路由判断辅助函数或验证 inputSchema 裁剪行为):
|
||||
- `isForkSubagentEnabled() 返回 false 时`: `inputSchema()` 不包含 `fork` 字段(通过 `.omit({ fork: true })` 裁剪)
|
||||
- `isBackgroundTasksDisabled 为 true 时`: `inputSchema()` 不包含 `run_in_background` 字段,但仍包含 `fork` 字段
|
||||
- 两个条件同时满足时: `inputSchema()` 同时 omit `run_in_background` 和 `fork`
|
||||
- 运行命令: `bun test packages/builtin-tools/src/tools/AgentTool/__tests__/agentToolUtils.test.ts`
|
||||
- 预期: 所有测试通过
|
||||
|
||||
**检查步骤:**
|
||||
- [x] 验证 `fork` 字段已添加到 baseInputSchema
|
||||
- `grep -n 'fork:' packages/builtin-tools/src/tools/AgentTool/AgentTool.tsx | head -5`
|
||||
- 预期: 输出至少包含 1 行 schema 定义中的 `fork:` 和 1 行类型中的 `fork?:`
|
||||
|
||||
- [x] 验证 forceAsync 已完全移除
|
||||
- `grep -n 'forceAsync' packages/builtin-tools/src/tools/AgentTool/AgentTool.tsx`
|
||||
- 预期: 无输出(grep 返回非零退出码)
|
||||
|
||||
- [x] 验证 isForkSubagentEnabled() 在 call() 中仅用于路由判断
|
||||
- `grep -n 'isForkSubagentEnabled' packages/builtin-tools/src/tools/AgentTool/AgentTool.tsx`
|
||||
- 预期: 仅出现在 `inputSchema()` 的 `!isForkSubagentEnabled()` 裁剪条件和路由的 `fork === true && isForkSubagentEnabled()` 中,不出现在 shouldRunAsync 或 enableSummarization 中
|
||||
|
||||
- [x] 验证 defines.ts 注释已更新
|
||||
- `grep 'FORK_SUBAGENT' scripts/defines.ts`
|
||||
- 预期: 输出行包含 "显式 `fork: true` 参数触发"
|
||||
|
||||
- [x] 运行 precheck 确认无类型/lint/测试错误
|
||||
- `bun run precheck`
|
||||
- 预期: 零错误通过
|
||||
|
||||
---
|
||||
|
||||
### Task 2: Prompt 文本调整
|
||||
|
||||
**背景:**
|
||||
[业务语境] — Task 1 将 fork 从隐式行为(省略 `subagent_type` 触发)改为显式参数(`fork: true`),prompt.ts 中的引导文本必须同步更新,否则模型仍会尝试用旧方式触发 fork。
|
||||
[修改原因] — 当前 prompt.ts 引导模型"省略 `subagent_type` 以触发 fork"(~L85 `omit \`subagent_type\``),且 forkExamples 中省略了 `subagent_type`(隐式触发)。这些文本与 Task 1 的新路由逻辑矛盾。此外,背景任务说明的显示条件 `!forkEnabled` 不再正确——Task 1 已解耦 forceAsync,fork agent 不再强制异步,背景任务说明应在 fork 启用时也显示。
|
||||
[上下游影响] — 本 Task 依赖 Task 1 完成(Task 1 重构了路由逻辑,本 Task 更新对应的 prompt 文本)。本 Task 仅修改 prompt 文本,不影响运行时逻辑。
|
||||
|
||||
**涉及文件:**
|
||||
- 修改: `packages/builtin-tools/src/tools/AgentTool/prompt.ts`
|
||||
|
||||
**执行步骤:**
|
||||
|
||||
- [x] 替换 `whenToForkSection` 中的 fork 触发说明
|
||||
- 位置: `packages/builtin-tools/src/tools/AgentTool/prompt.ts` `getPrompt()` 函数内 `whenToForkSection` 模板字面量(~L80-97)
|
||||
- 将 `## When to fork` 标题下的第一段文本(从 "Fork yourself (omit..." 到 "...Do research before jumping to implementation.")替换为:
|
||||
```
|
||||
When you need to delegate work that benefits from full conversation context (e.g., continuing a multi-file refactor where the child needs the same system prompt and history), use `fork: true`. For most tasks, prefer specialized agent types (Explore, Plan, general-purpose).
|
||||
```
|
||||
- "Don't peek."、"Don't race."、"Writing a fork prompt." 段落保持不变
|
||||
- 原因: 移除"省略 subagent_type"的引导,改为说明 `fork: true` 的适用场景
|
||||
|
||||
- [x] 更新 `writingThePromptSection` 中的术语
|
||||
- 位置: `packages/builtin-tools/src/tools/AgentTool/prompt.ts` `getPrompt()` 函数内 `writingThePromptSection` 模板字面量(~L99-113)
|
||||
- 将 ~L103 的条件文本从 `'When spawning a fresh agent (with a `subagent_type`), it starts with zero context. '` 替换为 `'When spawning an agent without `fork: true`, it starts with zero context. '`
|
||||
- 将 ~L110 的条件文本从 `'For fresh agents, terse'` 替换为 `'For non-fork agents, terse'`
|
||||
- 原因: fork 通过 `fork: true` 显式触发,"fresh agent"与"fork"的对立不再准确,改为"non-fork agents"
|
||||
|
||||
- [x] 替换 `shared` section 中的 fork 使用说明
|
||||
- 位置: `packages/builtin-tools/src/tools/AgentTool/prompt.ts` `getPrompt()` 函数内 `shared` 模板字面量(~L208-212)
|
||||
- 将整个条件分支(`forkEnabled ? ... : ...`)替换为统一文本:
|
||||
```
|
||||
When using the ${AGENT_TOOL_NAME} tool, specify a subagent_type parameter to select which agent type to use. If omitted, the general-purpose agent is used.${forkEnabled ? ` Set \`fork: true\` to fork from the parent conversation context, inheriting full history and model.` : ''}
|
||||
```
|
||||
- 原因: 省略 `subagent_type` 现在总是走 general-purpose,统一两分支为基础文本 + fork 追加说明
|
||||
|
||||
- [x] 移除背景任务说明的 `!forkEnabled` 条件
|
||||
- 位置: `packages/builtin-tools/src/tools/AgentTool/prompt.ts` `getPrompt()` 函数内背景任务说明的条件判断(~L259-261)
|
||||
- 将条件从 `!isEnvTruthy(process.env.CLAUDE_CODE_DISABLE_BACKGROUND_TASKS) && !isInProcessTeammate() && !forkEnabled` 改为 `!isEnvTruthy(process.env.CLAUDE_CODE_DISABLE_BACKGROUND_TASKS) && !isInProcessTeammate()`
|
||||
- 原因: Task 1 已解耦 forceAsync,fork agent 不再强制异步,背景任务说明应在 fork 启用时也显示
|
||||
|
||||
- [x] 更新 continue agent note 中的术语
|
||||
- 位置: `packages/builtin-tools/src/tools/AgentTool/prompt.ts` `getPrompt()` 函数内 continue agent 说明(~L267)
|
||||
- 将条件文本从 `'Each fresh Agent invocation with a subagent_type starts without context — provide a complete task description.'` 替换为 `'Each non-fork Agent invocation starts without context — provide a complete task description.'`
|
||||
- 原因: 与 writingThePromptSection 保持术语一致
|
||||
|
||||
- [x] 更新 `forkExamples` 中第一个示例调用,添加 `fork: true` 参数
|
||||
- 位置: `packages/builtin-tools/src/tools/AgentTool/prompt.ts` `getPrompt()` 函数内 `forkExamples` 模板字面量(~L120-124)
|
||||
- 在 `Agent({...})` 调用中 `description:` 行之后添加 `fork: true,` 行
|
||||
- 第二个示例(~L133-139)是"mid-wait"场景无工具调用,保持不变;第三个示例(~L141-154)有 `subagent_type: "code-reviewer"` 是 fresh agent 场景,保持不变
|
||||
- 原因: 第一个示例展示 fork 用法,需要显式传入 `fork: true`
|
||||
|
||||
- [x] 为 prompt.ts 的 fork 相关文本变更编写单元测试
|
||||
- 测试文件: `packages/builtin-tools/src/tools/AgentTool/__tests__/prompt.test.ts`
|
||||
- 测试场景:
|
||||
- `forkEnabled = true` 时: prompt 不包含 "omit `subagent_type`" 文本,包含 "`fork: true`" 文本
|
||||
- `forkEnabled = true` 时: prompt 包含 "non-fork" 术语(替代 "fresh agent")
|
||||
- `forkEnabled = true` 时: prompt 包含 "Set `fork: true` to fork from the parent" 说明
|
||||
- `forkEnabled = true` 时: prompt 包含背景任务说明(`run_in_background`)
|
||||
- `forkEnabled = false` 时: prompt 不包含 "`fork: true`" 文本,不包含 "When to fork" section
|
||||
- `forkEnabled = false` 时: prompt 包含 "general-purpose agent" 回退说明
|
||||
- Mock 列表: `isForkSubagentEnabled`(返回 true/false)、`getFeatureValue_CACHED_MAY_BE_STALE`(返回 false)、`shouldInjectAgentListInMessages`(返回 false)、`isInProcessTeammate`(返回 false)、`isTeammate`(返回 false)、`getSubscriptionType`(返回 'pro')、`hasEmbeddedSearchTools`(返回 false)、环境变量 `CLAUDE_CODE_DISABLE_BACKGROUND_TASKS` 未定义
|
||||
- 运行命令: `bun test packages/builtin-tools/src/tools/AgentTool/__tests__/prompt.test.ts`
|
||||
- 预期: 所有测试通过
|
||||
|
||||
**检查步骤:**
|
||||
- [x] 验证 prompt 中不再包含 "omit `subagent_type`" 引导文本
|
||||
- `grep -n "omit" packages/builtin-tools/src/tools/AgentTool/prompt.ts`
|
||||
- 预期: 无输出
|
||||
|
||||
- [x] 验证 prompt 中包含 "`fork: true`" 文本
|
||||
- `grep -c "fork: true" packages/builtin-tools/src/tools/AgentTool/prompt.ts`
|
||||
- 预期: 输出 >= 3(shared section + whenToForkSection + forkExamples)
|
||||
|
||||
- [x] 验证背景任务条件中不再包含 `!forkEnabled`
|
||||
- `grep -n "forkEnabled" packages/builtin-tools/src/tools/AgentTool/prompt.ts`
|
||||
- 预期: 所有匹配行均为 `forkEnabled ?` 形式的三元表达式条件,不包含 `!forkEnabled`
|
||||
|
||||
- [x] 运行 prompt 单元测试
|
||||
- `bun test packages/builtin-tools/src/tools/AgentTool/__tests__/prompt.test.ts`
|
||||
- 预期: 所有测试通过
|
||||
|
||||
- [x] 运行 precheck 确保无回归
|
||||
- `bun run precheck`
|
||||
- 预期: 零错误通过(typecheck + lint + test)
|
||||
|
||||
---
|
||||
|
||||
### Task 3: Fork Agent 显式参数触发 验收
|
||||
|
||||
**前置条件:**
|
||||
- 启动命令: `bun run dev`(开发模式)
|
||||
- 环境变量: `FEATURE_FORK_SUBAGENT=1` 启用 fork 功能
|
||||
|
||||
**端到端验证:**
|
||||
|
||||
1. 运行完整测试套件确保无回归
|
||||
- `bun run precheck`
|
||||
- 预期: typecheck + lint + test 全部通过,零错误
|
||||
- 失败排查: 检查 Task 1(AgentTool.tsx 路由逻辑)和 Task 2(prompt.ts 文本)的修改
|
||||
|
||||
2. 验证 `fork: true` + flag 启用时走 fork 路径
|
||||
- `grep -n 'isForkPath = fork === true' packages/builtin-tools/src/tools/AgentTool/AgentTool.tsx`
|
||||
- 预期: 找到路由逻辑行,确认 `fork === true && isForkSubagentEnabled()` 条件
|
||||
- 失败排查: 检查 Task 1 路由逻辑步骤
|
||||
|
||||
3. 验证 `fork` 参数在 flag 关闭时不在 schema 中
|
||||
- `grep -n 'omit.*fork' packages/builtin-tools/src/tools/AgentTool/AgentTool.tsx`
|
||||
- 预期: 找到 `schema.omit({ fork: true })` 行
|
||||
- 失败排查: 检查 Task 1 inputSchema 裁剪逻辑
|
||||
|
||||
4. 验证 `forceAsync` 已完全移除,不再绑定 `isForkSubagentEnabled()`
|
||||
- `grep -c 'forceAsync' packages/builtin-tools/src/tools/AgentTool/AgentTool.tsx`
|
||||
- 预期: 0(无匹配)
|
||||
- 失败排查: 检查 Task 1 forceAsync 删除步骤
|
||||
|
||||
5. 验证 prompt 中不再引导"省略 subagent_type 触发 fork"
|
||||
- `grep -c 'omit.*subagent_type' packages/builtin-tools/src/tools/AgentTool/prompt.ts`
|
||||
- 预期: 0(无匹配)
|
||||
- `grep -c 'fork: true' packages/builtin-tools/src/tools/AgentTool/prompt.ts`
|
||||
- 预期: >= 3(shared section + whenToForkSection + forkExamples)
|
||||
- 失败排查: 检查 Task 2 prompt 文本替换步骤
|
||||
|
||||
6. 验证后台/同步行为由 `run_in_background` 参数控制
|
||||
- `grep -n 'run_in_background' packages/builtin-tools/src/tools/AgentTool/AgentTool.tsx | head -5`
|
||||
- 预期: `shouldRunAsync` 计算中包含 `run_in_background === true` 条件,无 `forceAsync` 条件
|
||||
- 失败排查: 检查 Task 1 shouldRunAsync 修改步骤
|
||||
@@ -1003,6 +1003,15 @@ export class QueryEngine {
|
||||
uuid: msg.uuid,
|
||||
}
|
||||
}
|
||||
// Proactive truncation: prevent unbounded growth when API doesn't
|
||||
// return compact_boundary (e.g. third-party compat layers).
|
||||
if (feature('HISTORY_SNIP') && snipModule) {
|
||||
const truncated = snipModule.proactiveTruncate(this.mutableMessages)
|
||||
if (truncated !== this.mutableMessages) {
|
||||
this.mutableMessages.length = 0
|
||||
this.mutableMessages.push(...truncated)
|
||||
}
|
||||
}
|
||||
// Don't yield other system messages in headless mode
|
||||
break
|
||||
}
|
||||
|
||||
@@ -129,7 +129,7 @@ export async function updateCCB(): Promise<void> {
|
||||
|
||||
try {
|
||||
if (pkgManager === 'bun') {
|
||||
execSync(`bun update -g ${PACKAGE_NAME}`, {
|
||||
execSync(`bun install -g ${PACKAGE_NAME}@latest`, {
|
||||
stdio: 'inherit',
|
||||
cwd: homedir(),
|
||||
timeout: 120_000,
|
||||
@@ -153,7 +153,9 @@ export async function updateCCB(): Promise<void> {
|
||||
process.stderr.write('\n')
|
||||
process.stderr.write('Try manually updating with:\n')
|
||||
if (pkgManager === 'bun') {
|
||||
process.stderr.write(chalk.bold(` bun update -g ${PACKAGE_NAME}`) + '\n')
|
||||
process.stderr.write(
|
||||
chalk.bold(` bun install -g ${PACKAGE_NAME}@latest`) + '\n',
|
||||
)
|
||||
} else {
|
||||
process.stderr.write(
|
||||
chalk.bold(` npm install -g ${PACKAGE_NAME}@latest`) + '\n',
|
||||
|
||||
@@ -65,20 +65,40 @@ function wrapText(text: string, width: number, options?: { hard?: boolean }): st
|
||||
* 2. Distributing available space proportionally
|
||||
* 3. Wrapping text within cells (no truncation)
|
||||
* 4. Properly aligning multi-line rows with borders
|
||||
*
|
||||
* Performance: uses per-render caches (formatCache, plainTextCache, wrapCache)
|
||||
* to avoid redundant formatCell/wrapText calls across the multiple passes
|
||||
* (width calculation, row line counting, rendering). Wrapped in React.memo
|
||||
* to skip re-renders when props are unchanged.
|
||||
*/
|
||||
export function MarkdownTable({ token, highlight, forceWidth }: Props): React.ReactNode {
|
||||
export const MarkdownTable = React.memo(function MarkdownTable({
|
||||
token,
|
||||
highlight,
|
||||
forceWidth,
|
||||
}: Props): React.ReactNode {
|
||||
const [theme] = useTheme();
|
||||
const { columns: actualTerminalWidth } = useTerminalSize();
|
||||
const terminalWidth = forceWidth ?? actualTerminalWidth;
|
||||
|
||||
// Format cell content to ANSI string
|
||||
// Per-render caches — Token[] references are stable within a single token
|
||||
// prop (from LRU cache in Markdown.tsx), so reference equality is sufficient.
|
||||
const formatCache = new Map<Token[] | undefined, string>();
|
||||
const plainTextCache = new Map<Token[] | undefined, string>();
|
||||
|
||||
function formatCell(tokens: Token[] | undefined): string {
|
||||
return tokens?.map(_ => formatToken(_, theme, 0, null, null, highlight)).join('') ?? '';
|
||||
const cached = formatCache.get(tokens);
|
||||
if (cached !== undefined) return cached;
|
||||
const result = tokens?.map(_ => formatToken(_, theme, 0, null, null, highlight)).join('') ?? '';
|
||||
formatCache.set(tokens, result);
|
||||
return result;
|
||||
}
|
||||
|
||||
// Get plain text (stripped of ANSI codes)
|
||||
function getPlainText(tokens: Token[] | undefined): string {
|
||||
return stripAnsi(formatCell(tokens));
|
||||
const cached = plainTextCache.get(tokens);
|
||||
if (cached !== undefined) return cached;
|
||||
const result = stripAnsi(formatCell(tokens));
|
||||
plainTextCache.set(tokens, result);
|
||||
return result;
|
||||
}
|
||||
|
||||
// Get the longest word width in a cell (minimum width to avoid breaking words)
|
||||
@@ -149,43 +169,39 @@ export function MarkdownTable({ token, highlight, forceWidth }: Props): React.Re
|
||||
columnWidths = minWidths.map(w => Math.max(Math.floor(w * scaleFactor), MIN_COLUMN_WIDTH));
|
||||
}
|
||||
|
||||
// Step 4: Calculate max row lines to determine if vertical format is needed
|
||||
function calculateMaxRowLines(): number {
|
||||
let maxLines = 1;
|
||||
// Check header
|
||||
for (let i = 0; i < token.header.length; i++) {
|
||||
const content = formatCell(token.header[i]!.tokens);
|
||||
const wrapped = wrapText(content, columnWidths[i]!, {
|
||||
hard: needsHardWrap,
|
||||
});
|
||||
maxLines = Math.max(maxLines, wrapped.length);
|
||||
}
|
||||
// Check rows
|
||||
for (const row of token.rows) {
|
||||
for (let i = 0; i < row.length; i++) {
|
||||
const content = formatCell(row[i]?.tokens);
|
||||
const wrapped = wrapText(content, columnWidths[i]!, {
|
||||
hard: needsHardWrap,
|
||||
});
|
||||
maxLines = Math.max(maxLines, wrapped.length);
|
||||
}
|
||||
}
|
||||
return maxLines;
|
||||
// Step 4: Single-pass cell preparation — wraps each cell once, caches results
|
||||
// for reuse by both row-line counting and rendering.
|
||||
const wrapCache = new Map<Token[] | undefined, string[]>();
|
||||
|
||||
function getWrappedLines(tokens: Token[] | undefined, colIndex: number): string[] {
|
||||
const cached = wrapCache.get(tokens);
|
||||
if (cached !== undefined) return cached;
|
||||
const formatted = formatCell(tokens);
|
||||
const lines = wrapText(formatted, columnWidths[colIndex]!, {
|
||||
hard: needsHardWrap,
|
||||
});
|
||||
wrapCache.set(tokens, lines);
|
||||
return lines;
|
||||
}
|
||||
|
||||
// Step 5: Calculate max row lines using cached wrapped results
|
||||
let maxRowLines = 1;
|
||||
for (let i = 0; i < token.header.length; i++) {
|
||||
maxRowLines = Math.max(maxRowLines, getWrappedLines(token.header[i]!.tokens, i).length);
|
||||
}
|
||||
for (const row of token.rows) {
|
||||
for (let i = 0; i < row.length; i++) {
|
||||
maxRowLines = Math.max(maxRowLines, getWrappedLines(row[i]?.tokens, i).length);
|
||||
}
|
||||
}
|
||||
|
||||
// Use vertical format if wrapping would make rows too tall
|
||||
const maxRowLines = calculateMaxRowLines();
|
||||
const useVerticalFormat = maxRowLines > MAX_ROW_LINES;
|
||||
|
||||
// Render a single row with potential multi-line cells
|
||||
// Returns an array of strings, one per line of the row
|
||||
function renderRowLines(cells: Array<{ tokens?: Token[] }>, isHeader: boolean): string[] {
|
||||
// Get wrapped lines for each cell (preserving ANSI formatting)
|
||||
const cellLines = cells.map((cell, colIndex) => {
|
||||
const formattedText = formatCell(cell.tokens);
|
||||
const width = columnWidths[colIndex]!;
|
||||
return wrapText(formattedText, width, { hard: needsHardWrap });
|
||||
});
|
||||
// Reuse cached wrapped lines — no redundant formatCell/wrapText
|
||||
const cellLines = cells.map((cell, colIndex) => getWrappedLines(cell.tokens, colIndex));
|
||||
|
||||
// Find max number of lines in this row
|
||||
const maxLines = Math.max(...cellLines.map(lines => lines.length), 1);
|
||||
@@ -231,6 +247,7 @@ export function MarkdownTable({ token, highlight, forceWidth }: Props): React.Re
|
||||
}
|
||||
|
||||
// Render vertical format (key-value pairs) for extra-narrow terminals
|
||||
// Uses formatCell cache; wrapping uses terminal-width params (not column widths)
|
||||
function renderVerticalFormat(): string {
|
||||
const lines: string[] = [];
|
||||
const headers = token.header.map(h => getPlainText(h.tokens));
|
||||
@@ -318,4 +335,4 @@ export function MarkdownTable({ token, highlight, forceWidth }: Props): React.Re
|
||||
|
||||
// Render as a single Ansi block to prevent Ink from wrapping mid-row
|
||||
return <Ansi>{tableLines.join('\n')}</Ansi>;
|
||||
}
|
||||
});
|
||||
|
||||
@@ -18,6 +18,7 @@ import type { Tools } from '../Tool.js';
|
||||
import { findToolByName } from '../Tool.js';
|
||||
import type { AgentDefinitionsResult } from '@claude-code-best/builtin-tools/tools/AgentTool/loadAgentsDir.js';
|
||||
import type {
|
||||
AssistantMessage,
|
||||
Message as MessageType,
|
||||
NormalizedMessage,
|
||||
ProgressMessage as ProgressMessageType,
|
||||
@@ -34,6 +35,9 @@ import { isFullscreenEnvEnabled } from '../utils/fullscreen.js';
|
||||
import { applyGrouping } from '../utils/groupToolUses.js';
|
||||
import {
|
||||
buildMessageLookups,
|
||||
computeMessageStructureKey,
|
||||
type MessageLookups,
|
||||
updateMessageLookupsIncremental,
|
||||
createAssistantMessage,
|
||||
deriveUUID,
|
||||
getMessagesAfterCompactBoundary,
|
||||
@@ -510,6 +514,18 @@ const MessagesImpl = ({
|
||||
// comment above for why this replaced count-based slicing.
|
||||
const sliceAnchorRef = useRef<SliceAnchor>(null);
|
||||
|
||||
// Cache for buildMessageLookups: avoids rebuilding 8 Maps/Sets when only
|
||||
// message content changed during streaming (text/thinking deltas). The key
|
||||
// captures only structural info (types, IDs), so content-only deltas skip
|
||||
// the rebuild entirely.
|
||||
const lookupsCacheRef = useRef<{
|
||||
key: string;
|
||||
lookups: MessageLookups;
|
||||
normalizedCount: number;
|
||||
messageCount: number;
|
||||
lastAssistantMsgId: string | undefined;
|
||||
} | null>(null);
|
||||
|
||||
// Expensive message transforms — filter, reorder, group, collapse, lookups.
|
||||
// All O(n) over 27k messages. Split from the renderRange slice so scrolling
|
||||
// (which only changes renderRange) doesn't re-run these. Previously this
|
||||
@@ -578,7 +594,59 @@ const MessagesImpl = ({
|
||||
verbose,
|
||||
);
|
||||
|
||||
const lookups = buildMessageLookups(normalizedMessages, messagesToShow as MessageType[]);
|
||||
const lookupsKey = computeMessageStructureKey(normalizedMessages, messagesToShow as MessageType[]);
|
||||
const currentLastAssistantMsgId = (() => {
|
||||
const lastMsg = (messagesToShow as MessageType[]).at(-1);
|
||||
return lastMsg?.type === 'assistant' ? (lastMsg as AssistantMessage).message?.id : undefined;
|
||||
})();
|
||||
let lookups: MessageLookups;
|
||||
if (lookupsCacheRef.current && lookupsCacheRef.current.key === lookupsKey) {
|
||||
lookups = lookupsCacheRef.current.lookups;
|
||||
} else if (
|
||||
lookupsCacheRef.current &&
|
||||
normalizedMessages.length >= lookupsCacheRef.current.normalizedCount &&
|
||||
(messagesToShow as MessageType[]).length >= lookupsCacheRef.current.messageCount &&
|
||||
// If lastAssistantMsgId changed, previous "in-progress" assistant may
|
||||
// now be orphaned — force a full rebuild to pick up the new status.
|
||||
lookupsCacheRef.current.lastAssistantMsgId === currentLastAssistantMsgId
|
||||
) {
|
||||
// Try incremental update when only new messages were appended
|
||||
const updated = updateMessageLookupsIncremental(
|
||||
lookupsCacheRef.current.lookups,
|
||||
lookupsCacheRef.current.normalizedCount,
|
||||
lookupsCacheRef.current.messageCount,
|
||||
normalizedMessages,
|
||||
messagesToShow as MessageType[],
|
||||
);
|
||||
if (updated) {
|
||||
lookups = updated;
|
||||
lookupsCacheRef.current = {
|
||||
key: lookupsKey,
|
||||
lookups,
|
||||
normalizedCount: normalizedMessages.length,
|
||||
messageCount: (messagesToShow as MessageType[]).length,
|
||||
lastAssistantMsgId: currentLastAssistantMsgId,
|
||||
};
|
||||
} else {
|
||||
lookups = buildMessageLookups(normalizedMessages, messagesToShow as MessageType[]);
|
||||
lookupsCacheRef.current = {
|
||||
key: lookupsKey,
|
||||
lookups,
|
||||
normalizedCount: normalizedMessages.length,
|
||||
messageCount: (messagesToShow as MessageType[]).length,
|
||||
lastAssistantMsgId: currentLastAssistantMsgId,
|
||||
};
|
||||
}
|
||||
} else {
|
||||
lookups = buildMessageLookups(normalizedMessages, messagesToShow as MessageType[]);
|
||||
lookupsCacheRef.current = {
|
||||
key: lookupsKey,
|
||||
lookups,
|
||||
normalizedCount: normalizedMessages.length,
|
||||
messageCount: (messagesToShow as MessageType[]).length,
|
||||
lastAssistantMsgId: currentLastAssistantMsgId,
|
||||
};
|
||||
}
|
||||
|
||||
const hiddenMessageCount = messagesToShowNotTruncated.length - MAX_MESSAGES_TO_SHOW_IN_TRANSCRIPT_MODE;
|
||||
|
||||
|
||||
@@ -320,6 +320,16 @@ async function doInitializeTelemetry(): Promise<void> {
|
||||
return
|
||||
}
|
||||
|
||||
// Skip entire OTel initialization when telemetry is not enabled.
|
||||
// Prevents PerformanceMeasure accumulation in long-running sessions.
|
||||
if (!isEnvTruthy(process.env.CLAUDE_CODE_ENABLE_TELEMETRY)) {
|
||||
telemetryInitialized = true
|
||||
logForDebugging(
|
||||
'[3P telemetry] Skipped — CLAUDE_CODE_ENABLE_TELEMETRY not set',
|
||||
)
|
||||
return
|
||||
}
|
||||
|
||||
// Set flag before init to prevent double initialization
|
||||
telemetryInitialized = true
|
||||
try {
|
||||
|
||||
70
src/query.ts
70
src/query.ts
@@ -7,6 +7,9 @@ import type { CanUseToolFn } from './hooks/useCanUseTool.js'
|
||||
import { FallbackTriggeredError } from './services/api/withRetry.js'
|
||||
import {
|
||||
calculateTokenWarningState,
|
||||
estimateMaxTurnGrowth,
|
||||
getAutoCompactThreshold,
|
||||
getEffectiveContextWindowSize,
|
||||
isAutoCompactEnabled,
|
||||
type AutoCompactTrackingState,
|
||||
} from './services/compact/autoCompact.js'
|
||||
@@ -474,7 +477,7 @@ async function* queryLoop(
|
||||
queryTracking,
|
||||
}
|
||||
|
||||
let messagesForQuery = [...getMessagesAfterCompactBoundary(messages)]
|
||||
let messagesForQuery = getMessagesAfterCompactBoundary(messages)
|
||||
|
||||
let tracking = autoCompactTracking
|
||||
|
||||
@@ -529,6 +532,16 @@ async function* queryLoop(
|
||||
querySource,
|
||||
)
|
||||
messagesForQuery = microcompactResult.messages
|
||||
// Release original strings from contentReplacementState.replacements for
|
||||
// tool results whose content was replaced with the cleared message.
|
||||
if (microcompactResult.clearedToolUseIds?.length) {
|
||||
const replacements = toolUseContext?.contentReplacementState?.replacements
|
||||
if (replacements) {
|
||||
for (const id of microcompactResult.clearedToolUseIds) {
|
||||
replacements.delete(id)
|
||||
}
|
||||
}
|
||||
}
|
||||
// For cached microcompact (cache editing), defer boundary message until after
|
||||
// the API response so we can use actual cache_deleted_input_tokens.
|
||||
// Gated behind feature() so the string is eliminated from external builds.
|
||||
@@ -759,6 +772,48 @@ async function* queryLoop(
|
||||
}
|
||||
}
|
||||
|
||||
// Predictive autocompact: estimate if this turn's growth will push
|
||||
// us past the context window. Uses effectiveContextWindow directly
|
||||
// (without the autocompact buffer) to avoid double-reserving with
|
||||
// getAutoCompactThreshold which already subtracts buffer.
|
||||
if (!compactionResult && isAutoCompactEnabled()) {
|
||||
const model = toolUseContext.options.mainLoopModel
|
||||
const currentTokens =
|
||||
tokenCountWithEstimation(messagesForQuery) - snipTokensFreed
|
||||
const estimatedGrowth = estimateMaxTurnGrowth(model)
|
||||
const predictiveThreshold =
|
||||
getEffectiveContextWindowSize(model) - estimatedGrowth
|
||||
if (currentTokens > predictiveThreshold) {
|
||||
const predictiveResult = await deps.autocompact(
|
||||
messagesForQuery,
|
||||
toolUseContext,
|
||||
{
|
||||
systemPrompt,
|
||||
userContext,
|
||||
systemContext,
|
||||
toolUseContext,
|
||||
forkContextMessages: messagesForQuery,
|
||||
},
|
||||
querySource,
|
||||
tracking,
|
||||
snipTokensFreed,
|
||||
)
|
||||
if (predictiveResult.compactionResult) {
|
||||
messagesForQuery = buildPostCompactMessages(
|
||||
predictiveResult.compactionResult,
|
||||
)
|
||||
snipTokensFreed = 0
|
||||
tracking = tracking
|
||||
? {
|
||||
...tracking,
|
||||
compacted: true,
|
||||
consecutiveFailures: predictiveResult.consecutiveFailures ?? 0,
|
||||
}
|
||||
: tracking
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
let attemptWithFallback = true
|
||||
|
||||
queryCheckpoint('query_api_loop_start')
|
||||
@@ -1132,7 +1187,7 @@ async function* queryLoop(
|
||||
// Execute post-sampling hooks after model response is complete
|
||||
if (assistantMessages.length > 0) {
|
||||
void executePostSamplingHooks(
|
||||
[...messagesForQuery, ...assistantMessages],
|
||||
messagesForQuery.concat(assistantMessages),
|
||||
systemPrompt,
|
||||
userContext,
|
||||
systemContext,
|
||||
@@ -1854,11 +1909,10 @@ async function* queryLoop(
|
||||
userContext,
|
||||
systemContext,
|
||||
toolUseContext,
|
||||
forkContextMessages: [
|
||||
...messagesForQuery,
|
||||
...assistantMessages,
|
||||
...toolResults,
|
||||
],
|
||||
forkContextMessages: messagesForQuery.concat(
|
||||
assistantMessages,
|
||||
toolResults,
|
||||
),
|
||||
})
|
||||
}
|
||||
}
|
||||
@@ -1875,7 +1929,7 @@ async function* queryLoop(
|
||||
|
||||
queryCheckpoint('query_recursive_call')
|
||||
const next: State = {
|
||||
messages: [...messagesForQuery, ...assistantMessages, ...toolResults],
|
||||
messages: messagesForQuery.concat(assistantMessages, toolResults),
|
||||
toolUseContext: toolUseContextWithQueryTracking,
|
||||
autoCompactTracking: tracking,
|
||||
turnCount: nextTurnCount,
|
||||
|
||||
@@ -1566,7 +1566,15 @@ export function REPL({
|
||||
// Deferred messages for the Messages component — renders at transition
|
||||
// priority so the reconciler yields every 5ms, keeping input responsive
|
||||
// while the expensive message processing pipeline runs.
|
||||
const deferredMessages = useDeferredValue(messages);
|
||||
// Cap at 500 messages to limit memory double-buffering. The bypass
|
||||
// at display-time uses sync messages during streaming and non-loading,
|
||||
// so this cap only affects reduced-motion scenarios.
|
||||
const DEFERRED_CAP = 500;
|
||||
const cappedMessages = React.useMemo(
|
||||
() => (messages.length > DEFERRED_CAP ? messages.slice(-DEFERRED_CAP) : messages),
|
||||
[messages],
|
||||
);
|
||||
const deferredMessages = useDeferredValue(cappedMessages);
|
||||
const deferredBehind = messages.length - deferredMessages.length;
|
||||
if (deferredBehind > 0) {
|
||||
logForDebugging(
|
||||
|
||||
@@ -64,6 +64,35 @@ export const WARNING_THRESHOLD_BUFFER_TOKENS = 20_000
|
||||
export const ERROR_THRESHOLD_BUFFER_TOKENS = 20_000
|
||||
export const MANUAL_COMPACT_BUFFER_TOKENS = 3_000
|
||||
|
||||
// Conservative estimate for tool result growth per turn.
|
||||
// Typical tool results (file reads, grep, bash) average ~5-10K tokens;
|
||||
// occasional large reads can spike to 20K+.
|
||||
const TOOL_RESULT_GROWTH_ESTIMATE = 15_000
|
||||
|
||||
/**
|
||||
* Context-aware autocompact buffer. Larger context windows need more
|
||||
* headroom because a single turn can produce proportionally more tokens
|
||||
* (longer model outputs + larger tool results).
|
||||
*/
|
||||
export function getAutocompactBufferTokens(model: string): number {
|
||||
const effectiveWindow = getEffectiveContextWindowSize(model)
|
||||
if (effectiveWindow >= 800_000) return 50_000
|
||||
if (effectiveWindow >= 400_000) return 30_000
|
||||
return AUTOCOMPACT_BUFFER_TOKENS
|
||||
}
|
||||
|
||||
/**
|
||||
* Estimate the maximum token growth a single turn can produce.
|
||||
* Used for predictive autocompact checks before the API call.
|
||||
*/
|
||||
export function estimateMaxTurnGrowth(model: string): number {
|
||||
const maxOutput = Math.min(
|
||||
getMaxOutputTokensForModel(model),
|
||||
MAX_OUTPUT_TOKENS_FOR_SUMMARY,
|
||||
)
|
||||
return maxOutput + TOOL_RESULT_GROWTH_ESTIMATE
|
||||
}
|
||||
|
||||
// Stop trying autocompact after this many consecutive failures.
|
||||
// BQ 2026-03-10: 1,279 sessions had 50+ consecutive failures (up to 3,272)
|
||||
// in a single session, wasting ~250K API calls/day globally.
|
||||
@@ -73,7 +102,7 @@ export function getAutoCompactThreshold(model: string): number {
|
||||
const effectiveContextWindow = getEffectiveContextWindowSize(model)
|
||||
|
||||
const autocompactThreshold =
|
||||
effectiveContextWindow - AUTOCOMPACT_BUFFER_TOKENS
|
||||
effectiveContextWindow - getAutocompactBufferTokens(model)
|
||||
|
||||
// Override for easier testing of autocompact
|
||||
const envPercent = process.env.CLAUDE_AUTOCOMPACT_PCT_OVERRIDE
|
||||
|
||||
@@ -334,13 +334,12 @@ export type RecompactionInfo = {
|
||||
* Order: boundaryMarker, summaryMessages, messagesToKeep, attachments, hookResults
|
||||
*/
|
||||
export function buildPostCompactMessages(result: CompactionResult): Message[] {
|
||||
return [
|
||||
result.boundaryMarker,
|
||||
...result.summaryMessages,
|
||||
...(result.messagesToKeep ?? []),
|
||||
...result.attachments,
|
||||
...result.hookResults,
|
||||
]
|
||||
return ([result.boundaryMarker] as Message[]).concat(
|
||||
result.summaryMessages,
|
||||
result.messagesToKeep ?? [],
|
||||
result.attachments,
|
||||
result.hookResults,
|
||||
)
|
||||
}
|
||||
|
||||
/**
|
||||
|
||||
@@ -217,6 +217,10 @@ export type MicrocompactResult = {
|
||||
compactionInfo?: {
|
||||
pendingCacheEdits?: PendingCacheEdits
|
||||
}
|
||||
// Tool use IDs whose content was replaced with the cleared message.
|
||||
// Callers should remove these from contentReplacementState.replacements
|
||||
// to release the original strings from memory.
|
||||
clearedToolUseIds?: string[]
|
||||
}
|
||||
|
||||
/**
|
||||
@@ -528,5 +532,5 @@ function maybeTimeBasedMicrocompact(
|
||||
notifyCacheDeletion(querySource)
|
||||
}
|
||||
|
||||
return { messages: result }
|
||||
return { messages: result, clearedToolUseIds: [...clearSet] }
|
||||
}
|
||||
|
||||
@@ -1,25 +1,97 @@
|
||||
// Auto-generated stub — replace with real implementation
|
||||
export {}
|
||||
|
||||
import type { Message } from 'src/types/message'
|
||||
import type { CompactionResult } from './compact.js'
|
||||
import { isEnvTruthy } from '../../utils/envUtils.js'
|
||||
import {
|
||||
isMediaSizeErrorMessage,
|
||||
isPromptTooLongMessage,
|
||||
} from '../api/errors.js'
|
||||
import type { AssistantMessage, Message } from '../../types/message.js'
|
||||
import { type CompactionResult, compactConversation } from './compact.js'
|
||||
import { logError } from '../../utils/log.js'
|
||||
import { logForDebugging } from '../../utils/debug.js'
|
||||
import type { CacheSafeParams } from '../../utils/forkedAgent.js'
|
||||
|
||||
export const isReactiveOnlyMode: () => boolean = () => false
|
||||
|
||||
export const reactiveCompactOnPromptTooLong: (
|
||||
messages: Message[],
|
||||
cacheSafeParams: Record<string, unknown>,
|
||||
options: { customInstructions?: string; trigger?: string },
|
||||
) => Promise<{ ok: boolean; reason?: string; result?: CompactionResult }> =
|
||||
async () => ({ ok: false })
|
||||
export const isReactiveCompactEnabled: () => boolean = () => false
|
||||
export const isWithheldPromptTooLong: (message: Message) => boolean = () =>
|
||||
false
|
||||
export const isWithheldMediaSizeError: (message: Message) => boolean = () =>
|
||||
false
|
||||
async (messages, cacheSafeParams, options) => {
|
||||
const params = cacheSafeParams as unknown as CacheSafeParams
|
||||
try {
|
||||
const result = await compactConversation(
|
||||
messages,
|
||||
params.toolUseContext,
|
||||
params,
|
||||
true,
|
||||
options.customInstructions,
|
||||
true,
|
||||
{
|
||||
isRecompactionInChain: false,
|
||||
turnsSincePreviousCompact: 0,
|
||||
autoCompactThreshold: 0,
|
||||
querySource: 'compact',
|
||||
},
|
||||
)
|
||||
return { ok: true, result }
|
||||
} catch (error) {
|
||||
logError(error)
|
||||
return { ok: false, reason: String(error) }
|
||||
}
|
||||
}
|
||||
|
||||
export const isReactiveCompactEnabled: () => boolean = () => {
|
||||
if (isEnvTruthy(process.env.DISABLE_COMPACT)) return false
|
||||
return true
|
||||
}
|
||||
|
||||
export const isWithheldPromptTooLong: (message: Message) => boolean =
|
||||
message => {
|
||||
if (message.type !== 'assistant' || !message.isApiErrorMessage) return false
|
||||
return isPromptTooLongMessage(message as AssistantMessage)
|
||||
}
|
||||
|
||||
export const isWithheldMediaSizeError: (message: Message) => boolean =
|
||||
message => {
|
||||
if (message.type !== 'assistant' || !message.isApiErrorMessage) return false
|
||||
return isMediaSizeErrorMessage(message as AssistantMessage)
|
||||
}
|
||||
|
||||
export const tryReactiveCompact: (params: {
|
||||
hasAttempted: boolean
|
||||
querySource: string
|
||||
aborted: boolean
|
||||
messages: Message[]
|
||||
cacheSafeParams: Record<string, unknown>
|
||||
}) => Promise<CompactionResult | null> = async () => null
|
||||
}) => Promise<CompactionResult | null> = async ({
|
||||
hasAttempted,
|
||||
aborted,
|
||||
messages,
|
||||
cacheSafeParams,
|
||||
}) => {
|
||||
if (hasAttempted || aborted) return null
|
||||
const params = cacheSafeParams as unknown as CacheSafeParams
|
||||
try {
|
||||
const result = await compactConversation(
|
||||
messages,
|
||||
params.toolUseContext,
|
||||
params,
|
||||
true,
|
||||
undefined,
|
||||
true,
|
||||
{
|
||||
isRecompactionInChain: false,
|
||||
turnsSincePreviousCompact: 0,
|
||||
autoCompactThreshold: 0,
|
||||
},
|
||||
)
|
||||
return result
|
||||
} catch (error) {
|
||||
logForDebugging(
|
||||
`reactiveCompact: emergency compaction failed — ${String(error)}`,
|
||||
{ level: 'warn' },
|
||||
)
|
||||
logError(error)
|
||||
return null
|
||||
}
|
||||
}
|
||||
|
||||
@@ -163,3 +163,77 @@ export function isSnipRuntimeEnabled(): boolean {
|
||||
export function shouldNudgeForSnips(messages: Message[]): boolean {
|
||||
return messages.length >= SNIP_NUDGE_THRESHOLD
|
||||
}
|
||||
|
||||
/**
|
||||
* Maximum total character length of message content before proactive
|
||||
* truncation kicks in. ~150 MB of string data corresponds to roughly
|
||||
* 1.5x the default 200k-token context window at 4 chars/token — well
|
||||
* beyond what any model can actually use in a single request.
|
||||
*/
|
||||
const PROACTIVE_TRUNCATE_CHARS = 150_000_000
|
||||
|
||||
/**
|
||||
* Minimum number of messages to keep when falling back to tail-only
|
||||
* retention (i.e. when no compact_boundary exists in the array).
|
||||
*/
|
||||
const PROACTIVE_TRUNCATE_MIN_TAIL = 50
|
||||
|
||||
/**
|
||||
* Proactively truncate old messages when the in-memory store grows too
|
||||
* large. Unlike `snipCompactIfNeeded` (which waits for a snip_boundary
|
||||
* from the API), this runs client-side after every push — ensuring
|
||||
* unbounded growth cannot happen even when the API never returns a
|
||||
* compact_boundary (e.g. third-party compat layers).
|
||||
*
|
||||
* Strategy:
|
||||
* 1. If a `compact_boundary` exists, keep it and everything after it.
|
||||
* 2. Otherwise, keep only the last `PROACTIVE_TRUNCATE_MIN_TAIL` messages.
|
||||
*
|
||||
* Returns the same array reference when no truncation is needed.
|
||||
*/
|
||||
export function proactiveTruncate(messages: Message[]): Message[] {
|
||||
if (messages.length < PROACTIVE_TRUNCATE_MIN_TAIL) return messages
|
||||
|
||||
let totalChars = 0
|
||||
for (const msg of messages) {
|
||||
const content = msg.message?.content
|
||||
if (typeof content === 'string') {
|
||||
totalChars += content.length
|
||||
} else if (Array.isArray(content)) {
|
||||
for (const block of content) {
|
||||
if (typeof block === 'string') {
|
||||
totalChars += (block as string).length
|
||||
} else if (block && typeof block === 'object') {
|
||||
const obj = block as unknown as Record<string, unknown>
|
||||
const text = obj.text ?? obj.content
|
||||
if (typeof text === 'string') {
|
||||
totalChars += text.length
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if (totalChars < PROACTIVE_TRUNCATE_CHARS) return messages
|
||||
|
||||
// Find last compact_boundary — the standard anchor point
|
||||
let boundaryIdx = -1
|
||||
for (let i = messages.length - 1; i >= 0; i--) {
|
||||
const msg = messages[i]!
|
||||
if (
|
||||
msg.type === 'system' &&
|
||||
(msg as Record<string, unknown>).subtype === 'compact_boundary'
|
||||
) {
|
||||
boundaryIdx = i
|
||||
break
|
||||
}
|
||||
}
|
||||
|
||||
const keepFrom =
|
||||
boundaryIdx >= 0
|
||||
? boundaryIdx
|
||||
: Math.max(0, messages.length - PROACTIVE_TRUNCATE_MIN_TAIL)
|
||||
if (keepFrom === 0) return messages
|
||||
|
||||
return messages.slice(keepFrom)
|
||||
}
|
||||
|
||||
@@ -110,12 +110,15 @@ export async function connectDoubaoStream(
|
||||
let doubaoAsr: typeof import('doubaoime-asr')
|
||||
try {
|
||||
doubaoAsr = await import('doubaoime-asr')
|
||||
} catch {
|
||||
logError(new Error('[doubao-asr] Failed to import doubaoime-asr package'))
|
||||
callbacks.onError(
|
||||
'doubaoime-asr package is not installed. Install it with: bun add doubaoime-asr',
|
||||
{ fatal: true },
|
||||
} catch (err) {
|
||||
logError(
|
||||
new Error(
|
||||
`[doubao-asr] Failed to import doubaoime-asr package: ${String(err)}`,
|
||||
),
|
||||
)
|
||||
callbacks.onError(`doubaoime-asr package import failed: ${String(err)}`, {
|
||||
fatal: true,
|
||||
})
|
||||
return null
|
||||
}
|
||||
|
||||
|
||||
646
src/utils/__tests__/tasks.test.ts
Normal file
646
src/utils/__tests__/tasks.test.ts
Normal file
@@ -0,0 +1,646 @@
|
||||
import { mkdir, rm } from 'fs/promises'
|
||||
import { join } from 'path'
|
||||
import { tmpdir } from 'os'
|
||||
import { beforeEach, afterEach, describe, expect, mock, test } from 'bun:test'
|
||||
|
||||
import { logMock } from '../../../tests/mocks/log'
|
||||
import { debugMock } from '../../../tests/mocks/debug'
|
||||
|
||||
// Mock dependencies before importing the module under test
|
||||
mock.module('src/utils/log.ts', logMock)
|
||||
mock.module('src/utils/debug.ts', debugMock)
|
||||
mock.module('bun:bundle', () => ({
|
||||
feature: () => false,
|
||||
}))
|
||||
mock.module('src/bootstrap/state.ts', () => ({
|
||||
getSessionId: () => 'test-session-123',
|
||||
getIsNonInteractiveSession: () => false,
|
||||
}))
|
||||
mock.module('src/utils/teammate.ts', () => ({
|
||||
getTeamName: () => undefined,
|
||||
}))
|
||||
mock.module('src/utils/teammateContext.ts', () => ({
|
||||
getTeammateContext: () => undefined,
|
||||
}))
|
||||
mock.module('src/utils/slowOperations.ts', () => ({
|
||||
jsonParse: (s: string) => JSON.parse(s),
|
||||
jsonStringify: (
|
||||
v: unknown,
|
||||
...args: Parameters<typeof JSON.stringify>[1][]
|
||||
) => JSON.stringify(v, ...args),
|
||||
}))
|
||||
|
||||
import {
|
||||
createTask,
|
||||
getTask,
|
||||
updateTask,
|
||||
deleteTask,
|
||||
listTasks,
|
||||
blockTask,
|
||||
claimTask,
|
||||
resetTaskList,
|
||||
sanitizePathComponent,
|
||||
getTasksDir,
|
||||
notifyTasksUpdated,
|
||||
onTasksUpdated,
|
||||
setLeaderTeamName,
|
||||
clearLeaderTeamName,
|
||||
isTodoV2Enabled,
|
||||
type Task,
|
||||
} from '../tasks'
|
||||
|
||||
// Use a temp dir as CLAUDE_CONFIG_DIR for isolation
|
||||
let configDir: string
|
||||
const ORIGINAL_CONFIG_DIR = process.env.CLAUDE_CONFIG_DIR
|
||||
|
||||
beforeEach(async () => {
|
||||
configDir = join(
|
||||
tmpdir(),
|
||||
`claude-test-tasks-${Date.now()}-${Math.random().toString(36).slice(2)}`,
|
||||
)
|
||||
process.env.CLAUDE_CONFIG_DIR = configDir
|
||||
// Reset memoize cache by changing env
|
||||
const { getClaudeConfigHomeDir } = await import('src/utils/envUtils')
|
||||
getClaudeConfigHomeDir.cache.clear?.()
|
||||
})
|
||||
|
||||
afterEach(async () => {
|
||||
if (ORIGINAL_CONFIG_DIR !== undefined) {
|
||||
process.env.CLAUDE_CONFIG_DIR = ORIGINAL_CONFIG_DIR
|
||||
} else {
|
||||
delete process.env.CLAUDE_CONFIG_DIR
|
||||
}
|
||||
const { getClaudeConfigHomeDir } = await import('src/utils/envUtils')
|
||||
getClaudeConfigHomeDir.cache.clear?.()
|
||||
await rm(configDir, { recursive: true, force: true }).catch(() => {})
|
||||
})
|
||||
|
||||
const TASK_LIST_ID = 'test-list'
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// sanitizePathComponent
|
||||
// ---------------------------------------------------------------------------
|
||||
describe('sanitizePathComponent', () => {
|
||||
test('replaces non-alphanumeric characters with hyphens', () => {
|
||||
expect(sanitizePathComponent('hello world')).toBe('hello-world')
|
||||
})
|
||||
|
||||
test('preserves alphanumeric, hyphens and underscores', () => {
|
||||
expect(sanitizePathComponent('abc-123_XYZ')).toBe('abc-123_XYZ')
|
||||
})
|
||||
|
||||
test('handles path traversal attempts', () => {
|
||||
expect(sanitizePathComponent('../../../etc/passwd')).toBe(
|
||||
'---------etc-passwd',
|
||||
)
|
||||
})
|
||||
|
||||
test('handles empty string', () => {
|
||||
expect(sanitizePathComponent('')).toBe('')
|
||||
})
|
||||
})
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// getTasksDir
|
||||
// ---------------------------------------------------------------------------
|
||||
describe('getTasksDir', () => {
|
||||
test('returns correct path under config home', () => {
|
||||
const dir = getTasksDir('my-list')
|
||||
expect(dir).toBe(join(configDir, 'tasks', 'my-list'))
|
||||
})
|
||||
|
||||
test('sanitizes task list ID', () => {
|
||||
const dir = getTasksDir('../evil')
|
||||
expect(dir).toBe(join(configDir, 'tasks', '---evil'))
|
||||
})
|
||||
})
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// createTask
|
||||
// ---------------------------------------------------------------------------
|
||||
describe('createTask', () => {
|
||||
test('creates a task with sequential ID starting at 1', async () => {
|
||||
const id = await createTask(TASK_LIST_ID, {
|
||||
subject: 'Test task',
|
||||
description: 'A test task description',
|
||||
status: 'pending',
|
||||
blocks: [],
|
||||
blockedBy: [],
|
||||
})
|
||||
expect(id).toBe('1')
|
||||
|
||||
const task = await getTask(TASK_LIST_ID, id)
|
||||
expect(task).not.toBeNull()
|
||||
expect(task!.subject).toBe('Test task')
|
||||
expect(task!.status).toBe('pending')
|
||||
})
|
||||
|
||||
test('creates tasks with incrementing IDs', async () => {
|
||||
const id1 = await createTask(TASK_LIST_ID, {
|
||||
subject: 'First',
|
||||
description: '',
|
||||
status: 'pending',
|
||||
blocks: [],
|
||||
blockedBy: [],
|
||||
})
|
||||
const id2 = await createTask(TASK_LIST_ID, {
|
||||
subject: 'Second',
|
||||
description: '',
|
||||
status: 'pending',
|
||||
blocks: [],
|
||||
blockedBy: [],
|
||||
})
|
||||
expect(id1).toBe('1')
|
||||
expect(id2).toBe('2')
|
||||
})
|
||||
|
||||
test('preserves optional fields', async () => {
|
||||
const id = await createTask(TASK_LIST_ID, {
|
||||
subject: 'Task with options',
|
||||
description: 'Has owner and activeForm',
|
||||
status: 'in_progress',
|
||||
blocks: [],
|
||||
blockedBy: [],
|
||||
owner: 'agent-1',
|
||||
activeForm: 'Working on task',
|
||||
metadata: { priority: 'high' },
|
||||
})
|
||||
const task = await getTask(TASK_LIST_ID, id)
|
||||
expect(task!.owner).toBe('agent-1')
|
||||
expect(task!.activeForm).toBe('Working on task')
|
||||
expect(task!.metadata).toEqual({ priority: 'high' })
|
||||
})
|
||||
|
||||
test('does not reuse IDs after deletion (high water mark)', async () => {
|
||||
const id1 = await createTask(TASK_LIST_ID, {
|
||||
subject: 'To delete',
|
||||
description: '',
|
||||
status: 'pending',
|
||||
blocks: [],
|
||||
blockedBy: [],
|
||||
})
|
||||
await deleteTask(TASK_LIST_ID, id1)
|
||||
const id2 = await createTask(TASK_LIST_ID, {
|
||||
subject: 'After delete',
|
||||
description: '',
|
||||
status: 'pending',
|
||||
blocks: [],
|
||||
blockedBy: [],
|
||||
})
|
||||
expect(id1).toBe('1')
|
||||
expect(id2).toBe('2')
|
||||
})
|
||||
})
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// getTask
|
||||
// ---------------------------------------------------------------------------
|
||||
describe('getTask', () => {
|
||||
test('returns null for non-existent task', async () => {
|
||||
const task = await getTask(TASK_LIST_ID, '999')
|
||||
expect(task).toBeNull()
|
||||
})
|
||||
|
||||
test('returns task by ID', async () => {
|
||||
const id = await createTask(TASK_LIST_ID, {
|
||||
subject: 'Find me',
|
||||
description: '',
|
||||
status: 'pending',
|
||||
blocks: [],
|
||||
blockedBy: [],
|
||||
})
|
||||
const task = await getTask(TASK_LIST_ID, id)
|
||||
expect(task).not.toBeNull()
|
||||
expect(task!.id).toBe(id)
|
||||
expect(task!.subject).toBe('Find me')
|
||||
})
|
||||
|
||||
test('returns null for invalid JSON in task file', async () => {
|
||||
const { writeFile } = await import('fs/promises')
|
||||
const dir = getTasksDir(TASK_LIST_ID)
|
||||
await mkdir(dir, { recursive: true })
|
||||
await writeFile(join(dir, 'bad.json'), 'not valid json{{{')
|
||||
const task = await getTask(TASK_LIST_ID, 'bad')
|
||||
expect(task).toBeNull()
|
||||
})
|
||||
})
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// updateTask
|
||||
// ---------------------------------------------------------------------------
|
||||
describe('updateTask', () => {
|
||||
test('updates task fields', async () => {
|
||||
const id = await createTask(TASK_LIST_ID, {
|
||||
subject: 'Original',
|
||||
description: '',
|
||||
status: 'pending',
|
||||
blocks: [],
|
||||
blockedBy: [],
|
||||
})
|
||||
const updated = await updateTask(TASK_LIST_ID, id, {
|
||||
subject: 'Updated',
|
||||
status: 'in_progress',
|
||||
owner: 'agent-2',
|
||||
})
|
||||
expect(updated).not.toBeNull()
|
||||
expect(updated!.subject).toBe('Updated')
|
||||
expect(updated!.status).toBe('in_progress')
|
||||
expect(updated!.owner).toBe('agent-2')
|
||||
expect(updated!.id).toBe(id)
|
||||
})
|
||||
|
||||
test('returns null for non-existent task', async () => {
|
||||
const result = await updateTask(TASK_LIST_ID, '999', { subject: 'Nope' })
|
||||
expect(result).toBeNull()
|
||||
})
|
||||
|
||||
test('preserves unmodified fields', async () => {
|
||||
const id = await createTask(TASK_LIST_ID, {
|
||||
subject: 'Keep this',
|
||||
description: 'Keep desc',
|
||||
status: 'pending',
|
||||
blocks: [],
|
||||
blockedBy: [],
|
||||
})
|
||||
const updated = await updateTask(TASK_LIST_ID, id, { status: 'completed' })
|
||||
expect(updated!.subject).toBe('Keep this')
|
||||
expect(updated!.description).toBe('Keep desc')
|
||||
expect(updated!.status).toBe('completed')
|
||||
})
|
||||
})
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// deleteTask
|
||||
// ---------------------------------------------------------------------------
|
||||
describe('deleteTask', () => {
|
||||
test('deletes an existing task', async () => {
|
||||
const id = await createTask(TASK_LIST_ID, {
|
||||
subject: 'Delete me',
|
||||
description: '',
|
||||
status: 'pending',
|
||||
blocks: [],
|
||||
blockedBy: [],
|
||||
})
|
||||
const result = await deleteTask(TASK_LIST_ID, id)
|
||||
expect(result).toBe(true)
|
||||
const task = await getTask(TASK_LIST_ID, id)
|
||||
expect(task).toBeNull()
|
||||
})
|
||||
|
||||
test('returns false for non-existent task', async () => {
|
||||
const result = await deleteTask(TASK_LIST_ID, '999')
|
||||
expect(result).toBe(false)
|
||||
})
|
||||
|
||||
test('removes references from other tasks on delete', async () => {
|
||||
const id1 = await createTask(TASK_LIST_ID, {
|
||||
subject: 'Blocker',
|
||||
description: '',
|
||||
status: 'pending',
|
||||
blocks: [],
|
||||
blockedBy: [],
|
||||
})
|
||||
const id2 = await createTask(TASK_LIST_ID, {
|
||||
subject: 'Blocked',
|
||||
description: '',
|
||||
status: 'pending',
|
||||
blocks: [],
|
||||
blockedBy: [],
|
||||
})
|
||||
// Set up block relationship
|
||||
await blockTask(TASK_LIST_ID, id1, id2)
|
||||
|
||||
// Delete the blocker
|
||||
await deleteTask(TASK_LIST_ID, id1)
|
||||
|
||||
// The blocked task should no longer reference the deleted task
|
||||
const remaining = await getTask(TASK_LIST_ID, id2)
|
||||
expect(remaining).not.toBeNull()
|
||||
expect(remaining!.blockedBy).not.toContain(id1)
|
||||
})
|
||||
})
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// listTasks
|
||||
// ---------------------------------------------------------------------------
|
||||
describe('listTasks', () => {
|
||||
test('returns empty array for empty list', async () => {
|
||||
const tasks = await listTasks(TASK_LIST_ID)
|
||||
expect(tasks).toEqual([])
|
||||
})
|
||||
|
||||
test('returns all tasks', async () => {
|
||||
await createTask(TASK_LIST_ID, {
|
||||
subject: 'A',
|
||||
description: '',
|
||||
status: 'pending',
|
||||
blocks: [],
|
||||
blockedBy: [],
|
||||
})
|
||||
await createTask(TASK_LIST_ID, {
|
||||
subject: 'B',
|
||||
description: '',
|
||||
status: 'completed',
|
||||
blocks: [],
|
||||
blockedBy: [],
|
||||
})
|
||||
const tasks = await listTasks(TASK_LIST_ID)
|
||||
expect(tasks).toHaveLength(2)
|
||||
const subjects = tasks.map(t => t.subject).sort()
|
||||
expect(subjects).toEqual(['A', 'B'])
|
||||
})
|
||||
})
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// blockTask
|
||||
// ---------------------------------------------------------------------------
|
||||
describe('blockTask', () => {
|
||||
test('creates bidirectional block relationship', async () => {
|
||||
const id1 = await createTask(TASK_LIST_ID, {
|
||||
subject: 'Blocker',
|
||||
description: '',
|
||||
status: 'pending',
|
||||
blocks: [],
|
||||
blockedBy: [],
|
||||
})
|
||||
const id2 = await createTask(TASK_LIST_ID, {
|
||||
subject: 'Blocked',
|
||||
description: '',
|
||||
status: 'pending',
|
||||
blocks: [],
|
||||
blockedBy: [],
|
||||
})
|
||||
const result = await blockTask(TASK_LIST_ID, id1, id2)
|
||||
expect(result).toBe(true)
|
||||
|
||||
const t1 = await getTask(TASK_LIST_ID, id1)
|
||||
const t2 = await getTask(TASK_LIST_ID, id2)
|
||||
expect(t1!.blocks).toContain(id2)
|
||||
expect(t2!.blockedBy).toContain(id1)
|
||||
})
|
||||
|
||||
test('returns false for non-existent task', async () => {
|
||||
const result = await blockTask(TASK_LIST_ID, '999', '998')
|
||||
expect(result).toBe(false)
|
||||
})
|
||||
|
||||
test('does not add duplicate block entries', async () => {
|
||||
const id1 = await createTask(TASK_LIST_ID, {
|
||||
subject: 'A',
|
||||
description: '',
|
||||
status: 'pending',
|
||||
blocks: [],
|
||||
blockedBy: [],
|
||||
})
|
||||
const id2 = await createTask(TASK_LIST_ID, {
|
||||
subject: 'B',
|
||||
description: '',
|
||||
status: 'pending',
|
||||
blocks: [],
|
||||
blockedBy: [],
|
||||
})
|
||||
await blockTask(TASK_LIST_ID, id1, id2)
|
||||
await blockTask(TASK_LIST_ID, id1, id2)
|
||||
|
||||
const t1 = await getTask(TASK_LIST_ID, id1)
|
||||
expect(t1!.blocks.filter(id => id === id2)).toHaveLength(1)
|
||||
})
|
||||
})
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// claimTask
|
||||
// ---------------------------------------------------------------------------
|
||||
describe('claimTask', () => {
|
||||
test('claims an unowned task', async () => {
|
||||
const id = await createTask(TASK_LIST_ID, {
|
||||
subject: 'Claimable',
|
||||
description: '',
|
||||
status: 'pending',
|
||||
blocks: [],
|
||||
blockedBy: [],
|
||||
})
|
||||
const result = await claimTask(TASK_LIST_ID, id, 'agent-1')
|
||||
expect(result.success).toBe(true)
|
||||
expect(result.task!.owner).toBe('agent-1')
|
||||
})
|
||||
|
||||
test('allows same agent to re-claim', async () => {
|
||||
const id = await createTask(TASK_LIST_ID, {
|
||||
subject: 'Reclaim',
|
||||
description: '',
|
||||
status: 'pending',
|
||||
blocks: [],
|
||||
blockedBy: [],
|
||||
})
|
||||
await claimTask(TASK_LIST_ID, id, 'agent-1')
|
||||
const result = await claimTask(TASK_LIST_ID, id, 'agent-1')
|
||||
expect(result.success).toBe(true)
|
||||
})
|
||||
|
||||
test('rejects claim by different agent if already owned', async () => {
|
||||
const id = await createTask(TASK_LIST_ID, {
|
||||
subject: 'Owned',
|
||||
description: '',
|
||||
status: 'pending',
|
||||
blocks: [],
|
||||
blockedBy: [],
|
||||
})
|
||||
await claimTask(TASK_LIST_ID, id, 'agent-1')
|
||||
const result = await claimTask(TASK_LIST_ID, id, 'agent-2')
|
||||
expect(result.success).toBe(false)
|
||||
expect(result.reason).toBe('already_claimed')
|
||||
})
|
||||
|
||||
test('rejects claim on completed task', async () => {
|
||||
const id = await createTask(TASK_LIST_ID, {
|
||||
subject: 'Done',
|
||||
description: '',
|
||||
status: 'completed',
|
||||
blocks: [],
|
||||
blockedBy: [],
|
||||
})
|
||||
const result = await claimTask(TASK_LIST_ID, id, 'agent-1')
|
||||
expect(result.success).toBe(false)
|
||||
expect(result.reason).toBe('already_resolved')
|
||||
})
|
||||
|
||||
test('rejects claim on blocked task', async () => {
|
||||
const id1 = await createTask(TASK_LIST_ID, {
|
||||
subject: 'Blocker',
|
||||
description: '',
|
||||
status: 'pending',
|
||||
blocks: [],
|
||||
blockedBy: [],
|
||||
})
|
||||
const id2 = await createTask(TASK_LIST_ID, {
|
||||
subject: 'Blocked',
|
||||
description: '',
|
||||
status: 'pending',
|
||||
blocks: [],
|
||||
blockedBy: [],
|
||||
})
|
||||
await blockTask(TASK_LIST_ID, id1, id2)
|
||||
|
||||
const result = await claimTask(TASK_LIST_ID, id2, 'agent-1')
|
||||
expect(result.success).toBe(false)
|
||||
expect(result.reason).toBe('blocked')
|
||||
expect(result.blockedByTasks).toContain(id1)
|
||||
})
|
||||
|
||||
test('returns task_not_found for missing task', async () => {
|
||||
const result = await claimTask(TASK_LIST_ID, '999', 'agent-1')
|
||||
expect(result.success).toBe(false)
|
||||
expect(result.reason).toBe('task_not_found')
|
||||
})
|
||||
|
||||
test('rejects claim when agent is busy with checkAgentBusy', async () => {
|
||||
const id1 = await createTask(TASK_LIST_ID, {
|
||||
subject: 'Owned task',
|
||||
description: '',
|
||||
status: 'in_progress',
|
||||
blocks: [],
|
||||
blockedBy: [],
|
||||
owner: 'agent-1',
|
||||
})
|
||||
// Write the task with owner directly via file
|
||||
const { writeFile } = await import('fs/promises')
|
||||
const dir = getTasksDir(TASK_LIST_ID)
|
||||
await mkdir(dir, { recursive: true })
|
||||
const taskData: Task = {
|
||||
id: id1,
|
||||
subject: 'Owned task',
|
||||
description: '',
|
||||
status: 'in_progress',
|
||||
blocks: [],
|
||||
blockedBy: [],
|
||||
owner: 'agent-1',
|
||||
}
|
||||
await writeFile(join(dir, `${id1}.json`), JSON.stringify(taskData))
|
||||
|
||||
const id2 = await createTask(TASK_LIST_ID, {
|
||||
subject: 'New task',
|
||||
description: '',
|
||||
status: 'pending',
|
||||
blocks: [],
|
||||
blockedBy: [],
|
||||
})
|
||||
const result = await claimTask(TASK_LIST_ID, id2, 'agent-1', {
|
||||
checkAgentBusy: true,
|
||||
})
|
||||
expect(result.success).toBe(false)
|
||||
expect(result.reason).toBe('agent_busy')
|
||||
expect(result.busyWithTasks).toContain(id1)
|
||||
})
|
||||
})
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// resetTaskList
|
||||
// ---------------------------------------------------------------------------
|
||||
describe('resetTaskList', () => {
|
||||
test('deletes all tasks and preserves high water mark', async () => {
|
||||
const id1 = await createTask(TASK_LIST_ID, {
|
||||
subject: 'A',
|
||||
description: '',
|
||||
status: 'pending',
|
||||
blocks: [],
|
||||
blockedBy: [],
|
||||
})
|
||||
const id2 = await createTask(TASK_LIST_ID, {
|
||||
subject: 'B',
|
||||
description: '',
|
||||
status: 'pending',
|
||||
blocks: [],
|
||||
blockedBy: [],
|
||||
})
|
||||
await resetTaskList(TASK_LIST_ID)
|
||||
|
||||
const tasks = await listTasks(TASK_LIST_ID)
|
||||
expect(tasks).toHaveLength(0)
|
||||
|
||||
// Next ID should be higher than previous max
|
||||
const nextId = await createTask(TASK_LIST_ID, {
|
||||
subject: 'After reset',
|
||||
description: '',
|
||||
status: 'pending',
|
||||
blocks: [],
|
||||
blockedBy: [],
|
||||
})
|
||||
expect(Number(nextId)).toBeGreaterThan(Number(id2))
|
||||
})
|
||||
})
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Notification signals
|
||||
// ---------------------------------------------------------------------------
|
||||
describe('task notifications', () => {
|
||||
test('notifyTasksUpdated fires subscriber', () => {
|
||||
let called = false
|
||||
const unsub = onTasksUpdated(() => {
|
||||
called = true
|
||||
})
|
||||
notifyTasksUpdated()
|
||||
expect(called).toBe(true)
|
||||
unsub()
|
||||
})
|
||||
|
||||
test('setLeaderTeamName triggers notification', () => {
|
||||
let callCount = 0
|
||||
const unsub = onTasksUpdated(() => {
|
||||
callCount++
|
||||
})
|
||||
setLeaderTeamName('team-alpha')
|
||||
expect(callCount).toBe(1)
|
||||
// Setting same name again should not fire
|
||||
setLeaderTeamName('team-alpha')
|
||||
expect(callCount).toBe(1)
|
||||
unsub()
|
||||
clearLeaderTeamName()
|
||||
})
|
||||
})
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// isTodoV2Enabled
|
||||
// ---------------------------------------------------------------------------
|
||||
describe('isTodoV2Enabled', () => {
|
||||
test('returns true when CLAUDE_CODE_ENABLE_TASKS is set', () => {
|
||||
process.env.CLAUDE_CODE_ENABLE_TASKS = '1'
|
||||
try {
|
||||
expect(isTodoV2Enabled()).toBe(true)
|
||||
} finally {
|
||||
delete process.env.CLAUDE_CODE_ENABLE_TASKS
|
||||
}
|
||||
})
|
||||
|
||||
test('returns true in interactive sessions by default', () => {
|
||||
delete process.env.CLAUDE_CODE_ENABLE_TASKS
|
||||
// getIsNonInteractiveSession is mocked to return false
|
||||
expect(isTodoV2Enabled()).toBe(true)
|
||||
})
|
||||
})
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Concurrent access (integration)
|
||||
// ---------------------------------------------------------------------------
|
||||
describe('concurrent task creation', () => {
|
||||
test('creates unique IDs under rapid sequential writes', async () => {
|
||||
// proper-lockfile advisory locks may not serialize same-process async
|
||||
// operations in Bun, so we use sequential writes to verify ID monotonicity.
|
||||
const ids: string[] = []
|
||||
for (let i = 0; i < 10; i++) {
|
||||
const id = await createTask(TASK_LIST_ID, {
|
||||
subject: `Rapid ${i}`,
|
||||
description: '',
|
||||
status: 'pending',
|
||||
blocks: [],
|
||||
blockedBy: [],
|
||||
})
|
||||
ids.push(id)
|
||||
}
|
||||
const uniqueIds = new Set(ids)
|
||||
expect(uniqueIds.size).toBe(10)
|
||||
// Verify IDs are monotonically increasing
|
||||
for (let i = 1; i < ids.length; i++) {
|
||||
expect(Number(ids[i])).toBeGreaterThan(Number(ids[i - 1]))
|
||||
}
|
||||
})
|
||||
})
|
||||
@@ -1397,6 +1397,220 @@ export function buildMessageLookups(
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Incrementally update lookups by processing only newly appended messages.
|
||||
* Returns the same lookups object (mutated in place) if update succeeds,
|
||||
* or null if a full rebuild is needed (e.g., messages were removed).
|
||||
*/
|
||||
export function updateMessageLookupsIncremental(
|
||||
existing: MessageLookups,
|
||||
previousNormalizedCount: number,
|
||||
previousMessageCount: number,
|
||||
normalizedMessages: NormalizedMessage[],
|
||||
messages: Message[],
|
||||
): MessageLookups | null {
|
||||
// Safety check: only handle append-only case
|
||||
if (
|
||||
normalizedMessages.length < previousNormalizedCount ||
|
||||
messages.length < previousMessageCount
|
||||
) {
|
||||
return null
|
||||
}
|
||||
|
||||
// No new messages — nothing to do
|
||||
if (
|
||||
normalizedMessages.length === previousNormalizedCount &&
|
||||
messages.length === previousMessageCount
|
||||
) {
|
||||
return existing
|
||||
}
|
||||
|
||||
// Process new messages entries (pass 1: assistant tool_use blocks)
|
||||
const newMessageStart = previousMessageCount
|
||||
for (let i = newMessageStart; i < messages.length; i++) {
|
||||
const msg = messages[i]!
|
||||
if (msg.type === 'assistant') {
|
||||
const aMsg = msg as AssistantMessage
|
||||
const id = aMsg.message.id!
|
||||
if (Array.isArray(aMsg.message.content)) {
|
||||
const newToolUseIDs: string[] = []
|
||||
for (const content of aMsg.message.content) {
|
||||
if (typeof content !== 'string' && content.type === 'tool_use') {
|
||||
const toolUseContent = content as ToolUseBlock
|
||||
newToolUseIDs.push(toolUseContent.id)
|
||||
existing.toolUseByToolUseID.set(
|
||||
toolUseContent.id,
|
||||
content as ToolUseBlockParam,
|
||||
)
|
||||
}
|
||||
}
|
||||
// Update sibling lookup: all tool_use IDs in this message share siblings
|
||||
const allSiblings = new Set(newToolUseIDs)
|
||||
for (const toolUseID of newToolUseIDs) {
|
||||
existing.siblingToolUseIDs.set(toolUseID, allSiblings)
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Process new normalizedMessages entries (pass 2: progress, hooks, tool results)
|
||||
const newNormalizedStart = previousNormalizedCount
|
||||
for (let i = newNormalizedStart; i < normalizedMessages.length; i++) {
|
||||
const msg = normalizedMessages[i]!
|
||||
|
||||
if (msg.type === 'progress') {
|
||||
const toolUseID = msg.parentToolUseID as string
|
||||
const existing2 = existing.progressMessagesByToolUseID.get(toolUseID)
|
||||
if (existing2) {
|
||||
existing2.push(msg as ProgressMessage)
|
||||
} else {
|
||||
existing.progressMessagesByToolUseID.set(toolUseID, [
|
||||
msg as ProgressMessage,
|
||||
])
|
||||
}
|
||||
|
||||
const progressData = msg.data as { type: string; hookEvent: HookEvent }
|
||||
if (progressData.type === 'hook_progress') {
|
||||
const hookEvent = progressData.hookEvent
|
||||
let byHookEvent = existing.inProgressHookCounts.get(toolUseID)
|
||||
if (!byHookEvent) {
|
||||
byHookEvent = new Map()
|
||||
existing.inProgressHookCounts.set(toolUseID, byHookEvent)
|
||||
}
|
||||
byHookEvent.set(hookEvent, (byHookEvent.get(hookEvent) ?? 0) + 1)
|
||||
}
|
||||
}
|
||||
|
||||
if (msg.type === 'user' && Array.isArray(msg.message?.content)) {
|
||||
for (const content of msg.message?.content ?? []) {
|
||||
if (typeof content !== 'string' && content.type === 'tool_result') {
|
||||
const tr = content as ToolResultBlockParam
|
||||
existing.toolResultByToolUseID.set(tr.tool_use_id, msg)
|
||||
existing.resolvedToolUseIDs.add(tr.tool_use_id)
|
||||
if (tr.is_error) {
|
||||
existing.erroredToolUseIDs.add(tr.tool_use_id)
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if (msg.type === 'assistant' && Array.isArray(msg.message?.content)) {
|
||||
for (const content of msg.message?.content ?? []) {
|
||||
if (typeof content === 'string') continue
|
||||
if (
|
||||
'tool_use_id' in content &&
|
||||
typeof (content as { tool_use_id: string }).tool_use_id === 'string'
|
||||
) {
|
||||
existing.resolvedToolUseIDs.add(
|
||||
(content as { tool_use_id: string }).tool_use_id,
|
||||
)
|
||||
}
|
||||
if ((content.type as string) === 'advisor_tool_result') {
|
||||
const result = content as {
|
||||
tool_use_id: string
|
||||
content: { type: string }
|
||||
}
|
||||
if (result.content.type === 'advisor_tool_result_error') {
|
||||
existing.erroredToolUseIDs.add(result.tool_use_id)
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if (isHookAttachmentMessage(msg)) {
|
||||
const toolUseID = msg.attachment.toolUseID
|
||||
const hookEvent = msg.attachment.hookEvent
|
||||
const hookName = (msg.attachment as HookAttachmentWithName).hookName
|
||||
if (hookName !== undefined) {
|
||||
let byHookEvent = existing.resolvedHookCounts.get(toolUseID)
|
||||
if (!byHookEvent) {
|
||||
byHookEvent = new Map()
|
||||
existing.resolvedHookCounts.set(toolUseID, byHookEvent)
|
||||
}
|
||||
byHookEvent.set(hookEvent, (byHookEvent.get(hookEvent) ?? 0) + 1)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
existing.normalizedMessageCount = normalizedMessages.length
|
||||
|
||||
// Mark orphaned server_tool_use / mcp_tool_use blocks as errored.
|
||||
// Only scan the new normalizedMessages since the previous count —
|
||||
// existing entries were already checked by a prior full build.
|
||||
const lastMsg = messages.at(-1)
|
||||
const lastAssistantMsgId =
|
||||
lastMsg?.type === 'assistant' ? lastMsg.message?.id : undefined
|
||||
for (let i = newNormalizedStart; i < normalizedMessages.length; i++) {
|
||||
const msg = normalizedMessages[i]!
|
||||
if (msg.type !== 'assistant') continue
|
||||
const aMsg = msg as AssistantMessage
|
||||
if (aMsg.message.id === lastAssistantMsgId) continue
|
||||
if (!Array.isArray(aMsg.message.content)) continue
|
||||
for (const content of aMsg.message.content) {
|
||||
if (
|
||||
typeof content !== 'string' &&
|
||||
((content.type as string) === 'server_tool_use' ||
|
||||
(content.type as string) === 'mcp_tool_use') &&
|
||||
!existing.resolvedToolUseIDs.has((content as { id: string }).id)
|
||||
) {
|
||||
const id = (content as { id: string }).id
|
||||
existing.resolvedToolUseIDs.add(id)
|
||||
existing.erroredToolUseIDs.add(id)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return existing
|
||||
}
|
||||
|
||||
/**
|
||||
* Compute a lightweight structural fingerprint for buildMessageLookups caching.
|
||||
* Only captures information that affects lookup results (types, IDs, counts),
|
||||
* not content. Returns an empty string when the arrays are structurally empty.
|
||||
*
|
||||
* O(n) but allocates only a string — much cheaper than the 8 Maps/Sets that
|
||||
* buildMessageLookups creates on every call.
|
||||
*/
|
||||
export function computeMessageStructureKey(
|
||||
normalizedMessages: NormalizedMessage[],
|
||||
messages: Message[],
|
||||
): string {
|
||||
const parts: string[] = [
|
||||
String(normalizedMessages.length),
|
||||
'|',
|
||||
String(messages.length),
|
||||
]
|
||||
for (const msg of messages) {
|
||||
parts.push(msg.type[0])
|
||||
if (msg.type === 'assistant') {
|
||||
const aMsg = msg as AssistantMessage
|
||||
const content = aMsg.message?.content
|
||||
if (Array.isArray(content)) {
|
||||
for (const block of content) {
|
||||
if (typeof block !== 'string' && block.type === 'tool_use') {
|
||||
parts.push('t', (block as ToolUseBlock).id)
|
||||
}
|
||||
}
|
||||
}
|
||||
} else if (msg.type === 'user') {
|
||||
const content = (msg as UserMessage).message?.content
|
||||
if (Array.isArray(content)) {
|
||||
for (const block of content) {
|
||||
if (typeof block !== 'string' && block.type === 'tool_result') {
|
||||
parts.push('r', (block as ToolResultBlockParam).tool_use_id)
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
for (const msg of normalizedMessages) {
|
||||
if (msg.type === 'progress') {
|
||||
parts.push('p', (msg as ProgressMessage).parentToolUseID as string)
|
||||
}
|
||||
}
|
||||
return parts.join(',')
|
||||
}
|
||||
|
||||
/** Empty lookups for static rendering contexts that don't need real lookups. */
|
||||
export const EMPTY_LOOKUPS: MessageLookups = {
|
||||
siblingToolUseIDs: new Map(),
|
||||
|
||||
@@ -24,7 +24,7 @@ import { getModelStrings, resolveOverriddenModel } from './modelStrings.js'
|
||||
import { formatModelPricing, getOpus46CostTier } from '../modelCost.js'
|
||||
import { getSettings_DEPRECATED } from '../settings/settings.js'
|
||||
import type { PermissionMode } from '../permissions/PermissionMode.js'
|
||||
import { getAPIProvider } from './providers.js'
|
||||
import { getAPIProvider, isFirstPartyAnthropicBaseUrl } from './providers.js'
|
||||
import { LIGHTNING_BOLT } from '../../constants/figures.js'
|
||||
import { isModelAllowed } from './modelAllowlist.js'
|
||||
import { type ModelAlias, isModelAlias } from './aliases.js'
|
||||
@@ -360,7 +360,8 @@ export function isOpus1mMergeEnabled(): boolean {
|
||||
if (
|
||||
is1mContextDisabled() ||
|
||||
isProSubscriber() ||
|
||||
getAPIProvider() !== 'firstParty'
|
||||
getAPIProvider() !== 'firstParty' ||
|
||||
!isFirstPartyAnthropicBaseUrl()
|
||||
) {
|
||||
return false
|
||||
}
|
||||
|
||||
@@ -101,6 +101,20 @@ export async function readFileInRange(
|
||||
throw new FileTooLargeError(stats.size, maxBytes)
|
||||
}
|
||||
|
||||
// For targeted reads of moderately large files, prefer streaming to
|
||||
// avoid loading the full file into memory when only a slice is needed.
|
||||
const isTargetedRead = offset > 0 || maxLines !== undefined
|
||||
if (isTargetedRead && stats.size > FAST_PATH_MAX_SIZE / 4) {
|
||||
return readFileInRangeStreaming(
|
||||
filePath,
|
||||
offset,
|
||||
maxLines,
|
||||
maxBytes,
|
||||
truncateOnByteLimit,
|
||||
signal,
|
||||
)
|
||||
}
|
||||
|
||||
const text = await readFile(filePath, { encoding: 'utf8', signal })
|
||||
return readFileInRangeFast(
|
||||
text,
|
||||
|
||||
@@ -206,10 +206,49 @@ async function getOtlpReaders() {
|
||||
|
||||
return exporters.map(exporter => {
|
||||
if ('export' in exporter) {
|
||||
return new PeriodicExportingMetricReader({
|
||||
const reader = new PeriodicExportingMetricReader({
|
||||
exporter,
|
||||
exportIntervalMillis: exportInterval,
|
||||
})
|
||||
// Wrap the export callback to auto-shutdown the reader on auth
|
||||
// failures (401/403). Without this the PeriodicExportingMetricReader's
|
||||
// internal setInterval keeps retrying forever, leaking handles.
|
||||
const originalExport = (
|
||||
exporter as unknown as {
|
||||
export: (
|
||||
metrics: unknown,
|
||||
callback: (result: { error?: Error }) => void,
|
||||
) => unknown
|
||||
}
|
||||
).export.bind(exporter)
|
||||
;(
|
||||
exporter as unknown as {
|
||||
export: (
|
||||
metrics: unknown,
|
||||
callback: (result: { error?: Error }) => void,
|
||||
) => unknown
|
||||
}
|
||||
).export = (metrics, callback) => {
|
||||
return originalExport(metrics, result => {
|
||||
if (result.error) {
|
||||
const msg = result.error.message || ''
|
||||
if (
|
||||
msg.includes('401') ||
|
||||
msg.includes('403') ||
|
||||
msg.includes('Unauthorized') ||
|
||||
msg.includes('authentication')
|
||||
) {
|
||||
logForDebugging(
|
||||
`[3P telemetry] Auth error detected, shutting down metric reader`,
|
||||
{ level: 'error' },
|
||||
)
|
||||
void reader.shutdown()
|
||||
}
|
||||
}
|
||||
callback(result)
|
||||
})
|
||||
}
|
||||
return reader
|
||||
}
|
||||
return exporter
|
||||
})
|
||||
|
||||
@@ -56,9 +56,9 @@ export function getPersistenceThreshold(
|
||||
toolName: string,
|
||||
declaredMaxResultSizeChars: number,
|
||||
): number {
|
||||
// Infinity = hard opt-out. Read self-bounds via maxTokens; persisting its
|
||||
// output to a file the model reads back with Read is circular. Checked
|
||||
// before the GB override so tengu_satin_quoll can't force it back on.
|
||||
// Infinity = hard opt-out (reserved for tools that self-bound via other
|
||||
// mechanisms). Checked before the GB override so tengu_satin_quoll can't
|
||||
// force it back on.
|
||||
if (!Number.isFinite(declaredMaxResultSizeChars)) {
|
||||
return declaredMaxResultSizeChars
|
||||
}
|
||||
@@ -813,11 +813,12 @@ export async function enforceToolResultBudget(
|
||||
continue
|
||||
}
|
||||
|
||||
// Tools with maxResultSizeChars: Infinity (Read) — never persist.
|
||||
// Mark as seen (frozen) so the decision sticks across turns. They don't
|
||||
// count toward freshSize; if that lets the group slip under budget and
|
||||
// the wire message is still large, that's the contract — Read's own
|
||||
// maxTokens is the bound, not this wrapper.
|
||||
// Tools with maxResultSizeChars: Infinity — never persist (reserved for
|
||||
// tools that self-bound via other mechanisms). Mark as seen (frozen) so
|
||||
// the decision sticks across turns. They don't count toward freshSize; if
|
||||
// that lets the group slip under budget and the wire message is still
|
||||
// large, that's the contract — the tool's own maxTokens is the bound, not
|
||||
// this wrapper.
|
||||
const skipped = fresh.filter(c => shouldSkip(c.toolUseId))
|
||||
skipped.forEach(c => state.seenIds.add(c.toolUseId))
|
||||
const eligible = fresh.filter(c => !shouldSkip(c.toolUseId))
|
||||
|
||||
@@ -70,6 +70,11 @@ export default defineConfig({
|
||||
ssr: {
|
||||
target: 'node',
|
||||
noExternal: true,
|
||||
// Packages with runtime require.resolve() or WASM binaries can't be
|
||||
// inlined into the bundle — they must be resolved from node_modules
|
||||
// at runtime. doubaoime-asr uses opus-encdec which does
|
||||
// require.resolve('opus-encdec/dist/libopus-encoder.wasm.js').
|
||||
external: ['doubaoime-asr', 'opus-encdec'],
|
||||
},
|
||||
|
||||
build: {
|
||||
@@ -78,7 +83,7 @@ export default defineConfig({
|
||||
target: 'es2020',
|
||||
copyPublicDir: false,
|
||||
sourcemap: false,
|
||||
minify: false,
|
||||
minify: true,
|
||||
|
||||
// SSR build mode — uses Rollup with Node.js target
|
||||
ssr: true,
|
||||
@@ -88,9 +93,9 @@ export default defineConfig({
|
||||
|
||||
output: {
|
||||
format: 'es',
|
||||
dir: 'dist',
|
||||
// Single-file build: no code splitting, all dynamic imports inlined
|
||||
codeSplitting: false,
|
||||
entryFileNames: 'cli.js',
|
||||
chunkFileNames: 'chunks/[name]-[hash].js',
|
||||
},
|
||||
|
||||
plugins: [
|
||||
|
||||
Reference in New Issue
Block a user