Compare commits

...

5 Commits

Author SHA1 Message Date
claude-code-best
c3af45023d chore: v2.0.2 2026-05-02 20:37:46 +08:00
claude-code-best
2847cab787 docs: 压缩内存分析报告(720→120 行,保留全部可操作信息)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-02 20:37:14 +08:00
claude-code-best
198c09b263 fix: 内存优化 — 预测性 compact 阈值、增量 lookups orphaned 修复、deferred slice 引用优化
- P0: REPL.tsx 用 useMemo 包裹 deferred messages slice,避免每次渲染创建新数组引用导致不必要的后台重渲染
- P1: 预测性 compact 阈值改用 effectiveContextWindow - growth,消除与 autocompact buffer 的双重预留;TOOL_RESULT_GROWTH_ESTIMATE 从 20K 降至 15K
- P2: 增量 lookups 增加 lastAssistantMsgId 一致性检查和 orphaned server_tool_use/mcp_tool_use 扫描,防止 UI 永久 loading
- P3: reactiveCompact 类型断言改为直接使用 'compact' 字面量
- docs: CLAUDE.md 统一使用 precheck 替代分散的 typecheck/lint/test 命令

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-02 20:32:00 +08:00
claude-code-best
4cbf406c70 Merge pull request #403 from ymonster/fix/deepseek-empty-reasoning-content
fix: 保留 DeepSeek v4 thinking mode 的空 reasoning_content (#399)
2026-05-02 16:02:06 +08:00
ymonster
1b10ea391a fix: preserve empty reasoning_content for DeepSeek v4 thinking mode (#399)
DeepSeek v4 in thinking mode sometimes returns reasoning_content: ""
when the model answers directly without internal reasoning. Two places
were filtering the empty string out, which dropped the thinking block
from the assistant turn entirely. The next request then omitted
reasoning_content for that prior turn, and DeepSeek rejected with
400 "reasoning_content ... must be passed back to the API".

Fix:
- openaiStreamAdapter: open a thinking block whenever reasoning_content
  is present (including ""); skip the empty thinking_delta event since
  the empty value is already conveyed by the block's initial state.
- openaiConvertMessages: preserve empty thinking blocks as
  reasoning_content: "" when serializing assistant messages back to
  the OpenAI/DeepSeek format.

Tests:
- New: empty reasoning_content opens a thinking block (adapter).
- Updated: empty thinking blocks now round-trip as reasoning_content: ""
  instead of being dropped.
- New: assistant messages with no thinking block still omit
  reasoning_content (regression guard for non-thinking models).
2026-05-02 14:58:29 +08:00
16 changed files with 594 additions and 314 deletions

View File

@@ -4,7 +4,7 @@ This file provides guidance to Claude Code (claude.ai/code) and other AI coding
## Project Overview
This is a **reverse-engineered / decompiled** version of Anthropic's official Claude Code CLI tool. The goal is to restore core functionality while trimming secondary capabilities. Many modules are stubbed or feature-flagged off. TypeScript strict mode is enforced — **`bunx tsc --noEmit` must pass with zero errors**.
This is a **reverse-engineered / decompiled** version of Anthropic's official Claude Code CLI tool. The goal is to restore core functionality while trimming secondary capabilities. Many modules are stubbed or feature-flagged off. TypeScript strict mode is enforced — **`bun run precheck` 必须零错误通过**(包含 typecheck + lint fix + test
## Git Commit Message Convention
@@ -47,7 +47,7 @@ bun test # run all tests
bun test src/utils/__tests__/hash.test.ts # run single file
bun test --coverage # with coverage report
# Lint & Format (Biome)
# Lint & Format (Biome) — 日常开发用 precheck 代替单独调用
bun run lint # lint check (全项目)
bun run lint:fix # auto-fix lint issues
bun run format # format all (全项目)
@@ -60,7 +60,7 @@ bun run health
# Check unused exports
bun run check:unused
# Full check (typecheck + lint fix + test) — run after completing any task
# Full check (typecheck + lint fix + test) — 任务完成后必须运行
bun run precheck
# Remote Control Server
@@ -311,7 +311,7 @@ mock.module("src/utils/debug.ts", debugMock);
项目使用 TypeScript strict 模式,**tsc 必须零错误**。每次修改后运行:
```bash
bun run typecheck
bun run precheck
```
**类型规范**
@@ -324,14 +324,14 @@ bun run typecheck
## Working with This Codebase
- **tsc must pass** — `bun run typecheck` 必须零错误,任何修改都不能引入新的类型错误。
- **precheck must pass** — `bun run precheck`typecheck + lint fix + test必须零错误,任何修改都不能引入新的类型/lint/测试错误。
- **Feature flags** — 默认全部关闭(`feature()` 返回 `false`。Dev/build 各有自己的默认启用列表。不要在 `cli.tsx` 中重定义 `feature` 函数。
- **React Compiler output** — Components have decompiled memoization boilerplate (`const $ = _c(N)`). This is normal.
- **`bun:bundle` import** — `import { feature } from 'bun:bundle'` 是 Bun 内置模块,由运行时/构建器解析。不要用自定义函数替代它。**`feature()` 只能直接用在 `if` 语句或三元表达式的条件位置**Bun 编译器限制),不能赋值给变量、不能放在箭头函数体里、不能作为 `&&` 链的一部分。正确:`if (feature('X')) {}``feature('X') ? a : b`
- **`src/` path alias** — tsconfig maps `src/*` to `./src/*`. Imports like `import { ... } from 'src/utils/...'` are valid.
- **MACRO defines** — 集中管理在 `scripts/defines.ts`。Dev mode 通过 `bun -d` 注入build 通过 `Bun.build({ define })` 注入。修改版本号等常量只改这个文件。
- **构建产物兼容 Node.js** — `build.ts` 会自动后处理 `import.meta.require`,产物可直接用 `node dist/cli.js` 运行。
- **Biome 配置** — 42 条 lint 规则因 decompiled 代码被关闭,仅保留 `recommended` 基线。格式化覆盖全项目(`src/``scripts/``packages/`,含 `packages/@ant/`)。`.tsx` 文件用 120 行宽 + 强制分号;其他文件 80 行宽 + 按需分号。JSON 格式化已启用。`.editorconfig` 与 Biome 配置对齐2-space 缩进)。修改任何代码后应运行 `bun run check` 确认无 lint/格式问题pre-commit hook 会自动拦截不合格提交。
- **Biome 配置** — 42 条 lint 规则因 decompiled 代码被关闭,仅保留 `recommended` 基线。格式化覆盖全项目(`src/``scripts/``packages/`,含 `packages/@ant/`)。`.tsx` 文件用 120 行宽 + 强制分号;其他文件 80 行宽 + 按需分号。JSON 格式化已启用。`.editorconfig` 与 Biome 配置对齐2-space 缩进)。修改任何代码后应运行 `bun run precheck` 确认无类型/lint/格式/测试问题pre-commit hook 会自动拦截不合格提交。
- **tsc 与 Biome 冲突处理** — 当 tsc 要求声明属性(赋值使用)但 biome 报 `noUnusedPrivateClassMembers`(只写不读)时,用 `// biome-ignore lint/correctness/noUnusedPrivateClassMembers: <原因>` 抑制 lint 警告,保留类型声明。`biome ci` 必须零 warnings。
- **`@ts-expect-error` 维护** — 只在下方代码确实有类型错误时保留 `@ts-expect-error`。如果类型系统已更新导致 directive 变为 unusedTS2578直接移除注释。MACRO 替换产生的永假比较(如 `'production' === 'development'`)仍需保留 `@ts-expect-error`
- **Ink 框架在 `packages/@ant/ink/`** — 不是 `src/ink/`该目录不存在。Ink 相关的组件、hooks、keybindings 都在 packages 中。

View File

@@ -1,294 +1,103 @@
# 内存与性能峰值分析报告(最终版 — 5 轮迭代完成)
# 内存与性能峰值分析报告
> 进程 bun物理内存峰值 **700 MB+**,最差场景可达 **1.8 GB**
> 日期2026-05-02 | 状态:**调研完成** | 范围:内存峰值 + CPU 热点 + React 渲染循环
> Round 5 增量验证消息渲染管线buildMessageLookups 8 Map/Set 重建、useDeferredValue 双缓冲、FileReadTool 无上限、compaction 与 React 状态交互
> 进程 bunRSS 基线 **682 MB**,最差 **1.8 GB** | 2026-05-02 | **调研完成**12 轮迭代)
> 修复 commit`ef10ad28` + `ab0bbbc4`(降 100-300 MB| 架构限制Bun mimalloc/JSC 不归还内存页(~150-250 MB 永久占用)
## 数据收集
## 已修复10 项)
- 典型场景 RSS 682 MB基线 JSC heap 300-400 MB
- Bun mimalloc 不归还内存页JSC 页管理只增不减(架构级限制)
- 已有每秒 `Bun.gc()` 定时器(`cli/print.ts:554-558`),非强制模式
- 10 项已修复commit `ef10ad28` + `ab0bbbc4`),降低约 100-300MB
- Round 3 确认AWS SDK/Google Auth/Azure Identity 均动态 importlazy不贡献基线
## 已修复问题commit ef10ad28 + ab0bbbc4
| 问题 | 原峰值 | 修复方式 | 位置 |
|------|--------|----------|------|
| 问题 | 原峰值 | 修复 | 位置 |
|------|--------|------|------|
| 流式字符串拼接 O(n²) | 2-20 MB | `+=` → 数组累积 | `claude.ts:1834,2271` |
| Messages.tsx 多次遍历 | 100-270 MB | 合并单次 pass | `Messages.tsx:417-418` |
| ColorFile 无缓存 | 50-100 MB | LRU 缓存 50 条目 | `HighlightedCode.tsx:14-61` |
| Ink StylePool 无界 | 10-50+ MB | 1000 条目上限 | `@ant/ink/screen.ts:122` |
| ColorFile 无缓存 | 50-100 MB | LRU-50 | `HighlightedCode.tsx:14-61` |
| Ink StylePool 无界 | 10-50+ MB | 1000 上限 | `@ant/ink/screen.ts:122` |
| CompanionSprite 高频 | CPU | TICK_MS→1000ms | `CompanionSprite.tsx:15` |
| MCP stderr 缓冲 | 1-640 MB | 64→8MB/server | `mcp-client/connection.ts:117` |
| BashTool 输出缓冲 | 30-330 MB | 32→2MB | `stringUtils.ts:88` |
| Transcript 写入队列 | 5-50 MB | 1000 条目上限 | `sessionStorage.ts:613-619` |
| Transcript 写入队列 | 5-50 MB | 1000 上限 | `sessionStorage.ts:613-619` |
| contentReplacementState | 持续增长 | compact 清理 | `compact/compact.ts` |
| SSE 缓冲 | 无上限 | 1MB cap | SSE 处理代码 |
## 仍存在的问题 — 内存(按峰值影响排序
## P0 — 核心瓶颈6 项
### P0消息数组 7-8x 拷贝120-320 MB
| # | 问题 | 峰值 | 位置 | 建议 |
|---|------|------|------|------|
| 1 | 消息数组 7-8x spread 拷贝turn 尾部 3-4 份同时驻留) | 120-320 MB | `query.ts` 7 处(:477,:491,:897,:1135,:1745,:1857,:1878 | 去掉 spread / 传引用 / 改 push |
| 2 | AutoCompact 时序缺陷(检查在 API 前,增长在 API 后) | API 超限 | `query.ts:575` | 加入预测式阈值检查 |
| 3 | reactiveCompact 空存根API 413 时无紧急压缩) | 无降级 | `reactiveCompact.ts` 全文 | 实现真实逻辑 |
| 4 | buildMessageLookups 8 Map/Set 重建(流式每个 delta 触发) | GC STW 100-173ms | `Messages.tsx:519` | 增量更新 / 拆分 useMemo 链 |
| 5 | useDeferredValue 双缓冲 | 100-200 MB | `REPL.tsx:1569` | React 调度机制固有,优化空间有限 |
| 6 | Compact 峰值窗口preCompactReadFileState + summary + attachments | 20-80 MB | `compact.ts:524-644` | 提前释放 preCompactReadFileState/summaryResponse |
`src/query.ts` 每轮 turn 产生的拷贝Round 3 新增第 7 项)
## P1 — 重要瓶颈14 项)
| 位置 | 操作 | 是否必要 | 优化方式 |
|------|------|----------|----------|
| `:477` | `[...getMessagesAfterCompactBoundary(messages)]` | 双重浪费 | 去掉 spread |
| `:491` | `applyToolResultBudget → map()` | 按需 | 无超限返回原数组 |
| `:897` | `clonedContent ??= [...contentArr]` | 条件必要 | 保留 |
| `:1135` | `[...messagesForQuery, ...assistant]` | 可避免 | 传引用 |
| `:1745` | `.concat(assistant, toolResults)` | 可避免 | 传多参数 |
| `:1857` | `[...messagesForQuery, ...assistant, ...toolResults]` forkContextMessages | **Round 3 新发现** — task summary 用完即弃 | 传引用 |
| `:1878` | `[...messagesForQuery, ...assistant, ...toolResults]` | 必要 | 改 push |
| # | 问题 | 峰值 | 位置 | 建议 |
|---|------|------|------|------|
| 7 | OpenAI/Gemini/Grok 兼容层 O(n²) 拼接 | 25-75 MB | 3 文件 9 处(`openai/index.ts:386`, `gemini/index.ts:148`, `grok/index.ts:163` | 改数组累积(同 claude.ts 模式) |
| 8 | messages.ts O(n²) 拼接 | 10-25 MB | `messages.ts:3252,3268` | 改数组累积 |
| 9 | highlight.js 全量 192 语言(仅需 26 种) | 8-12 MB | `color-diff-napi/index.ts:21` | 自定义构建 |
| 10 | hlLineCache 模块级单例 2048 条目 | ~4 MB | `color-diff-napi/index.ts:508` | 改 LRU + size 上限 |
| 11 | colorFileCache 3x 代码存储 | 2-5 MB | `HighlightedCode.tsx:14` | 移除 value 中 code 字段 |
| 12 | 虚拟滚动 200 组件常驻 | 50 MB | `useVirtualScroll.ts` | 降低 OVERSCAN_ROWS / MAX_MOUNTED_ITEMS |
| 13 | FileReadTool 大文件(输出上限 100K 字符,但读取期间完整加载) | 临时数 MB | `FileReadTool.ts:342` | 读取前检测大小,流式截断 |
| 14 | Session 恢复全量加载磁盘→JSON→REPL 三阶段) | 200-300 MB | `sessionStorage.ts:3482` | 流式 JSONL / 增量恢复 |
| 15 | Session 写入 100MB 累积 | ~100 MB | `sessionStorage.ts:652` | 流式写入 |
| 16 | Forked Agent FileStateCache 完整克隆 | 50N MB | `forkedAgent.ts:382` | 共享/分层缓存agent 用 10MB |
| 17 | GC 阈值 350MB < 基线(每秒无意义强制 GC | CPU 浪费 | `cli/print.ts:554` | 提高到 800MB+ |
| 18 | PDF 100 页处理 | ~100 MB | `apiLimits.ts:54` | 分页流式处理 |
| 19 | 图片单张处理base64→解码→resize | ~16 MB/张 | `apiLimits.ts:22` | 流式 resize |
| 20 | token 估算 ±25-50% 误差放大时序问题 | 阈值不准 | `tokenEstimation.ts:215` | 内容类型感知估算 |
峰值时 3-4 份完整消息数组同时驻留477 + 1745 + 1857 + 1878 在同一 turn 尾部顺序执行)。
### P0React 消息管线重复计算Round 5 新增分析)
**buildMessageLookups 每次 useMemo 重算时创建 8 个 Map/Set**`messages.ts:1215-1398`
| 数据结构 | 规模 | 说明 |
|----------|------|------|
| `toolUseIDsByMessageID` | Map\<string, Set\> | 每个 assistant 消息一个 Set |
| `toolUseIDToMessageID` | Map\<string, string\> | 所有 tool_use ID |
| `toolUseByToolUseID` | Map\<string, ToolUseBlockParam\> | **保留完整 tool_use block** |
| `siblingToolUseIDs` | Map\<string, Set\> | 兄弟 tool_use 索引 |
| `progressMessagesByToolUseID` | Map\<string, ProgressMessage[]\> | 进度消息数组 |
| `toolResultByToolUseID` | Map\<string, NormalizedMessage\> | **保留完整 tool_result 消息引用** |
| `resolvedToolUseIDs` / `erroredToolUseIDs` | Set\<string\> | 已完成/错误 ID |
此 useMemo`Messages.tsx:519`)依赖 normalizedMessages任何消息变更含流式 delta触发重建。已拆分 renderRange 避免滚动触发注释明确记录50ms alloc per scroll → GC → 100-173ms STW on 1GB heap
**useDeferredValue 双缓冲**`REPL.tsx:1569`):流式期间 `messages``deferredMessages` 同时持有两份完整数组,直到 React 调度更新。在 27k 消息场景下,额外 ~100-200MB 临时占用。
**FileReadTool 无大小限制**`FileReadTool.ts:342``maxResultSizeChars: Infinity`,单次 10MB 文件读取完整保留在消息数组中。BashTool30KB和 GrepTool20KB有合理上限。
### P0Compaction 与 React 状态交互Round 5 新增分析)
**非全屏模式**`REPL.tsx:3074-3075`compact 后 `setMessages(() => [newMessage])` 正确替换整组旧消息,内存立即释放。
**全屏模式**`REPL.tsx:3056-3072`):保留最多 500 条消息的 scrollback。注释记录Ink fiber 树每条消息 ~250KB RSS无 cap 时观察过 13k+ 消息 → 1GB+ heap。
**Microcompact 的局限**`microCompact.ts:472-494`):用 spread 创建新消息对象替换内容为 `[Old tool result content cleared]`。但 `ContentReplacementState.replacements` Map`toolResultStorage.ts:392`)仍保留原始替换字符串,直到 compact 时才清理。这意味着 microcompact 减少了 token 数,但实际内存释放依赖后续 compact。
### P0Compact 峰值20-80 MB
峰值时间线(`compact.ts:524-644`
```
Before: messages(200K) + mutableMessages(200K) = 400K tokens
During: + preCompactReadFileState(25MB) + summary + attachments ≈ 500K+ tokens
After: splice → 50K tokens
```
可提前释放:`preCompactReadFileState`25MB`summaryResponse`、原始 `messages` 参数。
### P0React Hooks 闭包与 useMemo 链Round 5 深入排查)
**useCallback 闭包重建**`REPL.tsx`
| 回调 | 依赖项数 | 位置 | 影响 |
|------|----------|------|------|
| `getToolUseContext` | 20 | `:2789-2949` | 重建时旧闭包持有的引用阻止 GC |
| `onQueryImpl` | 14 | `:3188-3469` | 包含 getToolUseContext + 多层嵌套闭包 |
| `onQuery` | 在 onQueryImpl 上再包装 | `:3471-3697` | 又一层闭包 |
| `onSubmit` | ~10 | `:3822-4298` | 闭包链嵌套 3 层 |
每次 `messages` 变更触发 `setMessages` → React 重渲染 → 依赖 messages 的 useCallback/useMemo 全部重建。但 `getToolUseContext``onQueryImpl` **没有把 `messages` 放入依赖数组**(通过 `messagesRef.current` 参数传递规避),所以这些闭包不会因 messages 变化而重建。**这实际上是正确的设计**——用 ref 规避了闭包捕获问题。
**真正的 hooks 问题**在于 useMemo 链(`Messages.tsx`
```
messages → normalizedMessages (O(n))
→ compactAwareMessages (O(n) filter)
→ messagesToShow (O(n) filter + reorder)
→ groupedMessages (O(n))
→ collapsed (O(n))
→ lookups (8 Map/Set, O(n))
```
流式期间每个 delta 触发 `messages` 变更 → 整条链全量重算。注释记录50ms alloc per scroll → GC → 100-173ms STW on 1GB heap`Messages.tsx:516-518`)。
**无界 useRef**`REPL.tsx`
| Ref | 增长方式 | 清理 | 影响 |
|-----|----------|------|------|
| `bashTools` | `.add()` 每个 bash 命令 | `clearConversation` 时 clear | Set\<string\>,通常 <100 |
| `discoveredSkillNamesRef` | `.add()` 每个发现的 skill | `clearConversation` 时 clear | Set\<string\>,通常 <50 |
| `apiMetricsRef` | `.push()` 每次请求 | turn 结束时 `= []` | 临时turn 内累积 |
| `responseLengthRef` | 累加 | compact 时重置为 0 | 单数字 |
| `loadedNestedMemoryPathsRef` | `.add()` 每个 CLAUDE.md | compact/clear 时 clear | Set\<string\> |
结论:**这些 ref 都有清理机制**,不是主要问题。核心问题仍是 useMemo 链在流式期间的全量重算。
### P1虚拟滚动组件~50 MB— Round 3 新发现
`src/hooks/useVirtualScroll.ts` + React Ink 渲染管线:
- MAX_MOUNTED_ITEMS = 300OVERSCAN_ROWS = 80
- 实际挂载约 200 个 MessageRow视口 + overscan
- 每个 MessageRow ≈ 250KB RSSReact fiber + Yoga node + 子组件树)
- **总计约 50 MB 常驻内存**(当前会话最大挂载窗口)
优化空间:降低 MAX_MOUNTED_ITEMS 或 OVERSCAN_ROWS评估 MessageRow 组件内部 memo 化。
### P1流式 contentBlocks 累积 — Round 3 新发现
`src/services/api/claude.ts:1932`
- `contentBlocks` 数组在流式响应期间累积所有内容块
- 长 thinking 响应可达数万 tokenthinking 文本完整保留在 contentBlock.thinking 中
- `streamingDeltas` Map已修复为数组累积`content_block_stop``join('')` 赋值给 contentBlock
- 思考块在 normalize 后仍然保留完整 thinking 文本
### P1其他已确认内存问题
## P2 — 次要问题10 项)
| # | 问题 | 峰值 | 位置 |
|---|------|------|------|
| 1 | MCP Tool Schema 双重存储 | ~40 MB | `manager.ts:73` + `AppStateStore.ts:175` |
| 2 | lastAPIRequestMessages 常驻 | 30-50 MB | `bootstrap/state.ts:118` |
| 3 | Session 恢复全量加载(中小文件) | 50-200 MB | `sessionStorage.ts:3475-3582` |
| 4 | HybridTransport 100K 队列 | 1-10 MB | `HybridTransport.ts:86` |
| 5 | React messagesRef 双重引用 | 临时 | `REPL.tsx:1437-1477` |
| 6 | AppState 不可变更新抖动 | 5-50 MB | `store.ts:20-26` |
| 7 | Tool result seenIds/replacements | 0.5-2 MB | `toolResultStorage.ts:390-397` |
| 8 | bootstrap/state.ts 无界缓存 | 0.1-1 MB | planSlugCache 等 |
| 9 | QueryEngine 无界集合 | 0.1-1 MB | discoveredSkillNames 等 |
| 10 | expandedKeys Set 无清理Round 5 | <0.5 MB | `Messages.tsx:644` compact 后 stale keys 不删除 |
| 11 | OpenAI/Gemini/Grok collectedMessagesRound 5 | 临时 | 流式期间累积 assistant messages 供 Langfuse telemetrystream 结束后释放 |
| 21 | lastAPIRequestMessages 常驻 | 30-50 MB | `bootstrap/state.ts:118` |
| 22 | MCP Tool Schema 双重存储 | ~40 MB | `manager.ts:73` + `AppStateStore.ts:175` |
| 23 | ContentReplacementState 单调增长 | 0.5-2 MB | `toolResultStorage.ts:390` |
| 24 | Perfetto 100K 事件 | ~30 MB | `perfettoTracing.ts:106` |
| 25 | StreamingMarkdown 双渲染 | 临时 | `Markdown.tsx:185` |
| 26 | MarkdownTable 3 次遍历 | CPU 峰值 | `MarkdownTable.tsx:99` |
| 27 | 搜索索引 WeakMap | 5-10 MB | `transcriptSearch.ts:17` |
| 28 | ACP FileStateCache/会话 | 50 MB | `acp/agent.ts:554` |
| 29 | Agent initialMessages 浅拷贝 | 1-5 MB/agent | `runAgent.ts:382` |
| 30 | Hook 结果累积 | ~1 MB+ | `toolExecution.ts:1474` |
### P2低优先级未验证
| # | 问题 | 峰值 | 位置 |
|---|------|------|------|
| 1 | OpenTelemetry 多版本 | ~30 MB | 依赖树 |
| 2 | Perfetto tracing 100K events | ~30 MB | `perfettoTracing.ts:99` |
| 3 | Prompt Cache 规范化 | 5-15 MB | `claude.ts:3180-3329` |
| 4 | GrepTool 全量 stat+sort | ~10 MB | `GrepTool.ts:523-557` |
## 仍存在的问题 — CPU 与渲染热点
### 已确认
## CPU / 渲染热点
| # | 问题 | 影响 | 位置 |
|---|------|------|------|
| C2 | **Ink 每次 React commit 触发 Yoga 布局**React ConcurrentRoot 自动批处理 setState5 个 setState → 1 次 commit → 1 次布局) | ~1-3ms/commit | `reconciler.ts:279``ink.tsx:323` |
| C3 | **MessageRow 挂载成本 ~1.5ms**Markdown 解析仅占 1-7%,主因是 React/Yoga/Ink 管线开销 ~1.3ms | 已有 SLIDE_STEP=25 + useDeferredValue 限速 | `useVirtualScroll.ts` + `Markdown.tsx` |
| C4 | **布局偏移触发全屏 damage** | O(rows×cols) 全量 diff | `ink.tsx:655-661` |
| C7 | **CompanionSprite TICK_MS 定时器**500ms→已修复为 1000ms | 高频 setState 触发渲染 | `buddy/CompanionSprite.tsx:15,136` |
| C9 | 同步 fs 操作 | 阻塞主线程 | `projectOnboardingState.ts:20` 等 |
| C2 | Ink 每次 React commit 触发 Yoga 布局 | ~1-3ms/commit | `reconciler.ts:279``ink.tsx:323` |
| C3 | MessageRow 挂载 ~1.5msReact/Yoga/Ink 管线开销) | 批量挂载 ~290ms 卡顿 | `useVirtualScroll.ts` |
| C4 | 布局偏移触发全屏 damage | O(rows×cols) | `ink.tsx:655-661` |
| C9 | 同步 fs 操作阻塞主线程 | 间歇卡顿 | `projectOnboardingState.ts:20` |
### 已否认
已有缓解React ConcurrentRoot 批处理、帧率限制 16ms、虚拟滚动 overscan 80 + SLIDE_STEP=25 + useDeferredValue、Markdown tokenCache LRU-500 + hasMarkdownSyntax 快速路径、Yoga 增量缓存。
- **C1 useInboxPoller 状态循环** — 验证确认useEffect 是收敛的(移除消息 → count 减少 → 稳定poll 通过 `store.getState()` 读取不触发 React 依赖1 秒轮询是正常 I/O 模式无循环
- **Markdown 是 CPU 热点** — marked.lexer 对典型消息仅 0.01-0.1ms,已有 tokenCache LRU-500缓存命中 0.0003ms99.6% 降速)+ hasMarkdownSyntax 快速路径(跳过 30-40% 消息)
- **Yoga 无增量布局** — 实测增量更新高效1000 节点树改 1 叶子 → 仅 2 次 measure其余走缓存
- **Ink Yoga 2^depth 问题** — 实测 100 节点深链 = 11.7x 访问(线性增长,非指数级)
## 已否认12 轮汇总)
###优化措施
- React ConcurrentRoot 自动批处理 setState多个 setState → 1 次 commit
- Ink 帧率限制 16msthrottle 仅限终端输出Yoga 布局无 throttle 但被 React batching 保护)
- 虚拟滚动 overscan 80 + MAX_MOUNTED_ITEMS 300 + SLIDE_STEP=25 + useDeferredValue
- Markdown tokenCache LRU-500 + hasMarkdownSyntax 快速路径 + StreamingMarkdown 增量解析
- Yoga 增量缓存dirty propagation + measure 结果缓存)
- 双缓冲 + damage tracking + 字符池复用
- Pool 5 分钟周期重置
## 已否认内存5 轮汇总)
- VSZ 516 GB 是虚拟映射非物理 | Zod Schema ~650KB | Markdown LRU-500 已优化
- useSkillsChange/useSettingsChange — 正确 cleanup | useInboxPoller — 收敛设计
- React Compiler `_c(N)` — 未使用 | File watchers — 仅 ~5KB | React reconciler — WeakMap + freeRecursive
- Ink 屏幕缓冲 ~86KB | CharPool/HyperlinkPool ~1-5MB 且 5min 重置 | StylePool 缓存 1000 上限
- 依赖树 — AWS/Google/Azure SDK 均动态 import不贡献基线 | Sentry 空实现
- Ink 无 scrollback 缓冲 | Markdown tokenCache LRU-500 bounded
- **Round 5 否认**useCallback 闭包捕获 messages — 实际通过 messagesRef 参数传递规避,无闭包问题
- **Round 5 否认**MCP stderrHandler 泄漏 — 已有 64MB cap + 成功后释放 + cleanup 移除 listener
- **Round 5 否认**useRef 无界增长 — bashTools/discoveredSkillNamesRef/loadedNestedMemoryPathsRef 均有 clearConversation 或 compact 清理
- **Round 5 否认**apiMetricsRef 无界 — turn 结束时 `= []` 重置
- **Round 5 否认**useEffect 缺少 cleanup — 检查的 12 个 useEffect 均有 return cleanup 函数
VSZ 516 GB 是虚拟映射 | Zod ~650KB | Markdown LRU-500 已优化 | useSkillsChange/useSettingsChange 正确 cleanup | useInboxPoller 收敛设计(非循环)| React Compiler `_c(N)` 未使用 | File watchers ~5KB | React reconciler WeakMap + freeRecursive | Ink 屏幕缓冲 ~86KB | CharPool/HyperlinkPool ~1-5MB 5min 重置 | AWS/Google/Azure SDK 均懒加载 | Sentry 空实现 | useCallback 闭包通过 messagesRef 规避(无泄漏)| MCP stderrHandler 有 64MB cap + cleanup | useRef 有 clearConversation/compact 清理 | apiMetricsRef turn 结束重置 | useEffect 有 cleanup 函数 | lodash-es tree-shakable | AppState useSyncExternalStore 仅相关切片更新 | SDK 无全局重试队列 | Ink unmount 有清理
## 结论
**内存根因**5 轮迭代确认)
1. **消息数组 turn 尾部 3-4 次 spread 同时驻留**120-320 MB— 核心瓶颈
2. **React 消息管线 buildMessageLookups 8 个 Map/Set 重建**50ms/次27k 消息场景)— GC 压力源
3. **useDeferredValue 双缓冲**(流式期间额外 ~100-200 MB 临时
4. **FileReadTool 无大小上限**(单次 10MB 文件永久驻留
5. **Compact 峰值窗口**20-80 MB+ Microcompact 依赖后续 compact 才真正释放
6. **虚拟滚动 200 组件 ~50MB 常驻**
7. **Bun/JSC 不归还内存页**(架构级限制
**内存根因排序**
1. 消息数组 7-8x spread 拷贝120-320 MB— 核心瓶颈
2. useDeferredValue 双缓冲 + React useMemo 链全量重算100-200 MB + GC STW
3. Session 恢复/写入峰值200-300 MB
4. AutoCompact 时序缺陷 + reactiveCompact 空存根API 超限风险
5. Forked Agent FileStateCache 克隆50N MB
6. 虚拟滚动 200 组件 ~50MB 常驻
7. Bun/JSC 不归还内存页(架构级)
**CPU 根因**useInboxPoller 每秒轮询触发 React commit → 全量 Yoga 布局 → 全屏 Ink diff 完整管线。Markdown 渲染~1.5ms/行)在批量挂载新消息时造成 ~290ms 卡顿。轮询导致的周期性 commit 与消息挂载的 CPU 密集操作互相放大
**Round 4 最终验证**agent 递归 spread 和 attachment 累积均为已知 P0消息数组拷贝的变体无新根因。Snipping 在流式前执行无并发问题。consumedCommandUuids 等数组每轮重置无累积。
**Round 5 增量验证**
- buildMessageLookups 8 个 Map/Set 的重建成本已由 renderRange 拆分缓解,但仍然是消息变更时的主要 GC 压力源
- useDeferredValue 双缓冲是 React 调度机制的固有行为,优化空间有限
- FileReadTool 无上限是唯一一个"单次操作可注入 10MB+ 数据"的入口
- Microcompact 减少 token 但不立即释放内存(内容被 ContentReplacementState.replacements Map 间接持有)
**CPU 根因**useInboxPoller 每秒轮询 React commit → Yoga 布局 → 全屏 Ink diff 完整管线。Markdown 渲染批量挂载时 ~290ms 卡顿
**预估优化空间**
| 优先级 | 措施 | 预估降低 |
|--------|------|----------|
| P0 | 消息数组拷贝优化 7 处 | 100-200 MB |
| P0 | Compact 峰值管理 3 项 | 20-80 MB |
| P1 | 虚拟滚动优化 | 20-30 MB |
| P1 | 缓冲与缓存清理 5 项 | 30-80 MB |
| P2 | 其他 3 项 | 10-50 MB |
| **合计** | **21 项可操作建议** | **210-500 MB** |
| 优先级 | 措施 | 预估降低 |
|--------|--------|----------|
| P0 | 6 | 240-600 MB |
| P1 | 14 | 300-600 MB |
| P2 | 10 | 80-200 MB |
| **合计** | **30 项** | **620-1400 MB** |
理论可从当前 400-700 MB 降至 **200-350 MB**
## 建议(按优先级)
### P0消息数组拷贝预估降 100-200 MB
1. `query.ts:477` — 去掉 spread
2. `query.ts:1878` — 改 push 追加
3. `query.ts:1135` — 传引用
4. `query.ts:1745` — 传多参数
5. `query.ts:1857` — 传引用forkContextMessages
6. `query.ts:491` — 无超限返回原数组
### P0消息渲染管线Round 5 新增,预估降 30-60 MB
7. `FileReadTool.ts:342``maxResultSizeChars: Infinity` → 设合理上限(如 100KB
8. `toolResultStorage.ts:392` — Microcompact 后同步清理 `replacements` Map 中对应条目
9. `Messages.tsx:519` — 考虑 buildMessageLookups 增量更新而非全量重建
### P0Compact 峰值(预估降 20-80 MB
10. `compact.ts:543``preCompactReadFileState = undefined`
11. `compact.ts:651``summaryResponse = undefined`
12. 延迟非关键 attachment 生成
### P1渲染与缓存预估降 50-110 MB
13. 虚拟滚动 — 降低 OVERSCAN_ROWS 或 MAX_MOUNTED_ITEMS
14. `lastAPIRequestMessages` — 非 debug 清空
15. MCP Tool Schema — 去掉 manager 层 toolsCache
16. `HybridTransport` — maxQueueSize 100K→10K
17. `bootstrap/state.ts` — 无界 Map 加 LRU
### P2其他预估降 10-50 MB
18. `toolResultStorage.ts` — seenIds/replacements 定期清理
19. Session 恢复流式 JSONL | AppState 增量更新
20. Thinking 文本截断策略(保留前 N + 后 N 字符)
21. `Bun.gc(true)` 低内存触发
### P2Ink 渲染层(降低 CPU 开销)
22. `ink.tsx:655-661` — 布局偏移时尝试增量 damage 而非全屏 `{x:0,y:0,width:full,height:full}`
## 附录
- 合并来源:`docs/performance-reporter.md`7 轮调研,含 CPU/渲染热点详细验证)
- 修复 commit`ab0bbbc4`compact 清理)、`ef10ad28`(峰值优化 -100-300MB
- Round 2 新发现HybridTransport 缓冲、React messagesRef 双重引用、toolResultStorage 无界增长
- Round 3 新发现:虚拟滚动 ~50MB 常驻、第 7-8 次 spreadquery.ts:1857、流式 contentBlocks thinking 累积、依赖树已懒加载
- Round 4 最终验证无新根因agent spread 和 attachment 累积为已知变体),调研终止
- Round 5 增量验证buildMessageLookups 8 Map/Set 重建成本、useDeferredValue 双缓冲、FileReadTool 无上限、Microcompact 内存释放延迟、compaction 与 React 状态交互细节
理论可从 400-700 MB 降至 **200-350 MB**(受 mimalloc/JSC 架构限制约束)

View File

@@ -1,6 +1,6 @@
{
"name": "claude-code-best",
"version": "2.0.1",
"version": "2.0.2",
"description": "Reverse-engineered Anthropic Claude Code CLI — interactive AI coding assistant in the terminal",
"type": "module",
"author": "claude-code-best <claude-code-best@proton.me>",

View File

@@ -468,7 +468,11 @@ describe('DeepSeek thinking mode (enableThinking)', () => {
expect(assistant.reasoning_content).toBe('First thought.\nSecond thought.')
})
test('skips empty thinking blocks', () => {
test('preserves empty thinking blocks as reasoning_content: "" (DeepSeek v4 thinking mode)', () => {
// DeepSeek v4 thinking mode sometimes returns reasoning_content: ""
// when the model answers directly without reasoning. The empty value
// must be echoed back in the next request — otherwise DeepSeek returns
// 400 ("reasoning_content ... must be passed back"). See issue #399.
const result = anthropicMessagesToOpenAI(
[
makeUserMsg('question'),
@@ -481,7 +485,23 @@ describe('DeepSeek thinking mode (enableThinking)', () => {
{ enableThinking: true },
)
const assistant = result.filter(m => m.role === 'assistant')[0] as any
expect(assistant.reasoning_content).toBe('')
expect(assistant.content).toBe('Answer.')
})
test('omits reasoning_content when no thinking block is present', () => {
// No thinking block at all → no reasoning_content field on the
// OpenAI-format assistant message (relevant for non-thinking models).
const result = anthropicMessagesToOpenAI(
[
makeUserMsg('question'),
makeAssistantMsg([{ type: 'text', text: 'Answer.' }]),
],
[] as any,
)
const assistant = result.filter(m => m.role === 'assistant')[0] as any
expect(assistant.reasoning_content).toBeUndefined()
expect(assistant.content).toBe('Answer.')
})
// ── fix: reorder tool and user messages for OpenAI API compatibility (#168) ──

View File

@@ -439,6 +439,54 @@ describe('thinking support (reasoning_content)', () => {
expect(blockStarts[1].content_block.type).toBe('tool_use')
})
test('opens thinking block on empty reasoning_content (DeepSeek v4 direct-answer)', async () => {
// DeepSeek v4 thinking mode sometimes streams reasoning_content: ""
// before answering directly. We must still open a thinking block so the
// resulting assistant message carries an (empty) thinking block — that
// round-trips back as reasoning_content: "" in the next request,
// satisfying DeepSeek's requirement (see issue #399).
const events = await collectEvents([
makeChunk({
choices: [
{
index: 0,
delta: { reasoning_content: '' },
finish_reason: null,
},
],
}),
makeChunk({
choices: [
{
index: 0,
delta: { content: 'Direct answer.' },
finish_reason: null,
},
],
}),
makeChunk({
choices: [{ index: 0, delta: {}, finish_reason: 'stop' }],
}),
])
// A thinking block was opened (and closed before the text block starts)
const blockStarts = events.filter(
e => e.type === 'content_block_start',
) as any[]
expect(blockStarts.length).toBe(2)
expect(blockStarts[0].content_block.type).toBe('thinking')
expect(blockStarts[0].content_block.thinking).toBe('')
expect(blockStarts[1].content_block.type).toBe('text')
// No empty thinking_delta should be emitted — the empty string is
// already conveyed by the thinking block's initial value.
const thinkingDeltas = events.filter(
e =>
e.type === 'content_block_delta' && e.delta.type === 'thinking_delta',
)
expect(thinkingDeltas.length).toBe(0)
})
test('thinking block index is 0, text block index is 1', async () => {
const events = await collectEvents([
makeChunk({

View File

@@ -206,12 +206,14 @@ function convertInternalAssistantMessage(
},
})
} else if (block.type === 'thinking') {
// DeepSeek thinking mode: always preserve reasoning_content.
// DeepSeek requires reasoning_content to be passed back in subsequent requests,
// especially when tool calls are involved (returns 400 if missing).
// DeepSeek thinking mode: always preserve reasoning_content,
// including the empty-string case. DeepSeek v4 may return
// reasoning_content: "" when the model answers directly, and the
// empty value must be echoed back in the next request — otherwise
// DeepSeek returns 400 ("reasoning_content ... must be passed back").
const thinkingText = (block as unknown as Record<string, unknown>)
.thinking
if (typeof thinkingText === 'string' && thinkingText) {
if (typeof thinkingText === 'string') {
reasoningParts.push(thinkingText)
}
}

View File

@@ -106,9 +106,13 @@ export async function* adaptOpenAIStreamToAnthropic(
// Skip chunks that carry only usage data (no delta content)
if (!delta) continue
// Handle reasoning_content → Anthropic thinking block
// Handle reasoning_content → Anthropic thinking block.
// Empty string is a valid signal: DeepSeek v4 thinking mode sometimes
// returns reasoning_content: "" when the model answers directly. The
// empty thinking block must round-trip back to the API in subsequent
// requests, otherwise DeepSeek rejects with 400.
const reasoningContent = (delta as any).reasoning_content
if (reasoningContent != null && reasoningContent !== '') {
if (reasoningContent != null) {
if (!thinkingBlockOpen) {
currentContentIndex++
thinkingBlockOpen = true
@@ -125,14 +129,16 @@ export async function* adaptOpenAIStreamToAnthropic(
} as BetaRawMessageStreamEvent
}
yield {
type: 'content_block_delta',
index: currentContentIndex,
delta: {
type: 'thinking_delta',
thinking: reasoningContent,
},
} as BetaRawMessageStreamEvent
if (reasoningContent !== '') {
yield {
type: 'content_block_delta',
index: currentContentIndex,
delta: {
type: 'thinking_delta',
thinking: reasoningContent,
},
} as BetaRawMessageStreamEvent
}
}
// Handle text content

View File

@@ -761,6 +761,16 @@ async function validateContentTokens(
const effectiveMaxTokens =
maxTokens ?? getDefaultFileReadingLimits().maxTokens
// Fast rejection: if raw byte count exceeds 4x the token limit,
// no encoding can possibly fit (worst case is ~4 bytes/token).
const byteLength = Buffer.byteLength(content)
if (byteLength > effectiveMaxTokens * 4) {
throw new MaxFileReadTokenExceededError(
Math.ceil(byteLength / 4),
effectiveMaxTokens,
)
}
const tokenEstimate = roughTokenCountEstimationForFileType(content, ext)
if (!tokenEstimate || tokenEstimate <= effectiveMaxTokens / 4) return

View File

@@ -18,6 +18,7 @@ import type { Tools } from '../Tool.js';
import { findToolByName } from '../Tool.js';
import type { AgentDefinitionsResult } from '@claude-code-best/builtin-tools/tools/AgentTool/loadAgentsDir.js';
import type {
AssistantMessage,
Message as MessageType,
NormalizedMessage,
ProgressMessage as ProgressMessageType,
@@ -36,6 +37,7 @@ import {
buildMessageLookups,
computeMessageStructureKey,
type MessageLookups,
updateMessageLookupsIncremental,
createAssistantMessage,
deriveUUID,
getMessagesAfterCompactBoundary,
@@ -516,7 +518,13 @@ const MessagesImpl = ({
// message content changed during streaming (text/thinking deltas). The key
// captures only structural info (types, IDs), so content-only deltas skip
// the rebuild entirely.
const lookupsCacheRef = useRef<{ key: string; lookups: MessageLookups } | null>(null);
const lookupsCacheRef = useRef<{
key: string;
lookups: MessageLookups;
normalizedCount: number;
messageCount: number;
lastAssistantMsgId: string | undefined;
} | null>(null);
// Expensive message transforms — filter, reorder, group, collapse, lookups.
// All O(n) over 27k messages. Split from the renderRange slice so scrolling
@@ -587,12 +595,57 @@ const MessagesImpl = ({
);
const lookupsKey = computeMessageStructureKey(normalizedMessages, messagesToShow as MessageType[]);
const currentLastAssistantMsgId = (() => {
const lastMsg = (messagesToShow as MessageType[]).at(-1);
return lastMsg?.type === 'assistant' ? (lastMsg as AssistantMessage).message?.id : undefined;
})();
let lookups: MessageLookups;
if (lookupsCacheRef.current && lookupsCacheRef.current.key === lookupsKey) {
lookups = lookupsCacheRef.current.lookups;
} else if (
lookupsCacheRef.current &&
normalizedMessages.length >= lookupsCacheRef.current.normalizedCount &&
(messagesToShow as MessageType[]).length >= lookupsCacheRef.current.messageCount &&
// If lastAssistantMsgId changed, previous "in-progress" assistant may
// now be orphaned — force a full rebuild to pick up the new status.
lookupsCacheRef.current.lastAssistantMsgId === currentLastAssistantMsgId
) {
// Try incremental update when only new messages were appended
const updated = updateMessageLookupsIncremental(
lookupsCacheRef.current.lookups,
lookupsCacheRef.current.normalizedCount,
lookupsCacheRef.current.messageCount,
normalizedMessages,
messagesToShow as MessageType[],
);
if (updated) {
lookups = updated;
lookupsCacheRef.current = {
key: lookupsKey,
lookups,
normalizedCount: normalizedMessages.length,
messageCount: (messagesToShow as MessageType[]).length,
lastAssistantMsgId: currentLastAssistantMsgId,
};
} else {
lookups = buildMessageLookups(normalizedMessages, messagesToShow as MessageType[]);
lookupsCacheRef.current = {
key: lookupsKey,
lookups,
normalizedCount: normalizedMessages.length,
messageCount: (messagesToShow as MessageType[]).length,
lastAssistantMsgId: currentLastAssistantMsgId,
};
}
} else {
lookups = buildMessageLookups(normalizedMessages, messagesToShow as MessageType[]);
lookupsCacheRef.current = { key: lookupsKey, lookups };
lookupsCacheRef.current = {
key: lookupsKey,
lookups,
normalizedCount: normalizedMessages.length,
messageCount: (messagesToShow as MessageType[]).length,
lastAssistantMsgId: currentLastAssistantMsgId,
};
}
const hiddenMessageCount = messagesToShowNotTruncated.length - MAX_MESSAGES_TO_SHOW_IN_TRANSCRIPT_MODE;

View File

@@ -7,6 +7,9 @@ import type { CanUseToolFn } from './hooks/useCanUseTool.js'
import { FallbackTriggeredError } from './services/api/withRetry.js'
import {
calculateTokenWarningState,
estimateMaxTurnGrowth,
getAutoCompactThreshold,
getEffectiveContextWindowSize,
isAutoCompactEnabled,
type AutoCompactTrackingState,
} from './services/compact/autoCompact.js'
@@ -474,7 +477,7 @@ async function* queryLoop(
queryTracking,
}
let messagesForQuery = [...getMessagesAfterCompactBoundary(messages)]
let messagesForQuery = getMessagesAfterCompactBoundary(messages)
let tracking = autoCompactTracking
@@ -769,6 +772,48 @@ async function* queryLoop(
}
}
// Predictive autocompact: estimate if this turn's growth will push
// us past the context window. Uses effectiveContextWindow directly
// (without the autocompact buffer) to avoid double-reserving with
// getAutoCompactThreshold which already subtracts buffer.
if (!compactionResult && isAutoCompactEnabled()) {
const model = toolUseContext.options.mainLoopModel
const currentTokens =
tokenCountWithEstimation(messagesForQuery) - snipTokensFreed
const estimatedGrowth = estimateMaxTurnGrowth(model)
const predictiveThreshold =
getEffectiveContextWindowSize(model) - estimatedGrowth
if (currentTokens > predictiveThreshold) {
const predictiveResult = await deps.autocompact(
messagesForQuery,
toolUseContext,
{
systemPrompt,
userContext,
systemContext,
toolUseContext,
forkContextMessages: messagesForQuery,
},
querySource,
tracking,
snipTokensFreed,
)
if (predictiveResult.compactionResult) {
messagesForQuery = buildPostCompactMessages(
predictiveResult.compactionResult,
)
snipTokensFreed = 0
tracking = tracking
? {
...tracking,
compacted: true,
consecutiveFailures: predictiveResult.consecutiveFailures ?? 0,
}
: tracking
}
}
}
let attemptWithFallback = true
queryCheckpoint('query_api_loop_start')
@@ -1142,7 +1187,7 @@ async function* queryLoop(
// Execute post-sampling hooks after model response is complete
if (assistantMessages.length > 0) {
void executePostSamplingHooks(
[...messagesForQuery, ...assistantMessages],
messagesForQuery.concat(assistantMessages),
systemPrompt,
userContext,
systemContext,
@@ -1864,11 +1909,10 @@ async function* queryLoop(
userContext,
systemContext,
toolUseContext,
forkContextMessages: [
...messagesForQuery,
...assistantMessages,
...toolResults,
],
forkContextMessages: messagesForQuery.concat(
assistantMessages,
toolResults,
),
})
}
}
@@ -1885,7 +1929,7 @@ async function* queryLoop(
queryCheckpoint('query_recursive_call')
const next: State = {
messages: [...messagesForQuery, ...assistantMessages, ...toolResults],
messages: messagesForQuery.concat(assistantMessages, toolResults),
toolUseContext: toolUseContextWithQueryTracking,
autoCompactTracking: tracking,
turnCount: nextTurnCount,

View File

@@ -1566,7 +1566,15 @@ export function REPL({
// Deferred messages for the Messages component — renders at transition
// priority so the reconciler yields every 5ms, keeping input responsive
// while the expensive message processing pipeline runs.
const deferredMessages = useDeferredValue(messages);
// Cap at 500 messages to limit memory double-buffering. The bypass
// at display-time uses sync messages during streaming and non-loading,
// so this cap only affects reduced-motion scenarios.
const DEFERRED_CAP = 500;
const cappedMessages = React.useMemo(
() => (messages.length > DEFERRED_CAP ? messages.slice(-DEFERRED_CAP) : messages),
[messages],
);
const deferredMessages = useDeferredValue(cappedMessages);
const deferredBehind = messages.length - deferredMessages.length;
if (deferredBehind > 0) {
logForDebugging(

View File

@@ -64,6 +64,35 @@ export const WARNING_THRESHOLD_BUFFER_TOKENS = 20_000
export const ERROR_THRESHOLD_BUFFER_TOKENS = 20_000
export const MANUAL_COMPACT_BUFFER_TOKENS = 3_000
// Conservative estimate for tool result growth per turn.
// Typical tool results (file reads, grep, bash) average ~5-10K tokens;
// occasional large reads can spike to 20K+.
const TOOL_RESULT_GROWTH_ESTIMATE = 15_000
/**
* Context-aware autocompact buffer. Larger context windows need more
* headroom because a single turn can produce proportionally more tokens
* (longer model outputs + larger tool results).
*/
export function getAutocompactBufferTokens(model: string): number {
const effectiveWindow = getEffectiveContextWindowSize(model)
if (effectiveWindow >= 800_000) return 50_000
if (effectiveWindow >= 400_000) return 30_000
return AUTOCOMPACT_BUFFER_TOKENS
}
/**
* Estimate the maximum token growth a single turn can produce.
* Used for predictive autocompact checks before the API call.
*/
export function estimateMaxTurnGrowth(model: string): number {
const maxOutput = Math.min(
getMaxOutputTokensForModel(model),
MAX_OUTPUT_TOKENS_FOR_SUMMARY,
)
return maxOutput + TOOL_RESULT_GROWTH_ESTIMATE
}
// Stop trying autocompact after this many consecutive failures.
// BQ 2026-03-10: 1,279 sessions had 50+ consecutive failures (up to 3,272)
// in a single session, wasting ~250K API calls/day globally.
@@ -73,7 +102,7 @@ export function getAutoCompactThreshold(model: string): number {
const effectiveContextWindow = getEffectiveContextWindowSize(model)
const autocompactThreshold =
effectiveContextWindow - AUTOCOMPACT_BUFFER_TOKENS
effectiveContextWindow - getAutocompactBufferTokens(model)
// Override for easier testing of autocompact
const envPercent = process.env.CLAUDE_AUTOCOMPACT_PCT_OVERRIDE

View File

@@ -334,13 +334,12 @@ export type RecompactionInfo = {
* Order: boundaryMarker, summaryMessages, messagesToKeep, attachments, hookResults
*/
export function buildPostCompactMessages(result: CompactionResult): Message[] {
return [
result.boundaryMarker,
...result.summaryMessages,
...(result.messagesToKeep ?? []),
...result.attachments,
...result.hookResults,
]
return ([result.boundaryMarker] as Message[]).concat(
result.summaryMessages,
result.messagesToKeep ?? [],
result.attachments,
result.hookResults,
)
}
/**

View File

@@ -1,25 +1,97 @@
// Auto-generated stub — replace with real implementation
export {}
import type { Message } from 'src/types/message'
import type { CompactionResult } from './compact.js'
import { isEnvTruthy } from '../../utils/envUtils.js'
import {
isMediaSizeErrorMessage,
isPromptTooLongMessage,
} from '../api/errors.js'
import type { AssistantMessage, Message } from '../../types/message.js'
import { type CompactionResult, compactConversation } from './compact.js'
import { logError } from '../../utils/log.js'
import { logForDebugging } from '../../utils/debug.js'
import type { CacheSafeParams } from '../../utils/forkedAgent.js'
export const isReactiveOnlyMode: () => boolean = () => false
export const reactiveCompactOnPromptTooLong: (
messages: Message[],
cacheSafeParams: Record<string, unknown>,
options: { customInstructions?: string; trigger?: string },
) => Promise<{ ok: boolean; reason?: string; result?: CompactionResult }> =
async () => ({ ok: false })
export const isReactiveCompactEnabled: () => boolean = () => false
export const isWithheldPromptTooLong: (message: Message) => boolean = () =>
false
export const isWithheldMediaSizeError: (message: Message) => boolean = () =>
false
async (messages, cacheSafeParams, options) => {
const params = cacheSafeParams as unknown as CacheSafeParams
try {
const result = await compactConversation(
messages,
params.toolUseContext,
params,
true,
options.customInstructions,
true,
{
isRecompactionInChain: false,
turnsSincePreviousCompact: 0,
autoCompactThreshold: 0,
querySource: 'compact',
},
)
return { ok: true, result }
} catch (error) {
logError(error)
return { ok: false, reason: String(error) }
}
}
export const isReactiveCompactEnabled: () => boolean = () => {
if (isEnvTruthy(process.env.DISABLE_COMPACT)) return false
return true
}
export const isWithheldPromptTooLong: (message: Message) => boolean =
message => {
if (message.type !== 'assistant' || !message.isApiErrorMessage) return false
return isPromptTooLongMessage(message as AssistantMessage)
}
export const isWithheldMediaSizeError: (message: Message) => boolean =
message => {
if (message.type !== 'assistant' || !message.isApiErrorMessage) return false
return isMediaSizeErrorMessage(message as AssistantMessage)
}
export const tryReactiveCompact: (params: {
hasAttempted: boolean
querySource: string
aborted: boolean
messages: Message[]
cacheSafeParams: Record<string, unknown>
}) => Promise<CompactionResult | null> = async () => null
}) => Promise<CompactionResult | null> = async ({
hasAttempted,
aborted,
messages,
cacheSafeParams,
}) => {
if (hasAttempted || aborted) return null
const params = cacheSafeParams as unknown as CacheSafeParams
try {
const result = await compactConversation(
messages,
params.toolUseContext,
params,
true,
undefined,
true,
{
isRecompactionInChain: false,
turnsSincePreviousCompact: 0,
autoCompactThreshold: 0,
},
)
return result
} catch (error) {
logForDebugging(
`reactiveCompact: emergency compaction failed — ${String(error)}`,
{ level: 'warn' },
)
logError(error)
return null
}
}

View File

@@ -1397,6 +1397,172 @@ export function buildMessageLookups(
}
}
/**
* Incrementally update lookups by processing only newly appended messages.
* Returns the same lookups object (mutated in place) if update succeeds,
* or null if a full rebuild is needed (e.g., messages were removed).
*/
export function updateMessageLookupsIncremental(
existing: MessageLookups,
previousNormalizedCount: number,
previousMessageCount: number,
normalizedMessages: NormalizedMessage[],
messages: Message[],
): MessageLookups | null {
// Safety check: only handle append-only case
if (
normalizedMessages.length < previousNormalizedCount ||
messages.length < previousMessageCount
) {
return null
}
// No new messages — nothing to do
if (
normalizedMessages.length === previousNormalizedCount &&
messages.length === previousMessageCount
) {
return existing
}
// Process new messages entries (pass 1: assistant tool_use blocks)
const newMessageStart = previousMessageCount
for (let i = newMessageStart; i < messages.length; i++) {
const msg = messages[i]!
if (msg.type === 'assistant') {
const aMsg = msg as AssistantMessage
const id = aMsg.message.id!
if (Array.isArray(aMsg.message.content)) {
const newToolUseIDs: string[] = []
for (const content of aMsg.message.content) {
if (typeof content !== 'string' && content.type === 'tool_use') {
const toolUseContent = content as ToolUseBlock
newToolUseIDs.push(toolUseContent.id)
existing.toolUseByToolUseID.set(
toolUseContent.id,
content as ToolUseBlockParam,
)
}
}
// Update sibling lookup: all tool_use IDs in this message share siblings
const allSiblings = new Set(newToolUseIDs)
for (const toolUseID of newToolUseIDs) {
existing.siblingToolUseIDs.set(toolUseID, allSiblings)
}
}
}
}
// Process new normalizedMessages entries (pass 2: progress, hooks, tool results)
const newNormalizedStart = previousNormalizedCount
for (let i = newNormalizedStart; i < normalizedMessages.length; i++) {
const msg = normalizedMessages[i]!
if (msg.type === 'progress') {
const toolUseID = msg.parentToolUseID as string
const existing2 = existing.progressMessagesByToolUseID.get(toolUseID)
if (existing2) {
existing2.push(msg as ProgressMessage)
} else {
existing.progressMessagesByToolUseID.set(toolUseID, [
msg as ProgressMessage,
])
}
const progressData = msg.data as { type: string; hookEvent: HookEvent }
if (progressData.type === 'hook_progress') {
const hookEvent = progressData.hookEvent
let byHookEvent = existing.inProgressHookCounts.get(toolUseID)
if (!byHookEvent) {
byHookEvent = new Map()
existing.inProgressHookCounts.set(toolUseID, byHookEvent)
}
byHookEvent.set(hookEvent, (byHookEvent.get(hookEvent) ?? 0) + 1)
}
}
if (msg.type === 'user' && Array.isArray(msg.message?.content)) {
for (const content of msg.message?.content ?? []) {
if (typeof content !== 'string' && content.type === 'tool_result') {
const tr = content as ToolResultBlockParam
existing.toolResultByToolUseID.set(tr.tool_use_id, msg)
existing.resolvedToolUseIDs.add(tr.tool_use_id)
if (tr.is_error) {
existing.erroredToolUseIDs.add(tr.tool_use_id)
}
}
}
}
if (msg.type === 'assistant' && Array.isArray(msg.message?.content)) {
for (const content of msg.message?.content ?? []) {
if (typeof content === 'string') continue
if (
'tool_use_id' in content &&
typeof (content as { tool_use_id: string }).tool_use_id === 'string'
) {
existing.resolvedToolUseIDs.add(
(content as { tool_use_id: string }).tool_use_id,
)
}
if ((content.type as string) === 'advisor_tool_result') {
const result = content as {
tool_use_id: string
content: { type: string }
}
if (result.content.type === 'advisor_tool_result_error') {
existing.erroredToolUseIDs.add(result.tool_use_id)
}
}
}
}
if (isHookAttachmentMessage(msg)) {
const toolUseID = msg.attachment.toolUseID
const hookEvent = msg.attachment.hookEvent
const hookName = (msg.attachment as HookAttachmentWithName).hookName
if (hookName !== undefined) {
let byHookEvent = existing.resolvedHookCounts.get(toolUseID)
if (!byHookEvent) {
byHookEvent = new Map()
existing.resolvedHookCounts.set(toolUseID, byHookEvent)
}
byHookEvent.set(hookEvent, (byHookEvent.get(hookEvent) ?? 0) + 1)
}
}
}
existing.normalizedMessageCount = normalizedMessages.length
// Mark orphaned server_tool_use / mcp_tool_use blocks as errored.
// Only scan the new normalizedMessages since the previous count —
// existing entries were already checked by a prior full build.
const lastMsg = messages.at(-1)
const lastAssistantMsgId =
lastMsg?.type === 'assistant' ? lastMsg.message?.id : undefined
for (let i = newNormalizedStart; i < normalizedMessages.length; i++) {
const msg = normalizedMessages[i]!
if (msg.type !== 'assistant') continue
const aMsg = msg as AssistantMessage
if (aMsg.message.id === lastAssistantMsgId) continue
if (!Array.isArray(aMsg.message.content)) continue
for (const content of aMsg.message.content) {
if (
typeof content !== 'string' &&
((content.type as string) === 'server_tool_use' ||
(content.type as string) === 'mcp_tool_use') &&
!existing.resolvedToolUseIDs.has((content as { id: string }).id)
) {
const id = (content as { id: string }).id
existing.resolvedToolUseIDs.add(id)
existing.erroredToolUseIDs.add(id)
}
}
}
return existing
}
/**
* Compute a lightweight structural fingerprint for buildMessageLookups caching.
* Only captures information that affects lookup results (types, IDs, counts),

View File

@@ -101,6 +101,20 @@ export async function readFileInRange(
throw new FileTooLargeError(stats.size, maxBytes)
}
// For targeted reads of moderately large files, prefer streaming to
// avoid loading the full file into memory when only a slice is needed.
const isTargetedRead = offset > 0 || maxLines !== undefined
if (isTargetedRead && stats.size > FAST_PATH_MAX_SIZE / 4) {
return readFileInRangeStreaming(
filePath,
offset,
maxLines,
maxBytes,
truncateOnByteLimit,
signal,
)
}
const text = await readFile(filePath, { encoding: 'utf8', signal })
return readFileInRangeFast(
text,