Compare commits

..

7 Commits

Author SHA1 Message Date
claude-code-best
7cc1785fc0 chore:1.10.5 2026-04-27 19:54:26 +08:00
claude-code-best
c80e593212 feature: langfuse thinking 及 文本edit的问题修复( #371); 省略 diff 以减少内存峰值 (#376)
* feat: langfuse tracing 增加 thinking 参数记录

在 recordLLMObservation 中添加 thinking 配置(type/budgetTokens),
所有 provider(claude/gemini/openai)及 tokenEstimation、sideQuery
调用处同步传递 thinking 信息,便于 Langfuse 面板观察 thinking 使用情况。

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix: langfuse tracing 兼容 budget_tokens snake_case 格式

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix: 统一传递完整 thinking 配置而非仅 thinkingType

Langfuse 追踪直接传递整个 thinking 对象(含 type 和 budget_tokens),
Analytics 日志同步补充 thinkingBudgetTokens 字段,logAPIQuery 改为
接收 ThinkingConfig 类型参数。

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat: 省略旧消息的代码 diff 展示,仅保留最新消息的完整 diff

* fix: Edit 工具增加 Tab/空格规范化匹配,修复中文和缩进文件编辑失败

Read 工具输出将 Tab 渲染为空格,用户复制后 Edit 工具无法匹配。
在 findActualString 中增加 Tab→空格规范化回退匹配,并精确映射回原始文件位置。

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* docs: README 添加安装/更新失败的解决方案提示

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-27 17:06:33 +08:00
Dosion
b47731a3f3 test: keep Codecov coverage on real agent communication paths (#374)
* test: keep Codecov coverage on real agent communication paths

PR #369 was merged before the final Codecov coverage fix landed, so this follow-up carries only the incremental real-path tests needed on top of main. The tests exercise AgentSummary lifecycle branches, mailbox fail-closed behavior, UDS client connection failure through a real capability file, and UDS response-reader framing without mock.module, warning suppression, feature fallback, or production-code churn.

Constraint: PR #369 is already merged; this branch must contain only the incremental Codecov repair on top of latest main

Rejected: Reopen or keep pushing the merged PR branch | merged PR refs do not update and would leave Codecov stale

Rejected: Mock bun:bundle or hide warnings | would reintroduce cross-test pollution and pseudo coverage

Rejected: Keep unrelated SendMessageTool production diff | it created avoidable patch-coverage debt without improving the runtime path

Confidence: high

Scope-risk: narrow

Directive: Keep these coverage tests on real paths; do not replace them with output suppression or feature-flag mocks

Tested: bunx tsc --noEmit --pretty false

Tested: bun run lint

Tested: bun test src\utils\__tests__\teammateMailbox.test.ts

Tested: bun test src\services\AgentSummary\__tests__\agentSummary.test.ts src\services\AgentSummary\__tests__\summaryContext.test.ts src\utils\__tests__\teammateMailbox.test.ts src\utils\__tests__\udsMessaging.test.ts src\utils\__tests__\udsResponseReader.test.ts packages\builtin-tools\src\tools\SendMessageTool\__tests__\udsRecipientSanitization.test.ts

Tested: bun run test:all

Tested: bun test --coverage --coverage-reporter lcov --coverage-dir coverage

Tested: bun run build

Tested: bun run build:vite

Tested: bun audit

Tested: git diff --check

Tested: Claude simplify review GO (.omx/artifacts/claude-simplify-codecov-20260427-1521.md)

Tested: Claude security review GO (.omx/artifacts/claude-security-codecov-20260427-1522.md)

Not-tested: GitHub-hosted Codecov upload after this amended commit until PR checks rerun

* test: keep review assertions tied to real failure paths

CodeRabbit flagged three non-blocking but valid review gaps: platform-specific mailbox errno checks, brittle UDS connection-failure message assertions, and missing AgentSummary reschedule proof after fork errors. This keeps the fixes narrow by tightening the affected assertions and adding a structured UDS connection error for tests to assert behavior instead of prose.

Constraint: PR #374 is a review follow-up and must not hide warnings, skip tests, or merge the PR.

Rejected: Matching the UDS failure message literal | preserves the brittle coupling CodeRabbit flagged.

Rejected: Asserting only that mailbox writes throw | would allow unrelated pre-path failures to pass.

Confidence: high

Scope-risk: narrow

Directive: Keep UDS connection-failure tests on structured error data, not display wording.

Tested: bun test src/services/AgentSummary/__tests__/agentSummary.test.ts src/utils/__tests__/teammateMailbox.test.ts src/utils/__tests__/udsMessaging.test.ts

Tested: bunx tsc --noEmit --pretty false

Tested: bun run lint

Tested: bun run test:all

Tested: bun test --coverage --coverage-reporter lcov --coverage-dir coverage

Tested: bun run build

Tested: bun run build:vite

Not-tested: GitHub-hosted CodeRabbit refresh until pushed.

* test: remove brittle review follow-up assumptions

CodeRabbit's second pass found two valid brittleness issues and one suggested callback-reference assertion that would not match production behavior. This keeps the production behavior unchanged: timers still schedule the summarizer closure, tests now assert timer-handle identity, and UDS connection errors use native Error.cause instead of shadowing it.

Constraint: Do not manufacture behavior just to satisfy a review hint; assertions must match the real AgentSummary scheduling contract.

Rejected: Assert a fresh scheduled callback function | scheduleNext intentionally passes the same runSummary closure each time.

Rejected: Store a custom cause field on UdsPeerConnectionError | native Error.cause is available under ESNext/Bun.

Confidence: high

Scope-risk: narrow

Directive: Timer tests should assert returned handle identity for ownership, not incidental numeric values.

Tested: bun test src/services/AgentSummary/__tests__/agentSummary.test.ts src/utils/__tests__/udsMessaging.test.ts

Tested: bunx tsc --noEmit --pretty false

Tested: bun run lint

Tested: bun run test:all

Tested: bun test --coverage --coverage-reporter lcov --coverage-dir coverage

Tested: bun run build

Tested: bun run build:vite

Not-tested: GitHub-hosted CodeRabbit refresh until pushed.

* test: enforce structured UDS timeout failures

CodeRabbit's follow-up surfaced a real consistency gap: UDS send socket errors used UdsPeerConnectionError while response timeouts still rejected a generic Error. Timeouts now use the same structured peer failure contract, and the test exercises that path through a short explicit timeout instead of waiting for the production default.

The AgentSummary unchanged-fingerprint test now also asserts that the second unchanged tick does not log errors, preserving the existing behavior checks without changing production scheduling semantics.

Constraint: Keep the production timeout default at 5000ms while allowing tests to exercise the timeout path quickly.

Rejected: Leave timeout failures as generic Error | callers would need separate handling for the same peer connection failure class.

Confidence: high

Scope-risk: narrow

Directive: Keep UDS send timeout and socket-error branches on the same structured error contract.

Tested: bun test src/services/AgentSummary/__tests__/agentSummary.test.ts src/utils/__tests__/udsMessaging.test.ts

Tested: bunx tsc --noEmit --pretty false

Tested: bun run lint

Tested: bun run test:all

Tested: bun test --coverage --coverage-reporter lcov --coverage-dir coverage

Tested: bun run build

Tested: bun run build:vite

Not-tested: GitHub-hosted CodeRabbit refresh until pushed.

---------

Co-authored-by: unraid <local@unraid.local>
2026-04-27 16:22:13 +08:00
claude-code-best
a65df4a102 docs: update contributors 2026-04-27 07:57:43 +00:00
Dosion
52b61c2c06 fix: bound agent communication memory growth (#369)
* fix: bound agent communication memory growth

UDS messaging now uses private local capabilities instead of exposing auth tokens through SDK metadata, environment variables, session registry, peer listing, or tool output. The receive path bounds NDJSON frames, response buffers, active clients, and pending inbox bytes, and strips auth metadata before messages enter the prompt queue.

Teammate mailboxes now validate file and message sizes, fail closed on corrupt mutation inputs, compact by count and retained bytes, and use stable message identity for in-process acknowledgements. Agent summaries now fork only a bounded recent context using lazy size estimation and content fingerprints instead of retaining or serializing unbounded histories.

Constraint: PR #361 was already merged; this branch is based on upstream/main@c2ac9a74.
Rejected: Default-disabling COORDINATOR_MODE/TEAMMEM only | explicit feature enablement still hit unbounded paths.
Rejected: Persisting UDS auth in SDK/env/session registry | bridge/remote metadata can leak local capability secrets.
Rejected: Inline uds #token addresses | observable/tool/classifier paths can reflect raw addresses outside the UDS request frame.
Rejected: Positional mailbox marking after compaction | compaction can shift indices across the lock boundary.
Confidence: high
Scope-risk: moderate
Directive: Do not expose UDS capability tokens through SDK messages, environment variables, session registry, peer-list output, or SendMessage result/classifier surfaces.
Directive: Do not reintroduce positional mailbox acknowledgements unless compaction is removed or read+mark is atomic under one lock.
Tested: bun test src/utils/__tests__/ndjsonFramer.test.ts src/utils/__tests__/udsMessaging.test.ts packages/builtin-tools/src/tools/SendMessageTool/__tests__/udsRecipientSanitization.test.ts
Tested: bunx tsc --noEmit --pretty false
Tested: bun run lint
Tested: bunx biome lint modified src/package files
Tested: bun run test:all (3704 pass, 0 fail, 6734 expects)
Tested: bun audit (No vulnerabilities found)
Tested: bun run build
Tested: bun run build:vite
Tested: git diff --check
Not-tested: End-to-end external UDS client driving a full production headless model turn.

* fix: harden bounded agent communication review fixes

CodeRabbit and Codecov surfaced real gaps in UDS framing, peer discovery, mailbox retention, and summary context coverage. This tightens those paths without suppressing review or coverage signals.

Constraint: PR #369 must address CodeRabbit and Codecov findings without warning suppression or fake fallbacks

Rejected: Suppress Codecov or CodeRabbit warnings | leaves real receive-path and test-isolation gaps

Rejected: Add unreachable feature-gated tests | bun:bundle keeps those branches compile-time gated in local tests

Confidence: high

Scope-risk: moderate

Directive: Keep UDS auth-token rejection outside feature flags; do not reintroduce inline token fallbacks

Tested: bun test --coverage --coverage-reporter lcov --coverage-dir coverage; bun run test:all; bun run lint; bun run build; bun run build:vite; bun audit; git diff --cached --check

Not-tested: Remote Codecov/CodeRabbit refreshed reports until pushed

* fix: prevent agent communication bounds from hiding CI regressions

Tighten the UDS auth, framing, and response-reader boundaries while keeping the AgentSummary lifecycle covered so Codecov and CI fail on real regressions instead of missing coverage. The poorMode settings mock mirrors unrelated real settings defaults to avoid Bun mock retention changing later permission tests.

Constraint: PR #369 must fix Codecov/CI precisely without warning suppression, fallback masking, or mock pollution

Rejected: Delete AgentSummary lifecycle coverage | would hide Codecov loss and stale-summary behavior

Rejected: Store inline UDS rejection in a hidden input sentinel | cloned observable inputs can drop it and bypass rejection

Rejected: Ignore malformed UDS frames until timeout | leaves client slots and SendMessage calls open to exhaustion

Confidence: high

Scope-risk: moderate

Directive: Keep empty #token= markers rejected; do not require a non-empty token value in hasInlineUdsToken

Tested: bun test packages/builtin-tools/src/tools/SendMessageTool/__tests__/udsRecipientSanitization.test.ts src/utils/__tests__/udsMessaging.test.ts src/utils/__tests__/udsResponseReader.test.ts src/utils/__tests__/ndjsonFramer.test.ts

Tested: bunx tsc --noEmit --pretty false

Tested: bun run lint

Tested: bun test --coverage --coverage-reporter lcov --coverage-dir coverage

Tested: bun run test:all

Tested: bun audit

Tested: bun run build

Tested: bun run build:vite

Not-tested: GitHub-hosted Codecov upload until pushed PR checks rerun

---------

Co-authored-by: unraid <local@unraid.local>
2026-04-27 14:47:18 +08:00
claude-code-best
3cb4828de6 chore: 1.10.4 2026-04-26 21:33:00 +08:00
claude-code-best
f5c3ee5b5d fix: 修复长时间运行会话的内存泄漏问题
/clear 时释放 STATE 中保存的大块数据(API 请求/分类器请求/模型统计),
全屏模式增加 500 条消息上限防止无限增长,修复 progress 消息去重逻辑
避免交错消息导致重复累积(观察到 13k+ 条目/1GB+ 堆)。

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-26 21:14:00 +08:00
25 changed files with 475 additions and 76 deletions

View File

@@ -55,6 +55,8 @@ ccb update # 更新到最新版本
CLAUDE_BRIDGE_BASE_URL=https://remote-control.claude-code-best.win/ CLAUDE_BRIDGE_OAUTH_TOKEN=test-my-key ccb --remote-control # 我们有自部署的远程控制
```
> **安装/更新失败?** 先 `npm rm -g claude-code-best` 清理旧版本,再 `npm i -g claude-code-best@latest`。仍失败则指定版本号:`npm i -g claude-code-best@<版本号>`
## ⚡ 快速开始(源码版)
### ⚙️ 环境要求

File diff suppressed because one or more lines are too long

Before

Width:  |  Height:  |  Size: 1.6 MiB

After

Width:  |  Height:  |  Size: 1.7 MiB

View File

@@ -1,6 +1,6 @@
{
"name": "claude-code-best",
"version": "1.10.2",
"version": "1.10.5",
"description": "Reverse-engineered Anthropic Claude Code CLI — interactive AI coding assistant in the terminal",
"type": "module",
"author": "claude-code-best <claude-code-best@proton.me>",

View File

@@ -106,6 +106,84 @@ describe("findActualString", () => {
const result = findActualString("hello", "");
expect(result).toBe("");
});
// ── Tab/space normalization (Bug #2 reproduction) ──
test("finds match when search uses spaces but file uses tabs", () => {
// File content uses Tab indentation
const fileContent = "\tif (x) {\n\t\treturn 1;\n\t}";
// User copies from Read output which renders tabs as spaces
const searchWithSpaces = " if (x) {\n return 1;\n }";
const result = findActualString(fileContent, searchWithSpaces);
expect(result).not.toBeNull();
expect(result).toBe(fileContent);
});
test("finds match when search mixes tabs and spaces inconsistently", () => {
const fileContent = "\tconst x = 1; // comment";
const searchMixed = " const x = 1; // comment";
const result = findActualString(fileContent, searchMixed);
expect(result).not.toBeNull();
});
test("finds match for single-line tab-to-space mismatch", () => {
const fileContent = "\t\torder_price = NormalizeDouble(ask, digits);";
const searchSpaces = " order_price = NormalizeDouble(ask, digits);";
const result = findActualString(fileContent, searchSpaces);
expect(result).not.toBeNull();
});
// ── CJK / UTF-8 characters (Bug #1 reproduction) ──
test("finds match with CJK characters in content", () => {
const fileContent = "input int x = 620; // 止盈点数(点) — 32个pip=320点";
const result = findActualString(fileContent, fileContent);
expect(result).toBe(fileContent);
});
test("finds match with CJK characters when tab/space differs", () => {
const fileContent = "\t// 向上突破 → Sell Limit (逆方向做空)";
const searchSpaces = " // 向上突破 → Sell Limit (逆方向做空)";
const result = findActualString(fileContent, searchSpaces);
expect(result).not.toBeNull();
expect(result).toBe(fileContent);
});
// ── Multiline with tabs + CJK (combined Bug #1 + #2) ──
test("finds multiline match with tabs and CJK characters", () => {
const fileContent = "\tif(effective_dir == BREAKOUT_UP)\n\t\t{\n\t\t\t// 向上突破\n\t\t}";
const searchSpaces = " if(effective_dir == BREAKOUT_UP)\n {\n // 向上突破\n }";
const result = findActualString(fileContent, searchSpaces);
expect(result).not.toBeNull();
expect(result).toBe(fileContent);
});
// ── Returned string must be a valid substring of fileContent ──
test("returned string from tab match is a real substring of fileContent", () => {
const fileContent = "prefix\n\t\tindented code\nsuffix";
const searchSpaces = "prefix\n indented code\nsuffix";
const result = findActualString(fileContent, searchSpaces);
expect(result).not.toBeNull();
expect(fileContent.includes(result!)).toBe(true);
});
test("returned string from partial tab match is a real substring", () => {
const fileContent = "line1\n\tif (x) {\n\t\tdoStuff();\n\t}\nline5";
const searchSpaces = " if (x) {\n doStuff();\n }";
const result = findActualString(fileContent, searchSpaces);
expect(result).not.toBeNull();
expect(fileContent.includes(result!)).toBe(true);
});
test("tab match with mixed indentation levels", () => {
const fileContent = "class Foo {\n\t\tmethod1() {\n\t\t\treturn 42;\n\t\t}\n}";
const searchSpaces = "class Foo {\n method1() {\n return 42;\n }\n}";
const result = findActualString(fileContent, searchSpaces);
expect(result).not.toBeNull();
expect(fileContent.includes(result!)).toBe(true);
});
});
// ─── preserveQuoteStyle ─────────────────────────────────────────────────

View File

@@ -63,9 +63,26 @@ export function stripTrailingWhitespace(str: string): string {
return result
}
/**
* Normalizes whitespace for fuzzy matching by converting tabs to spaces
* and collapsing leading whitespace on each line to a canonical form.
* This handles the case where Read tool output renders tabs as spaces,
* so users copy spaces from the output but the file actually has tabs.
*/
function normalizeWhitespace(str: string): string {
return str.replace(/\t/g, ' ')
}
/**
* Finds the actual string in the file content that matches the search string,
* accounting for quote normalization
* accounting for quote normalization and tab/space differences.
*
* Matching cascade:
* 1. Exact match
* 2. Quote normalization (curly → straight quotes)
* 3. Tab/space normalization (tabs ↔ spaces in leading whitespace)
* 4. Quote + tab/space normalization combined
*
* @param fileContent The file content to search in
* @param searchString The string to search for
* @returns The actual string found in the file, or null if not found
@@ -89,9 +106,92 @@ export function findActualString(
return fileContent.substring(searchIndex, searchIndex + searchString.length)
}
// Try with tab/space normalization — handles the case where Read output
// renders tabs as spaces and the user copies the rendered version
const wsNormalizedFile = normalizeWhitespace(fileContent)
const wsNormalizedSearch = normalizeWhitespace(searchString)
const wsSearchIndex = wsNormalizedFile.indexOf(wsNormalizedSearch)
if (wsSearchIndex !== -1) {
// Map the match position back to the original file content.
// We need to find the corresponding range in the original string.
return mapNormalizedMatchBackToFile(fileContent, wsNormalizedFile, wsSearchIndex, wsNormalizedSearch.length)
}
// Try combined: quote normalization + tab/space normalization
const combinedFile = normalizeWhitespace(normalizedFile)
const combinedSearch = normalizeWhitespace(normalizedSearch)
const combinedIndex = combinedFile.indexOf(combinedSearch)
if (combinedIndex !== -1) {
return mapNormalizedMatchBackToFile(fileContent, combinedFile, combinedIndex, combinedSearch.length)
}
return null
}
/**
* Given a match found in a normalized version of fileContent, map the match
* position back to the original fileContent and extract the corresponding
* substring.
*
* Strategy: walk through both strings character by character, building a
* mapping from normalized offset to original offset. When a tab is expanded
* to 4 spaces in the normalized version, the normalized offset advances by 4
* while the original offset advances by 1.
*/
function mapNormalizedMatchBackToFile(
fileContent: string,
normalizedFile: string,
normalizedStart: number,
normalizedLength: number,
): string {
// Build a sparse mapping from normalized position → original position.
// We only need to map the range [normalizedStart, normalizedStart + normalizedLength].
let normPos = 0
let origPos = 0
let origStart = -1
let origEnd = -1
while (origPos < fileContent.length && normPos <= normalizedStart + normalizedLength) {
if (normPos === normalizedStart) {
origStart = origPos
}
if (normPos === normalizedStart + normalizedLength) {
origEnd = origPos
break
}
const origChar = fileContent[origPos]!
if (origChar === '\t') {
// Tab expands to 4 spaces in normalized version
const nextNormPos = normPos + 4
// If normalizedStart falls within this expanded tab, snap to origPos
if (normPos < normalizedStart && nextNormPos > normalizedStart && origStart === -1) {
origStart = origPos
}
if (normPos < normalizedStart + normalizedLength && nextNormPos > normalizedStart + normalizedLength && origEnd === -1) {
origEnd = origPos + 1
}
normPos = nextNormPos
origPos++
} else {
normPos++
origPos++
}
}
// Fallback: if we couldn't map precisely, use character-count heuristic
if (origStart === -1) origStart = 0
if (origEnd === -1) {
// Approximate: use the ratio of original to normalized length
const ratio = fileContent.length / normalizedFile.length
origEnd = Math.round(origStart + normalizedLength * ratio)
}
return fileContent.substring(origStart, origEnd)
}
/**
* When old_string matched via quote normalization (curly quotes in file,
* straight quotes from model), apply the same curly quote style to new_string

View File

@@ -616,10 +616,7 @@ export const SendMessageTool: Tool<InputSchema, SendMessageToolOutput> =
case 'shutdown_response':
return `shutdown_response ${input.message.approve ? 'approve' : 'reject'} ${input.message.request_id}`
case 'plan_approval_response':
const planApprovalDecision = input.message.approve
? 'approve'
: 'reject'
return `plan_approval ${planApprovalDecision} to ${recipient}`
return `plan_approval ${input.message.approve ? 'approve' : 'reject'} to ${recipient}`
}
},
@@ -837,10 +834,10 @@ export const SendMessageTool: Tool<InputSchema, SendMessageToolOutput> =
const { postInterClaudeMessage } =
require('src/bridge/peerSessions.js') as typeof import('src/bridge/peerSessions.js')
/* eslint-enable @typescript-eslint/no-require-imports */
const result = await postInterClaudeMessage(
const result = (await postInterClaudeMessage(
addr.target,
input.message,
) as { ok: boolean; error?: string }
)) as { ok: boolean; error?: string }
const preview = input.summary || truncate(input.message, 50)
return {
data: {
@@ -852,6 +849,7 @@ export const SendMessageTool: Tool<InputSchema, SendMessageToolOutput> =
}
}
if (addr.scheme === 'uds') {
const recipient = recipientForDisplay(input.to)
/* eslint-disable @typescript-eslint/no-require-imports */
const { sendToUdsSocket } =
require('src/utils/udsClient.js') as typeof import('src/utils/udsClient.js')
@@ -862,14 +860,14 @@ export const SendMessageTool: Tool<InputSchema, SendMessageToolOutput> =
return {
data: {
success: true,
message: `${preview}” → ${input.to}`,
message: `${preview}” → ${recipient}`,
},
}
} catch (e) {
return {
data: {
success: false,
message: `Failed to send to ${input.to}: ${errorMessage(e)}`,
message: `Failed to send to ${recipient}: ${errorMessage(e)}`,
},
}
}

View File

@@ -10,6 +10,10 @@ import {
getOriginalCwd,
getSessionId,
regenerateSessionId,
resetCostState,
setLastAPIRequest,
setLastAPIRequestMessages,
setLastClassifierRequests,
} from '../../bootstrap/state.js'
import type { SDKStatusMessage } from '../../entrypoints/sdk/coreTypes.js'
import {
@@ -144,6 +148,14 @@ export async function clearConversation({
// tracking) is retained so those agents keep functioning.
clearSessionCaches(preservedAgentIds)
// Clear large STATE-held data that outlives the message array.
// lastAPIRequestMessages can hold the full post-compaction conversation
// (hundreds of KBMB) for /share; resetCostState clears modelUsage.
setLastAPIRequest(null)
setLastAPIRequestMessages(null)
setLastClassifierRequests(null)
resetCostState()
setCwd(getOriginalCwd())
readFileState.clear()
discoveredSkillNames?.clear()

View File

@@ -77,6 +77,8 @@ export type Props = {
lastThinkingBlockId?: string | null
/** UUID of the latest user bash output message (for auto-expanding) */
latestBashOutputUUID?: string | null
/** Whether to collapse diff display for this message */
shouldCollapseDiffs?: boolean
}
function MessageImpl({
@@ -99,6 +101,7 @@ function MessageImpl({
isUserContinuation = false,
lastThinkingBlockId,
latestBashOutputUUID,
shouldCollapseDiffs,
}: Props): React.ReactNode {
switch (message.type) {
case 'attachment':
@@ -181,6 +184,7 @@ function MessageImpl({
isUserContinuation={isUserContinuation}
lookups={lookups}
isTranscriptMode={isTranscriptMode}
shouldCollapseDiffs={shouldCollapseDiffs}
/>
))}
</Box>
@@ -293,6 +297,7 @@ function UserMessage({
isUserContinuation,
lookups,
isTranscriptMode,
shouldCollapseDiffs,
}: {
message: NormalizedUserMessage
addMargin: boolean
@@ -309,6 +314,7 @@ function UserMessage({
isUserContinuation: boolean
lookups: ReturnType<typeof buildMessageLookups>
isTranscriptMode: boolean
shouldCollapseDiffs?: boolean
}): React.ReactNode {
const { columns } = useTerminalSize()
switch (param.type) {
@@ -344,6 +350,7 @@ function UserMessage({
verbose={verbose}
width={columns - 5}
isTranscriptMode={isTranscriptMode}
shouldCollapseDiffs={shouldCollapseDiffs}
/>
)
default:

View File

@@ -55,6 +55,7 @@ export type Props = {
columns: number
isLoading: boolean
lookups: ReturnType<typeof buildMessageLookups>
shouldCollapseDiffs?: boolean
}
/**
@@ -141,6 +142,7 @@ function MessageRowImpl({
columns,
isLoading,
lookups,
shouldCollapseDiffs,
}: Props): React.ReactNode {
const isTranscriptMode = screen === 'transcript'
const isGrouped = msg.type === 'grouped_tool_use'
@@ -221,6 +223,7 @@ function MessageRowImpl({
isUserContinuation={isUserContinuation}
lastThinkingBlockId={lastThinkingBlockId}
latestBashOutputUUID={latestBashOutputUUID}
shouldCollapseDiffs={shouldCollapseDiffs}
/>
)
// OffscreenFreeze: the outer React.memo already bails for static messages,

View File

@@ -814,6 +814,12 @@ const MessagesImpl = ({
streamingToolUseIDs,
))
// Collapse diffs for messages beyond the latest N messages.
// verbose (ctrl+o) overrides and always shows full diffs.
const DIFF_COLLAPSE_DISTANCE = 0
const shouldCollapseDiffs =
renderableMessages.length - 1 - index > DIFF_COLLAPSE_DISTANCE
const k = messageKey(msg)
const row = (
<MessageRow
@@ -838,6 +844,7 @@ const MessagesImpl = ({
columns={columns}
isLoading={isLoading}
lookups={lookups}
shouldCollapseDiffs={shouldCollapseDiffs}
/>
)

View File

@@ -27,6 +27,7 @@ type Props = {
verbose: boolean
width: number | string
isTranscriptMode?: boolean
shouldCollapseDiffs?: boolean
}
export function UserToolResultMessage({
@@ -39,6 +40,7 @@ export function UserToolResultMessage({
verbose,
width,
isTranscriptMode,
shouldCollapseDiffs,
}: Props): React.ReactNode {
const toolUse = useGetToolFromMessages(param.tool_use_id, tools, lookups)
if (!toolUse) {
@@ -96,6 +98,7 @@ export function UserToolResultMessage({
verbose={verbose}
width={width}
isTranscriptMode={isTranscriptMode}
shouldCollapseDiffs={shouldCollapseDiffs}
/>
)
}

View File

@@ -33,6 +33,7 @@ type Props = {
verbose: boolean
width: number | string
isTranscriptMode?: boolean
shouldCollapseDiffs?: boolean
}
export function UserToolSuccessMessage({
@@ -46,6 +47,7 @@ export function UserToolSuccessMessage({
verbose,
width,
isTranscriptMode,
shouldCollapseDiffs,
}: Props): React.ReactNode {
const [theme] = useTheme()
// Hook stays inside feature() ternary so external builds don't pay a
@@ -83,12 +85,16 @@ export function UserToolSuccessMessage({
}
const toolResult = parsedOutput?.data ?? message.toolUseResult
// Collapse diff display for old messages (verbose/ctrl+o overrides)
const effectiveStyle =
shouldCollapseDiffs && !verbose ? 'condensed' : style
const renderedMessage =
tool.renderToolResultMessage?.(
toolResult as never,
filterToolProgressMessages(progressMessagesForMessage),
{
style,
style: effectiveStyle,
theme,
tools,
verbose,

View File

@@ -6907,6 +6907,9 @@ async function logTenguInit({
allowDangerouslySkipPermissionsPassed,
thinkingType:
thinkingConfig.type as AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS,
...(thinkingConfig.type === "enabled" && {
thinkingBudgetTokens: thinkingConfig.budgetTokens,
}),
...(systemPromptFlag && {
systemPromptFlag:
systemPromptFlag as AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS,

View File

@@ -3051,12 +3051,22 @@ export function REPL({
// are O(n) per render, so drop everything before the previous
// boundary to keep n bounded across multi-day sessions.
if (isFullscreenEnvEnabled()) {
setMessages(old => [
...getMessagesAfterCompactBoundary(old, {
setMessages(old => {
const postBoundary = getMessagesAfterCompactBoundary(old, {
includeSnipped: true,
}),
newMessage,
]);
})
// Hard cap: keep at most 500 messages in fullscreen scrollback
// to prevent unbounded memory growth in multi-day sessions.
// normalizeMessages/applyGrouping are O(n), and Ink fiber
// trees cost ~250KB RSS per message. Without this cap,
// scrollback after several compactions can reach thousands
// of messages (observed: 13k+, 1GB+ heap).
const MAX_FULLSCREEN_SCROLLBACK = 500
const kept = postBoundary.length > MAX_FULLSCREEN_SCROLLBACK
? postBoundary.slice(-MAX_FULLSCREEN_SCROLLBACK)
: postBoundary
return [...kept, newMessage]
});
} else {
setMessages(() => [newMessage]);
}
@@ -3082,17 +3092,23 @@ export function REPL({
// history). Replacing those leaves the AgentTool UI stuck at
// "Initializing…" because it renders the full progress trail.
setMessages(oldMessages => {
const last = oldMessages.at(-1);
const lastData = last?.data as Record<string, unknown> | undefined;
const newData = newMessage.data as Record<string, unknown>;
if (
last?.type === 'progress' &&
last.parentToolUseID === newMessage.parentToolUseID &&
lastData?.type === newData.type
) {
const copy = oldMessages.slice();
copy[copy.length - 1] = newMessage;
return copy;
// Scan backwards to find the last ephemeral progress with matching
// parentToolUseID and type. Previously only checked the last message,
// so interleaved non-ephemeral messages caused duplicate progress
// entries to accumulate (observed 13k+ entries in sleep-heavy sessions).
for (let i = oldMessages.length - 1; i >= 0; i--) {
const m = oldMessages[i]!
if (m.type !== 'progress') break
const mData = m.data as Record<string, unknown> | undefined
if (
m.parentToolUseID === newMessage.parentToolUseID &&
mData?.type === newData.type
) {
const copy = oldMessages.slice();
copy[i] = newMessage;
return copy;
}
}
return [...oldMessages, newMessage];
});

View File

@@ -33,6 +33,8 @@ describe('startAgentSummarization', () => {
let debugLogs: string[]
let loggedErrors: Error[]
let clearedHandles: unknown[]
let scheduledCount: number
let lastTimerHandle: unknown
function startTestSummarization(
dependencies: AgentSummaryDependencies = {},
@@ -81,8 +83,10 @@ describe('startAgentSummarization', () => {
if (typeof callback !== 'function') {
throw new Error('Expected timer callback')
}
scheduledCount += 1
scheduled = callback as () => void | Promise<void>
return 1 as unknown as ReturnType<typeof setTimeout>
lastTimerHandle = { id: scheduledCount }
return lastTimerHandle as ReturnType<typeof setTimeout>
}) as unknown as typeof setTimeout,
updateAgentSummary: (taskId: string, summary: string) => {
updateCalls.push({ taskId, summary })
@@ -101,6 +105,8 @@ describe('startAgentSummarization', () => {
debugLogs = []
loggedErrors = []
clearedHandles = []
scheduledCount = 0
lastTimerHandle = undefined
})
test('summarizes bounded transcript once and skips unchanged fingerprints', async () => {
@@ -128,6 +134,7 @@ describe('startAgentSummarization', () => {
expect(forkCalls).toHaveLength(1)
expect(updateCalls).toHaveLength(1)
expect(loggedErrors).toEqual([])
})
test('skips summarization when filtering leaves too little bounded context', async () => {
@@ -175,6 +182,8 @@ describe('startAgentSummarization', () => {
})
expect(typeof scheduled).toBe('function')
const initialScheduledCount = scheduledCount
const initialTimerHandle = lastTimerHandle
await scheduled!()
expect(forkCalls).toEqual([])
@@ -182,9 +191,11 @@ describe('startAgentSummarization', () => {
expect(debugLogs).toContain(
'[AgentSummary] Skipping summary — poor mode active',
)
expect(scheduledCount).toBe(initialScheduledCount + 1)
expect(lastTimerHandle).not.toBe(initialTimerHandle)
})
test('logs summary errors and keeps the next timer owned by the summarizer', async () => {
test('logs summary errors and schedules the next timer', async () => {
const error = new Error('fork failed')
handle = startTestSummarization({
runForkedAgent: async () => {
@@ -193,20 +204,25 @@ describe('startAgentSummarization', () => {
})
expect(typeof scheduled).toBe('function')
const initialScheduledCount = scheduledCount
const initialTimerHandle = lastTimerHandle
await scheduled!()
expect(loggedErrors).toEqual([error])
expect(updateCalls).toEqual([])
expect(scheduledCount).toBe(initialScheduledCount + 1)
expect(lastTimerHandle).not.toBe(initialTimerHandle)
})
test('stop clears the pending summary timer', () => {
handle = startTestSummarization()
const pendingHandle = lastTimerHandle
handle.stop()
expect(debugLogs).toContain(
'[AgentSummary] Stopping summarization for task-1',
)
expect(clearedHandles).toEqual([1])
expect(clearedHandles).toEqual([pendingHandle])
})
})

View File

@@ -1776,6 +1776,10 @@ async function* queryModel(
// captures only primitives instead of paramsFromContext's full closure scope
// (messagesForAPI, system, allTools, betas — the entire request-building
// context), which would otherwise be pinned until the promise resolves.
// Also capture thinking params for Langfuse observability.
// Pass the entire thinking config object so all fields (type, budget_tokens,
// and any future additions) flow through without cherry-picking.
let langfuseThinking: BetaMessageStreamParams['thinking'] | undefined
{
const queryParams = paramsFromContext({
model: options.model,
@@ -1783,8 +1787,10 @@ async function* queryModel(
})
const logMessagesLength = queryParams.messages.length
const logBetas = useBetas ? (queryParams.betas ?? []) : []
const logThinkingType = queryParams.thinking?.type ?? 'disabled'
const logEffortValue = queryParams.output_config?.effort
if (queryParams.thinking && queryParams.thinking.type !== 'disabled') {
langfuseThinking = queryParams.thinking
}
void options.getToolPermissionContext().then(permissionContext => {
logAPIQuery({
model: options.model,
@@ -1794,7 +1800,7 @@ async function* queryModel(
permissionMode: permissionContext.mode,
querySource: options.querySource,
queryTracking: options.queryTracking,
thinkingType: logThinkingType,
thinkingConfig,
effortValue: logEffortValue,
fastMode: isFastMode,
previousRequestId,
@@ -2545,6 +2551,9 @@ async function* queryModel(
maxOutputTokens,
thinkingType:
thinkingConfig.type as AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS,
...(thinkingConfig.type === 'enabled' && {
thinkingBudgetTokens: thinkingConfig.budgetTokens,
}),
fallback_disabled: true,
request_id: (streamRequestId ??
'unknown') as AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS,
@@ -2577,6 +2586,9 @@ async function* queryModel(
maxOutputTokens,
thinkingType:
thinkingConfig.type as AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS,
...(thinkingConfig.type === 'enabled' && {
thinkingBudgetTokens: thinkingConfig.budgetTokens,
}),
fallback_disabled: false,
request_id: (streamRequestId ??
'unknown') as AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS,
@@ -2693,6 +2705,9 @@ async function* queryModel(
maxOutputTokens,
thinkingType:
thinkingConfig.type as AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS,
...(thinkingConfig.type === 'enabled' && {
thinkingBudgetTokens: thinkingConfig.budgetTokens,
}),
request_id:
failedRequestId as AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS,
fallback_cause:
@@ -2925,6 +2940,7 @@ async function* queryModel(
endTime: new Date(),
completionStartTime: ttftMs > 0 ? new Date(start + ttftMs) : undefined,
tools: convertToolsToLangfuse(toolSchemas as unknown[]),
thinking: langfuseThinking,
})
void options.getToolPermissionContext().then(permissionContext => {

View File

@@ -193,6 +193,15 @@ export async function* queryModelGemini(
endTime: new Date(),
completionStartTime: ttftMs > 0 ? new Date(start + ttftMs) : undefined,
tools: convertToolsToLangfuse(toolSchemas as unknown[]),
thinking:
thinkingConfig.type !== 'disabled'
? {
type: thinkingConfig.type,
...(thinkingConfig.type === 'enabled' && {
budgetTokens: thinkingConfig.budgetTokens,
}),
}
: undefined,
})
} catch (error) {
const errorMessage = error instanceof Error ? error.message : String(error)

View File

@@ -23,6 +23,7 @@ import { getAPIProviderForStatsig } from 'src/utils/model/providers.js'
import type { PermissionMode } from 'src/utils/permissions/PermissionMode.js'
import { jsonStringify } from 'src/utils/slowOperations.js'
import { logOTelEvent } from 'src/utils/telemetry/events.js'
import type { ThinkingConfig } from 'src/utils/thinking.js'
import {
endLLMRequestSpan,
isBetaTracingEnabled,
@@ -176,7 +177,7 @@ export function logAPIQuery({
permissionMode,
querySource,
queryTracking,
thinkingType,
thinkingConfig,
effortValue,
fastMode,
previousRequestId,
@@ -188,11 +189,13 @@ export function logAPIQuery({
permissionMode?: PermissionMode
querySource: string
queryTracking?: QueryChainTracking
thinkingType?: 'adaptive' | 'enabled' | 'disabled'
thinkingConfig?: ThinkingConfig
effortValue?: EffortLevel | null
fastMode?: boolean
previousRequestId?: string | null
}): void {
const thinkingType = thinkingConfig?.type ?? 'disabled'
const thinkingBudgetTokens = thinkingConfig?.type === 'enabled' ? thinkingConfig.budgetTokens : undefined
logEvent('tengu_api_query', {
model: model as AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS,
messagesLength,
@@ -219,6 +222,9 @@ export function logAPIQuery({
: {}),
thinkingType:
thinkingType as AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS,
...(thinkingBudgetTokens !== undefined && {
thinkingBudgetTokens,
}),
effortValue:
effortValue as AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS,
fastMode,

View File

@@ -418,6 +418,7 @@ export async function* queryModelOpenAI(
endTime: new Date(),
completionStartTime: ttftMs > 0 ? new Date(start + ttftMs) : undefined,
tools: convertToolsToLangfuse(toolSchemas as unknown[]),
...(enableThinking && { thinking: { type: 'enabled' } }),
})
// Safety: if stream ended without message_stop, assemble and yield whatever we have

View File

@@ -78,6 +78,16 @@ export function recordLLMObservation(
endTime?: Date
completionStartTime?: Date
tools?: unknown
/** Thinking depth configuration used for this request.
* Accepts the full API thinking config object. Fields:
* - type: thinking mode ("enabled", "adaptive", "disabled")
* - budget_tokens (snake_case, from Anthropic API) or budgetTokens (camelCase)
*/
thinking?: {
type: string
budget_tokens?: number
budgetTokens?: number
}
},
): void {
if (!rootSpan || !isLangfuseEnabled()) return
@@ -97,6 +107,7 @@ export function recordLLMObservation(
metadata: {
provider: params.provider,
model: params.model,
...(params.thinking && { thinking: params.thinking }),
},
...(params.completionStartTime && { completionStartTime: params.completionStartTime }),
},

View File

@@ -354,6 +354,7 @@ export async function countTokensViaHaikuFallback(
},
startTime: new Date(apiStart),
endTime: new Date(),
...(containsThinking && { thinking: { type: 'enabled', budgetTokens: TOKEN_COUNT_THINKING_BUDGET } }),
})
endTrace(langfuseTrace)

View File

@@ -1,9 +1,10 @@
import { afterEach, beforeEach, describe, expect, test } from 'bun:test'
import { mkdir, readFile, rm, writeFile } from 'node:fs/promises'
import { mkdir, readFile, rm, stat, writeFile } from 'node:fs/promises'
import { mkdtempSync } from 'node:fs'
import { tmpdir } from 'node:os'
import { dirname, join } from 'node:path'
import type { Message } from 'src/types/message.js'
import { getErrnoCode } from 'src/utils/errors.js'
import {
compactMailboxMessages,
getLastPeerDmSummary,
@@ -346,17 +347,26 @@ describe('teammate mailbox retention', () => {
const inboxPath = getInboxPath('worker', 'alpha')
await mkdir(inboxPath, { recursive: true })
await expect(
writeToMailbox(
'worker',
{
from: 'team-lead',
text: 'new',
timestamp: new Date(5).toISOString(),
},
'alpha',
),
).rejects.toThrow()
const error = await writeToMailbox(
'worker',
{
from: 'team-lead',
text: 'new',
timestamp: new Date(5).toISOString(),
},
'alpha',
).then(
() => undefined,
err => err,
)
const code = getErrnoCode(error)
expect(code).toBeDefined()
if (code === undefined) {
throw new Error('Expected filesystem errno code')
}
expect(['EISDIR', 'EPERM', 'EACCES']).toContain(code)
expect((await stat(inboxPath)).isDirectory()).toBe(true)
})
test('readMailbox fails closed on corrupt mailbox content', async () => {

View File

@@ -11,7 +11,7 @@ import {
writeFile,
} from 'node:fs/promises'
import { createHash } from 'node:crypto'
import { createConnection, createServer } from 'node:net'
import { createConnection, createServer, type Socket } from 'node:net'
import { dirname, join } from 'node:path'
import { tmpdir } from 'node:os'
import {
@@ -227,11 +227,78 @@ describe('UDS inbox retention', () => {
JSON.stringify({ socketPath: path, authToken: 'test-token' }),
'utf-8',
)
const { sendToUdsSocket } = await import('../udsClient.js')
await expect(sendToUdsSocket(path, 'hello')).rejects.toThrow(
'Failed to connect to peer',
const { sendToUdsSocket, UdsPeerConnectionError } = await import(
'../udsClient.js'
)
const error = await sendToUdsSocket(path, 'hello').then(
() => undefined,
err => err,
)
expect(error).toBeInstanceOf(UdsPeerConnectionError)
if (!(error instanceof UdsPeerConnectionError)) {
throw new Error('Expected UDS peer connection error')
}
expect(error.socketPath).toBe(path)
expect(error.message).not.toContain('test-token')
})
test('udsClient send reports response timeouts as peer connection errors', async () => {
const path = socketPath('uds-client-timeout')
const capabilityDir = join(tempConfigDir, 'messaging-capabilities')
const capabilityName = `${createHash('sha256').update(path).digest('hex')}.json`
await mkdir(capabilityDir, { recursive: true, mode: 0o700 })
await writeFile(
join(capabilityDir, capabilityName),
JSON.stringify({ socketPath: path, authToken: 'test-token' }),
'utf-8',
)
if (process.platform !== 'win32') {
await mkdir(dirname(path), { recursive: true })
}
const sockets = new Set<Socket>()
const receiver = createServer(socket => {
sockets.add(socket)
socket.on('close', () => {
sockets.delete(socket)
})
socket.on('data', () => undefined)
})
await new Promise<void>((resolve, reject) => {
receiver.on('error', reject)
receiver.listen(path, () => resolve())
})
try {
const { sendToUdsSocket, UdsPeerConnectionError } = await import(
'../udsClient.js'
)
const error = await sendToUdsSocket(path, 'hello', 50).then(
() => undefined,
err => err,
)
expect(error).toBeInstanceOf(UdsPeerConnectionError)
if (!(error instanceof UdsPeerConnectionError)) {
throw new Error('Expected UDS peer connection timeout error')
}
expect(error.socketPath).toBe(path)
expect(error.cause).toBeInstanceOf(Error)
if (!(error.cause instanceof Error)) {
throw new Error('Expected timeout cause')
}
expect(error.cause.message).toBe('Connection timed out')
expect(error.message).not.toContain('test-token')
} finally {
for (const socket of sockets) {
socket.destroy()
}
await closeServer(receiver)
if (process.platform !== 'win32') {
await unlink(path).catch(() => undefined)
}
}
})
test('sendUdsMessage fails closed before connecting without an auth token', async () => {

View File

@@ -294,6 +294,12 @@ export async function sideQuery(opts: SideQueryOptions): Promise<BetaMessage> {
startTime: new Date(start),
endTime: new Date(),
...(tools && { tools: convertToolsToLangfuse(tools as unknown[]) }),
...(thinkingConfig && thinkingConfig.type !== 'disabled' && {
thinking: {
type: thinkingConfig.type,
...(thinkingConfig.type === 'enabled' && { budgetTokens: thinkingConfig.budget_tokens }),
},
}),
})
endTrace(langfuseTrace)

View File

@@ -36,6 +36,19 @@ export type PeerSession = {
alive: boolean
}
export class UdsPeerConnectionError extends Error {
readonly socketPath: string
constructor(socketPath: string, cause: unknown) {
super(
`Failed to connect to peer at ${socketPath}: ${errorMessage(cause)}`,
{ cause },
)
this.name = 'UdsPeerConnectionError'
this.socketPath = socketPath
}
}
// ---------------------------------------------------------------------------
// Session directory
// ---------------------------------------------------------------------------
@@ -193,6 +206,7 @@ export async function isPeerAlive(
export async function sendToUdsSocket(
targetSocketPath: string,
message: string | Record<string, unknown>,
timeoutMs = 5000,
): Promise<void> {
const { parseUdsTarget } = await import('./udsMessaging.js')
const target = parseUdsTarget(targetSocketPath)
@@ -237,12 +251,15 @@ export async function sendToUdsSocket(
maxFrameBytes: MAX_UDS_FRAME_BYTES,
onSettled: finish,
formatSocketError: err =>
new Error(
`Failed to connect to peer at ${target.socketPath}: ${errorMessage(err)}`,
),
new UdsPeerConnectionError(target.socketPath, err),
})
conn.setTimeout(5000, () => {
finish(new Error('Connection timed out'))
conn.setTimeout(timeoutMs, () => {
finish(
new UdsPeerConnectionError(
target.socketPath,
new Error('Connection timed out'),
),
)
})
})
}