mirror of
https://github.com/claude-code-best/claude-code.git
synced 2026-06-18 06:15:51 +00:00
test: keep Codecov coverage on real agent communication paths (#374)
* test: keep Codecov coverage on real agent communication paths PR #369 was merged before the final Codecov coverage fix landed, so this follow-up carries only the incremental real-path tests needed on top of main. The tests exercise AgentSummary lifecycle branches, mailbox fail-closed behavior, UDS client connection failure through a real capability file, and UDS response-reader framing without mock.module, warning suppression, feature fallback, or production-code churn. Constraint: PR #369 is already merged; this branch must contain only the incremental Codecov repair on top of latest main Rejected: Reopen or keep pushing the merged PR branch | merged PR refs do not update and would leave Codecov stale Rejected: Mock bun:bundle or hide warnings | would reintroduce cross-test pollution and pseudo coverage Rejected: Keep unrelated SendMessageTool production diff | it created avoidable patch-coverage debt without improving the runtime path Confidence: high Scope-risk: narrow Directive: Keep these coverage tests on real paths; do not replace them with output suppression or feature-flag mocks Tested: bunx tsc --noEmit --pretty false Tested: bun run lint Tested: bun test src\utils\__tests__\teammateMailbox.test.ts Tested: bun test src\services\AgentSummary\__tests__\agentSummary.test.ts src\services\AgentSummary\__tests__\summaryContext.test.ts src\utils\__tests__\teammateMailbox.test.ts src\utils\__tests__\udsMessaging.test.ts src\utils\__tests__\udsResponseReader.test.ts packages\builtin-tools\src\tools\SendMessageTool\__tests__\udsRecipientSanitization.test.ts Tested: bun run test:all Tested: bun test --coverage --coverage-reporter lcov --coverage-dir coverage Tested: bun run build Tested: bun run build:vite Tested: bun audit Tested: git diff --check Tested: Claude simplify review GO (.omx/artifacts/claude-simplify-codecov-20260427-1521.md) Tested: Claude security review GO (.omx/artifacts/claude-security-codecov-20260427-1522.md) Not-tested: GitHub-hosted Codecov upload after this amended commit until PR checks rerun * test: keep review assertions tied to real failure paths CodeRabbit flagged three non-blocking but valid review gaps: platform-specific mailbox errno checks, brittle UDS connection-failure message assertions, and missing AgentSummary reschedule proof after fork errors. This keeps the fixes narrow by tightening the affected assertions and adding a structured UDS connection error for tests to assert behavior instead of prose. Constraint: PR #374 is a review follow-up and must not hide warnings, skip tests, or merge the PR. Rejected: Matching the UDS failure message literal | preserves the brittle coupling CodeRabbit flagged. Rejected: Asserting only that mailbox writes throw | would allow unrelated pre-path failures to pass. Confidence: high Scope-risk: narrow Directive: Keep UDS connection-failure tests on structured error data, not display wording. Tested: bun test src/services/AgentSummary/__tests__/agentSummary.test.ts src/utils/__tests__/teammateMailbox.test.ts src/utils/__tests__/udsMessaging.test.ts Tested: bunx tsc --noEmit --pretty false Tested: bun run lint Tested: bun run test:all Tested: bun test --coverage --coverage-reporter lcov --coverage-dir coverage Tested: bun run build Tested: bun run build:vite Not-tested: GitHub-hosted CodeRabbit refresh until pushed. * test: remove brittle review follow-up assumptions CodeRabbit's second pass found two valid brittleness issues and one suggested callback-reference assertion that would not match production behavior. This keeps the production behavior unchanged: timers still schedule the summarizer closure, tests now assert timer-handle identity, and UDS connection errors use native Error.cause instead of shadowing it. Constraint: Do not manufacture behavior just to satisfy a review hint; assertions must match the real AgentSummary scheduling contract. Rejected: Assert a fresh scheduled callback function | scheduleNext intentionally passes the same runSummary closure each time. Rejected: Store a custom cause field on UdsPeerConnectionError | native Error.cause is available under ESNext/Bun. Confidence: high Scope-risk: narrow Directive: Timer tests should assert returned handle identity for ownership, not incidental numeric values. Tested: bun test src/services/AgentSummary/__tests__/agentSummary.test.ts src/utils/__tests__/udsMessaging.test.ts Tested: bunx tsc --noEmit --pretty false Tested: bun run lint Tested: bun run test:all Tested: bun test --coverage --coverage-reporter lcov --coverage-dir coverage Tested: bun run build Tested: bun run build:vite Not-tested: GitHub-hosted CodeRabbit refresh until pushed. * test: enforce structured UDS timeout failures CodeRabbit's follow-up surfaced a real consistency gap: UDS send socket errors used UdsPeerConnectionError while response timeouts still rejected a generic Error. Timeouts now use the same structured peer failure contract, and the test exercises that path through a short explicit timeout instead of waiting for the production default. The AgentSummary unchanged-fingerprint test now also asserts that the second unchanged tick does not log errors, preserving the existing behavior checks without changing production scheduling semantics. Constraint: Keep the production timeout default at 5000ms while allowing tests to exercise the timeout path quickly. Rejected: Leave timeout failures as generic Error | callers would need separate handling for the same peer connection failure class. Confidence: high Scope-risk: narrow Directive: Keep UDS send timeout and socket-error branches on the same structured error contract. Tested: bun test src/services/AgentSummary/__tests__/agentSummary.test.ts src/utils/__tests__/udsMessaging.test.ts Tested: bunx tsc --noEmit --pretty false Tested: bun run lint Tested: bun run test:all Tested: bun test --coverage --coverage-reporter lcov --coverage-dir coverage Tested: bun run build Tested: bun run build:vite Not-tested: GitHub-hosted CodeRabbit refresh until pushed. --------- Co-authored-by: unraid <local@unraid.local>
This commit is contained in:
@@ -5,7 +5,10 @@ import type {
|
||||
CacheSafeParams,
|
||||
ForkedAgentResult,
|
||||
} from '../../../utils/forkedAgent.js'
|
||||
import { startAgentSummarization } from '../agentSummary.js'
|
||||
import {
|
||||
type AgentSummaryDependencies,
|
||||
startAgentSummarization,
|
||||
} from '../agentSummary.js'
|
||||
|
||||
const transcriptMessages = [
|
||||
{ type: 'user', message: { content: 'start' }, uuid: 'u1' },
|
||||
@@ -27,17 +30,16 @@ describe('startAgentSummarization', () => {
|
||||
let forkCalls: ForkCall[]
|
||||
let updateCalls: Array<{ taskId: string; summary: string }>
|
||||
let transcriptMessagesForTest: Message[]
|
||||
let debugLogs: string[]
|
||||
let loggedErrors: Error[]
|
||||
let clearedHandles: unknown[]
|
||||
let scheduledCount: number
|
||||
let lastTimerHandle: unknown
|
||||
|
||||
beforeEach(() => {
|
||||
forkCalls = []
|
||||
updateCalls = []
|
||||
scheduled = undefined
|
||||
handle = undefined
|
||||
transcriptMessagesForTest = transcriptMessages
|
||||
})
|
||||
|
||||
test('summarizes bounded transcript once and skips unchanged fingerprints', async () => {
|
||||
handle = startAgentSummarization(
|
||||
function startTestSummarization(
|
||||
dependencies: AgentSummaryDependencies = {},
|
||||
): { stop: () => void } {
|
||||
return startAgentSummarization(
|
||||
'task-1',
|
||||
asAgentId('a0000000000000000'),
|
||||
{
|
||||
@@ -48,14 +50,22 @@ describe('startAgentSummarization', () => {
|
||||
} as unknown as CacheSafeParams,
|
||||
() => undefined,
|
||||
{
|
||||
clearTimeout: () => undefined,
|
||||
clearTimeout: ((timeoutId: unknown) => {
|
||||
clearedHandles.push(timeoutId)
|
||||
}) as typeof clearTimeout,
|
||||
getAgentTranscript: async () => ({
|
||||
messages: transcriptMessagesForTest,
|
||||
contentReplacements: [],
|
||||
}),
|
||||
isPoorModeActive: () => false,
|
||||
logError: () => undefined,
|
||||
logForDebugging: () => undefined,
|
||||
logError: error => {
|
||||
loggedErrors.push(
|
||||
error instanceof Error ? error : new Error(String(error)),
|
||||
)
|
||||
},
|
||||
logForDebugging: message => {
|
||||
debugLogs.push(message)
|
||||
},
|
||||
runForkedAgent: async (args: ForkCall) => {
|
||||
forkCalls.push(args)
|
||||
return {
|
||||
@@ -73,14 +83,34 @@ describe('startAgentSummarization', () => {
|
||||
if (typeof callback !== 'function') {
|
||||
throw new Error('Expected timer callback')
|
||||
}
|
||||
scheduledCount += 1
|
||||
scheduled = callback as () => void | Promise<void>
|
||||
return 1 as unknown as ReturnType<typeof setTimeout>
|
||||
lastTimerHandle = { id: scheduledCount }
|
||||
return lastTimerHandle as ReturnType<typeof setTimeout>
|
||||
}) as unknown as typeof setTimeout,
|
||||
updateAgentSummary: (taskId: string, summary: string) => {
|
||||
updateCalls.push({ taskId, summary })
|
||||
},
|
||||
...dependencies,
|
||||
},
|
||||
)
|
||||
}
|
||||
|
||||
beforeEach(() => {
|
||||
forkCalls = []
|
||||
updateCalls = []
|
||||
scheduled = undefined
|
||||
handle = undefined
|
||||
transcriptMessagesForTest = transcriptMessages
|
||||
debugLogs = []
|
||||
loggedErrors = []
|
||||
clearedHandles = []
|
||||
scheduledCount = 0
|
||||
lastTimerHandle = undefined
|
||||
})
|
||||
|
||||
test('summarizes bounded transcript once and skips unchanged fingerprints', async () => {
|
||||
handle = startTestSummarization()
|
||||
|
||||
expect(typeof scheduled).toBe('function')
|
||||
await scheduled!()
|
||||
@@ -104,49 +134,95 @@ describe('startAgentSummarization', () => {
|
||||
|
||||
expect(forkCalls).toHaveLength(1)
|
||||
expect(updateCalls).toHaveLength(1)
|
||||
expect(loggedErrors).toEqual([])
|
||||
})
|
||||
|
||||
test('skips summarization when bounded context is too small', async () => {
|
||||
transcriptMessagesForTest = transcriptMessages.slice(0, 2)
|
||||
|
||||
handle = startAgentSummarization(
|
||||
'task-1',
|
||||
asAgentId('a0000000000000000'),
|
||||
test('skips summarization when filtering leaves too little bounded context', async () => {
|
||||
transcriptMessagesForTest = [
|
||||
{ type: 'user', message: { content: 'start' }, uuid: 'u1' },
|
||||
{
|
||||
forkContextMessages: transcriptMessages,
|
||||
model: 'claude-test',
|
||||
} as unknown as CacheSafeParams,
|
||||
() => undefined,
|
||||
{
|
||||
clearTimeout: () => undefined,
|
||||
getAgentTranscript: async () => ({
|
||||
messages: transcriptMessagesForTest,
|
||||
contentReplacements: [],
|
||||
}),
|
||||
isPoorModeActive: () => false,
|
||||
logError: () => undefined,
|
||||
logForDebugging: () => undefined,
|
||||
runForkedAgent: async (args: ForkCall) => {
|
||||
forkCalls.push(args)
|
||||
return { messages: [] } as unknown as ForkedAgentResult
|
||||
},
|
||||
setTimeout: ((callback: TimerHandler) => {
|
||||
if (typeof callback !== 'function') {
|
||||
throw new Error('Expected timer callback')
|
||||
}
|
||||
scheduled = callback as () => void | Promise<void>
|
||||
return 1 as unknown as ReturnType<typeof setTimeout>
|
||||
}) as unknown as typeof setTimeout,
|
||||
updateAgentSummary: (taskId: string, summary: string) => {
|
||||
updateCalls.push({ taskId, summary })
|
||||
type: 'assistant',
|
||||
uuid: 'a1',
|
||||
message: {
|
||||
content: [{ type: 'tool_use', id: 'missing', name: 'Read' }],
|
||||
},
|
||||
},
|
||||
)
|
||||
{ type: 'user', message: { content: 'continue' }, uuid: 'u2' },
|
||||
] as unknown as Message[]
|
||||
|
||||
handle = startTestSummarization()
|
||||
|
||||
expect(typeof scheduled).toBe('function')
|
||||
await scheduled!()
|
||||
|
||||
expect(forkCalls).toEqual([])
|
||||
expect(updateCalls).toEqual([])
|
||||
expect(debugLogs).toContain(
|
||||
'[AgentSummary] Skipping summary for task-1: no bounded context available',
|
||||
)
|
||||
})
|
||||
|
||||
test('skips summarization before building context when transcript is too short', async () => {
|
||||
transcriptMessagesForTest = transcriptMessages.slice(0, 2)
|
||||
handle = startTestSummarization()
|
||||
|
||||
expect(typeof scheduled).toBe('function')
|
||||
await scheduled!()
|
||||
|
||||
expect(forkCalls).toEqual([])
|
||||
expect(updateCalls).toEqual([])
|
||||
expect(debugLogs).toContain(
|
||||
'[AgentSummary] Skipping summary for task-1: not enough messages (2)',
|
||||
)
|
||||
})
|
||||
|
||||
test('skips and reschedules while poor mode is active', async () => {
|
||||
handle = startTestSummarization({
|
||||
isPoorModeActive: () => true,
|
||||
})
|
||||
|
||||
expect(typeof scheduled).toBe('function')
|
||||
const initialScheduledCount = scheduledCount
|
||||
const initialTimerHandle = lastTimerHandle
|
||||
await scheduled!()
|
||||
|
||||
expect(forkCalls).toEqual([])
|
||||
expect(updateCalls).toEqual([])
|
||||
expect(debugLogs).toContain(
|
||||
'[AgentSummary] Skipping summary — poor mode active',
|
||||
)
|
||||
expect(scheduledCount).toBe(initialScheduledCount + 1)
|
||||
expect(lastTimerHandle).not.toBe(initialTimerHandle)
|
||||
})
|
||||
|
||||
test('logs summary errors and schedules the next timer', async () => {
|
||||
const error = new Error('fork failed')
|
||||
handle = startTestSummarization({
|
||||
runForkedAgent: async () => {
|
||||
throw error
|
||||
},
|
||||
})
|
||||
|
||||
expect(typeof scheduled).toBe('function')
|
||||
const initialScheduledCount = scheduledCount
|
||||
const initialTimerHandle = lastTimerHandle
|
||||
await scheduled!()
|
||||
|
||||
expect(loggedErrors).toEqual([error])
|
||||
expect(updateCalls).toEqual([])
|
||||
expect(scheduledCount).toBe(initialScheduledCount + 1)
|
||||
expect(lastTimerHandle).not.toBe(initialTimerHandle)
|
||||
})
|
||||
|
||||
test('stop clears the pending summary timer', () => {
|
||||
handle = startTestSummarization()
|
||||
const pendingHandle = lastTimerHandle
|
||||
|
||||
handle.stop()
|
||||
|
||||
expect(debugLogs).toContain(
|
||||
'[AgentSummary] Stopping summarization for task-1',
|
||||
)
|
||||
expect(clearedHandles).toEqual([pendingHandle])
|
||||
})
|
||||
})
|
||||
|
||||
@@ -141,6 +141,13 @@ describe('getSummaryContextFingerprint', () => {
|
||||
expect(estimateMessageChars(message)).toBeGreaterThan(0)
|
||||
})
|
||||
|
||||
test('treats unsupported top-level primitives as zero-size estimates', () => {
|
||||
expect(
|
||||
estimateMessageChars((() => undefined) as unknown as Message),
|
||||
).toBe(0)
|
||||
expect(estimateMessageChars(1n as unknown as Message)).toBe(0)
|
||||
})
|
||||
|
||||
test('returns null for an empty transcript', () => {
|
||||
expect(getSummaryContextFingerprint([])).toBeNull()
|
||||
})
|
||||
|
||||
Reference in New Issue
Block a user