test: keep Codecov coverage on real agent communication paths (#374)

* test: keep Codecov coverage on real agent communication paths

PR #369 was merged before the final Codecov coverage fix landed, so this follow-up carries only the incremental real-path tests needed on top of main. The tests exercise AgentSummary lifecycle branches, mailbox fail-closed behavior, UDS client connection failure through a real capability file, and UDS response-reader framing without mock.module, warning suppression, feature fallback, or production-code churn.

Constraint: PR #369 is already merged; this branch must contain only the incremental Codecov repair on top of latest main

Rejected: Reopen or keep pushing the merged PR branch | merged PR refs do not update and would leave Codecov stale

Rejected: Mock bun:bundle or hide warnings | would reintroduce cross-test pollution and pseudo coverage

Rejected: Keep unrelated SendMessageTool production diff | it created avoidable patch-coverage debt without improving the runtime path

Confidence: high

Scope-risk: narrow

Directive: Keep these coverage tests on real paths; do not replace them with output suppression or feature-flag mocks

Tested: bunx tsc --noEmit --pretty false

Tested: bun run lint

Tested: bun test src\utils\__tests__\teammateMailbox.test.ts

Tested: bun test src\services\AgentSummary\__tests__\agentSummary.test.ts src\services\AgentSummary\__tests__\summaryContext.test.ts src\utils\__tests__\teammateMailbox.test.ts src\utils\__tests__\udsMessaging.test.ts src\utils\__tests__\udsResponseReader.test.ts packages\builtin-tools\src\tools\SendMessageTool\__tests__\udsRecipientSanitization.test.ts

Tested: bun run test:all

Tested: bun test --coverage --coverage-reporter lcov --coverage-dir coverage

Tested: bun run build

Tested: bun run build:vite

Tested: bun audit

Tested: git diff --check

Tested: Claude simplify review GO (.omx/artifacts/claude-simplify-codecov-20260427-1521.md)

Tested: Claude security review GO (.omx/artifacts/claude-security-codecov-20260427-1522.md)

Not-tested: GitHub-hosted Codecov upload after this amended commit until PR checks rerun

* test: keep review assertions tied to real failure paths

CodeRabbit flagged three non-blocking but valid review gaps: platform-specific mailbox errno checks, brittle UDS connection-failure message assertions, and missing AgentSummary reschedule proof after fork errors. This keeps the fixes narrow by tightening the affected assertions and adding a structured UDS connection error for tests to assert behavior instead of prose.

Constraint: PR #374 is a review follow-up and must not hide warnings, skip tests, or merge the PR.

Rejected: Matching the UDS failure message literal | preserves the brittle coupling CodeRabbit flagged.

Rejected: Asserting only that mailbox writes throw | would allow unrelated pre-path failures to pass.

Confidence: high

Scope-risk: narrow

Directive: Keep UDS connection-failure tests on structured error data, not display wording.

Tested: bun test src/services/AgentSummary/__tests__/agentSummary.test.ts src/utils/__tests__/teammateMailbox.test.ts src/utils/__tests__/udsMessaging.test.ts

Tested: bunx tsc --noEmit --pretty false

Tested: bun run lint

Tested: bun run test:all

Tested: bun test --coverage --coverage-reporter lcov --coverage-dir coverage

Tested: bun run build

Tested: bun run build:vite

Not-tested: GitHub-hosted CodeRabbit refresh until pushed.

* test: remove brittle review follow-up assumptions

CodeRabbit's second pass found two valid brittleness issues and one suggested callback-reference assertion that would not match production behavior. This keeps the production behavior unchanged: timers still schedule the summarizer closure, tests now assert timer-handle identity, and UDS connection errors use native Error.cause instead of shadowing it.

Constraint: Do not manufacture behavior just to satisfy a review hint; assertions must match the real AgentSummary scheduling contract.

Rejected: Assert a fresh scheduled callback function | scheduleNext intentionally passes the same runSummary closure each time.

Rejected: Store a custom cause field on UdsPeerConnectionError | native Error.cause is available under ESNext/Bun.

Confidence: high

Scope-risk: narrow

Directive: Timer tests should assert returned handle identity for ownership, not incidental numeric values.

Tested: bun test src/services/AgentSummary/__tests__/agentSummary.test.ts src/utils/__tests__/udsMessaging.test.ts

Tested: bunx tsc --noEmit --pretty false

Tested: bun run lint

Tested: bun run test:all

Tested: bun test --coverage --coverage-reporter lcov --coverage-dir coverage

Tested: bun run build

Tested: bun run build:vite

Not-tested: GitHub-hosted CodeRabbit refresh until pushed.

* test: enforce structured UDS timeout failures

CodeRabbit's follow-up surfaced a real consistency gap: UDS send socket errors used UdsPeerConnectionError while response timeouts still rejected a generic Error. Timeouts now use the same structured peer failure contract, and the test exercises that path through a short explicit timeout instead of waiting for the production default.

The AgentSummary unchanged-fingerprint test now also asserts that the second unchanged tick does not log errors, preserving the existing behavior checks without changing production scheduling semantics.

Constraint: Keep the production timeout default at 5000ms while allowing tests to exercise the timeout path quickly.

Rejected: Leave timeout failures as generic Error | callers would need separate handling for the same peer connection failure class.

Confidence: high

Scope-risk: narrow

Directive: Keep UDS send timeout and socket-error branches on the same structured error contract.

Tested: bun test src/services/AgentSummary/__tests__/agentSummary.test.ts src/utils/__tests__/udsMessaging.test.ts

Tested: bunx tsc --noEmit --pretty false

Tested: bun run lint

Tested: bun run test:all

Tested: bun test --coverage --coverage-reporter lcov --coverage-dir coverage

Tested: bun run build

Tested: bun run build:vite

Not-tested: GitHub-hosted CodeRabbit refresh until pushed.

---------

Co-authored-by: unraid <local@unraid.local>
This commit is contained in:
Dosion
2026-04-27 16:22:13 +08:00
committed by GitHub
parent a65df4a102
commit b47731a3f3
6 changed files with 324 additions and 55 deletions

View File

@@ -5,7 +5,10 @@ import type {
CacheSafeParams,
ForkedAgentResult,
} from '../../../utils/forkedAgent.js'
import { startAgentSummarization } from '../agentSummary.js'
import {
type AgentSummaryDependencies,
startAgentSummarization,
} from '../agentSummary.js'
const transcriptMessages = [
{ type: 'user', message: { content: 'start' }, uuid: 'u1' },
@@ -27,17 +30,16 @@ describe('startAgentSummarization', () => {
let forkCalls: ForkCall[]
let updateCalls: Array<{ taskId: string; summary: string }>
let transcriptMessagesForTest: Message[]
let debugLogs: string[]
let loggedErrors: Error[]
let clearedHandles: unknown[]
let scheduledCount: number
let lastTimerHandle: unknown
beforeEach(() => {
forkCalls = []
updateCalls = []
scheduled = undefined
handle = undefined
transcriptMessagesForTest = transcriptMessages
})
test('summarizes bounded transcript once and skips unchanged fingerprints', async () => {
handle = startAgentSummarization(
function startTestSummarization(
dependencies: AgentSummaryDependencies = {},
): { stop: () => void } {
return startAgentSummarization(
'task-1',
asAgentId('a0000000000000000'),
{
@@ -48,14 +50,22 @@ describe('startAgentSummarization', () => {
} as unknown as CacheSafeParams,
() => undefined,
{
clearTimeout: () => undefined,
clearTimeout: ((timeoutId: unknown) => {
clearedHandles.push(timeoutId)
}) as typeof clearTimeout,
getAgentTranscript: async () => ({
messages: transcriptMessagesForTest,
contentReplacements: [],
}),
isPoorModeActive: () => false,
logError: () => undefined,
logForDebugging: () => undefined,
logError: error => {
loggedErrors.push(
error instanceof Error ? error : new Error(String(error)),
)
},
logForDebugging: message => {
debugLogs.push(message)
},
runForkedAgent: async (args: ForkCall) => {
forkCalls.push(args)
return {
@@ -73,14 +83,34 @@ describe('startAgentSummarization', () => {
if (typeof callback !== 'function') {
throw new Error('Expected timer callback')
}
scheduledCount += 1
scheduled = callback as () => void | Promise<void>
return 1 as unknown as ReturnType<typeof setTimeout>
lastTimerHandle = { id: scheduledCount }
return lastTimerHandle as ReturnType<typeof setTimeout>
}) as unknown as typeof setTimeout,
updateAgentSummary: (taskId: string, summary: string) => {
updateCalls.push({ taskId, summary })
},
...dependencies,
},
)
}
beforeEach(() => {
forkCalls = []
updateCalls = []
scheduled = undefined
handle = undefined
transcriptMessagesForTest = transcriptMessages
debugLogs = []
loggedErrors = []
clearedHandles = []
scheduledCount = 0
lastTimerHandle = undefined
})
test('summarizes bounded transcript once and skips unchanged fingerprints', async () => {
handle = startTestSummarization()
expect(typeof scheduled).toBe('function')
await scheduled!()
@@ -104,49 +134,95 @@ describe('startAgentSummarization', () => {
expect(forkCalls).toHaveLength(1)
expect(updateCalls).toHaveLength(1)
expect(loggedErrors).toEqual([])
})
test('skips summarization when bounded context is too small', async () => {
transcriptMessagesForTest = transcriptMessages.slice(0, 2)
handle = startAgentSummarization(
'task-1',
asAgentId('a0000000000000000'),
test('skips summarization when filtering leaves too little bounded context', async () => {
transcriptMessagesForTest = [
{ type: 'user', message: { content: 'start' }, uuid: 'u1' },
{
forkContextMessages: transcriptMessages,
model: 'claude-test',
} as unknown as CacheSafeParams,
() => undefined,
{
clearTimeout: () => undefined,
getAgentTranscript: async () => ({
messages: transcriptMessagesForTest,
contentReplacements: [],
}),
isPoorModeActive: () => false,
logError: () => undefined,
logForDebugging: () => undefined,
runForkedAgent: async (args: ForkCall) => {
forkCalls.push(args)
return { messages: [] } as unknown as ForkedAgentResult
},
setTimeout: ((callback: TimerHandler) => {
if (typeof callback !== 'function') {
throw new Error('Expected timer callback')
}
scheduled = callback as () => void | Promise<void>
return 1 as unknown as ReturnType<typeof setTimeout>
}) as unknown as typeof setTimeout,
updateAgentSummary: (taskId: string, summary: string) => {
updateCalls.push({ taskId, summary })
type: 'assistant',
uuid: 'a1',
message: {
content: [{ type: 'tool_use', id: 'missing', name: 'Read' }],
},
},
)
{ type: 'user', message: { content: 'continue' }, uuid: 'u2' },
] as unknown as Message[]
handle = startTestSummarization()
expect(typeof scheduled).toBe('function')
await scheduled!()
expect(forkCalls).toEqual([])
expect(updateCalls).toEqual([])
expect(debugLogs).toContain(
'[AgentSummary] Skipping summary for task-1: no bounded context available',
)
})
test('skips summarization before building context when transcript is too short', async () => {
transcriptMessagesForTest = transcriptMessages.slice(0, 2)
handle = startTestSummarization()
expect(typeof scheduled).toBe('function')
await scheduled!()
expect(forkCalls).toEqual([])
expect(updateCalls).toEqual([])
expect(debugLogs).toContain(
'[AgentSummary] Skipping summary for task-1: not enough messages (2)',
)
})
test('skips and reschedules while poor mode is active', async () => {
handle = startTestSummarization({
isPoorModeActive: () => true,
})
expect(typeof scheduled).toBe('function')
const initialScheduledCount = scheduledCount
const initialTimerHandle = lastTimerHandle
await scheduled!()
expect(forkCalls).toEqual([])
expect(updateCalls).toEqual([])
expect(debugLogs).toContain(
'[AgentSummary] Skipping summary — poor mode active',
)
expect(scheduledCount).toBe(initialScheduledCount + 1)
expect(lastTimerHandle).not.toBe(initialTimerHandle)
})
test('logs summary errors and schedules the next timer', async () => {
const error = new Error('fork failed')
handle = startTestSummarization({
runForkedAgent: async () => {
throw error
},
})
expect(typeof scheduled).toBe('function')
const initialScheduledCount = scheduledCount
const initialTimerHandle = lastTimerHandle
await scheduled!()
expect(loggedErrors).toEqual([error])
expect(updateCalls).toEqual([])
expect(scheduledCount).toBe(initialScheduledCount + 1)
expect(lastTimerHandle).not.toBe(initialTimerHandle)
})
test('stop clears the pending summary timer', () => {
handle = startTestSummarization()
const pendingHandle = lastTimerHandle
handle.stop()
expect(debugLogs).toContain(
'[AgentSummary] Stopping summarization for task-1',
)
expect(clearedHandles).toEqual([pendingHandle])
})
})

View File

@@ -141,6 +141,13 @@ describe('getSummaryContextFingerprint', () => {
expect(estimateMessageChars(message)).toBeGreaterThan(0)
})
test('treats unsupported top-level primitives as zero-size estimates', () => {
expect(
estimateMessageChars((() => undefined) as unknown as Message),
).toBe(0)
expect(estimateMessageChars(1n as unknown as Message)).toBe(0)
})
test('returns null for an empty transcript', () => {
expect(getSummaryContextFingerprint([])).toBeNull()
})