test: keep Codecov coverage on real agent communication paths (#374)

* test: keep Codecov coverage on real agent communication paths PR #369 was merged before the final Codecov coverage fix landed, so this follow-up carries only the incremental real-path tests needed on top of main. The tests exercise AgentSummary lifecycle branches, mailbox fail-closed behavior, UDS client connection failure through a real capability file, and UDS response-reader framing without mock.module, warning suppression, feature fallback, or production-code churn. Constraint: PR #369 is already merged; this branch must contain only the incremental Codecov repair on top of latest main Rejected: Reopen or keep pushing the merged PR branch | merged PR refs do not update and would leave Codecov stale Rejected: Mock bun:bundle or hide warnings | would reintroduce cross-test pollution and pseudo coverage Rejected: Keep unrelated SendMessageTool production diff | it created avoidable patch-coverage debt without improving the runtime path Confidence: high Scope-risk: narrow Directive: Keep these coverage tests on real paths; do not replace them with output suppression or feature-flag mocks Tested: bunx tsc --noEmit --pretty false Tested: bun run lint Tested: bun test src\utils\__tests__\teammateMailbox.test.ts Tested: bun test src\services\AgentSummary\__tests__\agentSummary.test.ts src\services\AgentSummary\__tests__\summaryContext.test.ts src\utils\__tests__\teammateMailbox.test.ts src\utils\__tests__\udsMessaging.test.ts src\utils\__tests__\udsResponseReader.test.ts packages\builtin-tools\src\tools\SendMessageTool\__tests__\udsRecipientSanitization.test.ts Tested: bun run test:all Tested: bun test --coverage --coverage-reporter lcov --coverage-dir coverage Tested: bun run build Tested: bun run build:vite Tested: bun audit Tested: git diff --check Tested: Claude simplify review GO (.omx/artifacts/claude-simplify-codecov-20260427-1521.md) Tested: Claude security review GO (.omx/artifacts/claude-security-codecov-20260427-1522.md) Not-tested: GitHub-hosted Codecov upload after this amended commit until PR checks rerun * test: keep review assertions tied to real failure paths CodeRabbit flagged three non-blocking but valid review gaps: platform-specific mailbox errno checks, brittle UDS connection-failure message assertions, and missing AgentSummary reschedule proof after fork errors. This keeps the fixes narrow by tightening the affected assertions and adding a structured UDS connection error for tests to assert behavior instead of prose. Constraint: PR #374 is a review follow-up and must not hide warnings, skip tests, or merge the PR. Rejected: Matching the UDS failure message literal | preserves the brittle coupling CodeRabbit flagged. Rejected: Asserting only that mailbox writes throw | would allow unrelated pre-path failures to pass. Confidence: high Scope-risk: narrow Directive: Keep UDS connection-failure tests on structured error data, not display wording. Tested: bun test src/services/AgentSummary/__tests__/agentSummary.test.ts src/utils/__tests__/teammateMailbox.test.ts src/utils/__tests__/udsMessaging.test.ts Tested: bunx tsc --noEmit --pretty false Tested: bun run lint Tested: bun run test:all Tested: bun test --coverage --coverage-reporter lcov --coverage-dir coverage Tested: bun run build Tested: bun run build:vite Not-tested: GitHub-hosted CodeRabbit refresh until pushed. * test: remove brittle review follow-up assumptions CodeRabbit's second pass found two valid brittleness issues and one suggested callback-reference assertion that would not match production behavior. This keeps the production behavior unchanged: timers still schedule the summarizer closure, tests now assert timer-handle identity, and UDS connection errors use native Error.cause instead of shadowing it. Constraint: Do not manufacture behavior just to satisfy a review hint; assertions must match the real AgentSummary scheduling contract. Rejected: Assert a fresh scheduled callback function | scheduleNext intentionally passes the same runSummary closure each time. Rejected: Store a custom cause field on UdsPeerConnectionError | native Error.cause is available under ESNext/Bun. Confidence: high Scope-risk: narrow Directive: Timer tests should assert returned handle identity for ownership, not incidental numeric values. Tested: bun test src/services/AgentSummary/__tests__/agentSummary.test.ts src/utils/__tests__/udsMessaging.test.ts Tested: bunx tsc --noEmit --pretty false Tested: bun run lint Tested: bun run test:all Tested: bun test --coverage --coverage-reporter lcov --coverage-dir coverage Tested: bun run build Tested: bun run build:vite Not-tested: GitHub-hosted CodeRabbit refresh until pushed. * test: enforce structured UDS timeout failures CodeRabbit's follow-up surfaced a real consistency gap: UDS send socket errors used UdsPeerConnectionError while response timeouts still rejected a generic Error. Timeouts now use the same structured peer failure contract, and the test exercises that path through a short explicit timeout instead of waiting for the production default. The AgentSummary unchanged-fingerprint test now also asserts that the second unchanged tick does not log errors, preserving the existing behavior checks without changing production scheduling semantics. Constraint: Keep the production timeout default at 5000ms while allowing tests to exercise the timeout path quickly. Rejected: Leave timeout failures as generic Error | callers would need separate handling for the same peer connection failure class. Confidence: high Scope-risk: narrow Directive: Keep UDS send timeout and socket-error branches on the same structured error contract. Tested: bun test src/services/AgentSummary/__tests__/agentSummary.test.ts src/utils/__tests__/udsMessaging.test.ts Tested: bunx tsc --noEmit --pretty false Tested: bun run lint Tested: bun run test:all Tested: bun test --coverage --coverage-reporter lcov --coverage-dir coverage Tested: bun run build Tested: bun run build:vite Not-tested: GitHub-hosted CodeRabbit refresh until pushed. --------- Co-authored-by: unraid <local@unraid.local>
2026-06-18 06:15:51 +00:00 · 2026-04-27 16:22:13 +08:00
parent a65df4a102
commit b47731a3f3
6 changed files with 324 additions and 55 deletions
--- a/src/services/AgentSummary/tests/agentSummary.test.ts
+++ b/src/services/AgentSummary/tests/agentSummary.test.ts
@@ -5,7 +5,10 @@ import type {
  CacheSafeParams,
  ForkedAgentResult,
 } from '../../../utils/forkedAgent.js'
-import { startAgentSummarization } from '../agentSummary.js'
+import {
+  type AgentSummaryDependencies,
+  startAgentSummarization,
+} from '../agentSummary.js'

 const transcriptMessages = [
  { type: 'user', message: { content: 'start' }, uuid: 'u1' },
@@ -27,17 +30,16 @@ describe('startAgentSummarization', () => {
  let forkCalls: ForkCall[]
  let updateCalls: Array<{ taskId: string; summary: string }>
  let transcriptMessagesForTest: Message[]
+  let debugLogs: string[]
+  let loggedErrors: Error[]
+  let clearedHandles: unknown[]
+  let scheduledCount: number
+  let lastTimerHandle: unknown

-  beforeEach(() => {
-    forkCalls = []
-    updateCalls = []
-    scheduled = undefined
-    handle = undefined
-    transcriptMessagesForTest = transcriptMessages
-  })
-
-  test('summarizes bounded transcript once and skips unchanged fingerprints', async () => {
-    handle = startAgentSummarization(
+  function startTestSummarization(
+    dependencies: AgentSummaryDependencies = {},
+  ): { stop: () => void } {
+    return startAgentSummarization(
      'task-1',
      asAgentId('a0000000000000000'),
      {
@@ -48,14 +50,22 @@ describe('startAgentSummarization', () => {
      } as unknown as CacheSafeParams,
      () => undefined,
      {
-        clearTimeout: () => undefined,
+        clearTimeout: ((timeoutId: unknown) => {
+          clearedHandles.push(timeoutId)
+        }) as typeof clearTimeout,
        getAgentTranscript: async () => ({
          messages: transcriptMessagesForTest,
          contentReplacements: [],
        }),
        isPoorModeActive: () => false,
-        logError: () => undefined,
-        logForDebugging: () => undefined,
+        logError: error => {
+          loggedErrors.push(
+            error instanceof Error ? error : new Error(String(error)),
+          )
+        },
+        logForDebugging: message => {
+          debugLogs.push(message)
+        },
        runForkedAgent: async (args: ForkCall) => {
          forkCalls.push(args)
          return {
@@ -73,14 +83,34 @@ describe('startAgentSummarization', () => {
          if (typeof callback !== 'function') {
            throw new Error('Expected timer callback')
          }
+          scheduledCount += 1
          scheduled = callback as () => void | Promise<void>
-          return 1 as unknown as ReturnType<typeof setTimeout>
+          lastTimerHandle = { id: scheduledCount }
+          return lastTimerHandle as ReturnType<typeof setTimeout>
        }) as unknown as typeof setTimeout,
        updateAgentSummary: (taskId: string, summary: string) => {
          updateCalls.push({ taskId, summary })
        },
+        ...dependencies,
      },
    )
+  }
+
+  beforeEach(() => {
+    forkCalls = []
+    updateCalls = []
+    scheduled = undefined
+    handle = undefined
+    transcriptMessagesForTest = transcriptMessages
+    debugLogs = []
+    loggedErrors = []
+    clearedHandles = []
+    scheduledCount = 0
+    lastTimerHandle = undefined
+  })
+
+  test('summarizes bounded transcript once and skips unchanged fingerprints', async () => {
+    handle = startTestSummarization()

    expect(typeof scheduled).toBe('function')
    await scheduled!()
@@ -104,49 +134,95 @@ describe('startAgentSummarization', () => {

    expect(forkCalls).toHaveLength(1)
    expect(updateCalls).toHaveLength(1)
+    expect(loggedErrors).toEqual([])
  })

-  test('skips summarization when bounded context is too small', async () => {
-    transcriptMessagesForTest = transcriptMessages.slice(0, 2)
-
-    handle = startAgentSummarization(
-      'task-1',
-      asAgentId('a0000000000000000'),
+  test('skips summarization when filtering leaves too little bounded context', async () => {
+    transcriptMessagesForTest = [
+      { type: 'user', message: { content: 'start' }, uuid: 'u1' },
      {
-        forkContextMessages: transcriptMessages,
-        model: 'claude-test',
-      } as unknown as CacheSafeParams,
-      () => undefined,
-      {
-        clearTimeout: () => undefined,
-        getAgentTranscript: async () => ({
-          messages: transcriptMessagesForTest,
-          contentReplacements: [],
-        }),
-        isPoorModeActive: () => false,
-        logError: () => undefined,
-        logForDebugging: () => undefined,
-        runForkedAgent: async (args: ForkCall) => {
-          forkCalls.push(args)
-          return { messages: [] } as unknown as ForkedAgentResult
-        },
-        setTimeout: ((callback: TimerHandler) => {
-          if (typeof callback !== 'function') {
-            throw new Error('Expected timer callback')
-          }
-          scheduled = callback as () => void | Promise<void>
-          return 1 as unknown as ReturnType<typeof setTimeout>
-        }) as unknown as typeof setTimeout,
-        updateAgentSummary: (taskId: string, summary: string) => {
-          updateCalls.push({ taskId, summary })
+        type: 'assistant',
+        uuid: 'a1',
+        message: {
+          content: [{ type: 'tool_use', id: 'missing', name: 'Read' }],
        },
      },
-    )
+      { type: 'user', message: { content: 'continue' }, uuid: 'u2' },
+    ] as unknown as Message[]
+
+    handle = startTestSummarization()

    expect(typeof scheduled).toBe('function')
    await scheduled!()

    expect(forkCalls).toEqual([])
    expect(updateCalls).toEqual([])
+    expect(debugLogs).toContain(
+      '[AgentSummary] Skipping summary for task-1: no bounded context available',
+    )
+  })
+
+  test('skips summarization before building context when transcript is too short', async () => {
+    transcriptMessagesForTest = transcriptMessages.slice(0, 2)
+    handle = startTestSummarization()
+
+    expect(typeof scheduled).toBe('function')
+    await scheduled!()
+
+    expect(forkCalls).toEqual([])
+    expect(updateCalls).toEqual([])
+    expect(debugLogs).toContain(
+      '[AgentSummary] Skipping summary for task-1: not enough messages (2)',
+    )
+  })
+
+  test('skips and reschedules while poor mode is active', async () => {
+    handle = startTestSummarization({
+      isPoorModeActive: () => true,
+    })
+
+    expect(typeof scheduled).toBe('function')
+    const initialScheduledCount = scheduledCount
+    const initialTimerHandle = lastTimerHandle
+    await scheduled!()
+
+    expect(forkCalls).toEqual([])
+    expect(updateCalls).toEqual([])
+    expect(debugLogs).toContain(
+      '[AgentSummary] Skipping summary — poor mode active',
+    )
+    expect(scheduledCount).toBe(initialScheduledCount + 1)
+    expect(lastTimerHandle).not.toBe(initialTimerHandle)
+  })
+
+  test('logs summary errors and schedules the next timer', async () => {
+    const error = new Error('fork failed')
+    handle = startTestSummarization({
+      runForkedAgent: async () => {
+        throw error
+      },
+    })
+
+    expect(typeof scheduled).toBe('function')
+    const initialScheduledCount = scheduledCount
+    const initialTimerHandle = lastTimerHandle
+    await scheduled!()
+
+    expect(loggedErrors).toEqual([error])
+    expect(updateCalls).toEqual([])
+    expect(scheduledCount).toBe(initialScheduledCount + 1)
+    expect(lastTimerHandle).not.toBe(initialTimerHandle)
+  })
+
+  test('stop clears the pending summary timer', () => {
+    handle = startTestSummarization()
+    const pendingHandle = lastTimerHandle
+
+    handle.stop()
+
+    expect(debugLogs).toContain(
+      '[AgentSummary] Stopping summarization for task-1',
+    )
+    expect(clearedHandles).toEqual([pendingHandle])
  })
 })
--- a/src/services/AgentSummary/tests/summaryContext.test.ts
+++ b/src/services/AgentSummary/tests/summaryContext.test.ts
@@ -141,6 +141,13 @@ describe('getSummaryContextFingerprint', () => {
    expect(estimateMessageChars(message)).toBeGreaterThan(0)
  })

+  test('treats unsupported top-level primitives as zero-size estimates', () => {
+    expect(
+      estimateMessageChars((() => undefined) as unknown as Message),
+    ).toBe(0)
+    expect(estimateMessageChars(1n as unknown as Message)).toBe(0)
+  })
+
  test('returns null for an empty transcript', () => {
    expect(getSummaryContextFingerprint([])).toBeNull()
  })