mirror of
https://github.com/claude-code-best/claude-code.git
synced 2026-06-15 21:05:51 +00:00
* fix: keep UDS peer failures structured CodeRabbit and Claude cross-review identified that timeout and raw peer connection failures should share one observable error contract. UDS peer failures now use UdsPeerConnectionError consistently, and connectToPeer hands the socket lifecycle back to the caller after a successful connection instead of retaining an internal timeout or error listener. The tests cover the real socket paths with capability files, timeout behavior, connection failure structure, post-connect listener handoff, AgentSummary rescheduling observations, and platform-specific mailbox directory errno handling. Constraint: Preserve the 5000ms production timeout default while allowing tests to exercise timeout paths quickly. Rejected: Suppress CodeRabbit warnings in tests | would hide the real timeout/error contract gap. Rejected: Keep connectToPeer post-connect error listener | it would silently swallow caller-owned socket errors. Confidence: high Scope-risk: narrow Directive: Keep UDS send/connect timeout and socket-error paths on the same structured peer error contract. Tested: bun test src/utils/__tests__/udsMessaging.test.ts src/services/AgentSummary/__tests__/agentSummary.test.ts src/utils/__tests__/teammateMailbox.test.ts Tested: bunx tsc --noEmit --pretty false Tested: bun run lint Tested: bun run test:all Tested: bun test --coverage --coverage-reporter lcov --coverage-dir coverage Tested: bun run build Tested: bun run build:vite Tested: omx ask claude simplify review artifact .omx/artifacts/claude-review-only-cross-check-for-pr-374-on-branch-codex-codecov-r-2026-04-27T08-17-47-309Z.md Tested: omx ask claude security review artifact .omx/artifacts/claude-security-review-cross-check-for-pr-374-current-working-tree--2026-04-27T08-26-54-079Z.md Not-tested: GitHub-hosted CodeRabbit refresh until pushed. * docs: clarify UDS peer socket ownership CodeRabbit's #375 pass found that connectToPeer now correctly hands socket errors to the caller, but the JSDoc needed to spell out that contract. The lifecycle test also uses a less brittle post-connect timeout so slow CI does not turn the ownership check into a connection-speed race. Constraint: The raw socket API intentionally detaches its internal listener after successful connect so caller-owned errors are not swallowed. Rejected: Keep the test timeout at 50ms | it tests scheduler speed instead of socket lifecycle ownership. Confidence: high Scope-risk: narrow Directive: connectToPeer callers must attach their own error listener immediately after awaiting the socket. Tested: bun test src/utils/__tests__/udsMessaging.test.ts Tested: bunx tsc --noEmit --pretty false Tested: bun run lint Tested: git diff --check Tested: bun run test:all Not-tested: GitHub-hosted CodeRabbit refresh until pushed. * fix: close peer socket listener handoff window CodeRabbit and Claude review found that documenting caller-owned raw socket errors still left a Promise handoff window and a stale timeout-listener risk. The peer connection API now requires a caller error handler and installs it before resolving, while cleanup removes internal error and timeout listeners on every path. Constraint: Keep the fix precise to PR #375 review feedback and avoid warning suppression or fallback behavior. Rejected: Leave the behavior documented only | still permits an unhandled socket error window between resolve and caller listener attachment. Rejected: Keep a no-op internal error listener | would silently swallow caller-owned socket errors. Confidence: high Scope-risk: narrow Directive: Do not add raw connectToPeer callers without providing a real onSocketError handler and capability handshake. Tested: bun test src/utils/__tests__/udsMessaging.test.ts src/services/AgentSummary/__tests__/agentSummary.test.ts Tested: bunx tsc --noEmit --pretty false Tested: bun run lint Tested: bun run test:all Tested: bun test --coverage --coverage-reporter lcov --coverage-dir coverage Tested: bun run build Tested: bun run build:vite Tested: bun audit Not-tested: Manual external ACP peer runtime beyond repository tests. * fix: use a deadline timer for peer connects The raw socket handoff no longer needs Socket#setTimeout; an ordinary connection deadline keeps the timeout behavior while avoiding an internal socket timeout listener that has no reliable UDS integration path to exercise. Constraint: Keep Codecov coverage honest without adding ignore pragmas, mocks, or fallback suppression. Rejected: c8 ignore on the timeout listener | hides the uncovered branch instead of simplifying the lifecycle. Rejected: keep Socket#setTimeout listener | leaves a socket listener lifecycle to manage for a connect-only deadline. Confidence: high Scope-risk: narrow Directive: Keep connectToPeer errors caller-owned via onSocketError and reject pre-connect failures with UdsPeerConnectionError. Tested: bun test src/utils/__tests__/udsMessaging.test.ts src/services/AgentSummary/__tests__/agentSummary.test.ts Tested: bunx tsc --noEmit --pretty false Tested: bun run lint Tested: bun test src/utils/__tests__/udsMessaging.test.ts --coverage --coverage-reporter lcov --coverage-dir coverage-uds Tested: bun run test:all Tested: bun test --coverage --coverage-reporter lcov --coverage-dir coverage Tested: bun run build Tested: bun run build:vite Tested: bun audit Not-tested: Manual external ACP peer runtime beyond repository tests. --------- Co-authored-by: unraid <local@unraid.local>
322 lines
9.1 KiB
TypeScript
322 lines
9.1 KiB
TypeScript
/**
|
|
* UDS Client — connect to peer Claude Code sessions via Unix Domain Sockets.
|
|
*
|
|
* Peers are discovered by reading the PID-file registry in ~/.claude/sessions/
|
|
* (written by concurrentSessions.ts) and checking each entry's
|
|
* `messagingSocketPath` field. A peer is "alive" if its PID is running and
|
|
* its socket accepts a ping/pong round-trip.
|
|
*/
|
|
|
|
import { createConnection, type Socket } from 'net'
|
|
import { readdir, readFile } from 'fs/promises'
|
|
import { join } from 'path'
|
|
import { getClaudeConfigHomeDir } from './envUtils.js'
|
|
import { logForDebugging } from './debug.js'
|
|
import { errorMessage, isFsInaccessible } from './errors.js'
|
|
import { isProcessRunning } from './genericProcessUtils.js'
|
|
import { jsonParse, jsonStringify } from './slowOperations.js'
|
|
import type { SessionKind } from './concurrentSessions.js'
|
|
import { MAX_UDS_FRAME_BYTES, type UdsMessage } from './udsMessaging.js'
|
|
import { attachUdsResponseReader, getChunkBytes } from './udsResponseReader.js'
|
|
|
|
// ---------------------------------------------------------------------------
|
|
// Types
|
|
// ---------------------------------------------------------------------------
|
|
|
|
export type PeerSession = {
|
|
pid: number
|
|
sessionId?: string
|
|
cwd?: string
|
|
startedAt?: number
|
|
kind?: SessionKind
|
|
name?: string
|
|
messagingSocketPath?: string
|
|
entrypoint?: string
|
|
bridgeSessionId?: string | null
|
|
alive: boolean
|
|
}
|
|
|
|
export class UdsPeerConnectionError extends Error {
|
|
readonly socketPath: string
|
|
|
|
constructor(socketPath: string, cause: unknown) {
|
|
super(
|
|
`Failed to connect to peer at ${socketPath}: ${errorMessage(cause)}`,
|
|
{ cause },
|
|
)
|
|
this.name = 'UdsPeerConnectionError'
|
|
this.socketPath = socketPath
|
|
}
|
|
}
|
|
|
|
// ---------------------------------------------------------------------------
|
|
// Session directory
|
|
// ---------------------------------------------------------------------------
|
|
|
|
function getSessionsDir(): string {
|
|
return join(getClaudeConfigHomeDir(), 'sessions')
|
|
}
|
|
|
|
// ---------------------------------------------------------------------------
|
|
// Discovery
|
|
// ---------------------------------------------------------------------------
|
|
|
|
/**
|
|
* List all live sessions from the PID registry, optionally probing their
|
|
* UDS sockets for liveness. Sessions whose PID is no longer running are
|
|
* excluded (and their stale files cleaned up).
|
|
*/
|
|
export async function listAllLiveSessions(): Promise<PeerSession[]> {
|
|
const dir = getSessionsDir()
|
|
let files: string[]
|
|
try {
|
|
files = await readdir(dir)
|
|
} catch (e) {
|
|
if (!isFsInaccessible(e)) {
|
|
logForDebugging(`[udsClient] readdir failed: ${errorMessage(e)}`)
|
|
}
|
|
return []
|
|
}
|
|
|
|
const results: PeerSession[] = []
|
|
|
|
for (const file of files) {
|
|
if (!/^\d+\.json$/.test(file)) continue
|
|
const pid = parseInt(file.slice(0, -5), 10)
|
|
|
|
if (!isProcessRunning(pid)) {
|
|
// Stale — skip (concurrentSessions handles cleanup)
|
|
continue
|
|
}
|
|
|
|
try {
|
|
const raw = await readFile(join(dir, file), 'utf8')
|
|
const data = jsonParse(raw) as Record<string, unknown>
|
|
results.push({
|
|
pid,
|
|
sessionId: data.sessionId as string | undefined,
|
|
cwd: data.cwd as string | undefined,
|
|
startedAt: data.startedAt as number | undefined,
|
|
kind: data.kind as SessionKind | undefined,
|
|
name: data.name as string | undefined,
|
|
messagingSocketPath: data.messagingSocketPath as string | undefined,
|
|
entrypoint: data.entrypoint as string | undefined,
|
|
bridgeSessionId: data.bridgeSessionId as string | null | undefined,
|
|
alive: true,
|
|
})
|
|
} catch {
|
|
// Corrupted file — skip
|
|
}
|
|
}
|
|
|
|
return results
|
|
}
|
|
|
|
/**
|
|
* List peer sessions that have a UDS messaging socket (i.e. can receive
|
|
* messages). Excludes the current process.
|
|
*/
|
|
export async function listPeers(): Promise<PeerSession[]> {
|
|
const all = await listAllLiveSessions()
|
|
return all.filter(s => s.pid !== process.pid && s.messagingSocketPath != null)
|
|
}
|
|
|
|
async function findAuthTokenForSocketPath(
|
|
socketPath: string,
|
|
): Promise<string | undefined> {
|
|
const { readUdsCapabilityToken } = await import('./udsMessaging.js')
|
|
return readUdsCapabilityToken(socketPath)
|
|
}
|
|
|
|
// ---------------------------------------------------------------------------
|
|
// Connection helpers
|
|
// ---------------------------------------------------------------------------
|
|
|
|
/**
|
|
* Probe a UDS socket to check if a server is listening (ping/pong).
|
|
* Returns true if the peer responds within the timeout.
|
|
*/
|
|
export async function isPeerAlive(
|
|
socketPath: string,
|
|
timeoutMs = 3000,
|
|
authToken?: string,
|
|
): Promise<boolean> {
|
|
const token = authToken ?? (await findAuthTokenForSocketPath(socketPath))
|
|
if (!token) return false
|
|
|
|
return new Promise<boolean>(resolve => {
|
|
const conn = createConnection(socketPath, () => {
|
|
const ping: UdsMessage = {
|
|
type: 'ping',
|
|
ts: new Date().toISOString(),
|
|
meta: { authToken: token },
|
|
}
|
|
conn.write(jsonStringify(ping) + '\n')
|
|
})
|
|
|
|
let resolved = false
|
|
|
|
const timer = setTimeout(() => {
|
|
if (!resolved) {
|
|
resolved = true
|
|
conn.destroy()
|
|
resolve(false)
|
|
}
|
|
}, timeoutMs)
|
|
|
|
let buffer = ''
|
|
conn.on('data', chunk => {
|
|
if (
|
|
Buffer.byteLength(buffer, 'utf8') + getChunkBytes(chunk) >
|
|
MAX_UDS_FRAME_BYTES
|
|
) {
|
|
if (!resolved) {
|
|
resolved = true
|
|
clearTimeout(timer)
|
|
conn.destroy()
|
|
resolve(false)
|
|
}
|
|
return
|
|
}
|
|
buffer += chunk.toString()
|
|
if (buffer.includes('"pong"')) {
|
|
if (!resolved) {
|
|
resolved = true
|
|
clearTimeout(timer)
|
|
conn.end()
|
|
resolve(true)
|
|
}
|
|
}
|
|
})
|
|
|
|
conn.on('error', () => {
|
|
if (!resolved) {
|
|
resolved = true
|
|
clearTimeout(timer)
|
|
resolve(false)
|
|
}
|
|
})
|
|
})
|
|
}
|
|
|
|
/**
|
|
* Send a text message to a peer's UDS socket. This is the high-level helper
|
|
* used by SendMessageTool for `uds:<path>` addresses.
|
|
*/
|
|
export async function sendToUdsSocket(
|
|
targetSocketPath: string,
|
|
message: string | Record<string, unknown>,
|
|
timeoutMs = 5000,
|
|
): Promise<void> {
|
|
const { parseUdsTarget } = await import('./udsMessaging.js')
|
|
const target = parseUdsTarget(targetSocketPath)
|
|
const authToken = await findAuthTokenForSocketPath(target.socketPath)
|
|
if (!authToken) {
|
|
throw new Error(`No auth token found for peer at ${target.socketPath}`)
|
|
}
|
|
|
|
const data = typeof message === 'string' ? message : jsonStringify(message)
|
|
const udsMsg: UdsMessage = {
|
|
type: 'text',
|
|
data,
|
|
ts: new Date().toISOString(),
|
|
}
|
|
|
|
// Lazily import to avoid circular dep at module-load time
|
|
const { getUdsMessagingSocketPath } = await import('./udsMessaging.js')
|
|
udsMsg.from = getUdsMessagingSocketPath()
|
|
|
|
return new Promise<void>((resolve, reject) => {
|
|
let settled = false
|
|
let conn: ReturnType<typeof createConnection>
|
|
const finish = (error?: Error): void => {
|
|
if (settled) return
|
|
settled = true
|
|
if (error) {
|
|
conn.destroy(error)
|
|
reject(error)
|
|
} else {
|
|
conn.end()
|
|
resolve()
|
|
}
|
|
}
|
|
|
|
conn = createConnection(target.socketPath, () => {
|
|
udsMsg.meta = { ...udsMsg.meta, authToken }
|
|
conn.write(jsonStringify(udsMsg) + '\n', err => {
|
|
if (err) finish(err)
|
|
})
|
|
})
|
|
attachUdsResponseReader(conn, {
|
|
maxFrameBytes: MAX_UDS_FRAME_BYTES,
|
|
onSettled: finish,
|
|
formatSocketError: err =>
|
|
new UdsPeerConnectionError(target.socketPath, err),
|
|
})
|
|
conn.setTimeout(timeoutMs, () => {
|
|
finish(
|
|
new UdsPeerConnectionError(
|
|
target.socketPath,
|
|
new Error('Connection timed out'),
|
|
),
|
|
)
|
|
})
|
|
})
|
|
}
|
|
|
|
/**
|
|
* Connect to a peer and return the raw socket for bidirectional communication.
|
|
* The caller owns the post-connect lifecycle through onSocketError, which is
|
|
* attached before the Promise resolves so peer socket errors cannot be
|
|
* swallowed or surface through a listener handoff window.
|
|
* Pre-connect failures reject with UdsPeerConnectionError.
|
|
* This only opens the transport; callers still own any capability handshake.
|
|
*/
|
|
export function connectToPeer(
|
|
socketPath: string,
|
|
onSocketError: (error: Error) => void,
|
|
timeoutMs = 5000,
|
|
): Promise<Socket> {
|
|
return new Promise<Socket>((resolve, reject) => {
|
|
const conn = createConnection(socketPath)
|
|
let settled = false
|
|
const timeout = setTimeout(
|
|
fail,
|
|
timeoutMs,
|
|
new Error('Connection timed out'),
|
|
)
|
|
function cleanupListeners(): void {
|
|
clearTimeout(timeout)
|
|
conn.off('error', fail)
|
|
}
|
|
function fail(cause: unknown): void {
|
|
if (settled) {
|
|
return
|
|
}
|
|
settled = true
|
|
cleanupListeners()
|
|
conn.destroy()
|
|
reject(new UdsPeerConnectionError(socketPath, cause))
|
|
}
|
|
conn.once('connect', () => {
|
|
if (settled) {
|
|
return
|
|
}
|
|
settled = true
|
|
cleanupListeners()
|
|
conn.on('error', onSocketError)
|
|
resolve(conn)
|
|
})
|
|
conn.on('error', fail)
|
|
})
|
|
}
|
|
|
|
/**
|
|
* Disconnect a previously connected peer socket.
|
|
*/
|
|
export function disconnectPeer(socket: Socket): void {
|
|
if (!socket.destroyed) {
|
|
socket.end()
|
|
}
|
|
}
|