feat: harden autonomy lifecycle, OOM bounds, and provider-boundary finalization

This PR consolidates a coordinated batch of fixes around autonomy run/flow lifecycle, scheduled task deduplication, provider-boundary state finalization, and matching memory-bound treatments for adjacent long-running subsystems (REPL fullscreen scrollback, skill-search/skill-learning runtime activation). All changes were developed and reviewed together because they touched the same lifecycle invariants and were uncovered by the same long-running session reproductions.

## Lifecycle correctness

- Queued autonomy prompts are not injected unless the persisted run was successfully claimed; queued run claiming is now terminal-safe so a once-consumed/cancelled/failed run can not slip back into `queued`.
- Autonomy run/flow finalization happens on completion, provider error, generator close, and cancellation — not just the happy path. New `src/__tests__/queryAutonomyProviderBoundary.test.ts` covers these provider-boundary transitions.
- `requestManagedAutonomyFlowCancel` and `resumeManagedAutonomyFlowPrompt` carry `rootDir` and `currentDir` explicitly across detached async boundaries (proactive-tick, cron, daemon restart) instead of inferring from process state.
- Active runs/flows are protected from janitor pruning so a running step can not be garbage-collected mid-flight (`src/utils/autonomyAuthority.ts`).
- Heartbeat parser now ignores fenced code blocks; the two-phase commit window for autonomy state transitions is documented in `docs/internals/autonomy-jira.md`.

## Ownership and dedup

- `src/utils/autonomyRuns.ts`: ownership stamping (run id + rootDir carried end-to-end), source-based dedup against active runs.
- `src/hooks/useScheduledTasks.ts`: scheduled ticks deduplicate against runs already active on the same source label.
- `src/utils/processUserInput/processSlashCommand.tsx`: forked slash commands now thread the autonomy `runId` so completion finalizers can find the originating run for deferred completion.
- New `src/utils/autonomyQueueLifecycle.ts` and tests collect the queue-side lifecycle invariants in one place.

## Memory bounds (related, same review pass)

- `src/screens/REPL.tsx`: caps fullscreen scrollback after the compact boundary and updates trailing progress rows in place. Long-running fullscreen sessions could otherwise retain thousands of post-compaction messages and duplicate progress rows, keeping Ink trees alive long after their useful context had moved on.
- `src/services/skillSearch/*` and `src/services/skillLearning/*`: runtime activation is strictly opt-in via existing env toggles; session caches are capped so long-running processes can not grow them forever. Build presence is preserved so operators can still discover and opt into the slash commands.

## CI / test contract

- `tests/integration/dependency-overrides.test.ts`: smoke test no longer drives Mermaid's browser renderer; it validates the package-resolution contract directly so CI does not regress on unrelated browser timing.
- New `tests/integration/autonomy-lifecycle-user-flow.test.ts`: end-to-end CLI subprocess flow exercising `status --deep`, `flows`, `flow <id>`, `flow resume`, `flow cancel` against persisted state.
- `src/entrypoints/cli.tsx`: `claude autonomy …` routes through an entrypoint fast path that reuses the slash-command formatter without booting the full interactive CLI. Stdout is flushed before forced exit so coverage subprocesses do not terminate with empty stdout.
- `packages/builtin-tools/src/tools/RemoteTriggerTool/__tests__/RemoteTriggerTool.test.ts`: stabilized to prevent audit flake under coverage.

## Tests added

- `src/__tests__/queryAutonomyProviderBoundary.test.ts`
- `src/hooks/__tests__/useScheduledTasks.test.ts`
- `src/utils/__tests__/autonomyAuthority.test.ts`
- `src/utils/__tests__/autonomyFlows.test.ts` (extended)
- `src/utils/__tests__/autonomyPersistence.test.ts` (extended)
- `src/utils/__tests__/autonomyQueueLifecycle.test.ts`
- `src/utils/__tests__/autonomyRuns.test.ts` (extended)
- `src/utils/processUserInput/__tests__/processSlashCommand.test.ts`
- `tests/integration/autonomy-lifecycle-user-flow.test.ts`

## Docs

- `docs/agent/sur-loop-scheduled-oom.md`: System Understanding Report covering the scheduled/loop OOM problem, the call graphs investigated, and the lifecycle invariants this PR establishes.
- `docs/agent/sur-skill-overflow-bugs.md`: SUR for the related skill-overflow context.
- `docs/internals/autonomy-jira.md`: documents the two-phase commit window and ownership stamping invariants.
- `docs/memory-leak-audit.md`: audit notes covering the REPL/scrollback and skill-search bounds.

## Invariants this PR establishes

1. Queued autonomy prompts are not injected unless the persisted run was successfully claimed.
2. Terminal run/flow states are terminal — completion, failure, and cancellation all finalize state regardless of which provider/error path triggered them.
3. Autonomy run/flow `rootDir` is carried explicitly across detached async boundaries instead of inferred from a shared singleton.
4. State-only CLI subcommands (`autonomy status|runs|flows|flow …`) bypass full interactive bootstrap so they do not hold unrelated handles open.
5. REPL fullscreen scrollback and skill-search/skill-learning session caches are explicitly bounded.

## Validation

```bash
bun run typecheck
CI=true GITHUB_ACTIONS=true bun test            # 3996 pass / 0 fail across 305 files
bun test src/__tests__/queryAutonomyProviderBoundary.test.ts \
         src/hooks/__tests__/useScheduledTasks.test.ts \
         src/utils/__tests__/autonomy{Runs,Flows,Authority,QueueLifecycle,Persistence}.test.ts \
         src/utils/processUserInput/__tests__/processSlashCommand.test.ts \
         tests/integration/autonomy-lifecycle-user-flow.test.ts
```

## Origin

This PR is the consolidated, upstream-targeted version of two fork-side review PRs (fix/loop-scheduled-autonomy-oom and fix/autonomy-lifecycle). The fork-side review history is preserved at https://github.com/amDosion/claude-code-bast/pull/7 . The fork's own internal `chore: keep fork current with upstream` sync commits and the `docs: update contributors` automation are intentionally not included in this PR.

The autonomy CLI handler `rootDir` threading that the fork added (78f64d8a, 98d04ddb) is intentionally omitted here because upstream `a2cfaf91` (fix: 修复 RemoteTriggerTool 和 autonomy 测试的全量运行失败) already performed the equivalent change with an additional `currentDir` option. Keeping the upstream version avoids regressing that improvement.
This commit is contained in:
unraid
2026-04-29 14:04:27 +08:00
parent 4f1649e249
commit f2e9af4927
51 changed files with 4885 additions and 971 deletions

View File

@@ -178,6 +178,19 @@ export type ToolUseContext = {
querySource?: QuerySource
/** Optional callback to get the latest tools (e.g., after MCP servers connect mid-query) */
refreshTools?: () => Tools
/**
* @internal TEST-ONLY ESCAPE HATCH. MUST remain undefined in production.
*
* Allows non-bundled unit-test harnesses to exercise the background
* forked slash command path that production assistant mode gates behind
* `feature('KAIROS')`. Still requires `AppState.kairosEnabled`. This
* field is constructed in-process by trusted application code only;
* no external surface (MCP, plugin, slash command, network) writes to
* `ToolUseContext.options`. Setting this true outside a test bypasses
* the KAIROS feature flag; `processSlashCommand` rejects this flag
* outside `NODE_ENV=test`.
*/
allowBackgroundForkedSlashCommands?: boolean
}
abortController: AbortController
readFileState: FileStateCache

View File

@@ -1,8 +1,18 @@
import { beforeEach, describe, expect, mock, test } from 'bun:test'
import { afterEach, beforeEach, describe, expect, mock, test } from 'bun:test'
import { createAbortController } from '../utils/abortController'
import { QueryGuard } from '../utils/QueryGuard'
import { handlePromptSubmit } from '../utils/handlePromptSubmit'
import { getCommandQueue, resetCommandQueue } from '../utils/messageQueueManager'
import {
getCommandQueue,
resetCommandQueue,
} from '../utils/messageQueueManager'
import { cleanupTempDir, createTempDir } from '../../tests/mocks/file-system'
import {
createAutonomyQueuedPrompt,
markAutonomyRunCancelled,
} from '../utils/autonomyRuns'
let tempDirs: string[] = []
function createBaseParams() {
const queryGuard = new QueryGuard()
@@ -28,9 +38,7 @@ function createBaseParams() {
commands: [],
setUserInputOnProcessing: mock((_prompt?: string) => {}),
setAbortController: mock((_abortController: AbortController | null) => {}),
onQuery: mock(
async () => undefined,
) as unknown as (
onQuery: mock(async () => undefined) as unknown as (
...args: unknown[]
) => Promise<void>,
setAppState: mock((_updater: unknown) => {}),
@@ -40,6 +48,13 @@ function createBaseParams() {
describe('handlePromptSubmit', () => {
beforeEach(() => {
resetCommandQueue()
tempDirs = []
})
afterEach(async () => {
for (const tempDir of tempDirs) {
await cleanupTempDir(tempDir)
}
})
test('aborts the current turn when only cancel-interrupt tools are running', async () => {
@@ -118,4 +133,34 @@ describe('handlePromptSubmit', () => {
bridgeOrigin: true,
})
})
test('skips stale autonomy commands in the idle queued path', async () => {
const params = createBaseParams()
const abortController = createAbortController()
const tempDir = await createTempDir('handle-prompt-autonomy-')
tempDirs.push(tempDir)
const command = await createAutonomyQueuedPrompt({
basePrompt: 'scheduled prompt',
trigger: 'scheduled-task',
rootDir: tempDir,
currentDir: tempDir,
})
expect(command).not.toBeNull()
await markAutonomyRunCancelled(command!.autonomy!.runId, tempDir)
await handlePromptSubmit({
...params,
input: '',
mode: 'prompt',
pastedContents: {},
abortController,
streamMode: 'normal' as any,
hasInterruptibleToolInProgress: false,
isExternalLoading: false,
queuedCommands: [command!],
})
expect(params.getToolUseContext).not.toHaveBeenCalled()
expect(params.onQuery).not.toHaveBeenCalled()
})
})

View File

@@ -0,0 +1,337 @@
import { afterEach, beforeEach, describe, expect, test } from 'bun:test'
import { randomUUID } from 'crypto'
import {
resetStateForTests,
setCwdState,
setOriginalCwd,
setProjectRoot,
} from '../bootstrap/state'
import { query } from '../query'
import { getEmptyToolPermissionContext } from '../Tool'
import type { AssistantMessage } from '../types/message'
import { asSystemPrompt } from '../utils/systemPromptType'
import {
createAssistantAPIErrorMessage,
createUserMessage,
} from '../utils/messages'
import { cleanupTempDir, createTempDir } from '../../tests/mocks/file-system'
import {
enqueue,
getCommandsByMaxPriority,
resetCommandQueue,
} from '../utils/messageQueueManager'
import { getAutonomyFlowById, listAutonomyFlows } from '../utils/autonomyFlows'
import {
getAutonomyRunById,
startManagedAutonomyFlowFromHeartbeatTask,
} from '../utils/autonomyRuns'
let tempDir = ''
let originalProcessCwd = ''
beforeEach(async () => {
originalProcessCwd = process.cwd()
tempDir = await createTempDir('query-autonomy-provider-boundary-')
resetStateForTests()
resetCommandQueue()
setOriginalCwd(tempDir)
setCwdState(tempDir)
setProjectRoot(tempDir)
})
afterEach(async () => {
resetStateForTests()
resetCommandQueue()
if (originalProcessCwd) {
process.chdir(originalProcessCwd)
}
if (tempDir) {
let lastError: unknown
for (let attempt = 0; attempt < 20; attempt++) {
try {
await cleanupTempDir(tempDir)
lastError = undefined
break
} catch (error) {
lastError = error
await new Promise(resolve => setTimeout(resolve, 100))
}
}
if (lastError) {
throw lastError
}
}
})
function createToolUseAssistantMessage(): AssistantMessage {
return {
type: 'assistant',
uuid: randomUUID(),
timestamp: new Date().toISOString(),
requestId: undefined,
message: {
id: 'msg_tool_use',
type: 'message',
role: 'assistant',
model: 'test-model',
stop_reason: 'tool_use',
stop_sequence: null,
usage: {
input_tokens: 1,
output_tokens: 1,
cache_creation_input_tokens: 0,
cache_read_input_tokens: 0,
},
content: [
{
type: 'tool_use',
id: 'toolu_provider_boundary',
name: 'MissingBoundaryTool',
input: {},
},
],
},
} as unknown as AssistantMessage
}
function createToolUseContext(): any {
let inProgressToolUseIds = new Set<string>()
let responseLength = 0
let appState = {
toolPermissionContext: getEmptyToolPermissionContext(),
fastMode: false,
mcp: {
tools: [],
clients: [],
},
effortValue: undefined,
advisorModel: undefined,
sessionHooks: new Map(),
}
return {
options: {
commands: [],
debug: false,
mainLoopModel: 'claude-sonnet-4-5-20250929',
tools: [],
verbose: false,
thinkingConfig: { type: 'disabled' },
mcpClients: [],
mcpResources: {},
isNonInteractiveSession: true,
agentDefinitions: {
activeAgents: [],
allowedAgentTypes: [],
},
},
abortController: new AbortController(),
readFileState: new Map(),
getAppState: () => appState,
setAppState: (updater: (state: any) => any) => {
appState = updater(appState as never)
},
setInProgressToolUseIDs: (updater: (state: Set<string>) => Set<string>) => {
inProgressToolUseIds = updater(inProgressToolUseIds)
},
setResponseLength: (updater: (state: number) => number) => {
responseLength = updater(responseLength)
},
updateFileHistoryState: () => {},
updateAttributionState: () => {},
messages: [],
} as any
}
describe('query autonomy/provider boundary', () => {
test('provider api-error messages fail a consumed autonomy run instead of advancing the flow', async () => {
const previousDisableAttachments =
process.env.CLAUDE_CODE_DISABLE_ATTACHMENTS
process.env.CLAUDE_CODE_DISABLE_ATTACHMENTS = '1'
try {
const command = await startManagedAutonomyFlowFromHeartbeatTask({
task: {
name: 'provider-boundary',
interval: '1h',
prompt: 'Exercise provider boundary',
steps: [
{ name: 'first', prompt: 'First provider-boundary step' },
{ name: 'second', prompt: 'Second provider-boundary step' },
],
},
rootDir: tempDir,
currentDir: tempDir,
priority: 'next',
})
expect(command).not.toBeNull()
enqueue(command!)
const toolUseContext = createToolUseContext()
let callCount = 0
const deps = {
uuid: () => 'query-chain-id',
microcompact: async (messages: unknown[]) => ({ messages }),
autocompact: async () => ({
compactionResult: undefined,
consecutiveFailures: 0,
}),
callModel: async function* () {
callCount += 1
if (callCount === 1) {
yield createToolUseAssistantMessage()
return
}
yield createAssistantAPIErrorMessage({
content: 'API Error: provider unavailable',
apiError: 'api_error',
error: new Error('provider unavailable') as never,
})
},
}
const emitted: any[] = []
const generator = query({
messages: [
createUserMessage({
content: 'start provider-boundary test',
}),
],
systemPrompt: asSystemPrompt([]),
userContext: {},
systemContext: {},
canUseTool: async (_tool, input) => ({
behavior: 'allow',
updatedInput: input,
}),
toolUseContext,
querySource: 'sdk',
maxTurns: 3,
deps: deps as never,
})
let next = await generator.next()
while (!next.done) {
emitted.push(next.value)
next = await generator.next()
}
const [flow] = await listAutonomyFlows(tempDir)
const finalFlow = await getAutonomyFlowById(flow!.flowId, tempDir)
const run = await getAutonomyRunById(command!.autonomy!.runId, tempDir)
expect(next.value.reason).toBe('model_error')
expect(callCount).toBe(2)
expect(
emitted.some(
message =>
message.type === 'attachment' &&
message.attachment.type === 'queued_command',
),
).toBe(true)
expect(run!.status).toBe('failed')
expect(run!.error).toBe('provider api_error')
expect(finalFlow!.status).toBe('failed')
expect(finalFlow!.stateJson!.steps.map(step => step.status)).toEqual([
'failed',
'pending',
])
expect(getCommandsByMaxPriority('later')).toHaveLength(0)
} finally {
if (previousDisableAttachments === undefined) {
delete process.env.CLAUDE_CODE_DISABLE_ATTACHMENTS
} else {
process.env.CLAUDE_CODE_DISABLE_ATTACHMENTS = previousDisableAttachments
}
}
})
test('generator return cancels a consumed autonomy run instead of leaving it running', async () => {
const previousDisableAttachments =
process.env.CLAUDE_CODE_DISABLE_ATTACHMENTS
process.env.CLAUDE_CODE_DISABLE_ATTACHMENTS = '1'
try {
const command = await startManagedAutonomyFlowFromHeartbeatTask({
task: {
name: 'return-boundary',
interval: '1h',
prompt: 'Exercise generator return boundary',
steps: [
{ name: 'first', prompt: 'First return-boundary step' },
{ name: 'second', prompt: 'Second return-boundary step' },
],
},
rootDir: tempDir,
currentDir: tempDir,
priority: 'next',
})
expect(command).not.toBeNull()
enqueue(command!)
const toolUseContext = createToolUseContext()
const deps = {
uuid: () => 'query-chain-id',
microcompact: async (messages: unknown[]) => ({ messages }),
autocompact: async () => ({
compactionResult: undefined,
consecutiveFailures: 0,
}),
callModel: async function* () {
yield createToolUseAssistantMessage()
},
}
const generator = query({
messages: [
createUserMessage({
content: 'start return-boundary test',
}),
],
systemPrompt: asSystemPrompt([]),
userContext: {},
systemContext: {},
canUseTool: async (_tool, input) => ({
behavior: 'allow',
updatedInput: input,
}),
toolUseContext,
querySource: 'sdk',
maxTurns: 3,
deps: deps as never,
})
let sawQueuedAttachment = false
let next = await generator.next()
while (!next.done) {
const message = next.value as any
if (
message.type === 'attachment' &&
message.attachment.type === 'queued_command'
) {
sawQueuedAttachment = true
await generator.return(undefined as never)
break
}
next = await generator.next()
}
const [flow] = await listAutonomyFlows(tempDir)
const finalFlow = await getAutonomyFlowById(flow!.flowId, tempDir)
const run = await getAutonomyRunById(command!.autonomy!.runId, tempDir)
expect(sawQueuedAttachment).toBe(true)
expect(run!.status).toBe('cancelled')
expect(finalFlow!.status).toBe('cancelled')
expect(finalFlow!.stateJson!.steps.map(step => step.status)).toEqual([
'cancelled',
'cancelled',
])
expect(getCommandsByMaxPriority('later')).toHaveLength(0)
} finally {
if (previousDisableAttachments === undefined) {
delete process.env.CLAUDE_CODE_DISABLE_ATTACHMENTS
} else {
process.env.CLAUDE_CODE_DISABLE_ATTACHMENTS = previousDisableAttachments
}
}
})
})

View File

@@ -323,13 +323,16 @@ import { asSessionId } from 'src/types/ids.js'
import {
commitAutonomyQueuedPrompt,
createAutonomyQueuedPrompt,
createAutonomyQueuedPromptIfNoActiveSource,
createProactiveAutonomyCommands,
finalizeAutonomyRunCompleted,
finalizeAutonomyRunFailed,
markAutonomyRunCompleted,
markAutonomyRunCancelled,
markAutonomyRunFailed,
markAutonomyRunRunning,
} from 'src/utils/autonomyRuns.js'
import {
cancelQueuedAutonomyCommands,
claimConsumableQueuedAutonomyCommands,
finalizeAutonomyCommandsForTurn,
} from 'src/utils/autonomyQueueLifecycle.js'
import { prepareAutonomyTurnPrompt } from 'src/utils/autonomyAuthority.js'
import { jsonStringify } from '../utils/slowOperations.js'
import { skillChangeDetector } from '../utils/skills/skillChangeDetector.js'
@@ -1865,17 +1868,26 @@ function runHeadlessStreaming(
currentDir: cwd(),
shouldCreate: () => !inputClosed,
})
if (inputClosed) {
await cancelQueuedAutonomyCommands({ commands })
return
}
for (const command of commands) {
if (inputClosed) {
return
}
enqueue({
...command,
uuid: randomUUID(),
})
}
void run()
})()
})().catch(error => {
logError(error)
logForDebugging(
`[Proactive] failed to create headless tick: ${error}`,
{
level: 'error',
},
)
})
}, 0)
}
: undefined
@@ -1971,17 +1983,24 @@ function runHeadlessStreaming(
// Non-prompt commands (task-notification, orphaned-permission) carry
// side effects or orphanedPermission state, so they process singly.
// Prompt commands greedily collect followers with matching workload.
const batch: QueuedCommand[] = [command]
let batch: QueuedCommand[] = [command]
if (command.mode === 'prompt') {
while (canBatchWith(command, peek(isMainThread))) {
batch.push(dequeue(isMainThread)!)
}
if (batch.length > 1) {
command = {
...command,
value: joinPromptValues(batch.map(c => c.value)),
uuid: batch.findLast(c => c.uuid)?.uuid ?? command.uuid,
}
}
const queuedAutonomyClaim =
await claimConsumableQueuedAutonomyCommands(batch)
batch = queuedAutonomyClaim.attachmentCommands
if (batch.length === 0) {
continue
}
command = batch[0]!
if (command.mode === 'prompt' && batch.length > 1) {
command = {
...command,
value: joinPromptValues(batch.map(c => c.value)),
uuid: batch.findLast(c => c.uuid)?.uuid ?? command.uuid,
}
}
const batchUuids = batch.map(c => c.uuid).filter(u => u !== undefined)
@@ -2120,9 +2139,7 @@ function runHeadlessStreaming(
}
const input = command.value
const autonomyRunIds = batch
.map(item => item.autonomy?.runId)
.filter((runId): runId is string => Boolean(runId))
const claimedAutonomyCommands = queuedAutonomyClaim.claimedCommands
if (structuredIO instanceof RemoteIO && command.mode === 'prompt') {
logEvent('tengu_bridge_message_received', {
@@ -2172,9 +2189,6 @@ function runHeadlessStreaming(
// const-capture: TS loses `while ((command = dequeue()))` narrowing
// inside the closure.
const cmd = command
for (const runId of autonomyRunIds) {
await markAutonomyRunRunning(runId)
}
let lastResultIsError = false
try {
await runWithWorkload(
@@ -2286,35 +2300,39 @@ function runHeadlessStreaming(
},
) // end runWithWorkload
if (lastResultIsError) {
for (const runId of autonomyRunIds) {
await finalizeAutonomyRunFailed({
runId,
error: 'ask() returned an error result',
})
}
await finalizeAutonomyCommandsForTurn({
commands: claimedAutonomyCommands,
outcome: {
type: 'failed',
message: 'ask() returned an error result',
},
currentDir: cwd(),
priority: 'later',
workload: cmd.workload ?? options.workload,
})
} else {
for (const runId of autonomyRunIds) {
const nextCommands = await finalizeAutonomyRunCompleted({
runId,
currentDir: cwd(),
priority: 'later',
workload: cmd.workload ?? options.workload,
const nextCommands = await finalizeAutonomyCommandsForTurn({
commands: claimedAutonomyCommands,
outcome: { type: 'completed' },
currentDir: cwd(),
priority: 'later',
workload: cmd.workload ?? options.workload,
})
for (const nextCommand of nextCommands) {
enqueue({
...nextCommand,
uuid: randomUUID(),
})
for (const nextCommand of nextCommands) {
enqueue({
...nextCommand,
uuid: randomUUID(),
})
}
}
}
} catch (error) {
for (const runId of autonomyRunIds) {
await finalizeAutonomyRunFailed({
runId,
error: String(error),
})
}
await finalizeAutonomyCommandsForTurn({
commands: claimedAutonomyCommands,
outcome: { type: 'failed', error },
currentDir: cwd(),
priority: 'later',
workload: cmd.workload ?? options.workload,
})
throw error
}
@@ -2820,57 +2838,87 @@ function runHeadlessStreaming(
currentDir: cwd(),
workload: WORKLOAD_CRON,
})
if (inputClosed) return
if (inputClosed) {
await markAutonomyRunCancelled(
command.autonomy!.runId,
command.autonomy!.rootDir,
)
return
}
enqueue({
...command,
uuid: randomUUID(),
})
void run()
})()
})().catch(error => {
logError(error)
logForDebugging(
`[ScheduledTasks] failed to enqueue headless task: ${error}`,
{
level: 'error',
},
)
})
},
onFireTask: task => {
if (inputClosed) return
void (async () => {
if (task.agentId) {
const prepared = await prepareAutonomyTurnPrompt({
const command = await createAutonomyQueuedPromptIfNoActiveSource({
basePrompt: task.prompt,
trigger: 'scheduled-task',
currentDir: cwd(),
})
if (inputClosed) return
const command = await commitAutonomyQueuedPrompt({
prepared,
currentDir: cwd(),
sourceId: task.id,
sourceLabel: task.prompt,
workload: WORKLOAD_CRON,
shouldCreate: () => !inputClosed,
})
if (!command) return
if (inputClosed) {
await markAutonomyRunCancelled(
command.autonomy!.runId,
command.autonomy!.rootDir,
)
return
}
await markAutonomyRunFailed(
command.autonomy!.runId,
`No teammate runtime available for scheduled task owner ${task.agentId} in headless mode.`,
command.autonomy!.rootDir,
)
return
}
const prepared = await prepareAutonomyTurnPrompt({
const command = await createAutonomyQueuedPromptIfNoActiveSource({
basePrompt: task.prompt,
trigger: 'scheduled-task',
currentDir: cwd(),
})
if (inputClosed) return
const command = await commitAutonomyQueuedPrompt({
prepared,
currentDir: cwd(),
sourceId: task.id,
sourceLabel: task.prompt,
workload: WORKLOAD_CRON,
shouldCreate: () => !inputClosed,
})
if (inputClosed) return
if (!command) return
if (inputClosed) {
await markAutonomyRunCancelled(
command.autonomy!.runId,
command.autonomy!.rootDir,
)
return
}
enqueue({
...command,
uuid: randomUUID(),
})
void run()
})()
})().catch(error => {
logError(error)
logForDebugging(
`[ScheduledTasks] failed to enqueue headless task ${task.id}: ${error}`,
{
level: 'error',
},
)
})
},
isLoading: () => running || inputClosed,
getJitterConfig: cronJitterConfigModule?.getCronJitterConfig,

View File

@@ -1,5 +1,5 @@
import type { Command } from '../../commands.js'
import { isSkillLearningEnabled } from '../../services/skillLearning/featureCheck.js'
import { isSkillLearningCompiledIn } from '../../services/skillLearning/featureCheck.js'
const skillLearning = {
type: 'local-jsx',
@@ -7,7 +7,10 @@ const skillLearning = {
description: 'Manage skill learning (observe, analyze, evolve)',
argumentHint:
'[start|stop|about|status|ingest|evolve|export|import|prune|promote|projects]',
isEnabled: () => isSkillLearningEnabled(),
// The slash command is visible whenever the subsystem is compiled in.
// Whether the runtime feature is actually doing work is a separate
// concern controlled by `/skill-learning start` (see featureCheck.ts).
isEnabled: () => isSkillLearningCompiledIn(),
isHidden: false,
load: () => import('./skillPanel.js'),
} satisfies Command

View File

@@ -1,10 +1,14 @@
import type { Command } from '../../commands.js'
import { isSkillSearchCompiledIn } from '../../services/skillSearch/featureCheck.js'
const skillSearch = {
type: 'local-jsx',
name: 'skill-search',
description: 'Control automatic skill matching during conversations',
argumentHint: '[start|stop|about|status]',
// Visible whenever the subsystem is compiled in (build flag); runtime
// activation is separate and operator-controlled via /skill-search start.
isEnabled: () => isSkillSearchCompiledIn(),
isHidden: false,
load: () => import('./skillSearchPanel.js'),
} satisfies Command

View File

@@ -30,6 +30,7 @@ interface WorkerState {
failureCount: number
parked: boolean
lastStartTime: number
restartTimer: ReturnType<typeof setTimeout> | null
}
/**
@@ -241,6 +242,7 @@ async function runSupervisor(args: string[]): Promise<void> {
failureCount: 0,
parked: false,
lastStartTime: 0,
restartTimer: null,
},
]
@@ -261,6 +263,10 @@ async function runSupervisor(args: string[]): Promise<void> {
controller.abort()
removeDaemonState()
for (const w of workers) {
if (w.restartTimer) {
clearTimeout(w.restartTimer)
w.restartTimer = null
}
if (w.process && !w.process.killed) {
w.process.kill('SIGTERM')
}
@@ -288,22 +294,30 @@ async function runSupervisor(args: string[]): Promise<void> {
// Wait for all workers to exit
await Promise.all(
workers
.filter(w => w.process && !w.process.killed)
.filter(w => w.process && w.process.exitCode === null)
.map(
w =>
new Promise<void>(resolve => {
if (!w.process) {
if (!w.process || w.process.exitCode !== null) {
resolve()
return
}
w.process.on('exit', () => resolve())
let killTimer: ReturnType<typeof setTimeout> | null = null
w.process.on('exit', () => {
if (killTimer) {
clearTimeout(killTimer)
killTimer = null
}
resolve()
})
// Force kill after grace period
setTimeout(() => {
if (w.process && !w.process.killed) {
killTimer = setTimeout(() => {
if (w.process && w.process.exitCode === null) {
w.process.kill('SIGKILL')
}
resolve()
}, 30_000)
killTimer.unref?.()
}),
),
)
@@ -398,11 +412,13 @@ function spawnWorker(
`[daemon] worker '${worker.kind}' exited (code=${code}, signal=${sig}), restarting in ${worker.backoffMs}ms`,
)
setTimeout(() => {
worker.restartTimer = setTimeout(() => {
worker.restartTimer = null
if (!signal.aborted && !worker.parked) {
spawnWorker(worker, dir, config, signal)
}
}, worker.backoffMs)
worker.restartTimer.unref?.()
// Exponential backoff
worker.backoffMs = Math.min(

View File

@@ -255,6 +255,29 @@ async function main(): Promise<void> {
return
}
// Fast-path for `claude autonomy ...`: state inspection/management commands
// do not need the full interactive CLI bootstrap. The full Commander path
// imports main.tsx and runs root preAction initialization before the autonomy
// action; under coverage/CI that leaves unrelated handles around simple
// state-only subprocess calls.
if (args[0] === 'autonomy') {
profileCheckpoint('cli_autonomy_path')
const { getAutonomyCommandText } = await import(
'../cli/handlers/autonomy.js'
)
const text = await getAutonomyCommandText(args.slice(1).join(' '))
await new Promise<void>((resolve, reject) => {
process.stdout.write(`${text}\n`, error => {
if (error) {
reject(error)
return
}
resolve()
})
})
process.exit(0)
}
// Fast-path for `--bg`/`--background` shortcut → daemon bg.
if (
feature('BG_SESSIONS') &&
@@ -398,4 +421,4 @@ async function main(): Promise<void> {
}
// eslint-disable-next-line custom-rules/no-top-level-side-effects
void main()
await main()

View File

@@ -0,0 +1,80 @@
import { afterEach, beforeEach, describe, expect, test } from 'bun:test'
import {
resetStateForTests,
setCwdState,
setOriginalCwd,
setProjectRoot,
} from '../../bootstrap/state'
import { createScheduledTaskQueuedCommand } from '../useScheduledTasks'
import {
listAutonomyRuns,
markAutonomyRunCompleted,
} from '../../utils/autonomyRuns'
import { resetAutonomyAuthorityForTests } from '../../utils/autonomyAuthority'
import { cleanupTempDir, createTempDir } from '../../../tests/mocks/file-system'
let tempDir = ''
beforeEach(async () => {
tempDir = await createTempDir('scheduled-tasks-')
resetStateForTests()
resetAutonomyAuthorityForTests()
setOriginalCwd(tempDir)
setProjectRoot(tempDir)
setCwdState(tempDir)
})
afterEach(async () => {
resetStateForTests()
resetAutonomyAuthorityForTests()
if (tempDir) {
await cleanupTempDir(tempDir)
}
})
describe('createScheduledTaskQueuedCommand', () => {
function createCommandForTest(task: { id: string; prompt: string }) {
return createScheduledTaskQueuedCommand(task, {
rootDir: tempDir,
currentDir: tempDir,
})
}
test('skips a scheduled task when the same source already has an active run', async () => {
const task = {
id: 'cron-1',
prompt: '/loop review the repository',
}
const first = await createCommandForTest(task)
const second = await createCommandForTest(task)
const runs = await listAutonomyRuns(tempDir)
expect(first).not.toBeNull()
expect(second).toBeNull()
expect(runs).toHaveLength(1)
expect(runs[0]).toMatchObject({
trigger: 'scheduled-task',
status: 'queued',
sourceId: 'cron-1',
})
})
test('allows a scheduled task after the previous same-source run completes', async () => {
const task = {
id: 'cron-1',
prompt: '/loop review the repository',
}
const first = await createCommandForTest(task)
expect(first?.autonomy?.runId).toBeDefined()
await markAutonomyRunCompleted(first!.autonomy!.runId, tempDir, 100)
const second = await createCommandForTest(task)
const runs = await listAutonomyRuns(tempDir)
expect(second).not.toBeNull()
expect(runs).toHaveLength(2)
expect(runs.map(run => run.status).sort()).toEqual(['completed', 'queued'])
})
})

View File

@@ -189,12 +189,6 @@ export function useReplBridge(
}
let cancelled = false
// Map of pending bridge permission response handlers, keyed by request_id.
// Defined at useEffect scope so the cleanup function can clear it on unmount.
const pendingPermissionHandlers = new Map<
string,
(response: BridgePermissionResponse) => void
>()
// Capture messages.length now so we don't re-send initial messages
// through writeMessages after the bridge connects.
const initialMessageCount = messages.length
@@ -467,6 +461,13 @@ export function useReplBridge(
}
}
// Map of pending bridge permission response handlers, keyed by request_id.
// Each entry is an onResponse handler waiting for CCR to reply.
const pendingPermissionHandlers = new Map<
string,
(response: BridgePermissionResponse) => void
>()
// Dispatch incoming control_response messages to registered handlers
function handlePermissionResponse(msg: SDKControlResponse): void {
const requestId = msg.response?.request_id
@@ -817,10 +818,6 @@ export function useReplBridge(
return () => {
cancelled = true
// Release all pending permission handlers so their closures (which
// may capture React state/setters) can be GC'd immediately rather
// than waiting for the entire useEffect closure to become unreachable.
pendingPermissionHandlers.clear()
clearTimeout(failureTimeoutRef.current)
failureTimeoutRef.current = undefined
if (handleRef.current) {

View File

@@ -10,13 +10,18 @@ import type { Message } from '../types/message.js'
import { getCwd } from '../utils/cwd.js'
import { getCronJitterConfig } from '../utils/cronJitterConfig.js'
import { createCronScheduler } from '../utils/cronScheduler.js'
import { removeCronTasks } from '../utils/cronTasks.js'
import { createAutonomyQueuedPrompt } from '../utils/autonomyRuns.js'
import { markAutonomyRunFailed } from '../utils/autonomyRuns.js'
import { removeCronTasks, type CronTask } from '../utils/cronTasks.js'
import {
createAutonomyQueuedPrompt,
createAutonomyQueuedPromptIfNoActiveSource,
markAutonomyRunCancelled,
markAutonomyRunFailed,
} from '../utils/autonomyRuns.js'
import { logForDebugging } from '../utils/debug.js'
import { enqueuePendingNotification } from '../utils/messageQueueManager.js'
import { createScheduledTaskFireMessage } from '../utils/messages.js'
import { WORKLOAD_CRON } from '../utils/workloadContext.js'
import type { QueuedCommand } from '../types/textInputTypes.js'
type Props = {
isLoading: boolean
@@ -32,6 +37,32 @@ type Props = {
setMessages: React.Dispatch<React.SetStateAction<Message[]>>
}
export async function createScheduledTaskQueuedCommand(
task: Pick<CronTask, 'id' | 'prompt'>,
options?: {
rootDir?: string
currentDir?: string
shouldCreate?: () => boolean
},
): Promise<QueuedCommand | null> {
const command = await createAutonomyQueuedPromptIfNoActiveSource({
basePrompt: task.prompt,
trigger: 'scheduled-task',
rootDir: options?.rootDir,
currentDir: options?.currentDir ?? getCwd(),
sourceId: task.id,
sourceLabel: task.prompt,
workload: WORKLOAD_CRON,
shouldCreate: options?.shouldCreate,
})
if (!command) {
logForDebugging(
`[ScheduledTasks] skipping ${task.id}: previous run still queued or running`,
)
}
return command
}
/**
* REPL wrapper for the cron scheduler. Mounts the scheduler once and tears
* it down on unmount. Fired prompts go into the command queue as 'later'
@@ -71,16 +102,25 @@ export function useScheduledTasks({
// forward isMeta, so their messages remain visible in the
// transcript. This is acceptable since normal mode is not the
// primary use case for scheduled tasks.
let disposed = false
const enqueueForLead = async (prompt: string) => {
const command = await createAutonomyQueuedPrompt({
basePrompt: prompt,
trigger: 'scheduled-task',
currentDir: getCwd(),
workload: WORKLOAD_CRON,
shouldCreate: () => !disposed,
})
if (!command) {
return
}
if (disposed) {
await markAutonomyRunCancelled(
command.autonomy!.runId,
command.autonomy!.rootDir,
)
return
}
enqueuePendingNotification(command)
}
@@ -90,7 +130,12 @@ export function useScheduledTasks({
// which is populated from disk at scheduler startup — this path only
// handles team-lead durable crons.
onFire: prompt => {
void enqueueForLead(prompt)
void enqueueForLead(prompt).catch(error =>
logForDebugging(
`[ScheduledTasks] failed to enqueue missed task prompt: ${error}`,
{ level: 'error' },
),
)
},
// Normal fires receive the full CronTask so we can route by agentId.
onFireTask: task => {
@@ -101,22 +146,26 @@ export function useScheduledTasks({
store.getState().tasks,
)
if (teammate && !isTerminalTaskStatus(teammate.status)) {
const command = await createAutonomyQueuedPrompt({
basePrompt: task.prompt,
trigger: 'scheduled-task',
currentDir: getCwd(),
sourceId: task.id,
sourceLabel: task.prompt,
workload: WORKLOAD_CRON,
})
const command = await createScheduledTaskQueuedCommand(
task,
{ shouldCreate: () => !disposed },
)
if (!command) {
return
}
if (disposed) {
await markAutonomyRunCancelled(
command.autonomy!.runId,
command.autonomy!.rootDir,
)
return
}
const injected = injectUserMessageToTeammate(
teammate.id,
command.value as string,
{
autonomyRunId: command.autonomy?.runId,
autonomyRootDir: command.autonomy?.rootDir,
origin: command.origin,
},
setAppState,
@@ -125,6 +174,7 @@ export function useScheduledTasks({
await markAutonomyRunFailed(
command.autonomy.runId,
`Teammate ${task.agentId} exited before the scheduled message could be delivered.`,
command.autonomy.rootDir,
)
}
return
@@ -139,24 +189,32 @@ export function useScheduledTasks({
return
}
const command = await createAutonomyQueuedPrompt({
basePrompt: task.prompt,
trigger: 'scheduled-task',
currentDir: getCwd(),
sourceId: task.id,
sourceLabel: task.prompt,
workload: WORKLOAD_CRON,
})
const command = await createScheduledTaskQueuedCommand(
task,
{ shouldCreate: () => !disposed },
)
if (!command) {
return
}
if (disposed) {
await markAutonomyRunCancelled(
command.autonomy!.runId,
command.autonomy!.rootDir,
)
return
}
const msg = createScheduledTaskFireMessage(
`Running scheduled task (${formatCronFireTime(new Date())})`,
)
setMessages(prev => [...prev, msg])
enqueuePendingNotification(command)
})()
})().catch(error =>
logForDebugging(
`[ScheduledTasks] failed to enqueue task ${task.id}: ${error}`,
{ level: 'error' },
),
)
},
isLoading: () => isLoadingRef.current,
assistantMode,
@@ -164,7 +222,10 @@ export function useScheduledTasks({
isKilled: () => !isKairosCronEnabled(),
})
scheduler.start()
return () => scheduler.stop()
return () => {
disposed = true
scheduler.stop()
}
// assistantMode is stable for the session lifetime; store/setAppState are
// stable refs from useSyncExternalStore; setMessages is a stable useCallback.
// eslint-disable-next-line react-hooks/exhaustive-deps

View File

@@ -9,7 +9,9 @@ import { useEffect, useRef } from 'react'
import type { QueuedCommand } from '../types/textInputTypes.js'
import { TICK_TAG } from '../constants/xml.js'
import { getCwd } from '../utils/cwd.js'
import { cancelQueuedAutonomyCommands } from '../utils/autonomyQueueLifecycle.js'
import { createProactiveAutonomyCommands } from '../utils/autonomyRuns.js'
import { logForDebugging } from '../utils/debug.js'
import {
isProactiveActive,
isProactivePaused,
@@ -38,6 +40,8 @@ export function useProactive(opts: UseProactiveOpts): void {
if (!isProactiveActive()) return
let timer: ReturnType<typeof setTimeout> | null = null
let disposed = false
let generating = false
function scheduleTick(): void {
const nextTs = Date.now() + TICK_INTERVAL_MS
@@ -66,25 +70,51 @@ export function useProactive(opts: UseProactiveOpts): void {
isLoading ||
isInPlanMode ||
hasActiveLocalJsxUI ||
queuedCommandsLength > 0
queuedCommandsLength > 0 ||
generating
) {
scheduleTick()
return
}
generating = true
void (async () => {
const commands = await createProactiveAutonomyCommands({
basePrompt: `<${TICK_TAG}>${new Date().toLocaleTimeString()}</${TICK_TAG}>`,
currentDir: getCwd(),
shouldCreate: () => !disposed,
})
for (const command of commands) {
// Always queue proactive turns. This avoids races where the prompt
// is built asynchronously, a user turn starts meanwhile, and a
// direct-submit path would silently drop the autonomy turn after
// consuming its heartbeat due-state.
optsRef.current.onQueueTick(command)
if (disposed) {
await cancelQueuedAutonomyCommands({ commands })
return
}
const queuedCommands: QueuedCommand[] = []
try {
for (const command of commands) {
// Always queue proactive turns. This avoids races where the prompt
// is built asynchronously, a user turn starts meanwhile, and a
// direct-submit path would silently drop the autonomy turn after
// consuming its heartbeat due-state.
optsRef.current.onQueueTick(command)
queuedCommands.push(command)
}
} catch (error) {
await cancelQueuedAutonomyCommands({
commands: commands.filter(
command => !queuedCommands.includes(command),
),
})
throw error
}
})()
.catch(error =>
logForDebugging(`[Proactive] failed to create tick: ${error}`, {
level: 'error',
}),
)
.finally(() => {
generating = false
})
// Schedule next tick
scheduleTick()
@@ -94,6 +124,7 @@ export function useProactive(opts: UseProactiveOpts): void {
scheduleTick()
return () => {
disposed = true
if (timer !== null) {
clearTimeout(timer)
timer = null

View File

@@ -71,10 +71,16 @@ const jobClassifier = feature('TEMPLATES')
: null
/* eslint-enable @typescript-eslint/no-require-imports */
import {
enqueue,
remove as removeFromQueue,
getCommandsByMaxPriority,
isSlashCommand,
} from './utils/messageQueueManager.js'
import {
type AutonomyTurnOutcome,
claimConsumableQueuedAutonomyCommands,
finalizeAutonomyCommandsForTurn,
} from './utils/autonomyQueueLifecycle.js'
import { notifyCommandLifecycle } from './utils/commandLifecycle.js'
import { headlessProfilerCheckpoint } from './utils/headlessProfiler.js'
import {
@@ -92,6 +98,7 @@ import { SLEEP_TOOL_NAME } from '@claude-code-best/builtin-tools/tools/SleepTool
import { executePostSamplingHooks } from './utils/hooks/postSamplingHooks.js'
import { executeStopFailureHooks } from './utils/hooks.js'
import type { QuerySource } from './constants/querySource.js'
import type { QueuedCommand } from './types/textInputTypes.js'
import { createDumpPromptsFetch } from './services/api/dumpPrompts.js'
import { StreamingToolExecutor } from './services/tools/StreamingToolExecutor.js'
import { queryCheckpoint } from './utils/queryProfiler.js'
@@ -111,7 +118,11 @@ import {
} from './bootstrap/state.js'
import { createBudgetTracker, checkTokenBudget } from './query/tokenBudget.js'
import { count } from './utils/array.js'
import { createTrace, endTrace, isLangfuseEnabled } from './services/langfuse/index.js'
import {
createTrace,
endTrace,
isLangfuseEnabled,
} from './services/langfuse/index.js'
import { getAPIProvider } from './utils/model/providers.js'
/* eslint-disable @typescript-eslint/no-require-imports */
@@ -129,7 +140,11 @@ function* yieldMissingToolResultBlocks(
) {
for (const assistantMessage of assistantMessages) {
// Extract all tool use blocks from this assistant message
const toolUseBlocks = (Array.isArray(assistantMessage.message?.content) ? assistantMessage.message.content : []).filter(
const toolUseBlocks = (
Array.isArray(assistantMessage.message?.content)
? assistantMessage.message.content
: []
).filter(
(content: { type: string }) => content.type === 'tool_use',
) as ToolUseBlock[]
@@ -181,6 +196,33 @@ function isWithheldMaxOutputTokens(
return msg?.type === 'assistant' && msg.apiError === 'max_output_tokens'
}
function getAutonomyTurnOutcome(params: {
terminal?: Terminal
thrownError?: unknown
}): AutonomyTurnOutcome {
if (params.thrownError !== undefined) {
return { type: 'failed', error: params.thrownError }
}
const terminal = params.terminal
const reason = terminal?.reason
switch (reason) {
case 'completed':
return { type: 'completed' }
case undefined:
case 'aborted_streaming':
case 'aborted_tools':
return { type: 'cancelled' }
case 'model_error':
return { type: 'failed', error: terminal.error }
default:
return {
type: 'failed',
message: `query ended without successful completion: ${reason}`,
}
}
}
export type QueryParams = {
messages: Message[]
systemPrompt: SystemPrompt
@@ -230,6 +272,7 @@ export async function* query(
Terminal
> {
const consumedCommandUuids: string[] = []
const consumedAutonomyCommands: QueuedCommand[] = []
// Create Langfuse trace for this query turn (no-op if not configured).
// When called as a sub-agent, langfuseTrace is already set by runAgent()
@@ -238,8 +281,9 @@ export async function* query(
logForDebugging(
`[query] ownsTrace=${ownsTrace} incoming langfuseTrace=${params.toolUseContext.langfuseTrace ? 'present' : 'null/undefined'} isLangfuseEnabled=${isLangfuseEnabled()}`,
)
const langfuseTrace = params.toolUseContext.langfuseTrace
?? (isLangfuseEnabled()
const langfuseTrace =
params.toolUseContext.langfuseTrace ??
(isLangfuseEnabled()
? createTrace({
sessionId: getSessionId(),
model: params.toolUseContext.options.mainLoopModel,
@@ -258,9 +302,34 @@ export async function* query(
: params
let terminal: Terminal | undefined
let didThrow = false
let thrownError: unknown
try {
terminal = yield* queryLoop(paramsWithTrace, consumedCommandUuids)
terminal = yield* queryLoop(
paramsWithTrace,
consumedCommandUuids,
consumedAutonomyCommands,
)
} catch (error) {
didThrow = true
thrownError = error
throw error
} finally {
await finalizeAutonomyCommandsForTurn({
commands: consumedAutonomyCommands,
outcome: getAutonomyTurnOutcome({
terminal,
...(didThrow ? { thrownError } : {}),
}),
priority: 'later',
})
.then(nextCommands => {
for (const command of nextCommands) {
enqueue(command)
}
})
.catch(logError)
// Only end the trace if we created it — sub-agents own their traces
if (ownsTrace) {
const isAborted =
@@ -283,6 +352,7 @@ export async function* query(
async function* queryLoop(
params: QueryParams,
consumedCommandUuids: string[],
consumedAutonomyCommands: QueuedCommand[],
): AsyncGenerator<
| StreamEvent
| RequestStartEvent
@@ -790,7 +860,14 @@ async function* queryLoop(
let yieldMessage: typeof message = message
if (message.type === 'assistant') {
const assistantMsg = message as AssistantMessage
const contentArr = Array.isArray(assistantMsg.message?.content) ? assistantMsg.message.content as unknown as Array<{ type: string; input?: unknown; name?: string; [key: string]: unknown }> : []
const contentArr = Array.isArray(assistantMsg.message?.content)
? (assistantMsg.message.content as unknown as Array<{
type: string
input?: unknown
name?: string
[key: string]: unknown
}>)
: []
let clonedContent: typeof contentArr | undefined
for (let i = 0; i < contentArr.length; i++) {
const block = contentArr[i]!
@@ -826,7 +903,10 @@ async function* queryLoop(
if (clonedContent) {
yieldMessage = {
...message,
message: { ...(assistantMsg.message ?? {}), content: clonedContent },
message: {
...(assistantMsg.message ?? {}),
content: clonedContent,
},
} as typeof message
}
}
@@ -872,7 +952,11 @@ async function* queryLoop(
const assistantMessage = message as AssistantMessage
assistantMessages.push(assistantMessage)
const msgToolUseBlocks = (Array.isArray(assistantMessage.message?.content) ? assistantMessage.message.content : []).filter(
const msgToolUseBlocks = (
Array.isArray(assistantMessage.message?.content)
? assistantMessage.message.content
: []
).filter(
(content: { type: string }) => content.type === 'tool_use',
) as ToolUseBlock[]
if (msgToolUseBlocks.length > 0) {
@@ -1005,7 +1089,10 @@ async function* queryLoop(
logEvent('tengu_query_error', {
assistantMessages: assistantMessages.length,
toolUses: assistantMessages.flatMap(_ =>
(Array.isArray(_.message?.content) ? _.message.content as Array<{ type: string }> : []).filter(content => content.type === 'tool_use'),
(Array.isArray(_.message?.content)
? (_.message.content as Array<{ type: string }>)
: []
).filter(content => content.type === 'tool_use'),
).length,
queryChainId: queryChainIdForAnalytics,
@@ -1307,7 +1394,10 @@ async function* queryLoop(
// error → hook blocking → retry → error → …
if (lastMessage?.isApiErrorMessage) {
void executeStopFailureHooks(lastMessage, toolUseContext)
return { reason: 'completed' }
return {
reason: 'model_error',
error: lastMessage.error ?? lastMessage.apiError ?? 'api_error',
}
}
const stopHookResult = yield* handleStopHooks(
@@ -1408,7 +1498,6 @@ async function* queryLoop(
queryCheckpoint('query_tool_execution_start')
if (streamingToolExecutor) {
logEvent('tengu_streaming_tool_execution_used', {
tool_count: toolUseBlocks.length,
@@ -1468,9 +1557,14 @@ async function* queryLoop(
const lastAssistantMessage = assistantMessages.at(-1)
let lastAssistantText: string | undefined
if (lastAssistantMessage) {
const textBlocks = (Array.isArray(lastAssistantMessage.message?.content) ? lastAssistantMessage.message.content as Array<{ type: string; text?: string }> : []).filter(
block => block.type === 'text',
)
const textBlocks = (
Array.isArray(lastAssistantMessage.message?.content)
? (lastAssistantMessage.message.content as Array<{
type: string
text?: string
}>)
: []
).filter(block => block.type === 'text')
if (textBlocks.length > 0) {
const lastTextBlock = textBlocks.at(-1)
if (lastTextBlock && 'text' in lastTextBlock) {
@@ -1622,12 +1716,32 @@ async function* queryLoop(
// user prompts, even if someone stamps an agentId on one.
return cmd.mode === 'task-notification' && cmd.agentId === currentAgentId
})
const queuedAutonomyClaim = await claimConsumableQueuedAutonomyCommands(
queuedCommandsSnapshot,
)
if (queuedAutonomyClaim.staleCommands.length > 0) {
removeFromQueue(queuedAutonomyClaim.staleCommands)
}
const claimedConsumedCommands = queuedAutonomyClaim.claimedCommands.filter(
cmd => cmd.mode === 'prompt' || cmd.mode === 'task-notification',
)
if (claimedConsumedCommands.length > 0) {
consumedAutonomyCommands.push(...claimedConsumedCommands)
for (const cmd of claimedConsumedCommands) {
if (cmd.uuid) {
consumedCommandUuids.push(cmd.uuid)
notifyCommandLifecycle(cmd.uuid, 'started')
}
}
removeFromQueue(claimedConsumedCommands)
}
for await (const attachment of getAttachmentMessages(
null,
updatedToolUseContext,
null,
queuedCommandsSnapshot,
queuedAutonomyClaim.attachmentCommands,
[...messagesForQuery, ...assistantMessages, ...toolResults],
querySource,
)) {
@@ -1659,7 +1773,6 @@ async function* queryLoop(
pendingMemoryPrefetch.consumedOnIteration = turnCount - 1
}
// Inject prefetched skill discovery. collectSkillDiscoveryPrefetch emits
// hidden_by_main_turn — true when the prefetch resolved before this point
// (should be >98% at AKI@250ms / Haiku@573ms vs turn durations of 2-30s).
@@ -1675,8 +1788,11 @@ async function* queryLoop(
// Remove only commands that were actually consumed as attachments.
// Prompt and task-notification commands are converted to attachments above.
const consumedCommands = queuedCommandsSnapshot.filter(
cmd => cmd.mode === 'prompt' || cmd.mode === 'task-notification',
const claimedCommandSet = new Set(claimedConsumedCommands)
const consumedCommands = queuedAutonomyClaim.attachmentCommands.filter(
cmd =>
(cmd.mode === 'prompt' || cmd.mode === 'task-notification') &&
!claimedCommandSet.has(cmd),
)
if (consumedCommands.length > 0) {
for (const cmd of consumedCommands) {

View File

@@ -1,3 +1,20 @@
// Auto-generated stub — replace with real implementation
export type Terminal = any;
export type Continue = any;
export type Terminal =
| { reason: 'completed' }
| { reason: 'blocking_limit' }
| { reason: 'image_error' }
| { reason: 'model_error'; error?: unknown }
| { reason: 'aborted_streaming' }
| { reason: 'aborted_tools' }
| { reason: 'prompt_too_long' }
| { reason: 'stop_hook_prevented' }
| { reason: 'hook_stopped' }
| { reason: 'max_turns'; turnCount: number }
export type Continue =
| { reason: 'collapse_drain_retry'; committed: number }
| { reason: 'reactive_compact_retry' }
| { reason: 'max_output_tokens_escalate' }
| { reason: 'max_output_tokens_recovery'; attempt: number }
| { reason: 'stop_hook_blocking' }
| { reason: 'token_budget_continuation' }
| { reason: 'next_turn' }

View File

@@ -79,10 +79,9 @@ import { isEnvTruthy } from '../utils/envUtils.js';
import { formatTokens, truncateToWidth } from '../utils/format.js';
import { consumeEarlyInput } from '../utils/earlyInput.js';
import {
finalizeAutonomyRunCompleted,
finalizeAutonomyRunFailed,
markAutonomyRunRunning,
} from '../utils/autonomyRuns.js';
claimConsumableQueuedAutonomyCommands,
finalizeAutonomyCommandsForTurn,
} from '../utils/autonomyQueueLifecycle.js';
import { setMemberActive } from '../utils/swarm/teamHelpers.js';
import {
@@ -3054,18 +3053,19 @@ export function REPL({
setMessages(old => {
const postBoundary = getMessagesAfterCompactBoundary(old, {
includeSnipped: true,
})
});
// Hard cap: keep at most 500 messages in fullscreen scrollback
// to prevent unbounded memory growth in multi-day sessions.
// normalizeMessages/applyGrouping are O(n), and Ink fiber
// trees cost ~250KB RSS per message. Without this cap,
// scrollback after several compactions can reach thousands
// of messages (observed: 13k+, 1GB+ heap).
const MAX_FULLSCREEN_SCROLLBACK = 500
const kept = postBoundary.length > MAX_FULLSCREEN_SCROLLBACK
? postBoundary.slice(-MAX_FULLSCREEN_SCROLLBACK)
: postBoundary
return [...kept, newMessage]
const MAX_FULLSCREEN_SCROLLBACK = 500;
const kept =
postBoundary.length > MAX_FULLSCREEN_SCROLLBACK
? postBoundary.slice(-MAX_FULLSCREEN_SCROLLBACK)
: postBoundary;
return [...kept, newMessage];
});
} else {
setMessages(() => [newMessage]);
@@ -3098,13 +3098,10 @@ export function REPL({
// so interleaved non-ephemeral messages caused duplicate progress
// entries to accumulate (observed 13k+ entries in sleep-heavy sessions).
for (let i = oldMessages.length - 1; i >= 0; i--) {
const m = oldMessages[i]!
if (m.type !== 'progress') break
const mData = m.data as Record<string, unknown> | undefined
if (
m.parentToolUseID === newMessage.parentToolUseID &&
mData?.type === newData.type
) {
const m = oldMessages[i]!;
if (m.type !== 'progress') break;
const mData = m.data as Record<string, unknown> | undefined;
if (m.parentToolUseID === newMessage.parentToolUseID && mData?.type === newData.type) {
const copy = oldMessages.slice();
copy[i] = newMessage;
return copy;
@@ -4844,44 +4841,44 @@ export function REPL({
} satisfies QueuedCommand)
: input;
const newAbortController = createAbortController();
setAbortController(newAbortController);
void (async () => {
const claim = await claimConsumableQueuedAutonomyCommands([queuedCommand]);
const command = claim.attachmentCommands[0];
if (!command) return;
// Create a user message with the formatted content (includes XML wrapper)
const userMessage = createUserMessage({
content: queuedCommand.value as string,
isMeta: queuedCommand.isMeta ? true : undefined,
origin: queuedCommand.origin,
});
const newAbortController = createAbortController();
setAbortController(newAbortController);
const autonomyRunId = queuedCommand.autonomy?.runId;
if (autonomyRunId) {
void markAutonomyRunRunning(autonomyRunId);
}
void onQuery([userMessage], newAbortController, true, [], mainLoopModel)
.then(() => {
if (autonomyRunId) {
void finalizeAutonomyRunCompleted({
runId: autonomyRunId,
currentDir: getCwd(),
priority: 'later',
}).then(nextCommands => {
for (const command of nextCommands) {
enqueue(command);
}
});
}
})
.catch((error: unknown) => {
if (autonomyRunId) {
void finalizeAutonomyRunFailed({
runId: autonomyRunId,
error: String(error),
});
}
logError(toError(error));
// Create a user message with the formatted content (includes XML wrapper)
const userMessage = createUserMessage({
content: command.value as string,
isMeta: command.isMeta ? true : undefined,
origin: command.origin,
});
try {
await onQuery([userMessage], newAbortController, true, [], mainLoopModel);
const nextCommands = await finalizeAutonomyCommandsForTurn({
commands: claim.claimedCommands,
outcome: { type: 'completed' },
currentDir: getCwd(),
priority: 'later',
});
for (const nextCommand of nextCommands) {
enqueue(nextCommand);
}
} catch (error: unknown) {
await finalizeAutonomyCommandsForTurn({
commands: claim.claimedCommands,
outcome: { type: 'failed', error },
currentDir: getCwd(),
priority: 'later',
});
logError(toError(error));
}
})().catch((error: unknown) => {
logError(toError(error));
});
return true;
},
[onQuery, mainLoopModel, store],

View File

@@ -7,7 +7,6 @@ import { clearClassifierApprovals } from '../../utils/classifierApprovals.js'
import { resetGetMemoryFilesCache } from '../../utils/claudemd.js'
import { clearSessionMessagesCache } from '../../utils/sessionStorage.js'
import { clearBetaTracingState } from '../../utils/telemetry/betaSessionTracing.js'
import { getLspServerManager } from '../../services/lsp/manager.js'
import { resetMicrocompactState } from './microCompact.js'
/**
@@ -29,7 +28,7 @@ import { resetMicrocompactState } from './microCompact.js'
* pass querySource — undefined is only safe for callers that are
* genuinely main-thread-only (/compact, /clear).
*/
export async function runPostCompactCleanup(querySource?: QuerySource): Promise<void> {
export function runPostCompactCleanup(querySource?: QuerySource): void {
// Subagents (agent:*) run in the same process and share module-level
// state with the main thread. Only reset main-thread module-level state
// (context-collapse, memory file cache) for main-thread compacts.
@@ -75,15 +74,4 @@ export async function runPostCompactCleanup(querySource?: QuerySource): Promise<
)
}
clearSessionMessagesCache()
// Close all LSP-tracked files so servers release state for files no longer
// in the active context after compaction. Best-effort — LSP may not be
// initialized, and closeAllFiles catches per-file errors internally.
try {
const lspManager = getLspServerManager()
if (lspManager) {
await lspManager.closeAllFiles()
}
} catch {
// LSP module may not be available in all environments
}
}

View File

@@ -1,12 +1,36 @@
import { feature } from 'bun:bundle'
export function isSkillLearningEnabled(): boolean {
if (process.env.SKILL_LEARNING_ENABLED === '0') return false
if (process.env.SKILL_LEARNING_ENABLED === '1') return true
if (process.env.FEATURE_SKILL_LEARNING === '0') return false
if (process.env.FEATURE_SKILL_LEARNING === '1') return true
if (feature('SKILL_LEARNING')) {
return true
}
/**
* Build-time presence check: is the `/skill-learning` slash command
* compiled into this build? Used by the command registry's `isEnabled` so
* the command appears in the menu whenever it is buildable. Operators
* activate the subsystem itself via `/skill-learning start`, which flips
* `SKILL_LEARNING_ENABLED=1` and turns the runtime observers on (see
* `isSkillLearningEnabled`).
*/
export function isSkillLearningCompiledIn(): boolean {
if (feature('SKILL_LEARNING')) return true
return false
}
/**
* Runtime activation check: is the skill-learning subsystem actively
* running (toolEvent, runtime, session observers attached, persisting
* observations to disk)? Off by default — the operator must run
* `/skill-learning start` (which sets `SKILL_LEARNING_ENABLED=1`).
*
* Legacy `FEATURE_SKILL_LEARNING=1` is also accepted for backward
* compatibility with operators who set it before the slash-command UX
* landed.
*
* Build-flag gating is intentionally NOT performed here: the command
* registry already gates command compilation on the build flag, and this
* function is only reached from code paths that the build flag has
* already let through. Decoupling keeps the test surface clean (tests
* exercise the env-var contract without needing to mock `bun:bundle`).
*/
export function isSkillLearningEnabled(): boolean {
if (process.env.SKILL_LEARNING_ENABLED === '1') return true
if (process.env.FEATURE_SKILL_LEARNING === '1') return true
return false
}

View File

@@ -45,15 +45,44 @@ export function getProjectContextPath(projectId: string): string {
// in the tool.call hot path (one wrapper invocation per tool) that cost would
// accumulate into the hundreds-of-ms range per session. Cache keyed by the
// exact cwd string so different worktrees still get independent entries.
//
// Bounded with LRU eviction: long-lived processes that traverse many
// worktrees (e.g. multi-repo build orchestrators) would otherwise grow the
// cache without limit. Each entry holds a SkillLearningProjectContext
// (instinct + skill lists), so the cap ensures bounded memory regardless
// of cwd diversity. `defines.ts` originally flagged this as
// "无淘汰机制(非 GB 级主因)" — this fix closes that gap.
const PROJECT_CONTEXT_CACHE_MAX = 32
const PROJECT_CONTEXT_CACHE_TRIM_TO = 24
const contextCache = new Map<string, SkillLearningProjectContext>()
const PERSIST_INTERVAL_MS = 5 * 60 * 1000
let lastPersistAt = 0
function setProjectContextCache(
cwd: string,
ctx: SkillLearningProjectContext,
): void {
if (contextCache.has(cwd)) contextCache.delete(cwd)
contextCache.set(cwd, ctx)
if (contextCache.size > PROJECT_CONTEXT_CACHE_MAX) {
const toDrop = contextCache.size - PROJECT_CONTEXT_CACHE_TRIM_TO
const iter = contextCache.keys()
for (let i = 0; i < toDrop; i++) {
const next = iter.next()
if (next.done) break
contextCache.delete(next.value)
}
}
}
export function resolveProjectContext(
cwd = process.cwd(),
): SkillLearningProjectContext {
const cached = contextCache.get(cwd)
if (cached) {
// Refresh insertion order so frequently-accessed cwds survive eviction.
contextCache.delete(cwd)
contextCache.set(cwd, cached)
// Still touch the registry so long-lived processes keep `lastSeenAt`
// reasonably fresh, but throttle the write so it doesn't fire on every
// tool call.
@@ -65,7 +94,7 @@ export function resolveProjectContext(
return cached
}
const resolved = resolveContext(cwd)
contextCache.set(cwd, resolved)
setProjectContextCache(cwd, resolved)
persistProjectContext(resolved)
lastPersistAt = Date.now()
return resolved

View File

@@ -23,8 +23,30 @@ export type PromotionOptions = {
minConfidence?: number
}
/**
* Set bounded with FIFO eviction. # promotions per session is small in
* practice (single digits), but a long-lived sandbox/daemon could push
* this if it never restarts. The cap is defensive and the degraded
* behaviour — re-promote if we exceed N then forget the oldest — is
* benign because promotion is idempotent at the lifecycle layer.
*/
const SESSION_PROMOTED_IDS_MAX = 256
const SESSION_PROMOTED_IDS_TRIM_TO = 192
const sessionPromotedIds = new Set<string>()
function recordSessionPromoted(id: string): void {
sessionPromotedIds.add(id)
if (sessionPromotedIds.size > SESSION_PROMOTED_IDS_MAX) {
const toDrop = sessionPromotedIds.size - SESSION_PROMOTED_IDS_TRIM_TO
const iter = sessionPromotedIds.values()
for (let i = 0; i < toDrop; i++) {
const next = iter.next()
if (next.done) break
sessionPromotedIds.delete(next.value)
}
}
}
export function resetPromotionBookkeeping(): void {
sessionPromotedIds.clear()
}
@@ -103,7 +125,7 @@ export async function checkPromotion(
}
await saveInstinct(globalInstinct, globalOptions)
sessionPromotedIds.add(candidate.instinctId)
recordSessionPromoted(candidate.instinctId)
promoted.push(candidate)
}

View File

@@ -1,10 +1,30 @@
import { feature } from 'bun:bundle'
export function isSkillSearchEnabled(): boolean {
if (process.env.SKILL_SEARCH_ENABLED === '0') return false
if (process.env.SKILL_SEARCH_ENABLED === '1') return true
if (feature('EXPERIMENTAL_SKILL_SEARCH')) {
return true
}
/**
* Build-time presence check: is the `/skill-search` slash command compiled
* into this build? Used by the command registry's `isEnabled` so the
* command appears in the menu whenever it is buildable. Operators activate
* the subsystem itself via `/skill-search start`, which flips
* `SKILL_SEARCH_ENABLED=1` and turns the runtime hot paths on (see
* `isSkillSearchEnabled`).
*/
export function isSkillSearchCompiledIn(): boolean {
if (feature('EXPERIMENTAL_SKILL_SEARCH')) return true
return false
}
/**
* Runtime activation check: is the skill-search subsystem currently doing
* work (intentNormalize Haiku calls, prefetch hot path, telemetry)? Off by
* default — the operator must run `/skill-search start` (which sets
* `SKILL_SEARCH_ENABLED=1`). See docs/agent/sur-skill-overflow-bugs.md §5.
*
* Build-flag gating is intentionally NOT performed here: the command
* registry already gates command compilation on the build flag, and this
* function is only reached from code paths that the build flag has
* already let through. Decoupling keeps the test surface clean (tests
* exercise the env-var contract without needing to mock `bun:bundle`).
*/
export function isSkillSearchEnabled(): boolean {
return process.env.SKILL_SEARCH_ENABLED === '1'
}

View File

@@ -47,10 +47,35 @@ Output ONLY keywords. Nothing else.`
const DEFAULT_TIMEOUT_MS = 6_000
const MAX_QUERY_CHARS = 500
const MAX_KEYWORDS_CHARS = 120
/**
* Bound on the process-level query→keywords cache. Insertion-order LRU —
* Map iteration order is insertion order, so we evict from the front when
* size exceeds the cap. ~200 entries × ~600 bytes (query + keywords) ≈
* 120 KB worst case. Without this cap the cache grew monotonically with
* the diversity of Chinese queries in a long session.
*/
const CACHE_MAX_ENTRIES = 200
const CACHE_TRIM_TO = 150
/** Process-level cache. Keyed by the original (trimmed) query. */
const cache = new Map<string, string>()
function setCachedQueryIntent(key: string, value: string): void {
// Refresh insertion order on hit-then-write so frequently-used keys
// stay alive (delete + set is the canonical Map-LRU idiom).
if (cache.has(key)) cache.delete(key)
cache.set(key, value)
if (cache.size > CACHE_MAX_ENTRIES) {
const toDrop = cache.size - CACHE_TRIM_TO
const iter = cache.keys()
for (let i = 0; i < toDrop; i++) {
const next = iter.next()
if (next.done) break
cache.delete(next.value)
}
}
}
export function isIntentNormalizeEnabled(): boolean {
return process.env.SKILL_SEARCH_INTENT_ENABLED === '1'
}
@@ -74,12 +99,17 @@ export async function normalizeQueryIntent(query: string): Promise<string> {
if (!/[\u4e00-\u9fff]/.test(trimmed)) return trimmed
const cached = cache.get(trimmed)
if (cached !== undefined) return cached
if (cached !== undefined) {
// Refresh LRU position so frequently-queried strings survive eviction.
cache.delete(trimmed)
cache.set(trimmed, cached)
return cached
}
const capped = trimmed.slice(0, MAX_QUERY_CHARS)
const keywords = await callHaiku(capped)
const result = keywords ? `${trimmed} ${keywords}` : trimmed
cache.set(trimmed, result)
setCachedQueryIntent(trimmed, result)
logForDebugging(
`[skill-search] intent normalized: "${trimmed.slice(0, 40)}" -> "${keywords}"`,
)

View File

@@ -14,9 +14,35 @@ import { readFile } from 'node:fs/promises'
import { join } from 'node:path'
import { parseFrontmatter } from '../../utils/frontmatterParser.js'
/**
* Per-session memoization to avoid re-emitting the same skill discovery /
* gap signal twice. Each Set is bounded to keep long-running sessions from
* monotonically accumulating skill names and signal keys forever (which
* was the original session-scoped-but-unbounded design).
*
* FIFO eviction by insertion order — once the cap is hit, the oldest
* entries roll off and may be re-recorded if rediscovered, which is the
* correct degraded behaviour: at worst we re-emit a duplicate signal,
* never silently drop a real one.
*/
const SESSION_TRACKING_MAX = 1000
const SESSION_TRACKING_TRIM_TO = 750
const discoveredThisSession = new Set<string>()
const recordedGapSignals = new Set<string>()
function addBoundedSessionEntry(set: Set<string>, value: string): void {
set.add(value)
if (set.size > SESSION_TRACKING_MAX) {
const toDrop = set.size - SESSION_TRACKING_TRIM_TO
const iter = set.values()
for (let i = 0; i < toDrop; i++) {
const next = iter.next()
if (next.done) break
set.delete(next.value)
}
}
}
const AUTO_LOAD_MIN_SCORE = Number(
process.env.SKILL_SEARCH_AUTOLOAD_MIN_SCORE ?? '0.30',
)
@@ -185,7 +211,7 @@ async function maybeRecordSkillGap(
const gapSignalKey = `${trigger}:${queryText.trim().toLowerCase()}`
if (recordedGapSignals.has(gapSignalKey)) return undefined
recordedGapSignals.add(gapSignalKey)
addBoundedSessionEntry(recordedGapSignals, gapSignalKey)
try {
const [{ isSkillLearningEnabled }, { recordSkillGap }] = await Promise.all([
@@ -241,7 +267,7 @@ export async function startSkillDiscoveryPrefetch(
const newResults = results.filter(r => !discoveredThisSession.has(r.name))
if (newResults.length === 0) return []
for (const r of newResults) discoveredThisSession.add(r.name)
for (const r of newResults) addBoundedSessionEntry(discoveredThisSession, r.name)
const signal: DiscoverySignal = {
trigger: 'assistant_turn',
@@ -305,7 +331,7 @@ export async function getTurnZeroSkillDiscovery(
if (results.length === 0 && !gap) return null
for (const r of results) discoveredThisSession.add(r.name)
for (const r of results) addBoundedSessionEntry(discoveredThisSession, r.name)
const signal: DiscoverySignal = {
trigger: 'user_input',

View File

@@ -73,6 +73,7 @@ export function injectUserMessageToTeammate(
options:
| {
autonomyRunId?: string;
autonomyRootDir?: string;
origin?: MessageOrigin;
}
| undefined,
@@ -93,6 +94,9 @@ export function injectUserMessageToTeammate(
if (options?.autonomyRunId !== undefined) {
pendingMessage.autonomyRunId = options.autonomyRunId;
}
if (options?.autonomyRootDir !== undefined) {
pendingMessage.autonomyRootDir = options.autonomyRootDir;
}
if (options?.origin !== undefined) {
pendingMessage.origin = options.origin;
}

View File

@@ -22,6 +22,7 @@ export type TeammateIdentity = {
export type PendingTeammateUserMessage = {
message: string
autonomyRunId?: string
autonomyRootDir?: string
origin?: MessageOrigin
}

View File

@@ -361,6 +361,7 @@ export type QueuedCommand = {
*/
autonomy?: {
runId: string
rootDir?: string
trigger: 'scheduled-task' | 'proactive-tick' | 'managed-flow-step'
sourceId?: string
sourceLabel?: string

View File

@@ -5,6 +5,7 @@ import {
AUTONOMY_DIR,
buildAutonomyTurnPrompt,
loadAutonomyAuthority,
parseHeartbeatAuthorityTasks,
resetAutonomyAuthorityForTests,
} from '../autonomyAuthority'
import {
@@ -238,4 +239,79 @@ describe('autonomyAuthority', () => {
expect(prompt).not.toContain('- weekly-report (7d): Ship the weekly report')
expect(prompt).not.toContain('- gather (')
})
test('parseHeartbeatAuthorityTasks ignores tasks: literals inside markdown code fences', () => {
const content = [
'# HEARTBEAT.md',
'',
'```yaml',
'tasks:',
' - name: not-a-real-task',
' interval: 1m',
' prompt: "would-be-shadowed"',
'```',
'',
'tasks:',
' - name: real-task',
' interval: 30m',
' prompt: "Real prompt"',
].join('\n')
const parsed = parseHeartbeatAuthorityTasks(content)
expect(parsed).toHaveLength(1)
expect(parsed[0]).toMatchObject({
name: 'real-task',
interval: '30m',
prompt: 'Real prompt',
})
})
test('parseHeartbeatAuthorityTasks ignores tasks: literals inside tilde markdown code fences', () => {
const content = [
'# HEARTBEAT.md',
'',
'~~~yaml',
'tasks:',
' - name: not-a-real-task',
' interval: 1m',
' prompt: "would-be-shadowed"',
'~~~',
'',
'tasks:',
' - name: real-task',
' interval: 30m',
' prompt: "Real prompt"',
].join('\n')
const parsed = parseHeartbeatAuthorityTasks(content)
expect(parsed).toHaveLength(1)
expect(parsed[0]).toMatchObject({
name: 'real-task',
interval: '30m',
prompt: 'Real prompt',
})
})
test('parseHeartbeatAuthorityTasks parses real tasks even when documentation precedes them', () => {
const content = [
'# Heartbeat docs',
'',
'See `tasks:` below — the parser keys on the literal at column 0.',
'',
'tasks:',
' - name: weekly',
' interval: 7d',
' prompt: "Ship report"',
].join('\n')
const parsed = parseHeartbeatAuthorityTasks(content)
// Inline `tasks:` mention does NOT collide because it's not at column 0
// on its own line — the existing line.trim() === 'tasks:' guard handles
// that case. This test pins the behaviour.
expect(parsed).toHaveLength(1)
expect(parsed[0]?.name).toBe('weekly')
})
})

View File

@@ -126,6 +126,14 @@ describe('listAutonomyFlows', () => {
runCount: 0,
ownerKey: DEFAULT_AUTONOMY_OWNER_KEY,
currentDir: tempDir,
boundary: [
' src/utils/** ',
'/absolute/not-allowed',
'src\\windows',
'../outside',
'src/utils/**',
'docs/*.md',
],
stateJson: {
currentStepIndex: 0,
steps: [
@@ -147,6 +155,7 @@ describe('listAutonomyFlows', () => {
expect(flows).toHaveLength(1)
expect(flows[0]?.flowId).toBe('flow-1')
expect(flows[0]?.syncMode).toBe('managed')
expect(flows[0]?.boundary).toEqual(['src/utils/**', 'docs/*.md'])
expect(flows[0]?.stateJson?.steps).toHaveLength(1)
})
@@ -191,6 +200,64 @@ describe('listAutonomyFlows', () => {
const flows = await listAutonomyFlows(tempDir)
expect(flows).toEqual([])
})
test('persistence pruning keeps active flows ahead of recent terminal history', async () => {
const flows: AutonomyFlowRecord[] = [
{
flowId: 'old-active',
flowKey: 'managed:scheduled-task:old-active',
syncMode: 'managed',
ownerKey: DEFAULT_AUTONOMY_OWNER_KEY,
revision: 1,
trigger: 'scheduled-task',
status: 'queued',
goal: 'old active',
rootDir: tempDir,
currentDir: tempDir,
runCount: 0,
createdAt: 1,
updatedAt: 1,
},
...Array.from({ length: 100 }, (_, index) => ({
flowId: `history-${index}`,
flowKey: `managed:scheduled-task:history-${index}`,
syncMode: 'managed' as const,
ownerKey: DEFAULT_AUTONOMY_OWNER_KEY,
revision: 1,
trigger: 'scheduled-task' as const,
status: 'succeeded' as const,
goal: `history ${index}`,
rootDir: tempDir,
currentDir: tempDir,
runCount: 1,
createdAt: 1_000 + index,
updatedAt: 1_000 + index,
endedAt: 2_000 + index,
})),
]
const flowsPath = resolveAutonomyFlowsPath(tempDir)
await mkdir(join(tempDir, AUTONOMY_DIR), { recursive: true })
await writeFile(
flowsPath,
`${JSON.stringify({ flows }, null, 2)}\n`,
'utf-8',
)
await startManagedAutonomyFlow({
trigger: 'scheduled-task',
goal: 'fresh active',
steps: TWO_STEPS,
rootDir: tempDir,
currentDir: tempDir,
sourceId: 'fresh-active',
nowMs: 9_999,
})
const persisted = await listAutonomyFlows(tempDir)
expect(persisted).toHaveLength(100)
expect(persisted.some(flow => flow.flowId === 'old-active')).toBe(true)
expect(persisted.some(flow => flow.flowId === 'history-0')).toBe(false)
})
})
describe('startManagedAutonomyFlow', () => {
@@ -225,6 +292,49 @@ describe('startManagedAutonomyFlow', () => {
expect(result!.nextStep!.step.name).toBe('gather')
})
test('normalizes and preserves boundary across completed flow restarts', async () => {
const first = await startManagedAutonomyFlow({
trigger: 'scheduled-task',
goal: 'Scoped flow',
steps: [{ name: 'only', prompt: 'Do it' }],
rootDir: tempDir,
sourceId: 'scoped-src',
boundary: [' src/utils/** ', 'src\\bad', '/absolute', 'docs/*.md'],
nowMs: 1000,
})
const flowId = first!.flow.flowId
expect(first!.flow.boundary).toEqual(['src/utils/**', 'docs/*.md'])
await queueManagedAutonomyFlowStepRun({
flowId,
stepId: first!.nextStep!.step.stepId,
stepIndex: 0,
runId: 'run-1',
rootDir: tempDir,
nowMs: 2000,
})
await markManagedAutonomyFlowStepCompleted({
flowId,
runId: 'run-1',
rootDir: tempDir,
nowMs: 3000,
})
const restarted = await startManagedAutonomyFlow({
trigger: 'scheduled-task',
goal: 'Scoped flow',
steps: [{ name: 'only', prompt: 'Do it again' }],
rootDir: tempDir,
sourceId: 'scoped-src',
nowMs: 4000,
})
expect(restarted!.started).toBe(true)
expect(restarted!.flow.flowId).toBe(flowId)
expect(restarted!.flow.boundary).toEqual(['src/utils/**', 'docs/*.md'])
})
test('sets status=waiting when first step has waitFor', async () => {
const result = await startManagedAutonomyFlow({
trigger: 'scheduled-task',

View File

@@ -54,6 +54,25 @@ describe('withAutonomyPersistenceLock', () => {
).rejects.toThrow('inner failure')
})
test('releases same-root lock bookkeeping after success and failure', async () => {
const {
getAutonomyPersistenceLockCountForTests,
withAutonomyPersistenceLock,
} = await import('../autonomyPersistence')
expect(getAutonomyPersistenceLockCountForTests()).toBe(0)
await withAutonomyPersistenceLock(tempDir, async () => 'ok')
expect(getAutonomyPersistenceLockCountForTests()).toBe(0)
await expect(
withAutonomyPersistenceLock(tempDir, async () => {
throw new Error('inner failure')
}),
).rejects.toThrow('inner failure')
expect(getAutonomyPersistenceLockCountForTests()).toBe(0)
})
test('serializes concurrent calls on the same rootDir', async () => {
const { withAutonomyPersistenceLock } = await import(
'../autonomyPersistence'

View File

@@ -0,0 +1,279 @@
import { afterEach, beforeEach, describe, expect, test } from 'bun:test'
import { createTempDir, cleanupTempDir } from '../../../tests/mocks/file-system'
import { getAttachmentMessages } from '../attachments'
import {
createAutonomyQueuedPrompt,
createProactiveAutonomyCommands,
getAutonomyRunById,
markAutonomyRunCancelled,
startManagedAutonomyFlowFromHeartbeatTask,
} from '../autonomyRuns'
import { getAutonomyFlowById, listAutonomyFlows } from '../autonomyFlows'
import {
cancelQueuedAutonomyCommands,
claimConsumableQueuedAutonomyCommands,
finalizeAutonomyCommandsForTurn,
partitionConsumableQueuedAutonomyCommands,
} from '../autonomyQueueLifecycle'
import {
enqueue,
getCommandsByMaxPriority,
remove as removeFromQueue,
resetCommandQueue,
} from '../messageQueueManager'
let tempDir = ''
let extraTempDirs: string[] = []
beforeEach(async () => {
tempDir = await createTempDir('autonomy-queue-lifecycle-')
extraTempDirs = []
resetCommandQueue()
})
afterEach(async () => {
resetCommandQueue()
if (tempDir) {
await cleanupTempDir(tempDir)
}
for (const extraTempDir of extraTempDirs) {
await cleanupTempDir(extraTempDir)
}
})
describe('autonomyQueueLifecycle', () => {
async function consumeQueuedAutonomyAttachmentTurn() {
const previousDisableAttachments =
process.env.CLAUDE_CODE_DISABLE_ATTACHMENTS
process.env.CLAUDE_CODE_DISABLE_ATTACHMENTS = '1'
try {
const snapshot = getCommandsByMaxPriority('later')
const claim = await claimConsumableQueuedAutonomyCommands(
snapshot,
tempDir,
)
removeFromQueue(claim.staleCommands)
removeFromQueue(claim.claimedCommands)
const attachments = []
for await (const attachment of getAttachmentMessages(
null,
{} as never,
null,
claim.attachmentCommands,
[],
)) {
attachments.push(attachment)
}
const consumedCommands = claim.attachmentCommands.filter(
command =>
(command.mode === 'prompt' || command.mode === 'task-notification') &&
!claim.claimedCommands.includes(command),
)
removeFromQueue(consumedCommands)
const nextCommands = await finalizeAutonomyCommandsForTurn({
commands: claim.claimedCommands,
outcome: { type: 'completed' },
currentDir: tempDir,
priority: 'later',
})
for (const command of nextCommands) {
enqueue(command)
}
return { attachments, runningRunIds: claim.claimedRunIds, nextCommands }
} finally {
if (previousDisableAttachments === undefined) {
delete process.env.CLAUDE_CODE_DISABLE_ATTACHMENTS
} else {
process.env.CLAUDE_CODE_DISABLE_ATTACHMENTS = previousDisableAttachments
}
}
}
test('filters stale autonomy commands before mid-turn attachment consumption', async () => {
const command = await createAutonomyQueuedPrompt({
basePrompt: 'scheduled prompt',
trigger: 'scheduled-task',
rootDir: tempDir,
currentDir: tempDir,
})
expect(command).not.toBeNull()
const initial = await partitionConsumableQueuedAutonomyCommands(
[command!],
tempDir,
)
expect(initial.attachmentCommands).toHaveLength(1)
expect(initial.staleCommands).toHaveLength(0)
await markAutonomyRunCancelled(command!.autonomy!.runId, tempDir)
const afterCancel = await partitionConsumableQueuedAutonomyCommands(
[command!],
tempDir,
)
expect(afterCancel.attachmentCommands).toHaveLength(0)
expect(afterCancel.staleCommands).toHaveLength(1)
})
test('cancels proactive commands that are created but dropped before enqueue', async () => {
const commands = await createProactiveAutonomyCommands({
basePrompt: '<tick>12:00:00</tick>',
rootDir: tempDir,
currentDir: tempDir,
})
expect(commands).toHaveLength(1)
const queuedRun = await getAutonomyRunById(
commands[0]!.autonomy!.runId,
tempDir,
)
expect(queuedRun!.status).toBe('queued')
await cancelQueuedAutonomyCommands({ commands, rootDir: tempDir })
const cancelledRun = await getAutonomyRunById(
commands[0]!.autonomy!.runId,
tempDir,
)
expect(cancelledRun!.status).toBe('cancelled')
})
test('uses command rootDir when claiming after project context changes', async () => {
const otherProjectDir = await createTempDir('autonomy-other-project-')
extraTempDirs.push(otherProjectDir)
const command = await createAutonomyQueuedPrompt({
basePrompt: 'scheduled prompt',
trigger: 'scheduled-task',
rootDir: tempDir,
currentDir: tempDir,
})
expect(command).not.toBeNull()
expect(command!.autonomy?.rootDir).toBe(tempDir)
const claim = await claimConsumableQueuedAutonomyCommands(
[command!],
otherProjectDir,
)
const originalRun = await getAutonomyRunById(
command!.autonomy!.runId,
tempDir,
)
const wrongProjectRun = await getAutonomyRunById(
command!.autonomy!.runId,
otherProjectDir,
)
expect(claim.claimedRunIds).toEqual([command!.autonomy!.runId])
expect(claim.attachmentCommands).toHaveLength(1)
expect(originalRun!.status).toBe('running')
expect(wrongProjectRun).toBeNull()
})
test('advances a managed flow consumed as a queued attachment', async () => {
const command = await startManagedAutonomyFlowFromHeartbeatTask({
task: {
name: 'weekly-report',
interval: '7d',
prompt: 'Ship the weekly report',
steps: [
{ name: 'gather', prompt: 'Gather weekly inputs' },
{ name: 'draft', prompt: 'Draft weekly report' },
],
},
rootDir: tempDir,
currentDir: tempDir,
})
expect(command).not.toBeNull()
const claim = await claimConsumableQueuedAutonomyCommands(
[command!],
tempDir,
)
const runningRunIds = claim.claimedRunIds
expect(runningRunIds).toEqual([command!.autonomy!.runId])
const nextCommands = await finalizeAutonomyCommandsForTurn({
commands: claim.claimedCommands,
outcome: { type: 'completed' },
currentDir: tempDir,
priority: 'later',
})
const [flow] = await listAutonomyFlows(tempDir)
const detail = await getAutonomyFlowById(flow!.flowId, tempDir)
const run = await getAutonomyRunById(command!.autonomy!.runId, tempDir)
expect(run!.status).toBe('completed')
expect(nextCommands).toHaveLength(1)
expect(nextCommands[0]!.autonomy?.flowId).toBe(flow!.flowId)
expect(detail!.stateJson!.steps.map(step => step.status)).toEqual([
'completed',
'queued',
])
})
test('keeps managed autonomy flow coherent across queued attachment turns', async () => {
const firstCommand = await startManagedAutonomyFlowFromHeartbeatTask({
task: {
name: 'weekly-report',
interval: '7d',
prompt: 'Ship the weekly report',
steps: [
{ name: 'gather', prompt: 'Gather weekly inputs' },
{ name: 'draft', prompt: 'Draft weekly report' },
],
},
rootDir: tempDir,
currentDir: tempDir,
})
expect(firstCommand).not.toBeNull()
enqueue(firstCommand!)
const firstTurn = await consumeQueuedAutonomyAttachmentTurn()
const queuedAfterFirstTurn = getCommandsByMaxPriority('later')
const [flowAfterFirstTurn] = await listAutonomyFlows(tempDir)
const firstRun = await getAutonomyRunById(
firstCommand!.autonomy!.runId,
tempDir,
)
expect(firstTurn.attachments).toHaveLength(1)
expect(firstTurn.attachments[0]!.attachment?.type).toBe('queued_command')
expect(firstTurn.runningRunIds).toEqual([firstCommand!.autonomy!.runId])
expect(firstTurn.nextCommands).toHaveLength(1)
expect(queuedAfterFirstTurn).toHaveLength(1)
expect(queuedAfterFirstTurn[0]!.autonomy?.flowId).toBe(
flowAfterFirstTurn!.flowId,
)
expect(firstRun!.status).toBe('completed')
expect(
flowAfterFirstTurn!.stateJson!.steps.map(step => step.status),
).toEqual(['completed', 'queued'])
const secondCommand = queuedAfterFirstTurn[0]!
const secondTurn = await consumeQueuedAutonomyAttachmentTurn()
const queuedAfterSecondTurn = getCommandsByMaxPriority('later')
const finalFlow = await getAutonomyFlowById(
flowAfterFirstTurn!.flowId,
tempDir,
)
const secondRun = await getAutonomyRunById(
secondCommand.autonomy!.runId,
tempDir,
)
expect(secondTurn.attachments).toHaveLength(1)
expect(secondTurn.runningRunIds).toEqual([secondCommand.autonomy!.runId])
expect(secondTurn.nextCommands).toHaveLength(0)
expect(queuedAfterSecondTurn).toHaveLength(0)
expect(secondRun!.status).toBe('completed')
expect(finalFlow!.status).toBe('succeeded')
expect(finalFlow!.stateJson!.steps.map(step => step.status)).toEqual([
'completed',
'completed',
])
})
})

View File

@@ -1,6 +1,7 @@
import { afterEach, beforeEach, describe, expect, test } from 'bun:test'
import { existsSync, readFileSync } from 'fs'
import { mkdir, writeFile } from 'fs/promises'
import { join } from 'path'
import { join, resolve as resolvePath } from 'path'
import {
resetStateForTests,
setCwdState,
@@ -8,17 +9,23 @@ import {
setProjectRoot,
} from '../../bootstrap/state'
import {
createAutonomyRun,
formatAutonomyRunsList,
formatAutonomyRunsStatus,
listAutonomyRuns,
createAutonomyQueuedPrompt,
createAutonomyQueuedPromptIfNoActiveSource,
createProactiveAutonomyCommands,
finalizeAutonomyRunCompleted,
getAutonomyRunById,
hasActiveAutonomyRunForSource,
markAutonomyRunCompleted,
markAutonomyRunCancelled,
markAutonomyRunFailed,
markAutonomyRunRunning,
recoverManagedAutonomyFlowPrompt,
resolveAutonomyRunsPath,
STALE_ACTIVE_RUN_ERROR_PREFIX,
startManagedAutonomyFlowFromHeartbeatTask,
} from '../autonomyRuns'
import {
@@ -95,7 +102,9 @@ describe('autonomyRuns', () => {
ownerKey: 'main-thread',
sourceId: 'cron-1',
sourceLabel: 'nightly-report',
ownerProcessId: process.pid,
})
expect(runs[0]?.ownerSessionId).toBeString()
expect(flows).toHaveLength(0)
expect(resolveAutonomyRunsPath(tempDir)).toContain('.claude')
})
@@ -118,7 +127,7 @@ describe('autonomyRuns', () => {
expect(command!.value).toContain('nested authority')
})
test('markAutonomyRunRunning/completed/failed update persisted lifecycle state for plain runs', async () => {
test('markAutonomyRunRunning/completed update persisted lifecycle state for plain runs', async () => {
const command = await createAutonomyQueuedPrompt({
basePrompt: '<tick>12:00:00</tick>',
trigger: 'proactive-tick',
@@ -134,7 +143,9 @@ describe('autonomyRuns', () => {
runId,
status: 'running',
startedAt: 100,
ownerProcessId: process.pid,
})
expect(runs[0]?.ownerSessionId).toBeString()
await markAutonomyRunCompleted(runId, tempDir, 200)
runs = await listAutonomyRuns(tempDir)
@@ -143,9 +154,22 @@ describe('autonomyRuns', () => {
status: 'completed',
endedAt: 200,
})
})
test('markAutonomyRunFailed updates a non-terminal run', async () => {
const command = await createAutonomyQueuedPrompt({
basePrompt: '<tick>12:00:00</tick>',
trigger: 'proactive-tick',
rootDir: tempDir,
currentDir: tempDir,
})
expect(command).not.toBeNull()
const runId = command!.autonomy!.runId
await markAutonomyRunRunning(runId, tempDir, 100)
await markAutonomyRunFailed(runId, 'boom', tempDir, 300)
runs = await listAutonomyRuns(tempDir)
const runs = await listAutonomyRuns(tempDir)
expect(runs[0]).toMatchObject({
runId,
status: 'failed',
@@ -154,6 +178,348 @@ describe('autonomyRuns', () => {
})
})
test('terminal runs are not revived by stale lifecycle updates', async () => {
const command = await createAutonomyQueuedPrompt({
basePrompt: 'scheduled prompt',
trigger: 'scheduled-task',
rootDir: tempDir,
currentDir: tempDir,
})
expect(command).not.toBeNull()
const runId = command!.autonomy!.runId
await markAutonomyRunCancelled(runId, tempDir, 100)
const revived = await markAutonomyRunRunning(runId, tempDir, 200)
const completed = await markAutonomyRunCompleted(runId, tempDir, 300)
const failed = await markAutonomyRunFailed(
runId,
'late failure',
tempDir,
400,
)
const persisted = await getAutonomyRunById(runId, tempDir)
expect(revived).toBeNull()
expect(completed).toBeNull()
expect(failed).toBeNull()
expect(persisted).toMatchObject({
status: 'cancelled',
endedAt: 100,
})
expect(persisted!.error).toBeUndefined()
})
test('hasActiveAutonomyRunForSource only treats queued and running scheduled runs as active', async () => {
const command = await createAutonomyQueuedPrompt({
basePrompt: 'scheduled prompt',
trigger: 'scheduled-task',
rootDir: tempDir,
currentDir: tempDir,
sourceId: 'cron-1',
sourceLabel: 'nightly',
})
expect(command).not.toBeNull()
const runId = command!.autonomy!.runId
await expect(
hasActiveAutonomyRunForSource({
trigger: 'scheduled-task',
sourceId: 'cron-1',
rootDir: tempDir,
}),
).resolves.toBe(true)
await markAutonomyRunRunning(runId, tempDir, 100)
await expect(
hasActiveAutonomyRunForSource({
trigger: 'scheduled-task',
sourceId: 'cron-1',
rootDir: tempDir,
}),
).resolves.toBe(true)
await expect(
hasActiveAutonomyRunForSource({
trigger: 'scheduled-task',
sourceId: 'cron-2',
rootDir: tempDir,
}),
).resolves.toBe(false)
await markAutonomyRunCompleted(runId, tempDir, 200)
await expect(
hasActiveAutonomyRunForSource({
trigger: 'scheduled-task',
sourceId: 'cron-1',
rootDir: tempDir,
}),
).resolves.toBe(false)
const failedCommand = await createAutonomyQueuedPrompt({
basePrompt: 'scheduled prompt',
trigger: 'scheduled-task',
rootDir: tempDir,
currentDir: tempDir,
sourceId: 'cron-1',
})
expect(failedCommand).not.toBeNull()
await markAutonomyRunFailed(
failedCommand!.autonomy!.runId,
'boom',
tempDir,
300,
)
await expect(
hasActiveAutonomyRunForSource({
trigger: 'scheduled-task',
sourceId: 'cron-1',
rootDir: tempDir,
}),
).resolves.toBe(false)
})
test('createAutonomyQueuedPromptIfNoActiveSource atomically skips duplicate active scheduled sources', async () => {
const [first, second] = await Promise.all([
createAutonomyQueuedPromptIfNoActiveSource({
basePrompt: 'scheduled prompt',
trigger: 'scheduled-task',
rootDir: tempDir,
currentDir: tempDir,
sourceId: 'cron-1',
}),
createAutonomyQueuedPromptIfNoActiveSource({
basePrompt: 'scheduled prompt',
trigger: 'scheduled-task',
rootDir: tempDir,
currentDir: tempDir,
sourceId: 'cron-1',
}),
])
const created = [first, second].filter(command => command !== null)
const runs = await listAutonomyRuns(tempDir)
expect(created).toHaveLength(1)
expect(runs).toHaveLength(1)
expect(runs[0]).toMatchObject({
trigger: 'scheduled-task',
status: 'queued',
sourceId: 'cron-1',
})
})
test('createAutonomyQueuedPromptIfNoActiveSource scopes dedup by ownerKey', async () => {
const first = await createAutonomyQueuedPromptIfNoActiveSource({
basePrompt: 'scheduled prompt',
trigger: 'scheduled-task',
rootDir: tempDir,
currentDir: tempDir,
sourceId: 'cron-1',
ownerKey: 'owner-a',
})
const second = await createAutonomyQueuedPromptIfNoActiveSource({
basePrompt: 'scheduled prompt',
trigger: 'scheduled-task',
rootDir: tempDir,
currentDir: tempDir,
sourceId: 'cron-1',
ownerKey: 'owner-b',
})
const runs = await listAutonomyRuns(tempDir)
expect(first).not.toBeNull()
expect(second).not.toBeNull()
expect(runs).toHaveLength(2)
expect(new Set(runs.map(run => run.ownerKey))).toEqual(
new Set(['owner-a', 'owner-b']),
)
})
test('createAutonomyQueuedPromptIfNoActiveSource does not advance heartbeat last-run state on dedup skip (two-phase commit invariant)', async () => {
await writeTempFile(
tempDir,
HEARTBEAT_REL,
[
'tasks:',
' - name: inbox',
' interval: 30m',
' prompt: "Check inbox"',
].join('\n'),
)
// Seed an active queued run for cron-1 so the next dedup attempt skips.
await mkdir(join(tempDir, AUTONOMY_DIR), { recursive: true })
await writeFile(
resolveAutonomyRunsPath(tempDir),
`${JSON.stringify(
{
runs: [
{
runId: 'preexisting-active',
runtime: 'automatic',
trigger: 'scheduled-task',
status: 'queued',
rootDir: tempDir,
currentDir: tempDir,
sourceId: 'cron-1',
promptPreview: 'still queued',
createdAt: 100,
ownerProcessId: process.pid,
ownerSessionId: 'self',
},
],
},
null,
2,
)}\n`,
'utf-8',
)
const skipped = await createAutonomyQueuedPromptIfNoActiveSource({
basePrompt: 'scheduled prompt',
trigger: 'scheduled-task',
rootDir: tempDir,
currentDir: tempDir,
sourceId: 'cron-1',
})
expect(skipped).toBeNull()
// If the dedup skip wrongly advanced heartbeat state, the next
// proactive-tick prompt would NOT include the inbox task. Verify it
// still does.
const followUp = await createAutonomyQueuedPrompt({
basePrompt: '<tick>12:00:00</tick>',
trigger: 'proactive-tick',
rootDir: tempDir,
currentDir: tempDir,
})
expect(followUp).not.toBeNull()
expect(followUp!.value).toContain('Due HEARTBEAT.md tasks:')
expect(followUp!.value).toContain('- inbox (30m): Check inbox')
})
test('createAutonomyQueuedPromptIfNoActiveSource recovers stale active runs from dead owner processes', async () => {
await mkdir(join(tempDir, AUTONOMY_DIR), { recursive: true })
await writeFile(
resolveAutonomyRunsPath(tempDir),
`${JSON.stringify(
{
runs: [
{
runId: 'stale-run',
runtime: 'automatic',
trigger: 'scheduled-task',
status: 'running',
rootDir: tempDir,
currentDir: tempDir,
sourceId: 'cron-1',
sourceLabel: 'nightly',
promptPreview: 'stale scheduled prompt',
createdAt: 100,
startedAt: 100,
ownerProcessId: 2_147_483_647,
ownerSessionId: 'dead-session',
},
],
},
null,
2,
)}\n`,
'utf-8',
)
await expect(
hasActiveAutonomyRunForSource({
trigger: 'scheduled-task',
sourceId: 'cron-1',
rootDir: tempDir,
}),
).resolves.toBe(false)
const command = await createAutonomyQueuedPromptIfNoActiveSource({
basePrompt: 'scheduled prompt',
trigger: 'scheduled-task',
rootDir: tempDir,
currentDir: tempDir,
sourceId: 'cron-1',
})
const runs = await listAutonomyRuns(tempDir)
expect(command).not.toBeNull()
expect(runs).toHaveLength(2)
expect(runs[0]).toMatchObject({
trigger: 'scheduled-task',
status: 'queued',
sourceId: 'cron-1',
ownerProcessId: process.pid,
})
expect(runs[1]).toMatchObject({
runId: 'stale-run',
status: 'failed',
endedAt: runs[0]?.createdAt,
error: expect.stringContaining('owner process 2147483647'),
})
})
test('stale managed-flow run recovery also marks the flow step failed', async () => {
const command = await startManagedAutonomyFlowFromHeartbeatTask({
task: {
name: 'weekly-report',
interval: '7d',
prompt: 'Ship the weekly report',
steps: [
{
name: 'gather',
prompt: 'Gather weekly inputs',
},
],
},
rootDir: tempDir,
currentDir: tempDir,
})
expect(command).not.toBeNull()
const runId = command!.autonomy!.runId
await markAutonomyRunRunning(runId, tempDir, 100)
const runsPath = resolveAutonomyRunsPath(tempDir)
const file = JSON.parse(readFileSync(runsPath, 'utf-8')) as {
runs: Array<Record<string, unknown>>
}
file.runs = file.runs.map(run =>
run.runId === runId
? { ...run, ownerProcessId: 2_147_483_647 }
: run,
)
await writeFile(runsPath, `${JSON.stringify(file, null, 2)}\n`, 'utf-8')
const replacement = await createAutonomyQueuedPromptIfNoActiveSource({
basePrompt: 'replacement prompt',
trigger: 'managed-flow-step',
rootDir: tempDir,
currentDir: tempDir,
sourceId: command!.autonomy!.sourceId!,
ownerKey: 'main-thread',
})
const [flow] = await listAutonomyFlows(tempDir)
const runs = await listAutonomyRuns(tempDir)
expect(replacement).not.toBeNull()
expect(runs.find(run => run.runId === runId)).toMatchObject({
status: 'failed',
error: expect.stringContaining(STALE_ACTIVE_RUN_ERROR_PREFIX),
})
expect(flow).toMatchObject({
status: 'failed',
blockedRunId: runId,
})
expect(flow?.stateJson?.steps[0]).toMatchObject({
status: 'failed',
runId,
error: expect.stringContaining(STALE_ACTIVE_RUN_ERROR_PREFIX),
})
})
test('formatters produce readable status and run listings', async () => {
const first = await createAutonomyQueuedPrompt({
basePrompt: 'scheduled prompt',
@@ -223,6 +589,53 @@ describe('autonomyRuns', () => {
)
})
test('persistence pruning keeps active runs ahead of recent completed history', async () => {
const runs = [
{
runId: 'old-active',
runtime: 'automatic',
trigger: 'scheduled-task',
status: 'queued',
rootDir: tempDir,
currentDir: tempDir,
ownerKey: 'main-thread',
promptPreview: 'old active',
createdAt: 1,
},
...Array.from({ length: 200 }, (_, index) => ({
runId: `history-${index}`,
runtime: 'automatic',
trigger: 'scheduled-task',
status: 'completed',
rootDir: tempDir,
currentDir: tempDir,
ownerKey: 'main-thread',
promptPreview: `history ${index}`,
createdAt: 1_000 + index,
endedAt: 2_000 + index,
})),
]
await mkdir(join(tempDir, AUTONOMY_DIR), { recursive: true })
await writeFile(
resolveAutonomyRunsPath(tempDir),
`${JSON.stringify({ runs }, null, 2)}\n`,
'utf-8',
)
await createAutonomyRun({
trigger: 'scheduled-task',
prompt: 'fresh active',
rootDir: tempDir,
currentDir: tempDir,
nowMs: 9_999,
})
const persisted = await listAutonomyRuns(tempDir)
expect(persisted).toHaveLength(200)
expect(persisted.some(run => run.runId === 'old-active')).toBe(true)
expect(persisted.some(run => run.runId === 'history-0')).toBe(false)
})
test('listAutonomyRuns keeps older persisted records by normalizing missing runtime and owner metadata', async () => {
const runsPath = resolveAutonomyRunsPath(tempDir)
await mkdir(join(tempDir, '.claude', 'autonomy'), { recursive: true })
@@ -418,4 +831,27 @@ describe('autonomyRuns', () => {
expect(recovered!.autonomy?.runId).toBe(command!.autonomy?.runId)
expect(recovered!.autonomy?.flowId).toBe(flow!.flowId)
})
test('STALE_ACTIVE_RUN_ERROR_PREFIX stays in sync with HEARTBEAT.md stale-recovery-health task', () => {
// The HEARTBEAT.md stale-recovery-health task prompt embeds this prefix
// as a literal string. Changing the constant without updating the
// heartbeat prompt would silently break the monitor — this test fails
// first to force the simultaneous update.
const heartbeatPath = resolvePath(
import.meta.dir,
'..',
'..',
'..',
'.claude',
'autonomy',
'HEARTBEAT.md',
)
if (!existsSync(heartbeatPath)) {
// .claude/ may be absent in some checkout layouts (e.g., shallow clone
// for npm pack). Skip rather than fail in that case.
return
}
const content = readFileSync(heartbeatPath, 'utf8')
expect(content).toContain(STALE_ACTIVE_RUN_ERROR_PREFIX)
})
})

View File

@@ -133,11 +133,41 @@ function mergeAgentsAuthority(files: AutonomyAuthorityFile[]): string | null {
.join('\n\n')
}
/**
* Replaces fenced code-block content (and the ``` / ~~~ fence delimiters
* themselves) with empty strings while preserving the index of every
* other line. Used by the heartbeat parser so that `tasks:` literals
* appearing inside Markdown code samples in HEARTBEAT.md docs do not
* collide with the real config block.
*/
function maskCodeFencedLines(lines: string[]): string[] {
const masked = lines.slice()
let activeFenceChar: '`' | '~' | null = null
for (let i = 0; i < masked.length; i++) {
const trimmed = masked[i]!.trim()
const fenceMatch = trimmed.match(/^(```+|~~~+)/)
if (fenceMatch) {
const fenceChar = fenceMatch[1]![0] as '`' | '~'
if (activeFenceChar === null) {
activeFenceChar = fenceChar
} else if (activeFenceChar === fenceChar) {
activeFenceChar = null
}
masked[i] = ''
continue
}
if (activeFenceChar !== null) {
masked[i] = ''
}
}
return masked
}
export function parseHeartbeatAuthorityTasks(
content: string,
): HeartbeatAuthorityTask[] {
const tasks: HeartbeatAuthorityTask[] = []
const lines = content.split('\n')
const lines = maskCodeFencedLines(content.split('\n'))
const getIndent = (line: string): number =>
line.length - line.trimStart().length
const parseScalar = (line: string, key: string): string =>

View File

@@ -83,6 +83,20 @@ export type AutonomyFlowRecord = {
waitJson?: AutonomyFlowWaitState
cancelRequestedAt?: number
lastError?: string
/**
* Repo-relative POSIX glob patterns describing which paths this flow's
* `report`-step approval covers. The pre-tool-use hook
* `require-plan-for-risky-edit.mjs` consults this list to permit edits
* only when the target file matches at least one entry. Absent or empty
* means "no boundary declared" — during the pilot window the hook
* treats this as broad approval (v1 behaviour). Once all production
* flows declare boundaries, the hook will deny absent-boundary flows.
*
* Supported syntax: `*` matches one path segment, `**` matches any
* number including zero. Examples: `src/utils/autonomy*`,
* `src/services/api/**`, `src/Tool.ts`.
*/
boundary?: string[]
}
type AutonomyFlowsFile = {
@@ -138,6 +152,7 @@ function cloneWaitState(
function cloneFlowRecord(flow: AutonomyFlowRecord): AutonomyFlowRecord {
return {
...flow,
...(flow.boundary ? { boundary: [...flow.boundary] } : {}),
...(flow.stateJson ? { stateJson: cloneManagedState(flow.stateJson) } : {}),
...(flow.waitJson ? { waitJson: cloneWaitState(flow.waitJson) } : {}),
}
@@ -152,6 +167,25 @@ function isManagedFlowStatusActive(status: AutonomyFlowStatus): boolean {
)
}
function selectPersistedAutonomyFlows(
flows: AutonomyFlowRecord[],
): AutonomyFlowRecord[] {
const retained = flows
.slice()
.map(cloneFlowRecord)
.sort((left, right) => {
const leftActive = isManagedFlowStatusActive(left.status)
const rightActive = isManagedFlowStatusActive(right.status)
if (leftActive !== rightActive) {
return leftActive ? -1 : 1
}
return right.updatedAt - left.updatedAt
})
.slice(0, AUTONOMY_FLOWS_MAX)
return retained.sort((left, right) => right.updatedAt - left.updatedAt)
}
function defaultFlowSource(params: {
trigger: AutonomyTriggerKind
sourceId?: string
@@ -237,6 +271,35 @@ function normalizeWaitState(value: unknown): AutonomyFlowWaitState | undefined {
}
}
function isPosixBoundaryGlob(value: string): boolean {
if (!value || value.startsWith('/') || value.includes('\\')) {
return false
}
if (value.includes('\0')) {
return false
}
return !value.split('/').some(segment => segment === '..')
}
function normalizeBoundary(value: unknown): string[] | undefined {
if (!Array.isArray(value)) {
return undefined
}
const seen = new Set<string>()
const boundary = value
.filter((entry): entry is string => typeof entry === 'string')
.map(entry => entry.trim())
.filter(isPosixBoundaryGlob)
.filter(entry => {
if (seen.has(entry)) {
return false
}
seen.add(entry)
return true
})
return boundary.length > 0 ? boundary : undefined
}
function normalizeFlowRecord(flow: AutonomyFlowRecord): AutonomyFlowRecord {
const source = defaultFlowSource(flow)
return {
@@ -247,6 +310,7 @@ function normalizeFlowRecord(flow: AutonomyFlowRecord): AutonomyFlowRecord {
goal: flow.goal || flow.sourceLabel || flow.sourceId || flow.flowKey,
currentDir: flow.currentDir || flow.rootDir,
runCount: Math.max(flow.runCount ?? 0, 0),
boundary: normalizeBoundary(flow.boundary),
stateJson: normalizeManagedState(flow.stateJson),
waitJson: normalizeWaitState(flow.waitJson),
...(flow.sourceId
@@ -369,11 +433,7 @@ async function writeAutonomyFlows(
path,
`${JSON.stringify(
{
flows: flows
.slice()
.map(cloneFlowRecord)
.sort((left, right) => right.updatedAt - left.updatedAt)
.slice(0, AUTONOMY_FLOWS_MAX),
flows: selectPersistedAutonomyFlows(flows),
} satisfies AutonomyFlowsFile,
null,
2,
@@ -420,6 +480,7 @@ export async function startManagedAutonomyFlow(params: {
ownerKey?: string
sourceId?: string
sourceLabel?: string
boundary?: string[]
nowMs?: number
}): Promise<ManagedAutonomyFlowStartResult | null> {
if (params.steps.length === 0) {
@@ -450,6 +511,8 @@ export async function startManagedAutonomyFlow(params: {
const stateJson = buildManagedState(params.steps)
const firstStep = stateJson.steps[0]!
const boundary =
normalizeBoundary(params.boundary) ?? normalizeBoundary(current?.boundary)
const waiting =
firstStep.waitFor != null
? {
@@ -474,6 +537,7 @@ export async function startManagedAutonomyFlow(params: {
currentDir,
...(source.sourceId ? { sourceId: source.sourceId } : {}),
...(source.sourceLabel ? { sourceLabel: source.sourceLabel } : {}),
...(boundary ? { boundary } : {}),
latestRunId: undefined,
runCount: current?.runCount ?? 0,
createdAt: current?.createdAt ?? nowMs,

View File

@@ -4,6 +4,15 @@ import { lock } from './lockfile.js'
const persistenceLocks = new Map<string, Promise<void>>()
export function getAutonomyPersistenceLockCountForTests(): number {
if (process.env.NODE_ENV !== 'test') {
throw new Error(
'getAutonomyPersistenceLockCountForTests can only be called in tests',
)
}
return persistenceLocks.size
}
export async function withAutonomyPersistenceLock<T>(
rootDir: string,
fn: () => Promise<T>,
@@ -16,10 +25,8 @@ export async function withAutonomyPersistenceLock<T>(
const current = new Promise<void>(resolve => {
release = resolve
})
persistenceLocks.set(
key,
previous.then(() => current),
)
const chained = previous.then(() => current)
persistenceLocks.set(key, chained)
await previous
try {
@@ -41,7 +48,7 @@ export async function withAutonomyPersistenceLock<T>(
}
} finally {
release()
if (persistenceLocks.get(key) === current) {
if (persistenceLocks.get(key) === chained) {
persistenceLocks.delete(key)
}
}

View File

@@ -0,0 +1,261 @@
import type { QueuedCommand } from '../types/textInputTypes.js'
import {
finalizeAutonomyRunCompleted,
finalizeAutonomyRunFailed,
listAutonomyRuns,
markAutonomyRunCancelled,
markAutonomyRunRunning,
} from './autonomyRuns.js'
export type AutonomyQueuePartition = {
attachmentCommands: QueuedCommand[]
staleCommands: QueuedCommand[]
}
export type AutonomyQueueClaim = AutonomyQueuePartition & {
claimedRunIds: string[]
claimedCommands: QueuedCommand[]
}
export type AutonomyTurnOutcome =
| { type: 'completed' }
| { type: 'cancelled' }
| { type: 'failed'; error?: unknown; message?: string }
type AutonomyRunRef = {
runId: string
rootDir?: string
}
function getCommandRootDir(
command: QueuedCommand,
fallbackRootDir?: string,
): string | undefined {
return command.autonomy?.rootDir ?? fallbackRootDir
}
function refKey(ref: AutonomyRunRef): string {
return `${ref.rootDir ?? ''}\0${ref.runId}`
}
function getAutonomyRunRefs(
commands: QueuedCommand[],
fallbackRootDir?: string,
): AutonomyRunRef[] {
const refs = new Map<string, AutonomyRunRef>()
for (const command of commands) {
const runId = command.autonomy?.runId
if (!runId) {
continue
}
const ref = {
runId,
rootDir: getCommandRootDir(command, fallbackRootDir),
}
refs.set(refKey(ref), ref)
}
return [...refs.values()]
}
function isInlineQueuedCommand(command: QueuedCommand): boolean {
return command.mode === 'prompt' || command.mode === 'task-notification'
}
function groupRefsByRootDir(
refs: AutonomyRunRef[],
): Map<string, AutonomyRunRef[]> {
const grouped = new Map<string, AutonomyRunRef[]>()
for (const ref of refs) {
const key = ref.rootDir ?? ''
const group = grouped.get(key)
if (group) {
group.push(ref)
} else {
grouped.set(key, [ref])
}
}
return grouped
}
/**
* Exclude queued autonomy commands whose persisted run is no longer queued.
* This prevents stale in-memory commands from reviving flows after cancellation
* or after another path has already consumed the run.
*/
export async function partitionConsumableQueuedAutonomyCommands(
commands: QueuedCommand[],
rootDir?: string,
): Promise<AutonomyQueuePartition> {
const attachmentCommands: QueuedCommand[] = []
const staleCommands: QueuedCommand[] = []
const refs = getAutonomyRunRefs(commands, rootDir)
const runsByRef = new Map<
string,
Awaited<ReturnType<typeof listAutonomyRuns>>[number]
>()
for (const [rootKey, group] of groupRefsByRootDir(refs)) {
const runs = await listAutonomyRuns(rootKey || undefined)
const wanted = new Set(group.map(ref => ref.runId))
for (const run of runs) {
if (wanted.has(run.runId)) {
runsByRef.set(
refKey({ runId: run.runId, rootDir: rootKey || undefined }),
run,
)
}
}
}
for (const command of commands) {
const runId = command.autonomy?.runId
if (!runId) {
attachmentCommands.push(command)
continue
}
const commandRootDir = getCommandRootDir(command, rootDir)
const run = runsByRef.get(refKey({ runId, rootDir: commandRootDir }))
if (run?.status === 'queued' && !run.startedAt && !run.endedAt) {
attachmentCommands.push(command)
} else {
staleCommands.push(command)
}
}
return { attachmentCommands, staleCommands }
}
export async function claimConsumableQueuedAutonomyCommands(
commands: QueuedCommand[],
rootDir?: string,
): Promise<AutonomyQueueClaim> {
const partition = await partitionConsumableQueuedAutonomyCommands(
commands,
rootDir,
)
const claimedRunIds: string[] = []
const claimedRunKeys: string[] = []
const staleRunKeys = new Set<string>()
const candidateRefs = getAutonomyRunRefs(
partition.attachmentCommands.filter(isInlineQueuedCommand),
rootDir,
)
for (const ref of candidateRefs) {
const updated = await markAutonomyRunRunning(ref.runId, ref.rootDir)
if (updated?.status === 'running') {
claimedRunIds.push(ref.runId)
claimedRunKeys.push(refKey(ref))
} else {
staleRunKeys.add(refKey(ref))
}
}
const claimedRunKeySet = new Set(claimedRunKeys)
const attachmentCommands: QueuedCommand[] = []
const claimedCommands: QueuedCommand[] = []
const staleCommands = [...partition.staleCommands]
for (const command of partition.attachmentCommands) {
const runId = command.autonomy?.runId
if (!runId) {
attachmentCommands.push(command)
continue
}
const key = refKey({
runId,
rootDir: getCommandRootDir(command, rootDir),
})
if (claimedRunKeySet.has(key)) {
attachmentCommands.push(command)
claimedCommands.push(command)
} else if (staleRunKeys.has(key)) {
staleCommands.push(command)
}
}
return {
attachmentCommands,
staleCommands,
claimedRunIds,
claimedCommands,
}
}
export async function cancelQueuedAutonomyCommands(params: {
commands: QueuedCommand[]
rootDir?: string
}): Promise<void> {
for (const ref of getAutonomyRunRefs(params.commands, params.rootDir)) {
await markAutonomyRunCancelled(ref.runId, ref.rootDir)
}
}
function stringifyAutonomyError(error: unknown): string {
if (typeof error === 'string') {
return error
}
if (error instanceof Error) {
return error.message
}
return String(error)
}
export function sanitizeAutonomyFailureForPersistence(
error: unknown,
fallback = 'query failed',
): string {
const message = stringifyAutonomyError(error)
const lower = message.toLowerCase()
if (
lower.includes('api_error') ||
lower.includes('provider') ||
lower.includes('openai') ||
lower.includes('gemini') ||
lower.includes('grok') ||
lower.includes('anthropic') ||
lower.includes('bedrock') ||
lower.includes('vertex')
) {
return 'provider api_error'
}
return fallback
}
export async function finalizeAutonomyCommandsForTurn(params: {
commands: QueuedCommand[]
outcome: AutonomyTurnOutcome
currentDir?: string
priority?: 'now' | 'next' | 'later'
workload?: string
}): Promise<QueuedCommand[]> {
const nextCommands: QueuedCommand[] = []
for (const command of params.commands) {
const autonomy = command.autonomy
if (!autonomy?.runId) {
continue
}
if (params.outcome.type === 'completed') {
nextCommands.push(
...(await finalizeAutonomyRunCompleted({
runId: autonomy.runId,
rootDir: autonomy.rootDir,
currentDir: params.currentDir,
priority: params.priority,
workload: command.workload ?? params.workload,
})),
)
} else if (params.outcome.type === 'cancelled') {
await markAutonomyRunCancelled(autonomy.runId, autonomy.rootDir)
} else {
await finalizeAutonomyRunFailed({
runId: autonomy.runId,
rootDir: autonomy.rootDir,
error:
params.outcome.message ??
sanitizeAutonomyFailureForPersistence(params.outcome.error),
})
}
}
return nextCommands
}

View File

@@ -1,7 +1,7 @@
import { randomUUID } from 'crypto'
import { mkdir, writeFile } from 'fs/promises'
import { dirname, join, resolve } from 'path'
import { getProjectRoot } from '../bootstrap/state.js'
import { getProjectRoot, getSessionId } from '../bootstrap/state.js'
import type { MessageOrigin } from '../types/message.js'
import type { QueuedCommand } from '../types/textInputTypes.js'
import {
@@ -29,9 +29,22 @@ import {
} from './autonomyFlows.js'
import { withAutonomyPersistenceLock } from './autonomyPersistence.js'
import { getFsImplementation } from './fsOperations.js'
import { isProcessRunning } from './genericProcessUtils.js'
import { logError } from './log.js'
const AUTONOMY_RUNS_MAX = 200
const AUTONOMY_RUNS_RELATIVE_PATH = join(AUTONOMY_DIR, 'runs.json')
// Sentinel string surfaced to operators via runs.json error fields and
// referenced literally by the HEARTBEAT.md `stale-recovery-health` task.
// A unit test asserts the HEARTBEAT.md file contains this exact prefix —
// changing the value will fail the test, forcing the heartbeat prompt
// to be updated in the same change.
export const STALE_ACTIVE_RUN_ERROR_PREFIX =
'Recovered stale active autonomy run'
// Guards the legacy-block warning so it fires once per (process, runId) instead
// of every dedup tick while a no-owner record sits there.
const warnedLegacyBlockRunIds = new Set<string>()
export type AutonomyRunStatus =
| 'queued'
@@ -59,6 +72,8 @@ export type AutonomyRunRecord = {
flowStepName?: string
promptPreview: string
createdAt: number
ownerProcessId?: number
ownerSessionId?: string
startedAt?: number
endedAt?: number
error?: string
@@ -77,6 +92,19 @@ type AutonomyRunFlowRef = {
stepName: string
}
type CreateAutonomyRunParams = {
trigger: AutonomyTriggerKind
prompt: string
rootDir?: string
currentDir?: string
sourceId?: string
sourceLabel?: string
runtime?: AutonomyRunRuntime
ownerKey?: string
flow?: AutonomyRunFlowRef
nowMs?: number
}
function truncatePromptPreview(prompt: string): string {
const singleLine = prompt.replace(/\s+/g, ' ').trim()
return singleLine.length <= 240
@@ -95,6 +123,29 @@ function cloneRunRecord(run: AutonomyRunRecord): AutonomyRunRecord {
return { ...run }
}
function isAutonomyRunActive(run: AutonomyRunRecord): boolean {
return run.status === 'queued' || run.status === 'running'
}
function selectPersistedAutonomyRuns(
runs: AutonomyRunRecord[],
): AutonomyRunRecord[] {
const retained = runs
.slice()
.map(cloneRunRecord)
.sort((left, right) => {
const leftActive = isAutonomyRunActive(left)
const rightActive = isAutonomyRunActive(right)
if (leftActive !== rightActive) {
return leftActive ? -1 : 1
}
return right.createdAt - left.createdAt
})
.slice(0, AUTONOMY_RUNS_MAX)
return retained.sort((left, right) => right.createdAt - left.createdAt)
}
function normalizePersistedRunRecord(
run: PersistedAutonomyRunRecord,
): AutonomyRunRecord {
@@ -157,11 +208,7 @@ async function writeAutonomyRuns(
path,
`${JSON.stringify(
{
runs: runs
.slice()
.map(cloneRunRecord)
.sort((left, right) => right.createdAt - left.createdAt)
.slice(0, AUTONOMY_RUNS_MAX),
runs: selectPersistedAutonomyRuns(runs),
} satisfies AutonomyRunsFile,
null,
2,
@@ -172,7 +219,7 @@ async function writeAutonomyRuns(
async function updateAutonomyRun(
runId: string,
updater: (current: AutonomyRunRecord) => AutonomyRunRecord,
updater: (current: AutonomyRunRecord) => AutonomyRunRecord | null,
rootDir: string = getProjectRoot(),
): Promise<AutonomyRunRecord | null> {
return withAutonomyPersistenceLock(rootDir, async () => {
@@ -181,7 +228,11 @@ async function updateAutonomyRun(
if (index === -1) {
return null
}
const updated = cloneRunRecord(updater(cloneRunRecord(runs[index]!)))
const next = updater(cloneRunRecord(runs[index]!))
if (!next) {
return null
}
const updated = cloneRunRecord(next)
runs[index] = updated
await writeAutonomyRuns(runs, rootDir)
return updated
@@ -196,21 +247,112 @@ export async function getAutonomyRunById(
return runs.find(run => run.runId === runId) ?? null
}
export async function createAutonomyRun(params: {
function isActiveAutonomyRunStatus(status: AutonomyRunStatus): boolean {
return status === 'queued' || status === 'running'
}
function isValidOwnerProcessId(pid: number | undefined): pid is number {
// Reject non-numeric, negative, zero (Linux: send-to-process-group), and
// non-integer values. A forged record with pid=0 or pid<0 used to be
// treated as live and could permanently block dedup; treating them as
// stale closes that availability hole.
return (
typeof pid === 'number' &&
Number.isInteger(pid) &&
pid > 0 &&
pid < 4_194_304
)
}
function isStaleActiveAutonomyRun(run: AutonomyRunRecord): boolean {
if (!isActiveAutonomyRunStatus(run.status)) {
return false
}
if (run.ownerProcessId === undefined) {
return false
}
if (!isValidOwnerProcessId(run.ownerProcessId)) {
return true
}
return !isProcessRunning(run.ownerProcessId)
}
function staleActiveRunError(run: AutonomyRunRecord): string {
return `${STALE_ACTIVE_RUN_ERROR_PREFIX}: owner process ${run.ownerProcessId} is no longer running.`
}
function failAutonomyRunRecord(
run: AutonomyRunRecord,
error: string,
nowMs: number,
): AutonomyRunRecord {
return {
...run,
status: 'failed',
endedAt: nowMs,
error,
}
}
function recoverStaleActiveAutonomyRun(
run: AutonomyRunRecord,
nowMs: number,
): AutonomyRunRecord {
return failAutonomyRunRecord(run, staleActiveRunError(run), nowMs)
}
async function syncFailedManagedFlowForRun(
run: AutonomyRunRecord,
rootDir: string,
): Promise<void> {
if (run.parentFlowId && run.parentFlowSyncMode === 'managed') {
await markManagedAutonomyFlowStepFailed({
flowId: run.parentFlowId,
runId: run.runId,
error: run.error ?? 'Autonomy run failed.',
rootDir,
nowMs: run.endedAt,
})
}
}
function matchesActiveAutonomyRunSource(
run: AutonomyRunRecord,
params: {
trigger: AutonomyTriggerKind
sourceId: string
ownerKey?: string
},
): boolean {
return (
run.trigger === params.trigger &&
run.sourceId === params.sourceId &&
(params.ownerKey === undefined || run.ownerKey === params.ownerKey) &&
isActiveAutonomyRunStatus(run.status)
)
}
export async function hasActiveAutonomyRunForSource(params: {
trigger: AutonomyTriggerKind
prompt: string
sourceId: string
rootDir?: string
currentDir?: string
sourceId?: string
sourceLabel?: string
runtime?: AutonomyRunRuntime
ownerKey?: string
flow?: AutonomyRunFlowRef
nowMs?: number
}): Promise<AutonomyRunRecord> {
const rootDir = resolve(params.rootDir ?? getProjectRoot())
const currentDir = resolve(params.currentDir ?? rootDir)
const record: AutonomyRunRecord = {
}): Promise<boolean> {
const runs = await listAutonomyRuns(params.rootDir)
return runs.some(
run =>
matchesActiveAutonomyRunSource(run, params) &&
!isStaleActiveAutonomyRun(run),
)
}
function buildAutonomyRunRecord(
params: CreateAutonomyRunParams,
rootDir: string,
currentDir: string,
): AutonomyRunRecord {
const createdAt = params.nowMs ?? Date.now()
return {
runId: randomUUID(),
runtime: params.runtime ?? (params.flow ? 'flow_step' : 'automatic'),
trigger: params.trigger,
@@ -231,13 +373,80 @@ export async function createAutonomyRun(params: {
}
: {}),
promptPreview: truncatePromptPreview(params.prompt),
createdAt: params.nowMs ?? Date.now(),
createdAt,
ownerProcessId: process.pid,
ownerSessionId: getSessionId(),
}
}
async function persistAutonomyRunRecord(
record: AutonomyRunRecord,
rootDir: string,
skipWhenActiveSource: boolean,
): Promise<{
created: boolean
recoveredStaleRuns: AutonomyRunRecord[]
}> {
let created = false
const recoveredStaleRuns: AutonomyRunRecord[] = []
await withAutonomyPersistenceLock(rootDir, async () => {
const runs = await listAutonomyRuns(rootDir)
const sourceId = record.sourceId
if (skipWhenActiveSource && sourceId) {
let hasBlockingActiveRun = false
let staleRecoveriesApplied = false
for (let i = 0; i < runs.length; i++) {
const run = runs[i]!
if (
!matchesActiveAutonomyRunSource(run, {
trigger: record.trigger,
sourceId,
ownerKey: record.ownerKey,
})
) {
continue
}
if (isStaleActiveAutonomyRun(run)) {
const recovered = recoverStaleActiveAutonomyRun(
run,
record.createdAt,
)
runs[i] = recovered
recoveredStaleRuns.push(recovered)
staleRecoveriesApplied = true
continue
}
if (
run.ownerProcessId === undefined &&
!warnedLegacyBlockRunIds.has(run.runId)
) {
warnedLegacyBlockRunIds.add(run.runId)
logError(
new Error(
`[autonomyRuns] blocked by legacy un-owned active run ${run.runId} (createdAt=${run.createdAt}); cancel manually if this is a stale upgrade artifact`,
),
)
}
hasBlockingActiveRun = true
}
if (hasBlockingActiveRun) {
if (staleRecoveriesApplied) {
await writeAutonomyRuns(runs, rootDir)
}
return
}
}
runs.unshift(record)
await writeAutonomyRuns(runs, rootDir)
created = true
})
return { created, recoveredStaleRuns }
}
async function queueManagedFlowStepRunForRecord(
record: AutonomyRunRecord,
rootDir: string,
): Promise<void> {
if (
record.parentFlowId &&
record.flowStepId &&
@@ -258,9 +467,47 @@ export async function createAutonomyRun(params: {
nowMs: record.createdAt,
})
}
}
async function createAutonomyRunCore(
params: CreateAutonomyRunParams,
skipIfActiveSource: boolean,
): Promise<AutonomyRunRecord | null> {
const rootDir = resolve(params.rootDir ?? getProjectRoot())
const currentDir = resolve(params.currentDir ?? rootDir)
const record = buildAutonomyRunRecord(params, rootDir, currentDir)
const { created, recoveredStaleRuns } = await persistAutonomyRunRecord(
record,
rootDir,
skipIfActiveSource,
)
for (const recovered of recoveredStaleRuns) {
await syncFailedManagedFlowForRun(recovered, rootDir)
}
if (!created) {
return null
}
await queueManagedFlowStepRunForRecord(record, rootDir)
return record
}
export async function createAutonomyRun(
params: CreateAutonomyRunParams,
): Promise<AutonomyRunRecord> {
const record = await createAutonomyRunCore(params, false)
if (!record) {
throw new Error('Autonomy run was unexpectedly skipped.')
}
return record
}
export async function createAutonomyRunIfNoActiveSource(
params: CreateAutonomyRunParams & { sourceId: string },
): Promise<AutonomyRunRecord | null> {
return createAutonomyRunCore(params, true)
}
function buildManagedFlowStepPrompt(
flow: AutonomyFlowRecord,
stepIndex: number,
@@ -336,6 +583,7 @@ async function createOrRecoverManagedFlowStepCommand(params: {
workload: params.workload,
autonomy: {
runId: run.runId,
rootDir: run.rootDir,
trigger: 'managed-flow-step',
sourceId: run.sourceId,
sourceLabel: run.sourceLabel,
@@ -426,11 +674,16 @@ export async function markAutonomyRunRunning(
): Promise<AutonomyRunRecord | null> {
const updated = await updateAutonomyRun(
runId,
current => ({
...current,
status: 'running',
startedAt: nowMs ?? Date.now(),
}),
current =>
current.status === 'queued'
? {
...current,
status: 'running',
startedAt: nowMs ?? Date.now(),
ownerProcessId: process.pid,
ownerSessionId: getSessionId(),
}
: null,
rootDir,
)
if (updated?.parentFlowId && updated.parentFlowSyncMode === 'managed') {
@@ -451,12 +704,15 @@ export async function markAutonomyRunCompleted(
): Promise<AutonomyRunRecord | null> {
const updated = await updateAutonomyRun(
runId,
current => ({
...current,
status: 'completed',
endedAt: nowMs ?? Date.now(),
error: undefined,
}),
current =>
current.status === 'queued' || current.status === 'running'
? {
...current,
status: 'completed',
endedAt: nowMs ?? Date.now(),
error: undefined,
}
: null,
rootDir,
)
if (updated?.parentFlowId && updated.parentFlowSyncMode === 'managed') {
@@ -476,24 +732,17 @@ export async function markAutonomyRunFailed(
rootDir?: string,
nowMs?: number,
): Promise<AutonomyRunRecord | null> {
const endedAt = nowMs ?? Date.now()
const updated = await updateAutonomyRun(
runId,
current => ({
...current,
status: 'failed',
endedAt: nowMs ?? Date.now(),
error,
}),
current =>
isActiveAutonomyRunStatus(current.status)
? failAutonomyRunRecord(current, error, endedAt)
: null,
rootDir,
)
if (updated?.parentFlowId && updated.parentFlowSyncMode === 'managed') {
await markManagedAutonomyFlowStepFailed({
flowId: updated.parentFlowId,
runId: updated.runId,
error,
rootDir,
nowMs: updated.endedAt,
})
if (updated) {
await syncFailedManagedFlowForRun(updated, rootDir ?? updated.rootDir)
}
return updated
}
@@ -505,12 +754,15 @@ export async function markAutonomyRunCancelled(
): Promise<AutonomyRunRecord | null> {
const updated = await updateAutonomyRun(
runId,
current => ({
...current,
status: 'cancelled',
endedAt: nowMs ?? Date.now(),
error: undefined,
}),
current =>
current.status === 'queued' || current.status === 'running'
? {
...current,
status: 'cancelled',
endedAt: nowMs ?? Date.now(),
error: undefined,
}
: null,
rootDir,
)
if (updated?.parentFlowId && updated.parentFlowSyncMode === 'managed') {
@@ -612,6 +864,7 @@ export async function createAutonomyQueuedPrompt(params: {
currentDir?: string
sourceId?: string
sourceLabel?: string
ownerKey?: string
workload?: string
priority?: 'now' | 'next' | 'later'
shouldCreate?: () => boolean
@@ -634,39 +887,130 @@ export async function createAutonomyQueuedPrompt(params: {
currentDir,
sourceId: params.sourceId,
sourceLabel: params.sourceLabel,
ownerKey: params.ownerKey,
workload: params.workload,
priority: params.priority,
flow: params.flow,
})
}
export async function createAutonomyQueuedPromptIfNoActiveSource(params: {
trigger: AutonomyTriggerKind
basePrompt: string
rootDir?: string
currentDir?: string
sourceId: string
sourceLabel?: string
ownerKey?: string
workload?: string
priority?: 'now' | 'next' | 'later'
shouldCreate?: () => boolean
}): Promise<QueuedCommand | null> {
const rootDir = resolve(params.rootDir ?? getProjectRoot())
const currentDir = resolve(params.currentDir ?? getCwd())
// Cheap optimistic pre-check: skip the AGENTS.md / HEARTBEAT.md disk
// reads + prompt assembly when an active run for this source already
// blocks dedup. The lock-side check inside persistAutonomyRunRecord
// remains authoritative; this only fast-paths the common storm case.
if (
await hasActiveAutonomyRunForSource({
trigger: params.trigger,
sourceId: params.sourceId,
rootDir,
ownerKey: params.ownerKey,
})
) {
return null
}
const prepared = await prepareAutonomyTurnPrompt({
basePrompt: params.basePrompt,
trigger: params.trigger,
rootDir,
currentDir,
})
if (params.shouldCreate && !params.shouldCreate()) {
return null
}
return commitAutonomyQueuedPromptIfNoActiveSource({
prepared,
rootDir,
currentDir,
sourceId: params.sourceId,
sourceLabel: params.sourceLabel,
ownerKey: params.ownerKey,
workload: params.workload,
priority: params.priority,
})
}
export async function commitAutonomyQueuedPrompt(params: {
prepared: Awaited<ReturnType<typeof prepareAutonomyTurnPrompt>>
rootDir?: string
currentDir?: string
sourceId?: string
sourceLabel?: string
ownerKey?: string
workload?: string
priority?: 'now' | 'next' | 'later'
flow?: AutonomyRunFlowRef
}): Promise<QueuedCommand> {
const command = await commitAutonomyQueuedPromptInternal(params, false)
if (!command) {
throw new Error('Autonomy queued prompt was unexpectedly skipped.')
}
return command
}
async function commitAutonomyQueuedPromptIfNoActiveSource(params: {
prepared: Awaited<ReturnType<typeof prepareAutonomyTurnPrompt>>
rootDir?: string
currentDir?: string
sourceId: string
sourceLabel?: string
ownerKey?: string
workload?: string
priority?: 'now' | 'next' | 'later'
}): Promise<QueuedCommand | null> {
return commitAutonomyQueuedPromptInternal(params, true)
}
async function commitAutonomyQueuedPromptInternal(
params: {
prepared: Awaited<ReturnType<typeof prepareAutonomyTurnPrompt>>
rootDir?: string
currentDir?: string
sourceId?: string
sourceLabel?: string
ownerKey?: string
workload?: string
priority?: 'now' | 'next' | 'later'
flow?: AutonomyRunFlowRef
},
skipWhenActiveSource: boolean,
): Promise<QueuedCommand | null> {
const rootDir = resolve(
params.rootDir ?? params.prepared.rootDir ?? getProjectRoot(),
)
const currentDir = resolve(
params.currentDir ?? params.prepared.currentDir ?? getCwd(),
)
commitPreparedAutonomyTurn(params.prepared)
const value = params.prepared.prompt
const run = await createAutonomyRun({
const runParams: CreateAutonomyRunParams = {
trigger: params.prepared.trigger,
prompt: value,
rootDir,
currentDir,
sourceId: params.sourceId,
sourceLabel: params.sourceLabel,
ownerKey: params.ownerKey,
flow: params.flow,
})
}
const useDedup = skipWhenActiveSource && Boolean(params.sourceId)
const run = await createAutonomyRunCore(runParams, useDedup)
if (!run) {
return null
}
commitPreparedAutonomyTurn(params.prepared)
const origin = {
kind: 'autonomy',
trigger: params.prepared.trigger,
@@ -683,6 +1027,7 @@ export async function commitAutonomyQueuedPrompt(params: {
workload: params.workload,
autonomy: {
runId: run.runId,
rootDir: run.rootDir,
trigger: params.prepared.trigger,
sourceId: params.sourceId,
sourceLabel: params.sourceLabel,

View File

@@ -19,6 +19,7 @@ import {
} from '../types/textInputTypes.js'
import { createAbortController } from './abortController.js'
import type { PastedContent } from './config.js'
import { getCwd } from './cwd.js'
import { logForDebugging } from './debug.js'
import type { EffortValue } from './effort.js'
import type { FileHistoryState } from './fileHistory.js'
@@ -27,11 +28,9 @@ import { gracefulShutdownSync } from './gracefulShutdown.js'
import { enqueue } from './messageQueueManager.js'
import { resolveSkillModelOverride } from './model/model.js'
import {
finalizeAutonomyRunCompleted,
finalizeAutonomyRunFailed,
markAutonomyRunFailed,
markAutonomyRunRunning,
} from './autonomyRuns.js'
claimConsumableQueuedAutonomyCommands,
finalizeAutonomyCommandsForTurn,
} from './autonomyQueueLifecycle.js'
import type { ProcessUserInputContext } from './processUserInput/processUserInput.js'
import { processUserInput } from './processUserInput/processUserInput.js'
import type { QueryGuard } from './QueryGuard.js'
@@ -459,7 +458,14 @@ async function executeUserInput(params: ExecuteUserInputParams): Promise<void> {
// Iterate all commands uniformly. First command gets attachments +
// ideSelection + pastedContents, rest skip attachments to avoid
// duplicating turn-level context (IDE selection, todos, diffs).
const commands = queuedCommands ?? []
let commands = queuedCommands ?? []
const queuedAutonomyClaim =
await claimConsumableQueuedAutonomyCommands(commands)
commands = queuedAutonomyClaim.attachmentCommands
const claimedAutonomyCommands = queuedAutonomyClaim.claimedCommands
if (commands.length === 0) {
return
}
// Compute the workload tag for this turn. queueProcessor can batch a
// cron prompt with a same-tick human prompt; only tag when EVERY
@@ -471,7 +477,7 @@ async function executeUserInput(params: ExecuteUserInputParams): Promise<void> {
commands.every(c => c.workload === firstWorkload)
? firstWorkload
: undefined
let autonomyRunIds: string[] | undefined
const deferredAutonomyRunIds = new Set<string>()
// Wrap the entire turn (processUserInput loop + onQuery) in an
// AsyncLocalStorage context. This is the ONLY way to correctly
@@ -486,10 +492,7 @@ async function executeUserInput(params: ExecuteUserInputParams): Promise<void> {
for (let i = 0; i < commands.length; i++) {
const cmd = commands[i]!
const isFirst = i === 0
if (cmd.autonomy?.runId) {
;(autonomyRunIds ??= []).push(cmd.autonomy.runId)
await markAutonomyRunRunning(cmd.autonomy.runId)
}
const runId = cmd.autonomy?.runId
const result = await processUserInput({
input: cmd.value,
preExpansionInput: cmd.preExpansionValue,
@@ -510,7 +513,11 @@ async function executeUserInput(params: ExecuteUserInputParams): Promise<void> {
bridgeOrigin: cmd.bridgeOrigin,
isMeta: cmd.isMeta,
skipAttachments: !isFirst,
autonomy: cmd.autonomy,
})
if (runId && result.deferAutonomyCompletion) {
deferredAutonomyRunIds.add(runId)
}
// Stamp origin here rather than threading another arg through
// processUserInput → processUserInputBase → processTextPrompt → createUserMessage.
// Derive origin from mode for task-notifications — mirrors the origin
@@ -611,26 +618,35 @@ async function executeUserInput(params: ExecuteUserInputParams): Promise<void> {
}
}
}) // end runWithWorkload — ALS context naturally scoped, no finally needed
if (autonomyRunIds?.length) {
for (const runId of autonomyRunIds) {
const nextCommands = await finalizeAutonomyRunCompleted({
runId,
priority: 'later',
workload: turnWorkload,
})
for (const nextCommand of nextCommands) {
enqueue(nextCommand)
}
if (claimedAutonomyCommands.length) {
const finalizableCommands = claimedAutonomyCommands.filter(command => {
const runId = command.autonomy?.runId
return !runId || !deferredAutonomyRunIds.has(runId)
})
const nextCommands = await finalizeAutonomyCommandsForTurn({
commands: finalizableCommands,
outcome: { type: 'completed' },
currentDir: getCwd(),
priority: 'later',
workload: turnWorkload,
})
for (const nextCommand of nextCommands) {
enqueue(nextCommand)
}
}
} catch (error) {
if (autonomyRunIds?.length) {
for (const runId of autonomyRunIds) {
await finalizeAutonomyRunFailed({
runId,
error: String(error),
})
}
if (claimedAutonomyCommands.length) {
const finalizableCommands = claimedAutonomyCommands.filter(command => {
const runId = command.autonomy?.runId
return !runId || !deferredAutonomyRunIds.has(runId)
})
await finalizeAutonomyCommandsForTurn({
commands: finalizableCommands,
outcome: { type: 'failed', error },
currentDir: getCwd(),
priority: 'later',
workload: turnWorkload,
})
}
throw error
}

View File

@@ -1,173 +1,162 @@
import { describe, expect, test, beforeEach, afterEach } from "bun:test";
import { mock } from "bun:test";
import { describe, expect, test, beforeEach, afterEach } from 'bun:test'
let mockedModelType: "gemini" | undefined;
const { getAPIProvider, isFirstPartyAnthropicBaseUrl } = await import(
'../providers'
)
mock.module("../../settings/settings.js", () => ({
getInitialSettings: () =>
mockedModelType ? { modelType: mockedModelType } : {},
}));
const { getAPIProvider, isFirstPartyAnthropicBaseUrl } =
await import("../providers");
describe("getAPIProvider", () => {
describe('getAPIProvider', () => {
const envKeys = [
"CLAUDE_CODE_USE_GEMINI",
"CLAUDE_CODE_USE_BEDROCK",
"CLAUDE_CODE_USE_VERTEX",
"CLAUDE_CODE_USE_FOUNDRY",
"CLAUDE_CODE_USE_OPENAI",
] as const;
const savedEnv: Record<string, string | undefined> = {};
'CLAUDE_CODE_USE_GEMINI',
'CLAUDE_CODE_USE_BEDROCK',
'CLAUDE_CODE_USE_VERTEX',
'CLAUDE_CODE_USE_FOUNDRY',
'CLAUDE_CODE_USE_OPENAI',
'CLAUDE_CODE_USE_GROK',
] as const
const savedEnv: Record<string, string | undefined> = {}
beforeEach(() => {
// Save and clear environment variables
mockedModelType = undefined;
for (const key of envKeys) {
savedEnv[key] = process.env[key];
delete process.env[key];
savedEnv[key] = process.env[key]
delete process.env[key]
}
});
})
afterEach(() => {
// Restore environment variables
mockedModelType = undefined;
for (const key of envKeys) {
if (savedEnv[key] !== undefined) {
process.env[key] = savedEnv[key];
process.env[key] = savedEnv[key]
} else {
delete process.env[key];
delete process.env[key]
}
}
});
})
test('returns "firstParty" by default', () => {
expect(getAPIProvider()).toBe("firstParty");
});
expect(getAPIProvider({})).toBe('firstParty')
})
test('returns "gemini" when modelType is gemini', () => {
mockedModelType = "gemini";
expect(getAPIProvider()).toBe("gemini");
});
expect(getAPIProvider({ modelType: 'gemini' })).toBe('gemini')
})
test("modelType takes precedence over environment variables", () => {
mockedModelType = "gemini";
process.env.CLAUDE_CODE_USE_BEDROCK = "1";
expect(getAPIProvider()).toBe("gemini");
});
test('modelType takes precedence over environment variables', () => {
process.env.CLAUDE_CODE_USE_BEDROCK = '1'
expect(getAPIProvider({ modelType: 'gemini' })).toBe('gemini')
})
test('returns "gemini" when CLAUDE_CODE_USE_GEMINI is set', () => {
process.env.CLAUDE_CODE_USE_GEMINI = "1";
expect(getAPIProvider()).toBe("gemini");
});
process.env.CLAUDE_CODE_USE_GEMINI = '1'
expect(getAPIProvider({})).toBe('gemini')
})
test('returns "bedrock" when CLAUDE_CODE_USE_BEDROCK is set', () => {
process.env.CLAUDE_CODE_USE_BEDROCK = "1";
expect(getAPIProvider()).toBe("bedrock");
});
process.env.CLAUDE_CODE_USE_BEDROCK = '1'
expect(getAPIProvider({})).toBe('bedrock')
})
test('returns "vertex" when CLAUDE_CODE_USE_VERTEX is set', () => {
process.env.CLAUDE_CODE_USE_VERTEX = "1";
expect(getAPIProvider()).toBe("vertex");
});
process.env.CLAUDE_CODE_USE_VERTEX = '1'
expect(getAPIProvider({})).toBe('vertex')
})
test('returns "foundry" when CLAUDE_CODE_USE_FOUNDRY is set', () => {
process.env.CLAUDE_CODE_USE_FOUNDRY = "1";
expect(getAPIProvider()).toBe("foundry");
});
process.env.CLAUDE_CODE_USE_FOUNDRY = '1'
expect(getAPIProvider({})).toBe('foundry')
})
test("bedrock takes precedence over gemini", () => {
process.env.CLAUDE_CODE_USE_BEDROCK = "1";
process.env.CLAUDE_CODE_USE_GEMINI = "1";
expect(getAPIProvider()).toBe("bedrock");
});
test('bedrock takes precedence over gemini', () => {
process.env.CLAUDE_CODE_USE_BEDROCK = '1'
process.env.CLAUDE_CODE_USE_GEMINI = '1'
expect(getAPIProvider({})).toBe('bedrock')
})
test("bedrock takes precedence over vertex", () => {
process.env.CLAUDE_CODE_USE_BEDROCK = "1";
process.env.CLAUDE_CODE_USE_VERTEX = "1";
expect(getAPIProvider()).toBe("bedrock");
});
test('bedrock takes precedence over vertex', () => {
process.env.CLAUDE_CODE_USE_BEDROCK = '1'
process.env.CLAUDE_CODE_USE_VERTEX = '1'
expect(getAPIProvider({})).toBe('bedrock')
})
test("bedrock wins when all three env vars are set", () => {
process.env.CLAUDE_CODE_USE_BEDROCK = "1";
process.env.CLAUDE_CODE_USE_VERTEX = "1";
process.env.CLAUDE_CODE_USE_FOUNDRY = "1";
expect(getAPIProvider()).toBe("bedrock");
});
test('bedrock wins when all three env vars are set', () => {
process.env.CLAUDE_CODE_USE_BEDROCK = '1'
process.env.CLAUDE_CODE_USE_VERTEX = '1'
process.env.CLAUDE_CODE_USE_FOUNDRY = '1'
expect(getAPIProvider({})).toBe('bedrock')
})
test('"true" is truthy', () => {
process.env.CLAUDE_CODE_USE_BEDROCK = "true";
expect(getAPIProvider()).toBe("bedrock");
});
process.env.CLAUDE_CODE_USE_BEDROCK = 'true'
expect(getAPIProvider({})).toBe('bedrock')
})
test('"0" is not truthy', () => {
process.env.CLAUDE_CODE_USE_BEDROCK = "0";
expect(getAPIProvider()).toBe("firstParty");
});
process.env.CLAUDE_CODE_USE_BEDROCK = '0'
expect(getAPIProvider({})).toBe('firstParty')
})
test('empty string is not truthy', () => {
process.env.CLAUDE_CODE_USE_BEDROCK = "";
expect(getAPIProvider()).toBe("firstParty");
});
});
process.env.CLAUDE_CODE_USE_BEDROCK = ''
expect(getAPIProvider({})).toBe('firstParty')
})
})
describe("isFirstPartyAnthropicBaseUrl", () => {
const originalBaseUrl = process.env.ANTHROPIC_BASE_URL;
const originalUserType = process.env.USER_TYPE;
describe('isFirstPartyAnthropicBaseUrl', () => {
const originalBaseUrl = process.env.ANTHROPIC_BASE_URL
const originalUserType = process.env.USER_TYPE
afterEach(() => {
if (originalBaseUrl !== undefined) {
process.env.ANTHROPIC_BASE_URL = originalBaseUrl;
process.env.ANTHROPIC_BASE_URL = originalBaseUrl
} else {
delete process.env.ANTHROPIC_BASE_URL;
delete process.env.ANTHROPIC_BASE_URL
}
if (originalUserType !== undefined) {
process.env.USER_TYPE = originalUserType;
process.env.USER_TYPE = originalUserType
} else {
delete process.env.USER_TYPE;
delete process.env.USER_TYPE
}
});
})
test("returns true when ANTHROPIC_BASE_URL is not set", () => {
delete process.env.ANTHROPIC_BASE_URL;
expect(isFirstPartyAnthropicBaseUrl()).toBe(true);
});
test('returns true when ANTHROPIC_BASE_URL is not set', () => {
delete process.env.ANTHROPIC_BASE_URL
expect(isFirstPartyAnthropicBaseUrl()).toBe(true)
})
test("returns true for api.anthropic.com", () => {
process.env.ANTHROPIC_BASE_URL = "https://api.anthropic.com";
expect(isFirstPartyAnthropicBaseUrl()).toBe(true);
});
test('returns true for api.anthropic.com', () => {
process.env.ANTHROPIC_BASE_URL = 'https://api.anthropic.com'
expect(isFirstPartyAnthropicBaseUrl()).toBe(true)
})
test("returns false for custom URL", () => {
process.env.ANTHROPIC_BASE_URL = "https://my-proxy.com";
expect(isFirstPartyAnthropicBaseUrl()).toBe(false);
});
test('returns false for custom URL', () => {
process.env.ANTHROPIC_BASE_URL = 'https://my-proxy.com'
expect(isFirstPartyAnthropicBaseUrl()).toBe(false)
})
test("returns false for invalid URL", () => {
process.env.ANTHROPIC_BASE_URL = "not-a-url";
expect(isFirstPartyAnthropicBaseUrl()).toBe(false);
});
test('returns false for invalid URL', () => {
process.env.ANTHROPIC_BASE_URL = 'not-a-url'
expect(isFirstPartyAnthropicBaseUrl()).toBe(false)
})
test("returns true for staging URL when USER_TYPE is ant", () => {
process.env.ANTHROPIC_BASE_URL = "https://api-staging.anthropic.com";
process.env.USER_TYPE = "ant";
expect(isFirstPartyAnthropicBaseUrl()).toBe(true);
});
test('returns true for staging URL when USER_TYPE is ant', () => {
process.env.ANTHROPIC_BASE_URL = 'https://api-staging.anthropic.com'
process.env.USER_TYPE = 'ant'
expect(isFirstPartyAnthropicBaseUrl()).toBe(true)
})
test("returns true for URL with path", () => {
process.env.ANTHROPIC_BASE_URL = "https://api.anthropic.com/v1";
expect(isFirstPartyAnthropicBaseUrl()).toBe(true);
});
test('returns true for URL with path', () => {
process.env.ANTHROPIC_BASE_URL = 'https://api.anthropic.com/v1'
expect(isFirstPartyAnthropicBaseUrl()).toBe(true)
})
test("returns true for trailing slash", () => {
process.env.ANTHROPIC_BASE_URL = "https://api.anthropic.com/";
expect(isFirstPartyAnthropicBaseUrl()).toBe(true);
});
test('returns true for trailing slash', () => {
process.env.ANTHROPIC_BASE_URL = 'https://api.anthropic.com/'
expect(isFirstPartyAnthropicBaseUrl()).toBe(true)
})
test("returns false for subdomain attack", () => {
process.env.ANTHROPIC_BASE_URL = "https://evil-api.anthropic.com";
expect(isFirstPartyAnthropicBaseUrl()).toBe(false);
});
});
test('returns false for subdomain attack', () => {
process.env.ANTHROPIC_BASE_URL = 'https://evil-api.anthropic.com'
expect(isFirstPartyAnthropicBaseUrl()).toBe(false)
})
})

View File

@@ -1,5 +1,6 @@
import type { AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS } from '../../services/analytics/index.js'
import { getInitialSettings } from '../settings/settings.js'
import type { SettingsJson } from '../settings/types.js'
import { isEnvTruthy } from '../envUtils.js'
export type APIProvider =
@@ -11,8 +12,10 @@ export type APIProvider =
| 'gemini'
| 'grok'
export function getAPIProvider(): APIProvider {
const modelType = getInitialSettings().modelType
export function getAPIProvider(
settings: Pick<SettingsJson, 'modelType'> = getInitialSettings(),
): APIProvider {
const modelType = settings.modelType
if (modelType === 'openai') return 'openai'
if (modelType === 'gemini') return 'gemini'
if (modelType === 'grok') return 'grok'

View File

@@ -0,0 +1,375 @@
import { afterEach, beforeEach, describe, expect, mock, test } from 'bun:test'
import type { QueuedCommand } from '../../../types/textInputTypes'
import {
resetStateForTests,
setCwdState,
setOriginalCwd,
setProjectRoot,
} from '../../../bootstrap/state'
import {
createAutonomyQueuedPrompt,
getAutonomyRunById,
listAutonomyRuns,
markAutonomyRunRunning,
} from '../../autonomyRuns'
import { resetAutonomyAuthorityForTests } from '../../autonomyAuthority'
import { createScheduledTaskQueuedCommand } from '../../../hooks/useScheduledTasks'
import {
cleanupTempDir,
createTempDir,
} from '../../../../tests/mocks/file-system'
let runAgentBlocker: Promise<void> | null = null
let releaseRunAgentBlocker: (() => void) | null = null
let runAgentStartCount = 0
let originalNodeEnv: string | undefined
let originalAnthropicApiKey: string | undefined
const commandQueue: QueuedCommand[] = []
function enqueue(command: QueuedCommand): void {
commandQueue.push({ ...command, priority: command.priority ?? 'next' })
}
function enqueuePendingNotification(command: QueuedCommand): void {
commandQueue.push({ ...command, priority: command.priority ?? 'later' })
}
function getCommandQueue(): QueuedCommand[] {
return [...commandQueue]
}
function hasCommandsInQueue(): boolean {
return commandQueue.length > 0
}
function resetCommandQueue(): void {
commandQueue.length = 0
}
function createMessageQueueManagerMock() {
return {
enqueue,
enqueuePendingNotification,
getCommandQueue,
hasCommandsInQueue,
resetCommandQueue,
}
}
function holdRunAgent(): void {
runAgentBlocker = new Promise(resolve => {
releaseRunAgentBlocker = resolve
})
}
function releaseRunAgent(): void {
releaseRunAgentBlocker?.()
runAgentBlocker = null
releaseRunAgentBlocker = null
}
mock.module('bun:bundle', () => ({
feature: (name: string) => name === 'KAIROS',
}))
mock.module(
'@claude-code-best/builtin-tools/tools/AgentTool/runAgent.js',
() => ({
runAgent: async function* () {
runAgentStartCount += 1
if (runAgentBlocker) {
await runAgentBlocker
}
yield {
type: 'assistant',
uuid: 'assistant-1',
timestamp: new Date().toISOString(),
message: {
id: 'msg_1',
type: 'message',
role: 'assistant',
model: 'test-model',
content: [{ type: 'text', text: 'forked command done' }],
stop_reason: 'end_turn',
stop_sequence: null,
usage: {
input_tokens: 0,
output_tokens: 0,
},
},
}
},
}),
)
mock.module('@claude-code-best/builtin-tools/tools/AgentTool/UI.js', () => ({
AgentPromptDisplay: () => null,
AgentResponseDisplay: () => null,
extractLastToolInfo: () => null,
renderGroupedAgentToolUse: () => null,
renderToolResultMessage: () => null,
renderToolUseErrorMessage: () => null,
renderToolUseMessage: () => null,
renderToolUseProgressMessage: () => null,
renderToolUseRejectedMessage: () => null,
renderToolUseTag: () => null,
userFacingName: () => 'Agent',
userFacingNameBackgroundColor: () => 'gray',
}))
mock.module('../../messageQueueManager', createMessageQueueManagerMock)
mock.module('../../messageQueueManager.js', createMessageQueueManagerMock)
const { processSlashCommand } = await import('../processSlashCommand')
let tempDir = ''
function createScheduledTaskQueuedCommandForTest(task: {
id: string
prompt: string
}) {
return createScheduledTaskQueuedCommand(task, {
rootDir: tempDir,
currentDir: tempDir,
})
}
async function waitForRunStatus(
runId: string,
status: 'queued' | 'running' | 'completed' | 'failed' | 'cancelled',
): Promise<void> {
for (let i = 0; i < 200; i++) {
const run = await getAutonomyRunById(runId, tempDir)
if (run?.status === status) {
return
}
await new Promise(resolve => setTimeout(resolve, 10))
}
const run = await getAutonomyRunById(runId, tempDir)
throw new Error(`Expected ${runId} to be ${status}, got ${run?.status}`)
}
async function waitForRunAgentStarts(expected: number): Promise<void> {
for (let i = 0; i < 200; i++) {
if (runAgentStartCount >= expected) {
return
}
await new Promise(resolve => setTimeout(resolve, 10))
}
throw new Error(
`Expected runAgent to start ${expected} time(s), got ${runAgentStartCount}`,
)
}
async function waitForCommandQueueLength(expected: number): Promise<void> {
for (let i = 0; i < 200; i++) {
if (getCommandQueue().length === expected) {
return
}
await new Promise(resolve => setTimeout(resolve, 10))
}
throw new Error(
`Expected command queue length ${expected}, got ${getCommandQueue().length}`,
)
}
beforeEach(async () => {
tempDir = await createTempDir('process-slash-command-')
originalNodeEnv = process.env.NODE_ENV
originalAnthropicApiKey = process.env.ANTHROPIC_API_KEY
process.env.NODE_ENV = 'test'
process.env.ANTHROPIC_API_KEY = 'test-key'
runAgentBlocker = null
releaseRunAgentBlocker = null
runAgentStartCount = 0
resetStateForTests()
resetAutonomyAuthorityForTests()
resetCommandQueue()
setOriginalCwd(tempDir)
setProjectRoot(tempDir)
setCwdState(tempDir)
})
afterEach(async () => {
releaseRunAgent()
if (originalNodeEnv === undefined) {
delete process.env.NODE_ENV
} else {
process.env.NODE_ENV = originalNodeEnv
}
if (originalAnthropicApiKey === undefined) {
delete process.env.ANTHROPIC_API_KEY
} else {
process.env.ANTHROPIC_API_KEY = originalAnthropicApiKey
}
resetStateForTests()
resetAutonomyAuthorityForTests()
resetCommandQueue()
if (tempDir) {
await cleanupTempDir(tempDir)
}
mock.restore()
})
describe('processSlashCommand', () => {
const forkedCommand = {
type: 'prompt',
name: 'forked',
description: 'test forked command',
progressMessage: 'forking',
contentLength: 0,
source: 'builtin',
context: 'fork',
getPromptForCommand: async () => [
{ type: 'text', text: 'review from fork' },
],
} as const
function createContext() {
return {
getAppState: () => ({
kairosEnabled: true,
mcp: { clients: [] },
toolPermissionContext: {
mode: 'default',
alwaysAllowRules: {},
},
}),
options: {
commands: [forkedCommand],
allowBackgroundForkedSlashCommands: true,
tools: [],
refreshTools: () => [],
agentDefinitions: {
activeAgents: [{ agentType: 'general-purpose' }],
},
},
setResponseLength: mock((_updater: (length: number) => number) => {}),
} as any
}
test('defers autonomy completion until a KAIROS background forked command completes', async () => {
const queued = await createAutonomyQueuedPrompt({
basePrompt: '/forked review',
trigger: 'scheduled-task',
rootDir: tempDir,
currentDir: tempDir,
sourceId: 'cron-1',
})
expect(queued).not.toBeNull()
const runId = queued!.autonomy!.runId
await markAutonomyRunRunning(runId, tempDir, 100)
const result = await processSlashCommand(
'/forked review',
[],
[],
[],
createContext(),
mock(() => {}),
undefined,
false,
async () => ({ behavior: 'allow', updatedInput: {} }) as any,
queued!.autonomy,
)
expect(result).toMatchObject({
messages: [],
shouldQuery: false,
deferAutonomyCompletion: true,
})
await waitForRunStatus(runId, 'completed')
await waitForCommandQueueLength(1)
expect(getCommandQueue()).toEqual([
expect.objectContaining({
mode: 'prompt',
isMeta: true,
skipSlashCommands: true,
value: expect.stringContaining(
'<scheduled-task-result command="/forked">',
),
}),
])
})
test('keeps repeated /loop scheduled fires bounded while a background fork is running', async () => {
const task = {
id: 'cron-loop',
prompt: '/forked review',
}
const first = await createScheduledTaskQueuedCommandForTest(task)
expect(first?.autonomy?.runId).toBeDefined()
const runId = first!.autonomy!.runId
await markAutonomyRunRunning(runId, tempDir, 100)
holdRunAgent()
const result = await processSlashCommand(
'/forked review',
[],
[],
[],
createContext(),
mock(() => {}),
undefined,
false,
async () => ({ behavior: 'allow', updatedInput: {} }) as any,
first!.autonomy,
)
expect(result.deferAutonomyCompletion).toBe(true)
await waitForRunAgentStarts(1)
const repeatedFires = await Promise.all(
Array.from({ length: 200 }, () =>
createScheduledTaskQueuedCommandForTest(task),
),
)
expect(repeatedFires.every(command => command === null)).toBe(true)
expect(
(await listAutonomyRuns(tempDir)).filter(
run => run.sourceId === 'cron-loop',
),
).toHaveLength(1)
expect(getCommandQueue()).toHaveLength(0)
releaseRunAgent()
await waitForRunStatus(runId, 'completed')
await waitForCommandQueueLength(1)
expect(getCommandQueue()).toHaveLength(1)
const next = await createScheduledTaskQueuedCommandForTest(task)
expect(next?.autonomy?.runId).toBeDefined()
expect(
(await listAutonomyRuns(tempDir)).filter(
run => run.sourceId === 'cron-loop',
),
).toHaveLength(2)
})
test('rejects the background fork test override outside test runtime', async () => {
process.env.NODE_ENV = 'production'
const result = await processSlashCommand(
'/forked review',
[],
[],
[],
createContext(),
mock(() => {}),
undefined,
false,
async () => ({ behavior: 'allow', updatedInput: {} }) as any,
)
expect(result.shouldQuery).toBe(false)
expect(
result.messages.some(message =>
JSON.stringify(message).includes(
'allowBackgroundForkedSlashCommands is test-only',
),
),
).toBe(true)
expect(runAgentStartCount).toBe(0)
})
})

File diff suppressed because it is too large Load Diff

View File

@@ -28,6 +28,7 @@ import type {
import type { PermissionMode } from '../../types/permissions.js'
import {
isValidImagePaste,
type QueuedCommand,
type PromptInputMode,
} from '../../types/textInputTypes.js'
import {
@@ -80,6 +81,9 @@ export type ProcessUserInputBaseResult = {
// Used by /discover to chain into the selected feature's command
nextInput?: string
submitNextInput?: boolean
// When true, the command started detached work that will finalize its
// autonomy run after the background work completes.
deferAutonomyCompletion?: boolean
}
export async function processUserInput({
@@ -100,6 +104,7 @@ export async function processUserInput({
bridgeOrigin,
isMeta,
skipAttachments,
autonomy,
}: {
input: string | Array<ContentBlockParam>
/**
@@ -137,6 +142,7 @@ export async function processUserInput({
*/
isMeta?: boolean
skipAttachments?: boolean
autonomy?: QueuedCommand['autonomy']
}): Promise<ProcessUserInputBaseResult> {
const inputString = typeof input === 'string' ? input : null
// Immediately show the user input prompt while we are still processing the input.
@@ -168,6 +174,7 @@ export async function processUserInput({
isMeta,
skipAttachments,
preExpansionInput,
autonomy,
)
queryCheckpoint('query_process_user_input_base_end')
@@ -296,6 +303,7 @@ async function processUserInputBase(
isMeta?: boolean,
skipAttachments?: boolean,
preExpansionInput?: string,
autonomy?: QueuedCommand['autonomy'],
): Promise<ProcessUserInputBaseResult> {
let inputString: string | null = null
let precedingInputBlocks: ContentBlockParam[] = []
@@ -491,6 +499,7 @@ async function processUserInputBase(
uuid,
isAlreadyProcessing,
canUseTool,
autonomy,
)
return addImageMetadataMessage(slashResult, imageMetadataTexts)
}
@@ -549,6 +558,7 @@ async function processUserInputBase(
uuid,
isAlreadyProcessing,
canUseTool,
autonomy,
)
return addImageMetadataMessage(slashResult, imageMetadataTexts)
}

View File

@@ -424,8 +424,7 @@ function createInProcessCanUseTool(
feedback: parsed.error,
})
}
cleanup()
return
return // Callback already resolves the promise
}
}
}
@@ -675,6 +674,7 @@ type WaitResult =
type: 'new_message'
message: string
autonomyRunId?: string
autonomyRootDir?: string
from: string
color?: string
summary?: string
@@ -739,12 +739,16 @@ async function waitForNextPromptOrShutdown(
`[inProcessRunner] ${identity.agentName} found pending user message (poll #${pollCount})`,
)
if (pending.autonomyRunId) {
await markAutonomyRunRunning(pending.autonomyRunId)
await markAutonomyRunRunning(
pending.autonomyRunId,
pending.autonomyRootDir,
)
}
return {
type: 'new_message',
message: pending.message,
autonomyRunId: pending.autonomyRunId,
autonomyRootDir: pending.autonomyRootDir,
from: 'user',
}
}
@@ -1022,6 +1026,7 @@ export async function runInProcessTeammate(
)
let currentPrompt = wrappedInitialPrompt
let currentAutonomyRunId: string | undefined
let currentAutonomyRootDir: string | undefined
let shouldExit = false
// Try to claim an available task immediately so the UI can show activity
@@ -1319,12 +1324,21 @@ export async function runInProcessTeammate(
setAppState,
)
if (currentAutonomyRunId) {
await markAutonomyRunFailed(currentAutonomyRunId, ERROR_MESSAGE_USER_ABORT)
await markAutonomyRunFailed(
currentAutonomyRunId,
ERROR_MESSAGE_USER_ABORT,
currentAutonomyRootDir,
)
currentAutonomyRunId = undefined
currentAutonomyRootDir = undefined
}
} else if (currentAutonomyRunId) {
await markAutonomyRunCompleted(currentAutonomyRunId)
await markAutonomyRunCompleted(
currentAutonomyRunId,
currentAutonomyRootDir,
)
currentAutonomyRunId = undefined
currentAutonomyRootDir = undefined
}
// Check if already idle before updating (to skip duplicate notification)
@@ -1398,6 +1412,7 @@ export async function runInProcessTeammate(
setAppState,
)
currentAutonomyRunId = undefined
currentAutonomyRootDir = undefined
break
case 'new_message':
@@ -1410,6 +1425,7 @@ export async function runInProcessTeammate(
if (waitResult.from === 'user') {
currentPrompt = waitResult.message
currentAutonomyRunId = waitResult.autonomyRunId
currentAutonomyRootDir = waitResult.autonomyRootDir
} else {
currentPrompt = formatAsTeammateMessage(
waitResult.from,
@@ -1426,6 +1442,7 @@ export async function runInProcessTeammate(
setAppState,
)
currentAutonomyRunId = undefined
currentAutonomyRootDir = undefined
}
break
@@ -1533,7 +1550,11 @@ export async function runInProcessTeammate(
})
}
if (currentAutonomyRunId) {
await markAutonomyRunFailed(currentAutonomyRunId, errorMessage)
await markAutonomyRunFailed(
currentAutonomyRunId,
errorMessage,
currentAutonomyRootDir,
)
}
// Send idle notification with failure via file-based mailbox

View File

@@ -234,7 +234,7 @@ export function killInProcessTeammate(
let agentId: string | null = null
let toolUseId: string | undefined
let description: string | undefined
let pendingAutonomyRunIds: string[] = []
let pendingAutonomyRuns: Array<{ runId: string; rootDir?: string }> = []
setAppState((prev: AppState) => {
const task = prev.tasks[taskId]
@@ -255,9 +255,18 @@ export function killInProcessTeammate(
description = teammateTask.description
// Capture pending autonomy run IDs before clearing them
pendingAutonomyRunIds = teammateTask.pendingUserMessages
.map(message => message.autonomyRunId)
.filter((runId): runId is string => runId !== undefined)
pendingAutonomyRuns = teammateTask.pendingUserMessages.flatMap(message =>
message.autonomyRunId
? [
{
runId: message.autonomyRunId,
...(message.autonomyRootDir
? { rootDir: message.autonomyRootDir }
: {}),
},
]
: [],
)
// Abort the controller to stop execution
teammateTask.abortController?.abort()
@@ -311,10 +320,11 @@ export function killInProcessTeammate(
}
if (killed) {
for (const runId of pendingAutonomyRunIds) {
for (const run of pendingAutonomyRuns) {
void markAutonomyRunFailed(
runId,
run.runId,
`Teammate ${agentId ?? taskId} was stopped before it could consume the queued autonomy prompt.`,
run.rootDir,
)
}
void evictTaskOutput(taskId)