feat: harden autonomy lifecycle, OOM bounds, and provider-boundary finalization

This PR consolidates a coordinated batch of fixes around autonomy run/flow lifecycle, scheduled task deduplication, provider-boundary state finalization, and matching memory-bound treatments for adjacent long-running subsystems (REPL fullscreen scrollback, skill-search/skill-learning runtime activation). All changes were developed and reviewed together because they touched the same lifecycle invariants and were uncovered by the same long-running session reproductions.

## Lifecycle correctness

- Queued autonomy prompts are not injected unless the persisted run was successfully claimed; queued run claiming is now terminal-safe so a once-consumed/cancelled/failed run can not slip back into `queued`.
- Autonomy run/flow finalization happens on completion, provider error, generator close, and cancellation — not just the happy path. New `src/__tests__/queryAutonomyProviderBoundary.test.ts` covers these provider-boundary transitions.
- `requestManagedAutonomyFlowCancel` and `resumeManagedAutonomyFlowPrompt` carry `rootDir` and `currentDir` explicitly across detached async boundaries (proactive-tick, cron, daemon restart) instead of inferring from process state.
- Active runs/flows are protected from janitor pruning so a running step can not be garbage-collected mid-flight (`src/utils/autonomyAuthority.ts`).
- Heartbeat parser now ignores fenced code blocks; the two-phase commit window for autonomy state transitions is documented in `docs/internals/autonomy-jira.md`.

## Ownership and dedup

- `src/utils/autonomyRuns.ts`: ownership stamping (run id + rootDir carried end-to-end), source-based dedup against active runs.
- `src/hooks/useScheduledTasks.ts`: scheduled ticks deduplicate against runs already active on the same source label.
- `src/utils/processUserInput/processSlashCommand.tsx`: forked slash commands now thread the autonomy `runId` so completion finalizers can find the originating run for deferred completion.
- New `src/utils/autonomyQueueLifecycle.ts` and tests collect the queue-side lifecycle invariants in one place.

## Memory bounds (related, same review pass)

- `src/screens/REPL.tsx`: caps fullscreen scrollback after the compact boundary and updates trailing progress rows in place. Long-running fullscreen sessions could otherwise retain thousands of post-compaction messages and duplicate progress rows, keeping Ink trees alive long after their useful context had moved on.
- `src/services/skillSearch/*` and `src/services/skillLearning/*`: runtime activation is strictly opt-in via existing env toggles; session caches are capped so long-running processes can not grow them forever. Build presence is preserved so operators can still discover and opt into the slash commands.

## CI / test contract

- `tests/integration/dependency-overrides.test.ts`: smoke test no longer drives Mermaid's browser renderer; it validates the package-resolution contract directly so CI does not regress on unrelated browser timing.
- New `tests/integration/autonomy-lifecycle-user-flow.test.ts`: end-to-end CLI subprocess flow exercising `status --deep`, `flows`, `flow <id>`, `flow resume`, `flow cancel` against persisted state.
- `src/entrypoints/cli.tsx`: `claude autonomy …` routes through an entrypoint fast path that reuses the slash-command formatter without booting the full interactive CLI. Stdout is flushed before forced exit so coverage subprocesses do not terminate with empty stdout.
- `packages/builtin-tools/src/tools/RemoteTriggerTool/__tests__/RemoteTriggerTool.test.ts`: stabilized to prevent audit flake under coverage.

## Tests added

- `src/__tests__/queryAutonomyProviderBoundary.test.ts`
- `src/hooks/__tests__/useScheduledTasks.test.ts`
- `src/utils/__tests__/autonomyAuthority.test.ts`
- `src/utils/__tests__/autonomyFlows.test.ts` (extended)
- `src/utils/__tests__/autonomyPersistence.test.ts` (extended)
- `src/utils/__tests__/autonomyQueueLifecycle.test.ts`
- `src/utils/__tests__/autonomyRuns.test.ts` (extended)
- `src/utils/processUserInput/__tests__/processSlashCommand.test.ts`
- `tests/integration/autonomy-lifecycle-user-flow.test.ts`

## Docs

- `docs/agent/sur-loop-scheduled-oom.md`: System Understanding Report covering the scheduled/loop OOM problem, the call graphs investigated, and the lifecycle invariants this PR establishes.
- `docs/agent/sur-skill-overflow-bugs.md`: SUR for the related skill-overflow context.
- `docs/internals/autonomy-jira.md`: documents the two-phase commit window and ownership stamping invariants.
- `docs/memory-leak-audit.md`: audit notes covering the REPL/scrollback and skill-search bounds.

## Invariants this PR establishes

1. Queued autonomy prompts are not injected unless the persisted run was successfully claimed.
2. Terminal run/flow states are terminal — completion, failure, and cancellation all finalize state regardless of which provider/error path triggered them.
3. Autonomy run/flow `rootDir` is carried explicitly across detached async boundaries instead of inferred from a shared singleton.
4. State-only CLI subcommands (`autonomy status|runs|flows|flow …`) bypass full interactive bootstrap so they do not hold unrelated handles open.
5. REPL fullscreen scrollback and skill-search/skill-learning session caches are explicitly bounded.

## Validation

```bash
bun run typecheck
CI=true GITHUB_ACTIONS=true bun test            # 3996 pass / 0 fail across 305 files
bun test src/__tests__/queryAutonomyProviderBoundary.test.ts \
         src/hooks/__tests__/useScheduledTasks.test.ts \
         src/utils/__tests__/autonomy{Runs,Flows,Authority,QueueLifecycle,Persistence}.test.ts \
         src/utils/processUserInput/__tests__/processSlashCommand.test.ts \
         tests/integration/autonomy-lifecycle-user-flow.test.ts
```

## Origin

This PR is the consolidated, upstream-targeted version of two fork-side review PRs (fix/loop-scheduled-autonomy-oom and fix/autonomy-lifecycle). The fork-side review history is preserved at https://github.com/amDosion/claude-code-bast/pull/7 . The fork's own internal `chore: keep fork current with upstream` sync commits and the `docs: update contributors` automation are intentionally not included in this PR.

The autonomy CLI handler `rootDir` threading that the fork added (78f64d8a, 98d04ddb) is intentionally omitted here because upstream `a2cfaf91` (fix: 修复 RemoteTriggerTool 和 autonomy 测试的全量运行失败) already performed the equivalent change with an additional `currentDir` option. Keeping the upstream version avoids regressing that improvement.
This commit is contained in:
unraid
2026-04-29 14:04:27 +08:00
parent 4f1649e249
commit f2e9af4927
51 changed files with 4885 additions and 971 deletions

View File

@@ -5,6 +5,7 @@ import {
AUTONOMY_DIR,
buildAutonomyTurnPrompt,
loadAutonomyAuthority,
parseHeartbeatAuthorityTasks,
resetAutonomyAuthorityForTests,
} from '../autonomyAuthority'
import {
@@ -238,4 +239,79 @@ describe('autonomyAuthority', () => {
expect(prompt).not.toContain('- weekly-report (7d): Ship the weekly report')
expect(prompt).not.toContain('- gather (')
})
test('parseHeartbeatAuthorityTasks ignores tasks: literals inside markdown code fences', () => {
const content = [
'# HEARTBEAT.md',
'',
'```yaml',
'tasks:',
' - name: not-a-real-task',
' interval: 1m',
' prompt: "would-be-shadowed"',
'```',
'',
'tasks:',
' - name: real-task',
' interval: 30m',
' prompt: "Real prompt"',
].join('\n')
const parsed = parseHeartbeatAuthorityTasks(content)
expect(parsed).toHaveLength(1)
expect(parsed[0]).toMatchObject({
name: 'real-task',
interval: '30m',
prompt: 'Real prompt',
})
})
test('parseHeartbeatAuthorityTasks ignores tasks: literals inside tilde markdown code fences', () => {
const content = [
'# HEARTBEAT.md',
'',
'~~~yaml',
'tasks:',
' - name: not-a-real-task',
' interval: 1m',
' prompt: "would-be-shadowed"',
'~~~',
'',
'tasks:',
' - name: real-task',
' interval: 30m',
' prompt: "Real prompt"',
].join('\n')
const parsed = parseHeartbeatAuthorityTasks(content)
expect(parsed).toHaveLength(1)
expect(parsed[0]).toMatchObject({
name: 'real-task',
interval: '30m',
prompt: 'Real prompt',
})
})
test('parseHeartbeatAuthorityTasks parses real tasks even when documentation precedes them', () => {
const content = [
'# Heartbeat docs',
'',
'See `tasks:` below — the parser keys on the literal at column 0.',
'',
'tasks:',
' - name: weekly',
' interval: 7d',
' prompt: "Ship report"',
].join('\n')
const parsed = parseHeartbeatAuthorityTasks(content)
// Inline `tasks:` mention does NOT collide because it's not at column 0
// on its own line — the existing line.trim() === 'tasks:' guard handles
// that case. This test pins the behaviour.
expect(parsed).toHaveLength(1)
expect(parsed[0]?.name).toBe('weekly')
})
})

View File

@@ -126,6 +126,14 @@ describe('listAutonomyFlows', () => {
runCount: 0,
ownerKey: DEFAULT_AUTONOMY_OWNER_KEY,
currentDir: tempDir,
boundary: [
' src/utils/** ',
'/absolute/not-allowed',
'src\\windows',
'../outside',
'src/utils/**',
'docs/*.md',
],
stateJson: {
currentStepIndex: 0,
steps: [
@@ -147,6 +155,7 @@ describe('listAutonomyFlows', () => {
expect(flows).toHaveLength(1)
expect(flows[0]?.flowId).toBe('flow-1')
expect(flows[0]?.syncMode).toBe('managed')
expect(flows[0]?.boundary).toEqual(['src/utils/**', 'docs/*.md'])
expect(flows[0]?.stateJson?.steps).toHaveLength(1)
})
@@ -191,6 +200,64 @@ describe('listAutonomyFlows', () => {
const flows = await listAutonomyFlows(tempDir)
expect(flows).toEqual([])
})
test('persistence pruning keeps active flows ahead of recent terminal history', async () => {
const flows: AutonomyFlowRecord[] = [
{
flowId: 'old-active',
flowKey: 'managed:scheduled-task:old-active',
syncMode: 'managed',
ownerKey: DEFAULT_AUTONOMY_OWNER_KEY,
revision: 1,
trigger: 'scheduled-task',
status: 'queued',
goal: 'old active',
rootDir: tempDir,
currentDir: tempDir,
runCount: 0,
createdAt: 1,
updatedAt: 1,
},
...Array.from({ length: 100 }, (_, index) => ({
flowId: `history-${index}`,
flowKey: `managed:scheduled-task:history-${index}`,
syncMode: 'managed' as const,
ownerKey: DEFAULT_AUTONOMY_OWNER_KEY,
revision: 1,
trigger: 'scheduled-task' as const,
status: 'succeeded' as const,
goal: `history ${index}`,
rootDir: tempDir,
currentDir: tempDir,
runCount: 1,
createdAt: 1_000 + index,
updatedAt: 1_000 + index,
endedAt: 2_000 + index,
})),
]
const flowsPath = resolveAutonomyFlowsPath(tempDir)
await mkdir(join(tempDir, AUTONOMY_DIR), { recursive: true })
await writeFile(
flowsPath,
`${JSON.stringify({ flows }, null, 2)}\n`,
'utf-8',
)
await startManagedAutonomyFlow({
trigger: 'scheduled-task',
goal: 'fresh active',
steps: TWO_STEPS,
rootDir: tempDir,
currentDir: tempDir,
sourceId: 'fresh-active',
nowMs: 9_999,
})
const persisted = await listAutonomyFlows(tempDir)
expect(persisted).toHaveLength(100)
expect(persisted.some(flow => flow.flowId === 'old-active')).toBe(true)
expect(persisted.some(flow => flow.flowId === 'history-0')).toBe(false)
})
})
describe('startManagedAutonomyFlow', () => {
@@ -225,6 +292,49 @@ describe('startManagedAutonomyFlow', () => {
expect(result!.nextStep!.step.name).toBe('gather')
})
test('normalizes and preserves boundary across completed flow restarts', async () => {
const first = await startManagedAutonomyFlow({
trigger: 'scheduled-task',
goal: 'Scoped flow',
steps: [{ name: 'only', prompt: 'Do it' }],
rootDir: tempDir,
sourceId: 'scoped-src',
boundary: [' src/utils/** ', 'src\\bad', '/absolute', 'docs/*.md'],
nowMs: 1000,
})
const flowId = first!.flow.flowId
expect(first!.flow.boundary).toEqual(['src/utils/**', 'docs/*.md'])
await queueManagedAutonomyFlowStepRun({
flowId,
stepId: first!.nextStep!.step.stepId,
stepIndex: 0,
runId: 'run-1',
rootDir: tempDir,
nowMs: 2000,
})
await markManagedAutonomyFlowStepCompleted({
flowId,
runId: 'run-1',
rootDir: tempDir,
nowMs: 3000,
})
const restarted = await startManagedAutonomyFlow({
trigger: 'scheduled-task',
goal: 'Scoped flow',
steps: [{ name: 'only', prompt: 'Do it again' }],
rootDir: tempDir,
sourceId: 'scoped-src',
nowMs: 4000,
})
expect(restarted!.started).toBe(true)
expect(restarted!.flow.flowId).toBe(flowId)
expect(restarted!.flow.boundary).toEqual(['src/utils/**', 'docs/*.md'])
})
test('sets status=waiting when first step has waitFor', async () => {
const result = await startManagedAutonomyFlow({
trigger: 'scheduled-task',

View File

@@ -54,6 +54,25 @@ describe('withAutonomyPersistenceLock', () => {
).rejects.toThrow('inner failure')
})
test('releases same-root lock bookkeeping after success and failure', async () => {
const {
getAutonomyPersistenceLockCountForTests,
withAutonomyPersistenceLock,
} = await import('../autonomyPersistence')
expect(getAutonomyPersistenceLockCountForTests()).toBe(0)
await withAutonomyPersistenceLock(tempDir, async () => 'ok')
expect(getAutonomyPersistenceLockCountForTests()).toBe(0)
await expect(
withAutonomyPersistenceLock(tempDir, async () => {
throw new Error('inner failure')
}),
).rejects.toThrow('inner failure')
expect(getAutonomyPersistenceLockCountForTests()).toBe(0)
})
test('serializes concurrent calls on the same rootDir', async () => {
const { withAutonomyPersistenceLock } = await import(
'../autonomyPersistence'

View File

@@ -0,0 +1,279 @@
import { afterEach, beforeEach, describe, expect, test } from 'bun:test'
import { createTempDir, cleanupTempDir } from '../../../tests/mocks/file-system'
import { getAttachmentMessages } from '../attachments'
import {
createAutonomyQueuedPrompt,
createProactiveAutonomyCommands,
getAutonomyRunById,
markAutonomyRunCancelled,
startManagedAutonomyFlowFromHeartbeatTask,
} from '../autonomyRuns'
import { getAutonomyFlowById, listAutonomyFlows } from '../autonomyFlows'
import {
cancelQueuedAutonomyCommands,
claimConsumableQueuedAutonomyCommands,
finalizeAutonomyCommandsForTurn,
partitionConsumableQueuedAutonomyCommands,
} from '../autonomyQueueLifecycle'
import {
enqueue,
getCommandsByMaxPriority,
remove as removeFromQueue,
resetCommandQueue,
} from '../messageQueueManager'
let tempDir = ''
let extraTempDirs: string[] = []
beforeEach(async () => {
tempDir = await createTempDir('autonomy-queue-lifecycle-')
extraTempDirs = []
resetCommandQueue()
})
afterEach(async () => {
resetCommandQueue()
if (tempDir) {
await cleanupTempDir(tempDir)
}
for (const extraTempDir of extraTempDirs) {
await cleanupTempDir(extraTempDir)
}
})
describe('autonomyQueueLifecycle', () => {
async function consumeQueuedAutonomyAttachmentTurn() {
const previousDisableAttachments =
process.env.CLAUDE_CODE_DISABLE_ATTACHMENTS
process.env.CLAUDE_CODE_DISABLE_ATTACHMENTS = '1'
try {
const snapshot = getCommandsByMaxPriority('later')
const claim = await claimConsumableQueuedAutonomyCommands(
snapshot,
tempDir,
)
removeFromQueue(claim.staleCommands)
removeFromQueue(claim.claimedCommands)
const attachments = []
for await (const attachment of getAttachmentMessages(
null,
{} as never,
null,
claim.attachmentCommands,
[],
)) {
attachments.push(attachment)
}
const consumedCommands = claim.attachmentCommands.filter(
command =>
(command.mode === 'prompt' || command.mode === 'task-notification') &&
!claim.claimedCommands.includes(command),
)
removeFromQueue(consumedCommands)
const nextCommands = await finalizeAutonomyCommandsForTurn({
commands: claim.claimedCommands,
outcome: { type: 'completed' },
currentDir: tempDir,
priority: 'later',
})
for (const command of nextCommands) {
enqueue(command)
}
return { attachments, runningRunIds: claim.claimedRunIds, nextCommands }
} finally {
if (previousDisableAttachments === undefined) {
delete process.env.CLAUDE_CODE_DISABLE_ATTACHMENTS
} else {
process.env.CLAUDE_CODE_DISABLE_ATTACHMENTS = previousDisableAttachments
}
}
}
test('filters stale autonomy commands before mid-turn attachment consumption', async () => {
const command = await createAutonomyQueuedPrompt({
basePrompt: 'scheduled prompt',
trigger: 'scheduled-task',
rootDir: tempDir,
currentDir: tempDir,
})
expect(command).not.toBeNull()
const initial = await partitionConsumableQueuedAutonomyCommands(
[command!],
tempDir,
)
expect(initial.attachmentCommands).toHaveLength(1)
expect(initial.staleCommands).toHaveLength(0)
await markAutonomyRunCancelled(command!.autonomy!.runId, tempDir)
const afterCancel = await partitionConsumableQueuedAutonomyCommands(
[command!],
tempDir,
)
expect(afterCancel.attachmentCommands).toHaveLength(0)
expect(afterCancel.staleCommands).toHaveLength(1)
})
test('cancels proactive commands that are created but dropped before enqueue', async () => {
const commands = await createProactiveAutonomyCommands({
basePrompt: '<tick>12:00:00</tick>',
rootDir: tempDir,
currentDir: tempDir,
})
expect(commands).toHaveLength(1)
const queuedRun = await getAutonomyRunById(
commands[0]!.autonomy!.runId,
tempDir,
)
expect(queuedRun!.status).toBe('queued')
await cancelQueuedAutonomyCommands({ commands, rootDir: tempDir })
const cancelledRun = await getAutonomyRunById(
commands[0]!.autonomy!.runId,
tempDir,
)
expect(cancelledRun!.status).toBe('cancelled')
})
test('uses command rootDir when claiming after project context changes', async () => {
const otherProjectDir = await createTempDir('autonomy-other-project-')
extraTempDirs.push(otherProjectDir)
const command = await createAutonomyQueuedPrompt({
basePrompt: 'scheduled prompt',
trigger: 'scheduled-task',
rootDir: tempDir,
currentDir: tempDir,
})
expect(command).not.toBeNull()
expect(command!.autonomy?.rootDir).toBe(tempDir)
const claim = await claimConsumableQueuedAutonomyCommands(
[command!],
otherProjectDir,
)
const originalRun = await getAutonomyRunById(
command!.autonomy!.runId,
tempDir,
)
const wrongProjectRun = await getAutonomyRunById(
command!.autonomy!.runId,
otherProjectDir,
)
expect(claim.claimedRunIds).toEqual([command!.autonomy!.runId])
expect(claim.attachmentCommands).toHaveLength(1)
expect(originalRun!.status).toBe('running')
expect(wrongProjectRun).toBeNull()
})
test('advances a managed flow consumed as a queued attachment', async () => {
const command = await startManagedAutonomyFlowFromHeartbeatTask({
task: {
name: 'weekly-report',
interval: '7d',
prompt: 'Ship the weekly report',
steps: [
{ name: 'gather', prompt: 'Gather weekly inputs' },
{ name: 'draft', prompt: 'Draft weekly report' },
],
},
rootDir: tempDir,
currentDir: tempDir,
})
expect(command).not.toBeNull()
const claim = await claimConsumableQueuedAutonomyCommands(
[command!],
tempDir,
)
const runningRunIds = claim.claimedRunIds
expect(runningRunIds).toEqual([command!.autonomy!.runId])
const nextCommands = await finalizeAutonomyCommandsForTurn({
commands: claim.claimedCommands,
outcome: { type: 'completed' },
currentDir: tempDir,
priority: 'later',
})
const [flow] = await listAutonomyFlows(tempDir)
const detail = await getAutonomyFlowById(flow!.flowId, tempDir)
const run = await getAutonomyRunById(command!.autonomy!.runId, tempDir)
expect(run!.status).toBe('completed')
expect(nextCommands).toHaveLength(1)
expect(nextCommands[0]!.autonomy?.flowId).toBe(flow!.flowId)
expect(detail!.stateJson!.steps.map(step => step.status)).toEqual([
'completed',
'queued',
])
})
test('keeps managed autonomy flow coherent across queued attachment turns', async () => {
const firstCommand = await startManagedAutonomyFlowFromHeartbeatTask({
task: {
name: 'weekly-report',
interval: '7d',
prompt: 'Ship the weekly report',
steps: [
{ name: 'gather', prompt: 'Gather weekly inputs' },
{ name: 'draft', prompt: 'Draft weekly report' },
],
},
rootDir: tempDir,
currentDir: tempDir,
})
expect(firstCommand).not.toBeNull()
enqueue(firstCommand!)
const firstTurn = await consumeQueuedAutonomyAttachmentTurn()
const queuedAfterFirstTurn = getCommandsByMaxPriority('later')
const [flowAfterFirstTurn] = await listAutonomyFlows(tempDir)
const firstRun = await getAutonomyRunById(
firstCommand!.autonomy!.runId,
tempDir,
)
expect(firstTurn.attachments).toHaveLength(1)
expect(firstTurn.attachments[0]!.attachment?.type).toBe('queued_command')
expect(firstTurn.runningRunIds).toEqual([firstCommand!.autonomy!.runId])
expect(firstTurn.nextCommands).toHaveLength(1)
expect(queuedAfterFirstTurn).toHaveLength(1)
expect(queuedAfterFirstTurn[0]!.autonomy?.flowId).toBe(
flowAfterFirstTurn!.flowId,
)
expect(firstRun!.status).toBe('completed')
expect(
flowAfterFirstTurn!.stateJson!.steps.map(step => step.status),
).toEqual(['completed', 'queued'])
const secondCommand = queuedAfterFirstTurn[0]!
const secondTurn = await consumeQueuedAutonomyAttachmentTurn()
const queuedAfterSecondTurn = getCommandsByMaxPriority('later')
const finalFlow = await getAutonomyFlowById(
flowAfterFirstTurn!.flowId,
tempDir,
)
const secondRun = await getAutonomyRunById(
secondCommand.autonomy!.runId,
tempDir,
)
expect(secondTurn.attachments).toHaveLength(1)
expect(secondTurn.runningRunIds).toEqual([secondCommand.autonomy!.runId])
expect(secondTurn.nextCommands).toHaveLength(0)
expect(queuedAfterSecondTurn).toHaveLength(0)
expect(secondRun!.status).toBe('completed')
expect(finalFlow!.status).toBe('succeeded')
expect(finalFlow!.stateJson!.steps.map(step => step.status)).toEqual([
'completed',
'completed',
])
})
})

View File

@@ -1,6 +1,7 @@
import { afterEach, beforeEach, describe, expect, test } from 'bun:test'
import { existsSync, readFileSync } from 'fs'
import { mkdir, writeFile } from 'fs/promises'
import { join } from 'path'
import { join, resolve as resolvePath } from 'path'
import {
resetStateForTests,
setCwdState,
@@ -8,17 +9,23 @@ import {
setProjectRoot,
} from '../../bootstrap/state'
import {
createAutonomyRun,
formatAutonomyRunsList,
formatAutonomyRunsStatus,
listAutonomyRuns,
createAutonomyQueuedPrompt,
createAutonomyQueuedPromptIfNoActiveSource,
createProactiveAutonomyCommands,
finalizeAutonomyRunCompleted,
getAutonomyRunById,
hasActiveAutonomyRunForSource,
markAutonomyRunCompleted,
markAutonomyRunCancelled,
markAutonomyRunFailed,
markAutonomyRunRunning,
recoverManagedAutonomyFlowPrompt,
resolveAutonomyRunsPath,
STALE_ACTIVE_RUN_ERROR_PREFIX,
startManagedAutonomyFlowFromHeartbeatTask,
} from '../autonomyRuns'
import {
@@ -95,7 +102,9 @@ describe('autonomyRuns', () => {
ownerKey: 'main-thread',
sourceId: 'cron-1',
sourceLabel: 'nightly-report',
ownerProcessId: process.pid,
})
expect(runs[0]?.ownerSessionId).toBeString()
expect(flows).toHaveLength(0)
expect(resolveAutonomyRunsPath(tempDir)).toContain('.claude')
})
@@ -118,7 +127,7 @@ describe('autonomyRuns', () => {
expect(command!.value).toContain('nested authority')
})
test('markAutonomyRunRunning/completed/failed update persisted lifecycle state for plain runs', async () => {
test('markAutonomyRunRunning/completed update persisted lifecycle state for plain runs', async () => {
const command = await createAutonomyQueuedPrompt({
basePrompt: '<tick>12:00:00</tick>',
trigger: 'proactive-tick',
@@ -134,7 +143,9 @@ describe('autonomyRuns', () => {
runId,
status: 'running',
startedAt: 100,
ownerProcessId: process.pid,
})
expect(runs[0]?.ownerSessionId).toBeString()
await markAutonomyRunCompleted(runId, tempDir, 200)
runs = await listAutonomyRuns(tempDir)
@@ -143,9 +154,22 @@ describe('autonomyRuns', () => {
status: 'completed',
endedAt: 200,
})
})
test('markAutonomyRunFailed updates a non-terminal run', async () => {
const command = await createAutonomyQueuedPrompt({
basePrompt: '<tick>12:00:00</tick>',
trigger: 'proactive-tick',
rootDir: tempDir,
currentDir: tempDir,
})
expect(command).not.toBeNull()
const runId = command!.autonomy!.runId
await markAutonomyRunRunning(runId, tempDir, 100)
await markAutonomyRunFailed(runId, 'boom', tempDir, 300)
runs = await listAutonomyRuns(tempDir)
const runs = await listAutonomyRuns(tempDir)
expect(runs[0]).toMatchObject({
runId,
status: 'failed',
@@ -154,6 +178,348 @@ describe('autonomyRuns', () => {
})
})
test('terminal runs are not revived by stale lifecycle updates', async () => {
const command = await createAutonomyQueuedPrompt({
basePrompt: 'scheduled prompt',
trigger: 'scheduled-task',
rootDir: tempDir,
currentDir: tempDir,
})
expect(command).not.toBeNull()
const runId = command!.autonomy!.runId
await markAutonomyRunCancelled(runId, tempDir, 100)
const revived = await markAutonomyRunRunning(runId, tempDir, 200)
const completed = await markAutonomyRunCompleted(runId, tempDir, 300)
const failed = await markAutonomyRunFailed(
runId,
'late failure',
tempDir,
400,
)
const persisted = await getAutonomyRunById(runId, tempDir)
expect(revived).toBeNull()
expect(completed).toBeNull()
expect(failed).toBeNull()
expect(persisted).toMatchObject({
status: 'cancelled',
endedAt: 100,
})
expect(persisted!.error).toBeUndefined()
})
test('hasActiveAutonomyRunForSource only treats queued and running scheduled runs as active', async () => {
const command = await createAutonomyQueuedPrompt({
basePrompt: 'scheduled prompt',
trigger: 'scheduled-task',
rootDir: tempDir,
currentDir: tempDir,
sourceId: 'cron-1',
sourceLabel: 'nightly',
})
expect(command).not.toBeNull()
const runId = command!.autonomy!.runId
await expect(
hasActiveAutonomyRunForSource({
trigger: 'scheduled-task',
sourceId: 'cron-1',
rootDir: tempDir,
}),
).resolves.toBe(true)
await markAutonomyRunRunning(runId, tempDir, 100)
await expect(
hasActiveAutonomyRunForSource({
trigger: 'scheduled-task',
sourceId: 'cron-1',
rootDir: tempDir,
}),
).resolves.toBe(true)
await expect(
hasActiveAutonomyRunForSource({
trigger: 'scheduled-task',
sourceId: 'cron-2',
rootDir: tempDir,
}),
).resolves.toBe(false)
await markAutonomyRunCompleted(runId, tempDir, 200)
await expect(
hasActiveAutonomyRunForSource({
trigger: 'scheduled-task',
sourceId: 'cron-1',
rootDir: tempDir,
}),
).resolves.toBe(false)
const failedCommand = await createAutonomyQueuedPrompt({
basePrompt: 'scheduled prompt',
trigger: 'scheduled-task',
rootDir: tempDir,
currentDir: tempDir,
sourceId: 'cron-1',
})
expect(failedCommand).not.toBeNull()
await markAutonomyRunFailed(
failedCommand!.autonomy!.runId,
'boom',
tempDir,
300,
)
await expect(
hasActiveAutonomyRunForSource({
trigger: 'scheduled-task',
sourceId: 'cron-1',
rootDir: tempDir,
}),
).resolves.toBe(false)
})
test('createAutonomyQueuedPromptIfNoActiveSource atomically skips duplicate active scheduled sources', async () => {
const [first, second] = await Promise.all([
createAutonomyQueuedPromptIfNoActiveSource({
basePrompt: 'scheduled prompt',
trigger: 'scheduled-task',
rootDir: tempDir,
currentDir: tempDir,
sourceId: 'cron-1',
}),
createAutonomyQueuedPromptIfNoActiveSource({
basePrompt: 'scheduled prompt',
trigger: 'scheduled-task',
rootDir: tempDir,
currentDir: tempDir,
sourceId: 'cron-1',
}),
])
const created = [first, second].filter(command => command !== null)
const runs = await listAutonomyRuns(tempDir)
expect(created).toHaveLength(1)
expect(runs).toHaveLength(1)
expect(runs[0]).toMatchObject({
trigger: 'scheduled-task',
status: 'queued',
sourceId: 'cron-1',
})
})
test('createAutonomyQueuedPromptIfNoActiveSource scopes dedup by ownerKey', async () => {
const first = await createAutonomyQueuedPromptIfNoActiveSource({
basePrompt: 'scheduled prompt',
trigger: 'scheduled-task',
rootDir: tempDir,
currentDir: tempDir,
sourceId: 'cron-1',
ownerKey: 'owner-a',
})
const second = await createAutonomyQueuedPromptIfNoActiveSource({
basePrompt: 'scheduled prompt',
trigger: 'scheduled-task',
rootDir: tempDir,
currentDir: tempDir,
sourceId: 'cron-1',
ownerKey: 'owner-b',
})
const runs = await listAutonomyRuns(tempDir)
expect(first).not.toBeNull()
expect(second).not.toBeNull()
expect(runs).toHaveLength(2)
expect(new Set(runs.map(run => run.ownerKey))).toEqual(
new Set(['owner-a', 'owner-b']),
)
})
test('createAutonomyQueuedPromptIfNoActiveSource does not advance heartbeat last-run state on dedup skip (two-phase commit invariant)', async () => {
await writeTempFile(
tempDir,
HEARTBEAT_REL,
[
'tasks:',
' - name: inbox',
' interval: 30m',
' prompt: "Check inbox"',
].join('\n'),
)
// Seed an active queued run for cron-1 so the next dedup attempt skips.
await mkdir(join(tempDir, AUTONOMY_DIR), { recursive: true })
await writeFile(
resolveAutonomyRunsPath(tempDir),
`${JSON.stringify(
{
runs: [
{
runId: 'preexisting-active',
runtime: 'automatic',
trigger: 'scheduled-task',
status: 'queued',
rootDir: tempDir,
currentDir: tempDir,
sourceId: 'cron-1',
promptPreview: 'still queued',
createdAt: 100,
ownerProcessId: process.pid,
ownerSessionId: 'self',
},
],
},
null,
2,
)}\n`,
'utf-8',
)
const skipped = await createAutonomyQueuedPromptIfNoActiveSource({
basePrompt: 'scheduled prompt',
trigger: 'scheduled-task',
rootDir: tempDir,
currentDir: tempDir,
sourceId: 'cron-1',
})
expect(skipped).toBeNull()
// If the dedup skip wrongly advanced heartbeat state, the next
// proactive-tick prompt would NOT include the inbox task. Verify it
// still does.
const followUp = await createAutonomyQueuedPrompt({
basePrompt: '<tick>12:00:00</tick>',
trigger: 'proactive-tick',
rootDir: tempDir,
currentDir: tempDir,
})
expect(followUp).not.toBeNull()
expect(followUp!.value).toContain('Due HEARTBEAT.md tasks:')
expect(followUp!.value).toContain('- inbox (30m): Check inbox')
})
test('createAutonomyQueuedPromptIfNoActiveSource recovers stale active runs from dead owner processes', async () => {
await mkdir(join(tempDir, AUTONOMY_DIR), { recursive: true })
await writeFile(
resolveAutonomyRunsPath(tempDir),
`${JSON.stringify(
{
runs: [
{
runId: 'stale-run',
runtime: 'automatic',
trigger: 'scheduled-task',
status: 'running',
rootDir: tempDir,
currentDir: tempDir,
sourceId: 'cron-1',
sourceLabel: 'nightly',
promptPreview: 'stale scheduled prompt',
createdAt: 100,
startedAt: 100,
ownerProcessId: 2_147_483_647,
ownerSessionId: 'dead-session',
},
],
},
null,
2,
)}\n`,
'utf-8',
)
await expect(
hasActiveAutonomyRunForSource({
trigger: 'scheduled-task',
sourceId: 'cron-1',
rootDir: tempDir,
}),
).resolves.toBe(false)
const command = await createAutonomyQueuedPromptIfNoActiveSource({
basePrompt: 'scheduled prompt',
trigger: 'scheduled-task',
rootDir: tempDir,
currentDir: tempDir,
sourceId: 'cron-1',
})
const runs = await listAutonomyRuns(tempDir)
expect(command).not.toBeNull()
expect(runs).toHaveLength(2)
expect(runs[0]).toMatchObject({
trigger: 'scheduled-task',
status: 'queued',
sourceId: 'cron-1',
ownerProcessId: process.pid,
})
expect(runs[1]).toMatchObject({
runId: 'stale-run',
status: 'failed',
endedAt: runs[0]?.createdAt,
error: expect.stringContaining('owner process 2147483647'),
})
})
test('stale managed-flow run recovery also marks the flow step failed', async () => {
const command = await startManagedAutonomyFlowFromHeartbeatTask({
task: {
name: 'weekly-report',
interval: '7d',
prompt: 'Ship the weekly report',
steps: [
{
name: 'gather',
prompt: 'Gather weekly inputs',
},
],
},
rootDir: tempDir,
currentDir: tempDir,
})
expect(command).not.toBeNull()
const runId = command!.autonomy!.runId
await markAutonomyRunRunning(runId, tempDir, 100)
const runsPath = resolveAutonomyRunsPath(tempDir)
const file = JSON.parse(readFileSync(runsPath, 'utf-8')) as {
runs: Array<Record<string, unknown>>
}
file.runs = file.runs.map(run =>
run.runId === runId
? { ...run, ownerProcessId: 2_147_483_647 }
: run,
)
await writeFile(runsPath, `${JSON.stringify(file, null, 2)}\n`, 'utf-8')
const replacement = await createAutonomyQueuedPromptIfNoActiveSource({
basePrompt: 'replacement prompt',
trigger: 'managed-flow-step',
rootDir: tempDir,
currentDir: tempDir,
sourceId: command!.autonomy!.sourceId!,
ownerKey: 'main-thread',
})
const [flow] = await listAutonomyFlows(tempDir)
const runs = await listAutonomyRuns(tempDir)
expect(replacement).not.toBeNull()
expect(runs.find(run => run.runId === runId)).toMatchObject({
status: 'failed',
error: expect.stringContaining(STALE_ACTIVE_RUN_ERROR_PREFIX),
})
expect(flow).toMatchObject({
status: 'failed',
blockedRunId: runId,
})
expect(flow?.stateJson?.steps[0]).toMatchObject({
status: 'failed',
runId,
error: expect.stringContaining(STALE_ACTIVE_RUN_ERROR_PREFIX),
})
})
test('formatters produce readable status and run listings', async () => {
const first = await createAutonomyQueuedPrompt({
basePrompt: 'scheduled prompt',
@@ -223,6 +589,53 @@ describe('autonomyRuns', () => {
)
})
test('persistence pruning keeps active runs ahead of recent completed history', async () => {
const runs = [
{
runId: 'old-active',
runtime: 'automatic',
trigger: 'scheduled-task',
status: 'queued',
rootDir: tempDir,
currentDir: tempDir,
ownerKey: 'main-thread',
promptPreview: 'old active',
createdAt: 1,
},
...Array.from({ length: 200 }, (_, index) => ({
runId: `history-${index}`,
runtime: 'automatic',
trigger: 'scheduled-task',
status: 'completed',
rootDir: tempDir,
currentDir: tempDir,
ownerKey: 'main-thread',
promptPreview: `history ${index}`,
createdAt: 1_000 + index,
endedAt: 2_000 + index,
})),
]
await mkdir(join(tempDir, AUTONOMY_DIR), { recursive: true })
await writeFile(
resolveAutonomyRunsPath(tempDir),
`${JSON.stringify({ runs }, null, 2)}\n`,
'utf-8',
)
await createAutonomyRun({
trigger: 'scheduled-task',
prompt: 'fresh active',
rootDir: tempDir,
currentDir: tempDir,
nowMs: 9_999,
})
const persisted = await listAutonomyRuns(tempDir)
expect(persisted).toHaveLength(200)
expect(persisted.some(run => run.runId === 'old-active')).toBe(true)
expect(persisted.some(run => run.runId === 'history-0')).toBe(false)
})
test('listAutonomyRuns keeps older persisted records by normalizing missing runtime and owner metadata', async () => {
const runsPath = resolveAutonomyRunsPath(tempDir)
await mkdir(join(tempDir, '.claude', 'autonomy'), { recursive: true })
@@ -418,4 +831,27 @@ describe('autonomyRuns', () => {
expect(recovered!.autonomy?.runId).toBe(command!.autonomy?.runId)
expect(recovered!.autonomy?.flowId).toBe(flow!.flowId)
})
test('STALE_ACTIVE_RUN_ERROR_PREFIX stays in sync with HEARTBEAT.md stale-recovery-health task', () => {
// The HEARTBEAT.md stale-recovery-health task prompt embeds this prefix
// as a literal string. Changing the constant without updating the
// heartbeat prompt would silently break the monitor — this test fails
// first to force the simultaneous update.
const heartbeatPath = resolvePath(
import.meta.dir,
'..',
'..',
'..',
'.claude',
'autonomy',
'HEARTBEAT.md',
)
if (!existsSync(heartbeatPath)) {
// .claude/ may be absent in some checkout layouts (e.g., shallow clone
// for npm pack). Skip rather than fail in that case.
return
}
const content = readFileSync(heartbeatPath, 'utf8')
expect(content).toContain(STALE_ACTIVE_RUN_ERROR_PREFIX)
})
})

View File

@@ -133,11 +133,41 @@ function mergeAgentsAuthority(files: AutonomyAuthorityFile[]): string | null {
.join('\n\n')
}
/**
* Replaces fenced code-block content (and the ``` / ~~~ fence delimiters
* themselves) with empty strings while preserving the index of every
* other line. Used by the heartbeat parser so that `tasks:` literals
* appearing inside Markdown code samples in HEARTBEAT.md docs do not
* collide with the real config block.
*/
function maskCodeFencedLines(lines: string[]): string[] {
const masked = lines.slice()
let activeFenceChar: '`' | '~' | null = null
for (let i = 0; i < masked.length; i++) {
const trimmed = masked[i]!.trim()
const fenceMatch = trimmed.match(/^(```+|~~~+)/)
if (fenceMatch) {
const fenceChar = fenceMatch[1]![0] as '`' | '~'
if (activeFenceChar === null) {
activeFenceChar = fenceChar
} else if (activeFenceChar === fenceChar) {
activeFenceChar = null
}
masked[i] = ''
continue
}
if (activeFenceChar !== null) {
masked[i] = ''
}
}
return masked
}
export function parseHeartbeatAuthorityTasks(
content: string,
): HeartbeatAuthorityTask[] {
const tasks: HeartbeatAuthorityTask[] = []
const lines = content.split('\n')
const lines = maskCodeFencedLines(content.split('\n'))
const getIndent = (line: string): number =>
line.length - line.trimStart().length
const parseScalar = (line: string, key: string): string =>

View File

@@ -83,6 +83,20 @@ export type AutonomyFlowRecord = {
waitJson?: AutonomyFlowWaitState
cancelRequestedAt?: number
lastError?: string
/**
* Repo-relative POSIX glob patterns describing which paths this flow's
* `report`-step approval covers. The pre-tool-use hook
* `require-plan-for-risky-edit.mjs` consults this list to permit edits
* only when the target file matches at least one entry. Absent or empty
* means "no boundary declared" — during the pilot window the hook
* treats this as broad approval (v1 behaviour). Once all production
* flows declare boundaries, the hook will deny absent-boundary flows.
*
* Supported syntax: `*` matches one path segment, `**` matches any
* number including zero. Examples: `src/utils/autonomy*`,
* `src/services/api/**`, `src/Tool.ts`.
*/
boundary?: string[]
}
type AutonomyFlowsFile = {
@@ -138,6 +152,7 @@ function cloneWaitState(
function cloneFlowRecord(flow: AutonomyFlowRecord): AutonomyFlowRecord {
return {
...flow,
...(flow.boundary ? { boundary: [...flow.boundary] } : {}),
...(flow.stateJson ? { stateJson: cloneManagedState(flow.stateJson) } : {}),
...(flow.waitJson ? { waitJson: cloneWaitState(flow.waitJson) } : {}),
}
@@ -152,6 +167,25 @@ function isManagedFlowStatusActive(status: AutonomyFlowStatus): boolean {
)
}
function selectPersistedAutonomyFlows(
flows: AutonomyFlowRecord[],
): AutonomyFlowRecord[] {
const retained = flows
.slice()
.map(cloneFlowRecord)
.sort((left, right) => {
const leftActive = isManagedFlowStatusActive(left.status)
const rightActive = isManagedFlowStatusActive(right.status)
if (leftActive !== rightActive) {
return leftActive ? -1 : 1
}
return right.updatedAt - left.updatedAt
})
.slice(0, AUTONOMY_FLOWS_MAX)
return retained.sort((left, right) => right.updatedAt - left.updatedAt)
}
function defaultFlowSource(params: {
trigger: AutonomyTriggerKind
sourceId?: string
@@ -237,6 +271,35 @@ function normalizeWaitState(value: unknown): AutonomyFlowWaitState | undefined {
}
}
function isPosixBoundaryGlob(value: string): boolean {
if (!value || value.startsWith('/') || value.includes('\\')) {
return false
}
if (value.includes('\0')) {
return false
}
return !value.split('/').some(segment => segment === '..')
}
function normalizeBoundary(value: unknown): string[] | undefined {
if (!Array.isArray(value)) {
return undefined
}
const seen = new Set<string>()
const boundary = value
.filter((entry): entry is string => typeof entry === 'string')
.map(entry => entry.trim())
.filter(isPosixBoundaryGlob)
.filter(entry => {
if (seen.has(entry)) {
return false
}
seen.add(entry)
return true
})
return boundary.length > 0 ? boundary : undefined
}
function normalizeFlowRecord(flow: AutonomyFlowRecord): AutonomyFlowRecord {
const source = defaultFlowSource(flow)
return {
@@ -247,6 +310,7 @@ function normalizeFlowRecord(flow: AutonomyFlowRecord): AutonomyFlowRecord {
goal: flow.goal || flow.sourceLabel || flow.sourceId || flow.flowKey,
currentDir: flow.currentDir || flow.rootDir,
runCount: Math.max(flow.runCount ?? 0, 0),
boundary: normalizeBoundary(flow.boundary),
stateJson: normalizeManagedState(flow.stateJson),
waitJson: normalizeWaitState(flow.waitJson),
...(flow.sourceId
@@ -369,11 +433,7 @@ async function writeAutonomyFlows(
path,
`${JSON.stringify(
{
flows: flows
.slice()
.map(cloneFlowRecord)
.sort((left, right) => right.updatedAt - left.updatedAt)
.slice(0, AUTONOMY_FLOWS_MAX),
flows: selectPersistedAutonomyFlows(flows),
} satisfies AutonomyFlowsFile,
null,
2,
@@ -420,6 +480,7 @@ export async function startManagedAutonomyFlow(params: {
ownerKey?: string
sourceId?: string
sourceLabel?: string
boundary?: string[]
nowMs?: number
}): Promise<ManagedAutonomyFlowStartResult | null> {
if (params.steps.length === 0) {
@@ -450,6 +511,8 @@ export async function startManagedAutonomyFlow(params: {
const stateJson = buildManagedState(params.steps)
const firstStep = stateJson.steps[0]!
const boundary =
normalizeBoundary(params.boundary) ?? normalizeBoundary(current?.boundary)
const waiting =
firstStep.waitFor != null
? {
@@ -474,6 +537,7 @@ export async function startManagedAutonomyFlow(params: {
currentDir,
...(source.sourceId ? { sourceId: source.sourceId } : {}),
...(source.sourceLabel ? { sourceLabel: source.sourceLabel } : {}),
...(boundary ? { boundary } : {}),
latestRunId: undefined,
runCount: current?.runCount ?? 0,
createdAt: current?.createdAt ?? nowMs,

View File

@@ -4,6 +4,15 @@ import { lock } from './lockfile.js'
const persistenceLocks = new Map<string, Promise<void>>()
export function getAutonomyPersistenceLockCountForTests(): number {
if (process.env.NODE_ENV !== 'test') {
throw new Error(
'getAutonomyPersistenceLockCountForTests can only be called in tests',
)
}
return persistenceLocks.size
}
export async function withAutonomyPersistenceLock<T>(
rootDir: string,
fn: () => Promise<T>,
@@ -16,10 +25,8 @@ export async function withAutonomyPersistenceLock<T>(
const current = new Promise<void>(resolve => {
release = resolve
})
persistenceLocks.set(
key,
previous.then(() => current),
)
const chained = previous.then(() => current)
persistenceLocks.set(key, chained)
await previous
try {
@@ -41,7 +48,7 @@ export async function withAutonomyPersistenceLock<T>(
}
} finally {
release()
if (persistenceLocks.get(key) === current) {
if (persistenceLocks.get(key) === chained) {
persistenceLocks.delete(key)
}
}

View File

@@ -0,0 +1,261 @@
import type { QueuedCommand } from '../types/textInputTypes.js'
import {
finalizeAutonomyRunCompleted,
finalizeAutonomyRunFailed,
listAutonomyRuns,
markAutonomyRunCancelled,
markAutonomyRunRunning,
} from './autonomyRuns.js'
export type AutonomyQueuePartition = {
attachmentCommands: QueuedCommand[]
staleCommands: QueuedCommand[]
}
export type AutonomyQueueClaim = AutonomyQueuePartition & {
claimedRunIds: string[]
claimedCommands: QueuedCommand[]
}
export type AutonomyTurnOutcome =
| { type: 'completed' }
| { type: 'cancelled' }
| { type: 'failed'; error?: unknown; message?: string }
type AutonomyRunRef = {
runId: string
rootDir?: string
}
function getCommandRootDir(
command: QueuedCommand,
fallbackRootDir?: string,
): string | undefined {
return command.autonomy?.rootDir ?? fallbackRootDir
}
function refKey(ref: AutonomyRunRef): string {
return `${ref.rootDir ?? ''}\0${ref.runId}`
}
function getAutonomyRunRefs(
commands: QueuedCommand[],
fallbackRootDir?: string,
): AutonomyRunRef[] {
const refs = new Map<string, AutonomyRunRef>()
for (const command of commands) {
const runId = command.autonomy?.runId
if (!runId) {
continue
}
const ref = {
runId,
rootDir: getCommandRootDir(command, fallbackRootDir),
}
refs.set(refKey(ref), ref)
}
return [...refs.values()]
}
function isInlineQueuedCommand(command: QueuedCommand): boolean {
return command.mode === 'prompt' || command.mode === 'task-notification'
}
function groupRefsByRootDir(
refs: AutonomyRunRef[],
): Map<string, AutonomyRunRef[]> {
const grouped = new Map<string, AutonomyRunRef[]>()
for (const ref of refs) {
const key = ref.rootDir ?? ''
const group = grouped.get(key)
if (group) {
group.push(ref)
} else {
grouped.set(key, [ref])
}
}
return grouped
}
/**
* Exclude queued autonomy commands whose persisted run is no longer queued.
* This prevents stale in-memory commands from reviving flows after cancellation
* or after another path has already consumed the run.
*/
export async function partitionConsumableQueuedAutonomyCommands(
commands: QueuedCommand[],
rootDir?: string,
): Promise<AutonomyQueuePartition> {
const attachmentCommands: QueuedCommand[] = []
const staleCommands: QueuedCommand[] = []
const refs = getAutonomyRunRefs(commands, rootDir)
const runsByRef = new Map<
string,
Awaited<ReturnType<typeof listAutonomyRuns>>[number]
>()
for (const [rootKey, group] of groupRefsByRootDir(refs)) {
const runs = await listAutonomyRuns(rootKey || undefined)
const wanted = new Set(group.map(ref => ref.runId))
for (const run of runs) {
if (wanted.has(run.runId)) {
runsByRef.set(
refKey({ runId: run.runId, rootDir: rootKey || undefined }),
run,
)
}
}
}
for (const command of commands) {
const runId = command.autonomy?.runId
if (!runId) {
attachmentCommands.push(command)
continue
}
const commandRootDir = getCommandRootDir(command, rootDir)
const run = runsByRef.get(refKey({ runId, rootDir: commandRootDir }))
if (run?.status === 'queued' && !run.startedAt && !run.endedAt) {
attachmentCommands.push(command)
} else {
staleCommands.push(command)
}
}
return { attachmentCommands, staleCommands }
}
export async function claimConsumableQueuedAutonomyCommands(
commands: QueuedCommand[],
rootDir?: string,
): Promise<AutonomyQueueClaim> {
const partition = await partitionConsumableQueuedAutonomyCommands(
commands,
rootDir,
)
const claimedRunIds: string[] = []
const claimedRunKeys: string[] = []
const staleRunKeys = new Set<string>()
const candidateRefs = getAutonomyRunRefs(
partition.attachmentCommands.filter(isInlineQueuedCommand),
rootDir,
)
for (const ref of candidateRefs) {
const updated = await markAutonomyRunRunning(ref.runId, ref.rootDir)
if (updated?.status === 'running') {
claimedRunIds.push(ref.runId)
claimedRunKeys.push(refKey(ref))
} else {
staleRunKeys.add(refKey(ref))
}
}
const claimedRunKeySet = new Set(claimedRunKeys)
const attachmentCommands: QueuedCommand[] = []
const claimedCommands: QueuedCommand[] = []
const staleCommands = [...partition.staleCommands]
for (const command of partition.attachmentCommands) {
const runId = command.autonomy?.runId
if (!runId) {
attachmentCommands.push(command)
continue
}
const key = refKey({
runId,
rootDir: getCommandRootDir(command, rootDir),
})
if (claimedRunKeySet.has(key)) {
attachmentCommands.push(command)
claimedCommands.push(command)
} else if (staleRunKeys.has(key)) {
staleCommands.push(command)
}
}
return {
attachmentCommands,
staleCommands,
claimedRunIds,
claimedCommands,
}
}
export async function cancelQueuedAutonomyCommands(params: {
commands: QueuedCommand[]
rootDir?: string
}): Promise<void> {
for (const ref of getAutonomyRunRefs(params.commands, params.rootDir)) {
await markAutonomyRunCancelled(ref.runId, ref.rootDir)
}
}
function stringifyAutonomyError(error: unknown): string {
if (typeof error === 'string') {
return error
}
if (error instanceof Error) {
return error.message
}
return String(error)
}
export function sanitizeAutonomyFailureForPersistence(
error: unknown,
fallback = 'query failed',
): string {
const message = stringifyAutonomyError(error)
const lower = message.toLowerCase()
if (
lower.includes('api_error') ||
lower.includes('provider') ||
lower.includes('openai') ||
lower.includes('gemini') ||
lower.includes('grok') ||
lower.includes('anthropic') ||
lower.includes('bedrock') ||
lower.includes('vertex')
) {
return 'provider api_error'
}
return fallback
}
export async function finalizeAutonomyCommandsForTurn(params: {
commands: QueuedCommand[]
outcome: AutonomyTurnOutcome
currentDir?: string
priority?: 'now' | 'next' | 'later'
workload?: string
}): Promise<QueuedCommand[]> {
const nextCommands: QueuedCommand[] = []
for (const command of params.commands) {
const autonomy = command.autonomy
if (!autonomy?.runId) {
continue
}
if (params.outcome.type === 'completed') {
nextCommands.push(
...(await finalizeAutonomyRunCompleted({
runId: autonomy.runId,
rootDir: autonomy.rootDir,
currentDir: params.currentDir,
priority: params.priority,
workload: command.workload ?? params.workload,
})),
)
} else if (params.outcome.type === 'cancelled') {
await markAutonomyRunCancelled(autonomy.runId, autonomy.rootDir)
} else {
await finalizeAutonomyRunFailed({
runId: autonomy.runId,
rootDir: autonomy.rootDir,
error:
params.outcome.message ??
sanitizeAutonomyFailureForPersistence(params.outcome.error),
})
}
}
return nextCommands
}

View File

@@ -1,7 +1,7 @@
import { randomUUID } from 'crypto'
import { mkdir, writeFile } from 'fs/promises'
import { dirname, join, resolve } from 'path'
import { getProjectRoot } from '../bootstrap/state.js'
import { getProjectRoot, getSessionId } from '../bootstrap/state.js'
import type { MessageOrigin } from '../types/message.js'
import type { QueuedCommand } from '../types/textInputTypes.js'
import {
@@ -29,9 +29,22 @@ import {
} from './autonomyFlows.js'
import { withAutonomyPersistenceLock } from './autonomyPersistence.js'
import { getFsImplementation } from './fsOperations.js'
import { isProcessRunning } from './genericProcessUtils.js'
import { logError } from './log.js'
const AUTONOMY_RUNS_MAX = 200
const AUTONOMY_RUNS_RELATIVE_PATH = join(AUTONOMY_DIR, 'runs.json')
// Sentinel string surfaced to operators via runs.json error fields and
// referenced literally by the HEARTBEAT.md `stale-recovery-health` task.
// A unit test asserts the HEARTBEAT.md file contains this exact prefix —
// changing the value will fail the test, forcing the heartbeat prompt
// to be updated in the same change.
export const STALE_ACTIVE_RUN_ERROR_PREFIX =
'Recovered stale active autonomy run'
// Guards the legacy-block warning so it fires once per (process, runId) instead
// of every dedup tick while a no-owner record sits there.
const warnedLegacyBlockRunIds = new Set<string>()
export type AutonomyRunStatus =
| 'queued'
@@ -59,6 +72,8 @@ export type AutonomyRunRecord = {
flowStepName?: string
promptPreview: string
createdAt: number
ownerProcessId?: number
ownerSessionId?: string
startedAt?: number
endedAt?: number
error?: string
@@ -77,6 +92,19 @@ type AutonomyRunFlowRef = {
stepName: string
}
type CreateAutonomyRunParams = {
trigger: AutonomyTriggerKind
prompt: string
rootDir?: string
currentDir?: string
sourceId?: string
sourceLabel?: string
runtime?: AutonomyRunRuntime
ownerKey?: string
flow?: AutonomyRunFlowRef
nowMs?: number
}
function truncatePromptPreview(prompt: string): string {
const singleLine = prompt.replace(/\s+/g, ' ').trim()
return singleLine.length <= 240
@@ -95,6 +123,29 @@ function cloneRunRecord(run: AutonomyRunRecord): AutonomyRunRecord {
return { ...run }
}
function isAutonomyRunActive(run: AutonomyRunRecord): boolean {
return run.status === 'queued' || run.status === 'running'
}
function selectPersistedAutonomyRuns(
runs: AutonomyRunRecord[],
): AutonomyRunRecord[] {
const retained = runs
.slice()
.map(cloneRunRecord)
.sort((left, right) => {
const leftActive = isAutonomyRunActive(left)
const rightActive = isAutonomyRunActive(right)
if (leftActive !== rightActive) {
return leftActive ? -1 : 1
}
return right.createdAt - left.createdAt
})
.slice(0, AUTONOMY_RUNS_MAX)
return retained.sort((left, right) => right.createdAt - left.createdAt)
}
function normalizePersistedRunRecord(
run: PersistedAutonomyRunRecord,
): AutonomyRunRecord {
@@ -157,11 +208,7 @@ async function writeAutonomyRuns(
path,
`${JSON.stringify(
{
runs: runs
.slice()
.map(cloneRunRecord)
.sort((left, right) => right.createdAt - left.createdAt)
.slice(0, AUTONOMY_RUNS_MAX),
runs: selectPersistedAutonomyRuns(runs),
} satisfies AutonomyRunsFile,
null,
2,
@@ -172,7 +219,7 @@ async function writeAutonomyRuns(
async function updateAutonomyRun(
runId: string,
updater: (current: AutonomyRunRecord) => AutonomyRunRecord,
updater: (current: AutonomyRunRecord) => AutonomyRunRecord | null,
rootDir: string = getProjectRoot(),
): Promise<AutonomyRunRecord | null> {
return withAutonomyPersistenceLock(rootDir, async () => {
@@ -181,7 +228,11 @@ async function updateAutonomyRun(
if (index === -1) {
return null
}
const updated = cloneRunRecord(updater(cloneRunRecord(runs[index]!)))
const next = updater(cloneRunRecord(runs[index]!))
if (!next) {
return null
}
const updated = cloneRunRecord(next)
runs[index] = updated
await writeAutonomyRuns(runs, rootDir)
return updated
@@ -196,21 +247,112 @@ export async function getAutonomyRunById(
return runs.find(run => run.runId === runId) ?? null
}
export async function createAutonomyRun(params: {
function isActiveAutonomyRunStatus(status: AutonomyRunStatus): boolean {
return status === 'queued' || status === 'running'
}
function isValidOwnerProcessId(pid: number | undefined): pid is number {
// Reject non-numeric, negative, zero (Linux: send-to-process-group), and
// non-integer values. A forged record with pid=0 or pid<0 used to be
// treated as live and could permanently block dedup; treating them as
// stale closes that availability hole.
return (
typeof pid === 'number' &&
Number.isInteger(pid) &&
pid > 0 &&
pid < 4_194_304
)
}
function isStaleActiveAutonomyRun(run: AutonomyRunRecord): boolean {
if (!isActiveAutonomyRunStatus(run.status)) {
return false
}
if (run.ownerProcessId === undefined) {
return false
}
if (!isValidOwnerProcessId(run.ownerProcessId)) {
return true
}
return !isProcessRunning(run.ownerProcessId)
}
function staleActiveRunError(run: AutonomyRunRecord): string {
return `${STALE_ACTIVE_RUN_ERROR_PREFIX}: owner process ${run.ownerProcessId} is no longer running.`
}
function failAutonomyRunRecord(
run: AutonomyRunRecord,
error: string,
nowMs: number,
): AutonomyRunRecord {
return {
...run,
status: 'failed',
endedAt: nowMs,
error,
}
}
function recoverStaleActiveAutonomyRun(
run: AutonomyRunRecord,
nowMs: number,
): AutonomyRunRecord {
return failAutonomyRunRecord(run, staleActiveRunError(run), nowMs)
}
async function syncFailedManagedFlowForRun(
run: AutonomyRunRecord,
rootDir: string,
): Promise<void> {
if (run.parentFlowId && run.parentFlowSyncMode === 'managed') {
await markManagedAutonomyFlowStepFailed({
flowId: run.parentFlowId,
runId: run.runId,
error: run.error ?? 'Autonomy run failed.',
rootDir,
nowMs: run.endedAt,
})
}
}
function matchesActiveAutonomyRunSource(
run: AutonomyRunRecord,
params: {
trigger: AutonomyTriggerKind
sourceId: string
ownerKey?: string
},
): boolean {
return (
run.trigger === params.trigger &&
run.sourceId === params.sourceId &&
(params.ownerKey === undefined || run.ownerKey === params.ownerKey) &&
isActiveAutonomyRunStatus(run.status)
)
}
export async function hasActiveAutonomyRunForSource(params: {
trigger: AutonomyTriggerKind
prompt: string
sourceId: string
rootDir?: string
currentDir?: string
sourceId?: string
sourceLabel?: string
runtime?: AutonomyRunRuntime
ownerKey?: string
flow?: AutonomyRunFlowRef
nowMs?: number
}): Promise<AutonomyRunRecord> {
const rootDir = resolve(params.rootDir ?? getProjectRoot())
const currentDir = resolve(params.currentDir ?? rootDir)
const record: AutonomyRunRecord = {
}): Promise<boolean> {
const runs = await listAutonomyRuns(params.rootDir)
return runs.some(
run =>
matchesActiveAutonomyRunSource(run, params) &&
!isStaleActiveAutonomyRun(run),
)
}
function buildAutonomyRunRecord(
params: CreateAutonomyRunParams,
rootDir: string,
currentDir: string,
): AutonomyRunRecord {
const createdAt = params.nowMs ?? Date.now()
return {
runId: randomUUID(),
runtime: params.runtime ?? (params.flow ? 'flow_step' : 'automatic'),
trigger: params.trigger,
@@ -231,13 +373,80 @@ export async function createAutonomyRun(params: {
}
: {}),
promptPreview: truncatePromptPreview(params.prompt),
createdAt: params.nowMs ?? Date.now(),
createdAt,
ownerProcessId: process.pid,
ownerSessionId: getSessionId(),
}
}
async function persistAutonomyRunRecord(
record: AutonomyRunRecord,
rootDir: string,
skipWhenActiveSource: boolean,
): Promise<{
created: boolean
recoveredStaleRuns: AutonomyRunRecord[]
}> {
let created = false
const recoveredStaleRuns: AutonomyRunRecord[] = []
await withAutonomyPersistenceLock(rootDir, async () => {
const runs = await listAutonomyRuns(rootDir)
const sourceId = record.sourceId
if (skipWhenActiveSource && sourceId) {
let hasBlockingActiveRun = false
let staleRecoveriesApplied = false
for (let i = 0; i < runs.length; i++) {
const run = runs[i]!
if (
!matchesActiveAutonomyRunSource(run, {
trigger: record.trigger,
sourceId,
ownerKey: record.ownerKey,
})
) {
continue
}
if (isStaleActiveAutonomyRun(run)) {
const recovered = recoverStaleActiveAutonomyRun(
run,
record.createdAt,
)
runs[i] = recovered
recoveredStaleRuns.push(recovered)
staleRecoveriesApplied = true
continue
}
if (
run.ownerProcessId === undefined &&
!warnedLegacyBlockRunIds.has(run.runId)
) {
warnedLegacyBlockRunIds.add(run.runId)
logError(
new Error(
`[autonomyRuns] blocked by legacy un-owned active run ${run.runId} (createdAt=${run.createdAt}); cancel manually if this is a stale upgrade artifact`,
),
)
}
hasBlockingActiveRun = true
}
if (hasBlockingActiveRun) {
if (staleRecoveriesApplied) {
await writeAutonomyRuns(runs, rootDir)
}
return
}
}
runs.unshift(record)
await writeAutonomyRuns(runs, rootDir)
created = true
})
return { created, recoveredStaleRuns }
}
async function queueManagedFlowStepRunForRecord(
record: AutonomyRunRecord,
rootDir: string,
): Promise<void> {
if (
record.parentFlowId &&
record.flowStepId &&
@@ -258,9 +467,47 @@ export async function createAutonomyRun(params: {
nowMs: record.createdAt,
})
}
}
async function createAutonomyRunCore(
params: CreateAutonomyRunParams,
skipIfActiveSource: boolean,
): Promise<AutonomyRunRecord | null> {
const rootDir = resolve(params.rootDir ?? getProjectRoot())
const currentDir = resolve(params.currentDir ?? rootDir)
const record = buildAutonomyRunRecord(params, rootDir, currentDir)
const { created, recoveredStaleRuns } = await persistAutonomyRunRecord(
record,
rootDir,
skipIfActiveSource,
)
for (const recovered of recoveredStaleRuns) {
await syncFailedManagedFlowForRun(recovered, rootDir)
}
if (!created) {
return null
}
await queueManagedFlowStepRunForRecord(record, rootDir)
return record
}
export async function createAutonomyRun(
params: CreateAutonomyRunParams,
): Promise<AutonomyRunRecord> {
const record = await createAutonomyRunCore(params, false)
if (!record) {
throw new Error('Autonomy run was unexpectedly skipped.')
}
return record
}
export async function createAutonomyRunIfNoActiveSource(
params: CreateAutonomyRunParams & { sourceId: string },
): Promise<AutonomyRunRecord | null> {
return createAutonomyRunCore(params, true)
}
function buildManagedFlowStepPrompt(
flow: AutonomyFlowRecord,
stepIndex: number,
@@ -336,6 +583,7 @@ async function createOrRecoverManagedFlowStepCommand(params: {
workload: params.workload,
autonomy: {
runId: run.runId,
rootDir: run.rootDir,
trigger: 'managed-flow-step',
sourceId: run.sourceId,
sourceLabel: run.sourceLabel,
@@ -426,11 +674,16 @@ export async function markAutonomyRunRunning(
): Promise<AutonomyRunRecord | null> {
const updated = await updateAutonomyRun(
runId,
current => ({
...current,
status: 'running',
startedAt: nowMs ?? Date.now(),
}),
current =>
current.status === 'queued'
? {
...current,
status: 'running',
startedAt: nowMs ?? Date.now(),
ownerProcessId: process.pid,
ownerSessionId: getSessionId(),
}
: null,
rootDir,
)
if (updated?.parentFlowId && updated.parentFlowSyncMode === 'managed') {
@@ -451,12 +704,15 @@ export async function markAutonomyRunCompleted(
): Promise<AutonomyRunRecord | null> {
const updated = await updateAutonomyRun(
runId,
current => ({
...current,
status: 'completed',
endedAt: nowMs ?? Date.now(),
error: undefined,
}),
current =>
current.status === 'queued' || current.status === 'running'
? {
...current,
status: 'completed',
endedAt: nowMs ?? Date.now(),
error: undefined,
}
: null,
rootDir,
)
if (updated?.parentFlowId && updated.parentFlowSyncMode === 'managed') {
@@ -476,24 +732,17 @@ export async function markAutonomyRunFailed(
rootDir?: string,
nowMs?: number,
): Promise<AutonomyRunRecord | null> {
const endedAt = nowMs ?? Date.now()
const updated = await updateAutonomyRun(
runId,
current => ({
...current,
status: 'failed',
endedAt: nowMs ?? Date.now(),
error,
}),
current =>
isActiveAutonomyRunStatus(current.status)
? failAutonomyRunRecord(current, error, endedAt)
: null,
rootDir,
)
if (updated?.parentFlowId && updated.parentFlowSyncMode === 'managed') {
await markManagedAutonomyFlowStepFailed({
flowId: updated.parentFlowId,
runId: updated.runId,
error,
rootDir,
nowMs: updated.endedAt,
})
if (updated) {
await syncFailedManagedFlowForRun(updated, rootDir ?? updated.rootDir)
}
return updated
}
@@ -505,12 +754,15 @@ export async function markAutonomyRunCancelled(
): Promise<AutonomyRunRecord | null> {
const updated = await updateAutonomyRun(
runId,
current => ({
...current,
status: 'cancelled',
endedAt: nowMs ?? Date.now(),
error: undefined,
}),
current =>
current.status === 'queued' || current.status === 'running'
? {
...current,
status: 'cancelled',
endedAt: nowMs ?? Date.now(),
error: undefined,
}
: null,
rootDir,
)
if (updated?.parentFlowId && updated.parentFlowSyncMode === 'managed') {
@@ -612,6 +864,7 @@ export async function createAutonomyQueuedPrompt(params: {
currentDir?: string
sourceId?: string
sourceLabel?: string
ownerKey?: string
workload?: string
priority?: 'now' | 'next' | 'later'
shouldCreate?: () => boolean
@@ -634,39 +887,130 @@ export async function createAutonomyQueuedPrompt(params: {
currentDir,
sourceId: params.sourceId,
sourceLabel: params.sourceLabel,
ownerKey: params.ownerKey,
workload: params.workload,
priority: params.priority,
flow: params.flow,
})
}
export async function createAutonomyQueuedPromptIfNoActiveSource(params: {
trigger: AutonomyTriggerKind
basePrompt: string
rootDir?: string
currentDir?: string
sourceId: string
sourceLabel?: string
ownerKey?: string
workload?: string
priority?: 'now' | 'next' | 'later'
shouldCreate?: () => boolean
}): Promise<QueuedCommand | null> {
const rootDir = resolve(params.rootDir ?? getProjectRoot())
const currentDir = resolve(params.currentDir ?? getCwd())
// Cheap optimistic pre-check: skip the AGENTS.md / HEARTBEAT.md disk
// reads + prompt assembly when an active run for this source already
// blocks dedup. The lock-side check inside persistAutonomyRunRecord
// remains authoritative; this only fast-paths the common storm case.
if (
await hasActiveAutonomyRunForSource({
trigger: params.trigger,
sourceId: params.sourceId,
rootDir,
ownerKey: params.ownerKey,
})
) {
return null
}
const prepared = await prepareAutonomyTurnPrompt({
basePrompt: params.basePrompt,
trigger: params.trigger,
rootDir,
currentDir,
})
if (params.shouldCreate && !params.shouldCreate()) {
return null
}
return commitAutonomyQueuedPromptIfNoActiveSource({
prepared,
rootDir,
currentDir,
sourceId: params.sourceId,
sourceLabel: params.sourceLabel,
ownerKey: params.ownerKey,
workload: params.workload,
priority: params.priority,
})
}
export async function commitAutonomyQueuedPrompt(params: {
prepared: Awaited<ReturnType<typeof prepareAutonomyTurnPrompt>>
rootDir?: string
currentDir?: string
sourceId?: string
sourceLabel?: string
ownerKey?: string
workload?: string
priority?: 'now' | 'next' | 'later'
flow?: AutonomyRunFlowRef
}): Promise<QueuedCommand> {
const command = await commitAutonomyQueuedPromptInternal(params, false)
if (!command) {
throw new Error('Autonomy queued prompt was unexpectedly skipped.')
}
return command
}
async function commitAutonomyQueuedPromptIfNoActiveSource(params: {
prepared: Awaited<ReturnType<typeof prepareAutonomyTurnPrompt>>
rootDir?: string
currentDir?: string
sourceId: string
sourceLabel?: string
ownerKey?: string
workload?: string
priority?: 'now' | 'next' | 'later'
}): Promise<QueuedCommand | null> {
return commitAutonomyQueuedPromptInternal(params, true)
}
async function commitAutonomyQueuedPromptInternal(
params: {
prepared: Awaited<ReturnType<typeof prepareAutonomyTurnPrompt>>
rootDir?: string
currentDir?: string
sourceId?: string
sourceLabel?: string
ownerKey?: string
workload?: string
priority?: 'now' | 'next' | 'later'
flow?: AutonomyRunFlowRef
},
skipWhenActiveSource: boolean,
): Promise<QueuedCommand | null> {
const rootDir = resolve(
params.rootDir ?? params.prepared.rootDir ?? getProjectRoot(),
)
const currentDir = resolve(
params.currentDir ?? params.prepared.currentDir ?? getCwd(),
)
commitPreparedAutonomyTurn(params.prepared)
const value = params.prepared.prompt
const run = await createAutonomyRun({
const runParams: CreateAutonomyRunParams = {
trigger: params.prepared.trigger,
prompt: value,
rootDir,
currentDir,
sourceId: params.sourceId,
sourceLabel: params.sourceLabel,
ownerKey: params.ownerKey,
flow: params.flow,
})
}
const useDedup = skipWhenActiveSource && Boolean(params.sourceId)
const run = await createAutonomyRunCore(runParams, useDedup)
if (!run) {
return null
}
commitPreparedAutonomyTurn(params.prepared)
const origin = {
kind: 'autonomy',
trigger: params.prepared.trigger,
@@ -683,6 +1027,7 @@ export async function commitAutonomyQueuedPrompt(params: {
workload: params.workload,
autonomy: {
runId: run.runId,
rootDir: run.rootDir,
trigger: params.prepared.trigger,
sourceId: params.sourceId,
sourceLabel: params.sourceLabel,

View File

@@ -19,6 +19,7 @@ import {
} from '../types/textInputTypes.js'
import { createAbortController } from './abortController.js'
import type { PastedContent } from './config.js'
import { getCwd } from './cwd.js'
import { logForDebugging } from './debug.js'
import type { EffortValue } from './effort.js'
import type { FileHistoryState } from './fileHistory.js'
@@ -27,11 +28,9 @@ import { gracefulShutdownSync } from './gracefulShutdown.js'
import { enqueue } from './messageQueueManager.js'
import { resolveSkillModelOverride } from './model/model.js'
import {
finalizeAutonomyRunCompleted,
finalizeAutonomyRunFailed,
markAutonomyRunFailed,
markAutonomyRunRunning,
} from './autonomyRuns.js'
claimConsumableQueuedAutonomyCommands,
finalizeAutonomyCommandsForTurn,
} from './autonomyQueueLifecycle.js'
import type { ProcessUserInputContext } from './processUserInput/processUserInput.js'
import { processUserInput } from './processUserInput/processUserInput.js'
import type { QueryGuard } from './QueryGuard.js'
@@ -459,7 +458,14 @@ async function executeUserInput(params: ExecuteUserInputParams): Promise<void> {
// Iterate all commands uniformly. First command gets attachments +
// ideSelection + pastedContents, rest skip attachments to avoid
// duplicating turn-level context (IDE selection, todos, diffs).
const commands = queuedCommands ?? []
let commands = queuedCommands ?? []
const queuedAutonomyClaim =
await claimConsumableQueuedAutonomyCommands(commands)
commands = queuedAutonomyClaim.attachmentCommands
const claimedAutonomyCommands = queuedAutonomyClaim.claimedCommands
if (commands.length === 0) {
return
}
// Compute the workload tag for this turn. queueProcessor can batch a
// cron prompt with a same-tick human prompt; only tag when EVERY
@@ -471,7 +477,7 @@ async function executeUserInput(params: ExecuteUserInputParams): Promise<void> {
commands.every(c => c.workload === firstWorkload)
? firstWorkload
: undefined
let autonomyRunIds: string[] | undefined
const deferredAutonomyRunIds = new Set<string>()
// Wrap the entire turn (processUserInput loop + onQuery) in an
// AsyncLocalStorage context. This is the ONLY way to correctly
@@ -486,10 +492,7 @@ async function executeUserInput(params: ExecuteUserInputParams): Promise<void> {
for (let i = 0; i < commands.length; i++) {
const cmd = commands[i]!
const isFirst = i === 0
if (cmd.autonomy?.runId) {
;(autonomyRunIds ??= []).push(cmd.autonomy.runId)
await markAutonomyRunRunning(cmd.autonomy.runId)
}
const runId = cmd.autonomy?.runId
const result = await processUserInput({
input: cmd.value,
preExpansionInput: cmd.preExpansionValue,
@@ -510,7 +513,11 @@ async function executeUserInput(params: ExecuteUserInputParams): Promise<void> {
bridgeOrigin: cmd.bridgeOrigin,
isMeta: cmd.isMeta,
skipAttachments: !isFirst,
autonomy: cmd.autonomy,
})
if (runId && result.deferAutonomyCompletion) {
deferredAutonomyRunIds.add(runId)
}
// Stamp origin here rather than threading another arg through
// processUserInput → processUserInputBase → processTextPrompt → createUserMessage.
// Derive origin from mode for task-notifications — mirrors the origin
@@ -611,26 +618,35 @@ async function executeUserInput(params: ExecuteUserInputParams): Promise<void> {
}
}
}) // end runWithWorkload — ALS context naturally scoped, no finally needed
if (autonomyRunIds?.length) {
for (const runId of autonomyRunIds) {
const nextCommands = await finalizeAutonomyRunCompleted({
runId,
priority: 'later',
workload: turnWorkload,
})
for (const nextCommand of nextCommands) {
enqueue(nextCommand)
}
if (claimedAutonomyCommands.length) {
const finalizableCommands = claimedAutonomyCommands.filter(command => {
const runId = command.autonomy?.runId
return !runId || !deferredAutonomyRunIds.has(runId)
})
const nextCommands = await finalizeAutonomyCommandsForTurn({
commands: finalizableCommands,
outcome: { type: 'completed' },
currentDir: getCwd(),
priority: 'later',
workload: turnWorkload,
})
for (const nextCommand of nextCommands) {
enqueue(nextCommand)
}
}
} catch (error) {
if (autonomyRunIds?.length) {
for (const runId of autonomyRunIds) {
await finalizeAutonomyRunFailed({
runId,
error: String(error),
})
}
if (claimedAutonomyCommands.length) {
const finalizableCommands = claimedAutonomyCommands.filter(command => {
const runId = command.autonomy?.runId
return !runId || !deferredAutonomyRunIds.has(runId)
})
await finalizeAutonomyCommandsForTurn({
commands: finalizableCommands,
outcome: { type: 'failed', error },
currentDir: getCwd(),
priority: 'later',
workload: turnWorkload,
})
}
throw error
}

View File

@@ -1,173 +1,162 @@
import { describe, expect, test, beforeEach, afterEach } from "bun:test";
import { mock } from "bun:test";
import { describe, expect, test, beforeEach, afterEach } from 'bun:test'
let mockedModelType: "gemini" | undefined;
const { getAPIProvider, isFirstPartyAnthropicBaseUrl } = await import(
'../providers'
)
mock.module("../../settings/settings.js", () => ({
getInitialSettings: () =>
mockedModelType ? { modelType: mockedModelType } : {},
}));
const { getAPIProvider, isFirstPartyAnthropicBaseUrl } =
await import("../providers");
describe("getAPIProvider", () => {
describe('getAPIProvider', () => {
const envKeys = [
"CLAUDE_CODE_USE_GEMINI",
"CLAUDE_CODE_USE_BEDROCK",
"CLAUDE_CODE_USE_VERTEX",
"CLAUDE_CODE_USE_FOUNDRY",
"CLAUDE_CODE_USE_OPENAI",
] as const;
const savedEnv: Record<string, string | undefined> = {};
'CLAUDE_CODE_USE_GEMINI',
'CLAUDE_CODE_USE_BEDROCK',
'CLAUDE_CODE_USE_VERTEX',
'CLAUDE_CODE_USE_FOUNDRY',
'CLAUDE_CODE_USE_OPENAI',
'CLAUDE_CODE_USE_GROK',
] as const
const savedEnv: Record<string, string | undefined> = {}
beforeEach(() => {
// Save and clear environment variables
mockedModelType = undefined;
for (const key of envKeys) {
savedEnv[key] = process.env[key];
delete process.env[key];
savedEnv[key] = process.env[key]
delete process.env[key]
}
});
})
afterEach(() => {
// Restore environment variables
mockedModelType = undefined;
for (const key of envKeys) {
if (savedEnv[key] !== undefined) {
process.env[key] = savedEnv[key];
process.env[key] = savedEnv[key]
} else {
delete process.env[key];
delete process.env[key]
}
}
});
})
test('returns "firstParty" by default', () => {
expect(getAPIProvider()).toBe("firstParty");
});
expect(getAPIProvider({})).toBe('firstParty')
})
test('returns "gemini" when modelType is gemini', () => {
mockedModelType = "gemini";
expect(getAPIProvider()).toBe("gemini");
});
expect(getAPIProvider({ modelType: 'gemini' })).toBe('gemini')
})
test("modelType takes precedence over environment variables", () => {
mockedModelType = "gemini";
process.env.CLAUDE_CODE_USE_BEDROCK = "1";
expect(getAPIProvider()).toBe("gemini");
});
test('modelType takes precedence over environment variables', () => {
process.env.CLAUDE_CODE_USE_BEDROCK = '1'
expect(getAPIProvider({ modelType: 'gemini' })).toBe('gemini')
})
test('returns "gemini" when CLAUDE_CODE_USE_GEMINI is set', () => {
process.env.CLAUDE_CODE_USE_GEMINI = "1";
expect(getAPIProvider()).toBe("gemini");
});
process.env.CLAUDE_CODE_USE_GEMINI = '1'
expect(getAPIProvider({})).toBe('gemini')
})
test('returns "bedrock" when CLAUDE_CODE_USE_BEDROCK is set', () => {
process.env.CLAUDE_CODE_USE_BEDROCK = "1";
expect(getAPIProvider()).toBe("bedrock");
});
process.env.CLAUDE_CODE_USE_BEDROCK = '1'
expect(getAPIProvider({})).toBe('bedrock')
})
test('returns "vertex" when CLAUDE_CODE_USE_VERTEX is set', () => {
process.env.CLAUDE_CODE_USE_VERTEX = "1";
expect(getAPIProvider()).toBe("vertex");
});
process.env.CLAUDE_CODE_USE_VERTEX = '1'
expect(getAPIProvider({})).toBe('vertex')
})
test('returns "foundry" when CLAUDE_CODE_USE_FOUNDRY is set', () => {
process.env.CLAUDE_CODE_USE_FOUNDRY = "1";
expect(getAPIProvider()).toBe("foundry");
});
process.env.CLAUDE_CODE_USE_FOUNDRY = '1'
expect(getAPIProvider({})).toBe('foundry')
})
test("bedrock takes precedence over gemini", () => {
process.env.CLAUDE_CODE_USE_BEDROCK = "1";
process.env.CLAUDE_CODE_USE_GEMINI = "1";
expect(getAPIProvider()).toBe("bedrock");
});
test('bedrock takes precedence over gemini', () => {
process.env.CLAUDE_CODE_USE_BEDROCK = '1'
process.env.CLAUDE_CODE_USE_GEMINI = '1'
expect(getAPIProvider({})).toBe('bedrock')
})
test("bedrock takes precedence over vertex", () => {
process.env.CLAUDE_CODE_USE_BEDROCK = "1";
process.env.CLAUDE_CODE_USE_VERTEX = "1";
expect(getAPIProvider()).toBe("bedrock");
});
test('bedrock takes precedence over vertex', () => {
process.env.CLAUDE_CODE_USE_BEDROCK = '1'
process.env.CLAUDE_CODE_USE_VERTEX = '1'
expect(getAPIProvider({})).toBe('bedrock')
})
test("bedrock wins when all three env vars are set", () => {
process.env.CLAUDE_CODE_USE_BEDROCK = "1";
process.env.CLAUDE_CODE_USE_VERTEX = "1";
process.env.CLAUDE_CODE_USE_FOUNDRY = "1";
expect(getAPIProvider()).toBe("bedrock");
});
test('bedrock wins when all three env vars are set', () => {
process.env.CLAUDE_CODE_USE_BEDROCK = '1'
process.env.CLAUDE_CODE_USE_VERTEX = '1'
process.env.CLAUDE_CODE_USE_FOUNDRY = '1'
expect(getAPIProvider({})).toBe('bedrock')
})
test('"true" is truthy', () => {
process.env.CLAUDE_CODE_USE_BEDROCK = "true";
expect(getAPIProvider()).toBe("bedrock");
});
process.env.CLAUDE_CODE_USE_BEDROCK = 'true'
expect(getAPIProvider({})).toBe('bedrock')
})
test('"0" is not truthy', () => {
process.env.CLAUDE_CODE_USE_BEDROCK = "0";
expect(getAPIProvider()).toBe("firstParty");
});
process.env.CLAUDE_CODE_USE_BEDROCK = '0'
expect(getAPIProvider({})).toBe('firstParty')
})
test('empty string is not truthy', () => {
process.env.CLAUDE_CODE_USE_BEDROCK = "";
expect(getAPIProvider()).toBe("firstParty");
});
});
process.env.CLAUDE_CODE_USE_BEDROCK = ''
expect(getAPIProvider({})).toBe('firstParty')
})
})
describe("isFirstPartyAnthropicBaseUrl", () => {
const originalBaseUrl = process.env.ANTHROPIC_BASE_URL;
const originalUserType = process.env.USER_TYPE;
describe('isFirstPartyAnthropicBaseUrl', () => {
const originalBaseUrl = process.env.ANTHROPIC_BASE_URL
const originalUserType = process.env.USER_TYPE
afterEach(() => {
if (originalBaseUrl !== undefined) {
process.env.ANTHROPIC_BASE_URL = originalBaseUrl;
process.env.ANTHROPIC_BASE_URL = originalBaseUrl
} else {
delete process.env.ANTHROPIC_BASE_URL;
delete process.env.ANTHROPIC_BASE_URL
}
if (originalUserType !== undefined) {
process.env.USER_TYPE = originalUserType;
process.env.USER_TYPE = originalUserType
} else {
delete process.env.USER_TYPE;
delete process.env.USER_TYPE
}
});
})
test("returns true when ANTHROPIC_BASE_URL is not set", () => {
delete process.env.ANTHROPIC_BASE_URL;
expect(isFirstPartyAnthropicBaseUrl()).toBe(true);
});
test('returns true when ANTHROPIC_BASE_URL is not set', () => {
delete process.env.ANTHROPIC_BASE_URL
expect(isFirstPartyAnthropicBaseUrl()).toBe(true)
})
test("returns true for api.anthropic.com", () => {
process.env.ANTHROPIC_BASE_URL = "https://api.anthropic.com";
expect(isFirstPartyAnthropicBaseUrl()).toBe(true);
});
test('returns true for api.anthropic.com', () => {
process.env.ANTHROPIC_BASE_URL = 'https://api.anthropic.com'
expect(isFirstPartyAnthropicBaseUrl()).toBe(true)
})
test("returns false for custom URL", () => {
process.env.ANTHROPIC_BASE_URL = "https://my-proxy.com";
expect(isFirstPartyAnthropicBaseUrl()).toBe(false);
});
test('returns false for custom URL', () => {
process.env.ANTHROPIC_BASE_URL = 'https://my-proxy.com'
expect(isFirstPartyAnthropicBaseUrl()).toBe(false)
})
test("returns false for invalid URL", () => {
process.env.ANTHROPIC_BASE_URL = "not-a-url";
expect(isFirstPartyAnthropicBaseUrl()).toBe(false);
});
test('returns false for invalid URL', () => {
process.env.ANTHROPIC_BASE_URL = 'not-a-url'
expect(isFirstPartyAnthropicBaseUrl()).toBe(false)
})
test("returns true for staging URL when USER_TYPE is ant", () => {
process.env.ANTHROPIC_BASE_URL = "https://api-staging.anthropic.com";
process.env.USER_TYPE = "ant";
expect(isFirstPartyAnthropicBaseUrl()).toBe(true);
});
test('returns true for staging URL when USER_TYPE is ant', () => {
process.env.ANTHROPIC_BASE_URL = 'https://api-staging.anthropic.com'
process.env.USER_TYPE = 'ant'
expect(isFirstPartyAnthropicBaseUrl()).toBe(true)
})
test("returns true for URL with path", () => {
process.env.ANTHROPIC_BASE_URL = "https://api.anthropic.com/v1";
expect(isFirstPartyAnthropicBaseUrl()).toBe(true);
});
test('returns true for URL with path', () => {
process.env.ANTHROPIC_BASE_URL = 'https://api.anthropic.com/v1'
expect(isFirstPartyAnthropicBaseUrl()).toBe(true)
})
test("returns true for trailing slash", () => {
process.env.ANTHROPIC_BASE_URL = "https://api.anthropic.com/";
expect(isFirstPartyAnthropicBaseUrl()).toBe(true);
});
test('returns true for trailing slash', () => {
process.env.ANTHROPIC_BASE_URL = 'https://api.anthropic.com/'
expect(isFirstPartyAnthropicBaseUrl()).toBe(true)
})
test("returns false for subdomain attack", () => {
process.env.ANTHROPIC_BASE_URL = "https://evil-api.anthropic.com";
expect(isFirstPartyAnthropicBaseUrl()).toBe(false);
});
});
test('returns false for subdomain attack', () => {
process.env.ANTHROPIC_BASE_URL = 'https://evil-api.anthropic.com'
expect(isFirstPartyAnthropicBaseUrl()).toBe(false)
})
})

View File

@@ -1,5 +1,6 @@
import type { AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS } from '../../services/analytics/index.js'
import { getInitialSettings } from '../settings/settings.js'
import type { SettingsJson } from '../settings/types.js'
import { isEnvTruthy } from '../envUtils.js'
export type APIProvider =
@@ -11,8 +12,10 @@ export type APIProvider =
| 'gemini'
| 'grok'
export function getAPIProvider(): APIProvider {
const modelType = getInitialSettings().modelType
export function getAPIProvider(
settings: Pick<SettingsJson, 'modelType'> = getInitialSettings(),
): APIProvider {
const modelType = settings.modelType
if (modelType === 'openai') return 'openai'
if (modelType === 'gemini') return 'gemini'
if (modelType === 'grok') return 'grok'

View File

@@ -0,0 +1,375 @@
import { afterEach, beforeEach, describe, expect, mock, test } from 'bun:test'
import type { QueuedCommand } from '../../../types/textInputTypes'
import {
resetStateForTests,
setCwdState,
setOriginalCwd,
setProjectRoot,
} from '../../../bootstrap/state'
import {
createAutonomyQueuedPrompt,
getAutonomyRunById,
listAutonomyRuns,
markAutonomyRunRunning,
} from '../../autonomyRuns'
import { resetAutonomyAuthorityForTests } from '../../autonomyAuthority'
import { createScheduledTaskQueuedCommand } from '../../../hooks/useScheduledTasks'
import {
cleanupTempDir,
createTempDir,
} from '../../../../tests/mocks/file-system'
let runAgentBlocker: Promise<void> | null = null
let releaseRunAgentBlocker: (() => void) | null = null
let runAgentStartCount = 0
let originalNodeEnv: string | undefined
let originalAnthropicApiKey: string | undefined
const commandQueue: QueuedCommand[] = []
function enqueue(command: QueuedCommand): void {
commandQueue.push({ ...command, priority: command.priority ?? 'next' })
}
function enqueuePendingNotification(command: QueuedCommand): void {
commandQueue.push({ ...command, priority: command.priority ?? 'later' })
}
function getCommandQueue(): QueuedCommand[] {
return [...commandQueue]
}
function hasCommandsInQueue(): boolean {
return commandQueue.length > 0
}
function resetCommandQueue(): void {
commandQueue.length = 0
}
function createMessageQueueManagerMock() {
return {
enqueue,
enqueuePendingNotification,
getCommandQueue,
hasCommandsInQueue,
resetCommandQueue,
}
}
function holdRunAgent(): void {
runAgentBlocker = new Promise(resolve => {
releaseRunAgentBlocker = resolve
})
}
function releaseRunAgent(): void {
releaseRunAgentBlocker?.()
runAgentBlocker = null
releaseRunAgentBlocker = null
}
mock.module('bun:bundle', () => ({
feature: (name: string) => name === 'KAIROS',
}))
mock.module(
'@claude-code-best/builtin-tools/tools/AgentTool/runAgent.js',
() => ({
runAgent: async function* () {
runAgentStartCount += 1
if (runAgentBlocker) {
await runAgentBlocker
}
yield {
type: 'assistant',
uuid: 'assistant-1',
timestamp: new Date().toISOString(),
message: {
id: 'msg_1',
type: 'message',
role: 'assistant',
model: 'test-model',
content: [{ type: 'text', text: 'forked command done' }],
stop_reason: 'end_turn',
stop_sequence: null,
usage: {
input_tokens: 0,
output_tokens: 0,
},
},
}
},
}),
)
mock.module('@claude-code-best/builtin-tools/tools/AgentTool/UI.js', () => ({
AgentPromptDisplay: () => null,
AgentResponseDisplay: () => null,
extractLastToolInfo: () => null,
renderGroupedAgentToolUse: () => null,
renderToolResultMessage: () => null,
renderToolUseErrorMessage: () => null,
renderToolUseMessage: () => null,
renderToolUseProgressMessage: () => null,
renderToolUseRejectedMessage: () => null,
renderToolUseTag: () => null,
userFacingName: () => 'Agent',
userFacingNameBackgroundColor: () => 'gray',
}))
mock.module('../../messageQueueManager', createMessageQueueManagerMock)
mock.module('../../messageQueueManager.js', createMessageQueueManagerMock)
const { processSlashCommand } = await import('../processSlashCommand')
let tempDir = ''
function createScheduledTaskQueuedCommandForTest(task: {
id: string
prompt: string
}) {
return createScheduledTaskQueuedCommand(task, {
rootDir: tempDir,
currentDir: tempDir,
})
}
async function waitForRunStatus(
runId: string,
status: 'queued' | 'running' | 'completed' | 'failed' | 'cancelled',
): Promise<void> {
for (let i = 0; i < 200; i++) {
const run = await getAutonomyRunById(runId, tempDir)
if (run?.status === status) {
return
}
await new Promise(resolve => setTimeout(resolve, 10))
}
const run = await getAutonomyRunById(runId, tempDir)
throw new Error(`Expected ${runId} to be ${status}, got ${run?.status}`)
}
async function waitForRunAgentStarts(expected: number): Promise<void> {
for (let i = 0; i < 200; i++) {
if (runAgentStartCount >= expected) {
return
}
await new Promise(resolve => setTimeout(resolve, 10))
}
throw new Error(
`Expected runAgent to start ${expected} time(s), got ${runAgentStartCount}`,
)
}
async function waitForCommandQueueLength(expected: number): Promise<void> {
for (let i = 0; i < 200; i++) {
if (getCommandQueue().length === expected) {
return
}
await new Promise(resolve => setTimeout(resolve, 10))
}
throw new Error(
`Expected command queue length ${expected}, got ${getCommandQueue().length}`,
)
}
beforeEach(async () => {
tempDir = await createTempDir('process-slash-command-')
originalNodeEnv = process.env.NODE_ENV
originalAnthropicApiKey = process.env.ANTHROPIC_API_KEY
process.env.NODE_ENV = 'test'
process.env.ANTHROPIC_API_KEY = 'test-key'
runAgentBlocker = null
releaseRunAgentBlocker = null
runAgentStartCount = 0
resetStateForTests()
resetAutonomyAuthorityForTests()
resetCommandQueue()
setOriginalCwd(tempDir)
setProjectRoot(tempDir)
setCwdState(tempDir)
})
afterEach(async () => {
releaseRunAgent()
if (originalNodeEnv === undefined) {
delete process.env.NODE_ENV
} else {
process.env.NODE_ENV = originalNodeEnv
}
if (originalAnthropicApiKey === undefined) {
delete process.env.ANTHROPIC_API_KEY
} else {
process.env.ANTHROPIC_API_KEY = originalAnthropicApiKey
}
resetStateForTests()
resetAutonomyAuthorityForTests()
resetCommandQueue()
if (tempDir) {
await cleanupTempDir(tempDir)
}
mock.restore()
})
describe('processSlashCommand', () => {
const forkedCommand = {
type: 'prompt',
name: 'forked',
description: 'test forked command',
progressMessage: 'forking',
contentLength: 0,
source: 'builtin',
context: 'fork',
getPromptForCommand: async () => [
{ type: 'text', text: 'review from fork' },
],
} as const
function createContext() {
return {
getAppState: () => ({
kairosEnabled: true,
mcp: { clients: [] },
toolPermissionContext: {
mode: 'default',
alwaysAllowRules: {},
},
}),
options: {
commands: [forkedCommand],
allowBackgroundForkedSlashCommands: true,
tools: [],
refreshTools: () => [],
agentDefinitions: {
activeAgents: [{ agentType: 'general-purpose' }],
},
},
setResponseLength: mock((_updater: (length: number) => number) => {}),
} as any
}
test('defers autonomy completion until a KAIROS background forked command completes', async () => {
const queued = await createAutonomyQueuedPrompt({
basePrompt: '/forked review',
trigger: 'scheduled-task',
rootDir: tempDir,
currentDir: tempDir,
sourceId: 'cron-1',
})
expect(queued).not.toBeNull()
const runId = queued!.autonomy!.runId
await markAutonomyRunRunning(runId, tempDir, 100)
const result = await processSlashCommand(
'/forked review',
[],
[],
[],
createContext(),
mock(() => {}),
undefined,
false,
async () => ({ behavior: 'allow', updatedInput: {} }) as any,
queued!.autonomy,
)
expect(result).toMatchObject({
messages: [],
shouldQuery: false,
deferAutonomyCompletion: true,
})
await waitForRunStatus(runId, 'completed')
await waitForCommandQueueLength(1)
expect(getCommandQueue()).toEqual([
expect.objectContaining({
mode: 'prompt',
isMeta: true,
skipSlashCommands: true,
value: expect.stringContaining(
'<scheduled-task-result command="/forked">',
),
}),
])
})
test('keeps repeated /loop scheduled fires bounded while a background fork is running', async () => {
const task = {
id: 'cron-loop',
prompt: '/forked review',
}
const first = await createScheduledTaskQueuedCommandForTest(task)
expect(first?.autonomy?.runId).toBeDefined()
const runId = first!.autonomy!.runId
await markAutonomyRunRunning(runId, tempDir, 100)
holdRunAgent()
const result = await processSlashCommand(
'/forked review',
[],
[],
[],
createContext(),
mock(() => {}),
undefined,
false,
async () => ({ behavior: 'allow', updatedInput: {} }) as any,
first!.autonomy,
)
expect(result.deferAutonomyCompletion).toBe(true)
await waitForRunAgentStarts(1)
const repeatedFires = await Promise.all(
Array.from({ length: 200 }, () =>
createScheduledTaskQueuedCommandForTest(task),
),
)
expect(repeatedFires.every(command => command === null)).toBe(true)
expect(
(await listAutonomyRuns(tempDir)).filter(
run => run.sourceId === 'cron-loop',
),
).toHaveLength(1)
expect(getCommandQueue()).toHaveLength(0)
releaseRunAgent()
await waitForRunStatus(runId, 'completed')
await waitForCommandQueueLength(1)
expect(getCommandQueue()).toHaveLength(1)
const next = await createScheduledTaskQueuedCommandForTest(task)
expect(next?.autonomy?.runId).toBeDefined()
expect(
(await listAutonomyRuns(tempDir)).filter(
run => run.sourceId === 'cron-loop',
),
).toHaveLength(2)
})
test('rejects the background fork test override outside test runtime', async () => {
process.env.NODE_ENV = 'production'
const result = await processSlashCommand(
'/forked review',
[],
[],
[],
createContext(),
mock(() => {}),
undefined,
false,
async () => ({ behavior: 'allow', updatedInput: {} }) as any,
)
expect(result.shouldQuery).toBe(false)
expect(
result.messages.some(message =>
JSON.stringify(message).includes(
'allowBackgroundForkedSlashCommands is test-only',
),
),
).toBe(true)
expect(runAgentStartCount).toBe(0)
})
})

File diff suppressed because it is too large Load Diff

View File

@@ -28,6 +28,7 @@ import type {
import type { PermissionMode } from '../../types/permissions.js'
import {
isValidImagePaste,
type QueuedCommand,
type PromptInputMode,
} from '../../types/textInputTypes.js'
import {
@@ -80,6 +81,9 @@ export type ProcessUserInputBaseResult = {
// Used by /discover to chain into the selected feature's command
nextInput?: string
submitNextInput?: boolean
// When true, the command started detached work that will finalize its
// autonomy run after the background work completes.
deferAutonomyCompletion?: boolean
}
export async function processUserInput({
@@ -100,6 +104,7 @@ export async function processUserInput({
bridgeOrigin,
isMeta,
skipAttachments,
autonomy,
}: {
input: string | Array<ContentBlockParam>
/**
@@ -137,6 +142,7 @@ export async function processUserInput({
*/
isMeta?: boolean
skipAttachments?: boolean
autonomy?: QueuedCommand['autonomy']
}): Promise<ProcessUserInputBaseResult> {
const inputString = typeof input === 'string' ? input : null
// Immediately show the user input prompt while we are still processing the input.
@@ -168,6 +174,7 @@ export async function processUserInput({
isMeta,
skipAttachments,
preExpansionInput,
autonomy,
)
queryCheckpoint('query_process_user_input_base_end')
@@ -296,6 +303,7 @@ async function processUserInputBase(
isMeta?: boolean,
skipAttachments?: boolean,
preExpansionInput?: string,
autonomy?: QueuedCommand['autonomy'],
): Promise<ProcessUserInputBaseResult> {
let inputString: string | null = null
let precedingInputBlocks: ContentBlockParam[] = []
@@ -491,6 +499,7 @@ async function processUserInputBase(
uuid,
isAlreadyProcessing,
canUseTool,
autonomy,
)
return addImageMetadataMessage(slashResult, imageMetadataTexts)
}
@@ -549,6 +558,7 @@ async function processUserInputBase(
uuid,
isAlreadyProcessing,
canUseTool,
autonomy,
)
return addImageMetadataMessage(slashResult, imageMetadataTexts)
}

View File

@@ -424,8 +424,7 @@ function createInProcessCanUseTool(
feedback: parsed.error,
})
}
cleanup()
return
return // Callback already resolves the promise
}
}
}
@@ -675,6 +674,7 @@ type WaitResult =
type: 'new_message'
message: string
autonomyRunId?: string
autonomyRootDir?: string
from: string
color?: string
summary?: string
@@ -739,12 +739,16 @@ async function waitForNextPromptOrShutdown(
`[inProcessRunner] ${identity.agentName} found pending user message (poll #${pollCount})`,
)
if (pending.autonomyRunId) {
await markAutonomyRunRunning(pending.autonomyRunId)
await markAutonomyRunRunning(
pending.autonomyRunId,
pending.autonomyRootDir,
)
}
return {
type: 'new_message',
message: pending.message,
autonomyRunId: pending.autonomyRunId,
autonomyRootDir: pending.autonomyRootDir,
from: 'user',
}
}
@@ -1022,6 +1026,7 @@ export async function runInProcessTeammate(
)
let currentPrompt = wrappedInitialPrompt
let currentAutonomyRunId: string | undefined
let currentAutonomyRootDir: string | undefined
let shouldExit = false
// Try to claim an available task immediately so the UI can show activity
@@ -1319,12 +1324,21 @@ export async function runInProcessTeammate(
setAppState,
)
if (currentAutonomyRunId) {
await markAutonomyRunFailed(currentAutonomyRunId, ERROR_MESSAGE_USER_ABORT)
await markAutonomyRunFailed(
currentAutonomyRunId,
ERROR_MESSAGE_USER_ABORT,
currentAutonomyRootDir,
)
currentAutonomyRunId = undefined
currentAutonomyRootDir = undefined
}
} else if (currentAutonomyRunId) {
await markAutonomyRunCompleted(currentAutonomyRunId)
await markAutonomyRunCompleted(
currentAutonomyRunId,
currentAutonomyRootDir,
)
currentAutonomyRunId = undefined
currentAutonomyRootDir = undefined
}
// Check if already idle before updating (to skip duplicate notification)
@@ -1398,6 +1412,7 @@ export async function runInProcessTeammate(
setAppState,
)
currentAutonomyRunId = undefined
currentAutonomyRootDir = undefined
break
case 'new_message':
@@ -1410,6 +1425,7 @@ export async function runInProcessTeammate(
if (waitResult.from === 'user') {
currentPrompt = waitResult.message
currentAutonomyRunId = waitResult.autonomyRunId
currentAutonomyRootDir = waitResult.autonomyRootDir
} else {
currentPrompt = formatAsTeammateMessage(
waitResult.from,
@@ -1426,6 +1442,7 @@ export async function runInProcessTeammate(
setAppState,
)
currentAutonomyRunId = undefined
currentAutonomyRootDir = undefined
}
break
@@ -1533,7 +1550,11 @@ export async function runInProcessTeammate(
})
}
if (currentAutonomyRunId) {
await markAutonomyRunFailed(currentAutonomyRunId, errorMessage)
await markAutonomyRunFailed(
currentAutonomyRunId,
errorMessage,
currentAutonomyRootDir,
)
}
// Send idle notification with failure via file-based mailbox

View File

@@ -234,7 +234,7 @@ export function killInProcessTeammate(
let agentId: string | null = null
let toolUseId: string | undefined
let description: string | undefined
let pendingAutonomyRunIds: string[] = []
let pendingAutonomyRuns: Array<{ runId: string; rootDir?: string }> = []
setAppState((prev: AppState) => {
const task = prev.tasks[taskId]
@@ -255,9 +255,18 @@ export function killInProcessTeammate(
description = teammateTask.description
// Capture pending autonomy run IDs before clearing them
pendingAutonomyRunIds = teammateTask.pendingUserMessages
.map(message => message.autonomyRunId)
.filter((runId): runId is string => runId !== undefined)
pendingAutonomyRuns = teammateTask.pendingUserMessages.flatMap(message =>
message.autonomyRunId
? [
{
runId: message.autonomyRunId,
...(message.autonomyRootDir
? { rootDir: message.autonomyRootDir }
: {}),
},
]
: [],
)
// Abort the controller to stop execution
teammateTask.abortController?.abort()
@@ -311,10 +320,11 @@ export function killInProcessTeammate(
}
if (killed) {
for (const runId of pendingAutonomyRunIds) {
for (const run of pendingAutonomyRuns) {
void markAutonomyRunFailed(
runId,
run.runId,
`Teammate ${agentId ?? taskId} was stopped before it could consume the queued autonomy prompt.`,
run.rootDir,
)
}
void evictTaskOutput(taskId)