mirror of
https://github.com/claude-code-best/claude-code.git
synced 2026-06-17 13:55:50 +00:00
feat: harden autonomy lifecycle, OOM bounds, and provider-boundary finalization
This PR consolidates a coordinated batch of fixes around autonomy run/flow lifecycle, scheduled task deduplication, provider-boundary state finalization, and matching memory-bound treatments for adjacent long-running subsystems (REPL fullscreen scrollback, skill-search/skill-learning runtime activation). All changes were developed and reviewed together because they touched the same lifecycle invariants and were uncovered by the same long-running session reproductions.
## Lifecycle correctness
- Queued autonomy prompts are not injected unless the persisted run was successfully claimed; queued run claiming is now terminal-safe so a once-consumed/cancelled/failed run can not slip back into `queued`.
- Autonomy run/flow finalization happens on completion, provider error, generator close, and cancellation — not just the happy path. New `src/__tests__/queryAutonomyProviderBoundary.test.ts` covers these provider-boundary transitions.
- `requestManagedAutonomyFlowCancel` and `resumeManagedAutonomyFlowPrompt` carry `rootDir` and `currentDir` explicitly across detached async boundaries (proactive-tick, cron, daemon restart) instead of inferring from process state.
- Active runs/flows are protected from janitor pruning so a running step can not be garbage-collected mid-flight (`src/utils/autonomyAuthority.ts`).
- Heartbeat parser now ignores fenced code blocks; the two-phase commit window for autonomy state transitions is documented in `docs/internals/autonomy-jira.md`.
## Ownership and dedup
- `src/utils/autonomyRuns.ts`: ownership stamping (run id + rootDir carried end-to-end), source-based dedup against active runs.
- `src/hooks/useScheduledTasks.ts`: scheduled ticks deduplicate against runs already active on the same source label.
- `src/utils/processUserInput/processSlashCommand.tsx`: forked slash commands now thread the autonomy `runId` so completion finalizers can find the originating run for deferred completion.
- New `src/utils/autonomyQueueLifecycle.ts` and tests collect the queue-side lifecycle invariants in one place.
## Memory bounds (related, same review pass)
- `src/screens/REPL.tsx`: caps fullscreen scrollback after the compact boundary and updates trailing progress rows in place. Long-running fullscreen sessions could otherwise retain thousands of post-compaction messages and duplicate progress rows, keeping Ink trees alive long after their useful context had moved on.
- `src/services/skillSearch/*` and `src/services/skillLearning/*`: runtime activation is strictly opt-in via existing env toggles; session caches are capped so long-running processes can not grow them forever. Build presence is preserved so operators can still discover and opt into the slash commands.
## CI / test contract
- `tests/integration/dependency-overrides.test.ts`: smoke test no longer drives Mermaid's browser renderer; it validates the package-resolution contract directly so CI does not regress on unrelated browser timing.
- New `tests/integration/autonomy-lifecycle-user-flow.test.ts`: end-to-end CLI subprocess flow exercising `status --deep`, `flows`, `flow <id>`, `flow resume`, `flow cancel` against persisted state.
- `src/entrypoints/cli.tsx`: `claude autonomy …` routes through an entrypoint fast path that reuses the slash-command formatter without booting the full interactive CLI. Stdout is flushed before forced exit so coverage subprocesses do not terminate with empty stdout.
- `packages/builtin-tools/src/tools/RemoteTriggerTool/__tests__/RemoteTriggerTool.test.ts`: stabilized to prevent audit flake under coverage.
## Tests added
- `src/__tests__/queryAutonomyProviderBoundary.test.ts`
- `src/hooks/__tests__/useScheduledTasks.test.ts`
- `src/utils/__tests__/autonomyAuthority.test.ts`
- `src/utils/__tests__/autonomyFlows.test.ts` (extended)
- `src/utils/__tests__/autonomyPersistence.test.ts` (extended)
- `src/utils/__tests__/autonomyQueueLifecycle.test.ts`
- `src/utils/__tests__/autonomyRuns.test.ts` (extended)
- `src/utils/processUserInput/__tests__/processSlashCommand.test.ts`
- `tests/integration/autonomy-lifecycle-user-flow.test.ts`
## Docs
- `docs/agent/sur-loop-scheduled-oom.md`: System Understanding Report covering the scheduled/loop OOM problem, the call graphs investigated, and the lifecycle invariants this PR establishes.
- `docs/agent/sur-skill-overflow-bugs.md`: SUR for the related skill-overflow context.
- `docs/internals/autonomy-jira.md`: documents the two-phase commit window and ownership stamping invariants.
- `docs/memory-leak-audit.md`: audit notes covering the REPL/scrollback and skill-search bounds.
## Invariants this PR establishes
1. Queued autonomy prompts are not injected unless the persisted run was successfully claimed.
2. Terminal run/flow states are terminal — completion, failure, and cancellation all finalize state regardless of which provider/error path triggered them.
3. Autonomy run/flow `rootDir` is carried explicitly across detached async boundaries instead of inferred from a shared singleton.
4. State-only CLI subcommands (`autonomy status|runs|flows|flow …`) bypass full interactive bootstrap so they do not hold unrelated handles open.
5. REPL fullscreen scrollback and skill-search/skill-learning session caches are explicitly bounded.
## Validation
```bash
bun run typecheck
CI=true GITHUB_ACTIONS=true bun test # 3996 pass / 0 fail across 305 files
bun test src/__tests__/queryAutonomyProviderBoundary.test.ts \
src/hooks/__tests__/useScheduledTasks.test.ts \
src/utils/__tests__/autonomy{Runs,Flows,Authority,QueueLifecycle,Persistence}.test.ts \
src/utils/processUserInput/__tests__/processSlashCommand.test.ts \
tests/integration/autonomy-lifecycle-user-flow.test.ts
```
## Origin
This PR is the consolidated, upstream-targeted version of two fork-side review PRs (fix/loop-scheduled-autonomy-oom and fix/autonomy-lifecycle). The fork-side review history is preserved at https://github.com/amDosion/claude-code-bast/pull/7 . The fork's own internal `chore: keep fork current with upstream` sync commits and the `docs: update contributors` automation are intentionally not included in this PR.
The autonomy CLI handler `rootDir` threading that the fork added (78f64d8a, 98d04ddb) is intentionally omitted here because upstream `a2cfaf91` (fix: 修复 RemoteTriggerTool 和 autonomy 测试的全量运行失败) already performed the equivalent change with an additional `currentDir` option. Keeping the upstream version avoids regressing that improvement.
This commit is contained in:
@@ -5,6 +5,7 @@ import {
|
||||
AUTONOMY_DIR,
|
||||
buildAutonomyTurnPrompt,
|
||||
loadAutonomyAuthority,
|
||||
parseHeartbeatAuthorityTasks,
|
||||
resetAutonomyAuthorityForTests,
|
||||
} from '../autonomyAuthority'
|
||||
import {
|
||||
@@ -238,4 +239,79 @@ describe('autonomyAuthority', () => {
|
||||
expect(prompt).not.toContain('- weekly-report (7d): Ship the weekly report')
|
||||
expect(prompt).not.toContain('- gather (')
|
||||
})
|
||||
|
||||
test('parseHeartbeatAuthorityTasks ignores tasks: literals inside markdown code fences', () => {
|
||||
const content = [
|
||||
'# HEARTBEAT.md',
|
||||
'',
|
||||
'```yaml',
|
||||
'tasks:',
|
||||
' - name: not-a-real-task',
|
||||
' interval: 1m',
|
||||
' prompt: "would-be-shadowed"',
|
||||
'```',
|
||||
'',
|
||||
'tasks:',
|
||||
' - name: real-task',
|
||||
' interval: 30m',
|
||||
' prompt: "Real prompt"',
|
||||
].join('\n')
|
||||
|
||||
const parsed = parseHeartbeatAuthorityTasks(content)
|
||||
|
||||
expect(parsed).toHaveLength(1)
|
||||
expect(parsed[0]).toMatchObject({
|
||||
name: 'real-task',
|
||||
interval: '30m',
|
||||
prompt: 'Real prompt',
|
||||
})
|
||||
})
|
||||
|
||||
test('parseHeartbeatAuthorityTasks ignores tasks: literals inside tilde markdown code fences', () => {
|
||||
const content = [
|
||||
'# HEARTBEAT.md',
|
||||
'',
|
||||
'~~~yaml',
|
||||
'tasks:',
|
||||
' - name: not-a-real-task',
|
||||
' interval: 1m',
|
||||
' prompt: "would-be-shadowed"',
|
||||
'~~~',
|
||||
'',
|
||||
'tasks:',
|
||||
' - name: real-task',
|
||||
' interval: 30m',
|
||||
' prompt: "Real prompt"',
|
||||
].join('\n')
|
||||
|
||||
const parsed = parseHeartbeatAuthorityTasks(content)
|
||||
|
||||
expect(parsed).toHaveLength(1)
|
||||
expect(parsed[0]).toMatchObject({
|
||||
name: 'real-task',
|
||||
interval: '30m',
|
||||
prompt: 'Real prompt',
|
||||
})
|
||||
})
|
||||
|
||||
test('parseHeartbeatAuthorityTasks parses real tasks even when documentation precedes them', () => {
|
||||
const content = [
|
||||
'# Heartbeat docs',
|
||||
'',
|
||||
'See `tasks:` below — the parser keys on the literal at column 0.',
|
||||
'',
|
||||
'tasks:',
|
||||
' - name: weekly',
|
||||
' interval: 7d',
|
||||
' prompt: "Ship report"',
|
||||
].join('\n')
|
||||
|
||||
const parsed = parseHeartbeatAuthorityTasks(content)
|
||||
|
||||
// Inline `tasks:` mention does NOT collide because it's not at column 0
|
||||
// on its own line — the existing line.trim() === 'tasks:' guard handles
|
||||
// that case. This test pins the behaviour.
|
||||
expect(parsed).toHaveLength(1)
|
||||
expect(parsed[0]?.name).toBe('weekly')
|
||||
})
|
||||
})
|
||||
|
||||
@@ -126,6 +126,14 @@ describe('listAutonomyFlows', () => {
|
||||
runCount: 0,
|
||||
ownerKey: DEFAULT_AUTONOMY_OWNER_KEY,
|
||||
currentDir: tempDir,
|
||||
boundary: [
|
||||
' src/utils/** ',
|
||||
'/absolute/not-allowed',
|
||||
'src\\windows',
|
||||
'../outside',
|
||||
'src/utils/**',
|
||||
'docs/*.md',
|
||||
],
|
||||
stateJson: {
|
||||
currentStepIndex: 0,
|
||||
steps: [
|
||||
@@ -147,6 +155,7 @@ describe('listAutonomyFlows', () => {
|
||||
expect(flows).toHaveLength(1)
|
||||
expect(flows[0]?.flowId).toBe('flow-1')
|
||||
expect(flows[0]?.syncMode).toBe('managed')
|
||||
expect(flows[0]?.boundary).toEqual(['src/utils/**', 'docs/*.md'])
|
||||
expect(flows[0]?.stateJson?.steps).toHaveLength(1)
|
||||
})
|
||||
|
||||
@@ -191,6 +200,64 @@ describe('listAutonomyFlows', () => {
|
||||
const flows = await listAutonomyFlows(tempDir)
|
||||
expect(flows).toEqual([])
|
||||
})
|
||||
|
||||
test('persistence pruning keeps active flows ahead of recent terminal history', async () => {
|
||||
const flows: AutonomyFlowRecord[] = [
|
||||
{
|
||||
flowId: 'old-active',
|
||||
flowKey: 'managed:scheduled-task:old-active',
|
||||
syncMode: 'managed',
|
||||
ownerKey: DEFAULT_AUTONOMY_OWNER_KEY,
|
||||
revision: 1,
|
||||
trigger: 'scheduled-task',
|
||||
status: 'queued',
|
||||
goal: 'old active',
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
runCount: 0,
|
||||
createdAt: 1,
|
||||
updatedAt: 1,
|
||||
},
|
||||
...Array.from({ length: 100 }, (_, index) => ({
|
||||
flowId: `history-${index}`,
|
||||
flowKey: `managed:scheduled-task:history-${index}`,
|
||||
syncMode: 'managed' as const,
|
||||
ownerKey: DEFAULT_AUTONOMY_OWNER_KEY,
|
||||
revision: 1,
|
||||
trigger: 'scheduled-task' as const,
|
||||
status: 'succeeded' as const,
|
||||
goal: `history ${index}`,
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
runCount: 1,
|
||||
createdAt: 1_000 + index,
|
||||
updatedAt: 1_000 + index,
|
||||
endedAt: 2_000 + index,
|
||||
})),
|
||||
]
|
||||
const flowsPath = resolveAutonomyFlowsPath(tempDir)
|
||||
await mkdir(join(tempDir, AUTONOMY_DIR), { recursive: true })
|
||||
await writeFile(
|
||||
flowsPath,
|
||||
`${JSON.stringify({ flows }, null, 2)}\n`,
|
||||
'utf-8',
|
||||
)
|
||||
|
||||
await startManagedAutonomyFlow({
|
||||
trigger: 'scheduled-task',
|
||||
goal: 'fresh active',
|
||||
steps: TWO_STEPS,
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
sourceId: 'fresh-active',
|
||||
nowMs: 9_999,
|
||||
})
|
||||
|
||||
const persisted = await listAutonomyFlows(tempDir)
|
||||
expect(persisted).toHaveLength(100)
|
||||
expect(persisted.some(flow => flow.flowId === 'old-active')).toBe(true)
|
||||
expect(persisted.some(flow => flow.flowId === 'history-0')).toBe(false)
|
||||
})
|
||||
})
|
||||
|
||||
describe('startManagedAutonomyFlow', () => {
|
||||
@@ -225,6 +292,49 @@ describe('startManagedAutonomyFlow', () => {
|
||||
expect(result!.nextStep!.step.name).toBe('gather')
|
||||
})
|
||||
|
||||
test('normalizes and preserves boundary across completed flow restarts', async () => {
|
||||
const first = await startManagedAutonomyFlow({
|
||||
trigger: 'scheduled-task',
|
||||
goal: 'Scoped flow',
|
||||
steps: [{ name: 'only', prompt: 'Do it' }],
|
||||
rootDir: tempDir,
|
||||
sourceId: 'scoped-src',
|
||||
boundary: [' src/utils/** ', 'src\\bad', '/absolute', 'docs/*.md'],
|
||||
nowMs: 1000,
|
||||
})
|
||||
const flowId = first!.flow.flowId
|
||||
|
||||
expect(first!.flow.boundary).toEqual(['src/utils/**', 'docs/*.md'])
|
||||
|
||||
await queueManagedAutonomyFlowStepRun({
|
||||
flowId,
|
||||
stepId: first!.nextStep!.step.stepId,
|
||||
stepIndex: 0,
|
||||
runId: 'run-1',
|
||||
rootDir: tempDir,
|
||||
nowMs: 2000,
|
||||
})
|
||||
await markManagedAutonomyFlowStepCompleted({
|
||||
flowId,
|
||||
runId: 'run-1',
|
||||
rootDir: tempDir,
|
||||
nowMs: 3000,
|
||||
})
|
||||
|
||||
const restarted = await startManagedAutonomyFlow({
|
||||
trigger: 'scheduled-task',
|
||||
goal: 'Scoped flow',
|
||||
steps: [{ name: 'only', prompt: 'Do it again' }],
|
||||
rootDir: tempDir,
|
||||
sourceId: 'scoped-src',
|
||||
nowMs: 4000,
|
||||
})
|
||||
|
||||
expect(restarted!.started).toBe(true)
|
||||
expect(restarted!.flow.flowId).toBe(flowId)
|
||||
expect(restarted!.flow.boundary).toEqual(['src/utils/**', 'docs/*.md'])
|
||||
})
|
||||
|
||||
test('sets status=waiting when first step has waitFor', async () => {
|
||||
const result = await startManagedAutonomyFlow({
|
||||
trigger: 'scheduled-task',
|
||||
|
||||
@@ -54,6 +54,25 @@ describe('withAutonomyPersistenceLock', () => {
|
||||
).rejects.toThrow('inner failure')
|
||||
})
|
||||
|
||||
test('releases same-root lock bookkeeping after success and failure', async () => {
|
||||
const {
|
||||
getAutonomyPersistenceLockCountForTests,
|
||||
withAutonomyPersistenceLock,
|
||||
} = await import('../autonomyPersistence')
|
||||
|
||||
expect(getAutonomyPersistenceLockCountForTests()).toBe(0)
|
||||
|
||||
await withAutonomyPersistenceLock(tempDir, async () => 'ok')
|
||||
expect(getAutonomyPersistenceLockCountForTests()).toBe(0)
|
||||
|
||||
await expect(
|
||||
withAutonomyPersistenceLock(tempDir, async () => {
|
||||
throw new Error('inner failure')
|
||||
}),
|
||||
).rejects.toThrow('inner failure')
|
||||
expect(getAutonomyPersistenceLockCountForTests()).toBe(0)
|
||||
})
|
||||
|
||||
test('serializes concurrent calls on the same rootDir', async () => {
|
||||
const { withAutonomyPersistenceLock } = await import(
|
||||
'../autonomyPersistence'
|
||||
|
||||
279
src/utils/__tests__/autonomyQueueLifecycle.test.ts
Normal file
279
src/utils/__tests__/autonomyQueueLifecycle.test.ts
Normal file
@@ -0,0 +1,279 @@
|
||||
import { afterEach, beforeEach, describe, expect, test } from 'bun:test'
|
||||
import { createTempDir, cleanupTempDir } from '../../../tests/mocks/file-system'
|
||||
import { getAttachmentMessages } from '../attachments'
|
||||
import {
|
||||
createAutonomyQueuedPrompt,
|
||||
createProactiveAutonomyCommands,
|
||||
getAutonomyRunById,
|
||||
markAutonomyRunCancelled,
|
||||
startManagedAutonomyFlowFromHeartbeatTask,
|
||||
} from '../autonomyRuns'
|
||||
import { getAutonomyFlowById, listAutonomyFlows } from '../autonomyFlows'
|
||||
import {
|
||||
cancelQueuedAutonomyCommands,
|
||||
claimConsumableQueuedAutonomyCommands,
|
||||
finalizeAutonomyCommandsForTurn,
|
||||
partitionConsumableQueuedAutonomyCommands,
|
||||
} from '../autonomyQueueLifecycle'
|
||||
import {
|
||||
enqueue,
|
||||
getCommandsByMaxPriority,
|
||||
remove as removeFromQueue,
|
||||
resetCommandQueue,
|
||||
} from '../messageQueueManager'
|
||||
|
||||
let tempDir = ''
|
||||
let extraTempDirs: string[] = []
|
||||
|
||||
beforeEach(async () => {
|
||||
tempDir = await createTempDir('autonomy-queue-lifecycle-')
|
||||
extraTempDirs = []
|
||||
resetCommandQueue()
|
||||
})
|
||||
|
||||
afterEach(async () => {
|
||||
resetCommandQueue()
|
||||
if (tempDir) {
|
||||
await cleanupTempDir(tempDir)
|
||||
}
|
||||
for (const extraTempDir of extraTempDirs) {
|
||||
await cleanupTempDir(extraTempDir)
|
||||
}
|
||||
})
|
||||
|
||||
describe('autonomyQueueLifecycle', () => {
|
||||
async function consumeQueuedAutonomyAttachmentTurn() {
|
||||
const previousDisableAttachments =
|
||||
process.env.CLAUDE_CODE_DISABLE_ATTACHMENTS
|
||||
process.env.CLAUDE_CODE_DISABLE_ATTACHMENTS = '1'
|
||||
try {
|
||||
const snapshot = getCommandsByMaxPriority('later')
|
||||
const claim = await claimConsumableQueuedAutonomyCommands(
|
||||
snapshot,
|
||||
tempDir,
|
||||
)
|
||||
removeFromQueue(claim.staleCommands)
|
||||
removeFromQueue(claim.claimedCommands)
|
||||
|
||||
const attachments = []
|
||||
for await (const attachment of getAttachmentMessages(
|
||||
null,
|
||||
{} as never,
|
||||
null,
|
||||
claim.attachmentCommands,
|
||||
[],
|
||||
)) {
|
||||
attachments.push(attachment)
|
||||
}
|
||||
|
||||
const consumedCommands = claim.attachmentCommands.filter(
|
||||
command =>
|
||||
(command.mode === 'prompt' || command.mode === 'task-notification') &&
|
||||
!claim.claimedCommands.includes(command),
|
||||
)
|
||||
removeFromQueue(consumedCommands)
|
||||
const nextCommands = await finalizeAutonomyCommandsForTurn({
|
||||
commands: claim.claimedCommands,
|
||||
outcome: { type: 'completed' },
|
||||
currentDir: tempDir,
|
||||
priority: 'later',
|
||||
})
|
||||
for (const command of nextCommands) {
|
||||
enqueue(command)
|
||||
}
|
||||
|
||||
return { attachments, runningRunIds: claim.claimedRunIds, nextCommands }
|
||||
} finally {
|
||||
if (previousDisableAttachments === undefined) {
|
||||
delete process.env.CLAUDE_CODE_DISABLE_ATTACHMENTS
|
||||
} else {
|
||||
process.env.CLAUDE_CODE_DISABLE_ATTACHMENTS = previousDisableAttachments
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
test('filters stale autonomy commands before mid-turn attachment consumption', async () => {
|
||||
const command = await createAutonomyQueuedPrompt({
|
||||
basePrompt: 'scheduled prompt',
|
||||
trigger: 'scheduled-task',
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
})
|
||||
expect(command).not.toBeNull()
|
||||
|
||||
const initial = await partitionConsumableQueuedAutonomyCommands(
|
||||
[command!],
|
||||
tempDir,
|
||||
)
|
||||
expect(initial.attachmentCommands).toHaveLength(1)
|
||||
expect(initial.staleCommands).toHaveLength(0)
|
||||
|
||||
await markAutonomyRunCancelled(command!.autonomy!.runId, tempDir)
|
||||
|
||||
const afterCancel = await partitionConsumableQueuedAutonomyCommands(
|
||||
[command!],
|
||||
tempDir,
|
||||
)
|
||||
expect(afterCancel.attachmentCommands).toHaveLength(0)
|
||||
expect(afterCancel.staleCommands).toHaveLength(1)
|
||||
})
|
||||
|
||||
test('cancels proactive commands that are created but dropped before enqueue', async () => {
|
||||
const commands = await createProactiveAutonomyCommands({
|
||||
basePrompt: '<tick>12:00:00</tick>',
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
})
|
||||
expect(commands).toHaveLength(1)
|
||||
|
||||
const queuedRun = await getAutonomyRunById(
|
||||
commands[0]!.autonomy!.runId,
|
||||
tempDir,
|
||||
)
|
||||
expect(queuedRun!.status).toBe('queued')
|
||||
|
||||
await cancelQueuedAutonomyCommands({ commands, rootDir: tempDir })
|
||||
|
||||
const cancelledRun = await getAutonomyRunById(
|
||||
commands[0]!.autonomy!.runId,
|
||||
tempDir,
|
||||
)
|
||||
expect(cancelledRun!.status).toBe('cancelled')
|
||||
})
|
||||
|
||||
test('uses command rootDir when claiming after project context changes', async () => {
|
||||
const otherProjectDir = await createTempDir('autonomy-other-project-')
|
||||
extraTempDirs.push(otherProjectDir)
|
||||
const command = await createAutonomyQueuedPrompt({
|
||||
basePrompt: 'scheduled prompt',
|
||||
trigger: 'scheduled-task',
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
})
|
||||
expect(command).not.toBeNull()
|
||||
expect(command!.autonomy?.rootDir).toBe(tempDir)
|
||||
|
||||
const claim = await claimConsumableQueuedAutonomyCommands(
|
||||
[command!],
|
||||
otherProjectDir,
|
||||
)
|
||||
|
||||
const originalRun = await getAutonomyRunById(
|
||||
command!.autonomy!.runId,
|
||||
tempDir,
|
||||
)
|
||||
const wrongProjectRun = await getAutonomyRunById(
|
||||
command!.autonomy!.runId,
|
||||
otherProjectDir,
|
||||
)
|
||||
|
||||
expect(claim.claimedRunIds).toEqual([command!.autonomy!.runId])
|
||||
expect(claim.attachmentCommands).toHaveLength(1)
|
||||
expect(originalRun!.status).toBe('running')
|
||||
expect(wrongProjectRun).toBeNull()
|
||||
})
|
||||
|
||||
test('advances a managed flow consumed as a queued attachment', async () => {
|
||||
const command = await startManagedAutonomyFlowFromHeartbeatTask({
|
||||
task: {
|
||||
name: 'weekly-report',
|
||||
interval: '7d',
|
||||
prompt: 'Ship the weekly report',
|
||||
steps: [
|
||||
{ name: 'gather', prompt: 'Gather weekly inputs' },
|
||||
{ name: 'draft', prompt: 'Draft weekly report' },
|
||||
],
|
||||
},
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
})
|
||||
expect(command).not.toBeNull()
|
||||
|
||||
const claim = await claimConsumableQueuedAutonomyCommands(
|
||||
[command!],
|
||||
tempDir,
|
||||
)
|
||||
const runningRunIds = claim.claimedRunIds
|
||||
expect(runningRunIds).toEqual([command!.autonomy!.runId])
|
||||
|
||||
const nextCommands = await finalizeAutonomyCommandsForTurn({
|
||||
commands: claim.claimedCommands,
|
||||
outcome: { type: 'completed' },
|
||||
currentDir: tempDir,
|
||||
priority: 'later',
|
||||
})
|
||||
const [flow] = await listAutonomyFlows(tempDir)
|
||||
const detail = await getAutonomyFlowById(flow!.flowId, tempDir)
|
||||
const run = await getAutonomyRunById(command!.autonomy!.runId, tempDir)
|
||||
|
||||
expect(run!.status).toBe('completed')
|
||||
expect(nextCommands).toHaveLength(1)
|
||||
expect(nextCommands[0]!.autonomy?.flowId).toBe(flow!.flowId)
|
||||
expect(detail!.stateJson!.steps.map(step => step.status)).toEqual([
|
||||
'completed',
|
||||
'queued',
|
||||
])
|
||||
})
|
||||
|
||||
test('keeps managed autonomy flow coherent across queued attachment turns', async () => {
|
||||
const firstCommand = await startManagedAutonomyFlowFromHeartbeatTask({
|
||||
task: {
|
||||
name: 'weekly-report',
|
||||
interval: '7d',
|
||||
prompt: 'Ship the weekly report',
|
||||
steps: [
|
||||
{ name: 'gather', prompt: 'Gather weekly inputs' },
|
||||
{ name: 'draft', prompt: 'Draft weekly report' },
|
||||
],
|
||||
},
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
})
|
||||
expect(firstCommand).not.toBeNull()
|
||||
enqueue(firstCommand!)
|
||||
|
||||
const firstTurn = await consumeQueuedAutonomyAttachmentTurn()
|
||||
const queuedAfterFirstTurn = getCommandsByMaxPriority('later')
|
||||
const [flowAfterFirstTurn] = await listAutonomyFlows(tempDir)
|
||||
const firstRun = await getAutonomyRunById(
|
||||
firstCommand!.autonomy!.runId,
|
||||
tempDir,
|
||||
)
|
||||
|
||||
expect(firstTurn.attachments).toHaveLength(1)
|
||||
expect(firstTurn.attachments[0]!.attachment?.type).toBe('queued_command')
|
||||
expect(firstTurn.runningRunIds).toEqual([firstCommand!.autonomy!.runId])
|
||||
expect(firstTurn.nextCommands).toHaveLength(1)
|
||||
expect(queuedAfterFirstTurn).toHaveLength(1)
|
||||
expect(queuedAfterFirstTurn[0]!.autonomy?.flowId).toBe(
|
||||
flowAfterFirstTurn!.flowId,
|
||||
)
|
||||
expect(firstRun!.status).toBe('completed')
|
||||
expect(
|
||||
flowAfterFirstTurn!.stateJson!.steps.map(step => step.status),
|
||||
).toEqual(['completed', 'queued'])
|
||||
|
||||
const secondCommand = queuedAfterFirstTurn[0]!
|
||||
const secondTurn = await consumeQueuedAutonomyAttachmentTurn()
|
||||
const queuedAfterSecondTurn = getCommandsByMaxPriority('later')
|
||||
const finalFlow = await getAutonomyFlowById(
|
||||
flowAfterFirstTurn!.flowId,
|
||||
tempDir,
|
||||
)
|
||||
const secondRun = await getAutonomyRunById(
|
||||
secondCommand.autonomy!.runId,
|
||||
tempDir,
|
||||
)
|
||||
|
||||
expect(secondTurn.attachments).toHaveLength(1)
|
||||
expect(secondTurn.runningRunIds).toEqual([secondCommand.autonomy!.runId])
|
||||
expect(secondTurn.nextCommands).toHaveLength(0)
|
||||
expect(queuedAfterSecondTurn).toHaveLength(0)
|
||||
expect(secondRun!.status).toBe('completed')
|
||||
expect(finalFlow!.status).toBe('succeeded')
|
||||
expect(finalFlow!.stateJson!.steps.map(step => step.status)).toEqual([
|
||||
'completed',
|
||||
'completed',
|
||||
])
|
||||
})
|
||||
})
|
||||
@@ -1,6 +1,7 @@
|
||||
import { afterEach, beforeEach, describe, expect, test } from 'bun:test'
|
||||
import { existsSync, readFileSync } from 'fs'
|
||||
import { mkdir, writeFile } from 'fs/promises'
|
||||
import { join } from 'path'
|
||||
import { join, resolve as resolvePath } from 'path'
|
||||
import {
|
||||
resetStateForTests,
|
||||
setCwdState,
|
||||
@@ -8,17 +9,23 @@ import {
|
||||
setProjectRoot,
|
||||
} from '../../bootstrap/state'
|
||||
import {
|
||||
createAutonomyRun,
|
||||
formatAutonomyRunsList,
|
||||
formatAutonomyRunsStatus,
|
||||
listAutonomyRuns,
|
||||
createAutonomyQueuedPrompt,
|
||||
createAutonomyQueuedPromptIfNoActiveSource,
|
||||
createProactiveAutonomyCommands,
|
||||
finalizeAutonomyRunCompleted,
|
||||
getAutonomyRunById,
|
||||
hasActiveAutonomyRunForSource,
|
||||
markAutonomyRunCompleted,
|
||||
markAutonomyRunCancelled,
|
||||
markAutonomyRunFailed,
|
||||
markAutonomyRunRunning,
|
||||
recoverManagedAutonomyFlowPrompt,
|
||||
resolveAutonomyRunsPath,
|
||||
STALE_ACTIVE_RUN_ERROR_PREFIX,
|
||||
startManagedAutonomyFlowFromHeartbeatTask,
|
||||
} from '../autonomyRuns'
|
||||
import {
|
||||
@@ -95,7 +102,9 @@ describe('autonomyRuns', () => {
|
||||
ownerKey: 'main-thread',
|
||||
sourceId: 'cron-1',
|
||||
sourceLabel: 'nightly-report',
|
||||
ownerProcessId: process.pid,
|
||||
})
|
||||
expect(runs[0]?.ownerSessionId).toBeString()
|
||||
expect(flows).toHaveLength(0)
|
||||
expect(resolveAutonomyRunsPath(tempDir)).toContain('.claude')
|
||||
})
|
||||
@@ -118,7 +127,7 @@ describe('autonomyRuns', () => {
|
||||
expect(command!.value).toContain('nested authority')
|
||||
})
|
||||
|
||||
test('markAutonomyRunRunning/completed/failed update persisted lifecycle state for plain runs', async () => {
|
||||
test('markAutonomyRunRunning/completed update persisted lifecycle state for plain runs', async () => {
|
||||
const command = await createAutonomyQueuedPrompt({
|
||||
basePrompt: '<tick>12:00:00</tick>',
|
||||
trigger: 'proactive-tick',
|
||||
@@ -134,7 +143,9 @@ describe('autonomyRuns', () => {
|
||||
runId,
|
||||
status: 'running',
|
||||
startedAt: 100,
|
||||
ownerProcessId: process.pid,
|
||||
})
|
||||
expect(runs[0]?.ownerSessionId).toBeString()
|
||||
|
||||
await markAutonomyRunCompleted(runId, tempDir, 200)
|
||||
runs = await listAutonomyRuns(tempDir)
|
||||
@@ -143,9 +154,22 @@ describe('autonomyRuns', () => {
|
||||
status: 'completed',
|
||||
endedAt: 200,
|
||||
})
|
||||
})
|
||||
|
||||
test('markAutonomyRunFailed updates a non-terminal run', async () => {
|
||||
const command = await createAutonomyQueuedPrompt({
|
||||
basePrompt: '<tick>12:00:00</tick>',
|
||||
trigger: 'proactive-tick',
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
})
|
||||
expect(command).not.toBeNull()
|
||||
const runId = command!.autonomy!.runId
|
||||
|
||||
await markAutonomyRunRunning(runId, tempDir, 100)
|
||||
await markAutonomyRunFailed(runId, 'boom', tempDir, 300)
|
||||
runs = await listAutonomyRuns(tempDir)
|
||||
const runs = await listAutonomyRuns(tempDir)
|
||||
|
||||
expect(runs[0]).toMatchObject({
|
||||
runId,
|
||||
status: 'failed',
|
||||
@@ -154,6 +178,348 @@ describe('autonomyRuns', () => {
|
||||
})
|
||||
})
|
||||
|
||||
test('terminal runs are not revived by stale lifecycle updates', async () => {
|
||||
const command = await createAutonomyQueuedPrompt({
|
||||
basePrompt: 'scheduled prompt',
|
||||
trigger: 'scheduled-task',
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
})
|
||||
expect(command).not.toBeNull()
|
||||
const runId = command!.autonomy!.runId
|
||||
|
||||
await markAutonomyRunCancelled(runId, tempDir, 100)
|
||||
const revived = await markAutonomyRunRunning(runId, tempDir, 200)
|
||||
const completed = await markAutonomyRunCompleted(runId, tempDir, 300)
|
||||
const failed = await markAutonomyRunFailed(
|
||||
runId,
|
||||
'late failure',
|
||||
tempDir,
|
||||
400,
|
||||
)
|
||||
const persisted = await getAutonomyRunById(runId, tempDir)
|
||||
|
||||
expect(revived).toBeNull()
|
||||
expect(completed).toBeNull()
|
||||
expect(failed).toBeNull()
|
||||
expect(persisted).toMatchObject({
|
||||
status: 'cancelled',
|
||||
endedAt: 100,
|
||||
})
|
||||
expect(persisted!.error).toBeUndefined()
|
||||
})
|
||||
|
||||
test('hasActiveAutonomyRunForSource only treats queued and running scheduled runs as active', async () => {
|
||||
const command = await createAutonomyQueuedPrompt({
|
||||
basePrompt: 'scheduled prompt',
|
||||
trigger: 'scheduled-task',
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
sourceId: 'cron-1',
|
||||
sourceLabel: 'nightly',
|
||||
})
|
||||
expect(command).not.toBeNull()
|
||||
const runId = command!.autonomy!.runId
|
||||
|
||||
await expect(
|
||||
hasActiveAutonomyRunForSource({
|
||||
trigger: 'scheduled-task',
|
||||
sourceId: 'cron-1',
|
||||
rootDir: tempDir,
|
||||
}),
|
||||
).resolves.toBe(true)
|
||||
|
||||
await markAutonomyRunRunning(runId, tempDir, 100)
|
||||
await expect(
|
||||
hasActiveAutonomyRunForSource({
|
||||
trigger: 'scheduled-task',
|
||||
sourceId: 'cron-1',
|
||||
rootDir: tempDir,
|
||||
}),
|
||||
).resolves.toBe(true)
|
||||
|
||||
await expect(
|
||||
hasActiveAutonomyRunForSource({
|
||||
trigger: 'scheduled-task',
|
||||
sourceId: 'cron-2',
|
||||
rootDir: tempDir,
|
||||
}),
|
||||
).resolves.toBe(false)
|
||||
|
||||
await markAutonomyRunCompleted(runId, tempDir, 200)
|
||||
await expect(
|
||||
hasActiveAutonomyRunForSource({
|
||||
trigger: 'scheduled-task',
|
||||
sourceId: 'cron-1',
|
||||
rootDir: tempDir,
|
||||
}),
|
||||
).resolves.toBe(false)
|
||||
|
||||
const failedCommand = await createAutonomyQueuedPrompt({
|
||||
basePrompt: 'scheduled prompt',
|
||||
trigger: 'scheduled-task',
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
sourceId: 'cron-1',
|
||||
})
|
||||
expect(failedCommand).not.toBeNull()
|
||||
await markAutonomyRunFailed(
|
||||
failedCommand!.autonomy!.runId,
|
||||
'boom',
|
||||
tempDir,
|
||||
300,
|
||||
)
|
||||
await expect(
|
||||
hasActiveAutonomyRunForSource({
|
||||
trigger: 'scheduled-task',
|
||||
sourceId: 'cron-1',
|
||||
rootDir: tempDir,
|
||||
}),
|
||||
).resolves.toBe(false)
|
||||
})
|
||||
|
||||
test('createAutonomyQueuedPromptIfNoActiveSource atomically skips duplicate active scheduled sources', async () => {
|
||||
const [first, second] = await Promise.all([
|
||||
createAutonomyQueuedPromptIfNoActiveSource({
|
||||
basePrompt: 'scheduled prompt',
|
||||
trigger: 'scheduled-task',
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
sourceId: 'cron-1',
|
||||
}),
|
||||
createAutonomyQueuedPromptIfNoActiveSource({
|
||||
basePrompt: 'scheduled prompt',
|
||||
trigger: 'scheduled-task',
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
sourceId: 'cron-1',
|
||||
}),
|
||||
])
|
||||
|
||||
const created = [first, second].filter(command => command !== null)
|
||||
const runs = await listAutonomyRuns(tempDir)
|
||||
|
||||
expect(created).toHaveLength(1)
|
||||
expect(runs).toHaveLength(1)
|
||||
expect(runs[0]).toMatchObject({
|
||||
trigger: 'scheduled-task',
|
||||
status: 'queued',
|
||||
sourceId: 'cron-1',
|
||||
})
|
||||
})
|
||||
|
||||
test('createAutonomyQueuedPromptIfNoActiveSource scopes dedup by ownerKey', async () => {
|
||||
const first = await createAutonomyQueuedPromptIfNoActiveSource({
|
||||
basePrompt: 'scheduled prompt',
|
||||
trigger: 'scheduled-task',
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
sourceId: 'cron-1',
|
||||
ownerKey: 'owner-a',
|
||||
})
|
||||
const second = await createAutonomyQueuedPromptIfNoActiveSource({
|
||||
basePrompt: 'scheduled prompt',
|
||||
trigger: 'scheduled-task',
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
sourceId: 'cron-1',
|
||||
ownerKey: 'owner-b',
|
||||
})
|
||||
|
||||
const runs = await listAutonomyRuns(tempDir)
|
||||
|
||||
expect(first).not.toBeNull()
|
||||
expect(second).not.toBeNull()
|
||||
expect(runs).toHaveLength(2)
|
||||
expect(new Set(runs.map(run => run.ownerKey))).toEqual(
|
||||
new Set(['owner-a', 'owner-b']),
|
||||
)
|
||||
})
|
||||
|
||||
test('createAutonomyQueuedPromptIfNoActiveSource does not advance heartbeat last-run state on dedup skip (two-phase commit invariant)', async () => {
|
||||
await writeTempFile(
|
||||
tempDir,
|
||||
HEARTBEAT_REL,
|
||||
[
|
||||
'tasks:',
|
||||
' - name: inbox',
|
||||
' interval: 30m',
|
||||
' prompt: "Check inbox"',
|
||||
].join('\n'),
|
||||
)
|
||||
|
||||
// Seed an active queued run for cron-1 so the next dedup attempt skips.
|
||||
await mkdir(join(tempDir, AUTONOMY_DIR), { recursive: true })
|
||||
await writeFile(
|
||||
resolveAutonomyRunsPath(tempDir),
|
||||
`${JSON.stringify(
|
||||
{
|
||||
runs: [
|
||||
{
|
||||
runId: 'preexisting-active',
|
||||
runtime: 'automatic',
|
||||
trigger: 'scheduled-task',
|
||||
status: 'queued',
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
sourceId: 'cron-1',
|
||||
promptPreview: 'still queued',
|
||||
createdAt: 100,
|
||||
ownerProcessId: process.pid,
|
||||
ownerSessionId: 'self',
|
||||
},
|
||||
],
|
||||
},
|
||||
null,
|
||||
2,
|
||||
)}\n`,
|
||||
'utf-8',
|
||||
)
|
||||
|
||||
const skipped = await createAutonomyQueuedPromptIfNoActiveSource({
|
||||
basePrompt: 'scheduled prompt',
|
||||
trigger: 'scheduled-task',
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
sourceId: 'cron-1',
|
||||
})
|
||||
expect(skipped).toBeNull()
|
||||
|
||||
// If the dedup skip wrongly advanced heartbeat state, the next
|
||||
// proactive-tick prompt would NOT include the inbox task. Verify it
|
||||
// still does.
|
||||
const followUp = await createAutonomyQueuedPrompt({
|
||||
basePrompt: '<tick>12:00:00</tick>',
|
||||
trigger: 'proactive-tick',
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
})
|
||||
expect(followUp).not.toBeNull()
|
||||
expect(followUp!.value).toContain('Due HEARTBEAT.md tasks:')
|
||||
expect(followUp!.value).toContain('- inbox (30m): Check inbox')
|
||||
})
|
||||
|
||||
test('createAutonomyQueuedPromptIfNoActiveSource recovers stale active runs from dead owner processes', async () => {
|
||||
await mkdir(join(tempDir, AUTONOMY_DIR), { recursive: true })
|
||||
await writeFile(
|
||||
resolveAutonomyRunsPath(tempDir),
|
||||
`${JSON.stringify(
|
||||
{
|
||||
runs: [
|
||||
{
|
||||
runId: 'stale-run',
|
||||
runtime: 'automatic',
|
||||
trigger: 'scheduled-task',
|
||||
status: 'running',
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
sourceId: 'cron-1',
|
||||
sourceLabel: 'nightly',
|
||||
promptPreview: 'stale scheduled prompt',
|
||||
createdAt: 100,
|
||||
startedAt: 100,
|
||||
ownerProcessId: 2_147_483_647,
|
||||
ownerSessionId: 'dead-session',
|
||||
},
|
||||
],
|
||||
},
|
||||
null,
|
||||
2,
|
||||
)}\n`,
|
||||
'utf-8',
|
||||
)
|
||||
|
||||
await expect(
|
||||
hasActiveAutonomyRunForSource({
|
||||
trigger: 'scheduled-task',
|
||||
sourceId: 'cron-1',
|
||||
rootDir: tempDir,
|
||||
}),
|
||||
).resolves.toBe(false)
|
||||
|
||||
const command = await createAutonomyQueuedPromptIfNoActiveSource({
|
||||
basePrompt: 'scheduled prompt',
|
||||
trigger: 'scheduled-task',
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
sourceId: 'cron-1',
|
||||
})
|
||||
const runs = await listAutonomyRuns(tempDir)
|
||||
|
||||
expect(command).not.toBeNull()
|
||||
expect(runs).toHaveLength(2)
|
||||
expect(runs[0]).toMatchObject({
|
||||
trigger: 'scheduled-task',
|
||||
status: 'queued',
|
||||
sourceId: 'cron-1',
|
||||
ownerProcessId: process.pid,
|
||||
})
|
||||
expect(runs[1]).toMatchObject({
|
||||
runId: 'stale-run',
|
||||
status: 'failed',
|
||||
endedAt: runs[0]?.createdAt,
|
||||
error: expect.stringContaining('owner process 2147483647'),
|
||||
})
|
||||
})
|
||||
|
||||
test('stale managed-flow run recovery also marks the flow step failed', async () => {
|
||||
const command = await startManagedAutonomyFlowFromHeartbeatTask({
|
||||
task: {
|
||||
name: 'weekly-report',
|
||||
interval: '7d',
|
||||
prompt: 'Ship the weekly report',
|
||||
steps: [
|
||||
{
|
||||
name: 'gather',
|
||||
prompt: 'Gather weekly inputs',
|
||||
},
|
||||
],
|
||||
},
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
})
|
||||
expect(command).not.toBeNull()
|
||||
const runId = command!.autonomy!.runId
|
||||
await markAutonomyRunRunning(runId, tempDir, 100)
|
||||
|
||||
const runsPath = resolveAutonomyRunsPath(tempDir)
|
||||
const file = JSON.parse(readFileSync(runsPath, 'utf-8')) as {
|
||||
runs: Array<Record<string, unknown>>
|
||||
}
|
||||
file.runs = file.runs.map(run =>
|
||||
run.runId === runId
|
||||
? { ...run, ownerProcessId: 2_147_483_647 }
|
||||
: run,
|
||||
)
|
||||
await writeFile(runsPath, `${JSON.stringify(file, null, 2)}\n`, 'utf-8')
|
||||
|
||||
const replacement = await createAutonomyQueuedPromptIfNoActiveSource({
|
||||
basePrompt: 'replacement prompt',
|
||||
trigger: 'managed-flow-step',
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
sourceId: command!.autonomy!.sourceId!,
|
||||
ownerKey: 'main-thread',
|
||||
})
|
||||
const [flow] = await listAutonomyFlows(tempDir)
|
||||
const runs = await listAutonomyRuns(tempDir)
|
||||
|
||||
expect(replacement).not.toBeNull()
|
||||
expect(runs.find(run => run.runId === runId)).toMatchObject({
|
||||
status: 'failed',
|
||||
error: expect.stringContaining(STALE_ACTIVE_RUN_ERROR_PREFIX),
|
||||
})
|
||||
expect(flow).toMatchObject({
|
||||
status: 'failed',
|
||||
blockedRunId: runId,
|
||||
})
|
||||
expect(flow?.stateJson?.steps[0]).toMatchObject({
|
||||
status: 'failed',
|
||||
runId,
|
||||
error: expect.stringContaining(STALE_ACTIVE_RUN_ERROR_PREFIX),
|
||||
})
|
||||
})
|
||||
|
||||
test('formatters produce readable status and run listings', async () => {
|
||||
const first = await createAutonomyQueuedPrompt({
|
||||
basePrompt: 'scheduled prompt',
|
||||
@@ -223,6 +589,53 @@ describe('autonomyRuns', () => {
|
||||
)
|
||||
})
|
||||
|
||||
test('persistence pruning keeps active runs ahead of recent completed history', async () => {
|
||||
const runs = [
|
||||
{
|
||||
runId: 'old-active',
|
||||
runtime: 'automatic',
|
||||
trigger: 'scheduled-task',
|
||||
status: 'queued',
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
ownerKey: 'main-thread',
|
||||
promptPreview: 'old active',
|
||||
createdAt: 1,
|
||||
},
|
||||
...Array.from({ length: 200 }, (_, index) => ({
|
||||
runId: `history-${index}`,
|
||||
runtime: 'automatic',
|
||||
trigger: 'scheduled-task',
|
||||
status: 'completed',
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
ownerKey: 'main-thread',
|
||||
promptPreview: `history ${index}`,
|
||||
createdAt: 1_000 + index,
|
||||
endedAt: 2_000 + index,
|
||||
})),
|
||||
]
|
||||
await mkdir(join(tempDir, AUTONOMY_DIR), { recursive: true })
|
||||
await writeFile(
|
||||
resolveAutonomyRunsPath(tempDir),
|
||||
`${JSON.stringify({ runs }, null, 2)}\n`,
|
||||
'utf-8',
|
||||
)
|
||||
|
||||
await createAutonomyRun({
|
||||
trigger: 'scheduled-task',
|
||||
prompt: 'fresh active',
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
nowMs: 9_999,
|
||||
})
|
||||
|
||||
const persisted = await listAutonomyRuns(tempDir)
|
||||
expect(persisted).toHaveLength(200)
|
||||
expect(persisted.some(run => run.runId === 'old-active')).toBe(true)
|
||||
expect(persisted.some(run => run.runId === 'history-0')).toBe(false)
|
||||
})
|
||||
|
||||
test('listAutonomyRuns keeps older persisted records by normalizing missing runtime and owner metadata', async () => {
|
||||
const runsPath = resolveAutonomyRunsPath(tempDir)
|
||||
await mkdir(join(tempDir, '.claude', 'autonomy'), { recursive: true })
|
||||
@@ -418,4 +831,27 @@ describe('autonomyRuns', () => {
|
||||
expect(recovered!.autonomy?.runId).toBe(command!.autonomy?.runId)
|
||||
expect(recovered!.autonomy?.flowId).toBe(flow!.flowId)
|
||||
})
|
||||
|
||||
test('STALE_ACTIVE_RUN_ERROR_PREFIX stays in sync with HEARTBEAT.md stale-recovery-health task', () => {
|
||||
// The HEARTBEAT.md stale-recovery-health task prompt embeds this prefix
|
||||
// as a literal string. Changing the constant without updating the
|
||||
// heartbeat prompt would silently break the monitor — this test fails
|
||||
// first to force the simultaneous update.
|
||||
const heartbeatPath = resolvePath(
|
||||
import.meta.dir,
|
||||
'..',
|
||||
'..',
|
||||
'..',
|
||||
'.claude',
|
||||
'autonomy',
|
||||
'HEARTBEAT.md',
|
||||
)
|
||||
if (!existsSync(heartbeatPath)) {
|
||||
// .claude/ may be absent in some checkout layouts (e.g., shallow clone
|
||||
// for npm pack). Skip rather than fail in that case.
|
||||
return
|
||||
}
|
||||
const content = readFileSync(heartbeatPath, 'utf8')
|
||||
expect(content).toContain(STALE_ACTIVE_RUN_ERROR_PREFIX)
|
||||
})
|
||||
})
|
||||
|
||||
@@ -133,11 +133,41 @@ function mergeAgentsAuthority(files: AutonomyAuthorityFile[]): string | null {
|
||||
.join('\n\n')
|
||||
}
|
||||
|
||||
/**
|
||||
* Replaces fenced code-block content (and the ``` / ~~~ fence delimiters
|
||||
* themselves) with empty strings while preserving the index of every
|
||||
* other line. Used by the heartbeat parser so that `tasks:` literals
|
||||
* appearing inside Markdown code samples in HEARTBEAT.md docs do not
|
||||
* collide with the real config block.
|
||||
*/
|
||||
function maskCodeFencedLines(lines: string[]): string[] {
|
||||
const masked = lines.slice()
|
||||
let activeFenceChar: '`' | '~' | null = null
|
||||
for (let i = 0; i < masked.length; i++) {
|
||||
const trimmed = masked[i]!.trim()
|
||||
const fenceMatch = trimmed.match(/^(```+|~~~+)/)
|
||||
if (fenceMatch) {
|
||||
const fenceChar = fenceMatch[1]![0] as '`' | '~'
|
||||
if (activeFenceChar === null) {
|
||||
activeFenceChar = fenceChar
|
||||
} else if (activeFenceChar === fenceChar) {
|
||||
activeFenceChar = null
|
||||
}
|
||||
masked[i] = ''
|
||||
continue
|
||||
}
|
||||
if (activeFenceChar !== null) {
|
||||
masked[i] = ''
|
||||
}
|
||||
}
|
||||
return masked
|
||||
}
|
||||
|
||||
export function parseHeartbeatAuthorityTasks(
|
||||
content: string,
|
||||
): HeartbeatAuthorityTask[] {
|
||||
const tasks: HeartbeatAuthorityTask[] = []
|
||||
const lines = content.split('\n')
|
||||
const lines = maskCodeFencedLines(content.split('\n'))
|
||||
const getIndent = (line: string): number =>
|
||||
line.length - line.trimStart().length
|
||||
const parseScalar = (line: string, key: string): string =>
|
||||
|
||||
@@ -83,6 +83,20 @@ export type AutonomyFlowRecord = {
|
||||
waitJson?: AutonomyFlowWaitState
|
||||
cancelRequestedAt?: number
|
||||
lastError?: string
|
||||
/**
|
||||
* Repo-relative POSIX glob patterns describing which paths this flow's
|
||||
* `report`-step approval covers. The pre-tool-use hook
|
||||
* `require-plan-for-risky-edit.mjs` consults this list to permit edits
|
||||
* only when the target file matches at least one entry. Absent or empty
|
||||
* means "no boundary declared" — during the pilot window the hook
|
||||
* treats this as broad approval (v1 behaviour). Once all production
|
||||
* flows declare boundaries, the hook will deny absent-boundary flows.
|
||||
*
|
||||
* Supported syntax: `*` matches one path segment, `**` matches any
|
||||
* number including zero. Examples: `src/utils/autonomy*`,
|
||||
* `src/services/api/**`, `src/Tool.ts`.
|
||||
*/
|
||||
boundary?: string[]
|
||||
}
|
||||
|
||||
type AutonomyFlowsFile = {
|
||||
@@ -138,6 +152,7 @@ function cloneWaitState(
|
||||
function cloneFlowRecord(flow: AutonomyFlowRecord): AutonomyFlowRecord {
|
||||
return {
|
||||
...flow,
|
||||
...(flow.boundary ? { boundary: [...flow.boundary] } : {}),
|
||||
...(flow.stateJson ? { stateJson: cloneManagedState(flow.stateJson) } : {}),
|
||||
...(flow.waitJson ? { waitJson: cloneWaitState(flow.waitJson) } : {}),
|
||||
}
|
||||
@@ -152,6 +167,25 @@ function isManagedFlowStatusActive(status: AutonomyFlowStatus): boolean {
|
||||
)
|
||||
}
|
||||
|
||||
function selectPersistedAutonomyFlows(
|
||||
flows: AutonomyFlowRecord[],
|
||||
): AutonomyFlowRecord[] {
|
||||
const retained = flows
|
||||
.slice()
|
||||
.map(cloneFlowRecord)
|
||||
.sort((left, right) => {
|
||||
const leftActive = isManagedFlowStatusActive(left.status)
|
||||
const rightActive = isManagedFlowStatusActive(right.status)
|
||||
if (leftActive !== rightActive) {
|
||||
return leftActive ? -1 : 1
|
||||
}
|
||||
return right.updatedAt - left.updatedAt
|
||||
})
|
||||
.slice(0, AUTONOMY_FLOWS_MAX)
|
||||
|
||||
return retained.sort((left, right) => right.updatedAt - left.updatedAt)
|
||||
}
|
||||
|
||||
function defaultFlowSource(params: {
|
||||
trigger: AutonomyTriggerKind
|
||||
sourceId?: string
|
||||
@@ -237,6 +271,35 @@ function normalizeWaitState(value: unknown): AutonomyFlowWaitState | undefined {
|
||||
}
|
||||
}
|
||||
|
||||
function isPosixBoundaryGlob(value: string): boolean {
|
||||
if (!value || value.startsWith('/') || value.includes('\\')) {
|
||||
return false
|
||||
}
|
||||
if (value.includes('\0')) {
|
||||
return false
|
||||
}
|
||||
return !value.split('/').some(segment => segment === '..')
|
||||
}
|
||||
|
||||
function normalizeBoundary(value: unknown): string[] | undefined {
|
||||
if (!Array.isArray(value)) {
|
||||
return undefined
|
||||
}
|
||||
const seen = new Set<string>()
|
||||
const boundary = value
|
||||
.filter((entry): entry is string => typeof entry === 'string')
|
||||
.map(entry => entry.trim())
|
||||
.filter(isPosixBoundaryGlob)
|
||||
.filter(entry => {
|
||||
if (seen.has(entry)) {
|
||||
return false
|
||||
}
|
||||
seen.add(entry)
|
||||
return true
|
||||
})
|
||||
return boundary.length > 0 ? boundary : undefined
|
||||
}
|
||||
|
||||
function normalizeFlowRecord(flow: AutonomyFlowRecord): AutonomyFlowRecord {
|
||||
const source = defaultFlowSource(flow)
|
||||
return {
|
||||
@@ -247,6 +310,7 @@ function normalizeFlowRecord(flow: AutonomyFlowRecord): AutonomyFlowRecord {
|
||||
goal: flow.goal || flow.sourceLabel || flow.sourceId || flow.flowKey,
|
||||
currentDir: flow.currentDir || flow.rootDir,
|
||||
runCount: Math.max(flow.runCount ?? 0, 0),
|
||||
boundary: normalizeBoundary(flow.boundary),
|
||||
stateJson: normalizeManagedState(flow.stateJson),
|
||||
waitJson: normalizeWaitState(flow.waitJson),
|
||||
...(flow.sourceId
|
||||
@@ -369,11 +433,7 @@ async function writeAutonomyFlows(
|
||||
path,
|
||||
`${JSON.stringify(
|
||||
{
|
||||
flows: flows
|
||||
.slice()
|
||||
.map(cloneFlowRecord)
|
||||
.sort((left, right) => right.updatedAt - left.updatedAt)
|
||||
.slice(0, AUTONOMY_FLOWS_MAX),
|
||||
flows: selectPersistedAutonomyFlows(flows),
|
||||
} satisfies AutonomyFlowsFile,
|
||||
null,
|
||||
2,
|
||||
@@ -420,6 +480,7 @@ export async function startManagedAutonomyFlow(params: {
|
||||
ownerKey?: string
|
||||
sourceId?: string
|
||||
sourceLabel?: string
|
||||
boundary?: string[]
|
||||
nowMs?: number
|
||||
}): Promise<ManagedAutonomyFlowStartResult | null> {
|
||||
if (params.steps.length === 0) {
|
||||
@@ -450,6 +511,8 @@ export async function startManagedAutonomyFlow(params: {
|
||||
|
||||
const stateJson = buildManagedState(params.steps)
|
||||
const firstStep = stateJson.steps[0]!
|
||||
const boundary =
|
||||
normalizeBoundary(params.boundary) ?? normalizeBoundary(current?.boundary)
|
||||
const waiting =
|
||||
firstStep.waitFor != null
|
||||
? {
|
||||
@@ -474,6 +537,7 @@ export async function startManagedAutonomyFlow(params: {
|
||||
currentDir,
|
||||
...(source.sourceId ? { sourceId: source.sourceId } : {}),
|
||||
...(source.sourceLabel ? { sourceLabel: source.sourceLabel } : {}),
|
||||
...(boundary ? { boundary } : {}),
|
||||
latestRunId: undefined,
|
||||
runCount: current?.runCount ?? 0,
|
||||
createdAt: current?.createdAt ?? nowMs,
|
||||
|
||||
@@ -4,6 +4,15 @@ import { lock } from './lockfile.js'
|
||||
|
||||
const persistenceLocks = new Map<string, Promise<void>>()
|
||||
|
||||
export function getAutonomyPersistenceLockCountForTests(): number {
|
||||
if (process.env.NODE_ENV !== 'test') {
|
||||
throw new Error(
|
||||
'getAutonomyPersistenceLockCountForTests can only be called in tests',
|
||||
)
|
||||
}
|
||||
return persistenceLocks.size
|
||||
}
|
||||
|
||||
export async function withAutonomyPersistenceLock<T>(
|
||||
rootDir: string,
|
||||
fn: () => Promise<T>,
|
||||
@@ -16,10 +25,8 @@ export async function withAutonomyPersistenceLock<T>(
|
||||
const current = new Promise<void>(resolve => {
|
||||
release = resolve
|
||||
})
|
||||
persistenceLocks.set(
|
||||
key,
|
||||
previous.then(() => current),
|
||||
)
|
||||
const chained = previous.then(() => current)
|
||||
persistenceLocks.set(key, chained)
|
||||
|
||||
await previous
|
||||
try {
|
||||
@@ -41,7 +48,7 @@ export async function withAutonomyPersistenceLock<T>(
|
||||
}
|
||||
} finally {
|
||||
release()
|
||||
if (persistenceLocks.get(key) === current) {
|
||||
if (persistenceLocks.get(key) === chained) {
|
||||
persistenceLocks.delete(key)
|
||||
}
|
||||
}
|
||||
|
||||
261
src/utils/autonomyQueueLifecycle.ts
Normal file
261
src/utils/autonomyQueueLifecycle.ts
Normal file
@@ -0,0 +1,261 @@
|
||||
import type { QueuedCommand } from '../types/textInputTypes.js'
|
||||
import {
|
||||
finalizeAutonomyRunCompleted,
|
||||
finalizeAutonomyRunFailed,
|
||||
listAutonomyRuns,
|
||||
markAutonomyRunCancelled,
|
||||
markAutonomyRunRunning,
|
||||
} from './autonomyRuns.js'
|
||||
|
||||
export type AutonomyQueuePartition = {
|
||||
attachmentCommands: QueuedCommand[]
|
||||
staleCommands: QueuedCommand[]
|
||||
}
|
||||
|
||||
export type AutonomyQueueClaim = AutonomyQueuePartition & {
|
||||
claimedRunIds: string[]
|
||||
claimedCommands: QueuedCommand[]
|
||||
}
|
||||
|
||||
export type AutonomyTurnOutcome =
|
||||
| { type: 'completed' }
|
||||
| { type: 'cancelled' }
|
||||
| { type: 'failed'; error?: unknown; message?: string }
|
||||
|
||||
type AutonomyRunRef = {
|
||||
runId: string
|
||||
rootDir?: string
|
||||
}
|
||||
|
||||
function getCommandRootDir(
|
||||
command: QueuedCommand,
|
||||
fallbackRootDir?: string,
|
||||
): string | undefined {
|
||||
return command.autonomy?.rootDir ?? fallbackRootDir
|
||||
}
|
||||
|
||||
function refKey(ref: AutonomyRunRef): string {
|
||||
return `${ref.rootDir ?? ''}\0${ref.runId}`
|
||||
}
|
||||
|
||||
function getAutonomyRunRefs(
|
||||
commands: QueuedCommand[],
|
||||
fallbackRootDir?: string,
|
||||
): AutonomyRunRef[] {
|
||||
const refs = new Map<string, AutonomyRunRef>()
|
||||
for (const command of commands) {
|
||||
const runId = command.autonomy?.runId
|
||||
if (!runId) {
|
||||
continue
|
||||
}
|
||||
const ref = {
|
||||
runId,
|
||||
rootDir: getCommandRootDir(command, fallbackRootDir),
|
||||
}
|
||||
refs.set(refKey(ref), ref)
|
||||
}
|
||||
return [...refs.values()]
|
||||
}
|
||||
|
||||
function isInlineQueuedCommand(command: QueuedCommand): boolean {
|
||||
return command.mode === 'prompt' || command.mode === 'task-notification'
|
||||
}
|
||||
|
||||
function groupRefsByRootDir(
|
||||
refs: AutonomyRunRef[],
|
||||
): Map<string, AutonomyRunRef[]> {
|
||||
const grouped = new Map<string, AutonomyRunRef[]>()
|
||||
for (const ref of refs) {
|
||||
const key = ref.rootDir ?? ''
|
||||
const group = grouped.get(key)
|
||||
if (group) {
|
||||
group.push(ref)
|
||||
} else {
|
||||
grouped.set(key, [ref])
|
||||
}
|
||||
}
|
||||
return grouped
|
||||
}
|
||||
|
||||
/**
|
||||
* Exclude queued autonomy commands whose persisted run is no longer queued.
|
||||
* This prevents stale in-memory commands from reviving flows after cancellation
|
||||
* or after another path has already consumed the run.
|
||||
*/
|
||||
export async function partitionConsumableQueuedAutonomyCommands(
|
||||
commands: QueuedCommand[],
|
||||
rootDir?: string,
|
||||
): Promise<AutonomyQueuePartition> {
|
||||
const attachmentCommands: QueuedCommand[] = []
|
||||
const staleCommands: QueuedCommand[] = []
|
||||
const refs = getAutonomyRunRefs(commands, rootDir)
|
||||
const runsByRef = new Map<
|
||||
string,
|
||||
Awaited<ReturnType<typeof listAutonomyRuns>>[number]
|
||||
>()
|
||||
for (const [rootKey, group] of groupRefsByRootDir(refs)) {
|
||||
const runs = await listAutonomyRuns(rootKey || undefined)
|
||||
const wanted = new Set(group.map(ref => ref.runId))
|
||||
for (const run of runs) {
|
||||
if (wanted.has(run.runId)) {
|
||||
runsByRef.set(
|
||||
refKey({ runId: run.runId, rootDir: rootKey || undefined }),
|
||||
run,
|
||||
)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
for (const command of commands) {
|
||||
const runId = command.autonomy?.runId
|
||||
if (!runId) {
|
||||
attachmentCommands.push(command)
|
||||
continue
|
||||
}
|
||||
|
||||
const commandRootDir = getCommandRootDir(command, rootDir)
|
||||
const run = runsByRef.get(refKey({ runId, rootDir: commandRootDir }))
|
||||
if (run?.status === 'queued' && !run.startedAt && !run.endedAt) {
|
||||
attachmentCommands.push(command)
|
||||
} else {
|
||||
staleCommands.push(command)
|
||||
}
|
||||
}
|
||||
|
||||
return { attachmentCommands, staleCommands }
|
||||
}
|
||||
|
||||
export async function claimConsumableQueuedAutonomyCommands(
|
||||
commands: QueuedCommand[],
|
||||
rootDir?: string,
|
||||
): Promise<AutonomyQueueClaim> {
|
||||
const partition = await partitionConsumableQueuedAutonomyCommands(
|
||||
commands,
|
||||
rootDir,
|
||||
)
|
||||
const claimedRunIds: string[] = []
|
||||
const claimedRunKeys: string[] = []
|
||||
const staleRunKeys = new Set<string>()
|
||||
const candidateRefs = getAutonomyRunRefs(
|
||||
partition.attachmentCommands.filter(isInlineQueuedCommand),
|
||||
rootDir,
|
||||
)
|
||||
|
||||
for (const ref of candidateRefs) {
|
||||
const updated = await markAutonomyRunRunning(ref.runId, ref.rootDir)
|
||||
if (updated?.status === 'running') {
|
||||
claimedRunIds.push(ref.runId)
|
||||
claimedRunKeys.push(refKey(ref))
|
||||
} else {
|
||||
staleRunKeys.add(refKey(ref))
|
||||
}
|
||||
}
|
||||
|
||||
const claimedRunKeySet = new Set(claimedRunKeys)
|
||||
const attachmentCommands: QueuedCommand[] = []
|
||||
const claimedCommands: QueuedCommand[] = []
|
||||
const staleCommands = [...partition.staleCommands]
|
||||
|
||||
for (const command of partition.attachmentCommands) {
|
||||
const runId = command.autonomy?.runId
|
||||
if (!runId) {
|
||||
attachmentCommands.push(command)
|
||||
continue
|
||||
}
|
||||
const key = refKey({
|
||||
runId,
|
||||
rootDir: getCommandRootDir(command, rootDir),
|
||||
})
|
||||
if (claimedRunKeySet.has(key)) {
|
||||
attachmentCommands.push(command)
|
||||
claimedCommands.push(command)
|
||||
} else if (staleRunKeys.has(key)) {
|
||||
staleCommands.push(command)
|
||||
}
|
||||
}
|
||||
|
||||
return {
|
||||
attachmentCommands,
|
||||
staleCommands,
|
||||
claimedRunIds,
|
||||
claimedCommands,
|
||||
}
|
||||
}
|
||||
|
||||
export async function cancelQueuedAutonomyCommands(params: {
|
||||
commands: QueuedCommand[]
|
||||
rootDir?: string
|
||||
}): Promise<void> {
|
||||
for (const ref of getAutonomyRunRefs(params.commands, params.rootDir)) {
|
||||
await markAutonomyRunCancelled(ref.runId, ref.rootDir)
|
||||
}
|
||||
}
|
||||
|
||||
function stringifyAutonomyError(error: unknown): string {
|
||||
if (typeof error === 'string') {
|
||||
return error
|
||||
}
|
||||
if (error instanceof Error) {
|
||||
return error.message
|
||||
}
|
||||
return String(error)
|
||||
}
|
||||
|
||||
export function sanitizeAutonomyFailureForPersistence(
|
||||
error: unknown,
|
||||
fallback = 'query failed',
|
||||
): string {
|
||||
const message = stringifyAutonomyError(error)
|
||||
const lower = message.toLowerCase()
|
||||
if (
|
||||
lower.includes('api_error') ||
|
||||
lower.includes('provider') ||
|
||||
lower.includes('openai') ||
|
||||
lower.includes('gemini') ||
|
||||
lower.includes('grok') ||
|
||||
lower.includes('anthropic') ||
|
||||
lower.includes('bedrock') ||
|
||||
lower.includes('vertex')
|
||||
) {
|
||||
return 'provider api_error'
|
||||
}
|
||||
return fallback
|
||||
}
|
||||
|
||||
export async function finalizeAutonomyCommandsForTurn(params: {
|
||||
commands: QueuedCommand[]
|
||||
outcome: AutonomyTurnOutcome
|
||||
currentDir?: string
|
||||
priority?: 'now' | 'next' | 'later'
|
||||
workload?: string
|
||||
}): Promise<QueuedCommand[]> {
|
||||
const nextCommands: QueuedCommand[] = []
|
||||
for (const command of params.commands) {
|
||||
const autonomy = command.autonomy
|
||||
if (!autonomy?.runId) {
|
||||
continue
|
||||
}
|
||||
if (params.outcome.type === 'completed') {
|
||||
nextCommands.push(
|
||||
...(await finalizeAutonomyRunCompleted({
|
||||
runId: autonomy.runId,
|
||||
rootDir: autonomy.rootDir,
|
||||
currentDir: params.currentDir,
|
||||
priority: params.priority,
|
||||
workload: command.workload ?? params.workload,
|
||||
})),
|
||||
)
|
||||
} else if (params.outcome.type === 'cancelled') {
|
||||
await markAutonomyRunCancelled(autonomy.runId, autonomy.rootDir)
|
||||
} else {
|
||||
await finalizeAutonomyRunFailed({
|
||||
runId: autonomy.runId,
|
||||
rootDir: autonomy.rootDir,
|
||||
error:
|
||||
params.outcome.message ??
|
||||
sanitizeAutonomyFailureForPersistence(params.outcome.error),
|
||||
})
|
||||
}
|
||||
}
|
||||
return nextCommands
|
||||
}
|
||||
@@ -1,7 +1,7 @@
|
||||
import { randomUUID } from 'crypto'
|
||||
import { mkdir, writeFile } from 'fs/promises'
|
||||
import { dirname, join, resolve } from 'path'
|
||||
import { getProjectRoot } from '../bootstrap/state.js'
|
||||
import { getProjectRoot, getSessionId } from '../bootstrap/state.js'
|
||||
import type { MessageOrigin } from '../types/message.js'
|
||||
import type { QueuedCommand } from '../types/textInputTypes.js'
|
||||
import {
|
||||
@@ -29,9 +29,22 @@ import {
|
||||
} from './autonomyFlows.js'
|
||||
import { withAutonomyPersistenceLock } from './autonomyPersistence.js'
|
||||
import { getFsImplementation } from './fsOperations.js'
|
||||
import { isProcessRunning } from './genericProcessUtils.js'
|
||||
import { logError } from './log.js'
|
||||
|
||||
const AUTONOMY_RUNS_MAX = 200
|
||||
const AUTONOMY_RUNS_RELATIVE_PATH = join(AUTONOMY_DIR, 'runs.json')
|
||||
// Sentinel string surfaced to operators via runs.json error fields and
|
||||
// referenced literally by the HEARTBEAT.md `stale-recovery-health` task.
|
||||
// A unit test asserts the HEARTBEAT.md file contains this exact prefix —
|
||||
// changing the value will fail the test, forcing the heartbeat prompt
|
||||
// to be updated in the same change.
|
||||
export const STALE_ACTIVE_RUN_ERROR_PREFIX =
|
||||
'Recovered stale active autonomy run'
|
||||
|
||||
// Guards the legacy-block warning so it fires once per (process, runId) instead
|
||||
// of every dedup tick while a no-owner record sits there.
|
||||
const warnedLegacyBlockRunIds = new Set<string>()
|
||||
|
||||
export type AutonomyRunStatus =
|
||||
| 'queued'
|
||||
@@ -59,6 +72,8 @@ export type AutonomyRunRecord = {
|
||||
flowStepName?: string
|
||||
promptPreview: string
|
||||
createdAt: number
|
||||
ownerProcessId?: number
|
||||
ownerSessionId?: string
|
||||
startedAt?: number
|
||||
endedAt?: number
|
||||
error?: string
|
||||
@@ -77,6 +92,19 @@ type AutonomyRunFlowRef = {
|
||||
stepName: string
|
||||
}
|
||||
|
||||
type CreateAutonomyRunParams = {
|
||||
trigger: AutonomyTriggerKind
|
||||
prompt: string
|
||||
rootDir?: string
|
||||
currentDir?: string
|
||||
sourceId?: string
|
||||
sourceLabel?: string
|
||||
runtime?: AutonomyRunRuntime
|
||||
ownerKey?: string
|
||||
flow?: AutonomyRunFlowRef
|
||||
nowMs?: number
|
||||
}
|
||||
|
||||
function truncatePromptPreview(prompt: string): string {
|
||||
const singleLine = prompt.replace(/\s+/g, ' ').trim()
|
||||
return singleLine.length <= 240
|
||||
@@ -95,6 +123,29 @@ function cloneRunRecord(run: AutonomyRunRecord): AutonomyRunRecord {
|
||||
return { ...run }
|
||||
}
|
||||
|
||||
function isAutonomyRunActive(run: AutonomyRunRecord): boolean {
|
||||
return run.status === 'queued' || run.status === 'running'
|
||||
}
|
||||
|
||||
function selectPersistedAutonomyRuns(
|
||||
runs: AutonomyRunRecord[],
|
||||
): AutonomyRunRecord[] {
|
||||
const retained = runs
|
||||
.slice()
|
||||
.map(cloneRunRecord)
|
||||
.sort((left, right) => {
|
||||
const leftActive = isAutonomyRunActive(left)
|
||||
const rightActive = isAutonomyRunActive(right)
|
||||
if (leftActive !== rightActive) {
|
||||
return leftActive ? -1 : 1
|
||||
}
|
||||
return right.createdAt - left.createdAt
|
||||
})
|
||||
.slice(0, AUTONOMY_RUNS_MAX)
|
||||
|
||||
return retained.sort((left, right) => right.createdAt - left.createdAt)
|
||||
}
|
||||
|
||||
function normalizePersistedRunRecord(
|
||||
run: PersistedAutonomyRunRecord,
|
||||
): AutonomyRunRecord {
|
||||
@@ -157,11 +208,7 @@ async function writeAutonomyRuns(
|
||||
path,
|
||||
`${JSON.stringify(
|
||||
{
|
||||
runs: runs
|
||||
.slice()
|
||||
.map(cloneRunRecord)
|
||||
.sort((left, right) => right.createdAt - left.createdAt)
|
||||
.slice(0, AUTONOMY_RUNS_MAX),
|
||||
runs: selectPersistedAutonomyRuns(runs),
|
||||
} satisfies AutonomyRunsFile,
|
||||
null,
|
||||
2,
|
||||
@@ -172,7 +219,7 @@ async function writeAutonomyRuns(
|
||||
|
||||
async function updateAutonomyRun(
|
||||
runId: string,
|
||||
updater: (current: AutonomyRunRecord) => AutonomyRunRecord,
|
||||
updater: (current: AutonomyRunRecord) => AutonomyRunRecord | null,
|
||||
rootDir: string = getProjectRoot(),
|
||||
): Promise<AutonomyRunRecord | null> {
|
||||
return withAutonomyPersistenceLock(rootDir, async () => {
|
||||
@@ -181,7 +228,11 @@ async function updateAutonomyRun(
|
||||
if (index === -1) {
|
||||
return null
|
||||
}
|
||||
const updated = cloneRunRecord(updater(cloneRunRecord(runs[index]!)))
|
||||
const next = updater(cloneRunRecord(runs[index]!))
|
||||
if (!next) {
|
||||
return null
|
||||
}
|
||||
const updated = cloneRunRecord(next)
|
||||
runs[index] = updated
|
||||
await writeAutonomyRuns(runs, rootDir)
|
||||
return updated
|
||||
@@ -196,21 +247,112 @@ export async function getAutonomyRunById(
|
||||
return runs.find(run => run.runId === runId) ?? null
|
||||
}
|
||||
|
||||
export async function createAutonomyRun(params: {
|
||||
function isActiveAutonomyRunStatus(status: AutonomyRunStatus): boolean {
|
||||
return status === 'queued' || status === 'running'
|
||||
}
|
||||
|
||||
function isValidOwnerProcessId(pid: number | undefined): pid is number {
|
||||
// Reject non-numeric, negative, zero (Linux: send-to-process-group), and
|
||||
// non-integer values. A forged record with pid=0 or pid<0 used to be
|
||||
// treated as live and could permanently block dedup; treating them as
|
||||
// stale closes that availability hole.
|
||||
return (
|
||||
typeof pid === 'number' &&
|
||||
Number.isInteger(pid) &&
|
||||
pid > 0 &&
|
||||
pid < 4_194_304
|
||||
)
|
||||
}
|
||||
|
||||
function isStaleActiveAutonomyRun(run: AutonomyRunRecord): boolean {
|
||||
if (!isActiveAutonomyRunStatus(run.status)) {
|
||||
return false
|
||||
}
|
||||
if (run.ownerProcessId === undefined) {
|
||||
return false
|
||||
}
|
||||
if (!isValidOwnerProcessId(run.ownerProcessId)) {
|
||||
return true
|
||||
}
|
||||
return !isProcessRunning(run.ownerProcessId)
|
||||
}
|
||||
|
||||
function staleActiveRunError(run: AutonomyRunRecord): string {
|
||||
return `${STALE_ACTIVE_RUN_ERROR_PREFIX}: owner process ${run.ownerProcessId} is no longer running.`
|
||||
}
|
||||
|
||||
function failAutonomyRunRecord(
|
||||
run: AutonomyRunRecord,
|
||||
error: string,
|
||||
nowMs: number,
|
||||
): AutonomyRunRecord {
|
||||
return {
|
||||
...run,
|
||||
status: 'failed',
|
||||
endedAt: nowMs,
|
||||
error,
|
||||
}
|
||||
}
|
||||
|
||||
function recoverStaleActiveAutonomyRun(
|
||||
run: AutonomyRunRecord,
|
||||
nowMs: number,
|
||||
): AutonomyRunRecord {
|
||||
return failAutonomyRunRecord(run, staleActiveRunError(run), nowMs)
|
||||
}
|
||||
|
||||
async function syncFailedManagedFlowForRun(
|
||||
run: AutonomyRunRecord,
|
||||
rootDir: string,
|
||||
): Promise<void> {
|
||||
if (run.parentFlowId && run.parentFlowSyncMode === 'managed') {
|
||||
await markManagedAutonomyFlowStepFailed({
|
||||
flowId: run.parentFlowId,
|
||||
runId: run.runId,
|
||||
error: run.error ?? 'Autonomy run failed.',
|
||||
rootDir,
|
||||
nowMs: run.endedAt,
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
function matchesActiveAutonomyRunSource(
|
||||
run: AutonomyRunRecord,
|
||||
params: {
|
||||
trigger: AutonomyTriggerKind
|
||||
sourceId: string
|
||||
ownerKey?: string
|
||||
},
|
||||
): boolean {
|
||||
return (
|
||||
run.trigger === params.trigger &&
|
||||
run.sourceId === params.sourceId &&
|
||||
(params.ownerKey === undefined || run.ownerKey === params.ownerKey) &&
|
||||
isActiveAutonomyRunStatus(run.status)
|
||||
)
|
||||
}
|
||||
|
||||
export async function hasActiveAutonomyRunForSource(params: {
|
||||
trigger: AutonomyTriggerKind
|
||||
prompt: string
|
||||
sourceId: string
|
||||
rootDir?: string
|
||||
currentDir?: string
|
||||
sourceId?: string
|
||||
sourceLabel?: string
|
||||
runtime?: AutonomyRunRuntime
|
||||
ownerKey?: string
|
||||
flow?: AutonomyRunFlowRef
|
||||
nowMs?: number
|
||||
}): Promise<AutonomyRunRecord> {
|
||||
const rootDir = resolve(params.rootDir ?? getProjectRoot())
|
||||
const currentDir = resolve(params.currentDir ?? rootDir)
|
||||
const record: AutonomyRunRecord = {
|
||||
}): Promise<boolean> {
|
||||
const runs = await listAutonomyRuns(params.rootDir)
|
||||
return runs.some(
|
||||
run =>
|
||||
matchesActiveAutonomyRunSource(run, params) &&
|
||||
!isStaleActiveAutonomyRun(run),
|
||||
)
|
||||
}
|
||||
|
||||
function buildAutonomyRunRecord(
|
||||
params: CreateAutonomyRunParams,
|
||||
rootDir: string,
|
||||
currentDir: string,
|
||||
): AutonomyRunRecord {
|
||||
const createdAt = params.nowMs ?? Date.now()
|
||||
return {
|
||||
runId: randomUUID(),
|
||||
runtime: params.runtime ?? (params.flow ? 'flow_step' : 'automatic'),
|
||||
trigger: params.trigger,
|
||||
@@ -231,13 +373,80 @@ export async function createAutonomyRun(params: {
|
||||
}
|
||||
: {}),
|
||||
promptPreview: truncatePromptPreview(params.prompt),
|
||||
createdAt: params.nowMs ?? Date.now(),
|
||||
createdAt,
|
||||
ownerProcessId: process.pid,
|
||||
ownerSessionId: getSessionId(),
|
||||
}
|
||||
}
|
||||
|
||||
async function persistAutonomyRunRecord(
|
||||
record: AutonomyRunRecord,
|
||||
rootDir: string,
|
||||
skipWhenActiveSource: boolean,
|
||||
): Promise<{
|
||||
created: boolean
|
||||
recoveredStaleRuns: AutonomyRunRecord[]
|
||||
}> {
|
||||
let created = false
|
||||
const recoveredStaleRuns: AutonomyRunRecord[] = []
|
||||
await withAutonomyPersistenceLock(rootDir, async () => {
|
||||
const runs = await listAutonomyRuns(rootDir)
|
||||
const sourceId = record.sourceId
|
||||
if (skipWhenActiveSource && sourceId) {
|
||||
let hasBlockingActiveRun = false
|
||||
let staleRecoveriesApplied = false
|
||||
for (let i = 0; i < runs.length; i++) {
|
||||
const run = runs[i]!
|
||||
if (
|
||||
!matchesActiveAutonomyRunSource(run, {
|
||||
trigger: record.trigger,
|
||||
sourceId,
|
||||
ownerKey: record.ownerKey,
|
||||
})
|
||||
) {
|
||||
continue
|
||||
}
|
||||
if (isStaleActiveAutonomyRun(run)) {
|
||||
const recovered = recoverStaleActiveAutonomyRun(
|
||||
run,
|
||||
record.createdAt,
|
||||
)
|
||||
runs[i] = recovered
|
||||
recoveredStaleRuns.push(recovered)
|
||||
staleRecoveriesApplied = true
|
||||
continue
|
||||
}
|
||||
if (
|
||||
run.ownerProcessId === undefined &&
|
||||
!warnedLegacyBlockRunIds.has(run.runId)
|
||||
) {
|
||||
warnedLegacyBlockRunIds.add(run.runId)
|
||||
logError(
|
||||
new Error(
|
||||
`[autonomyRuns] blocked by legacy un-owned active run ${run.runId} (createdAt=${run.createdAt}); cancel manually if this is a stale upgrade artifact`,
|
||||
),
|
||||
)
|
||||
}
|
||||
hasBlockingActiveRun = true
|
||||
}
|
||||
if (hasBlockingActiveRun) {
|
||||
if (staleRecoveriesApplied) {
|
||||
await writeAutonomyRuns(runs, rootDir)
|
||||
}
|
||||
return
|
||||
}
|
||||
}
|
||||
runs.unshift(record)
|
||||
await writeAutonomyRuns(runs, rootDir)
|
||||
created = true
|
||||
})
|
||||
return { created, recoveredStaleRuns }
|
||||
}
|
||||
|
||||
async function queueManagedFlowStepRunForRecord(
|
||||
record: AutonomyRunRecord,
|
||||
rootDir: string,
|
||||
): Promise<void> {
|
||||
if (
|
||||
record.parentFlowId &&
|
||||
record.flowStepId &&
|
||||
@@ -258,9 +467,47 @@ export async function createAutonomyRun(params: {
|
||||
nowMs: record.createdAt,
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
async function createAutonomyRunCore(
|
||||
params: CreateAutonomyRunParams,
|
||||
skipIfActiveSource: boolean,
|
||||
): Promise<AutonomyRunRecord | null> {
|
||||
const rootDir = resolve(params.rootDir ?? getProjectRoot())
|
||||
const currentDir = resolve(params.currentDir ?? rootDir)
|
||||
const record = buildAutonomyRunRecord(params, rootDir, currentDir)
|
||||
|
||||
const { created, recoveredStaleRuns } = await persistAutonomyRunRecord(
|
||||
record,
|
||||
rootDir,
|
||||
skipIfActiveSource,
|
||||
)
|
||||
for (const recovered of recoveredStaleRuns) {
|
||||
await syncFailedManagedFlowForRun(recovered, rootDir)
|
||||
}
|
||||
if (!created) {
|
||||
return null
|
||||
}
|
||||
await queueManagedFlowStepRunForRecord(record, rootDir)
|
||||
return record
|
||||
}
|
||||
|
||||
export async function createAutonomyRun(
|
||||
params: CreateAutonomyRunParams,
|
||||
): Promise<AutonomyRunRecord> {
|
||||
const record = await createAutonomyRunCore(params, false)
|
||||
if (!record) {
|
||||
throw new Error('Autonomy run was unexpectedly skipped.')
|
||||
}
|
||||
return record
|
||||
}
|
||||
|
||||
export async function createAutonomyRunIfNoActiveSource(
|
||||
params: CreateAutonomyRunParams & { sourceId: string },
|
||||
): Promise<AutonomyRunRecord | null> {
|
||||
return createAutonomyRunCore(params, true)
|
||||
}
|
||||
|
||||
function buildManagedFlowStepPrompt(
|
||||
flow: AutonomyFlowRecord,
|
||||
stepIndex: number,
|
||||
@@ -336,6 +583,7 @@ async function createOrRecoverManagedFlowStepCommand(params: {
|
||||
workload: params.workload,
|
||||
autonomy: {
|
||||
runId: run.runId,
|
||||
rootDir: run.rootDir,
|
||||
trigger: 'managed-flow-step',
|
||||
sourceId: run.sourceId,
|
||||
sourceLabel: run.sourceLabel,
|
||||
@@ -426,11 +674,16 @@ export async function markAutonomyRunRunning(
|
||||
): Promise<AutonomyRunRecord | null> {
|
||||
const updated = await updateAutonomyRun(
|
||||
runId,
|
||||
current => ({
|
||||
...current,
|
||||
status: 'running',
|
||||
startedAt: nowMs ?? Date.now(),
|
||||
}),
|
||||
current =>
|
||||
current.status === 'queued'
|
||||
? {
|
||||
...current,
|
||||
status: 'running',
|
||||
startedAt: nowMs ?? Date.now(),
|
||||
ownerProcessId: process.pid,
|
||||
ownerSessionId: getSessionId(),
|
||||
}
|
||||
: null,
|
||||
rootDir,
|
||||
)
|
||||
if (updated?.parentFlowId && updated.parentFlowSyncMode === 'managed') {
|
||||
@@ -451,12 +704,15 @@ export async function markAutonomyRunCompleted(
|
||||
): Promise<AutonomyRunRecord | null> {
|
||||
const updated = await updateAutonomyRun(
|
||||
runId,
|
||||
current => ({
|
||||
...current,
|
||||
status: 'completed',
|
||||
endedAt: nowMs ?? Date.now(),
|
||||
error: undefined,
|
||||
}),
|
||||
current =>
|
||||
current.status === 'queued' || current.status === 'running'
|
||||
? {
|
||||
...current,
|
||||
status: 'completed',
|
||||
endedAt: nowMs ?? Date.now(),
|
||||
error: undefined,
|
||||
}
|
||||
: null,
|
||||
rootDir,
|
||||
)
|
||||
if (updated?.parentFlowId && updated.parentFlowSyncMode === 'managed') {
|
||||
@@ -476,24 +732,17 @@ export async function markAutonomyRunFailed(
|
||||
rootDir?: string,
|
||||
nowMs?: number,
|
||||
): Promise<AutonomyRunRecord | null> {
|
||||
const endedAt = nowMs ?? Date.now()
|
||||
const updated = await updateAutonomyRun(
|
||||
runId,
|
||||
current => ({
|
||||
...current,
|
||||
status: 'failed',
|
||||
endedAt: nowMs ?? Date.now(),
|
||||
error,
|
||||
}),
|
||||
current =>
|
||||
isActiveAutonomyRunStatus(current.status)
|
||||
? failAutonomyRunRecord(current, error, endedAt)
|
||||
: null,
|
||||
rootDir,
|
||||
)
|
||||
if (updated?.parentFlowId && updated.parentFlowSyncMode === 'managed') {
|
||||
await markManagedAutonomyFlowStepFailed({
|
||||
flowId: updated.parentFlowId,
|
||||
runId: updated.runId,
|
||||
error,
|
||||
rootDir,
|
||||
nowMs: updated.endedAt,
|
||||
})
|
||||
if (updated) {
|
||||
await syncFailedManagedFlowForRun(updated, rootDir ?? updated.rootDir)
|
||||
}
|
||||
return updated
|
||||
}
|
||||
@@ -505,12 +754,15 @@ export async function markAutonomyRunCancelled(
|
||||
): Promise<AutonomyRunRecord | null> {
|
||||
const updated = await updateAutonomyRun(
|
||||
runId,
|
||||
current => ({
|
||||
...current,
|
||||
status: 'cancelled',
|
||||
endedAt: nowMs ?? Date.now(),
|
||||
error: undefined,
|
||||
}),
|
||||
current =>
|
||||
current.status === 'queued' || current.status === 'running'
|
||||
? {
|
||||
...current,
|
||||
status: 'cancelled',
|
||||
endedAt: nowMs ?? Date.now(),
|
||||
error: undefined,
|
||||
}
|
||||
: null,
|
||||
rootDir,
|
||||
)
|
||||
if (updated?.parentFlowId && updated.parentFlowSyncMode === 'managed') {
|
||||
@@ -612,6 +864,7 @@ export async function createAutonomyQueuedPrompt(params: {
|
||||
currentDir?: string
|
||||
sourceId?: string
|
||||
sourceLabel?: string
|
||||
ownerKey?: string
|
||||
workload?: string
|
||||
priority?: 'now' | 'next' | 'later'
|
||||
shouldCreate?: () => boolean
|
||||
@@ -634,39 +887,130 @@ export async function createAutonomyQueuedPrompt(params: {
|
||||
currentDir,
|
||||
sourceId: params.sourceId,
|
||||
sourceLabel: params.sourceLabel,
|
||||
ownerKey: params.ownerKey,
|
||||
workload: params.workload,
|
||||
priority: params.priority,
|
||||
flow: params.flow,
|
||||
})
|
||||
}
|
||||
|
||||
export async function createAutonomyQueuedPromptIfNoActiveSource(params: {
|
||||
trigger: AutonomyTriggerKind
|
||||
basePrompt: string
|
||||
rootDir?: string
|
||||
currentDir?: string
|
||||
sourceId: string
|
||||
sourceLabel?: string
|
||||
ownerKey?: string
|
||||
workload?: string
|
||||
priority?: 'now' | 'next' | 'later'
|
||||
shouldCreate?: () => boolean
|
||||
}): Promise<QueuedCommand | null> {
|
||||
const rootDir = resolve(params.rootDir ?? getProjectRoot())
|
||||
const currentDir = resolve(params.currentDir ?? getCwd())
|
||||
// Cheap optimistic pre-check: skip the AGENTS.md / HEARTBEAT.md disk
|
||||
// reads + prompt assembly when an active run for this source already
|
||||
// blocks dedup. The lock-side check inside persistAutonomyRunRecord
|
||||
// remains authoritative; this only fast-paths the common storm case.
|
||||
if (
|
||||
await hasActiveAutonomyRunForSource({
|
||||
trigger: params.trigger,
|
||||
sourceId: params.sourceId,
|
||||
rootDir,
|
||||
ownerKey: params.ownerKey,
|
||||
})
|
||||
) {
|
||||
return null
|
||||
}
|
||||
const prepared = await prepareAutonomyTurnPrompt({
|
||||
basePrompt: params.basePrompt,
|
||||
trigger: params.trigger,
|
||||
rootDir,
|
||||
currentDir,
|
||||
})
|
||||
if (params.shouldCreate && !params.shouldCreate()) {
|
||||
return null
|
||||
}
|
||||
return commitAutonomyQueuedPromptIfNoActiveSource({
|
||||
prepared,
|
||||
rootDir,
|
||||
currentDir,
|
||||
sourceId: params.sourceId,
|
||||
sourceLabel: params.sourceLabel,
|
||||
ownerKey: params.ownerKey,
|
||||
workload: params.workload,
|
||||
priority: params.priority,
|
||||
})
|
||||
}
|
||||
|
||||
export async function commitAutonomyQueuedPrompt(params: {
|
||||
prepared: Awaited<ReturnType<typeof prepareAutonomyTurnPrompt>>
|
||||
rootDir?: string
|
||||
currentDir?: string
|
||||
sourceId?: string
|
||||
sourceLabel?: string
|
||||
ownerKey?: string
|
||||
workload?: string
|
||||
priority?: 'now' | 'next' | 'later'
|
||||
flow?: AutonomyRunFlowRef
|
||||
}): Promise<QueuedCommand> {
|
||||
const command = await commitAutonomyQueuedPromptInternal(params, false)
|
||||
if (!command) {
|
||||
throw new Error('Autonomy queued prompt was unexpectedly skipped.')
|
||||
}
|
||||
return command
|
||||
}
|
||||
|
||||
async function commitAutonomyQueuedPromptIfNoActiveSource(params: {
|
||||
prepared: Awaited<ReturnType<typeof prepareAutonomyTurnPrompt>>
|
||||
rootDir?: string
|
||||
currentDir?: string
|
||||
sourceId: string
|
||||
sourceLabel?: string
|
||||
ownerKey?: string
|
||||
workload?: string
|
||||
priority?: 'now' | 'next' | 'later'
|
||||
}): Promise<QueuedCommand | null> {
|
||||
return commitAutonomyQueuedPromptInternal(params, true)
|
||||
}
|
||||
|
||||
async function commitAutonomyQueuedPromptInternal(
|
||||
params: {
|
||||
prepared: Awaited<ReturnType<typeof prepareAutonomyTurnPrompt>>
|
||||
rootDir?: string
|
||||
currentDir?: string
|
||||
sourceId?: string
|
||||
sourceLabel?: string
|
||||
ownerKey?: string
|
||||
workload?: string
|
||||
priority?: 'now' | 'next' | 'later'
|
||||
flow?: AutonomyRunFlowRef
|
||||
},
|
||||
skipWhenActiveSource: boolean,
|
||||
): Promise<QueuedCommand | null> {
|
||||
const rootDir = resolve(
|
||||
params.rootDir ?? params.prepared.rootDir ?? getProjectRoot(),
|
||||
)
|
||||
const currentDir = resolve(
|
||||
params.currentDir ?? params.prepared.currentDir ?? getCwd(),
|
||||
)
|
||||
commitPreparedAutonomyTurn(params.prepared)
|
||||
const value = params.prepared.prompt
|
||||
const run = await createAutonomyRun({
|
||||
const runParams: CreateAutonomyRunParams = {
|
||||
trigger: params.prepared.trigger,
|
||||
prompt: value,
|
||||
rootDir,
|
||||
currentDir,
|
||||
sourceId: params.sourceId,
|
||||
sourceLabel: params.sourceLabel,
|
||||
ownerKey: params.ownerKey,
|
||||
flow: params.flow,
|
||||
})
|
||||
}
|
||||
const useDedup = skipWhenActiveSource && Boolean(params.sourceId)
|
||||
const run = await createAutonomyRunCore(runParams, useDedup)
|
||||
if (!run) {
|
||||
return null
|
||||
}
|
||||
commitPreparedAutonomyTurn(params.prepared)
|
||||
const origin = {
|
||||
kind: 'autonomy',
|
||||
trigger: params.prepared.trigger,
|
||||
@@ -683,6 +1027,7 @@ export async function commitAutonomyQueuedPrompt(params: {
|
||||
workload: params.workload,
|
||||
autonomy: {
|
||||
runId: run.runId,
|
||||
rootDir: run.rootDir,
|
||||
trigger: params.prepared.trigger,
|
||||
sourceId: params.sourceId,
|
||||
sourceLabel: params.sourceLabel,
|
||||
|
||||
@@ -19,6 +19,7 @@ import {
|
||||
} from '../types/textInputTypes.js'
|
||||
import { createAbortController } from './abortController.js'
|
||||
import type { PastedContent } from './config.js'
|
||||
import { getCwd } from './cwd.js'
|
||||
import { logForDebugging } from './debug.js'
|
||||
import type { EffortValue } from './effort.js'
|
||||
import type { FileHistoryState } from './fileHistory.js'
|
||||
@@ -27,11 +28,9 @@ import { gracefulShutdownSync } from './gracefulShutdown.js'
|
||||
import { enqueue } from './messageQueueManager.js'
|
||||
import { resolveSkillModelOverride } from './model/model.js'
|
||||
import {
|
||||
finalizeAutonomyRunCompleted,
|
||||
finalizeAutonomyRunFailed,
|
||||
markAutonomyRunFailed,
|
||||
markAutonomyRunRunning,
|
||||
} from './autonomyRuns.js'
|
||||
claimConsumableQueuedAutonomyCommands,
|
||||
finalizeAutonomyCommandsForTurn,
|
||||
} from './autonomyQueueLifecycle.js'
|
||||
import type { ProcessUserInputContext } from './processUserInput/processUserInput.js'
|
||||
import { processUserInput } from './processUserInput/processUserInput.js'
|
||||
import type { QueryGuard } from './QueryGuard.js'
|
||||
@@ -459,7 +458,14 @@ async function executeUserInput(params: ExecuteUserInputParams): Promise<void> {
|
||||
// Iterate all commands uniformly. First command gets attachments +
|
||||
// ideSelection + pastedContents, rest skip attachments to avoid
|
||||
// duplicating turn-level context (IDE selection, todos, diffs).
|
||||
const commands = queuedCommands ?? []
|
||||
let commands = queuedCommands ?? []
|
||||
const queuedAutonomyClaim =
|
||||
await claimConsumableQueuedAutonomyCommands(commands)
|
||||
commands = queuedAutonomyClaim.attachmentCommands
|
||||
const claimedAutonomyCommands = queuedAutonomyClaim.claimedCommands
|
||||
if (commands.length === 0) {
|
||||
return
|
||||
}
|
||||
|
||||
// Compute the workload tag for this turn. queueProcessor can batch a
|
||||
// cron prompt with a same-tick human prompt; only tag when EVERY
|
||||
@@ -471,7 +477,7 @@ async function executeUserInput(params: ExecuteUserInputParams): Promise<void> {
|
||||
commands.every(c => c.workload === firstWorkload)
|
||||
? firstWorkload
|
||||
: undefined
|
||||
let autonomyRunIds: string[] | undefined
|
||||
const deferredAutonomyRunIds = new Set<string>()
|
||||
|
||||
// Wrap the entire turn (processUserInput loop + onQuery) in an
|
||||
// AsyncLocalStorage context. This is the ONLY way to correctly
|
||||
@@ -486,10 +492,7 @@ async function executeUserInput(params: ExecuteUserInputParams): Promise<void> {
|
||||
for (let i = 0; i < commands.length; i++) {
|
||||
const cmd = commands[i]!
|
||||
const isFirst = i === 0
|
||||
if (cmd.autonomy?.runId) {
|
||||
;(autonomyRunIds ??= []).push(cmd.autonomy.runId)
|
||||
await markAutonomyRunRunning(cmd.autonomy.runId)
|
||||
}
|
||||
const runId = cmd.autonomy?.runId
|
||||
const result = await processUserInput({
|
||||
input: cmd.value,
|
||||
preExpansionInput: cmd.preExpansionValue,
|
||||
@@ -510,7 +513,11 @@ async function executeUserInput(params: ExecuteUserInputParams): Promise<void> {
|
||||
bridgeOrigin: cmd.bridgeOrigin,
|
||||
isMeta: cmd.isMeta,
|
||||
skipAttachments: !isFirst,
|
||||
autonomy: cmd.autonomy,
|
||||
})
|
||||
if (runId && result.deferAutonomyCompletion) {
|
||||
deferredAutonomyRunIds.add(runId)
|
||||
}
|
||||
// Stamp origin here rather than threading another arg through
|
||||
// processUserInput → processUserInputBase → processTextPrompt → createUserMessage.
|
||||
// Derive origin from mode for task-notifications — mirrors the origin
|
||||
@@ -611,26 +618,35 @@ async function executeUserInput(params: ExecuteUserInputParams): Promise<void> {
|
||||
}
|
||||
}
|
||||
}) // end runWithWorkload — ALS context naturally scoped, no finally needed
|
||||
if (autonomyRunIds?.length) {
|
||||
for (const runId of autonomyRunIds) {
|
||||
const nextCommands = await finalizeAutonomyRunCompleted({
|
||||
runId,
|
||||
priority: 'later',
|
||||
workload: turnWorkload,
|
||||
})
|
||||
for (const nextCommand of nextCommands) {
|
||||
enqueue(nextCommand)
|
||||
}
|
||||
if (claimedAutonomyCommands.length) {
|
||||
const finalizableCommands = claimedAutonomyCommands.filter(command => {
|
||||
const runId = command.autonomy?.runId
|
||||
return !runId || !deferredAutonomyRunIds.has(runId)
|
||||
})
|
||||
const nextCommands = await finalizeAutonomyCommandsForTurn({
|
||||
commands: finalizableCommands,
|
||||
outcome: { type: 'completed' },
|
||||
currentDir: getCwd(),
|
||||
priority: 'later',
|
||||
workload: turnWorkload,
|
||||
})
|
||||
for (const nextCommand of nextCommands) {
|
||||
enqueue(nextCommand)
|
||||
}
|
||||
}
|
||||
} catch (error) {
|
||||
if (autonomyRunIds?.length) {
|
||||
for (const runId of autonomyRunIds) {
|
||||
await finalizeAutonomyRunFailed({
|
||||
runId,
|
||||
error: String(error),
|
||||
})
|
||||
}
|
||||
if (claimedAutonomyCommands.length) {
|
||||
const finalizableCommands = claimedAutonomyCommands.filter(command => {
|
||||
const runId = command.autonomy?.runId
|
||||
return !runId || !deferredAutonomyRunIds.has(runId)
|
||||
})
|
||||
await finalizeAutonomyCommandsForTurn({
|
||||
commands: finalizableCommands,
|
||||
outcome: { type: 'failed', error },
|
||||
currentDir: getCwd(),
|
||||
priority: 'later',
|
||||
workload: turnWorkload,
|
||||
})
|
||||
}
|
||||
throw error
|
||||
}
|
||||
|
||||
@@ -1,173 +1,162 @@
|
||||
import { describe, expect, test, beforeEach, afterEach } from "bun:test";
|
||||
import { mock } from "bun:test";
|
||||
import { describe, expect, test, beforeEach, afterEach } from 'bun:test'
|
||||
|
||||
let mockedModelType: "gemini" | undefined;
|
||||
const { getAPIProvider, isFirstPartyAnthropicBaseUrl } = await import(
|
||||
'../providers'
|
||||
)
|
||||
|
||||
mock.module("../../settings/settings.js", () => ({
|
||||
getInitialSettings: () =>
|
||||
mockedModelType ? { modelType: mockedModelType } : {},
|
||||
}));
|
||||
|
||||
const { getAPIProvider, isFirstPartyAnthropicBaseUrl } =
|
||||
await import("../providers");
|
||||
|
||||
describe("getAPIProvider", () => {
|
||||
describe('getAPIProvider', () => {
|
||||
const envKeys = [
|
||||
"CLAUDE_CODE_USE_GEMINI",
|
||||
"CLAUDE_CODE_USE_BEDROCK",
|
||||
"CLAUDE_CODE_USE_VERTEX",
|
||||
"CLAUDE_CODE_USE_FOUNDRY",
|
||||
"CLAUDE_CODE_USE_OPENAI",
|
||||
] as const;
|
||||
const savedEnv: Record<string, string | undefined> = {};
|
||||
|
||||
'CLAUDE_CODE_USE_GEMINI',
|
||||
'CLAUDE_CODE_USE_BEDROCK',
|
||||
'CLAUDE_CODE_USE_VERTEX',
|
||||
'CLAUDE_CODE_USE_FOUNDRY',
|
||||
'CLAUDE_CODE_USE_OPENAI',
|
||||
'CLAUDE_CODE_USE_GROK',
|
||||
] as const
|
||||
const savedEnv: Record<string, string | undefined> = {}
|
||||
|
||||
beforeEach(() => {
|
||||
// Save and clear environment variables
|
||||
mockedModelType = undefined;
|
||||
for (const key of envKeys) {
|
||||
savedEnv[key] = process.env[key];
|
||||
delete process.env[key];
|
||||
savedEnv[key] = process.env[key]
|
||||
delete process.env[key]
|
||||
}
|
||||
});
|
||||
})
|
||||
|
||||
afterEach(() => {
|
||||
// Restore environment variables
|
||||
mockedModelType = undefined;
|
||||
for (const key of envKeys) {
|
||||
if (savedEnv[key] !== undefined) {
|
||||
process.env[key] = savedEnv[key];
|
||||
process.env[key] = savedEnv[key]
|
||||
} else {
|
||||
delete process.env[key];
|
||||
delete process.env[key]
|
||||
}
|
||||
}
|
||||
});
|
||||
})
|
||||
|
||||
test('returns "firstParty" by default', () => {
|
||||
expect(getAPIProvider()).toBe("firstParty");
|
||||
});
|
||||
expect(getAPIProvider({})).toBe('firstParty')
|
||||
})
|
||||
|
||||
test('returns "gemini" when modelType is gemini', () => {
|
||||
mockedModelType = "gemini";
|
||||
expect(getAPIProvider()).toBe("gemini");
|
||||
});
|
||||
expect(getAPIProvider({ modelType: 'gemini' })).toBe('gemini')
|
||||
})
|
||||
|
||||
test("modelType takes precedence over environment variables", () => {
|
||||
mockedModelType = "gemini";
|
||||
process.env.CLAUDE_CODE_USE_BEDROCK = "1";
|
||||
expect(getAPIProvider()).toBe("gemini");
|
||||
});
|
||||
test('modelType takes precedence over environment variables', () => {
|
||||
process.env.CLAUDE_CODE_USE_BEDROCK = '1'
|
||||
expect(getAPIProvider({ modelType: 'gemini' })).toBe('gemini')
|
||||
})
|
||||
|
||||
test('returns "gemini" when CLAUDE_CODE_USE_GEMINI is set', () => {
|
||||
process.env.CLAUDE_CODE_USE_GEMINI = "1";
|
||||
expect(getAPIProvider()).toBe("gemini");
|
||||
});
|
||||
process.env.CLAUDE_CODE_USE_GEMINI = '1'
|
||||
expect(getAPIProvider({})).toBe('gemini')
|
||||
})
|
||||
|
||||
test('returns "bedrock" when CLAUDE_CODE_USE_BEDROCK is set', () => {
|
||||
process.env.CLAUDE_CODE_USE_BEDROCK = "1";
|
||||
expect(getAPIProvider()).toBe("bedrock");
|
||||
});
|
||||
process.env.CLAUDE_CODE_USE_BEDROCK = '1'
|
||||
expect(getAPIProvider({})).toBe('bedrock')
|
||||
})
|
||||
|
||||
test('returns "vertex" when CLAUDE_CODE_USE_VERTEX is set', () => {
|
||||
process.env.CLAUDE_CODE_USE_VERTEX = "1";
|
||||
expect(getAPIProvider()).toBe("vertex");
|
||||
});
|
||||
process.env.CLAUDE_CODE_USE_VERTEX = '1'
|
||||
expect(getAPIProvider({})).toBe('vertex')
|
||||
})
|
||||
|
||||
test('returns "foundry" when CLAUDE_CODE_USE_FOUNDRY is set', () => {
|
||||
process.env.CLAUDE_CODE_USE_FOUNDRY = "1";
|
||||
expect(getAPIProvider()).toBe("foundry");
|
||||
});
|
||||
process.env.CLAUDE_CODE_USE_FOUNDRY = '1'
|
||||
expect(getAPIProvider({})).toBe('foundry')
|
||||
})
|
||||
|
||||
test("bedrock takes precedence over gemini", () => {
|
||||
process.env.CLAUDE_CODE_USE_BEDROCK = "1";
|
||||
process.env.CLAUDE_CODE_USE_GEMINI = "1";
|
||||
expect(getAPIProvider()).toBe("bedrock");
|
||||
});
|
||||
test('bedrock takes precedence over gemini', () => {
|
||||
process.env.CLAUDE_CODE_USE_BEDROCK = '1'
|
||||
process.env.CLAUDE_CODE_USE_GEMINI = '1'
|
||||
expect(getAPIProvider({})).toBe('bedrock')
|
||||
})
|
||||
|
||||
test("bedrock takes precedence over vertex", () => {
|
||||
process.env.CLAUDE_CODE_USE_BEDROCK = "1";
|
||||
process.env.CLAUDE_CODE_USE_VERTEX = "1";
|
||||
expect(getAPIProvider()).toBe("bedrock");
|
||||
});
|
||||
test('bedrock takes precedence over vertex', () => {
|
||||
process.env.CLAUDE_CODE_USE_BEDROCK = '1'
|
||||
process.env.CLAUDE_CODE_USE_VERTEX = '1'
|
||||
expect(getAPIProvider({})).toBe('bedrock')
|
||||
})
|
||||
|
||||
test("bedrock wins when all three env vars are set", () => {
|
||||
process.env.CLAUDE_CODE_USE_BEDROCK = "1";
|
||||
process.env.CLAUDE_CODE_USE_VERTEX = "1";
|
||||
process.env.CLAUDE_CODE_USE_FOUNDRY = "1";
|
||||
expect(getAPIProvider()).toBe("bedrock");
|
||||
});
|
||||
test('bedrock wins when all three env vars are set', () => {
|
||||
process.env.CLAUDE_CODE_USE_BEDROCK = '1'
|
||||
process.env.CLAUDE_CODE_USE_VERTEX = '1'
|
||||
process.env.CLAUDE_CODE_USE_FOUNDRY = '1'
|
||||
expect(getAPIProvider({})).toBe('bedrock')
|
||||
})
|
||||
|
||||
test('"true" is truthy', () => {
|
||||
process.env.CLAUDE_CODE_USE_BEDROCK = "true";
|
||||
expect(getAPIProvider()).toBe("bedrock");
|
||||
});
|
||||
process.env.CLAUDE_CODE_USE_BEDROCK = 'true'
|
||||
expect(getAPIProvider({})).toBe('bedrock')
|
||||
})
|
||||
|
||||
test('"0" is not truthy', () => {
|
||||
process.env.CLAUDE_CODE_USE_BEDROCK = "0";
|
||||
expect(getAPIProvider()).toBe("firstParty");
|
||||
});
|
||||
process.env.CLAUDE_CODE_USE_BEDROCK = '0'
|
||||
expect(getAPIProvider({})).toBe('firstParty')
|
||||
})
|
||||
|
||||
test('empty string is not truthy', () => {
|
||||
process.env.CLAUDE_CODE_USE_BEDROCK = "";
|
||||
expect(getAPIProvider()).toBe("firstParty");
|
||||
});
|
||||
});
|
||||
process.env.CLAUDE_CODE_USE_BEDROCK = ''
|
||||
expect(getAPIProvider({})).toBe('firstParty')
|
||||
})
|
||||
})
|
||||
|
||||
describe("isFirstPartyAnthropicBaseUrl", () => {
|
||||
const originalBaseUrl = process.env.ANTHROPIC_BASE_URL;
|
||||
const originalUserType = process.env.USER_TYPE;
|
||||
describe('isFirstPartyAnthropicBaseUrl', () => {
|
||||
const originalBaseUrl = process.env.ANTHROPIC_BASE_URL
|
||||
const originalUserType = process.env.USER_TYPE
|
||||
|
||||
afterEach(() => {
|
||||
if (originalBaseUrl !== undefined) {
|
||||
process.env.ANTHROPIC_BASE_URL = originalBaseUrl;
|
||||
process.env.ANTHROPIC_BASE_URL = originalBaseUrl
|
||||
} else {
|
||||
delete process.env.ANTHROPIC_BASE_URL;
|
||||
delete process.env.ANTHROPIC_BASE_URL
|
||||
}
|
||||
if (originalUserType !== undefined) {
|
||||
process.env.USER_TYPE = originalUserType;
|
||||
process.env.USER_TYPE = originalUserType
|
||||
} else {
|
||||
delete process.env.USER_TYPE;
|
||||
delete process.env.USER_TYPE
|
||||
}
|
||||
});
|
||||
})
|
||||
|
||||
test("returns true when ANTHROPIC_BASE_URL is not set", () => {
|
||||
delete process.env.ANTHROPIC_BASE_URL;
|
||||
expect(isFirstPartyAnthropicBaseUrl()).toBe(true);
|
||||
});
|
||||
test('returns true when ANTHROPIC_BASE_URL is not set', () => {
|
||||
delete process.env.ANTHROPIC_BASE_URL
|
||||
expect(isFirstPartyAnthropicBaseUrl()).toBe(true)
|
||||
})
|
||||
|
||||
test("returns true for api.anthropic.com", () => {
|
||||
process.env.ANTHROPIC_BASE_URL = "https://api.anthropic.com";
|
||||
expect(isFirstPartyAnthropicBaseUrl()).toBe(true);
|
||||
});
|
||||
test('returns true for api.anthropic.com', () => {
|
||||
process.env.ANTHROPIC_BASE_URL = 'https://api.anthropic.com'
|
||||
expect(isFirstPartyAnthropicBaseUrl()).toBe(true)
|
||||
})
|
||||
|
||||
test("returns false for custom URL", () => {
|
||||
process.env.ANTHROPIC_BASE_URL = "https://my-proxy.com";
|
||||
expect(isFirstPartyAnthropicBaseUrl()).toBe(false);
|
||||
});
|
||||
test('returns false for custom URL', () => {
|
||||
process.env.ANTHROPIC_BASE_URL = 'https://my-proxy.com'
|
||||
expect(isFirstPartyAnthropicBaseUrl()).toBe(false)
|
||||
})
|
||||
|
||||
test("returns false for invalid URL", () => {
|
||||
process.env.ANTHROPIC_BASE_URL = "not-a-url";
|
||||
expect(isFirstPartyAnthropicBaseUrl()).toBe(false);
|
||||
});
|
||||
test('returns false for invalid URL', () => {
|
||||
process.env.ANTHROPIC_BASE_URL = 'not-a-url'
|
||||
expect(isFirstPartyAnthropicBaseUrl()).toBe(false)
|
||||
})
|
||||
|
||||
test("returns true for staging URL when USER_TYPE is ant", () => {
|
||||
process.env.ANTHROPIC_BASE_URL = "https://api-staging.anthropic.com";
|
||||
process.env.USER_TYPE = "ant";
|
||||
expect(isFirstPartyAnthropicBaseUrl()).toBe(true);
|
||||
});
|
||||
test('returns true for staging URL when USER_TYPE is ant', () => {
|
||||
process.env.ANTHROPIC_BASE_URL = 'https://api-staging.anthropic.com'
|
||||
process.env.USER_TYPE = 'ant'
|
||||
expect(isFirstPartyAnthropicBaseUrl()).toBe(true)
|
||||
})
|
||||
|
||||
test("returns true for URL with path", () => {
|
||||
process.env.ANTHROPIC_BASE_URL = "https://api.anthropic.com/v1";
|
||||
expect(isFirstPartyAnthropicBaseUrl()).toBe(true);
|
||||
});
|
||||
test('returns true for URL with path', () => {
|
||||
process.env.ANTHROPIC_BASE_URL = 'https://api.anthropic.com/v1'
|
||||
expect(isFirstPartyAnthropicBaseUrl()).toBe(true)
|
||||
})
|
||||
|
||||
test("returns true for trailing slash", () => {
|
||||
process.env.ANTHROPIC_BASE_URL = "https://api.anthropic.com/";
|
||||
expect(isFirstPartyAnthropicBaseUrl()).toBe(true);
|
||||
});
|
||||
test('returns true for trailing slash', () => {
|
||||
process.env.ANTHROPIC_BASE_URL = 'https://api.anthropic.com/'
|
||||
expect(isFirstPartyAnthropicBaseUrl()).toBe(true)
|
||||
})
|
||||
|
||||
test("returns false for subdomain attack", () => {
|
||||
process.env.ANTHROPIC_BASE_URL = "https://evil-api.anthropic.com";
|
||||
expect(isFirstPartyAnthropicBaseUrl()).toBe(false);
|
||||
});
|
||||
});
|
||||
test('returns false for subdomain attack', () => {
|
||||
process.env.ANTHROPIC_BASE_URL = 'https://evil-api.anthropic.com'
|
||||
expect(isFirstPartyAnthropicBaseUrl()).toBe(false)
|
||||
})
|
||||
})
|
||||
|
||||
@@ -1,5 +1,6 @@
|
||||
import type { AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS } from '../../services/analytics/index.js'
|
||||
import { getInitialSettings } from '../settings/settings.js'
|
||||
import type { SettingsJson } from '../settings/types.js'
|
||||
import { isEnvTruthy } from '../envUtils.js'
|
||||
|
||||
export type APIProvider =
|
||||
@@ -11,8 +12,10 @@ export type APIProvider =
|
||||
| 'gemini'
|
||||
| 'grok'
|
||||
|
||||
export function getAPIProvider(): APIProvider {
|
||||
const modelType = getInitialSettings().modelType
|
||||
export function getAPIProvider(
|
||||
settings: Pick<SettingsJson, 'modelType'> = getInitialSettings(),
|
||||
): APIProvider {
|
||||
const modelType = settings.modelType
|
||||
if (modelType === 'openai') return 'openai'
|
||||
if (modelType === 'gemini') return 'gemini'
|
||||
if (modelType === 'grok') return 'grok'
|
||||
|
||||
375
src/utils/processUserInput/__tests__/processSlashCommand.test.ts
Normal file
375
src/utils/processUserInput/__tests__/processSlashCommand.test.ts
Normal file
@@ -0,0 +1,375 @@
|
||||
import { afterEach, beforeEach, describe, expect, mock, test } from 'bun:test'
|
||||
import type { QueuedCommand } from '../../../types/textInputTypes'
|
||||
import {
|
||||
resetStateForTests,
|
||||
setCwdState,
|
||||
setOriginalCwd,
|
||||
setProjectRoot,
|
||||
} from '../../../bootstrap/state'
|
||||
import {
|
||||
createAutonomyQueuedPrompt,
|
||||
getAutonomyRunById,
|
||||
listAutonomyRuns,
|
||||
markAutonomyRunRunning,
|
||||
} from '../../autonomyRuns'
|
||||
import { resetAutonomyAuthorityForTests } from '../../autonomyAuthority'
|
||||
import { createScheduledTaskQueuedCommand } from '../../../hooks/useScheduledTasks'
|
||||
import {
|
||||
cleanupTempDir,
|
||||
createTempDir,
|
||||
} from '../../../../tests/mocks/file-system'
|
||||
|
||||
let runAgentBlocker: Promise<void> | null = null
|
||||
let releaseRunAgentBlocker: (() => void) | null = null
|
||||
let runAgentStartCount = 0
|
||||
let originalNodeEnv: string | undefined
|
||||
let originalAnthropicApiKey: string | undefined
|
||||
const commandQueue: QueuedCommand[] = []
|
||||
|
||||
function enqueue(command: QueuedCommand): void {
|
||||
commandQueue.push({ ...command, priority: command.priority ?? 'next' })
|
||||
}
|
||||
|
||||
function enqueuePendingNotification(command: QueuedCommand): void {
|
||||
commandQueue.push({ ...command, priority: command.priority ?? 'later' })
|
||||
}
|
||||
|
||||
function getCommandQueue(): QueuedCommand[] {
|
||||
return [...commandQueue]
|
||||
}
|
||||
|
||||
function hasCommandsInQueue(): boolean {
|
||||
return commandQueue.length > 0
|
||||
}
|
||||
|
||||
function resetCommandQueue(): void {
|
||||
commandQueue.length = 0
|
||||
}
|
||||
|
||||
function createMessageQueueManagerMock() {
|
||||
return {
|
||||
enqueue,
|
||||
enqueuePendingNotification,
|
||||
getCommandQueue,
|
||||
hasCommandsInQueue,
|
||||
resetCommandQueue,
|
||||
}
|
||||
}
|
||||
|
||||
function holdRunAgent(): void {
|
||||
runAgentBlocker = new Promise(resolve => {
|
||||
releaseRunAgentBlocker = resolve
|
||||
})
|
||||
}
|
||||
|
||||
function releaseRunAgent(): void {
|
||||
releaseRunAgentBlocker?.()
|
||||
runAgentBlocker = null
|
||||
releaseRunAgentBlocker = null
|
||||
}
|
||||
|
||||
mock.module('bun:bundle', () => ({
|
||||
feature: (name: string) => name === 'KAIROS',
|
||||
}))
|
||||
|
||||
mock.module(
|
||||
'@claude-code-best/builtin-tools/tools/AgentTool/runAgent.js',
|
||||
() => ({
|
||||
runAgent: async function* () {
|
||||
runAgentStartCount += 1
|
||||
if (runAgentBlocker) {
|
||||
await runAgentBlocker
|
||||
}
|
||||
yield {
|
||||
type: 'assistant',
|
||||
uuid: 'assistant-1',
|
||||
timestamp: new Date().toISOString(),
|
||||
message: {
|
||||
id: 'msg_1',
|
||||
type: 'message',
|
||||
role: 'assistant',
|
||||
model: 'test-model',
|
||||
content: [{ type: 'text', text: 'forked command done' }],
|
||||
stop_reason: 'end_turn',
|
||||
stop_sequence: null,
|
||||
usage: {
|
||||
input_tokens: 0,
|
||||
output_tokens: 0,
|
||||
},
|
||||
},
|
||||
}
|
||||
},
|
||||
}),
|
||||
)
|
||||
|
||||
mock.module('@claude-code-best/builtin-tools/tools/AgentTool/UI.js', () => ({
|
||||
AgentPromptDisplay: () => null,
|
||||
AgentResponseDisplay: () => null,
|
||||
extractLastToolInfo: () => null,
|
||||
renderGroupedAgentToolUse: () => null,
|
||||
renderToolResultMessage: () => null,
|
||||
renderToolUseErrorMessage: () => null,
|
||||
renderToolUseMessage: () => null,
|
||||
renderToolUseProgressMessage: () => null,
|
||||
renderToolUseRejectedMessage: () => null,
|
||||
renderToolUseTag: () => null,
|
||||
userFacingName: () => 'Agent',
|
||||
userFacingNameBackgroundColor: () => 'gray',
|
||||
}))
|
||||
|
||||
mock.module('../../messageQueueManager', createMessageQueueManagerMock)
|
||||
mock.module('../../messageQueueManager.js', createMessageQueueManagerMock)
|
||||
|
||||
const { processSlashCommand } = await import('../processSlashCommand')
|
||||
|
||||
let tempDir = ''
|
||||
|
||||
function createScheduledTaskQueuedCommandForTest(task: {
|
||||
id: string
|
||||
prompt: string
|
||||
}) {
|
||||
return createScheduledTaskQueuedCommand(task, {
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
})
|
||||
}
|
||||
|
||||
async function waitForRunStatus(
|
||||
runId: string,
|
||||
status: 'queued' | 'running' | 'completed' | 'failed' | 'cancelled',
|
||||
): Promise<void> {
|
||||
for (let i = 0; i < 200; i++) {
|
||||
const run = await getAutonomyRunById(runId, tempDir)
|
||||
if (run?.status === status) {
|
||||
return
|
||||
}
|
||||
await new Promise(resolve => setTimeout(resolve, 10))
|
||||
}
|
||||
const run = await getAutonomyRunById(runId, tempDir)
|
||||
throw new Error(`Expected ${runId} to be ${status}, got ${run?.status}`)
|
||||
}
|
||||
|
||||
async function waitForRunAgentStarts(expected: number): Promise<void> {
|
||||
for (let i = 0; i < 200; i++) {
|
||||
if (runAgentStartCount >= expected) {
|
||||
return
|
||||
}
|
||||
await new Promise(resolve => setTimeout(resolve, 10))
|
||||
}
|
||||
throw new Error(
|
||||
`Expected runAgent to start ${expected} time(s), got ${runAgentStartCount}`,
|
||||
)
|
||||
}
|
||||
|
||||
async function waitForCommandQueueLength(expected: number): Promise<void> {
|
||||
for (let i = 0; i < 200; i++) {
|
||||
if (getCommandQueue().length === expected) {
|
||||
return
|
||||
}
|
||||
await new Promise(resolve => setTimeout(resolve, 10))
|
||||
}
|
||||
throw new Error(
|
||||
`Expected command queue length ${expected}, got ${getCommandQueue().length}`,
|
||||
)
|
||||
}
|
||||
|
||||
beforeEach(async () => {
|
||||
tempDir = await createTempDir('process-slash-command-')
|
||||
originalNodeEnv = process.env.NODE_ENV
|
||||
originalAnthropicApiKey = process.env.ANTHROPIC_API_KEY
|
||||
process.env.NODE_ENV = 'test'
|
||||
process.env.ANTHROPIC_API_KEY = 'test-key'
|
||||
runAgentBlocker = null
|
||||
releaseRunAgentBlocker = null
|
||||
runAgentStartCount = 0
|
||||
resetStateForTests()
|
||||
resetAutonomyAuthorityForTests()
|
||||
resetCommandQueue()
|
||||
setOriginalCwd(tempDir)
|
||||
setProjectRoot(tempDir)
|
||||
setCwdState(tempDir)
|
||||
})
|
||||
|
||||
afterEach(async () => {
|
||||
releaseRunAgent()
|
||||
if (originalNodeEnv === undefined) {
|
||||
delete process.env.NODE_ENV
|
||||
} else {
|
||||
process.env.NODE_ENV = originalNodeEnv
|
||||
}
|
||||
if (originalAnthropicApiKey === undefined) {
|
||||
delete process.env.ANTHROPIC_API_KEY
|
||||
} else {
|
||||
process.env.ANTHROPIC_API_KEY = originalAnthropicApiKey
|
||||
}
|
||||
resetStateForTests()
|
||||
resetAutonomyAuthorityForTests()
|
||||
resetCommandQueue()
|
||||
if (tempDir) {
|
||||
await cleanupTempDir(tempDir)
|
||||
}
|
||||
mock.restore()
|
||||
})
|
||||
|
||||
describe('processSlashCommand', () => {
|
||||
const forkedCommand = {
|
||||
type: 'prompt',
|
||||
name: 'forked',
|
||||
description: 'test forked command',
|
||||
progressMessage: 'forking',
|
||||
contentLength: 0,
|
||||
source: 'builtin',
|
||||
context: 'fork',
|
||||
getPromptForCommand: async () => [
|
||||
{ type: 'text', text: 'review from fork' },
|
||||
],
|
||||
} as const
|
||||
|
||||
function createContext() {
|
||||
return {
|
||||
getAppState: () => ({
|
||||
kairosEnabled: true,
|
||||
mcp: { clients: [] },
|
||||
toolPermissionContext: {
|
||||
mode: 'default',
|
||||
alwaysAllowRules: {},
|
||||
},
|
||||
}),
|
||||
options: {
|
||||
commands: [forkedCommand],
|
||||
allowBackgroundForkedSlashCommands: true,
|
||||
tools: [],
|
||||
refreshTools: () => [],
|
||||
agentDefinitions: {
|
||||
activeAgents: [{ agentType: 'general-purpose' }],
|
||||
},
|
||||
},
|
||||
setResponseLength: mock((_updater: (length: number) => number) => {}),
|
||||
} as any
|
||||
}
|
||||
|
||||
test('defers autonomy completion until a KAIROS background forked command completes', async () => {
|
||||
const queued = await createAutonomyQueuedPrompt({
|
||||
basePrompt: '/forked review',
|
||||
trigger: 'scheduled-task',
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
sourceId: 'cron-1',
|
||||
})
|
||||
expect(queued).not.toBeNull()
|
||||
const runId = queued!.autonomy!.runId
|
||||
await markAutonomyRunRunning(runId, tempDir, 100)
|
||||
|
||||
const result = await processSlashCommand(
|
||||
'/forked review',
|
||||
[],
|
||||
[],
|
||||
[],
|
||||
createContext(),
|
||||
mock(() => {}),
|
||||
undefined,
|
||||
false,
|
||||
async () => ({ behavior: 'allow', updatedInput: {} }) as any,
|
||||
queued!.autonomy,
|
||||
)
|
||||
|
||||
expect(result).toMatchObject({
|
||||
messages: [],
|
||||
shouldQuery: false,
|
||||
deferAutonomyCompletion: true,
|
||||
})
|
||||
|
||||
await waitForRunStatus(runId, 'completed')
|
||||
await waitForCommandQueueLength(1)
|
||||
expect(getCommandQueue()).toEqual([
|
||||
expect.objectContaining({
|
||||
mode: 'prompt',
|
||||
isMeta: true,
|
||||
skipSlashCommands: true,
|
||||
value: expect.stringContaining(
|
||||
'<scheduled-task-result command="/forked">',
|
||||
),
|
||||
}),
|
||||
])
|
||||
})
|
||||
|
||||
test('keeps repeated /loop scheduled fires bounded while a background fork is running', async () => {
|
||||
const task = {
|
||||
id: 'cron-loop',
|
||||
prompt: '/forked review',
|
||||
}
|
||||
const first = await createScheduledTaskQueuedCommandForTest(task)
|
||||
expect(first?.autonomy?.runId).toBeDefined()
|
||||
const runId = first!.autonomy!.runId
|
||||
await markAutonomyRunRunning(runId, tempDir, 100)
|
||||
|
||||
holdRunAgent()
|
||||
const result = await processSlashCommand(
|
||||
'/forked review',
|
||||
[],
|
||||
[],
|
||||
[],
|
||||
createContext(),
|
||||
mock(() => {}),
|
||||
undefined,
|
||||
false,
|
||||
async () => ({ behavior: 'allow', updatedInput: {} }) as any,
|
||||
first!.autonomy,
|
||||
)
|
||||
|
||||
expect(result.deferAutonomyCompletion).toBe(true)
|
||||
await waitForRunAgentStarts(1)
|
||||
|
||||
const repeatedFires = await Promise.all(
|
||||
Array.from({ length: 200 }, () =>
|
||||
createScheduledTaskQueuedCommandForTest(task),
|
||||
),
|
||||
)
|
||||
expect(repeatedFires.every(command => command === null)).toBe(true)
|
||||
expect(
|
||||
(await listAutonomyRuns(tempDir)).filter(
|
||||
run => run.sourceId === 'cron-loop',
|
||||
),
|
||||
).toHaveLength(1)
|
||||
expect(getCommandQueue()).toHaveLength(0)
|
||||
|
||||
releaseRunAgent()
|
||||
await waitForRunStatus(runId, 'completed')
|
||||
await waitForCommandQueueLength(1)
|
||||
expect(getCommandQueue()).toHaveLength(1)
|
||||
|
||||
const next = await createScheduledTaskQueuedCommandForTest(task)
|
||||
expect(next?.autonomy?.runId).toBeDefined()
|
||||
expect(
|
||||
(await listAutonomyRuns(tempDir)).filter(
|
||||
run => run.sourceId === 'cron-loop',
|
||||
),
|
||||
).toHaveLength(2)
|
||||
})
|
||||
|
||||
test('rejects the background fork test override outside test runtime', async () => {
|
||||
process.env.NODE_ENV = 'production'
|
||||
|
||||
const result = await processSlashCommand(
|
||||
'/forked review',
|
||||
[],
|
||||
[],
|
||||
[],
|
||||
createContext(),
|
||||
mock(() => {}),
|
||||
undefined,
|
||||
false,
|
||||
async () => ({ behavior: 'allow', updatedInput: {} }) as any,
|
||||
)
|
||||
|
||||
expect(result.shouldQuery).toBe(false)
|
||||
expect(
|
||||
result.messages.some(message =>
|
||||
JSON.stringify(message).includes(
|
||||
'allowBackgroundForkedSlashCommands is test-only',
|
||||
),
|
||||
),
|
||||
).toBe(true)
|
||||
expect(runAgentStartCount).toBe(0)
|
||||
})
|
||||
})
|
||||
File diff suppressed because it is too large
Load Diff
@@ -28,6 +28,7 @@ import type {
|
||||
import type { PermissionMode } from '../../types/permissions.js'
|
||||
import {
|
||||
isValidImagePaste,
|
||||
type QueuedCommand,
|
||||
type PromptInputMode,
|
||||
} from '../../types/textInputTypes.js'
|
||||
import {
|
||||
@@ -80,6 +81,9 @@ export type ProcessUserInputBaseResult = {
|
||||
// Used by /discover to chain into the selected feature's command
|
||||
nextInput?: string
|
||||
submitNextInput?: boolean
|
||||
// When true, the command started detached work that will finalize its
|
||||
// autonomy run after the background work completes.
|
||||
deferAutonomyCompletion?: boolean
|
||||
}
|
||||
|
||||
export async function processUserInput({
|
||||
@@ -100,6 +104,7 @@ export async function processUserInput({
|
||||
bridgeOrigin,
|
||||
isMeta,
|
||||
skipAttachments,
|
||||
autonomy,
|
||||
}: {
|
||||
input: string | Array<ContentBlockParam>
|
||||
/**
|
||||
@@ -137,6 +142,7 @@ export async function processUserInput({
|
||||
*/
|
||||
isMeta?: boolean
|
||||
skipAttachments?: boolean
|
||||
autonomy?: QueuedCommand['autonomy']
|
||||
}): Promise<ProcessUserInputBaseResult> {
|
||||
const inputString = typeof input === 'string' ? input : null
|
||||
// Immediately show the user input prompt while we are still processing the input.
|
||||
@@ -168,6 +174,7 @@ export async function processUserInput({
|
||||
isMeta,
|
||||
skipAttachments,
|
||||
preExpansionInput,
|
||||
autonomy,
|
||||
)
|
||||
queryCheckpoint('query_process_user_input_base_end')
|
||||
|
||||
@@ -296,6 +303,7 @@ async function processUserInputBase(
|
||||
isMeta?: boolean,
|
||||
skipAttachments?: boolean,
|
||||
preExpansionInput?: string,
|
||||
autonomy?: QueuedCommand['autonomy'],
|
||||
): Promise<ProcessUserInputBaseResult> {
|
||||
let inputString: string | null = null
|
||||
let precedingInputBlocks: ContentBlockParam[] = []
|
||||
@@ -491,6 +499,7 @@ async function processUserInputBase(
|
||||
uuid,
|
||||
isAlreadyProcessing,
|
||||
canUseTool,
|
||||
autonomy,
|
||||
)
|
||||
return addImageMetadataMessage(slashResult, imageMetadataTexts)
|
||||
}
|
||||
@@ -549,6 +558,7 @@ async function processUserInputBase(
|
||||
uuid,
|
||||
isAlreadyProcessing,
|
||||
canUseTool,
|
||||
autonomy,
|
||||
)
|
||||
return addImageMetadataMessage(slashResult, imageMetadataTexts)
|
||||
}
|
||||
|
||||
@@ -424,8 +424,7 @@ function createInProcessCanUseTool(
|
||||
feedback: parsed.error,
|
||||
})
|
||||
}
|
||||
cleanup()
|
||||
return
|
||||
return // Callback already resolves the promise
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -675,6 +674,7 @@ type WaitResult =
|
||||
type: 'new_message'
|
||||
message: string
|
||||
autonomyRunId?: string
|
||||
autonomyRootDir?: string
|
||||
from: string
|
||||
color?: string
|
||||
summary?: string
|
||||
@@ -739,12 +739,16 @@ async function waitForNextPromptOrShutdown(
|
||||
`[inProcessRunner] ${identity.agentName} found pending user message (poll #${pollCount})`,
|
||||
)
|
||||
if (pending.autonomyRunId) {
|
||||
await markAutonomyRunRunning(pending.autonomyRunId)
|
||||
await markAutonomyRunRunning(
|
||||
pending.autonomyRunId,
|
||||
pending.autonomyRootDir,
|
||||
)
|
||||
}
|
||||
return {
|
||||
type: 'new_message',
|
||||
message: pending.message,
|
||||
autonomyRunId: pending.autonomyRunId,
|
||||
autonomyRootDir: pending.autonomyRootDir,
|
||||
from: 'user',
|
||||
}
|
||||
}
|
||||
@@ -1022,6 +1026,7 @@ export async function runInProcessTeammate(
|
||||
)
|
||||
let currentPrompt = wrappedInitialPrompt
|
||||
let currentAutonomyRunId: string | undefined
|
||||
let currentAutonomyRootDir: string | undefined
|
||||
let shouldExit = false
|
||||
|
||||
// Try to claim an available task immediately so the UI can show activity
|
||||
@@ -1319,12 +1324,21 @@ export async function runInProcessTeammate(
|
||||
setAppState,
|
||||
)
|
||||
if (currentAutonomyRunId) {
|
||||
await markAutonomyRunFailed(currentAutonomyRunId, ERROR_MESSAGE_USER_ABORT)
|
||||
await markAutonomyRunFailed(
|
||||
currentAutonomyRunId,
|
||||
ERROR_MESSAGE_USER_ABORT,
|
||||
currentAutonomyRootDir,
|
||||
)
|
||||
currentAutonomyRunId = undefined
|
||||
currentAutonomyRootDir = undefined
|
||||
}
|
||||
} else if (currentAutonomyRunId) {
|
||||
await markAutonomyRunCompleted(currentAutonomyRunId)
|
||||
await markAutonomyRunCompleted(
|
||||
currentAutonomyRunId,
|
||||
currentAutonomyRootDir,
|
||||
)
|
||||
currentAutonomyRunId = undefined
|
||||
currentAutonomyRootDir = undefined
|
||||
}
|
||||
|
||||
// Check if already idle before updating (to skip duplicate notification)
|
||||
@@ -1398,6 +1412,7 @@ export async function runInProcessTeammate(
|
||||
setAppState,
|
||||
)
|
||||
currentAutonomyRunId = undefined
|
||||
currentAutonomyRootDir = undefined
|
||||
break
|
||||
|
||||
case 'new_message':
|
||||
@@ -1410,6 +1425,7 @@ export async function runInProcessTeammate(
|
||||
if (waitResult.from === 'user') {
|
||||
currentPrompt = waitResult.message
|
||||
currentAutonomyRunId = waitResult.autonomyRunId
|
||||
currentAutonomyRootDir = waitResult.autonomyRootDir
|
||||
} else {
|
||||
currentPrompt = formatAsTeammateMessage(
|
||||
waitResult.from,
|
||||
@@ -1426,6 +1442,7 @@ export async function runInProcessTeammate(
|
||||
setAppState,
|
||||
)
|
||||
currentAutonomyRunId = undefined
|
||||
currentAutonomyRootDir = undefined
|
||||
}
|
||||
break
|
||||
|
||||
@@ -1533,7 +1550,11 @@ export async function runInProcessTeammate(
|
||||
})
|
||||
}
|
||||
if (currentAutonomyRunId) {
|
||||
await markAutonomyRunFailed(currentAutonomyRunId, errorMessage)
|
||||
await markAutonomyRunFailed(
|
||||
currentAutonomyRunId,
|
||||
errorMessage,
|
||||
currentAutonomyRootDir,
|
||||
)
|
||||
}
|
||||
|
||||
// Send idle notification with failure via file-based mailbox
|
||||
|
||||
@@ -234,7 +234,7 @@ export function killInProcessTeammate(
|
||||
let agentId: string | null = null
|
||||
let toolUseId: string | undefined
|
||||
let description: string | undefined
|
||||
let pendingAutonomyRunIds: string[] = []
|
||||
let pendingAutonomyRuns: Array<{ runId: string; rootDir?: string }> = []
|
||||
|
||||
setAppState((prev: AppState) => {
|
||||
const task = prev.tasks[taskId]
|
||||
@@ -255,9 +255,18 @@ export function killInProcessTeammate(
|
||||
description = teammateTask.description
|
||||
|
||||
// Capture pending autonomy run IDs before clearing them
|
||||
pendingAutonomyRunIds = teammateTask.pendingUserMessages
|
||||
.map(message => message.autonomyRunId)
|
||||
.filter((runId): runId is string => runId !== undefined)
|
||||
pendingAutonomyRuns = teammateTask.pendingUserMessages.flatMap(message =>
|
||||
message.autonomyRunId
|
||||
? [
|
||||
{
|
||||
runId: message.autonomyRunId,
|
||||
...(message.autonomyRootDir
|
||||
? { rootDir: message.autonomyRootDir }
|
||||
: {}),
|
||||
},
|
||||
]
|
||||
: [],
|
||||
)
|
||||
|
||||
// Abort the controller to stop execution
|
||||
teammateTask.abortController?.abort()
|
||||
@@ -311,10 +320,11 @@ export function killInProcessTeammate(
|
||||
}
|
||||
|
||||
if (killed) {
|
||||
for (const runId of pendingAutonomyRunIds) {
|
||||
for (const run of pendingAutonomyRuns) {
|
||||
void markAutonomyRunFailed(
|
||||
runId,
|
||||
run.runId,
|
||||
`Teammate ${agentId ?? taskId} was stopped before it could consume the queued autonomy prompt.`,
|
||||
run.rootDir,
|
||||
)
|
||||
}
|
||||
void evictTaskOutput(taskId)
|
||||
|
||||
Reference in New Issue
Block a user