mirror of
https://github.com/claude-code-best/claude-code.git
synced 2026-06-17 13:55:50 +00:00
feat: harden autonomy lifecycle, OOM bounds, and provider-boundary finalization
This PR consolidates a coordinated batch of fixes around autonomy run/flow lifecycle, scheduled task deduplication, provider-boundary state finalization, and matching memory-bound treatments for adjacent long-running subsystems (REPL fullscreen scrollback, skill-search/skill-learning runtime activation). All changes were developed and reviewed together because they touched the same lifecycle invariants and were uncovered by the same long-running session reproductions.
## Lifecycle correctness
- Queued autonomy prompts are not injected unless the persisted run was successfully claimed; queued run claiming is now terminal-safe so a once-consumed/cancelled/failed run can not slip back into `queued`.
- Autonomy run/flow finalization happens on completion, provider error, generator close, and cancellation — not just the happy path. New `src/__tests__/queryAutonomyProviderBoundary.test.ts` covers these provider-boundary transitions.
- `requestManagedAutonomyFlowCancel` and `resumeManagedAutonomyFlowPrompt` carry `rootDir` and `currentDir` explicitly across detached async boundaries (proactive-tick, cron, daemon restart) instead of inferring from process state.
- Active runs/flows are protected from janitor pruning so a running step can not be garbage-collected mid-flight (`src/utils/autonomyAuthority.ts`).
- Heartbeat parser now ignores fenced code blocks; the two-phase commit window for autonomy state transitions is documented in `docs/internals/autonomy-jira.md`.
## Ownership and dedup
- `src/utils/autonomyRuns.ts`: ownership stamping (run id + rootDir carried end-to-end), source-based dedup against active runs.
- `src/hooks/useScheduledTasks.ts`: scheduled ticks deduplicate against runs already active on the same source label.
- `src/utils/processUserInput/processSlashCommand.tsx`: forked slash commands now thread the autonomy `runId` so completion finalizers can find the originating run for deferred completion.
- New `src/utils/autonomyQueueLifecycle.ts` and tests collect the queue-side lifecycle invariants in one place.
## Memory bounds (related, same review pass)
- `src/screens/REPL.tsx`: caps fullscreen scrollback after the compact boundary and updates trailing progress rows in place. Long-running fullscreen sessions could otherwise retain thousands of post-compaction messages and duplicate progress rows, keeping Ink trees alive long after their useful context had moved on.
- `src/services/skillSearch/*` and `src/services/skillLearning/*`: runtime activation is strictly opt-in via existing env toggles; session caches are capped so long-running processes can not grow them forever. Build presence is preserved so operators can still discover and opt into the slash commands.
## CI / test contract
- `tests/integration/dependency-overrides.test.ts`: smoke test no longer drives Mermaid's browser renderer; it validates the package-resolution contract directly so CI does not regress on unrelated browser timing.
- New `tests/integration/autonomy-lifecycle-user-flow.test.ts`: end-to-end CLI subprocess flow exercising `status --deep`, `flows`, `flow <id>`, `flow resume`, `flow cancel` against persisted state.
- `src/entrypoints/cli.tsx`: `claude autonomy …` routes through an entrypoint fast path that reuses the slash-command formatter without booting the full interactive CLI. Stdout is flushed before forced exit so coverage subprocesses do not terminate with empty stdout.
- `packages/builtin-tools/src/tools/RemoteTriggerTool/__tests__/RemoteTriggerTool.test.ts`: stabilized to prevent audit flake under coverage.
## Tests added
- `src/__tests__/queryAutonomyProviderBoundary.test.ts`
- `src/hooks/__tests__/useScheduledTasks.test.ts`
- `src/utils/__tests__/autonomyAuthority.test.ts`
- `src/utils/__tests__/autonomyFlows.test.ts` (extended)
- `src/utils/__tests__/autonomyPersistence.test.ts` (extended)
- `src/utils/__tests__/autonomyQueueLifecycle.test.ts`
- `src/utils/__tests__/autonomyRuns.test.ts` (extended)
- `src/utils/processUserInput/__tests__/processSlashCommand.test.ts`
- `tests/integration/autonomy-lifecycle-user-flow.test.ts`
## Docs
- `docs/agent/sur-loop-scheduled-oom.md`: System Understanding Report covering the scheduled/loop OOM problem, the call graphs investigated, and the lifecycle invariants this PR establishes.
- `docs/agent/sur-skill-overflow-bugs.md`: SUR for the related skill-overflow context.
- `docs/internals/autonomy-jira.md`: documents the two-phase commit window and ownership stamping invariants.
- `docs/memory-leak-audit.md`: audit notes covering the REPL/scrollback and skill-search bounds.
## Invariants this PR establishes
1. Queued autonomy prompts are not injected unless the persisted run was successfully claimed.
2. Terminal run/flow states are terminal — completion, failure, and cancellation all finalize state regardless of which provider/error path triggered them.
3. Autonomy run/flow `rootDir` is carried explicitly across detached async boundaries instead of inferred from a shared singleton.
4. State-only CLI subcommands (`autonomy status|runs|flows|flow …`) bypass full interactive bootstrap so they do not hold unrelated handles open.
5. REPL fullscreen scrollback and skill-search/skill-learning session caches are explicitly bounded.
## Validation
```bash
bun run typecheck
CI=true GITHUB_ACTIONS=true bun test # 3996 pass / 0 fail across 305 files
bun test src/__tests__/queryAutonomyProviderBoundary.test.ts \
src/hooks/__tests__/useScheduledTasks.test.ts \
src/utils/__tests__/autonomy{Runs,Flows,Authority,QueueLifecycle,Persistence}.test.ts \
src/utils/processUserInput/__tests__/processSlashCommand.test.ts \
tests/integration/autonomy-lifecycle-user-flow.test.ts
```
## Origin
This PR is the consolidated, upstream-targeted version of two fork-side review PRs (fix/loop-scheduled-autonomy-oom and fix/autonomy-lifecycle). The fork-side review history is preserved at https://github.com/amDosion/claude-code-bast/pull/7 . The fork's own internal `chore: keep fork current with upstream` sync commits and the `docs: update contributors` automation are intentionally not included in this PR.
The autonomy CLI handler `rootDir` threading that the fork added (78f64d8a, 98d04ddb) is intentionally omitted here because upstream `a2cfaf91` (fix: 修复 RemoteTriggerTool 和 autonomy 测试的全量运行失败) already performed the equivalent change with an additional `currentDir` option. Keeping the upstream version avoids regressing that improvement.
This commit is contained in:
13
src/Tool.ts
13
src/Tool.ts
@@ -178,6 +178,19 @@ export type ToolUseContext = {
|
||||
querySource?: QuerySource
|
||||
/** Optional callback to get the latest tools (e.g., after MCP servers connect mid-query) */
|
||||
refreshTools?: () => Tools
|
||||
/**
|
||||
* @internal TEST-ONLY ESCAPE HATCH. MUST remain undefined in production.
|
||||
*
|
||||
* Allows non-bundled unit-test harnesses to exercise the background
|
||||
* forked slash command path that production assistant mode gates behind
|
||||
* `feature('KAIROS')`. Still requires `AppState.kairosEnabled`. This
|
||||
* field is constructed in-process by trusted application code only;
|
||||
* no external surface (MCP, plugin, slash command, network) writes to
|
||||
* `ToolUseContext.options`. Setting this true outside a test bypasses
|
||||
* the KAIROS feature flag; `processSlashCommand` rejects this flag
|
||||
* outside `NODE_ENV=test`.
|
||||
*/
|
||||
allowBackgroundForkedSlashCommands?: boolean
|
||||
}
|
||||
abortController: AbortController
|
||||
readFileState: FileStateCache
|
||||
|
||||
@@ -1,8 +1,18 @@
|
||||
import { beforeEach, describe, expect, mock, test } from 'bun:test'
|
||||
import { afterEach, beforeEach, describe, expect, mock, test } from 'bun:test'
|
||||
import { createAbortController } from '../utils/abortController'
|
||||
import { QueryGuard } from '../utils/QueryGuard'
|
||||
import { handlePromptSubmit } from '../utils/handlePromptSubmit'
|
||||
import { getCommandQueue, resetCommandQueue } from '../utils/messageQueueManager'
|
||||
import {
|
||||
getCommandQueue,
|
||||
resetCommandQueue,
|
||||
} from '../utils/messageQueueManager'
|
||||
import { cleanupTempDir, createTempDir } from '../../tests/mocks/file-system'
|
||||
import {
|
||||
createAutonomyQueuedPrompt,
|
||||
markAutonomyRunCancelled,
|
||||
} from '../utils/autonomyRuns'
|
||||
|
||||
let tempDirs: string[] = []
|
||||
|
||||
function createBaseParams() {
|
||||
const queryGuard = new QueryGuard()
|
||||
@@ -28,9 +38,7 @@ function createBaseParams() {
|
||||
commands: [],
|
||||
setUserInputOnProcessing: mock((_prompt?: string) => {}),
|
||||
setAbortController: mock((_abortController: AbortController | null) => {}),
|
||||
onQuery: mock(
|
||||
async () => undefined,
|
||||
) as unknown as (
|
||||
onQuery: mock(async () => undefined) as unknown as (
|
||||
...args: unknown[]
|
||||
) => Promise<void>,
|
||||
setAppState: mock((_updater: unknown) => {}),
|
||||
@@ -40,6 +48,13 @@ function createBaseParams() {
|
||||
describe('handlePromptSubmit', () => {
|
||||
beforeEach(() => {
|
||||
resetCommandQueue()
|
||||
tempDirs = []
|
||||
})
|
||||
|
||||
afterEach(async () => {
|
||||
for (const tempDir of tempDirs) {
|
||||
await cleanupTempDir(tempDir)
|
||||
}
|
||||
})
|
||||
|
||||
test('aborts the current turn when only cancel-interrupt tools are running', async () => {
|
||||
@@ -118,4 +133,34 @@ describe('handlePromptSubmit', () => {
|
||||
bridgeOrigin: true,
|
||||
})
|
||||
})
|
||||
|
||||
test('skips stale autonomy commands in the idle queued path', async () => {
|
||||
const params = createBaseParams()
|
||||
const abortController = createAbortController()
|
||||
const tempDir = await createTempDir('handle-prompt-autonomy-')
|
||||
tempDirs.push(tempDir)
|
||||
const command = await createAutonomyQueuedPrompt({
|
||||
basePrompt: 'scheduled prompt',
|
||||
trigger: 'scheduled-task',
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
})
|
||||
expect(command).not.toBeNull()
|
||||
await markAutonomyRunCancelled(command!.autonomy!.runId, tempDir)
|
||||
|
||||
await handlePromptSubmit({
|
||||
...params,
|
||||
input: '',
|
||||
mode: 'prompt',
|
||||
pastedContents: {},
|
||||
abortController,
|
||||
streamMode: 'normal' as any,
|
||||
hasInterruptibleToolInProgress: false,
|
||||
isExternalLoading: false,
|
||||
queuedCommands: [command!],
|
||||
})
|
||||
|
||||
expect(params.getToolUseContext).not.toHaveBeenCalled()
|
||||
expect(params.onQuery).not.toHaveBeenCalled()
|
||||
})
|
||||
})
|
||||
|
||||
337
src/__tests__/queryAutonomyProviderBoundary.test.ts
Normal file
337
src/__tests__/queryAutonomyProviderBoundary.test.ts
Normal file
@@ -0,0 +1,337 @@
|
||||
import { afterEach, beforeEach, describe, expect, test } from 'bun:test'
|
||||
import { randomUUID } from 'crypto'
|
||||
import {
|
||||
resetStateForTests,
|
||||
setCwdState,
|
||||
setOriginalCwd,
|
||||
setProjectRoot,
|
||||
} from '../bootstrap/state'
|
||||
import { query } from '../query'
|
||||
import { getEmptyToolPermissionContext } from '../Tool'
|
||||
import type { AssistantMessage } from '../types/message'
|
||||
import { asSystemPrompt } from '../utils/systemPromptType'
|
||||
import {
|
||||
createAssistantAPIErrorMessage,
|
||||
createUserMessage,
|
||||
} from '../utils/messages'
|
||||
import { cleanupTempDir, createTempDir } from '../../tests/mocks/file-system'
|
||||
import {
|
||||
enqueue,
|
||||
getCommandsByMaxPriority,
|
||||
resetCommandQueue,
|
||||
} from '../utils/messageQueueManager'
|
||||
import { getAutonomyFlowById, listAutonomyFlows } from '../utils/autonomyFlows'
|
||||
import {
|
||||
getAutonomyRunById,
|
||||
startManagedAutonomyFlowFromHeartbeatTask,
|
||||
} from '../utils/autonomyRuns'
|
||||
|
||||
let tempDir = ''
|
||||
let originalProcessCwd = ''
|
||||
|
||||
beforeEach(async () => {
|
||||
originalProcessCwd = process.cwd()
|
||||
tempDir = await createTempDir('query-autonomy-provider-boundary-')
|
||||
resetStateForTests()
|
||||
resetCommandQueue()
|
||||
setOriginalCwd(tempDir)
|
||||
setCwdState(tempDir)
|
||||
setProjectRoot(tempDir)
|
||||
})
|
||||
|
||||
afterEach(async () => {
|
||||
resetStateForTests()
|
||||
resetCommandQueue()
|
||||
if (originalProcessCwd) {
|
||||
process.chdir(originalProcessCwd)
|
||||
}
|
||||
if (tempDir) {
|
||||
let lastError: unknown
|
||||
for (let attempt = 0; attempt < 20; attempt++) {
|
||||
try {
|
||||
await cleanupTempDir(tempDir)
|
||||
lastError = undefined
|
||||
break
|
||||
} catch (error) {
|
||||
lastError = error
|
||||
await new Promise(resolve => setTimeout(resolve, 100))
|
||||
}
|
||||
}
|
||||
if (lastError) {
|
||||
throw lastError
|
||||
}
|
||||
}
|
||||
})
|
||||
|
||||
function createToolUseAssistantMessage(): AssistantMessage {
|
||||
return {
|
||||
type: 'assistant',
|
||||
uuid: randomUUID(),
|
||||
timestamp: new Date().toISOString(),
|
||||
requestId: undefined,
|
||||
message: {
|
||||
id: 'msg_tool_use',
|
||||
type: 'message',
|
||||
role: 'assistant',
|
||||
model: 'test-model',
|
||||
stop_reason: 'tool_use',
|
||||
stop_sequence: null,
|
||||
usage: {
|
||||
input_tokens: 1,
|
||||
output_tokens: 1,
|
||||
cache_creation_input_tokens: 0,
|
||||
cache_read_input_tokens: 0,
|
||||
},
|
||||
content: [
|
||||
{
|
||||
type: 'tool_use',
|
||||
id: 'toolu_provider_boundary',
|
||||
name: 'MissingBoundaryTool',
|
||||
input: {},
|
||||
},
|
||||
],
|
||||
},
|
||||
} as unknown as AssistantMessage
|
||||
}
|
||||
|
||||
function createToolUseContext(): any {
|
||||
let inProgressToolUseIds = new Set<string>()
|
||||
let responseLength = 0
|
||||
let appState = {
|
||||
toolPermissionContext: getEmptyToolPermissionContext(),
|
||||
fastMode: false,
|
||||
mcp: {
|
||||
tools: [],
|
||||
clients: [],
|
||||
},
|
||||
effortValue: undefined,
|
||||
advisorModel: undefined,
|
||||
sessionHooks: new Map(),
|
||||
}
|
||||
|
||||
return {
|
||||
options: {
|
||||
commands: [],
|
||||
debug: false,
|
||||
mainLoopModel: 'claude-sonnet-4-5-20250929',
|
||||
tools: [],
|
||||
verbose: false,
|
||||
thinkingConfig: { type: 'disabled' },
|
||||
mcpClients: [],
|
||||
mcpResources: {},
|
||||
isNonInteractiveSession: true,
|
||||
agentDefinitions: {
|
||||
activeAgents: [],
|
||||
allowedAgentTypes: [],
|
||||
},
|
||||
},
|
||||
abortController: new AbortController(),
|
||||
readFileState: new Map(),
|
||||
getAppState: () => appState,
|
||||
setAppState: (updater: (state: any) => any) => {
|
||||
appState = updater(appState as never)
|
||||
},
|
||||
setInProgressToolUseIDs: (updater: (state: Set<string>) => Set<string>) => {
|
||||
inProgressToolUseIds = updater(inProgressToolUseIds)
|
||||
},
|
||||
setResponseLength: (updater: (state: number) => number) => {
|
||||
responseLength = updater(responseLength)
|
||||
},
|
||||
updateFileHistoryState: () => {},
|
||||
updateAttributionState: () => {},
|
||||
messages: [],
|
||||
} as any
|
||||
}
|
||||
|
||||
describe('query autonomy/provider boundary', () => {
|
||||
test('provider api-error messages fail a consumed autonomy run instead of advancing the flow', async () => {
|
||||
const previousDisableAttachments =
|
||||
process.env.CLAUDE_CODE_DISABLE_ATTACHMENTS
|
||||
process.env.CLAUDE_CODE_DISABLE_ATTACHMENTS = '1'
|
||||
try {
|
||||
const command = await startManagedAutonomyFlowFromHeartbeatTask({
|
||||
task: {
|
||||
name: 'provider-boundary',
|
||||
interval: '1h',
|
||||
prompt: 'Exercise provider boundary',
|
||||
steps: [
|
||||
{ name: 'first', prompt: 'First provider-boundary step' },
|
||||
{ name: 'second', prompt: 'Second provider-boundary step' },
|
||||
],
|
||||
},
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
priority: 'next',
|
||||
})
|
||||
expect(command).not.toBeNull()
|
||||
enqueue(command!)
|
||||
|
||||
const toolUseContext = createToolUseContext()
|
||||
|
||||
let callCount = 0
|
||||
const deps = {
|
||||
uuid: () => 'query-chain-id',
|
||||
microcompact: async (messages: unknown[]) => ({ messages }),
|
||||
autocompact: async () => ({
|
||||
compactionResult: undefined,
|
||||
consecutiveFailures: 0,
|
||||
}),
|
||||
callModel: async function* () {
|
||||
callCount += 1
|
||||
if (callCount === 1) {
|
||||
yield createToolUseAssistantMessage()
|
||||
return
|
||||
}
|
||||
yield createAssistantAPIErrorMessage({
|
||||
content: 'API Error: provider unavailable',
|
||||
apiError: 'api_error',
|
||||
error: new Error('provider unavailable') as never,
|
||||
})
|
||||
},
|
||||
}
|
||||
|
||||
const emitted: any[] = []
|
||||
const generator = query({
|
||||
messages: [
|
||||
createUserMessage({
|
||||
content: 'start provider-boundary test',
|
||||
}),
|
||||
],
|
||||
systemPrompt: asSystemPrompt([]),
|
||||
userContext: {},
|
||||
systemContext: {},
|
||||
canUseTool: async (_tool, input) => ({
|
||||
behavior: 'allow',
|
||||
updatedInput: input,
|
||||
}),
|
||||
toolUseContext,
|
||||
querySource: 'sdk',
|
||||
maxTurns: 3,
|
||||
deps: deps as never,
|
||||
})
|
||||
let next = await generator.next()
|
||||
while (!next.done) {
|
||||
emitted.push(next.value)
|
||||
next = await generator.next()
|
||||
}
|
||||
|
||||
const [flow] = await listAutonomyFlows(tempDir)
|
||||
const finalFlow = await getAutonomyFlowById(flow!.flowId, tempDir)
|
||||
const run = await getAutonomyRunById(command!.autonomy!.runId, tempDir)
|
||||
|
||||
expect(next.value.reason).toBe('model_error')
|
||||
expect(callCount).toBe(2)
|
||||
expect(
|
||||
emitted.some(
|
||||
message =>
|
||||
message.type === 'attachment' &&
|
||||
message.attachment.type === 'queued_command',
|
||||
),
|
||||
).toBe(true)
|
||||
expect(run!.status).toBe('failed')
|
||||
expect(run!.error).toBe('provider api_error')
|
||||
expect(finalFlow!.status).toBe('failed')
|
||||
expect(finalFlow!.stateJson!.steps.map(step => step.status)).toEqual([
|
||||
'failed',
|
||||
'pending',
|
||||
])
|
||||
expect(getCommandsByMaxPriority('later')).toHaveLength(0)
|
||||
} finally {
|
||||
if (previousDisableAttachments === undefined) {
|
||||
delete process.env.CLAUDE_CODE_DISABLE_ATTACHMENTS
|
||||
} else {
|
||||
process.env.CLAUDE_CODE_DISABLE_ATTACHMENTS = previousDisableAttachments
|
||||
}
|
||||
}
|
||||
})
|
||||
|
||||
test('generator return cancels a consumed autonomy run instead of leaving it running', async () => {
|
||||
const previousDisableAttachments =
|
||||
process.env.CLAUDE_CODE_DISABLE_ATTACHMENTS
|
||||
process.env.CLAUDE_CODE_DISABLE_ATTACHMENTS = '1'
|
||||
try {
|
||||
const command = await startManagedAutonomyFlowFromHeartbeatTask({
|
||||
task: {
|
||||
name: 'return-boundary',
|
||||
interval: '1h',
|
||||
prompt: 'Exercise generator return boundary',
|
||||
steps: [
|
||||
{ name: 'first', prompt: 'First return-boundary step' },
|
||||
{ name: 'second', prompt: 'Second return-boundary step' },
|
||||
],
|
||||
},
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
priority: 'next',
|
||||
})
|
||||
expect(command).not.toBeNull()
|
||||
enqueue(command!)
|
||||
|
||||
const toolUseContext = createToolUseContext()
|
||||
const deps = {
|
||||
uuid: () => 'query-chain-id',
|
||||
microcompact: async (messages: unknown[]) => ({ messages }),
|
||||
autocompact: async () => ({
|
||||
compactionResult: undefined,
|
||||
consecutiveFailures: 0,
|
||||
}),
|
||||
callModel: async function* () {
|
||||
yield createToolUseAssistantMessage()
|
||||
},
|
||||
}
|
||||
|
||||
const generator = query({
|
||||
messages: [
|
||||
createUserMessage({
|
||||
content: 'start return-boundary test',
|
||||
}),
|
||||
],
|
||||
systemPrompt: asSystemPrompt([]),
|
||||
userContext: {},
|
||||
systemContext: {},
|
||||
canUseTool: async (_tool, input) => ({
|
||||
behavior: 'allow',
|
||||
updatedInput: input,
|
||||
}),
|
||||
toolUseContext,
|
||||
querySource: 'sdk',
|
||||
maxTurns: 3,
|
||||
deps: deps as never,
|
||||
})
|
||||
|
||||
let sawQueuedAttachment = false
|
||||
let next = await generator.next()
|
||||
while (!next.done) {
|
||||
const message = next.value as any
|
||||
if (
|
||||
message.type === 'attachment' &&
|
||||
message.attachment.type === 'queued_command'
|
||||
) {
|
||||
sawQueuedAttachment = true
|
||||
await generator.return(undefined as never)
|
||||
break
|
||||
}
|
||||
next = await generator.next()
|
||||
}
|
||||
|
||||
const [flow] = await listAutonomyFlows(tempDir)
|
||||
const finalFlow = await getAutonomyFlowById(flow!.flowId, tempDir)
|
||||
const run = await getAutonomyRunById(command!.autonomy!.runId, tempDir)
|
||||
|
||||
expect(sawQueuedAttachment).toBe(true)
|
||||
expect(run!.status).toBe('cancelled')
|
||||
expect(finalFlow!.status).toBe('cancelled')
|
||||
expect(finalFlow!.stateJson!.steps.map(step => step.status)).toEqual([
|
||||
'cancelled',
|
||||
'cancelled',
|
||||
])
|
||||
expect(getCommandsByMaxPriority('later')).toHaveLength(0)
|
||||
} finally {
|
||||
if (previousDisableAttachments === undefined) {
|
||||
delete process.env.CLAUDE_CODE_DISABLE_ATTACHMENTS
|
||||
} else {
|
||||
process.env.CLAUDE_CODE_DISABLE_ATTACHMENTS = previousDisableAttachments
|
||||
}
|
||||
}
|
||||
})
|
||||
})
|
||||
170
src/cli/print.ts
170
src/cli/print.ts
@@ -323,13 +323,16 @@ import { asSessionId } from 'src/types/ids.js'
|
||||
import {
|
||||
commitAutonomyQueuedPrompt,
|
||||
createAutonomyQueuedPrompt,
|
||||
createAutonomyQueuedPromptIfNoActiveSource,
|
||||
createProactiveAutonomyCommands,
|
||||
finalizeAutonomyRunCompleted,
|
||||
finalizeAutonomyRunFailed,
|
||||
markAutonomyRunCompleted,
|
||||
markAutonomyRunCancelled,
|
||||
markAutonomyRunFailed,
|
||||
markAutonomyRunRunning,
|
||||
} from 'src/utils/autonomyRuns.js'
|
||||
import {
|
||||
cancelQueuedAutonomyCommands,
|
||||
claimConsumableQueuedAutonomyCommands,
|
||||
finalizeAutonomyCommandsForTurn,
|
||||
} from 'src/utils/autonomyQueueLifecycle.js'
|
||||
import { prepareAutonomyTurnPrompt } from 'src/utils/autonomyAuthority.js'
|
||||
import { jsonStringify } from '../utils/slowOperations.js'
|
||||
import { skillChangeDetector } from '../utils/skills/skillChangeDetector.js'
|
||||
@@ -1865,17 +1868,26 @@ function runHeadlessStreaming(
|
||||
currentDir: cwd(),
|
||||
shouldCreate: () => !inputClosed,
|
||||
})
|
||||
if (inputClosed) {
|
||||
await cancelQueuedAutonomyCommands({ commands })
|
||||
return
|
||||
}
|
||||
for (const command of commands) {
|
||||
if (inputClosed) {
|
||||
return
|
||||
}
|
||||
enqueue({
|
||||
...command,
|
||||
uuid: randomUUID(),
|
||||
})
|
||||
}
|
||||
void run()
|
||||
})()
|
||||
})().catch(error => {
|
||||
logError(error)
|
||||
logForDebugging(
|
||||
`[Proactive] failed to create headless tick: ${error}`,
|
||||
{
|
||||
level: 'error',
|
||||
},
|
||||
)
|
||||
})
|
||||
}, 0)
|
||||
}
|
||||
: undefined
|
||||
@@ -1971,17 +1983,24 @@ function runHeadlessStreaming(
|
||||
// Non-prompt commands (task-notification, orphaned-permission) carry
|
||||
// side effects or orphanedPermission state, so they process singly.
|
||||
// Prompt commands greedily collect followers with matching workload.
|
||||
const batch: QueuedCommand[] = [command]
|
||||
let batch: QueuedCommand[] = [command]
|
||||
if (command.mode === 'prompt') {
|
||||
while (canBatchWith(command, peek(isMainThread))) {
|
||||
batch.push(dequeue(isMainThread)!)
|
||||
}
|
||||
if (batch.length > 1) {
|
||||
command = {
|
||||
...command,
|
||||
value: joinPromptValues(batch.map(c => c.value)),
|
||||
uuid: batch.findLast(c => c.uuid)?.uuid ?? command.uuid,
|
||||
}
|
||||
}
|
||||
const queuedAutonomyClaim =
|
||||
await claimConsumableQueuedAutonomyCommands(batch)
|
||||
batch = queuedAutonomyClaim.attachmentCommands
|
||||
if (batch.length === 0) {
|
||||
continue
|
||||
}
|
||||
command = batch[0]!
|
||||
if (command.mode === 'prompt' && batch.length > 1) {
|
||||
command = {
|
||||
...command,
|
||||
value: joinPromptValues(batch.map(c => c.value)),
|
||||
uuid: batch.findLast(c => c.uuid)?.uuid ?? command.uuid,
|
||||
}
|
||||
}
|
||||
const batchUuids = batch.map(c => c.uuid).filter(u => u !== undefined)
|
||||
@@ -2120,9 +2139,7 @@ function runHeadlessStreaming(
|
||||
}
|
||||
|
||||
const input = command.value
|
||||
const autonomyRunIds = batch
|
||||
.map(item => item.autonomy?.runId)
|
||||
.filter((runId): runId is string => Boolean(runId))
|
||||
const claimedAutonomyCommands = queuedAutonomyClaim.claimedCommands
|
||||
|
||||
if (structuredIO instanceof RemoteIO && command.mode === 'prompt') {
|
||||
logEvent('tengu_bridge_message_received', {
|
||||
@@ -2172,9 +2189,6 @@ function runHeadlessStreaming(
|
||||
// const-capture: TS loses `while ((command = dequeue()))` narrowing
|
||||
// inside the closure.
|
||||
const cmd = command
|
||||
for (const runId of autonomyRunIds) {
|
||||
await markAutonomyRunRunning(runId)
|
||||
}
|
||||
let lastResultIsError = false
|
||||
try {
|
||||
await runWithWorkload(
|
||||
@@ -2286,35 +2300,39 @@ function runHeadlessStreaming(
|
||||
},
|
||||
) // end runWithWorkload
|
||||
if (lastResultIsError) {
|
||||
for (const runId of autonomyRunIds) {
|
||||
await finalizeAutonomyRunFailed({
|
||||
runId,
|
||||
error: 'ask() returned an error result',
|
||||
})
|
||||
}
|
||||
await finalizeAutonomyCommandsForTurn({
|
||||
commands: claimedAutonomyCommands,
|
||||
outcome: {
|
||||
type: 'failed',
|
||||
message: 'ask() returned an error result',
|
||||
},
|
||||
currentDir: cwd(),
|
||||
priority: 'later',
|
||||
workload: cmd.workload ?? options.workload,
|
||||
})
|
||||
} else {
|
||||
for (const runId of autonomyRunIds) {
|
||||
const nextCommands = await finalizeAutonomyRunCompleted({
|
||||
runId,
|
||||
currentDir: cwd(),
|
||||
priority: 'later',
|
||||
workload: cmd.workload ?? options.workload,
|
||||
const nextCommands = await finalizeAutonomyCommandsForTurn({
|
||||
commands: claimedAutonomyCommands,
|
||||
outcome: { type: 'completed' },
|
||||
currentDir: cwd(),
|
||||
priority: 'later',
|
||||
workload: cmd.workload ?? options.workload,
|
||||
})
|
||||
for (const nextCommand of nextCommands) {
|
||||
enqueue({
|
||||
...nextCommand,
|
||||
uuid: randomUUID(),
|
||||
})
|
||||
for (const nextCommand of nextCommands) {
|
||||
enqueue({
|
||||
...nextCommand,
|
||||
uuid: randomUUID(),
|
||||
})
|
||||
}
|
||||
}
|
||||
}
|
||||
} catch (error) {
|
||||
for (const runId of autonomyRunIds) {
|
||||
await finalizeAutonomyRunFailed({
|
||||
runId,
|
||||
error: String(error),
|
||||
})
|
||||
}
|
||||
await finalizeAutonomyCommandsForTurn({
|
||||
commands: claimedAutonomyCommands,
|
||||
outcome: { type: 'failed', error },
|
||||
currentDir: cwd(),
|
||||
priority: 'later',
|
||||
workload: cmd.workload ?? options.workload,
|
||||
})
|
||||
throw error
|
||||
}
|
||||
|
||||
@@ -2820,57 +2838,87 @@ function runHeadlessStreaming(
|
||||
currentDir: cwd(),
|
||||
workload: WORKLOAD_CRON,
|
||||
})
|
||||
if (inputClosed) return
|
||||
if (inputClosed) {
|
||||
await markAutonomyRunCancelled(
|
||||
command.autonomy!.runId,
|
||||
command.autonomy!.rootDir,
|
||||
)
|
||||
return
|
||||
}
|
||||
enqueue({
|
||||
...command,
|
||||
uuid: randomUUID(),
|
||||
})
|
||||
void run()
|
||||
})()
|
||||
})().catch(error => {
|
||||
logError(error)
|
||||
logForDebugging(
|
||||
`[ScheduledTasks] failed to enqueue headless task: ${error}`,
|
||||
{
|
||||
level: 'error',
|
||||
},
|
||||
)
|
||||
})
|
||||
},
|
||||
onFireTask: task => {
|
||||
if (inputClosed) return
|
||||
void (async () => {
|
||||
if (task.agentId) {
|
||||
const prepared = await prepareAutonomyTurnPrompt({
|
||||
const command = await createAutonomyQueuedPromptIfNoActiveSource({
|
||||
basePrompt: task.prompt,
|
||||
trigger: 'scheduled-task',
|
||||
currentDir: cwd(),
|
||||
})
|
||||
if (inputClosed) return
|
||||
const command = await commitAutonomyQueuedPrompt({
|
||||
prepared,
|
||||
currentDir: cwd(),
|
||||
sourceId: task.id,
|
||||
sourceLabel: task.prompt,
|
||||
workload: WORKLOAD_CRON,
|
||||
shouldCreate: () => !inputClosed,
|
||||
})
|
||||
if (!command) return
|
||||
if (inputClosed) {
|
||||
await markAutonomyRunCancelled(
|
||||
command.autonomy!.runId,
|
||||
command.autonomy!.rootDir,
|
||||
)
|
||||
return
|
||||
}
|
||||
await markAutonomyRunFailed(
|
||||
command.autonomy!.runId,
|
||||
`No teammate runtime available for scheduled task owner ${task.agentId} in headless mode.`,
|
||||
command.autonomy!.rootDir,
|
||||
)
|
||||
return
|
||||
}
|
||||
const prepared = await prepareAutonomyTurnPrompt({
|
||||
const command = await createAutonomyQueuedPromptIfNoActiveSource({
|
||||
basePrompt: task.prompt,
|
||||
trigger: 'scheduled-task',
|
||||
currentDir: cwd(),
|
||||
})
|
||||
if (inputClosed) return
|
||||
const command = await commitAutonomyQueuedPrompt({
|
||||
prepared,
|
||||
currentDir: cwd(),
|
||||
sourceId: task.id,
|
||||
sourceLabel: task.prompt,
|
||||
workload: WORKLOAD_CRON,
|
||||
shouldCreate: () => !inputClosed,
|
||||
})
|
||||
if (inputClosed) return
|
||||
if (!command) return
|
||||
if (inputClosed) {
|
||||
await markAutonomyRunCancelled(
|
||||
command.autonomy!.runId,
|
||||
command.autonomy!.rootDir,
|
||||
)
|
||||
return
|
||||
}
|
||||
enqueue({
|
||||
...command,
|
||||
uuid: randomUUID(),
|
||||
})
|
||||
void run()
|
||||
})()
|
||||
})().catch(error => {
|
||||
logError(error)
|
||||
logForDebugging(
|
||||
`[ScheduledTasks] failed to enqueue headless task ${task.id}: ${error}`,
|
||||
{
|
||||
level: 'error',
|
||||
},
|
||||
)
|
||||
})
|
||||
},
|
||||
isLoading: () => running || inputClosed,
|
||||
getJitterConfig: cronJitterConfigModule?.getCronJitterConfig,
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
import type { Command } from '../../commands.js'
|
||||
import { isSkillLearningEnabled } from '../../services/skillLearning/featureCheck.js'
|
||||
import { isSkillLearningCompiledIn } from '../../services/skillLearning/featureCheck.js'
|
||||
|
||||
const skillLearning = {
|
||||
type: 'local-jsx',
|
||||
@@ -7,7 +7,10 @@ const skillLearning = {
|
||||
description: 'Manage skill learning (observe, analyze, evolve)',
|
||||
argumentHint:
|
||||
'[start|stop|about|status|ingest|evolve|export|import|prune|promote|projects]',
|
||||
isEnabled: () => isSkillLearningEnabled(),
|
||||
// The slash command is visible whenever the subsystem is compiled in.
|
||||
// Whether the runtime feature is actually doing work is a separate
|
||||
// concern controlled by `/skill-learning start` (see featureCheck.ts).
|
||||
isEnabled: () => isSkillLearningCompiledIn(),
|
||||
isHidden: false,
|
||||
load: () => import('./skillPanel.js'),
|
||||
} satisfies Command
|
||||
|
||||
@@ -1,10 +1,14 @@
|
||||
import type { Command } from '../../commands.js'
|
||||
import { isSkillSearchCompiledIn } from '../../services/skillSearch/featureCheck.js'
|
||||
|
||||
const skillSearch = {
|
||||
type: 'local-jsx',
|
||||
name: 'skill-search',
|
||||
description: 'Control automatic skill matching during conversations',
|
||||
argumentHint: '[start|stop|about|status]',
|
||||
// Visible whenever the subsystem is compiled in (build flag); runtime
|
||||
// activation is separate and operator-controlled via /skill-search start.
|
||||
isEnabled: () => isSkillSearchCompiledIn(),
|
||||
isHidden: false,
|
||||
load: () => import('./skillSearchPanel.js'),
|
||||
} satisfies Command
|
||||
|
||||
@@ -30,6 +30,7 @@ interface WorkerState {
|
||||
failureCount: number
|
||||
parked: boolean
|
||||
lastStartTime: number
|
||||
restartTimer: ReturnType<typeof setTimeout> | null
|
||||
}
|
||||
|
||||
/**
|
||||
@@ -241,6 +242,7 @@ async function runSupervisor(args: string[]): Promise<void> {
|
||||
failureCount: 0,
|
||||
parked: false,
|
||||
lastStartTime: 0,
|
||||
restartTimer: null,
|
||||
},
|
||||
]
|
||||
|
||||
@@ -261,6 +263,10 @@ async function runSupervisor(args: string[]): Promise<void> {
|
||||
controller.abort()
|
||||
removeDaemonState()
|
||||
for (const w of workers) {
|
||||
if (w.restartTimer) {
|
||||
clearTimeout(w.restartTimer)
|
||||
w.restartTimer = null
|
||||
}
|
||||
if (w.process && !w.process.killed) {
|
||||
w.process.kill('SIGTERM')
|
||||
}
|
||||
@@ -288,22 +294,30 @@ async function runSupervisor(args: string[]): Promise<void> {
|
||||
// Wait for all workers to exit
|
||||
await Promise.all(
|
||||
workers
|
||||
.filter(w => w.process && !w.process.killed)
|
||||
.filter(w => w.process && w.process.exitCode === null)
|
||||
.map(
|
||||
w =>
|
||||
new Promise<void>(resolve => {
|
||||
if (!w.process) {
|
||||
if (!w.process || w.process.exitCode !== null) {
|
||||
resolve()
|
||||
return
|
||||
}
|
||||
w.process.on('exit', () => resolve())
|
||||
let killTimer: ReturnType<typeof setTimeout> | null = null
|
||||
w.process.on('exit', () => {
|
||||
if (killTimer) {
|
||||
clearTimeout(killTimer)
|
||||
killTimer = null
|
||||
}
|
||||
resolve()
|
||||
})
|
||||
// Force kill after grace period
|
||||
setTimeout(() => {
|
||||
if (w.process && !w.process.killed) {
|
||||
killTimer = setTimeout(() => {
|
||||
if (w.process && w.process.exitCode === null) {
|
||||
w.process.kill('SIGKILL')
|
||||
}
|
||||
resolve()
|
||||
}, 30_000)
|
||||
killTimer.unref?.()
|
||||
}),
|
||||
),
|
||||
)
|
||||
@@ -398,11 +412,13 @@ function spawnWorker(
|
||||
`[daemon] worker '${worker.kind}' exited (code=${code}, signal=${sig}), restarting in ${worker.backoffMs}ms`,
|
||||
)
|
||||
|
||||
setTimeout(() => {
|
||||
worker.restartTimer = setTimeout(() => {
|
||||
worker.restartTimer = null
|
||||
if (!signal.aborted && !worker.parked) {
|
||||
spawnWorker(worker, dir, config, signal)
|
||||
}
|
||||
}, worker.backoffMs)
|
||||
worker.restartTimer.unref?.()
|
||||
|
||||
// Exponential backoff
|
||||
worker.backoffMs = Math.min(
|
||||
|
||||
@@ -255,6 +255,29 @@ async function main(): Promise<void> {
|
||||
return
|
||||
}
|
||||
|
||||
// Fast-path for `claude autonomy ...`: state inspection/management commands
|
||||
// do not need the full interactive CLI bootstrap. The full Commander path
|
||||
// imports main.tsx and runs root preAction initialization before the autonomy
|
||||
// action; under coverage/CI that leaves unrelated handles around simple
|
||||
// state-only subprocess calls.
|
||||
if (args[0] === 'autonomy') {
|
||||
profileCheckpoint('cli_autonomy_path')
|
||||
const { getAutonomyCommandText } = await import(
|
||||
'../cli/handlers/autonomy.js'
|
||||
)
|
||||
const text = await getAutonomyCommandText(args.slice(1).join(' '))
|
||||
await new Promise<void>((resolve, reject) => {
|
||||
process.stdout.write(`${text}\n`, error => {
|
||||
if (error) {
|
||||
reject(error)
|
||||
return
|
||||
}
|
||||
resolve()
|
||||
})
|
||||
})
|
||||
process.exit(0)
|
||||
}
|
||||
|
||||
// Fast-path for `--bg`/`--background` shortcut → daemon bg.
|
||||
if (
|
||||
feature('BG_SESSIONS') &&
|
||||
@@ -398,4 +421,4 @@ async function main(): Promise<void> {
|
||||
}
|
||||
|
||||
// eslint-disable-next-line custom-rules/no-top-level-side-effects
|
||||
void main()
|
||||
await main()
|
||||
|
||||
80
src/hooks/__tests__/useScheduledTasks.test.ts
Normal file
80
src/hooks/__tests__/useScheduledTasks.test.ts
Normal file
@@ -0,0 +1,80 @@
|
||||
import { afterEach, beforeEach, describe, expect, test } from 'bun:test'
|
||||
import {
|
||||
resetStateForTests,
|
||||
setCwdState,
|
||||
setOriginalCwd,
|
||||
setProjectRoot,
|
||||
} from '../../bootstrap/state'
|
||||
import { createScheduledTaskQueuedCommand } from '../useScheduledTasks'
|
||||
import {
|
||||
listAutonomyRuns,
|
||||
markAutonomyRunCompleted,
|
||||
} from '../../utils/autonomyRuns'
|
||||
import { resetAutonomyAuthorityForTests } from '../../utils/autonomyAuthority'
|
||||
import { cleanupTempDir, createTempDir } from '../../../tests/mocks/file-system'
|
||||
|
||||
let tempDir = ''
|
||||
|
||||
beforeEach(async () => {
|
||||
tempDir = await createTempDir('scheduled-tasks-')
|
||||
resetStateForTests()
|
||||
resetAutonomyAuthorityForTests()
|
||||
setOriginalCwd(tempDir)
|
||||
setProjectRoot(tempDir)
|
||||
setCwdState(tempDir)
|
||||
})
|
||||
|
||||
afterEach(async () => {
|
||||
resetStateForTests()
|
||||
resetAutonomyAuthorityForTests()
|
||||
if (tempDir) {
|
||||
await cleanupTempDir(tempDir)
|
||||
}
|
||||
})
|
||||
|
||||
describe('createScheduledTaskQueuedCommand', () => {
|
||||
function createCommandForTest(task: { id: string; prompt: string }) {
|
||||
return createScheduledTaskQueuedCommand(task, {
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
})
|
||||
}
|
||||
|
||||
test('skips a scheduled task when the same source already has an active run', async () => {
|
||||
const task = {
|
||||
id: 'cron-1',
|
||||
prompt: '/loop review the repository',
|
||||
}
|
||||
|
||||
const first = await createCommandForTest(task)
|
||||
const second = await createCommandForTest(task)
|
||||
const runs = await listAutonomyRuns(tempDir)
|
||||
|
||||
expect(first).not.toBeNull()
|
||||
expect(second).toBeNull()
|
||||
expect(runs).toHaveLength(1)
|
||||
expect(runs[0]).toMatchObject({
|
||||
trigger: 'scheduled-task',
|
||||
status: 'queued',
|
||||
sourceId: 'cron-1',
|
||||
})
|
||||
})
|
||||
|
||||
test('allows a scheduled task after the previous same-source run completes', async () => {
|
||||
const task = {
|
||||
id: 'cron-1',
|
||||
prompt: '/loop review the repository',
|
||||
}
|
||||
|
||||
const first = await createCommandForTest(task)
|
||||
expect(first?.autonomy?.runId).toBeDefined()
|
||||
|
||||
await markAutonomyRunCompleted(first!.autonomy!.runId, tempDir, 100)
|
||||
const second = await createCommandForTest(task)
|
||||
const runs = await listAutonomyRuns(tempDir)
|
||||
|
||||
expect(second).not.toBeNull()
|
||||
expect(runs).toHaveLength(2)
|
||||
expect(runs.map(run => run.status).sort()).toEqual(['completed', 'queued'])
|
||||
})
|
||||
})
|
||||
@@ -189,12 +189,6 @@ export function useReplBridge(
|
||||
}
|
||||
|
||||
let cancelled = false
|
||||
// Map of pending bridge permission response handlers, keyed by request_id.
|
||||
// Defined at useEffect scope so the cleanup function can clear it on unmount.
|
||||
const pendingPermissionHandlers = new Map<
|
||||
string,
|
||||
(response: BridgePermissionResponse) => void
|
||||
>()
|
||||
// Capture messages.length now so we don't re-send initial messages
|
||||
// through writeMessages after the bridge connects.
|
||||
const initialMessageCount = messages.length
|
||||
@@ -467,6 +461,13 @@ export function useReplBridge(
|
||||
}
|
||||
}
|
||||
|
||||
// Map of pending bridge permission response handlers, keyed by request_id.
|
||||
// Each entry is an onResponse handler waiting for CCR to reply.
|
||||
const pendingPermissionHandlers = new Map<
|
||||
string,
|
||||
(response: BridgePermissionResponse) => void
|
||||
>()
|
||||
|
||||
// Dispatch incoming control_response messages to registered handlers
|
||||
function handlePermissionResponse(msg: SDKControlResponse): void {
|
||||
const requestId = msg.response?.request_id
|
||||
@@ -817,10 +818,6 @@ export function useReplBridge(
|
||||
|
||||
return () => {
|
||||
cancelled = true
|
||||
// Release all pending permission handlers so their closures (which
|
||||
// may capture React state/setters) can be GC'd immediately rather
|
||||
// than waiting for the entire useEffect closure to become unreachable.
|
||||
pendingPermissionHandlers.clear()
|
||||
clearTimeout(failureTimeoutRef.current)
|
||||
failureTimeoutRef.current = undefined
|
||||
if (handleRef.current) {
|
||||
|
||||
@@ -10,13 +10,18 @@ import type { Message } from '../types/message.js'
|
||||
import { getCwd } from '../utils/cwd.js'
|
||||
import { getCronJitterConfig } from '../utils/cronJitterConfig.js'
|
||||
import { createCronScheduler } from '../utils/cronScheduler.js'
|
||||
import { removeCronTasks } from '../utils/cronTasks.js'
|
||||
import { createAutonomyQueuedPrompt } from '../utils/autonomyRuns.js'
|
||||
import { markAutonomyRunFailed } from '../utils/autonomyRuns.js'
|
||||
import { removeCronTasks, type CronTask } from '../utils/cronTasks.js'
|
||||
import {
|
||||
createAutonomyQueuedPrompt,
|
||||
createAutonomyQueuedPromptIfNoActiveSource,
|
||||
markAutonomyRunCancelled,
|
||||
markAutonomyRunFailed,
|
||||
} from '../utils/autonomyRuns.js'
|
||||
import { logForDebugging } from '../utils/debug.js'
|
||||
import { enqueuePendingNotification } from '../utils/messageQueueManager.js'
|
||||
import { createScheduledTaskFireMessage } from '../utils/messages.js'
|
||||
import { WORKLOAD_CRON } from '../utils/workloadContext.js'
|
||||
import type { QueuedCommand } from '../types/textInputTypes.js'
|
||||
|
||||
type Props = {
|
||||
isLoading: boolean
|
||||
@@ -32,6 +37,32 @@ type Props = {
|
||||
setMessages: React.Dispatch<React.SetStateAction<Message[]>>
|
||||
}
|
||||
|
||||
export async function createScheduledTaskQueuedCommand(
|
||||
task: Pick<CronTask, 'id' | 'prompt'>,
|
||||
options?: {
|
||||
rootDir?: string
|
||||
currentDir?: string
|
||||
shouldCreate?: () => boolean
|
||||
},
|
||||
): Promise<QueuedCommand | null> {
|
||||
const command = await createAutonomyQueuedPromptIfNoActiveSource({
|
||||
basePrompt: task.prompt,
|
||||
trigger: 'scheduled-task',
|
||||
rootDir: options?.rootDir,
|
||||
currentDir: options?.currentDir ?? getCwd(),
|
||||
sourceId: task.id,
|
||||
sourceLabel: task.prompt,
|
||||
workload: WORKLOAD_CRON,
|
||||
shouldCreate: options?.shouldCreate,
|
||||
})
|
||||
if (!command) {
|
||||
logForDebugging(
|
||||
`[ScheduledTasks] skipping ${task.id}: previous run still queued or running`,
|
||||
)
|
||||
}
|
||||
return command
|
||||
}
|
||||
|
||||
/**
|
||||
* REPL wrapper for the cron scheduler. Mounts the scheduler once and tears
|
||||
* it down on unmount. Fired prompts go into the command queue as 'later'
|
||||
@@ -71,16 +102,25 @@ export function useScheduledTasks({
|
||||
// forward isMeta, so their messages remain visible in the
|
||||
// transcript. This is acceptable since normal mode is not the
|
||||
// primary use case for scheduled tasks.
|
||||
let disposed = false
|
||||
const enqueueForLead = async (prompt: string) => {
|
||||
const command = await createAutonomyQueuedPrompt({
|
||||
basePrompt: prompt,
|
||||
trigger: 'scheduled-task',
|
||||
currentDir: getCwd(),
|
||||
workload: WORKLOAD_CRON,
|
||||
shouldCreate: () => !disposed,
|
||||
})
|
||||
if (!command) {
|
||||
return
|
||||
}
|
||||
if (disposed) {
|
||||
await markAutonomyRunCancelled(
|
||||
command.autonomy!.runId,
|
||||
command.autonomy!.rootDir,
|
||||
)
|
||||
return
|
||||
}
|
||||
enqueuePendingNotification(command)
|
||||
}
|
||||
|
||||
@@ -90,7 +130,12 @@ export function useScheduledTasks({
|
||||
// which is populated from disk at scheduler startup — this path only
|
||||
// handles team-lead durable crons.
|
||||
onFire: prompt => {
|
||||
void enqueueForLead(prompt)
|
||||
void enqueueForLead(prompt).catch(error =>
|
||||
logForDebugging(
|
||||
`[ScheduledTasks] failed to enqueue missed task prompt: ${error}`,
|
||||
{ level: 'error' },
|
||||
),
|
||||
)
|
||||
},
|
||||
// Normal fires receive the full CronTask so we can route by agentId.
|
||||
onFireTask: task => {
|
||||
@@ -101,22 +146,26 @@ export function useScheduledTasks({
|
||||
store.getState().tasks,
|
||||
)
|
||||
if (teammate && !isTerminalTaskStatus(teammate.status)) {
|
||||
const command = await createAutonomyQueuedPrompt({
|
||||
basePrompt: task.prompt,
|
||||
trigger: 'scheduled-task',
|
||||
currentDir: getCwd(),
|
||||
sourceId: task.id,
|
||||
sourceLabel: task.prompt,
|
||||
workload: WORKLOAD_CRON,
|
||||
})
|
||||
const command = await createScheduledTaskQueuedCommand(
|
||||
task,
|
||||
{ shouldCreate: () => !disposed },
|
||||
)
|
||||
if (!command) {
|
||||
return
|
||||
}
|
||||
if (disposed) {
|
||||
await markAutonomyRunCancelled(
|
||||
command.autonomy!.runId,
|
||||
command.autonomy!.rootDir,
|
||||
)
|
||||
return
|
||||
}
|
||||
const injected = injectUserMessageToTeammate(
|
||||
teammate.id,
|
||||
command.value as string,
|
||||
{
|
||||
autonomyRunId: command.autonomy?.runId,
|
||||
autonomyRootDir: command.autonomy?.rootDir,
|
||||
origin: command.origin,
|
||||
},
|
||||
setAppState,
|
||||
@@ -125,6 +174,7 @@ export function useScheduledTasks({
|
||||
await markAutonomyRunFailed(
|
||||
command.autonomy.runId,
|
||||
`Teammate ${task.agentId} exited before the scheduled message could be delivered.`,
|
||||
command.autonomy.rootDir,
|
||||
)
|
||||
}
|
||||
return
|
||||
@@ -139,24 +189,32 @@ export function useScheduledTasks({
|
||||
return
|
||||
}
|
||||
|
||||
const command = await createAutonomyQueuedPrompt({
|
||||
basePrompt: task.prompt,
|
||||
trigger: 'scheduled-task',
|
||||
currentDir: getCwd(),
|
||||
sourceId: task.id,
|
||||
sourceLabel: task.prompt,
|
||||
workload: WORKLOAD_CRON,
|
||||
})
|
||||
const command = await createScheduledTaskQueuedCommand(
|
||||
task,
|
||||
{ shouldCreate: () => !disposed },
|
||||
)
|
||||
if (!command) {
|
||||
return
|
||||
}
|
||||
if (disposed) {
|
||||
await markAutonomyRunCancelled(
|
||||
command.autonomy!.runId,
|
||||
command.autonomy!.rootDir,
|
||||
)
|
||||
return
|
||||
}
|
||||
|
||||
const msg = createScheduledTaskFireMessage(
|
||||
`Running scheduled task (${formatCronFireTime(new Date())})`,
|
||||
)
|
||||
setMessages(prev => [...prev, msg])
|
||||
enqueuePendingNotification(command)
|
||||
})()
|
||||
})().catch(error =>
|
||||
logForDebugging(
|
||||
`[ScheduledTasks] failed to enqueue task ${task.id}: ${error}`,
|
||||
{ level: 'error' },
|
||||
),
|
||||
)
|
||||
},
|
||||
isLoading: () => isLoadingRef.current,
|
||||
assistantMode,
|
||||
@@ -164,7 +222,10 @@ export function useScheduledTasks({
|
||||
isKilled: () => !isKairosCronEnabled(),
|
||||
})
|
||||
scheduler.start()
|
||||
return () => scheduler.stop()
|
||||
return () => {
|
||||
disposed = true
|
||||
scheduler.stop()
|
||||
}
|
||||
// assistantMode is stable for the session lifetime; store/setAppState are
|
||||
// stable refs from useSyncExternalStore; setMessages is a stable useCallback.
|
||||
// eslint-disable-next-line react-hooks/exhaustive-deps
|
||||
|
||||
@@ -9,7 +9,9 @@ import { useEffect, useRef } from 'react'
|
||||
import type { QueuedCommand } from '../types/textInputTypes.js'
|
||||
import { TICK_TAG } from '../constants/xml.js'
|
||||
import { getCwd } from '../utils/cwd.js'
|
||||
import { cancelQueuedAutonomyCommands } from '../utils/autonomyQueueLifecycle.js'
|
||||
import { createProactiveAutonomyCommands } from '../utils/autonomyRuns.js'
|
||||
import { logForDebugging } from '../utils/debug.js'
|
||||
import {
|
||||
isProactiveActive,
|
||||
isProactivePaused,
|
||||
@@ -38,6 +40,8 @@ export function useProactive(opts: UseProactiveOpts): void {
|
||||
if (!isProactiveActive()) return
|
||||
|
||||
let timer: ReturnType<typeof setTimeout> | null = null
|
||||
let disposed = false
|
||||
let generating = false
|
||||
|
||||
function scheduleTick(): void {
|
||||
const nextTs = Date.now() + TICK_INTERVAL_MS
|
||||
@@ -66,25 +70,51 @@ export function useProactive(opts: UseProactiveOpts): void {
|
||||
isLoading ||
|
||||
isInPlanMode ||
|
||||
hasActiveLocalJsxUI ||
|
||||
queuedCommandsLength > 0
|
||||
queuedCommandsLength > 0 ||
|
||||
generating
|
||||
) {
|
||||
scheduleTick()
|
||||
return
|
||||
}
|
||||
|
||||
generating = true
|
||||
void (async () => {
|
||||
const commands = await createProactiveAutonomyCommands({
|
||||
basePrompt: `<${TICK_TAG}>${new Date().toLocaleTimeString()}</${TICK_TAG}>`,
|
||||
currentDir: getCwd(),
|
||||
shouldCreate: () => !disposed,
|
||||
})
|
||||
for (const command of commands) {
|
||||
// Always queue proactive turns. This avoids races where the prompt
|
||||
// is built asynchronously, a user turn starts meanwhile, and a
|
||||
// direct-submit path would silently drop the autonomy turn after
|
||||
// consuming its heartbeat due-state.
|
||||
optsRef.current.onQueueTick(command)
|
||||
if (disposed) {
|
||||
await cancelQueuedAutonomyCommands({ commands })
|
||||
return
|
||||
}
|
||||
const queuedCommands: QueuedCommand[] = []
|
||||
try {
|
||||
for (const command of commands) {
|
||||
// Always queue proactive turns. This avoids races where the prompt
|
||||
// is built asynchronously, a user turn starts meanwhile, and a
|
||||
// direct-submit path would silently drop the autonomy turn after
|
||||
// consuming its heartbeat due-state.
|
||||
optsRef.current.onQueueTick(command)
|
||||
queuedCommands.push(command)
|
||||
}
|
||||
} catch (error) {
|
||||
await cancelQueuedAutonomyCommands({
|
||||
commands: commands.filter(
|
||||
command => !queuedCommands.includes(command),
|
||||
),
|
||||
})
|
||||
throw error
|
||||
}
|
||||
})()
|
||||
.catch(error =>
|
||||
logForDebugging(`[Proactive] failed to create tick: ${error}`, {
|
||||
level: 'error',
|
||||
}),
|
||||
)
|
||||
.finally(() => {
|
||||
generating = false
|
||||
})
|
||||
|
||||
// Schedule next tick
|
||||
scheduleTick()
|
||||
@@ -94,6 +124,7 @@ export function useProactive(opts: UseProactiveOpts): void {
|
||||
scheduleTick()
|
||||
|
||||
return () => {
|
||||
disposed = true
|
||||
if (timer !== null) {
|
||||
clearTimeout(timer)
|
||||
timer = null
|
||||
|
||||
152
src/query.ts
152
src/query.ts
@@ -71,10 +71,16 @@ const jobClassifier = feature('TEMPLATES')
|
||||
: null
|
||||
/* eslint-enable @typescript-eslint/no-require-imports */
|
||||
import {
|
||||
enqueue,
|
||||
remove as removeFromQueue,
|
||||
getCommandsByMaxPriority,
|
||||
isSlashCommand,
|
||||
} from './utils/messageQueueManager.js'
|
||||
import {
|
||||
type AutonomyTurnOutcome,
|
||||
claimConsumableQueuedAutonomyCommands,
|
||||
finalizeAutonomyCommandsForTurn,
|
||||
} from './utils/autonomyQueueLifecycle.js'
|
||||
import { notifyCommandLifecycle } from './utils/commandLifecycle.js'
|
||||
import { headlessProfilerCheckpoint } from './utils/headlessProfiler.js'
|
||||
import {
|
||||
@@ -92,6 +98,7 @@ import { SLEEP_TOOL_NAME } from '@claude-code-best/builtin-tools/tools/SleepTool
|
||||
import { executePostSamplingHooks } from './utils/hooks/postSamplingHooks.js'
|
||||
import { executeStopFailureHooks } from './utils/hooks.js'
|
||||
import type { QuerySource } from './constants/querySource.js'
|
||||
import type { QueuedCommand } from './types/textInputTypes.js'
|
||||
import { createDumpPromptsFetch } from './services/api/dumpPrompts.js'
|
||||
import { StreamingToolExecutor } from './services/tools/StreamingToolExecutor.js'
|
||||
import { queryCheckpoint } from './utils/queryProfiler.js'
|
||||
@@ -111,7 +118,11 @@ import {
|
||||
} from './bootstrap/state.js'
|
||||
import { createBudgetTracker, checkTokenBudget } from './query/tokenBudget.js'
|
||||
import { count } from './utils/array.js'
|
||||
import { createTrace, endTrace, isLangfuseEnabled } from './services/langfuse/index.js'
|
||||
import {
|
||||
createTrace,
|
||||
endTrace,
|
||||
isLangfuseEnabled,
|
||||
} from './services/langfuse/index.js'
|
||||
import { getAPIProvider } from './utils/model/providers.js'
|
||||
|
||||
/* eslint-disable @typescript-eslint/no-require-imports */
|
||||
@@ -129,7 +140,11 @@ function* yieldMissingToolResultBlocks(
|
||||
) {
|
||||
for (const assistantMessage of assistantMessages) {
|
||||
// Extract all tool use blocks from this assistant message
|
||||
const toolUseBlocks = (Array.isArray(assistantMessage.message?.content) ? assistantMessage.message.content : []).filter(
|
||||
const toolUseBlocks = (
|
||||
Array.isArray(assistantMessage.message?.content)
|
||||
? assistantMessage.message.content
|
||||
: []
|
||||
).filter(
|
||||
(content: { type: string }) => content.type === 'tool_use',
|
||||
) as ToolUseBlock[]
|
||||
|
||||
@@ -181,6 +196,33 @@ function isWithheldMaxOutputTokens(
|
||||
return msg?.type === 'assistant' && msg.apiError === 'max_output_tokens'
|
||||
}
|
||||
|
||||
function getAutonomyTurnOutcome(params: {
|
||||
terminal?: Terminal
|
||||
thrownError?: unknown
|
||||
}): AutonomyTurnOutcome {
|
||||
if (params.thrownError !== undefined) {
|
||||
return { type: 'failed', error: params.thrownError }
|
||||
}
|
||||
|
||||
const terminal = params.terminal
|
||||
const reason = terminal?.reason
|
||||
switch (reason) {
|
||||
case 'completed':
|
||||
return { type: 'completed' }
|
||||
case undefined:
|
||||
case 'aborted_streaming':
|
||||
case 'aborted_tools':
|
||||
return { type: 'cancelled' }
|
||||
case 'model_error':
|
||||
return { type: 'failed', error: terminal.error }
|
||||
default:
|
||||
return {
|
||||
type: 'failed',
|
||||
message: `query ended without successful completion: ${reason}`,
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
export type QueryParams = {
|
||||
messages: Message[]
|
||||
systemPrompt: SystemPrompt
|
||||
@@ -230,6 +272,7 @@ export async function* query(
|
||||
Terminal
|
||||
> {
|
||||
const consumedCommandUuids: string[] = []
|
||||
const consumedAutonomyCommands: QueuedCommand[] = []
|
||||
|
||||
// Create Langfuse trace for this query turn (no-op if not configured).
|
||||
// When called as a sub-agent, langfuseTrace is already set by runAgent()
|
||||
@@ -238,8 +281,9 @@ export async function* query(
|
||||
logForDebugging(
|
||||
`[query] ownsTrace=${ownsTrace} incoming langfuseTrace=${params.toolUseContext.langfuseTrace ? 'present' : 'null/undefined'} isLangfuseEnabled=${isLangfuseEnabled()}`,
|
||||
)
|
||||
const langfuseTrace = params.toolUseContext.langfuseTrace
|
||||
?? (isLangfuseEnabled()
|
||||
const langfuseTrace =
|
||||
params.toolUseContext.langfuseTrace ??
|
||||
(isLangfuseEnabled()
|
||||
? createTrace({
|
||||
sessionId: getSessionId(),
|
||||
model: params.toolUseContext.options.mainLoopModel,
|
||||
@@ -258,9 +302,34 @@ export async function* query(
|
||||
: params
|
||||
|
||||
let terminal: Terminal | undefined
|
||||
let didThrow = false
|
||||
let thrownError: unknown
|
||||
try {
|
||||
terminal = yield* queryLoop(paramsWithTrace, consumedCommandUuids)
|
||||
terminal = yield* queryLoop(
|
||||
paramsWithTrace,
|
||||
consumedCommandUuids,
|
||||
consumedAutonomyCommands,
|
||||
)
|
||||
} catch (error) {
|
||||
didThrow = true
|
||||
thrownError = error
|
||||
throw error
|
||||
} finally {
|
||||
await finalizeAutonomyCommandsForTurn({
|
||||
commands: consumedAutonomyCommands,
|
||||
outcome: getAutonomyTurnOutcome({
|
||||
terminal,
|
||||
...(didThrow ? { thrownError } : {}),
|
||||
}),
|
||||
priority: 'later',
|
||||
})
|
||||
.then(nextCommands => {
|
||||
for (const command of nextCommands) {
|
||||
enqueue(command)
|
||||
}
|
||||
})
|
||||
.catch(logError)
|
||||
|
||||
// Only end the trace if we created it — sub-agents own their traces
|
||||
if (ownsTrace) {
|
||||
const isAborted =
|
||||
@@ -283,6 +352,7 @@ export async function* query(
|
||||
async function* queryLoop(
|
||||
params: QueryParams,
|
||||
consumedCommandUuids: string[],
|
||||
consumedAutonomyCommands: QueuedCommand[],
|
||||
): AsyncGenerator<
|
||||
| StreamEvent
|
||||
| RequestStartEvent
|
||||
@@ -790,7 +860,14 @@ async function* queryLoop(
|
||||
let yieldMessage: typeof message = message
|
||||
if (message.type === 'assistant') {
|
||||
const assistantMsg = message as AssistantMessage
|
||||
const contentArr = Array.isArray(assistantMsg.message?.content) ? assistantMsg.message.content as unknown as Array<{ type: string; input?: unknown; name?: string; [key: string]: unknown }> : []
|
||||
const contentArr = Array.isArray(assistantMsg.message?.content)
|
||||
? (assistantMsg.message.content as unknown as Array<{
|
||||
type: string
|
||||
input?: unknown
|
||||
name?: string
|
||||
[key: string]: unknown
|
||||
}>)
|
||||
: []
|
||||
let clonedContent: typeof contentArr | undefined
|
||||
for (let i = 0; i < contentArr.length; i++) {
|
||||
const block = contentArr[i]!
|
||||
@@ -826,7 +903,10 @@ async function* queryLoop(
|
||||
if (clonedContent) {
|
||||
yieldMessage = {
|
||||
...message,
|
||||
message: { ...(assistantMsg.message ?? {}), content: clonedContent },
|
||||
message: {
|
||||
...(assistantMsg.message ?? {}),
|
||||
content: clonedContent,
|
||||
},
|
||||
} as typeof message
|
||||
}
|
||||
}
|
||||
@@ -872,7 +952,11 @@ async function* queryLoop(
|
||||
const assistantMessage = message as AssistantMessage
|
||||
assistantMessages.push(assistantMessage)
|
||||
|
||||
const msgToolUseBlocks = (Array.isArray(assistantMessage.message?.content) ? assistantMessage.message.content : []).filter(
|
||||
const msgToolUseBlocks = (
|
||||
Array.isArray(assistantMessage.message?.content)
|
||||
? assistantMessage.message.content
|
||||
: []
|
||||
).filter(
|
||||
(content: { type: string }) => content.type === 'tool_use',
|
||||
) as ToolUseBlock[]
|
||||
if (msgToolUseBlocks.length > 0) {
|
||||
@@ -1005,7 +1089,10 @@ async function* queryLoop(
|
||||
logEvent('tengu_query_error', {
|
||||
assistantMessages: assistantMessages.length,
|
||||
toolUses: assistantMessages.flatMap(_ =>
|
||||
(Array.isArray(_.message?.content) ? _.message.content as Array<{ type: string }> : []).filter(content => content.type === 'tool_use'),
|
||||
(Array.isArray(_.message?.content)
|
||||
? (_.message.content as Array<{ type: string }>)
|
||||
: []
|
||||
).filter(content => content.type === 'tool_use'),
|
||||
).length,
|
||||
|
||||
queryChainId: queryChainIdForAnalytics,
|
||||
@@ -1307,7 +1394,10 @@ async function* queryLoop(
|
||||
// error → hook blocking → retry → error → …
|
||||
if (lastMessage?.isApiErrorMessage) {
|
||||
void executeStopFailureHooks(lastMessage, toolUseContext)
|
||||
return { reason: 'completed' }
|
||||
return {
|
||||
reason: 'model_error',
|
||||
error: lastMessage.error ?? lastMessage.apiError ?? 'api_error',
|
||||
}
|
||||
}
|
||||
|
||||
const stopHookResult = yield* handleStopHooks(
|
||||
@@ -1408,7 +1498,6 @@ async function* queryLoop(
|
||||
|
||||
queryCheckpoint('query_tool_execution_start')
|
||||
|
||||
|
||||
if (streamingToolExecutor) {
|
||||
logEvent('tengu_streaming_tool_execution_used', {
|
||||
tool_count: toolUseBlocks.length,
|
||||
@@ -1468,9 +1557,14 @@ async function* queryLoop(
|
||||
const lastAssistantMessage = assistantMessages.at(-1)
|
||||
let lastAssistantText: string | undefined
|
||||
if (lastAssistantMessage) {
|
||||
const textBlocks = (Array.isArray(lastAssistantMessage.message?.content) ? lastAssistantMessage.message.content as Array<{ type: string; text?: string }> : []).filter(
|
||||
block => block.type === 'text',
|
||||
)
|
||||
const textBlocks = (
|
||||
Array.isArray(lastAssistantMessage.message?.content)
|
||||
? (lastAssistantMessage.message.content as Array<{
|
||||
type: string
|
||||
text?: string
|
||||
}>)
|
||||
: []
|
||||
).filter(block => block.type === 'text')
|
||||
if (textBlocks.length > 0) {
|
||||
const lastTextBlock = textBlocks.at(-1)
|
||||
if (lastTextBlock && 'text' in lastTextBlock) {
|
||||
@@ -1622,12 +1716,32 @@ async function* queryLoop(
|
||||
// user prompts, even if someone stamps an agentId on one.
|
||||
return cmd.mode === 'task-notification' && cmd.agentId === currentAgentId
|
||||
})
|
||||
const queuedAutonomyClaim = await claimConsumableQueuedAutonomyCommands(
|
||||
queuedCommandsSnapshot,
|
||||
)
|
||||
if (queuedAutonomyClaim.staleCommands.length > 0) {
|
||||
removeFromQueue(queuedAutonomyClaim.staleCommands)
|
||||
}
|
||||
|
||||
const claimedConsumedCommands = queuedAutonomyClaim.claimedCommands.filter(
|
||||
cmd => cmd.mode === 'prompt' || cmd.mode === 'task-notification',
|
||||
)
|
||||
if (claimedConsumedCommands.length > 0) {
|
||||
consumedAutonomyCommands.push(...claimedConsumedCommands)
|
||||
for (const cmd of claimedConsumedCommands) {
|
||||
if (cmd.uuid) {
|
||||
consumedCommandUuids.push(cmd.uuid)
|
||||
notifyCommandLifecycle(cmd.uuid, 'started')
|
||||
}
|
||||
}
|
||||
removeFromQueue(claimedConsumedCommands)
|
||||
}
|
||||
|
||||
for await (const attachment of getAttachmentMessages(
|
||||
null,
|
||||
updatedToolUseContext,
|
||||
null,
|
||||
queuedCommandsSnapshot,
|
||||
queuedAutonomyClaim.attachmentCommands,
|
||||
[...messagesForQuery, ...assistantMessages, ...toolResults],
|
||||
querySource,
|
||||
)) {
|
||||
@@ -1659,7 +1773,6 @@ async function* queryLoop(
|
||||
pendingMemoryPrefetch.consumedOnIteration = turnCount - 1
|
||||
}
|
||||
|
||||
|
||||
// Inject prefetched skill discovery. collectSkillDiscoveryPrefetch emits
|
||||
// hidden_by_main_turn — true when the prefetch resolved before this point
|
||||
// (should be >98% at AKI@250ms / Haiku@573ms vs turn durations of 2-30s).
|
||||
@@ -1675,8 +1788,11 @@ async function* queryLoop(
|
||||
|
||||
// Remove only commands that were actually consumed as attachments.
|
||||
// Prompt and task-notification commands are converted to attachments above.
|
||||
const consumedCommands = queuedCommandsSnapshot.filter(
|
||||
cmd => cmd.mode === 'prompt' || cmd.mode === 'task-notification',
|
||||
const claimedCommandSet = new Set(claimedConsumedCommands)
|
||||
const consumedCommands = queuedAutonomyClaim.attachmentCommands.filter(
|
||||
cmd =>
|
||||
(cmd.mode === 'prompt' || cmd.mode === 'task-notification') &&
|
||||
!claimedCommandSet.has(cmd),
|
||||
)
|
||||
if (consumedCommands.length > 0) {
|
||||
for (const cmd of consumedCommands) {
|
||||
|
||||
@@ -1,3 +1,20 @@
|
||||
// Auto-generated stub — replace with real implementation
|
||||
export type Terminal = any;
|
||||
export type Continue = any;
|
||||
export type Terminal =
|
||||
| { reason: 'completed' }
|
||||
| { reason: 'blocking_limit' }
|
||||
| { reason: 'image_error' }
|
||||
| { reason: 'model_error'; error?: unknown }
|
||||
| { reason: 'aborted_streaming' }
|
||||
| { reason: 'aborted_tools' }
|
||||
| { reason: 'prompt_too_long' }
|
||||
| { reason: 'stop_hook_prevented' }
|
||||
| { reason: 'hook_stopped' }
|
||||
| { reason: 'max_turns'; turnCount: number }
|
||||
|
||||
export type Continue =
|
||||
| { reason: 'collapse_drain_retry'; committed: number }
|
||||
| { reason: 'reactive_compact_retry' }
|
||||
| { reason: 'max_output_tokens_escalate' }
|
||||
| { reason: 'max_output_tokens_recovery'; attempt: number }
|
||||
| { reason: 'stop_hook_blocking' }
|
||||
| { reason: 'token_budget_continuation' }
|
||||
| { reason: 'next_turn' }
|
||||
|
||||
@@ -79,10 +79,9 @@ import { isEnvTruthy } from '../utils/envUtils.js';
|
||||
import { formatTokens, truncateToWidth } from '../utils/format.js';
|
||||
import { consumeEarlyInput } from '../utils/earlyInput.js';
|
||||
import {
|
||||
finalizeAutonomyRunCompleted,
|
||||
finalizeAutonomyRunFailed,
|
||||
markAutonomyRunRunning,
|
||||
} from '../utils/autonomyRuns.js';
|
||||
claimConsumableQueuedAutonomyCommands,
|
||||
finalizeAutonomyCommandsForTurn,
|
||||
} from '../utils/autonomyQueueLifecycle.js';
|
||||
|
||||
import { setMemberActive } from '../utils/swarm/teamHelpers.js';
|
||||
import {
|
||||
@@ -3054,18 +3053,19 @@ export function REPL({
|
||||
setMessages(old => {
|
||||
const postBoundary = getMessagesAfterCompactBoundary(old, {
|
||||
includeSnipped: true,
|
||||
})
|
||||
});
|
||||
// Hard cap: keep at most 500 messages in fullscreen scrollback
|
||||
// to prevent unbounded memory growth in multi-day sessions.
|
||||
// normalizeMessages/applyGrouping are O(n), and Ink fiber
|
||||
// trees cost ~250KB RSS per message. Without this cap,
|
||||
// scrollback after several compactions can reach thousands
|
||||
// of messages (observed: 13k+, 1GB+ heap).
|
||||
const MAX_FULLSCREEN_SCROLLBACK = 500
|
||||
const kept = postBoundary.length > MAX_FULLSCREEN_SCROLLBACK
|
||||
? postBoundary.slice(-MAX_FULLSCREEN_SCROLLBACK)
|
||||
: postBoundary
|
||||
return [...kept, newMessage]
|
||||
const MAX_FULLSCREEN_SCROLLBACK = 500;
|
||||
const kept =
|
||||
postBoundary.length > MAX_FULLSCREEN_SCROLLBACK
|
||||
? postBoundary.slice(-MAX_FULLSCREEN_SCROLLBACK)
|
||||
: postBoundary;
|
||||
return [...kept, newMessage];
|
||||
});
|
||||
} else {
|
||||
setMessages(() => [newMessage]);
|
||||
@@ -3098,13 +3098,10 @@ export function REPL({
|
||||
// so interleaved non-ephemeral messages caused duplicate progress
|
||||
// entries to accumulate (observed 13k+ entries in sleep-heavy sessions).
|
||||
for (let i = oldMessages.length - 1; i >= 0; i--) {
|
||||
const m = oldMessages[i]!
|
||||
if (m.type !== 'progress') break
|
||||
const mData = m.data as Record<string, unknown> | undefined
|
||||
if (
|
||||
m.parentToolUseID === newMessage.parentToolUseID &&
|
||||
mData?.type === newData.type
|
||||
) {
|
||||
const m = oldMessages[i]!;
|
||||
if (m.type !== 'progress') break;
|
||||
const mData = m.data as Record<string, unknown> | undefined;
|
||||
if (m.parentToolUseID === newMessage.parentToolUseID && mData?.type === newData.type) {
|
||||
const copy = oldMessages.slice();
|
||||
copy[i] = newMessage;
|
||||
return copy;
|
||||
@@ -4844,44 +4841,44 @@ export function REPL({
|
||||
} satisfies QueuedCommand)
|
||||
: input;
|
||||
|
||||
const newAbortController = createAbortController();
|
||||
setAbortController(newAbortController);
|
||||
void (async () => {
|
||||
const claim = await claimConsumableQueuedAutonomyCommands([queuedCommand]);
|
||||
const command = claim.attachmentCommands[0];
|
||||
if (!command) return;
|
||||
|
||||
// Create a user message with the formatted content (includes XML wrapper)
|
||||
const userMessage = createUserMessage({
|
||||
content: queuedCommand.value as string,
|
||||
isMeta: queuedCommand.isMeta ? true : undefined,
|
||||
origin: queuedCommand.origin,
|
||||
});
|
||||
const newAbortController = createAbortController();
|
||||
setAbortController(newAbortController);
|
||||
|
||||
const autonomyRunId = queuedCommand.autonomy?.runId;
|
||||
if (autonomyRunId) {
|
||||
void markAutonomyRunRunning(autonomyRunId);
|
||||
}
|
||||
|
||||
void onQuery([userMessage], newAbortController, true, [], mainLoopModel)
|
||||
.then(() => {
|
||||
if (autonomyRunId) {
|
||||
void finalizeAutonomyRunCompleted({
|
||||
runId: autonomyRunId,
|
||||
currentDir: getCwd(),
|
||||
priority: 'later',
|
||||
}).then(nextCommands => {
|
||||
for (const command of nextCommands) {
|
||||
enqueue(command);
|
||||
}
|
||||
});
|
||||
}
|
||||
})
|
||||
.catch((error: unknown) => {
|
||||
if (autonomyRunId) {
|
||||
void finalizeAutonomyRunFailed({
|
||||
runId: autonomyRunId,
|
||||
error: String(error),
|
||||
});
|
||||
}
|
||||
logError(toError(error));
|
||||
// Create a user message with the formatted content (includes XML wrapper)
|
||||
const userMessage = createUserMessage({
|
||||
content: command.value as string,
|
||||
isMeta: command.isMeta ? true : undefined,
|
||||
origin: command.origin,
|
||||
});
|
||||
|
||||
try {
|
||||
await onQuery([userMessage], newAbortController, true, [], mainLoopModel);
|
||||
const nextCommands = await finalizeAutonomyCommandsForTurn({
|
||||
commands: claim.claimedCommands,
|
||||
outcome: { type: 'completed' },
|
||||
currentDir: getCwd(),
|
||||
priority: 'later',
|
||||
});
|
||||
for (const nextCommand of nextCommands) {
|
||||
enqueue(nextCommand);
|
||||
}
|
||||
} catch (error: unknown) {
|
||||
await finalizeAutonomyCommandsForTurn({
|
||||
commands: claim.claimedCommands,
|
||||
outcome: { type: 'failed', error },
|
||||
currentDir: getCwd(),
|
||||
priority: 'later',
|
||||
});
|
||||
logError(toError(error));
|
||||
}
|
||||
})().catch((error: unknown) => {
|
||||
logError(toError(error));
|
||||
});
|
||||
return true;
|
||||
},
|
||||
[onQuery, mainLoopModel, store],
|
||||
|
||||
@@ -7,7 +7,6 @@ import { clearClassifierApprovals } from '../../utils/classifierApprovals.js'
|
||||
import { resetGetMemoryFilesCache } from '../../utils/claudemd.js'
|
||||
import { clearSessionMessagesCache } from '../../utils/sessionStorage.js'
|
||||
import { clearBetaTracingState } from '../../utils/telemetry/betaSessionTracing.js'
|
||||
import { getLspServerManager } from '../../services/lsp/manager.js'
|
||||
import { resetMicrocompactState } from './microCompact.js'
|
||||
|
||||
/**
|
||||
@@ -29,7 +28,7 @@ import { resetMicrocompactState } from './microCompact.js'
|
||||
* pass querySource — undefined is only safe for callers that are
|
||||
* genuinely main-thread-only (/compact, /clear).
|
||||
*/
|
||||
export async function runPostCompactCleanup(querySource?: QuerySource): Promise<void> {
|
||||
export function runPostCompactCleanup(querySource?: QuerySource): void {
|
||||
// Subagents (agent:*) run in the same process and share module-level
|
||||
// state with the main thread. Only reset main-thread module-level state
|
||||
// (context-collapse, memory file cache) for main-thread compacts.
|
||||
@@ -75,15 +74,4 @@ export async function runPostCompactCleanup(querySource?: QuerySource): Promise<
|
||||
)
|
||||
}
|
||||
clearSessionMessagesCache()
|
||||
// Close all LSP-tracked files so servers release state for files no longer
|
||||
// in the active context after compaction. Best-effort — LSP may not be
|
||||
// initialized, and closeAllFiles catches per-file errors internally.
|
||||
try {
|
||||
const lspManager = getLspServerManager()
|
||||
if (lspManager) {
|
||||
await lspManager.closeAllFiles()
|
||||
}
|
||||
} catch {
|
||||
// LSP module may not be available in all environments
|
||||
}
|
||||
}
|
||||
|
||||
@@ -1,12 +1,36 @@
|
||||
import { feature } from 'bun:bundle'
|
||||
|
||||
export function isSkillLearningEnabled(): boolean {
|
||||
if (process.env.SKILL_LEARNING_ENABLED === '0') return false
|
||||
if (process.env.SKILL_LEARNING_ENABLED === '1') return true
|
||||
if (process.env.FEATURE_SKILL_LEARNING === '0') return false
|
||||
if (process.env.FEATURE_SKILL_LEARNING === '1') return true
|
||||
if (feature('SKILL_LEARNING')) {
|
||||
return true
|
||||
}
|
||||
/**
|
||||
* Build-time presence check: is the `/skill-learning` slash command
|
||||
* compiled into this build? Used by the command registry's `isEnabled` so
|
||||
* the command appears in the menu whenever it is buildable. Operators
|
||||
* activate the subsystem itself via `/skill-learning start`, which flips
|
||||
* `SKILL_LEARNING_ENABLED=1` and turns the runtime observers on (see
|
||||
* `isSkillLearningEnabled`).
|
||||
*/
|
||||
export function isSkillLearningCompiledIn(): boolean {
|
||||
if (feature('SKILL_LEARNING')) return true
|
||||
return false
|
||||
}
|
||||
|
||||
/**
|
||||
* Runtime activation check: is the skill-learning subsystem actively
|
||||
* running (toolEvent, runtime, session observers attached, persisting
|
||||
* observations to disk)? Off by default — the operator must run
|
||||
* `/skill-learning start` (which sets `SKILL_LEARNING_ENABLED=1`).
|
||||
*
|
||||
* Legacy `FEATURE_SKILL_LEARNING=1` is also accepted for backward
|
||||
* compatibility with operators who set it before the slash-command UX
|
||||
* landed.
|
||||
*
|
||||
* Build-flag gating is intentionally NOT performed here: the command
|
||||
* registry already gates command compilation on the build flag, and this
|
||||
* function is only reached from code paths that the build flag has
|
||||
* already let through. Decoupling keeps the test surface clean (tests
|
||||
* exercise the env-var contract without needing to mock `bun:bundle`).
|
||||
*/
|
||||
export function isSkillLearningEnabled(): boolean {
|
||||
if (process.env.SKILL_LEARNING_ENABLED === '1') return true
|
||||
if (process.env.FEATURE_SKILL_LEARNING === '1') return true
|
||||
return false
|
||||
}
|
||||
|
||||
@@ -45,15 +45,44 @@ export function getProjectContextPath(projectId: string): string {
|
||||
// in the tool.call hot path (one wrapper invocation per tool) that cost would
|
||||
// accumulate into the hundreds-of-ms range per session. Cache keyed by the
|
||||
// exact cwd string so different worktrees still get independent entries.
|
||||
//
|
||||
// Bounded with LRU eviction: long-lived processes that traverse many
|
||||
// worktrees (e.g. multi-repo build orchestrators) would otherwise grow the
|
||||
// cache without limit. Each entry holds a SkillLearningProjectContext
|
||||
// (instinct + skill lists), so the cap ensures bounded memory regardless
|
||||
// of cwd diversity. `defines.ts` originally flagged this as
|
||||
// "无淘汰机制(非 GB 级主因)" — this fix closes that gap.
|
||||
const PROJECT_CONTEXT_CACHE_MAX = 32
|
||||
const PROJECT_CONTEXT_CACHE_TRIM_TO = 24
|
||||
const contextCache = new Map<string, SkillLearningProjectContext>()
|
||||
const PERSIST_INTERVAL_MS = 5 * 60 * 1000
|
||||
let lastPersistAt = 0
|
||||
|
||||
function setProjectContextCache(
|
||||
cwd: string,
|
||||
ctx: SkillLearningProjectContext,
|
||||
): void {
|
||||
if (contextCache.has(cwd)) contextCache.delete(cwd)
|
||||
contextCache.set(cwd, ctx)
|
||||
if (contextCache.size > PROJECT_CONTEXT_CACHE_MAX) {
|
||||
const toDrop = contextCache.size - PROJECT_CONTEXT_CACHE_TRIM_TO
|
||||
const iter = contextCache.keys()
|
||||
for (let i = 0; i < toDrop; i++) {
|
||||
const next = iter.next()
|
||||
if (next.done) break
|
||||
contextCache.delete(next.value)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
export function resolveProjectContext(
|
||||
cwd = process.cwd(),
|
||||
): SkillLearningProjectContext {
|
||||
const cached = contextCache.get(cwd)
|
||||
if (cached) {
|
||||
// Refresh insertion order so frequently-accessed cwds survive eviction.
|
||||
contextCache.delete(cwd)
|
||||
contextCache.set(cwd, cached)
|
||||
// Still touch the registry so long-lived processes keep `lastSeenAt`
|
||||
// reasonably fresh, but throttle the write so it doesn't fire on every
|
||||
// tool call.
|
||||
@@ -65,7 +94,7 @@ export function resolveProjectContext(
|
||||
return cached
|
||||
}
|
||||
const resolved = resolveContext(cwd)
|
||||
contextCache.set(cwd, resolved)
|
||||
setProjectContextCache(cwd, resolved)
|
||||
persistProjectContext(resolved)
|
||||
lastPersistAt = Date.now()
|
||||
return resolved
|
||||
|
||||
@@ -23,8 +23,30 @@ export type PromotionOptions = {
|
||||
minConfidence?: number
|
||||
}
|
||||
|
||||
/**
|
||||
* Set bounded with FIFO eviction. # promotions per session is small in
|
||||
* practice (single digits), but a long-lived sandbox/daemon could push
|
||||
* this if it never restarts. The cap is defensive and the degraded
|
||||
* behaviour — re-promote if we exceed N then forget the oldest — is
|
||||
* benign because promotion is idempotent at the lifecycle layer.
|
||||
*/
|
||||
const SESSION_PROMOTED_IDS_MAX = 256
|
||||
const SESSION_PROMOTED_IDS_TRIM_TO = 192
|
||||
const sessionPromotedIds = new Set<string>()
|
||||
|
||||
function recordSessionPromoted(id: string): void {
|
||||
sessionPromotedIds.add(id)
|
||||
if (sessionPromotedIds.size > SESSION_PROMOTED_IDS_MAX) {
|
||||
const toDrop = sessionPromotedIds.size - SESSION_PROMOTED_IDS_TRIM_TO
|
||||
const iter = sessionPromotedIds.values()
|
||||
for (let i = 0; i < toDrop; i++) {
|
||||
const next = iter.next()
|
||||
if (next.done) break
|
||||
sessionPromotedIds.delete(next.value)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
export function resetPromotionBookkeeping(): void {
|
||||
sessionPromotedIds.clear()
|
||||
}
|
||||
@@ -103,7 +125,7 @@ export async function checkPromotion(
|
||||
}
|
||||
await saveInstinct(globalInstinct, globalOptions)
|
||||
|
||||
sessionPromotedIds.add(candidate.instinctId)
|
||||
recordSessionPromoted(candidate.instinctId)
|
||||
promoted.push(candidate)
|
||||
}
|
||||
|
||||
|
||||
@@ -1,10 +1,30 @@
|
||||
import { feature } from 'bun:bundle'
|
||||
|
||||
export function isSkillSearchEnabled(): boolean {
|
||||
if (process.env.SKILL_SEARCH_ENABLED === '0') return false
|
||||
if (process.env.SKILL_SEARCH_ENABLED === '1') return true
|
||||
if (feature('EXPERIMENTAL_SKILL_SEARCH')) {
|
||||
return true
|
||||
}
|
||||
/**
|
||||
* Build-time presence check: is the `/skill-search` slash command compiled
|
||||
* into this build? Used by the command registry's `isEnabled` so the
|
||||
* command appears in the menu whenever it is buildable. Operators activate
|
||||
* the subsystem itself via `/skill-search start`, which flips
|
||||
* `SKILL_SEARCH_ENABLED=1` and turns the runtime hot paths on (see
|
||||
* `isSkillSearchEnabled`).
|
||||
*/
|
||||
export function isSkillSearchCompiledIn(): boolean {
|
||||
if (feature('EXPERIMENTAL_SKILL_SEARCH')) return true
|
||||
return false
|
||||
}
|
||||
|
||||
/**
|
||||
* Runtime activation check: is the skill-search subsystem currently doing
|
||||
* work (intentNormalize Haiku calls, prefetch hot path, telemetry)? Off by
|
||||
* default — the operator must run `/skill-search start` (which sets
|
||||
* `SKILL_SEARCH_ENABLED=1`). See docs/agent/sur-skill-overflow-bugs.md §5.
|
||||
*
|
||||
* Build-flag gating is intentionally NOT performed here: the command
|
||||
* registry already gates command compilation on the build flag, and this
|
||||
* function is only reached from code paths that the build flag has
|
||||
* already let through. Decoupling keeps the test surface clean (tests
|
||||
* exercise the env-var contract without needing to mock `bun:bundle`).
|
||||
*/
|
||||
export function isSkillSearchEnabled(): boolean {
|
||||
return process.env.SKILL_SEARCH_ENABLED === '1'
|
||||
}
|
||||
|
||||
@@ -47,10 +47,35 @@ Output ONLY keywords. Nothing else.`
|
||||
const DEFAULT_TIMEOUT_MS = 6_000
|
||||
const MAX_QUERY_CHARS = 500
|
||||
const MAX_KEYWORDS_CHARS = 120
|
||||
/**
|
||||
* Bound on the process-level query→keywords cache. Insertion-order LRU —
|
||||
* Map iteration order is insertion order, so we evict from the front when
|
||||
* size exceeds the cap. ~200 entries × ~600 bytes (query + keywords) ≈
|
||||
* 120 KB worst case. Without this cap the cache grew monotonically with
|
||||
* the diversity of Chinese queries in a long session.
|
||||
*/
|
||||
const CACHE_MAX_ENTRIES = 200
|
||||
const CACHE_TRIM_TO = 150
|
||||
|
||||
/** Process-level cache. Keyed by the original (trimmed) query. */
|
||||
const cache = new Map<string, string>()
|
||||
|
||||
function setCachedQueryIntent(key: string, value: string): void {
|
||||
// Refresh insertion order on hit-then-write so frequently-used keys
|
||||
// stay alive (delete + set is the canonical Map-LRU idiom).
|
||||
if (cache.has(key)) cache.delete(key)
|
||||
cache.set(key, value)
|
||||
if (cache.size > CACHE_MAX_ENTRIES) {
|
||||
const toDrop = cache.size - CACHE_TRIM_TO
|
||||
const iter = cache.keys()
|
||||
for (let i = 0; i < toDrop; i++) {
|
||||
const next = iter.next()
|
||||
if (next.done) break
|
||||
cache.delete(next.value)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
export function isIntentNormalizeEnabled(): boolean {
|
||||
return process.env.SKILL_SEARCH_INTENT_ENABLED === '1'
|
||||
}
|
||||
@@ -74,12 +99,17 @@ export async function normalizeQueryIntent(query: string): Promise<string> {
|
||||
if (!/[\u4e00-\u9fff]/.test(trimmed)) return trimmed
|
||||
|
||||
const cached = cache.get(trimmed)
|
||||
if (cached !== undefined) return cached
|
||||
if (cached !== undefined) {
|
||||
// Refresh LRU position so frequently-queried strings survive eviction.
|
||||
cache.delete(trimmed)
|
||||
cache.set(trimmed, cached)
|
||||
return cached
|
||||
}
|
||||
|
||||
const capped = trimmed.slice(0, MAX_QUERY_CHARS)
|
||||
const keywords = await callHaiku(capped)
|
||||
const result = keywords ? `${trimmed} ${keywords}` : trimmed
|
||||
cache.set(trimmed, result)
|
||||
setCachedQueryIntent(trimmed, result)
|
||||
logForDebugging(
|
||||
`[skill-search] intent normalized: "${trimmed.slice(0, 40)}" -> "${keywords}"`,
|
||||
)
|
||||
|
||||
@@ -14,9 +14,35 @@ import { readFile } from 'node:fs/promises'
|
||||
import { join } from 'node:path'
|
||||
import { parseFrontmatter } from '../../utils/frontmatterParser.js'
|
||||
|
||||
/**
|
||||
* Per-session memoization to avoid re-emitting the same skill discovery /
|
||||
* gap signal twice. Each Set is bounded to keep long-running sessions from
|
||||
* monotonically accumulating skill names and signal keys forever (which
|
||||
* was the original session-scoped-but-unbounded design).
|
||||
*
|
||||
* FIFO eviction by insertion order — once the cap is hit, the oldest
|
||||
* entries roll off and may be re-recorded if rediscovered, which is the
|
||||
* correct degraded behaviour: at worst we re-emit a duplicate signal,
|
||||
* never silently drop a real one.
|
||||
*/
|
||||
const SESSION_TRACKING_MAX = 1000
|
||||
const SESSION_TRACKING_TRIM_TO = 750
|
||||
const discoveredThisSession = new Set<string>()
|
||||
const recordedGapSignals = new Set<string>()
|
||||
|
||||
function addBoundedSessionEntry(set: Set<string>, value: string): void {
|
||||
set.add(value)
|
||||
if (set.size > SESSION_TRACKING_MAX) {
|
||||
const toDrop = set.size - SESSION_TRACKING_TRIM_TO
|
||||
const iter = set.values()
|
||||
for (let i = 0; i < toDrop; i++) {
|
||||
const next = iter.next()
|
||||
if (next.done) break
|
||||
set.delete(next.value)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
const AUTO_LOAD_MIN_SCORE = Number(
|
||||
process.env.SKILL_SEARCH_AUTOLOAD_MIN_SCORE ?? '0.30',
|
||||
)
|
||||
@@ -185,7 +211,7 @@ async function maybeRecordSkillGap(
|
||||
|
||||
const gapSignalKey = `${trigger}:${queryText.trim().toLowerCase()}`
|
||||
if (recordedGapSignals.has(gapSignalKey)) return undefined
|
||||
recordedGapSignals.add(gapSignalKey)
|
||||
addBoundedSessionEntry(recordedGapSignals, gapSignalKey)
|
||||
|
||||
try {
|
||||
const [{ isSkillLearningEnabled }, { recordSkillGap }] = await Promise.all([
|
||||
@@ -241,7 +267,7 @@ export async function startSkillDiscoveryPrefetch(
|
||||
const newResults = results.filter(r => !discoveredThisSession.has(r.name))
|
||||
if (newResults.length === 0) return []
|
||||
|
||||
for (const r of newResults) discoveredThisSession.add(r.name)
|
||||
for (const r of newResults) addBoundedSessionEntry(discoveredThisSession, r.name)
|
||||
|
||||
const signal: DiscoverySignal = {
|
||||
trigger: 'assistant_turn',
|
||||
@@ -305,7 +331,7 @@ export async function getTurnZeroSkillDiscovery(
|
||||
|
||||
if (results.length === 0 && !gap) return null
|
||||
|
||||
for (const r of results) discoveredThisSession.add(r.name)
|
||||
for (const r of results) addBoundedSessionEntry(discoveredThisSession, r.name)
|
||||
|
||||
const signal: DiscoverySignal = {
|
||||
trigger: 'user_input',
|
||||
|
||||
@@ -73,6 +73,7 @@ export function injectUserMessageToTeammate(
|
||||
options:
|
||||
| {
|
||||
autonomyRunId?: string;
|
||||
autonomyRootDir?: string;
|
||||
origin?: MessageOrigin;
|
||||
}
|
||||
| undefined,
|
||||
@@ -93,6 +94,9 @@ export function injectUserMessageToTeammate(
|
||||
if (options?.autonomyRunId !== undefined) {
|
||||
pendingMessage.autonomyRunId = options.autonomyRunId;
|
||||
}
|
||||
if (options?.autonomyRootDir !== undefined) {
|
||||
pendingMessage.autonomyRootDir = options.autonomyRootDir;
|
||||
}
|
||||
if (options?.origin !== undefined) {
|
||||
pendingMessage.origin = options.origin;
|
||||
}
|
||||
|
||||
@@ -22,6 +22,7 @@ export type TeammateIdentity = {
|
||||
export type PendingTeammateUserMessage = {
|
||||
message: string
|
||||
autonomyRunId?: string
|
||||
autonomyRootDir?: string
|
||||
origin?: MessageOrigin
|
||||
}
|
||||
|
||||
|
||||
@@ -361,6 +361,7 @@ export type QueuedCommand = {
|
||||
*/
|
||||
autonomy?: {
|
||||
runId: string
|
||||
rootDir?: string
|
||||
trigger: 'scheduled-task' | 'proactive-tick' | 'managed-flow-step'
|
||||
sourceId?: string
|
||||
sourceLabel?: string
|
||||
|
||||
@@ -5,6 +5,7 @@ import {
|
||||
AUTONOMY_DIR,
|
||||
buildAutonomyTurnPrompt,
|
||||
loadAutonomyAuthority,
|
||||
parseHeartbeatAuthorityTasks,
|
||||
resetAutonomyAuthorityForTests,
|
||||
} from '../autonomyAuthority'
|
||||
import {
|
||||
@@ -238,4 +239,79 @@ describe('autonomyAuthority', () => {
|
||||
expect(prompt).not.toContain('- weekly-report (7d): Ship the weekly report')
|
||||
expect(prompt).not.toContain('- gather (')
|
||||
})
|
||||
|
||||
test('parseHeartbeatAuthorityTasks ignores tasks: literals inside markdown code fences', () => {
|
||||
const content = [
|
||||
'# HEARTBEAT.md',
|
||||
'',
|
||||
'```yaml',
|
||||
'tasks:',
|
||||
' - name: not-a-real-task',
|
||||
' interval: 1m',
|
||||
' prompt: "would-be-shadowed"',
|
||||
'```',
|
||||
'',
|
||||
'tasks:',
|
||||
' - name: real-task',
|
||||
' interval: 30m',
|
||||
' prompt: "Real prompt"',
|
||||
].join('\n')
|
||||
|
||||
const parsed = parseHeartbeatAuthorityTasks(content)
|
||||
|
||||
expect(parsed).toHaveLength(1)
|
||||
expect(parsed[0]).toMatchObject({
|
||||
name: 'real-task',
|
||||
interval: '30m',
|
||||
prompt: 'Real prompt',
|
||||
})
|
||||
})
|
||||
|
||||
test('parseHeartbeatAuthorityTasks ignores tasks: literals inside tilde markdown code fences', () => {
|
||||
const content = [
|
||||
'# HEARTBEAT.md',
|
||||
'',
|
||||
'~~~yaml',
|
||||
'tasks:',
|
||||
' - name: not-a-real-task',
|
||||
' interval: 1m',
|
||||
' prompt: "would-be-shadowed"',
|
||||
'~~~',
|
||||
'',
|
||||
'tasks:',
|
||||
' - name: real-task',
|
||||
' interval: 30m',
|
||||
' prompt: "Real prompt"',
|
||||
].join('\n')
|
||||
|
||||
const parsed = parseHeartbeatAuthorityTasks(content)
|
||||
|
||||
expect(parsed).toHaveLength(1)
|
||||
expect(parsed[0]).toMatchObject({
|
||||
name: 'real-task',
|
||||
interval: '30m',
|
||||
prompt: 'Real prompt',
|
||||
})
|
||||
})
|
||||
|
||||
test('parseHeartbeatAuthorityTasks parses real tasks even when documentation precedes them', () => {
|
||||
const content = [
|
||||
'# Heartbeat docs',
|
||||
'',
|
||||
'See `tasks:` below — the parser keys on the literal at column 0.',
|
||||
'',
|
||||
'tasks:',
|
||||
' - name: weekly',
|
||||
' interval: 7d',
|
||||
' prompt: "Ship report"',
|
||||
].join('\n')
|
||||
|
||||
const parsed = parseHeartbeatAuthorityTasks(content)
|
||||
|
||||
// Inline `tasks:` mention does NOT collide because it's not at column 0
|
||||
// on its own line — the existing line.trim() === 'tasks:' guard handles
|
||||
// that case. This test pins the behaviour.
|
||||
expect(parsed).toHaveLength(1)
|
||||
expect(parsed[0]?.name).toBe('weekly')
|
||||
})
|
||||
})
|
||||
|
||||
@@ -126,6 +126,14 @@ describe('listAutonomyFlows', () => {
|
||||
runCount: 0,
|
||||
ownerKey: DEFAULT_AUTONOMY_OWNER_KEY,
|
||||
currentDir: tempDir,
|
||||
boundary: [
|
||||
' src/utils/** ',
|
||||
'/absolute/not-allowed',
|
||||
'src\\windows',
|
||||
'../outside',
|
||||
'src/utils/**',
|
||||
'docs/*.md',
|
||||
],
|
||||
stateJson: {
|
||||
currentStepIndex: 0,
|
||||
steps: [
|
||||
@@ -147,6 +155,7 @@ describe('listAutonomyFlows', () => {
|
||||
expect(flows).toHaveLength(1)
|
||||
expect(flows[0]?.flowId).toBe('flow-1')
|
||||
expect(flows[0]?.syncMode).toBe('managed')
|
||||
expect(flows[0]?.boundary).toEqual(['src/utils/**', 'docs/*.md'])
|
||||
expect(flows[0]?.stateJson?.steps).toHaveLength(1)
|
||||
})
|
||||
|
||||
@@ -191,6 +200,64 @@ describe('listAutonomyFlows', () => {
|
||||
const flows = await listAutonomyFlows(tempDir)
|
||||
expect(flows).toEqual([])
|
||||
})
|
||||
|
||||
test('persistence pruning keeps active flows ahead of recent terminal history', async () => {
|
||||
const flows: AutonomyFlowRecord[] = [
|
||||
{
|
||||
flowId: 'old-active',
|
||||
flowKey: 'managed:scheduled-task:old-active',
|
||||
syncMode: 'managed',
|
||||
ownerKey: DEFAULT_AUTONOMY_OWNER_KEY,
|
||||
revision: 1,
|
||||
trigger: 'scheduled-task',
|
||||
status: 'queued',
|
||||
goal: 'old active',
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
runCount: 0,
|
||||
createdAt: 1,
|
||||
updatedAt: 1,
|
||||
},
|
||||
...Array.from({ length: 100 }, (_, index) => ({
|
||||
flowId: `history-${index}`,
|
||||
flowKey: `managed:scheduled-task:history-${index}`,
|
||||
syncMode: 'managed' as const,
|
||||
ownerKey: DEFAULT_AUTONOMY_OWNER_KEY,
|
||||
revision: 1,
|
||||
trigger: 'scheduled-task' as const,
|
||||
status: 'succeeded' as const,
|
||||
goal: `history ${index}`,
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
runCount: 1,
|
||||
createdAt: 1_000 + index,
|
||||
updatedAt: 1_000 + index,
|
||||
endedAt: 2_000 + index,
|
||||
})),
|
||||
]
|
||||
const flowsPath = resolveAutonomyFlowsPath(tempDir)
|
||||
await mkdir(join(tempDir, AUTONOMY_DIR), { recursive: true })
|
||||
await writeFile(
|
||||
flowsPath,
|
||||
`${JSON.stringify({ flows }, null, 2)}\n`,
|
||||
'utf-8',
|
||||
)
|
||||
|
||||
await startManagedAutonomyFlow({
|
||||
trigger: 'scheduled-task',
|
||||
goal: 'fresh active',
|
||||
steps: TWO_STEPS,
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
sourceId: 'fresh-active',
|
||||
nowMs: 9_999,
|
||||
})
|
||||
|
||||
const persisted = await listAutonomyFlows(tempDir)
|
||||
expect(persisted).toHaveLength(100)
|
||||
expect(persisted.some(flow => flow.flowId === 'old-active')).toBe(true)
|
||||
expect(persisted.some(flow => flow.flowId === 'history-0')).toBe(false)
|
||||
})
|
||||
})
|
||||
|
||||
describe('startManagedAutonomyFlow', () => {
|
||||
@@ -225,6 +292,49 @@ describe('startManagedAutonomyFlow', () => {
|
||||
expect(result!.nextStep!.step.name).toBe('gather')
|
||||
})
|
||||
|
||||
test('normalizes and preserves boundary across completed flow restarts', async () => {
|
||||
const first = await startManagedAutonomyFlow({
|
||||
trigger: 'scheduled-task',
|
||||
goal: 'Scoped flow',
|
||||
steps: [{ name: 'only', prompt: 'Do it' }],
|
||||
rootDir: tempDir,
|
||||
sourceId: 'scoped-src',
|
||||
boundary: [' src/utils/** ', 'src\\bad', '/absolute', 'docs/*.md'],
|
||||
nowMs: 1000,
|
||||
})
|
||||
const flowId = first!.flow.flowId
|
||||
|
||||
expect(first!.flow.boundary).toEqual(['src/utils/**', 'docs/*.md'])
|
||||
|
||||
await queueManagedAutonomyFlowStepRun({
|
||||
flowId,
|
||||
stepId: first!.nextStep!.step.stepId,
|
||||
stepIndex: 0,
|
||||
runId: 'run-1',
|
||||
rootDir: tempDir,
|
||||
nowMs: 2000,
|
||||
})
|
||||
await markManagedAutonomyFlowStepCompleted({
|
||||
flowId,
|
||||
runId: 'run-1',
|
||||
rootDir: tempDir,
|
||||
nowMs: 3000,
|
||||
})
|
||||
|
||||
const restarted = await startManagedAutonomyFlow({
|
||||
trigger: 'scheduled-task',
|
||||
goal: 'Scoped flow',
|
||||
steps: [{ name: 'only', prompt: 'Do it again' }],
|
||||
rootDir: tempDir,
|
||||
sourceId: 'scoped-src',
|
||||
nowMs: 4000,
|
||||
})
|
||||
|
||||
expect(restarted!.started).toBe(true)
|
||||
expect(restarted!.flow.flowId).toBe(flowId)
|
||||
expect(restarted!.flow.boundary).toEqual(['src/utils/**', 'docs/*.md'])
|
||||
})
|
||||
|
||||
test('sets status=waiting when first step has waitFor', async () => {
|
||||
const result = await startManagedAutonomyFlow({
|
||||
trigger: 'scheduled-task',
|
||||
|
||||
@@ -54,6 +54,25 @@ describe('withAutonomyPersistenceLock', () => {
|
||||
).rejects.toThrow('inner failure')
|
||||
})
|
||||
|
||||
test('releases same-root lock bookkeeping after success and failure', async () => {
|
||||
const {
|
||||
getAutonomyPersistenceLockCountForTests,
|
||||
withAutonomyPersistenceLock,
|
||||
} = await import('../autonomyPersistence')
|
||||
|
||||
expect(getAutonomyPersistenceLockCountForTests()).toBe(0)
|
||||
|
||||
await withAutonomyPersistenceLock(tempDir, async () => 'ok')
|
||||
expect(getAutonomyPersistenceLockCountForTests()).toBe(0)
|
||||
|
||||
await expect(
|
||||
withAutonomyPersistenceLock(tempDir, async () => {
|
||||
throw new Error('inner failure')
|
||||
}),
|
||||
).rejects.toThrow('inner failure')
|
||||
expect(getAutonomyPersistenceLockCountForTests()).toBe(0)
|
||||
})
|
||||
|
||||
test('serializes concurrent calls on the same rootDir', async () => {
|
||||
const { withAutonomyPersistenceLock } = await import(
|
||||
'../autonomyPersistence'
|
||||
|
||||
279
src/utils/__tests__/autonomyQueueLifecycle.test.ts
Normal file
279
src/utils/__tests__/autonomyQueueLifecycle.test.ts
Normal file
@@ -0,0 +1,279 @@
|
||||
import { afterEach, beforeEach, describe, expect, test } from 'bun:test'
|
||||
import { createTempDir, cleanupTempDir } from '../../../tests/mocks/file-system'
|
||||
import { getAttachmentMessages } from '../attachments'
|
||||
import {
|
||||
createAutonomyQueuedPrompt,
|
||||
createProactiveAutonomyCommands,
|
||||
getAutonomyRunById,
|
||||
markAutonomyRunCancelled,
|
||||
startManagedAutonomyFlowFromHeartbeatTask,
|
||||
} from '../autonomyRuns'
|
||||
import { getAutonomyFlowById, listAutonomyFlows } from '../autonomyFlows'
|
||||
import {
|
||||
cancelQueuedAutonomyCommands,
|
||||
claimConsumableQueuedAutonomyCommands,
|
||||
finalizeAutonomyCommandsForTurn,
|
||||
partitionConsumableQueuedAutonomyCommands,
|
||||
} from '../autonomyQueueLifecycle'
|
||||
import {
|
||||
enqueue,
|
||||
getCommandsByMaxPriority,
|
||||
remove as removeFromQueue,
|
||||
resetCommandQueue,
|
||||
} from '../messageQueueManager'
|
||||
|
||||
let tempDir = ''
|
||||
let extraTempDirs: string[] = []
|
||||
|
||||
beforeEach(async () => {
|
||||
tempDir = await createTempDir('autonomy-queue-lifecycle-')
|
||||
extraTempDirs = []
|
||||
resetCommandQueue()
|
||||
})
|
||||
|
||||
afterEach(async () => {
|
||||
resetCommandQueue()
|
||||
if (tempDir) {
|
||||
await cleanupTempDir(tempDir)
|
||||
}
|
||||
for (const extraTempDir of extraTempDirs) {
|
||||
await cleanupTempDir(extraTempDir)
|
||||
}
|
||||
})
|
||||
|
||||
describe('autonomyQueueLifecycle', () => {
|
||||
async function consumeQueuedAutonomyAttachmentTurn() {
|
||||
const previousDisableAttachments =
|
||||
process.env.CLAUDE_CODE_DISABLE_ATTACHMENTS
|
||||
process.env.CLAUDE_CODE_DISABLE_ATTACHMENTS = '1'
|
||||
try {
|
||||
const snapshot = getCommandsByMaxPriority('later')
|
||||
const claim = await claimConsumableQueuedAutonomyCommands(
|
||||
snapshot,
|
||||
tempDir,
|
||||
)
|
||||
removeFromQueue(claim.staleCommands)
|
||||
removeFromQueue(claim.claimedCommands)
|
||||
|
||||
const attachments = []
|
||||
for await (const attachment of getAttachmentMessages(
|
||||
null,
|
||||
{} as never,
|
||||
null,
|
||||
claim.attachmentCommands,
|
||||
[],
|
||||
)) {
|
||||
attachments.push(attachment)
|
||||
}
|
||||
|
||||
const consumedCommands = claim.attachmentCommands.filter(
|
||||
command =>
|
||||
(command.mode === 'prompt' || command.mode === 'task-notification') &&
|
||||
!claim.claimedCommands.includes(command),
|
||||
)
|
||||
removeFromQueue(consumedCommands)
|
||||
const nextCommands = await finalizeAutonomyCommandsForTurn({
|
||||
commands: claim.claimedCommands,
|
||||
outcome: { type: 'completed' },
|
||||
currentDir: tempDir,
|
||||
priority: 'later',
|
||||
})
|
||||
for (const command of nextCommands) {
|
||||
enqueue(command)
|
||||
}
|
||||
|
||||
return { attachments, runningRunIds: claim.claimedRunIds, nextCommands }
|
||||
} finally {
|
||||
if (previousDisableAttachments === undefined) {
|
||||
delete process.env.CLAUDE_CODE_DISABLE_ATTACHMENTS
|
||||
} else {
|
||||
process.env.CLAUDE_CODE_DISABLE_ATTACHMENTS = previousDisableAttachments
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
test('filters stale autonomy commands before mid-turn attachment consumption', async () => {
|
||||
const command = await createAutonomyQueuedPrompt({
|
||||
basePrompt: 'scheduled prompt',
|
||||
trigger: 'scheduled-task',
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
})
|
||||
expect(command).not.toBeNull()
|
||||
|
||||
const initial = await partitionConsumableQueuedAutonomyCommands(
|
||||
[command!],
|
||||
tempDir,
|
||||
)
|
||||
expect(initial.attachmentCommands).toHaveLength(1)
|
||||
expect(initial.staleCommands).toHaveLength(0)
|
||||
|
||||
await markAutonomyRunCancelled(command!.autonomy!.runId, tempDir)
|
||||
|
||||
const afterCancel = await partitionConsumableQueuedAutonomyCommands(
|
||||
[command!],
|
||||
tempDir,
|
||||
)
|
||||
expect(afterCancel.attachmentCommands).toHaveLength(0)
|
||||
expect(afterCancel.staleCommands).toHaveLength(1)
|
||||
})
|
||||
|
||||
test('cancels proactive commands that are created but dropped before enqueue', async () => {
|
||||
const commands = await createProactiveAutonomyCommands({
|
||||
basePrompt: '<tick>12:00:00</tick>',
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
})
|
||||
expect(commands).toHaveLength(1)
|
||||
|
||||
const queuedRun = await getAutonomyRunById(
|
||||
commands[0]!.autonomy!.runId,
|
||||
tempDir,
|
||||
)
|
||||
expect(queuedRun!.status).toBe('queued')
|
||||
|
||||
await cancelQueuedAutonomyCommands({ commands, rootDir: tempDir })
|
||||
|
||||
const cancelledRun = await getAutonomyRunById(
|
||||
commands[0]!.autonomy!.runId,
|
||||
tempDir,
|
||||
)
|
||||
expect(cancelledRun!.status).toBe('cancelled')
|
||||
})
|
||||
|
||||
test('uses command rootDir when claiming after project context changes', async () => {
|
||||
const otherProjectDir = await createTempDir('autonomy-other-project-')
|
||||
extraTempDirs.push(otherProjectDir)
|
||||
const command = await createAutonomyQueuedPrompt({
|
||||
basePrompt: 'scheduled prompt',
|
||||
trigger: 'scheduled-task',
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
})
|
||||
expect(command).not.toBeNull()
|
||||
expect(command!.autonomy?.rootDir).toBe(tempDir)
|
||||
|
||||
const claim = await claimConsumableQueuedAutonomyCommands(
|
||||
[command!],
|
||||
otherProjectDir,
|
||||
)
|
||||
|
||||
const originalRun = await getAutonomyRunById(
|
||||
command!.autonomy!.runId,
|
||||
tempDir,
|
||||
)
|
||||
const wrongProjectRun = await getAutonomyRunById(
|
||||
command!.autonomy!.runId,
|
||||
otherProjectDir,
|
||||
)
|
||||
|
||||
expect(claim.claimedRunIds).toEqual([command!.autonomy!.runId])
|
||||
expect(claim.attachmentCommands).toHaveLength(1)
|
||||
expect(originalRun!.status).toBe('running')
|
||||
expect(wrongProjectRun).toBeNull()
|
||||
})
|
||||
|
||||
test('advances a managed flow consumed as a queued attachment', async () => {
|
||||
const command = await startManagedAutonomyFlowFromHeartbeatTask({
|
||||
task: {
|
||||
name: 'weekly-report',
|
||||
interval: '7d',
|
||||
prompt: 'Ship the weekly report',
|
||||
steps: [
|
||||
{ name: 'gather', prompt: 'Gather weekly inputs' },
|
||||
{ name: 'draft', prompt: 'Draft weekly report' },
|
||||
],
|
||||
},
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
})
|
||||
expect(command).not.toBeNull()
|
||||
|
||||
const claim = await claimConsumableQueuedAutonomyCommands(
|
||||
[command!],
|
||||
tempDir,
|
||||
)
|
||||
const runningRunIds = claim.claimedRunIds
|
||||
expect(runningRunIds).toEqual([command!.autonomy!.runId])
|
||||
|
||||
const nextCommands = await finalizeAutonomyCommandsForTurn({
|
||||
commands: claim.claimedCommands,
|
||||
outcome: { type: 'completed' },
|
||||
currentDir: tempDir,
|
||||
priority: 'later',
|
||||
})
|
||||
const [flow] = await listAutonomyFlows(tempDir)
|
||||
const detail = await getAutonomyFlowById(flow!.flowId, tempDir)
|
||||
const run = await getAutonomyRunById(command!.autonomy!.runId, tempDir)
|
||||
|
||||
expect(run!.status).toBe('completed')
|
||||
expect(nextCommands).toHaveLength(1)
|
||||
expect(nextCommands[0]!.autonomy?.flowId).toBe(flow!.flowId)
|
||||
expect(detail!.stateJson!.steps.map(step => step.status)).toEqual([
|
||||
'completed',
|
||||
'queued',
|
||||
])
|
||||
})
|
||||
|
||||
test('keeps managed autonomy flow coherent across queued attachment turns', async () => {
|
||||
const firstCommand = await startManagedAutonomyFlowFromHeartbeatTask({
|
||||
task: {
|
||||
name: 'weekly-report',
|
||||
interval: '7d',
|
||||
prompt: 'Ship the weekly report',
|
||||
steps: [
|
||||
{ name: 'gather', prompt: 'Gather weekly inputs' },
|
||||
{ name: 'draft', prompt: 'Draft weekly report' },
|
||||
],
|
||||
},
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
})
|
||||
expect(firstCommand).not.toBeNull()
|
||||
enqueue(firstCommand!)
|
||||
|
||||
const firstTurn = await consumeQueuedAutonomyAttachmentTurn()
|
||||
const queuedAfterFirstTurn = getCommandsByMaxPriority('later')
|
||||
const [flowAfterFirstTurn] = await listAutonomyFlows(tempDir)
|
||||
const firstRun = await getAutonomyRunById(
|
||||
firstCommand!.autonomy!.runId,
|
||||
tempDir,
|
||||
)
|
||||
|
||||
expect(firstTurn.attachments).toHaveLength(1)
|
||||
expect(firstTurn.attachments[0]!.attachment?.type).toBe('queued_command')
|
||||
expect(firstTurn.runningRunIds).toEqual([firstCommand!.autonomy!.runId])
|
||||
expect(firstTurn.nextCommands).toHaveLength(1)
|
||||
expect(queuedAfterFirstTurn).toHaveLength(1)
|
||||
expect(queuedAfterFirstTurn[0]!.autonomy?.flowId).toBe(
|
||||
flowAfterFirstTurn!.flowId,
|
||||
)
|
||||
expect(firstRun!.status).toBe('completed')
|
||||
expect(
|
||||
flowAfterFirstTurn!.stateJson!.steps.map(step => step.status),
|
||||
).toEqual(['completed', 'queued'])
|
||||
|
||||
const secondCommand = queuedAfterFirstTurn[0]!
|
||||
const secondTurn = await consumeQueuedAutonomyAttachmentTurn()
|
||||
const queuedAfterSecondTurn = getCommandsByMaxPriority('later')
|
||||
const finalFlow = await getAutonomyFlowById(
|
||||
flowAfterFirstTurn!.flowId,
|
||||
tempDir,
|
||||
)
|
||||
const secondRun = await getAutonomyRunById(
|
||||
secondCommand.autonomy!.runId,
|
||||
tempDir,
|
||||
)
|
||||
|
||||
expect(secondTurn.attachments).toHaveLength(1)
|
||||
expect(secondTurn.runningRunIds).toEqual([secondCommand.autonomy!.runId])
|
||||
expect(secondTurn.nextCommands).toHaveLength(0)
|
||||
expect(queuedAfterSecondTurn).toHaveLength(0)
|
||||
expect(secondRun!.status).toBe('completed')
|
||||
expect(finalFlow!.status).toBe('succeeded')
|
||||
expect(finalFlow!.stateJson!.steps.map(step => step.status)).toEqual([
|
||||
'completed',
|
||||
'completed',
|
||||
])
|
||||
})
|
||||
})
|
||||
@@ -1,6 +1,7 @@
|
||||
import { afterEach, beforeEach, describe, expect, test } from 'bun:test'
|
||||
import { existsSync, readFileSync } from 'fs'
|
||||
import { mkdir, writeFile } from 'fs/promises'
|
||||
import { join } from 'path'
|
||||
import { join, resolve as resolvePath } from 'path'
|
||||
import {
|
||||
resetStateForTests,
|
||||
setCwdState,
|
||||
@@ -8,17 +9,23 @@ import {
|
||||
setProjectRoot,
|
||||
} from '../../bootstrap/state'
|
||||
import {
|
||||
createAutonomyRun,
|
||||
formatAutonomyRunsList,
|
||||
formatAutonomyRunsStatus,
|
||||
listAutonomyRuns,
|
||||
createAutonomyQueuedPrompt,
|
||||
createAutonomyQueuedPromptIfNoActiveSource,
|
||||
createProactiveAutonomyCommands,
|
||||
finalizeAutonomyRunCompleted,
|
||||
getAutonomyRunById,
|
||||
hasActiveAutonomyRunForSource,
|
||||
markAutonomyRunCompleted,
|
||||
markAutonomyRunCancelled,
|
||||
markAutonomyRunFailed,
|
||||
markAutonomyRunRunning,
|
||||
recoverManagedAutonomyFlowPrompt,
|
||||
resolveAutonomyRunsPath,
|
||||
STALE_ACTIVE_RUN_ERROR_PREFIX,
|
||||
startManagedAutonomyFlowFromHeartbeatTask,
|
||||
} from '../autonomyRuns'
|
||||
import {
|
||||
@@ -95,7 +102,9 @@ describe('autonomyRuns', () => {
|
||||
ownerKey: 'main-thread',
|
||||
sourceId: 'cron-1',
|
||||
sourceLabel: 'nightly-report',
|
||||
ownerProcessId: process.pid,
|
||||
})
|
||||
expect(runs[0]?.ownerSessionId).toBeString()
|
||||
expect(flows).toHaveLength(0)
|
||||
expect(resolveAutonomyRunsPath(tempDir)).toContain('.claude')
|
||||
})
|
||||
@@ -118,7 +127,7 @@ describe('autonomyRuns', () => {
|
||||
expect(command!.value).toContain('nested authority')
|
||||
})
|
||||
|
||||
test('markAutonomyRunRunning/completed/failed update persisted lifecycle state for plain runs', async () => {
|
||||
test('markAutonomyRunRunning/completed update persisted lifecycle state for plain runs', async () => {
|
||||
const command = await createAutonomyQueuedPrompt({
|
||||
basePrompt: '<tick>12:00:00</tick>',
|
||||
trigger: 'proactive-tick',
|
||||
@@ -134,7 +143,9 @@ describe('autonomyRuns', () => {
|
||||
runId,
|
||||
status: 'running',
|
||||
startedAt: 100,
|
||||
ownerProcessId: process.pid,
|
||||
})
|
||||
expect(runs[0]?.ownerSessionId).toBeString()
|
||||
|
||||
await markAutonomyRunCompleted(runId, tempDir, 200)
|
||||
runs = await listAutonomyRuns(tempDir)
|
||||
@@ -143,9 +154,22 @@ describe('autonomyRuns', () => {
|
||||
status: 'completed',
|
||||
endedAt: 200,
|
||||
})
|
||||
})
|
||||
|
||||
test('markAutonomyRunFailed updates a non-terminal run', async () => {
|
||||
const command = await createAutonomyQueuedPrompt({
|
||||
basePrompt: '<tick>12:00:00</tick>',
|
||||
trigger: 'proactive-tick',
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
})
|
||||
expect(command).not.toBeNull()
|
||||
const runId = command!.autonomy!.runId
|
||||
|
||||
await markAutonomyRunRunning(runId, tempDir, 100)
|
||||
await markAutonomyRunFailed(runId, 'boom', tempDir, 300)
|
||||
runs = await listAutonomyRuns(tempDir)
|
||||
const runs = await listAutonomyRuns(tempDir)
|
||||
|
||||
expect(runs[0]).toMatchObject({
|
||||
runId,
|
||||
status: 'failed',
|
||||
@@ -154,6 +178,348 @@ describe('autonomyRuns', () => {
|
||||
})
|
||||
})
|
||||
|
||||
test('terminal runs are not revived by stale lifecycle updates', async () => {
|
||||
const command = await createAutonomyQueuedPrompt({
|
||||
basePrompt: 'scheduled prompt',
|
||||
trigger: 'scheduled-task',
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
})
|
||||
expect(command).not.toBeNull()
|
||||
const runId = command!.autonomy!.runId
|
||||
|
||||
await markAutonomyRunCancelled(runId, tempDir, 100)
|
||||
const revived = await markAutonomyRunRunning(runId, tempDir, 200)
|
||||
const completed = await markAutonomyRunCompleted(runId, tempDir, 300)
|
||||
const failed = await markAutonomyRunFailed(
|
||||
runId,
|
||||
'late failure',
|
||||
tempDir,
|
||||
400,
|
||||
)
|
||||
const persisted = await getAutonomyRunById(runId, tempDir)
|
||||
|
||||
expect(revived).toBeNull()
|
||||
expect(completed).toBeNull()
|
||||
expect(failed).toBeNull()
|
||||
expect(persisted).toMatchObject({
|
||||
status: 'cancelled',
|
||||
endedAt: 100,
|
||||
})
|
||||
expect(persisted!.error).toBeUndefined()
|
||||
})
|
||||
|
||||
test('hasActiveAutonomyRunForSource only treats queued and running scheduled runs as active', async () => {
|
||||
const command = await createAutonomyQueuedPrompt({
|
||||
basePrompt: 'scheduled prompt',
|
||||
trigger: 'scheduled-task',
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
sourceId: 'cron-1',
|
||||
sourceLabel: 'nightly',
|
||||
})
|
||||
expect(command).not.toBeNull()
|
||||
const runId = command!.autonomy!.runId
|
||||
|
||||
await expect(
|
||||
hasActiveAutonomyRunForSource({
|
||||
trigger: 'scheduled-task',
|
||||
sourceId: 'cron-1',
|
||||
rootDir: tempDir,
|
||||
}),
|
||||
).resolves.toBe(true)
|
||||
|
||||
await markAutonomyRunRunning(runId, tempDir, 100)
|
||||
await expect(
|
||||
hasActiveAutonomyRunForSource({
|
||||
trigger: 'scheduled-task',
|
||||
sourceId: 'cron-1',
|
||||
rootDir: tempDir,
|
||||
}),
|
||||
).resolves.toBe(true)
|
||||
|
||||
await expect(
|
||||
hasActiveAutonomyRunForSource({
|
||||
trigger: 'scheduled-task',
|
||||
sourceId: 'cron-2',
|
||||
rootDir: tempDir,
|
||||
}),
|
||||
).resolves.toBe(false)
|
||||
|
||||
await markAutonomyRunCompleted(runId, tempDir, 200)
|
||||
await expect(
|
||||
hasActiveAutonomyRunForSource({
|
||||
trigger: 'scheduled-task',
|
||||
sourceId: 'cron-1',
|
||||
rootDir: tempDir,
|
||||
}),
|
||||
).resolves.toBe(false)
|
||||
|
||||
const failedCommand = await createAutonomyQueuedPrompt({
|
||||
basePrompt: 'scheduled prompt',
|
||||
trigger: 'scheduled-task',
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
sourceId: 'cron-1',
|
||||
})
|
||||
expect(failedCommand).not.toBeNull()
|
||||
await markAutonomyRunFailed(
|
||||
failedCommand!.autonomy!.runId,
|
||||
'boom',
|
||||
tempDir,
|
||||
300,
|
||||
)
|
||||
await expect(
|
||||
hasActiveAutonomyRunForSource({
|
||||
trigger: 'scheduled-task',
|
||||
sourceId: 'cron-1',
|
||||
rootDir: tempDir,
|
||||
}),
|
||||
).resolves.toBe(false)
|
||||
})
|
||||
|
||||
test('createAutonomyQueuedPromptIfNoActiveSource atomically skips duplicate active scheduled sources', async () => {
|
||||
const [first, second] = await Promise.all([
|
||||
createAutonomyQueuedPromptIfNoActiveSource({
|
||||
basePrompt: 'scheduled prompt',
|
||||
trigger: 'scheduled-task',
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
sourceId: 'cron-1',
|
||||
}),
|
||||
createAutonomyQueuedPromptIfNoActiveSource({
|
||||
basePrompt: 'scheduled prompt',
|
||||
trigger: 'scheduled-task',
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
sourceId: 'cron-1',
|
||||
}),
|
||||
])
|
||||
|
||||
const created = [first, second].filter(command => command !== null)
|
||||
const runs = await listAutonomyRuns(tempDir)
|
||||
|
||||
expect(created).toHaveLength(1)
|
||||
expect(runs).toHaveLength(1)
|
||||
expect(runs[0]).toMatchObject({
|
||||
trigger: 'scheduled-task',
|
||||
status: 'queued',
|
||||
sourceId: 'cron-1',
|
||||
})
|
||||
})
|
||||
|
||||
test('createAutonomyQueuedPromptIfNoActiveSource scopes dedup by ownerKey', async () => {
|
||||
const first = await createAutonomyQueuedPromptIfNoActiveSource({
|
||||
basePrompt: 'scheduled prompt',
|
||||
trigger: 'scheduled-task',
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
sourceId: 'cron-1',
|
||||
ownerKey: 'owner-a',
|
||||
})
|
||||
const second = await createAutonomyQueuedPromptIfNoActiveSource({
|
||||
basePrompt: 'scheduled prompt',
|
||||
trigger: 'scheduled-task',
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
sourceId: 'cron-1',
|
||||
ownerKey: 'owner-b',
|
||||
})
|
||||
|
||||
const runs = await listAutonomyRuns(tempDir)
|
||||
|
||||
expect(first).not.toBeNull()
|
||||
expect(second).not.toBeNull()
|
||||
expect(runs).toHaveLength(2)
|
||||
expect(new Set(runs.map(run => run.ownerKey))).toEqual(
|
||||
new Set(['owner-a', 'owner-b']),
|
||||
)
|
||||
})
|
||||
|
||||
test('createAutonomyQueuedPromptIfNoActiveSource does not advance heartbeat last-run state on dedup skip (two-phase commit invariant)', async () => {
|
||||
await writeTempFile(
|
||||
tempDir,
|
||||
HEARTBEAT_REL,
|
||||
[
|
||||
'tasks:',
|
||||
' - name: inbox',
|
||||
' interval: 30m',
|
||||
' prompt: "Check inbox"',
|
||||
].join('\n'),
|
||||
)
|
||||
|
||||
// Seed an active queued run for cron-1 so the next dedup attempt skips.
|
||||
await mkdir(join(tempDir, AUTONOMY_DIR), { recursive: true })
|
||||
await writeFile(
|
||||
resolveAutonomyRunsPath(tempDir),
|
||||
`${JSON.stringify(
|
||||
{
|
||||
runs: [
|
||||
{
|
||||
runId: 'preexisting-active',
|
||||
runtime: 'automatic',
|
||||
trigger: 'scheduled-task',
|
||||
status: 'queued',
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
sourceId: 'cron-1',
|
||||
promptPreview: 'still queued',
|
||||
createdAt: 100,
|
||||
ownerProcessId: process.pid,
|
||||
ownerSessionId: 'self',
|
||||
},
|
||||
],
|
||||
},
|
||||
null,
|
||||
2,
|
||||
)}\n`,
|
||||
'utf-8',
|
||||
)
|
||||
|
||||
const skipped = await createAutonomyQueuedPromptIfNoActiveSource({
|
||||
basePrompt: 'scheduled prompt',
|
||||
trigger: 'scheduled-task',
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
sourceId: 'cron-1',
|
||||
})
|
||||
expect(skipped).toBeNull()
|
||||
|
||||
// If the dedup skip wrongly advanced heartbeat state, the next
|
||||
// proactive-tick prompt would NOT include the inbox task. Verify it
|
||||
// still does.
|
||||
const followUp = await createAutonomyQueuedPrompt({
|
||||
basePrompt: '<tick>12:00:00</tick>',
|
||||
trigger: 'proactive-tick',
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
})
|
||||
expect(followUp).not.toBeNull()
|
||||
expect(followUp!.value).toContain('Due HEARTBEAT.md tasks:')
|
||||
expect(followUp!.value).toContain('- inbox (30m): Check inbox')
|
||||
})
|
||||
|
||||
test('createAutonomyQueuedPromptIfNoActiveSource recovers stale active runs from dead owner processes', async () => {
|
||||
await mkdir(join(tempDir, AUTONOMY_DIR), { recursive: true })
|
||||
await writeFile(
|
||||
resolveAutonomyRunsPath(tempDir),
|
||||
`${JSON.stringify(
|
||||
{
|
||||
runs: [
|
||||
{
|
||||
runId: 'stale-run',
|
||||
runtime: 'automatic',
|
||||
trigger: 'scheduled-task',
|
||||
status: 'running',
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
sourceId: 'cron-1',
|
||||
sourceLabel: 'nightly',
|
||||
promptPreview: 'stale scheduled prompt',
|
||||
createdAt: 100,
|
||||
startedAt: 100,
|
||||
ownerProcessId: 2_147_483_647,
|
||||
ownerSessionId: 'dead-session',
|
||||
},
|
||||
],
|
||||
},
|
||||
null,
|
||||
2,
|
||||
)}\n`,
|
||||
'utf-8',
|
||||
)
|
||||
|
||||
await expect(
|
||||
hasActiveAutonomyRunForSource({
|
||||
trigger: 'scheduled-task',
|
||||
sourceId: 'cron-1',
|
||||
rootDir: tempDir,
|
||||
}),
|
||||
).resolves.toBe(false)
|
||||
|
||||
const command = await createAutonomyQueuedPromptIfNoActiveSource({
|
||||
basePrompt: 'scheduled prompt',
|
||||
trigger: 'scheduled-task',
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
sourceId: 'cron-1',
|
||||
})
|
||||
const runs = await listAutonomyRuns(tempDir)
|
||||
|
||||
expect(command).not.toBeNull()
|
||||
expect(runs).toHaveLength(2)
|
||||
expect(runs[0]).toMatchObject({
|
||||
trigger: 'scheduled-task',
|
||||
status: 'queued',
|
||||
sourceId: 'cron-1',
|
||||
ownerProcessId: process.pid,
|
||||
})
|
||||
expect(runs[1]).toMatchObject({
|
||||
runId: 'stale-run',
|
||||
status: 'failed',
|
||||
endedAt: runs[0]?.createdAt,
|
||||
error: expect.stringContaining('owner process 2147483647'),
|
||||
})
|
||||
})
|
||||
|
||||
test('stale managed-flow run recovery also marks the flow step failed', async () => {
|
||||
const command = await startManagedAutonomyFlowFromHeartbeatTask({
|
||||
task: {
|
||||
name: 'weekly-report',
|
||||
interval: '7d',
|
||||
prompt: 'Ship the weekly report',
|
||||
steps: [
|
||||
{
|
||||
name: 'gather',
|
||||
prompt: 'Gather weekly inputs',
|
||||
},
|
||||
],
|
||||
},
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
})
|
||||
expect(command).not.toBeNull()
|
||||
const runId = command!.autonomy!.runId
|
||||
await markAutonomyRunRunning(runId, tempDir, 100)
|
||||
|
||||
const runsPath = resolveAutonomyRunsPath(tempDir)
|
||||
const file = JSON.parse(readFileSync(runsPath, 'utf-8')) as {
|
||||
runs: Array<Record<string, unknown>>
|
||||
}
|
||||
file.runs = file.runs.map(run =>
|
||||
run.runId === runId
|
||||
? { ...run, ownerProcessId: 2_147_483_647 }
|
||||
: run,
|
||||
)
|
||||
await writeFile(runsPath, `${JSON.stringify(file, null, 2)}\n`, 'utf-8')
|
||||
|
||||
const replacement = await createAutonomyQueuedPromptIfNoActiveSource({
|
||||
basePrompt: 'replacement prompt',
|
||||
trigger: 'managed-flow-step',
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
sourceId: command!.autonomy!.sourceId!,
|
||||
ownerKey: 'main-thread',
|
||||
})
|
||||
const [flow] = await listAutonomyFlows(tempDir)
|
||||
const runs = await listAutonomyRuns(tempDir)
|
||||
|
||||
expect(replacement).not.toBeNull()
|
||||
expect(runs.find(run => run.runId === runId)).toMatchObject({
|
||||
status: 'failed',
|
||||
error: expect.stringContaining(STALE_ACTIVE_RUN_ERROR_PREFIX),
|
||||
})
|
||||
expect(flow).toMatchObject({
|
||||
status: 'failed',
|
||||
blockedRunId: runId,
|
||||
})
|
||||
expect(flow?.stateJson?.steps[0]).toMatchObject({
|
||||
status: 'failed',
|
||||
runId,
|
||||
error: expect.stringContaining(STALE_ACTIVE_RUN_ERROR_PREFIX),
|
||||
})
|
||||
})
|
||||
|
||||
test('formatters produce readable status and run listings', async () => {
|
||||
const first = await createAutonomyQueuedPrompt({
|
||||
basePrompt: 'scheduled prompt',
|
||||
@@ -223,6 +589,53 @@ describe('autonomyRuns', () => {
|
||||
)
|
||||
})
|
||||
|
||||
test('persistence pruning keeps active runs ahead of recent completed history', async () => {
|
||||
const runs = [
|
||||
{
|
||||
runId: 'old-active',
|
||||
runtime: 'automatic',
|
||||
trigger: 'scheduled-task',
|
||||
status: 'queued',
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
ownerKey: 'main-thread',
|
||||
promptPreview: 'old active',
|
||||
createdAt: 1,
|
||||
},
|
||||
...Array.from({ length: 200 }, (_, index) => ({
|
||||
runId: `history-${index}`,
|
||||
runtime: 'automatic',
|
||||
trigger: 'scheduled-task',
|
||||
status: 'completed',
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
ownerKey: 'main-thread',
|
||||
promptPreview: `history ${index}`,
|
||||
createdAt: 1_000 + index,
|
||||
endedAt: 2_000 + index,
|
||||
})),
|
||||
]
|
||||
await mkdir(join(tempDir, AUTONOMY_DIR), { recursive: true })
|
||||
await writeFile(
|
||||
resolveAutonomyRunsPath(tempDir),
|
||||
`${JSON.stringify({ runs }, null, 2)}\n`,
|
||||
'utf-8',
|
||||
)
|
||||
|
||||
await createAutonomyRun({
|
||||
trigger: 'scheduled-task',
|
||||
prompt: 'fresh active',
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
nowMs: 9_999,
|
||||
})
|
||||
|
||||
const persisted = await listAutonomyRuns(tempDir)
|
||||
expect(persisted).toHaveLength(200)
|
||||
expect(persisted.some(run => run.runId === 'old-active')).toBe(true)
|
||||
expect(persisted.some(run => run.runId === 'history-0')).toBe(false)
|
||||
})
|
||||
|
||||
test('listAutonomyRuns keeps older persisted records by normalizing missing runtime and owner metadata', async () => {
|
||||
const runsPath = resolveAutonomyRunsPath(tempDir)
|
||||
await mkdir(join(tempDir, '.claude', 'autonomy'), { recursive: true })
|
||||
@@ -418,4 +831,27 @@ describe('autonomyRuns', () => {
|
||||
expect(recovered!.autonomy?.runId).toBe(command!.autonomy?.runId)
|
||||
expect(recovered!.autonomy?.flowId).toBe(flow!.flowId)
|
||||
})
|
||||
|
||||
test('STALE_ACTIVE_RUN_ERROR_PREFIX stays in sync with HEARTBEAT.md stale-recovery-health task', () => {
|
||||
// The HEARTBEAT.md stale-recovery-health task prompt embeds this prefix
|
||||
// as a literal string. Changing the constant without updating the
|
||||
// heartbeat prompt would silently break the monitor — this test fails
|
||||
// first to force the simultaneous update.
|
||||
const heartbeatPath = resolvePath(
|
||||
import.meta.dir,
|
||||
'..',
|
||||
'..',
|
||||
'..',
|
||||
'.claude',
|
||||
'autonomy',
|
||||
'HEARTBEAT.md',
|
||||
)
|
||||
if (!existsSync(heartbeatPath)) {
|
||||
// .claude/ may be absent in some checkout layouts (e.g., shallow clone
|
||||
// for npm pack). Skip rather than fail in that case.
|
||||
return
|
||||
}
|
||||
const content = readFileSync(heartbeatPath, 'utf8')
|
||||
expect(content).toContain(STALE_ACTIVE_RUN_ERROR_PREFIX)
|
||||
})
|
||||
})
|
||||
|
||||
@@ -133,11 +133,41 @@ function mergeAgentsAuthority(files: AutonomyAuthorityFile[]): string | null {
|
||||
.join('\n\n')
|
||||
}
|
||||
|
||||
/**
|
||||
* Replaces fenced code-block content (and the ``` / ~~~ fence delimiters
|
||||
* themselves) with empty strings while preserving the index of every
|
||||
* other line. Used by the heartbeat parser so that `tasks:` literals
|
||||
* appearing inside Markdown code samples in HEARTBEAT.md docs do not
|
||||
* collide with the real config block.
|
||||
*/
|
||||
function maskCodeFencedLines(lines: string[]): string[] {
|
||||
const masked = lines.slice()
|
||||
let activeFenceChar: '`' | '~' | null = null
|
||||
for (let i = 0; i < masked.length; i++) {
|
||||
const trimmed = masked[i]!.trim()
|
||||
const fenceMatch = trimmed.match(/^(```+|~~~+)/)
|
||||
if (fenceMatch) {
|
||||
const fenceChar = fenceMatch[1]![0] as '`' | '~'
|
||||
if (activeFenceChar === null) {
|
||||
activeFenceChar = fenceChar
|
||||
} else if (activeFenceChar === fenceChar) {
|
||||
activeFenceChar = null
|
||||
}
|
||||
masked[i] = ''
|
||||
continue
|
||||
}
|
||||
if (activeFenceChar !== null) {
|
||||
masked[i] = ''
|
||||
}
|
||||
}
|
||||
return masked
|
||||
}
|
||||
|
||||
export function parseHeartbeatAuthorityTasks(
|
||||
content: string,
|
||||
): HeartbeatAuthorityTask[] {
|
||||
const tasks: HeartbeatAuthorityTask[] = []
|
||||
const lines = content.split('\n')
|
||||
const lines = maskCodeFencedLines(content.split('\n'))
|
||||
const getIndent = (line: string): number =>
|
||||
line.length - line.trimStart().length
|
||||
const parseScalar = (line: string, key: string): string =>
|
||||
|
||||
@@ -83,6 +83,20 @@ export type AutonomyFlowRecord = {
|
||||
waitJson?: AutonomyFlowWaitState
|
||||
cancelRequestedAt?: number
|
||||
lastError?: string
|
||||
/**
|
||||
* Repo-relative POSIX glob patterns describing which paths this flow's
|
||||
* `report`-step approval covers. The pre-tool-use hook
|
||||
* `require-plan-for-risky-edit.mjs` consults this list to permit edits
|
||||
* only when the target file matches at least one entry. Absent or empty
|
||||
* means "no boundary declared" — during the pilot window the hook
|
||||
* treats this as broad approval (v1 behaviour). Once all production
|
||||
* flows declare boundaries, the hook will deny absent-boundary flows.
|
||||
*
|
||||
* Supported syntax: `*` matches one path segment, `**` matches any
|
||||
* number including zero. Examples: `src/utils/autonomy*`,
|
||||
* `src/services/api/**`, `src/Tool.ts`.
|
||||
*/
|
||||
boundary?: string[]
|
||||
}
|
||||
|
||||
type AutonomyFlowsFile = {
|
||||
@@ -138,6 +152,7 @@ function cloneWaitState(
|
||||
function cloneFlowRecord(flow: AutonomyFlowRecord): AutonomyFlowRecord {
|
||||
return {
|
||||
...flow,
|
||||
...(flow.boundary ? { boundary: [...flow.boundary] } : {}),
|
||||
...(flow.stateJson ? { stateJson: cloneManagedState(flow.stateJson) } : {}),
|
||||
...(flow.waitJson ? { waitJson: cloneWaitState(flow.waitJson) } : {}),
|
||||
}
|
||||
@@ -152,6 +167,25 @@ function isManagedFlowStatusActive(status: AutonomyFlowStatus): boolean {
|
||||
)
|
||||
}
|
||||
|
||||
function selectPersistedAutonomyFlows(
|
||||
flows: AutonomyFlowRecord[],
|
||||
): AutonomyFlowRecord[] {
|
||||
const retained = flows
|
||||
.slice()
|
||||
.map(cloneFlowRecord)
|
||||
.sort((left, right) => {
|
||||
const leftActive = isManagedFlowStatusActive(left.status)
|
||||
const rightActive = isManagedFlowStatusActive(right.status)
|
||||
if (leftActive !== rightActive) {
|
||||
return leftActive ? -1 : 1
|
||||
}
|
||||
return right.updatedAt - left.updatedAt
|
||||
})
|
||||
.slice(0, AUTONOMY_FLOWS_MAX)
|
||||
|
||||
return retained.sort((left, right) => right.updatedAt - left.updatedAt)
|
||||
}
|
||||
|
||||
function defaultFlowSource(params: {
|
||||
trigger: AutonomyTriggerKind
|
||||
sourceId?: string
|
||||
@@ -237,6 +271,35 @@ function normalizeWaitState(value: unknown): AutonomyFlowWaitState | undefined {
|
||||
}
|
||||
}
|
||||
|
||||
function isPosixBoundaryGlob(value: string): boolean {
|
||||
if (!value || value.startsWith('/') || value.includes('\\')) {
|
||||
return false
|
||||
}
|
||||
if (value.includes('\0')) {
|
||||
return false
|
||||
}
|
||||
return !value.split('/').some(segment => segment === '..')
|
||||
}
|
||||
|
||||
function normalizeBoundary(value: unknown): string[] | undefined {
|
||||
if (!Array.isArray(value)) {
|
||||
return undefined
|
||||
}
|
||||
const seen = new Set<string>()
|
||||
const boundary = value
|
||||
.filter((entry): entry is string => typeof entry === 'string')
|
||||
.map(entry => entry.trim())
|
||||
.filter(isPosixBoundaryGlob)
|
||||
.filter(entry => {
|
||||
if (seen.has(entry)) {
|
||||
return false
|
||||
}
|
||||
seen.add(entry)
|
||||
return true
|
||||
})
|
||||
return boundary.length > 0 ? boundary : undefined
|
||||
}
|
||||
|
||||
function normalizeFlowRecord(flow: AutonomyFlowRecord): AutonomyFlowRecord {
|
||||
const source = defaultFlowSource(flow)
|
||||
return {
|
||||
@@ -247,6 +310,7 @@ function normalizeFlowRecord(flow: AutonomyFlowRecord): AutonomyFlowRecord {
|
||||
goal: flow.goal || flow.sourceLabel || flow.sourceId || flow.flowKey,
|
||||
currentDir: flow.currentDir || flow.rootDir,
|
||||
runCount: Math.max(flow.runCount ?? 0, 0),
|
||||
boundary: normalizeBoundary(flow.boundary),
|
||||
stateJson: normalizeManagedState(flow.stateJson),
|
||||
waitJson: normalizeWaitState(flow.waitJson),
|
||||
...(flow.sourceId
|
||||
@@ -369,11 +433,7 @@ async function writeAutonomyFlows(
|
||||
path,
|
||||
`${JSON.stringify(
|
||||
{
|
||||
flows: flows
|
||||
.slice()
|
||||
.map(cloneFlowRecord)
|
||||
.sort((left, right) => right.updatedAt - left.updatedAt)
|
||||
.slice(0, AUTONOMY_FLOWS_MAX),
|
||||
flows: selectPersistedAutonomyFlows(flows),
|
||||
} satisfies AutonomyFlowsFile,
|
||||
null,
|
||||
2,
|
||||
@@ -420,6 +480,7 @@ export async function startManagedAutonomyFlow(params: {
|
||||
ownerKey?: string
|
||||
sourceId?: string
|
||||
sourceLabel?: string
|
||||
boundary?: string[]
|
||||
nowMs?: number
|
||||
}): Promise<ManagedAutonomyFlowStartResult | null> {
|
||||
if (params.steps.length === 0) {
|
||||
@@ -450,6 +511,8 @@ export async function startManagedAutonomyFlow(params: {
|
||||
|
||||
const stateJson = buildManagedState(params.steps)
|
||||
const firstStep = stateJson.steps[0]!
|
||||
const boundary =
|
||||
normalizeBoundary(params.boundary) ?? normalizeBoundary(current?.boundary)
|
||||
const waiting =
|
||||
firstStep.waitFor != null
|
||||
? {
|
||||
@@ -474,6 +537,7 @@ export async function startManagedAutonomyFlow(params: {
|
||||
currentDir,
|
||||
...(source.sourceId ? { sourceId: source.sourceId } : {}),
|
||||
...(source.sourceLabel ? { sourceLabel: source.sourceLabel } : {}),
|
||||
...(boundary ? { boundary } : {}),
|
||||
latestRunId: undefined,
|
||||
runCount: current?.runCount ?? 0,
|
||||
createdAt: current?.createdAt ?? nowMs,
|
||||
|
||||
@@ -4,6 +4,15 @@ import { lock } from './lockfile.js'
|
||||
|
||||
const persistenceLocks = new Map<string, Promise<void>>()
|
||||
|
||||
export function getAutonomyPersistenceLockCountForTests(): number {
|
||||
if (process.env.NODE_ENV !== 'test') {
|
||||
throw new Error(
|
||||
'getAutonomyPersistenceLockCountForTests can only be called in tests',
|
||||
)
|
||||
}
|
||||
return persistenceLocks.size
|
||||
}
|
||||
|
||||
export async function withAutonomyPersistenceLock<T>(
|
||||
rootDir: string,
|
||||
fn: () => Promise<T>,
|
||||
@@ -16,10 +25,8 @@ export async function withAutonomyPersistenceLock<T>(
|
||||
const current = new Promise<void>(resolve => {
|
||||
release = resolve
|
||||
})
|
||||
persistenceLocks.set(
|
||||
key,
|
||||
previous.then(() => current),
|
||||
)
|
||||
const chained = previous.then(() => current)
|
||||
persistenceLocks.set(key, chained)
|
||||
|
||||
await previous
|
||||
try {
|
||||
@@ -41,7 +48,7 @@ export async function withAutonomyPersistenceLock<T>(
|
||||
}
|
||||
} finally {
|
||||
release()
|
||||
if (persistenceLocks.get(key) === current) {
|
||||
if (persistenceLocks.get(key) === chained) {
|
||||
persistenceLocks.delete(key)
|
||||
}
|
||||
}
|
||||
|
||||
261
src/utils/autonomyQueueLifecycle.ts
Normal file
261
src/utils/autonomyQueueLifecycle.ts
Normal file
@@ -0,0 +1,261 @@
|
||||
import type { QueuedCommand } from '../types/textInputTypes.js'
|
||||
import {
|
||||
finalizeAutonomyRunCompleted,
|
||||
finalizeAutonomyRunFailed,
|
||||
listAutonomyRuns,
|
||||
markAutonomyRunCancelled,
|
||||
markAutonomyRunRunning,
|
||||
} from './autonomyRuns.js'
|
||||
|
||||
export type AutonomyQueuePartition = {
|
||||
attachmentCommands: QueuedCommand[]
|
||||
staleCommands: QueuedCommand[]
|
||||
}
|
||||
|
||||
export type AutonomyQueueClaim = AutonomyQueuePartition & {
|
||||
claimedRunIds: string[]
|
||||
claimedCommands: QueuedCommand[]
|
||||
}
|
||||
|
||||
export type AutonomyTurnOutcome =
|
||||
| { type: 'completed' }
|
||||
| { type: 'cancelled' }
|
||||
| { type: 'failed'; error?: unknown; message?: string }
|
||||
|
||||
type AutonomyRunRef = {
|
||||
runId: string
|
||||
rootDir?: string
|
||||
}
|
||||
|
||||
function getCommandRootDir(
|
||||
command: QueuedCommand,
|
||||
fallbackRootDir?: string,
|
||||
): string | undefined {
|
||||
return command.autonomy?.rootDir ?? fallbackRootDir
|
||||
}
|
||||
|
||||
function refKey(ref: AutonomyRunRef): string {
|
||||
return `${ref.rootDir ?? ''}\0${ref.runId}`
|
||||
}
|
||||
|
||||
function getAutonomyRunRefs(
|
||||
commands: QueuedCommand[],
|
||||
fallbackRootDir?: string,
|
||||
): AutonomyRunRef[] {
|
||||
const refs = new Map<string, AutonomyRunRef>()
|
||||
for (const command of commands) {
|
||||
const runId = command.autonomy?.runId
|
||||
if (!runId) {
|
||||
continue
|
||||
}
|
||||
const ref = {
|
||||
runId,
|
||||
rootDir: getCommandRootDir(command, fallbackRootDir),
|
||||
}
|
||||
refs.set(refKey(ref), ref)
|
||||
}
|
||||
return [...refs.values()]
|
||||
}
|
||||
|
||||
function isInlineQueuedCommand(command: QueuedCommand): boolean {
|
||||
return command.mode === 'prompt' || command.mode === 'task-notification'
|
||||
}
|
||||
|
||||
function groupRefsByRootDir(
|
||||
refs: AutonomyRunRef[],
|
||||
): Map<string, AutonomyRunRef[]> {
|
||||
const grouped = new Map<string, AutonomyRunRef[]>()
|
||||
for (const ref of refs) {
|
||||
const key = ref.rootDir ?? ''
|
||||
const group = grouped.get(key)
|
||||
if (group) {
|
||||
group.push(ref)
|
||||
} else {
|
||||
grouped.set(key, [ref])
|
||||
}
|
||||
}
|
||||
return grouped
|
||||
}
|
||||
|
||||
/**
|
||||
* Exclude queued autonomy commands whose persisted run is no longer queued.
|
||||
* This prevents stale in-memory commands from reviving flows after cancellation
|
||||
* or after another path has already consumed the run.
|
||||
*/
|
||||
export async function partitionConsumableQueuedAutonomyCommands(
|
||||
commands: QueuedCommand[],
|
||||
rootDir?: string,
|
||||
): Promise<AutonomyQueuePartition> {
|
||||
const attachmentCommands: QueuedCommand[] = []
|
||||
const staleCommands: QueuedCommand[] = []
|
||||
const refs = getAutonomyRunRefs(commands, rootDir)
|
||||
const runsByRef = new Map<
|
||||
string,
|
||||
Awaited<ReturnType<typeof listAutonomyRuns>>[number]
|
||||
>()
|
||||
for (const [rootKey, group] of groupRefsByRootDir(refs)) {
|
||||
const runs = await listAutonomyRuns(rootKey || undefined)
|
||||
const wanted = new Set(group.map(ref => ref.runId))
|
||||
for (const run of runs) {
|
||||
if (wanted.has(run.runId)) {
|
||||
runsByRef.set(
|
||||
refKey({ runId: run.runId, rootDir: rootKey || undefined }),
|
||||
run,
|
||||
)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
for (const command of commands) {
|
||||
const runId = command.autonomy?.runId
|
||||
if (!runId) {
|
||||
attachmentCommands.push(command)
|
||||
continue
|
||||
}
|
||||
|
||||
const commandRootDir = getCommandRootDir(command, rootDir)
|
||||
const run = runsByRef.get(refKey({ runId, rootDir: commandRootDir }))
|
||||
if (run?.status === 'queued' && !run.startedAt && !run.endedAt) {
|
||||
attachmentCommands.push(command)
|
||||
} else {
|
||||
staleCommands.push(command)
|
||||
}
|
||||
}
|
||||
|
||||
return { attachmentCommands, staleCommands }
|
||||
}
|
||||
|
||||
export async function claimConsumableQueuedAutonomyCommands(
|
||||
commands: QueuedCommand[],
|
||||
rootDir?: string,
|
||||
): Promise<AutonomyQueueClaim> {
|
||||
const partition = await partitionConsumableQueuedAutonomyCommands(
|
||||
commands,
|
||||
rootDir,
|
||||
)
|
||||
const claimedRunIds: string[] = []
|
||||
const claimedRunKeys: string[] = []
|
||||
const staleRunKeys = new Set<string>()
|
||||
const candidateRefs = getAutonomyRunRefs(
|
||||
partition.attachmentCommands.filter(isInlineQueuedCommand),
|
||||
rootDir,
|
||||
)
|
||||
|
||||
for (const ref of candidateRefs) {
|
||||
const updated = await markAutonomyRunRunning(ref.runId, ref.rootDir)
|
||||
if (updated?.status === 'running') {
|
||||
claimedRunIds.push(ref.runId)
|
||||
claimedRunKeys.push(refKey(ref))
|
||||
} else {
|
||||
staleRunKeys.add(refKey(ref))
|
||||
}
|
||||
}
|
||||
|
||||
const claimedRunKeySet = new Set(claimedRunKeys)
|
||||
const attachmentCommands: QueuedCommand[] = []
|
||||
const claimedCommands: QueuedCommand[] = []
|
||||
const staleCommands = [...partition.staleCommands]
|
||||
|
||||
for (const command of partition.attachmentCommands) {
|
||||
const runId = command.autonomy?.runId
|
||||
if (!runId) {
|
||||
attachmentCommands.push(command)
|
||||
continue
|
||||
}
|
||||
const key = refKey({
|
||||
runId,
|
||||
rootDir: getCommandRootDir(command, rootDir),
|
||||
})
|
||||
if (claimedRunKeySet.has(key)) {
|
||||
attachmentCommands.push(command)
|
||||
claimedCommands.push(command)
|
||||
} else if (staleRunKeys.has(key)) {
|
||||
staleCommands.push(command)
|
||||
}
|
||||
}
|
||||
|
||||
return {
|
||||
attachmentCommands,
|
||||
staleCommands,
|
||||
claimedRunIds,
|
||||
claimedCommands,
|
||||
}
|
||||
}
|
||||
|
||||
export async function cancelQueuedAutonomyCommands(params: {
|
||||
commands: QueuedCommand[]
|
||||
rootDir?: string
|
||||
}): Promise<void> {
|
||||
for (const ref of getAutonomyRunRefs(params.commands, params.rootDir)) {
|
||||
await markAutonomyRunCancelled(ref.runId, ref.rootDir)
|
||||
}
|
||||
}
|
||||
|
||||
function stringifyAutonomyError(error: unknown): string {
|
||||
if (typeof error === 'string') {
|
||||
return error
|
||||
}
|
||||
if (error instanceof Error) {
|
||||
return error.message
|
||||
}
|
||||
return String(error)
|
||||
}
|
||||
|
||||
export function sanitizeAutonomyFailureForPersistence(
|
||||
error: unknown,
|
||||
fallback = 'query failed',
|
||||
): string {
|
||||
const message = stringifyAutonomyError(error)
|
||||
const lower = message.toLowerCase()
|
||||
if (
|
||||
lower.includes('api_error') ||
|
||||
lower.includes('provider') ||
|
||||
lower.includes('openai') ||
|
||||
lower.includes('gemini') ||
|
||||
lower.includes('grok') ||
|
||||
lower.includes('anthropic') ||
|
||||
lower.includes('bedrock') ||
|
||||
lower.includes('vertex')
|
||||
) {
|
||||
return 'provider api_error'
|
||||
}
|
||||
return fallback
|
||||
}
|
||||
|
||||
export async function finalizeAutonomyCommandsForTurn(params: {
|
||||
commands: QueuedCommand[]
|
||||
outcome: AutonomyTurnOutcome
|
||||
currentDir?: string
|
||||
priority?: 'now' | 'next' | 'later'
|
||||
workload?: string
|
||||
}): Promise<QueuedCommand[]> {
|
||||
const nextCommands: QueuedCommand[] = []
|
||||
for (const command of params.commands) {
|
||||
const autonomy = command.autonomy
|
||||
if (!autonomy?.runId) {
|
||||
continue
|
||||
}
|
||||
if (params.outcome.type === 'completed') {
|
||||
nextCommands.push(
|
||||
...(await finalizeAutonomyRunCompleted({
|
||||
runId: autonomy.runId,
|
||||
rootDir: autonomy.rootDir,
|
||||
currentDir: params.currentDir,
|
||||
priority: params.priority,
|
||||
workload: command.workload ?? params.workload,
|
||||
})),
|
||||
)
|
||||
} else if (params.outcome.type === 'cancelled') {
|
||||
await markAutonomyRunCancelled(autonomy.runId, autonomy.rootDir)
|
||||
} else {
|
||||
await finalizeAutonomyRunFailed({
|
||||
runId: autonomy.runId,
|
||||
rootDir: autonomy.rootDir,
|
||||
error:
|
||||
params.outcome.message ??
|
||||
sanitizeAutonomyFailureForPersistence(params.outcome.error),
|
||||
})
|
||||
}
|
||||
}
|
||||
return nextCommands
|
||||
}
|
||||
@@ -1,7 +1,7 @@
|
||||
import { randomUUID } from 'crypto'
|
||||
import { mkdir, writeFile } from 'fs/promises'
|
||||
import { dirname, join, resolve } from 'path'
|
||||
import { getProjectRoot } from '../bootstrap/state.js'
|
||||
import { getProjectRoot, getSessionId } from '../bootstrap/state.js'
|
||||
import type { MessageOrigin } from '../types/message.js'
|
||||
import type { QueuedCommand } from '../types/textInputTypes.js'
|
||||
import {
|
||||
@@ -29,9 +29,22 @@ import {
|
||||
} from './autonomyFlows.js'
|
||||
import { withAutonomyPersistenceLock } from './autonomyPersistence.js'
|
||||
import { getFsImplementation } from './fsOperations.js'
|
||||
import { isProcessRunning } from './genericProcessUtils.js'
|
||||
import { logError } from './log.js'
|
||||
|
||||
const AUTONOMY_RUNS_MAX = 200
|
||||
const AUTONOMY_RUNS_RELATIVE_PATH = join(AUTONOMY_DIR, 'runs.json')
|
||||
// Sentinel string surfaced to operators via runs.json error fields and
|
||||
// referenced literally by the HEARTBEAT.md `stale-recovery-health` task.
|
||||
// A unit test asserts the HEARTBEAT.md file contains this exact prefix —
|
||||
// changing the value will fail the test, forcing the heartbeat prompt
|
||||
// to be updated in the same change.
|
||||
export const STALE_ACTIVE_RUN_ERROR_PREFIX =
|
||||
'Recovered stale active autonomy run'
|
||||
|
||||
// Guards the legacy-block warning so it fires once per (process, runId) instead
|
||||
// of every dedup tick while a no-owner record sits there.
|
||||
const warnedLegacyBlockRunIds = new Set<string>()
|
||||
|
||||
export type AutonomyRunStatus =
|
||||
| 'queued'
|
||||
@@ -59,6 +72,8 @@ export type AutonomyRunRecord = {
|
||||
flowStepName?: string
|
||||
promptPreview: string
|
||||
createdAt: number
|
||||
ownerProcessId?: number
|
||||
ownerSessionId?: string
|
||||
startedAt?: number
|
||||
endedAt?: number
|
||||
error?: string
|
||||
@@ -77,6 +92,19 @@ type AutonomyRunFlowRef = {
|
||||
stepName: string
|
||||
}
|
||||
|
||||
type CreateAutonomyRunParams = {
|
||||
trigger: AutonomyTriggerKind
|
||||
prompt: string
|
||||
rootDir?: string
|
||||
currentDir?: string
|
||||
sourceId?: string
|
||||
sourceLabel?: string
|
||||
runtime?: AutonomyRunRuntime
|
||||
ownerKey?: string
|
||||
flow?: AutonomyRunFlowRef
|
||||
nowMs?: number
|
||||
}
|
||||
|
||||
function truncatePromptPreview(prompt: string): string {
|
||||
const singleLine = prompt.replace(/\s+/g, ' ').trim()
|
||||
return singleLine.length <= 240
|
||||
@@ -95,6 +123,29 @@ function cloneRunRecord(run: AutonomyRunRecord): AutonomyRunRecord {
|
||||
return { ...run }
|
||||
}
|
||||
|
||||
function isAutonomyRunActive(run: AutonomyRunRecord): boolean {
|
||||
return run.status === 'queued' || run.status === 'running'
|
||||
}
|
||||
|
||||
function selectPersistedAutonomyRuns(
|
||||
runs: AutonomyRunRecord[],
|
||||
): AutonomyRunRecord[] {
|
||||
const retained = runs
|
||||
.slice()
|
||||
.map(cloneRunRecord)
|
||||
.sort((left, right) => {
|
||||
const leftActive = isAutonomyRunActive(left)
|
||||
const rightActive = isAutonomyRunActive(right)
|
||||
if (leftActive !== rightActive) {
|
||||
return leftActive ? -1 : 1
|
||||
}
|
||||
return right.createdAt - left.createdAt
|
||||
})
|
||||
.slice(0, AUTONOMY_RUNS_MAX)
|
||||
|
||||
return retained.sort((left, right) => right.createdAt - left.createdAt)
|
||||
}
|
||||
|
||||
function normalizePersistedRunRecord(
|
||||
run: PersistedAutonomyRunRecord,
|
||||
): AutonomyRunRecord {
|
||||
@@ -157,11 +208,7 @@ async function writeAutonomyRuns(
|
||||
path,
|
||||
`${JSON.stringify(
|
||||
{
|
||||
runs: runs
|
||||
.slice()
|
||||
.map(cloneRunRecord)
|
||||
.sort((left, right) => right.createdAt - left.createdAt)
|
||||
.slice(0, AUTONOMY_RUNS_MAX),
|
||||
runs: selectPersistedAutonomyRuns(runs),
|
||||
} satisfies AutonomyRunsFile,
|
||||
null,
|
||||
2,
|
||||
@@ -172,7 +219,7 @@ async function writeAutonomyRuns(
|
||||
|
||||
async function updateAutonomyRun(
|
||||
runId: string,
|
||||
updater: (current: AutonomyRunRecord) => AutonomyRunRecord,
|
||||
updater: (current: AutonomyRunRecord) => AutonomyRunRecord | null,
|
||||
rootDir: string = getProjectRoot(),
|
||||
): Promise<AutonomyRunRecord | null> {
|
||||
return withAutonomyPersistenceLock(rootDir, async () => {
|
||||
@@ -181,7 +228,11 @@ async function updateAutonomyRun(
|
||||
if (index === -1) {
|
||||
return null
|
||||
}
|
||||
const updated = cloneRunRecord(updater(cloneRunRecord(runs[index]!)))
|
||||
const next = updater(cloneRunRecord(runs[index]!))
|
||||
if (!next) {
|
||||
return null
|
||||
}
|
||||
const updated = cloneRunRecord(next)
|
||||
runs[index] = updated
|
||||
await writeAutonomyRuns(runs, rootDir)
|
||||
return updated
|
||||
@@ -196,21 +247,112 @@ export async function getAutonomyRunById(
|
||||
return runs.find(run => run.runId === runId) ?? null
|
||||
}
|
||||
|
||||
export async function createAutonomyRun(params: {
|
||||
function isActiveAutonomyRunStatus(status: AutonomyRunStatus): boolean {
|
||||
return status === 'queued' || status === 'running'
|
||||
}
|
||||
|
||||
function isValidOwnerProcessId(pid: number | undefined): pid is number {
|
||||
// Reject non-numeric, negative, zero (Linux: send-to-process-group), and
|
||||
// non-integer values. A forged record with pid=0 or pid<0 used to be
|
||||
// treated as live and could permanently block dedup; treating them as
|
||||
// stale closes that availability hole.
|
||||
return (
|
||||
typeof pid === 'number' &&
|
||||
Number.isInteger(pid) &&
|
||||
pid > 0 &&
|
||||
pid < 4_194_304
|
||||
)
|
||||
}
|
||||
|
||||
function isStaleActiveAutonomyRun(run: AutonomyRunRecord): boolean {
|
||||
if (!isActiveAutonomyRunStatus(run.status)) {
|
||||
return false
|
||||
}
|
||||
if (run.ownerProcessId === undefined) {
|
||||
return false
|
||||
}
|
||||
if (!isValidOwnerProcessId(run.ownerProcessId)) {
|
||||
return true
|
||||
}
|
||||
return !isProcessRunning(run.ownerProcessId)
|
||||
}
|
||||
|
||||
function staleActiveRunError(run: AutonomyRunRecord): string {
|
||||
return `${STALE_ACTIVE_RUN_ERROR_PREFIX}: owner process ${run.ownerProcessId} is no longer running.`
|
||||
}
|
||||
|
||||
function failAutonomyRunRecord(
|
||||
run: AutonomyRunRecord,
|
||||
error: string,
|
||||
nowMs: number,
|
||||
): AutonomyRunRecord {
|
||||
return {
|
||||
...run,
|
||||
status: 'failed',
|
||||
endedAt: nowMs,
|
||||
error,
|
||||
}
|
||||
}
|
||||
|
||||
function recoverStaleActiveAutonomyRun(
|
||||
run: AutonomyRunRecord,
|
||||
nowMs: number,
|
||||
): AutonomyRunRecord {
|
||||
return failAutonomyRunRecord(run, staleActiveRunError(run), nowMs)
|
||||
}
|
||||
|
||||
async function syncFailedManagedFlowForRun(
|
||||
run: AutonomyRunRecord,
|
||||
rootDir: string,
|
||||
): Promise<void> {
|
||||
if (run.parentFlowId && run.parentFlowSyncMode === 'managed') {
|
||||
await markManagedAutonomyFlowStepFailed({
|
||||
flowId: run.parentFlowId,
|
||||
runId: run.runId,
|
||||
error: run.error ?? 'Autonomy run failed.',
|
||||
rootDir,
|
||||
nowMs: run.endedAt,
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
function matchesActiveAutonomyRunSource(
|
||||
run: AutonomyRunRecord,
|
||||
params: {
|
||||
trigger: AutonomyTriggerKind
|
||||
sourceId: string
|
||||
ownerKey?: string
|
||||
},
|
||||
): boolean {
|
||||
return (
|
||||
run.trigger === params.trigger &&
|
||||
run.sourceId === params.sourceId &&
|
||||
(params.ownerKey === undefined || run.ownerKey === params.ownerKey) &&
|
||||
isActiveAutonomyRunStatus(run.status)
|
||||
)
|
||||
}
|
||||
|
||||
export async function hasActiveAutonomyRunForSource(params: {
|
||||
trigger: AutonomyTriggerKind
|
||||
prompt: string
|
||||
sourceId: string
|
||||
rootDir?: string
|
||||
currentDir?: string
|
||||
sourceId?: string
|
||||
sourceLabel?: string
|
||||
runtime?: AutonomyRunRuntime
|
||||
ownerKey?: string
|
||||
flow?: AutonomyRunFlowRef
|
||||
nowMs?: number
|
||||
}): Promise<AutonomyRunRecord> {
|
||||
const rootDir = resolve(params.rootDir ?? getProjectRoot())
|
||||
const currentDir = resolve(params.currentDir ?? rootDir)
|
||||
const record: AutonomyRunRecord = {
|
||||
}): Promise<boolean> {
|
||||
const runs = await listAutonomyRuns(params.rootDir)
|
||||
return runs.some(
|
||||
run =>
|
||||
matchesActiveAutonomyRunSource(run, params) &&
|
||||
!isStaleActiveAutonomyRun(run),
|
||||
)
|
||||
}
|
||||
|
||||
function buildAutonomyRunRecord(
|
||||
params: CreateAutonomyRunParams,
|
||||
rootDir: string,
|
||||
currentDir: string,
|
||||
): AutonomyRunRecord {
|
||||
const createdAt = params.nowMs ?? Date.now()
|
||||
return {
|
||||
runId: randomUUID(),
|
||||
runtime: params.runtime ?? (params.flow ? 'flow_step' : 'automatic'),
|
||||
trigger: params.trigger,
|
||||
@@ -231,13 +373,80 @@ export async function createAutonomyRun(params: {
|
||||
}
|
||||
: {}),
|
||||
promptPreview: truncatePromptPreview(params.prompt),
|
||||
createdAt: params.nowMs ?? Date.now(),
|
||||
createdAt,
|
||||
ownerProcessId: process.pid,
|
||||
ownerSessionId: getSessionId(),
|
||||
}
|
||||
}
|
||||
|
||||
async function persistAutonomyRunRecord(
|
||||
record: AutonomyRunRecord,
|
||||
rootDir: string,
|
||||
skipWhenActiveSource: boolean,
|
||||
): Promise<{
|
||||
created: boolean
|
||||
recoveredStaleRuns: AutonomyRunRecord[]
|
||||
}> {
|
||||
let created = false
|
||||
const recoveredStaleRuns: AutonomyRunRecord[] = []
|
||||
await withAutonomyPersistenceLock(rootDir, async () => {
|
||||
const runs = await listAutonomyRuns(rootDir)
|
||||
const sourceId = record.sourceId
|
||||
if (skipWhenActiveSource && sourceId) {
|
||||
let hasBlockingActiveRun = false
|
||||
let staleRecoveriesApplied = false
|
||||
for (let i = 0; i < runs.length; i++) {
|
||||
const run = runs[i]!
|
||||
if (
|
||||
!matchesActiveAutonomyRunSource(run, {
|
||||
trigger: record.trigger,
|
||||
sourceId,
|
||||
ownerKey: record.ownerKey,
|
||||
})
|
||||
) {
|
||||
continue
|
||||
}
|
||||
if (isStaleActiveAutonomyRun(run)) {
|
||||
const recovered = recoverStaleActiveAutonomyRun(
|
||||
run,
|
||||
record.createdAt,
|
||||
)
|
||||
runs[i] = recovered
|
||||
recoveredStaleRuns.push(recovered)
|
||||
staleRecoveriesApplied = true
|
||||
continue
|
||||
}
|
||||
if (
|
||||
run.ownerProcessId === undefined &&
|
||||
!warnedLegacyBlockRunIds.has(run.runId)
|
||||
) {
|
||||
warnedLegacyBlockRunIds.add(run.runId)
|
||||
logError(
|
||||
new Error(
|
||||
`[autonomyRuns] blocked by legacy un-owned active run ${run.runId} (createdAt=${run.createdAt}); cancel manually if this is a stale upgrade artifact`,
|
||||
),
|
||||
)
|
||||
}
|
||||
hasBlockingActiveRun = true
|
||||
}
|
||||
if (hasBlockingActiveRun) {
|
||||
if (staleRecoveriesApplied) {
|
||||
await writeAutonomyRuns(runs, rootDir)
|
||||
}
|
||||
return
|
||||
}
|
||||
}
|
||||
runs.unshift(record)
|
||||
await writeAutonomyRuns(runs, rootDir)
|
||||
created = true
|
||||
})
|
||||
return { created, recoveredStaleRuns }
|
||||
}
|
||||
|
||||
async function queueManagedFlowStepRunForRecord(
|
||||
record: AutonomyRunRecord,
|
||||
rootDir: string,
|
||||
): Promise<void> {
|
||||
if (
|
||||
record.parentFlowId &&
|
||||
record.flowStepId &&
|
||||
@@ -258,9 +467,47 @@ export async function createAutonomyRun(params: {
|
||||
nowMs: record.createdAt,
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
async function createAutonomyRunCore(
|
||||
params: CreateAutonomyRunParams,
|
||||
skipIfActiveSource: boolean,
|
||||
): Promise<AutonomyRunRecord | null> {
|
||||
const rootDir = resolve(params.rootDir ?? getProjectRoot())
|
||||
const currentDir = resolve(params.currentDir ?? rootDir)
|
||||
const record = buildAutonomyRunRecord(params, rootDir, currentDir)
|
||||
|
||||
const { created, recoveredStaleRuns } = await persistAutonomyRunRecord(
|
||||
record,
|
||||
rootDir,
|
||||
skipIfActiveSource,
|
||||
)
|
||||
for (const recovered of recoveredStaleRuns) {
|
||||
await syncFailedManagedFlowForRun(recovered, rootDir)
|
||||
}
|
||||
if (!created) {
|
||||
return null
|
||||
}
|
||||
await queueManagedFlowStepRunForRecord(record, rootDir)
|
||||
return record
|
||||
}
|
||||
|
||||
export async function createAutonomyRun(
|
||||
params: CreateAutonomyRunParams,
|
||||
): Promise<AutonomyRunRecord> {
|
||||
const record = await createAutonomyRunCore(params, false)
|
||||
if (!record) {
|
||||
throw new Error('Autonomy run was unexpectedly skipped.')
|
||||
}
|
||||
return record
|
||||
}
|
||||
|
||||
export async function createAutonomyRunIfNoActiveSource(
|
||||
params: CreateAutonomyRunParams & { sourceId: string },
|
||||
): Promise<AutonomyRunRecord | null> {
|
||||
return createAutonomyRunCore(params, true)
|
||||
}
|
||||
|
||||
function buildManagedFlowStepPrompt(
|
||||
flow: AutonomyFlowRecord,
|
||||
stepIndex: number,
|
||||
@@ -336,6 +583,7 @@ async function createOrRecoverManagedFlowStepCommand(params: {
|
||||
workload: params.workload,
|
||||
autonomy: {
|
||||
runId: run.runId,
|
||||
rootDir: run.rootDir,
|
||||
trigger: 'managed-flow-step',
|
||||
sourceId: run.sourceId,
|
||||
sourceLabel: run.sourceLabel,
|
||||
@@ -426,11 +674,16 @@ export async function markAutonomyRunRunning(
|
||||
): Promise<AutonomyRunRecord | null> {
|
||||
const updated = await updateAutonomyRun(
|
||||
runId,
|
||||
current => ({
|
||||
...current,
|
||||
status: 'running',
|
||||
startedAt: nowMs ?? Date.now(),
|
||||
}),
|
||||
current =>
|
||||
current.status === 'queued'
|
||||
? {
|
||||
...current,
|
||||
status: 'running',
|
||||
startedAt: nowMs ?? Date.now(),
|
||||
ownerProcessId: process.pid,
|
||||
ownerSessionId: getSessionId(),
|
||||
}
|
||||
: null,
|
||||
rootDir,
|
||||
)
|
||||
if (updated?.parentFlowId && updated.parentFlowSyncMode === 'managed') {
|
||||
@@ -451,12 +704,15 @@ export async function markAutonomyRunCompleted(
|
||||
): Promise<AutonomyRunRecord | null> {
|
||||
const updated = await updateAutonomyRun(
|
||||
runId,
|
||||
current => ({
|
||||
...current,
|
||||
status: 'completed',
|
||||
endedAt: nowMs ?? Date.now(),
|
||||
error: undefined,
|
||||
}),
|
||||
current =>
|
||||
current.status === 'queued' || current.status === 'running'
|
||||
? {
|
||||
...current,
|
||||
status: 'completed',
|
||||
endedAt: nowMs ?? Date.now(),
|
||||
error: undefined,
|
||||
}
|
||||
: null,
|
||||
rootDir,
|
||||
)
|
||||
if (updated?.parentFlowId && updated.parentFlowSyncMode === 'managed') {
|
||||
@@ -476,24 +732,17 @@ export async function markAutonomyRunFailed(
|
||||
rootDir?: string,
|
||||
nowMs?: number,
|
||||
): Promise<AutonomyRunRecord | null> {
|
||||
const endedAt = nowMs ?? Date.now()
|
||||
const updated = await updateAutonomyRun(
|
||||
runId,
|
||||
current => ({
|
||||
...current,
|
||||
status: 'failed',
|
||||
endedAt: nowMs ?? Date.now(),
|
||||
error,
|
||||
}),
|
||||
current =>
|
||||
isActiveAutonomyRunStatus(current.status)
|
||||
? failAutonomyRunRecord(current, error, endedAt)
|
||||
: null,
|
||||
rootDir,
|
||||
)
|
||||
if (updated?.parentFlowId && updated.parentFlowSyncMode === 'managed') {
|
||||
await markManagedAutonomyFlowStepFailed({
|
||||
flowId: updated.parentFlowId,
|
||||
runId: updated.runId,
|
||||
error,
|
||||
rootDir,
|
||||
nowMs: updated.endedAt,
|
||||
})
|
||||
if (updated) {
|
||||
await syncFailedManagedFlowForRun(updated, rootDir ?? updated.rootDir)
|
||||
}
|
||||
return updated
|
||||
}
|
||||
@@ -505,12 +754,15 @@ export async function markAutonomyRunCancelled(
|
||||
): Promise<AutonomyRunRecord | null> {
|
||||
const updated = await updateAutonomyRun(
|
||||
runId,
|
||||
current => ({
|
||||
...current,
|
||||
status: 'cancelled',
|
||||
endedAt: nowMs ?? Date.now(),
|
||||
error: undefined,
|
||||
}),
|
||||
current =>
|
||||
current.status === 'queued' || current.status === 'running'
|
||||
? {
|
||||
...current,
|
||||
status: 'cancelled',
|
||||
endedAt: nowMs ?? Date.now(),
|
||||
error: undefined,
|
||||
}
|
||||
: null,
|
||||
rootDir,
|
||||
)
|
||||
if (updated?.parentFlowId && updated.parentFlowSyncMode === 'managed') {
|
||||
@@ -612,6 +864,7 @@ export async function createAutonomyQueuedPrompt(params: {
|
||||
currentDir?: string
|
||||
sourceId?: string
|
||||
sourceLabel?: string
|
||||
ownerKey?: string
|
||||
workload?: string
|
||||
priority?: 'now' | 'next' | 'later'
|
||||
shouldCreate?: () => boolean
|
||||
@@ -634,39 +887,130 @@ export async function createAutonomyQueuedPrompt(params: {
|
||||
currentDir,
|
||||
sourceId: params.sourceId,
|
||||
sourceLabel: params.sourceLabel,
|
||||
ownerKey: params.ownerKey,
|
||||
workload: params.workload,
|
||||
priority: params.priority,
|
||||
flow: params.flow,
|
||||
})
|
||||
}
|
||||
|
||||
export async function createAutonomyQueuedPromptIfNoActiveSource(params: {
|
||||
trigger: AutonomyTriggerKind
|
||||
basePrompt: string
|
||||
rootDir?: string
|
||||
currentDir?: string
|
||||
sourceId: string
|
||||
sourceLabel?: string
|
||||
ownerKey?: string
|
||||
workload?: string
|
||||
priority?: 'now' | 'next' | 'later'
|
||||
shouldCreate?: () => boolean
|
||||
}): Promise<QueuedCommand | null> {
|
||||
const rootDir = resolve(params.rootDir ?? getProjectRoot())
|
||||
const currentDir = resolve(params.currentDir ?? getCwd())
|
||||
// Cheap optimistic pre-check: skip the AGENTS.md / HEARTBEAT.md disk
|
||||
// reads + prompt assembly when an active run for this source already
|
||||
// blocks dedup. The lock-side check inside persistAutonomyRunRecord
|
||||
// remains authoritative; this only fast-paths the common storm case.
|
||||
if (
|
||||
await hasActiveAutonomyRunForSource({
|
||||
trigger: params.trigger,
|
||||
sourceId: params.sourceId,
|
||||
rootDir,
|
||||
ownerKey: params.ownerKey,
|
||||
})
|
||||
) {
|
||||
return null
|
||||
}
|
||||
const prepared = await prepareAutonomyTurnPrompt({
|
||||
basePrompt: params.basePrompt,
|
||||
trigger: params.trigger,
|
||||
rootDir,
|
||||
currentDir,
|
||||
})
|
||||
if (params.shouldCreate && !params.shouldCreate()) {
|
||||
return null
|
||||
}
|
||||
return commitAutonomyQueuedPromptIfNoActiveSource({
|
||||
prepared,
|
||||
rootDir,
|
||||
currentDir,
|
||||
sourceId: params.sourceId,
|
||||
sourceLabel: params.sourceLabel,
|
||||
ownerKey: params.ownerKey,
|
||||
workload: params.workload,
|
||||
priority: params.priority,
|
||||
})
|
||||
}
|
||||
|
||||
export async function commitAutonomyQueuedPrompt(params: {
|
||||
prepared: Awaited<ReturnType<typeof prepareAutonomyTurnPrompt>>
|
||||
rootDir?: string
|
||||
currentDir?: string
|
||||
sourceId?: string
|
||||
sourceLabel?: string
|
||||
ownerKey?: string
|
||||
workload?: string
|
||||
priority?: 'now' | 'next' | 'later'
|
||||
flow?: AutonomyRunFlowRef
|
||||
}): Promise<QueuedCommand> {
|
||||
const command = await commitAutonomyQueuedPromptInternal(params, false)
|
||||
if (!command) {
|
||||
throw new Error('Autonomy queued prompt was unexpectedly skipped.')
|
||||
}
|
||||
return command
|
||||
}
|
||||
|
||||
async function commitAutonomyQueuedPromptIfNoActiveSource(params: {
|
||||
prepared: Awaited<ReturnType<typeof prepareAutonomyTurnPrompt>>
|
||||
rootDir?: string
|
||||
currentDir?: string
|
||||
sourceId: string
|
||||
sourceLabel?: string
|
||||
ownerKey?: string
|
||||
workload?: string
|
||||
priority?: 'now' | 'next' | 'later'
|
||||
}): Promise<QueuedCommand | null> {
|
||||
return commitAutonomyQueuedPromptInternal(params, true)
|
||||
}
|
||||
|
||||
async function commitAutonomyQueuedPromptInternal(
|
||||
params: {
|
||||
prepared: Awaited<ReturnType<typeof prepareAutonomyTurnPrompt>>
|
||||
rootDir?: string
|
||||
currentDir?: string
|
||||
sourceId?: string
|
||||
sourceLabel?: string
|
||||
ownerKey?: string
|
||||
workload?: string
|
||||
priority?: 'now' | 'next' | 'later'
|
||||
flow?: AutonomyRunFlowRef
|
||||
},
|
||||
skipWhenActiveSource: boolean,
|
||||
): Promise<QueuedCommand | null> {
|
||||
const rootDir = resolve(
|
||||
params.rootDir ?? params.prepared.rootDir ?? getProjectRoot(),
|
||||
)
|
||||
const currentDir = resolve(
|
||||
params.currentDir ?? params.prepared.currentDir ?? getCwd(),
|
||||
)
|
||||
commitPreparedAutonomyTurn(params.prepared)
|
||||
const value = params.prepared.prompt
|
||||
const run = await createAutonomyRun({
|
||||
const runParams: CreateAutonomyRunParams = {
|
||||
trigger: params.prepared.trigger,
|
||||
prompt: value,
|
||||
rootDir,
|
||||
currentDir,
|
||||
sourceId: params.sourceId,
|
||||
sourceLabel: params.sourceLabel,
|
||||
ownerKey: params.ownerKey,
|
||||
flow: params.flow,
|
||||
})
|
||||
}
|
||||
const useDedup = skipWhenActiveSource && Boolean(params.sourceId)
|
||||
const run = await createAutonomyRunCore(runParams, useDedup)
|
||||
if (!run) {
|
||||
return null
|
||||
}
|
||||
commitPreparedAutonomyTurn(params.prepared)
|
||||
const origin = {
|
||||
kind: 'autonomy',
|
||||
trigger: params.prepared.trigger,
|
||||
@@ -683,6 +1027,7 @@ export async function commitAutonomyQueuedPrompt(params: {
|
||||
workload: params.workload,
|
||||
autonomy: {
|
||||
runId: run.runId,
|
||||
rootDir: run.rootDir,
|
||||
trigger: params.prepared.trigger,
|
||||
sourceId: params.sourceId,
|
||||
sourceLabel: params.sourceLabel,
|
||||
|
||||
@@ -19,6 +19,7 @@ import {
|
||||
} from '../types/textInputTypes.js'
|
||||
import { createAbortController } from './abortController.js'
|
||||
import type { PastedContent } from './config.js'
|
||||
import { getCwd } from './cwd.js'
|
||||
import { logForDebugging } from './debug.js'
|
||||
import type { EffortValue } from './effort.js'
|
||||
import type { FileHistoryState } from './fileHistory.js'
|
||||
@@ -27,11 +28,9 @@ import { gracefulShutdownSync } from './gracefulShutdown.js'
|
||||
import { enqueue } from './messageQueueManager.js'
|
||||
import { resolveSkillModelOverride } from './model/model.js'
|
||||
import {
|
||||
finalizeAutonomyRunCompleted,
|
||||
finalizeAutonomyRunFailed,
|
||||
markAutonomyRunFailed,
|
||||
markAutonomyRunRunning,
|
||||
} from './autonomyRuns.js'
|
||||
claimConsumableQueuedAutonomyCommands,
|
||||
finalizeAutonomyCommandsForTurn,
|
||||
} from './autonomyQueueLifecycle.js'
|
||||
import type { ProcessUserInputContext } from './processUserInput/processUserInput.js'
|
||||
import { processUserInput } from './processUserInput/processUserInput.js'
|
||||
import type { QueryGuard } from './QueryGuard.js'
|
||||
@@ -459,7 +458,14 @@ async function executeUserInput(params: ExecuteUserInputParams): Promise<void> {
|
||||
// Iterate all commands uniformly. First command gets attachments +
|
||||
// ideSelection + pastedContents, rest skip attachments to avoid
|
||||
// duplicating turn-level context (IDE selection, todos, diffs).
|
||||
const commands = queuedCommands ?? []
|
||||
let commands = queuedCommands ?? []
|
||||
const queuedAutonomyClaim =
|
||||
await claimConsumableQueuedAutonomyCommands(commands)
|
||||
commands = queuedAutonomyClaim.attachmentCommands
|
||||
const claimedAutonomyCommands = queuedAutonomyClaim.claimedCommands
|
||||
if (commands.length === 0) {
|
||||
return
|
||||
}
|
||||
|
||||
// Compute the workload tag for this turn. queueProcessor can batch a
|
||||
// cron prompt with a same-tick human prompt; only tag when EVERY
|
||||
@@ -471,7 +477,7 @@ async function executeUserInput(params: ExecuteUserInputParams): Promise<void> {
|
||||
commands.every(c => c.workload === firstWorkload)
|
||||
? firstWorkload
|
||||
: undefined
|
||||
let autonomyRunIds: string[] | undefined
|
||||
const deferredAutonomyRunIds = new Set<string>()
|
||||
|
||||
// Wrap the entire turn (processUserInput loop + onQuery) in an
|
||||
// AsyncLocalStorage context. This is the ONLY way to correctly
|
||||
@@ -486,10 +492,7 @@ async function executeUserInput(params: ExecuteUserInputParams): Promise<void> {
|
||||
for (let i = 0; i < commands.length; i++) {
|
||||
const cmd = commands[i]!
|
||||
const isFirst = i === 0
|
||||
if (cmd.autonomy?.runId) {
|
||||
;(autonomyRunIds ??= []).push(cmd.autonomy.runId)
|
||||
await markAutonomyRunRunning(cmd.autonomy.runId)
|
||||
}
|
||||
const runId = cmd.autonomy?.runId
|
||||
const result = await processUserInput({
|
||||
input: cmd.value,
|
||||
preExpansionInput: cmd.preExpansionValue,
|
||||
@@ -510,7 +513,11 @@ async function executeUserInput(params: ExecuteUserInputParams): Promise<void> {
|
||||
bridgeOrigin: cmd.bridgeOrigin,
|
||||
isMeta: cmd.isMeta,
|
||||
skipAttachments: !isFirst,
|
||||
autonomy: cmd.autonomy,
|
||||
})
|
||||
if (runId && result.deferAutonomyCompletion) {
|
||||
deferredAutonomyRunIds.add(runId)
|
||||
}
|
||||
// Stamp origin here rather than threading another arg through
|
||||
// processUserInput → processUserInputBase → processTextPrompt → createUserMessage.
|
||||
// Derive origin from mode for task-notifications — mirrors the origin
|
||||
@@ -611,26 +618,35 @@ async function executeUserInput(params: ExecuteUserInputParams): Promise<void> {
|
||||
}
|
||||
}
|
||||
}) // end runWithWorkload — ALS context naturally scoped, no finally needed
|
||||
if (autonomyRunIds?.length) {
|
||||
for (const runId of autonomyRunIds) {
|
||||
const nextCommands = await finalizeAutonomyRunCompleted({
|
||||
runId,
|
||||
priority: 'later',
|
||||
workload: turnWorkload,
|
||||
})
|
||||
for (const nextCommand of nextCommands) {
|
||||
enqueue(nextCommand)
|
||||
}
|
||||
if (claimedAutonomyCommands.length) {
|
||||
const finalizableCommands = claimedAutonomyCommands.filter(command => {
|
||||
const runId = command.autonomy?.runId
|
||||
return !runId || !deferredAutonomyRunIds.has(runId)
|
||||
})
|
||||
const nextCommands = await finalizeAutonomyCommandsForTurn({
|
||||
commands: finalizableCommands,
|
||||
outcome: { type: 'completed' },
|
||||
currentDir: getCwd(),
|
||||
priority: 'later',
|
||||
workload: turnWorkload,
|
||||
})
|
||||
for (const nextCommand of nextCommands) {
|
||||
enqueue(nextCommand)
|
||||
}
|
||||
}
|
||||
} catch (error) {
|
||||
if (autonomyRunIds?.length) {
|
||||
for (const runId of autonomyRunIds) {
|
||||
await finalizeAutonomyRunFailed({
|
||||
runId,
|
||||
error: String(error),
|
||||
})
|
||||
}
|
||||
if (claimedAutonomyCommands.length) {
|
||||
const finalizableCommands = claimedAutonomyCommands.filter(command => {
|
||||
const runId = command.autonomy?.runId
|
||||
return !runId || !deferredAutonomyRunIds.has(runId)
|
||||
})
|
||||
await finalizeAutonomyCommandsForTurn({
|
||||
commands: finalizableCommands,
|
||||
outcome: { type: 'failed', error },
|
||||
currentDir: getCwd(),
|
||||
priority: 'later',
|
||||
workload: turnWorkload,
|
||||
})
|
||||
}
|
||||
throw error
|
||||
}
|
||||
|
||||
@@ -1,173 +1,162 @@
|
||||
import { describe, expect, test, beforeEach, afterEach } from "bun:test";
|
||||
import { mock } from "bun:test";
|
||||
import { describe, expect, test, beforeEach, afterEach } from 'bun:test'
|
||||
|
||||
let mockedModelType: "gemini" | undefined;
|
||||
const { getAPIProvider, isFirstPartyAnthropicBaseUrl } = await import(
|
||||
'../providers'
|
||||
)
|
||||
|
||||
mock.module("../../settings/settings.js", () => ({
|
||||
getInitialSettings: () =>
|
||||
mockedModelType ? { modelType: mockedModelType } : {},
|
||||
}));
|
||||
|
||||
const { getAPIProvider, isFirstPartyAnthropicBaseUrl } =
|
||||
await import("../providers");
|
||||
|
||||
describe("getAPIProvider", () => {
|
||||
describe('getAPIProvider', () => {
|
||||
const envKeys = [
|
||||
"CLAUDE_CODE_USE_GEMINI",
|
||||
"CLAUDE_CODE_USE_BEDROCK",
|
||||
"CLAUDE_CODE_USE_VERTEX",
|
||||
"CLAUDE_CODE_USE_FOUNDRY",
|
||||
"CLAUDE_CODE_USE_OPENAI",
|
||||
] as const;
|
||||
const savedEnv: Record<string, string | undefined> = {};
|
||||
|
||||
'CLAUDE_CODE_USE_GEMINI',
|
||||
'CLAUDE_CODE_USE_BEDROCK',
|
||||
'CLAUDE_CODE_USE_VERTEX',
|
||||
'CLAUDE_CODE_USE_FOUNDRY',
|
||||
'CLAUDE_CODE_USE_OPENAI',
|
||||
'CLAUDE_CODE_USE_GROK',
|
||||
] as const
|
||||
const savedEnv: Record<string, string | undefined> = {}
|
||||
|
||||
beforeEach(() => {
|
||||
// Save and clear environment variables
|
||||
mockedModelType = undefined;
|
||||
for (const key of envKeys) {
|
||||
savedEnv[key] = process.env[key];
|
||||
delete process.env[key];
|
||||
savedEnv[key] = process.env[key]
|
||||
delete process.env[key]
|
||||
}
|
||||
});
|
||||
})
|
||||
|
||||
afterEach(() => {
|
||||
// Restore environment variables
|
||||
mockedModelType = undefined;
|
||||
for (const key of envKeys) {
|
||||
if (savedEnv[key] !== undefined) {
|
||||
process.env[key] = savedEnv[key];
|
||||
process.env[key] = savedEnv[key]
|
||||
} else {
|
||||
delete process.env[key];
|
||||
delete process.env[key]
|
||||
}
|
||||
}
|
||||
});
|
||||
})
|
||||
|
||||
test('returns "firstParty" by default', () => {
|
||||
expect(getAPIProvider()).toBe("firstParty");
|
||||
});
|
||||
expect(getAPIProvider({})).toBe('firstParty')
|
||||
})
|
||||
|
||||
test('returns "gemini" when modelType is gemini', () => {
|
||||
mockedModelType = "gemini";
|
||||
expect(getAPIProvider()).toBe("gemini");
|
||||
});
|
||||
expect(getAPIProvider({ modelType: 'gemini' })).toBe('gemini')
|
||||
})
|
||||
|
||||
test("modelType takes precedence over environment variables", () => {
|
||||
mockedModelType = "gemini";
|
||||
process.env.CLAUDE_CODE_USE_BEDROCK = "1";
|
||||
expect(getAPIProvider()).toBe("gemini");
|
||||
});
|
||||
test('modelType takes precedence over environment variables', () => {
|
||||
process.env.CLAUDE_CODE_USE_BEDROCK = '1'
|
||||
expect(getAPIProvider({ modelType: 'gemini' })).toBe('gemini')
|
||||
})
|
||||
|
||||
test('returns "gemini" when CLAUDE_CODE_USE_GEMINI is set', () => {
|
||||
process.env.CLAUDE_CODE_USE_GEMINI = "1";
|
||||
expect(getAPIProvider()).toBe("gemini");
|
||||
});
|
||||
process.env.CLAUDE_CODE_USE_GEMINI = '1'
|
||||
expect(getAPIProvider({})).toBe('gemini')
|
||||
})
|
||||
|
||||
test('returns "bedrock" when CLAUDE_CODE_USE_BEDROCK is set', () => {
|
||||
process.env.CLAUDE_CODE_USE_BEDROCK = "1";
|
||||
expect(getAPIProvider()).toBe("bedrock");
|
||||
});
|
||||
process.env.CLAUDE_CODE_USE_BEDROCK = '1'
|
||||
expect(getAPIProvider({})).toBe('bedrock')
|
||||
})
|
||||
|
||||
test('returns "vertex" when CLAUDE_CODE_USE_VERTEX is set', () => {
|
||||
process.env.CLAUDE_CODE_USE_VERTEX = "1";
|
||||
expect(getAPIProvider()).toBe("vertex");
|
||||
});
|
||||
process.env.CLAUDE_CODE_USE_VERTEX = '1'
|
||||
expect(getAPIProvider({})).toBe('vertex')
|
||||
})
|
||||
|
||||
test('returns "foundry" when CLAUDE_CODE_USE_FOUNDRY is set', () => {
|
||||
process.env.CLAUDE_CODE_USE_FOUNDRY = "1";
|
||||
expect(getAPIProvider()).toBe("foundry");
|
||||
});
|
||||
process.env.CLAUDE_CODE_USE_FOUNDRY = '1'
|
||||
expect(getAPIProvider({})).toBe('foundry')
|
||||
})
|
||||
|
||||
test("bedrock takes precedence over gemini", () => {
|
||||
process.env.CLAUDE_CODE_USE_BEDROCK = "1";
|
||||
process.env.CLAUDE_CODE_USE_GEMINI = "1";
|
||||
expect(getAPIProvider()).toBe("bedrock");
|
||||
});
|
||||
test('bedrock takes precedence over gemini', () => {
|
||||
process.env.CLAUDE_CODE_USE_BEDROCK = '1'
|
||||
process.env.CLAUDE_CODE_USE_GEMINI = '1'
|
||||
expect(getAPIProvider({})).toBe('bedrock')
|
||||
})
|
||||
|
||||
test("bedrock takes precedence over vertex", () => {
|
||||
process.env.CLAUDE_CODE_USE_BEDROCK = "1";
|
||||
process.env.CLAUDE_CODE_USE_VERTEX = "1";
|
||||
expect(getAPIProvider()).toBe("bedrock");
|
||||
});
|
||||
test('bedrock takes precedence over vertex', () => {
|
||||
process.env.CLAUDE_CODE_USE_BEDROCK = '1'
|
||||
process.env.CLAUDE_CODE_USE_VERTEX = '1'
|
||||
expect(getAPIProvider({})).toBe('bedrock')
|
||||
})
|
||||
|
||||
test("bedrock wins when all three env vars are set", () => {
|
||||
process.env.CLAUDE_CODE_USE_BEDROCK = "1";
|
||||
process.env.CLAUDE_CODE_USE_VERTEX = "1";
|
||||
process.env.CLAUDE_CODE_USE_FOUNDRY = "1";
|
||||
expect(getAPIProvider()).toBe("bedrock");
|
||||
});
|
||||
test('bedrock wins when all three env vars are set', () => {
|
||||
process.env.CLAUDE_CODE_USE_BEDROCK = '1'
|
||||
process.env.CLAUDE_CODE_USE_VERTEX = '1'
|
||||
process.env.CLAUDE_CODE_USE_FOUNDRY = '1'
|
||||
expect(getAPIProvider({})).toBe('bedrock')
|
||||
})
|
||||
|
||||
test('"true" is truthy', () => {
|
||||
process.env.CLAUDE_CODE_USE_BEDROCK = "true";
|
||||
expect(getAPIProvider()).toBe("bedrock");
|
||||
});
|
||||
process.env.CLAUDE_CODE_USE_BEDROCK = 'true'
|
||||
expect(getAPIProvider({})).toBe('bedrock')
|
||||
})
|
||||
|
||||
test('"0" is not truthy', () => {
|
||||
process.env.CLAUDE_CODE_USE_BEDROCK = "0";
|
||||
expect(getAPIProvider()).toBe("firstParty");
|
||||
});
|
||||
process.env.CLAUDE_CODE_USE_BEDROCK = '0'
|
||||
expect(getAPIProvider({})).toBe('firstParty')
|
||||
})
|
||||
|
||||
test('empty string is not truthy', () => {
|
||||
process.env.CLAUDE_CODE_USE_BEDROCK = "";
|
||||
expect(getAPIProvider()).toBe("firstParty");
|
||||
});
|
||||
});
|
||||
process.env.CLAUDE_CODE_USE_BEDROCK = ''
|
||||
expect(getAPIProvider({})).toBe('firstParty')
|
||||
})
|
||||
})
|
||||
|
||||
describe("isFirstPartyAnthropicBaseUrl", () => {
|
||||
const originalBaseUrl = process.env.ANTHROPIC_BASE_URL;
|
||||
const originalUserType = process.env.USER_TYPE;
|
||||
describe('isFirstPartyAnthropicBaseUrl', () => {
|
||||
const originalBaseUrl = process.env.ANTHROPIC_BASE_URL
|
||||
const originalUserType = process.env.USER_TYPE
|
||||
|
||||
afterEach(() => {
|
||||
if (originalBaseUrl !== undefined) {
|
||||
process.env.ANTHROPIC_BASE_URL = originalBaseUrl;
|
||||
process.env.ANTHROPIC_BASE_URL = originalBaseUrl
|
||||
} else {
|
||||
delete process.env.ANTHROPIC_BASE_URL;
|
||||
delete process.env.ANTHROPIC_BASE_URL
|
||||
}
|
||||
if (originalUserType !== undefined) {
|
||||
process.env.USER_TYPE = originalUserType;
|
||||
process.env.USER_TYPE = originalUserType
|
||||
} else {
|
||||
delete process.env.USER_TYPE;
|
||||
delete process.env.USER_TYPE
|
||||
}
|
||||
});
|
||||
})
|
||||
|
||||
test("returns true when ANTHROPIC_BASE_URL is not set", () => {
|
||||
delete process.env.ANTHROPIC_BASE_URL;
|
||||
expect(isFirstPartyAnthropicBaseUrl()).toBe(true);
|
||||
});
|
||||
test('returns true when ANTHROPIC_BASE_URL is not set', () => {
|
||||
delete process.env.ANTHROPIC_BASE_URL
|
||||
expect(isFirstPartyAnthropicBaseUrl()).toBe(true)
|
||||
})
|
||||
|
||||
test("returns true for api.anthropic.com", () => {
|
||||
process.env.ANTHROPIC_BASE_URL = "https://api.anthropic.com";
|
||||
expect(isFirstPartyAnthropicBaseUrl()).toBe(true);
|
||||
});
|
||||
test('returns true for api.anthropic.com', () => {
|
||||
process.env.ANTHROPIC_BASE_URL = 'https://api.anthropic.com'
|
||||
expect(isFirstPartyAnthropicBaseUrl()).toBe(true)
|
||||
})
|
||||
|
||||
test("returns false for custom URL", () => {
|
||||
process.env.ANTHROPIC_BASE_URL = "https://my-proxy.com";
|
||||
expect(isFirstPartyAnthropicBaseUrl()).toBe(false);
|
||||
});
|
||||
test('returns false for custom URL', () => {
|
||||
process.env.ANTHROPIC_BASE_URL = 'https://my-proxy.com'
|
||||
expect(isFirstPartyAnthropicBaseUrl()).toBe(false)
|
||||
})
|
||||
|
||||
test("returns false for invalid URL", () => {
|
||||
process.env.ANTHROPIC_BASE_URL = "not-a-url";
|
||||
expect(isFirstPartyAnthropicBaseUrl()).toBe(false);
|
||||
});
|
||||
test('returns false for invalid URL', () => {
|
||||
process.env.ANTHROPIC_BASE_URL = 'not-a-url'
|
||||
expect(isFirstPartyAnthropicBaseUrl()).toBe(false)
|
||||
})
|
||||
|
||||
test("returns true for staging URL when USER_TYPE is ant", () => {
|
||||
process.env.ANTHROPIC_BASE_URL = "https://api-staging.anthropic.com";
|
||||
process.env.USER_TYPE = "ant";
|
||||
expect(isFirstPartyAnthropicBaseUrl()).toBe(true);
|
||||
});
|
||||
test('returns true for staging URL when USER_TYPE is ant', () => {
|
||||
process.env.ANTHROPIC_BASE_URL = 'https://api-staging.anthropic.com'
|
||||
process.env.USER_TYPE = 'ant'
|
||||
expect(isFirstPartyAnthropicBaseUrl()).toBe(true)
|
||||
})
|
||||
|
||||
test("returns true for URL with path", () => {
|
||||
process.env.ANTHROPIC_BASE_URL = "https://api.anthropic.com/v1";
|
||||
expect(isFirstPartyAnthropicBaseUrl()).toBe(true);
|
||||
});
|
||||
test('returns true for URL with path', () => {
|
||||
process.env.ANTHROPIC_BASE_URL = 'https://api.anthropic.com/v1'
|
||||
expect(isFirstPartyAnthropicBaseUrl()).toBe(true)
|
||||
})
|
||||
|
||||
test("returns true for trailing slash", () => {
|
||||
process.env.ANTHROPIC_BASE_URL = "https://api.anthropic.com/";
|
||||
expect(isFirstPartyAnthropicBaseUrl()).toBe(true);
|
||||
});
|
||||
test('returns true for trailing slash', () => {
|
||||
process.env.ANTHROPIC_BASE_URL = 'https://api.anthropic.com/'
|
||||
expect(isFirstPartyAnthropicBaseUrl()).toBe(true)
|
||||
})
|
||||
|
||||
test("returns false for subdomain attack", () => {
|
||||
process.env.ANTHROPIC_BASE_URL = "https://evil-api.anthropic.com";
|
||||
expect(isFirstPartyAnthropicBaseUrl()).toBe(false);
|
||||
});
|
||||
});
|
||||
test('returns false for subdomain attack', () => {
|
||||
process.env.ANTHROPIC_BASE_URL = 'https://evil-api.anthropic.com'
|
||||
expect(isFirstPartyAnthropicBaseUrl()).toBe(false)
|
||||
})
|
||||
})
|
||||
|
||||
@@ -1,5 +1,6 @@
|
||||
import type { AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS } from '../../services/analytics/index.js'
|
||||
import { getInitialSettings } from '../settings/settings.js'
|
||||
import type { SettingsJson } from '../settings/types.js'
|
||||
import { isEnvTruthy } from '../envUtils.js'
|
||||
|
||||
export type APIProvider =
|
||||
@@ -11,8 +12,10 @@ export type APIProvider =
|
||||
| 'gemini'
|
||||
| 'grok'
|
||||
|
||||
export function getAPIProvider(): APIProvider {
|
||||
const modelType = getInitialSettings().modelType
|
||||
export function getAPIProvider(
|
||||
settings: Pick<SettingsJson, 'modelType'> = getInitialSettings(),
|
||||
): APIProvider {
|
||||
const modelType = settings.modelType
|
||||
if (modelType === 'openai') return 'openai'
|
||||
if (modelType === 'gemini') return 'gemini'
|
||||
if (modelType === 'grok') return 'grok'
|
||||
|
||||
375
src/utils/processUserInput/__tests__/processSlashCommand.test.ts
Normal file
375
src/utils/processUserInput/__tests__/processSlashCommand.test.ts
Normal file
@@ -0,0 +1,375 @@
|
||||
import { afterEach, beforeEach, describe, expect, mock, test } from 'bun:test'
|
||||
import type { QueuedCommand } from '../../../types/textInputTypes'
|
||||
import {
|
||||
resetStateForTests,
|
||||
setCwdState,
|
||||
setOriginalCwd,
|
||||
setProjectRoot,
|
||||
} from '../../../bootstrap/state'
|
||||
import {
|
||||
createAutonomyQueuedPrompt,
|
||||
getAutonomyRunById,
|
||||
listAutonomyRuns,
|
||||
markAutonomyRunRunning,
|
||||
} from '../../autonomyRuns'
|
||||
import { resetAutonomyAuthorityForTests } from '../../autonomyAuthority'
|
||||
import { createScheduledTaskQueuedCommand } from '../../../hooks/useScheduledTasks'
|
||||
import {
|
||||
cleanupTempDir,
|
||||
createTempDir,
|
||||
} from '../../../../tests/mocks/file-system'
|
||||
|
||||
let runAgentBlocker: Promise<void> | null = null
|
||||
let releaseRunAgentBlocker: (() => void) | null = null
|
||||
let runAgentStartCount = 0
|
||||
let originalNodeEnv: string | undefined
|
||||
let originalAnthropicApiKey: string | undefined
|
||||
const commandQueue: QueuedCommand[] = []
|
||||
|
||||
function enqueue(command: QueuedCommand): void {
|
||||
commandQueue.push({ ...command, priority: command.priority ?? 'next' })
|
||||
}
|
||||
|
||||
function enqueuePendingNotification(command: QueuedCommand): void {
|
||||
commandQueue.push({ ...command, priority: command.priority ?? 'later' })
|
||||
}
|
||||
|
||||
function getCommandQueue(): QueuedCommand[] {
|
||||
return [...commandQueue]
|
||||
}
|
||||
|
||||
function hasCommandsInQueue(): boolean {
|
||||
return commandQueue.length > 0
|
||||
}
|
||||
|
||||
function resetCommandQueue(): void {
|
||||
commandQueue.length = 0
|
||||
}
|
||||
|
||||
function createMessageQueueManagerMock() {
|
||||
return {
|
||||
enqueue,
|
||||
enqueuePendingNotification,
|
||||
getCommandQueue,
|
||||
hasCommandsInQueue,
|
||||
resetCommandQueue,
|
||||
}
|
||||
}
|
||||
|
||||
function holdRunAgent(): void {
|
||||
runAgentBlocker = new Promise(resolve => {
|
||||
releaseRunAgentBlocker = resolve
|
||||
})
|
||||
}
|
||||
|
||||
function releaseRunAgent(): void {
|
||||
releaseRunAgentBlocker?.()
|
||||
runAgentBlocker = null
|
||||
releaseRunAgentBlocker = null
|
||||
}
|
||||
|
||||
mock.module('bun:bundle', () => ({
|
||||
feature: (name: string) => name === 'KAIROS',
|
||||
}))
|
||||
|
||||
mock.module(
|
||||
'@claude-code-best/builtin-tools/tools/AgentTool/runAgent.js',
|
||||
() => ({
|
||||
runAgent: async function* () {
|
||||
runAgentStartCount += 1
|
||||
if (runAgentBlocker) {
|
||||
await runAgentBlocker
|
||||
}
|
||||
yield {
|
||||
type: 'assistant',
|
||||
uuid: 'assistant-1',
|
||||
timestamp: new Date().toISOString(),
|
||||
message: {
|
||||
id: 'msg_1',
|
||||
type: 'message',
|
||||
role: 'assistant',
|
||||
model: 'test-model',
|
||||
content: [{ type: 'text', text: 'forked command done' }],
|
||||
stop_reason: 'end_turn',
|
||||
stop_sequence: null,
|
||||
usage: {
|
||||
input_tokens: 0,
|
||||
output_tokens: 0,
|
||||
},
|
||||
},
|
||||
}
|
||||
},
|
||||
}),
|
||||
)
|
||||
|
||||
mock.module('@claude-code-best/builtin-tools/tools/AgentTool/UI.js', () => ({
|
||||
AgentPromptDisplay: () => null,
|
||||
AgentResponseDisplay: () => null,
|
||||
extractLastToolInfo: () => null,
|
||||
renderGroupedAgentToolUse: () => null,
|
||||
renderToolResultMessage: () => null,
|
||||
renderToolUseErrorMessage: () => null,
|
||||
renderToolUseMessage: () => null,
|
||||
renderToolUseProgressMessage: () => null,
|
||||
renderToolUseRejectedMessage: () => null,
|
||||
renderToolUseTag: () => null,
|
||||
userFacingName: () => 'Agent',
|
||||
userFacingNameBackgroundColor: () => 'gray',
|
||||
}))
|
||||
|
||||
mock.module('../../messageQueueManager', createMessageQueueManagerMock)
|
||||
mock.module('../../messageQueueManager.js', createMessageQueueManagerMock)
|
||||
|
||||
const { processSlashCommand } = await import('../processSlashCommand')
|
||||
|
||||
let tempDir = ''
|
||||
|
||||
function createScheduledTaskQueuedCommandForTest(task: {
|
||||
id: string
|
||||
prompt: string
|
||||
}) {
|
||||
return createScheduledTaskQueuedCommand(task, {
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
})
|
||||
}
|
||||
|
||||
async function waitForRunStatus(
|
||||
runId: string,
|
||||
status: 'queued' | 'running' | 'completed' | 'failed' | 'cancelled',
|
||||
): Promise<void> {
|
||||
for (let i = 0; i < 200; i++) {
|
||||
const run = await getAutonomyRunById(runId, tempDir)
|
||||
if (run?.status === status) {
|
||||
return
|
||||
}
|
||||
await new Promise(resolve => setTimeout(resolve, 10))
|
||||
}
|
||||
const run = await getAutonomyRunById(runId, tempDir)
|
||||
throw new Error(`Expected ${runId} to be ${status}, got ${run?.status}`)
|
||||
}
|
||||
|
||||
async function waitForRunAgentStarts(expected: number): Promise<void> {
|
||||
for (let i = 0; i < 200; i++) {
|
||||
if (runAgentStartCount >= expected) {
|
||||
return
|
||||
}
|
||||
await new Promise(resolve => setTimeout(resolve, 10))
|
||||
}
|
||||
throw new Error(
|
||||
`Expected runAgent to start ${expected} time(s), got ${runAgentStartCount}`,
|
||||
)
|
||||
}
|
||||
|
||||
async function waitForCommandQueueLength(expected: number): Promise<void> {
|
||||
for (let i = 0; i < 200; i++) {
|
||||
if (getCommandQueue().length === expected) {
|
||||
return
|
||||
}
|
||||
await new Promise(resolve => setTimeout(resolve, 10))
|
||||
}
|
||||
throw new Error(
|
||||
`Expected command queue length ${expected}, got ${getCommandQueue().length}`,
|
||||
)
|
||||
}
|
||||
|
||||
beforeEach(async () => {
|
||||
tempDir = await createTempDir('process-slash-command-')
|
||||
originalNodeEnv = process.env.NODE_ENV
|
||||
originalAnthropicApiKey = process.env.ANTHROPIC_API_KEY
|
||||
process.env.NODE_ENV = 'test'
|
||||
process.env.ANTHROPIC_API_KEY = 'test-key'
|
||||
runAgentBlocker = null
|
||||
releaseRunAgentBlocker = null
|
||||
runAgentStartCount = 0
|
||||
resetStateForTests()
|
||||
resetAutonomyAuthorityForTests()
|
||||
resetCommandQueue()
|
||||
setOriginalCwd(tempDir)
|
||||
setProjectRoot(tempDir)
|
||||
setCwdState(tempDir)
|
||||
})
|
||||
|
||||
afterEach(async () => {
|
||||
releaseRunAgent()
|
||||
if (originalNodeEnv === undefined) {
|
||||
delete process.env.NODE_ENV
|
||||
} else {
|
||||
process.env.NODE_ENV = originalNodeEnv
|
||||
}
|
||||
if (originalAnthropicApiKey === undefined) {
|
||||
delete process.env.ANTHROPIC_API_KEY
|
||||
} else {
|
||||
process.env.ANTHROPIC_API_KEY = originalAnthropicApiKey
|
||||
}
|
||||
resetStateForTests()
|
||||
resetAutonomyAuthorityForTests()
|
||||
resetCommandQueue()
|
||||
if (tempDir) {
|
||||
await cleanupTempDir(tempDir)
|
||||
}
|
||||
mock.restore()
|
||||
})
|
||||
|
||||
describe('processSlashCommand', () => {
|
||||
const forkedCommand = {
|
||||
type: 'prompt',
|
||||
name: 'forked',
|
||||
description: 'test forked command',
|
||||
progressMessage: 'forking',
|
||||
contentLength: 0,
|
||||
source: 'builtin',
|
||||
context: 'fork',
|
||||
getPromptForCommand: async () => [
|
||||
{ type: 'text', text: 'review from fork' },
|
||||
],
|
||||
} as const
|
||||
|
||||
function createContext() {
|
||||
return {
|
||||
getAppState: () => ({
|
||||
kairosEnabled: true,
|
||||
mcp: { clients: [] },
|
||||
toolPermissionContext: {
|
||||
mode: 'default',
|
||||
alwaysAllowRules: {},
|
||||
},
|
||||
}),
|
||||
options: {
|
||||
commands: [forkedCommand],
|
||||
allowBackgroundForkedSlashCommands: true,
|
||||
tools: [],
|
||||
refreshTools: () => [],
|
||||
agentDefinitions: {
|
||||
activeAgents: [{ agentType: 'general-purpose' }],
|
||||
},
|
||||
},
|
||||
setResponseLength: mock((_updater: (length: number) => number) => {}),
|
||||
} as any
|
||||
}
|
||||
|
||||
test('defers autonomy completion until a KAIROS background forked command completes', async () => {
|
||||
const queued = await createAutonomyQueuedPrompt({
|
||||
basePrompt: '/forked review',
|
||||
trigger: 'scheduled-task',
|
||||
rootDir: tempDir,
|
||||
currentDir: tempDir,
|
||||
sourceId: 'cron-1',
|
||||
})
|
||||
expect(queued).not.toBeNull()
|
||||
const runId = queued!.autonomy!.runId
|
||||
await markAutonomyRunRunning(runId, tempDir, 100)
|
||||
|
||||
const result = await processSlashCommand(
|
||||
'/forked review',
|
||||
[],
|
||||
[],
|
||||
[],
|
||||
createContext(),
|
||||
mock(() => {}),
|
||||
undefined,
|
||||
false,
|
||||
async () => ({ behavior: 'allow', updatedInput: {} }) as any,
|
||||
queued!.autonomy,
|
||||
)
|
||||
|
||||
expect(result).toMatchObject({
|
||||
messages: [],
|
||||
shouldQuery: false,
|
||||
deferAutonomyCompletion: true,
|
||||
})
|
||||
|
||||
await waitForRunStatus(runId, 'completed')
|
||||
await waitForCommandQueueLength(1)
|
||||
expect(getCommandQueue()).toEqual([
|
||||
expect.objectContaining({
|
||||
mode: 'prompt',
|
||||
isMeta: true,
|
||||
skipSlashCommands: true,
|
||||
value: expect.stringContaining(
|
||||
'<scheduled-task-result command="/forked">',
|
||||
),
|
||||
}),
|
||||
])
|
||||
})
|
||||
|
||||
test('keeps repeated /loop scheduled fires bounded while a background fork is running', async () => {
|
||||
const task = {
|
||||
id: 'cron-loop',
|
||||
prompt: '/forked review',
|
||||
}
|
||||
const first = await createScheduledTaskQueuedCommandForTest(task)
|
||||
expect(first?.autonomy?.runId).toBeDefined()
|
||||
const runId = first!.autonomy!.runId
|
||||
await markAutonomyRunRunning(runId, tempDir, 100)
|
||||
|
||||
holdRunAgent()
|
||||
const result = await processSlashCommand(
|
||||
'/forked review',
|
||||
[],
|
||||
[],
|
||||
[],
|
||||
createContext(),
|
||||
mock(() => {}),
|
||||
undefined,
|
||||
false,
|
||||
async () => ({ behavior: 'allow', updatedInput: {} }) as any,
|
||||
first!.autonomy,
|
||||
)
|
||||
|
||||
expect(result.deferAutonomyCompletion).toBe(true)
|
||||
await waitForRunAgentStarts(1)
|
||||
|
||||
const repeatedFires = await Promise.all(
|
||||
Array.from({ length: 200 }, () =>
|
||||
createScheduledTaskQueuedCommandForTest(task),
|
||||
),
|
||||
)
|
||||
expect(repeatedFires.every(command => command === null)).toBe(true)
|
||||
expect(
|
||||
(await listAutonomyRuns(tempDir)).filter(
|
||||
run => run.sourceId === 'cron-loop',
|
||||
),
|
||||
).toHaveLength(1)
|
||||
expect(getCommandQueue()).toHaveLength(0)
|
||||
|
||||
releaseRunAgent()
|
||||
await waitForRunStatus(runId, 'completed')
|
||||
await waitForCommandQueueLength(1)
|
||||
expect(getCommandQueue()).toHaveLength(1)
|
||||
|
||||
const next = await createScheduledTaskQueuedCommandForTest(task)
|
||||
expect(next?.autonomy?.runId).toBeDefined()
|
||||
expect(
|
||||
(await listAutonomyRuns(tempDir)).filter(
|
||||
run => run.sourceId === 'cron-loop',
|
||||
),
|
||||
).toHaveLength(2)
|
||||
})
|
||||
|
||||
test('rejects the background fork test override outside test runtime', async () => {
|
||||
process.env.NODE_ENV = 'production'
|
||||
|
||||
const result = await processSlashCommand(
|
||||
'/forked review',
|
||||
[],
|
||||
[],
|
||||
[],
|
||||
createContext(),
|
||||
mock(() => {}),
|
||||
undefined,
|
||||
false,
|
||||
async () => ({ behavior: 'allow', updatedInput: {} }) as any,
|
||||
)
|
||||
|
||||
expect(result.shouldQuery).toBe(false)
|
||||
expect(
|
||||
result.messages.some(message =>
|
||||
JSON.stringify(message).includes(
|
||||
'allowBackgroundForkedSlashCommands is test-only',
|
||||
),
|
||||
),
|
||||
).toBe(true)
|
||||
expect(runAgentStartCount).toBe(0)
|
||||
})
|
||||
})
|
||||
File diff suppressed because it is too large
Load Diff
@@ -28,6 +28,7 @@ import type {
|
||||
import type { PermissionMode } from '../../types/permissions.js'
|
||||
import {
|
||||
isValidImagePaste,
|
||||
type QueuedCommand,
|
||||
type PromptInputMode,
|
||||
} from '../../types/textInputTypes.js'
|
||||
import {
|
||||
@@ -80,6 +81,9 @@ export type ProcessUserInputBaseResult = {
|
||||
// Used by /discover to chain into the selected feature's command
|
||||
nextInput?: string
|
||||
submitNextInput?: boolean
|
||||
// When true, the command started detached work that will finalize its
|
||||
// autonomy run after the background work completes.
|
||||
deferAutonomyCompletion?: boolean
|
||||
}
|
||||
|
||||
export async function processUserInput({
|
||||
@@ -100,6 +104,7 @@ export async function processUserInput({
|
||||
bridgeOrigin,
|
||||
isMeta,
|
||||
skipAttachments,
|
||||
autonomy,
|
||||
}: {
|
||||
input: string | Array<ContentBlockParam>
|
||||
/**
|
||||
@@ -137,6 +142,7 @@ export async function processUserInput({
|
||||
*/
|
||||
isMeta?: boolean
|
||||
skipAttachments?: boolean
|
||||
autonomy?: QueuedCommand['autonomy']
|
||||
}): Promise<ProcessUserInputBaseResult> {
|
||||
const inputString = typeof input === 'string' ? input : null
|
||||
// Immediately show the user input prompt while we are still processing the input.
|
||||
@@ -168,6 +174,7 @@ export async function processUserInput({
|
||||
isMeta,
|
||||
skipAttachments,
|
||||
preExpansionInput,
|
||||
autonomy,
|
||||
)
|
||||
queryCheckpoint('query_process_user_input_base_end')
|
||||
|
||||
@@ -296,6 +303,7 @@ async function processUserInputBase(
|
||||
isMeta?: boolean,
|
||||
skipAttachments?: boolean,
|
||||
preExpansionInput?: string,
|
||||
autonomy?: QueuedCommand['autonomy'],
|
||||
): Promise<ProcessUserInputBaseResult> {
|
||||
let inputString: string | null = null
|
||||
let precedingInputBlocks: ContentBlockParam[] = []
|
||||
@@ -491,6 +499,7 @@ async function processUserInputBase(
|
||||
uuid,
|
||||
isAlreadyProcessing,
|
||||
canUseTool,
|
||||
autonomy,
|
||||
)
|
||||
return addImageMetadataMessage(slashResult, imageMetadataTexts)
|
||||
}
|
||||
@@ -549,6 +558,7 @@ async function processUserInputBase(
|
||||
uuid,
|
||||
isAlreadyProcessing,
|
||||
canUseTool,
|
||||
autonomy,
|
||||
)
|
||||
return addImageMetadataMessage(slashResult, imageMetadataTexts)
|
||||
}
|
||||
|
||||
@@ -424,8 +424,7 @@ function createInProcessCanUseTool(
|
||||
feedback: parsed.error,
|
||||
})
|
||||
}
|
||||
cleanup()
|
||||
return
|
||||
return // Callback already resolves the promise
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -675,6 +674,7 @@ type WaitResult =
|
||||
type: 'new_message'
|
||||
message: string
|
||||
autonomyRunId?: string
|
||||
autonomyRootDir?: string
|
||||
from: string
|
||||
color?: string
|
||||
summary?: string
|
||||
@@ -739,12 +739,16 @@ async function waitForNextPromptOrShutdown(
|
||||
`[inProcessRunner] ${identity.agentName} found pending user message (poll #${pollCount})`,
|
||||
)
|
||||
if (pending.autonomyRunId) {
|
||||
await markAutonomyRunRunning(pending.autonomyRunId)
|
||||
await markAutonomyRunRunning(
|
||||
pending.autonomyRunId,
|
||||
pending.autonomyRootDir,
|
||||
)
|
||||
}
|
||||
return {
|
||||
type: 'new_message',
|
||||
message: pending.message,
|
||||
autonomyRunId: pending.autonomyRunId,
|
||||
autonomyRootDir: pending.autonomyRootDir,
|
||||
from: 'user',
|
||||
}
|
||||
}
|
||||
@@ -1022,6 +1026,7 @@ export async function runInProcessTeammate(
|
||||
)
|
||||
let currentPrompt = wrappedInitialPrompt
|
||||
let currentAutonomyRunId: string | undefined
|
||||
let currentAutonomyRootDir: string | undefined
|
||||
let shouldExit = false
|
||||
|
||||
// Try to claim an available task immediately so the UI can show activity
|
||||
@@ -1319,12 +1324,21 @@ export async function runInProcessTeammate(
|
||||
setAppState,
|
||||
)
|
||||
if (currentAutonomyRunId) {
|
||||
await markAutonomyRunFailed(currentAutonomyRunId, ERROR_MESSAGE_USER_ABORT)
|
||||
await markAutonomyRunFailed(
|
||||
currentAutonomyRunId,
|
||||
ERROR_MESSAGE_USER_ABORT,
|
||||
currentAutonomyRootDir,
|
||||
)
|
||||
currentAutonomyRunId = undefined
|
||||
currentAutonomyRootDir = undefined
|
||||
}
|
||||
} else if (currentAutonomyRunId) {
|
||||
await markAutonomyRunCompleted(currentAutonomyRunId)
|
||||
await markAutonomyRunCompleted(
|
||||
currentAutonomyRunId,
|
||||
currentAutonomyRootDir,
|
||||
)
|
||||
currentAutonomyRunId = undefined
|
||||
currentAutonomyRootDir = undefined
|
||||
}
|
||||
|
||||
// Check if already idle before updating (to skip duplicate notification)
|
||||
@@ -1398,6 +1412,7 @@ export async function runInProcessTeammate(
|
||||
setAppState,
|
||||
)
|
||||
currentAutonomyRunId = undefined
|
||||
currentAutonomyRootDir = undefined
|
||||
break
|
||||
|
||||
case 'new_message':
|
||||
@@ -1410,6 +1425,7 @@ export async function runInProcessTeammate(
|
||||
if (waitResult.from === 'user') {
|
||||
currentPrompt = waitResult.message
|
||||
currentAutonomyRunId = waitResult.autonomyRunId
|
||||
currentAutonomyRootDir = waitResult.autonomyRootDir
|
||||
} else {
|
||||
currentPrompt = formatAsTeammateMessage(
|
||||
waitResult.from,
|
||||
@@ -1426,6 +1442,7 @@ export async function runInProcessTeammate(
|
||||
setAppState,
|
||||
)
|
||||
currentAutonomyRunId = undefined
|
||||
currentAutonomyRootDir = undefined
|
||||
}
|
||||
break
|
||||
|
||||
@@ -1533,7 +1550,11 @@ export async function runInProcessTeammate(
|
||||
})
|
||||
}
|
||||
if (currentAutonomyRunId) {
|
||||
await markAutonomyRunFailed(currentAutonomyRunId, errorMessage)
|
||||
await markAutonomyRunFailed(
|
||||
currentAutonomyRunId,
|
||||
errorMessage,
|
||||
currentAutonomyRootDir,
|
||||
)
|
||||
}
|
||||
|
||||
// Send idle notification with failure via file-based mailbox
|
||||
|
||||
@@ -234,7 +234,7 @@ export function killInProcessTeammate(
|
||||
let agentId: string | null = null
|
||||
let toolUseId: string | undefined
|
||||
let description: string | undefined
|
||||
let pendingAutonomyRunIds: string[] = []
|
||||
let pendingAutonomyRuns: Array<{ runId: string; rootDir?: string }> = []
|
||||
|
||||
setAppState((prev: AppState) => {
|
||||
const task = prev.tasks[taskId]
|
||||
@@ -255,9 +255,18 @@ export function killInProcessTeammate(
|
||||
description = teammateTask.description
|
||||
|
||||
// Capture pending autonomy run IDs before clearing them
|
||||
pendingAutonomyRunIds = teammateTask.pendingUserMessages
|
||||
.map(message => message.autonomyRunId)
|
||||
.filter((runId): runId is string => runId !== undefined)
|
||||
pendingAutonomyRuns = teammateTask.pendingUserMessages.flatMap(message =>
|
||||
message.autonomyRunId
|
||||
? [
|
||||
{
|
||||
runId: message.autonomyRunId,
|
||||
...(message.autonomyRootDir
|
||||
? { rootDir: message.autonomyRootDir }
|
||||
: {}),
|
||||
},
|
||||
]
|
||||
: [],
|
||||
)
|
||||
|
||||
// Abort the controller to stop execution
|
||||
teammateTask.abortController?.abort()
|
||||
@@ -311,10 +320,11 @@ export function killInProcessTeammate(
|
||||
}
|
||||
|
||||
if (killed) {
|
||||
for (const runId of pendingAutonomyRunIds) {
|
||||
for (const run of pendingAutonomyRuns) {
|
||||
void markAutonomyRunFailed(
|
||||
runId,
|
||||
run.runId,
|
||||
`Teammate ${agentId ?? taskId} was stopped before it could consume the queued autonomy prompt.`,
|
||||
run.rootDir,
|
||||
)
|
||||
}
|
||||
void evictTaskOutput(taskId)
|
||||
|
||||
Reference in New Issue
Block a user