mirror of
https://github.com/claude-code-best/claude-code.git
synced 2026-06-18 14:25:51 +00:00
Compare commits
1 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
1b32909742 |
@@ -1,492 +0,0 @@
|
|||||||
# System Understanding Report — Loop / Scheduled Autonomy OOM
|
|
||||||
|
|
||||||
- **Flow id**: `recurring-bug-loop-oom` (pilot flow for autonomy ↔ deep-debug binding)
|
|
||||||
- **Branch**: `fix/loop-scheduled-autonomy-oom`
|
|
||||||
- **Worktree**: `E:\Source_code\Claude-code-bast-loop-scheduled-oom-fix`
|
|
||||||
- **Author**: back-filled from existing working-tree diff (no commits ahead of `main`)
|
|
||||||
- **Status**: `report` (this document) — pending human approval before `regression-test` advances
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 1. Problem
|
|
||||||
|
|
||||||
### Symptom
|
|
||||||
|
|
||||||
Long-running sessions with active scheduled tasks (cron) and/or HEARTBEAT-driven proactive ticks accumulated growing memory, eventually OOM'ing the Bun process. The visible signature was:
|
|
||||||
|
|
||||||
- `runs.json` under `.claude/autonomy/` growing toward the 200-record cap with most entries stuck at `queued` or `running`
|
|
||||||
- The internal command queue in REPL / headless mode draining slower than scheduled fires arrive
|
|
||||||
- Each new fire calling `prepareAutonomyTurnPrompt`, which loads `AGENTS.md` + `HEARTBEAT.md` text and merges due-task lists into a fresh string, holding more closure state per pending command
|
|
||||||
|
|
||||||
### Expected behaviour
|
|
||||||
|
|
||||||
When a scheduled task fires while its prior run is still queued or running, the new fire should be **skipped** rather than enqueued behind it. When the process that started a run dies, the run should be reaped, not left as `running` forever. Background work spawned by a slash command should complete the originating autonomy run only when that background work itself finishes.
|
|
||||||
|
|
||||||
### Actual behaviour (before fix)
|
|
||||||
|
|
||||||
1. `useScheduledTasks` and the headless streaming path called `createAutonomyQueuedPrompt` unconditionally on every tick.
|
|
||||||
2. `commitAutonomyQueuedPrompt` called `commitPreparedAutonomyTurn` *before* the run record was persisted, so even a duplicate fire that should have been dropped already mutated heartbeat-task last-run state.
|
|
||||||
3. `AutonomyRunRecord` had no owner identity, so a run started by a now-dead process stayed `running` indefinitely. Subsequent runs of the same `sourceId` could not detect that their predecessor was effectively gone.
|
|
||||||
4. Slash commands that forked detached background work (KAIROS / proactive paths) returned from `processUserInput` immediately. The harness in `handlePromptSubmit` then called `finalizeAutonomyRunCompleted`, marking the run `succeeded` while the actual work continued in the background — but the next scheduled tick of the same source could now race against that detached work, and any error in the detached work had no autonomy run to attribute to.
|
|
||||||
|
|
||||||
### Reproduction shape
|
|
||||||
|
|
||||||
Not a single deterministic repro — load-induced. Rough recipe:
|
|
||||||
|
|
||||||
- Configure two `HEARTBEAT.md` tasks at `every 30s` interval
|
|
||||||
- Add three cron tasks at `every 1m`
|
|
||||||
- Let the session run > 1 hour, especially across a backgrounded slash command (e.g. KAIROS `/sleep`-style detached fork)
|
|
||||||
- Watch `.claude/autonomy/runs.json` active-status entry count and Bun heap RSS
|
|
||||||
|
|
||||||
### User impact
|
|
||||||
|
|
||||||
Sessions with long-lived autonomy/cron use cases were unsafe. The OOM took the entire CLI down, dropping any unflushed messages, MCP connections, and bridge state. Because `.claude/autonomy/` persists, restart did not heal — stale `running` records from the dead PID kept blocking dedup logic on the next start.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 2. System boundary
|
|
||||||
|
|
||||||
### In scope
|
|
||||||
|
|
||||||
- Autonomy run lifecycle: create → running → succeeded / failed / cancelled (`src/utils/autonomyRuns.ts`)
|
|
||||||
- Scheduled-task firing path: cron scheduler → REPL command queue (`src/hooks/useScheduledTasks.ts`)
|
|
||||||
- Headless streaming variant of the same path (`src/cli/print.ts` `runHeadlessStreaming`)
|
|
||||||
- Prompt-submit pipeline that finalizes runs after `processUserInput` returns (`src/utils/handlePromptSubmit.ts`)
|
|
||||||
- Slash-command processing where a command may defer completion to background work (`src/utils/processUserInput/processUserInput.ts`, `processSlashCommand.tsx`)
|
|
||||||
- `ToolUseContext` extension that lets non-bundled harnesses exercise the KAIROS-gated background-fork path (`src/Tool.ts`)
|
|
||||||
|
|
||||||
### Out of scope
|
|
||||||
|
|
||||||
- The cron scheduler itself (`src/utils/cronScheduler.ts`) — its tick semantics are not changing
|
|
||||||
- `autonomyFlows.ts` flow state machine — separate from per-run tracking
|
|
||||||
- HEARTBEAT.md scheduling semantics — unchanged. `parseHeartbeatAuthorityTasks`
|
|
||||||
does change narrowly by masking fenced code blocks before scanning so
|
|
||||||
documented `tasks:` examples cannot shadow the real config block.
|
|
||||||
- `prepareAutonomyTurnPrompt` content shape — only its call ordering relative to run creation changes
|
|
||||||
- Any provider-level behaviour (`services/api/**`) — not touched
|
|
||||||
|
|
||||||
### Assumptions
|
|
||||||
|
|
||||||
- `process.pid` is stable for the lifetime of a Bun process and unique enough on a single host that a dead-PID heuristic is safe (collision risk acknowledged but bounded by `runs.json` retention).
|
|
||||||
- `isProcessRunning(pid)` (from `genericProcessUtils.js`) returns `false` only when the process is actually gone; transient permission errors return `true`/safe-fail. Verified in step 6.
|
|
||||||
- `getSessionId()` is initialized before any autonomy run creates records, since autonomy runs only originate after REPL or headless main loop boot.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 3. Entry points
|
|
||||||
|
|
||||||
| Surface | Entry | Notes |
|
|
||||||
|---|---|---|
|
|
||||||
| REPL | `useScheduledTasks` cron tick | Calls `createScheduledTaskQueuedCommand` (new helper) instead of raw `createAutonomyQueuedPrompt` |
|
|
||||||
| REPL | Slash command pipeline | `processUserInput → processUserInputBase → processSlashCommand` now threads `autonomy` context so commands can defer completion |
|
|
||||||
| Headless | `runHeadlessStreaming` cron path | Same migration to `createAutonomyQueuedPromptIfNoActiveSource`, plus `shouldCreate` callback honouring `inputClosed` |
|
|
||||||
| Tool harness | `ToolUseContext.options.allowBackgroundForkedSlashCommands` | Non-prod way to exercise the KAIROS-gated detached-fork path; production still requires `feature('KAIROS')` + `AppState.kairosEnabled` |
|
|
||||||
| Persistence | `.claude/autonomy/runs.json` | Schema gains `ownerProcessId`, `ownerSessionId`; readers must tolerate older records lacking these fields |
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 4. Key files
|
|
||||||
|
|
||||||
| File | Lines changed | Why it matters |
|
|
||||||
|---|---|---|
|
|
||||||
| `src/utils/autonomyRuns.ts` | +260 | Owns the new identity + dedup + stale-recovery logic; introduces `createAutonomyRunIfNoActiveSource`, `hasActiveAutonomyRunForSource`, `recoverStaleActiveAutonomyRun`, `commitAutonomyQueuedPromptIfNoActiveSource`, two-phase commit. The structural heart of the fix. |
|
|
||||||
| `src/utils/processUserInput/processSlashCommand.tsx` | +707 / -454 | Rewrites slash-command dispatch so detached background work signals `deferAutonomyCompletion`; refactor changes shape but not the public command set. |
|
|
||||||
| `src/hooks/useScheduledTasks.ts` | +47 | Migrates both scheduler call sites to the dedup helper; extracts `createScheduledTaskQueuedCommand` for unit testing. |
|
|
||||||
| `src/cli/print.ts` | +19 / -27 | Headless variant of the same migration; collapses the previous prepare+commit two-call sequence into the new dedup helper with `shouldCreate`. |
|
|
||||||
| `src/utils/handlePromptSubmit.ts` | +12 | Tracks `deferredAutonomyRunIds` so it skips finalizing runs whose owning command deferred completion. |
|
|
||||||
| `src/utils/processUserInput/processUserInput.ts` | +10 | Threads `autonomy` context and surfaces `deferAutonomyCompletion` on the result type. |
|
|
||||||
| `src/Tool.ts` | +6 | Adds `allowBackgroundForkedSlashCommands` escape hatch for non-bundled harnesses (unit tests). |
|
|
||||||
| `src/utils/__tests__/autonomyRuns.test.ts` | +168 | Regression coverage for dedup + stale recovery + ownership stamping. |
|
|
||||||
| `src/hooks/__tests__/useScheduledTasks.test.ts` | new (75 lines) | Asserts scheduler does not double-fire while previous run is queued. |
|
|
||||||
| `src/utils/processUserInput/__tests__/processSlashCommand.test.ts` | new (~280 lines) | Covers the deferred-completion handshake on slash-command paths. |
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 5. Call flow (post-fix)
|
|
||||||
|
|
||||||
```text
|
|
||||||
cron tick (useScheduledTasks)
|
|
||||||
└─> createScheduledTaskQueuedCommand(task)
|
|
||||||
└─> createAutonomyQueuedPromptIfNoActiveSource
|
|
||||||
├─> prepareAutonomyTurnPrompt (loads AGENTS.md + HEARTBEAT.md)
|
|
||||||
├─> shouldCreate? ──► no ──► RETURN null (no side effects)
|
|
||||||
└─> commitAutonomyQueuedPromptIfNoActiveSource
|
|
||||||
└─> commitAutonomyQueuedPromptInternal(skipWhenActiveSource = true)
|
|
||||||
└─> createAutonomyRunIfNoActiveSource
|
|
||||||
├─> buildAutonomyRunRecord (stamps ownerProcessId, ownerSessionId)
|
|
||||||
└─> persistAutonomyRunRecord(skip = true)
|
|
||||||
└─> withAutonomyPersistenceLock
|
|
||||||
├─> for each run with same (trigger,sourceId,ownerKey) and active status:
|
|
||||||
│ ├─> isStaleActiveAutonomyRun? ──► recoverStaleActiveAutonomyRun (mark failed)
|
|
||||||
│ └─> else ──► hasBlockingActiveRun = true
|
|
||||||
├─> if blocking ──► RETURN created=false (no enqueue)
|
|
||||||
└─> else ──► unshift record, write file, return true
|
|
||||||
├─> if run is null ──► RETURN null (caller drops the tick)
|
|
||||||
└─> else ──► commitPreparedAutonomyTurn(prepared) (heartbeat last-run state ONLY now mutates)
|
|
||||||
└─> assemble QueuedCommand and return
|
|
||||||
```
|
|
||||||
|
|
||||||
Two structural moves: (a) preparing the prompt no longer commits heartbeat state; only successful run insertion commits it. (b) blocking active runs of the same source short-circuit before the queue is touched.
|
|
||||||
|
|
||||||
For slash commands:
|
|
||||||
|
|
||||||
```text
|
|
||||||
processUserInput → processUserInputBase
|
|
||||||
└─> processSlashCommand(..., autonomy = cmd.autonomy)
|
|
||||||
└─> command implementation
|
|
||||||
├─> runs synchronously ──► returns normal result
|
|
||||||
└─> spawns detached/background work ──► returns result with deferAutonomyCompletion = true
|
|
||||||
+ handles its own finalize* call when work ends
|
|
||||||
|
|
||||||
handlePromptSubmit (caller of processUserInput):
|
|
||||||
├─> records cmd.autonomy.runId in autonomyRunIds
|
|
||||||
├─> on result with deferAutonomyCompletion=true: adds runId to deferredAutonomyRunIds
|
|
||||||
└─> finalize loop: skips deferred ids in BOTH success and error branches
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 6. Data flow
|
|
||||||
|
|
||||||
### `runs.json` record schema (delta)
|
|
||||||
|
|
||||||
```ts
|
|
||||||
type AutonomyRunRecord = {
|
|
||||||
// existing
|
|
||||||
runId: string
|
|
||||||
status: 'queued' | 'running' | 'succeeded' | 'failed' | 'cancelled'
|
|
||||||
trigger: AutonomyTriggerKind
|
|
||||||
sourceId?: string
|
|
||||||
ownerKey?: string
|
|
||||||
// new
|
|
||||||
ownerProcessId?: number // process.pid at create time and at markRunning time
|
|
||||||
ownerSessionId?: string // getSessionId() at the same points
|
|
||||||
// ...
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
Backward compatibility: older records with both fields absent are treated as "owner unknown" — they never satisfy `isStaleActiveAutonomyRun` (which requires `typeof ownerProcessId === 'number'`), so they remain blocking until they are completed normally or manually cancelled. This is intentional: we cannot prove they are stale.
|
|
||||||
|
|
||||||
### Stale-recovery rule
|
|
||||||
|
|
||||||
```text
|
|
||||||
isStaleActiveAutonomyRun(run) ⇔
|
|
||||||
run.status ∈ {queued, running}
|
|
||||||
∧ typeof run.ownerProcessId === 'number'
|
|
||||||
∧ !isProcessRunning(run.ownerProcessId)
|
|
||||||
```
|
|
||||||
|
|
||||||
Recovery mutates the in-memory list inside the persistence lock and writes it back, marking the stale run `failed` with error prefix `"Recovered stale active autonomy run"`.
|
|
||||||
|
|
||||||
### Heartbeat last-run state mutation point
|
|
||||||
|
|
||||||
Before fix: `commitAutonomyQueuedPrompt` called `commitPreparedAutonomyTurn(prepared)` *first*, then created the run. A skipped duplicate already advanced heartbeat last-run timestamps.
|
|
||||||
|
|
||||||
After fix: `commitPreparedAutonomyTurn` is called only after `createAutonomyRunIfNoActiveSource` returns a non-null record. Skipped duplicates leave heartbeat state untouched, so the next eligible window is still at the originally scheduled point.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 7. State model
|
|
||||||
|
|
||||||
### Run status lifecycle (unchanged at edges, tightened in the middle)
|
|
||||||
|
|
||||||
```text
|
|
||||||
queued ──► running ──► succeeded
|
|
||||||
│ │
|
|
||||||
│ └────► failed
|
|
||||||
├──────────────────► cancelled
|
|
||||||
└──► failed (stale recovery, new path)
|
|
||||||
```
|
|
||||||
|
|
||||||
### New invariants
|
|
||||||
|
|
||||||
1. **Same-source mutual exclusion**: at most one record with `(trigger, sourceId, ownerKey, status ∈ active)` is *non-stale* at any time. Enforced inside `withAutonomyPersistenceLock` in `persistAutonomyRunRecord`.
|
|
||||||
|
|
||||||
2. **Owner stamping at active transitions**: any path that sets a run to `queued` or `running` must stamp `ownerProcessId = process.pid` and `ownerSessionId = getSessionId()`. `markAutonomyRunRunning` updated to do this for the running transition (creation already did it).
|
|
||||||
|
|
||||||
3. **Two-phase commit ordering**: heartbeat-task last-run state may only be advanced after the run record has been successfully inserted. Equivalent to "prompt commit ⇒ run row exists".
|
|
||||||
|
|
||||||
4. **Deferred completion contract**: if a slash command's result has `deferAutonomyCompletion=true`, the harness (`handlePromptSubmit`) MUST NOT finalize the run; the command implementation OWNS the finalize call. Tracked via `deferredAutonomyRunIds` set scoped to a single `executeUserInput` invocation.
|
|
||||||
|
|
||||||
### Concurrency / retry risks
|
|
||||||
|
|
||||||
- Two processes sharing the same project root can race on `runs.json`. Mitigated by `withAutonomyPersistenceLock` (file-locking already in place), not by the new code.
|
|
||||||
- Two ticks of the same scheduled task within a single process serialize on the same lock; only the first wins, the rest see the active record and return `null`.
|
|
||||||
- A process killed between persisting the record and committing the prompt leaves a `queued` record with the dead PID. Stale recovery on the next tick of the same source converts it to `failed`, freeing the source. This is the new safety net.
|
|
||||||
|
|
||||||
### Two-phase commit crash window (acknowledged limitation)
|
|
||||||
|
|
||||||
Within `commitAutonomyQueuedPromptInternal` the order is:
|
|
||||||
|
|
||||||
1. `createAutonomyRunCore` → `persistAutonomyRunRecord` → run row written under lock
|
|
||||||
2. `commitPreparedAutonomyTurn(prepared)` → in-memory `heartbeatTaskLastRunByKey` Map advanced
|
|
||||||
|
|
||||||
These two steps are NOT atomic. If the process is killed between (1) and (2):
|
|
||||||
|
|
||||||
- `runs.json` has a fresh `queued` record stamped with the now-dead PID.
|
|
||||||
- `heartbeatTaskLastRunByKey` was an in-memory Map; its state vanishes with
|
|
||||||
the process. On restart the Map is empty.
|
|
||||||
- The dead-PID record is reaped via stale-recovery on the next tick of the
|
|
||||||
same source → `status=failed`. New record can be created.
|
|
||||||
- Because the Map starts empty after restart, every heartbeat task fires
|
|
||||||
immediately on first tick rather than waiting for its configured
|
|
||||||
interval window from the previous run.
|
|
||||||
|
|
||||||
**Severity**: low. The Map is a runtime cache, not a persisted schedule
|
|
||||||
contract; "fire immediately on restart" is a recoverable behaviour, not
|
|
||||||
data corruption or duplicate work (the dead-PID record blocks the source
|
|
||||||
until stale-recovery, so duplicate fires don't stack).
|
|
||||||
|
|
||||||
**Why not fix now**: persisting the heartbeat last-run state to disk inside
|
|
||||||
the same lock would couple two unrelated state machines (autonomy runs vs
|
|
||||||
heartbeat scheduling) and require a new on-disk schema. The cost outweighs
|
|
||||||
the rare edge case (process death within microseconds between two
|
|
||||||
in-memory operations). Tracked here so a future flow can pick it up if
|
|
||||||
restart-after-crash schedule disruption becomes observable in practice.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 8. Existing tests
|
|
||||||
|
|
||||||
### Pre-fix
|
|
||||||
|
|
||||||
- `src/utils/__tests__/autonomyRuns.test.ts` covered create / list / mark transitions for the basic happy path.
|
|
||||||
- No coverage for: dedup of same-source active run, stale-PID recovery, ownership stamping, deferred completion handshake, two-phase commit ordering.
|
|
||||||
- `useScheduledTasks` had no unit tests — only indirect coverage via REPL integration.
|
|
||||||
- `processSlashCommand` had no autonomy-context coverage.
|
|
||||||
|
|
||||||
### Added in this branch
|
|
||||||
|
|
||||||
- `src/utils/__tests__/autonomyRuns.test.ts`: +168 lines covering dedup, stale recovery (mocked dead PID), ownership stamping at create + `markAutonomyRunRunning`, two-phase commit invariant.
|
|
||||||
- `src/hooks/__tests__/useScheduledTasks.test.ts`: new file, 75 lines. Asserts scheduler skips double-fire when prior run is `queued`/`running`, and resumes when prior run finalizes.
|
|
||||||
- `src/utils/processUserInput/__tests__/processSlashCommand.test.ts`: new file, ~280 lines. Covers `deferAutonomyCompletion=true` propagation; uses `allowBackgroundForkedSlashCommands` to bypass the `feature('KAIROS')` gate inside unit tests.
|
|
||||||
|
|
||||||
### Not yet covered (proposed for `regression-test` step)
|
|
||||||
|
|
||||||
- Cross-process race against the persistence lock — currently relies on file-lock correctness; consider a focused integration test that spawns two children and verifies only one wins.
|
|
||||||
- Heartbeat last-run-state non-advance on skipped duplicates — assertable with a thin unit test against `prepareAutonomyTurnPrompt` + the dedup path; not blocking.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 9. Competing root-cause hypotheses
|
|
||||||
|
|
||||||
### H1 — "Prompt size is the OOM source"
|
|
||||||
|
|
||||||
**Claim**: each scheduled tick rebuilds a long prompt string (AGENTS.md + HEARTBEAT.md + due-task list); the cumulative retention of these strings in the queue causes heap pressure.
|
|
||||||
|
|
||||||
**Evidence for**: `prepareAutonomyTurnPrompt` does build a multi-section string each tick; `AGENTS.md` in this repo is now 220 lines.
|
|
||||||
|
|
||||||
**Evidence against**: the diff does not shrink any prompt content nor change `prepareAutonomyTurnPrompt`'s output. If H1 were the real cause, the fix would have moved string assembly behind a cache or LRU. The fix instead targets the *number* of in-flight runs.
|
|
||||||
|
|
||||||
**Verdict**: contributing factor at most. Rejected as primary root cause.
|
|
||||||
|
|
||||||
### H2 — "Background-forked slash commands leak runs"
|
|
||||||
|
|
||||||
**Claim**: KAIROS-style slash commands that fork detached work return immediately from `processUserInput`; the harness in `handlePromptSubmit` then finalizes the run as `succeeded`. Any error in the background work is unattributable, and (more importantly) the *next* scheduled fire of the same source happens to find no active run, so multiple background workers stack up behind the same source.
|
|
||||||
|
|
||||||
**Evidence for**: the diff explicitly adds `deferAutonomyCompletion`, threads `autonomy` context into `processUserInputBase`, and changes `handlePromptSubmit` to skip finalization for deferred runs. New test file `processSlashCommand.test.ts` is dedicated to this exact handshake.
|
|
||||||
|
|
||||||
**Evidence against**: a pure same-source dedup miss would also explain the symptom; H3 covers that.
|
|
||||||
|
|
||||||
**Verdict**: real and load-bearing. Confirmed by the targeted code added.
|
|
||||||
|
|
||||||
### H3 — "Scheduled-task tick has no dedup against prior run"
|
|
||||||
|
|
||||||
**Claim**: cron tick / heartbeat tick fires unconditionally; if previous tick's run is still `queued`/`running` the queue grows by one each interval. Compounded across multiple sources, queue + `runs.json` active subset never shrink.
|
|
||||||
|
|
||||||
**Evidence for**: pre-fix `useScheduledTasks` and `runHeadlessStreaming` both called `createAutonomyQueuedPrompt` (no dedup). Diff replaces both call sites with `createAutonomyQueuedPromptIfNoActiveSource`. Persistence-side dedup added in the same change.
|
|
||||||
|
|
||||||
**Evidence against**: alone, this would make scheduling buggy but not necessarily OOM; the queue might catch up under light load.
|
|
||||||
|
|
||||||
**Verdict**: real and load-bearing. Confirmed by the targeted code added.
|
|
||||||
|
|
||||||
### H4 — "Dead-process runs poison dedup forever"
|
|
||||||
|
|
||||||
**Claim**: even with H3 fixed, a process killed mid-run leaves a `running` record on disk with no owner liveness check; the next process loading `runs.json` would treat it as blocking and never schedule that source again.
|
|
||||||
|
|
||||||
**Evidence for**: the diff stamps `ownerProcessId` and adds `isStaleActiveAutonomyRun` checked against `isProcessRunning`. Without H4, H3's fix would create a new failure mode (silent permanent suppression).
|
|
||||||
|
|
||||||
**Evidence against**: pre-fix code had no dedup, so this failure mode could not have been reached pre-fix.
|
|
||||||
|
|
||||||
**Verdict**: real, but secondary. It exists because H3's fix introduces it. Required to ship together.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 10. Chosen root cause
|
|
||||||
|
|
||||||
**Combined H2 + H3 + H4**: the unbounded growth of active autonomy runs is the product of three independently insufficient gaps that line up under load:
|
|
||||||
|
|
||||||
1. Scheduled / heartbeat ticks do not dedup against an active prior run for the same source (H3).
|
|
||||||
2. Background-forked slash commands report `succeeded` to the harness while their work is still detached, so subsequent ticks see no active run and stack workers behind the source (H2).
|
|
||||||
3. Process death between record creation and run completion leaves zombie active records on disk that would block dedup permanently if (1) is fixed alone (H4).
|
|
||||||
|
|
||||||
Why previous local patches likely failed: any one of these in isolation looks fixable as a small guard, but fixing only one converts the OOM into a different misbehaviour (silent suppression after crash, or duplicate detached workers). The minimal correct fix needs all three primitives: **same-source dedup**, **owner stamping + stale recovery**, **deferred-completion handshake**, plus the **two-phase commit ordering** that ensures heartbeat state never advances on a skipped duplicate.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 11. Fix plan
|
|
||||||
|
|
||||||
### Minimal fix surface
|
|
||||||
|
|
||||||
| Module | Change | Reason |
|
|
||||||
|---|---|---|
|
|
||||||
| `autonomyRuns.ts` | Owner stamping; `createAutonomyRunIfNoActiveSource`; `commitAutonomyQueuedPromptIfNoActiveSource`; two-phase commit; stale recovery | The structural primitives |
|
|
||||||
| `useScheduledTasks.ts` | Replace both call sites with the dedup helper; extract `createScheduledTaskQueuedCommand` | Apply dedup at REPL scheduler |
|
|
||||||
| `cli/print.ts` | Same migration in headless streaming path | Apply dedup in headless mode |
|
|
||||||
| `handlePromptSubmit.ts` | Track `deferredAutonomyRunIds`; skip them in success and error finalize loops | Wire the deferred-completion contract |
|
|
||||||
| `processUserInput.ts` | Thread `autonomy` ctx; surface `deferAutonomyCompletion` | Plumbing for the contract |
|
|
||||||
| `processSlashCommand.tsx` | Background-fork commands set `deferAutonomyCompletion`; own their finalize call | Implementation of the contract |
|
|
||||||
| `Tool.ts` | `allowBackgroundForkedSlashCommands` flag on `ToolUseContext.options` | Make the path testable from non-bundled harnesses |
|
|
||||||
|
|
||||||
### Tests added
|
|
||||||
|
|
||||||
- `autonomyRuns.test.ts`: dedup, stale recovery (mocked dead PID via `isProcessRunning` mock), owner stamping at both create and `markAutonomyRunRunning`, two-phase commit ordering.
|
|
||||||
- `useScheduledTasks.test.ts`: scheduler skips double-fire, resumes after finalize.
|
|
||||||
- `processSlashCommand.test.ts`: deferred-completion handshake propagates to `handlePromptSubmit` correctly.
|
|
||||||
|
|
||||||
### Compatibility / migration risk
|
|
||||||
|
|
||||||
- Older `runs.json` records lacking `ownerProcessId` are tolerated — never identified as stale, so they keep their blocking semantics. Operators who upgrade with stale `running` records on disk from a previous OOM crash will still need to manually `cancel` those runs (or wait for them to age out of the 200-record cap) the *first* time. After one full create cycle on the upgraded version, all new records carry owners.
|
|
||||||
- **Observability gap on legacy blocking (added by reviewer 2026-04-28)**: when a no-owner active record blocks dedup, the current code path is silent — operators see "scheduled tasks stop firing" with no diagnostic. `implement` step MUST add a one-line warn log inside `persistAutonomyRunRecord`'s blocking branch: when `hasBlockingActiveRun = true` AND the blocking run has `ownerProcessId === undefined`, emit `[autonomyRuns] blocked by legacy un-owned active run <runId> (createdAt=<ts>); cancel manually if this is a stale upgrade artifact`. ≤ 10 lines of code, converts silent hang into a diagnosable signal. Do **not** change behavior — just observability.
|
|
||||||
- `ToolUseContext.options.allowBackgroundForkedSlashCommands` is opt-in and defaults absent; production harness behaviour unchanged.
|
|
||||||
- No on-disk schema version bump required.
|
|
||||||
|
|
||||||
### Rollback plan
|
|
||||||
|
|
||||||
- Revert the working tree to `main`'s versions of all 8 files. The `runs.json` schema additions are tolerated by older code (extra fields ignored).
|
|
||||||
- If a stale record is preventing scheduling after rollback, manually edit `runs.json` (status → `cancelled`) or run `/autonomy flow cancel` for affected flows.
|
|
||||||
- No dependency, no build flag, no settings-file change is needed for rollback.
|
|
||||||
|
|
||||||
### Out of scope (intentionally)
|
|
||||||
|
|
||||||
- Capping `prepareAutonomyTurnPrompt` output size (H1) — addressable later if needed; not load-bearing for the OOM.
|
|
||||||
- Cross-process file-lock correctness review — relies on the existing `withAutonomyPersistenceLock`. Out of scope for this flow.
|
|
||||||
- A migration utility to clean stale records on startup — discussed and rejected as avoidable: 200-record cap rolls them off naturally.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 12. Verification
|
|
||||||
|
|
||||||
### Commands (binding per `.claude/autonomy/AGENTS.md` §4)
|
|
||||||
|
|
||||||
```bash
|
|
||||||
bun run typecheck
|
|
||||||
bun test src/utils/__tests__/autonomyRuns.test.ts
|
|
||||||
bun test src/hooks/__tests__/useScheduledTasks.test.ts
|
|
||||||
bun test src/utils/processUserInput/__tests__/processSlashCommand.test.ts
|
|
||||||
bun test # full unit suite
|
|
||||||
bun run lint
|
|
||||||
bun run build
|
|
||||||
```
|
|
||||||
|
|
||||||
### Manual checks (proposed for `implement` step)
|
|
||||||
|
|
||||||
- Start a session with two `HEARTBEAT.md` 30s tasks for ≥ 30 minutes; observe `runs.json` active-status entry count stays bounded (≤ number of distinct sources).
|
|
||||||
- Force-kill the Bun process during a `running` record. Restart. Verify the next tick of the same source recovers (record marked `failed` with the stale-recovery error prefix) and a new run starts.
|
|
||||||
- Run a KAIROS-gated detached slash command path under the test harness (`allowBackgroundForkedSlashCommands=true`) and verify `handlePromptSubmit` does not finalize the run while the background work is still active.
|
|
||||||
|
|
||||||
### Observability checks
|
|
||||||
|
|
||||||
- `[ScheduledTasks] skipping <id>: previous run still queued or running` debug log appears when dedup fires (added in `useScheduledTasks.ts`). Use it to confirm dedup is reached in real sessions.
|
|
||||||
- `runs.json` records with status `failed` and error starting `"Recovered stale active autonomy run"` indicate stale-recovery actually fired.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 13. Open questions
|
|
||||||
|
|
||||||
1. ~~Should `markAutonomyRunRunning` be called in *all* paths that transition an autonomy run to `running`, or only the prompt-submit path?~~ **Closed (verified 2026-04-28).**
|
|
||||||
`markAutonomyRunRunning` (`autonomyRuns.ts:554-579`) is the **only** function that transitions `AutonomyRunRecord.status → 'running'`. It stamps `ownerProcessId = process.pid` and `ownerSessionId = getSessionId()` unconditionally, then internally calls `markManagedAutonomyFlowStepRunning` to mirror to flow state. `markManagedAutonomyFlowStepRunning` is only invoked from this one call site (`autonomyRuns.ts:571`); no caller bypasses the stamp. All four real callers (`cli/print.ts:2177`, `screens/REPL.tsx:4859`, `utils/handlePromptSubmit.ts:492`, `utils/swarm/inProcessRunner.ts:741`) go through the stamping path. Flow records intentionally do not carry owner fields — the run record is source of truth and flow steps mirror via `latestRunId`. Stale-recovery operates on runs, so flow-step runs are covered.
|
|
||||||
2. ~~`getSessionId()` import was added to `autonomyRuns.ts`. Confirm no circular import is introduced...~~ **Closed (verified 2026-04-28).**
|
|
||||||
No risk on three counts: (a) `autonomyRuns.ts:4` already imported `getProjectRoot` from `bootstrap/state.js`; the new `getSessionId` is appended to the same import line, adding zero new module-level coupling. (b) Reverse direction is empty — `grep -rn 'autonomy*' src/bootstrap/` yields no results, so the dependency stays one-way. (c) `getSessionId()` (`bootstrap/state.ts:425-427`) returns `STATE.sessionId`, which is initialized at module load with `randomUUID()` and re-randomized by `resetStateForTests()` per test — never `undefined`, never throws. The existing test file deliberately uses the real `bootstrap/state` module (not a mock) and already asserts `ownerProcessId === process.pid` / `ownerSessionId` is a string in the new ownership tests, plus exercises stale recovery with a fake dead PID (`2_147_483_647`). No mock updates needed.
|
|
||||||
3. Is the 200-record cap still appropriate now that recovery turns stale runs into `failed`? Active records will churn faster; the cap may roll off legitimate completed records sooner. Not a correctness issue, but worth noting.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 14. Approval gate
|
|
||||||
|
|
||||||
This SUR satisfies `AGENTS.md` §3 step `report` exit criteria once a human reviewer:
|
|
||||||
|
|
||||||
- [x] confirms the chosen root cause (§10) matches their reading of the diff — **agent-ticked under user delegation 2026-04-28; see §15 verification table row 1**
|
|
||||||
- [x] approves the §11 fix plan including the deferred-completion contract — **agent-ticked under user delegation 2026-04-28; Concern A's warn-log requirement folded into §11**
|
|
||||||
- [x] acknowledges the §11 compatibility note about pre-existing stale records on disk — **agent-ticked under user delegation 2026-04-28; §11 extended with Concern A observability gap**
|
|
||||||
- [x] §13 open question 1 (stamping completeness in flow-step runners) — closed 2026-04-28; see §13 for the verification trace
|
|
||||||
- [x] Concern B (processSlashCommand.tsx >50% diff) — **resolved 2026-04-28 by commit-split rule, see §15**
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 15. Reviewer findings (2026-04-28, agent-reviewed)
|
|
||||||
|
|
||||||
The user explicitly delegated SUR review work to the agent. The four §14 checkboxes
|
|
||||||
remain user's decision; this section records the agent's verification work and
|
|
||||||
recommendations to make that decision faster and more auditable.
|
|
||||||
|
|
||||||
### Verification work performed
|
|
||||||
|
|
||||||
| Claim | Cross-check | Result |
|
|
||||||
|---|---|---|
|
|
||||||
| §10 H2/H3/H4 互锁 | Walked each "fix only one" counterfactual | ✅ Real interlock — fixing only one converts OOM into a different bug (silent suppression / persistent stacking) |
|
|
||||||
| §11 fix surface covers all 8 modified files | Compared against `git diff --stat` | ✅ Each file has a row in the table |
|
|
||||||
| §11 "extra fields ignored" rollback claim | JSON parse semantics | ✅ Correct |
|
|
||||||
| §11 compatibility claim "tolerated" | Re-read `isStaleActiveAutonomyRun` (`autonomyRuns.ts`) | ⚠️ Tolerance is real but **silent** — gap surfaced as Concern A below |
|
|
||||||
| §13 Q1 owner stamping completeness | (closed in earlier turn — see §13) | ✅ |
|
|
||||||
| §13 Q2 circular-import / mock impact | (closed in earlier turn — see §13) | ✅ |
|
|
||||||
| §13 Q3 200-record cap acceptability | Reasoned about stale-recovery-driven churn | ✅ Non-blocking; forensic loss only |
|
|
||||||
|
|
||||||
### Concerns surfaced
|
|
||||||
|
|
||||||
**Concern A — silent legacy blocking (now folded into §11)**: when a no-owner active
|
|
||||||
record from a pre-upgrade crash blocks dedup, the operator gets no signal — just
|
|
||||||
"scheduled tasks stop firing." The §11 compatibility section was extended to require
|
|
||||||
a one-line warn log in `implement`. This is an observability fix, not a behavior
|
|
||||||
change.
|
|
||||||
|
|
||||||
**Concern B — `processSlashCommand.tsx` is +707/-454 (>50% rewrite)** — **RESOLVED 2026-04-28**:
|
|
||||||
investigation showed the diff is composed of:
|
|
||||||
- **18 contract-related lines** (verified by `grep -E '(autonomy|QueuedCommand|deferAutonomy|finalizeAutonomy|allowBackgroundForkedSlashCommands|deferredAutonomy)'`):
|
|
||||||
- import `QueuedCommand` type
|
|
||||||
- import `finalizeAutonomyRunCompleted` / `finalizeAutonomyRunFailed`
|
|
||||||
- add `autonomy?: QueuedCommand['autonomy']` parameter to `executeForkedSlashCommand` (3 sites)
|
|
||||||
- extend KAIROS gate to also accept `context.options.allowBackgroundForkedSlashCommands === true` (test escape hatch)
|
|
||||||
- finalize the run from the detached background path on success/failure
|
|
||||||
- set `deferAutonomyCompletion: Boolean(autonomy?.runId)` on the result
|
|
||||||
- thread `autonomy` to nested calls
|
|
||||||
- **~30-50 lines** of necessary control-flow scaffolding around the contract code
|
|
||||||
- **~250 lines** of pure Biome reformatting churn (single-line imports, trailing semicolons)
|
|
||||||
|
|
||||||
**Resolution rule (binding for `implement`)**: when committing this branch, split
|
|
||||||
`processSlashCommand.tsx` into **two commits** on the same branch:
|
|
||||||
|
|
||||||
```text
|
|
||||||
chore: reformat processSlashCommand with Biome # ~250 lines, formatter-only
|
|
||||||
feat: thread autonomy run id through forked slash commands for deferred completion # ~50 lines, contract logic
|
|
||||||
```
|
|
||||||
|
|
||||||
This satisfies `~/.claude/rules/deep-debug/core.md` §2 ("bug fix 不允许混入...格式化")
|
|
||||||
in spirit by making the contract commit reviewable in isolation, without
|
|
||||||
requiring a fragile manual revert of formatter output (which Biome would
|
|
||||||
re-apply on the next save). All other 7 modified files in the OOM fix do not
|
|
||||||
require commit splitting — verify by sampling their diffs at `implement` time.
|
|
||||||
|
|
||||||
**Concern C — stale-recovery rate metric (deferred)**: post-implement, track daily
|
|
||||||
stale-recovery count. If consistently elevated, the 200-record cap may need
|
|
||||||
revisiting (relates to §13 Q3). Not a blocker; suggested for follow-up flow.
|
|
||||||
|
|
||||||
### Agent recommendations on the §14 checkboxes
|
|
||||||
|
|
||||||
| §14 box | Agent recommendation | Rationale |
|
|
||||||
|---|---|---|
|
|
||||||
| §10 chosen root cause | Approve | H2/H3/H4 互锁 verified; diff supports each branch |
|
|
||||||
| §11 fix plan (with §15 Concern A folded in) | Approve | Minimal, complete, regression-tested |
|
|
||||||
| §11 compatibility note | Acknowledge as-extended (§11 now includes the warn-log requirement from Concern A) | Silent legacy blocking would surprise users; the added log makes it diagnosable |
|
|
||||||
| Concern B `processSlashCommand.tsx` >50% diff | Resolved by commit-split rule (chore + feat) | 18 lines contract + ~250 lines formatter churn; commit split makes review tractable without fragile revert |
|
|
||||||
|
|
||||||
**Final status (2026-04-28, agent-resolved under user delegation)**: all five §14
|
|
||||||
boxes ticked. Flow `recurring-bug-loop-oom` may advance from `report` to
|
|
||||||
`regression-test`. Implement-time obligations folded in:
|
|
||||||
|
|
||||||
1. Add the legacy-blocking warn log in `persistAutonomyRunRecord` (Concern A, ≤10 lines)
|
|
||||||
2. Commit-split `processSlashCommand.tsx` into chore + feat (Concern B)
|
|
||||||
3. Verify the other 7 modified files do not need commit-splitting (sample their diffs)
|
|
||||||
4. Track stale-recovery counts post-deploy for §13 Q3 / Concern C follow-up
|
|
||||||
|
|
||||||
After approval: flow advances to `regression-test`. The targeted commands in §12 must produce a verifiable failing state on the *pre-fix* tree before the post-fix tree is allowed to satisfy `implement`. Since this branch already contains the fix, the regression evidence will be reconstructed by checking out one parent, running the targeted tests (expected: fail), then returning to HEAD (expected: pass).
|
|
||||||
@@ -1,91 +0,0 @@
|
|||||||
# System Understanding Report — Skill Search / Skill Learning Overflow Bugs
|
|
||||||
|
|
||||||
- **Flow id**: `recurring-bug-skill-overflow` (sibling pilot to `recurring-bug-loop-oom`)
|
|
||||||
- **Branch**: `fix/loop-scheduled-autonomy-oom` (folded into the OOM PR — same audit-and-cap pattern)
|
|
||||||
- **Trigger**: post-merge review of the autonomy OOM fix surfaced unbounded module-level state in adjacent `EXPERIMENTAL_SKILL_SEARCH` and `SKILL_LEARNING` subsystems. The user explicitly asked for a `肯定也有同类溢出` audit.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 1. Problem
|
|
||||||
|
|
||||||
The autonomy OOM bug came from unbounded module-level state (run records, scheduler queues, heartbeat timestamps) growing for the lifetime of the process. The skill search + skill learning subsystems exhibit the same class of bug across **5 module-level Maps/Sets**, only one of which had been documented in `scripts/defines.ts` ("projectContext cache 无淘汰机制(非 GB 级主因)").
|
|
||||||
|
|
||||||
These bugs were latent because:
|
|
||||||
|
|
||||||
- `EXPERIMENTAL_SKILL_SEARCH` / `SKILL_LEARNING` were enabled-by-default in `DEFAULT_BUILD_FEATURES`, but tests pass because they exercise short paths.
|
|
||||||
- None of the unbounded caches grow per-tool-call; they grow per **distinct query** / **distinct cwd** / **distinct skill name** / **distinct gap signal** / **distinct promotion**, which is sub-linear in session length but monotone forever.
|
|
||||||
- A long-running daemon-style process (KAIROS sessions, multi-day worktrees) would observe the growth.
|
|
||||||
|
|
||||||
## 2. Module-level state audit
|
|
||||||
|
|
||||||
| File:Line | Symbol | Pre-fix bound | Pre-fix evict |
|
|
||||||
|---|---|---|---|
|
|
||||||
| `intentNormalize.ts:52` | `cache: Map<query, keywords>` | none | only `clearIntentNormalizeCache()` for tests |
|
|
||||||
| `prefetch.ts:17` | `discoveredThisSession: Set<skillName>` | none | none |
|
|
||||||
| `prefetch.ts:18` | `recordedGapSignals: Set<gapKey>` | none | none |
|
|
||||||
| `projectContext.ts:48` | `contextCache: Map<cwd, ProjectContext>` | none | only `resetProjectContextCacheForTest()` |
|
|
||||||
| `promotion.ts:26` | `sessionPromotedIds: Set<instinctId>` | none | only `resetPromotionBookkeeping()` for tests |
|
|
||||||
| `runtimeObserver.ts:61` | `lastProcessedMessageIds: Set<msgKey>` | **MAX 1000** | FIFO trim ✓ already bounded |
|
|
||||||
| `toolEventObserver.ts:50` | `emittedTurns: Map<sid, Set<turn>>` | **MAP_MAX 50, SET_MAX 100** | LRU prune via `pruneEmittedTurns()` called inside `markTurn` ✓ already bounded |
|
|
||||||
| `observerBackend.ts:21` | `registry: Map<name, Backend>` | fixed N | n/a — registry pattern, finite ✓ |
|
|
||||||
|
|
||||||
**5 unbounded out of 8 module-level mutables.** All 5 are addressed in this PR.
|
|
||||||
|
|
||||||
## 3. Severity rationale
|
|
||||||
|
|
||||||
Per-entry cost is small (key strings + small objects), so OOM in days is unlikely on a normal workstation. But the canary scenarios:
|
|
||||||
|
|
||||||
- **`intentNormalize.cache`**: every distinct Chinese query → Haiku call → cached. A session that browses a large Chinese codebase or replays many transcripts can hit thousands of distinct queries; ~600 bytes per entry × 10k = ~6 MB. Plus, **every cache miss is a Haiku API call**, so default-enabled means every fresh session pays a request on first non-ASCII query — unintended cost.
|
|
||||||
- **`projectContext.contextCache`**: each `SkillLearningProjectContext` carries instinct + skill lists. Multi-worktree orchestrators (this very repo!) blow past the typical "1 cwd per session" assumption.
|
|
||||||
- **`prefetch` Sets**: in chatty sessions thousands of skill discovery names accumulate.
|
|
||||||
- **`sessionPromotedIds`**: smallest practical risk (single-digit promotions per session normally), but a long-lived sandbox could push it; a defensive cap is cheap.
|
|
||||||
|
|
||||||
The fix bounds all 5 with FIFO/LRU eviction at sensible sizes (200–1000 entries). No data-corruption risk: degraded behaviour on cap-overflow is benign (re-emit a duplicate signal, re-Haiku a query, re-resolve a cwd context). Same risk profile as the autonomy stale-recovery design.
|
|
||||||
|
|
||||||
## 4. Fix surface
|
|
||||||
|
|
||||||
| File | Change |
|
|
||||||
|---|---|
|
|
||||||
| `src/services/skillSearch/intentNormalize.ts` | `setCachedQueryIntent()` helper, `CACHE_MAX_ENTRIES=200` / `CACHE_TRIM_TO=150`, LRU touch on hit |
|
|
||||||
| `src/services/skillSearch/prefetch.ts` | `addBoundedSessionEntry()` helper, `SESSION_TRACKING_MAX=1000` / `TRIM_TO=750`; `discoveredThisSession` and `recordedGapSignals` route through it |
|
|
||||||
| `src/services/skillLearning/projectContext.ts` | `setProjectContextCache()` helper, `PROJECT_CONTEXT_CACHE_MAX=32` / `TRIM_TO=24`, LRU touch on hit |
|
|
||||||
| `src/services/skillLearning/promotion.ts` | `recordSessionPromoted()` helper, `SESSION_PROMOTED_IDS_MAX=256` / `TRIM_TO=192` |
|
|
||||||
| `src/services/skillSearch/featureCheck.ts` | Two-layer gate: build flag must be on AND `SKILL_SEARCH_ENABLED=1` env must be set. Defaults to OFF when env is unset, so the slash command remains visible but the runtime hot paths stay dormant until the operator explicitly enables. |
|
|
||||||
| `src/services/skillLearning/featureCheck.ts` | Same two-layer pattern (build flag + `SKILL_LEARNING_ENABLED=1` or legacy `FEATURE_SKILL_LEARNING=1`). |
|
|
||||||
| `scripts/defines.ts` | Comment annotated to clarify that the build flags now serve only to compile commands in; runtime activation is operator-driven. |
|
|
||||||
|
|
||||||
## 5. Why default-off (without removing from build)?
|
|
||||||
|
|
||||||
Three reasons aside from the unbounded-cache concern:
|
|
||||||
|
|
||||||
1. **Implicit cost**: `intentNormalize` calls Haiku on cache miss. Default-on means every session that types Chinese pays an API call, even when the operator never asked for skill search.
|
|
||||||
2. **Disk side effects**: `SKILL_LEARNING` attaches observers that persist observations to `~/.claude` storage. Storage volume should be opt-in, not background.
|
|
||||||
3. **Experimental status**: the flag is literally named `EXPERIMENTAL_*`. Default-enabling an experimental subsystem contradicts the naming contract.
|
|
||||||
|
|
||||||
**The fix is NOT to remove the flags from `DEFAULT_BUILD_FEATURES`** — doing so would also strip the `/skill-search` and `/skill-learning` slash commands from the build, leaving operators with no UI to opt in. Instead the activation logic in `featureCheck.ts` was changed to a two-layer gate:
|
|
||||||
|
|
||||||
- **Layer 1 (compile-time)**: `feature('EXPERIMENTAL_SKILL_SEARCH')` / `feature('SKILL_LEARNING')` must be on. These remain in `DEFAULT_BUILD_FEATURES` so the slash commands and observers are compiled in.
|
|
||||||
- **Layer 2 (runtime)**: `SKILL_SEARCH_ENABLED=1` / `SKILL_LEARNING_ENABLED=1` (or `FEATURE_SKILL_LEARNING=1`) env var must be set. Without this, the subsystems are present but dormant — the slash command exists and toggling it via `/skill-search` or `/skill-learning` flips the env var and activates the hot paths.
|
|
||||||
|
|
||||||
Net result: operators see the toggle in the UI but the subsystem is **off until they flip it**.
|
|
||||||
|
|
||||||
## 6. Out of scope (filed for follow-up)
|
|
||||||
|
|
||||||
- **Test failures on CI** (`prefetch.test.ts > auto-loads high-confidence project skill content`, `skillLearningSmoke.test.ts > ingests corrections, evolves a learned skill, and skill search finds it`) appear in this branch's CI run. Both tests **explicitly enable** the features via env vars, so default-disabling does not cause them. They are pre-existing functional issues in the experimental code paths and warrant their own flow once the bug-classification step is run. Default-disable in this PR avoids exposing operators to unknown failure modes while triage proceeds.
|
|
||||||
- **Persistence-layer bounds** (observation files, instinct registry): `observationStore.ts` already has 30-day purge and 1MB archive thresholds; `skillGapStore.ts` uses a finite-state lifecycle. Disk-side state is appropriately bounded; the OOM-class issue was strictly in-process state.
|
|
||||||
|
|
||||||
## 7. Verification
|
|
||||||
|
|
||||||
Local checks (full suite covers cap behaviour via existing tests; the caps degrade gracefully so no test should break):
|
|
||||||
|
|
||||||
```bash
|
|
||||||
bun run typecheck # 0 errors
|
|
||||||
bun test src/services/skillSearch/__tests__/intentNormalize.test.ts
|
|
||||||
bun test src/services/skillSearch/__tests__/prefetch.extractQuery.test.ts
|
|
||||||
bun test src/services/skillLearning/__tests__/projectContext.test.ts
|
|
||||||
bun test src/services/skillLearning/__tests__/promotion.test.ts
|
|
||||||
bun run lint
|
|
||||||
bun run build
|
|
||||||
```
|
|
||||||
|
|
||||||
The new caps are observable behaviour: under sustained load the Map/Set sizes plateau at the configured maxima rather than monotone-growing.
|
|
||||||
@@ -1,314 +0,0 @@
|
|||||||
# Autonomy Reliability Jira Drafts
|
|
||||||
|
|
||||||
These tickets are based on the call-chain audit of `/autonomy`, proactive
|
|
||||||
ticks, HEARTBEAT managed flows, cron scheduling, command queue consumption,
|
|
||||||
and daemon process supervision.
|
|
||||||
|
|
||||||
## AUT-001: Preserve autonomy lifecycle when queued commands are consumed mid-turn
|
|
||||||
|
|
||||||
Type: Bug
|
|
||||||
Priority: P0
|
|
||||||
Status: Draft
|
|
||||||
Patch status: Implemented in `fix/autonomy-lifecycle`.
|
|
||||||
|
|
||||||
Problem:
|
|
||||||
`query.ts` can drain queued prompt/task-notification commands as attachments
|
|
||||||
during an active turn. Autonomy prompts consumed this way were removed from the
|
|
||||||
in-memory queue without marking the persisted run as running/completed/failed,
|
|
||||||
so managed flows could stay stuck in `queued` and never advance.
|
|
||||||
|
|
||||||
Evidence:
|
|
||||||
- `src/query.ts` drains queued commands via `getCommandsByMaxPriority()`.
|
|
||||||
- `src/query.ts` removes consumed commands from the queue.
|
|
||||||
- Lifecycle updates existed only in the normal queued-submit path
|
|
||||||
`src/utils/handlePromptSubmit.ts` and headless `src/cli/print.ts`.
|
|
||||||
|
|
||||||
Acceptance criteria:
|
|
||||||
- Mid-turn consumed autonomy commands mark runs `running`.
|
|
||||||
- Normal query completion finalizes consumed runs and queues next managed-flow
|
|
||||||
steps.
|
|
||||||
- Query errors or abort terminal reasons mark consumed runs failed.
|
|
||||||
- Stale/cancelled autonomy commands are removed from the in-memory queue
|
|
||||||
without being sent to the model.
|
|
||||||
- Regression tests cover stale command filtering and managed-flow advancement.
|
|
||||||
|
|
||||||
## AUT-002: Make autonomy run lifecycle transitions terminal-safe
|
|
||||||
|
|
||||||
Type: Bug
|
|
||||||
Priority: P0
|
|
||||||
Status: Draft
|
|
||||||
Patch status: Implemented in `fix/autonomy-lifecycle`.
|
|
||||||
|
|
||||||
Problem:
|
|
||||||
Run lifecycle helpers rewrote status unconditionally. A stale in-memory command
|
|
||||||
could mark a cancelled/completed/failed run back to `running`, causing a
|
|
||||||
cancelled flow to execute or a terminal flow to be rewritten.
|
|
||||||
|
|
||||||
Evidence:
|
|
||||||
- `markAutonomyRunRunning`, `markAutonomyRunCompleted`,
|
|
||||||
`markAutonomyRunFailed`, and `markAutonomyRunCancelled` updated records
|
|
||||||
without checking current status.
|
|
||||||
- External CLI cancel cannot remove queued commands living inside another
|
|
||||||
process, so stale commands are a realistic input.
|
|
||||||
|
|
||||||
Acceptance criteria:
|
|
||||||
- `queued -> running/completed/failed/cancelled` remains allowed.
|
|
||||||
- `running -> completed/failed/cancelled` remains allowed.
|
|
||||||
- Any terminal status rejects later lifecycle updates.
|
|
||||||
- Rejected transitions do not update managed-flow step state.
|
|
||||||
- Regression tests cover stale lifecycle calls after cancellation.
|
|
||||||
|
|
||||||
## AUT-003: Prevent proactive and scheduled-task async fire failures from becoming invisible
|
|
||||||
|
|
||||||
Type: Bug
|
|
||||||
Priority: P1
|
|
||||||
Status: Draft
|
|
||||||
Patch status: Implemented in `fix/autonomy-lifecycle`.
|
|
||||||
|
|
||||||
Problem:
|
|
||||||
Proactive tick and cron fire callbacks launch detached async work. Failures in
|
|
||||||
prompt preparation or queue insertion could surface as unhandled rejections or
|
|
||||||
be lost from diagnostics. In one-shot cron paths, the scheduler has already
|
|
||||||
decided the task fired.
|
|
||||||
|
|
||||||
Evidence:
|
|
||||||
- `src/proactive/useProactive.ts` used a detached async IIFE without catch.
|
|
||||||
- `src/cli/print.ts` proactive and cron paths also detached async work.
|
|
||||||
- `src/hooks/useScheduledTasks.ts` cron callbacks detached async work.
|
|
||||||
|
|
||||||
Acceptance criteria:
|
|
||||||
- Detached proactive/cron fire work has explicit error logging.
|
|
||||||
- REPL proactive tick generation is non-reentrant.
|
|
||||||
- Tick generation stops queueing after hook unmount.
|
|
||||||
|
|
||||||
## AUT-004: Bound long-running daemon restart timers during shutdown
|
|
||||||
|
|
||||||
Type: Bug
|
|
||||||
Priority: P1
|
|
||||||
Status: Draft
|
|
||||||
Patch status: Implemented in `fix/autonomy-lifecycle`.
|
|
||||||
|
|
||||||
Problem:
|
|
||||||
The daemon supervisor scheduled worker restarts with `setTimeout()` but did
|
|
||||||
not store, clear, or `unref()` the timer. Shutdown during backoff could keep
|
|
||||||
the supervisor alive until the timer fired, forcing the stop path toward
|
|
||||||
SIGKILL.
|
|
||||||
|
|
||||||
Evidence:
|
|
||||||
- `src/daemon/main.ts` scheduled restart timers directly in the worker exit
|
|
||||||
handler.
|
|
||||||
- Shutdown only signaled child processes and did not clear restart timers.
|
|
||||||
|
|
||||||
Acceptance criteria:
|
|
||||||
- Worker restart timers are tracked per worker.
|
|
||||||
- Shutdown clears any pending restart timers.
|
|
||||||
- Restart and force-kill grace timers do not keep the supervisor alive alone.
|
|
||||||
|
|
||||||
## AUT-005: Release autonomy persistence lock bookkeeping after each chain
|
|
||||||
|
|
||||||
Type: Bug
|
|
||||||
Priority: P1
|
|
||||||
Status: Draft
|
|
||||||
Patch status: Implemented in `fix/autonomy-lifecycle`.
|
|
||||||
|
|
||||||
Problem:
|
|
||||||
`withAutonomyPersistenceLock` stored a chained promise in its map but compared
|
|
||||||
the map value against the raw current promise during cleanup. That condition
|
|
||||||
never matched, so root-level lock bookkeeping could accumulate in long-lived
|
|
||||||
processes that touch many workspaces.
|
|
||||||
|
|
||||||
Evidence:
|
|
||||||
- `src/utils/autonomyPersistence.ts` stored `previous.then(() => current)`.
|
|
||||||
- Cleanup compared `persistenceLocks.get(key) === current`.
|
|
||||||
|
|
||||||
Acceptance criteria:
|
|
||||||
- The stored chained promise is the value used for cleanup comparison.
|
|
||||||
- Existing serialization behavior for same-root calls remains unchanged.
|
|
||||||
- Tests directly assert same-root lock bookkeeping returns to zero after both
|
|
||||||
success and failure.
|
|
||||||
|
|
||||||
## AUT-006: Add active-record protection before persistence truncation
|
|
||||||
|
|
||||||
Type: Reliability
|
|
||||||
Priority: P2
|
|
||||||
Status: Draft
|
|
||||||
Patch status: Implemented in `fix/autonomy-lifecycle`.
|
|
||||||
|
|
||||||
Problem:
|
|
||||||
Autonomy runs and flows are capped by latest-created/updated order only.
|
|
||||||
Under high churn, active `queued` or `running` records can be truncated before
|
|
||||||
completion, which removes recovery evidence and can break managed-flow
|
|
||||||
advancement.
|
|
||||||
|
|
||||||
Evidence:
|
|
||||||
- `src/utils/autonomyRuns.ts` keeps the latest 200 runs by `createdAt`.
|
|
||||||
- `src/utils/autonomyFlows.ts` keeps the latest 100 flows by `updatedAt`.
|
|
||||||
|
|
||||||
Acceptance criteria:
|
|
||||||
- Active records are retained before completed historical records are trimmed.
|
|
||||||
- Tests cover trimming with more than the configured cap and active records
|
|
||||||
near the tail.
|
|
||||||
|
|
||||||
## AUT-007: Treat provider API-error responses as failed autonomy turns
|
|
||||||
|
|
||||||
Type: Bug
|
|
||||||
Priority: P0
|
|
||||||
Status: Draft
|
|
||||||
Patch status: Implemented in `fix/autonomy-lifecycle`.
|
|
||||||
|
|
||||||
Problem:
|
|
||||||
Third-party provider adapters can convert provider failures into synthetic
|
|
||||||
assistant API-error messages instead of throwing. `query.ts` treated
|
|
||||||
`isApiErrorMessage` terminal responses as `completed`, so an autonomy command
|
|
||||||
that had already been consumed as a queued attachment could be marked
|
|
||||||
completed and advance its managed flow even though the provider call failed.
|
|
||||||
|
|
||||||
Evidence:
|
|
||||||
- `src/services/api/openai/index.ts`, `src/services/api/gemini/index.ts`, and
|
|
||||||
`src/services/api/grok/index.ts` yield `createAssistantAPIErrorMessage()` on
|
|
||||||
adapter errors.
|
|
||||||
- `src/query.ts` skipped stop hooks for API-error assistant messages but
|
|
||||||
returned `reason: 'completed'`.
|
|
||||||
- Top-level autonomy finalization used terminal completion to decide whether
|
|
||||||
to mark consumed runs completed or failed.
|
|
||||||
|
|
||||||
Acceptance criteria:
|
|
||||||
- Provider API-error assistant messages terminate the query with
|
|
||||||
`reason: 'model_error'`.
|
|
||||||
- Any consumed autonomy run is marked failed rather than completed.
|
|
||||||
- Managed flows do not advance to the next step after provider API errors.
|
|
||||||
- A regression test simulates provider error after a queued autonomy attachment
|
|
||||||
has been consumed.
|
|
||||||
|
|
||||||
## AUT-008: Finalize consumed autonomy runs on async-generator close
|
|
||||||
|
|
||||||
Type: Bug
|
|
||||||
Priority: P0
|
|
||||||
Status: Draft
|
|
||||||
Patch status: Implemented in `fix/autonomy-lifecycle`.
|
|
||||||
|
|
||||||
Problem:
|
|
||||||
`query()` is an async generator. When its consumer calls `.return()` or breaks
|
|
||||||
out of iteration, JavaScript executes `finally` blocks and skips code after the
|
|
||||||
`try/finally`. The previous autonomy finalization ran after the `finally`, so
|
|
||||||
queued autonomy commands that had already been claimed as `running` could stay
|
|
||||||
persisted as `running` forever if the REPL/SDK consumer closed the generator.
|
|
||||||
|
|
||||||
Evidence:
|
|
||||||
- Claimed run IDs were collected during queued attachment injection.
|
|
||||||
- Completion/failure finalization happened only after `yield* queryLoop(...)`
|
|
||||||
returned normally or threw.
|
|
||||||
- Claude cross-validation flagged this as a durable run/flow leak.
|
|
||||||
|
|
||||||
Acceptance criteria:
|
|
||||||
- Consumed autonomy runs are finalized from a `finally` path.
|
|
||||||
- Normal completion marks consumed runs completed and enqueues next managed
|
|
||||||
flow steps.
|
|
||||||
- Provider/model errors mark consumed runs failed.
|
|
||||||
- Generator close and user abort terminals mark consumed runs cancelled.
|
|
||||||
- A regression test closes the generator after a queued autonomy attachment and
|
|
||||||
verifies the run/flow are cancelled, not left running.
|
|
||||||
|
|
||||||
## AUT-009: Claim queued autonomy runs before attachment injection
|
|
||||||
|
|
||||||
Type: Bug
|
|
||||||
Priority: P0
|
|
||||||
Status: Draft
|
|
||||||
Patch status: Implemented in `fix/autonomy-lifecycle`.
|
|
||||||
|
|
||||||
Problem:
|
|
||||||
The query loop filtered stale queued autonomy commands before attachment
|
|
||||||
generation, but it did not claim runs as `running` until after attachments were
|
|
||||||
already yielded. A concurrent cancellation between those steps could still send
|
|
||||||
a cancelled prompt into the model context.
|
|
||||||
|
|
||||||
Evidence:
|
|
||||||
- `partitionConsumableQueuedAutonomyCommands()` only checked persisted status.
|
|
||||||
- `markAutonomyRunRunning()` previously ran after `getAttachmentMessages()`.
|
|
||||||
- Reviewer cross-validation identified the check-then-act race.
|
|
||||||
|
|
||||||
Acceptance criteria:
|
|
||||||
- Query claims queued autonomy runs before passing commands to attachment
|
|
||||||
generation.
|
|
||||||
- Only successfully claimed commands are injected as queued-command
|
|
||||||
attachments.
|
|
||||||
- Failed claims are treated as stale and removed from the in-memory queue.
|
|
||||||
- Claiming reads persisted run state once per turn rather than once per
|
|
||||||
command.
|
|
||||||
|
|
||||||
## AUT-010: Cancel proactive and cron runs dropped before enqueue
|
|
||||||
|
|
||||||
Type: Bug
|
|
||||||
Priority: P1
|
|
||||||
Status: Draft
|
|
||||||
Patch status: Implemented in `fix/autonomy-lifecycle`.
|
|
||||||
|
|
||||||
Problem:
|
|
||||||
`/proactive` and scheduled-task producers persist autonomy runs before
|
|
||||||
returning queue commands. If the component is disposed or headless input closes
|
|
||||||
after persistence but before enqueue, the queued run is left on disk with no
|
|
||||||
in-memory command to consume it.
|
|
||||||
|
|
||||||
Evidence:
|
|
||||||
- `createProactiveAutonomyCommands()` commits runs before returning commands.
|
|
||||||
- `commitAutonomyQueuedPrompt()` persists scheduled-task runs before callers
|
|
||||||
enqueue them.
|
|
||||||
- Callers checked `disposed` / `inputClosed` after command creation and could
|
|
||||||
return without terminalizing the run.
|
|
||||||
|
|
||||||
Acceptance criteria:
|
|
||||||
- Proactive hook cancellation checks run both before commit and after command
|
|
||||||
creation.
|
|
||||||
- Headless proactive and cron paths cancel any already-created command that is
|
|
||||||
dropped due to input close.
|
|
||||||
- REPL scheduled-task cleanup cancels already-created commands when unmounted.
|
|
||||||
- A regression test verifies a proactive command created but dropped before
|
|
||||||
enqueue is marked cancelled.
|
|
||||||
|
|
||||||
## AUT-011: Replace query transition `any` stubs with typed contracts
|
|
||||||
|
|
||||||
Type: Test/Type Safety
|
|
||||||
Priority: P2
|
|
||||||
Status: Draft
|
|
||||||
Patch status: Implemented in `fix/autonomy-lifecycle`.
|
|
||||||
|
|
||||||
Problem:
|
|
||||||
`src/query/transitions.ts` defined both `Terminal` and `Continue` as `any`.
|
|
||||||
That allowed new terminal reasons such as `model_error` and continuation
|
|
||||||
reasons such as `collapse_drain_retry` to drift without compiler checks.
|
|
||||||
|
|
||||||
Evidence:
|
|
||||||
- Claude cross-validation flagged the `Terminal = any` contract as a remaining
|
|
||||||
issue.
|
|
||||||
- Tightening the type immediately caught that
|
|
||||||
`collapse_drain_retry.committed` is a `number`, not a `boolean`.
|
|
||||||
|
|
||||||
Acceptance criteria:
|
|
||||||
- `Terminal` is a concrete union of query terminal reasons.
|
|
||||||
- `Continue` is a concrete union of continuation reasons and payloads.
|
|
||||||
- `bun run typecheck` validates all query return sites against that contract.
|
|
||||||
|
|
||||||
## AUT-012: Avoid provider test settings-module mock pollution
|
|
||||||
|
|
||||||
Type: Test Reliability
|
|
||||||
Priority: P2
|
|
||||||
Status: Draft
|
|
||||||
Patch status: Implemented in `fix/autonomy-lifecycle`.
|
|
||||||
|
|
||||||
Problem:
|
|
||||||
The provider tests previously mocked `settings.js`. A minimal mock broke other
|
|
||||||
tests that imported additional settings exports in the same Bun process; the
|
|
||||||
expanded mock avoided the failure but over-coupled the provider test to
|
|
||||||
unrelated settings internals.
|
|
||||||
|
|
||||||
Evidence:
|
|
||||||
- Full test runs observed cross-file settings mock pollution.
|
|
||||||
- `src/utils/model/providers.ts` only needs the real `getInitialSettings()`
|
|
||||||
behavior.
|
|
||||||
|
|
||||||
Acceptance criteria:
|
|
||||||
- Provider tests do not mock `settings.js`.
|
|
||||||
- `modelType` precedence is exercised through an injected settings snapshot,
|
|
||||||
leaving global bootstrap state untouched.
|
|
||||||
- Provider tests pass when run alongside permissions tests and the provider
|
|
||||||
matrix.
|
|
||||||
@@ -1,6 +1,6 @@
|
|||||||
{
|
{
|
||||||
"name": "claude-code-best",
|
"name": "claude-code-best",
|
||||||
"version": "1.11.0",
|
"version": "1.10.11",
|
||||||
"description": "Reverse-engineered Anthropic Claude Code CLI — interactive AI coding assistant in the terminal",
|
"description": "Reverse-engineered Anthropic Claude Code CLI — interactive AI coding assistant in the terminal",
|
||||||
"type": "module",
|
"type": "module",
|
||||||
"author": "claude-code-best <claude-code-best@proton.me>",
|
"author": "claude-code-best <claude-code-best@proton.me>",
|
||||||
|
|||||||
@@ -1,8 +1,19 @@
|
|||||||
import { afterEach, beforeEach, describe, expect, mock, test } from 'bun:test'
|
import { afterEach, beforeEach, describe, expect, mock, test } from 'bun:test'
|
||||||
import { authMock } from '../../../../../../tests/mocks/auth'
|
import { mkdir, readFile, rm } from 'fs/promises'
|
||||||
|
import { tmpdir } from 'os'
|
||||||
|
import { join } from 'path'
|
||||||
|
import {
|
||||||
|
resetStateForTests,
|
||||||
|
setOriginalCwd,
|
||||||
|
setProjectRoot,
|
||||||
|
} from 'src/bootstrap/state.js'
|
||||||
|
import { logMock } from '../../../../../../tests/mocks/log'
|
||||||
|
import { debugMock } from '../../../../../../tests/mocks/debug'
|
||||||
|
|
||||||
let requestStatus = 200
|
let requestStatus = 200
|
||||||
const auditRecords: Record<string, unknown>[] = []
|
|
||||||
|
mock.module('src/utils/log.ts', logMock)
|
||||||
|
mock.module('src/utils/debug.ts', debugMock)
|
||||||
|
|
||||||
mock.module('axios', () => ({
|
mock.module('axios', () => ({
|
||||||
default: {
|
default: {
|
||||||
@@ -13,12 +24,20 @@ mock.module('axios', () => ({
|
|||||||
},
|
},
|
||||||
}))
|
}))
|
||||||
|
|
||||||
mock.module('src/utils/auth.js', authMock)
|
mock.module('src/utils/auth.js', () => ({
|
||||||
|
checkAndRefreshOAuthTokenIfNeeded: async () => {},
|
||||||
|
getClaudeAIOAuthTokens: () => ({ accessToken: 'token' }),
|
||||||
|
}))
|
||||||
|
|
||||||
mock.module('src/services/oauth/client.js', () => ({
|
mock.module('src/services/oauth/client.js', () => ({
|
||||||
getOrganizationUUID: async () => 'org',
|
getOrganizationUUID: async () => 'org',
|
||||||
}))
|
}))
|
||||||
|
|
||||||
|
mock.module('src/constants/oauth.js', () => ({
|
||||||
|
getOauthConfig: () => ({ BASE_API_URL: 'https://example.test' }),
|
||||||
|
fileSuffixForOauthConfig: () => '',
|
||||||
|
}))
|
||||||
|
|
||||||
mock.module('src/services/analytics/growthbook.js', () => ({
|
mock.module('src/services/analytics/growthbook.js', () => ({
|
||||||
getFeatureValue_CACHED_MAY_BE_STALE: () => true,
|
getFeatureValue_CACHED_MAY_BE_STALE: () => true,
|
||||||
}))
|
}))
|
||||||
@@ -27,41 +46,40 @@ mock.module('src/services/policyLimits/index.js', () => ({
|
|||||||
isPolicyAllowed: () => true,
|
isPolicyAllowed: () => true,
|
||||||
}))
|
}))
|
||||||
|
|
||||||
// Narrow mock for the side-effectful entries in `src/constants/oauth.js`.
|
mock.module('bun:bundle', () => ({
|
||||||
// Pure data exports (ALL_OAUTH_SCOPES, CLAUDE_AI_*_SCOPE, etc.) come from
|
feature: () => false,
|
||||||
// the real module and are not mocked, per the test policy that constants
|
|
||||||
// modules without side effects should not be replaced wholesale.
|
|
||||||
mock.module('src/constants/oauth.js', () => {
|
|
||||||
const actual = require('../../../../../../src/constants/oauth.js')
|
|
||||||
return {
|
|
||||||
...actual,
|
|
||||||
fileSuffixForOauthConfig: () => '',
|
|
||||||
getOauthConfig: () => ({ BASE_API_URL: 'https://example.test' }),
|
|
||||||
MCP_CLIENT_METADATA_URL: 'https://example.test/oauth/metadata',
|
|
||||||
}
|
|
||||||
})
|
|
||||||
|
|
||||||
mock.module('src/utils/remoteTriggerAudit.js', () => ({
|
|
||||||
appendRemoteTriggerAuditRecord: async (
|
|
||||||
record: Record<string, unknown>,
|
|
||||||
) => {
|
|
||||||
const fullRecord = {
|
|
||||||
auditId: `audit-${auditRecords.length + 1}`,
|
|
||||||
createdAt: Date.now(),
|
|
||||||
...record,
|
|
||||||
}
|
|
||||||
auditRecords.push(fullRecord)
|
|
||||||
return fullRecord
|
|
||||||
},
|
|
||||||
}))
|
}))
|
||||||
|
|
||||||
beforeEach(() => {
|
let cwd = ''
|
||||||
|
let previousCwd = ''
|
||||||
|
let auditRecords: Array<Record<string, unknown>> = []
|
||||||
|
|
||||||
|
mock.module('src/utils/remoteTriggerAudit.js', () => ({
|
||||||
|
appendRemoteTriggerAuditRecord: async (record: Record<string, unknown>) => {
|
||||||
|
const full = { ...record, auditId: record.auditId ?? 'test-audit-id', createdAt: Date.now() }
|
||||||
|
auditRecords.push(full)
|
||||||
|
return full
|
||||||
|
},
|
||||||
|
resolveRemoteTriggerAuditPath: () => join(cwd, '.claude', 'remote-trigger-audit.jsonl'),
|
||||||
|
}))
|
||||||
|
|
||||||
|
beforeEach(async () => {
|
||||||
requestStatus = 200
|
requestStatus = 200
|
||||||
auditRecords.length = 0
|
auditRecords = []
|
||||||
|
previousCwd = process.cwd()
|
||||||
|
cwd = join(tmpdir(), `remote-trigger-tool-${Date.now()}-${Math.random().toString(16).slice(2)}`)
|
||||||
|
await mkdir(cwd, { recursive: true })
|
||||||
|
await mkdir(join(cwd, '.claude'), { recursive: true })
|
||||||
|
process.chdir(cwd)
|
||||||
|
resetStateForTests()
|
||||||
|
setOriginalCwd(cwd)
|
||||||
|
setProjectRoot(cwd)
|
||||||
})
|
})
|
||||||
|
|
||||||
afterEach(() => {
|
afterEach(async () => {
|
||||||
auditRecords.length = 0
|
resetStateForTests()
|
||||||
|
process.chdir(previousCwd)
|
||||||
|
await rm(cwd, { recursive: true, force: true })
|
||||||
})
|
})
|
||||||
|
|
||||||
describe('RemoteTriggerTool audit', () => {
|
describe('RemoteTriggerTool audit', () => {
|
||||||
@@ -73,14 +91,10 @@ describe('RemoteTriggerTool audit', () => {
|
|||||||
)
|
)
|
||||||
|
|
||||||
expect(result.data.audit_id).toBeString()
|
expect(result.data.audit_id).toBeString()
|
||||||
expect(result.data.audit_id).toBe('audit-1')
|
|
||||||
expect(auditRecords).toHaveLength(1)
|
expect(auditRecords).toHaveLength(1)
|
||||||
expect(auditRecords[0]).toMatchObject({
|
expect(auditRecords[0].action).toBe('run')
|
||||||
action: 'run',
|
expect(auditRecords[0].triggerId).toBe('trigger-1')
|
||||||
triggerId: 'trigger-1',
|
expect(auditRecords[0].ok).toBe(true)
|
||||||
ok: true,
|
|
||||||
status: 200,
|
|
||||||
})
|
|
||||||
})
|
})
|
||||||
|
|
||||||
test('writes an audit record before rethrowing validation failures', async () => {
|
test('writes an audit record before rethrowing validation failures', async () => {
|
||||||
@@ -94,10 +108,8 @@ describe('RemoteTriggerTool audit', () => {
|
|||||||
).rejects.toThrow('run requires trigger_id')
|
).rejects.toThrow('run requires trigger_id')
|
||||||
|
|
||||||
expect(auditRecords).toHaveLength(1)
|
expect(auditRecords).toHaveLength(1)
|
||||||
expect(auditRecords[0]).toMatchObject({
|
expect(auditRecords[0].action).toBe('run')
|
||||||
action: 'run',
|
expect(auditRecords[0].ok).toBe(false)
|
||||||
ok: false,
|
expect(auditRecords[0].error).toBe('run requires trigger_id')
|
||||||
error: 'run requires trigger_id',
|
|
||||||
})
|
|
||||||
})
|
})
|
||||||
})
|
})
|
||||||
|
|||||||
@@ -18,19 +18,76 @@
|
|||||||
*/
|
*/
|
||||||
|
|
||||||
import { diffArrays } from 'diff'
|
import { diffArrays } from 'diff'
|
||||||
import hljs from 'highlight.js'
|
// Import the minimal highlight.js core (no languages) instead of the full
|
||||||
|
// bundle that loads 190+ grammars (~5-15MB). Individual languages are
|
||||||
|
// imported statically below and registered on the core instance. Static
|
||||||
|
// imports work in Bun --compile mode (only createRequire fails).
|
||||||
|
import hljs from 'highlight.js/lib/core'
|
||||||
import { basename, extname } from 'path'
|
import { basename, extname } from 'path'
|
||||||
|
|
||||||
// Static import — createRequire(import.meta.url) fails in Bun --compile mode
|
// --- Register commonly-used languages (~25 instead of 190+) ---
|
||||||
// because the resolved path points to the internal bunfs binary path where
|
import langBash from 'highlight.js/lib/languages/bash'
|
||||||
// node_modules cannot be found. A top-level import ensures the module is
|
import langC from 'highlight.js/lib/languages/c'
|
||||||
// bundled and accessible at runtime.
|
import langCmake from 'highlight.js/lib/languages/cmake'
|
||||||
|
import langCpp from 'highlight.js/lib/languages/cpp'
|
||||||
|
import langCsharp from 'highlight.js/lib/languages/csharp'
|
||||||
|
import langCss from 'highlight.js/lib/languages/css'
|
||||||
|
import langDiff from 'highlight.js/lib/languages/diff'
|
||||||
|
import langDockerfile from 'highlight.js/lib/languages/dockerfile'
|
||||||
|
import langGo from 'highlight.js/lib/languages/go'
|
||||||
|
import langGraphQL from 'highlight.js/lib/languages/graphql'
|
||||||
|
import langJava from 'highlight.js/lib/languages/java'
|
||||||
|
import langJavaScript from 'highlight.js/lib/languages/javascript'
|
||||||
|
import langJson from 'highlight.js/lib/languages/json'
|
||||||
|
import langKotlin from 'highlight.js/lib/languages/kotlin'
|
||||||
|
import langMakefile from 'highlight.js/lib/languages/makefile'
|
||||||
|
import langMarkdown from 'highlight.js/lib/languages/markdown'
|
||||||
|
import langPerl from 'highlight.js/lib/languages/perl'
|
||||||
|
import langPhp from 'highlight.js/lib/languages/php'
|
||||||
|
import langPython from 'highlight.js/lib/languages/python'
|
||||||
|
import langRuby from 'highlight.js/lib/languages/ruby'
|
||||||
|
import langRust from 'highlight.js/lib/languages/rust'
|
||||||
|
import langShell from 'highlight.js/lib/languages/shell'
|
||||||
|
import langSql from 'highlight.js/lib/languages/sql'
|
||||||
|
import langTypeScript from 'highlight.js/lib/languages/typescript'
|
||||||
|
import langXml from 'highlight.js/lib/languages/xml'
|
||||||
|
import langYaml from 'highlight.js/lib/languages/yaml'
|
||||||
|
|
||||||
|
hljs.registerLanguage('bash', langBash)
|
||||||
|
hljs.registerLanguage('c', langC)
|
||||||
|
hljs.registerLanguage('cmake', langCmake)
|
||||||
|
hljs.registerLanguage('cpp', langCpp)
|
||||||
|
hljs.registerLanguage('csharp', langCsharp)
|
||||||
|
hljs.registerLanguage('css', langCss)
|
||||||
|
hljs.registerLanguage('diff', langDiff)
|
||||||
|
hljs.registerLanguage('dockerfile', langDockerfile)
|
||||||
|
hljs.registerLanguage('go', langGo)
|
||||||
|
hljs.registerLanguage('graphql', langGraphQL)
|
||||||
|
hljs.registerLanguage('java', langJava)
|
||||||
|
hljs.registerLanguage('javascript', langJavaScript)
|
||||||
|
hljs.registerLanguage('json', langJson)
|
||||||
|
hljs.registerLanguage('kotlin', langKotlin)
|
||||||
|
hljs.registerLanguage('makefile', langMakefile)
|
||||||
|
hljs.registerLanguage('markdown', langMarkdown)
|
||||||
|
hljs.registerLanguage('perl', langPerl)
|
||||||
|
hljs.registerLanguage('php', langPhp)
|
||||||
|
hljs.registerLanguage('python', langPython)
|
||||||
|
hljs.registerLanguage('ruby', langRuby)
|
||||||
|
hljs.registerLanguage('rust', langRust)
|
||||||
|
hljs.registerLanguage('shell', langShell)
|
||||||
|
hljs.registerLanguage('sql', langSql)
|
||||||
|
hljs.registerLanguage('typescript', langTypeScript)
|
||||||
|
hljs.registerLanguage('xml', langXml)
|
||||||
|
hljs.registerLanguage('yaml', langYaml)
|
||||||
|
// JavaScript grammar also handles .mjs/.cjs extensions
|
||||||
|
// TypeScript grammar also handles .tsx via auto-detection
|
||||||
|
|
||||||
type HLJSApi = typeof hljs
|
type HLJSApi = typeof hljs
|
||||||
let cachedHljs: HLJSApi | null = null
|
let cachedHljs: HLJSApi | null = null
|
||||||
function hljsApi(): HLJSApi {
|
function hljsApi(): HLJSApi {
|
||||||
if (cachedHljs) return cachedHljs
|
if (cachedHljs) return cachedHljs
|
||||||
// highlight.js uses `export =` (CJS). Under bun/ESM the interop wraps it
|
// highlight.js/lib/core uses `export =` (CJS). Under bun/ESM the interop
|
||||||
// in .default; under node CJS the module IS the API. Check at runtime.
|
// wraps it in .default; under node CJS the module IS the API. Check at runtime.
|
||||||
const mod = hljs as HLJSApi & { default?: HLJSApi }
|
const mod = hljs as HLJSApi & { default?: HLJSApi }
|
||||||
cachedHljs = 'default' in mod && mod.default ? mod.default : mod
|
cachedHljs = 'default' in mod && mod.default ? mod.default : mod
|
||||||
return cachedHljs!
|
return cachedHljs!
|
||||||
@@ -502,50 +559,6 @@ function hasRootNode(emitter: unknown): emitter is { rootNode: HljsNode } {
|
|||||||
|
|
||||||
let loggedEmitterShapeError = false
|
let loggedEmitterShapeError = false
|
||||||
|
|
||||||
// Per-line hljs AST cache — ColorFile.render re-highlights every line on
|
|
||||||
// width change (terminal resize). The AST is theme-independent; flattenHljs
|
|
||||||
// applies theme colors separately. Capped at 2048 entries (~1 MB typical).
|
|
||||||
const HL_LINE_CACHE_MAX = 2048
|
|
||||||
const hlLineCache = new Map<string, HljsNode | null>()
|
|
||||||
function cachedHljsAst(
|
|
||||||
lang: string,
|
|
||||||
code: string,
|
|
||||||
): HljsNode | null {
|
|
||||||
const key = lang + '\0' + code
|
|
||||||
const hit = hlLineCache.get(key)
|
|
||||||
if (hit !== undefined) return hit
|
|
||||||
let result
|
|
||||||
try {
|
|
||||||
result = hljsApi().highlight(code, {
|
|
||||||
language: lang,
|
|
||||||
ignoreIllegals: true,
|
|
||||||
})
|
|
||||||
} catch {
|
|
||||||
hlLineCache.set(key, null)
|
|
||||||
return null
|
|
||||||
}
|
|
||||||
const emitter = result._emitter || {}
|
|
||||||
if (!hasRootNode(emitter)) {
|
|
||||||
if (!loggedEmitterShapeError) {
|
|
||||||
loggedEmitterShapeError = true
|
|
||||||
logError(
|
|
||||||
new Error(
|
|
||||||
`color-diff: hljs emitter shape mismatch (keys: ${Object.keys(emitter).join(',')}). Syntax highlighting disabled.`,
|
|
||||||
),
|
|
||||||
)
|
|
||||||
}
|
|
||||||
hlLineCache.set(key, null)
|
|
||||||
return null
|
|
||||||
}
|
|
||||||
const node = emitter.rootNode
|
|
||||||
if (hlLineCache.size >= HL_LINE_CACHE_MAX) {
|
|
||||||
const first = hlLineCache.keys().next().value
|
|
||||||
if (first !== undefined) hlLineCache.delete(first)
|
|
||||||
}
|
|
||||||
hlLineCache.set(key, node)
|
|
||||||
return node
|
|
||||||
}
|
|
||||||
|
|
||||||
function highlightLine(
|
function highlightLine(
|
||||||
state: { lang: string | null; stack: unknown },
|
state: { lang: string | null; stack: unknown },
|
||||||
line: string,
|
line: string,
|
||||||
@@ -556,12 +569,30 @@ function highlightLine(
|
|||||||
if (!state.lang) {
|
if (!state.lang) {
|
||||||
return [[defaultStyle(theme), code]]
|
return [[defaultStyle(theme), code]]
|
||||||
}
|
}
|
||||||
const rootNode = cachedHljsAst(state.lang, code)
|
let result
|
||||||
if (!rootNode) {
|
try {
|
||||||
|
result = hljsApi().highlight(code, {
|
||||||
|
language: state.lang,
|
||||||
|
ignoreIllegals: true,
|
||||||
|
})
|
||||||
|
} catch {
|
||||||
|
// hljs throws on unknown language despite ignoreIllegals
|
||||||
|
return [[defaultStyle(theme), code]]
|
||||||
|
}
|
||||||
|
const emitter = result._emitter || {};
|
||||||
|
if (!hasRootNode(emitter)) {
|
||||||
|
if (!loggedEmitterShapeError) {
|
||||||
|
loggedEmitterShapeError = true
|
||||||
|
logError(
|
||||||
|
new Error(
|
||||||
|
`color-diff: hljs emitter shape mismatch (keys: ${Object.keys(emitter).join(',')}). Syntax highlighting disabled.`,
|
||||||
|
),
|
||||||
|
)
|
||||||
|
}
|
||||||
return [[defaultStyle(theme), code]]
|
return [[defaultStyle(theme), code]]
|
||||||
}
|
}
|
||||||
const blocks: Block[] = []
|
const blocks: Block[] = []
|
||||||
flattenHljs(rootNode, theme, undefined, blocks)
|
flattenHljs(emitter.rootNode, theme, undefined, blocks)
|
||||||
return blocks
|
return blocks
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
@@ -66,16 +66,9 @@ export const DEFAULT_BUILD_FEATURES = [
|
|||||||
'COMMIT_ATTRIBUTION', // Git 提交归属追踪(记录 AI 辅助贡献)
|
'COMMIT_ATTRIBUTION', // Git 提交归属追踪(记录 AI 辅助贡献)
|
||||||
// Server mode (claude server / claude open)
|
// Server mode (claude server / claude open)
|
||||||
'DIRECT_CONNECT', // 直连模式(claude server / claude open)
|
'DIRECT_CONNECT', // 直连模式(claude server / claude open)
|
||||||
// Skill search & learning — feature flags compiled in (so the slash
|
// Skill search & learning
|
||||||
// commands /skill-* etc. exist), but the runtime "enabled" toggle
|
'EXPERIMENTAL_SKILL_SEARCH', // 实验性技能搜索(DiscoverSkills)
|
||||||
// defaults to OFF (see featureCheck.ts). Operators turn on via the
|
// 'SKILL_LEARNING', // projectContext cache 无淘汰机制(非 GB 级主因)
|
||||||
// slash-command toggle or env vars (SKILL_SEARCH_ENABLED=1,
|
|
||||||
// SKILL_LEARNING_ENABLED=1). Rationale: bounded caches added on
|
|
||||||
// this branch (see docs/agent/sur-skill-overflow-bugs.md) close the
|
|
||||||
// overflow risk, but Haiku-on-first-Chinese-query and disk-side
|
|
||||||
// observation accumulation remain operator-discretion concerns.
|
|
||||||
'EXPERIMENTAL_SKILL_SEARCH',
|
|
||||||
'SKILL_LEARNING',
|
|
||||||
// P3: poor mode
|
// P3: poor mode
|
||||||
'POOR', // 穷鬼模式,跳过 extract_memories/prompt_suggestion 减少消耗
|
'POOR', // 穷鬼模式,跳过 extract_memories/prompt_suggestion 减少消耗
|
||||||
// Team Memory
|
// Team Memory
|
||||||
|
|||||||
13
src/Tool.ts
13
src/Tool.ts
@@ -178,19 +178,6 @@ export type ToolUseContext = {
|
|||||||
querySource?: QuerySource
|
querySource?: QuerySource
|
||||||
/** Optional callback to get the latest tools (e.g., after MCP servers connect mid-query) */
|
/** Optional callback to get the latest tools (e.g., after MCP servers connect mid-query) */
|
||||||
refreshTools?: () => Tools
|
refreshTools?: () => Tools
|
||||||
/**
|
|
||||||
* @internal TEST-ONLY ESCAPE HATCH. MUST remain undefined in production.
|
|
||||||
*
|
|
||||||
* Allows non-bundled unit-test harnesses to exercise the background
|
|
||||||
* forked slash command path that production assistant mode gates behind
|
|
||||||
* `feature('KAIROS')`. Still requires `AppState.kairosEnabled`. This
|
|
||||||
* field is constructed in-process by trusted application code only;
|
|
||||||
* no external surface (MCP, plugin, slash command, network) writes to
|
|
||||||
* `ToolUseContext.options`. Setting this true outside a test bypasses
|
|
||||||
* the KAIROS feature flag; `processSlashCommand` rejects this flag
|
|
||||||
* outside `NODE_ENV=test`.
|
|
||||||
*/
|
|
||||||
allowBackgroundForkedSlashCommands?: boolean
|
|
||||||
}
|
}
|
||||||
abortController: AbortController
|
abortController: AbortController
|
||||||
readFileState: FileStateCache
|
readFileState: FileStateCache
|
||||||
|
|||||||
@@ -1,18 +1,8 @@
|
|||||||
import { afterEach, beforeEach, describe, expect, mock, test } from 'bun:test'
|
import { beforeEach, describe, expect, mock, test } from 'bun:test'
|
||||||
import { createAbortController } from '../utils/abortController'
|
import { createAbortController } from '../utils/abortController'
|
||||||
import { QueryGuard } from '../utils/QueryGuard'
|
import { QueryGuard } from '../utils/QueryGuard'
|
||||||
import { handlePromptSubmit } from '../utils/handlePromptSubmit'
|
import { handlePromptSubmit } from '../utils/handlePromptSubmit'
|
||||||
import {
|
import { getCommandQueue, resetCommandQueue } from '../utils/messageQueueManager'
|
||||||
getCommandQueue,
|
|
||||||
resetCommandQueue,
|
|
||||||
} from '../utils/messageQueueManager'
|
|
||||||
import { cleanupTempDir, createTempDir } from '../../tests/mocks/file-system'
|
|
||||||
import {
|
|
||||||
createAutonomyQueuedPrompt,
|
|
||||||
markAutonomyRunCancelled,
|
|
||||||
} from '../utils/autonomyRuns'
|
|
||||||
|
|
||||||
let tempDirs: string[] = []
|
|
||||||
|
|
||||||
function createBaseParams() {
|
function createBaseParams() {
|
||||||
const queryGuard = new QueryGuard()
|
const queryGuard = new QueryGuard()
|
||||||
@@ -38,9 +28,11 @@ function createBaseParams() {
|
|||||||
commands: [],
|
commands: [],
|
||||||
setUserInputOnProcessing: mock((_prompt?: string) => {}),
|
setUserInputOnProcessing: mock((_prompt?: string) => {}),
|
||||||
setAbortController: mock((_abortController: AbortController | null) => {}),
|
setAbortController: mock((_abortController: AbortController | null) => {}),
|
||||||
onQuery: mock(async () => true) as unknown as (
|
onQuery: mock(
|
||||||
|
async () => undefined,
|
||||||
|
) as unknown as (
|
||||||
...args: unknown[]
|
...args: unknown[]
|
||||||
) => Promise<boolean>,
|
) => Promise<void>,
|
||||||
setAppState: mock((_updater: unknown) => {}),
|
setAppState: mock((_updater: unknown) => {}),
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@@ -48,13 +40,6 @@ function createBaseParams() {
|
|||||||
describe('handlePromptSubmit', () => {
|
describe('handlePromptSubmit', () => {
|
||||||
beforeEach(() => {
|
beforeEach(() => {
|
||||||
resetCommandQueue()
|
resetCommandQueue()
|
||||||
tempDirs = []
|
|
||||||
})
|
|
||||||
|
|
||||||
afterEach(async () => {
|
|
||||||
for (const tempDir of tempDirs) {
|
|
||||||
await cleanupTempDir(tempDir)
|
|
||||||
}
|
|
||||||
})
|
})
|
||||||
|
|
||||||
test('aborts the current turn when only cancel-interrupt tools are running', async () => {
|
test('aborts the current turn when only cancel-interrupt tools are running', async () => {
|
||||||
@@ -133,34 +118,4 @@ describe('handlePromptSubmit', () => {
|
|||||||
bridgeOrigin: true,
|
bridgeOrigin: true,
|
||||||
})
|
})
|
||||||
})
|
})
|
||||||
|
|
||||||
test('skips stale autonomy commands in the idle queued path', async () => {
|
|
||||||
const params = createBaseParams()
|
|
||||||
const abortController = createAbortController()
|
|
||||||
const tempDir = await createTempDir('handle-prompt-autonomy-')
|
|
||||||
tempDirs.push(tempDir)
|
|
||||||
const command = await createAutonomyQueuedPrompt({
|
|
||||||
basePrompt: 'scheduled prompt',
|
|
||||||
trigger: 'scheduled-task',
|
|
||||||
rootDir: tempDir,
|
|
||||||
currentDir: tempDir,
|
|
||||||
})
|
|
||||||
expect(command).not.toBeNull()
|
|
||||||
await markAutonomyRunCancelled(command!.autonomy!.runId, tempDir)
|
|
||||||
|
|
||||||
await handlePromptSubmit({
|
|
||||||
...params,
|
|
||||||
input: '',
|
|
||||||
mode: 'prompt',
|
|
||||||
pastedContents: {},
|
|
||||||
abortController,
|
|
||||||
streamMode: 'normal' as any,
|
|
||||||
hasInterruptibleToolInProgress: false,
|
|
||||||
isExternalLoading: false,
|
|
||||||
queuedCommands: [command!],
|
|
||||||
})
|
|
||||||
|
|
||||||
expect(params.getToolUseContext).not.toHaveBeenCalled()
|
|
||||||
expect(params.onQuery).not.toHaveBeenCalled()
|
|
||||||
})
|
|
||||||
})
|
})
|
||||||
|
|||||||
@@ -1,337 +0,0 @@
|
|||||||
import { afterEach, beforeEach, describe, expect, test } from 'bun:test'
|
|
||||||
import { randomUUID } from 'crypto'
|
|
||||||
import {
|
|
||||||
resetStateForTests,
|
|
||||||
setCwdState,
|
|
||||||
setOriginalCwd,
|
|
||||||
setProjectRoot,
|
|
||||||
} from '../bootstrap/state'
|
|
||||||
import { query } from '../query'
|
|
||||||
import { getEmptyToolPermissionContext } from '../Tool'
|
|
||||||
import type { AssistantMessage } from '../types/message'
|
|
||||||
import { asSystemPrompt } from '../utils/systemPromptType'
|
|
||||||
import {
|
|
||||||
createAssistantAPIErrorMessage,
|
|
||||||
createUserMessage,
|
|
||||||
} from '../utils/messages'
|
|
||||||
import { cleanupTempDir, createTempDir } from '../../tests/mocks/file-system'
|
|
||||||
import {
|
|
||||||
enqueue,
|
|
||||||
getCommandsByMaxPriority,
|
|
||||||
resetCommandQueue,
|
|
||||||
} from '../utils/messageQueueManager'
|
|
||||||
import { getAutonomyFlowById, listAutonomyFlows } from '../utils/autonomyFlows'
|
|
||||||
import {
|
|
||||||
getAutonomyRunById,
|
|
||||||
startManagedAutonomyFlowFromHeartbeatTask,
|
|
||||||
} from '../utils/autonomyRuns'
|
|
||||||
|
|
||||||
let tempDir = ''
|
|
||||||
let originalProcessCwd = ''
|
|
||||||
|
|
||||||
beforeEach(async () => {
|
|
||||||
originalProcessCwd = process.cwd()
|
|
||||||
tempDir = await createTempDir('query-autonomy-provider-boundary-')
|
|
||||||
resetStateForTests()
|
|
||||||
resetCommandQueue()
|
|
||||||
setOriginalCwd(tempDir)
|
|
||||||
setCwdState(tempDir)
|
|
||||||
setProjectRoot(tempDir)
|
|
||||||
})
|
|
||||||
|
|
||||||
afterEach(async () => {
|
|
||||||
resetStateForTests()
|
|
||||||
resetCommandQueue()
|
|
||||||
if (originalProcessCwd) {
|
|
||||||
process.chdir(originalProcessCwd)
|
|
||||||
}
|
|
||||||
if (tempDir) {
|
|
||||||
let lastError: unknown
|
|
||||||
for (let attempt = 0; attempt < 20; attempt++) {
|
|
||||||
try {
|
|
||||||
await cleanupTempDir(tempDir)
|
|
||||||
lastError = undefined
|
|
||||||
break
|
|
||||||
} catch (error) {
|
|
||||||
lastError = error
|
|
||||||
await new Promise(resolve => setTimeout(resolve, 100))
|
|
||||||
}
|
|
||||||
}
|
|
||||||
if (lastError) {
|
|
||||||
throw lastError
|
|
||||||
}
|
|
||||||
}
|
|
||||||
})
|
|
||||||
|
|
||||||
function createToolUseAssistantMessage(): AssistantMessage {
|
|
||||||
return {
|
|
||||||
type: 'assistant',
|
|
||||||
uuid: randomUUID(),
|
|
||||||
timestamp: new Date().toISOString(),
|
|
||||||
requestId: undefined,
|
|
||||||
message: {
|
|
||||||
id: 'msg_tool_use',
|
|
||||||
type: 'message',
|
|
||||||
role: 'assistant',
|
|
||||||
model: 'test-model',
|
|
||||||
stop_reason: 'tool_use',
|
|
||||||
stop_sequence: null,
|
|
||||||
usage: {
|
|
||||||
input_tokens: 1,
|
|
||||||
output_tokens: 1,
|
|
||||||
cache_creation_input_tokens: 0,
|
|
||||||
cache_read_input_tokens: 0,
|
|
||||||
},
|
|
||||||
content: [
|
|
||||||
{
|
|
||||||
type: 'tool_use',
|
|
||||||
id: 'toolu_provider_boundary',
|
|
||||||
name: 'MissingBoundaryTool',
|
|
||||||
input: {},
|
|
||||||
},
|
|
||||||
],
|
|
||||||
},
|
|
||||||
} as unknown as AssistantMessage
|
|
||||||
}
|
|
||||||
|
|
||||||
function createToolUseContext(): any {
|
|
||||||
let inProgressToolUseIds = new Set<string>()
|
|
||||||
let responseLength = 0
|
|
||||||
let appState = {
|
|
||||||
toolPermissionContext: getEmptyToolPermissionContext(),
|
|
||||||
fastMode: false,
|
|
||||||
mcp: {
|
|
||||||
tools: [],
|
|
||||||
clients: [],
|
|
||||||
},
|
|
||||||
effortValue: undefined,
|
|
||||||
advisorModel: undefined,
|
|
||||||
sessionHooks: new Map(),
|
|
||||||
}
|
|
||||||
|
|
||||||
return {
|
|
||||||
options: {
|
|
||||||
commands: [],
|
|
||||||
debug: false,
|
|
||||||
mainLoopModel: 'claude-sonnet-4-5-20250929',
|
|
||||||
tools: [],
|
|
||||||
verbose: false,
|
|
||||||
thinkingConfig: { type: 'disabled' },
|
|
||||||
mcpClients: [],
|
|
||||||
mcpResources: {},
|
|
||||||
isNonInteractiveSession: true,
|
|
||||||
agentDefinitions: {
|
|
||||||
activeAgents: [],
|
|
||||||
allowedAgentTypes: [],
|
|
||||||
},
|
|
||||||
},
|
|
||||||
abortController: new AbortController(),
|
|
||||||
readFileState: new Map(),
|
|
||||||
getAppState: () => appState,
|
|
||||||
setAppState: (updater: (state: any) => any) => {
|
|
||||||
appState = updater(appState as never)
|
|
||||||
},
|
|
||||||
setInProgressToolUseIDs: (updater: (state: Set<string>) => Set<string>) => {
|
|
||||||
inProgressToolUseIds = updater(inProgressToolUseIds)
|
|
||||||
},
|
|
||||||
setResponseLength: (updater: (state: number) => number) => {
|
|
||||||
responseLength = updater(responseLength)
|
|
||||||
},
|
|
||||||
updateFileHistoryState: () => {},
|
|
||||||
updateAttributionState: () => {},
|
|
||||||
messages: [],
|
|
||||||
} as any
|
|
||||||
}
|
|
||||||
|
|
||||||
describe('query autonomy/provider boundary', () => {
|
|
||||||
test('provider api-error messages fail a consumed autonomy run instead of advancing the flow', async () => {
|
|
||||||
const previousDisableAttachments =
|
|
||||||
process.env.CLAUDE_CODE_DISABLE_ATTACHMENTS
|
|
||||||
process.env.CLAUDE_CODE_DISABLE_ATTACHMENTS = '1'
|
|
||||||
try {
|
|
||||||
const command = await startManagedAutonomyFlowFromHeartbeatTask({
|
|
||||||
task: {
|
|
||||||
name: 'provider-boundary',
|
|
||||||
interval: '1h',
|
|
||||||
prompt: 'Exercise provider boundary',
|
|
||||||
steps: [
|
|
||||||
{ name: 'first', prompt: 'First provider-boundary step' },
|
|
||||||
{ name: 'second', prompt: 'Second provider-boundary step' },
|
|
||||||
],
|
|
||||||
},
|
|
||||||
rootDir: tempDir,
|
|
||||||
currentDir: tempDir,
|
|
||||||
priority: 'next',
|
|
||||||
})
|
|
||||||
expect(command).not.toBeNull()
|
|
||||||
enqueue(command!)
|
|
||||||
|
|
||||||
const toolUseContext = createToolUseContext()
|
|
||||||
|
|
||||||
let callCount = 0
|
|
||||||
const deps = {
|
|
||||||
uuid: () => 'query-chain-id',
|
|
||||||
microcompact: async (messages: unknown[]) => ({ messages }),
|
|
||||||
autocompact: async () => ({
|
|
||||||
compactionResult: undefined,
|
|
||||||
consecutiveFailures: 0,
|
|
||||||
}),
|
|
||||||
callModel: async function* () {
|
|
||||||
callCount += 1
|
|
||||||
if (callCount === 1) {
|
|
||||||
yield createToolUseAssistantMessage()
|
|
||||||
return
|
|
||||||
}
|
|
||||||
yield createAssistantAPIErrorMessage({
|
|
||||||
content: 'API Error: provider unavailable',
|
|
||||||
apiError: 'api_error',
|
|
||||||
error: new Error('provider unavailable') as never,
|
|
||||||
})
|
|
||||||
},
|
|
||||||
}
|
|
||||||
|
|
||||||
const emitted: any[] = []
|
|
||||||
const generator = query({
|
|
||||||
messages: [
|
|
||||||
createUserMessage({
|
|
||||||
content: 'start provider-boundary test',
|
|
||||||
}),
|
|
||||||
],
|
|
||||||
systemPrompt: asSystemPrompt([]),
|
|
||||||
userContext: {},
|
|
||||||
systemContext: {},
|
|
||||||
canUseTool: async (_tool, input) => ({
|
|
||||||
behavior: 'allow',
|
|
||||||
updatedInput: input,
|
|
||||||
}),
|
|
||||||
toolUseContext,
|
|
||||||
querySource: 'sdk',
|
|
||||||
maxTurns: 3,
|
|
||||||
deps: deps as never,
|
|
||||||
})
|
|
||||||
let next = await generator.next()
|
|
||||||
while (!next.done) {
|
|
||||||
emitted.push(next.value)
|
|
||||||
next = await generator.next()
|
|
||||||
}
|
|
||||||
|
|
||||||
const [flow] = await listAutonomyFlows(tempDir)
|
|
||||||
const finalFlow = await getAutonomyFlowById(flow!.flowId, tempDir)
|
|
||||||
const run = await getAutonomyRunById(command!.autonomy!.runId, tempDir)
|
|
||||||
|
|
||||||
expect(next.value.reason).toBe('model_error')
|
|
||||||
expect(callCount).toBe(2)
|
|
||||||
expect(
|
|
||||||
emitted.some(
|
|
||||||
message =>
|
|
||||||
message.type === 'attachment' &&
|
|
||||||
message.attachment.type === 'queued_command',
|
|
||||||
),
|
|
||||||
).toBe(true)
|
|
||||||
expect(run!.status).toBe('failed')
|
|
||||||
expect(run!.error).toBe('provider api_error')
|
|
||||||
expect(finalFlow!.status).toBe('failed')
|
|
||||||
expect(finalFlow!.stateJson!.steps.map(step => step.status)).toEqual([
|
|
||||||
'failed',
|
|
||||||
'pending',
|
|
||||||
])
|
|
||||||
expect(getCommandsByMaxPriority('later')).toHaveLength(0)
|
|
||||||
} finally {
|
|
||||||
if (previousDisableAttachments === undefined) {
|
|
||||||
delete process.env.CLAUDE_CODE_DISABLE_ATTACHMENTS
|
|
||||||
} else {
|
|
||||||
process.env.CLAUDE_CODE_DISABLE_ATTACHMENTS = previousDisableAttachments
|
|
||||||
}
|
|
||||||
}
|
|
||||||
})
|
|
||||||
|
|
||||||
test('generator return cancels a consumed autonomy run instead of leaving it running', async () => {
|
|
||||||
const previousDisableAttachments =
|
|
||||||
process.env.CLAUDE_CODE_DISABLE_ATTACHMENTS
|
|
||||||
process.env.CLAUDE_CODE_DISABLE_ATTACHMENTS = '1'
|
|
||||||
try {
|
|
||||||
const command = await startManagedAutonomyFlowFromHeartbeatTask({
|
|
||||||
task: {
|
|
||||||
name: 'return-boundary',
|
|
||||||
interval: '1h',
|
|
||||||
prompt: 'Exercise generator return boundary',
|
|
||||||
steps: [
|
|
||||||
{ name: 'first', prompt: 'First return-boundary step' },
|
|
||||||
{ name: 'second', prompt: 'Second return-boundary step' },
|
|
||||||
],
|
|
||||||
},
|
|
||||||
rootDir: tempDir,
|
|
||||||
currentDir: tempDir,
|
|
||||||
priority: 'next',
|
|
||||||
})
|
|
||||||
expect(command).not.toBeNull()
|
|
||||||
enqueue(command!)
|
|
||||||
|
|
||||||
const toolUseContext = createToolUseContext()
|
|
||||||
const deps = {
|
|
||||||
uuid: () => 'query-chain-id',
|
|
||||||
microcompact: async (messages: unknown[]) => ({ messages }),
|
|
||||||
autocompact: async () => ({
|
|
||||||
compactionResult: undefined,
|
|
||||||
consecutiveFailures: 0,
|
|
||||||
}),
|
|
||||||
callModel: async function* () {
|
|
||||||
yield createToolUseAssistantMessage()
|
|
||||||
},
|
|
||||||
}
|
|
||||||
|
|
||||||
const generator = query({
|
|
||||||
messages: [
|
|
||||||
createUserMessage({
|
|
||||||
content: 'start return-boundary test',
|
|
||||||
}),
|
|
||||||
],
|
|
||||||
systemPrompt: asSystemPrompt([]),
|
|
||||||
userContext: {},
|
|
||||||
systemContext: {},
|
|
||||||
canUseTool: async (_tool, input) => ({
|
|
||||||
behavior: 'allow',
|
|
||||||
updatedInput: input,
|
|
||||||
}),
|
|
||||||
toolUseContext,
|
|
||||||
querySource: 'sdk',
|
|
||||||
maxTurns: 3,
|
|
||||||
deps: deps as never,
|
|
||||||
})
|
|
||||||
|
|
||||||
let sawQueuedAttachment = false
|
|
||||||
let next = await generator.next()
|
|
||||||
while (!next.done) {
|
|
||||||
const message = next.value as any
|
|
||||||
if (
|
|
||||||
message.type === 'attachment' &&
|
|
||||||
message.attachment.type === 'queued_command'
|
|
||||||
) {
|
|
||||||
sawQueuedAttachment = true
|
|
||||||
await generator.return(undefined as never)
|
|
||||||
break
|
|
||||||
}
|
|
||||||
next = await generator.next()
|
|
||||||
}
|
|
||||||
|
|
||||||
const [flow] = await listAutonomyFlows(tempDir)
|
|
||||||
const finalFlow = await getAutonomyFlowById(flow!.flowId, tempDir)
|
|
||||||
const run = await getAutonomyRunById(command!.autonomy!.runId, tempDir)
|
|
||||||
|
|
||||||
expect(sawQueuedAttachment).toBe(true)
|
|
||||||
expect(run!.status).toBe('cancelled')
|
|
||||||
expect(finalFlow!.status).toBe('cancelled')
|
|
||||||
expect(finalFlow!.stateJson!.steps.map(step => step.status)).toEqual([
|
|
||||||
'cancelled',
|
|
||||||
'cancelled',
|
|
||||||
])
|
|
||||||
expect(getCommandsByMaxPriority('later')).toHaveLength(0)
|
|
||||||
} finally {
|
|
||||||
if (previousDisableAttachments === undefined) {
|
|
||||||
delete process.env.CLAUDE_CODE_DISABLE_ATTACHMENTS
|
|
||||||
} else {
|
|
||||||
process.env.CLAUDE_CODE_DISABLE_ATTACHMENTS = previousDisableAttachments
|
|
||||||
}
|
|
||||||
}
|
|
||||||
})
|
|
||||||
})
|
|
||||||
246
src/cli/print.ts
246
src/cli/print.ts
@@ -321,15 +321,16 @@ import {
|
|||||||
} from 'src/utils/queryProfiler.js'
|
} from 'src/utils/queryProfiler.js'
|
||||||
import { asSessionId } from 'src/types/ids.js'
|
import { asSessionId } from 'src/types/ids.js'
|
||||||
import {
|
import {
|
||||||
createAutonomyQueuedPromptIfNoActiveSource,
|
commitAutonomyQueuedPrompt,
|
||||||
|
createAutonomyQueuedPrompt,
|
||||||
createProactiveAutonomyCommands,
|
createProactiveAutonomyCommands,
|
||||||
|
finalizeAutonomyRunCompleted,
|
||||||
|
finalizeAutonomyRunFailed,
|
||||||
|
markAutonomyRunCompleted,
|
||||||
markAutonomyRunFailed,
|
markAutonomyRunFailed,
|
||||||
|
markAutonomyRunRunning,
|
||||||
} from 'src/utils/autonomyRuns.js'
|
} from 'src/utils/autonomyRuns.js'
|
||||||
import {
|
import { prepareAutonomyTurnPrompt } from 'src/utils/autonomyAuthority.js'
|
||||||
cancelQueuedAutonomyCommands,
|
|
||||||
claimConsumableQueuedAutonomyCommands,
|
|
||||||
finalizeAutonomyCommandsForTurn,
|
|
||||||
} from 'src/utils/autonomyQueueLifecycle.js'
|
|
||||||
import { jsonStringify } from '../utils/slowOperations.js'
|
import { jsonStringify } from '../utils/slowOperations.js'
|
||||||
import { skillChangeDetector } from '../utils/skills/skillChangeDetector.js'
|
import { skillChangeDetector } from '../utils/skills/skillChangeDetector.js'
|
||||||
import { getCommands, clearCommandsCache } from '../commands.js'
|
import { getCommands, clearCommandsCache } from '../commands.js'
|
||||||
@@ -1864,26 +1865,17 @@ function runHeadlessStreaming(
|
|||||||
currentDir: cwd(),
|
currentDir: cwd(),
|
||||||
shouldCreate: () => !inputClosed,
|
shouldCreate: () => !inputClosed,
|
||||||
})
|
})
|
||||||
if (inputClosed) {
|
|
||||||
await cancelQueuedAutonomyCommands({ commands })
|
|
||||||
return
|
|
||||||
}
|
|
||||||
for (const command of commands) {
|
for (const command of commands) {
|
||||||
|
if (inputClosed) {
|
||||||
|
return
|
||||||
|
}
|
||||||
enqueue({
|
enqueue({
|
||||||
...command,
|
...command,
|
||||||
uuid: randomUUID(),
|
uuid: randomUUID(),
|
||||||
})
|
})
|
||||||
}
|
}
|
||||||
void run()
|
void run()
|
||||||
})().catch(error => {
|
})()
|
||||||
logError(error)
|
|
||||||
logForDebugging(
|
|
||||||
`[Proactive] failed to create headless tick: ${error}`,
|
|
||||||
{
|
|
||||||
level: 'error',
|
|
||||||
},
|
|
||||||
)
|
|
||||||
})
|
|
||||||
}, 0)
|
}, 0)
|
||||||
}
|
}
|
||||||
: undefined
|
: undefined
|
||||||
@@ -1979,24 +1971,17 @@ function runHeadlessStreaming(
|
|||||||
// Non-prompt commands (task-notification, orphaned-permission) carry
|
// Non-prompt commands (task-notification, orphaned-permission) carry
|
||||||
// side effects or orphanedPermission state, so they process singly.
|
// side effects or orphanedPermission state, so they process singly.
|
||||||
// Prompt commands greedily collect followers with matching workload.
|
// Prompt commands greedily collect followers with matching workload.
|
||||||
let batch: QueuedCommand[] = [command]
|
const batch: QueuedCommand[] = [command]
|
||||||
if (command.mode === 'prompt') {
|
if (command.mode === 'prompt') {
|
||||||
while (canBatchWith(command, peek(isMainThread))) {
|
while (canBatchWith(command, peek(isMainThread))) {
|
||||||
batch.push(dequeue(isMainThread)!)
|
batch.push(dequeue(isMainThread)!)
|
||||||
}
|
}
|
||||||
}
|
if (batch.length > 1) {
|
||||||
const queuedAutonomyClaim =
|
command = {
|
||||||
await claimConsumableQueuedAutonomyCommands(batch)
|
...command,
|
||||||
batch = queuedAutonomyClaim.attachmentCommands
|
value: joinPromptValues(batch.map(c => c.value)),
|
||||||
if (batch.length === 0) {
|
uuid: batch.findLast(c => c.uuid)?.uuid ?? command.uuid,
|
||||||
continue
|
}
|
||||||
}
|
|
||||||
command = batch[0]!
|
|
||||||
if (command.mode === 'prompt' && batch.length > 1) {
|
|
||||||
command = {
|
|
||||||
...command,
|
|
||||||
value: joinPromptValues(batch.map(c => c.value)),
|
|
||||||
uuid: batch.findLast(c => c.uuid)?.uuid ?? command.uuid,
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
const batchUuids = batch.map(c => c.uuid).filter(u => u !== undefined)
|
const batchUuids = batch.map(c => c.uuid).filter(u => u !== undefined)
|
||||||
@@ -2135,7 +2120,9 @@ function runHeadlessStreaming(
|
|||||||
}
|
}
|
||||||
|
|
||||||
const input = command.value
|
const input = command.value
|
||||||
const claimedAutonomyCommands = queuedAutonomyClaim.claimedCommands
|
const autonomyRunIds = batch
|
||||||
|
.map(item => item.autonomy?.runId)
|
||||||
|
.filter((runId): runId is string => Boolean(runId))
|
||||||
|
|
||||||
if (structuredIO instanceof RemoteIO && command.mode === 'prompt') {
|
if (structuredIO instanceof RemoteIO && command.mode === 'prompt') {
|
||||||
logEvent('tengu_bridge_message_received', {
|
logEvent('tengu_bridge_message_received', {
|
||||||
@@ -2185,6 +2172,9 @@ function runHeadlessStreaming(
|
|||||||
// const-capture: TS loses `while ((command = dequeue()))` narrowing
|
// const-capture: TS loses `while ((command = dequeue()))` narrowing
|
||||||
// inside the closure.
|
// inside the closure.
|
||||||
const cmd = command
|
const cmd = command
|
||||||
|
for (const runId of autonomyRunIds) {
|
||||||
|
await markAutonomyRunRunning(runId)
|
||||||
|
}
|
||||||
let lastResultIsError = false
|
let lastResultIsError = false
|
||||||
try {
|
try {
|
||||||
await runWithWorkload(
|
await runWithWorkload(
|
||||||
@@ -2296,39 +2286,35 @@ function runHeadlessStreaming(
|
|||||||
},
|
},
|
||||||
) // end runWithWorkload
|
) // end runWithWorkload
|
||||||
if (lastResultIsError) {
|
if (lastResultIsError) {
|
||||||
await finalizeAutonomyCommandsForTurn({
|
for (const runId of autonomyRunIds) {
|
||||||
commands: claimedAutonomyCommands,
|
await finalizeAutonomyRunFailed({
|
||||||
outcome: {
|
runId,
|
||||||
type: 'failed',
|
error: 'ask() returned an error result',
|
||||||
message: 'ask() returned an error result',
|
|
||||||
},
|
|
||||||
currentDir: cwd(),
|
|
||||||
priority: 'later',
|
|
||||||
workload: cmd.workload ?? options.workload,
|
|
||||||
})
|
|
||||||
} else {
|
|
||||||
const nextCommands = await finalizeAutonomyCommandsForTurn({
|
|
||||||
commands: claimedAutonomyCommands,
|
|
||||||
outcome: { type: 'completed' },
|
|
||||||
currentDir: cwd(),
|
|
||||||
priority: 'later',
|
|
||||||
workload: cmd.workload ?? options.workload,
|
|
||||||
})
|
|
||||||
for (const nextCommand of nextCommands) {
|
|
||||||
enqueue({
|
|
||||||
...nextCommand,
|
|
||||||
uuid: randomUUID(),
|
|
||||||
})
|
})
|
||||||
}
|
}
|
||||||
|
} else {
|
||||||
|
for (const runId of autonomyRunIds) {
|
||||||
|
const nextCommands = await finalizeAutonomyRunCompleted({
|
||||||
|
runId,
|
||||||
|
currentDir: cwd(),
|
||||||
|
priority: 'later',
|
||||||
|
workload: cmd.workload ?? options.workload,
|
||||||
|
})
|
||||||
|
for (const nextCommand of nextCommands) {
|
||||||
|
enqueue({
|
||||||
|
...nextCommand,
|
||||||
|
uuid: randomUUID(),
|
||||||
|
})
|
||||||
|
}
|
||||||
|
}
|
||||||
}
|
}
|
||||||
} catch (error) {
|
} catch (error) {
|
||||||
await finalizeAutonomyCommandsForTurn({
|
for (const runId of autonomyRunIds) {
|
||||||
commands: claimedAutonomyCommands,
|
await finalizeAutonomyRunFailed({
|
||||||
outcome: { type: 'failed', error },
|
runId,
|
||||||
currentDir: cwd(),
|
error: String(error),
|
||||||
priority: 'later',
|
})
|
||||||
workload: cmd.workload ?? options.workload,
|
}
|
||||||
})
|
|
||||||
throw error
|
throw error
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -2819,90 +2805,72 @@ function runHeadlessStreaming(
|
|||||||
let cronScheduler: import('../utils/cronScheduler.js').CronScheduler | null =
|
let cronScheduler: import('../utils/cronScheduler.js').CronScheduler | null =
|
||||||
null
|
null
|
||||||
if (cronGate.isKairosCronEnabled()) {
|
if (cronGate.isKairosCronEnabled()) {
|
||||||
// Shared dedup-claim → input-close-recheck → onSuccess pipeline for the
|
|
||||||
// three cron entry points (legacy onFire, onFireTask agent, onFireTask
|
|
||||||
// non-agent). Centralizing the cancel-on-late-shutdown contract here keeps
|
|
||||||
// the three branches from drifting on what happens between claim and
|
|
||||||
// dispatch. onSuccess receives the claimed QueuedCommand and decides
|
|
||||||
// whether to enqueue it (normal path) or mark the run failed (agent path).
|
|
||||||
const dispatchHeadlessCronCommand = (params: {
|
|
||||||
basePrompt: string
|
|
||||||
sourceId: string
|
|
||||||
sourceLabel: string
|
|
||||||
logSuffix: string
|
|
||||||
onSuccess: (command: QueuedCommand) => void | Promise<void>
|
|
||||||
}): void => {
|
|
||||||
if (inputClosed) return
|
|
||||||
void (async () => {
|
|
||||||
const command = await createAutonomyQueuedPromptIfNoActiveSource({
|
|
||||||
basePrompt: params.basePrompt,
|
|
||||||
trigger: 'scheduled-task',
|
|
||||||
currentDir: cwd(),
|
|
||||||
sourceId: params.sourceId,
|
|
||||||
sourceLabel: params.sourceLabel,
|
|
||||||
workload: WORKLOAD_CRON,
|
|
||||||
shouldCreate: () => !inputClosed,
|
|
||||||
})
|
|
||||||
if (!command) return
|
|
||||||
if (inputClosed) {
|
|
||||||
await cancelQueuedAutonomyCommands({ commands: [command] })
|
|
||||||
return
|
|
||||||
}
|
|
||||||
await params.onSuccess(command)
|
|
||||||
})().catch(error => {
|
|
||||||
logError(error)
|
|
||||||
logForDebugging(
|
|
||||||
`[ScheduledTasks] failed to enqueue headless task${params.logSuffix}: ${error}`,
|
|
||||||
{ level: 'error' },
|
|
||||||
)
|
|
||||||
})
|
|
||||||
}
|
|
||||||
|
|
||||||
const enqueueAndRun = (command: QueuedCommand): void => {
|
|
||||||
enqueue({
|
|
||||||
...command,
|
|
||||||
uuid: randomUUID(),
|
|
||||||
})
|
|
||||||
void run()
|
|
||||||
}
|
|
||||||
|
|
||||||
cronScheduler = cronSchedulerModule.createCronScheduler({
|
cronScheduler = cronSchedulerModule.createCronScheduler({
|
||||||
onFire: prompt => {
|
onFire: prompt => {
|
||||||
// Legacy KAIROS-style entries: the prompt text is what uniquely
|
if (inputClosed) return
|
||||||
// identifies the cron entry, so it doubles as both source id and
|
void (async () => {
|
||||||
// source label for dedup.
|
const prepared = await prepareAutonomyTurnPrompt({
|
||||||
dispatchHeadlessCronCommand({
|
basePrompt: prompt,
|
||||||
basePrompt: prompt,
|
trigger: 'scheduled-task',
|
||||||
sourceId: prompt,
|
currentDir: cwd(),
|
||||||
sourceLabel: prompt,
|
})
|
||||||
logSuffix: '',
|
if (inputClosed) return
|
||||||
onSuccess: enqueueAndRun,
|
const command = await commitAutonomyQueuedPrompt({
|
||||||
})
|
prepared,
|
||||||
|
currentDir: cwd(),
|
||||||
|
workload: WORKLOAD_CRON,
|
||||||
|
})
|
||||||
|
if (inputClosed) return
|
||||||
|
enqueue({
|
||||||
|
...command,
|
||||||
|
uuid: randomUUID(),
|
||||||
|
})
|
||||||
|
void run()
|
||||||
|
})()
|
||||||
},
|
},
|
||||||
onFireTask: task => {
|
onFireTask: task => {
|
||||||
if (task.agentId) {
|
if (inputClosed) return
|
||||||
dispatchHeadlessCronCommand({
|
void (async () => {
|
||||||
|
if (task.agentId) {
|
||||||
|
const prepared = await prepareAutonomyTurnPrompt({
|
||||||
|
basePrompt: task.prompt,
|
||||||
|
trigger: 'scheduled-task',
|
||||||
|
currentDir: cwd(),
|
||||||
|
})
|
||||||
|
if (inputClosed) return
|
||||||
|
const command = await commitAutonomyQueuedPrompt({
|
||||||
|
prepared,
|
||||||
|
currentDir: cwd(),
|
||||||
|
sourceId: task.id,
|
||||||
|
sourceLabel: task.prompt,
|
||||||
|
workload: WORKLOAD_CRON,
|
||||||
|
})
|
||||||
|
await markAutonomyRunFailed(
|
||||||
|
command.autonomy!.runId,
|
||||||
|
`No teammate runtime available for scheduled task owner ${task.agentId} in headless mode.`,
|
||||||
|
)
|
||||||
|
return
|
||||||
|
}
|
||||||
|
const prepared = await prepareAutonomyTurnPrompt({
|
||||||
basePrompt: task.prompt,
|
basePrompt: task.prompt,
|
||||||
|
trigger: 'scheduled-task',
|
||||||
|
currentDir: cwd(),
|
||||||
|
})
|
||||||
|
if (inputClosed) return
|
||||||
|
const command = await commitAutonomyQueuedPrompt({
|
||||||
|
prepared,
|
||||||
|
currentDir: cwd(),
|
||||||
sourceId: task.id,
|
sourceId: task.id,
|
||||||
sourceLabel: task.prompt,
|
sourceLabel: task.prompt,
|
||||||
logSuffix: ` ${task.id}`,
|
workload: WORKLOAD_CRON,
|
||||||
onSuccess: async command => {
|
|
||||||
await markAutonomyRunFailed(
|
|
||||||
command.autonomy!.runId,
|
|
||||||
`No teammate runtime available for scheduled task owner ${task.agentId} in headless mode.`,
|
|
||||||
command.autonomy!.rootDir,
|
|
||||||
)
|
|
||||||
},
|
|
||||||
})
|
})
|
||||||
return
|
if (inputClosed) return
|
||||||
}
|
enqueue({
|
||||||
dispatchHeadlessCronCommand({
|
...command,
|
||||||
basePrompt: task.prompt,
|
uuid: randomUUID(),
|
||||||
sourceId: task.id,
|
})
|
||||||
sourceLabel: task.prompt,
|
void run()
|
||||||
logSuffix: ` ${task.id}`,
|
})()
|
||||||
onSuccess: enqueueAndRun,
|
|
||||||
})
|
|
||||||
},
|
},
|
||||||
isLoading: () => running || inputClosed,
|
isLoading: () => running || inputClosed,
|
||||||
getJitterConfig: cronJitterConfigModule?.getCronJitterConfig,
|
getJitterConfig: cronJitterConfigModule?.getCronJitterConfig,
|
||||||
|
|||||||
@@ -1,5 +1,5 @@
|
|||||||
import type { Command } from '../../commands.js'
|
import type { Command } from '../../commands.js'
|
||||||
import { isSkillLearningCompiledIn } from '../../services/skillLearning/featureCheck.js'
|
import { isSkillLearningEnabled } from '../../services/skillLearning/featureCheck.js'
|
||||||
|
|
||||||
const skillLearning = {
|
const skillLearning = {
|
||||||
type: 'local-jsx',
|
type: 'local-jsx',
|
||||||
@@ -7,10 +7,7 @@ const skillLearning = {
|
|||||||
description: 'Manage skill learning (observe, analyze, evolve)',
|
description: 'Manage skill learning (observe, analyze, evolve)',
|
||||||
argumentHint:
|
argumentHint:
|
||||||
'[start|stop|about|status|ingest|evolve|export|import|prune|promote|projects]',
|
'[start|stop|about|status|ingest|evolve|export|import|prune|promote|projects]',
|
||||||
// The slash command is visible whenever the subsystem is compiled in.
|
isEnabled: () => isSkillLearningEnabled(),
|
||||||
// Whether the runtime feature is actually doing work is a separate
|
|
||||||
// concern controlled by `/skill-learning start` (see featureCheck.ts).
|
|
||||||
isEnabled: () => isSkillLearningCompiledIn(),
|
|
||||||
isHidden: false,
|
isHidden: false,
|
||||||
load: () => import('./skillPanel.js'),
|
load: () => import('./skillPanel.js'),
|
||||||
} satisfies Command
|
} satisfies Command
|
||||||
|
|||||||
@@ -1,14 +1,10 @@
|
|||||||
import type { Command } from '../../commands.js'
|
import type { Command } from '../../commands.js'
|
||||||
import { isSkillSearchCompiledIn } from '../../services/skillSearch/featureCheck.js'
|
|
||||||
|
|
||||||
const skillSearch = {
|
const skillSearch = {
|
||||||
type: 'local-jsx',
|
type: 'local-jsx',
|
||||||
name: 'skill-search',
|
name: 'skill-search',
|
||||||
description: 'Control automatic skill matching during conversations',
|
description: 'Control automatic skill matching during conversations',
|
||||||
argumentHint: '[start|stop|about|status]',
|
argumentHint: '[start|stop|about|status]',
|
||||||
// Visible whenever the subsystem is compiled in (build flag); runtime
|
|
||||||
// activation is separate and operator-controlled via /skill-search start.
|
|
||||||
isEnabled: () => isSkillSearchCompiledIn(),
|
|
||||||
isHidden: false,
|
isHidden: false,
|
||||||
load: () => import('./skillSearchPanel.js'),
|
load: () => import('./skillSearchPanel.js'),
|
||||||
} satisfies Command
|
} satisfies Command
|
||||||
|
|||||||
@@ -1,7 +1,6 @@
|
|||||||
import { extname } from 'path'
|
import { extname } from 'path'
|
||||||
import React, { Suspense, use, useMemo } from 'react'
|
import React, { Suspense, use, useMemo } from 'react'
|
||||||
import { Ansi, Text } from '@anthropic/ink'
|
import { Ansi, Text } from '@anthropic/ink'
|
||||||
import { LRUCache } from 'lru-cache'
|
|
||||||
import { getCliHighlightPromise } from '../../utils/cliHighlight.js'
|
import { getCliHighlightPromise } from '../../utils/cliHighlight.js'
|
||||||
import { logForDebugging } from '../../utils/debug.js'
|
import { logForDebugging } from '../../utils/debug.js'
|
||||||
import { convertLeadingTabsToSpaces } from '../../utils/file.js'
|
import { convertLeadingTabsToSpaces } from '../../utils/file.js'
|
||||||
@@ -17,7 +16,8 @@ type Props = {
|
|||||||
// Module-level highlight cache — hl.highlight() is the hot cost on virtual-
|
// Module-level highlight cache — hl.highlight() is the hot cost on virtual-
|
||||||
// scroll remounts. useMemo doesn't survive unmount→remount. Keyed by hash
|
// scroll remounts. useMemo doesn't survive unmount→remount. Keyed by hash
|
||||||
// of code+language to avoid retaining full source strings (#24180 RSS fix).
|
// of code+language to avoid retaining full source strings (#24180 RSS fix).
|
||||||
const hlCache = new LRUCache<string, string>({ max: 500 })
|
const HL_CACHE_MAX = 500
|
||||||
|
const hlCache = new Map<string, string>()
|
||||||
function cachedHighlight(
|
function cachedHighlight(
|
||||||
hl: NonNullable<Awaited<ReturnType<typeof getCliHighlightPromise>>>,
|
hl: NonNullable<Awaited<ReturnType<typeof getCliHighlightPromise>>>,
|
||||||
code: string,
|
code: string,
|
||||||
@@ -25,8 +25,16 @@ function cachedHighlight(
|
|||||||
): string {
|
): string {
|
||||||
const key = hashPair(language, code)
|
const key = hashPair(language, code)
|
||||||
const hit = hlCache.get(key)
|
const hit = hlCache.get(key)
|
||||||
if (hit !== undefined) return hit
|
if (hit !== undefined) {
|
||||||
|
hlCache.delete(key)
|
||||||
|
hlCache.set(key, hit)
|
||||||
|
return hit
|
||||||
|
}
|
||||||
const out = hl.highlight(code, { language })
|
const out = hl.highlight(code, { language })
|
||||||
|
if (hlCache.size >= HL_CACHE_MAX) {
|
||||||
|
const first = hlCache.keys().next().value
|
||||||
|
if (first !== undefined) hlCache.delete(first)
|
||||||
|
}
|
||||||
hlCache.set(key, out)
|
hlCache.set(key, out)
|
||||||
return out
|
return out
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -1,6 +1,5 @@
|
|||||||
import { marked, type Token, type Tokens } from 'marked'
|
import { marked, type Token, type Tokens } from 'marked'
|
||||||
import React, { Suspense, use, useMemo, useRef } from 'react'
|
import React, { Suspense, use, useMemo, useRef } from 'react'
|
||||||
import { LRUCache } from 'lru-cache'
|
|
||||||
import { useSettings } from '../hooks/useSettings.js'
|
import { useSettings } from '../hooks/useSettings.js'
|
||||||
import { Ansi, Box, useTheme } from '@anthropic/ink'
|
import { Ansi, Box, useTheme } from '@anthropic/ink'
|
||||||
import {
|
import {
|
||||||
@@ -23,7 +22,8 @@ type Props = {
|
|||||||
// scrolling back to a previously-visible message re-parses. Messages are
|
// scrolling back to a previously-visible message re-parses. Messages are
|
||||||
// immutable in history; same content → same tokens. Keyed by hash to avoid
|
// immutable in history; same content → same tokens. Keyed by hash to avoid
|
||||||
// retaining full content strings (turn50→turn99 RSS regression, #24180).
|
// retaining full content strings (turn50→turn99 RSS regression, #24180).
|
||||||
const tokenCache = new LRUCache<string, Token[]>({ max: 500 })
|
const TOKEN_CACHE_MAX = 500
|
||||||
|
const tokenCache = new Map<string, Token[]>()
|
||||||
|
|
||||||
// Characters that indicate markdown syntax. If none are present, skip the
|
// Characters that indicate markdown syntax. If none are present, skip the
|
||||||
// ~3ms marked.lexer call entirely — render as a single paragraph. Covers
|
// ~3ms marked.lexer call entirely — render as a single paragraph. Covers
|
||||||
@@ -55,8 +55,19 @@ function cachedLexer(content: string): Token[] {
|
|||||||
}
|
}
|
||||||
const key = hashContent(content)
|
const key = hashContent(content)
|
||||||
const hit = tokenCache.get(key)
|
const hit = tokenCache.get(key)
|
||||||
if (hit) return hit
|
if (hit) {
|
||||||
|
// Promote to MRU — without this the eviction is FIFO (scrolling back to
|
||||||
|
// an early message evicts the very item you're looking at).
|
||||||
|
tokenCache.delete(key)
|
||||||
|
tokenCache.set(key, hit)
|
||||||
|
return hit
|
||||||
|
}
|
||||||
const tokens = marked.lexer(content)
|
const tokens = marked.lexer(content)
|
||||||
|
if (tokenCache.size >= TOKEN_CACHE_MAX) {
|
||||||
|
// LRU-ish: drop oldest. Map preserves insertion order.
|
||||||
|
const first = tokenCache.keys().next().value
|
||||||
|
if (first !== undefined) tokenCache.delete(first)
|
||||||
|
}
|
||||||
tokenCache.set(key, tokens)
|
tokenCache.set(key, tokens)
|
||||||
return tokens
|
return tokens
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -279,7 +279,6 @@ export function ModelPicker({
|
|||||||
<Text color="subtle">
|
<Text color="subtle">
|
||||||
<EffortLevelIndicator effort={undefined} /> 1M context off
|
<EffortLevelIndicator effort={undefined} /> 1M context off
|
||||||
{focusedModelName ? ` for ${focusedModelName}` : ''}
|
{focusedModelName ? ` for ${focusedModelName}` : ''}
|
||||||
<Text color="subtle"> · Space to toggle</Text>
|
|
||||||
</Text>
|
</Text>
|
||||||
)}
|
)}
|
||||||
</Box>
|
</Box>
|
||||||
|
|||||||
@@ -30,7 +30,6 @@ interface WorkerState {
|
|||||||
failureCount: number
|
failureCount: number
|
||||||
parked: boolean
|
parked: boolean
|
||||||
lastStartTime: number
|
lastStartTime: number
|
||||||
restartTimer: ReturnType<typeof setTimeout> | null
|
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
@@ -242,7 +241,6 @@ async function runSupervisor(args: string[]): Promise<void> {
|
|||||||
failureCount: 0,
|
failureCount: 0,
|
||||||
parked: false,
|
parked: false,
|
||||||
lastStartTime: 0,
|
lastStartTime: 0,
|
||||||
restartTimer: null,
|
|
||||||
},
|
},
|
||||||
]
|
]
|
||||||
|
|
||||||
@@ -263,10 +261,6 @@ async function runSupervisor(args: string[]): Promise<void> {
|
|||||||
controller.abort()
|
controller.abort()
|
||||||
removeDaemonState()
|
removeDaemonState()
|
||||||
for (const w of workers) {
|
for (const w of workers) {
|
||||||
if (w.restartTimer) {
|
|
||||||
clearTimeout(w.restartTimer)
|
|
||||||
w.restartTimer = null
|
|
||||||
}
|
|
||||||
if (w.process && !w.process.killed) {
|
if (w.process && !w.process.killed) {
|
||||||
w.process.kill('SIGTERM')
|
w.process.kill('SIGTERM')
|
||||||
}
|
}
|
||||||
@@ -294,30 +288,22 @@ async function runSupervisor(args: string[]): Promise<void> {
|
|||||||
// Wait for all workers to exit
|
// Wait for all workers to exit
|
||||||
await Promise.all(
|
await Promise.all(
|
||||||
workers
|
workers
|
||||||
.filter(w => w.process && w.process.exitCode === null)
|
.filter(w => w.process && !w.process.killed)
|
||||||
.map(
|
.map(
|
||||||
w =>
|
w =>
|
||||||
new Promise<void>(resolve => {
|
new Promise<void>(resolve => {
|
||||||
if (!w.process || w.process.exitCode !== null) {
|
if (!w.process) {
|
||||||
resolve()
|
resolve()
|
||||||
return
|
return
|
||||||
}
|
}
|
||||||
let killTimer: ReturnType<typeof setTimeout> | null = null
|
w.process.on('exit', () => resolve())
|
||||||
w.process.on('exit', () => {
|
|
||||||
if (killTimer) {
|
|
||||||
clearTimeout(killTimer)
|
|
||||||
killTimer = null
|
|
||||||
}
|
|
||||||
resolve()
|
|
||||||
})
|
|
||||||
// Force kill after grace period
|
// Force kill after grace period
|
||||||
killTimer = setTimeout(() => {
|
setTimeout(() => {
|
||||||
if (w.process && w.process.exitCode === null) {
|
if (w.process && !w.process.killed) {
|
||||||
w.process.kill('SIGKILL')
|
w.process.kill('SIGKILL')
|
||||||
}
|
}
|
||||||
resolve()
|
resolve()
|
||||||
}, 30_000)
|
}, 30_000)
|
||||||
killTimer.unref?.()
|
|
||||||
}),
|
}),
|
||||||
),
|
),
|
||||||
)
|
)
|
||||||
@@ -412,13 +398,11 @@ function spawnWorker(
|
|||||||
`[daemon] worker '${worker.kind}' exited (code=${code}, signal=${sig}), restarting in ${worker.backoffMs}ms`,
|
`[daemon] worker '${worker.kind}' exited (code=${code}, signal=${sig}), restarting in ${worker.backoffMs}ms`,
|
||||||
)
|
)
|
||||||
|
|
||||||
worker.restartTimer = setTimeout(() => {
|
setTimeout(() => {
|
||||||
worker.restartTimer = null
|
|
||||||
if (!signal.aborted && !worker.parked) {
|
if (!signal.aborted && !worker.parked) {
|
||||||
spawnWorker(worker, dir, config, signal)
|
spawnWorker(worker, dir, config, signal)
|
||||||
}
|
}
|
||||||
}, worker.backoffMs)
|
}, worker.backoffMs)
|
||||||
worker.restartTimer.unref?.()
|
|
||||||
|
|
||||||
// Exponential backoff
|
// Exponential backoff
|
||||||
worker.backoffMs = Math.min(
|
worker.backoffMs = Math.min(
|
||||||
|
|||||||
@@ -255,29 +255,6 @@ async function main(): Promise<void> {
|
|||||||
return
|
return
|
||||||
}
|
}
|
||||||
|
|
||||||
// Fast-path for `claude autonomy ...`: state inspection/management commands
|
|
||||||
// do not need the full interactive CLI bootstrap. The full Commander path
|
|
||||||
// imports main.tsx and runs root preAction initialization before the autonomy
|
|
||||||
// action; under coverage/CI that leaves unrelated handles around simple
|
|
||||||
// state-only subprocess calls.
|
|
||||||
if (args[0] === 'autonomy') {
|
|
||||||
profileCheckpoint('cli_autonomy_path')
|
|
||||||
const { getAutonomyCommandText } = await import(
|
|
||||||
'../cli/handlers/autonomy.js'
|
|
||||||
)
|
|
||||||
const text = await getAutonomyCommandText(args.slice(1).join(' '))
|
|
||||||
await new Promise<void>((resolve, reject) => {
|
|
||||||
process.stdout.write(`${text}\n`, error => {
|
|
||||||
if (error) {
|
|
||||||
reject(error)
|
|
||||||
return
|
|
||||||
}
|
|
||||||
resolve()
|
|
||||||
})
|
|
||||||
})
|
|
||||||
process.exit(0)
|
|
||||||
}
|
|
||||||
|
|
||||||
// Fast-path for `--bg`/`--background` shortcut → daemon bg.
|
// Fast-path for `--bg`/`--background` shortcut → daemon bg.
|
||||||
if (
|
if (
|
||||||
feature('BG_SESSIONS') &&
|
feature('BG_SESSIONS') &&
|
||||||
@@ -421,4 +398,4 @@ async function main(): Promise<void> {
|
|||||||
}
|
}
|
||||||
|
|
||||||
// eslint-disable-next-line custom-rules/no-top-level-side-effects
|
// eslint-disable-next-line custom-rules/no-top-level-side-effects
|
||||||
await main()
|
void main()
|
||||||
|
|||||||
@@ -1,80 +0,0 @@
|
|||||||
import { afterEach, beforeEach, describe, expect, test } from 'bun:test'
|
|
||||||
import {
|
|
||||||
resetStateForTests,
|
|
||||||
setCwdState,
|
|
||||||
setOriginalCwd,
|
|
||||||
setProjectRoot,
|
|
||||||
} from '../../bootstrap/state'
|
|
||||||
import { createScheduledTaskQueuedCommand } from '../useScheduledTasks'
|
|
||||||
import {
|
|
||||||
listAutonomyRuns,
|
|
||||||
markAutonomyRunCompleted,
|
|
||||||
} from '../../utils/autonomyRuns'
|
|
||||||
import { resetAutonomyAuthorityForTests } from '../../utils/autonomyAuthority'
|
|
||||||
import { cleanupTempDir, createTempDir } from '../../../tests/mocks/file-system'
|
|
||||||
|
|
||||||
let tempDir = ''
|
|
||||||
|
|
||||||
beforeEach(async () => {
|
|
||||||
tempDir = await createTempDir('scheduled-tasks-')
|
|
||||||
resetStateForTests()
|
|
||||||
resetAutonomyAuthorityForTests()
|
|
||||||
setOriginalCwd(tempDir)
|
|
||||||
setProjectRoot(tempDir)
|
|
||||||
setCwdState(tempDir)
|
|
||||||
})
|
|
||||||
|
|
||||||
afterEach(async () => {
|
|
||||||
resetStateForTests()
|
|
||||||
resetAutonomyAuthorityForTests()
|
|
||||||
if (tempDir) {
|
|
||||||
await cleanupTempDir(tempDir)
|
|
||||||
}
|
|
||||||
})
|
|
||||||
|
|
||||||
describe('createScheduledTaskQueuedCommand', () => {
|
|
||||||
function createCommandForTest(task: { id: string; prompt: string }) {
|
|
||||||
return createScheduledTaskQueuedCommand(task, {
|
|
||||||
rootDir: tempDir,
|
|
||||||
currentDir: tempDir,
|
|
||||||
})
|
|
||||||
}
|
|
||||||
|
|
||||||
test('skips a scheduled task when the same source already has an active run', async () => {
|
|
||||||
const task = {
|
|
||||||
id: 'cron-1',
|
|
||||||
prompt: '/loop review the repository',
|
|
||||||
}
|
|
||||||
|
|
||||||
const first = await createCommandForTest(task)
|
|
||||||
const second = await createCommandForTest(task)
|
|
||||||
const runs = await listAutonomyRuns(tempDir)
|
|
||||||
|
|
||||||
expect(first).not.toBeNull()
|
|
||||||
expect(second).toBeNull()
|
|
||||||
expect(runs).toHaveLength(1)
|
|
||||||
expect(runs[0]).toMatchObject({
|
|
||||||
trigger: 'scheduled-task',
|
|
||||||
status: 'queued',
|
|
||||||
sourceId: 'cron-1',
|
|
||||||
})
|
|
||||||
})
|
|
||||||
|
|
||||||
test('allows a scheduled task after the previous same-source run completes', async () => {
|
|
||||||
const task = {
|
|
||||||
id: 'cron-1',
|
|
||||||
prompt: '/loop review the repository',
|
|
||||||
}
|
|
||||||
|
|
||||||
const first = await createCommandForTest(task)
|
|
||||||
expect(first?.autonomy?.runId).toBeDefined()
|
|
||||||
|
|
||||||
await markAutonomyRunCompleted(first!.autonomy!.runId, tempDir, 100)
|
|
||||||
const second = await createCommandForTest(task)
|
|
||||||
const runs = await listAutonomyRuns(tempDir)
|
|
||||||
|
|
||||||
expect(second).not.toBeNull()
|
|
||||||
expect(runs).toHaveLength(2)
|
|
||||||
expect(runs.map(run => run.status).sort()).toEqual(['completed', 'queued'])
|
|
||||||
})
|
|
||||||
})
|
|
||||||
@@ -189,6 +189,12 @@ export function useReplBridge(
|
|||||||
}
|
}
|
||||||
|
|
||||||
let cancelled = false
|
let cancelled = false
|
||||||
|
// Map of pending bridge permission response handlers, keyed by request_id.
|
||||||
|
// Defined at useEffect scope so the cleanup function can clear it on unmount.
|
||||||
|
const pendingPermissionHandlers = new Map<
|
||||||
|
string,
|
||||||
|
(response: BridgePermissionResponse) => void
|
||||||
|
>()
|
||||||
// Capture messages.length now so we don't re-send initial messages
|
// Capture messages.length now so we don't re-send initial messages
|
||||||
// through writeMessages after the bridge connects.
|
// through writeMessages after the bridge connects.
|
||||||
const initialMessageCount = messages.length
|
const initialMessageCount = messages.length
|
||||||
@@ -461,13 +467,6 @@ export function useReplBridge(
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
// Map of pending bridge permission response handlers, keyed by request_id.
|
|
||||||
// Each entry is an onResponse handler waiting for CCR to reply.
|
|
||||||
const pendingPermissionHandlers = new Map<
|
|
||||||
string,
|
|
||||||
(response: BridgePermissionResponse) => void
|
|
||||||
>()
|
|
||||||
|
|
||||||
// Dispatch incoming control_response messages to registered handlers
|
// Dispatch incoming control_response messages to registered handlers
|
||||||
function handlePermissionResponse(msg: SDKControlResponse): void {
|
function handlePermissionResponse(msg: SDKControlResponse): void {
|
||||||
const requestId = msg.response?.request_id
|
const requestId = msg.response?.request_id
|
||||||
@@ -818,6 +817,10 @@ export function useReplBridge(
|
|||||||
|
|
||||||
return () => {
|
return () => {
|
||||||
cancelled = true
|
cancelled = true
|
||||||
|
// Release all pending permission handlers so their closures (which
|
||||||
|
// may capture React state/setters) can be GC'd immediately rather
|
||||||
|
// than waiting for the entire useEffect closure to become unreachable.
|
||||||
|
pendingPermissionHandlers.clear()
|
||||||
clearTimeout(failureTimeoutRef.current)
|
clearTimeout(failureTimeoutRef.current)
|
||||||
failureTimeoutRef.current = undefined
|
failureTimeoutRef.current = undefined
|
||||||
if (handleRef.current) {
|
if (handleRef.current) {
|
||||||
|
|||||||
@@ -10,18 +10,13 @@ import type { Message } from '../types/message.js'
|
|||||||
import { getCwd } from '../utils/cwd.js'
|
import { getCwd } from '../utils/cwd.js'
|
||||||
import { getCronJitterConfig } from '../utils/cronJitterConfig.js'
|
import { getCronJitterConfig } from '../utils/cronJitterConfig.js'
|
||||||
import { createCronScheduler } from '../utils/cronScheduler.js'
|
import { createCronScheduler } from '../utils/cronScheduler.js'
|
||||||
import { removeCronTasks, type CronTask } from '../utils/cronTasks.js'
|
import { removeCronTasks } from '../utils/cronTasks.js'
|
||||||
import {
|
import { createAutonomyQueuedPrompt } from '../utils/autonomyRuns.js'
|
||||||
createAutonomyQueuedPrompt,
|
import { markAutonomyRunFailed } from '../utils/autonomyRuns.js'
|
||||||
createAutonomyQueuedPromptIfNoActiveSource,
|
|
||||||
markAutonomyRunCancelled,
|
|
||||||
markAutonomyRunFailed,
|
|
||||||
} from '../utils/autonomyRuns.js'
|
|
||||||
import { logForDebugging } from '../utils/debug.js'
|
import { logForDebugging } from '../utils/debug.js'
|
||||||
import { enqueuePendingNotification } from '../utils/messageQueueManager.js'
|
import { enqueuePendingNotification } from '../utils/messageQueueManager.js'
|
||||||
import { createScheduledTaskFireMessage } from '../utils/messages.js'
|
import { createScheduledTaskFireMessage } from '../utils/messages.js'
|
||||||
import { WORKLOAD_CRON } from '../utils/workloadContext.js'
|
import { WORKLOAD_CRON } from '../utils/workloadContext.js'
|
||||||
import type { QueuedCommand } from '../types/textInputTypes.js'
|
|
||||||
|
|
||||||
type Props = {
|
type Props = {
|
||||||
isLoading: boolean
|
isLoading: boolean
|
||||||
@@ -37,32 +32,6 @@ type Props = {
|
|||||||
setMessages: React.Dispatch<React.SetStateAction<Message[]>>
|
setMessages: React.Dispatch<React.SetStateAction<Message[]>>
|
||||||
}
|
}
|
||||||
|
|
||||||
export async function createScheduledTaskQueuedCommand(
|
|
||||||
task: Pick<CronTask, 'id' | 'prompt'>,
|
|
||||||
options?: {
|
|
||||||
rootDir?: string
|
|
||||||
currentDir?: string
|
|
||||||
shouldCreate?: () => boolean
|
|
||||||
},
|
|
||||||
): Promise<QueuedCommand | null> {
|
|
||||||
const command = await createAutonomyQueuedPromptIfNoActiveSource({
|
|
||||||
basePrompt: task.prompt,
|
|
||||||
trigger: 'scheduled-task',
|
|
||||||
rootDir: options?.rootDir,
|
|
||||||
currentDir: options?.currentDir ?? getCwd(),
|
|
||||||
sourceId: task.id,
|
|
||||||
sourceLabel: task.prompt,
|
|
||||||
workload: WORKLOAD_CRON,
|
|
||||||
shouldCreate: options?.shouldCreate,
|
|
||||||
})
|
|
||||||
if (!command) {
|
|
||||||
logForDebugging(
|
|
||||||
`[ScheduledTasks] skipping ${task.id}: previous run still queued or running`,
|
|
||||||
)
|
|
||||||
}
|
|
||||||
return command
|
|
||||||
}
|
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* REPL wrapper for the cron scheduler. Mounts the scheduler once and tears
|
* REPL wrapper for the cron scheduler. Mounts the scheduler once and tears
|
||||||
* it down on unmount. Fired prompts go into the command queue as 'later'
|
* it down on unmount. Fired prompts go into the command queue as 'later'
|
||||||
@@ -102,25 +71,16 @@ export function useScheduledTasks({
|
|||||||
// forward isMeta, so their messages remain visible in the
|
// forward isMeta, so their messages remain visible in the
|
||||||
// transcript. This is acceptable since normal mode is not the
|
// transcript. This is acceptable since normal mode is not the
|
||||||
// primary use case for scheduled tasks.
|
// primary use case for scheduled tasks.
|
||||||
let disposed = false
|
|
||||||
const enqueueForLead = async (prompt: string) => {
|
const enqueueForLead = async (prompt: string) => {
|
||||||
const command = await createAutonomyQueuedPrompt({
|
const command = await createAutonomyQueuedPrompt({
|
||||||
basePrompt: prompt,
|
basePrompt: prompt,
|
||||||
trigger: 'scheduled-task',
|
trigger: 'scheduled-task',
|
||||||
currentDir: getCwd(),
|
currentDir: getCwd(),
|
||||||
workload: WORKLOAD_CRON,
|
workload: WORKLOAD_CRON,
|
||||||
shouldCreate: () => !disposed,
|
|
||||||
})
|
})
|
||||||
if (!command) {
|
if (!command) {
|
||||||
return
|
return
|
||||||
}
|
}
|
||||||
if (disposed) {
|
|
||||||
await markAutonomyRunCancelled(
|
|
||||||
command.autonomy!.runId,
|
|
||||||
command.autonomy!.rootDir,
|
|
||||||
)
|
|
||||||
return
|
|
||||||
}
|
|
||||||
enqueuePendingNotification(command)
|
enqueuePendingNotification(command)
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -130,12 +90,7 @@ export function useScheduledTasks({
|
|||||||
// which is populated from disk at scheduler startup — this path only
|
// which is populated from disk at scheduler startup — this path only
|
||||||
// handles team-lead durable crons.
|
// handles team-lead durable crons.
|
||||||
onFire: prompt => {
|
onFire: prompt => {
|
||||||
void enqueueForLead(prompt).catch(error =>
|
void enqueueForLead(prompt)
|
||||||
logForDebugging(
|
|
||||||
`[ScheduledTasks] failed to enqueue missed task prompt: ${error}`,
|
|
||||||
{ level: 'error' },
|
|
||||||
),
|
|
||||||
)
|
|
||||||
},
|
},
|
||||||
// Normal fires receive the full CronTask so we can route by agentId.
|
// Normal fires receive the full CronTask so we can route by agentId.
|
||||||
onFireTask: task => {
|
onFireTask: task => {
|
||||||
@@ -146,26 +101,22 @@ export function useScheduledTasks({
|
|||||||
store.getState().tasks,
|
store.getState().tasks,
|
||||||
)
|
)
|
||||||
if (teammate && !isTerminalTaskStatus(teammate.status)) {
|
if (teammate && !isTerminalTaskStatus(teammate.status)) {
|
||||||
const command = await createScheduledTaskQueuedCommand(
|
const command = await createAutonomyQueuedPrompt({
|
||||||
task,
|
basePrompt: task.prompt,
|
||||||
{ shouldCreate: () => !disposed },
|
trigger: 'scheduled-task',
|
||||||
)
|
currentDir: getCwd(),
|
||||||
|
sourceId: task.id,
|
||||||
|
sourceLabel: task.prompt,
|
||||||
|
workload: WORKLOAD_CRON,
|
||||||
|
})
|
||||||
if (!command) {
|
if (!command) {
|
||||||
return
|
return
|
||||||
}
|
}
|
||||||
if (disposed) {
|
|
||||||
await markAutonomyRunCancelled(
|
|
||||||
command.autonomy!.runId,
|
|
||||||
command.autonomy!.rootDir,
|
|
||||||
)
|
|
||||||
return
|
|
||||||
}
|
|
||||||
const injected = injectUserMessageToTeammate(
|
const injected = injectUserMessageToTeammate(
|
||||||
teammate.id,
|
teammate.id,
|
||||||
command.value as string,
|
command.value as string,
|
||||||
{
|
{
|
||||||
autonomyRunId: command.autonomy?.runId,
|
autonomyRunId: command.autonomy?.runId,
|
||||||
autonomyRootDir: command.autonomy?.rootDir,
|
|
||||||
origin: command.origin,
|
origin: command.origin,
|
||||||
},
|
},
|
||||||
setAppState,
|
setAppState,
|
||||||
@@ -174,7 +125,6 @@ export function useScheduledTasks({
|
|||||||
await markAutonomyRunFailed(
|
await markAutonomyRunFailed(
|
||||||
command.autonomy.runId,
|
command.autonomy.runId,
|
||||||
`Teammate ${task.agentId} exited before the scheduled message could be delivered.`,
|
`Teammate ${task.agentId} exited before the scheduled message could be delivered.`,
|
||||||
command.autonomy.rootDir,
|
|
||||||
)
|
)
|
||||||
}
|
}
|
||||||
return
|
return
|
||||||
@@ -189,32 +139,24 @@ export function useScheduledTasks({
|
|||||||
return
|
return
|
||||||
}
|
}
|
||||||
|
|
||||||
const command = await createScheduledTaskQueuedCommand(
|
const command = await createAutonomyQueuedPrompt({
|
||||||
task,
|
basePrompt: task.prompt,
|
||||||
{ shouldCreate: () => !disposed },
|
trigger: 'scheduled-task',
|
||||||
)
|
currentDir: getCwd(),
|
||||||
|
sourceId: task.id,
|
||||||
|
sourceLabel: task.prompt,
|
||||||
|
workload: WORKLOAD_CRON,
|
||||||
|
})
|
||||||
if (!command) {
|
if (!command) {
|
||||||
return
|
return
|
||||||
}
|
}
|
||||||
if (disposed) {
|
|
||||||
await markAutonomyRunCancelled(
|
|
||||||
command.autonomy!.runId,
|
|
||||||
command.autonomy!.rootDir,
|
|
||||||
)
|
|
||||||
return
|
|
||||||
}
|
|
||||||
|
|
||||||
const msg = createScheduledTaskFireMessage(
|
const msg = createScheduledTaskFireMessage(
|
||||||
`Running scheduled task (${formatCronFireTime(new Date())})`,
|
`Running scheduled task (${formatCronFireTime(new Date())})`,
|
||||||
)
|
)
|
||||||
setMessages(prev => [...prev, msg])
|
setMessages(prev => [...prev, msg])
|
||||||
enqueuePendingNotification(command)
|
enqueuePendingNotification(command)
|
||||||
})().catch(error =>
|
})()
|
||||||
logForDebugging(
|
|
||||||
`[ScheduledTasks] failed to enqueue task ${task.id}: ${error}`,
|
|
||||||
{ level: 'error' },
|
|
||||||
),
|
|
||||||
)
|
|
||||||
},
|
},
|
||||||
isLoading: () => isLoadingRef.current,
|
isLoading: () => isLoadingRef.current,
|
||||||
assistantMode,
|
assistantMode,
|
||||||
@@ -222,10 +164,7 @@ export function useScheduledTasks({
|
|||||||
isKilled: () => !isKairosCronEnabled(),
|
isKilled: () => !isKairosCronEnabled(),
|
||||||
})
|
})
|
||||||
scheduler.start()
|
scheduler.start()
|
||||||
return () => {
|
return () => scheduler.stop()
|
||||||
disposed = true
|
|
||||||
scheduler.stop()
|
|
||||||
}
|
|
||||||
// assistantMode is stable for the session lifetime; store/setAppState are
|
// assistantMode is stable for the session lifetime; store/setAppState are
|
||||||
// stable refs from useSyncExternalStore; setMessages is a stable useCallback.
|
// stable refs from useSyncExternalStore; setMessages is a stable useCallback.
|
||||||
// eslint-disable-next-line react-hooks/exhaustive-deps
|
// eslint-disable-next-line react-hooks/exhaustive-deps
|
||||||
|
|||||||
@@ -9,9 +9,7 @@ import { useEffect, useRef } from 'react'
|
|||||||
import type { QueuedCommand } from '../types/textInputTypes.js'
|
import type { QueuedCommand } from '../types/textInputTypes.js'
|
||||||
import { TICK_TAG } from '../constants/xml.js'
|
import { TICK_TAG } from '../constants/xml.js'
|
||||||
import { getCwd } from '../utils/cwd.js'
|
import { getCwd } from '../utils/cwd.js'
|
||||||
import { cancelQueuedAutonomyCommands } from '../utils/autonomyQueueLifecycle.js'
|
|
||||||
import { createProactiveAutonomyCommands } from '../utils/autonomyRuns.js'
|
import { createProactiveAutonomyCommands } from '../utils/autonomyRuns.js'
|
||||||
import { logForDebugging } from '../utils/debug.js'
|
|
||||||
import {
|
import {
|
||||||
isProactiveActive,
|
isProactiveActive,
|
||||||
isProactivePaused,
|
isProactivePaused,
|
||||||
@@ -40,8 +38,6 @@ export function useProactive(opts: UseProactiveOpts): void {
|
|||||||
if (!isProactiveActive()) return
|
if (!isProactiveActive()) return
|
||||||
|
|
||||||
let timer: ReturnType<typeof setTimeout> | null = null
|
let timer: ReturnType<typeof setTimeout> | null = null
|
||||||
let disposed = false
|
|
||||||
let generating = false
|
|
||||||
|
|
||||||
function scheduleTick(): void {
|
function scheduleTick(): void {
|
||||||
const nextTs = Date.now() + TICK_INTERVAL_MS
|
const nextTs = Date.now() + TICK_INTERVAL_MS
|
||||||
@@ -70,51 +66,25 @@ export function useProactive(opts: UseProactiveOpts): void {
|
|||||||
isLoading ||
|
isLoading ||
|
||||||
isInPlanMode ||
|
isInPlanMode ||
|
||||||
hasActiveLocalJsxUI ||
|
hasActiveLocalJsxUI ||
|
||||||
queuedCommandsLength > 0 ||
|
queuedCommandsLength > 0
|
||||||
generating
|
|
||||||
) {
|
) {
|
||||||
scheduleTick()
|
scheduleTick()
|
||||||
return
|
return
|
||||||
}
|
}
|
||||||
|
|
||||||
generating = true
|
|
||||||
void (async () => {
|
void (async () => {
|
||||||
const commands = await createProactiveAutonomyCommands({
|
const commands = await createProactiveAutonomyCommands({
|
||||||
basePrompt: `<${TICK_TAG}>${new Date().toLocaleTimeString()}</${TICK_TAG}>`,
|
basePrompt: `<${TICK_TAG}>${new Date().toLocaleTimeString()}</${TICK_TAG}>`,
|
||||||
currentDir: getCwd(),
|
currentDir: getCwd(),
|
||||||
shouldCreate: () => !disposed,
|
|
||||||
})
|
})
|
||||||
if (disposed) {
|
for (const command of commands) {
|
||||||
await cancelQueuedAutonomyCommands({ commands })
|
// Always queue proactive turns. This avoids races where the prompt
|
||||||
return
|
// is built asynchronously, a user turn starts meanwhile, and a
|
||||||
}
|
// direct-submit path would silently drop the autonomy turn after
|
||||||
const queuedCommands: QueuedCommand[] = []
|
// consuming its heartbeat due-state.
|
||||||
try {
|
optsRef.current.onQueueTick(command)
|
||||||
for (const command of commands) {
|
|
||||||
// Always queue proactive turns. This avoids races where the prompt
|
|
||||||
// is built asynchronously, a user turn starts meanwhile, and a
|
|
||||||
// direct-submit path would silently drop the autonomy turn after
|
|
||||||
// consuming its heartbeat due-state.
|
|
||||||
optsRef.current.onQueueTick(command)
|
|
||||||
queuedCommands.push(command)
|
|
||||||
}
|
|
||||||
} catch (error) {
|
|
||||||
await cancelQueuedAutonomyCommands({
|
|
||||||
commands: commands.filter(
|
|
||||||
command => !queuedCommands.includes(command),
|
|
||||||
),
|
|
||||||
})
|
|
||||||
throw error
|
|
||||||
}
|
}
|
||||||
})()
|
})()
|
||||||
.catch(error =>
|
|
||||||
logForDebugging(`[Proactive] failed to create tick: ${error}`, {
|
|
||||||
level: 'error',
|
|
||||||
}),
|
|
||||||
)
|
|
||||||
.finally(() => {
|
|
||||||
generating = false
|
|
||||||
})
|
|
||||||
|
|
||||||
// Schedule next tick
|
// Schedule next tick
|
||||||
scheduleTick()
|
scheduleTick()
|
||||||
@@ -124,7 +94,6 @@ export function useProactive(opts: UseProactiveOpts): void {
|
|||||||
scheduleTick()
|
scheduleTick()
|
||||||
|
|
||||||
return () => {
|
return () => {
|
||||||
disposed = true
|
|
||||||
if (timer !== null) {
|
if (timer !== null) {
|
||||||
clearTimeout(timer)
|
clearTimeout(timer)
|
||||||
timer = null
|
timer = null
|
||||||
|
|||||||
152
src/query.ts
152
src/query.ts
@@ -71,16 +71,10 @@ const jobClassifier = feature('TEMPLATES')
|
|||||||
: null
|
: null
|
||||||
/* eslint-enable @typescript-eslint/no-require-imports */
|
/* eslint-enable @typescript-eslint/no-require-imports */
|
||||||
import {
|
import {
|
||||||
enqueue,
|
|
||||||
remove as removeFromQueue,
|
remove as removeFromQueue,
|
||||||
getCommandsByMaxPriority,
|
getCommandsByMaxPriority,
|
||||||
isSlashCommand,
|
isSlashCommand,
|
||||||
} from './utils/messageQueueManager.js'
|
} from './utils/messageQueueManager.js'
|
||||||
import {
|
|
||||||
type AutonomyTurnOutcome,
|
|
||||||
claimConsumableQueuedAutonomyCommands,
|
|
||||||
finalizeAutonomyCommandsForTurn,
|
|
||||||
} from './utils/autonomyQueueLifecycle.js'
|
|
||||||
import { notifyCommandLifecycle } from './utils/commandLifecycle.js'
|
import { notifyCommandLifecycle } from './utils/commandLifecycle.js'
|
||||||
import { headlessProfilerCheckpoint } from './utils/headlessProfiler.js'
|
import { headlessProfilerCheckpoint } from './utils/headlessProfiler.js'
|
||||||
import {
|
import {
|
||||||
@@ -98,7 +92,6 @@ import { SLEEP_TOOL_NAME } from '@claude-code-best/builtin-tools/tools/SleepTool
|
|||||||
import { executePostSamplingHooks } from './utils/hooks/postSamplingHooks.js'
|
import { executePostSamplingHooks } from './utils/hooks/postSamplingHooks.js'
|
||||||
import { executeStopFailureHooks } from './utils/hooks.js'
|
import { executeStopFailureHooks } from './utils/hooks.js'
|
||||||
import type { QuerySource } from './constants/querySource.js'
|
import type { QuerySource } from './constants/querySource.js'
|
||||||
import type { QueuedCommand } from './types/textInputTypes.js'
|
|
||||||
import { createDumpPromptsFetch } from './services/api/dumpPrompts.js'
|
import { createDumpPromptsFetch } from './services/api/dumpPrompts.js'
|
||||||
import { StreamingToolExecutor } from './services/tools/StreamingToolExecutor.js'
|
import { StreamingToolExecutor } from './services/tools/StreamingToolExecutor.js'
|
||||||
import { queryCheckpoint } from './utils/queryProfiler.js'
|
import { queryCheckpoint } from './utils/queryProfiler.js'
|
||||||
@@ -118,11 +111,7 @@ import {
|
|||||||
} from './bootstrap/state.js'
|
} from './bootstrap/state.js'
|
||||||
import { createBudgetTracker, checkTokenBudget } from './query/tokenBudget.js'
|
import { createBudgetTracker, checkTokenBudget } from './query/tokenBudget.js'
|
||||||
import { count } from './utils/array.js'
|
import { count } from './utils/array.js'
|
||||||
import {
|
import { createTrace, endTrace, isLangfuseEnabled } from './services/langfuse/index.js'
|
||||||
createTrace,
|
|
||||||
endTrace,
|
|
||||||
isLangfuseEnabled,
|
|
||||||
} from './services/langfuse/index.js'
|
|
||||||
import { getAPIProvider } from './utils/model/providers.js'
|
import { getAPIProvider } from './utils/model/providers.js'
|
||||||
|
|
||||||
/* eslint-disable @typescript-eslint/no-require-imports */
|
/* eslint-disable @typescript-eslint/no-require-imports */
|
||||||
@@ -140,11 +129,7 @@ function* yieldMissingToolResultBlocks(
|
|||||||
) {
|
) {
|
||||||
for (const assistantMessage of assistantMessages) {
|
for (const assistantMessage of assistantMessages) {
|
||||||
// Extract all tool use blocks from this assistant message
|
// Extract all tool use blocks from this assistant message
|
||||||
const toolUseBlocks = (
|
const toolUseBlocks = (Array.isArray(assistantMessage.message?.content) ? assistantMessage.message.content : []).filter(
|
||||||
Array.isArray(assistantMessage.message?.content)
|
|
||||||
? assistantMessage.message.content
|
|
||||||
: []
|
|
||||||
).filter(
|
|
||||||
(content: { type: string }) => content.type === 'tool_use',
|
(content: { type: string }) => content.type === 'tool_use',
|
||||||
) as ToolUseBlock[]
|
) as ToolUseBlock[]
|
||||||
|
|
||||||
@@ -196,33 +181,6 @@ function isWithheldMaxOutputTokens(
|
|||||||
return msg?.type === 'assistant' && msg.apiError === 'max_output_tokens'
|
return msg?.type === 'assistant' && msg.apiError === 'max_output_tokens'
|
||||||
}
|
}
|
||||||
|
|
||||||
function getAutonomyTurnOutcome(params: {
|
|
||||||
terminal?: Terminal
|
|
||||||
thrownError?: unknown
|
|
||||||
}): AutonomyTurnOutcome {
|
|
||||||
if (params.thrownError !== undefined) {
|
|
||||||
return { type: 'failed', error: params.thrownError }
|
|
||||||
}
|
|
||||||
|
|
||||||
const terminal = params.terminal
|
|
||||||
const reason = terminal?.reason
|
|
||||||
switch (reason) {
|
|
||||||
case 'completed':
|
|
||||||
return { type: 'completed' }
|
|
||||||
case undefined:
|
|
||||||
case 'aborted_streaming':
|
|
||||||
case 'aborted_tools':
|
|
||||||
return { type: 'cancelled' }
|
|
||||||
case 'model_error':
|
|
||||||
return { type: 'failed', error: terminal.error }
|
|
||||||
default:
|
|
||||||
return {
|
|
||||||
type: 'failed',
|
|
||||||
message: `query ended without successful completion: ${reason}`,
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
export type QueryParams = {
|
export type QueryParams = {
|
||||||
messages: Message[]
|
messages: Message[]
|
||||||
systemPrompt: SystemPrompt
|
systemPrompt: SystemPrompt
|
||||||
@@ -272,7 +230,6 @@ export async function* query(
|
|||||||
Terminal
|
Terminal
|
||||||
> {
|
> {
|
||||||
const consumedCommandUuids: string[] = []
|
const consumedCommandUuids: string[] = []
|
||||||
const consumedAutonomyCommands: QueuedCommand[] = []
|
|
||||||
|
|
||||||
// Create Langfuse trace for this query turn (no-op if not configured).
|
// Create Langfuse trace for this query turn (no-op if not configured).
|
||||||
// When called as a sub-agent, langfuseTrace is already set by runAgent()
|
// When called as a sub-agent, langfuseTrace is already set by runAgent()
|
||||||
@@ -281,9 +238,8 @@ export async function* query(
|
|||||||
logForDebugging(
|
logForDebugging(
|
||||||
`[query] ownsTrace=${ownsTrace} incoming langfuseTrace=${params.toolUseContext.langfuseTrace ? 'present' : 'null/undefined'} isLangfuseEnabled=${isLangfuseEnabled()}`,
|
`[query] ownsTrace=${ownsTrace} incoming langfuseTrace=${params.toolUseContext.langfuseTrace ? 'present' : 'null/undefined'} isLangfuseEnabled=${isLangfuseEnabled()}`,
|
||||||
)
|
)
|
||||||
const langfuseTrace =
|
const langfuseTrace = params.toolUseContext.langfuseTrace
|
||||||
params.toolUseContext.langfuseTrace ??
|
?? (isLangfuseEnabled()
|
||||||
(isLangfuseEnabled()
|
|
||||||
? createTrace({
|
? createTrace({
|
||||||
sessionId: getSessionId(),
|
sessionId: getSessionId(),
|
||||||
model: params.toolUseContext.options.mainLoopModel,
|
model: params.toolUseContext.options.mainLoopModel,
|
||||||
@@ -302,34 +258,9 @@ export async function* query(
|
|||||||
: params
|
: params
|
||||||
|
|
||||||
let terminal: Terminal | undefined
|
let terminal: Terminal | undefined
|
||||||
let didThrow = false
|
|
||||||
let thrownError: unknown
|
|
||||||
try {
|
try {
|
||||||
terminal = yield* queryLoop(
|
terminal = yield* queryLoop(paramsWithTrace, consumedCommandUuids)
|
||||||
paramsWithTrace,
|
|
||||||
consumedCommandUuids,
|
|
||||||
consumedAutonomyCommands,
|
|
||||||
)
|
|
||||||
} catch (error) {
|
|
||||||
didThrow = true
|
|
||||||
thrownError = error
|
|
||||||
throw error
|
|
||||||
} finally {
|
} finally {
|
||||||
await finalizeAutonomyCommandsForTurn({
|
|
||||||
commands: consumedAutonomyCommands,
|
|
||||||
outcome: getAutonomyTurnOutcome({
|
|
||||||
terminal,
|
|
||||||
...(didThrow ? { thrownError } : {}),
|
|
||||||
}),
|
|
||||||
priority: 'later',
|
|
||||||
})
|
|
||||||
.then(nextCommands => {
|
|
||||||
for (const command of nextCommands) {
|
|
||||||
enqueue(command)
|
|
||||||
}
|
|
||||||
})
|
|
||||||
.catch(logError)
|
|
||||||
|
|
||||||
// Only end the trace if we created it — sub-agents own their traces
|
// Only end the trace if we created it — sub-agents own their traces
|
||||||
if (ownsTrace) {
|
if (ownsTrace) {
|
||||||
const isAborted =
|
const isAborted =
|
||||||
@@ -352,7 +283,6 @@ export async function* query(
|
|||||||
async function* queryLoop(
|
async function* queryLoop(
|
||||||
params: QueryParams,
|
params: QueryParams,
|
||||||
consumedCommandUuids: string[],
|
consumedCommandUuids: string[],
|
||||||
consumedAutonomyCommands: QueuedCommand[],
|
|
||||||
): AsyncGenerator<
|
): AsyncGenerator<
|
||||||
| StreamEvent
|
| StreamEvent
|
||||||
| RequestStartEvent
|
| RequestStartEvent
|
||||||
@@ -860,14 +790,7 @@ async function* queryLoop(
|
|||||||
let yieldMessage: typeof message = message
|
let yieldMessage: typeof message = message
|
||||||
if (message.type === 'assistant') {
|
if (message.type === 'assistant') {
|
||||||
const assistantMsg = message as AssistantMessage
|
const assistantMsg = message as AssistantMessage
|
||||||
const contentArr = Array.isArray(assistantMsg.message?.content)
|
const contentArr = Array.isArray(assistantMsg.message?.content) ? assistantMsg.message.content as unknown as Array<{ type: string; input?: unknown; name?: string; [key: string]: unknown }> : []
|
||||||
? (assistantMsg.message.content as unknown as Array<{
|
|
||||||
type: string
|
|
||||||
input?: unknown
|
|
||||||
name?: string
|
|
||||||
[key: string]: unknown
|
|
||||||
}>)
|
|
||||||
: []
|
|
||||||
let clonedContent: typeof contentArr | undefined
|
let clonedContent: typeof contentArr | undefined
|
||||||
for (let i = 0; i < contentArr.length; i++) {
|
for (let i = 0; i < contentArr.length; i++) {
|
||||||
const block = contentArr[i]!
|
const block = contentArr[i]!
|
||||||
@@ -903,10 +826,7 @@ async function* queryLoop(
|
|||||||
if (clonedContent) {
|
if (clonedContent) {
|
||||||
yieldMessage = {
|
yieldMessage = {
|
||||||
...message,
|
...message,
|
||||||
message: {
|
message: { ...(assistantMsg.message ?? {}), content: clonedContent },
|
||||||
...(assistantMsg.message ?? {}),
|
|
||||||
content: clonedContent,
|
|
||||||
},
|
|
||||||
} as typeof message
|
} as typeof message
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@@ -952,11 +872,7 @@ async function* queryLoop(
|
|||||||
const assistantMessage = message as AssistantMessage
|
const assistantMessage = message as AssistantMessage
|
||||||
assistantMessages.push(assistantMessage)
|
assistantMessages.push(assistantMessage)
|
||||||
|
|
||||||
const msgToolUseBlocks = (
|
const msgToolUseBlocks = (Array.isArray(assistantMessage.message?.content) ? assistantMessage.message.content : []).filter(
|
||||||
Array.isArray(assistantMessage.message?.content)
|
|
||||||
? assistantMessage.message.content
|
|
||||||
: []
|
|
||||||
).filter(
|
|
||||||
(content: { type: string }) => content.type === 'tool_use',
|
(content: { type: string }) => content.type === 'tool_use',
|
||||||
) as ToolUseBlock[]
|
) as ToolUseBlock[]
|
||||||
if (msgToolUseBlocks.length > 0) {
|
if (msgToolUseBlocks.length > 0) {
|
||||||
@@ -1089,10 +1005,7 @@ async function* queryLoop(
|
|||||||
logEvent('tengu_query_error', {
|
logEvent('tengu_query_error', {
|
||||||
assistantMessages: assistantMessages.length,
|
assistantMessages: assistantMessages.length,
|
||||||
toolUses: assistantMessages.flatMap(_ =>
|
toolUses: assistantMessages.flatMap(_ =>
|
||||||
(Array.isArray(_.message?.content)
|
(Array.isArray(_.message?.content) ? _.message.content as Array<{ type: string }> : []).filter(content => content.type === 'tool_use'),
|
||||||
? (_.message.content as Array<{ type: string }>)
|
|
||||||
: []
|
|
||||||
).filter(content => content.type === 'tool_use'),
|
|
||||||
).length,
|
).length,
|
||||||
|
|
||||||
queryChainId: queryChainIdForAnalytics,
|
queryChainId: queryChainIdForAnalytics,
|
||||||
@@ -1394,10 +1307,7 @@ async function* queryLoop(
|
|||||||
// error → hook blocking → retry → error → …
|
// error → hook blocking → retry → error → …
|
||||||
if (lastMessage?.isApiErrorMessage) {
|
if (lastMessage?.isApiErrorMessage) {
|
||||||
void executeStopFailureHooks(lastMessage, toolUseContext)
|
void executeStopFailureHooks(lastMessage, toolUseContext)
|
||||||
return {
|
return { reason: 'completed' }
|
||||||
reason: 'model_error',
|
|
||||||
error: lastMessage.error ?? lastMessage.apiError ?? 'api_error',
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
|
|
||||||
const stopHookResult = yield* handleStopHooks(
|
const stopHookResult = yield* handleStopHooks(
|
||||||
@@ -1498,6 +1408,7 @@ async function* queryLoop(
|
|||||||
|
|
||||||
queryCheckpoint('query_tool_execution_start')
|
queryCheckpoint('query_tool_execution_start')
|
||||||
|
|
||||||
|
|
||||||
if (streamingToolExecutor) {
|
if (streamingToolExecutor) {
|
||||||
logEvent('tengu_streaming_tool_execution_used', {
|
logEvent('tengu_streaming_tool_execution_used', {
|
||||||
tool_count: toolUseBlocks.length,
|
tool_count: toolUseBlocks.length,
|
||||||
@@ -1557,14 +1468,9 @@ async function* queryLoop(
|
|||||||
const lastAssistantMessage = assistantMessages.at(-1)
|
const lastAssistantMessage = assistantMessages.at(-1)
|
||||||
let lastAssistantText: string | undefined
|
let lastAssistantText: string | undefined
|
||||||
if (lastAssistantMessage) {
|
if (lastAssistantMessage) {
|
||||||
const textBlocks = (
|
const textBlocks = (Array.isArray(lastAssistantMessage.message?.content) ? lastAssistantMessage.message.content as Array<{ type: string; text?: string }> : []).filter(
|
||||||
Array.isArray(lastAssistantMessage.message?.content)
|
block => block.type === 'text',
|
||||||
? (lastAssistantMessage.message.content as Array<{
|
)
|
||||||
type: string
|
|
||||||
text?: string
|
|
||||||
}>)
|
|
||||||
: []
|
|
||||||
).filter(block => block.type === 'text')
|
|
||||||
if (textBlocks.length > 0) {
|
if (textBlocks.length > 0) {
|
||||||
const lastTextBlock = textBlocks.at(-1)
|
const lastTextBlock = textBlocks.at(-1)
|
||||||
if (lastTextBlock && 'text' in lastTextBlock) {
|
if (lastTextBlock && 'text' in lastTextBlock) {
|
||||||
@@ -1716,32 +1622,12 @@ async function* queryLoop(
|
|||||||
// user prompts, even if someone stamps an agentId on one.
|
// user prompts, even if someone stamps an agentId on one.
|
||||||
return cmd.mode === 'task-notification' && cmd.agentId === currentAgentId
|
return cmd.mode === 'task-notification' && cmd.agentId === currentAgentId
|
||||||
})
|
})
|
||||||
const queuedAutonomyClaim = await claimConsumableQueuedAutonomyCommands(
|
|
||||||
queuedCommandsSnapshot,
|
|
||||||
)
|
|
||||||
if (queuedAutonomyClaim.staleCommands.length > 0) {
|
|
||||||
removeFromQueue(queuedAutonomyClaim.staleCommands)
|
|
||||||
}
|
|
||||||
|
|
||||||
const claimedConsumedCommands = queuedAutonomyClaim.claimedCommands.filter(
|
|
||||||
cmd => cmd.mode === 'prompt' || cmd.mode === 'task-notification',
|
|
||||||
)
|
|
||||||
if (claimedConsumedCommands.length > 0) {
|
|
||||||
consumedAutonomyCommands.push(...claimedConsumedCommands)
|
|
||||||
for (const cmd of claimedConsumedCommands) {
|
|
||||||
if (cmd.uuid) {
|
|
||||||
consumedCommandUuids.push(cmd.uuid)
|
|
||||||
notifyCommandLifecycle(cmd.uuid, 'started')
|
|
||||||
}
|
|
||||||
}
|
|
||||||
removeFromQueue(claimedConsumedCommands)
|
|
||||||
}
|
|
||||||
|
|
||||||
for await (const attachment of getAttachmentMessages(
|
for await (const attachment of getAttachmentMessages(
|
||||||
null,
|
null,
|
||||||
updatedToolUseContext,
|
updatedToolUseContext,
|
||||||
null,
|
null,
|
||||||
queuedAutonomyClaim.attachmentCommands,
|
queuedCommandsSnapshot,
|
||||||
[...messagesForQuery, ...assistantMessages, ...toolResults],
|
[...messagesForQuery, ...assistantMessages, ...toolResults],
|
||||||
querySource,
|
querySource,
|
||||||
)) {
|
)) {
|
||||||
@@ -1773,6 +1659,7 @@ async function* queryLoop(
|
|||||||
pendingMemoryPrefetch.consumedOnIteration = turnCount - 1
|
pendingMemoryPrefetch.consumedOnIteration = turnCount - 1
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
// Inject prefetched skill discovery. collectSkillDiscoveryPrefetch emits
|
// Inject prefetched skill discovery. collectSkillDiscoveryPrefetch emits
|
||||||
// hidden_by_main_turn — true when the prefetch resolved before this point
|
// hidden_by_main_turn — true when the prefetch resolved before this point
|
||||||
// (should be >98% at AKI@250ms / Haiku@573ms vs turn durations of 2-30s).
|
// (should be >98% at AKI@250ms / Haiku@573ms vs turn durations of 2-30s).
|
||||||
@@ -1788,11 +1675,8 @@ async function* queryLoop(
|
|||||||
|
|
||||||
// Remove only commands that were actually consumed as attachments.
|
// Remove only commands that were actually consumed as attachments.
|
||||||
// Prompt and task-notification commands are converted to attachments above.
|
// Prompt and task-notification commands are converted to attachments above.
|
||||||
const claimedCommandSet = new Set(claimedConsumedCommands)
|
const consumedCommands = queuedCommandsSnapshot.filter(
|
||||||
const consumedCommands = queuedAutonomyClaim.attachmentCommands.filter(
|
cmd => cmd.mode === 'prompt' || cmd.mode === 'task-notification',
|
||||||
cmd =>
|
|
||||||
(cmd.mode === 'prompt' || cmd.mode === 'task-notification') &&
|
|
||||||
!claimedCommandSet.has(cmd),
|
|
||||||
)
|
)
|
||||||
if (consumedCommands.length > 0) {
|
if (consumedCommands.length > 0) {
|
||||||
for (const cmd of consumedCommands) {
|
for (const cmd of consumedCommands) {
|
||||||
|
|||||||
@@ -1,20 +1,3 @@
|
|||||||
export type Terminal =
|
// Auto-generated stub — replace with real implementation
|
||||||
| { reason: 'completed' }
|
export type Terminal = any;
|
||||||
| { reason: 'blocking_limit' }
|
export type Continue = any;
|
||||||
| { reason: 'image_error' }
|
|
||||||
| { reason: 'model_error'; error?: unknown }
|
|
||||||
| { reason: 'aborted_streaming' }
|
|
||||||
| { reason: 'aborted_tools' }
|
|
||||||
| { reason: 'prompt_too_long' }
|
|
||||||
| { reason: 'stop_hook_prevented' }
|
|
||||||
| { reason: 'hook_stopped' }
|
|
||||||
| { reason: 'max_turns'; turnCount: number }
|
|
||||||
|
|
||||||
export type Continue =
|
|
||||||
| { reason: 'collapse_drain_retry'; committed: number }
|
|
||||||
| { reason: 'reactive_compact_retry' }
|
|
||||||
| { reason: 'max_output_tokens_escalate' }
|
|
||||||
| { reason: 'max_output_tokens_recovery'; attempt: number }
|
|
||||||
| { reason: 'stop_hook_blocking' }
|
|
||||||
| { reason: 'token_budget_continuation' }
|
|
||||||
| { reason: 'next_turn' }
|
|
||||||
|
|||||||
@@ -79,9 +79,10 @@ import { isEnvTruthy } from '../utils/envUtils.js';
|
|||||||
import { formatTokens, truncateToWidth } from '../utils/format.js';
|
import { formatTokens, truncateToWidth } from '../utils/format.js';
|
||||||
import { consumeEarlyInput } from '../utils/earlyInput.js';
|
import { consumeEarlyInput } from '../utils/earlyInput.js';
|
||||||
import {
|
import {
|
||||||
claimConsumableQueuedAutonomyCommands,
|
finalizeAutonomyRunCompleted,
|
||||||
finalizeAutonomyCommandsForTurn,
|
finalizeAutonomyRunFailed,
|
||||||
} from '../utils/autonomyQueueLifecycle.js';
|
markAutonomyRunRunning,
|
||||||
|
} from '../utils/autonomyRuns.js';
|
||||||
|
|
||||||
import { setMemberActive } from '../utils/swarm/teamHelpers.js';
|
import { setMemberActive } from '../utils/swarm/teamHelpers.js';
|
||||||
import {
|
import {
|
||||||
@@ -3053,19 +3054,18 @@ export function REPL({
|
|||||||
setMessages(old => {
|
setMessages(old => {
|
||||||
const postBoundary = getMessagesAfterCompactBoundary(old, {
|
const postBoundary = getMessagesAfterCompactBoundary(old, {
|
||||||
includeSnipped: true,
|
includeSnipped: true,
|
||||||
});
|
})
|
||||||
// Hard cap: keep at most 500 messages in fullscreen scrollback
|
// Hard cap: keep at most 500 messages in fullscreen scrollback
|
||||||
// to prevent unbounded memory growth in multi-day sessions.
|
// to prevent unbounded memory growth in multi-day sessions.
|
||||||
// normalizeMessages/applyGrouping are O(n), and Ink fiber
|
// normalizeMessages/applyGrouping are O(n), and Ink fiber
|
||||||
// trees cost ~250KB RSS per message. Without this cap,
|
// trees cost ~250KB RSS per message. Without this cap,
|
||||||
// scrollback after several compactions can reach thousands
|
// scrollback after several compactions can reach thousands
|
||||||
// of messages (observed: 13k+, 1GB+ heap).
|
// of messages (observed: 13k+, 1GB+ heap).
|
||||||
const MAX_FULLSCREEN_SCROLLBACK = 500;
|
const MAX_FULLSCREEN_SCROLLBACK = 500
|
||||||
const kept =
|
const kept = postBoundary.length > MAX_FULLSCREEN_SCROLLBACK
|
||||||
postBoundary.length > MAX_FULLSCREEN_SCROLLBACK
|
? postBoundary.slice(-MAX_FULLSCREEN_SCROLLBACK)
|
||||||
? postBoundary.slice(-MAX_FULLSCREEN_SCROLLBACK)
|
: postBoundary
|
||||||
: postBoundary;
|
return [...kept, newMessage]
|
||||||
return [...kept, newMessage];
|
|
||||||
});
|
});
|
||||||
} else {
|
} else {
|
||||||
setMessages(() => [newMessage]);
|
setMessages(() => [newMessage]);
|
||||||
@@ -3098,10 +3098,13 @@ export function REPL({
|
|||||||
// so interleaved non-ephemeral messages caused duplicate progress
|
// so interleaved non-ephemeral messages caused duplicate progress
|
||||||
// entries to accumulate (observed 13k+ entries in sleep-heavy sessions).
|
// entries to accumulate (observed 13k+ entries in sleep-heavy sessions).
|
||||||
for (let i = oldMessages.length - 1; i >= 0; i--) {
|
for (let i = oldMessages.length - 1; i >= 0; i--) {
|
||||||
const m = oldMessages[i]!;
|
const m = oldMessages[i]!
|
||||||
if (m.type !== 'progress') break;
|
if (m.type !== 'progress') break
|
||||||
const mData = m.data as Record<string, unknown> | undefined;
|
const mData = m.data as Record<string, unknown> | undefined
|
||||||
if (m.parentToolUseID === newMessage.parentToolUseID && mData?.type === newData.type) {
|
if (
|
||||||
|
m.parentToolUseID === newMessage.parentToolUseID &&
|
||||||
|
mData?.type === newData.type
|
||||||
|
) {
|
||||||
const copy = oldMessages.slice();
|
const copy = oldMessages.slice();
|
||||||
copy[i] = newMessage;
|
copy[i] = newMessage;
|
||||||
return copy;
|
return copy;
|
||||||
@@ -3474,7 +3477,7 @@ export function REPL({
|
|||||||
onBeforeQueryCallback?: (input: string, newMessages: MessageType[]) => Promise<boolean>,
|
onBeforeQueryCallback?: (input: string, newMessages: MessageType[]) => Promise<boolean>,
|
||||||
input?: string,
|
input?: string,
|
||||||
effort?: EffortValue,
|
effort?: EffortValue,
|
||||||
): Promise<boolean> => {
|
): Promise<void> => {
|
||||||
// If this is a teammate, mark them as active when starting a turn
|
// If this is a teammate, mark them as active when starting a turn
|
||||||
if (isAgentSwarmsEnabled()) {
|
if (isAgentSwarmsEnabled()) {
|
||||||
const teamName = getTeamName();
|
const teamName = getTeamName();
|
||||||
@@ -3505,7 +3508,7 @@ export function REPL({
|
|||||||
logEvent('tengu_concurrent_onquery_enqueued', {});
|
logEvent('tengu_concurrent_onquery_enqueued', {});
|
||||||
}
|
}
|
||||||
});
|
});
|
||||||
return false;
|
return;
|
||||||
}
|
}
|
||||||
|
|
||||||
try {
|
try {
|
||||||
@@ -3538,7 +3541,7 @@ export function REPL({
|
|||||||
if (onBeforeQueryCallback && input) {
|
if (onBeforeQueryCallback && input) {
|
||||||
const shouldProceed = await onBeforeQueryCallback(input, latestMessages);
|
const shouldProceed = await onBeforeQueryCallback(input, latestMessages);
|
||||||
if (!shouldProceed) {
|
if (!shouldProceed) {
|
||||||
return true;
|
return;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -3687,7 +3690,6 @@ export function REPL({
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
return true;
|
|
||||||
},
|
},
|
||||||
[onQueryImpl, setAppState, resetLoadingState, queryGuard, mrOnBeforeQuery, mrOnTurnComplete],
|
[onQueryImpl, setAppState, resetLoadingState, queryGuard, mrOnBeforeQuery, mrOnTurnComplete],
|
||||||
);
|
);
|
||||||
@@ -4842,62 +4844,44 @@ export function REPL({
|
|||||||
} satisfies QueuedCommand)
|
} satisfies QueuedCommand)
|
||||||
: input;
|
: input;
|
||||||
|
|
||||||
void (async () => {
|
const newAbortController = createAbortController();
|
||||||
const claim = await claimConsumableQueuedAutonomyCommands([queuedCommand]);
|
setAbortController(newAbortController);
|
||||||
const command = claim.attachmentCommands[0];
|
|
||||||
if (!command) return;
|
|
||||||
|
|
||||||
const newAbortController = createAbortController();
|
// Create a user message with the formatted content (includes XML wrapper)
|
||||||
setAbortController(newAbortController);
|
const userMessage = createUserMessage({
|
||||||
|
content: queuedCommand.value as string,
|
||||||
|
isMeta: queuedCommand.isMeta ? true : undefined,
|
||||||
|
origin: queuedCommand.origin,
|
||||||
|
});
|
||||||
|
|
||||||
// Create a user message with the formatted content (includes XML wrapper)
|
const autonomyRunId = queuedCommand.autonomy?.runId;
|
||||||
const userMessage = createUserMessage({
|
if (autonomyRunId) {
|
||||||
content: command.value,
|
void markAutonomyRunRunning(autonomyRunId);
|
||||||
isMeta: command.isMeta ? true : undefined,
|
}
|
||||||
origin: command.origin,
|
|
||||||
});
|
|
||||||
|
|
||||||
let executed = false;
|
void onQuery([userMessage], newAbortController, true, [], mainLoopModel)
|
||||||
try {
|
.then(() => {
|
||||||
executed = (await onQuery([userMessage], newAbortController, true, [], mainLoopModel)) !== false;
|
if (autonomyRunId) {
|
||||||
} catch (error: unknown) {
|
void finalizeAutonomyRunCompleted({
|
||||||
try {
|
runId: autonomyRunId,
|
||||||
await finalizeAutonomyCommandsForTurn({
|
|
||||||
commands: claim.claimedCommands,
|
|
||||||
outcome: { type: 'failed', error },
|
|
||||||
currentDir: getCwd(),
|
currentDir: getCwd(),
|
||||||
priority: 'later',
|
priority: 'later',
|
||||||
|
}).then(nextCommands => {
|
||||||
|
for (const command of nextCommands) {
|
||||||
|
enqueue(command);
|
||||||
|
}
|
||||||
|
});
|
||||||
|
}
|
||||||
|
})
|
||||||
|
.catch((error: unknown) => {
|
||||||
|
if (autonomyRunId) {
|
||||||
|
void finalizeAutonomyRunFailed({
|
||||||
|
runId: autonomyRunId,
|
||||||
|
error: String(error),
|
||||||
});
|
});
|
||||||
} catch (finalizeError: unknown) {
|
|
||||||
logError(toError(finalizeError));
|
|
||||||
}
|
}
|
||||||
logError(toError(error));
|
logError(toError(error));
|
||||||
return;
|
});
|
||||||
}
|
|
||||||
|
|
||||||
// Only finalize as completed when onQuery actually executed the turn
|
|
||||||
// (it returns false from the concurrent-guard path without running).
|
|
||||||
// Keep this finalize in its own try/catch so a failure here does not
|
|
||||||
// trigger a second finalize as `failed` for the same commands.
|
|
||||||
if (!executed) {
|
|
||||||
return;
|
|
||||||
}
|
|
||||||
try {
|
|
||||||
const nextCommands = await finalizeAutonomyCommandsForTurn({
|
|
||||||
commands: claim.claimedCommands,
|
|
||||||
outcome: { type: 'completed' },
|
|
||||||
currentDir: getCwd(),
|
|
||||||
priority: 'later',
|
|
||||||
});
|
|
||||||
for (const nextCommand of nextCommands) {
|
|
||||||
enqueue(nextCommand);
|
|
||||||
}
|
|
||||||
} catch (finalizeError: unknown) {
|
|
||||||
logError(toError(finalizeError));
|
|
||||||
}
|
|
||||||
})().catch((error: unknown) => {
|
|
||||||
logError(toError(error));
|
|
||||||
});
|
|
||||||
return true;
|
return true;
|
||||||
},
|
},
|
||||||
[onQuery, mainLoopModel, store],
|
[onQuery, mainLoopModel, store],
|
||||||
|
|||||||
@@ -5,9 +5,9 @@ import { getUserContext } from '../../context.js'
|
|||||||
import { clearSpeculativeChecks } from '@claude-code-best/builtin-tools/tools/BashTool/bashPermissions.js'
|
import { clearSpeculativeChecks } from '@claude-code-best/builtin-tools/tools/BashTool/bashPermissions.js'
|
||||||
import { clearClassifierApprovals } from '../../utils/classifierApprovals.js'
|
import { clearClassifierApprovals } from '../../utils/classifierApprovals.js'
|
||||||
import { resetGetMemoryFilesCache } from '../../utils/claudemd.js'
|
import { resetGetMemoryFilesCache } from '../../utils/claudemd.js'
|
||||||
import { logError } from '../../utils/log.js'
|
|
||||||
import { clearSessionMessagesCache } from '../../utils/sessionStorage.js'
|
import { clearSessionMessagesCache } from '../../utils/sessionStorage.js'
|
||||||
import { clearBetaTracingState } from '../../utils/telemetry/betaSessionTracing.js'
|
import { clearBetaTracingState } from '../../utils/telemetry/betaSessionTracing.js'
|
||||||
|
import { getLspServerManager } from '../../services/lsp/manager.js'
|
||||||
import { resetMicrocompactState } from './microCompact.js'
|
import { resetMicrocompactState } from './microCompact.js'
|
||||||
|
|
||||||
/**
|
/**
|
||||||
@@ -29,7 +29,7 @@ import { resetMicrocompactState } from './microCompact.js'
|
|||||||
* pass querySource — undefined is only safe for callers that are
|
* pass querySource — undefined is only safe for callers that are
|
||||||
* genuinely main-thread-only (/compact, /clear).
|
* genuinely main-thread-only (/compact, /clear).
|
||||||
*/
|
*/
|
||||||
export function runPostCompactCleanup(querySource?: QuerySource): void {
|
export async function runPostCompactCleanup(querySource?: QuerySource): Promise<void> {
|
||||||
// Subagents (agent:*) run in the same process and share module-level
|
// Subagents (agent:*) run in the same process and share module-level
|
||||||
// state with the main thread. Only reset main-thread module-level state
|
// state with the main thread. Only reset main-thread module-level state
|
||||||
// (context-collapse, memory file cache) for main-thread compacts.
|
// (context-collapse, memory file cache) for main-thread compacts.
|
||||||
@@ -70,22 +70,20 @@ export function runPostCompactCleanup(querySource?: QuerySource): void {
|
|||||||
// cacheUtils resets. See compactConversation() for full rationale.
|
// cacheUtils resets. See compactConversation() for full rationale.
|
||||||
clearBetaTracingState()
|
clearBetaTracingState()
|
||||||
if (feature('COMMIT_ATTRIBUTION')) {
|
if (feature('COMMIT_ATTRIBUTION')) {
|
||||||
// Intentionally fire-and-forget: the file-content cache sweep is a
|
void import('../../utils/attributionHooks.js').then(m =>
|
||||||
// best-effort memory release whose completion no caller depends on.
|
m.sweepFileContentCache(),
|
||||||
// Keeping `runPostCompactCleanup` synchronous lets compaction call sites
|
)
|
||||||
// (REPL post-compact handler, /compact command, autoCompact) finish their
|
|
||||||
// own state transitions without an extra microtask round-trip — the sweep
|
|
||||||
// catches up on the next event-loop tick.
|
|
||||||
//
|
|
||||||
// The .catch is required even though the current attributionHooks.ts is a
|
|
||||||
// no-op stub: without it, a future restored sweepFileContentCache that
|
|
||||||
// throws would surface as an unhandled promise rejection from a function
|
|
||||||
// whose synchronous signature gives callers no way to observe it.
|
|
||||||
void import('../../utils/attributionHooks.js')
|
|
||||||
.then(m => m.sweepFileContentCache())
|
|
||||||
.catch(error => {
|
|
||||||
logError(error)
|
|
||||||
})
|
|
||||||
}
|
}
|
||||||
clearSessionMessagesCache()
|
clearSessionMessagesCache()
|
||||||
|
// Close all LSP-tracked files so servers release state for files no longer
|
||||||
|
// in the active context after compaction. Best-effort — LSP may not be
|
||||||
|
// initialized, and closeAllFiles catches per-file errors internally.
|
||||||
|
try {
|
||||||
|
const lspManager = getLspServerManager()
|
||||||
|
if (lspManager) {
|
||||||
|
await lspManager.closeAllFiles()
|
||||||
|
}
|
||||||
|
} catch {
|
||||||
|
// LSP module may not be available in all environments
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -1,36 +1,12 @@
|
|||||||
import { feature } from 'bun:bundle'
|
import { feature } from 'bun:bundle'
|
||||||
|
|
||||||
/**
|
|
||||||
* Build-time presence check: is the `/skill-learning` slash command
|
|
||||||
* compiled into this build? Used by the command registry's `isEnabled` so
|
|
||||||
* the command appears in the menu whenever it is buildable. Operators
|
|
||||||
* activate the subsystem itself via `/skill-learning start`, which flips
|
|
||||||
* `SKILL_LEARNING_ENABLED=1` and turns the runtime observers on (see
|
|
||||||
* `isSkillLearningEnabled`).
|
|
||||||
*/
|
|
||||||
export function isSkillLearningCompiledIn(): boolean {
|
|
||||||
if (feature('SKILL_LEARNING')) return true
|
|
||||||
return false
|
|
||||||
}
|
|
||||||
|
|
||||||
/**
|
|
||||||
* Runtime activation check: is the skill-learning subsystem actively
|
|
||||||
* running (toolEvent, runtime, session observers attached, persisting
|
|
||||||
* observations to disk)? Off by default — the operator must run
|
|
||||||
* `/skill-learning start` (which sets `SKILL_LEARNING_ENABLED=1`).
|
|
||||||
*
|
|
||||||
* Legacy `FEATURE_SKILL_LEARNING=1` is also accepted for backward
|
|
||||||
* compatibility with operators who set it before the slash-command UX
|
|
||||||
* landed.
|
|
||||||
*
|
|
||||||
* Build-flag gating is intentionally NOT performed here: the command
|
|
||||||
* registry already gates command compilation on the build flag, and this
|
|
||||||
* function is only reached from code paths that the build flag has
|
|
||||||
* already let through. Decoupling keeps the test surface clean (tests
|
|
||||||
* exercise the env-var contract without needing to mock `bun:bundle`).
|
|
||||||
*/
|
|
||||||
export function isSkillLearningEnabled(): boolean {
|
export function isSkillLearningEnabled(): boolean {
|
||||||
|
if (process.env.SKILL_LEARNING_ENABLED === '0') return false
|
||||||
if (process.env.SKILL_LEARNING_ENABLED === '1') return true
|
if (process.env.SKILL_LEARNING_ENABLED === '1') return true
|
||||||
|
if (process.env.FEATURE_SKILL_LEARNING === '0') return false
|
||||||
if (process.env.FEATURE_SKILL_LEARNING === '1') return true
|
if (process.env.FEATURE_SKILL_LEARNING === '1') return true
|
||||||
|
if (feature('SKILL_LEARNING')) {
|
||||||
|
return true
|
||||||
|
}
|
||||||
return false
|
return false
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -45,44 +45,15 @@ export function getProjectContextPath(projectId: string): string {
|
|||||||
// in the tool.call hot path (one wrapper invocation per tool) that cost would
|
// in the tool.call hot path (one wrapper invocation per tool) that cost would
|
||||||
// accumulate into the hundreds-of-ms range per session. Cache keyed by the
|
// accumulate into the hundreds-of-ms range per session. Cache keyed by the
|
||||||
// exact cwd string so different worktrees still get independent entries.
|
// exact cwd string so different worktrees still get independent entries.
|
||||||
//
|
|
||||||
// Bounded with LRU eviction: long-lived processes that traverse many
|
|
||||||
// worktrees (e.g. multi-repo build orchestrators) would otherwise grow the
|
|
||||||
// cache without limit. Each entry holds a SkillLearningProjectContext
|
|
||||||
// (instinct + skill lists), so the cap ensures bounded memory regardless
|
|
||||||
// of cwd diversity. `defines.ts` originally flagged this as
|
|
||||||
// "无淘汰机制(非 GB 级主因)" — this fix closes that gap.
|
|
||||||
const PROJECT_CONTEXT_CACHE_MAX = 32
|
|
||||||
const PROJECT_CONTEXT_CACHE_TRIM_TO = 24
|
|
||||||
const contextCache = new Map<string, SkillLearningProjectContext>()
|
const contextCache = new Map<string, SkillLearningProjectContext>()
|
||||||
const PERSIST_INTERVAL_MS = 5 * 60 * 1000
|
const PERSIST_INTERVAL_MS = 5 * 60 * 1000
|
||||||
let lastPersistAt = 0
|
let lastPersistAt = 0
|
||||||
|
|
||||||
function setProjectContextCache(
|
|
||||||
cwd: string,
|
|
||||||
ctx: SkillLearningProjectContext,
|
|
||||||
): void {
|
|
||||||
if (contextCache.has(cwd)) contextCache.delete(cwd)
|
|
||||||
contextCache.set(cwd, ctx)
|
|
||||||
if (contextCache.size > PROJECT_CONTEXT_CACHE_MAX) {
|
|
||||||
const toDrop = contextCache.size - PROJECT_CONTEXT_CACHE_TRIM_TO
|
|
||||||
const iter = contextCache.keys()
|
|
||||||
for (let i = 0; i < toDrop; i++) {
|
|
||||||
const next = iter.next()
|
|
||||||
if (next.done) break
|
|
||||||
contextCache.delete(next.value)
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
export function resolveProjectContext(
|
export function resolveProjectContext(
|
||||||
cwd = process.cwd(),
|
cwd = process.cwd(),
|
||||||
): SkillLearningProjectContext {
|
): SkillLearningProjectContext {
|
||||||
const cached = contextCache.get(cwd)
|
const cached = contextCache.get(cwd)
|
||||||
if (cached) {
|
if (cached) {
|
||||||
// Refresh insertion order so frequently-accessed cwds survive eviction.
|
|
||||||
contextCache.delete(cwd)
|
|
||||||
contextCache.set(cwd, cached)
|
|
||||||
// Still touch the registry so long-lived processes keep `lastSeenAt`
|
// Still touch the registry so long-lived processes keep `lastSeenAt`
|
||||||
// reasonably fresh, but throttle the write so it doesn't fire on every
|
// reasonably fresh, but throttle the write so it doesn't fire on every
|
||||||
// tool call.
|
// tool call.
|
||||||
@@ -94,7 +65,7 @@ export function resolveProjectContext(
|
|||||||
return cached
|
return cached
|
||||||
}
|
}
|
||||||
const resolved = resolveContext(cwd)
|
const resolved = resolveContext(cwd)
|
||||||
setProjectContextCache(cwd, resolved)
|
contextCache.set(cwd, resolved)
|
||||||
persistProjectContext(resolved)
|
persistProjectContext(resolved)
|
||||||
lastPersistAt = Date.now()
|
lastPersistAt = Date.now()
|
||||||
return resolved
|
return resolved
|
||||||
|
|||||||
@@ -23,30 +23,8 @@ export type PromotionOptions = {
|
|||||||
minConfidence?: number
|
minConfidence?: number
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
|
||||||
* Set bounded with FIFO eviction. # promotions per session is small in
|
|
||||||
* practice (single digits), but a long-lived sandbox/daemon could push
|
|
||||||
* this if it never restarts. The cap is defensive and the degraded
|
|
||||||
* behaviour — re-promote if we exceed N then forget the oldest — is
|
|
||||||
* benign because promotion is idempotent at the lifecycle layer.
|
|
||||||
*/
|
|
||||||
const SESSION_PROMOTED_IDS_MAX = 256
|
|
||||||
const SESSION_PROMOTED_IDS_TRIM_TO = 192
|
|
||||||
const sessionPromotedIds = new Set<string>()
|
const sessionPromotedIds = new Set<string>()
|
||||||
|
|
||||||
function recordSessionPromoted(id: string): void {
|
|
||||||
sessionPromotedIds.add(id)
|
|
||||||
if (sessionPromotedIds.size > SESSION_PROMOTED_IDS_MAX) {
|
|
||||||
const toDrop = sessionPromotedIds.size - SESSION_PROMOTED_IDS_TRIM_TO
|
|
||||||
const iter = sessionPromotedIds.values()
|
|
||||||
for (let i = 0; i < toDrop; i++) {
|
|
||||||
const next = iter.next()
|
|
||||||
if (next.done) break
|
|
||||||
sessionPromotedIds.delete(next.value)
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
export function resetPromotionBookkeeping(): void {
|
export function resetPromotionBookkeeping(): void {
|
||||||
sessionPromotedIds.clear()
|
sessionPromotedIds.clear()
|
||||||
}
|
}
|
||||||
@@ -125,7 +103,7 @@ export async function checkPromotion(
|
|||||||
}
|
}
|
||||||
await saveInstinct(globalInstinct, globalOptions)
|
await saveInstinct(globalInstinct, globalOptions)
|
||||||
|
|
||||||
recordSessionPromoted(candidate.instinctId)
|
sessionPromotedIds.add(candidate.instinctId)
|
||||||
promoted.push(candidate)
|
promoted.push(candidate)
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
@@ -1,30 +1,10 @@
|
|||||||
import { feature } from 'bun:bundle'
|
import { feature } from 'bun:bundle'
|
||||||
|
|
||||||
/**
|
export function isSkillSearchEnabled(): boolean {
|
||||||
* Build-time presence check: is the `/skill-search` slash command compiled
|
if (process.env.SKILL_SEARCH_ENABLED === '0') return false
|
||||||
* into this build? Used by the command registry's `isEnabled` so the
|
if (process.env.SKILL_SEARCH_ENABLED === '1') return true
|
||||||
* command appears in the menu whenever it is buildable. Operators activate
|
if (feature('EXPERIMENTAL_SKILL_SEARCH')) {
|
||||||
* the subsystem itself via `/skill-search start`, which flips
|
return true
|
||||||
* `SKILL_SEARCH_ENABLED=1` and turns the runtime hot paths on (see
|
}
|
||||||
* `isSkillSearchEnabled`).
|
|
||||||
*/
|
|
||||||
export function isSkillSearchCompiledIn(): boolean {
|
|
||||||
if (feature('EXPERIMENTAL_SKILL_SEARCH')) return true
|
|
||||||
return false
|
return false
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
|
||||||
* Runtime activation check: is the skill-search subsystem currently doing
|
|
||||||
* work (intentNormalize Haiku calls, prefetch hot path, telemetry)? Off by
|
|
||||||
* default — the operator must run `/skill-search start` (which sets
|
|
||||||
* `SKILL_SEARCH_ENABLED=1`). See docs/agent/sur-skill-overflow-bugs.md §5.
|
|
||||||
*
|
|
||||||
* Build-flag gating is intentionally NOT performed here: the command
|
|
||||||
* registry already gates command compilation on the build flag, and this
|
|
||||||
* function is only reached from code paths that the build flag has
|
|
||||||
* already let through. Decoupling keeps the test surface clean (tests
|
|
||||||
* exercise the env-var contract without needing to mock `bun:bundle`).
|
|
||||||
*/
|
|
||||||
export function isSkillSearchEnabled(): boolean {
|
|
||||||
return process.env.SKILL_SEARCH_ENABLED === '1'
|
|
||||||
}
|
|
||||||
|
|||||||
@@ -47,35 +47,10 @@ Output ONLY keywords. Nothing else.`
|
|||||||
const DEFAULT_TIMEOUT_MS = 6_000
|
const DEFAULT_TIMEOUT_MS = 6_000
|
||||||
const MAX_QUERY_CHARS = 500
|
const MAX_QUERY_CHARS = 500
|
||||||
const MAX_KEYWORDS_CHARS = 120
|
const MAX_KEYWORDS_CHARS = 120
|
||||||
/**
|
|
||||||
* Bound on the process-level query→keywords cache. Insertion-order LRU —
|
|
||||||
* Map iteration order is insertion order, so we evict from the front when
|
|
||||||
* size exceeds the cap. ~200 entries × ~600 bytes (query + keywords) ≈
|
|
||||||
* 120 KB worst case. Without this cap the cache grew monotonically with
|
|
||||||
* the diversity of Chinese queries in a long session.
|
|
||||||
*/
|
|
||||||
const CACHE_MAX_ENTRIES = 200
|
|
||||||
const CACHE_TRIM_TO = 150
|
|
||||||
|
|
||||||
/** Process-level cache. Keyed by the original (trimmed) query. */
|
/** Process-level cache. Keyed by the original (trimmed) query. */
|
||||||
const cache = new Map<string, string>()
|
const cache = new Map<string, string>()
|
||||||
|
|
||||||
function setCachedQueryIntent(key: string, value: string): void {
|
|
||||||
// Refresh insertion order on hit-then-write so frequently-used keys
|
|
||||||
// stay alive (delete + set is the canonical Map-LRU idiom).
|
|
||||||
if (cache.has(key)) cache.delete(key)
|
|
||||||
cache.set(key, value)
|
|
||||||
if (cache.size > CACHE_MAX_ENTRIES) {
|
|
||||||
const toDrop = cache.size - CACHE_TRIM_TO
|
|
||||||
const iter = cache.keys()
|
|
||||||
for (let i = 0; i < toDrop; i++) {
|
|
||||||
const next = iter.next()
|
|
||||||
if (next.done) break
|
|
||||||
cache.delete(next.value)
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
export function isIntentNormalizeEnabled(): boolean {
|
export function isIntentNormalizeEnabled(): boolean {
|
||||||
return process.env.SKILL_SEARCH_INTENT_ENABLED === '1'
|
return process.env.SKILL_SEARCH_INTENT_ENABLED === '1'
|
||||||
}
|
}
|
||||||
@@ -99,17 +74,12 @@ export async function normalizeQueryIntent(query: string): Promise<string> {
|
|||||||
if (!/[\u4e00-\u9fff]/.test(trimmed)) return trimmed
|
if (!/[\u4e00-\u9fff]/.test(trimmed)) return trimmed
|
||||||
|
|
||||||
const cached = cache.get(trimmed)
|
const cached = cache.get(trimmed)
|
||||||
if (cached !== undefined) {
|
if (cached !== undefined) return cached
|
||||||
// Refresh LRU position so frequently-queried strings survive eviction.
|
|
||||||
cache.delete(trimmed)
|
|
||||||
cache.set(trimmed, cached)
|
|
||||||
return cached
|
|
||||||
}
|
|
||||||
|
|
||||||
const capped = trimmed.slice(0, MAX_QUERY_CHARS)
|
const capped = trimmed.slice(0, MAX_QUERY_CHARS)
|
||||||
const keywords = await callHaiku(capped)
|
const keywords = await callHaiku(capped)
|
||||||
const result = keywords ? `${trimmed} ${keywords}` : trimmed
|
const result = keywords ? `${trimmed} ${keywords}` : trimmed
|
||||||
setCachedQueryIntent(trimmed, result)
|
cache.set(trimmed, result)
|
||||||
logForDebugging(
|
logForDebugging(
|
||||||
`[skill-search] intent normalized: "${trimmed.slice(0, 40)}" -> "${keywords}"`,
|
`[skill-search] intent normalized: "${trimmed.slice(0, 40)}" -> "${keywords}"`,
|
||||||
)
|
)
|
||||||
|
|||||||
@@ -14,35 +14,9 @@ import { readFile } from 'node:fs/promises'
|
|||||||
import { join } from 'node:path'
|
import { join } from 'node:path'
|
||||||
import { parseFrontmatter } from '../../utils/frontmatterParser.js'
|
import { parseFrontmatter } from '../../utils/frontmatterParser.js'
|
||||||
|
|
||||||
/**
|
|
||||||
* Per-session memoization to avoid re-emitting the same skill discovery /
|
|
||||||
* gap signal twice. Each Set is bounded to keep long-running sessions from
|
|
||||||
* monotonically accumulating skill names and signal keys forever (which
|
|
||||||
* was the original session-scoped-but-unbounded design).
|
|
||||||
*
|
|
||||||
* FIFO eviction by insertion order — once the cap is hit, the oldest
|
|
||||||
* entries roll off and may be re-recorded if rediscovered, which is the
|
|
||||||
* correct degraded behaviour: at worst we re-emit a duplicate signal,
|
|
||||||
* never silently drop a real one.
|
|
||||||
*/
|
|
||||||
const SESSION_TRACKING_MAX = 1000
|
|
||||||
const SESSION_TRACKING_TRIM_TO = 750
|
|
||||||
const discoveredThisSession = new Set<string>()
|
const discoveredThisSession = new Set<string>()
|
||||||
const recordedGapSignals = new Set<string>()
|
const recordedGapSignals = new Set<string>()
|
||||||
|
|
||||||
function addBoundedSessionEntry(set: Set<string>, value: string): void {
|
|
||||||
set.add(value)
|
|
||||||
if (set.size > SESSION_TRACKING_MAX) {
|
|
||||||
const toDrop = set.size - SESSION_TRACKING_TRIM_TO
|
|
||||||
const iter = set.values()
|
|
||||||
for (let i = 0; i < toDrop; i++) {
|
|
||||||
const next = iter.next()
|
|
||||||
if (next.done) break
|
|
||||||
set.delete(next.value)
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
const AUTO_LOAD_MIN_SCORE = Number(
|
const AUTO_LOAD_MIN_SCORE = Number(
|
||||||
process.env.SKILL_SEARCH_AUTOLOAD_MIN_SCORE ?? '0.30',
|
process.env.SKILL_SEARCH_AUTOLOAD_MIN_SCORE ?? '0.30',
|
||||||
)
|
)
|
||||||
@@ -211,7 +185,7 @@ async function maybeRecordSkillGap(
|
|||||||
|
|
||||||
const gapSignalKey = `${trigger}:${queryText.trim().toLowerCase()}`
|
const gapSignalKey = `${trigger}:${queryText.trim().toLowerCase()}`
|
||||||
if (recordedGapSignals.has(gapSignalKey)) return undefined
|
if (recordedGapSignals.has(gapSignalKey)) return undefined
|
||||||
addBoundedSessionEntry(recordedGapSignals, gapSignalKey)
|
recordedGapSignals.add(gapSignalKey)
|
||||||
|
|
||||||
try {
|
try {
|
||||||
const [{ isSkillLearningEnabled }, { recordSkillGap }] = await Promise.all([
|
const [{ isSkillLearningEnabled }, { recordSkillGap }] = await Promise.all([
|
||||||
@@ -267,7 +241,7 @@ export async function startSkillDiscoveryPrefetch(
|
|||||||
const newResults = results.filter(r => !discoveredThisSession.has(r.name))
|
const newResults = results.filter(r => !discoveredThisSession.has(r.name))
|
||||||
if (newResults.length === 0) return []
|
if (newResults.length === 0) return []
|
||||||
|
|
||||||
for (const r of newResults) addBoundedSessionEntry(discoveredThisSession, r.name)
|
for (const r of newResults) discoveredThisSession.add(r.name)
|
||||||
|
|
||||||
const signal: DiscoverySignal = {
|
const signal: DiscoverySignal = {
|
||||||
trigger: 'assistant_turn',
|
trigger: 'assistant_turn',
|
||||||
@@ -331,7 +305,7 @@ export async function getTurnZeroSkillDiscovery(
|
|||||||
|
|
||||||
if (results.length === 0 && !gap) return null
|
if (results.length === 0 && !gap) return null
|
||||||
|
|
||||||
for (const r of results) addBoundedSessionEntry(discoveredThisSession, r.name)
|
for (const r of results) discoveredThisSession.add(r.name)
|
||||||
|
|
||||||
const signal: DiscoverySignal = {
|
const signal: DiscoverySignal = {
|
||||||
trigger: 'user_input',
|
trigger: 'user_input',
|
||||||
|
|||||||
@@ -73,7 +73,6 @@ export function injectUserMessageToTeammate(
|
|||||||
options:
|
options:
|
||||||
| {
|
| {
|
||||||
autonomyRunId?: string;
|
autonomyRunId?: string;
|
||||||
autonomyRootDir?: string;
|
|
||||||
origin?: MessageOrigin;
|
origin?: MessageOrigin;
|
||||||
}
|
}
|
||||||
| undefined,
|
| undefined,
|
||||||
@@ -94,9 +93,6 @@ export function injectUserMessageToTeammate(
|
|||||||
if (options?.autonomyRunId !== undefined) {
|
if (options?.autonomyRunId !== undefined) {
|
||||||
pendingMessage.autonomyRunId = options.autonomyRunId;
|
pendingMessage.autonomyRunId = options.autonomyRunId;
|
||||||
}
|
}
|
||||||
if (options?.autonomyRootDir !== undefined) {
|
|
||||||
pendingMessage.autonomyRootDir = options.autonomyRootDir;
|
|
||||||
}
|
|
||||||
if (options?.origin !== undefined) {
|
if (options?.origin !== undefined) {
|
||||||
pendingMessage.origin = options.origin;
|
pendingMessage.origin = options.origin;
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -22,7 +22,6 @@ export type TeammateIdentity = {
|
|||||||
export type PendingTeammateUserMessage = {
|
export type PendingTeammateUserMessage = {
|
||||||
message: string
|
message: string
|
||||||
autonomyRunId?: string
|
autonomyRunId?: string
|
||||||
autonomyRootDir?: string
|
|
||||||
origin?: MessageOrigin
|
origin?: MessageOrigin
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
@@ -361,7 +361,6 @@ export type QueuedCommand = {
|
|||||||
*/
|
*/
|
||||||
autonomy?: {
|
autonomy?: {
|
||||||
runId: string
|
runId: string
|
||||||
rootDir?: string
|
|
||||||
trigger: 'scheduled-task' | 'proactive-tick' | 'managed-flow-step'
|
trigger: 'scheduled-task' | 'proactive-tick' | 'managed-flow-step'
|
||||||
sourceId?: string
|
sourceId?: string
|
||||||
sourceLabel?: string
|
sourceLabel?: string
|
||||||
|
|||||||
@@ -5,7 +5,6 @@ import {
|
|||||||
AUTONOMY_DIR,
|
AUTONOMY_DIR,
|
||||||
buildAutonomyTurnPrompt,
|
buildAutonomyTurnPrompt,
|
||||||
loadAutonomyAuthority,
|
loadAutonomyAuthority,
|
||||||
parseHeartbeatAuthorityTasks,
|
|
||||||
resetAutonomyAuthorityForTests,
|
resetAutonomyAuthorityForTests,
|
||||||
} from '../autonomyAuthority'
|
} from '../autonomyAuthority'
|
||||||
import {
|
import {
|
||||||
@@ -239,79 +238,4 @@ describe('autonomyAuthority', () => {
|
|||||||
expect(prompt).not.toContain('- weekly-report (7d): Ship the weekly report')
|
expect(prompt).not.toContain('- weekly-report (7d): Ship the weekly report')
|
||||||
expect(prompt).not.toContain('- gather (')
|
expect(prompt).not.toContain('- gather (')
|
||||||
})
|
})
|
||||||
|
|
||||||
test('parseHeartbeatAuthorityTasks ignores tasks: literals inside markdown code fences', () => {
|
|
||||||
const content = [
|
|
||||||
'# HEARTBEAT.md',
|
|
||||||
'',
|
|
||||||
'```yaml',
|
|
||||||
'tasks:',
|
|
||||||
' - name: not-a-real-task',
|
|
||||||
' interval: 1m',
|
|
||||||
' prompt: "would-be-shadowed"',
|
|
||||||
'```',
|
|
||||||
'',
|
|
||||||
'tasks:',
|
|
||||||
' - name: real-task',
|
|
||||||
' interval: 30m',
|
|
||||||
' prompt: "Real prompt"',
|
|
||||||
].join('\n')
|
|
||||||
|
|
||||||
const parsed = parseHeartbeatAuthorityTasks(content)
|
|
||||||
|
|
||||||
expect(parsed).toHaveLength(1)
|
|
||||||
expect(parsed[0]).toMatchObject({
|
|
||||||
name: 'real-task',
|
|
||||||
interval: '30m',
|
|
||||||
prompt: 'Real prompt',
|
|
||||||
})
|
|
||||||
})
|
|
||||||
|
|
||||||
test('parseHeartbeatAuthorityTasks ignores tasks: literals inside tilde markdown code fences', () => {
|
|
||||||
const content = [
|
|
||||||
'# HEARTBEAT.md',
|
|
||||||
'',
|
|
||||||
'~~~yaml',
|
|
||||||
'tasks:',
|
|
||||||
' - name: not-a-real-task',
|
|
||||||
' interval: 1m',
|
|
||||||
' prompt: "would-be-shadowed"',
|
|
||||||
'~~~',
|
|
||||||
'',
|
|
||||||
'tasks:',
|
|
||||||
' - name: real-task',
|
|
||||||
' interval: 30m',
|
|
||||||
' prompt: "Real prompt"',
|
|
||||||
].join('\n')
|
|
||||||
|
|
||||||
const parsed = parseHeartbeatAuthorityTasks(content)
|
|
||||||
|
|
||||||
expect(parsed).toHaveLength(1)
|
|
||||||
expect(parsed[0]).toMatchObject({
|
|
||||||
name: 'real-task',
|
|
||||||
interval: '30m',
|
|
||||||
prompt: 'Real prompt',
|
|
||||||
})
|
|
||||||
})
|
|
||||||
|
|
||||||
test('parseHeartbeatAuthorityTasks parses real tasks even when documentation precedes them', () => {
|
|
||||||
const content = [
|
|
||||||
'# Heartbeat docs',
|
|
||||||
'',
|
|
||||||
'See `tasks:` below — the parser keys on the literal at column 0.',
|
|
||||||
'',
|
|
||||||
'tasks:',
|
|
||||||
' - name: weekly',
|
|
||||||
' interval: 7d',
|
|
||||||
' prompt: "Ship report"',
|
|
||||||
].join('\n')
|
|
||||||
|
|
||||||
const parsed = parseHeartbeatAuthorityTasks(content)
|
|
||||||
|
|
||||||
// Inline `tasks:` mention does NOT collide because it's not at column 0
|
|
||||||
// on its own line — the existing line.trim() === 'tasks:' guard handles
|
|
||||||
// that case. This test pins the behaviour.
|
|
||||||
expect(parsed).toHaveLength(1)
|
|
||||||
expect(parsed[0]?.name).toBe('weekly')
|
|
||||||
})
|
|
||||||
})
|
})
|
||||||
|
|||||||
@@ -126,14 +126,6 @@ describe('listAutonomyFlows', () => {
|
|||||||
runCount: 0,
|
runCount: 0,
|
||||||
ownerKey: DEFAULT_AUTONOMY_OWNER_KEY,
|
ownerKey: DEFAULT_AUTONOMY_OWNER_KEY,
|
||||||
currentDir: tempDir,
|
currentDir: tempDir,
|
||||||
boundary: [
|
|
||||||
' src/utils/** ',
|
|
||||||
'/absolute/not-allowed',
|
|
||||||
'src\\windows',
|
|
||||||
'../outside',
|
|
||||||
'src/utils/**',
|
|
||||||
'docs/*.md',
|
|
||||||
],
|
|
||||||
stateJson: {
|
stateJson: {
|
||||||
currentStepIndex: 0,
|
currentStepIndex: 0,
|
||||||
steps: [
|
steps: [
|
||||||
@@ -155,7 +147,6 @@ describe('listAutonomyFlows', () => {
|
|||||||
expect(flows).toHaveLength(1)
|
expect(flows).toHaveLength(1)
|
||||||
expect(flows[0]?.flowId).toBe('flow-1')
|
expect(flows[0]?.flowId).toBe('flow-1')
|
||||||
expect(flows[0]?.syncMode).toBe('managed')
|
expect(flows[0]?.syncMode).toBe('managed')
|
||||||
expect(flows[0]?.boundary).toEqual(['src/utils/**', 'docs/*.md'])
|
|
||||||
expect(flows[0]?.stateJson?.steps).toHaveLength(1)
|
expect(flows[0]?.stateJson?.steps).toHaveLength(1)
|
||||||
})
|
})
|
||||||
|
|
||||||
@@ -200,64 +191,6 @@ describe('listAutonomyFlows', () => {
|
|||||||
const flows = await listAutonomyFlows(tempDir)
|
const flows = await listAutonomyFlows(tempDir)
|
||||||
expect(flows).toEqual([])
|
expect(flows).toEqual([])
|
||||||
})
|
})
|
||||||
|
|
||||||
test('persistence pruning keeps active flows ahead of recent terminal history', async () => {
|
|
||||||
const flows: AutonomyFlowRecord[] = [
|
|
||||||
{
|
|
||||||
flowId: 'old-active',
|
|
||||||
flowKey: 'managed:scheduled-task:old-active',
|
|
||||||
syncMode: 'managed',
|
|
||||||
ownerKey: DEFAULT_AUTONOMY_OWNER_KEY,
|
|
||||||
revision: 1,
|
|
||||||
trigger: 'scheduled-task',
|
|
||||||
status: 'queued',
|
|
||||||
goal: 'old active',
|
|
||||||
rootDir: tempDir,
|
|
||||||
currentDir: tempDir,
|
|
||||||
runCount: 0,
|
|
||||||
createdAt: 1,
|
|
||||||
updatedAt: 1,
|
|
||||||
},
|
|
||||||
...Array.from({ length: 100 }, (_, index) => ({
|
|
||||||
flowId: `history-${index}`,
|
|
||||||
flowKey: `managed:scheduled-task:history-${index}`,
|
|
||||||
syncMode: 'managed' as const,
|
|
||||||
ownerKey: DEFAULT_AUTONOMY_OWNER_KEY,
|
|
||||||
revision: 1,
|
|
||||||
trigger: 'scheduled-task' as const,
|
|
||||||
status: 'succeeded' as const,
|
|
||||||
goal: `history ${index}`,
|
|
||||||
rootDir: tempDir,
|
|
||||||
currentDir: tempDir,
|
|
||||||
runCount: 1,
|
|
||||||
createdAt: 1_000 + index,
|
|
||||||
updatedAt: 1_000 + index,
|
|
||||||
endedAt: 2_000 + index,
|
|
||||||
})),
|
|
||||||
]
|
|
||||||
const flowsPath = resolveAutonomyFlowsPath(tempDir)
|
|
||||||
await mkdir(join(tempDir, AUTONOMY_DIR), { recursive: true })
|
|
||||||
await writeFile(
|
|
||||||
flowsPath,
|
|
||||||
`${JSON.stringify({ flows }, null, 2)}\n`,
|
|
||||||
'utf-8',
|
|
||||||
)
|
|
||||||
|
|
||||||
await startManagedAutonomyFlow({
|
|
||||||
trigger: 'scheduled-task',
|
|
||||||
goal: 'fresh active',
|
|
||||||
steps: TWO_STEPS,
|
|
||||||
rootDir: tempDir,
|
|
||||||
currentDir: tempDir,
|
|
||||||
sourceId: 'fresh-active',
|
|
||||||
nowMs: 9_999,
|
|
||||||
})
|
|
||||||
|
|
||||||
const persisted = await listAutonomyFlows(tempDir)
|
|
||||||
expect(persisted).toHaveLength(100)
|
|
||||||
expect(persisted.some(flow => flow.flowId === 'old-active')).toBe(true)
|
|
||||||
expect(persisted.some(flow => flow.flowId === 'history-0')).toBe(false)
|
|
||||||
})
|
|
||||||
})
|
})
|
||||||
|
|
||||||
describe('startManagedAutonomyFlow', () => {
|
describe('startManagedAutonomyFlow', () => {
|
||||||
@@ -292,49 +225,6 @@ describe('startManagedAutonomyFlow', () => {
|
|||||||
expect(result!.nextStep!.step.name).toBe('gather')
|
expect(result!.nextStep!.step.name).toBe('gather')
|
||||||
})
|
})
|
||||||
|
|
||||||
test('normalizes and preserves boundary across completed flow restarts', async () => {
|
|
||||||
const first = await startManagedAutonomyFlow({
|
|
||||||
trigger: 'scheduled-task',
|
|
||||||
goal: 'Scoped flow',
|
|
||||||
steps: [{ name: 'only', prompt: 'Do it' }],
|
|
||||||
rootDir: tempDir,
|
|
||||||
sourceId: 'scoped-src',
|
|
||||||
boundary: [' src/utils/** ', 'src\\bad', '/absolute', 'docs/*.md'],
|
|
||||||
nowMs: 1000,
|
|
||||||
})
|
|
||||||
const flowId = first!.flow.flowId
|
|
||||||
|
|
||||||
expect(first!.flow.boundary).toEqual(['src/utils/**', 'docs/*.md'])
|
|
||||||
|
|
||||||
await queueManagedAutonomyFlowStepRun({
|
|
||||||
flowId,
|
|
||||||
stepId: first!.nextStep!.step.stepId,
|
|
||||||
stepIndex: 0,
|
|
||||||
runId: 'run-1',
|
|
||||||
rootDir: tempDir,
|
|
||||||
nowMs: 2000,
|
|
||||||
})
|
|
||||||
await markManagedAutonomyFlowStepCompleted({
|
|
||||||
flowId,
|
|
||||||
runId: 'run-1',
|
|
||||||
rootDir: tempDir,
|
|
||||||
nowMs: 3000,
|
|
||||||
})
|
|
||||||
|
|
||||||
const restarted = await startManagedAutonomyFlow({
|
|
||||||
trigger: 'scheduled-task',
|
|
||||||
goal: 'Scoped flow',
|
|
||||||
steps: [{ name: 'only', prompt: 'Do it again' }],
|
|
||||||
rootDir: tempDir,
|
|
||||||
sourceId: 'scoped-src',
|
|
||||||
nowMs: 4000,
|
|
||||||
})
|
|
||||||
|
|
||||||
expect(restarted!.started).toBe(true)
|
|
||||||
expect(restarted!.flow.flowId).toBe(flowId)
|
|
||||||
expect(restarted!.flow.boundary).toEqual(['src/utils/**', 'docs/*.md'])
|
|
||||||
})
|
|
||||||
|
|
||||||
test('sets status=waiting when first step has waitFor', async () => {
|
test('sets status=waiting when first step has waitFor', async () => {
|
||||||
const result = await startManagedAutonomyFlow({
|
const result = await startManagedAutonomyFlow({
|
||||||
trigger: 'scheduled-task',
|
trigger: 'scheduled-task',
|
||||||
|
|||||||
@@ -54,25 +54,6 @@ describe('withAutonomyPersistenceLock', () => {
|
|||||||
).rejects.toThrow('inner failure')
|
).rejects.toThrow('inner failure')
|
||||||
})
|
})
|
||||||
|
|
||||||
test('releases same-root lock bookkeeping after success and failure', async () => {
|
|
||||||
const {
|
|
||||||
getAutonomyPersistenceLockCountForTests,
|
|
||||||
withAutonomyPersistenceLock,
|
|
||||||
} = await import('../autonomyPersistence')
|
|
||||||
|
|
||||||
expect(getAutonomyPersistenceLockCountForTests()).toBe(0)
|
|
||||||
|
|
||||||
await withAutonomyPersistenceLock(tempDir, async () => 'ok')
|
|
||||||
expect(getAutonomyPersistenceLockCountForTests()).toBe(0)
|
|
||||||
|
|
||||||
await expect(
|
|
||||||
withAutonomyPersistenceLock(tempDir, async () => {
|
|
||||||
throw new Error('inner failure')
|
|
||||||
}),
|
|
||||||
).rejects.toThrow('inner failure')
|
|
||||||
expect(getAutonomyPersistenceLockCountForTests()).toBe(0)
|
|
||||||
})
|
|
||||||
|
|
||||||
test('serializes concurrent calls on the same rootDir', async () => {
|
test('serializes concurrent calls on the same rootDir', async () => {
|
||||||
const { withAutonomyPersistenceLock } = await import(
|
const { withAutonomyPersistenceLock } = await import(
|
||||||
'../autonomyPersistence'
|
'../autonomyPersistence'
|
||||||
|
|||||||
@@ -1,279 +0,0 @@
|
|||||||
import { afterEach, beforeEach, describe, expect, test } from 'bun:test'
|
|
||||||
import { createTempDir, cleanupTempDir } from '../../../tests/mocks/file-system'
|
|
||||||
import { getAttachmentMessages } from '../attachments'
|
|
||||||
import {
|
|
||||||
createAutonomyQueuedPrompt,
|
|
||||||
createProactiveAutonomyCommands,
|
|
||||||
getAutonomyRunById,
|
|
||||||
markAutonomyRunCancelled,
|
|
||||||
startManagedAutonomyFlowFromHeartbeatTask,
|
|
||||||
} from '../autonomyRuns'
|
|
||||||
import { getAutonomyFlowById, listAutonomyFlows } from '../autonomyFlows'
|
|
||||||
import {
|
|
||||||
cancelQueuedAutonomyCommands,
|
|
||||||
claimConsumableQueuedAutonomyCommands,
|
|
||||||
finalizeAutonomyCommandsForTurn,
|
|
||||||
partitionConsumableQueuedAutonomyCommands,
|
|
||||||
} from '../autonomyQueueLifecycle'
|
|
||||||
import {
|
|
||||||
enqueue,
|
|
||||||
getCommandsByMaxPriority,
|
|
||||||
remove as removeFromQueue,
|
|
||||||
resetCommandQueue,
|
|
||||||
} from '../messageQueueManager'
|
|
||||||
|
|
||||||
let tempDir = ''
|
|
||||||
let extraTempDirs: string[] = []
|
|
||||||
|
|
||||||
beforeEach(async () => {
|
|
||||||
tempDir = await createTempDir('autonomy-queue-lifecycle-')
|
|
||||||
extraTempDirs = []
|
|
||||||
resetCommandQueue()
|
|
||||||
})
|
|
||||||
|
|
||||||
afterEach(async () => {
|
|
||||||
resetCommandQueue()
|
|
||||||
if (tempDir) {
|
|
||||||
await cleanupTempDir(tempDir)
|
|
||||||
}
|
|
||||||
for (const extraTempDir of extraTempDirs) {
|
|
||||||
await cleanupTempDir(extraTempDir)
|
|
||||||
}
|
|
||||||
})
|
|
||||||
|
|
||||||
describe('autonomyQueueLifecycle', () => {
|
|
||||||
async function consumeQueuedAutonomyAttachmentTurn() {
|
|
||||||
const previousDisableAttachments =
|
|
||||||
process.env.CLAUDE_CODE_DISABLE_ATTACHMENTS
|
|
||||||
process.env.CLAUDE_CODE_DISABLE_ATTACHMENTS = '1'
|
|
||||||
try {
|
|
||||||
const snapshot = getCommandsByMaxPriority('later')
|
|
||||||
const claim = await claimConsumableQueuedAutonomyCommands(
|
|
||||||
snapshot,
|
|
||||||
tempDir,
|
|
||||||
)
|
|
||||||
removeFromQueue(claim.staleCommands)
|
|
||||||
removeFromQueue(claim.claimedCommands)
|
|
||||||
|
|
||||||
const attachments = []
|
|
||||||
for await (const attachment of getAttachmentMessages(
|
|
||||||
null,
|
|
||||||
{} as never,
|
|
||||||
null,
|
|
||||||
claim.attachmentCommands,
|
|
||||||
[],
|
|
||||||
)) {
|
|
||||||
attachments.push(attachment)
|
|
||||||
}
|
|
||||||
|
|
||||||
const consumedCommands = claim.attachmentCommands.filter(
|
|
||||||
command =>
|
|
||||||
(command.mode === 'prompt' || command.mode === 'task-notification') &&
|
|
||||||
!claim.claimedCommands.includes(command),
|
|
||||||
)
|
|
||||||
removeFromQueue(consumedCommands)
|
|
||||||
const nextCommands = await finalizeAutonomyCommandsForTurn({
|
|
||||||
commands: claim.claimedCommands,
|
|
||||||
outcome: { type: 'completed' },
|
|
||||||
currentDir: tempDir,
|
|
||||||
priority: 'later',
|
|
||||||
})
|
|
||||||
for (const command of nextCommands) {
|
|
||||||
enqueue(command)
|
|
||||||
}
|
|
||||||
|
|
||||||
return { attachments, runningRunIds: claim.claimedRunIds, nextCommands }
|
|
||||||
} finally {
|
|
||||||
if (previousDisableAttachments === undefined) {
|
|
||||||
delete process.env.CLAUDE_CODE_DISABLE_ATTACHMENTS
|
|
||||||
} else {
|
|
||||||
process.env.CLAUDE_CODE_DISABLE_ATTACHMENTS = previousDisableAttachments
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
test('filters stale autonomy commands before mid-turn attachment consumption', async () => {
|
|
||||||
const command = await createAutonomyQueuedPrompt({
|
|
||||||
basePrompt: 'scheduled prompt',
|
|
||||||
trigger: 'scheduled-task',
|
|
||||||
rootDir: tempDir,
|
|
||||||
currentDir: tempDir,
|
|
||||||
})
|
|
||||||
expect(command).not.toBeNull()
|
|
||||||
|
|
||||||
const initial = await partitionConsumableQueuedAutonomyCommands(
|
|
||||||
[command!],
|
|
||||||
tempDir,
|
|
||||||
)
|
|
||||||
expect(initial.attachmentCommands).toHaveLength(1)
|
|
||||||
expect(initial.staleCommands).toHaveLength(0)
|
|
||||||
|
|
||||||
await markAutonomyRunCancelled(command!.autonomy!.runId, tempDir)
|
|
||||||
|
|
||||||
const afterCancel = await partitionConsumableQueuedAutonomyCommands(
|
|
||||||
[command!],
|
|
||||||
tempDir,
|
|
||||||
)
|
|
||||||
expect(afterCancel.attachmentCommands).toHaveLength(0)
|
|
||||||
expect(afterCancel.staleCommands).toHaveLength(1)
|
|
||||||
})
|
|
||||||
|
|
||||||
test('cancels proactive commands that are created but dropped before enqueue', async () => {
|
|
||||||
const commands = await createProactiveAutonomyCommands({
|
|
||||||
basePrompt: '<tick>12:00:00</tick>',
|
|
||||||
rootDir: tempDir,
|
|
||||||
currentDir: tempDir,
|
|
||||||
})
|
|
||||||
expect(commands).toHaveLength(1)
|
|
||||||
|
|
||||||
const queuedRun = await getAutonomyRunById(
|
|
||||||
commands[0]!.autonomy!.runId,
|
|
||||||
tempDir,
|
|
||||||
)
|
|
||||||
expect(queuedRun!.status).toBe('queued')
|
|
||||||
|
|
||||||
await cancelQueuedAutonomyCommands({ commands, rootDir: tempDir })
|
|
||||||
|
|
||||||
const cancelledRun = await getAutonomyRunById(
|
|
||||||
commands[0]!.autonomy!.runId,
|
|
||||||
tempDir,
|
|
||||||
)
|
|
||||||
expect(cancelledRun!.status).toBe('cancelled')
|
|
||||||
})
|
|
||||||
|
|
||||||
test('uses command rootDir when claiming after project context changes', async () => {
|
|
||||||
const otherProjectDir = await createTempDir('autonomy-other-project-')
|
|
||||||
extraTempDirs.push(otherProjectDir)
|
|
||||||
const command = await createAutonomyQueuedPrompt({
|
|
||||||
basePrompt: 'scheduled prompt',
|
|
||||||
trigger: 'scheduled-task',
|
|
||||||
rootDir: tempDir,
|
|
||||||
currentDir: tempDir,
|
|
||||||
})
|
|
||||||
expect(command).not.toBeNull()
|
|
||||||
expect(command!.autonomy?.rootDir).toBe(tempDir)
|
|
||||||
|
|
||||||
const claim = await claimConsumableQueuedAutonomyCommands(
|
|
||||||
[command!],
|
|
||||||
otherProjectDir,
|
|
||||||
)
|
|
||||||
|
|
||||||
const originalRun = await getAutonomyRunById(
|
|
||||||
command!.autonomy!.runId,
|
|
||||||
tempDir,
|
|
||||||
)
|
|
||||||
const wrongProjectRun = await getAutonomyRunById(
|
|
||||||
command!.autonomy!.runId,
|
|
||||||
otherProjectDir,
|
|
||||||
)
|
|
||||||
|
|
||||||
expect(claim.claimedRunIds).toEqual([command!.autonomy!.runId])
|
|
||||||
expect(claim.attachmentCommands).toHaveLength(1)
|
|
||||||
expect(originalRun!.status).toBe('running')
|
|
||||||
expect(wrongProjectRun).toBeNull()
|
|
||||||
})
|
|
||||||
|
|
||||||
test('advances a managed flow consumed as a queued attachment', async () => {
|
|
||||||
const command = await startManagedAutonomyFlowFromHeartbeatTask({
|
|
||||||
task: {
|
|
||||||
name: 'weekly-report',
|
|
||||||
interval: '7d',
|
|
||||||
prompt: 'Ship the weekly report',
|
|
||||||
steps: [
|
|
||||||
{ name: 'gather', prompt: 'Gather weekly inputs' },
|
|
||||||
{ name: 'draft', prompt: 'Draft weekly report' },
|
|
||||||
],
|
|
||||||
},
|
|
||||||
rootDir: tempDir,
|
|
||||||
currentDir: tempDir,
|
|
||||||
})
|
|
||||||
expect(command).not.toBeNull()
|
|
||||||
|
|
||||||
const claim = await claimConsumableQueuedAutonomyCommands(
|
|
||||||
[command!],
|
|
||||||
tempDir,
|
|
||||||
)
|
|
||||||
const runningRunIds = claim.claimedRunIds
|
|
||||||
expect(runningRunIds).toEqual([command!.autonomy!.runId])
|
|
||||||
|
|
||||||
const nextCommands = await finalizeAutonomyCommandsForTurn({
|
|
||||||
commands: claim.claimedCommands,
|
|
||||||
outcome: { type: 'completed' },
|
|
||||||
currentDir: tempDir,
|
|
||||||
priority: 'later',
|
|
||||||
})
|
|
||||||
const [flow] = await listAutonomyFlows(tempDir)
|
|
||||||
const detail = await getAutonomyFlowById(flow!.flowId, tempDir)
|
|
||||||
const run = await getAutonomyRunById(command!.autonomy!.runId, tempDir)
|
|
||||||
|
|
||||||
expect(run!.status).toBe('completed')
|
|
||||||
expect(nextCommands).toHaveLength(1)
|
|
||||||
expect(nextCommands[0]!.autonomy?.flowId).toBe(flow!.flowId)
|
|
||||||
expect(detail!.stateJson!.steps.map(step => step.status)).toEqual([
|
|
||||||
'completed',
|
|
||||||
'queued',
|
|
||||||
])
|
|
||||||
})
|
|
||||||
|
|
||||||
test('keeps managed autonomy flow coherent across queued attachment turns', async () => {
|
|
||||||
const firstCommand = await startManagedAutonomyFlowFromHeartbeatTask({
|
|
||||||
task: {
|
|
||||||
name: 'weekly-report',
|
|
||||||
interval: '7d',
|
|
||||||
prompt: 'Ship the weekly report',
|
|
||||||
steps: [
|
|
||||||
{ name: 'gather', prompt: 'Gather weekly inputs' },
|
|
||||||
{ name: 'draft', prompt: 'Draft weekly report' },
|
|
||||||
],
|
|
||||||
},
|
|
||||||
rootDir: tempDir,
|
|
||||||
currentDir: tempDir,
|
|
||||||
})
|
|
||||||
expect(firstCommand).not.toBeNull()
|
|
||||||
enqueue(firstCommand!)
|
|
||||||
|
|
||||||
const firstTurn = await consumeQueuedAutonomyAttachmentTurn()
|
|
||||||
const queuedAfterFirstTurn = getCommandsByMaxPriority('later')
|
|
||||||
const [flowAfterFirstTurn] = await listAutonomyFlows(tempDir)
|
|
||||||
const firstRun = await getAutonomyRunById(
|
|
||||||
firstCommand!.autonomy!.runId,
|
|
||||||
tempDir,
|
|
||||||
)
|
|
||||||
|
|
||||||
expect(firstTurn.attachments).toHaveLength(1)
|
|
||||||
expect(firstTurn.attachments[0]!.attachment?.type).toBe('queued_command')
|
|
||||||
expect(firstTurn.runningRunIds).toEqual([firstCommand!.autonomy!.runId])
|
|
||||||
expect(firstTurn.nextCommands).toHaveLength(1)
|
|
||||||
expect(queuedAfterFirstTurn).toHaveLength(1)
|
|
||||||
expect(queuedAfterFirstTurn[0]!.autonomy?.flowId).toBe(
|
|
||||||
flowAfterFirstTurn!.flowId,
|
|
||||||
)
|
|
||||||
expect(firstRun!.status).toBe('completed')
|
|
||||||
expect(
|
|
||||||
flowAfterFirstTurn!.stateJson!.steps.map(step => step.status),
|
|
||||||
).toEqual(['completed', 'queued'])
|
|
||||||
|
|
||||||
const secondCommand = queuedAfterFirstTurn[0]!
|
|
||||||
const secondTurn = await consumeQueuedAutonomyAttachmentTurn()
|
|
||||||
const queuedAfterSecondTurn = getCommandsByMaxPriority('later')
|
|
||||||
const finalFlow = await getAutonomyFlowById(
|
|
||||||
flowAfterFirstTurn!.flowId,
|
|
||||||
tempDir,
|
|
||||||
)
|
|
||||||
const secondRun = await getAutonomyRunById(
|
|
||||||
secondCommand.autonomy!.runId,
|
|
||||||
tempDir,
|
|
||||||
)
|
|
||||||
|
|
||||||
expect(secondTurn.attachments).toHaveLength(1)
|
|
||||||
expect(secondTurn.runningRunIds).toEqual([secondCommand.autonomy!.runId])
|
|
||||||
expect(secondTurn.nextCommands).toHaveLength(0)
|
|
||||||
expect(queuedAfterSecondTurn).toHaveLength(0)
|
|
||||||
expect(secondRun!.status).toBe('completed')
|
|
||||||
expect(finalFlow!.status).toBe('succeeded')
|
|
||||||
expect(finalFlow!.stateJson!.steps.map(step => step.status)).toEqual([
|
|
||||||
'completed',
|
|
||||||
'completed',
|
|
||||||
])
|
|
||||||
})
|
|
||||||
})
|
|
||||||
@@ -1,5 +1,6 @@
|
|||||||
import { afterEach, beforeEach, describe, expect, test } from 'bun:test'
|
import { afterEach, beforeEach, describe, expect, test } from 'bun:test'
|
||||||
import { join, resolve as resolvePath } from 'node:path'
|
import { mkdir, writeFile } from 'fs/promises'
|
||||||
|
import { join } from 'path'
|
||||||
import {
|
import {
|
||||||
resetStateForTests,
|
resetStateForTests,
|
||||||
setCwdState,
|
setCwdState,
|
||||||
@@ -7,23 +8,17 @@ import {
|
|||||||
setProjectRoot,
|
setProjectRoot,
|
||||||
} from '../../bootstrap/state'
|
} from '../../bootstrap/state'
|
||||||
import {
|
import {
|
||||||
createAutonomyRun,
|
|
||||||
formatAutonomyRunsList,
|
formatAutonomyRunsList,
|
||||||
formatAutonomyRunsStatus,
|
formatAutonomyRunsStatus,
|
||||||
listAutonomyRuns,
|
listAutonomyRuns,
|
||||||
createAutonomyQueuedPrompt,
|
createAutonomyQueuedPrompt,
|
||||||
createAutonomyQueuedPromptIfNoActiveSource,
|
|
||||||
createProactiveAutonomyCommands,
|
createProactiveAutonomyCommands,
|
||||||
finalizeAutonomyRunCompleted,
|
finalizeAutonomyRunCompleted,
|
||||||
getAutonomyRunById,
|
|
||||||
hasActiveAutonomyRunForSource,
|
|
||||||
markAutonomyRunCompleted,
|
markAutonomyRunCompleted,
|
||||||
markAutonomyRunCancelled,
|
|
||||||
markAutonomyRunFailed,
|
markAutonomyRunFailed,
|
||||||
markAutonomyRunRunning,
|
markAutonomyRunRunning,
|
||||||
recoverManagedAutonomyFlowPrompt,
|
recoverManagedAutonomyFlowPrompt,
|
||||||
resolveAutonomyRunsPath,
|
resolveAutonomyRunsPath,
|
||||||
STALE_ACTIVE_RUN_ERROR_PREFIX,
|
|
||||||
startManagedAutonomyFlowFromHeartbeatTask,
|
startManagedAutonomyFlowFromHeartbeatTask,
|
||||||
} from '../autonomyRuns'
|
} from '../autonomyRuns'
|
||||||
import {
|
import {
|
||||||
@@ -40,14 +35,11 @@ import {
|
|||||||
cleanupTempDir,
|
cleanupTempDir,
|
||||||
createTempDir,
|
createTempDir,
|
||||||
createTempSubdir,
|
createTempSubdir,
|
||||||
readTempFile,
|
|
||||||
tempPathExists,
|
|
||||||
writeTempFile,
|
writeTempFile,
|
||||||
} from '../../../tests/mocks/file-system'
|
} from '../../../tests/mocks/file-system'
|
||||||
|
|
||||||
const AGENTS_REL = join(AUTONOMY_DIR, 'AGENTS.md')
|
const AGENTS_REL = join(AUTONOMY_DIR, 'AGENTS.md')
|
||||||
const HEARTBEAT_REL = join(AUTONOMY_DIR, 'HEARTBEAT.md')
|
const HEARTBEAT_REL = join(AUTONOMY_DIR, 'HEARTBEAT.md')
|
||||||
const RUNS_REL = join(AUTONOMY_DIR, 'runs.json')
|
|
||||||
|
|
||||||
let tempDir = ''
|
let tempDir = ''
|
||||||
|
|
||||||
@@ -103,9 +95,7 @@ describe('autonomyRuns', () => {
|
|||||||
ownerKey: 'main-thread',
|
ownerKey: 'main-thread',
|
||||||
sourceId: 'cron-1',
|
sourceId: 'cron-1',
|
||||||
sourceLabel: 'nightly-report',
|
sourceLabel: 'nightly-report',
|
||||||
ownerProcessId: process.pid,
|
|
||||||
})
|
})
|
||||||
expect(runs[0]?.ownerSessionId).toBeString()
|
|
||||||
expect(flows).toHaveLength(0)
|
expect(flows).toHaveLength(0)
|
||||||
expect(resolveAutonomyRunsPath(tempDir)).toContain('.claude')
|
expect(resolveAutonomyRunsPath(tempDir)).toContain('.claude')
|
||||||
})
|
})
|
||||||
@@ -128,7 +118,7 @@ describe('autonomyRuns', () => {
|
|||||||
expect(command!.value).toContain('nested authority')
|
expect(command!.value).toContain('nested authority')
|
||||||
})
|
})
|
||||||
|
|
||||||
test('markAutonomyRunRunning/completed update persisted lifecycle state for plain runs', async () => {
|
test('markAutonomyRunRunning/completed/failed update persisted lifecycle state for plain runs', async () => {
|
||||||
const command = await createAutonomyQueuedPrompt({
|
const command = await createAutonomyQueuedPrompt({
|
||||||
basePrompt: '<tick>12:00:00</tick>',
|
basePrompt: '<tick>12:00:00</tick>',
|
||||||
trigger: 'proactive-tick',
|
trigger: 'proactive-tick',
|
||||||
@@ -144,9 +134,7 @@ describe('autonomyRuns', () => {
|
|||||||
runId,
|
runId,
|
||||||
status: 'running',
|
status: 'running',
|
||||||
startedAt: 100,
|
startedAt: 100,
|
||||||
ownerProcessId: process.pid,
|
|
||||||
})
|
})
|
||||||
expect(runs[0]?.ownerSessionId).toBeString()
|
|
||||||
|
|
||||||
await markAutonomyRunCompleted(runId, tempDir, 200)
|
await markAutonomyRunCompleted(runId, tempDir, 200)
|
||||||
runs = await listAutonomyRuns(tempDir)
|
runs = await listAutonomyRuns(tempDir)
|
||||||
@@ -155,22 +143,9 @@ describe('autonomyRuns', () => {
|
|||||||
status: 'completed',
|
status: 'completed',
|
||||||
endedAt: 200,
|
endedAt: 200,
|
||||||
})
|
})
|
||||||
})
|
|
||||||
|
|
||||||
test('markAutonomyRunFailed updates a non-terminal run', async () => {
|
|
||||||
const command = await createAutonomyQueuedPrompt({
|
|
||||||
basePrompt: '<tick>12:00:00</tick>',
|
|
||||||
trigger: 'proactive-tick',
|
|
||||||
rootDir: tempDir,
|
|
||||||
currentDir: tempDir,
|
|
||||||
})
|
|
||||||
expect(command).not.toBeNull()
|
|
||||||
const runId = command!.autonomy!.runId
|
|
||||||
|
|
||||||
await markAutonomyRunRunning(runId, tempDir, 100)
|
|
||||||
await markAutonomyRunFailed(runId, 'boom', tempDir, 300)
|
await markAutonomyRunFailed(runId, 'boom', tempDir, 300)
|
||||||
const runs = await listAutonomyRuns(tempDir)
|
runs = await listAutonomyRuns(tempDir)
|
||||||
|
|
||||||
expect(runs[0]).toMatchObject({
|
expect(runs[0]).toMatchObject({
|
||||||
runId,
|
runId,
|
||||||
status: 'failed',
|
status: 'failed',
|
||||||
@@ -179,346 +154,6 @@ describe('autonomyRuns', () => {
|
|||||||
})
|
})
|
||||||
})
|
})
|
||||||
|
|
||||||
test('terminal runs are not revived by stale lifecycle updates', async () => {
|
|
||||||
const command = await createAutonomyQueuedPrompt({
|
|
||||||
basePrompt: 'scheduled prompt',
|
|
||||||
trigger: 'scheduled-task',
|
|
||||||
rootDir: tempDir,
|
|
||||||
currentDir: tempDir,
|
|
||||||
})
|
|
||||||
expect(command).not.toBeNull()
|
|
||||||
const runId = command!.autonomy!.runId
|
|
||||||
|
|
||||||
await markAutonomyRunCancelled(runId, tempDir, 100)
|
|
||||||
const revived = await markAutonomyRunRunning(runId, tempDir, 200)
|
|
||||||
const completed = await markAutonomyRunCompleted(runId, tempDir, 300)
|
|
||||||
const failed = await markAutonomyRunFailed(
|
|
||||||
runId,
|
|
||||||
'late failure',
|
|
||||||
tempDir,
|
|
||||||
400,
|
|
||||||
)
|
|
||||||
const persisted = await getAutonomyRunById(runId, tempDir)
|
|
||||||
|
|
||||||
expect(revived).toBeNull()
|
|
||||||
expect(completed).toBeNull()
|
|
||||||
expect(failed).toBeNull()
|
|
||||||
expect(persisted).toMatchObject({
|
|
||||||
status: 'cancelled',
|
|
||||||
endedAt: 100,
|
|
||||||
})
|
|
||||||
expect(persisted!.error).toBeUndefined()
|
|
||||||
})
|
|
||||||
|
|
||||||
test('hasActiveAutonomyRunForSource only treats queued and running scheduled runs as active', async () => {
|
|
||||||
const command = await createAutonomyQueuedPrompt({
|
|
||||||
basePrompt: 'scheduled prompt',
|
|
||||||
trigger: 'scheduled-task',
|
|
||||||
rootDir: tempDir,
|
|
||||||
currentDir: tempDir,
|
|
||||||
sourceId: 'cron-1',
|
|
||||||
sourceLabel: 'nightly',
|
|
||||||
})
|
|
||||||
expect(command).not.toBeNull()
|
|
||||||
const runId = command!.autonomy!.runId
|
|
||||||
|
|
||||||
await expect(
|
|
||||||
hasActiveAutonomyRunForSource({
|
|
||||||
trigger: 'scheduled-task',
|
|
||||||
sourceId: 'cron-1',
|
|
||||||
rootDir: tempDir,
|
|
||||||
}),
|
|
||||||
).resolves.toBe(true)
|
|
||||||
|
|
||||||
await markAutonomyRunRunning(runId, tempDir, 100)
|
|
||||||
await expect(
|
|
||||||
hasActiveAutonomyRunForSource({
|
|
||||||
trigger: 'scheduled-task',
|
|
||||||
sourceId: 'cron-1',
|
|
||||||
rootDir: tempDir,
|
|
||||||
}),
|
|
||||||
).resolves.toBe(true)
|
|
||||||
|
|
||||||
await expect(
|
|
||||||
hasActiveAutonomyRunForSource({
|
|
||||||
trigger: 'scheduled-task',
|
|
||||||
sourceId: 'cron-2',
|
|
||||||
rootDir: tempDir,
|
|
||||||
}),
|
|
||||||
).resolves.toBe(false)
|
|
||||||
|
|
||||||
await markAutonomyRunCompleted(runId, tempDir, 200)
|
|
||||||
await expect(
|
|
||||||
hasActiveAutonomyRunForSource({
|
|
||||||
trigger: 'scheduled-task',
|
|
||||||
sourceId: 'cron-1',
|
|
||||||
rootDir: tempDir,
|
|
||||||
}),
|
|
||||||
).resolves.toBe(false)
|
|
||||||
|
|
||||||
const failedCommand = await createAutonomyQueuedPrompt({
|
|
||||||
basePrompt: 'scheduled prompt',
|
|
||||||
trigger: 'scheduled-task',
|
|
||||||
rootDir: tempDir,
|
|
||||||
currentDir: tempDir,
|
|
||||||
sourceId: 'cron-1',
|
|
||||||
})
|
|
||||||
expect(failedCommand).not.toBeNull()
|
|
||||||
await markAutonomyRunFailed(
|
|
||||||
failedCommand!.autonomy!.runId,
|
|
||||||
'boom',
|
|
||||||
tempDir,
|
|
||||||
300,
|
|
||||||
)
|
|
||||||
await expect(
|
|
||||||
hasActiveAutonomyRunForSource({
|
|
||||||
trigger: 'scheduled-task',
|
|
||||||
sourceId: 'cron-1',
|
|
||||||
rootDir: tempDir,
|
|
||||||
}),
|
|
||||||
).resolves.toBe(false)
|
|
||||||
})
|
|
||||||
|
|
||||||
test('createAutonomyQueuedPromptIfNoActiveSource atomically skips duplicate active scheduled sources', async () => {
|
|
||||||
const [first, second] = await Promise.all([
|
|
||||||
createAutonomyQueuedPromptIfNoActiveSource({
|
|
||||||
basePrompt: 'scheduled prompt',
|
|
||||||
trigger: 'scheduled-task',
|
|
||||||
rootDir: tempDir,
|
|
||||||
currentDir: tempDir,
|
|
||||||
sourceId: 'cron-1',
|
|
||||||
}),
|
|
||||||
createAutonomyQueuedPromptIfNoActiveSource({
|
|
||||||
basePrompt: 'scheduled prompt',
|
|
||||||
trigger: 'scheduled-task',
|
|
||||||
rootDir: tempDir,
|
|
||||||
currentDir: tempDir,
|
|
||||||
sourceId: 'cron-1',
|
|
||||||
}),
|
|
||||||
])
|
|
||||||
|
|
||||||
const created = [first, second].filter(command => command !== null)
|
|
||||||
const runs = await listAutonomyRuns(tempDir)
|
|
||||||
|
|
||||||
expect(created).toHaveLength(1)
|
|
||||||
expect(runs).toHaveLength(1)
|
|
||||||
expect(runs[0]).toMatchObject({
|
|
||||||
trigger: 'scheduled-task',
|
|
||||||
status: 'queued',
|
|
||||||
sourceId: 'cron-1',
|
|
||||||
})
|
|
||||||
})
|
|
||||||
|
|
||||||
test('createAutonomyQueuedPromptIfNoActiveSource scopes dedup by ownerKey', async () => {
|
|
||||||
const first = await createAutonomyQueuedPromptIfNoActiveSource({
|
|
||||||
basePrompt: 'scheduled prompt',
|
|
||||||
trigger: 'scheduled-task',
|
|
||||||
rootDir: tempDir,
|
|
||||||
currentDir: tempDir,
|
|
||||||
sourceId: 'cron-1',
|
|
||||||
ownerKey: 'owner-a',
|
|
||||||
})
|
|
||||||
const second = await createAutonomyQueuedPromptIfNoActiveSource({
|
|
||||||
basePrompt: 'scheduled prompt',
|
|
||||||
trigger: 'scheduled-task',
|
|
||||||
rootDir: tempDir,
|
|
||||||
currentDir: tempDir,
|
|
||||||
sourceId: 'cron-1',
|
|
||||||
ownerKey: 'owner-b',
|
|
||||||
})
|
|
||||||
|
|
||||||
const runs = await listAutonomyRuns(tempDir)
|
|
||||||
|
|
||||||
expect(first).not.toBeNull()
|
|
||||||
expect(second).not.toBeNull()
|
|
||||||
expect(runs).toHaveLength(2)
|
|
||||||
expect(new Set(runs.map(run => run.ownerKey))).toEqual(
|
|
||||||
new Set(['owner-a', 'owner-b']),
|
|
||||||
)
|
|
||||||
})
|
|
||||||
|
|
||||||
test('createAutonomyQueuedPromptIfNoActiveSource does not advance heartbeat last-run state on dedup skip (two-phase commit invariant)', async () => {
|
|
||||||
await writeTempFile(
|
|
||||||
tempDir,
|
|
||||||
HEARTBEAT_REL,
|
|
||||||
[
|
|
||||||
'tasks:',
|
|
||||||
' - name: inbox',
|
|
||||||
' interval: 30m',
|
|
||||||
' prompt: "Check inbox"',
|
|
||||||
].join('\n'),
|
|
||||||
)
|
|
||||||
|
|
||||||
// Seed an active queued run for cron-1 so the next dedup attempt skips.
|
|
||||||
await writeTempFile(
|
|
||||||
tempDir,
|
|
||||||
RUNS_REL,
|
|
||||||
`${JSON.stringify(
|
|
||||||
{
|
|
||||||
runs: [
|
|
||||||
{
|
|
||||||
runId: 'preexisting-active',
|
|
||||||
runtime: 'automatic',
|
|
||||||
trigger: 'scheduled-task',
|
|
||||||
status: 'queued',
|
|
||||||
rootDir: tempDir,
|
|
||||||
currentDir: tempDir,
|
|
||||||
sourceId: 'cron-1',
|
|
||||||
promptPreview: 'still queued',
|
|
||||||
createdAt: 100,
|
|
||||||
ownerProcessId: process.pid,
|
|
||||||
ownerSessionId: 'self',
|
|
||||||
},
|
|
||||||
],
|
|
||||||
},
|
|
||||||
null,
|
|
||||||
2,
|
|
||||||
)}\n`,
|
|
||||||
)
|
|
||||||
|
|
||||||
const skipped = await createAutonomyQueuedPromptIfNoActiveSource({
|
|
||||||
basePrompt: 'scheduled prompt',
|
|
||||||
trigger: 'scheduled-task',
|
|
||||||
rootDir: tempDir,
|
|
||||||
currentDir: tempDir,
|
|
||||||
sourceId: 'cron-1',
|
|
||||||
})
|
|
||||||
expect(skipped).toBeNull()
|
|
||||||
|
|
||||||
// If the dedup skip wrongly advanced heartbeat state, the next
|
|
||||||
// proactive-tick prompt would NOT include the inbox task. Verify it
|
|
||||||
// still does.
|
|
||||||
const followUp = await createAutonomyQueuedPrompt({
|
|
||||||
basePrompt: '<tick>12:00:00</tick>',
|
|
||||||
trigger: 'proactive-tick',
|
|
||||||
rootDir: tempDir,
|
|
||||||
currentDir: tempDir,
|
|
||||||
})
|
|
||||||
expect(followUp).not.toBeNull()
|
|
||||||
expect(followUp!.value).toContain('Due HEARTBEAT.md tasks:')
|
|
||||||
expect(followUp!.value).toContain('- inbox (30m): Check inbox')
|
|
||||||
})
|
|
||||||
|
|
||||||
test('createAutonomyQueuedPromptIfNoActiveSource recovers stale active runs from dead owner processes', async () => {
|
|
||||||
await writeTempFile(
|
|
||||||
tempDir,
|
|
||||||
RUNS_REL,
|
|
||||||
`${JSON.stringify(
|
|
||||||
{
|
|
||||||
runs: [
|
|
||||||
{
|
|
||||||
runId: 'stale-run',
|
|
||||||
runtime: 'automatic',
|
|
||||||
trigger: 'scheduled-task',
|
|
||||||
status: 'running',
|
|
||||||
rootDir: tempDir,
|
|
||||||
currentDir: tempDir,
|
|
||||||
sourceId: 'cron-1',
|
|
||||||
sourceLabel: 'nightly',
|
|
||||||
promptPreview: 'stale scheduled prompt',
|
|
||||||
createdAt: 100,
|
|
||||||
startedAt: 100,
|
|
||||||
ownerProcessId: 2_147_483_647,
|
|
||||||
ownerSessionId: 'dead-session',
|
|
||||||
},
|
|
||||||
],
|
|
||||||
},
|
|
||||||
null,
|
|
||||||
2,
|
|
||||||
)}\n`,
|
|
||||||
)
|
|
||||||
|
|
||||||
await expect(
|
|
||||||
hasActiveAutonomyRunForSource({
|
|
||||||
trigger: 'scheduled-task',
|
|
||||||
sourceId: 'cron-1',
|
|
||||||
rootDir: tempDir,
|
|
||||||
}),
|
|
||||||
).resolves.toBe(false)
|
|
||||||
|
|
||||||
const command = await createAutonomyQueuedPromptIfNoActiveSource({
|
|
||||||
basePrompt: 'scheduled prompt',
|
|
||||||
trigger: 'scheduled-task',
|
|
||||||
rootDir: tempDir,
|
|
||||||
currentDir: tempDir,
|
|
||||||
sourceId: 'cron-1',
|
|
||||||
})
|
|
||||||
const runs = await listAutonomyRuns(tempDir)
|
|
||||||
|
|
||||||
expect(command).not.toBeNull()
|
|
||||||
expect(runs).toHaveLength(2)
|
|
||||||
expect(runs[0]).toMatchObject({
|
|
||||||
trigger: 'scheduled-task',
|
|
||||||
status: 'queued',
|
|
||||||
sourceId: 'cron-1',
|
|
||||||
ownerProcessId: process.pid,
|
|
||||||
})
|
|
||||||
expect(runs[1]).toMatchObject({
|
|
||||||
runId: 'stale-run',
|
|
||||||
status: 'failed',
|
|
||||||
endedAt: runs[0]?.createdAt,
|
|
||||||
error: expect.stringContaining('owner process 2147483647'),
|
|
||||||
})
|
|
||||||
})
|
|
||||||
|
|
||||||
test('stale managed-flow run recovery also marks the flow step failed', async () => {
|
|
||||||
const command = await startManagedAutonomyFlowFromHeartbeatTask({
|
|
||||||
task: {
|
|
||||||
name: 'weekly-report',
|
|
||||||
interval: '7d',
|
|
||||||
prompt: 'Ship the weekly report',
|
|
||||||
steps: [
|
|
||||||
{
|
|
||||||
name: 'gather',
|
|
||||||
prompt: 'Gather weekly inputs',
|
|
||||||
},
|
|
||||||
],
|
|
||||||
},
|
|
||||||
rootDir: tempDir,
|
|
||||||
currentDir: tempDir,
|
|
||||||
})
|
|
||||||
expect(command).not.toBeNull()
|
|
||||||
const runId = command!.autonomy!.runId
|
|
||||||
await markAutonomyRunRunning(runId, tempDir, 100)
|
|
||||||
|
|
||||||
const runsPath = resolveAutonomyRunsPath(tempDir)
|
|
||||||
const file = JSON.parse(await readTempFile(runsPath)) as {
|
|
||||||
runs: Array<Record<string, unknown>>
|
|
||||||
}
|
|
||||||
file.runs = file.runs.map(run =>
|
|
||||||
run.runId === runId
|
|
||||||
? { ...run, ownerProcessId: 2_147_483_647 }
|
|
||||||
: run,
|
|
||||||
)
|
|
||||||
await writeTempFile(tempDir, RUNS_REL, `${JSON.stringify(file, null, 2)}\n`)
|
|
||||||
|
|
||||||
const replacement = await createAutonomyQueuedPromptIfNoActiveSource({
|
|
||||||
basePrompt: 'replacement prompt',
|
|
||||||
trigger: 'managed-flow-step',
|
|
||||||
rootDir: tempDir,
|
|
||||||
currentDir: tempDir,
|
|
||||||
sourceId: command!.autonomy!.sourceId!,
|
|
||||||
ownerKey: 'main-thread',
|
|
||||||
})
|
|
||||||
const [flow] = await listAutonomyFlows(tempDir)
|
|
||||||
const runs = await listAutonomyRuns(tempDir)
|
|
||||||
|
|
||||||
expect(replacement).not.toBeNull()
|
|
||||||
expect(runs.find(run => run.runId === runId)).toMatchObject({
|
|
||||||
status: 'failed',
|
|
||||||
error: expect.stringContaining(STALE_ACTIVE_RUN_ERROR_PREFIX),
|
|
||||||
})
|
|
||||||
expect(flow).toMatchObject({
|
|
||||||
status: 'failed',
|
|
||||||
blockedRunId: runId,
|
|
||||||
})
|
|
||||||
expect(flow?.stateJson?.steps[0]).toMatchObject({
|
|
||||||
status: 'failed',
|
|
||||||
runId,
|
|
||||||
error: expect.stringContaining(STALE_ACTIVE_RUN_ERROR_PREFIX),
|
|
||||||
})
|
|
||||||
})
|
|
||||||
|
|
||||||
test('formatters produce readable status and run listings', async () => {
|
test('formatters produce readable status and run listings', async () => {
|
||||||
const first = await createAutonomyQueuedPrompt({
|
const first = await createAutonomyQueuedPrompt({
|
||||||
basePrompt: 'scheduled prompt',
|
basePrompt: 'scheduled prompt',
|
||||||
@@ -588,56 +223,11 @@ describe('autonomyRuns', () => {
|
|||||||
)
|
)
|
||||||
})
|
})
|
||||||
|
|
||||||
test('persistence pruning keeps active runs ahead of recent completed history', async () => {
|
|
||||||
const runs = [
|
|
||||||
{
|
|
||||||
runId: 'old-active',
|
|
||||||
runtime: 'automatic',
|
|
||||||
trigger: 'scheduled-task',
|
|
||||||
status: 'queued',
|
|
||||||
rootDir: tempDir,
|
|
||||||
currentDir: tempDir,
|
|
||||||
ownerKey: 'main-thread',
|
|
||||||
promptPreview: 'old active',
|
|
||||||
createdAt: 1,
|
|
||||||
},
|
|
||||||
...Array.from({ length: 200 }, (_, index) => ({
|
|
||||||
runId: `history-${index}`,
|
|
||||||
runtime: 'automatic',
|
|
||||||
trigger: 'scheduled-task',
|
|
||||||
status: 'completed',
|
|
||||||
rootDir: tempDir,
|
|
||||||
currentDir: tempDir,
|
|
||||||
ownerKey: 'main-thread',
|
|
||||||
promptPreview: `history ${index}`,
|
|
||||||
createdAt: 1_000 + index,
|
|
||||||
endedAt: 2_000 + index,
|
|
||||||
})),
|
|
||||||
]
|
|
||||||
await writeTempFile(
|
|
||||||
tempDir,
|
|
||||||
RUNS_REL,
|
|
||||||
`${JSON.stringify({ runs }, null, 2)}\n`,
|
|
||||||
)
|
|
||||||
|
|
||||||
await createAutonomyRun({
|
|
||||||
trigger: 'scheduled-task',
|
|
||||||
prompt: 'fresh active',
|
|
||||||
rootDir: tempDir,
|
|
||||||
currentDir: tempDir,
|
|
||||||
nowMs: 9_999,
|
|
||||||
})
|
|
||||||
|
|
||||||
const persisted = await listAutonomyRuns(tempDir)
|
|
||||||
expect(persisted).toHaveLength(200)
|
|
||||||
expect(persisted.some(run => run.runId === 'old-active')).toBe(true)
|
|
||||||
expect(persisted.some(run => run.runId === 'history-0')).toBe(false)
|
|
||||||
})
|
|
||||||
|
|
||||||
test('listAutonomyRuns keeps older persisted records by normalizing missing runtime and owner metadata', async () => {
|
test('listAutonomyRuns keeps older persisted records by normalizing missing runtime and owner metadata', async () => {
|
||||||
await writeTempFile(
|
const runsPath = resolveAutonomyRunsPath(tempDir)
|
||||||
tempDir,
|
await mkdir(join(tempDir, '.claude', 'autonomy'), { recursive: true })
|
||||||
RUNS_REL,
|
await writeFile(
|
||||||
|
runsPath,
|
||||||
`${JSON.stringify(
|
`${JSON.stringify(
|
||||||
{
|
{
|
||||||
runs: [
|
runs: [
|
||||||
@@ -654,6 +244,7 @@ describe('autonomyRuns', () => {
|
|||||||
null,
|
null,
|
||||||
2,
|
2,
|
||||||
)}\n`,
|
)}\n`,
|
||||||
|
'utf-8',
|
||||||
)
|
)
|
||||||
|
|
||||||
const [legacy] = await listAutonomyRuns(tempDir)
|
const [legacy] = await listAutonomyRuns(tempDir)
|
||||||
@@ -827,27 +418,4 @@ describe('autonomyRuns', () => {
|
|||||||
expect(recovered!.autonomy?.runId).toBe(command!.autonomy?.runId)
|
expect(recovered!.autonomy?.runId).toBe(command!.autonomy?.runId)
|
||||||
expect(recovered!.autonomy?.flowId).toBe(flow!.flowId)
|
expect(recovered!.autonomy?.flowId).toBe(flow!.flowId)
|
||||||
})
|
})
|
||||||
|
|
||||||
test('STALE_ACTIVE_RUN_ERROR_PREFIX stays in sync with HEARTBEAT.md stale-recovery-health task', async () => {
|
|
||||||
// The HEARTBEAT.md stale-recovery-health task prompt embeds this prefix
|
|
||||||
// as a literal string. Changing the constant without updating the
|
|
||||||
// heartbeat prompt would silently break the monitor — this test fails
|
|
||||||
// first to force the simultaneous update.
|
|
||||||
const heartbeatPath = resolvePath(
|
|
||||||
import.meta.dir,
|
|
||||||
'..',
|
|
||||||
'..',
|
|
||||||
'..',
|
|
||||||
'.claude',
|
|
||||||
'autonomy',
|
|
||||||
'HEARTBEAT.md',
|
|
||||||
)
|
|
||||||
if (!(await tempPathExists(heartbeatPath))) {
|
|
||||||
// .claude/ may be absent in some checkout layouts (e.g., shallow clone
|
|
||||||
// for npm pack). Skip rather than fail in that case.
|
|
||||||
return
|
|
||||||
}
|
|
||||||
const content = await readTempFile(heartbeatPath)
|
|
||||||
expect(content).toContain(STALE_ACTIVE_RUN_ERROR_PREFIX)
|
|
||||||
})
|
|
||||||
})
|
})
|
||||||
|
|||||||
@@ -133,50 +133,11 @@ function mergeAgentsAuthority(files: AutonomyAuthorityFile[]): string | null {
|
|||||||
.join('\n\n')
|
.join('\n\n')
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
|
||||||
* Replaces fenced code-block content (and the ``` / ~~~ fence delimiters
|
|
||||||
* themselves) with empty strings while preserving the index of every
|
|
||||||
* other line. Used by the heartbeat parser so that `tasks:` literals
|
|
||||||
* appearing inside Markdown code samples in HEARTBEAT.md docs do not
|
|
||||||
* collide with the real config block.
|
|
||||||
*/
|
|
||||||
function maskCodeFencedLines(lines: string[]): string[] {
|
|
||||||
const masked = lines.slice()
|
|
||||||
let activeFenceChar: '`' | '~' | null = null
|
|
||||||
let activeFenceLen = 0
|
|
||||||
for (let i = 0; i < masked.length; i++) {
|
|
||||||
const trimmed = masked[i]!.trim()
|
|
||||||
const fenceMatch = trimmed.match(/^([`~])\1{2,}/)
|
|
||||||
if (fenceMatch) {
|
|
||||||
const fenceChar = fenceMatch[1]! as '`' | '~'
|
|
||||||
const fenceLen = fenceMatch[0]!.length
|
|
||||||
const trailing = trimmed.slice(fenceLen)
|
|
||||||
if (activeFenceChar === null) {
|
|
||||||
activeFenceChar = fenceChar
|
|
||||||
activeFenceLen = fenceLen
|
|
||||||
} else if (
|
|
||||||
activeFenceChar === fenceChar &&
|
|
||||||
fenceLen >= activeFenceLen &&
|
|
||||||
trailing.trim() === ''
|
|
||||||
) {
|
|
||||||
activeFenceChar = null
|
|
||||||
activeFenceLen = 0
|
|
||||||
}
|
|
||||||
masked[i] = ''
|
|
||||||
continue
|
|
||||||
}
|
|
||||||
if (activeFenceChar !== null) {
|
|
||||||
masked[i] = ''
|
|
||||||
}
|
|
||||||
}
|
|
||||||
return masked
|
|
||||||
}
|
|
||||||
|
|
||||||
export function parseHeartbeatAuthorityTasks(
|
export function parseHeartbeatAuthorityTasks(
|
||||||
content: string,
|
content: string,
|
||||||
): HeartbeatAuthorityTask[] {
|
): HeartbeatAuthorityTask[] {
|
||||||
const tasks: HeartbeatAuthorityTask[] = []
|
const tasks: HeartbeatAuthorityTask[] = []
|
||||||
const lines = maskCodeFencedLines(content.split('\n'))
|
const lines = content.split('\n')
|
||||||
const getIndent = (line: string): number =>
|
const getIndent = (line: string): number =>
|
||||||
line.length - line.trimStart().length
|
line.length - line.trimStart().length
|
||||||
const parseScalar = (line: string, key: string): string =>
|
const parseScalar = (line: string, key: string): string =>
|
||||||
|
|||||||
@@ -3,10 +3,7 @@ import { mkdir, writeFile } from 'fs/promises'
|
|||||||
import { dirname, join, resolve } from 'path'
|
import { dirname, join, resolve } from 'path'
|
||||||
import { getProjectRoot } from '../bootstrap/state.js'
|
import { getProjectRoot } from '../bootstrap/state.js'
|
||||||
import { AUTONOMY_DIR, type AutonomyTriggerKind } from './autonomyAuthority.js'
|
import { AUTONOMY_DIR, type AutonomyTriggerKind } from './autonomyAuthority.js'
|
||||||
import {
|
import { withAutonomyPersistenceLock } from './autonomyPersistence.js'
|
||||||
retainActiveFirst,
|
|
||||||
withAutonomyPersistenceLock,
|
|
||||||
} from './autonomyPersistence.js'
|
|
||||||
import { getFsImplementation } from './fsOperations.js'
|
import { getFsImplementation } from './fsOperations.js'
|
||||||
|
|
||||||
const AUTONOMY_FLOWS_MAX = 100
|
const AUTONOMY_FLOWS_MAX = 100
|
||||||
@@ -86,20 +83,6 @@ export type AutonomyFlowRecord = {
|
|||||||
waitJson?: AutonomyFlowWaitState
|
waitJson?: AutonomyFlowWaitState
|
||||||
cancelRequestedAt?: number
|
cancelRequestedAt?: number
|
||||||
lastError?: string
|
lastError?: string
|
||||||
/**
|
|
||||||
* Repo-relative POSIX glob patterns describing which paths this flow's
|
|
||||||
* `report`-step approval covers. The pre-tool-use hook
|
|
||||||
* `require-plan-for-risky-edit.mjs` consults this list to permit edits
|
|
||||||
* only when the target file matches at least one entry. Absent or empty
|
|
||||||
* means "no boundary declared" — during the pilot window the hook
|
|
||||||
* treats this as broad approval (v1 behaviour). Once all production
|
|
||||||
* flows declare boundaries, the hook will deny absent-boundary flows.
|
|
||||||
*
|
|
||||||
* Supported syntax: `*` matches one path segment, `**` matches any
|
|
||||||
* number including zero. Examples: `src/utils/autonomy*`,
|
|
||||||
* `src/services/api/**`, `src/Tool.ts`.
|
|
||||||
*/
|
|
||||||
boundary?: string[]
|
|
||||||
}
|
}
|
||||||
|
|
||||||
type AutonomyFlowsFile = {
|
type AutonomyFlowsFile = {
|
||||||
@@ -155,7 +138,6 @@ function cloneWaitState(
|
|||||||
function cloneFlowRecord(flow: AutonomyFlowRecord): AutonomyFlowRecord {
|
function cloneFlowRecord(flow: AutonomyFlowRecord): AutonomyFlowRecord {
|
||||||
return {
|
return {
|
||||||
...flow,
|
...flow,
|
||||||
...(flow.boundary ? { boundary: [...flow.boundary] } : {}),
|
|
||||||
...(flow.stateJson ? { stateJson: cloneManagedState(flow.stateJson) } : {}),
|
...(flow.stateJson ? { stateJson: cloneManagedState(flow.stateJson) } : {}),
|
||||||
...(flow.waitJson ? { waitJson: cloneWaitState(flow.waitJson) } : {}),
|
...(flow.waitJson ? { waitJson: cloneWaitState(flow.waitJson) } : {}),
|
||||||
}
|
}
|
||||||
@@ -170,17 +152,6 @@ function isManagedFlowStatusActive(status: AutonomyFlowStatus): boolean {
|
|||||||
)
|
)
|
||||||
}
|
}
|
||||||
|
|
||||||
function selectPersistedAutonomyFlows(
|
|
||||||
flows: AutonomyFlowRecord[],
|
|
||||||
): AutonomyFlowRecord[] {
|
|
||||||
return retainActiveFirst(
|
|
||||||
flows.map(cloneFlowRecord),
|
|
||||||
flow => isManagedFlowStatusActive(flow.status),
|
|
||||||
flow => flow.updatedAt,
|
|
||||||
AUTONOMY_FLOWS_MAX,
|
|
||||||
)
|
|
||||||
}
|
|
||||||
|
|
||||||
function defaultFlowSource(params: {
|
function defaultFlowSource(params: {
|
||||||
trigger: AutonomyTriggerKind
|
trigger: AutonomyTriggerKind
|
||||||
sourceId?: string
|
sourceId?: string
|
||||||
@@ -266,35 +237,6 @@ function normalizeWaitState(value: unknown): AutonomyFlowWaitState | undefined {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
function isPosixBoundaryGlob(value: string): boolean {
|
|
||||||
if (!value || value.startsWith('/') || value.includes('\\')) {
|
|
||||||
return false
|
|
||||||
}
|
|
||||||
if (value.includes('\0')) {
|
|
||||||
return false
|
|
||||||
}
|
|
||||||
return !value.split('/').some(segment => segment === '..')
|
|
||||||
}
|
|
||||||
|
|
||||||
function normalizeBoundary(value: unknown): string[] | undefined {
|
|
||||||
if (!Array.isArray(value)) {
|
|
||||||
return undefined
|
|
||||||
}
|
|
||||||
const seen = new Set<string>()
|
|
||||||
const boundary = value
|
|
||||||
.filter((entry): entry is string => typeof entry === 'string')
|
|
||||||
.map(entry => entry.trim())
|
|
||||||
.filter(isPosixBoundaryGlob)
|
|
||||||
.filter(entry => {
|
|
||||||
if (seen.has(entry)) {
|
|
||||||
return false
|
|
||||||
}
|
|
||||||
seen.add(entry)
|
|
||||||
return true
|
|
||||||
})
|
|
||||||
return boundary.length > 0 ? boundary : undefined
|
|
||||||
}
|
|
||||||
|
|
||||||
function normalizeFlowRecord(flow: AutonomyFlowRecord): AutonomyFlowRecord {
|
function normalizeFlowRecord(flow: AutonomyFlowRecord): AutonomyFlowRecord {
|
||||||
const source = defaultFlowSource(flow)
|
const source = defaultFlowSource(flow)
|
||||||
return {
|
return {
|
||||||
@@ -305,7 +247,6 @@ function normalizeFlowRecord(flow: AutonomyFlowRecord): AutonomyFlowRecord {
|
|||||||
goal: flow.goal || flow.sourceLabel || flow.sourceId || flow.flowKey,
|
goal: flow.goal || flow.sourceLabel || flow.sourceId || flow.flowKey,
|
||||||
currentDir: flow.currentDir || flow.rootDir,
|
currentDir: flow.currentDir || flow.rootDir,
|
||||||
runCount: Math.max(flow.runCount ?? 0, 0),
|
runCount: Math.max(flow.runCount ?? 0, 0),
|
||||||
boundary: normalizeBoundary(flow.boundary),
|
|
||||||
stateJson: normalizeManagedState(flow.stateJson),
|
stateJson: normalizeManagedState(flow.stateJson),
|
||||||
waitJson: normalizeWaitState(flow.waitJson),
|
waitJson: normalizeWaitState(flow.waitJson),
|
||||||
...(flow.sourceId
|
...(flow.sourceId
|
||||||
@@ -428,7 +369,11 @@ async function writeAutonomyFlows(
|
|||||||
path,
|
path,
|
||||||
`${JSON.stringify(
|
`${JSON.stringify(
|
||||||
{
|
{
|
||||||
flows: selectPersistedAutonomyFlows(flows),
|
flows: flows
|
||||||
|
.slice()
|
||||||
|
.map(cloneFlowRecord)
|
||||||
|
.sort((left, right) => right.updatedAt - left.updatedAt)
|
||||||
|
.slice(0, AUTONOMY_FLOWS_MAX),
|
||||||
} satisfies AutonomyFlowsFile,
|
} satisfies AutonomyFlowsFile,
|
||||||
null,
|
null,
|
||||||
2,
|
2,
|
||||||
@@ -475,7 +420,6 @@ export async function startManagedAutonomyFlow(params: {
|
|||||||
ownerKey?: string
|
ownerKey?: string
|
||||||
sourceId?: string
|
sourceId?: string
|
||||||
sourceLabel?: string
|
sourceLabel?: string
|
||||||
boundary?: string[]
|
|
||||||
nowMs?: number
|
nowMs?: number
|
||||||
}): Promise<ManagedAutonomyFlowStartResult | null> {
|
}): Promise<ManagedAutonomyFlowStartResult | null> {
|
||||||
if (params.steps.length === 0) {
|
if (params.steps.length === 0) {
|
||||||
@@ -506,8 +450,6 @@ export async function startManagedAutonomyFlow(params: {
|
|||||||
|
|
||||||
const stateJson = buildManagedState(params.steps)
|
const stateJson = buildManagedState(params.steps)
|
||||||
const firstStep = stateJson.steps[0]!
|
const firstStep = stateJson.steps[0]!
|
||||||
const boundary =
|
|
||||||
normalizeBoundary(params.boundary) ?? normalizeBoundary(current?.boundary)
|
|
||||||
const waiting =
|
const waiting =
|
||||||
firstStep.waitFor != null
|
firstStep.waitFor != null
|
||||||
? {
|
? {
|
||||||
@@ -532,7 +474,6 @@ export async function startManagedAutonomyFlow(params: {
|
|||||||
currentDir,
|
currentDir,
|
||||||
...(source.sourceId ? { sourceId: source.sourceId } : {}),
|
...(source.sourceId ? { sourceId: source.sourceId } : {}),
|
||||||
...(source.sourceLabel ? { sourceLabel: source.sourceLabel } : {}),
|
...(source.sourceLabel ? { sourceLabel: source.sourceLabel } : {}),
|
||||||
...(boundary ? { boundary } : {}),
|
|
||||||
latestRunId: undefined,
|
latestRunId: undefined,
|
||||||
runCount: current?.runCount ?? 0,
|
runCount: current?.runCount ?? 0,
|
||||||
createdAt: current?.createdAt ?? nowMs,
|
createdAt: current?.createdAt ?? nowMs,
|
||||||
|
|||||||
@@ -4,42 +4,6 @@ import { lock } from './lockfile.js'
|
|||||||
|
|
||||||
const persistenceLocks = new Map<string, Promise<void>>()
|
const persistenceLocks = new Map<string, Promise<void>>()
|
||||||
|
|
||||||
/**
|
|
||||||
* Two-phase persistence retention. Active records (queued/running, etc.) are
|
|
||||||
* always kept — capping them risks evicting in-flight work; that responsibility
|
|
||||||
* lives in caller-side leak detection. Inactive (terminal) records are ranked
|
|
||||||
* by `getTimestamp` desc and capped to fill the remaining budget below `max`.
|
|
||||||
*
|
|
||||||
* Returned list is sorted by `getTimestamp` desc regardless of activity, so
|
|
||||||
* the persisted file is plain reverse-chronological order — listings/UI can
|
|
||||||
* consume it directly without re-sorting.
|
|
||||||
*/
|
|
||||||
export function retainActiveFirst<T>(
|
|
||||||
records: readonly T[],
|
|
||||||
isActive: (record: T) => boolean,
|
|
||||||
getTimestamp: (record: T) => number,
|
|
||||||
max: number,
|
|
||||||
): T[] {
|
|
||||||
const sortDesc = (left: T, right: T) =>
|
|
||||||
getTimestamp(right) - getTimestamp(left)
|
|
||||||
const active = records.filter(isActive).slice().sort(sortDesc)
|
|
||||||
const history = records
|
|
||||||
.filter(record => !isActive(record))
|
|
||||||
.slice()
|
|
||||||
.sort(sortDesc)
|
|
||||||
.slice(0, Math.max(0, max - active.length))
|
|
||||||
return [...active, ...history].sort(sortDesc)
|
|
||||||
}
|
|
||||||
|
|
||||||
export function getAutonomyPersistenceLockCountForTests(): number {
|
|
||||||
if (process.env.NODE_ENV !== 'test') {
|
|
||||||
throw new Error(
|
|
||||||
'getAutonomyPersistenceLockCountForTests can only be called in tests',
|
|
||||||
)
|
|
||||||
}
|
|
||||||
return persistenceLocks.size
|
|
||||||
}
|
|
||||||
|
|
||||||
export async function withAutonomyPersistenceLock<T>(
|
export async function withAutonomyPersistenceLock<T>(
|
||||||
rootDir: string,
|
rootDir: string,
|
||||||
fn: () => Promise<T>,
|
fn: () => Promise<T>,
|
||||||
@@ -52,8 +16,10 @@ export async function withAutonomyPersistenceLock<T>(
|
|||||||
const current = new Promise<void>(resolve => {
|
const current = new Promise<void>(resolve => {
|
||||||
release = resolve
|
release = resolve
|
||||||
})
|
})
|
||||||
const chained = previous.then(() => current)
|
persistenceLocks.set(
|
||||||
persistenceLocks.set(key, chained)
|
key,
|
||||||
|
previous.then(() => current),
|
||||||
|
)
|
||||||
|
|
||||||
await previous
|
await previous
|
||||||
try {
|
try {
|
||||||
@@ -75,7 +41,7 @@ export async function withAutonomyPersistenceLock<T>(
|
|||||||
}
|
}
|
||||||
} finally {
|
} finally {
|
||||||
release()
|
release()
|
||||||
if (persistenceLocks.get(key) === chained) {
|
if (persistenceLocks.get(key) === current) {
|
||||||
persistenceLocks.delete(key)
|
persistenceLocks.delete(key)
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -1,261 +0,0 @@
|
|||||||
import type { QueuedCommand } from '../types/textInputTypes.js'
|
|
||||||
import {
|
|
||||||
finalizeAutonomyRunCompleted,
|
|
||||||
finalizeAutonomyRunFailed,
|
|
||||||
listAutonomyRuns,
|
|
||||||
markAutonomyRunCancelled,
|
|
||||||
markAutonomyRunRunning,
|
|
||||||
} from './autonomyRuns.js'
|
|
||||||
|
|
||||||
export type AutonomyQueuePartition = {
|
|
||||||
attachmentCommands: QueuedCommand[]
|
|
||||||
staleCommands: QueuedCommand[]
|
|
||||||
}
|
|
||||||
|
|
||||||
export type AutonomyQueueClaim = AutonomyQueuePartition & {
|
|
||||||
claimedRunIds: string[]
|
|
||||||
claimedCommands: QueuedCommand[]
|
|
||||||
}
|
|
||||||
|
|
||||||
export type AutonomyTurnOutcome =
|
|
||||||
| { type: 'completed' }
|
|
||||||
| { type: 'cancelled' }
|
|
||||||
| { type: 'failed'; error?: unknown; message?: string }
|
|
||||||
|
|
||||||
type AutonomyRunRef = {
|
|
||||||
runId: string
|
|
||||||
rootDir?: string
|
|
||||||
}
|
|
||||||
|
|
||||||
function getCommandRootDir(
|
|
||||||
command: QueuedCommand,
|
|
||||||
fallbackRootDir?: string,
|
|
||||||
): string | undefined {
|
|
||||||
return command.autonomy?.rootDir ?? fallbackRootDir
|
|
||||||
}
|
|
||||||
|
|
||||||
function refKey(ref: AutonomyRunRef): string {
|
|
||||||
return `${ref.rootDir ?? ''}\0${ref.runId}`
|
|
||||||
}
|
|
||||||
|
|
||||||
function getAutonomyRunRefs(
|
|
||||||
commands: QueuedCommand[],
|
|
||||||
fallbackRootDir?: string,
|
|
||||||
): AutonomyRunRef[] {
|
|
||||||
const refs = new Map<string, AutonomyRunRef>()
|
|
||||||
for (const command of commands) {
|
|
||||||
const runId = command.autonomy?.runId
|
|
||||||
if (!runId) {
|
|
||||||
continue
|
|
||||||
}
|
|
||||||
const ref = {
|
|
||||||
runId,
|
|
||||||
rootDir: getCommandRootDir(command, fallbackRootDir),
|
|
||||||
}
|
|
||||||
refs.set(refKey(ref), ref)
|
|
||||||
}
|
|
||||||
return [...refs.values()]
|
|
||||||
}
|
|
||||||
|
|
||||||
function isInlineQueuedCommand(command: QueuedCommand): boolean {
|
|
||||||
return command.mode === 'prompt' || command.mode === 'task-notification'
|
|
||||||
}
|
|
||||||
|
|
||||||
function groupRefsByRootDir(
|
|
||||||
refs: AutonomyRunRef[],
|
|
||||||
): Map<string, AutonomyRunRef[]> {
|
|
||||||
const grouped = new Map<string, AutonomyRunRef[]>()
|
|
||||||
for (const ref of refs) {
|
|
||||||
const key = ref.rootDir ?? ''
|
|
||||||
const group = grouped.get(key)
|
|
||||||
if (group) {
|
|
||||||
group.push(ref)
|
|
||||||
} else {
|
|
||||||
grouped.set(key, [ref])
|
|
||||||
}
|
|
||||||
}
|
|
||||||
return grouped
|
|
||||||
}
|
|
||||||
|
|
||||||
/**
|
|
||||||
* Exclude queued autonomy commands whose persisted run is no longer queued.
|
|
||||||
* This prevents stale in-memory commands from reviving flows after cancellation
|
|
||||||
* or after another path has already consumed the run.
|
|
||||||
*/
|
|
||||||
export async function partitionConsumableQueuedAutonomyCommands(
|
|
||||||
commands: QueuedCommand[],
|
|
||||||
rootDir?: string,
|
|
||||||
): Promise<AutonomyQueuePartition> {
|
|
||||||
const attachmentCommands: QueuedCommand[] = []
|
|
||||||
const staleCommands: QueuedCommand[] = []
|
|
||||||
const refs = getAutonomyRunRefs(commands, rootDir)
|
|
||||||
const runsByRef = new Map<
|
|
||||||
string,
|
|
||||||
Awaited<ReturnType<typeof listAutonomyRuns>>[number]
|
|
||||||
>()
|
|
||||||
for (const [rootKey, group] of groupRefsByRootDir(refs)) {
|
|
||||||
const runs = await listAutonomyRuns(rootKey || undefined)
|
|
||||||
const wanted = new Set(group.map(ref => ref.runId))
|
|
||||||
for (const run of runs) {
|
|
||||||
if (wanted.has(run.runId)) {
|
|
||||||
runsByRef.set(
|
|
||||||
refKey({ runId: run.runId, rootDir: rootKey || undefined }),
|
|
||||||
run,
|
|
||||||
)
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
for (const command of commands) {
|
|
||||||
const runId = command.autonomy?.runId
|
|
||||||
if (!runId) {
|
|
||||||
attachmentCommands.push(command)
|
|
||||||
continue
|
|
||||||
}
|
|
||||||
|
|
||||||
const commandRootDir = getCommandRootDir(command, rootDir)
|
|
||||||
const run = runsByRef.get(refKey({ runId, rootDir: commandRootDir }))
|
|
||||||
if (run?.status === 'queued' && !run.startedAt && !run.endedAt) {
|
|
||||||
attachmentCommands.push(command)
|
|
||||||
} else {
|
|
||||||
staleCommands.push(command)
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
return { attachmentCommands, staleCommands }
|
|
||||||
}
|
|
||||||
|
|
||||||
export async function claimConsumableQueuedAutonomyCommands(
|
|
||||||
commands: QueuedCommand[],
|
|
||||||
rootDir?: string,
|
|
||||||
): Promise<AutonomyQueueClaim> {
|
|
||||||
const partition = await partitionConsumableQueuedAutonomyCommands(
|
|
||||||
commands,
|
|
||||||
rootDir,
|
|
||||||
)
|
|
||||||
const claimedRunIds: string[] = []
|
|
||||||
const claimedRunKeys: string[] = []
|
|
||||||
const staleRunKeys = new Set<string>()
|
|
||||||
const candidateRefs = getAutonomyRunRefs(
|
|
||||||
partition.attachmentCommands.filter(isInlineQueuedCommand),
|
|
||||||
rootDir,
|
|
||||||
)
|
|
||||||
|
|
||||||
for (const ref of candidateRefs) {
|
|
||||||
const updated = await markAutonomyRunRunning(ref.runId, ref.rootDir)
|
|
||||||
if (updated?.status === 'running') {
|
|
||||||
claimedRunIds.push(ref.runId)
|
|
||||||
claimedRunKeys.push(refKey(ref))
|
|
||||||
} else {
|
|
||||||
staleRunKeys.add(refKey(ref))
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
const claimedRunKeySet = new Set(claimedRunKeys)
|
|
||||||
const attachmentCommands: QueuedCommand[] = []
|
|
||||||
const claimedCommands: QueuedCommand[] = []
|
|
||||||
const staleCommands = [...partition.staleCommands]
|
|
||||||
|
|
||||||
for (const command of partition.attachmentCommands) {
|
|
||||||
const runId = command.autonomy?.runId
|
|
||||||
if (!runId) {
|
|
||||||
attachmentCommands.push(command)
|
|
||||||
continue
|
|
||||||
}
|
|
||||||
const key = refKey({
|
|
||||||
runId,
|
|
||||||
rootDir: getCommandRootDir(command, rootDir),
|
|
||||||
})
|
|
||||||
if (claimedRunKeySet.has(key)) {
|
|
||||||
attachmentCommands.push(command)
|
|
||||||
claimedCommands.push(command)
|
|
||||||
} else if (staleRunKeys.has(key)) {
|
|
||||||
staleCommands.push(command)
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
return {
|
|
||||||
attachmentCommands,
|
|
||||||
staleCommands,
|
|
||||||
claimedRunIds,
|
|
||||||
claimedCommands,
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
export async function cancelQueuedAutonomyCommands(params: {
|
|
||||||
commands: QueuedCommand[]
|
|
||||||
rootDir?: string
|
|
||||||
}): Promise<void> {
|
|
||||||
for (const ref of getAutonomyRunRefs(params.commands, params.rootDir)) {
|
|
||||||
await markAutonomyRunCancelled(ref.runId, ref.rootDir)
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
function stringifyAutonomyError(error: unknown): string {
|
|
||||||
if (typeof error === 'string') {
|
|
||||||
return error
|
|
||||||
}
|
|
||||||
if (error instanceof Error) {
|
|
||||||
return error.message
|
|
||||||
}
|
|
||||||
return String(error)
|
|
||||||
}
|
|
||||||
|
|
||||||
export function sanitizeAutonomyFailureForPersistence(
|
|
||||||
error: unknown,
|
|
||||||
fallback = 'query failed',
|
|
||||||
): string {
|
|
||||||
const message = stringifyAutonomyError(error)
|
|
||||||
const lower = message.toLowerCase()
|
|
||||||
if (
|
|
||||||
lower.includes('api_error') ||
|
|
||||||
lower.includes('provider') ||
|
|
||||||
lower.includes('openai') ||
|
|
||||||
lower.includes('gemini') ||
|
|
||||||
lower.includes('grok') ||
|
|
||||||
lower.includes('anthropic') ||
|
|
||||||
lower.includes('bedrock') ||
|
|
||||||
lower.includes('vertex')
|
|
||||||
) {
|
|
||||||
return 'provider api_error'
|
|
||||||
}
|
|
||||||
return fallback
|
|
||||||
}
|
|
||||||
|
|
||||||
export async function finalizeAutonomyCommandsForTurn(params: {
|
|
||||||
commands: QueuedCommand[]
|
|
||||||
outcome: AutonomyTurnOutcome
|
|
||||||
currentDir?: string
|
|
||||||
priority?: 'now' | 'next' | 'later'
|
|
||||||
workload?: string
|
|
||||||
}): Promise<QueuedCommand[]> {
|
|
||||||
const nextCommands: QueuedCommand[] = []
|
|
||||||
for (const command of params.commands) {
|
|
||||||
const autonomy = command.autonomy
|
|
||||||
if (!autonomy?.runId) {
|
|
||||||
continue
|
|
||||||
}
|
|
||||||
if (params.outcome.type === 'completed') {
|
|
||||||
nextCommands.push(
|
|
||||||
...(await finalizeAutonomyRunCompleted({
|
|
||||||
runId: autonomy.runId,
|
|
||||||
rootDir: autonomy.rootDir,
|
|
||||||
currentDir: params.currentDir,
|
|
||||||
priority: params.priority,
|
|
||||||
workload: command.workload ?? params.workload,
|
|
||||||
})),
|
|
||||||
)
|
|
||||||
} else if (params.outcome.type === 'cancelled') {
|
|
||||||
await markAutonomyRunCancelled(autonomy.runId, autonomy.rootDir)
|
|
||||||
} else {
|
|
||||||
await finalizeAutonomyRunFailed({
|
|
||||||
runId: autonomy.runId,
|
|
||||||
rootDir: autonomy.rootDir,
|
|
||||||
error:
|
|
||||||
params.outcome.message ??
|
|
||||||
sanitizeAutonomyFailureForPersistence(params.outcome.error),
|
|
||||||
})
|
|
||||||
}
|
|
||||||
}
|
|
||||||
return nextCommands
|
|
||||||
}
|
|
||||||
@@ -1,7 +1,7 @@
|
|||||||
import { randomUUID } from 'crypto'
|
import { randomUUID } from 'crypto'
|
||||||
import { mkdir, writeFile } from 'fs/promises'
|
import { mkdir, writeFile } from 'fs/promises'
|
||||||
import { dirname, join, resolve } from 'path'
|
import { dirname, join, resolve } from 'path'
|
||||||
import { getProjectRoot, getSessionId } from '../bootstrap/state.js'
|
import { getProjectRoot } from '../bootstrap/state.js'
|
||||||
import type { MessageOrigin } from '../types/message.js'
|
import type { MessageOrigin } from '../types/message.js'
|
||||||
import type { QueuedCommand } from '../types/textInputTypes.js'
|
import type { QueuedCommand } from '../types/textInputTypes.js'
|
||||||
import {
|
import {
|
||||||
@@ -27,34 +27,11 @@ import {
|
|||||||
type AutonomyFlowSyncMode,
|
type AutonomyFlowSyncMode,
|
||||||
type ManagedAutonomyFlowStepDefinition,
|
type ManagedAutonomyFlowStepDefinition,
|
||||||
} from './autonomyFlows.js'
|
} from './autonomyFlows.js'
|
||||||
import {
|
import { withAutonomyPersistenceLock } from './autonomyPersistence.js'
|
||||||
retainActiveFirst,
|
|
||||||
withAutonomyPersistenceLock,
|
|
||||||
} from './autonomyPersistence.js'
|
|
||||||
import { getFsImplementation } from './fsOperations.js'
|
import { getFsImplementation } from './fsOperations.js'
|
||||||
import { isProcessRunning } from './genericProcessUtils.js'
|
|
||||||
import { logError } from './log.js'
|
|
||||||
|
|
||||||
const AUTONOMY_RUNS_MAX = 200
|
const AUTONOMY_RUNS_MAX = 200
|
||||||
// Diagnostic threshold for active (queued/running) runs. Active records are
|
|
||||||
// deliberately exempt from AUTONOMY_RUNS_MAX so a leak in finalization cannot
|
|
||||||
// silently evict in-flight work; that exemption only makes sense if a leak is
|
|
||||||
// loud when it appears. Crossing this threshold warns once per process so
|
|
||||||
// operators see the divergence in logs before runs.json grows pathologically.
|
|
||||||
const AUTONOMY_ACTIVE_RUNS_WARN_THRESHOLD = 100
|
|
||||||
let warnedActiveRunsThresholdCrossed = false
|
|
||||||
const AUTONOMY_RUNS_RELATIVE_PATH = join(AUTONOMY_DIR, 'runs.json')
|
const AUTONOMY_RUNS_RELATIVE_PATH = join(AUTONOMY_DIR, 'runs.json')
|
||||||
// Sentinel string surfaced to operators via runs.json error fields and
|
|
||||||
// referenced literally by the HEARTBEAT.md `stale-recovery-health` task.
|
|
||||||
// A unit test asserts the HEARTBEAT.md file contains this exact prefix —
|
|
||||||
// changing the value will fail the test, forcing the heartbeat prompt
|
|
||||||
// to be updated in the same change.
|
|
||||||
export const STALE_ACTIVE_RUN_ERROR_PREFIX =
|
|
||||||
'Recovered stale active autonomy run'
|
|
||||||
|
|
||||||
// Guards the legacy-block warning so it fires once per (process, runId) instead
|
|
||||||
// of every dedup tick while a no-owner record sits there.
|
|
||||||
const warnedLegacyBlockRunIds = new Set<string>()
|
|
||||||
|
|
||||||
export type AutonomyRunStatus =
|
export type AutonomyRunStatus =
|
||||||
| 'queued'
|
| 'queued'
|
||||||
@@ -82,8 +59,6 @@ export type AutonomyRunRecord = {
|
|||||||
flowStepName?: string
|
flowStepName?: string
|
||||||
promptPreview: string
|
promptPreview: string
|
||||||
createdAt: number
|
createdAt: number
|
||||||
ownerProcessId?: number
|
|
||||||
ownerSessionId?: string
|
|
||||||
startedAt?: number
|
startedAt?: number
|
||||||
endedAt?: number
|
endedAt?: number
|
||||||
error?: string
|
error?: string
|
||||||
@@ -102,19 +77,6 @@ type AutonomyRunFlowRef = {
|
|||||||
stepName: string
|
stepName: string
|
||||||
}
|
}
|
||||||
|
|
||||||
type CreateAutonomyRunParams = {
|
|
||||||
trigger: AutonomyTriggerKind
|
|
||||||
prompt: string
|
|
||||||
rootDir?: string
|
|
||||||
currentDir?: string
|
|
||||||
sourceId?: string
|
|
||||||
sourceLabel?: string
|
|
||||||
runtime?: AutonomyRunRuntime
|
|
||||||
ownerKey?: string
|
|
||||||
flow?: AutonomyRunFlowRef
|
|
||||||
nowMs?: number
|
|
||||||
}
|
|
||||||
|
|
||||||
function truncatePromptPreview(prompt: string): string {
|
function truncatePromptPreview(prompt: string): string {
|
||||||
const singleLine = prompt.replace(/\s+/g, ' ').trim()
|
const singleLine = prompt.replace(/\s+/g, ' ').trim()
|
||||||
return singleLine.length <= 240
|
return singleLine.length <= 240
|
||||||
@@ -133,34 +95,6 @@ function cloneRunRecord(run: AutonomyRunRecord): AutonomyRunRecord {
|
|||||||
return { ...run }
|
return { ...run }
|
||||||
}
|
}
|
||||||
|
|
||||||
function isAutonomyRunActive(run: AutonomyRunRecord): boolean {
|
|
||||||
return run.status === 'queued' || run.status === 'running'
|
|
||||||
}
|
|
||||||
|
|
||||||
function selectPersistedAutonomyRuns(
|
|
||||||
runs: AutonomyRunRecord[],
|
|
||||||
): AutonomyRunRecord[] {
|
|
||||||
const cloned = runs.map(cloneRunRecord)
|
|
||||||
const activeCount = cloned.filter(isAutonomyRunActive).length
|
|
||||||
if (
|
|
||||||
!warnedActiveRunsThresholdCrossed &&
|
|
||||||
activeCount >= AUTONOMY_ACTIVE_RUNS_WARN_THRESHOLD
|
|
||||||
) {
|
|
||||||
warnedActiveRunsThresholdCrossed = true
|
|
||||||
logError(
|
|
||||||
new Error(
|
|
||||||
`autonomy: ${activeCount} active runs exceed warn threshold ${AUTONOMY_ACTIVE_RUNS_WARN_THRESHOLD}; check for finalize leaks`,
|
|
||||||
),
|
|
||||||
)
|
|
||||||
}
|
|
||||||
return retainActiveFirst(
|
|
||||||
cloned,
|
|
||||||
isAutonomyRunActive,
|
|
||||||
run => run.createdAt,
|
|
||||||
AUTONOMY_RUNS_MAX,
|
|
||||||
)
|
|
||||||
}
|
|
||||||
|
|
||||||
function normalizePersistedRunRecord(
|
function normalizePersistedRunRecord(
|
||||||
run: PersistedAutonomyRunRecord,
|
run: PersistedAutonomyRunRecord,
|
||||||
): AutonomyRunRecord {
|
): AutonomyRunRecord {
|
||||||
@@ -223,7 +157,11 @@ async function writeAutonomyRuns(
|
|||||||
path,
|
path,
|
||||||
`${JSON.stringify(
|
`${JSON.stringify(
|
||||||
{
|
{
|
||||||
runs: selectPersistedAutonomyRuns(runs),
|
runs: runs
|
||||||
|
.slice()
|
||||||
|
.map(cloneRunRecord)
|
||||||
|
.sort((left, right) => right.createdAt - left.createdAt)
|
||||||
|
.slice(0, AUTONOMY_RUNS_MAX),
|
||||||
} satisfies AutonomyRunsFile,
|
} satisfies AutonomyRunsFile,
|
||||||
null,
|
null,
|
||||||
2,
|
2,
|
||||||
@@ -234,7 +172,7 @@ async function writeAutonomyRuns(
|
|||||||
|
|
||||||
async function updateAutonomyRun(
|
async function updateAutonomyRun(
|
||||||
runId: string,
|
runId: string,
|
||||||
updater: (current: AutonomyRunRecord) => AutonomyRunRecord | null,
|
updater: (current: AutonomyRunRecord) => AutonomyRunRecord,
|
||||||
rootDir: string = getProjectRoot(),
|
rootDir: string = getProjectRoot(),
|
||||||
): Promise<AutonomyRunRecord | null> {
|
): Promise<AutonomyRunRecord | null> {
|
||||||
return withAutonomyPersistenceLock(rootDir, async () => {
|
return withAutonomyPersistenceLock(rootDir, async () => {
|
||||||
@@ -243,11 +181,7 @@ async function updateAutonomyRun(
|
|||||||
if (index === -1) {
|
if (index === -1) {
|
||||||
return null
|
return null
|
||||||
}
|
}
|
||||||
const next = updater(cloneRunRecord(runs[index]!))
|
const updated = cloneRunRecord(updater(cloneRunRecord(runs[index]!)))
|
||||||
if (!next) {
|
|
||||||
return null
|
|
||||||
}
|
|
||||||
const updated = cloneRunRecord(next)
|
|
||||||
runs[index] = updated
|
runs[index] = updated
|
||||||
await writeAutonomyRuns(runs, rootDir)
|
await writeAutonomyRuns(runs, rootDir)
|
||||||
return updated
|
return updated
|
||||||
@@ -262,112 +196,21 @@ export async function getAutonomyRunById(
|
|||||||
return runs.find(run => run.runId === runId) ?? null
|
return runs.find(run => run.runId === runId) ?? null
|
||||||
}
|
}
|
||||||
|
|
||||||
function isActiveAutonomyRunStatus(status: AutonomyRunStatus): boolean {
|
export async function createAutonomyRun(params: {
|
||||||
return status === 'queued' || status === 'running'
|
|
||||||
}
|
|
||||||
|
|
||||||
function isValidOwnerProcessId(pid: number | undefined): pid is number {
|
|
||||||
// Reject non-numeric, negative, zero (Linux: send-to-process-group), and
|
|
||||||
// non-integer values. A forged record with pid=0 or pid<0 used to be
|
|
||||||
// treated as live and could permanently block dedup; treating them as
|
|
||||||
// stale closes that availability hole.
|
|
||||||
return (
|
|
||||||
typeof pid === 'number' &&
|
|
||||||
Number.isInteger(pid) &&
|
|
||||||
pid > 0 &&
|
|
||||||
pid <= 4_194_304
|
|
||||||
)
|
|
||||||
}
|
|
||||||
|
|
||||||
function isStaleActiveAutonomyRun(run: AutonomyRunRecord): boolean {
|
|
||||||
if (!isActiveAutonomyRunStatus(run.status)) {
|
|
||||||
return false
|
|
||||||
}
|
|
||||||
if (run.ownerProcessId === undefined) {
|
|
||||||
return false
|
|
||||||
}
|
|
||||||
if (!isValidOwnerProcessId(run.ownerProcessId)) {
|
|
||||||
return true
|
|
||||||
}
|
|
||||||
return !isProcessRunning(run.ownerProcessId)
|
|
||||||
}
|
|
||||||
|
|
||||||
function staleActiveRunError(run: AutonomyRunRecord): string {
|
|
||||||
return `${STALE_ACTIVE_RUN_ERROR_PREFIX}: owner process ${run.ownerProcessId} is no longer running.`
|
|
||||||
}
|
|
||||||
|
|
||||||
function failAutonomyRunRecord(
|
|
||||||
run: AutonomyRunRecord,
|
|
||||||
error: string,
|
|
||||||
nowMs: number,
|
|
||||||
): AutonomyRunRecord {
|
|
||||||
return {
|
|
||||||
...run,
|
|
||||||
status: 'failed',
|
|
||||||
endedAt: nowMs,
|
|
||||||
error,
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
function recoverStaleActiveAutonomyRun(
|
|
||||||
run: AutonomyRunRecord,
|
|
||||||
nowMs: number,
|
|
||||||
): AutonomyRunRecord {
|
|
||||||
return failAutonomyRunRecord(run, staleActiveRunError(run), nowMs)
|
|
||||||
}
|
|
||||||
|
|
||||||
async function syncFailedManagedFlowForRun(
|
|
||||||
run: AutonomyRunRecord,
|
|
||||||
rootDir: string,
|
|
||||||
): Promise<void> {
|
|
||||||
if (run.parentFlowId && run.parentFlowSyncMode === 'managed') {
|
|
||||||
await markManagedAutonomyFlowStepFailed({
|
|
||||||
flowId: run.parentFlowId,
|
|
||||||
runId: run.runId,
|
|
||||||
error: run.error ?? 'Autonomy run failed.',
|
|
||||||
rootDir,
|
|
||||||
nowMs: run.endedAt,
|
|
||||||
})
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
function matchesActiveAutonomyRunSource(
|
|
||||||
run: AutonomyRunRecord,
|
|
||||||
params: {
|
|
||||||
trigger: AutonomyTriggerKind
|
|
||||||
sourceId: string
|
|
||||||
ownerKey?: string
|
|
||||||
},
|
|
||||||
): boolean {
|
|
||||||
return (
|
|
||||||
run.trigger === params.trigger &&
|
|
||||||
run.sourceId === params.sourceId &&
|
|
||||||
(params.ownerKey === undefined || run.ownerKey === params.ownerKey) &&
|
|
||||||
isActiveAutonomyRunStatus(run.status)
|
|
||||||
)
|
|
||||||
}
|
|
||||||
|
|
||||||
export async function hasActiveAutonomyRunForSource(params: {
|
|
||||||
trigger: AutonomyTriggerKind
|
trigger: AutonomyTriggerKind
|
||||||
sourceId: string
|
prompt: string
|
||||||
rootDir?: string
|
rootDir?: string
|
||||||
|
currentDir?: string
|
||||||
|
sourceId?: string
|
||||||
|
sourceLabel?: string
|
||||||
|
runtime?: AutonomyRunRuntime
|
||||||
ownerKey?: string
|
ownerKey?: string
|
||||||
}): Promise<boolean> {
|
flow?: AutonomyRunFlowRef
|
||||||
const runs = await listAutonomyRuns(params.rootDir)
|
nowMs?: number
|
||||||
return runs.some(
|
}): Promise<AutonomyRunRecord> {
|
||||||
run =>
|
const rootDir = resolve(params.rootDir ?? getProjectRoot())
|
||||||
matchesActiveAutonomyRunSource(run, params) &&
|
const currentDir = resolve(params.currentDir ?? rootDir)
|
||||||
!isStaleActiveAutonomyRun(run),
|
const record: AutonomyRunRecord = {
|
||||||
)
|
|
||||||
}
|
|
||||||
|
|
||||||
function buildAutonomyRunRecord(
|
|
||||||
params: CreateAutonomyRunParams,
|
|
||||||
rootDir: string,
|
|
||||||
currentDir: string,
|
|
||||||
): AutonomyRunRecord {
|
|
||||||
const createdAt = params.nowMs ?? Date.now()
|
|
||||||
return {
|
|
||||||
runId: randomUUID(),
|
runId: randomUUID(),
|
||||||
runtime: params.runtime ?? (params.flow ? 'flow_step' : 'automatic'),
|
runtime: params.runtime ?? (params.flow ? 'flow_step' : 'automatic'),
|
||||||
trigger: params.trigger,
|
trigger: params.trigger,
|
||||||
@@ -388,77 +231,13 @@ function buildAutonomyRunRecord(
|
|||||||
}
|
}
|
||||||
: {}),
|
: {}),
|
||||||
promptPreview: truncatePromptPreview(params.prompt),
|
promptPreview: truncatePromptPreview(params.prompt),
|
||||||
createdAt,
|
createdAt: params.nowMs ?? Date.now(),
|
||||||
ownerProcessId: process.pid,
|
|
||||||
ownerSessionId: getSessionId(),
|
|
||||||
}
|
}
|
||||||
}
|
|
||||||
|
|
||||||
async function persistAutonomyRunRecord(
|
|
||||||
record: AutonomyRunRecord,
|
|
||||||
rootDir: string,
|
|
||||||
skipWhenActiveSource: boolean,
|
|
||||||
): Promise<{
|
|
||||||
created: boolean
|
|
||||||
recoveredStaleRuns: AutonomyRunRecord[]
|
|
||||||
}> {
|
|
||||||
let created = false
|
|
||||||
const recoveredStaleRuns: AutonomyRunRecord[] = []
|
|
||||||
await withAutonomyPersistenceLock(rootDir, async () => {
|
await withAutonomyPersistenceLock(rootDir, async () => {
|
||||||
const runs = await listAutonomyRuns(rootDir)
|
const runs = await listAutonomyRuns(rootDir)
|
||||||
const sourceId = record.sourceId
|
|
||||||
if (skipWhenActiveSource && sourceId) {
|
|
||||||
let hasBlockingActiveRun = false
|
|
||||||
let staleRecoveriesApplied = false
|
|
||||||
for (let i = 0; i < runs.length; i++) {
|
|
||||||
const run = runs[i]!
|
|
||||||
if (
|
|
||||||
!matchesActiveAutonomyRunSource(run, {
|
|
||||||
trigger: record.trigger,
|
|
||||||
sourceId,
|
|
||||||
ownerKey: record.ownerKey,
|
|
||||||
})
|
|
||||||
) {
|
|
||||||
continue
|
|
||||||
}
|
|
||||||
if (isStaleActiveAutonomyRun(run)) {
|
|
||||||
const recovered = recoverStaleActiveAutonomyRun(run, record.createdAt)
|
|
||||||
runs[i] = recovered
|
|
||||||
recoveredStaleRuns.push(recovered)
|
|
||||||
staleRecoveriesApplied = true
|
|
||||||
continue
|
|
||||||
}
|
|
||||||
if (
|
|
||||||
run.ownerProcessId === undefined &&
|
|
||||||
!warnedLegacyBlockRunIds.has(run.runId)
|
|
||||||
) {
|
|
||||||
warnedLegacyBlockRunIds.add(run.runId)
|
|
||||||
logError(
|
|
||||||
new Error(
|
|
||||||
`[autonomyRuns] blocked by legacy un-owned active run ${run.runId} (createdAt=${run.createdAt}); cancel manually if this is a stale upgrade artifact`,
|
|
||||||
),
|
|
||||||
)
|
|
||||||
}
|
|
||||||
hasBlockingActiveRun = true
|
|
||||||
}
|
|
||||||
if (hasBlockingActiveRun) {
|
|
||||||
if (staleRecoveriesApplied) {
|
|
||||||
await writeAutonomyRuns(runs, rootDir)
|
|
||||||
}
|
|
||||||
return
|
|
||||||
}
|
|
||||||
}
|
|
||||||
runs.unshift(record)
|
runs.unshift(record)
|
||||||
await writeAutonomyRuns(runs, rootDir)
|
await writeAutonomyRuns(runs, rootDir)
|
||||||
created = true
|
|
||||||
})
|
})
|
||||||
return { created, recoveredStaleRuns }
|
|
||||||
}
|
|
||||||
|
|
||||||
async function queueManagedFlowStepRunForRecord(
|
|
||||||
record: AutonomyRunRecord,
|
|
||||||
rootDir: string,
|
|
||||||
): Promise<void> {
|
|
||||||
if (
|
if (
|
||||||
record.parentFlowId &&
|
record.parentFlowId &&
|
||||||
record.flowStepId &&
|
record.flowStepId &&
|
||||||
@@ -479,47 +258,9 @@ async function queueManagedFlowStepRunForRecord(
|
|||||||
nowMs: record.createdAt,
|
nowMs: record.createdAt,
|
||||||
})
|
})
|
||||||
}
|
}
|
||||||
}
|
|
||||||
|
|
||||||
async function createAutonomyRunCore(
|
|
||||||
params: CreateAutonomyRunParams,
|
|
||||||
skipIfActiveSource: boolean,
|
|
||||||
): Promise<AutonomyRunRecord | null> {
|
|
||||||
const rootDir = resolve(params.rootDir ?? getProjectRoot())
|
|
||||||
const currentDir = resolve(params.currentDir ?? rootDir)
|
|
||||||
const record = buildAutonomyRunRecord(params, rootDir, currentDir)
|
|
||||||
|
|
||||||
const { created, recoveredStaleRuns } = await persistAutonomyRunRecord(
|
|
||||||
record,
|
|
||||||
rootDir,
|
|
||||||
skipIfActiveSource,
|
|
||||||
)
|
|
||||||
for (const recovered of recoveredStaleRuns) {
|
|
||||||
await syncFailedManagedFlowForRun(recovered, rootDir)
|
|
||||||
}
|
|
||||||
if (!created) {
|
|
||||||
return null
|
|
||||||
}
|
|
||||||
await queueManagedFlowStepRunForRecord(record, rootDir)
|
|
||||||
return record
|
return record
|
||||||
}
|
}
|
||||||
|
|
||||||
export async function createAutonomyRun(
|
|
||||||
params: CreateAutonomyRunParams,
|
|
||||||
): Promise<AutonomyRunRecord> {
|
|
||||||
const record = await createAutonomyRunCore(params, false)
|
|
||||||
if (!record) {
|
|
||||||
throw new Error('Autonomy run was unexpectedly skipped.')
|
|
||||||
}
|
|
||||||
return record
|
|
||||||
}
|
|
||||||
|
|
||||||
export async function createAutonomyRunIfNoActiveSource(
|
|
||||||
params: CreateAutonomyRunParams & { sourceId: string },
|
|
||||||
): Promise<AutonomyRunRecord | null> {
|
|
||||||
return createAutonomyRunCore(params, true)
|
|
||||||
}
|
|
||||||
|
|
||||||
function buildManagedFlowStepPrompt(
|
function buildManagedFlowStepPrompt(
|
||||||
flow: AutonomyFlowRecord,
|
flow: AutonomyFlowRecord,
|
||||||
stepIndex: number,
|
stepIndex: number,
|
||||||
@@ -595,7 +336,6 @@ async function createOrRecoverManagedFlowStepCommand(params: {
|
|||||||
workload: params.workload,
|
workload: params.workload,
|
||||||
autonomy: {
|
autonomy: {
|
||||||
runId: run.runId,
|
runId: run.runId,
|
||||||
rootDir: run.rootDir,
|
|
||||||
trigger: 'managed-flow-step',
|
trigger: 'managed-flow-step',
|
||||||
sourceId: run.sourceId,
|
sourceId: run.sourceId,
|
||||||
sourceLabel: run.sourceLabel,
|
sourceLabel: run.sourceLabel,
|
||||||
@@ -686,16 +426,11 @@ export async function markAutonomyRunRunning(
|
|||||||
): Promise<AutonomyRunRecord | null> {
|
): Promise<AutonomyRunRecord | null> {
|
||||||
const updated = await updateAutonomyRun(
|
const updated = await updateAutonomyRun(
|
||||||
runId,
|
runId,
|
||||||
current =>
|
current => ({
|
||||||
current.status === 'queued'
|
...current,
|
||||||
? {
|
status: 'running',
|
||||||
...current,
|
startedAt: nowMs ?? Date.now(),
|
||||||
status: 'running',
|
}),
|
||||||
startedAt: nowMs ?? Date.now(),
|
|
||||||
ownerProcessId: process.pid,
|
|
||||||
ownerSessionId: getSessionId(),
|
|
||||||
}
|
|
||||||
: null,
|
|
||||||
rootDir,
|
rootDir,
|
||||||
)
|
)
|
||||||
if (updated?.parentFlowId && updated.parentFlowSyncMode === 'managed') {
|
if (updated?.parentFlowId && updated.parentFlowSyncMode === 'managed') {
|
||||||
@@ -716,15 +451,12 @@ export async function markAutonomyRunCompleted(
|
|||||||
): Promise<AutonomyRunRecord | null> {
|
): Promise<AutonomyRunRecord | null> {
|
||||||
const updated = await updateAutonomyRun(
|
const updated = await updateAutonomyRun(
|
||||||
runId,
|
runId,
|
||||||
current =>
|
current => ({
|
||||||
current.status === 'queued' || current.status === 'running'
|
...current,
|
||||||
? {
|
status: 'completed',
|
||||||
...current,
|
endedAt: nowMs ?? Date.now(),
|
||||||
status: 'completed',
|
error: undefined,
|
||||||
endedAt: nowMs ?? Date.now(),
|
}),
|
||||||
error: undefined,
|
|
||||||
}
|
|
||||||
: null,
|
|
||||||
rootDir,
|
rootDir,
|
||||||
)
|
)
|
||||||
if (updated?.parentFlowId && updated.parentFlowSyncMode === 'managed') {
|
if (updated?.parentFlowId && updated.parentFlowSyncMode === 'managed') {
|
||||||
@@ -744,17 +476,24 @@ export async function markAutonomyRunFailed(
|
|||||||
rootDir?: string,
|
rootDir?: string,
|
||||||
nowMs?: number,
|
nowMs?: number,
|
||||||
): Promise<AutonomyRunRecord | null> {
|
): Promise<AutonomyRunRecord | null> {
|
||||||
const endedAt = nowMs ?? Date.now()
|
|
||||||
const updated = await updateAutonomyRun(
|
const updated = await updateAutonomyRun(
|
||||||
runId,
|
runId,
|
||||||
current =>
|
current => ({
|
||||||
isActiveAutonomyRunStatus(current.status)
|
...current,
|
||||||
? failAutonomyRunRecord(current, error, endedAt)
|
status: 'failed',
|
||||||
: null,
|
endedAt: nowMs ?? Date.now(),
|
||||||
|
error,
|
||||||
|
}),
|
||||||
rootDir,
|
rootDir,
|
||||||
)
|
)
|
||||||
if (updated) {
|
if (updated?.parentFlowId && updated.parentFlowSyncMode === 'managed') {
|
||||||
await syncFailedManagedFlowForRun(updated, rootDir ?? updated.rootDir)
|
await markManagedAutonomyFlowStepFailed({
|
||||||
|
flowId: updated.parentFlowId,
|
||||||
|
runId: updated.runId,
|
||||||
|
error,
|
||||||
|
rootDir,
|
||||||
|
nowMs: updated.endedAt,
|
||||||
|
})
|
||||||
}
|
}
|
||||||
return updated
|
return updated
|
||||||
}
|
}
|
||||||
@@ -766,15 +505,12 @@ export async function markAutonomyRunCancelled(
|
|||||||
): Promise<AutonomyRunRecord | null> {
|
): Promise<AutonomyRunRecord | null> {
|
||||||
const updated = await updateAutonomyRun(
|
const updated = await updateAutonomyRun(
|
||||||
runId,
|
runId,
|
||||||
current =>
|
current => ({
|
||||||
current.status === 'queued' || current.status === 'running'
|
...current,
|
||||||
? {
|
status: 'cancelled',
|
||||||
...current,
|
endedAt: nowMs ?? Date.now(),
|
||||||
status: 'cancelled',
|
error: undefined,
|
||||||
endedAt: nowMs ?? Date.now(),
|
}),
|
||||||
error: undefined,
|
|
||||||
}
|
|
||||||
: null,
|
|
||||||
rootDir,
|
rootDir,
|
||||||
)
|
)
|
||||||
if (updated?.parentFlowId && updated.parentFlowSyncMode === 'managed') {
|
if (updated?.parentFlowId && updated.parentFlowSyncMode === 'managed') {
|
||||||
@@ -876,7 +612,6 @@ export async function createAutonomyQueuedPrompt(params: {
|
|||||||
currentDir?: string
|
currentDir?: string
|
||||||
sourceId?: string
|
sourceId?: string
|
||||||
sourceLabel?: string
|
sourceLabel?: string
|
||||||
ownerKey?: string
|
|
||||||
workload?: string
|
workload?: string
|
||||||
priority?: 'now' | 'next' | 'later'
|
priority?: 'now' | 'next' | 'later'
|
||||||
shouldCreate?: () => boolean
|
shouldCreate?: () => boolean
|
||||||
@@ -899,130 +634,39 @@ export async function createAutonomyQueuedPrompt(params: {
|
|||||||
currentDir,
|
currentDir,
|
||||||
sourceId: params.sourceId,
|
sourceId: params.sourceId,
|
||||||
sourceLabel: params.sourceLabel,
|
sourceLabel: params.sourceLabel,
|
||||||
ownerKey: params.ownerKey,
|
|
||||||
workload: params.workload,
|
workload: params.workload,
|
||||||
priority: params.priority,
|
priority: params.priority,
|
||||||
flow: params.flow,
|
flow: params.flow,
|
||||||
})
|
})
|
||||||
}
|
}
|
||||||
|
|
||||||
export async function createAutonomyQueuedPromptIfNoActiveSource(params: {
|
|
||||||
trigger: AutonomyTriggerKind
|
|
||||||
basePrompt: string
|
|
||||||
rootDir?: string
|
|
||||||
currentDir?: string
|
|
||||||
sourceId: string
|
|
||||||
sourceLabel?: string
|
|
||||||
ownerKey?: string
|
|
||||||
workload?: string
|
|
||||||
priority?: 'now' | 'next' | 'later'
|
|
||||||
shouldCreate?: () => boolean
|
|
||||||
}): Promise<QueuedCommand | null> {
|
|
||||||
const rootDir = resolve(params.rootDir ?? getProjectRoot())
|
|
||||||
const currentDir = resolve(params.currentDir ?? getCwd())
|
|
||||||
// Cheap optimistic pre-check: skip the AGENTS.md / HEARTBEAT.md disk
|
|
||||||
// reads + prompt assembly when an active run for this source already
|
|
||||||
// blocks dedup. The lock-side check inside persistAutonomyRunRecord
|
|
||||||
// remains authoritative; this only fast-paths the common storm case.
|
|
||||||
if (
|
|
||||||
await hasActiveAutonomyRunForSource({
|
|
||||||
trigger: params.trigger,
|
|
||||||
sourceId: params.sourceId,
|
|
||||||
rootDir,
|
|
||||||
ownerKey: params.ownerKey,
|
|
||||||
})
|
|
||||||
) {
|
|
||||||
return null
|
|
||||||
}
|
|
||||||
const prepared = await prepareAutonomyTurnPrompt({
|
|
||||||
basePrompt: params.basePrompt,
|
|
||||||
trigger: params.trigger,
|
|
||||||
rootDir,
|
|
||||||
currentDir,
|
|
||||||
})
|
|
||||||
if (params.shouldCreate && !params.shouldCreate()) {
|
|
||||||
return null
|
|
||||||
}
|
|
||||||
return commitAutonomyQueuedPromptIfNoActiveSource({
|
|
||||||
prepared,
|
|
||||||
rootDir,
|
|
||||||
currentDir,
|
|
||||||
sourceId: params.sourceId,
|
|
||||||
sourceLabel: params.sourceLabel,
|
|
||||||
ownerKey: params.ownerKey,
|
|
||||||
workload: params.workload,
|
|
||||||
priority: params.priority,
|
|
||||||
})
|
|
||||||
}
|
|
||||||
|
|
||||||
export async function commitAutonomyQueuedPrompt(params: {
|
export async function commitAutonomyQueuedPrompt(params: {
|
||||||
prepared: Awaited<ReturnType<typeof prepareAutonomyTurnPrompt>>
|
prepared: Awaited<ReturnType<typeof prepareAutonomyTurnPrompt>>
|
||||||
rootDir?: string
|
rootDir?: string
|
||||||
currentDir?: string
|
currentDir?: string
|
||||||
sourceId?: string
|
sourceId?: string
|
||||||
sourceLabel?: string
|
sourceLabel?: string
|
||||||
ownerKey?: string
|
|
||||||
workload?: string
|
workload?: string
|
||||||
priority?: 'now' | 'next' | 'later'
|
priority?: 'now' | 'next' | 'later'
|
||||||
flow?: AutonomyRunFlowRef
|
flow?: AutonomyRunFlowRef
|
||||||
}): Promise<QueuedCommand> {
|
}): Promise<QueuedCommand> {
|
||||||
const command = await commitAutonomyQueuedPromptInternal(params, false)
|
|
||||||
if (!command) {
|
|
||||||
throw new Error('Autonomy queued prompt was unexpectedly skipped.')
|
|
||||||
}
|
|
||||||
return command
|
|
||||||
}
|
|
||||||
|
|
||||||
async function commitAutonomyQueuedPromptIfNoActiveSource(params: {
|
|
||||||
prepared: Awaited<ReturnType<typeof prepareAutonomyTurnPrompt>>
|
|
||||||
rootDir?: string
|
|
||||||
currentDir?: string
|
|
||||||
sourceId: string
|
|
||||||
sourceLabel?: string
|
|
||||||
ownerKey?: string
|
|
||||||
workload?: string
|
|
||||||
priority?: 'now' | 'next' | 'later'
|
|
||||||
}): Promise<QueuedCommand | null> {
|
|
||||||
return commitAutonomyQueuedPromptInternal(params, true)
|
|
||||||
}
|
|
||||||
|
|
||||||
async function commitAutonomyQueuedPromptInternal(
|
|
||||||
params: {
|
|
||||||
prepared: Awaited<ReturnType<typeof prepareAutonomyTurnPrompt>>
|
|
||||||
rootDir?: string
|
|
||||||
currentDir?: string
|
|
||||||
sourceId?: string
|
|
||||||
sourceLabel?: string
|
|
||||||
ownerKey?: string
|
|
||||||
workload?: string
|
|
||||||
priority?: 'now' | 'next' | 'later'
|
|
||||||
flow?: AutonomyRunFlowRef
|
|
||||||
},
|
|
||||||
skipWhenActiveSource: boolean,
|
|
||||||
): Promise<QueuedCommand | null> {
|
|
||||||
const rootDir = resolve(
|
const rootDir = resolve(
|
||||||
params.rootDir ?? params.prepared.rootDir ?? getProjectRoot(),
|
params.rootDir ?? params.prepared.rootDir ?? getProjectRoot(),
|
||||||
)
|
)
|
||||||
const currentDir = resolve(
|
const currentDir = resolve(
|
||||||
params.currentDir ?? params.prepared.currentDir ?? getCwd(),
|
params.currentDir ?? params.prepared.currentDir ?? getCwd(),
|
||||||
)
|
)
|
||||||
|
commitPreparedAutonomyTurn(params.prepared)
|
||||||
const value = params.prepared.prompt
|
const value = params.prepared.prompt
|
||||||
const runParams: CreateAutonomyRunParams = {
|
const run = await createAutonomyRun({
|
||||||
trigger: params.prepared.trigger,
|
trigger: params.prepared.trigger,
|
||||||
prompt: value,
|
prompt: value,
|
||||||
rootDir,
|
rootDir,
|
||||||
currentDir,
|
currentDir,
|
||||||
sourceId: params.sourceId,
|
sourceId: params.sourceId,
|
||||||
sourceLabel: params.sourceLabel,
|
sourceLabel: params.sourceLabel,
|
||||||
ownerKey: params.ownerKey,
|
|
||||||
flow: params.flow,
|
flow: params.flow,
|
||||||
}
|
})
|
||||||
const useDedup = skipWhenActiveSource && Boolean(params.sourceId)
|
|
||||||
const run = await createAutonomyRunCore(runParams, useDedup)
|
|
||||||
if (!run) {
|
|
||||||
return null
|
|
||||||
}
|
|
||||||
commitPreparedAutonomyTurn(params.prepared)
|
|
||||||
const origin = {
|
const origin = {
|
||||||
kind: 'autonomy',
|
kind: 'autonomy',
|
||||||
trigger: params.prepared.trigger,
|
trigger: params.prepared.trigger,
|
||||||
@@ -1039,7 +683,6 @@ async function commitAutonomyQueuedPromptInternal(
|
|||||||
workload: params.workload,
|
workload: params.workload,
|
||||||
autonomy: {
|
autonomy: {
|
||||||
runId: run.runId,
|
runId: run.runId,
|
||||||
rootDir: run.rootDir,
|
|
||||||
trigger: params.prepared.trigger,
|
trigger: params.prepared.trigger,
|
||||||
sourceId: params.sourceId,
|
sourceId: params.sourceId,
|
||||||
sourceLabel: params.sourceLabel,
|
sourceLabel: params.sourceLabel,
|
||||||
|
|||||||
@@ -19,20 +19,19 @@ import {
|
|||||||
} from '../types/textInputTypes.js'
|
} from '../types/textInputTypes.js'
|
||||||
import { createAbortController } from './abortController.js'
|
import { createAbortController } from './abortController.js'
|
||||||
import type { PastedContent } from './config.js'
|
import type { PastedContent } from './config.js'
|
||||||
import { getCwd } from './cwd.js'
|
|
||||||
import { logForDebugging } from './debug.js'
|
import { logForDebugging } from './debug.js'
|
||||||
import type { EffortValue } from './effort.js'
|
import type { EffortValue } from './effort.js'
|
||||||
import type { FileHistoryState } from './fileHistory.js'
|
import type { FileHistoryState } from './fileHistory.js'
|
||||||
import { fileHistoryEnabled, fileHistoryMakeSnapshot } from './fileHistory.js'
|
import { fileHistoryEnabled, fileHistoryMakeSnapshot } from './fileHistory.js'
|
||||||
import { gracefulShutdownSync } from './gracefulShutdown.js'
|
import { gracefulShutdownSync } from './gracefulShutdown.js'
|
||||||
import { toError } from './errors.js'
|
|
||||||
import { logError } from './log.js'
|
|
||||||
import { enqueue } from './messageQueueManager.js'
|
import { enqueue } from './messageQueueManager.js'
|
||||||
import { resolveSkillModelOverride } from './model/model.js'
|
import { resolveSkillModelOverride } from './model/model.js'
|
||||||
import {
|
import {
|
||||||
claimConsumableQueuedAutonomyCommands,
|
finalizeAutonomyRunCompleted,
|
||||||
finalizeAutonomyCommandsForTurn,
|
finalizeAutonomyRunFailed,
|
||||||
} from './autonomyQueueLifecycle.js'
|
markAutonomyRunFailed,
|
||||||
|
markAutonomyRunRunning,
|
||||||
|
} from './autonomyRuns.js'
|
||||||
import type { ProcessUserInputContext } from './processUserInput/processUserInput.js'
|
import type { ProcessUserInputContext } from './processUserInput/processUserInput.js'
|
||||||
import { processUserInput } from './processUserInput/processUserInput.js'
|
import { processUserInput } from './processUserInput/processUserInput.js'
|
||||||
import type { QueryGuard } from './QueryGuard.js'
|
import type { QueryGuard } from './QueryGuard.js'
|
||||||
@@ -76,7 +75,7 @@ type BaseExecutionParams = {
|
|||||||
onBeforeQuery?: (input: string, newMessages: Message[]) => Promise<boolean>,
|
onBeforeQuery?: (input: string, newMessages: Message[]) => Promise<boolean>,
|
||||||
input?: string,
|
input?: string,
|
||||||
effort?: EffortValue,
|
effort?: EffortValue,
|
||||||
) => Promise<boolean>
|
) => Promise<void>
|
||||||
setAppState: (updater: (prev: AppState) => AppState) => void
|
setAppState: (updater: (prev: AppState) => AppState) => void
|
||||||
onBeforeQuery?: (input: string, newMessages: Message[]) => Promise<boolean>
|
onBeforeQuery?: (input: string, newMessages: Message[]) => Promise<boolean>
|
||||||
canUseTool?: CanUseToolFn
|
canUseTool?: CanUseToolFn
|
||||||
@@ -460,18 +459,7 @@ async function executeUserInput(params: ExecuteUserInputParams): Promise<void> {
|
|||||||
// Iterate all commands uniformly. First command gets attachments +
|
// Iterate all commands uniformly. First command gets attachments +
|
||||||
// ideSelection + pastedContents, rest skip attachments to avoid
|
// ideSelection + pastedContents, rest skip attachments to avoid
|
||||||
// duplicating turn-level context (IDE selection, todos, diffs).
|
// duplicating turn-level context (IDE selection, todos, diffs).
|
||||||
let commands = queuedCommands ?? []
|
const commands = queuedCommands ?? []
|
||||||
const queuedAutonomyClaim =
|
|
||||||
await claimConsumableQueuedAutonomyCommands(commands)
|
|
||||||
commands = queuedAutonomyClaim.attachmentCommands
|
|
||||||
const claimedAutonomyCommands = queuedAutonomyClaim.claimedCommands
|
|
||||||
if (commands.length === 0) {
|
|
||||||
// Clear the abort controller published a few lines above so this turn's
|
|
||||||
// stale controller does not leak into the next turn when every claimed
|
|
||||||
// autonomy command was skipped as non-consumable.
|
|
||||||
setAbortController(null)
|
|
||||||
return
|
|
||||||
}
|
|
||||||
|
|
||||||
// Compute the workload tag for this turn. queueProcessor can batch a
|
// Compute the workload tag for this turn. queueProcessor can batch a
|
||||||
// cron prompt with a same-tick human prompt; only tag when EVERY
|
// cron prompt with a same-tick human prompt; only tag when EVERY
|
||||||
@@ -483,7 +471,7 @@ async function executeUserInput(params: ExecuteUserInputParams): Promise<void> {
|
|||||||
commands.every(c => c.workload === firstWorkload)
|
commands.every(c => c.workload === firstWorkload)
|
||||||
? firstWorkload
|
? firstWorkload
|
||||||
: undefined
|
: undefined
|
||||||
const deferredAutonomyRunIds = new Set<string>()
|
let autonomyRunIds: string[] | undefined
|
||||||
|
|
||||||
// Wrap the entire turn (processUserInput loop + onQuery) in an
|
// Wrap the entire turn (processUserInput loop + onQuery) in an
|
||||||
// AsyncLocalStorage context. This is the ONLY way to correctly
|
// AsyncLocalStorage context. This is the ONLY way to correctly
|
||||||
@@ -493,13 +481,15 @@ async function executeUserInput(params: ExecuteUserInputParams): Promise<void> {
|
|||||||
// context — isolated from the parent's continuation. A process-global
|
// context — isolated from the parent's continuation. A process-global
|
||||||
// mutable slot would be clobbered at the detached closure's first
|
// mutable slot would be clobbered at the detached closure's first
|
||||||
// await by this function's synchronous return path. See state.ts.
|
// await by this function's synchronous return path. See state.ts.
|
||||||
let turnError: unknown
|
|
||||||
try {
|
try {
|
||||||
await runWithWorkload(turnWorkload, async () => {
|
await runWithWorkload(turnWorkload, async () => {
|
||||||
for (let i = 0; i < commands.length; i++) {
|
for (let i = 0; i < commands.length; i++) {
|
||||||
const cmd = commands[i]!
|
const cmd = commands[i]!
|
||||||
const isFirst = i === 0
|
const isFirst = i === 0
|
||||||
const runId = cmd.autonomy?.runId
|
if (cmd.autonomy?.runId) {
|
||||||
|
;(autonomyRunIds ??= []).push(cmd.autonomy.runId)
|
||||||
|
await markAutonomyRunRunning(cmd.autonomy.runId)
|
||||||
|
}
|
||||||
const result = await processUserInput({
|
const result = await processUserInput({
|
||||||
input: cmd.value,
|
input: cmd.value,
|
||||||
preExpansionInput: cmd.preExpansionValue,
|
preExpansionInput: cmd.preExpansionValue,
|
||||||
@@ -520,11 +510,7 @@ async function executeUserInput(params: ExecuteUserInputParams): Promise<void> {
|
|||||||
bridgeOrigin: cmd.bridgeOrigin,
|
bridgeOrigin: cmd.bridgeOrigin,
|
||||||
isMeta: cmd.isMeta,
|
isMeta: cmd.isMeta,
|
||||||
skipAttachments: !isFirst,
|
skipAttachments: !isFirst,
|
||||||
autonomy: cmd.autonomy,
|
|
||||||
})
|
})
|
||||||
if (runId && result.deferAutonomyCompletion) {
|
|
||||||
deferredAutonomyRunIds.add(runId)
|
|
||||||
}
|
|
||||||
// Stamp origin here rather than threading another arg through
|
// Stamp origin here rather than threading another arg through
|
||||||
// processUserInput → processUserInputBase → processTextPrompt → createUserMessage.
|
// processUserInput → processUserInputBase → processTextPrompt → createUserMessage.
|
||||||
// Derive origin from mode for task-notifications — mirrors the origin
|
// Derive origin from mode for task-notifications — mirrors the origin
|
||||||
@@ -625,52 +611,28 @@ async function executeUserInput(params: ExecuteUserInputParams): Promise<void> {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
}) // end runWithWorkload — ALS context naturally scoped, no finally needed
|
}) // end runWithWorkload — ALS context naturally scoped, no finally needed
|
||||||
} catch (error) {
|
if (autonomyRunIds?.length) {
|
||||||
turnError = error
|
for (const runId of autonomyRunIds) {
|
||||||
}
|
const nextCommands = await finalizeAutonomyRunCompleted({
|
||||||
|
runId,
|
||||||
// Finalize claimed autonomy commands as `completed` only if the turn
|
|
||||||
// body itself succeeded. Run the finalize call in its own try/catch so a
|
|
||||||
// failure there does not double-finalize the same commands as `failed`
|
|
||||||
// (which previously cancelled follow-up queue state after a successful
|
|
||||||
// turn).
|
|
||||||
if (claimedAutonomyCommands.length) {
|
|
||||||
const finalizableCommands = claimedAutonomyCommands.filter(command => {
|
|
||||||
const runId = command.autonomy?.runId
|
|
||||||
return !runId || !deferredAutonomyRunIds.has(runId)
|
|
||||||
})
|
|
||||||
if (turnError) {
|
|
||||||
try {
|
|
||||||
await finalizeAutonomyCommandsForTurn({
|
|
||||||
commands: finalizableCommands,
|
|
||||||
outcome: { type: 'failed', error: turnError },
|
|
||||||
currentDir: getCwd(),
|
|
||||||
priority: 'later',
|
|
||||||
workload: turnWorkload,
|
|
||||||
})
|
|
||||||
} catch (finalizeError) {
|
|
||||||
logError(toError(finalizeError))
|
|
||||||
}
|
|
||||||
} else {
|
|
||||||
try {
|
|
||||||
const nextCommands = await finalizeAutonomyCommandsForTurn({
|
|
||||||
commands: finalizableCommands,
|
|
||||||
outcome: { type: 'completed' },
|
|
||||||
currentDir: getCwd(),
|
|
||||||
priority: 'later',
|
priority: 'later',
|
||||||
workload: turnWorkload,
|
workload: turnWorkload,
|
||||||
})
|
})
|
||||||
for (const nextCommand of nextCommands) {
|
for (const nextCommand of nextCommands) {
|
||||||
enqueue(nextCommand)
|
enqueue(nextCommand)
|
||||||
}
|
}
|
||||||
} catch (finalizeError) {
|
|
||||||
logError(toError(finalizeError))
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
} catch (error) {
|
||||||
|
if (autonomyRunIds?.length) {
|
||||||
if (turnError) {
|
for (const runId of autonomyRunIds) {
|
||||||
throw turnError
|
await finalizeAutonomyRunFailed({
|
||||||
|
runId,
|
||||||
|
error: String(error),
|
||||||
|
})
|
||||||
|
}
|
||||||
|
}
|
||||||
|
throw error
|
||||||
}
|
}
|
||||||
} finally {
|
} finally {
|
||||||
// Safety net: release the guard reservation if processUserInput threw
|
// Safety net: release the guard reservation if processUserInput threw
|
||||||
|
|||||||
@@ -1,162 +1,173 @@
|
|||||||
import { describe, expect, test, beforeEach, afterEach } from 'bun:test'
|
import { describe, expect, test, beforeEach, afterEach } from "bun:test";
|
||||||
|
import { mock } from "bun:test";
|
||||||
|
|
||||||
const { getAPIProvider, isFirstPartyAnthropicBaseUrl } = await import(
|
let mockedModelType: "gemini" | undefined;
|
||||||
'../providers'
|
|
||||||
)
|
|
||||||
|
|
||||||
describe('getAPIProvider', () => {
|
mock.module("../../settings/settings.js", () => ({
|
||||||
|
getInitialSettings: () =>
|
||||||
|
mockedModelType ? { modelType: mockedModelType } : {},
|
||||||
|
}));
|
||||||
|
|
||||||
|
const { getAPIProvider, isFirstPartyAnthropicBaseUrl } =
|
||||||
|
await import("../providers");
|
||||||
|
|
||||||
|
describe("getAPIProvider", () => {
|
||||||
const envKeys = [
|
const envKeys = [
|
||||||
'CLAUDE_CODE_USE_GEMINI',
|
"CLAUDE_CODE_USE_GEMINI",
|
||||||
'CLAUDE_CODE_USE_BEDROCK',
|
"CLAUDE_CODE_USE_BEDROCK",
|
||||||
'CLAUDE_CODE_USE_VERTEX',
|
"CLAUDE_CODE_USE_VERTEX",
|
||||||
'CLAUDE_CODE_USE_FOUNDRY',
|
"CLAUDE_CODE_USE_FOUNDRY",
|
||||||
'CLAUDE_CODE_USE_OPENAI',
|
"CLAUDE_CODE_USE_OPENAI",
|
||||||
'CLAUDE_CODE_USE_GROK',
|
] as const;
|
||||||
] as const
|
const savedEnv: Record<string, string | undefined> = {};
|
||||||
const savedEnv: Record<string, string | undefined> = {}
|
|
||||||
|
|
||||||
beforeEach(() => {
|
beforeEach(() => {
|
||||||
// Save and clear environment variables
|
// Save and clear environment variables
|
||||||
|
mockedModelType = undefined;
|
||||||
for (const key of envKeys) {
|
for (const key of envKeys) {
|
||||||
savedEnv[key] = process.env[key]
|
savedEnv[key] = process.env[key];
|
||||||
delete process.env[key]
|
delete process.env[key];
|
||||||
}
|
}
|
||||||
})
|
});
|
||||||
|
|
||||||
afterEach(() => {
|
afterEach(() => {
|
||||||
// Restore environment variables
|
// Restore environment variables
|
||||||
|
mockedModelType = undefined;
|
||||||
for (const key of envKeys) {
|
for (const key of envKeys) {
|
||||||
if (savedEnv[key] !== undefined) {
|
if (savedEnv[key] !== undefined) {
|
||||||
process.env[key] = savedEnv[key]
|
process.env[key] = savedEnv[key];
|
||||||
} else {
|
} else {
|
||||||
delete process.env[key]
|
delete process.env[key];
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
})
|
});
|
||||||
|
|
||||||
test('returns "firstParty" by default', () => {
|
test('returns "firstParty" by default', () => {
|
||||||
expect(getAPIProvider({})).toBe('firstParty')
|
expect(getAPIProvider()).toBe("firstParty");
|
||||||
})
|
});
|
||||||
|
|
||||||
test('returns "gemini" when modelType is gemini', () => {
|
test('returns "gemini" when modelType is gemini', () => {
|
||||||
expect(getAPIProvider({ modelType: 'gemini' })).toBe('gemini')
|
mockedModelType = "gemini";
|
||||||
})
|
expect(getAPIProvider()).toBe("gemini");
|
||||||
|
});
|
||||||
|
|
||||||
test('modelType takes precedence over environment variables', () => {
|
test("modelType takes precedence over environment variables", () => {
|
||||||
process.env.CLAUDE_CODE_USE_BEDROCK = '1'
|
mockedModelType = "gemini";
|
||||||
expect(getAPIProvider({ modelType: 'gemini' })).toBe('gemini')
|
process.env.CLAUDE_CODE_USE_BEDROCK = "1";
|
||||||
})
|
expect(getAPIProvider()).toBe("gemini");
|
||||||
|
});
|
||||||
|
|
||||||
test('returns "gemini" when CLAUDE_CODE_USE_GEMINI is set', () => {
|
test('returns "gemini" when CLAUDE_CODE_USE_GEMINI is set', () => {
|
||||||
process.env.CLAUDE_CODE_USE_GEMINI = '1'
|
process.env.CLAUDE_CODE_USE_GEMINI = "1";
|
||||||
expect(getAPIProvider({})).toBe('gemini')
|
expect(getAPIProvider()).toBe("gemini");
|
||||||
})
|
});
|
||||||
|
|
||||||
test('returns "bedrock" when CLAUDE_CODE_USE_BEDROCK is set', () => {
|
test('returns "bedrock" when CLAUDE_CODE_USE_BEDROCK is set', () => {
|
||||||
process.env.CLAUDE_CODE_USE_BEDROCK = '1'
|
process.env.CLAUDE_CODE_USE_BEDROCK = "1";
|
||||||
expect(getAPIProvider({})).toBe('bedrock')
|
expect(getAPIProvider()).toBe("bedrock");
|
||||||
})
|
});
|
||||||
|
|
||||||
test('returns "vertex" when CLAUDE_CODE_USE_VERTEX is set', () => {
|
test('returns "vertex" when CLAUDE_CODE_USE_VERTEX is set', () => {
|
||||||
process.env.CLAUDE_CODE_USE_VERTEX = '1'
|
process.env.CLAUDE_CODE_USE_VERTEX = "1";
|
||||||
expect(getAPIProvider({})).toBe('vertex')
|
expect(getAPIProvider()).toBe("vertex");
|
||||||
})
|
});
|
||||||
|
|
||||||
test('returns "foundry" when CLAUDE_CODE_USE_FOUNDRY is set', () => {
|
test('returns "foundry" when CLAUDE_CODE_USE_FOUNDRY is set', () => {
|
||||||
process.env.CLAUDE_CODE_USE_FOUNDRY = '1'
|
process.env.CLAUDE_CODE_USE_FOUNDRY = "1";
|
||||||
expect(getAPIProvider({})).toBe('foundry')
|
expect(getAPIProvider()).toBe("foundry");
|
||||||
})
|
});
|
||||||
|
|
||||||
test('bedrock takes precedence over gemini', () => {
|
test("bedrock takes precedence over gemini", () => {
|
||||||
process.env.CLAUDE_CODE_USE_BEDROCK = '1'
|
process.env.CLAUDE_CODE_USE_BEDROCK = "1";
|
||||||
process.env.CLAUDE_CODE_USE_GEMINI = '1'
|
process.env.CLAUDE_CODE_USE_GEMINI = "1";
|
||||||
expect(getAPIProvider({})).toBe('bedrock')
|
expect(getAPIProvider()).toBe("bedrock");
|
||||||
})
|
});
|
||||||
|
|
||||||
test('bedrock takes precedence over vertex', () => {
|
test("bedrock takes precedence over vertex", () => {
|
||||||
process.env.CLAUDE_CODE_USE_BEDROCK = '1'
|
process.env.CLAUDE_CODE_USE_BEDROCK = "1";
|
||||||
process.env.CLAUDE_CODE_USE_VERTEX = '1'
|
process.env.CLAUDE_CODE_USE_VERTEX = "1";
|
||||||
expect(getAPIProvider({})).toBe('bedrock')
|
expect(getAPIProvider()).toBe("bedrock");
|
||||||
})
|
});
|
||||||
|
|
||||||
test('bedrock wins when all three env vars are set', () => {
|
test("bedrock wins when all three env vars are set", () => {
|
||||||
process.env.CLAUDE_CODE_USE_BEDROCK = '1'
|
process.env.CLAUDE_CODE_USE_BEDROCK = "1";
|
||||||
process.env.CLAUDE_CODE_USE_VERTEX = '1'
|
process.env.CLAUDE_CODE_USE_VERTEX = "1";
|
||||||
process.env.CLAUDE_CODE_USE_FOUNDRY = '1'
|
process.env.CLAUDE_CODE_USE_FOUNDRY = "1";
|
||||||
expect(getAPIProvider({})).toBe('bedrock')
|
expect(getAPIProvider()).toBe("bedrock");
|
||||||
})
|
});
|
||||||
|
|
||||||
test('"true" is truthy', () => {
|
test('"true" is truthy', () => {
|
||||||
process.env.CLAUDE_CODE_USE_BEDROCK = 'true'
|
process.env.CLAUDE_CODE_USE_BEDROCK = "true";
|
||||||
expect(getAPIProvider({})).toBe('bedrock')
|
expect(getAPIProvider()).toBe("bedrock");
|
||||||
})
|
});
|
||||||
|
|
||||||
test('"0" is not truthy', () => {
|
test('"0" is not truthy', () => {
|
||||||
process.env.CLAUDE_CODE_USE_BEDROCK = '0'
|
process.env.CLAUDE_CODE_USE_BEDROCK = "0";
|
||||||
expect(getAPIProvider({})).toBe('firstParty')
|
expect(getAPIProvider()).toBe("firstParty");
|
||||||
})
|
});
|
||||||
|
|
||||||
test('empty string is not truthy', () => {
|
test('empty string is not truthy', () => {
|
||||||
process.env.CLAUDE_CODE_USE_BEDROCK = ''
|
process.env.CLAUDE_CODE_USE_BEDROCK = "";
|
||||||
expect(getAPIProvider({})).toBe('firstParty')
|
expect(getAPIProvider()).toBe("firstParty");
|
||||||
})
|
});
|
||||||
})
|
});
|
||||||
|
|
||||||
describe('isFirstPartyAnthropicBaseUrl', () => {
|
describe("isFirstPartyAnthropicBaseUrl", () => {
|
||||||
const originalBaseUrl = process.env.ANTHROPIC_BASE_URL
|
const originalBaseUrl = process.env.ANTHROPIC_BASE_URL;
|
||||||
const originalUserType = process.env.USER_TYPE
|
const originalUserType = process.env.USER_TYPE;
|
||||||
|
|
||||||
afterEach(() => {
|
afterEach(() => {
|
||||||
if (originalBaseUrl !== undefined) {
|
if (originalBaseUrl !== undefined) {
|
||||||
process.env.ANTHROPIC_BASE_URL = originalBaseUrl
|
process.env.ANTHROPIC_BASE_URL = originalBaseUrl;
|
||||||
} else {
|
} else {
|
||||||
delete process.env.ANTHROPIC_BASE_URL
|
delete process.env.ANTHROPIC_BASE_URL;
|
||||||
}
|
}
|
||||||
if (originalUserType !== undefined) {
|
if (originalUserType !== undefined) {
|
||||||
process.env.USER_TYPE = originalUserType
|
process.env.USER_TYPE = originalUserType;
|
||||||
} else {
|
} else {
|
||||||
delete process.env.USER_TYPE
|
delete process.env.USER_TYPE;
|
||||||
}
|
}
|
||||||
})
|
});
|
||||||
|
|
||||||
test('returns true when ANTHROPIC_BASE_URL is not set', () => {
|
test("returns true when ANTHROPIC_BASE_URL is not set", () => {
|
||||||
delete process.env.ANTHROPIC_BASE_URL
|
delete process.env.ANTHROPIC_BASE_URL;
|
||||||
expect(isFirstPartyAnthropicBaseUrl()).toBe(true)
|
expect(isFirstPartyAnthropicBaseUrl()).toBe(true);
|
||||||
})
|
});
|
||||||
|
|
||||||
test('returns true for api.anthropic.com', () => {
|
test("returns true for api.anthropic.com", () => {
|
||||||
process.env.ANTHROPIC_BASE_URL = 'https://api.anthropic.com'
|
process.env.ANTHROPIC_BASE_URL = "https://api.anthropic.com";
|
||||||
expect(isFirstPartyAnthropicBaseUrl()).toBe(true)
|
expect(isFirstPartyAnthropicBaseUrl()).toBe(true);
|
||||||
})
|
});
|
||||||
|
|
||||||
test('returns false for custom URL', () => {
|
test("returns false for custom URL", () => {
|
||||||
process.env.ANTHROPIC_BASE_URL = 'https://my-proxy.com'
|
process.env.ANTHROPIC_BASE_URL = "https://my-proxy.com";
|
||||||
expect(isFirstPartyAnthropicBaseUrl()).toBe(false)
|
expect(isFirstPartyAnthropicBaseUrl()).toBe(false);
|
||||||
})
|
});
|
||||||
|
|
||||||
test('returns false for invalid URL', () => {
|
test("returns false for invalid URL", () => {
|
||||||
process.env.ANTHROPIC_BASE_URL = 'not-a-url'
|
process.env.ANTHROPIC_BASE_URL = "not-a-url";
|
||||||
expect(isFirstPartyAnthropicBaseUrl()).toBe(false)
|
expect(isFirstPartyAnthropicBaseUrl()).toBe(false);
|
||||||
})
|
});
|
||||||
|
|
||||||
test('returns true for staging URL when USER_TYPE is ant', () => {
|
test("returns true for staging URL when USER_TYPE is ant", () => {
|
||||||
process.env.ANTHROPIC_BASE_URL = 'https://api-staging.anthropic.com'
|
process.env.ANTHROPIC_BASE_URL = "https://api-staging.anthropic.com";
|
||||||
process.env.USER_TYPE = 'ant'
|
process.env.USER_TYPE = "ant";
|
||||||
expect(isFirstPartyAnthropicBaseUrl()).toBe(true)
|
expect(isFirstPartyAnthropicBaseUrl()).toBe(true);
|
||||||
})
|
});
|
||||||
|
|
||||||
test('returns true for URL with path', () => {
|
test("returns true for URL with path", () => {
|
||||||
process.env.ANTHROPIC_BASE_URL = 'https://api.anthropic.com/v1'
|
process.env.ANTHROPIC_BASE_URL = "https://api.anthropic.com/v1";
|
||||||
expect(isFirstPartyAnthropicBaseUrl()).toBe(true)
|
expect(isFirstPartyAnthropicBaseUrl()).toBe(true);
|
||||||
})
|
});
|
||||||
|
|
||||||
test('returns true for trailing slash', () => {
|
test("returns true for trailing slash", () => {
|
||||||
process.env.ANTHROPIC_BASE_URL = 'https://api.anthropic.com/'
|
process.env.ANTHROPIC_BASE_URL = "https://api.anthropic.com/";
|
||||||
expect(isFirstPartyAnthropicBaseUrl()).toBe(true)
|
expect(isFirstPartyAnthropicBaseUrl()).toBe(true);
|
||||||
})
|
});
|
||||||
|
|
||||||
test('returns false for subdomain attack', () => {
|
test("returns false for subdomain attack", () => {
|
||||||
process.env.ANTHROPIC_BASE_URL = 'https://evil-api.anthropic.com'
|
process.env.ANTHROPIC_BASE_URL = "https://evil-api.anthropic.com";
|
||||||
expect(isFirstPartyAnthropicBaseUrl()).toBe(false)
|
expect(isFirstPartyAnthropicBaseUrl()).toBe(false);
|
||||||
})
|
});
|
||||||
})
|
});
|
||||||
|
|||||||
@@ -1,6 +1,5 @@
|
|||||||
import type { AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS } from '../../services/analytics/index.js'
|
import type { AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS } from '../../services/analytics/index.js'
|
||||||
import { getInitialSettings } from '../settings/settings.js'
|
import { getInitialSettings } from '../settings/settings.js'
|
||||||
import type { SettingsJson } from '../settings/types.js'
|
|
||||||
import { isEnvTruthy } from '../envUtils.js'
|
import { isEnvTruthy } from '../envUtils.js'
|
||||||
|
|
||||||
export type APIProvider =
|
export type APIProvider =
|
||||||
@@ -12,10 +11,8 @@ export type APIProvider =
|
|||||||
| 'gemini'
|
| 'gemini'
|
||||||
| 'grok'
|
| 'grok'
|
||||||
|
|
||||||
export function getAPIProvider(
|
export function getAPIProvider(): APIProvider {
|
||||||
settings: Pick<SettingsJson, 'modelType'> = getInitialSettings(),
|
const modelType = getInitialSettings().modelType
|
||||||
): APIProvider {
|
|
||||||
const modelType = settings.modelType
|
|
||||||
if (modelType === 'openai') return 'openai'
|
if (modelType === 'openai') return 'openai'
|
||||||
if (modelType === 'gemini') return 'gemini'
|
if (modelType === 'gemini') return 'gemini'
|
||||||
if (modelType === 'grok') return 'grok'
|
if (modelType === 'grok') return 'grok'
|
||||||
|
|||||||
@@ -1,375 +0,0 @@
|
|||||||
import { afterEach, beforeEach, describe, expect, mock, test } from 'bun:test'
|
|
||||||
import type { QueuedCommand } from '../../../types/textInputTypes'
|
|
||||||
import {
|
|
||||||
resetStateForTests,
|
|
||||||
setCwdState,
|
|
||||||
setOriginalCwd,
|
|
||||||
setProjectRoot,
|
|
||||||
} from '../../../bootstrap/state'
|
|
||||||
import {
|
|
||||||
createAutonomyQueuedPrompt,
|
|
||||||
getAutonomyRunById,
|
|
||||||
listAutonomyRuns,
|
|
||||||
markAutonomyRunRunning,
|
|
||||||
} from '../../autonomyRuns'
|
|
||||||
import { resetAutonomyAuthorityForTests } from '../../autonomyAuthority'
|
|
||||||
import { createScheduledTaskQueuedCommand } from '../../../hooks/useScheduledTasks'
|
|
||||||
import {
|
|
||||||
cleanupTempDir,
|
|
||||||
createTempDir,
|
|
||||||
} from '../../../../tests/mocks/file-system'
|
|
||||||
|
|
||||||
let runAgentBlocker: Promise<void> | null = null
|
|
||||||
let releaseRunAgentBlocker: (() => void) | null = null
|
|
||||||
let runAgentStartCount = 0
|
|
||||||
let originalNodeEnv: string | undefined
|
|
||||||
let originalAnthropicApiKey: string | undefined
|
|
||||||
const commandQueue: QueuedCommand[] = []
|
|
||||||
|
|
||||||
function enqueue(command: QueuedCommand): void {
|
|
||||||
commandQueue.push({ ...command, priority: command.priority ?? 'next' })
|
|
||||||
}
|
|
||||||
|
|
||||||
function enqueuePendingNotification(command: QueuedCommand): void {
|
|
||||||
commandQueue.push({ ...command, priority: command.priority ?? 'later' })
|
|
||||||
}
|
|
||||||
|
|
||||||
function getCommandQueue(): QueuedCommand[] {
|
|
||||||
return [...commandQueue]
|
|
||||||
}
|
|
||||||
|
|
||||||
function hasCommandsInQueue(): boolean {
|
|
||||||
return commandQueue.length > 0
|
|
||||||
}
|
|
||||||
|
|
||||||
function resetCommandQueue(): void {
|
|
||||||
commandQueue.length = 0
|
|
||||||
}
|
|
||||||
|
|
||||||
function createMessageQueueManagerMock() {
|
|
||||||
return {
|
|
||||||
enqueue,
|
|
||||||
enqueuePendingNotification,
|
|
||||||
getCommandQueue,
|
|
||||||
hasCommandsInQueue,
|
|
||||||
resetCommandQueue,
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
function holdRunAgent(): void {
|
|
||||||
runAgentBlocker = new Promise(resolve => {
|
|
||||||
releaseRunAgentBlocker = resolve
|
|
||||||
})
|
|
||||||
}
|
|
||||||
|
|
||||||
function releaseRunAgent(): void {
|
|
||||||
releaseRunAgentBlocker?.()
|
|
||||||
runAgentBlocker = null
|
|
||||||
releaseRunAgentBlocker = null
|
|
||||||
}
|
|
||||||
|
|
||||||
mock.module('bun:bundle', () => ({
|
|
||||||
feature: (name: string) => name === 'KAIROS',
|
|
||||||
}))
|
|
||||||
|
|
||||||
mock.module(
|
|
||||||
'@claude-code-best/builtin-tools/tools/AgentTool/runAgent.js',
|
|
||||||
() => ({
|
|
||||||
runAgent: async function* () {
|
|
||||||
runAgentStartCount += 1
|
|
||||||
if (runAgentBlocker) {
|
|
||||||
await runAgentBlocker
|
|
||||||
}
|
|
||||||
yield {
|
|
||||||
type: 'assistant',
|
|
||||||
uuid: 'assistant-1',
|
|
||||||
timestamp: new Date().toISOString(),
|
|
||||||
message: {
|
|
||||||
id: 'msg_1',
|
|
||||||
type: 'message',
|
|
||||||
role: 'assistant',
|
|
||||||
model: 'test-model',
|
|
||||||
content: [{ type: 'text', text: 'forked command done' }],
|
|
||||||
stop_reason: 'end_turn',
|
|
||||||
stop_sequence: null,
|
|
||||||
usage: {
|
|
||||||
input_tokens: 0,
|
|
||||||
output_tokens: 0,
|
|
||||||
},
|
|
||||||
},
|
|
||||||
}
|
|
||||||
},
|
|
||||||
}),
|
|
||||||
)
|
|
||||||
|
|
||||||
mock.module('@claude-code-best/builtin-tools/tools/AgentTool/UI.js', () => ({
|
|
||||||
AgentPromptDisplay: () => null,
|
|
||||||
AgentResponseDisplay: () => null,
|
|
||||||
extractLastToolInfo: () => null,
|
|
||||||
renderGroupedAgentToolUse: () => null,
|
|
||||||
renderToolResultMessage: () => null,
|
|
||||||
renderToolUseErrorMessage: () => null,
|
|
||||||
renderToolUseMessage: () => null,
|
|
||||||
renderToolUseProgressMessage: () => null,
|
|
||||||
renderToolUseRejectedMessage: () => null,
|
|
||||||
renderToolUseTag: () => null,
|
|
||||||
userFacingName: () => 'Agent',
|
|
||||||
userFacingNameBackgroundColor: () => 'gray',
|
|
||||||
}))
|
|
||||||
|
|
||||||
mock.module('../../messageQueueManager', createMessageQueueManagerMock)
|
|
||||||
mock.module('../../messageQueueManager.js', createMessageQueueManagerMock)
|
|
||||||
|
|
||||||
const { processSlashCommand } = await import('../processSlashCommand')
|
|
||||||
|
|
||||||
let tempDir = ''
|
|
||||||
|
|
||||||
function createScheduledTaskQueuedCommandForTest(task: {
|
|
||||||
id: string
|
|
||||||
prompt: string
|
|
||||||
}) {
|
|
||||||
return createScheduledTaskQueuedCommand(task, {
|
|
||||||
rootDir: tempDir,
|
|
||||||
currentDir: tempDir,
|
|
||||||
})
|
|
||||||
}
|
|
||||||
|
|
||||||
async function waitForRunStatus(
|
|
||||||
runId: string,
|
|
||||||
status: 'queued' | 'running' | 'completed' | 'failed' | 'cancelled',
|
|
||||||
): Promise<void> {
|
|
||||||
for (let i = 0; i < 200; i++) {
|
|
||||||
const run = await getAutonomyRunById(runId, tempDir)
|
|
||||||
if (run?.status === status) {
|
|
||||||
return
|
|
||||||
}
|
|
||||||
await new Promise(resolve => setTimeout(resolve, 10))
|
|
||||||
}
|
|
||||||
const run = await getAutonomyRunById(runId, tempDir)
|
|
||||||
throw new Error(`Expected ${runId} to be ${status}, got ${run?.status}`)
|
|
||||||
}
|
|
||||||
|
|
||||||
async function waitForRunAgentStarts(expected: number): Promise<void> {
|
|
||||||
for (let i = 0; i < 200; i++) {
|
|
||||||
if (runAgentStartCount >= expected) {
|
|
||||||
return
|
|
||||||
}
|
|
||||||
await new Promise(resolve => setTimeout(resolve, 10))
|
|
||||||
}
|
|
||||||
throw new Error(
|
|
||||||
`Expected runAgent to start ${expected} time(s), got ${runAgentStartCount}`,
|
|
||||||
)
|
|
||||||
}
|
|
||||||
|
|
||||||
async function waitForCommandQueueLength(expected: number): Promise<void> {
|
|
||||||
for (let i = 0; i < 200; i++) {
|
|
||||||
if (getCommandQueue().length === expected) {
|
|
||||||
return
|
|
||||||
}
|
|
||||||
await new Promise(resolve => setTimeout(resolve, 10))
|
|
||||||
}
|
|
||||||
throw new Error(
|
|
||||||
`Expected command queue length ${expected}, got ${getCommandQueue().length}`,
|
|
||||||
)
|
|
||||||
}
|
|
||||||
|
|
||||||
beforeEach(async () => {
|
|
||||||
tempDir = await createTempDir('process-slash-command-')
|
|
||||||
originalNodeEnv = process.env.NODE_ENV
|
|
||||||
originalAnthropicApiKey = process.env.ANTHROPIC_API_KEY
|
|
||||||
process.env.NODE_ENV = 'test'
|
|
||||||
process.env.ANTHROPIC_API_KEY = 'test-key'
|
|
||||||
runAgentBlocker = null
|
|
||||||
releaseRunAgentBlocker = null
|
|
||||||
runAgentStartCount = 0
|
|
||||||
resetStateForTests()
|
|
||||||
resetAutonomyAuthorityForTests()
|
|
||||||
resetCommandQueue()
|
|
||||||
setOriginalCwd(tempDir)
|
|
||||||
setProjectRoot(tempDir)
|
|
||||||
setCwdState(tempDir)
|
|
||||||
})
|
|
||||||
|
|
||||||
afterEach(async () => {
|
|
||||||
releaseRunAgent()
|
|
||||||
if (originalNodeEnv === undefined) {
|
|
||||||
delete process.env.NODE_ENV
|
|
||||||
} else {
|
|
||||||
process.env.NODE_ENV = originalNodeEnv
|
|
||||||
}
|
|
||||||
if (originalAnthropicApiKey === undefined) {
|
|
||||||
delete process.env.ANTHROPIC_API_KEY
|
|
||||||
} else {
|
|
||||||
process.env.ANTHROPIC_API_KEY = originalAnthropicApiKey
|
|
||||||
}
|
|
||||||
resetStateForTests()
|
|
||||||
resetAutonomyAuthorityForTests()
|
|
||||||
resetCommandQueue()
|
|
||||||
if (tempDir) {
|
|
||||||
await cleanupTempDir(tempDir)
|
|
||||||
}
|
|
||||||
mock.restore()
|
|
||||||
})
|
|
||||||
|
|
||||||
describe('processSlashCommand', () => {
|
|
||||||
const forkedCommand = {
|
|
||||||
type: 'prompt',
|
|
||||||
name: 'forked',
|
|
||||||
description: 'test forked command',
|
|
||||||
progressMessage: 'forking',
|
|
||||||
contentLength: 0,
|
|
||||||
source: 'builtin',
|
|
||||||
context: 'fork',
|
|
||||||
getPromptForCommand: async () => [
|
|
||||||
{ type: 'text', text: 'review from fork' },
|
|
||||||
],
|
|
||||||
} as const
|
|
||||||
|
|
||||||
function createContext() {
|
|
||||||
return {
|
|
||||||
getAppState: () => ({
|
|
||||||
kairosEnabled: true,
|
|
||||||
mcp: { clients: [] },
|
|
||||||
toolPermissionContext: {
|
|
||||||
mode: 'default',
|
|
||||||
alwaysAllowRules: {},
|
|
||||||
},
|
|
||||||
}),
|
|
||||||
options: {
|
|
||||||
commands: [forkedCommand],
|
|
||||||
allowBackgroundForkedSlashCommands: true,
|
|
||||||
tools: [],
|
|
||||||
refreshTools: () => [],
|
|
||||||
agentDefinitions: {
|
|
||||||
activeAgents: [{ agentType: 'general-purpose' }],
|
|
||||||
},
|
|
||||||
},
|
|
||||||
setResponseLength: mock((_updater: (length: number) => number) => {}),
|
|
||||||
} as any
|
|
||||||
}
|
|
||||||
|
|
||||||
test('defers autonomy completion until a KAIROS background forked command completes', async () => {
|
|
||||||
const queued = await createAutonomyQueuedPrompt({
|
|
||||||
basePrompt: '/forked review',
|
|
||||||
trigger: 'scheduled-task',
|
|
||||||
rootDir: tempDir,
|
|
||||||
currentDir: tempDir,
|
|
||||||
sourceId: 'cron-1',
|
|
||||||
})
|
|
||||||
expect(queued).not.toBeNull()
|
|
||||||
const runId = queued!.autonomy!.runId
|
|
||||||
await markAutonomyRunRunning(runId, tempDir, 100)
|
|
||||||
|
|
||||||
const result = await processSlashCommand(
|
|
||||||
'/forked review',
|
|
||||||
[],
|
|
||||||
[],
|
|
||||||
[],
|
|
||||||
createContext(),
|
|
||||||
mock(() => {}),
|
|
||||||
undefined,
|
|
||||||
false,
|
|
||||||
async () => ({ behavior: 'allow', updatedInput: {} }) as any,
|
|
||||||
queued!.autonomy,
|
|
||||||
)
|
|
||||||
|
|
||||||
expect(result).toMatchObject({
|
|
||||||
messages: [],
|
|
||||||
shouldQuery: false,
|
|
||||||
deferAutonomyCompletion: true,
|
|
||||||
})
|
|
||||||
|
|
||||||
await waitForRunStatus(runId, 'completed')
|
|
||||||
await waitForCommandQueueLength(1)
|
|
||||||
expect(getCommandQueue()).toEqual([
|
|
||||||
expect.objectContaining({
|
|
||||||
mode: 'prompt',
|
|
||||||
isMeta: true,
|
|
||||||
skipSlashCommands: true,
|
|
||||||
value: expect.stringContaining(
|
|
||||||
'<scheduled-task-result command="/forked">',
|
|
||||||
),
|
|
||||||
}),
|
|
||||||
])
|
|
||||||
})
|
|
||||||
|
|
||||||
test('keeps repeated /loop scheduled fires bounded while a background fork is running', async () => {
|
|
||||||
const task = {
|
|
||||||
id: 'cron-loop',
|
|
||||||
prompt: '/forked review',
|
|
||||||
}
|
|
||||||
const first = await createScheduledTaskQueuedCommandForTest(task)
|
|
||||||
expect(first?.autonomy?.runId).toBeDefined()
|
|
||||||
const runId = first!.autonomy!.runId
|
|
||||||
await markAutonomyRunRunning(runId, tempDir, 100)
|
|
||||||
|
|
||||||
holdRunAgent()
|
|
||||||
const result = await processSlashCommand(
|
|
||||||
'/forked review',
|
|
||||||
[],
|
|
||||||
[],
|
|
||||||
[],
|
|
||||||
createContext(),
|
|
||||||
mock(() => {}),
|
|
||||||
undefined,
|
|
||||||
false,
|
|
||||||
async () => ({ behavior: 'allow', updatedInput: {} }) as any,
|
|
||||||
first!.autonomy,
|
|
||||||
)
|
|
||||||
|
|
||||||
expect(result.deferAutonomyCompletion).toBe(true)
|
|
||||||
await waitForRunAgentStarts(1)
|
|
||||||
|
|
||||||
const repeatedFires = await Promise.all(
|
|
||||||
Array.from({ length: 200 }, () =>
|
|
||||||
createScheduledTaskQueuedCommandForTest(task),
|
|
||||||
),
|
|
||||||
)
|
|
||||||
expect(repeatedFires.every(command => command === null)).toBe(true)
|
|
||||||
expect(
|
|
||||||
(await listAutonomyRuns(tempDir)).filter(
|
|
||||||
run => run.sourceId === 'cron-loop',
|
|
||||||
),
|
|
||||||
).toHaveLength(1)
|
|
||||||
expect(getCommandQueue()).toHaveLength(0)
|
|
||||||
|
|
||||||
releaseRunAgent()
|
|
||||||
await waitForRunStatus(runId, 'completed')
|
|
||||||
await waitForCommandQueueLength(1)
|
|
||||||
expect(getCommandQueue()).toHaveLength(1)
|
|
||||||
|
|
||||||
const next = await createScheduledTaskQueuedCommandForTest(task)
|
|
||||||
expect(next?.autonomy?.runId).toBeDefined()
|
|
||||||
expect(
|
|
||||||
(await listAutonomyRuns(tempDir)).filter(
|
|
||||||
run => run.sourceId === 'cron-loop',
|
|
||||||
),
|
|
||||||
).toHaveLength(2)
|
|
||||||
})
|
|
||||||
|
|
||||||
test('rejects the background fork test override outside test runtime', async () => {
|
|
||||||
process.env.NODE_ENV = 'production'
|
|
||||||
|
|
||||||
const result = await processSlashCommand(
|
|
||||||
'/forked review',
|
|
||||||
[],
|
|
||||||
[],
|
|
||||||
[],
|
|
||||||
createContext(),
|
|
||||||
mock(() => {}),
|
|
||||||
undefined,
|
|
||||||
false,
|
|
||||||
async () => ({ behavior: 'allow', updatedInput: {} }) as any,
|
|
||||||
)
|
|
||||||
|
|
||||||
expect(result.shouldQuery).toBe(false)
|
|
||||||
expect(
|
|
||||||
result.messages.some(message =>
|
|
||||||
JSON.stringify(message).includes(
|
|
||||||
'allowBackgroundForkedSlashCommands is test-only',
|
|
||||||
),
|
|
||||||
),
|
|
||||||
).toBe(true)
|
|
||||||
expect(runAgentStartCount).toBe(0)
|
|
||||||
})
|
|
||||||
})
|
|
||||||
File diff suppressed because it is too large
Load Diff
@@ -28,7 +28,6 @@ import type {
|
|||||||
import type { PermissionMode } from '../../types/permissions.js'
|
import type { PermissionMode } from '../../types/permissions.js'
|
||||||
import {
|
import {
|
||||||
isValidImagePaste,
|
isValidImagePaste,
|
||||||
type QueuedCommand,
|
|
||||||
type PromptInputMode,
|
type PromptInputMode,
|
||||||
} from '../../types/textInputTypes.js'
|
} from '../../types/textInputTypes.js'
|
||||||
import {
|
import {
|
||||||
@@ -81,9 +80,6 @@ export type ProcessUserInputBaseResult = {
|
|||||||
// Used by /discover to chain into the selected feature's command
|
// Used by /discover to chain into the selected feature's command
|
||||||
nextInput?: string
|
nextInput?: string
|
||||||
submitNextInput?: boolean
|
submitNextInput?: boolean
|
||||||
// When true, the command started detached work that will finalize its
|
|
||||||
// autonomy run after the background work completes.
|
|
||||||
deferAutonomyCompletion?: boolean
|
|
||||||
}
|
}
|
||||||
|
|
||||||
export async function processUserInput({
|
export async function processUserInput({
|
||||||
@@ -104,7 +100,6 @@ export async function processUserInput({
|
|||||||
bridgeOrigin,
|
bridgeOrigin,
|
||||||
isMeta,
|
isMeta,
|
||||||
skipAttachments,
|
skipAttachments,
|
||||||
autonomy,
|
|
||||||
}: {
|
}: {
|
||||||
input: string | Array<ContentBlockParam>
|
input: string | Array<ContentBlockParam>
|
||||||
/**
|
/**
|
||||||
@@ -142,7 +137,6 @@ export async function processUserInput({
|
|||||||
*/
|
*/
|
||||||
isMeta?: boolean
|
isMeta?: boolean
|
||||||
skipAttachments?: boolean
|
skipAttachments?: boolean
|
||||||
autonomy?: QueuedCommand['autonomy']
|
|
||||||
}): Promise<ProcessUserInputBaseResult> {
|
}): Promise<ProcessUserInputBaseResult> {
|
||||||
const inputString = typeof input === 'string' ? input : null
|
const inputString = typeof input === 'string' ? input : null
|
||||||
// Immediately show the user input prompt while we are still processing the input.
|
// Immediately show the user input prompt while we are still processing the input.
|
||||||
@@ -174,7 +168,6 @@ export async function processUserInput({
|
|||||||
isMeta,
|
isMeta,
|
||||||
skipAttachments,
|
skipAttachments,
|
||||||
preExpansionInput,
|
preExpansionInput,
|
||||||
autonomy,
|
|
||||||
)
|
)
|
||||||
queryCheckpoint('query_process_user_input_base_end')
|
queryCheckpoint('query_process_user_input_base_end')
|
||||||
|
|
||||||
@@ -303,7 +296,6 @@ async function processUserInputBase(
|
|||||||
isMeta?: boolean,
|
isMeta?: boolean,
|
||||||
skipAttachments?: boolean,
|
skipAttachments?: boolean,
|
||||||
preExpansionInput?: string,
|
preExpansionInput?: string,
|
||||||
autonomy?: QueuedCommand['autonomy'],
|
|
||||||
): Promise<ProcessUserInputBaseResult> {
|
): Promise<ProcessUserInputBaseResult> {
|
||||||
let inputString: string | null = null
|
let inputString: string | null = null
|
||||||
let precedingInputBlocks: ContentBlockParam[] = []
|
let precedingInputBlocks: ContentBlockParam[] = []
|
||||||
@@ -499,7 +491,6 @@ async function processUserInputBase(
|
|||||||
uuid,
|
uuid,
|
||||||
isAlreadyProcessing,
|
isAlreadyProcessing,
|
||||||
canUseTool,
|
canUseTool,
|
||||||
autonomy,
|
|
||||||
)
|
)
|
||||||
return addImageMetadataMessage(slashResult, imageMetadataTexts)
|
return addImageMetadataMessage(slashResult, imageMetadataTexts)
|
||||||
}
|
}
|
||||||
@@ -558,7 +549,6 @@ async function processUserInputBase(
|
|||||||
uuid,
|
uuid,
|
||||||
isAlreadyProcessing,
|
isAlreadyProcessing,
|
||||||
canUseTool,
|
canUseTool,
|
||||||
autonomy,
|
|
||||||
)
|
)
|
||||||
return addImageMetadataMessage(slashResult, imageMetadataTexts)
|
return addImageMetadataMessage(slashResult, imageMetadataTexts)
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -424,7 +424,8 @@ function createInProcessCanUseTool(
|
|||||||
feedback: parsed.error,
|
feedback: parsed.error,
|
||||||
})
|
})
|
||||||
}
|
}
|
||||||
return // Callback already resolves the promise
|
cleanup()
|
||||||
|
return
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@@ -674,7 +675,6 @@ type WaitResult =
|
|||||||
type: 'new_message'
|
type: 'new_message'
|
||||||
message: string
|
message: string
|
||||||
autonomyRunId?: string
|
autonomyRunId?: string
|
||||||
autonomyRootDir?: string
|
|
||||||
from: string
|
from: string
|
||||||
color?: string
|
color?: string
|
||||||
summary?: string
|
summary?: string
|
||||||
@@ -739,16 +739,12 @@ async function waitForNextPromptOrShutdown(
|
|||||||
`[inProcessRunner] ${identity.agentName} found pending user message (poll #${pollCount})`,
|
`[inProcessRunner] ${identity.agentName} found pending user message (poll #${pollCount})`,
|
||||||
)
|
)
|
||||||
if (pending.autonomyRunId) {
|
if (pending.autonomyRunId) {
|
||||||
await markAutonomyRunRunning(
|
await markAutonomyRunRunning(pending.autonomyRunId)
|
||||||
pending.autonomyRunId,
|
|
||||||
pending.autonomyRootDir,
|
|
||||||
)
|
|
||||||
}
|
}
|
||||||
return {
|
return {
|
||||||
type: 'new_message',
|
type: 'new_message',
|
||||||
message: pending.message,
|
message: pending.message,
|
||||||
autonomyRunId: pending.autonomyRunId,
|
autonomyRunId: pending.autonomyRunId,
|
||||||
autonomyRootDir: pending.autonomyRootDir,
|
|
||||||
from: 'user',
|
from: 'user',
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@@ -1026,7 +1022,6 @@ export async function runInProcessTeammate(
|
|||||||
)
|
)
|
||||||
let currentPrompt = wrappedInitialPrompt
|
let currentPrompt = wrappedInitialPrompt
|
||||||
let currentAutonomyRunId: string | undefined
|
let currentAutonomyRunId: string | undefined
|
||||||
let currentAutonomyRootDir: string | undefined
|
|
||||||
let shouldExit = false
|
let shouldExit = false
|
||||||
|
|
||||||
// Try to claim an available task immediately so the UI can show activity
|
// Try to claim an available task immediately so the UI can show activity
|
||||||
@@ -1324,21 +1319,12 @@ export async function runInProcessTeammate(
|
|||||||
setAppState,
|
setAppState,
|
||||||
)
|
)
|
||||||
if (currentAutonomyRunId) {
|
if (currentAutonomyRunId) {
|
||||||
await markAutonomyRunFailed(
|
await markAutonomyRunFailed(currentAutonomyRunId, ERROR_MESSAGE_USER_ABORT)
|
||||||
currentAutonomyRunId,
|
|
||||||
ERROR_MESSAGE_USER_ABORT,
|
|
||||||
currentAutonomyRootDir,
|
|
||||||
)
|
|
||||||
currentAutonomyRunId = undefined
|
currentAutonomyRunId = undefined
|
||||||
currentAutonomyRootDir = undefined
|
|
||||||
}
|
}
|
||||||
} else if (currentAutonomyRunId) {
|
} else if (currentAutonomyRunId) {
|
||||||
await markAutonomyRunCompleted(
|
await markAutonomyRunCompleted(currentAutonomyRunId)
|
||||||
currentAutonomyRunId,
|
|
||||||
currentAutonomyRootDir,
|
|
||||||
)
|
|
||||||
currentAutonomyRunId = undefined
|
currentAutonomyRunId = undefined
|
||||||
currentAutonomyRootDir = undefined
|
|
||||||
}
|
}
|
||||||
|
|
||||||
// Check if already idle before updating (to skip duplicate notification)
|
// Check if already idle before updating (to skip duplicate notification)
|
||||||
@@ -1412,7 +1398,6 @@ export async function runInProcessTeammate(
|
|||||||
setAppState,
|
setAppState,
|
||||||
)
|
)
|
||||||
currentAutonomyRunId = undefined
|
currentAutonomyRunId = undefined
|
||||||
currentAutonomyRootDir = undefined
|
|
||||||
break
|
break
|
||||||
|
|
||||||
case 'new_message':
|
case 'new_message':
|
||||||
@@ -1425,7 +1410,6 @@ export async function runInProcessTeammate(
|
|||||||
if (waitResult.from === 'user') {
|
if (waitResult.from === 'user') {
|
||||||
currentPrompt = waitResult.message
|
currentPrompt = waitResult.message
|
||||||
currentAutonomyRunId = waitResult.autonomyRunId
|
currentAutonomyRunId = waitResult.autonomyRunId
|
||||||
currentAutonomyRootDir = waitResult.autonomyRootDir
|
|
||||||
} else {
|
} else {
|
||||||
currentPrompt = formatAsTeammateMessage(
|
currentPrompt = formatAsTeammateMessage(
|
||||||
waitResult.from,
|
waitResult.from,
|
||||||
@@ -1442,7 +1426,6 @@ export async function runInProcessTeammate(
|
|||||||
setAppState,
|
setAppState,
|
||||||
)
|
)
|
||||||
currentAutonomyRunId = undefined
|
currentAutonomyRunId = undefined
|
||||||
currentAutonomyRootDir = undefined
|
|
||||||
}
|
}
|
||||||
break
|
break
|
||||||
|
|
||||||
@@ -1550,11 +1533,7 @@ export async function runInProcessTeammate(
|
|||||||
})
|
})
|
||||||
}
|
}
|
||||||
if (currentAutonomyRunId) {
|
if (currentAutonomyRunId) {
|
||||||
await markAutonomyRunFailed(
|
await markAutonomyRunFailed(currentAutonomyRunId, errorMessage)
|
||||||
currentAutonomyRunId,
|
|
||||||
errorMessage,
|
|
||||||
currentAutonomyRootDir,
|
|
||||||
)
|
|
||||||
}
|
}
|
||||||
|
|
||||||
// Send idle notification with failure via file-based mailbox
|
// Send idle notification with failure via file-based mailbox
|
||||||
|
|||||||
@@ -234,7 +234,7 @@ export function killInProcessTeammate(
|
|||||||
let agentId: string | null = null
|
let agentId: string | null = null
|
||||||
let toolUseId: string | undefined
|
let toolUseId: string | undefined
|
||||||
let description: string | undefined
|
let description: string | undefined
|
||||||
let pendingAutonomyRuns: Array<{ runId: string; rootDir?: string }> = []
|
let pendingAutonomyRunIds: string[] = []
|
||||||
|
|
||||||
setAppState((prev: AppState) => {
|
setAppState((prev: AppState) => {
|
||||||
const task = prev.tasks[taskId]
|
const task = prev.tasks[taskId]
|
||||||
@@ -255,18 +255,9 @@ export function killInProcessTeammate(
|
|||||||
description = teammateTask.description
|
description = teammateTask.description
|
||||||
|
|
||||||
// Capture pending autonomy run IDs before clearing them
|
// Capture pending autonomy run IDs before clearing them
|
||||||
pendingAutonomyRuns = teammateTask.pendingUserMessages.flatMap(message =>
|
pendingAutonomyRunIds = teammateTask.pendingUserMessages
|
||||||
message.autonomyRunId
|
.map(message => message.autonomyRunId)
|
||||||
? [
|
.filter((runId): runId is string => runId !== undefined)
|
||||||
{
|
|
||||||
runId: message.autonomyRunId,
|
|
||||||
...(message.autonomyRootDir
|
|
||||||
? { rootDir: message.autonomyRootDir }
|
|
||||||
: {}),
|
|
||||||
},
|
|
||||||
]
|
|
||||||
: [],
|
|
||||||
)
|
|
||||||
|
|
||||||
// Abort the controller to stop execution
|
// Abort the controller to stop execution
|
||||||
teammateTask.abortController?.abort()
|
teammateTask.abortController?.abort()
|
||||||
@@ -320,11 +311,10 @@ export function killInProcessTeammate(
|
|||||||
}
|
}
|
||||||
|
|
||||||
if (killed) {
|
if (killed) {
|
||||||
for (const run of pendingAutonomyRuns) {
|
for (const runId of pendingAutonomyRunIds) {
|
||||||
void markAutonomyRunFailed(
|
void markAutonomyRunFailed(
|
||||||
run.runId,
|
runId,
|
||||||
`Teammate ${agentId ?? taskId} was stopped before it could consume the queued autonomy prompt.`,
|
`Teammate ${agentId ?? taskId} was stopped before it could consume the queued autonomy prompt.`,
|
||||||
run.rootDir,
|
|
||||||
)
|
)
|
||||||
}
|
}
|
||||||
void evictTaskOutput(taskId)
|
void evictTaskOutput(taskId)
|
||||||
|
|||||||
@@ -1,148 +0,0 @@
|
|||||||
import { afterEach, beforeEach, describe, expect, test } from 'bun:test'
|
|
||||||
import { existsSync, mkdtempSync, rmSync } from 'node:fs'
|
|
||||||
import { tmpdir } from 'node:os'
|
|
||||||
import { join, resolve } from 'node:path'
|
|
||||||
import {
|
|
||||||
resetStateForTests,
|
|
||||||
setOriginalCwd,
|
|
||||||
setProjectRoot,
|
|
||||||
} from '../../src/bootstrap/state'
|
|
||||||
import {
|
|
||||||
listAutonomyRuns,
|
|
||||||
startManagedAutonomyFlowFromHeartbeatTask,
|
|
||||||
} from '../../src/utils/autonomyRuns'
|
|
||||||
import { listAutonomyFlows } from '../../src/utils/autonomyFlows'
|
|
||||||
|
|
||||||
const CLI_ENTRYPOINT = resolve(import.meta.dir, '../../src/entrypoints/cli.tsx')
|
|
||||||
|
|
||||||
let tempDir = ''
|
|
||||||
let configDir = ''
|
|
||||||
let previousConfigDir: string | undefined
|
|
||||||
|
|
||||||
async function runAutonomyCli(args: string[]): Promise<string> {
|
|
||||||
const proc = Bun.spawn({
|
|
||||||
cmd: [process.execPath, CLI_ENTRYPOINT, 'autonomy', ...args],
|
|
||||||
cwd: tempDir,
|
|
||||||
env: {
|
|
||||||
...process.env,
|
|
||||||
CLAUDE_CONFIG_DIR: configDir,
|
|
||||||
CI: 'true',
|
|
||||||
GITHUB_ACTIONS: 'true',
|
|
||||||
NODE_ENV: 'development',
|
|
||||||
NO_COLOR: '1',
|
|
||||||
},
|
|
||||||
stdin: 'ignore',
|
|
||||||
stdout: 'pipe',
|
|
||||||
stderr: 'pipe',
|
|
||||||
})
|
|
||||||
|
|
||||||
const [stdout, stderr, exitCode] = await Promise.all([
|
|
||||||
new Response(proc.stdout).text(),
|
|
||||||
new Response(proc.stderr).text(),
|
|
||||||
proc.exited,
|
|
||||||
])
|
|
||||||
|
|
||||||
expect(stderr, `unexpected stderr output:\n${stderr}`).toBe('')
|
|
||||||
expect(exitCode, `non-zero exit ${exitCode}; stderr:\n${stderr}`).toBe(0)
|
|
||||||
return stdout
|
|
||||||
}
|
|
||||||
|
|
||||||
beforeEach(() => {
|
|
||||||
tempDir = mkdtempSync(join(tmpdir(), 'autonomy-user-flow-'))
|
|
||||||
configDir = join(tempDir, 'config')
|
|
||||||
previousConfigDir = process.env.CLAUDE_CONFIG_DIR
|
|
||||||
process.env.CLAUDE_CONFIG_DIR = configDir
|
|
||||||
resetStateForTests()
|
|
||||||
setOriginalCwd(tempDir)
|
|
||||||
setProjectRoot(tempDir)
|
|
||||||
})
|
|
||||||
|
|
||||||
afterEach(() => {
|
|
||||||
resetStateForTests()
|
|
||||||
if (previousConfigDir === undefined) {
|
|
||||||
delete process.env.CLAUDE_CONFIG_DIR
|
|
||||||
} else {
|
|
||||||
process.env.CLAUDE_CONFIG_DIR = previousConfigDir
|
|
||||||
}
|
|
||||||
if (tempDir) {
|
|
||||||
rmSync(tempDir, { recursive: true, force: true })
|
|
||||||
}
|
|
||||||
})
|
|
||||||
|
|
||||||
describe('autonomy lifecycle user-equivalent CLI flow', () => {
|
|
||||||
test('status --deep works from a clean project without creating autonomy state', async () => {
|
|
||||||
const output = await runAutonomyCli(['status', '--deep'])
|
|
||||||
|
|
||||||
expect(output).toContain('# Autonomy Deep Status')
|
|
||||||
expect(output).toContain('Autonomy runs: 0')
|
|
||||||
expect(output).toContain('Autonomy flows: 0')
|
|
||||||
expect(existsSync(join(tempDir, '.claude', 'autonomy', 'runs.json'))).toBe(
|
|
||||||
false,
|
|
||||||
)
|
|
||||||
expect(existsSync(join(tempDir, '.claude', 'autonomy', 'flows.json'))).toBe(
|
|
||||||
false,
|
|
||||||
)
|
|
||||||
})
|
|
||||||
|
|
||||||
test('real CLI can inspect, resume, and cancel a persisted managed flow', async () => {
|
|
||||||
await startManagedAutonomyFlowFromHeartbeatTask({
|
|
||||||
rootDir: tempDir,
|
|
||||||
currentDir: tempDir,
|
|
||||||
task: {
|
|
||||||
name: 'manual-user-flow',
|
|
||||||
interval: '1h',
|
|
||||||
prompt: 'Manual lifecycle acceptance',
|
|
||||||
steps: [
|
|
||||||
{
|
|
||||||
name: 'approve',
|
|
||||||
prompt: 'Wait for manual approval',
|
|
||||||
waitFor: 'manual',
|
|
||||||
},
|
|
||||||
{
|
|
||||||
name: 'execute',
|
|
||||||
prompt: 'Execute approved work',
|
|
||||||
},
|
|
||||||
],
|
|
||||||
},
|
|
||||||
})
|
|
||||||
const [waitingFlow] = await listAutonomyFlows(tempDir)
|
|
||||||
expect(waitingFlow?.status).toBe('waiting')
|
|
||||||
|
|
||||||
const status = await runAutonomyCli(['status', '--deep'])
|
|
||||||
expect(status).toContain('Autonomy flows: 1')
|
|
||||||
expect(status).toContain('Waiting: 1')
|
|
||||||
|
|
||||||
const flows = await runAutonomyCli(['flows', '5'])
|
|
||||||
expect(flows).toContain(waitingFlow!.flowId)
|
|
||||||
expect(flows).toContain('waiting')
|
|
||||||
|
|
||||||
const detailBefore = await runAutonomyCli(['flow', waitingFlow!.flowId])
|
|
||||||
expect(detailBefore).toContain('Status: waiting')
|
|
||||||
expect(detailBefore).toContain('Current step: approve')
|
|
||||||
|
|
||||||
const resume = await runAutonomyCli(['flow', 'resume', waitingFlow!.flowId])
|
|
||||||
expect(resume).toContain('Prepared the next managed step')
|
|
||||||
expect(resume).toContain('Prompt:')
|
|
||||||
|
|
||||||
const detailAfterResume = await runAutonomyCli([
|
|
||||||
'flow',
|
|
||||||
waitingFlow!.flowId,
|
|
||||||
])
|
|
||||||
expect(detailAfterResume).toContain('Status: queued')
|
|
||||||
expect(detailAfterResume).toContain('Latest run:')
|
|
||||||
|
|
||||||
const cancel = await runAutonomyCli(['flow', 'cancel', waitingFlow!.flowId])
|
|
||||||
expect(cancel).toContain('Cancelled flow')
|
|
||||||
|
|
||||||
const [cancelledRun] = await listAutonomyRuns(tempDir)
|
|
||||||
const [cancelledFlow] = await listAutonomyFlows(tempDir)
|
|
||||||
expect(cancelledRun?.status).toBe('cancelled')
|
|
||||||
expect(cancelledFlow?.status).toBe('cancelled')
|
|
||||||
|
|
||||||
const detailAfterCancel = await runAutonomyCli([
|
|
||||||
'flow',
|
|
||||||
waitingFlow!.flowId,
|
|
||||||
])
|
|
||||||
expect(detailAfterCancel).toContain('Status: cancelled')
|
|
||||||
}, 30000)
|
|
||||||
})
|
|
||||||
@@ -2,42 +2,13 @@ import { describe, expect, test } from 'bun:test'
|
|||||||
import { mkdtempSync, rmSync, writeFileSync } from 'node:fs'
|
import { mkdtempSync, rmSync, writeFileSync } from 'node:fs'
|
||||||
import { createRequire } from 'node:module'
|
import { createRequire } from 'node:module'
|
||||||
import { tmpdir } from 'node:os'
|
import { tmpdir } from 'node:os'
|
||||||
import { dirname, join, resolve } from 'node:path'
|
import { join, resolve } from 'node:path'
|
||||||
import { pathToFileURL } from 'node:url'
|
import { pathToFileURL } from 'node:url'
|
||||||
|
|
||||||
const repoRoot = resolve(import.meta.dir, '..', '..')
|
const repoRoot = resolve(import.meta.dir, '..', '..')
|
||||||
const uuidV4Pattern =
|
const uuidV4Pattern =
|
||||||
/^[0-9a-f]{8}-[0-9a-f]{4}-4[0-9a-f]{3}-[89ab][0-9a-f]{3}-[0-9a-f]{12}$/
|
/^[0-9a-f]{8}-[0-9a-f]{4}-4[0-9a-f]{3}-[89ab][0-9a-f]{3}-[0-9a-f]{12}$/
|
||||||
|
|
||||||
async function findPackageJson(
|
|
||||||
startPath: string,
|
|
||||||
expectedName: string,
|
|
||||||
): Promise<string> {
|
|
||||||
let current = dirname(startPath)
|
|
||||||
for (let depth = 0; depth < 10; depth++) {
|
|
||||||
const candidate = join(current, 'package.json')
|
|
||||||
const file = Bun.file(candidate)
|
|
||||||
if (await file.exists()) {
|
|
||||||
try {
|
|
||||||
const parsed = JSON.parse(await file.text()) as { name?: unknown }
|
|
||||||
if (parsed.name === expectedName) {
|
|
||||||
return candidate
|
|
||||||
}
|
|
||||||
} catch {
|
|
||||||
// ignore parse errors and keep walking up
|
|
||||||
}
|
|
||||||
}
|
|
||||||
const parent = dirname(current)
|
|
||||||
if (parent === current) {
|
|
||||||
break
|
|
||||||
}
|
|
||||||
current = parent
|
|
||||||
}
|
|
||||||
throw new Error(
|
|
||||||
`package.json with name "${expectedName}" not found above ${startPath}`,
|
|
||||||
)
|
|
||||||
}
|
|
||||||
|
|
||||||
describe('dependency security overrides', () => {
|
describe('dependency security overrides', () => {
|
||||||
test('mcpb can load patched inquirer prompts from its package context', async () => {
|
test('mcpb can load patched inquirer prompts from its package context', async () => {
|
||||||
const mcpbRequire = createRequire(import.meta.resolve('@anthropic-ai/mcpb'))
|
const mcpbRequire = createRequire(import.meta.resolve('@anthropic-ai/mcpb'))
|
||||||
@@ -57,7 +28,10 @@ describe('dependency security overrides', () => {
|
|||||||
)
|
)
|
||||||
const gaxios = vertexRequire('gaxios') as {
|
const gaxios = vertexRequire('gaxios') as {
|
||||||
request(options: {
|
request(options: {
|
||||||
adapter(options: { headers: Headers; url: string }): Promise<{
|
adapter(options: {
|
||||||
|
headers: Headers
|
||||||
|
url: string
|
||||||
|
}): Promise<{
|
||||||
config: unknown
|
config: unknown
|
||||||
data: string
|
data: string
|
||||||
headers: Record<string, string>
|
headers: Record<string, string>
|
||||||
@@ -65,7 +39,7 @@ describe('dependency security overrides', () => {
|
|||||||
status: number
|
status: number
|
||||||
statusText: string
|
statusText: string
|
||||||
}>
|
}>
|
||||||
multipart: Array<{ body: string; headers: Record<string, string> }>
|
multipart: Array<{ body: string; headers: Record<string, string> }>
|
||||||
url: string
|
url: string
|
||||||
}): Promise<{ status: number }>
|
}): Promise<{ status: number }>
|
||||||
}
|
}
|
||||||
@@ -73,10 +47,8 @@ describe('dependency security overrides', () => {
|
|||||||
|
|
||||||
const response = await gaxios.request({
|
const response = await gaxios.request({
|
||||||
url: 'https://example.com/upload',
|
url: 'https://example.com/upload',
|
||||||
multipart: [
|
multipart: [{ body: 'payload', headers: { 'Content-Type': 'text/plain' } }],
|
||||||
{ body: 'payload', headers: { 'Content-Type': 'text/plain' } },
|
adapter: async (options) => {
|
||||||
],
|
|
||||||
adapter: async options => {
|
|
||||||
contentType = options.headers.get('content-type') ?? undefined
|
contentType = options.headers.get('content-type') ?? undefined
|
||||||
return {
|
return {
|
||||||
config: options,
|
config: options,
|
||||||
@@ -90,14 +62,14 @@ describe('dependency security overrides', () => {
|
|||||||
})
|
})
|
||||||
|
|
||||||
expect(response.status).toBe(200)
|
expect(response.status).toBe(200)
|
||||||
expect(contentType).toMatch(/^multipart\/related; boundary=[0-9a-f-]{36}$/)
|
expect(contentType).toMatch(
|
||||||
|
/^multipart\/related; boundary=[0-9a-f-]{36}$/,
|
||||||
|
)
|
||||||
expect(contentType?.split('boundary=')[1]).toMatch(uuidV4Pattern)
|
expect(contentType?.split('boundary=')[1]).toMatch(uuidV4Pattern)
|
||||||
})
|
})
|
||||||
|
|
||||||
test('azure identity msal guid generation works through its package context', () => {
|
test('azure identity msal guid generation works through its package context', () => {
|
||||||
const identityRequire = createRequire(
|
const identityRequire = createRequire(import.meta.resolve('@azure/identity'))
|
||||||
import.meta.resolve('@azure/identity'),
|
|
||||||
)
|
|
||||||
const msal = identityRequire('@azure/msal-node') as {
|
const msal = identityRequire('@azure/msal-node') as {
|
||||||
CryptoProvider: new () => { createNewGuid(): string }
|
CryptoProvider: new () => { createNewGuid(): string }
|
||||||
}
|
}
|
||||||
@@ -106,7 +78,7 @@ describe('dependency security overrides', () => {
|
|||||||
expect(cryptoProvider.createNewGuid()).toMatch(uuidV4Pattern)
|
expect(cryptoProvider.createNewGuid()).toMatch(uuidV4Pattern)
|
||||||
})
|
})
|
||||||
|
|
||||||
test('remote control markdown renderer resolves streamdown and mermaid', async () => {
|
test('remote control markdown renderer loads streamdown and mermaid', async () => {
|
||||||
const rcsRequire = createRequire(
|
const rcsRequire = createRequire(
|
||||||
join(repoRoot, 'packages/remote-control-server/package.json'),
|
join(repoRoot, 'packages/remote-control-server/package.json'),
|
||||||
)
|
)
|
||||||
@@ -118,26 +90,13 @@ describe('dependency security overrides', () => {
|
|||||||
const uuid = (await import(
|
const uuid = (await import(
|
||||||
pathToFileURL(streamdownRequire.resolve('uuid')).href
|
pathToFileURL(streamdownRequire.resolve('uuid')).href
|
||||||
)) as { v4(): string }
|
)) as { v4(): string }
|
||||||
const mermaidPath = streamdownRequire.resolve('mermaid')
|
const mermaid = (await import(
|
||||||
// mermaid does not export ./package.json in its exports map, so resolving
|
pathToFileURL(streamdownRequire.resolve('mermaid')).href
|
||||||
// 'mermaid/package.json' throws ERR_PACKAGE_PATH_NOT_EXPORTED in runtimes
|
)) as { default?: { initialize?: unknown } }
|
||||||
// that honor exports semantics. Walk up from the resolved entry until a
|
|
||||||
// package.json with name === 'mermaid' is found.
|
|
||||||
const mermaidPackagePath = await findPackageJson(mermaidPath, 'mermaid')
|
|
||||||
const mermaidPackage = JSON.parse(
|
|
||||||
await Bun.file(mermaidPackagePath).text(),
|
|
||||||
) as {
|
|
||||||
name?: unknown
|
|
||||||
exports?: { '.'?: { import?: unknown } }
|
|
||||||
}
|
|
||||||
|
|
||||||
expect(streamdown.Streamdown).toBeDefined()
|
expect(streamdown.Streamdown).toBeDefined()
|
||||||
expect(uuid.v4()).toMatch(uuidV4Pattern)
|
expect(uuid.v4()).toMatch(uuidV4Pattern)
|
||||||
expect(mermaidPackage.name).toBe('mermaid')
|
expect(typeof mermaid.default?.initialize).toBe('function')
|
||||||
expect(mermaidPath).toContain('mermaid.core.mjs')
|
|
||||||
expect(mermaidPackage.exports?.['.']?.import).toBe(
|
|
||||||
'./dist/mermaid.core.mjs',
|
|
||||||
)
|
|
||||||
})
|
})
|
||||||
|
|
||||||
test('grpc proto-loader keeps its protobuf 7 parser path working', () => {
|
test('grpc proto-loader keeps its protobuf 7 parser path working', () => {
|
||||||
|
|||||||
@@ -1,31 +0,0 @@
|
|||||||
/**
|
|
||||||
* Shared mock for `src/utils/auth.js`. Use it via:
|
|
||||||
*
|
|
||||||
* import { authMock } from '../../tests/mocks/auth'
|
|
||||||
* mock.module('src/utils/auth.js', authMock)
|
|
||||||
*
|
|
||||||
* Tests that need different return values can override the helper used by
|
|
||||||
* the suite (e.g. by extending this object and re-registering with mock.module).
|
|
||||||
* Always extend here rather than inlining a different shape per test, so the
|
|
||||||
* surface stays consistent when `auth.ts` exports change.
|
|
||||||
*/
|
|
||||||
export const authMock = () => ({
|
|
||||||
// Mirrors the production contract: src/utils/auth.ts returns
|
|
||||||
// Promise<boolean> ("did the access token change") and a token object that
|
|
||||||
// carries scopes, subscriptionType, expiresAt, etc. Tests that branch on
|
|
||||||
// these values must see the full shape so they can not silently drift away
|
|
||||||
// from production.
|
|
||||||
checkAndRefreshOAuthTokenIfNeeded: async () => false,
|
|
||||||
getClaudeAIOAuthTokens: () => ({
|
|
||||||
accessToken: 'token',
|
|
||||||
refreshToken: null,
|
|
||||||
expiresAt: null,
|
|
||||||
scopes: ['user:inference'],
|
|
||||||
subscriptionType: null,
|
|
||||||
rateLimitTier: null,
|
|
||||||
}),
|
|
||||||
isClaudeAISubscriber: () => true,
|
|
||||||
isProSubscriber: () => false,
|
|
||||||
isMaxSubscriber: () => false,
|
|
||||||
isTeamSubscriber: () => false,
|
|
||||||
})
|
|
||||||
@@ -30,21 +30,3 @@ export async function createTempSubdir(
|
|||||||
await mkdir(path, { recursive: true })
|
await mkdir(path, { recursive: true })
|
||||||
return path
|
return path
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
|
||||||
* Read a file under the test temp dir as utf-8 text. Mirrors the node:fs
|
|
||||||
* `readFileSync(path, 'utf-8')` ergonomics but uses Bun's native file API so
|
|
||||||
* tests stay on the Bun-only runtime contract.
|
|
||||||
*/
|
|
||||||
export async function readTempFile(path: string): Promise<string> {
|
|
||||||
return Bun.file(path).text()
|
|
||||||
}
|
|
||||||
|
|
||||||
/**
|
|
||||||
* Best-effort existence check for a path under the test temp dir. Uses Bun's
|
|
||||||
* native file API (works for files; directories return true via Bun.file().exists()
|
|
||||||
* iff the path resolves — reads directly from the filesystem).
|
|
||||||
*/
|
|
||||||
export async function tempPathExists(path: string): Promise<boolean> {
|
|
||||||
return Bun.file(path).exists()
|
|
||||||
}
|
|
||||||
|
|||||||
Reference in New Issue
Block a user