When ANTHROPIC_BASE_URL points to a non-Anthropic endpoint (e.g.
DeepSeek), the JSON-formatted user_id containing {, ", : characters
fails validation against ^[a-zA-Z0-9_-]+$. Send only the hex device_id
for third-party providers.
Add ImageLimits type and plumb optional limits through the chain:
callMCPTool/callMCPToolWithUrlElicitationRetry -> processMCPResult ->
transformMCPResult -> transformResultContent -> maybeResizeAndDownsampleImageBuffer.
When provided, limits override the module-level defaults
(IMAGE_TARGET_RAW_SIZE, IMAGE_MAX_WIDTH, IMAGE_MAX_HEIGHT,
API_IMAGE_MAX_BASE64_SIZE) inside maybeResizeAndDownsampleImageBuffer.
When undefined, behavior is unchanged for current callers.
Add _meta preservation in the text-block case of transformResultContent
(only when the caller opts in via includeMeta=true). transformMCPResult
passes includeMeta=true on the tool-result path; the prompt-handler call
site keeps the default false, preserving prior behavior.
Add skipLargeOutput early-return in processMCPResult after the IDE check:
when the caller passes skipLargeOutput=true and the content has no images,
the function returns content directly without large-output handling.
Add unwrap-to-text in processMCPResult for the persisted-content path:
when the large-string format gate is enabled
(MCP_TRUNCATION_PROMPT_OVERRIDE env var, or
tengu_mcp_subagent_prompt Statsig gate), and the content is a single
bare text block (no annotations, no _meta), unwrap to raw text and
switch the format description to 'Plain text'. Default-off; gate-off
behavior is unchanged.
Verified structurally against the 2.1.128 binary: function signatures,
the IDE check, gate logic, _meta-unwrap pattern, and imageLimits
plumbing match this implementation.
The buildDiffableContent and buildPrevDiffableContent fields were closures
capturing full system prompt and tool schema arrays (~300KB each). With 10
map entries × 2 closures, this held ~6MB of GC-unreachable memory.
Since recordPromptState already serializes the same data for hashing,
pre-computing the diffable content string has negligible marginal cost.
- Help General 页添加 3 步 Getting started 引导,替代单段描述
- 权限对话框底部 "Esc to cancel" → "Esc to reject","Tab to amend" → "Tab to add feedback"
- .claude/ 文件夹权限选项标签从 60 字符缩至 49 字符,避免窄终端截断
- 新增 10 个测试覆盖权限提示文案和帮助页引导内容
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
bun update -g 只更新到 package.json 版本范围内的最新版,无法跨版本升级。
改为 bun install -g @latest 与 npm 侧行为一致,强制拉取最新发布版。
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Root causes:
1. ThemeProvider was imported but never used in App.tsx and showSetupDialog
2. setThemeConfigCallbacks was never called to inject persistence callbacks
3. Preview/save/cancel theme lifecycle had no provider to coordinate
Changes:
- Export setThemeConfigCallbacks from @anthropic/ink
- Wrap App.tsx children with ThemeProvider (initialState from config, onThemeSave persists)
- Wrap showSetupDialog with ThemeProvider for onboarding/trust dialogs
- Call setThemeConfigCallbacks in init.ts to register load/save callbacks
- Update SnapshotUpdateDialog test to account for new ThemeProvider wrapper
Fixes #theme-switching
Four inline + one outside-diff actionable comment from the second CodeRabbit
review on claude-code-best/claude-code#386:
- tests/mocks/auth.ts: align mock return contracts with src/utils/auth.ts.
checkAndRefreshOAuthTokenIfNeeded resolves to a Promise<boolean> and
getClaudeAIOAuthTokens returns the full token shape (refreshToken, expiresAt,
scopes, subscriptionType, rateLimitTier) so tests that branch on these
values can not silently drift away from production.
- src/utils/handlePromptSubmit.ts (461-468): clear the freshly-published
abortController before the early return when every claimed autonomy command
was skipped as non-consumable, so this turn's stale controller does not leak
into the next turn.
- src/utils/handlePromptSubmit.ts (621-649): separate execution failure from
finalizer failure. The turn body now writes to a `turnError` slot; a single
pass after the inner try decides whether to finalize claimed commands as
`completed` or `failed`, with each finalize call wrapped in its own
try/catch so a failure inside finalize does not flip a successful turn into
`failed` and double-finalize the same commands. The outer catch only
rethrows the original turn error.
- src/utils/processUserInput/processSlashCommand.tsx (228-276): wrap the
post-success `finalizeDeferredAutonomyRunCompleted()` call in its own
try/catch so a finalize failure no longer falls into the worker-failure
catch path and emits a contradictory `<scheduled-task-result status="failed">`
for a slash command that actually succeeded.
Outside scope (not changed) — the CodeRabbit suggestion to add a `.ts`
extension to the shared `tests/mocks/auth` import contradicts the project's
existing convention: every other test imports the shared mocks without the
extension (e.g. `tests/mocks/log`, `tests/mocks/debug`,
`tests/mocks/file-system`), and the project's tsconfig does not enable
`allowImportingTsExtensions`, so adding the extension fails typecheck. The
import is kept extension-less to match the rest of the suite.
Validation:
- bun run typecheck (clean).
- bun test → 3996 pass / 0 fail across 305 test files.
Twelve actionable items (7 Major + 5 Minor) from the CodeRabbit review on
claude-code-best/claude-code#386:
- docs/internals/autonomy-jira.md: typo "due input close" → "due to input close".
- src/utils/autonomyRuns.ts:
- selectPersistedAutonomyRuns no longer evicts active (queued/running) runs
when the combined list exceeds AUTONOMY_RUNS_MAX. Active runs are kept in
full and the inactive history is capped to the remaining budget so
persisted ownership for live work survives.
- isValidOwnerProcessId now allows pid <= 4_194_304 so a live run owned by
the maximum Linux PID is not treated as stale.
- src/utils/autonomyAuthority.ts: maskCodeFencedLines tracks the active fence
length and only closes the fence when a same-character run of equal-or-
greater length appears with no trailing content, so a nested ```yaml inside
an outer ```` block no longer leaks fake `tasks:` entries into the parser.
- src/cli/print.ts: late-shutdown branches in the cron and scheduled-task
paths now call cancelQueuedAutonomyCommands({ commands: [command] }) instead
of markAutonomyRunCancelled(...). Updating run state alone left the
queue-side record orphaned for resume/recovery.
- src/utils/processUserInput/processSlashCommand.tsx: scheduled-task-result
notification is enqueued before finalizeAutonomyRunCompleted (which queues
follow-up autonomy commands) so both at priority: 'later' land in order and
the next autonomy step can not run before the worker's output is observed.
- src/screens/REPL.tsx + src/utils/handlePromptSubmit.ts:
- onQuery now returns Promise<boolean>: false from the concurrent-guard
skip path, true otherwise. Other call sites use `void onQuery(...)` and
are unaffected. handlePromptSubmit's onQuery prop type matches.
- The autonomy-prompt callsite captures the executed flag, finalizes
claim.claimedCommands as { type: 'completed' } only when onQuery actually
ran, and runs the completed-finalize in its own try/catch so a failure
there does not propagate into the outer catch and trigger a second
finalize as { type: 'failed' } for the same commands.
- Removed the unsafe `command.value as string` cast; createUserMessage
already accepts `string | ContentBlockParam[]`.
- createUserMessage mock in src/__tests__/handlePromptSubmit.test.ts now
matches the new Promise<boolean> shape.
- packages/builtin-tools/src/tools/RemoteTriggerTool/__tests__/
RemoteTriggerTool.test.ts:
- Inline auth mock replaced with the shared tests/mocks/auth (added).
- The full mock of src/constants/oauth.js is replaced by a narrow
side-effect-only mock that overrides the env-reading helpers
(getOauthConfig, fileSuffixForOauthConfig, MCP_CLIENT_METADATA_URL) and
delegates pure data exports to the real module.
- tests/integration/dependency-overrides.test.ts:
- mermaid does not export `./package.json` in its exports map, so
require.resolve('mermaid/package.json') throws
ERR_PACKAGE_PATH_NOT_EXPORTED in runtimes that honor exports semantics.
The test now resolves the package entry and walks up to the package
root via a small findPackageJson helper.
- readFileSync from node:fs is replaced with `await Bun.file(...).text()`
to match the project's Bun-API requirement.
Validation:
- bun run typecheck (clean).
- bun test → 3996 pass / 0 fail across 305 test files.
Targets PRs:
- amDosion/claude-code-bast#8 (fork-internal review)
- claude-code-best/claude-code#386 (upstream review, same head branch)
This PR consolidates a coordinated batch of fixes around autonomy run/flow lifecycle, scheduled task deduplication, provider-boundary state finalization, and matching memory-bound treatments for adjacent long-running subsystems (REPL fullscreen scrollback, skill-search/skill-learning runtime activation). All changes were developed and reviewed together because they touched the same lifecycle invariants and were uncovered by the same long-running session reproductions.
## Lifecycle correctness
- Queued autonomy prompts are not injected unless the persisted run was successfully claimed; queued run claiming is now terminal-safe so a once-consumed/cancelled/failed run can not slip back into `queued`.
- Autonomy run/flow finalization happens on completion, provider error, generator close, and cancellation — not just the happy path. New `src/__tests__/queryAutonomyProviderBoundary.test.ts` covers these provider-boundary transitions.
- `requestManagedAutonomyFlowCancel` and `resumeManagedAutonomyFlowPrompt` carry `rootDir` and `currentDir` explicitly across detached async boundaries (proactive-tick, cron, daemon restart) instead of inferring from process state.
- Active runs/flows are protected from janitor pruning so a running step can not be garbage-collected mid-flight (`src/utils/autonomyAuthority.ts`).
- Heartbeat parser now ignores fenced code blocks; the two-phase commit window for autonomy state transitions is documented in `docs/internals/autonomy-jira.md`.
## Ownership and dedup
- `src/utils/autonomyRuns.ts`: ownership stamping (run id + rootDir carried end-to-end), source-based dedup against active runs.
- `src/hooks/useScheduledTasks.ts`: scheduled ticks deduplicate against runs already active on the same source label.
- `src/utils/processUserInput/processSlashCommand.tsx`: forked slash commands now thread the autonomy `runId` so completion finalizers can find the originating run for deferred completion.
- New `src/utils/autonomyQueueLifecycle.ts` and tests collect the queue-side lifecycle invariants in one place.
## Memory bounds (related, same review pass)
- `src/screens/REPL.tsx`: caps fullscreen scrollback after the compact boundary and updates trailing progress rows in place. Long-running fullscreen sessions could otherwise retain thousands of post-compaction messages and duplicate progress rows, keeping Ink trees alive long after their useful context had moved on.
- `src/services/skillSearch/*` and `src/services/skillLearning/*`: runtime activation is strictly opt-in via existing env toggles; session caches are capped so long-running processes can not grow them forever. Build presence is preserved so operators can still discover and opt into the slash commands.
## CI / test contract
- `tests/integration/dependency-overrides.test.ts`: smoke test no longer drives Mermaid's browser renderer; it validates the package-resolution contract directly so CI does not regress on unrelated browser timing.
- New `tests/integration/autonomy-lifecycle-user-flow.test.ts`: end-to-end CLI subprocess flow exercising `status --deep`, `flows`, `flow <id>`, `flow resume`, `flow cancel` against persisted state.
- `src/entrypoints/cli.tsx`: `claude autonomy …` routes through an entrypoint fast path that reuses the slash-command formatter without booting the full interactive CLI. Stdout is flushed before forced exit so coverage subprocesses do not terminate with empty stdout.
- `packages/builtin-tools/src/tools/RemoteTriggerTool/__tests__/RemoteTriggerTool.test.ts`: stabilized to prevent audit flake under coverage.
## Tests added
- `src/__tests__/queryAutonomyProviderBoundary.test.ts`
- `src/hooks/__tests__/useScheduledTasks.test.ts`
- `src/utils/__tests__/autonomyAuthority.test.ts`
- `src/utils/__tests__/autonomyFlows.test.ts` (extended)
- `src/utils/__tests__/autonomyPersistence.test.ts` (extended)
- `src/utils/__tests__/autonomyQueueLifecycle.test.ts`
- `src/utils/__tests__/autonomyRuns.test.ts` (extended)
- `src/utils/processUserInput/__tests__/processSlashCommand.test.ts`
- `tests/integration/autonomy-lifecycle-user-flow.test.ts`
## Docs
- `docs/agent/sur-loop-scheduled-oom.md`: System Understanding Report covering the scheduled/loop OOM problem, the call graphs investigated, and the lifecycle invariants this PR establishes.
- `docs/agent/sur-skill-overflow-bugs.md`: SUR for the related skill-overflow context.
- `docs/internals/autonomy-jira.md`: documents the two-phase commit window and ownership stamping invariants.
- `docs/memory-leak-audit.md`: audit notes covering the REPL/scrollback and skill-search bounds.
## Invariants this PR establishes
1. Queued autonomy prompts are not injected unless the persisted run was successfully claimed.
2. Terminal run/flow states are terminal — completion, failure, and cancellation all finalize state regardless of which provider/error path triggered them.
3. Autonomy run/flow `rootDir` is carried explicitly across detached async boundaries instead of inferred from a shared singleton.
4. State-only CLI subcommands (`autonomy status|runs|flows|flow …`) bypass full interactive bootstrap so they do not hold unrelated handles open.
5. REPL fullscreen scrollback and skill-search/skill-learning session caches are explicitly bounded.
## Validation
```bash
bun run typecheck
CI=true GITHUB_ACTIONS=true bun test # 3996 pass / 0 fail across 305 files
bun test src/__tests__/queryAutonomyProviderBoundary.test.ts \
src/hooks/__tests__/useScheduledTasks.test.ts \
src/utils/__tests__/autonomy{Runs,Flows,Authority,QueueLifecycle,Persistence}.test.ts \
src/utils/processUserInput/__tests__/processSlashCommand.test.ts \
tests/integration/autonomy-lifecycle-user-flow.test.ts
```
## Origin
This PR is the consolidated, upstream-targeted version of two fork-side review PRs (fix/loop-scheduled-autonomy-oom and fix/autonomy-lifecycle). The fork-side review history is preserved at https://github.com/amDosion/claude-code-bast/pull/7 . The fork's own internal `chore: keep fork current with upstream` sync commits and the `docs: update contributors` automation are intentionally not included in this PR.
The autonomy CLI handler `rootDir` threading that the fork added (78f64d8a, 98d04ddb) is intentionally omitted here because upstream `a2cfaf91` (fix: 修复 RemoteTriggerTool 和 autonomy 测试的全量运行失败) already performed the equivalent change with an additional `currentDir` option. Keeping the upstream version avoids regressing that improvement.