* feat: 新增 cloud-artifacts 包(Cloudflare Worker HTML artifact 托管)
POST /upload 鉴权上传 HTML 到 R2 返回 hash URL,GET /<7d|30d>/<id>.html
由 Worker 代理读取并直出 text/html。R2 lifecycle rule 自动 7/30 天删除。
独立服务,不被主 CLI 引用(类似 packages/remote-control-server/ 定位)。
Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>
* docs: 完善 cloud-artifacts 文档并统一出口域名
- CLAUDE.md 加 cloud-artifacts 到 Workspace Packages 表和新增 HTML Artifact Hosting 段落
- docs.json 注册 cloud-artifacts 到运行模式 group
- README 加 Quickstart、架构图(含 Deno Deploy 代理层)、Security Considerations、Troubleshooting
- 统一出口域名为 https://cloud-artifacts.claude-code-best.win(wrangler.toml PUBLIC_URL、test.sh 默认 WORKER_URL、所有文档示例)
- test.sh expect() 加 [via body] fallback:经 Deno Deploy 代理(status 抹平为 200)时按 body 的 error 字段断言
Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>
* docs: 修正 CLAUDE.md cloud-artifacts 引用死链
之前指向不存在的 docs/features/cloud-artifacts.md(用户未注册到 docs.json),
改为指向已存在的 packages/cloud-artifacts/README.md,并补充生产出口域名
与 Deno Deploy status 抹平副作用的说明。
Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>
* chore: 同步 cloud-artifacts 测试默认 TOKEN 到新值
用户已通过 wrangler secret put 把生产 TOKEN 改为 claude-code-best,
test.sh 的默认值(之前用旧 token 作 fallback)和注释示例同步更新。
现在直接 bash scripts/test.sh 即可跑通(无需显式传 TOKEN)。
src/index.ts 不依赖具体 token 值(只读 env.TOKEN 做比较),
wrangler.toml 不含 secret,README/.dev.vars.example 用 <your-token>
占位符故无需改。
Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>
* docs: add artifacts feature implementation plan
* feat(artifact): add cloud-artifacts config with token/URL defaults
Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>
* feat(artifact): add HTTP client with body-error parsing
Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>
* feat(artifact): add tool name, description, and prompt
Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>
* feat(artifact): add buildTool definition with file validation
Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>
* test(artifact): add end-to-end tool tests for upload/error paths
Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>
* feat(artifact): export ArtifactTool from builtin-tools barrel
Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>
* feat(artifact): register ArtifactTool in tools list
Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>
* feat(artifact): add /use-artifacts bundled skill
Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>
* feat(artifact): add extractArtifacts message scanner
Scans Message[] for artifact tool_use/tool_result pairs, parses URL/id/expires
from the upload response string, and returns ArtifactInfo[] newest-first.
Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>
* fix(artifact): scanner type narrowing and url regex
- Use double assertion (`as unknown as Record<string, unknown>`) at lines 30
and 90 to fix TS2352 per project convention
- Tighten URL_REGEX to avoid capturing trailing punctuation (parens,
quotes, commas) when URL is embedded in text
- Add test case for array-form tool_result content path
Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>
* feat(artifact): add ArtifactsMenu Ink component
Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>
* feat(artifact): add /artifacts slash command entry
Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>
* feat(artifact): register /artifacts command
Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>
* fix(artifact): use setClipboard instead of pbcopy for cross-platform support
* fix(artifact): drop userFacingName override so display matches /artifacts
* fix(rcs): add resJson helper to resolve strict mode type errors in tests
Hono Response.json() returns Promise<unknown> under strict TypeScript,
causing 121 TS errors across middleware and routes test files.
Co-Authored-By: glm-5-turbo <zai-org@claude-code-best.win>
* fix(cloud-artifacts): add type stubs so tsc passes without worker-configuration.d.ts
The wrangler-generated worker-configuration.d.ts is gitignored, causing CI to
fail with missing ExportedHandler/Env/R2Bucket types. This file provides minimal
stubs for all Cloudflare Workers types used by the artifact upload Worker.
Co-Authored-By: glm-5-turbo <zai-org@claude-code-best.win>
* fix(query): shallow-copy messages before stripping toolUseResult
Previously the per-query cleanup mutated messagesForQuery entries in
place via `delete msg.toolUseResult`. Those entries are references
shared with mutableMessages (UI state), so the delete stripped the
field from the live message object. The next query can start within
milliseconds of tool_result creation — before the React UI commit
lands — so UserToolSuccessMessage's `!message.toolUseResult` check
returned null and tool.renderToolResultMessage was never called,
leaving tool-result rows blank.
Map to a stripped copy instead so mutableMessages keeps the original
for the UI. Downstream API transformations (applyToolResultBudget,
snip, microcompact) already build new arrays via .map(), so they
compose cleanly with this copy.
Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>
* feat(artifact): show uploaded URL inline below ExecuteExtraTool
Deferred tools (shouldDefer: true) are invoked via SearchExtraTools →
ExecuteExtraTool, so their tool_result rows used to render blank —
the UI looked up ExecuteExtraTool, which had no renderToolResultMessage,
and returned null. Add a generic delegation in ExecuteTool that forwards
renderToolResultMessage to the inner tool when it defines one, unwrapping
the { result, tool_name } envelope and the params from the input shape.
All 28 deferred tools can now render their own UI by defining
renderToolResultMessage.
For ArtifactTool specifically, render the uploaded URL as an OSC 8
hyperlink (Link component) in warning color so it's visually prominent,
with the expiry timestamp on a second line and a separate error branch.
Also add `error: z.string().optional()` to outputSchema — zod's default
strip mode was dropping the field, so error states never reached the UI.
Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>
* fix(cloud-artifacts): make Env stubs actually take effect in CI
The previous stub file (2e29e362) wrapped `interface Env` in
`declare global { ... }`, but the file has no top-level import/export so
it's a script, not a module. TS2669 forbids `declare global` in scripts,
and in .d.ts files that error is silently swallowed — so the Env stubs
were never merged into the global scope. Locally typecheck passed only
because worker-configuration.d.ts (gitignored) provided Env separately;
in CI / fresh clones, `BUCKET`, `MAX_BYTES`, `DEFAULT_TTL_DAYS`,
`PUBLIC_URL` were all missing on Env.
Drop the wrapper. Top-level `interface Env` in a script .d.ts is already
global ambient and merges with worker-configuration.d.ts via interface
declaration merging, so both environments typecheck cleanly.
Co-Authored-By: glm-5.2 <zai-org@claude-code-best.win>
---------
Co-authored-by: glm-5.2 <zai-org@claude-code-best.win>
Twelve actionable items (7 Major + 5 Minor) from the CodeRabbit review on
claude-code-best/claude-code#386:
- docs/internals/autonomy-jira.md: typo "due input close" → "due to input close".
- src/utils/autonomyRuns.ts:
- selectPersistedAutonomyRuns no longer evicts active (queued/running) runs
when the combined list exceeds AUTONOMY_RUNS_MAX. Active runs are kept in
full and the inactive history is capped to the remaining budget so
persisted ownership for live work survives.
- isValidOwnerProcessId now allows pid <= 4_194_304 so a live run owned by
the maximum Linux PID is not treated as stale.
- src/utils/autonomyAuthority.ts: maskCodeFencedLines tracks the active fence
length and only closes the fence when a same-character run of equal-or-
greater length appears with no trailing content, so a nested ```yaml inside
an outer ```` block no longer leaks fake `tasks:` entries into the parser.
- src/cli/print.ts: late-shutdown branches in the cron and scheduled-task
paths now call cancelQueuedAutonomyCommands({ commands: [command] }) instead
of markAutonomyRunCancelled(...). Updating run state alone left the
queue-side record orphaned for resume/recovery.
- src/utils/processUserInput/processSlashCommand.tsx: scheduled-task-result
notification is enqueued before finalizeAutonomyRunCompleted (which queues
follow-up autonomy commands) so both at priority: 'later' land in order and
the next autonomy step can not run before the worker's output is observed.
- src/screens/REPL.tsx + src/utils/handlePromptSubmit.ts:
- onQuery now returns Promise<boolean>: false from the concurrent-guard
skip path, true otherwise. Other call sites use `void onQuery(...)` and
are unaffected. handlePromptSubmit's onQuery prop type matches.
- The autonomy-prompt callsite captures the executed flag, finalizes
claim.claimedCommands as { type: 'completed' } only when onQuery actually
ran, and runs the completed-finalize in its own try/catch so a failure
there does not propagate into the outer catch and trigger a second
finalize as { type: 'failed' } for the same commands.
- Removed the unsafe `command.value as string` cast; createUserMessage
already accepts `string | ContentBlockParam[]`.
- createUserMessage mock in src/__tests__/handlePromptSubmit.test.ts now
matches the new Promise<boolean> shape.
- packages/builtin-tools/src/tools/RemoteTriggerTool/__tests__/
RemoteTriggerTool.test.ts:
- Inline auth mock replaced with the shared tests/mocks/auth (added).
- The full mock of src/constants/oauth.js is replaced by a narrow
side-effect-only mock that overrides the env-reading helpers
(getOauthConfig, fileSuffixForOauthConfig, MCP_CLIENT_METADATA_URL) and
delegates pure data exports to the real module.
- tests/integration/dependency-overrides.test.ts:
- mermaid does not export `./package.json` in its exports map, so
require.resolve('mermaid/package.json') throws
ERR_PACKAGE_PATH_NOT_EXPORTED in runtimes that honor exports semantics.
The test now resolves the package entry and walks up to the package
root via a small findPackageJson helper.
- readFileSync from node:fs is replaced with `await Bun.file(...).text()`
to match the project's Bun-API requirement.
Validation:
- bun run typecheck (clean).
- bun test → 3996 pass / 0 fail across 305 test files.
Targets PRs:
- amDosion/claude-code-bast#8 (fork-internal review)
- claude-code-best/claude-code#386 (upstream review, same head branch)
This PR consolidates a coordinated batch of fixes around autonomy run/flow lifecycle, scheduled task deduplication, provider-boundary state finalization, and matching memory-bound treatments for adjacent long-running subsystems (REPL fullscreen scrollback, skill-search/skill-learning runtime activation). All changes were developed and reviewed together because they touched the same lifecycle invariants and were uncovered by the same long-running session reproductions.
## Lifecycle correctness
- Queued autonomy prompts are not injected unless the persisted run was successfully claimed; queued run claiming is now terminal-safe so a once-consumed/cancelled/failed run can not slip back into `queued`.
- Autonomy run/flow finalization happens on completion, provider error, generator close, and cancellation — not just the happy path. New `src/__tests__/queryAutonomyProviderBoundary.test.ts` covers these provider-boundary transitions.
- `requestManagedAutonomyFlowCancel` and `resumeManagedAutonomyFlowPrompt` carry `rootDir` and `currentDir` explicitly across detached async boundaries (proactive-tick, cron, daemon restart) instead of inferring from process state.
- Active runs/flows are protected from janitor pruning so a running step can not be garbage-collected mid-flight (`src/utils/autonomyAuthority.ts`).
- Heartbeat parser now ignores fenced code blocks; the two-phase commit window for autonomy state transitions is documented in `docs/internals/autonomy-jira.md`.
## Ownership and dedup
- `src/utils/autonomyRuns.ts`: ownership stamping (run id + rootDir carried end-to-end), source-based dedup against active runs.
- `src/hooks/useScheduledTasks.ts`: scheduled ticks deduplicate against runs already active on the same source label.
- `src/utils/processUserInput/processSlashCommand.tsx`: forked slash commands now thread the autonomy `runId` so completion finalizers can find the originating run for deferred completion.
- New `src/utils/autonomyQueueLifecycle.ts` and tests collect the queue-side lifecycle invariants in one place.
## Memory bounds (related, same review pass)
- `src/screens/REPL.tsx`: caps fullscreen scrollback after the compact boundary and updates trailing progress rows in place. Long-running fullscreen sessions could otherwise retain thousands of post-compaction messages and duplicate progress rows, keeping Ink trees alive long after their useful context had moved on.
- `src/services/skillSearch/*` and `src/services/skillLearning/*`: runtime activation is strictly opt-in via existing env toggles; session caches are capped so long-running processes can not grow them forever. Build presence is preserved so operators can still discover and opt into the slash commands.
## CI / test contract
- `tests/integration/dependency-overrides.test.ts`: smoke test no longer drives Mermaid's browser renderer; it validates the package-resolution contract directly so CI does not regress on unrelated browser timing.
- New `tests/integration/autonomy-lifecycle-user-flow.test.ts`: end-to-end CLI subprocess flow exercising `status --deep`, `flows`, `flow <id>`, `flow resume`, `flow cancel` against persisted state.
- `src/entrypoints/cli.tsx`: `claude autonomy …` routes through an entrypoint fast path that reuses the slash-command formatter without booting the full interactive CLI. Stdout is flushed before forced exit so coverage subprocesses do not terminate with empty stdout.
- `packages/builtin-tools/src/tools/RemoteTriggerTool/__tests__/RemoteTriggerTool.test.ts`: stabilized to prevent audit flake under coverage.
## Tests added
- `src/__tests__/queryAutonomyProviderBoundary.test.ts`
- `src/hooks/__tests__/useScheduledTasks.test.ts`
- `src/utils/__tests__/autonomyAuthority.test.ts`
- `src/utils/__tests__/autonomyFlows.test.ts` (extended)
- `src/utils/__tests__/autonomyPersistence.test.ts` (extended)
- `src/utils/__tests__/autonomyQueueLifecycle.test.ts`
- `src/utils/__tests__/autonomyRuns.test.ts` (extended)
- `src/utils/processUserInput/__tests__/processSlashCommand.test.ts`
- `tests/integration/autonomy-lifecycle-user-flow.test.ts`
## Docs
- `docs/agent/sur-loop-scheduled-oom.md`: System Understanding Report covering the scheduled/loop OOM problem, the call graphs investigated, and the lifecycle invariants this PR establishes.
- `docs/agent/sur-skill-overflow-bugs.md`: SUR for the related skill-overflow context.
- `docs/internals/autonomy-jira.md`: documents the two-phase commit window and ownership stamping invariants.
- `docs/memory-leak-audit.md`: audit notes covering the REPL/scrollback and skill-search bounds.
## Invariants this PR establishes
1. Queued autonomy prompts are not injected unless the persisted run was successfully claimed.
2. Terminal run/flow states are terminal — completion, failure, and cancellation all finalize state regardless of which provider/error path triggered them.
3. Autonomy run/flow `rootDir` is carried explicitly across detached async boundaries instead of inferred from a shared singleton.
4. State-only CLI subcommands (`autonomy status|runs|flows|flow …`) bypass full interactive bootstrap so they do not hold unrelated handles open.
5. REPL fullscreen scrollback and skill-search/skill-learning session caches are explicitly bounded.
## Validation
```bash
bun run typecheck
CI=true GITHUB_ACTIONS=true bun test # 3996 pass / 0 fail across 305 files
bun test src/__tests__/queryAutonomyProviderBoundary.test.ts \
src/hooks/__tests__/useScheduledTasks.test.ts \
src/utils/__tests__/autonomy{Runs,Flows,Authority,QueueLifecycle,Persistence}.test.ts \
src/utils/processUserInput/__tests__/processSlashCommand.test.ts \
tests/integration/autonomy-lifecycle-user-flow.test.ts
```
## Origin
This PR is the consolidated, upstream-targeted version of two fork-side review PRs (fix/loop-scheduled-autonomy-oom and fix/autonomy-lifecycle). The fork-side review history is preserved at https://github.com/amDosion/claude-code-bast/pull/7 . The fork's own internal `chore: keep fork current with upstream` sync commits and the `docs: update contributors` automation are intentionally not included in this PR.
The autonomy CLI handler `rootDir` threading that the fork added (78f64d8a, 98d04ddb) is intentionally omitted here because upstream `a2cfaf91` (fix: 修复 RemoteTriggerTool 和 autonomy 测试的全量运行失败) already performed the equivalent change with an additional `currentDir` option. Keeping the upstream version avoids regressing that improvement.
* fix: bound agent communication memory growth
UDS messaging now uses private local capabilities instead of exposing auth tokens through SDK metadata, environment variables, session registry, peer listing, or tool output. The receive path bounds NDJSON frames, response buffers, active clients, and pending inbox bytes, and strips auth metadata before messages enter the prompt queue.
Teammate mailboxes now validate file and message sizes, fail closed on corrupt mutation inputs, compact by count and retained bytes, and use stable message identity for in-process acknowledgements. Agent summaries now fork only a bounded recent context using lazy size estimation and content fingerprints instead of retaining or serializing unbounded histories.
Constraint: PR #361 was already merged; this branch is based on upstream/main@c2ac9a74.
Rejected: Default-disabling COORDINATOR_MODE/TEAMMEM only | explicit feature enablement still hit unbounded paths.
Rejected: Persisting UDS auth in SDK/env/session registry | bridge/remote metadata can leak local capability secrets.
Rejected: Inline uds #token addresses | observable/tool/classifier paths can reflect raw addresses outside the UDS request frame.
Rejected: Positional mailbox marking after compaction | compaction can shift indices across the lock boundary.
Confidence: high
Scope-risk: moderate
Directive: Do not expose UDS capability tokens through SDK messages, environment variables, session registry, peer-list output, or SendMessage result/classifier surfaces.
Directive: Do not reintroduce positional mailbox acknowledgements unless compaction is removed or read+mark is atomic under one lock.
Tested: bun test src/utils/__tests__/ndjsonFramer.test.ts src/utils/__tests__/udsMessaging.test.ts packages/builtin-tools/src/tools/SendMessageTool/__tests__/udsRecipientSanitization.test.ts
Tested: bunx tsc --noEmit --pretty false
Tested: bun run lint
Tested: bunx biome lint modified src/package files
Tested: bun run test:all (3704 pass, 0 fail, 6734 expects)
Tested: bun audit (No vulnerabilities found)
Tested: bun run build
Tested: bun run build:vite
Tested: git diff --check
Not-tested: End-to-end external UDS client driving a full production headless model turn.
* fix: harden bounded agent communication review fixes
CodeRabbit and Codecov surfaced real gaps in UDS framing, peer discovery, mailbox retention, and summary context coverage. This tightens those paths without suppressing review or coverage signals.
Constraint: PR #369 must address CodeRabbit and Codecov findings without warning suppression or fake fallbacks
Rejected: Suppress Codecov or CodeRabbit warnings | leaves real receive-path and test-isolation gaps
Rejected: Add unreachable feature-gated tests | bun:bundle keeps those branches compile-time gated in local tests
Confidence: high
Scope-risk: moderate
Directive: Keep UDS auth-token rejection outside feature flags; do not reintroduce inline token fallbacks
Tested: bun test --coverage --coverage-reporter lcov --coverage-dir coverage; bun run test:all; bun run lint; bun run build; bun run build:vite; bun audit; git diff --cached --check
Not-tested: Remote Codecov/CodeRabbit refreshed reports until pushed
* fix: prevent agent communication bounds from hiding CI regressions
Tighten the UDS auth, framing, and response-reader boundaries while keeping the AgentSummary lifecycle covered so Codecov and CI fail on real regressions instead of missing coverage. The poorMode settings mock mirrors unrelated real settings defaults to avoid Bun mock retention changing later permission tests.
Constraint: PR #369 must fix Codecov/CI precisely without warning suppression, fallback masking, or mock pollution
Rejected: Delete AgentSummary lifecycle coverage | would hide Codecov loss and stale-summary behavior
Rejected: Store inline UDS rejection in a hidden input sentinel | cloned observable inputs can drop it and bypass rejection
Rejected: Ignore malformed UDS frames until timeout | leaves client slots and SendMessage calls open to exhaustion
Confidence: high
Scope-risk: moderate
Directive: Keep empty #token= markers rejected; do not require a non-empty token value in hasInlineUdsToken
Tested: bun test packages/builtin-tools/src/tools/SendMessageTool/__tests__/udsRecipientSanitization.test.ts src/utils/__tests__/udsMessaging.test.ts src/utils/__tests__/udsResponseReader.test.ts src/utils/__tests__/ndjsonFramer.test.ts
Tested: bunx tsc --noEmit --pretty false
Tested: bun run lint
Tested: bun test --coverage --coverage-reporter lcov --coverage-dir coverage
Tested: bun run test:all
Tested: bun audit
Tested: bun run build
Tested: bun run build:vite
Not-tested: GitHub-hosted Codecov upload until pushed PR checks rerun
---------
Co-authored-by: unraid <local@unraid.local>
* fix: harden ACP communication boundaries
Harden ACP communication boundaries
Remote ACP sessions now cannot widen permission mode through untrusted
metadata or client payloads. WebSocket ACP ingress measures payloads by bytes
before binary decode, and prompt queue handoff keeps exactly one prompt active
while queued prompts are drained FIFO.
Constraint: ACP remote clients must not be able to open bypassPermissions without local launch intent
Constraint: WebSocket payload limits must be byte-based and checked before binary decode
Rejected: Keep promptToQueryContent wrapper | no production consumers remained after prompt conversion single-sourcing
Confidence: high
Scope-risk: moderate
Directive: Do not re-enable remote bypassPermissions from _meta unless a local launch gate is verified in both acp-link and agent
Tested: targeted ACP/RCS/acp-link prompt queue, bridge, permission, payload, and prompt conversion tests; bun run typecheck; bun run build
Not-tested: Manual live ACP/RCS session against an external client
* fix: restore repository verification gates
Keep the full repository test, typecheck, build, and Biome lint gates usable
after the ACP fix pass. This commit is intentionally separate from the ACP
behavior change: it fixes Windows-safe Langfuse home redaction, removes stale
lint suppressions, resolves Biome warning/info diagnostics, and keeps env
expansion tests explicit without template-placeholder lint noise.
Constraint: The project completion contract requires full typecheck, lint, test, and build evidence
Rejected: Leave warning/info diagnostics as historical noise | they obscure future gate regressions and weaken flow-impact claims
Confidence: high
Scope-risk: narrow
Directive: Keep repository gate cleanup separate from feature fixes when it is not part of the same runtime path
Tested: bunx biome lint src/; bunx tsc --noEmit; bun test src/services/mcp/__tests__/envExpansion.test.ts src/utils/__tests__/sliceAnsi.test.ts src/utils/__tests__/stringUtils.test.ts; bun test; bun run build
Not-tested: Manual Langfuse export against a real external Langfuse service
* fix: harden ACP failure boundaries after review
Deep review found several paths that made ACP communication failures look normal: prompt errors could finish as end_turn, permission pipeline exceptions could fall through to client approval, tool rawInput was deep-copied with JSON, and acp-link accepted unbounded or unvalidated WebSocket payloads. This keeps the behavior fail-closed, validates WS payloads before dispatch, caps payload size before JSON parse, and preserves cancellation intent with a generation counter.
Constraint: User explicitly rejected pseudo-fixes, fallback behavior, and unbounded payload handling
Rejected: Keep JSON stringify/parse rawInput copy | duplicates large payloads and silently drops non-JSON inputs
Rejected: Delegate permission pipeline errors to client approval | allows a broken local permission check to be bypassed
Confidence: high
Scope-risk: moderate
Directive: Do not convert ACP errors into normal end_turn responses without a protocol-level reason and regression tests
Tested: bun test src/services/acp/__tests__/agent.test.ts src/services/acp/__tests__/bridge.test.ts src/services/acp/__tests__/permissions.test.ts
Tested: bun test packages/acp-link/src/__tests__/server.test.ts
Tested: bunx tsc --noEmit
Tested: bunx biome lint src/ packages/acp-link/src/
Tested: bun run test:all
Tested: bun run build
Not-tested: Manual end-to-end ACP client session over a real editor WebSocket
* fix: prevent ACP coverage runs from seeing partial mocks
GitHub Actions failed under bun test --coverage because permissions.test.ts replaced ../bridge.js with a partial mock that omitted forwardSessionUpdates. Coverage worker ordering on Linux let sibling tests observe that incomplete module.
This isolates ACP test mocks by snapshotting real exports, overriding only requested symbols, and restoring mocks in LIFO order. The shared helper also keeps the same behavior in agent.test.ts without duplicating mock infrastructure.
Constraint: bun:test mock.module is process-global inside a worker.
Rejected: Add fallback exports or production guards | the bridge export exists; the failure was test mock pollution.
Rejected: Keep per-file helper copies | duplication would let restore semantics drift again.
Confidence: high
Scope-risk: narrow
Directive: Prefer safeMockModule for partial mocks of real modules in ACP tests; plain mock.module is only appropriate for fully synthetic modules or isolated tests.
Tested: bun test src/services/acp/__tests__/agent.test.ts src/services/acp/__tests__/bridge.test.ts src/services/acp/__tests__/permissions.test.ts
Tested: bun test --coverage --coverage-reporter=lcov
Tested: bunx tsc --noEmit
Tested: bun run lint
Tested: git diff --check
Not-tested: Linux runner directly before push
* fix: normalize ACP bypass requests without warning noise
The previous CI repair removed the failing partial bridge mock, but it also added a shared safeMockModule helper and left the acp-link bypass normalization warning in the real new_session path.
This tightens the fix: acp-link now treats an unauthorized client bypass request as normal permission-mode normalization without emitting a warning, and the ACP permission test explicitly preserves the real bridge and permission exports instead of using a shared helper. The agent test keeps its local mock preservation but names it by behavior and restores mocks in LIFO order.
Constraint: CI output should not contain expected warning noise for covered policy branches.
Rejected: Silence the test only | the normal new_session path would still warn for an expected normalization branch.
Rejected: Keep the shared safeMockModule helper | the failing module was specific and should be fixed by preserving real exports at the mocking site.
Confidence: high
Scope-risk: narrow
Directive: Treat client-requested bypassPermissions as data to normalize unless the local default explicitly enables bypass.
Tested: bun test packages/acp-link/src/__tests__/server.test.ts
Tested: bun test src/services/acp/__tests__/agent.test.ts src/services/acp/__tests__/bridge.test.ts src/services/acp/__tests__/permissions.test.ts
Tested: bun test --coverage --coverage-reporter=lcov with UPPER_WARN_COUNT=0
Tested: bun run test:all
Tested: bun run lint
Tested: bunx tsc --noEmit
Tested: git diff --check
* fix: harden ACP bypass and CI warning gates
ACP clients must not be able to enter bypassPermissions unless the local ACP gate and process environment both allow it. The same gate now controls session creation, explicit mode changes, and the ExitPlanMode option list, while session setup restores process.cwd so coverage and later work do not inherit ACP session state.
Constraint: CI must stay warning-clean without hiding real ACP permission failures
Rejected: Logging rejected bypass requests on the normal new_session path | it preserves audit text but reintroduces warning noise the runtime should not emit
Rejected: Broad CI=true postinstall skip | it hides explicit Chrome MCP setup checks outside the install path
Confidence: high
Scope-risk: moderate
Directive: Keep bypassPermissions gated through one ACP availability decision before exposing it to clients
Tested: bun test src/services/acp/__tests__/permissions.test.ts src/services/acp/__tests__/agent.test.ts packages/acp-link/src/__tests__/server.test.ts
Tested: bun run test:all
Tested: bun run lint
Tested: bun run build:vite with zero warning matches
Tested: bun test --coverage --coverage-reporter lcov --coverage-dir coverage produced non-empty lcov with SF records and zero filtered warning matches
Not-tested: GitHub Actions result after this push
* fix: remove remaining CI warning noise
The CI log still had three non-failing warnings after the ACP hardening commit: git init default-branch advice from checkout, a Node 20 action-runtime deprecation, and one additional known Vite dynamic-import diagnostic that only surfaced on Linux. The workflow now provides explicit git config and opts actions into Node 24, while Vite keeps a narrow allowlist for acknowledged optimizer diagnostics.
Constraint: Do not use shell log filtering to hide warnings after they happen
Rejected: Grep warning lines out of CI output | it would make future diagnostics harder to find
Confidence: high
Scope-risk: narrow
Directive: Add new Vite warning allowlist entries only after checking that they are existing optimizer diagnostics, not new application defects
Tested: bunx tsc --noEmit --pretty false
Tested: bunx biome lint .github/workflows/ci.yml vite.config.ts
Tested: bun run build:vite with zero warning matches
Not-tested: GitHub Actions result after this push
* fix: reject unauthorized ACP bypass and harden CI actions
ACP clients now fail closed when permissionMode is malformed, unknown, or requests bypass without a local bypass opt-in. acp-link validates new_session input before forwarding to the agent and returns client error frames for expected unauthorized requests without logging create-failed noise. The direct AcpAgent path independently rejects invalid _meta.permissionMode and unauthorized bypass instead of falling back to settings.
CI workflows and generated GitHub App templates now use Node 24-compatible actions pinned to immutable commit SHAs, and acp-link startup output no longer prints the auth token.
Constraint: Must not hide warnings with test isolation or log filtering
Rejected: Silent fallback to local permission mode | accepts invalid client intent and masks boundary behavior
Rejected: Broad dependency churn from bun update | audit remained failing while package and lockfile churn expanded scope
Confidence: high
Scope-risk: moderate
Directive: Client-provided permissionMode must stay fail-closed before reaching AcpAgent; only local settings.defaultMode may fall back to default on invalid local config
Tested: bun test packages/acp-link/src/__tests__/server.test.ts src/services/acp/__tests__/agent.test.ts src/services/acp/__tests__/permissions.test.ts src/services/skillLearning/__tests__/skillLifecycle.test.ts src/utils/settings/__tests__/config.test.ts
Tested: bunx tsc -p packages/acp-link/tsconfig.json --noEmit --pretty false
Tested: bunx tsc --noEmit --pretty false
Tested: bun run lint
Tested: bun run test:all
Tested: local CI equivalent install/typecheck/coverage/build with warning_scan=0
Not-tested: Pre-existing bun audit vulnerabilities require a separate dependency-hardening PR
* fix: resolve dependency audit findings precisely
Use dependency-native upgrades and lockfile resolution to close the audit findings without suppressions. Keep the chrome MCP setup aligned with the new dependency graph and add real integration coverage so the override behavior stays verified.
Constraint: no audit ignores or warning suppression
Rejected: broad google-auth/protobuf overrides | replaced with upstream-compatible resolution
Confidence: high
Scope-risk: moderate
Directive: keep dependency fixes upstream-compatible; do not reintroduce blanket overrides unless the audit surface changes materially
Tested: bun audit; bun audit --json; bun install --frozen-lockfile with CLAUDE_CODE_SKIP_CHROME_MCP_SETUP=1; bunx tsc --noEmit --pretty false; bun run lint; targeted tests; bun run test:all; bun test --coverage --coverage-reporter lcov --coverage-dir coverage; bun run build:vite
Not-tested: unrelated pre-existing ACP/CORS/token fallback residual risks
* fix: keep ACP auth tokens out of URLs
Replace the ad hoc URL-token flow with crypto UUID-backed transport identifiers so the bearer token stays in structured request data instead of query strings. Keep the server, web client, and transport helpers aligned so the ACP/RCS handshake remains compatible after the API shape change.
Constraint: token must not be embedded in the URL
Rejected: token-as-uuid query fallback | leaked bearer tokens in URLs
Confidence: high
Scope-risk: moderate
Directive: preserve the structured auth path; do not reintroduce query-token fallback when adjusting ACP transport code
Tested: targeted ACP/RCS transport tests
Not-tested: unrelated pre-existing ACP/CORS/token fallback residual risks
* fix: normalize WebFetch request headers
Normalize WebFetch headers before dispatch so canonicalization preserves auth semantics and duplicate forms do not slip through. Keep the behavior locked with a focused header test instead of broadening the request pipeline.
Constraint: preserve header semantics without widening the fetch surface
Rejected: ad hoc caller-side normalization | too easy to bypass in future call sites
Confidence: high
Scope-risk: narrow
Directive: keep header normalization close to the WebFetch utility so future callers inherit the same behavior automatically
Tested: targeted WebFetch header tests
Not-tested: unrelated fetch backend behavior beyond header normalization
* fix: harden ACP remote auth surfaces
Tighten the remaining Claude security artifact items by requiring API keys on ACP global reads and relay upgrades, moving WebSocket tokens out of URLs, and replacing open web CORS with an explicit allowlist.
Constraint: Browser WebSocket clients cannot set arbitrary Authorization headers, so the token is carried in a selected subprotocol instead of a query string.
Rejected: Keep UUID auth for ACP channel groups | any caller can mint a UUID and read global ACP data.
Rejected: Preserve ?token= compatibility | secrets leak into logs, history, referrers, and intermediaries.
Confidence: high
Scope-risk: moderate
Directive: Do not reintroduce query-string bearer tokens; use Authorization or rcs.auth.<base64url-token>.
Tested: bunx tsc --noEmit --pretty false
Tested: bun run typecheck in packages/remote-control-server
Tested: bun run build in packages/acp-link
Tested: bun run lint
Tested: bun audit
Tested: focused RCS/acp-link/web tests, 160 pass
Tested: Edge headless browser WebSocket subprotocol handshake
Tested: bun run test:all, 3669 pass
Tested: bun run build:vite
Tested: bun run build
Not-tested: Manual end-to-end relay with a live external ACP agent
* fix: resolve CI dependency override lookup
The CI runner does not expose @grpc/proto-loader as a root-resolvable package, and the test was relying on local hoisting rather than the real dependency owner. Resolve proto-loader through @opentelemetry/exporter-trace-otlp-grpc and @grpc/grpc-js so the smoke test follows the package graph it is validating.
Constraint: Do not add a new root dependency for a transitive smoke test.
Rejected: Skip or weaken the test | the test protects the protobuf 7 override path and should keep exercising loadSync.
Rejected: Add @grpc/proto-loader directly to root package.json | that hides the owning-package resolution issue and broadens dependency surface.
Confidence: high
Scope-risk: narrow
Directive: Dependency override smoke tests should resolve from the package that actually owns the dependency, not from incidental root hoisting.
Tested: bun test tests/integration/dependency-overrides.test.ts; bunx tsc --noEmit --pretty false; bun run lint; bun audit; bun run test:all; git diff --check
---------
Co-authored-by: unraid <local@unraid.local>