mirror of
https://github.com/claude-code-best/claude-code.git
synced 2026-06-18 22:35:51 +00:00
feat: integrate 5 feature branches, upstream fixes, and MIME detection fix
Squashed 5 commits: Features (from 5 feature branches): - MCP fix, pipe mute, stub recovery - KAIROS activation, openclaw autonomy - Daemon/job command hierarchy + cross-platform bg engine Upstream fixes: - fix: Bun.hash compatibility - chore: chrome dependency update - docs: browser support guide MIME detection fix: - Screenshot detectMimeFromBase64(): decode raw bytes from base64 instead of broken charCodeAt comparison - Fixes API 400 on Windows (JPEG) and macOS (PNG) screenshots
This commit is contained in:
88
docs/test-plans/openclaw-autonomy-baseline.md
Normal file
88
docs/test-plans/openclaw-autonomy-baseline.md
Normal file
@@ -0,0 +1,88 @@
|
||||
# OpenClaw Autonomy Baseline Test Spec
|
||||
|
||||
## Purpose
|
||||
|
||||
This test spec locks the current behavior of the existing trigger and context layers before any formal autonomy-subsystem implementation begins.
|
||||
|
||||
At this stage, production code is read-only. Only test files, fixtures, and planning documents may change.
|
||||
|
||||
## Goal
|
||||
|
||||
Establish a stable baseline around the parts of `Claude-code-bast` that later autonomy work is most likely to touch:
|
||||
|
||||
- proactive state handling
|
||||
- cron task storage semantics
|
||||
- cron scheduler helper semantics
|
||||
- user-context cache and `CLAUDE.md` injection behavior
|
||||
|
||||
## Out of Scope for This Baseline Round
|
||||
|
||||
- New authority behavior (`AGENTS.md` / `HEARTBEAT.md`)
|
||||
- New detached-run ledger behavior
|
||||
- New flow behavior
|
||||
- UI redesign
|
||||
|
||||
## Files Under Baseline Protection
|
||||
|
||||
- `src/proactive/index.ts`
|
||||
- `src/utils/cronTasks.ts`
|
||||
- `src/utils/cronScheduler.ts`
|
||||
- `src/context.ts`
|
||||
|
||||
## Test Files Added In This Round
|
||||
|
||||
- `src/proactive/__tests__/state.baseline.test.ts`
|
||||
- `src/commands/__tests__/proactive.baseline.test.ts`
|
||||
- `src/utils/__tests__/cronTasks.baseline.test.ts`
|
||||
- `src/utils/__tests__/cronScheduler.baseline.test.ts`
|
||||
- `src/__tests__/context.baseline.test.ts`
|
||||
|
||||
## Baseline Assertions
|
||||
|
||||
### Proactive state
|
||||
|
||||
1. Activating proactive mode sets active state and activation source.
|
||||
2. Pausing proactive mode suppresses `shouldTick()` and clears `nextTickAt`.
|
||||
3. Blocking context suppresses `shouldTick()` and clears `nextTickAt`.
|
||||
4. Subscribers are notified on state transitions.
|
||||
5. The `/proactive` command enables proactive mode and emits the expected hidden reminder.
|
||||
6. The `/proactive` command disables proactive mode on the second invocation.
|
||||
|
||||
### Cron task storage
|
||||
|
||||
1. Session-only cron tasks remain in memory only.
|
||||
2. Durable cron tasks are persisted to `.claude/scheduled_tasks.json`.
|
||||
3. Daemon-style `dir`-scoped reads exclude session-only cron tasks.
|
||||
4. `removeCronTasks()` without `dir` can remove session-only tasks.
|
||||
5. `removeCronTasks()` with `dir` does not mutate session-only task storage.
|
||||
|
||||
### Cron scheduler helpers
|
||||
|
||||
1. `isRecurringTaskAged()` preserves current aging semantics.
|
||||
2. `buildMissedTaskNotification()` preserves the current AskUserQuestion safety wording.
|
||||
3. `buildMissedTaskNotification()` preserves code-fence hardening for prompt bodies that contain backticks.
|
||||
|
||||
### User context caching
|
||||
|
||||
1. `getUserContext()` includes `currentDate`.
|
||||
2. `getUserContext()` includes mocked `claudeMd` content when memory loading is enabled.
|
||||
3. `CLAUDE_CODE_DISABLE_CLAUDE_MDS` suppresses `claudeMd`.
|
||||
4. `setSystemPromptInjection()` clears the memoized user-context cache.
|
||||
5. `getSystemContext()` reflects the injection after cache invalidation.
|
||||
|
||||
## Remaining Baseline Gaps
|
||||
|
||||
The following areas are intentionally deferred because they require higher-cost harnessing and should still avoid production-code changes:
|
||||
|
||||
1. `useScheduledTasks.ts` hook-level runtime behavior
|
||||
2. `src/cli/print.ts` full headless scheduler loop behavior
|
||||
3. `useProactive.ts` hook timer behavior
|
||||
4. end-to-end queue interaction between proactive ticks and `SleepTool`
|
||||
|
||||
## Acceptance
|
||||
|
||||
This baseline round is complete when:
|
||||
|
||||
1. The four new test files pass.
|
||||
2. No production source files are modified.
|
||||
3. The tests are stable enough to serve as a pre-implementation guardrail.
|
||||
Reference in New Issue
Block a user