mirror of
https://github.com/claude-code-best/claude-code.git
synced 2026-06-19 06:45:50 +00:00
Squashed 5 commits: Features (from 5 feature branches): - MCP fix, pipe mute, stub recovery - KAIROS activation, openclaw autonomy - Daemon/job command hierarchy + cross-platform bg engine Upstream fixes: - fix: Bun.hash compatibility - chore: chrome dependency update - docs: browser support guide MIME detection fix: - Screenshot detectMimeFromBase64(): decode raw bytes from base64 instead of broken charCodeAt comparison - Fixes API 400 on Windows (JPEG) and macOS (PNG) screenshots
89 lines
3.2 KiB
Markdown
89 lines
3.2 KiB
Markdown
# OpenClaw Autonomy Baseline Test Spec
|
|
|
|
## Purpose
|
|
|
|
This test spec locks the current behavior of the existing trigger and context layers before any formal autonomy-subsystem implementation begins.
|
|
|
|
At this stage, production code is read-only. Only test files, fixtures, and planning documents may change.
|
|
|
|
## Goal
|
|
|
|
Establish a stable baseline around the parts of `Claude-code-bast` that later autonomy work is most likely to touch:
|
|
|
|
- proactive state handling
|
|
- cron task storage semantics
|
|
- cron scheduler helper semantics
|
|
- user-context cache and `CLAUDE.md` injection behavior
|
|
|
|
## Out of Scope for This Baseline Round
|
|
|
|
- New authority behavior (`AGENTS.md` / `HEARTBEAT.md`)
|
|
- New detached-run ledger behavior
|
|
- New flow behavior
|
|
- UI redesign
|
|
|
|
## Files Under Baseline Protection
|
|
|
|
- `src/proactive/index.ts`
|
|
- `src/utils/cronTasks.ts`
|
|
- `src/utils/cronScheduler.ts`
|
|
- `src/context.ts`
|
|
|
|
## Test Files Added In This Round
|
|
|
|
- `src/proactive/__tests__/state.baseline.test.ts`
|
|
- `src/commands/__tests__/proactive.baseline.test.ts`
|
|
- `src/utils/__tests__/cronTasks.baseline.test.ts`
|
|
- `src/utils/__tests__/cronScheduler.baseline.test.ts`
|
|
- `src/__tests__/context.baseline.test.ts`
|
|
|
|
## Baseline Assertions
|
|
|
|
### Proactive state
|
|
|
|
1. Activating proactive mode sets active state and activation source.
|
|
2. Pausing proactive mode suppresses `shouldTick()` and clears `nextTickAt`.
|
|
3. Blocking context suppresses `shouldTick()` and clears `nextTickAt`.
|
|
4. Subscribers are notified on state transitions.
|
|
5. The `/proactive` command enables proactive mode and emits the expected hidden reminder.
|
|
6. The `/proactive` command disables proactive mode on the second invocation.
|
|
|
|
### Cron task storage
|
|
|
|
1. Session-only cron tasks remain in memory only.
|
|
2. Durable cron tasks are persisted to `.claude/scheduled_tasks.json`.
|
|
3. Daemon-style `dir`-scoped reads exclude session-only cron tasks.
|
|
4. `removeCronTasks()` without `dir` can remove session-only tasks.
|
|
5. `removeCronTasks()` with `dir` does not mutate session-only task storage.
|
|
|
|
### Cron scheduler helpers
|
|
|
|
1. `isRecurringTaskAged()` preserves current aging semantics.
|
|
2. `buildMissedTaskNotification()` preserves the current AskUserQuestion safety wording.
|
|
3. `buildMissedTaskNotification()` preserves code-fence hardening for prompt bodies that contain backticks.
|
|
|
|
### User context caching
|
|
|
|
1. `getUserContext()` includes `currentDate`.
|
|
2. `getUserContext()` includes mocked `claudeMd` content when memory loading is enabled.
|
|
3. `CLAUDE_CODE_DISABLE_CLAUDE_MDS` suppresses `claudeMd`.
|
|
4. `setSystemPromptInjection()` clears the memoized user-context cache.
|
|
5. `getSystemContext()` reflects the injection after cache invalidation.
|
|
|
|
## Remaining Baseline Gaps
|
|
|
|
The following areas are intentionally deferred because they require higher-cost harnessing and should still avoid production-code changes:
|
|
|
|
1. `useScheduledTasks.ts` hook-level runtime behavior
|
|
2. `src/cli/print.ts` full headless scheduler loop behavior
|
|
3. `useProactive.ts` hook timer behavior
|
|
4. end-to-end queue interaction between proactive ticks and `SleepTool`
|
|
|
|
## Acceptance
|
|
|
|
This baseline round is complete when:
|
|
|
|
1. The four new test files pass.
|
|
2. No production source files are modified.
|
|
3. The tests are stable enough to serve as a pre-implementation guardrail.
|