Files
claude-code/docs/internals/simplify-findings-2026-04-17.md
unraid 95fece4b51 feat: 整合功能恢复与技能学习闭环(含 ECC v2.1 parity + Opus 4.7 接入 + prompt 工程优化)
主要变更:
- Skill Learning 闭环系统 (9/9 AC)
- Opus 4.7 模型层接入 + adaptive thinking
- Prompt 工程优化 (64 审计测试)
- Agent Teams 简化门控 (默认启用)
- Windows Terminal 后端修复 (EncodedCommand/WT_SESSION)
- TF-IDF 技能搜索精准化 (字段加权/CJK 优化)
- Autonomy 系统 (/autonomy 命令)
- ACP 协议完整实现
- mock.module 泄漏修复 (CI 全绿)
- 152+ lint/type 修复
2026-04-22 16:07:42 +08:00

17 KiB
Raw Blame History

Simplify Review Findings — 2026-04-17

Base commit: 5b9943b3 on chore/lint-cleanup Three parallel review agents (reuse / quality / efficiency) audited the skill-learning sprint's new or heavily-changed files. 30 findings total.

Fix attempt in the same session was reverted by an unidentified post-write mechanism (git status remained clean after every Edit call). This document preserves the findings so a future session can apply them when the revert source is identified.

Files reviewed

  • src/services/skillLearning/ — runtimeObserver, toolEventObserver, llmObserverBackend, observerBackend, instinctStore, skillGapStore, skillLifecycle, evolution, skillGenerator, commandGenerator, agentGenerator, learningPolicy, promotion, observationStore, sessionObserver, instinctParser, projectContext, featureCheck
  • src/services/skillSearch/prefetch.ts, localSearch.ts
  • src/commands/skill-learning/skill-learning.ts
  • src/services/tools/toolExecution.ts (AC1 wire only)
  • scripts/verify-skill-learning-e2e.ts

Section A — Reuse findings (8)

A1 · Duplicate of extractTextContent

runtimeObserver.ts:301-312 has textFromContent(content: unknown) that maps + filters over ContentBlock[] to join text. The project already exports extractTextContent / getContentText from src/utils/messages.ts:3011-3031. The new helper only exists because it takes unknown; a narrow as ContentBlockParam[] at the callsite lets the utility handle it.

A2 · extractWords copied between command and agent generators

commandGenerator.ts:139-167 is byte-identical to agentGenerator.ts:137-164 except for a two-entry difference in the stop-word set. Both share 80% of the loop body with learningPolicy.buildLearnedSkillName (learningPolicy.ts:38-47). Extract a extractInstinctWords(instincts, { stopWords }) helper, ideally placed next to the existing policy exports.

A3 · averageConfidence computed inline in four places

commandGenerator.ts:132-137, agentGenerator.ts:130-135, skillGenerator.ts:36-38, plus the same reduce shape inside learningPolicy.shouldGenerateSkillFromInstincts (lines 29-32). Expose a single averageInstinctConfidence(instincts) helper.

A4 · Frontmatter template triplicated across generators

skillGenerator.ts:171-179, commandGenerator.ts:104-111, agentGenerator.ts:102-109 all emit the same 7-line frontmatter (name / description / origin / confidence / evolved_from). A future schema change has to touch three files. Extract buildLearnedArtifactFrontmatter({ name, description, confidence, sourceIds }).

A5 · Inline createHash() instead of src/utils/hash.ts

instinctParser.ts:69-72, observationStore.ts:434-435, projectContext.ts:234, skillGapStore.ts:466-468 all hand-roll createHash('sha1'|'sha256').update(x).digest('hex'). hashContent in src/utils/hash.ts:19-46 already does this with Bun's faster non-cryptographic hash; the four call sites are dedup-style uses where cryptographic strength isn't required. Note: verify semantic equivalence before swapping — Bun.hash output differs from SHA-256, so any persisted IDs need a one-shot migration or a cutover version bump.

A6 · Defensive createObservationId fallback is dead code

observationStore.ts:427-432 feature-detects crypto.randomUUID, but Bun + Node ≥18 always have it. Other files in the same directory (toolEventObserver.ts:72, runtimeObserver.ts:253/265/279/288) call it directly. Internal inconsistency.

A7 · projectContext.ts re-implements src/utils/git.ts

projectContext.ts:72-99 + 199-210 + 221-231 has its own execFileSync git wrapper, normalizeGitRemote, and projectNameFromRemote. Already exists: findGitRoot (src/utils/git.ts:97), getRemoteUrl (src/utils/git.ts:269), parseGitRemote (src/utils/detectRepository.ts:87). The blocker is that projectContext is sync (execFileSync) while getRemoteUrl is async. findGitRoot is sync and can be reused immediately.

A8 · isSkillLearningEnabled vs isSkillSearchEnabled duplicated

featureCheck.ts in skillLearning and skillSearch are 1:1 templates differing only in env-var names and flag names. Wrap with createFeatureGate(envName, flagName) in src/utils/.

Section B — Quality findings (12)

B1 · emittedTurns redundant with timestamp watermark · HIGH

toolEventObserver.ts:39-56 maintains emittedTurns: Map<string, Set<number>> plus markTurn and hasToolHookObservationsForTurn. After the AC1 fix in runtimeObserver.ts:146-161 switched to a timestamp watermark, the turn-Set is now just an "are there any tool-hook observations at all" gate, which is already answered by readObservations(...) returning an empty array. Module-level mutable state duplicating information already in the observation store.

Fix: delete emittedTurns, markTurn, hasToolHookObservationsForTurn, resetToolHookBookkeeping. Drop the if (hasToolHookObservationsForTurn(...)) guard in runtimeObserver.ts and always run the watermark filter. Update __tests__/toolEventObserver.test.ts to remove those imports; add a test asserting turn is persisted on observations instead.

B2 · Dead _turn parameter in observationsFromMessages · LOW

runtimeObserver.ts:232-236 signature carries _turn: number, never used in the body. AC1 rewrite artefact.

Fix: drop the parameter and the call-site third argument.

B3 · Process-artefact comments leaking to source · MEDIUM

Multiple files contain // codex review QN / // Codex second-pass audit ACn / // AC9 compliance (codex review Q6) comments. These explain "why the previous implementation was wrong", not the current invariant. Reviewer references are not addressable from the codebase.

Locations:

  • runtimeObserver.ts:49-54, 77-79, 106-120, 132-134, 145
  • toolEventObserver.ts:22-28 @todo JSDoc, 81, 93-146
  • instinctStore.ts:74-79, 152-153
  • skillGapStore.ts:43, 169, 60-63 TODO block
  • skillLifecycle.ts:193-199
  • observationStore.ts:38-41
  • __tests__/skillGapStore.test.ts:173-175

Fix: keep the WHY (what invariant is guarded), delete the reviewer reference and the "what was wrong before" narrative. Collapse multi- line history notes to a single invariant statement.

B4 · Three dynamic imports in tool wrapper · MEDIUM

toolEventObserver.ts:101-105: runToolCallWithSkillLearningHooks does await import('./projectContext.js'), await import('./featureCheck.js'), await import('./runtimeObserver.js') on every invocation. Only the runtimeObserver import has a cycle concern; the other two can be static top-of-file imports.

Fix: convert resolveProjectContext and isSkillLearningEnabled to static imports. Keep runtimeObserver dynamic or restructure RUNTIME_SESSION_ID + getRuntimeTurn into a shared constant file.

B5 · try/catch swallow triplicated · LOW

toolEventObserver.ts:122, 128-134, 137-143: three near-identical try { await recordX(...) } catch { /* swallow */ } blocks.

Fix: extract safeRecord(fn: () => Promise<unknown>): Promise<void> and call it at the three sites.

B6 · recordToolError redundant with recordToolComplete · LOW

toolEventObserver.ts:180-194 builds the same observation shape as recordToolComplete with outcome: 'failure'. recordToolError can simply delegate: return recordToolComplete(ctx, toolName, error, 'failure').

B7 · TODO comments in production · LOW

skillGapStore.ts:60-63 carries a "P0-2 hook" multi-line TODO. toolEventObserver.ts:22-28 JSDoc @todo describes the pending wire into src/Tool.ts. Both are planning notes, not code constraints.

Fix: move to issue tracker; leave at most a one-line // TODO(skill-learning): wire into Tool.ts dispatch.

B8 · VALID_DOMAINS double source of truth · MEDIUM

llmObserverBackend.ts:33-41 maintains a readonly InstinctDomain[] array separately from the InstinctDomain union in types.ts:14-22. Adding a domain requires editing both, and domainField uses includes(value as InstinctDomain) which bypasses type safety.

Fix: declare export const INSTINCT_DOMAINS = [...] as const in types.ts and derive the union as typeof INSTINCT_DOMAINS[number]. Import the const in llmObserverBackend.ts and validate with (INSTINCT_DOMAINS as readonly string[]).includes(value).

B9 · makeTimeoutSignal dead fallback · LOW

llmObserverBackend.ts:284-293 feature-detects AbortSignal.timeout and falls back to AbortController + setTimeout.unref?.(). Project targets Bun + Node ≥18 where AbortSignal.timeout is always present.

Fix: return AbortSignal.timeout(ms) directly.

B10 · recordSkillGap rewrites all 14 fields by hand · LOW

skillGapStore.ts:95-113 literally lists every field when constructing the updated gap, mixing carry-over and new values. Adding a field forces an edit here. Contrast with recordDraftHit (L173-178) which uses spread.

Fix: const gap: SkillGapRecord = { ...(existing ?? defaults), count: ..., updatedAt: now, recommendations: ..., sessionId: ..., cwd: ... }.

B11 · buildGapAction uses unlabelled regex chain · LOW

skillGapStore.ts:318-331 dispatches by regex, with stub appearing in two different branches. Order-dependent. The sibling inferDomain (L333-341) is cleanly layered.

Fix: define const ACTION_RULES: Array<{ pattern: RegExp; action: string }> at top-of-file, loop in priority order.

B12 · Watermark is in-memory + module-scoped · MEDIUM

runtimeObserver.ts:54 lastConsumedToolHookTimestamp lives in module state, reset on test helper, lost on process restart. After restart the next post-sampling pass re-reads everything above epoch-0. Also means a test must know to reset the module to avoid cross-test leak.

Fix: persist the watermark next to the observations file, or mark each consumed observation with consumed: true at read time.

Section C — Efficiency findings (10)

C1 · resolveProjectContext is uncached per tool.call · CRITICAL

projectContext.ts:43-49 (+persistProjectContext) does on EVERY call:

  1. execFileSync('git', ['remote', 'get-url', 'origin'])
  2. execFileSync('git', ['rev-parse', '--show-toplevel'])
  3. Two realpathSync.native calls
  4. readProjectsRegistry + two writeFileSync operations (registry + project.json)

runToolCallWithSkillLearningHooks calls this per tool.call. At ~100 tool calls per session, that is 200 git process forks plus 400 synchronous disk writes. Highest-impact finding in the entire sprint.

Fix:

const contextCache = new Map<string, SkillLearningProjectContext>()
const PERSIST_INTERVAL_MS = 5 * 60 * 1000
let lastPersistAt = 0

export function resolveProjectContext(cwd = process.cwd()) {
  const cached = contextCache.get(cwd)
  if (cached) {
    if (Date.now() - lastPersistAt > PERSIST_INTERVAL_MS) {
      lastPersistAt = Date.now()
      persistProjectContext(cached)
    }
    return cached
  }
  const resolved = resolveContext(cwd)
  contextCache.set(cwd, resolved)
  persistProjectContext(resolved)
  lastPersistAt = Date.now()
  return resolved
}

Also export resetProjectContextCacheForTest().

C2 · Wrapper pays 3× dynamic import cost even when feature off · HIGH

toolEventObserver.ts:101-108: the isSkillLearningEnabled() check is INSIDE the try block that runs after all three await import calls. Feature-off path pays the cost.

Fix: static-import isSkillLearningEnabled; at the top of runToolCallWithSkillLearningHooks do if (!isSkillLearningEnabled()) return invoke() immediately. Only then do dynamic imports for runtimeObserver (if still needed).

C3 · emittedTurns unbounded + allocation churn · MEDIUM

toolEventObserver.ts:42: const seen = emittedTurns.get(sessionId) ?? new Set<number>() — every call allocates a fresh Set and then emittedTurns.set() replaces, even when an entry already existed. Unbounded growth over a long daemon session.

Fix: subsumed by B1 (delete the bookkeeping entirely).

C4 · Per-turn full-file read of observations.jsonl · MEDIUM

runtimeObserver.ts:147: readObservations(options) reads and JSON.parses the entire jsonl each post-sampling pass just to filter for source === 'tool-hook' && timestamp > watermark. At 0.9 MB (below archive threshold) that is ~1050 ms main-thread blocking per turn.

Fix: keep the last N tool-hook records in a ring buffer in toolEventObserver.ts, returned directly from a drainPendingToolHookObservations() helper. Disk is for durability only.

C5 · purgeOldObservations always does full read + rewrite · LOW

observationStore.ts:211-246 reads full file, parses, writes back — unconditional. Runs on startup via runStartupMaintenance. On a long-lived file near threshold, this is the slowest startup path.

Fix: short-circuit if the first observation line's timestamp is already newer than the cutoff; also skip if file size < some floor.

C6 · decayInstinctConfidence writes instincts serially · LOW

instinctStore.ts:136-168: for-await on saveInstinct makes N sequential writeFile calls. N is typically small, but for 50+ instincts this is still noticeable.

Fix: await Promise.all(toDecay.map(saveInstinct)). Safe because each writes an independent file.

C7 · upsertInstinct reloads full instinct dir per candidate · MEDIUM

instinctStore.ts:73: every call re-does readdir + readFile × N. Post-sampling may upsert 3+ candidates in a row. O(candidates × total instincts) filesystem reads.

Fix: add a bulkUpsertInstincts(candidates, options) helper that loads once and diff/merges in memory.

C8 · Startup maintenance duplicates loadInstincts twice · LOW

runtimeObserver.ts:86-90: decayInstinctConfidence and prunePendingInstincts each internally loadInstincts — two full directory reads back-to-back.

Fix: load once in runStartupMaintenance, pass the array to both. Or throttle maintenance to "once per 24h" via a persisted timestamp.

C9 · recordedGapSignals + discoveredThisSession unbounded · MEDIUM

prefetch.ts:22-23: both module-level Sets monotonically grow. In a long REPL or daemon session, memory leak accumulates.

Fix: LRU-cap at ~500 entries, or register a sessionEnd reset.

C10 · checkPromotion loads every project serially · LOW

promotion.ts:113-140: for (const entry of entries) { await loadInstincts(entry) }. For N projects, N sequential disk scans. Runs at the end of each post-sampling pass.

Fix: Promise.all(entries.map(loadInstincts)). Or invalidate- based: only call checkPromotion when at least one project's instinct file changed this turn.

Priority ranking (for the fix sprint)

Tier Finding Effort Impact
Critical C1 resolveProjectContext cache S Huge (per tool.call)
High B1/C3 delete emittedTurns bookkeeping S Real redundancy
High C2/B4 wrapper static imports + early short-circuit S Per tool.call
High B3 clean codex review comments S Code hygiene, user policy
Medium B2 drop dead _turn param XS Trivial
Medium B8 unify VALID_DOMAINS via INSTINCT_DOMAINS const S Type safety
Medium B9 drop AbortSignal fallback XS Dead code
Medium B12/C4 watermark persistence or in-memory tool-hook buffer M Tail latency
Medium A2/A4 extract shared frontmatter + word helpers M Dedup 3 generators
Medium C7 bulkUpsertInstincts S Per post-sampling
Low C9/C5/C6/C8/C10 various batch/throttle optimisations S each Incremental
Low A5/A7 replace hand-rolled git / hash with existing utils M Refactor, careful
Low A6/A8 internal consistency + featureCheck factor S Polish
Low B5/B6/B10/B11/B7 cosmetic quality cleanups S each Polish

Action recommendation

Apply in three independent commits (avoids batch revert risk):

  1. commit 1 (critical): C1 project context cache + C2/B4 wrapper short-circuit + static imports.
  2. commit 2 (state cleanup): B1/C3 delete emittedTurns, B2 drop _turn, B12 persist or replace watermark.
  3. commit 3 (hygiene): B3 comment cleanup + B8/B9 domain/timeout cleanups + A2/A3/A4 generator helper extraction.

After each commit, run bunx tsc --noEmit and bun test src/services/skillLearning/__tests__/ src/services/skillSearch/__tests__/ src/commands/skill-learning/__tests__/ before moving on.

Environment note

During the 2026-04-17 simplify pass the fixes above were attempted as direct Edit calls. git status --short was empty after the Edit batch, indicating a PostToolUse / linter / format hook silently reverted every write. All three agents returned valid diagnoses but the code base stayed on 5b9943b3 unmodified. A future attempt should first run git status between two Edit calls to confirm write persistence, or disable the suspect hook and retry.