主要变更: - Skill Learning 闭环系统 (9/9 AC) - Opus 4.7 模型层接入 + adaptive thinking - Prompt 工程优化 (64 审计测试) - Agent Teams 简化门控 (默认启用) - Windows Terminal 后端修复 (EncodedCommand/WT_SESSION) - TF-IDF 技能搜索精准化 (字段加权/CJK 优化) - Autonomy 系统 (/autonomy 命令) - ACP 协议完整实现 - mock.module 泄漏修复 (CI 全绿) - 152+ lint/type 修复
17 KiB
Simplify Review Findings — 2026-04-17
Base commit:
5b9943b3onchore/lint-cleanupThree parallel review agents (reuse / quality / efficiency) audited the skill-learning sprint's new or heavily-changed files. 30 findings total.Fix attempt in the same session was reverted by an unidentified post-write mechanism (git status remained clean after every Edit call). This document preserves the findings so a future session can apply them when the revert source is identified.
Files reviewed
src/services/skillLearning/— runtimeObserver, toolEventObserver, llmObserverBackend, observerBackend, instinctStore, skillGapStore, skillLifecycle, evolution, skillGenerator, commandGenerator, agentGenerator, learningPolicy, promotion, observationStore, sessionObserver, instinctParser, projectContext, featureChecksrc/services/skillSearch/prefetch.ts,localSearch.tssrc/commands/skill-learning/skill-learning.tssrc/services/tools/toolExecution.ts(AC1 wire only)scripts/verify-skill-learning-e2e.ts
Section A — Reuse findings (8)
A1 · Duplicate of extractTextContent
runtimeObserver.ts:301-312 has textFromContent(content: unknown)
that maps + filters over ContentBlock[] to join text. The project
already exports extractTextContent / getContentText from
src/utils/messages.ts:3011-3031. The new helper only exists because
it takes unknown; a narrow as ContentBlockParam[] at the callsite
lets the utility handle it.
A2 · extractWords copied between command and agent generators
commandGenerator.ts:139-167 is byte-identical to
agentGenerator.ts:137-164 except for a two-entry difference in the
stop-word set. Both share 80% of the loop body with
learningPolicy.buildLearnedSkillName (learningPolicy.ts:38-47).
Extract a extractInstinctWords(instincts, { stopWords }) helper,
ideally placed next to the existing policy exports.
A3 · averageConfidence computed inline in four places
commandGenerator.ts:132-137, agentGenerator.ts:130-135,
skillGenerator.ts:36-38, plus the same reduce shape inside
learningPolicy.shouldGenerateSkillFromInstincts (lines 29-32). Expose
a single averageInstinctConfidence(instincts) helper.
A4 · Frontmatter template triplicated across generators
skillGenerator.ts:171-179, commandGenerator.ts:104-111,
agentGenerator.ts:102-109 all emit the same 7-line frontmatter
(name / description / origin / confidence / evolved_from). A future
schema change has to touch three files. Extract
buildLearnedArtifactFrontmatter({ name, description, confidence, sourceIds }).
A5 · Inline createHash() instead of src/utils/hash.ts
instinctParser.ts:69-72, observationStore.ts:434-435,
projectContext.ts:234, skillGapStore.ts:466-468 all hand-roll
createHash('sha1'|'sha256').update(x).digest('hex'). hashContent in
src/utils/hash.ts:19-46 already does this with Bun's faster
non-cryptographic hash; the four call sites are dedup-style uses where
cryptographic strength isn't required. Note: verify semantic
equivalence before swapping — Bun.hash output differs from SHA-256, so
any persisted IDs need a one-shot migration or a cutover version bump.
A6 · Defensive createObservationId fallback is dead code
observationStore.ts:427-432 feature-detects crypto.randomUUID, but
Bun + Node ≥18 always have it. Other files in the same directory
(toolEventObserver.ts:72, runtimeObserver.ts:253/265/279/288) call
it directly. Internal inconsistency.
A7 · projectContext.ts re-implements src/utils/git.ts
projectContext.ts:72-99 + 199-210 + 221-231 has its own execFileSync
git wrapper, normalizeGitRemote, and projectNameFromRemote. Already
exists: findGitRoot (src/utils/git.ts:97), getRemoteUrl
(src/utils/git.ts:269), parseGitRemote
(src/utils/detectRepository.ts:87). The blocker is that
projectContext is sync (execFileSync) while getRemoteUrl is async.
findGitRoot is sync and can be reused immediately.
A8 · isSkillLearningEnabled vs isSkillSearchEnabled duplicated
featureCheck.ts in skillLearning and skillSearch are 1:1 templates
differing only in env-var names and flag names. Wrap with
createFeatureGate(envName, flagName) in src/utils/.
Section B — Quality findings (12)
B1 · emittedTurns redundant with timestamp watermark · HIGH
toolEventObserver.ts:39-56 maintains emittedTurns: Map<string, Set<number>>
plus markTurn and hasToolHookObservationsForTurn. After the AC1 fix
in runtimeObserver.ts:146-161 switched to a timestamp watermark, the
turn-Set is now just an "are there any tool-hook observations at all"
gate, which is already answered by readObservations(...) returning
an empty array. Module-level mutable state duplicating information
already in the observation store.
Fix: delete emittedTurns, markTurn,
hasToolHookObservationsForTurn, resetToolHookBookkeeping. Drop the
if (hasToolHookObservationsForTurn(...)) guard in runtimeObserver.ts
and always run the watermark filter. Update
__tests__/toolEventObserver.test.ts to remove those imports; add a
test asserting turn is persisted on observations instead.
B2 · Dead _turn parameter in observationsFromMessages · LOW
runtimeObserver.ts:232-236 signature carries _turn: number, never
used in the body. AC1 rewrite artefact.
Fix: drop the parameter and the call-site third argument.
B3 · Process-artefact comments leaking to source · MEDIUM
Multiple files contain // codex review QN / // Codex second-pass audit ACn / // AC9 compliance (codex review Q6) comments. These
explain "why the previous implementation was wrong", not the current
invariant. Reviewer references are not addressable from the codebase.
Locations:
runtimeObserver.ts:49-54, 77-79, 106-120, 132-134, 145toolEventObserver.ts:22-28 @todo JSDoc, 81, 93-146instinctStore.ts:74-79, 152-153skillGapStore.ts:43, 169, 60-63 TODO blockskillLifecycle.ts:193-199observationStore.ts:38-41__tests__/skillGapStore.test.ts:173-175
Fix: keep the WHY (what invariant is guarded), delete the reviewer reference and the "what was wrong before" narrative. Collapse multi- line history notes to a single invariant statement.
B4 · Three dynamic imports in tool wrapper · MEDIUM
toolEventObserver.ts:101-105: runToolCallWithSkillLearningHooks
does await import('./projectContext.js'), await import('./featureCheck.js'), await import('./runtimeObserver.js') on every invocation. Only the
runtimeObserver import has a cycle concern; the other two can be
static top-of-file imports.
Fix: convert resolveProjectContext and isSkillLearningEnabled
to static imports. Keep runtimeObserver dynamic or restructure
RUNTIME_SESSION_ID + getRuntimeTurn into a shared constant file.
B5 · try/catch swallow triplicated · LOW
toolEventObserver.ts:122, 128-134, 137-143: three near-identical
try { await recordX(...) } catch { /* swallow */ } blocks.
Fix: extract safeRecord(fn: () => Promise<unknown>): Promise<void>
and call it at the three sites.
B6 · recordToolError redundant with recordToolComplete · LOW
toolEventObserver.ts:180-194 builds the same observation shape as
recordToolComplete with outcome: 'failure'. recordToolError can
simply delegate: return recordToolComplete(ctx, toolName, error, 'failure').
B7 · TODO comments in production · LOW
skillGapStore.ts:60-63 carries a "P0-2 hook" multi-line TODO.
toolEventObserver.ts:22-28 JSDoc @todo describes the pending wire
into src/Tool.ts. Both are planning notes, not code constraints.
Fix: move to issue tracker; leave at most a one-line
// TODO(skill-learning): wire into Tool.ts dispatch.
B8 · VALID_DOMAINS double source of truth · MEDIUM
llmObserverBackend.ts:33-41 maintains a readonly InstinctDomain[]
array separately from the InstinctDomain union in types.ts:14-22.
Adding a domain requires editing both, and domainField uses
includes(value as InstinctDomain) which bypasses type safety.
Fix: declare export const INSTINCT_DOMAINS = [...] as const in
types.ts and derive the union as typeof INSTINCT_DOMAINS[number].
Import the const in llmObserverBackend.ts and validate with
(INSTINCT_DOMAINS as readonly string[]).includes(value).
B9 · makeTimeoutSignal dead fallback · LOW
llmObserverBackend.ts:284-293 feature-detects AbortSignal.timeout
and falls back to AbortController + setTimeout.unref?.(). Project
targets Bun + Node ≥18 where AbortSignal.timeout is always present.
Fix: return AbortSignal.timeout(ms) directly.
B10 · recordSkillGap rewrites all 14 fields by hand · LOW
skillGapStore.ts:95-113 literally lists every field when
constructing the updated gap, mixing carry-over and new values. Adding
a field forces an edit here. Contrast with recordDraftHit (L173-178)
which uses spread.
Fix: const gap: SkillGapRecord = { ...(existing ?? defaults), count: ..., updatedAt: now, recommendations: ..., sessionId: ..., cwd: ... }.
B11 · buildGapAction uses unlabelled regex chain · LOW
skillGapStore.ts:318-331 dispatches by regex, with stub appearing
in two different branches. Order-dependent. The sibling inferDomain
(L333-341) is cleanly layered.
Fix: define const ACTION_RULES: Array<{ pattern: RegExp; action: string }> at top-of-file, loop in priority order.
B12 · Watermark is in-memory + module-scoped · MEDIUM
runtimeObserver.ts:54 lastConsumedToolHookTimestamp lives in module
state, reset on test helper, lost on process restart. After restart
the next post-sampling pass re-reads everything above epoch-0. Also
means a test must know to reset the module to avoid cross-test leak.
Fix: persist the watermark next to the observations file, or mark
each consumed observation with consumed: true at read time.
Section C — Efficiency findings (10)
C1 · resolveProjectContext is uncached per tool.call · CRITICAL
projectContext.ts:43-49 (+persistProjectContext) does on EVERY
call:
execFileSync('git', ['remote', 'get-url', 'origin'])execFileSync('git', ['rev-parse', '--show-toplevel'])- Two
realpathSync.nativecalls readProjectsRegistry+ twowriteFileSyncoperations (registry + project.json)
runToolCallWithSkillLearningHooks calls this per tool.call. At
~100 tool calls per session, that is 200 git process forks plus 400
synchronous disk writes. Highest-impact finding in the entire
sprint.
Fix:
const contextCache = new Map<string, SkillLearningProjectContext>()
const PERSIST_INTERVAL_MS = 5 * 60 * 1000
let lastPersistAt = 0
export function resolveProjectContext(cwd = process.cwd()) {
const cached = contextCache.get(cwd)
if (cached) {
if (Date.now() - lastPersistAt > PERSIST_INTERVAL_MS) {
lastPersistAt = Date.now()
persistProjectContext(cached)
}
return cached
}
const resolved = resolveContext(cwd)
contextCache.set(cwd, resolved)
persistProjectContext(resolved)
lastPersistAt = Date.now()
return resolved
}
Also export resetProjectContextCacheForTest().
C2 · Wrapper pays 3× dynamic import cost even when feature off · HIGH
toolEventObserver.ts:101-108: the isSkillLearningEnabled() check is
INSIDE the try block that runs after all three await import calls.
Feature-off path pays the cost.
Fix: static-import isSkillLearningEnabled; at the top of
runToolCallWithSkillLearningHooks do if (!isSkillLearningEnabled()) return invoke() immediately. Only then do dynamic imports for
runtimeObserver (if still needed).
C3 · emittedTurns unbounded + allocation churn · MEDIUM
toolEventObserver.ts:42: const seen = emittedTurns.get(sessionId) ?? new Set<number>() — every call allocates a fresh Set and then
emittedTurns.set() replaces, even when an entry already existed.
Unbounded growth over a long daemon session.
Fix: subsumed by B1 (delete the bookkeeping entirely).
C4 · Per-turn full-file read of observations.jsonl · MEDIUM
runtimeObserver.ts:147: readObservations(options) reads and
JSON.parses the entire jsonl each post-sampling pass just to filter
for source === 'tool-hook' && timestamp > watermark. At 0.9 MB
(below archive threshold) that is ~10–50 ms main-thread blocking per
turn.
Fix: keep the last N tool-hook records in a ring buffer in
toolEventObserver.ts, returned directly from a
drainPendingToolHookObservations() helper. Disk is for durability
only.
C5 · purgeOldObservations always does full read + rewrite · LOW
observationStore.ts:211-246 reads full file, parses, writes back —
unconditional. Runs on startup via runStartupMaintenance. On a
long-lived file near threshold, this is the slowest startup path.
Fix: short-circuit if the first observation line's timestamp is already newer than the cutoff; also skip if file size < some floor.
C6 · decayInstinctConfidence writes instincts serially · LOW
instinctStore.ts:136-168: for-await on saveInstinct makes N
sequential writeFile calls. N is typically small, but for 50+
instincts this is still noticeable.
Fix: await Promise.all(toDecay.map(saveInstinct)). Safe because
each writes an independent file.
C7 · upsertInstinct reloads full instinct dir per candidate · MEDIUM
instinctStore.ts:73: every call re-does readdir + readFile × N.
Post-sampling may upsert 3+ candidates in a row. O(candidates × total
instincts) filesystem reads.
Fix: add a bulkUpsertInstincts(candidates, options) helper that
loads once and diff/merges in memory.
C8 · Startup maintenance duplicates loadInstincts twice · LOW
runtimeObserver.ts:86-90: decayInstinctConfidence and
prunePendingInstincts each internally loadInstincts — two full
directory reads back-to-back.
Fix: load once in runStartupMaintenance, pass the array to both.
Or throttle maintenance to "once per 24h" via a persisted timestamp.
C9 · recordedGapSignals + discoveredThisSession unbounded · MEDIUM
prefetch.ts:22-23: both module-level Sets monotonically grow. In a
long REPL or daemon session, memory leak accumulates.
Fix: LRU-cap at ~500 entries, or register a sessionEnd reset.
C10 · checkPromotion loads every project serially · LOW
promotion.ts:113-140: for (const entry of entries) { await loadInstincts(entry) }. For N projects, N sequential disk scans. Runs
at the end of each post-sampling pass.
Fix: Promise.all(entries.map(loadInstincts)). Or invalidate-
based: only call checkPromotion when at least one project's instinct
file changed this turn.
Priority ranking (for the fix sprint)
| Tier | Finding | Effort | Impact |
|---|---|---|---|
| Critical | C1 resolveProjectContext cache |
S | Huge (per tool.call) |
| High | B1/C3 delete emittedTurns bookkeeping |
S | Real redundancy |
| High | C2/B4 wrapper static imports + early short-circuit | S | Per tool.call |
| High | B3 clean codex review comments | S | Code hygiene, user policy |
| Medium | B2 drop dead _turn param |
XS | Trivial |
| Medium | B8 unify VALID_DOMAINS via INSTINCT_DOMAINS const |
S | Type safety |
| Medium | B9 drop AbortSignal fallback | XS | Dead code |
| Medium | B12/C4 watermark persistence or in-memory tool-hook buffer | M | Tail latency |
| Medium | A2/A4 extract shared frontmatter + word helpers | M | Dedup 3 generators |
| Medium | C7 bulkUpsertInstincts | S | Per post-sampling |
| Low | C9/C5/C6/C8/C10 various batch/throttle optimisations | S each | Incremental |
| Low | A5/A7 replace hand-rolled git / hash with existing utils | M | Refactor, careful |
| Low | A6/A8 internal consistency + featureCheck factor | S | Polish |
| Low | B5/B6/B10/B11/B7 cosmetic quality cleanups | S each | Polish |
Action recommendation
Apply in three independent commits (avoids batch revert risk):
- commit 1 (critical): C1 project context cache + C2/B4 wrapper short-circuit + static imports.
- commit 2 (state cleanup): B1/C3 delete
emittedTurns, B2 drop_turn, B12 persist or replace watermark. - commit 3 (hygiene): B3 comment cleanup + B8/B9 domain/timeout cleanups + A2/A3/A4 generator helper extraction.
After each commit, run bunx tsc --noEmit and
bun test src/services/skillLearning/__tests__/ src/services/skillSearch/__tests__/ src/commands/skill-learning/__tests__/
before moving on.
Environment note
During the 2026-04-17 simplify pass the fixes above were attempted as
direct Edit calls. git status --short was empty after the Edit
batch, indicating a PostToolUse / linter / format hook silently
reverted every write. All three agents returned valid diagnoses but
the code base stayed on 5b9943b3 unmodified. A future attempt should
first run git status between two Edit calls to confirm write
persistence, or disable the suspect hook and retry.