feat(autofix-pr): 完整完成回流机制 (latent bug fix + completionChecker + 内容回流) (#1240)

* fix(autofix-pr): 修复 taskId 不一致导致 monitor lock dangling

问题:createAutofixTeammate 生成 teammate UUID 作为 monitor lock 的 key,
但 registerRemoteAgentTask 内部生成的 framework taskId 是另一个 UUID。
CCR session 自然完成时框架调 clearActiveMonitor(frameworkTaskId)
guard 失败,lock 永不释放,导致后续 /autofix-pr 报 "already monitoring"。

修复(Phase 1 of remote-agent completion loop):
- monitorState 新增 updateActiveMonitor(partial) 原子更新
- callAutofixPr 在 register 后 swap lock 的 taskId 到 framework 分配的 id
- RemoteAgentTask 引入 registerCompletionHook 注册式 API(参考已有的
  registerCompletionChecker 模式),在 5 个完成路径调 runCompletionHook
- autofix-pr 命令模块自己注册 cleanup hook,避免 framework 反向依赖
  command 模块

测试:
- monitorState 新增 4 个测试(updateActiveMonitor 行为 + bug 复现/修复)
- launchAutofixPr 新增 3 个端到端回归测试(taskId swap + hook 触发 +
  subsequent launch 不报 already monitoring)

完整分析与 Phase 2/3 改造方案见
docs/features/remote-agent-completion-analysis.md。

* feat(autofix-pr): 注册 completionChecker 用 gh CLI 探测 PR 完成

Phase 2 of remote-agent completion loop。Phase 1 修了 monitor lock
dangling,但完成信号仍然只能等 CCR session 自然 archive(timing 不可
预测,且不知道 PR 究竟有没有被修好)。Phase 2 加上主动完成探测。

实现:
- 新增 prOutcomeCheck.ts(纯决策矩阵):summariseAutofixOutcome 给定
  PR 快照 + 基线 SHA 返回 completed/summary。8 个决策分支单元测试。
- 新增 prFetch.ts(spawn 层):runGhPrView 调 gh CLI,fetchPrHeadSha
  在 launch 时捕获基线 SHA,checkPrAutofixOutcome 组合两者。
- AutofixPrRemoteTaskMetadata 加 initialHeadSha?: string 字段,survive
  --resume。
- launchAutofixPr.ts 模块顶部 registerCompletionChecker('autofix-pr',
  ...),5s throttle 防 gh CLI 调用爆。callAutofixPr 启动时调
  fetchPrHeadSha 传入 metadata。

决策矩阵:
  MERGED                  → done(merged)
  CLOSED 未 merge          → done(closed without fix)
  OPEN 无 baseline        → 继续轮询
  OPEN head 未变           → 继续轮询(agent 还没 push)
  OPEN head 变 + CI pending → 继续轮询
  OPEN head 变 + CI failure → done(surface red,user 决定 retry)
  OPEN head 变 + CI success → done(clean fix)

设计:
- gh CLI 而非 Octokit:复用用户已有 auth,不引入 token 管理
- 决策与 spawn 分文件:prOutcomeCheck 纯函数易测,prFetch 单独 mock
  避免 Bun mock.module 进程级污染(已在 launchAutofixPr.test 注释说明)
- 5s throttle:framework 每 1s 轮询,gh CLI subprocess 太重不能跟上
- 失败兜底:fetchPrHeadSha/checkPrAutofixOutcome 失败均不抛,returns
  null/false,framework 继续走原路径

测试:
- prOutcomeCheck 9 个单测覆盖决策矩阵
- launchAutofixPr 5 个新测试:checker 注册 / fetchPrHeadSha 调用 /
  initialHeadSha 传 metadata / SHA 失败仍能 launch / SHA null 处理

完整方案见 docs/features/remote-agent-completion-analysis.md。

* feat(autofix-pr): 内容回流让本地模型读到 PR 修复结果

Phase 3 of remote-agent completion loop。Phase 2 注册了 completionChecker
让框架能在 PR 合并/关闭/有 push+CI 绿时主动完成 task,但 task-notification
仍然只携带 generic 文本(""${owner}/${repo}#42 merged"")。Phase 3 让本地
模型读到远端 agent 自己产出的结构化结果(commits 列表、files 列表、CI
状态、人类可读 summary)。

实现:
- 新增 extractAutofixResultFromLog (src/commands/autofix-pr/
  extractAutofixResult.ts):从 SDKMessage[] 中扫 <autofix-result> tag,
  优先 hook stdout 后 fallback assistant text,latest-wins。10 个单测。
- RemoteAgentTask 新增 registerContentExtractor 注册式 API + 私有
  enqueueRichRemoteNotification(参考 enqueueRemoteReviewNotification),
  在 3 个 generic 完成路径(archived / completionChecker / result-driven)
  先尝试 tryExtractRichContent,有内容用 rich 变体,没有走 generic。
  isRemoteReview 路径不变(它走自己的 enqueueRemoteReviewNotification)。
- launchAutofixPr.ts 模块顶部 registerContentExtractor('autofix-pr',
  extractAutofixResultFromLog)。initialMessage 加 <autofix-result> 输出
  指令(pr-number / commits-pushed / files-changed / ci-status / summary)。

设计:
- 注册式 API(同 Phase 1 hook + Phase 2 checker):framework 不反向依赖
  命令模块,所有 PR-specific 逻辑在 autofix-pr/
- latest-wins:agent 重试时只取最新 tag,旧 tag 不会污染
- truncated tag → null:开 tag 无对应闭 tag 视为不完整,走 generic
  fallback
- 跨 message 不拼接:开 tag 和闭 tag 在不同 message 视为不完整(避免
  误拼字符串)
- 字符串 content 不解析:assistant.message.content 为 string(非 block
  array)的少见路径直接 skip,不 crash

测试:
- extractAutofixResultFromLog 10 个单测(空 log / 无 tag / hook stdout /
  assistant text / hook_response subtype / 多 tag latest-wins / 截断 /
  hook 后于 assistant 的优先级 / 跨 message 不拼接 / 字符串 content
  graceful)
- launchAutofixPr 3 个新测试(extractor 注册 / initialMessage 含 tag
  schema / extractor 真实行为)

完整方案见 docs/features/remote-agent-completion-analysis.md 第 5.3 节。

* fix(autofix-pr): extractBetween 支持 latest tag 截断时回溯到更早完整对

如果远端 agent 重试时写了完整 <autofix-result> 后又开了一个被截断的
第二个 tag, 旧实现只看 lastIndexOf(open) 然后找不到 close 就返回 null,
导致前面那个完整结果被丢弃。改为从尾向首遍历所有 open tag, 返回第一个
能配对的 open/close 对。

附带:
- docs/features/remote-agent-completion-analysis.md: 9 处裸 fenced block
  补 language tag (text/http), 修复 markdownlint MD040 警告
- 同文件: 两处"三选项" → "三个选项" 符合中文量词习惯

* test(autofix-pr): 补齐 completionChecker / 边界 CI 检查覆盖率

针对 codecov patch coverage gap, 补足三块此前未走到的代码路径:

prOutcomeCheck.ts (原 96.92%, 2 lines missing):
- statusCheckRollup === undefined 路径 (与空数组分支不同, GitHub 在无
  checks 配置的 PR 上直接省略字段)
- COMPLETED 状态但 conclusion 为 null/空 的 in-flight 检查归为 pending

launchAutofixPr.ts (原 58.33%, 15 lines missing):
- registerCompletionChecker arrow body: metadata 缺失早返回 / 节流窗口内
  返回 null / completed=false 返回 null / completed=true 返回 summary /
  initialHeadSha 透传到 checkPrAutofixOutcome
- registerCompletionHook 的 if(meta) 短路两侧: 有 metadata 时清空节流条目,
  无 metadata 时仍释放 active monitor lock

所有新测试沿用现有 mock.module 与 registerXxxMock.mock.calls 拉取注册
回调的模式, 无新增依赖。prOutcomeCheck 11/11 本地通过。

* style: biome check --fix 整形 launchAutofixPr.test 新增段

---------

Co-authored-by: unraid <local@unraid.local>
Co-authored-by: Claude <noreply@anthropic.com>
This commit is contained in:
Dosion
2026-05-22 21:06:26 +08:00
committed by GitHub
parent f2b751f659
commit 9d17597e58
10 changed files with 1346 additions and 7 deletions

View File

@@ -13,7 +13,11 @@ import {
checkRemoteAgentEligibility,
formatPreconditionError,
getRemoteTaskSessionUrl,
registerCompletionChecker,
registerCompletionHook,
registerContentExtractor,
registerRemoteAgentTask,
type AutofixPrRemoteTaskMetadata,
type BackgroundRemoteSessionPrecondition,
} from '../../tasks/RemoteAgentTask/RemoteAgentTask.js'
import type { LocalJSXCommandCall } from '../../types/command.js'
@@ -26,10 +30,66 @@ import {
getActiveMonitor,
isMonitoring,
trySetActiveMonitor,
updateActiveMonitor,
} from './monitorState.js'
import { extractAutofixResultFromLog } from './extractAutofixResult.js'
import { parseAutofixArgs } from './parseArgs.js'
import { checkPrAutofixOutcome, fetchPrHeadSha } from './prFetch.js'
import { detectAutofixSkills, formatSkillsHint } from './skillDetect.js'
// Throttle map for the completionChecker: gh CLI is called at most once per
// PR per CHECK_INTERVAL_MS, regardless of the framework's 1s poll cadence.
// Key is `${owner}/${repo}#${prNumber}`. Cleared when the completion hook
// fires so a re-launched monitor starts with a fresh budget.
const lastCheckAt = new Map<string, number>()
const CHECK_INTERVAL_MS = 5_000
function throttleKey(meta: AutofixPrRemoteTaskMetadata): string {
return `${meta.owner}/${meta.repo}#${meta.prNumber}`
}
// Register the completionChecker once at module load. The framework calls it
// on every poll tick for tasks with remoteTaskType==='autofix-pr'; throttle
// inside so we don't fire gh CLI 60×/min. Returns the summary string on
// completion (becomes the task-notification body) or null to keep polling.
registerCompletionChecker('autofix-pr', async metadata => {
const meta = metadata as AutofixPrRemoteTaskMetadata | undefined
if (!meta) return null
const key = throttleKey(meta)
const now = Date.now()
if (now - (lastCheckAt.get(key) ?? 0) < CHECK_INTERVAL_MS) return null
lastCheckAt.set(key, now)
const result = await checkPrAutofixOutcome({
owner: meta.owner,
repo: meta.repo,
prNumber: meta.prNumber,
initialHeadSha: meta.initialHeadSha,
})
return result.completed ? result.summary : null
})
// Release the singleton monitor lock when the framework transitions the
// autofix task to a terminal state. Without this, the lock — keyed by the
// framework-assigned taskId (after callAutofixPr's updateActiveMonitor swap)
// — would dangle past natural completion, blocking subsequent /autofix-pr
// invocations until the process restarts. Registered at module load; the
// framework's runCompletionHook invokes it once per terminal transition.
// Also clear the per-PR throttle entry so a re-launch starts fresh.
registerCompletionHook('autofix-pr', (taskId, metadata) => {
clearActiveMonitor(taskId)
const meta = metadata as AutofixPrRemoteTaskMetadata | undefined
if (meta) lastCheckAt.delete(throttleKey(meta))
})
// Phase 3 content return: extract the <autofix-result> tag from the session
// log so the local model sees the agent's structured outcome (commits
// pushed, files changed, CI status) inline in the completion task-
// notification — instead of just a file-path pointer. The framework falls
// back to the generic notification if extraction returns null.
registerContentExtractor('autofix-pr', log => extractAutofixResultFromLog(log))
function makeErrorText(message: string, code: string): string {
logEvent('tengu_autofix_pr_result', {
result:
@@ -198,7 +258,23 @@ export const callAutofixPr: LocalJSXCommandCall = async (
// 4.5 compose message
const target = `${owner}/${repo}#${prNumber}`
const branchName = `refs/pull/${prNumber}/head`
const initialMessage = `Auto-fix failing CI checks on PR #${prNumber} in ${owner}/${repo}.${skillsHint}`
const initialMessage = `Auto-fix failing CI checks on PR #${prNumber} in ${owner}/${repo}.${skillsHint}
When you finish (or hit a blocker you can't recover from), output the following XML tag as your final message so the local user gets a structured summary:
<autofix-result>
<pr-number>${prNumber}</pr-number>
<commits-pushed>
<commit sha="...">commit message</commit>
</commits-pushed>
<files-changed>
<file path="...">N changes</file>
</files-changed>
<ci-status>green | red | pending | unknown</ci-status>
<summary>One-sentence summary of what was fixed or why it could not be fixed.</summary>
</autofix-result>
If no fix was needed, omit <commits-pushed> and <files-changed> and explain in <summary>. If you only attempted partial work, list the commits you did push and explain the remainder in <summary>.`
// 4.6 in-process teammate
const teammate = createAutofixTeammate(initialMessage, target)
@@ -274,18 +350,35 @@ export const callAutofixPr: LocalJSXCommandCall = async (
return null
}
// 4.8b capture PR head SHA before registering so the completionChecker
// can detect when the agent has pushed new commits. Best-effort — if gh
// is unavailable or the call fails, leave initialHeadSha undefined and
// the checker falls back to terminal-state-only completion (closed /
// merged). Don't block on this; teleport succeeded already.
const initialHeadSha =
(await fetchPrHeadSha(owner, repo, prNumber).catch(() => null)) ??
undefined
// 4.9 register task. If this throws, release the lock so the user can
// retry — the remote CCR session is already created so we surface a
// dedicated error code.
//
// After registration succeeds, swap the lock's taskId from the tentative
// teammate UUID (used to acquire the lock atomically before teleport) to
// the framework-assigned taskId. Without this swap, the framework's own
// cleanup path (clearActiveMonitor(frameworkTaskId) on natural completion)
// would no-op against a lock keyed by teammate.taskId, leaving the
// singleton lock dangling and blocking future /autofix-pr invocations.
try {
registerRemoteAgentTask({
const { taskId: frameworkTaskId } = registerRemoteAgentTask({
remoteTaskType: 'autofix-pr',
session,
command: `/autofix-pr ${prNumber}`,
context,
isLongRunning: true,
remoteTaskMetadata: { owner, repo, prNumber },
remoteTaskMetadata: { owner, repo, prNumber, initialHeadSha },
})
updateActiveMonitor({ taskId: frameworkTaskId })
} catch (regErr: unknown) {
clearActiveMonitor(teammate.taskId)
const regMsg = regErr instanceof Error ? regErr.message : String(regErr)