Files
claude-code/src/utils/swarm
Dosion f91060836f fix(swarm): WindowsTerminalBackend pidFile health check + 5-state lifecycle (#1237)
* fix(swarm): WindowsTerminalBackend pidFile health check + 5-state lifecycle

修 wt.exe split-pane fire-and-forget 导致 teammate 假死、TeamDelete 卡死、
kill-while-spawn race 等多个问题。

- 加 waitForPidFile() 在 wt.exe 返回后等 powershell.exe 真启动写 pidFile
  默认 8s timeout,env CLAUDE_WT_PANE_TIMEOUT_MS 覆盖,超时 throw 含完整诊断
- 加 5 态生命周期 (registered/spawning/ready/killing/dead),sendCommandToPane
  inner Promise 包装 spawnPromise,ready 态重 spawn 直接 throw
- killPane TOCTOU 修正:await spawnPromise 后重读 status;优先用缓存 pane.pid
  避免读盘,Stop-Process 失败也清缓存 + 标 dead 防 PID 复用误杀
- pid 解析严格化:/^\d+$/ + Number.isFinite + >0;移除 dead try/catch
- 构造函数 options 对象注入 pidFileDir(兼容原位置参数)
- 清启动前陈旧 pidFile,killPane fallback 3×500ms retry 兜底

* test(swarm): 12 tests covering WindowsTerminalBackend lifecycle, race, pid validation

为 WindowsTerminalBackend 加 12 个测试覆盖 v2 全部新行为,含 5 个 v1 兼容 + 7 个
v2 新场景。配套构造函数 options 对象,测试用 pidFileDir: tempDir 隔离防泄漏到
真实 OS tmpdir。

新场景覆盖:
- unlinks stale pidFile so a stale pid is not adopted
- rejects re-spawn on a ready pane
- throws on unknown paneId in sendCommandToPane
- rejects corrupted pidFile content ("123abc") and times out
- killPane awaits in-flight spawn before killing (kill-while-spawn race)
- Stop-Process failure clears cached pid and marks pane dead
- killPane uses cached pid and returns false when pane is unknown

createBackend helper 改用 options 对象 + simulatePidWrite 模拟 powershell 写
pidFile,pidFileDir 注入 tempDir,env CLAUDE_WT_PANE_TIMEOUT_MS beforeEach 设置
afterEach 清理。

---------

Co-authored-by: unraid <local@unraid.local>
2026-05-22 21:06:47 +08:00
..
2026-03-31 19:22:47 +08:00
2026-03-31 19:22:47 +08:00
2026-03-31 19:22:47 +08:00
2026-03-31 19:22:47 +08:00
2026-03-31 19:22:47 +08:00
2026-03-31 19:22:47 +08:00
2026-03-31 19:22:47 +08:00
2026-03-31 19:22:47 +08:00