Commit Graph

2 Commits

Author SHA1 Message Date
unraid
c17edcb12e feat: Computer Use — Windows 跨平台支持 + GUI 无障碍增强 + Python Bridge
三平台 Computer Use (macOS + Windows + Linux),Windows 专项增强。

- MCP server: toolCalls/tools/executor/mcpServer 等 12 文件完整实现
- 平台抽象层: platforms/{win32,darwin,linux}.ts
- 跨平台 executor: executorCrossPlatform.ts
- CHICAGO_MCP + VOICE_MODE feature flags 启用

- windowMessage.ts: SendMessageW (WM_CHAR Unicode + 剪贴板粘贴)
- windowBorder.ts: 4 叠加窗口边框 (30fps 跟踪)
- uiAutomation.ts: UI Automation 元素树/点击/写值
- accessibilitySnapshot.ts: 无障碍快照 → 模型感知 GUI
- bridge.py + bridgeClient.ts: Python 长驻进程 (替代 per-call PS)

- window_management: min/max/restore/close/focus (Win32 API)
- click_element / type_into_element: 按名称操作 (无需坐标)
- 截图自动附带 Accessibility Snapshot

- 17 种方法, stdin/stdout JSON 通信
- 窗口枚举 1.5ms vs PS 500ms, 截图 360ms vs PS 800ms
- 依赖: mss + Pillow + pywinauto
2026-04-05 15:47:20 +08:00
unraid
3707c3c0ba feat: Windows Computer Use enhancement — PrintWindow, UI Automation, OCR
New Windows-native capabilities:
- windowCapture.ts: PrintWindow API for per-window screenshot (works on
  occluded/background windows)
- windowEnum.ts: EnumWindows for precise window enumeration with HWND
- uiAutomation.ts: IUIAutomation for UI tree reading, element clicking,
  text input, and coordinate-based element identification
- ocr.ts: Windows.Media.Ocr for screen text recognition (en-US + zh-CN)

Updated win32.ts backend to use EnumWindows for listRunning() and added
captureWindowTarget() for window-specific screenshots.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 00:00:02 +08:00