feat: 重构供应商层次 (#286)

* refactor: 创建 @anthropic-ai/model-provider 包骨架与类型定义 - 新建 workspace 包 packages/@anthropic-ai/model-provider - 定义 ModelProviderHooks 接口（依赖注入：分析、成本、日志等） - 定义 ClientFactories 接口（Anthropic/OpenAI/Gemini/Grok 客户端工厂） - 搬入核心类型：Message 体系、NonNullableUsage、EMPTY_USAGE、SystemPrompt、错误常量 - 主项目 src/types/message.ts 等改为 re-export，保持向后兼容 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor: 提升 OpenAI 转换器和模型映射到 model-provider 包 - 搬入 OpenAI 消息转换（convertMessages）、工具转换（convertTools）、流适配（streamAdapter） - 搬入 OpenAI 和 Grok 模型映射（resolveOpenAIModel、resolveGrokModel） - 主项目文件改为 thin re-export proxy Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor: 搬入 Gemini 兼容层到 model-provider 包 - 搬入 Gemini 类型定义、消息转换、工具转换、流适配、模型映射 - 主项目 gemini/ 目录下文件改为 thin re-export proxy Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor: 搬入 errorUtils 并迁移消费者导入到 model-provider - 搬入 formatAPIError、extractConnectionErrorDetails 等 errorUtils - 迁移 10 个消费者文件直接从 @anthropic-ai/model-provider 导入 - 更新 emptyUsage、sdkUtilityTypes、systemPromptType 为 re-export proxy Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: compact 模型降级为 -1 模式（Opus→Sonnet, Sonnet→Haiku） Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: 添加 agent-loop 绘图 * Revert "feat: compact 模型降级为 -1 模式（Opus→Sonnet, Sonnet→Haiku）" This reverts commit e458d6391d. * docs: 添加简化版 agent loop * fix: 修复 n 快捷键导致关闭的问题 * fix: 修复 node 下 ws 没打包问题 * docs: 修复链接 * test: 添加测试支持 * fix: 修复类型问题(#267) (#271) * fix: 修复 Bun 的 polyfill 问题 * fix: 类型修复完成 * feat: 统一所有包的类型文件 * fix: 修复构建问题 * test: 修复类型校验 (#279) * fix: 修复 Bun 的 polyfill 问题 * fix: 类型修复完成 * feat: 统一所有包的类型文件 * fix: 修复构建问题 * fix(remote-control): harden self-hosted session flows (#278) Co-authored-by: chengzifeng <chengzifeng@meituan.com> * docs: update contributors * build: 新增 vite 构建流程 * feat: 添加环境变量支持以覆盖 max_tokens 设置 * feat(langfuse): LLM generation 记录工具定义将 Anthropic 格式的工具定义转换为 Langfuse 兼容的 OpenAI 格式，并在 generation 的 input 中以 { messages, tools } 结构传入，以便在 Langfuse UI 中查看完整的工具定义信息。 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: 添加对 ACP 协议的支持 (#284) * feat: 适配 zed acp 协议 * docs: 完善 acp 文档 * chore: 1.4.0 * conflict: 解决冲突 * feat: 添加测试覆盖率上报 * style: 改名加移动文件夹位置 * refactor: 移动测试用例及实现 * test: 修复测试用例完成 --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Cheng Zi Feng <1154238323@qq.com> Co-authored-by: chengzifeng <chengzifeng@meituan.com> Co-authored-by: claude-code-best <272536312+claude-code-best@users.noreply.github.com>
2026-06-15 12:55:51 +00:00 · 2026-04-17 09:33:14 +08:00
parent c8d08d235b
commit bddd146f25
86 changed files with 1661 additions and 1766 deletions
--- a/src/services/api/emptyUsage.ts
+++ b/src/services/api/emptyUsage.ts
@@ -1,22 +1,4 @@
-import type { NonNullableUsage } from '../../entrypoints/sdk/sdkUtilityTypes.js'
-
-/**
- * Zero-initialized usage object. Extracted from logging.ts so that
- * bridge/replBridge.ts can import it without transitively pulling in
- * api/errors.ts → utils/messages.ts → BashTool.tsx → the world.
- */
-export const EMPTY_USAGE: Readonly<NonNullableUsage> = {
-  input_tokens: 0,
-  cache_creation_input_tokens: 0,
-  cache_read_input_tokens: 0,
-  output_tokens: 0,
-  server_tool_use: { web_search_requests: 0, web_fetch_requests: 0 },
-  service_tier: 'standard',
-  cache_creation: {
-    ephemeral_1h_input_tokens: 0,
-    ephemeral_5m_input_tokens: 0,
-  },
-  inference_geo: '',
-  iterations: [],
-  speed: 'standard',
-}
+// Re-export EMPTY_USAGE from @ant/model-provider
+// Kept here for backward compatibility — consumers import from this path.
+export { EMPTY_USAGE } from '@ant/model-provider'
+export type { NonNullableUsage } from '@ant/model-provider'
--- a/src/services/api/errorUtils.ts
+++ b/src/services/api/errorUtils.ts
@@ -1,260 +1,8 @@
-import type { APIError } from '@anthropic-ai/sdk'
-
-// SSL/TLS error codes from OpenSSL (used by both Node.js and Bun)
-// See: https://www.openssl.org/docs/man3.1/man3/X509_STORE_CTX_get_error.html
-const SSL_ERROR_CODES = new Set([
-  // Certificate verification errors
-  'UNABLE_TO_VERIFY_LEAF_SIGNATURE',
-  'UNABLE_TO_GET_ISSUER_CERT',
-  'UNABLE_TO_GET_ISSUER_CERT_LOCALLY',
-  'CERT_SIGNATURE_FAILURE',
-  'CERT_NOT_YET_VALID',
-  'CERT_HAS_EXPIRED',
-  'CERT_REVOKED',
-  'CERT_REJECTED',
-  'CERT_UNTRUSTED',
-  // Self-signed certificate errors
-  'DEPTH_ZERO_SELF_SIGNED_CERT',
-  'SELF_SIGNED_CERT_IN_CHAIN',
-  // Chain errors
-  'CERT_CHAIN_TOO_LONG',
-  'PATH_LENGTH_EXCEEDED',
-  // Hostname/altname errors
-  'ERR_TLS_CERT_ALTNAME_INVALID',
-  'HOSTNAME_MISMATCH',
-  // TLS handshake errors
-  'ERR_TLS_HANDSHAKE_TIMEOUT',
-  'ERR_SSL_WRONG_VERSION_NUMBER',
-  'ERR_SSL_DECRYPTION_FAILED_OR_BAD_RECORD_MAC',
-])
-
-export type ConnectionErrorDetails = {
-  code: string
-  message: string
-  isSSLError: boolean
-}
-
-/**
- * Extracts connection error details from the error cause chain.
- * The Anthropic SDK wraps underlying errors in the `cause` property.
- * This function walks the cause chain to find the root error code/message.
- */
-export function extractConnectionErrorDetails(
-  error: unknown,
-): ConnectionErrorDetails | null {
-  if (!error || typeof error !== 'object') {
-    return null
-  }
-
-  // Walk the cause chain to find the root error with a code
-  let current: unknown = error
-  const maxDepth = 5 // Prevent infinite loops
-  let depth = 0
-
-  while (current && depth < maxDepth) {
-    if (
-      current instanceof Error &&
-      'code' in current &&
-      typeof current.code === 'string'
-    ) {
-      const code = current.code
-      const isSSLError = SSL_ERROR_CODES.has(code)
-      return {
-        code,
-        message: current.message,
-        isSSLError,
-      }
-    }
-
-    // Move to the next cause in the chain
-    if (
-      current instanceof Error &&
-      'cause' in current &&
-      current.cause !== current
-    ) {
-      current = current.cause
-      depth++
-    } else {
-      break
-    }
-  }
-
-  return null
-}
-
-/**
- * Returns an actionable hint for SSL/TLS errors, intended for contexts outside
- * the main API client (OAuth token exchange, preflight connectivity checks)
- * where `formatAPIError` doesn't apply.
- *
- * Motivation: enterprise users behind TLS-intercepting proxies (Zscaler et al.)
- * see OAuth complete in-browser but the CLI's token exchange silently fails
- * with a raw SSL code. Surfacing the likely fix saves a support round-trip.
- */
-export function getSSLErrorHint(error: unknown): string | null {
-  const details = extractConnectionErrorDetails(error)
-  if (!details?.isSSLError) {
-    return null
-  }
-  return `SSL certificate error (${details.code}). If you are behind a corporate proxy or TLS-intercepting firewall, set NODE_EXTRA_CA_CERTS to your CA bundle path, or ask IT to allowlist *.anthropic.com. Run /doctor for details.`
-}
-
-/**
- * Strips HTML content (e.g., CloudFlare error pages) from a message string,
- * returning a user-friendly title or empty string if HTML is detected.
- * Returns the original message unchanged if no HTML is found.
- */
-function sanitizeMessageHTML(message: string): string {
-  if (message.includes('<!DOCTYPE html') || message.includes('<html')) {
-    const titleMatch = message.match(/<title>([^<]+)<\/title>/)
-    if (titleMatch && titleMatch[1]) {
-      return titleMatch[1].trim()
-    }
-    return ''
-  }
-  return message
-}
-
-/**
- * Detects if an error message contains HTML content (e.g., CloudFlare error pages)
- * and returns a user-friendly message instead
- */
-export function sanitizeAPIError(apiError: APIError): string {
-  const message = apiError.message
-  if (!message) {
-    // Sometimes message is undefined
-    // TODO: figure out why
-    return ''
-  }
-  return sanitizeMessageHTML(message)
-}
-
-/**
- * Shapes of deserialized API errors from session JSONL.
- *
- * After JSON round-tripping, the SDK's APIError loses its `.message` property.
- * The actual message lives at different nesting levels depending on the provider:
- *
- * - Bedrock/proxy: `{ error: { message: "..." } }`
- * - Standard Anthropic API: `{ error: { error: { message: "..." } } }`
- *   (the outer `.error` is the response body, the inner `.error` is the API error)
- *
- * See also: `getErrorMessage` in `logging.ts` which handles the same shapes.
- */
-type NestedAPIError = {
-  error?: {
-    message?: string
-    error?: { message?: string }
-  }
-}
-
-function hasNestedError(value: unknown): value is NestedAPIError {
-  return (
-    typeof value === 'object' &&
-    value !== null &&
-    'error' in value &&
-    typeof value.error === 'object' &&
-    value.error !== null
-  )
-}
-
-/**
- * Extract a human-readable message from a deserialized API error that lacks
- * a top-level `.message`.
- *
- * Checks two nesting levels (deeper first for specificity):
- * 1. `error.error.error.message` — standard Anthropic API shape
- * 2. `error.error.message` — Bedrock shape
- */
-function extractNestedErrorMessage(error: APIError): string | null {
-  if (!hasNestedError(error)) {
-    return null
-  }
-
-  // Access `.error` via the narrowed type so TypeScript sees the nested shape
-  // instead of the SDK's `Object | undefined`.
-  const narrowed: NestedAPIError = error
-  const nested = narrowed.error
-
-  // Standard Anthropic API shape: { error: { error: { message } } }
-  const deepMsg = nested?.error?.message
-  if (typeof deepMsg === 'string' && deepMsg.length > 0) {
-    const sanitized = sanitizeMessageHTML(deepMsg)
-    if (sanitized.length > 0) {
-      return sanitized
-    }
-  }
-
-  // Bedrock shape: { error: { message } }
-  const msg = nested?.message
-  if (typeof msg === 'string' && msg.length > 0) {
-    const sanitized = sanitizeMessageHTML(msg)
-    if (sanitized.length > 0) {
-      return sanitized
-    }
-  }
-
-  return null
-}
-
-export function formatAPIError(error: APIError): string {
-  // Extract connection error details from the cause chain
-  const connectionDetails = extractConnectionErrorDetails(error)
-
-  if (connectionDetails) {
-    const { code, isSSLError } = connectionDetails
-
-    // Handle timeout errors
-    if (code === 'ETIMEDOUT') {
-      return 'Request timed out. Check your internet connection and proxy settings'
-    }
-
-    // Handle SSL/TLS errors with specific messages
-    if (isSSLError) {
-      switch (code) {
-        case 'UNABLE_TO_VERIFY_LEAF_SIGNATURE':
-        case 'UNABLE_TO_GET_ISSUER_CERT':
-        case 'UNABLE_TO_GET_ISSUER_CERT_LOCALLY':
-          return 'Unable to connect to API: SSL certificate verification failed. Check your proxy or corporate SSL certificates'
-        case 'CERT_HAS_EXPIRED':
-          return 'Unable to connect to API: SSL certificate has expired'
-        case 'CERT_REVOKED':
-          return 'Unable to connect to API: SSL certificate has been revoked'
-        case 'DEPTH_ZERO_SELF_SIGNED_CERT':
-        case 'SELF_SIGNED_CERT_IN_CHAIN':
-          return 'Unable to connect to API: Self-signed certificate detected. Check your proxy or corporate SSL certificates'
-        case 'ERR_TLS_CERT_ALTNAME_INVALID':
-        case 'HOSTNAME_MISMATCH':
-          return 'Unable to connect to API: SSL certificate hostname mismatch'
-        case 'CERT_NOT_YET_VALID':
-          return 'Unable to connect to API: SSL certificate is not yet valid'
-        default:
-          return `Unable to connect to API: SSL error (${code})`
-      }
-    }
-  }
-
-  if (error.message === 'Connection error.') {
-    // If we have a code but it's not SSL, include it for debugging
-    if (connectionDetails?.code) {
-      return `Unable to connect to API (${connectionDetails.code})`
-    }
-    return 'Unable to connect to API. Check your internet connection'
-  }
-
-  // Guard: when deserialized from JSONL (e.g. --resume), the error object may
-  // be a plain object without a `.message` property.  Return a safe fallback
-  // instead of undefined, which would crash callers that access `.length`.
-  if (!error.message) {
-    return (
-      extractNestedErrorMessage(error) ??
-      `API error (status ${error.status ?? 'unknown'})`
-    )
-  }
-
-  const sanitizedMessage = sanitizeAPIError(error)
-  // Use sanitized message if it's different from the original (i.e., HTML was sanitized)
-  return sanitizedMessage !== error.message && sanitizedMessage.length > 0
-    ? sanitizedMessage
-    : error.message
-}
+// Re-export from @ant/model-provider
+export {
+  formatAPIError,
+  extractConnectionErrorDetails,
+  sanitizeAPIError,
+  getSSLErrorHint,
+  type ConnectionErrorDetails,
+} from '@ant/model-provider'
--- a/src/services/api/gemini/tests/convertMessages.test.ts
+++ b/src/services/api/gemini/tests/convertMessages.test.ts
@@ -1,267 +0,0 @@
-import { describe, expect, test } from 'bun:test'
-import type {
-  AssistantMessage,
-  UserMessage,
-} from '../../../../types/message.js'
-import { anthropicMessagesToGemini } from '../convertMessages.js'
-
-function makeUserMsg(content: string | any[]): UserMessage {
-  return {
-    type: 'user',
-    uuid: '00000000-0000-0000-0000-000000000000',
-    message: { role: 'user', content },
-  } as UserMessage
-}
-
-function makeAssistantMsg(content: string | any[]): AssistantMessage {
-  return {
-    type: 'assistant',
-    uuid: '00000000-0000-0000-0000-000000000001',
-    message: { role: 'assistant', content },
-  } as AssistantMessage
-}
-
-describe('anthropicMessagesToGemini', () => {
-  test('converts system prompt to systemInstruction', () => {
-    const result = anthropicMessagesToGemini(
-      [makeUserMsg('hello')],
-      ['You are helpful.'] as any,
-    )
-
-    expect(result.systemInstruction).toEqual({
-      parts: [{ text: 'You are helpful.' }],
-    })
-  })
-
-  test('converts assistant tool_use to functionCall', () => {
-    const result = anthropicMessagesToGemini(
-      [
-        makeAssistantMsg([
-          {
-            type: 'tool_use',
-            id: 'toolu_123',
-            name: 'bash',
-            input: { command: 'ls' },
-            _geminiThoughtSignature: 'sig-tool',
-          },
-        ]),
-      ],
-      [] as any,
-    )
-
-    expect(result.contents).toEqual([
-      {
-        role: 'model',
-        parts: [
-          {
-            functionCall: {
-              name: 'bash',
-              args: { command: 'ls' },
-            },
-            thoughtSignature: 'sig-tool',
-          },
-        ],
-      },
-    ])
-  })
-
-  test('converts tool_result to functionResponse using prior tool name', () => {
-    const result = anthropicMessagesToGemini(
-      [
-        makeAssistantMsg([
-          {
-            type: 'tool_use',
-            id: 'toolu_123',
-            name: 'bash',
-            input: { command: 'ls' },
-          },
-        ]),
-        makeUserMsg([
-          {
-            type: 'tool_result',
-            tool_use_id: 'toolu_123',
-            content: 'file.txt',
-          },
-        ]),
-      ],
-      [] as any,
-    )
-
-    expect(result.contents[1]).toEqual({
-      role: 'user',
-      parts: [
-        {
-          functionResponse: {
-            name: 'bash',
-            response: {
-              result: 'file.txt',
-            },
-          },
-        },
-      ],
-    })
-  })
-
-  test('converts thinking blocks with signatures', () => {
-    const result = anthropicMessagesToGemini(
-      [
-        makeAssistantMsg([
-          {
-            type: 'thinking',
-            thinking: 'internal reasoning',
-            signature: 'sig-thinking',
-          },
-          {
-            type: 'text',
-            text: 'visible answer',
-          },
-        ]),
-      ],
-      [] as any,
-    )
-
-    expect(result.contents[0]).toEqual({
-      role: 'model',
-      parts: [
-        {
-          text: 'internal reasoning',
-          thought: true,
-          thoughtSignature: 'sig-thinking',
-        },
-        {
-          text: 'visible answer',
-        },
-      ],
-    })
-  })
-
-  test('filters empty assistant text and signature-only thinking parts', () => {
-    const result = anthropicMessagesToGemini(
-      [
-        makeAssistantMsg([
-          {
-            type: 'text',
-            text: '',
-            _geminiThoughtSignature: 'sig-empty-text',
-          },
-          {
-            type: 'thinking',
-            thinking: '',
-            signature: 'sig-empty-thinking',
-          },
-          {
-            type: 'tool_use',
-            id: 'toolu_123',
-            name: 'bash',
-            input: { command: 'pwd' },
-          },
-        ]),
-      ],
-      [] as any,
-    )
-
-    expect(result.contents).toEqual([
-      {
-        role: 'model',
-        parts: [
-          {
-            functionCall: {
-              name: 'bash',
-              args: { command: 'pwd' },
-            },
-          },
-        ],
-      },
-    ])
-  })
-
-  test('filters empty user text blocks', () => {
-    const result = anthropicMessagesToGemini(
-      [
-        makeUserMsg([
-          {
-            type: 'text',
-            text: '',
-          },
-          {
-            type: 'text',
-            text: 'hello',
-          },
-        ]),
-      ],
-      [] as any,
-    )
-
-    expect(result.contents).toEqual([
-      {
-        role: 'user',
-        parts: [{ text: 'hello' }],
-      },
-    ])
-  })
-
-  test('converts base64 image to inlineData', () => {
-    const result = anthropicMessagesToGemini(
-      [makeUserMsg([
-        { type: 'text', text: 'describe this' },
-        {
-          type: 'image',
-          source: {
-            type: 'base64',
-            media_type: 'image/png',
-            data: 'iVBORw0KGgo=',
-          },
-        },
-      ])],
-      [] as any,
-    )
-    expect(result.contents).toEqual([
-      {
-        role: 'user',
-        parts: [
-          { text: 'describe this' },
-          { inlineData: { mimeType: 'image/png', data: 'iVBORw0KGgo=' } },
-        ],
-      },
-    ])
-  })
-
-  test('converts url image to text fallback', () => {
-    const result = anthropicMessagesToGemini(
-      [makeUserMsg([
-        {
-          type: 'image',
-          source: {
-            type: 'url',
-            url: 'https://example.com/img.png',
-          },
-        },
-      ])],
-      [] as any,
-    )
-    expect(result.contents).toEqual([
-      {
-        role: 'user',
-        parts: [{ text: '[image: https://example.com/img.png]' }],
-      },
-    ])
-  })
-
-  test('defaults to image/png when media_type is missing', () => {
-    const result = anthropicMessagesToGemini(
-      [makeUserMsg([
-        {
-          type: 'image',
-          source: {
-            type: 'base64',
-            data: 'ABC123',
-          },
-        },
-      ])],
-      [] as any,
-    )
-    expect(result.contents[0].parts[0]).toEqual({
-      inlineData: { mimeType: 'image/png', data: 'ABC123' },
-    })
-  })
-})
--- a/src/services/api/gemini/tests/convertTools.test.ts
+++ b/src/services/api/gemini/tests/convertTools.test.ts
@@ -1,130 +0,0 @@
-import { describe, expect, test } from 'bun:test'
-import {
-  anthropicToolChoiceToGemini,
-  anthropicToolsToGemini,
-} from '../convertTools.js'
-
-describe('anthropicToolsToGemini', () => {
-  test('converts basic tool to parametersJsonSchema', () => {
-    const tools = [
-      {
-        type: 'custom',
-        name: 'bash',
-        description: 'Run a bash command',
-        input_schema: {
-          type: 'object',
-          properties: { command: { type: 'string' } },
-          required: ['command'],
-        },
-      },
-    ]
-
-    expect(anthropicToolsToGemini(tools as any)).toEqual([
-      {
-        functionDeclarations: [
-          {
-            name: 'bash',
-            description: 'Run a bash command',
-            parametersJsonSchema: {
-              type: 'object',
-              properties: { command: { type: 'string' } },
-              propertyOrdering: ['command'],
-              required: ['command'],
-            },
-          },
-        ],
-      },
-    ])
-  })
-
-  test('sanitizes unsupported JSON Schema fields for Gemini', () => {
-    const tools = [
-      {
-        type: 'custom',
-        name: 'complex',
-        description: 'Complex schema',
-        input_schema: {
-          $schema: 'http://json-schema.org/draft-07/schema#',
-          type: 'object',
-          additionalProperties: false,
-          propertyNames: { pattern: '^[a-z]+$' },
-          properties: {
-            mode: { const: 'strict' },
-            retries: {
-              type: 'integer',
-              exclusiveMinimum: 0,
-            },
-            metadata: {
-              type: 'object',
-              additionalProperties: {
-                type: 'string',
-                propertyNames: { pattern: '^[a-z]+$' },
-              },
-            },
-          },
-          required: ['mode'],
-        },
-      },
-    ]
-
-    expect(anthropicToolsToGemini(tools as any)).toEqual([
-      {
-        functionDeclarations: [
-          {
-            name: 'complex',
-            description: 'Complex schema',
-            parametersJsonSchema: {
-              type: 'object',
-              additionalProperties: false,
-              properties: {
-                mode: {
-                  type: 'string',
-                  enum: ['strict'],
-                },
-                retries: {
-                  type: 'integer',
-                  minimum: 0,
-                },
-                metadata: {
-                  type: 'object',
-                  additionalProperties: {
-                    type: 'string',
-                  },
-                },
-              },
-              propertyOrdering: ['mode', 'retries', 'metadata'],
-              required: ['mode'],
-            },
-          },
-        ],
-      },
-    ])
-  })
-
-  test('returns empty array when no tools are provided', () => {
-    expect(anthropicToolsToGemini([])).toEqual([])
-  })
-})
-
-describe('anthropicToolChoiceToGemini', () => {
-  test('maps auto', () => {
-    expect(anthropicToolChoiceToGemini({ type: 'auto' })).toEqual({
-      mode: 'AUTO',
-    })
-  })
-
-  test('maps any', () => {
-    expect(anthropicToolChoiceToGemini({ type: 'any' })).toEqual({
-      mode: 'ANY',
-    })
-  })
-
-  test('maps explicit tool choice', () => {
-    expect(
-      anthropicToolChoiceToGemini({ type: 'tool', name: 'bash' }),
-    ).toEqual({
-      mode: 'ANY',
-      allowedFunctionNames: ['bash'],
-    })
-  })
-})
--- a/src/services/api/gemini/tests/modelMapping.test.ts
+++ b/src/services/api/gemini/tests/modelMapping.test.ts
@@ -1,100 +0,0 @@
-import { afterEach, beforeEach, describe, expect, test } from 'bun:test'
-import { resolveGeminiModel } from '../modelMapping.js'
-
-describe('resolveGeminiModel', () => {
-  const originalEnv = {
-    GEMINI_MODEL: process.env.GEMINI_MODEL,
-    GEMINI_DEFAULT_HAIKU_MODEL: process.env.GEMINI_DEFAULT_HAIKU_MODEL,
-    GEMINI_DEFAULT_SONNET_MODEL: process.env.GEMINI_DEFAULT_SONNET_MODEL,
-    GEMINI_DEFAULT_OPUS_MODEL: process.env.GEMINI_DEFAULT_OPUS_MODEL,
-    ANTHROPIC_DEFAULT_HAIKU_MODEL: process.env.ANTHROPIC_DEFAULT_HAIKU_MODEL,
-    ANTHROPIC_DEFAULT_SONNET_MODEL: process.env.ANTHROPIC_DEFAULT_SONNET_MODEL,
-    ANTHROPIC_DEFAULT_OPUS_MODEL: process.env.ANTHROPIC_DEFAULT_OPUS_MODEL,
-  }
-
-  beforeEach(() => {
-    delete process.env.GEMINI_MODEL
-    delete process.env.GEMINI_DEFAULT_HAIKU_MODEL
-    delete process.env.GEMINI_DEFAULT_SONNET_MODEL
-    delete process.env.GEMINI_DEFAULT_OPUS_MODEL
-    delete process.env.ANTHROPIC_DEFAULT_HAIKU_MODEL
-    delete process.env.ANTHROPIC_DEFAULT_SONNET_MODEL
-    delete process.env.ANTHROPIC_DEFAULT_OPUS_MODEL
-  })
-
-  afterEach(() => {
-    Object.assign(process.env, originalEnv)
-  })
-
-  test('GEMINI_MODEL env var overrides family mappings', () => {
-    process.env.GEMINI_MODEL = 'gemini-2.5-pro'
-    process.env.ANTHROPIC_DEFAULT_SONNET_MODEL = 'gemini-2.5-flash'
-
-    expect(resolveGeminiModel('claude-sonnet-4-6')).toBe('gemini-2.5-pro')
-  })
-
-  test('GEMINI_DEFAULT_*_MODEL takes precedence over ANTHROPIC_DEFAULT_*', () => {
-    process.env.GEMINI_DEFAULT_SONNET_MODEL = 'gemini-2.5-flash-priority'
-    process.env.ANTHROPIC_DEFAULT_SONNET_MODEL = 'gemini-2.5-flash-fallback'
-
-    expect(resolveGeminiModel('claude-sonnet-4-6')).toBe(
-      'gemini-2.5-flash-priority',
-    )
-  })
-
-  test('resolves sonnet model from GEMINI_DEFAULT_SONNET_MODEL', () => {
-    process.env.GEMINI_DEFAULT_SONNET_MODEL = 'gemini-2.5-flash'
-    expect(resolveGeminiModel('claude-sonnet-4-6')).toBe('gemini-2.5-flash')
-  })
-
-  test('resolves haiku model from GEMINI_DEFAULT_HAIKU_MODEL', () => {
-    process.env.GEMINI_DEFAULT_HAIKU_MODEL = 'gemini-2.5-flash-lite'
-    expect(resolveGeminiModel('claude-haiku-4-5-20251001')).toBe(
-      'gemini-2.5-flash-lite',
-    )
-  })
-
-  test('resolves opus model from GEMINI_DEFAULT_OPUS_MODEL', () => {
-    process.env.GEMINI_DEFAULT_OPUS_MODEL = 'gemini-2.5-pro'
-    expect(resolveGeminiModel('claude-opus-4-6')).toBe('gemini-2.5-pro')
-  })
-
-  test('falls back to ANTHROPIC_DEFAULT_* when GEMINI_DEFAULT_* not set', () => {
-    process.env.ANTHROPIC_DEFAULT_SONNET_MODEL = 'gemini-2.5-flash'
-    expect(resolveGeminiModel('claude-sonnet-4-6')).toBe('gemini-2.5-flash')
-  })
-
-  test('resolves haiku from ANTHROPIC_DEFAULT_HAIKU_MODEL as fallback', () => {
-    process.env.ANTHROPIC_DEFAULT_HAIKU_MODEL = 'gemini-2.5-flash-lite'
-    expect(resolveGeminiModel('claude-haiku-4-5-20251001')).toBe(
-      'gemini-2.5-flash-lite',
-    )
-  })
-
-  test('resolves opus from ANTHROPIC_DEFAULT_OPUS_MODEL as fallback', () => {
-    process.env.ANTHROPIC_DEFAULT_OPUS_MODEL = 'gemini-2.5-pro'
-    expect(resolveGeminiModel('claude-opus-4-6')).toBe('gemini-2.5-pro')
-  })
-
-  test('uses backward compatible family override', () => {
-    process.env.ANTHROPIC_DEFAULT_SONNET_MODEL = 'legacy-gemini-sonnet'
-    expect(resolveGeminiModel('claude-sonnet-4-6')).toBe('legacy-gemini-sonnet')
-  })
-
-  test('strips [1m] suffix before resolving', () => {
-    process.env.GEMINI_DEFAULT_SONNET_MODEL = 'gemini-2.5-flash'
-    expect(resolveGeminiModel('claude-sonnet-4-6[1m]')).toBe('gemini-2.5-flash')
-  })
-
-  test('passes through explicit Gemini model names', () => {
-    expect(resolveGeminiModel('gemini-3.1-flash-lite-preview')).toBe(
-      'gemini-3.1-flash-lite-preview',
-    )
-  })
-
-  test('throws when no Gemini model configuration is available', () => {
-    expect(() => resolveGeminiModel('claude-sonnet-4-6')).toThrow(
-      'Gemini provider requires GEMINI_MODEL or GEMINI_DEFAULT_SONNET_MODEL (or ANTHROPIC_DEFAULT_SONNET_MODEL for backward compatibility) to be configured.',
-    )
-  })
-})
--- a/src/services/api/gemini/tests/streamAdapter.test.ts
+++ b/src/services/api/gemini/tests/streamAdapter.test.ts
@@ -1,175 +0,0 @@
-import { describe, expect, test } from 'bun:test'
-import { adaptGeminiStreamToAnthropic } from '../streamAdapter.js'
-import type { GeminiStreamChunk } from '../types.js'
-
-function mockStream(
-  chunks: GeminiStreamChunk[],
-): AsyncIterable<GeminiStreamChunk> {
-  return {
-    [Symbol.asyncIterator]() {
-      let index = 0
-      return {
-        async next() {
-          if (index >= chunks.length) {
-            return { done: true, value: undefined }
-          }
-          return { done: false, value: chunks[index++] }
-        },
-      }
-    },
-  }
-}
-
-async function collectEvents(chunks: GeminiStreamChunk[]) {
-  const events: any[] = []
-  for await (const event of adaptGeminiStreamToAnthropic(
-    mockStream(chunks),
-    'gemini-2.5-flash',
-  )) {
-    events.push(event)
-  }
-  return events
-}
-
-describe('adaptGeminiStreamToAnthropic', () => {
-  test('converts text chunks', async () => {
-    const events = await collectEvents([
-      {
-        candidates: [
-          {
-            content: {
-              parts: [{ text: 'Hello' }],
-            },
-          },
-        ],
-      },
-      {
-        candidates: [
-          {
-            content: {
-              parts: [{ text: ' world' }],
-            },
-            finishReason: 'STOP',
-          },
-        ],
-      },
-    ])
-
-    const textDeltas = events.filter(
-      event =>
-        event.type === 'content_block_delta' && event.delta.type === 'text_delta',
-    )
-
-    expect(events[0].type).toBe('message_start')
-    expect(textDeltas).toHaveLength(2)
-    expect(textDeltas[0].delta.text).toBe('Hello')
-    expect(textDeltas[1].delta.text).toBe(' world')
-
-    const messageDelta = events.find(event => event.type === 'message_delta')
-    expect(messageDelta.delta.stop_reason).toBe('end_turn')
-  })
-
-  test('converts thinking chunks and signatures', async () => {
-    const events = await collectEvents([
-      {
-        candidates: [
-          {
-            content: {
-              parts: [{ text: 'Think', thought: true }],
-            },
-          },
-        ],
-      },
-      {
-        candidates: [
-          {
-            content: {
-              parts: [{ thought: true, thoughtSignature: 'sig-123' }],
-            },
-            finishReason: 'STOP',
-          },
-        ],
-      },
-    ])
-
-    const blockStart = events.find(event => event.type === 'content_block_start')
-    expect(blockStart.content_block.type).toBe('thinking')
-
-    const signatureDelta = events.find(
-      event =>
-        event.type === 'content_block_delta' &&
-        event.delta.type === 'signature_delta',
-    )
-    expect(signatureDelta.delta.signature).toBe('sig-123')
-  })
-
-  test('converts function calls to tool_use blocks', async () => {
-    const events = await collectEvents([
-      {
-        candidates: [
-          {
-            content: {
-              parts: [
-                {
-                  functionCall: {
-                    name: 'bash',
-                    args: { command: 'ls' },
-                  },
-                  thoughtSignature: 'sig-tool',
-                },
-              ],
-            },
-            finishReason: 'STOP',
-          },
-        ],
-      },
-    ])
-
-    const blockStart = events.find(event => event.type === 'content_block_start')
-    expect(blockStart.content_block.type).toBe('tool_use')
-    expect(blockStart.content_block.name).toBe('bash')
-
-    const signatureDelta = events.find(
-      event =>
-        event.type === 'content_block_delta' &&
-        event.delta.type === 'signature_delta',
-    )
-    expect(signatureDelta.delta.signature).toBe('sig-tool')
-
-    const inputDelta = events.find(
-      event =>
-        event.type === 'content_block_delta' &&
-        event.delta.type === 'input_json_delta',
-    )
-    expect(inputDelta.delta.partial_json).toBe('{"command":"ls"}')
-
-    const messageDelta = events.find(event => event.type === 'message_delta')
-    expect(messageDelta.delta.stop_reason).toBe('tool_use')
-  })
-
-  test('maps usage metadata into output tokens', async () => {
-    const events = await collectEvents([
-      {
-        candidates: [
-          {
-            content: {
-              parts: [{ text: 'Hello' }],
-            },
-            finishReason: 'STOP',
-          },
-        ],
-        usageMetadata: {
-          promptTokenCount: 10,
-          candidatesTokenCount: 5,
-          thoughtsTokenCount: 2,
-        },
-      },
-    ])
-
-    const messageStart = events.find(event => event.type === 'message_start')
-    expect(messageStart.message.usage.input_tokens).toBe(10)
-
-    const messageDelta = events.find(event => event.type === 'message_delta')
-    expect(messageDelta.usage.output_tokens).toBe(7)
-  })
-})
--- a/src/services/api/gemini/client.ts
+++ b/src/services/api/gemini/client.ts
@@ -4,7 +4,7 @@ import { getProxyFetchOptions } from 'src/utils/proxy.js'
 import type {
  GeminiGenerateContentRequest,
  GeminiStreamChunk,
-} from './types.js'
+} from '@ant/model-provider'

 const DEFAULT_GEMINI_BASE_URL =
  'https://generativelanguage.googleapis.com/v1beta'
--- a/src/services/api/gemini/convertMessages.ts
+++ b/src/services/api/gemini/convertMessages.ts
@@ -1,298 +0,0 @@
-import type {
-  BetaToolResultBlockParam,
-  BetaToolUseBlock,
-} from '@anthropic-ai/sdk/resources/beta/messages/messages.mjs'
-import type { AssistantMessage, UserMessage } from '../../../types/message.js'
-import { safeParseJSON } from '../../../utils/json.js'
-import type { SystemPrompt } from '../../../utils/systemPromptType.js'
-import {
-  GEMINI_THOUGHT_SIGNATURE_FIELD,
-  type GeminiContent,
-  type GeminiGenerateContentRequest,
-  type GeminiPart,
-} from './types.js'
-
-export function anthropicMessagesToGemini(
-  messages: (UserMessage | AssistantMessage)[],
-  systemPrompt: SystemPrompt,
-): Pick<GeminiGenerateContentRequest, 'contents' | 'systemInstruction'> {
-  const contents: GeminiContent[] = []
-  const toolNamesById = new Map<string, string>()
-
-  for (const msg of messages) {
-    if (msg.type === 'assistant') {
-      const content = convertInternalAssistantMessage(msg)
-      if (content.parts.length > 0) {
-        contents.push(content)
-      }
-
-      const assistantContent = msg.message.content
-      if (Array.isArray(assistantContent)) {
-        for (const block of assistantContent) {
-          if (typeof block !== 'string' && block.type === 'tool_use') {
-            toolNamesById.set(block.id, block.name)
-          }
-        }
-      }
-      continue
-    }
-
-    if (msg.type === 'user') {
-      const content = convertInternalUserMessage(msg, toolNamesById)
-      if (content.parts.length > 0) {
-        contents.push(content)
-      }
-    }
-  }
-
-  const systemText = systemPromptToText(systemPrompt)
-
-  return {
-    contents,
-    ...(systemText
-      ? {
-          systemInstruction: {
-            parts: [{ text: systemText }],
-          },
-        }
-      : {}),
-  }
-}
-
-function systemPromptToText(systemPrompt: SystemPrompt): string {
-  if (!systemPrompt || systemPrompt.length === 0) return ''
-  return systemPrompt.filter(Boolean).join('\n\n')
-}
-
-function convertInternalUserMessage(
-  msg: UserMessage,
-  toolNamesById: ReadonlyMap<string, string>,
-): GeminiContent {
-  const content = msg.message.content
-
-  if (typeof content === 'string') {
-    return {
-      role: 'user',
-      parts: createTextGeminiParts(content),
-    }
-  }
-
-  if (!Array.isArray(content)) {
-    return { role: 'user', parts: [] }
-  }
-
-  return {
-    role: 'user',
-    parts: content.flatMap(block =>
-      convertUserContentBlockToGeminiParts(block as unknown as string | Record<string, unknown>, toolNamesById),
-    ),
-  }
-}
-
-function convertUserContentBlockToGeminiParts(
-  block: string | Record<string, unknown>,
-  toolNamesById: ReadonlyMap<string, string>,
-): GeminiPart[] {
-  if (typeof block === 'string') {
-    return createTextGeminiParts(block)
-  }
-
-  if (block.type === 'text') {
-    return createTextGeminiParts(block.text)
-  }
-
-  if (block.type === 'tool_result') {
-    const toolResult = block as unknown as BetaToolResultBlockParam
-    return [
-      {
-        functionResponse: {
-          name: toolNamesById.get(toolResult.tool_use_id) ?? toolResult.tool_use_id,
-          response: toolResultToResponseObject(toolResult),
-        },
-      },
-    ]
-  }
-
-  // 将 Anthropic image 块转换为 Gemini inlineData
-  if (block.type === 'image') {
-    const source = block.source as Record<string, unknown> | undefined
-    if (source?.type === 'base64' && typeof source.data === 'string') {
-      const mediaType = (source.media_type as string) || 'image/png'
-      return [
-        {
-          inlineData: {
-            mimeType: mediaType,
-            data: source.data,
-          },
-        },
-      ]
-    }
-    // url 类型的图片，Gemini 不直接支持，转为文本描述
-    if (source?.type === 'url' && typeof source.url === 'string') {
-      return createTextGeminiParts(`[image: ${source.url}]`)
-    }
-  }
-
-  return []
-}
-
-function convertInternalAssistantMessage(msg: AssistantMessage): GeminiContent {
-  const content = msg.message.content
-
-  if (typeof content === 'string') {
-    return {
-      role: 'model',
-      parts: createTextGeminiParts(content),
-    }
-  }
-
-  if (!Array.isArray(content)) {
-    return { role: 'model', parts: [] }
-  }
-
-  const parts: GeminiPart[] = []
-  for (const block of content) {
-    if (typeof block === 'string') {
-      parts.push(...createTextGeminiParts(block))
-      continue
-    }
-
-    if (block.type === 'text') {
-      parts.push(
-        ...createTextGeminiParts(
-          block.text,
-          getGeminiThoughtSignature(block as unknown as Record<string, unknown>),
-        ),
-      )
-      continue
-    }
-
-    if (block.type === 'thinking') {
-      const thinkingPart = createThinkingGeminiPart(
-        block.thinking,
-        block.signature,
-      )
-      if (thinkingPart) {
-        parts.push(thinkingPart)
-      }
-      continue
-    }
-
-    if (block.type === 'tool_use') {
-      const toolUse = block as unknown as BetaToolUseBlock
-      parts.push({
-        functionCall: {
-          name: toolUse.name,
-          args: normalizeToolUseInput(toolUse.input),
-        },
-        ...(getGeminiThoughtSignature(block as unknown as Record<string, unknown>) && {
-          thoughtSignature: getGeminiThoughtSignature(block as unknown as Record<string, unknown>),
-        }),
-      })
-    }
-  }
-
-  return { role: 'model', parts }
-}
-
-function createTextGeminiParts(
-  value: unknown,
-  thoughtSignature?: string,
-): GeminiPart[] {
-  if (typeof value !== 'string' || value.length === 0) {
-    return []
-  }
-
-  return [
-    {
-      text: value,
-      ...(thoughtSignature && { thoughtSignature }),
-    },
-  ]
-}
-
-function createThinkingGeminiPart(
-  value: unknown,
-  thoughtSignature?: string,
-): GeminiPart | undefined {
-  if (typeof value !== 'string' || value.length === 0) {
-    return undefined
-  }
-
-  return {
-    text: value,
-    thought: true,
-    ...(thoughtSignature && { thoughtSignature }),
-  }
-}
-
-function normalizeToolUseInput(input: unknown): Record<string, unknown> {
-  if (typeof input === 'string') {
-    const parsed = safeParseJSON(input)
-    if (parsed && typeof parsed === 'object' && !Array.isArray(parsed)) {
-      return parsed as Record<string, unknown>
-    }
-    return parsed === null ? {} : { value: parsed }
-  }
-
-  if (input && typeof input === 'object' && !Array.isArray(input)) {
-    return input as Record<string, unknown>
-  }
-
-  return input === undefined ? {} : { value: input }
-}
-
-function toolResultToResponseObject(
-  block: BetaToolResultBlockParam,
-): Record<string, unknown> {
-  const result = normalizeToolResultContent(block.content)
-  if (
-    result &&
-    typeof result === 'object' &&
-    !Array.isArray(result)
-  ) {
-    return block.is_error ? { ...(result as Record<string, unknown>), is_error: true } : result as Record<string, unknown>
-  }
-
-  return {
-    result,
-    ...(block.is_error ? { is_error: true } : {}),
-  }
-}
-
-function normalizeToolResultContent(content: unknown): unknown {
-  if (typeof content === 'string') {
-    const parsed = safeParseJSON(content)
-    return parsed ?? content
-  }
-
-  if (Array.isArray(content)) {
-    const text = content
-      .map(part => {
-        if (typeof part === 'string') return part
-        if (
-          part &&
-          typeof part === 'object' &&
-          'text' in part &&
-          typeof part.text === 'string'
-        ) {
-          return part.text
-        }
-        return ''
-      })
-      .filter(Boolean)
-      .join('\n')
-
-    const parsed = safeParseJSON(text)
-    return parsed ?? text
-  }
-
-  return content ?? ''
-}
-
-function getGeminiThoughtSignature(block: Record<string, unknown>): string | undefined {
-  const signature = block[GEMINI_THOUGHT_SIGNATURE_FIELD]
-  return typeof signature === 'string' && signature.length > 0
-    ? signature
-    : undefined
-}
--- a/src/services/api/gemini/convertTools.ts
+++ b/src/services/api/gemini/convertTools.ts
@@ -1,285 +0,0 @@
-import type { BetaToolUnion } from '@anthropic-ai/sdk/resources/beta/messages/messages.mjs'
-import type {
-  GeminiFunctionCallingConfig,
-  GeminiTool,
-} from './types.js'
-
-const GEMINI_JSON_SCHEMA_TYPES = new Set([
-  'string',
-  'number',
-  'integer',
-  'boolean',
-  'object',
-  'array',
-  'null',
-])
-
-function normalizeGeminiJsonSchemaType(
-  value: unknown,
-): string | string[] | undefined {
-  if (typeof value === 'string') {
-    return GEMINI_JSON_SCHEMA_TYPES.has(value) ? value : undefined
-  }
-
-  if (Array.isArray(value)) {
-    const normalized = value.filter(
-      (item): item is string =>
-        typeof item === 'string' && GEMINI_JSON_SCHEMA_TYPES.has(item),
-    )
-    const unique = Array.from(new Set(normalized))
-    if (unique.length === 0) return undefined
-    return unique.length === 1 ? unique[0] : unique
-  }
-
-  return undefined
-}
-
-function inferGeminiJsonSchemaTypeFromValue(value: unknown): string | undefined {
-  if (value === null) return 'null'
-  if (Array.isArray(value)) return 'array'
-  if (typeof value === 'string') return 'string'
-  if (typeof value === 'boolean') return 'boolean'
-  if (typeof value === 'number') {
-    return Number.isInteger(value) ? 'integer' : 'number'
-  }
-  if (typeof value === 'object') return 'object'
-  return undefined
-}
-
-function inferGeminiJsonSchemaTypeFromEnum(
-  values: unknown[],
-): string | string[] | undefined {
-  const inferred = values
-    .map(inferGeminiJsonSchemaTypeFromValue)
-    .filter((value): value is string => value !== undefined)
-  const unique = Array.from(new Set(inferred))
-  if (unique.length === 0) return undefined
-  return unique.length === 1 ? unique[0] : unique
-}
-
-function addNullToGeminiJsonSchemaType(
-  value: string | string[] | undefined,
-): string | string[] | undefined {
-  if (value === undefined) return ['null']
-  if (Array.isArray(value)) {
-    return value.includes('null') ? value : [...value, 'null']
-  }
-  return value === 'null' ? value : [value, 'null']
-}
-
-function sanitizeGeminiJsonSchemaProperties(
-  value: unknown,
-): Record<string, Record<string, unknown>> | undefined {
-  if (!value || typeof value !== 'object' || Array.isArray(value)) {
-    return undefined
-  }
-
-  const sanitizedEntries = Object.entries(value as Record<string, unknown>)
-    .map(([key, schema]) => [key, sanitizeGeminiJsonSchema(schema)] as const)
-    .filter(([, schema]) => Object.keys(schema).length > 0)
-
-  if (sanitizedEntries.length === 0) {
-    return undefined
-  }
-
-  return Object.fromEntries(sanitizedEntries)
-}
-
-function sanitizeGeminiJsonSchemaArray(
-  value: unknown,
-): Record<string, unknown>[] | undefined {
-  if (!Array.isArray(value)) return undefined
-
-  const sanitized = value
-    .map(item => sanitizeGeminiJsonSchema(item))
-    .filter(item => Object.keys(item).length > 0)
-
-  return sanitized.length > 0 ? sanitized : undefined
-}
-
-function sanitizeGeminiJsonSchema(
-  schema: unknown,
-): Record<string, unknown> {
-  if (!schema || typeof schema !== 'object' || Array.isArray(schema)) {
-    return {}
-  }
-
-  const source = schema as Record<string, unknown>
-  const result: Record<string, unknown> = {}
-
-  let type = normalizeGeminiJsonSchemaType(source.type)
-
-  if (source.const !== undefined) {
-    result.enum = [source.const]
-    type = type ?? inferGeminiJsonSchemaTypeFromValue(source.const)
-  } else if (Array.isArray(source.enum) && source.enum.length > 0) {
-    result.enum = source.enum
-    type = type ?? inferGeminiJsonSchemaTypeFromEnum(source.enum)
-  }
-
-  if (!type) {
-    if (source.properties && typeof source.properties === 'object') {
-      type = 'object'
-    } else if (source.items !== undefined || source.prefixItems !== undefined) {
-      type = 'array'
-    }
-  }
-
-  if (source.nullable === true) {
-    type = addNullToGeminiJsonSchemaType(type)
-  }
-
-  if (type) {
-    result.type = type
-  }
-
-  if (typeof source.title === 'string') {
-    result.title = source.title
-  }
-  if (typeof source.description === 'string') {
-    result.description = source.description
-  }
-  if (typeof source.format === 'string') {
-    result.format = source.format
-  }
-  if (typeof source.pattern === 'string') {
-    result.pattern = source.pattern
-  }
-  if (typeof source.minimum === 'number') {
-    result.minimum = source.minimum
-  } else if (typeof source.exclusiveMinimum === 'number') {
-    result.minimum = source.exclusiveMinimum
-  }
-  if (typeof source.maximum === 'number') {
-    result.maximum = source.maximum
-  } else if (typeof source.exclusiveMaximum === 'number') {
-    result.maximum = source.exclusiveMaximum
-  }
-  if (typeof source.minItems === 'number') {
-    result.minItems = source.minItems
-  }
-  if (typeof source.maxItems === 'number') {
-    result.maxItems = source.maxItems
-  }
-  if (typeof source.minLength === 'number') {
-    result.minLength = source.minLength
-  }
-  if (typeof source.maxLength === 'number') {
-    result.maxLength = source.maxLength
-  }
-  if (typeof source.minProperties === 'number') {
-    result.minProperties = source.minProperties
-  }
-  if (typeof source.maxProperties === 'number') {
-    result.maxProperties = source.maxProperties
-  }
-
-  const properties = sanitizeGeminiJsonSchemaProperties(source.properties)
-  if (properties) {
-    result.properties = properties
-    result.propertyOrdering = Object.keys(properties)
-  }
-
-  if (Array.isArray(source.required)) {
-    const required = source.required.filter(
-      (item): item is string => typeof item === 'string',
-    )
-    if (required.length > 0) {
-      result.required = required
-    }
-  }
-
-  if (typeof source.additionalProperties === 'boolean') {
-    result.additionalProperties = source.additionalProperties
-  } else {
-    const additionalProperties = sanitizeGeminiJsonSchema(
-      source.additionalProperties,
-    )
-    if (Object.keys(additionalProperties).length > 0) {
-      result.additionalProperties = additionalProperties
-    }
-  }
-
-  const items = sanitizeGeminiJsonSchema(source.items)
-  if (Object.keys(items).length > 0) {
-    result.items = items
-  }
-
-  const prefixItems = sanitizeGeminiJsonSchemaArray(source.prefixItems)
-  if (prefixItems) {
-    result.prefixItems = prefixItems
-  }
-
-  const anyOf = sanitizeGeminiJsonSchemaArray(source.anyOf ?? source.oneOf)
-  if (anyOf) {
-    result.anyOf = anyOf
-  }
-
-  return result
-}
-
-function sanitizeGeminiFunctionParameters(
-  schema: unknown,
-): Record<string, unknown> {
-  const sanitized = sanitizeGeminiJsonSchema(schema)
-  if (Object.keys(sanitized).length > 0) {
-    return sanitized
-  }
-
-  return {
-    type: 'object',
-    properties: {},
-  }
-}
-
-export function anthropicToolsToGemini(tools: BetaToolUnion[]): GeminiTool[] {
-  const functionDeclarations = tools
-    .filter(tool => {
-      const toolType = (tool as unknown as { type?: string }).type
-      return tool.type === 'custom' || !('type' in tool) || toolType !== 'server'
-    })
-    .map(tool => {
-      const anyTool = tool as unknown as Record<string, unknown>
-      const name = (anyTool.name as string) || ''
-      const description = (anyTool.description as string) || ''
-      const inputSchema =
-        (anyTool.input_schema as Record<string, unknown> | undefined) ?? {
-          type: 'object',
-          properties: {},
-        }
-
-      return {
-        name,
-        description,
-        parametersJsonSchema: sanitizeGeminiFunctionParameters(inputSchema),
-      }
-    })
-
-  return functionDeclarations.length > 0
-    ? [{ functionDeclarations }]
-    : []
-}
-
-export function anthropicToolChoiceToGemini(
-  toolChoice: unknown,
-): GeminiFunctionCallingConfig | undefined {
-  if (!toolChoice || typeof toolChoice !== 'object') return undefined
-
-  const tc = toolChoice as Record<string, unknown>
-  const type = tc.type as string
-
-  switch (type) {
-    case 'auto':
-      return { mode: 'AUTO' }
-    case 'any':
-      return { mode: 'ANY' }
-    case 'tool':
-      return {
-        mode: 'ANY',
-        allowedFunctionNames:
-          typeof tc.name === 'string' ? [tc.name] : undefined,
-      }
-    default:
-      return undefined
-  }
-}
--- a/src/services/api/gemini/index.ts
+++ b/src/services/api/gemini/index.ts
@@ -19,14 +19,7 @@ import type { SystemPrompt } from '../../../utils/systemPromptType.js'
 import type { ThinkingConfig } from '../../../utils/thinking.js'
 import type { Options } from '../claude.js'
 import { streamGeminiGenerateContent } from './client.js'
-import { anthropicMessagesToGemini } from './convertMessages.js'
-import {
-  anthropicToolChoiceToGemini,
-  anthropicToolsToGemini,
-} from './convertTools.js'
-import { resolveGeminiModel } from './modelMapping.js'
-import { adaptGeminiStreamToAnthropic } from './streamAdapter.js'
-import { GEMINI_THOUGHT_SIGNATURE_FIELD } from './types.js'
+import { anthropicMessagesToGemini, resolveGeminiModel, adaptGeminiStreamToAnthropic, anthropicToolsToGemini, anthropicToolChoiceToGemini, GEMINI_THOUGHT_SIGNATURE_FIELD } from '@ant/model-provider'

 export async function* queryModelGemini(
  messages: Message[],
--- a/src/services/api/gemini/modelMapping.ts
+++ b/src/services/api/gemini/modelMapping.ts
@@ -1,37 +0,0 @@
-function getModelFamily(model: string): 'haiku' | 'sonnet' | 'opus' | null {
-  if (/haiku/i.test(model)) return 'haiku'
-  if (/opus/i.test(model)) return 'opus'
-  if (/sonnet/i.test(model)) return 'sonnet'
-  return null
-}
-
-export function resolveGeminiModel(anthropicModel: string): string {
-  if (process.env.GEMINI_MODEL) {
-    return process.env.GEMINI_MODEL
-  }
-
-  const cleanModel = anthropicModel.replace(/\[1m\]$/i, '')
-  const family = getModelFamily(cleanModel)
-
-  if (!family) {
-    return cleanModel
-  }
-
-  // First, try Gemini-specific DEFAULT variables (separated from Anthropic)
-  const geminiEnvVar = `GEMINI_DEFAULT_${family.toUpperCase()}_MODEL`
-  const geminiModel = process.env[geminiEnvVar]
-  if (geminiModel) {
-    return geminiModel
-  }
-
-  // Fallback to Anthropic DEFAULT variables for backward compatibility
-  const sharedEnvVar = `ANTHROPIC_DEFAULT_${family.toUpperCase()}_MODEL`
-  const resolvedModel = process.env[sharedEnvVar]
-  if (resolvedModel) {
-    return resolvedModel
-  }
-
-  throw new Error(
-    `Gemini provider requires GEMINI_MODEL or ${geminiEnvVar} (or ${sharedEnvVar} for backward compatibility) to be configured.`,
-  )
-}
--- a/src/services/api/gemini/streamAdapter.ts
+++ b/src/services/api/gemini/streamAdapter.ts
@@ -1,243 +0,0 @@
-import type { BetaRawMessageStreamEvent } from '@anthropic-ai/sdk/resources/beta/messages/messages.mjs'
-import { randomUUID } from 'crypto'
-import type { GeminiPart, GeminiStreamChunk } from './types.js'
-
-export async function* adaptGeminiStreamToAnthropic(
-  stream: AsyncIterable<GeminiStreamChunk>,
-  model: string,
-): AsyncGenerator<BetaRawMessageStreamEvent, void> {
-  const messageId = `msg_${randomUUID().replace(/-/g, '').slice(0, 24)}`
-  let started = false
-  let stopped = false
-  let nextContentIndex = 0
-  let openTextLikeBlock:
-    | { index: number; type: 'text' | 'thinking' }
-    | null = null
-  let sawToolUse = false
-  let finishReason: string | undefined
-  let inputTokens = 0
-  let outputTokens = 0
-
-  for await (const chunk of stream) {
-    const usage = chunk.usageMetadata
-    if (usage) {
-      inputTokens = usage.promptTokenCount ?? inputTokens
-      outputTokens =
-        (usage.candidatesTokenCount ?? 0) + (usage.thoughtsTokenCount ?? 0)
-    }
-
-    if (!started) {
-      started = true
-      yield {
-        type: 'message_start',
-        message: {
-          id: messageId,
-          type: 'message',
-          role: 'assistant',
-          content: [],
-          model,
-          stop_reason: null,
-          stop_sequence: null,
-          usage: {
-            input_tokens: inputTokens,
-            output_tokens: 0,
-            cache_creation_input_tokens: 0,
-            cache_read_input_tokens: 0,
-          },
-        },
-      } as unknown as BetaRawMessageStreamEvent
-    }
-    const candidate = chunk.candidates?.[0]
-    const parts = candidate?.content?.parts ?? []
-
-    for (const part of parts) {
-      if (part.functionCall) {
-        if (openTextLikeBlock) {
-          yield {
-            type: 'content_block_stop',
-            index: openTextLikeBlock.index,
-          } as BetaRawMessageStreamEvent
-          openTextLikeBlock = null
-        }
-
-        sawToolUse = true
-        const toolIndex = nextContentIndex++
-        const toolId = `toolu_${randomUUID().replace(/-/g, '').slice(0, 24)}`
-        yield {
-          type: 'content_block_start',
-          index: toolIndex,
-          content_block: {
-            type: 'tool_use',
-            id: toolId,
-            name: part.functionCall.name || '',
-            input: {},
-          },
-        } as BetaRawMessageStreamEvent
-
-        if (part.thoughtSignature) {
-          yield {
-            type: 'content_block_delta',
-            index: toolIndex,
-            delta: {
-              type: 'signature_delta',
-              signature: part.thoughtSignature,
-            },
-          } as BetaRawMessageStreamEvent
-        }
-
-        if (part.functionCall.args && Object.keys(part.functionCall.args).length > 0) {
-          yield {
-            type: 'content_block_delta',
-            index: toolIndex,
-            delta: {
-              type: 'input_json_delta',
-              partial_json: JSON.stringify(part.functionCall.args),
-            },
-          } as BetaRawMessageStreamEvent
-        }
-
-        yield {
-          type: 'content_block_stop',
-          index: toolIndex,
-        } as BetaRawMessageStreamEvent
-        continue
-      }
-
-      const textLikeType = getTextLikeBlockType(part)
-      if (textLikeType) {
-        if (!openTextLikeBlock || openTextLikeBlock.type !== textLikeType) {
-          if (openTextLikeBlock) {
-            yield {
-              type: 'content_block_stop',
-              index: openTextLikeBlock.index,
-            } as BetaRawMessageStreamEvent
-          }
-
-          openTextLikeBlock = {
-            index: nextContentIndex++,
-            type: textLikeType,
-          }
-
-          yield {
-            type: 'content_block_start',
-            index: openTextLikeBlock.index,
-            content_block:
-              textLikeType === 'thinking'
-                ? {
-                    type: 'thinking',
-                    thinking: '',
-                    signature: '',
-                  }
-                : {
-                    type: 'text',
-                    text: '',
-                  },
-          } as BetaRawMessageStreamEvent
-        }
-
-        if (part.text) {
-          yield {
-            type: 'content_block_delta',
-            index: openTextLikeBlock.index,
-            delta:
-              textLikeType === 'thinking'
-                ? {
-                    type: 'thinking_delta',
-                    thinking: part.text,
-                  }
-                : {
-                    type: 'text_delta',
-                    text: part.text,
-                  },
-          } as BetaRawMessageStreamEvent
-        }
-
-        if (part.thoughtSignature) {
-          yield {
-            type: 'content_block_delta',
-            index: openTextLikeBlock.index,
-            delta: {
-              type: 'signature_delta',
-              signature: part.thoughtSignature,
-            },
-          } as BetaRawMessageStreamEvent
-        }
-
-        continue
-      }
-
-      if (part.thoughtSignature && openTextLikeBlock) {
-        yield {
-          type: 'content_block_delta',
-          index: openTextLikeBlock.index,
-          delta: {
-            type: 'signature_delta',
-            signature: part.thoughtSignature,
-          },
-        } as BetaRawMessageStreamEvent
-      }
-    }
-
-    if (candidate?.finishReason) {
-      finishReason = candidate.finishReason
-    }
-  }
-
-  if (!started) {
-    return
-  }
-
-  if (openTextLikeBlock) {
-    yield {
-      type: 'content_block_stop',
-      index: openTextLikeBlock.index,
-    } as BetaRawMessageStreamEvent
-  }
-
-  if (!stopped) {
-    yield {
-      type: 'message_delta',
-      delta: {
-        stop_reason: mapGeminiFinishReason(finishReason, sawToolUse),
-        stop_sequence: null,
-      },
-      usage: {
-        output_tokens: outputTokens,
-      },
-    } as BetaRawMessageStreamEvent
-
-    yield {
-      type: 'message_stop',
-    } as BetaRawMessageStreamEvent
-    stopped = true
-  }
-}
-
-function getTextLikeBlockType(
-  part: GeminiPart,
-): 'text' | 'thinking' | null {
-  if (typeof part.text !== 'string') {
-    return null
-  }
-  return part.thought ? 'thinking' : 'text'
-}
-
-function mapGeminiFinishReason(
-  reason: string | undefined,
-  sawToolUse: boolean,
-): string {
-  switch (reason) {
-    case 'MAX_TOKENS':
-      return 'max_tokens'
-    case 'STOP':
-    case 'FINISH_REASON_UNSPECIFIED':
-    case 'SAFETY':
-    case 'RECITATION':
-    case 'BLOCKLIST':
-    case 'PROHIBITED_CONTENT':
-    case 'SPII':
-    case 'MALFORMED_FUNCTION_CALL':
-    default:
-      return sawToolUse ? 'tool_use' : 'end_turn'
-  }
-}
--- a/src/services/api/gemini/types.ts
+++ b/src/services/api/gemini/types.ts
@@ -1,86 +0,0 @@
-export const GEMINI_THOUGHT_SIGNATURE_FIELD = '_geminiThoughtSignature'
-
-export type GeminiFunctionCall = {
-  name?: string
-  args?: Record<string, unknown>
-}
-
-export type GeminiFunctionResponse = {
-  name?: string
-  response?: Record<string, unknown>
-}
-
-export type GeminiInlineData = {
-  mimeType: string
-  data: string
-}
-
-export type GeminiPart = {
-  text?: string
-  thought?: boolean
-  thoughtSignature?: string
-  functionCall?: GeminiFunctionCall
-  functionResponse?: GeminiFunctionResponse
-  inlineData?: GeminiInlineData
-}
-
-export type GeminiContent = {
-  role: 'user' | 'model'
-  parts: GeminiPart[]
-}
-
-export type GeminiFunctionDeclaration = {
-  name: string
-  description?: string
-  parameters?: Record<string, unknown>
-  parametersJsonSchema?: Record<string, unknown>
-}
-
-export type GeminiTool = {
-  functionDeclarations: GeminiFunctionDeclaration[]
-}
-
-export type GeminiFunctionCallingConfig = {
-  mode: 'AUTO' | 'ANY' | 'NONE'
-  allowedFunctionNames?: string[]
-}
-
-export type GeminiGenerateContentRequest = {
-  contents: GeminiContent[]
-  systemInstruction?: {
-    parts: Array<{ text: string }>
-  }
-  tools?: GeminiTool[]
-  toolConfig?: {
-    functionCallingConfig: GeminiFunctionCallingConfig
-  }
-  generationConfig?: {
-    temperature?: number
-    thinkingConfig?: {
-      includeThoughts?: boolean
-      thinkingBudget?: number
-    }
-  }
-}
-
-export type GeminiUsageMetadata = {
-  promptTokenCount?: number
-  candidatesTokenCount?: number
-  thoughtsTokenCount?: number
-  totalTokenCount?: number
-}
-
-export type GeminiCandidate = {
-  content?: {
-    role?: string
-    parts?: GeminiPart[]
-  }
-  finishReason?: string
-  index?: number
-}
-
-export type GeminiStreamChunk = {
-  candidates?: GeminiCandidate[]
-  usageMetadata?: GeminiUsageMetadata
-  modelVersion?: string
-}
--- a/src/services/api/grok/tests/client.test.ts
+++ b/src/services/api/grok/tests/client.test.ts
@@ -1,4 +1,10 @@
-import { describe, expect, test, beforeEach, afterEach } from 'bun:test'
+import { describe, expect, test, beforeEach, afterEach, mock } from 'bun:test'
+
+// Defensive: agent.test.ts can corrupt Bun's src/* path alias at runtime.
+mock.module('src/utils/proxy.js', () => ({
+  getProxyFetchOptions: () => ({} as any),
+}))
+
 import { getGrokClient, clearGrokClientCache } from '../client.js'

 describe('getGrokClient', () => {
--- a/src/services/api/grok/tests/modelMapping.test.ts
+++ b/src/services/api/grok/tests/modelMapping.test.ts
@@ -1,67 +0,0 @@
-import { describe, expect, test, beforeEach, afterEach } from 'bun:test'
-import { resolveGrokModel } from '../modelMapping.js'
-
-describe('resolveGrokModel', () => {
-  const originalEnv = { ...process.env }
-
-  beforeEach(() => {
-    delete process.env.GROK_MODEL
-    delete process.env.GROK_MODEL_MAP
-    delete process.env.GROK_DEFAULT_SONNET_MODEL
-    delete process.env.GROK_DEFAULT_OPUS_MODEL
-    delete process.env.GROK_DEFAULT_HAIKU_MODEL
-    delete process.env.ANTHROPIC_DEFAULT_SONNET_MODEL
-    delete process.env.ANTHROPIC_DEFAULT_OPUS_MODEL
-    delete process.env.ANTHROPIC_DEFAULT_HAIKU_MODEL
-  })
-
-  afterEach(() => {
-    process.env = { ...originalEnv }
-  })
-
-  test('GROK_MODEL env var takes highest priority', () => {
-    process.env.GROK_MODEL = 'grok-custom'
-    expect(resolveGrokModel('claude-sonnet-4-6')).toBe('grok-custom')
-  })
-
-  test('maps opus models to grok-4.20-reasoning', () => {
-    expect(resolveGrokModel('claude-opus-4-6')).toBe('grok-4.20-reasoning')
-  })
-
-  test('maps sonnet models to grok-3-mini-fast', () => {
-    expect(resolveGrokModel('claude-sonnet-4-6')).toBe('grok-3-mini-fast')
-  })
-
-  test('maps haiku models to grok-3-mini-fast', () => {
-    expect(resolveGrokModel('claude-haiku-4-5-20251001')).toBe('grok-3-mini-fast')
-  })
-
-  test('GROK_MODEL_MAP overrides family mapping', () => {
-    process.env.GROK_MODEL_MAP = '{"opus":"grok-4","sonnet":"grok-3","haiku":"grok-mini"}'
-    expect(resolveGrokModel('claude-opus-4-6')).toBe('grok-4')
-    expect(resolveGrokModel('claude-sonnet-4-6')).toBe('grok-3')
-    expect(resolveGrokModel('claude-haiku-4-5-20251001')).toBe('grok-mini')
-  })
-
-  test('GROK_MODEL_MAP ignores invalid JSON', () => {
-    process.env.GROK_MODEL_MAP = 'not-json'
-    expect(resolveGrokModel('claude-opus-4-6')).toBe('grok-4.20-reasoning')
-  })
-
-  test('GROK_DEFAULT_{FAMILY}_MODEL overrides default map', () => {
-    process.env.GROK_DEFAULT_OPUS_MODEL = 'grok-2-latest'
-    expect(resolveGrokModel('claude-opus-4-6')).toBe('grok-2-latest')
-  })
-
-  test('passes through unknown model names', () => {
-    expect(resolveGrokModel('some-unknown-model')).toBe('some-unknown-model')
-  })
-
-  test('strips [1m] suffix before lookup', () => {
-    expect(resolveGrokModel('claude-sonnet-4-6[1m]')).toBe('grok-3-mini-fast')
-  })
-
-  test('falls back to family default for unlisted model', () => {
-    expect(resolveGrokModel('claude-opus-99-20300101')).toBe('grok-4.20-reasoning')
-  })
-})
--- a/src/services/api/grok/index.ts
+++ b/src/services/api/grok/index.ts
@@ -7,10 +7,7 @@ import type {
  ChatCompletionCreateParamsStreaming,
 } from 'openai/resources/chat/completions/completions.mjs'
 import { getGrokClient } from './client.js'
-import { anthropicMessagesToOpenAI } from '../openai/convertMessages.js'
-import { anthropicToolsToOpenAI, anthropicToolChoiceToOpenAI } from '../openai/convertTools.js'
-import { adaptOpenAIStreamToAnthropic } from '../openai/streamAdapter.js'
-import { resolveGrokModel } from './modelMapping.js'
+import { anthropicMessagesToOpenAI, anthropicToolsToOpenAI, anthropicToolChoiceToOpenAI, adaptOpenAIStreamToAnthropic, resolveGrokModel } from '@ant/model-provider'
 import { normalizeMessagesForAPI } from '../../../utils/messages.js'
 import type { SDKAssistantMessageError } from '../../../entrypoints/agentSdkTypes.js'
 import { toolToAPISchema } from '../../../utils/api.js'
--- a/src/services/api/grok/modelMapping.ts
+++ b/src/services/api/grok/modelMapping.ts
@@ -1,107 +0,0 @@
-/**
- * Default mapping from Anthropic model names to Grok model names.
- *
- * Users can override per-family via GROK_DEFAULT_{FAMILY}_MODEL env vars,
- * or override the entire mapping via GROK_MODEL_MAP env var (JSON string):
- *   GROK_MODEL_MAP='{"opus":"grok-4","sonnet":"grok-3","haiku":"grok-3-mini-fast"}'
- */
-const DEFAULT_MODEL_MAP: Record<string, string> = {
-  'claude-sonnet-4-20250514': 'grok-3-mini-fast',
-  'claude-sonnet-4-5-20250929': 'grok-3-mini-fast',
-  'claude-sonnet-4-6': 'grok-3-mini-fast',
-  'claude-opus-4-20250514': 'grok-4.20-reasoning',
-  'claude-opus-4-1-20250805': 'grok-4.20-reasoning',
-  'claude-opus-4-5-20251101': 'grok-4.20-reasoning',
-  'claude-opus-4-6': 'grok-4.20-reasoning',
-  'claude-haiku-4-5-20251001': 'grok-3-mini-fast',
-  'claude-3-5-haiku-20241022': 'grok-3-mini-fast',
-  'claude-3-7-sonnet-20250219': 'grok-3-mini-fast',
-  'claude-3-5-sonnet-20241022': 'grok-3-mini-fast',
-}
-
-/**
- * Family-level mapping defaults (used by GROK_MODEL_MAP).
- */
-const DEFAULT_FAMILY_MAP: Record<string, string> = {
-  opus: 'grok-4.20-reasoning',
-  sonnet: 'grok-3-mini-fast',
-  haiku: 'grok-3-mini-fast',
-}
-
-function getModelFamily(model: string): 'haiku' | 'sonnet' | 'opus' | null {
-  if (/haiku/i.test(model)) return 'haiku'
-  if (/opus/i.test(model)) return 'opus'
-  if (/sonnet/i.test(model)) return 'sonnet'
-  return null
-}
-
-/**
- * Parse user-provided model map from GROK_MODEL_MAP env var.
- * Accepts JSON like: {"opus":"grok-4","sonnet":"grok-3","haiku":"grok-3-mini-fast"}
- */
-function getUserModelMap(): Record<string, string> | null {
-  const raw = process.env.GROK_MODEL_MAP
-  if (!raw) return null
-  try {
-    const parsed = JSON.parse(raw)
-    if (parsed && typeof parsed === 'object' && !Array.isArray(parsed)) {
-      return parsed as Record<string, string>
-    }
-  } catch {
-    // ignore invalid JSON
-  }
-  return null
-}
-
-/**
- * Resolve the Grok model name for a given Anthropic model.
- *
- * Priority:
- * 1. GROK_MODEL env var (override all)
- * 2. GROK_MODEL_MAP env var — JSON family map (e.g. {"opus":"grok-4"})
- * 3. GROK_DEFAULT_{FAMILY}_MODEL env var (e.g. GROK_DEFAULT_OPUS_MODEL)
- * 4. ANTHROPIC_DEFAULT_{FAMILY}_MODEL env var (backward compat)
- * 5. DEFAULT_MODEL_MAP lookup
- * 6. Family-level default
- * 7. Pass through original model name
- */
-export function resolveGrokModel(anthropicModel: string): string {
-  // 1. Global override
-  if (process.env.GROK_MODEL) {
-    return process.env.GROK_MODEL
-  }
-
-  const cleanModel = anthropicModel.replace(/\[1m\]$/, '')
-  const family = getModelFamily(cleanModel)
-
-  // 2. User-provided model map
-  const userMap = getUserModelMap()
-  if (userMap && family && userMap[family]) {
-    return userMap[family]
-  }
-
-  if (family) {
-    // 3. Grok-specific family override
-    const grokEnvVar = `GROK_DEFAULT_${family.toUpperCase()}_MODEL`
-    const grokOverride = process.env[grokEnvVar]
-    if (grokOverride) return grokOverride
-
-    // 4. Anthropic env var (backward compat)
-    const anthropicEnvVar = `ANTHROPIC_DEFAULT_${family.toUpperCase()}_MODEL`
-    const anthropicOverride = process.env[anthropicEnvVar]
-    if (anthropicOverride) return anthropicOverride
-  }
-
-  // 5. Exact model name lookup
-  if (DEFAULT_MODEL_MAP[cleanModel]) {
-    return DEFAULT_MODEL_MAP[cleanModel]
-  }
-
-  // 6. Family-level default
-  if (family && DEFAULT_FAMILY_MAP[family]) {
-    return DEFAULT_FAMILY_MAP[family]
-  }
-
-  // 7. Pass through
-  return cleanModel
-}
--- a/src/services/api/openai/tests/convertMessages.test.ts
+++ b/src/services/api/openai/tests/convertMessages.test.ts
@@ -1,457 +0,0 @@
-import { describe, expect, test } from 'bun:test'
-import { anthropicMessagesToOpenAI } from '../convertMessages.js'
-import type { UserMessage, AssistantMessage } from '../../../../types/message.js'
-
-// Helpers to create internal-format messages
-function makeUserMsg(content: string | any[]): UserMessage {
-  return {
-    type: 'user',
-    uuid: '00000000-0000-0000-0000-000000000000',
-    message: { role: 'user', content },
-  } as UserMessage
-}
-
-function makeAssistantMsg(content: string | any[]): AssistantMessage {
-  return {
-    type: 'assistant',
-    uuid: '00000000-0000-0000-0000-000000000001',
-    message: { role: 'assistant', content },
-  } as AssistantMessage
-}
-
-describe('anthropicMessagesToOpenAI', () => {
-  test('converts system prompt to system message', () => {
-    const result = anthropicMessagesToOpenAI(
-      [makeUserMsg('hello')],
-      ['You are helpful.'] as any,
-    )
-    expect(result[0]).toEqual({ role: 'system', content: 'You are helpful.' })
-  })
-
-  test('joins multiple system prompt strings', () => {
-    const result = anthropicMessagesToOpenAI(
-      [makeUserMsg('hi')],
-      ['Part 1', 'Part 2'] as any,
-    )
-    expect(result[0]).toEqual({ role: 'system', content: 'Part 1\n\nPart 2' })
-  })
-
-  test('skips empty system prompt', () => {
-    const result = anthropicMessagesToOpenAI(
-      [makeUserMsg('hi')],
-      [] as any,
-    )
-    expect(result[0].role).toBe('user')
-  })
-
-  test('converts simple user text message', () => {
-    const result = anthropicMessagesToOpenAI(
-      [makeUserMsg('hello world')],
-      [] as any,
-    )
-    expect(result).toEqual([{ role: 'user', content: 'hello world' }])
-  })
-
-  test('converts user message with content array', () => {
-    const result = anthropicMessagesToOpenAI(
-      [makeUserMsg([
-        { type: 'text', text: 'line 1' },
-        { type: 'text', text: 'line 2' },
-      ])],
-      [] as any,
-    )
-    expect(result).toEqual([{ role: 'user', content: 'line 1\nline 2' }])
-  })
-
-  test('converts assistant message with text', () => {
-    const result = anthropicMessagesToOpenAI(
-      [makeAssistantMsg('response text')],
-      [] as any,
-    )
-    expect(result).toEqual([{ role: 'assistant', content: 'response text' }])
-  })
-
-  test('converts assistant message with tool_use', () => {
-    const result = anthropicMessagesToOpenAI(
-      [makeAssistantMsg([
-        { type: 'text', text: 'Let me help.' },
-        {
-          type: 'tool_use' as const,
-          id: 'toolu_123',
-          name: 'bash',
-          input: { command: 'ls' },
-        },
-      ])],
-      [] as any,
-    )
-    expect(result).toEqual([{
-      role: 'assistant',
-      content: 'Let me help.',
-      tool_calls: [{
-        id: 'toolu_123',
-        type: 'function',
-        function: { name: 'bash', arguments: '{"command":"ls"}' },
-      }],
-    }])
-  })
-
-  test('converts tool_result to tool message', () => {
-    const result = anthropicMessagesToOpenAI(
-      [makeUserMsg([
-        {
-          type: 'tool_result' as const,
-          tool_use_id: 'toolu_123',
-          content: 'file1.txt\nfile2.txt',
-        },
-      ])],
-      [] as any,
-    )
-    expect(result).toEqual([{
-      role: 'tool',
-      tool_call_id: 'toolu_123',
-      content: 'file1.txt\nfile2.txt',
-    }])
-  })
-
-  test('strips thinking blocks', () => {
-    const result = anthropicMessagesToOpenAI(
-      [makeAssistantMsg([
-        { type: 'thinking' as const, thinking: 'internal thoughts...' },
-        { type: 'text', text: 'visible response' },
-      ])],
-      [] as any,
-    )
-    expect(result).toEqual([{ role: 'assistant', content: 'visible response' }])
-  })
-
-  test('handles full conversation with tools', () => {
-    const result = anthropicMessagesToOpenAI(
-      [
-        makeUserMsg('list files'),
-        makeAssistantMsg([
-          {
-            type: 'tool_use' as const,
-            id: 'toolu_abc',
-            name: 'bash',
-            input: { command: 'ls' },
-          },
-        ]),
-        makeUserMsg([
-          {
-            type: 'tool_result' as const,
-            tool_use_id: 'toolu_abc',
-            content: 'file.txt',
-          },
-        ]),
-      ],
-      ['You are helpful.'] as any,
-    )
-
-    expect(result).toHaveLength(4)
-    expect(result[0].role).toBe('system')
-    expect(result[1].role).toBe('user')
-    expect(result[2].role).toBe('assistant')
-    expect((result[2] as any).tool_calls).toBeDefined()
-    expect(result[3].role).toBe('tool')
-  })
-
-  test('converts base64 image to image_url', () => {
-    const result = anthropicMessagesToOpenAI(
-      [makeUserMsg([
-        { type: 'text', text: 'what is this?' },
-        {
-          type: 'image' as const,
-          source: {
-            type: 'base64',
-            media_type: 'image/png',
-            data: 'iVBORw0KGgo=',
-          },
-        },
-      ])],
-      [] as any,
-    )
-    expect(result).toEqual([{
-      role: 'user',
-      content: [
-        { type: 'text', text: 'what is this?' },
-        {
-          type: 'image_url',
-          image_url: { url: 'data:image/png;base64,iVBORw0KGgo=' },
-        },
-      ],
-    }])
-  })
-
-  test('converts url image to image_url', () => {
-    const result = anthropicMessagesToOpenAI(
-      [makeUserMsg([
-        {
-          type: 'image' as const,
-          source: {
-            type: 'url',
-            url: 'https://example.com/img.png',
-          },
-        },
-      ])],
-      [] as any,
-    )
-    expect(result).toEqual([{
-      role: 'user',
-      content: [
-        {
-          type: 'image_url',
-          image_url: { url: 'https://example.com/img.png' },
-        },
-      ],
-    }])
-  })
-
-  test('converts image-only message without text', () => {
-    const result = anthropicMessagesToOpenAI(
-      [makeUserMsg([
-        {
-          type: 'image' as const,
-          source: {
-            type: 'base64',
-            media_type: 'image/jpeg',
-            data: '/9j/4AAQ',
-          },
-        },
-      ])],
-      [] as any,
-    )
-    expect(result).toEqual([{
-      role: 'user',
-      content: [
-        {
-          type: 'image_url',
-          image_url: { url: 'data:image/jpeg;base64,/9j/4AAQ' },
-        },
-      ],
-    }])
-  })
-
-  test('defaults to image/png when media_type is missing', () => {
-    const result = anthropicMessagesToOpenAI(
-      [makeUserMsg([
-        {
-          type: 'image' as const,
-          source: {
-            type: 'base64',
-            data: 'ABC123',
-          },
-        },
-      ])],
-      [] as any,
-    )
-    expect((result[0].content as any[])[0].image_url.url).toBe(
-      'data:image/png;base64,ABC123',
-    )
-  })
-})
-
-describe('DeepSeek thinking mode (enableThinking)', () => {
-  test('preserves thinking block as reasoning_content when enabled', () => {
-    const result = anthropicMessagesToOpenAI(
-      [makeUserMsg('question'), makeAssistantMsg([
-        { type: 'thinking' as const, thinking: 'Let me reason about this...' },
-        { type: 'text', text: 'The answer is 42.' },
-      ])],
-      [] as any,
-      { enableThinking: true },
-    )
-    // Should have: user, assistant with reasoning_content
-    expect(result).toHaveLength(2)
-    expect(result[0].role).toBe('user')
-    const assistant = result[1] as any
-    expect(assistant.role).toBe('assistant')
-    expect(assistant.content).toBe('The answer is 42.')
-    expect(assistant.reasoning_content).toBe('Let me reason about this...')
-  })
-
-  test('drops thinking block when enableThinking is false (default)', () => {
-    const result = anthropicMessagesToOpenAI(
-      [makeAssistantMsg([
-        { type: 'thinking' as const, thinking: 'internal thoughts...' },
-        { type: 'text', text: 'visible response' },
-      ])],
-      [] as any,
-    )
-    const assistant = result[0] as any
-    expect(assistant.content).toBe('visible response')
-    expect(assistant.reasoning_content).toBeUndefined()
-  })
-
-  test('preserves reasoning_content with tool_calls in same turn', () => {
-    const result = anthropicMessagesToOpenAI(
-      [
-        makeUserMsg('what is the weather?'),
-        makeAssistantMsg([
-          { type: 'thinking' as const, thinking: 'I need to call the weather tool.' },
-          { type: 'text', text: '' },
-          {
-            type: 'tool_use' as const,
-            id: 'toolu_001',
-            name: 'get_weather',
-            input: { location: 'Hangzhou' },
-          },
-        ]),
-        makeUserMsg([
-          {
-            type: 'tool_result' as const,
-            tool_use_id: 'toolu_001',
-            content: 'Cloudy 7~13°C',
-          },
-        ]),
-      ],
-      [] as any,
-      { enableThinking: true },
-    )
-
-    // Find the assistant message
-    const assistants = result.filter(m => m.role === 'assistant')
-    expect(assistants.length).toBe(1)
-    const assistant = assistants[0] as any
-    expect(assistant.reasoning_content).toBe('I need to call the weather tool.')
-    expect(assistant.tool_calls).toBeDefined()
-    expect(assistant.tool_calls[0].function.name).toBe('get_weather')
-  })
-
-  test('strips reasoning_content from previous turns', () => {
-    const result = anthropicMessagesToOpenAI(
-      [
-        // Turn 1: user → assistant (with thinking)
-        makeUserMsg('question 1'),
-        makeAssistantMsg([
-          { type: 'thinking' as const, thinking: 'Turn 1 reasoning...' },
-          { type: 'text', text: 'Turn 1 answer' },
-        ]),
-        // Turn 2: new user message → previous reasoning should be stripped
-        makeUserMsg('question 2'),
-        makeAssistantMsg([
-          { type: 'thinking' as const, thinking: 'Turn 2 reasoning...' },
-          { type: 'text', text: 'Turn 2 answer' },
-        ]),
-      ],
-      [] as any,
-      { enableThinking: true },
-    )
-
-    const assistants = result.filter(m => m.role === 'assistant')
-    // Turn 1 assistant: reasoning should be stripped (previous turn)
-    expect((assistants[0] as any).reasoning_content).toBeUndefined()
-    expect((assistants[0] as any).content).toBe('Turn 1 answer')
-    // Turn 2 assistant: reasoning should be preserved (current turn)
-    expect((assistants[1] as any).reasoning_content).toBe('Turn 2 reasoning...')
-    expect((assistants[1] as any).content).toBe('Turn 2 answer')
-  })
-
-  test('preserves reasoning_content in multi-iteration tool call within same turn', () => {
-    // Simulates a full DeepSeek tool call iteration:
-    // user → assistant(thinking+tool_call) → tool_result → assistant(thinking+tool_call) → tool_result → assistant(thinking+text)
-    const result = anthropicMessagesToOpenAI(
-      [
-        makeUserMsg("tomorrow's weather in Hangzhou"),
-        // Iteration 1: thinking + tool call
-        makeAssistantMsg([
-          { type: 'thinking' as const, thinking: 'I need the date first.' },
-          {
-            type: 'tool_use' as const,
-            id: 'toolu_001',
-            name: 'get_date',
-            input: {},
-          },
-        ]),
-        makeUserMsg([
-          {
-            type: 'tool_result' as const,
-            tool_use_id: 'toolu_001',
-            content: '2026-04-08',
-          },
-        ]),
-        // Iteration 2: thinking + tool call
-        makeAssistantMsg([
-          { type: 'thinking' as const, thinking: 'Now I can get the weather.' },
-          {
-            type: 'tool_use' as const,
-            id: 'toolu_002',
-            name: 'get_weather',
-            input: { location: 'Hangzhou', date: '2026-04-08' },
-          },
-        ]),
-        makeUserMsg([
-          {
-            type: 'tool_result' as const,
-            tool_use_id: 'toolu_002',
-            content: 'Cloudy 7~13°C',
-          },
-        ]),
-        // Iteration 3: thinking + final answer
-        makeAssistantMsg([
-          { type: 'thinking' as const, thinking: 'I have the info now.' },
-          { type: 'text', text: 'Tomorrow will be cloudy, 7-13°C.' },
-        ]),
-      ],
-      [] as any,
-      { enableThinking: true },
-    )
-
-    // All 3 assistant messages are in the current turn (after last user msg is the last tool_result,
-    // but the "last user message" boundary logic finds the last user-typed message).
-    // Actually, tool_result messages are also UserMessage type, so the last user message
-    // is the one with tool_result for toolu_002. All assistant messages after that should have reasoning.
-    const assistants = result.filter(m => m.role === 'assistant')
-    expect(assistants.length).toBe(3)
-    // All iterations within the same turn preserve reasoning
-    expect((assistants[0] as any).reasoning_content).toBe('I need the date first.')
-    expect((assistants[1] as any).reasoning_content).toBe('Now I can get the weather.')
-    expect((assistants[2] as any).reasoning_content).toBe('I have the info now.')
-  })
-
-  test('handles multiple thinking blocks in single assistant message', () => {
-    const result = anthropicMessagesToOpenAI(
-      [makeUserMsg('question'), makeAssistantMsg([
-        { type: 'thinking' as const, thinking: 'First thought.' },
-        { type: 'thinking' as const, thinking: 'Second thought.' },
-        { type: 'text', text: 'Final answer.' },
-      ])],
-      [] as any,
-      { enableThinking: true },
-    )
-    const assistant = result.filter(m => m.role === 'assistant')[0] as any
-    expect(assistant.reasoning_content).toBe('First thought.\nSecond thought.')
-  })
-
-  test('skips empty thinking blocks', () => {
-    const result = anthropicMessagesToOpenAI(
-      [makeUserMsg('question'), makeAssistantMsg([
-        { type: 'thinking' as const, thinking: '' },
-        { type: 'text', text: 'Answer.' },
-      ])],
-      [] as any,
-      { enableThinking: true },
-    )
-    const assistant = result.filter(m => m.role === 'assistant')[0] as any
-    expect(assistant.reasoning_content).toBeUndefined()
-  })
-
-  test('sets content to null when only thinking and tool_calls present', () => {
-    const result = anthropicMessagesToOpenAI(
-      [makeUserMsg('question'), makeAssistantMsg([
-        { type: 'thinking' as const, thinking: 'Reasoning only.' },
-        {
-          type: 'tool_use' as const,
-          id: 'toolu_001',
-          name: 'bash',
-          input: { command: 'ls' },
-        },
-      ])],
-      [] as any,
-      { enableThinking: true },
-    )
-    const assistant = result.filter(m => m.role === 'assistant')[0] as any
-    expect(assistant.content).toBeNull()
-    expect(assistant.reasoning_content).toBe('Reasoning only.')
-    expect(assistant.tool_calls).toHaveLength(1)
-  })
-})
--- a/src/services/api/openai/tests/convertTools.test.ts
+++ b/src/services/api/openai/tests/convertTools.test.ts
@@ -1,167 +0,0 @@
-import { describe, expect, test } from 'bun:test'
-import { anthropicToolsToOpenAI, anthropicToolChoiceToOpenAI } from '../convertTools.js'
-
-describe('anthropicToolsToOpenAI', () => {
-  test('converts basic tool', () => {
-    const tools = [
-      {
-        type: 'custom',
-        name: 'bash',
-        description: 'Run a bash command',
-        input_schema: {
-          type: 'object',
-          properties: { command: { type: 'string' } },
-          required: ['command'],
-        },
-      },
-    ]
-
-    const result = anthropicToolsToOpenAI(tools as any)
-
-    expect(result).toEqual([{
-      type: 'function',
-      function: {
-        name: 'bash',
-        description: 'Run a bash command',
-        parameters: {
-          type: 'object',
-          properties: { command: { type: 'string' } },
-          required: ['command'],
-        },
-      },
-    }])
-  })
-
-  test('uses empty schema when input_schema missing', () => {
-    const tools = [{ type: 'custom', name: 'noop', description: 'no-op' }]
-    const result = anthropicToolsToOpenAI(tools as any)
-
-    expect((result[0] as { function: { parameters: unknown } }).function.parameters).toEqual({ type: 'object', properties: {} })
-  })
-
-  test('strips Anthropic-specific fields', () => {
-    const tools = [
-      {
-        type: 'custom',
-        name: 'bash',
-        description: 'Run bash',
-        input_schema: { type: 'object', properties: {} },
-        cache_control: { type: 'ephemeral' },
-        defer_loading: true,
-      },
-    ]
-    const result = anthropicToolsToOpenAI(tools as any)
-
-    expect((result[0] as any).cache_control).toBeUndefined()
-    expect((result[0] as any).defer_loading).toBeUndefined()
-  })
-
-  test('handles empty tools array', () => {
-    expect(anthropicToolsToOpenAI([])).toEqual([])
-  })
-
-  test('sanitizes const to enum in tool schema', () => {
-    const tools = [
-      {
-        type: 'custom',
-        name: 'test',
-        description: 'test tool',
-        input_schema: {
-          type: 'object',
-          properties: {
-            mode: { const: 'read' },
-            name: { type: 'string' },
-          },
-        },
-      },
-    ]
-    const result = anthropicToolsToOpenAI(tools as any)
-    const props = (result[0] as { function: { parameters: any } }).function.parameters as any
-    expect(props.properties.mode).toEqual({ enum: ['read'] })
-    expect(props.properties.mode.const).toBeUndefined()
-    expect(props.properties.name).toEqual({ type: 'string' })
-  })
-
-  test('sanitizes const in deeply nested schemas', () => {
-    const tools = [
-      {
-        type: 'custom',
-        name: 'deep',
-        description: 'nested const',
-        input_schema: {
-          type: 'object',
-          properties: {
-            outer: {
-              type: 'object',
-              properties: {
-                inner: { const: 'fixed' },
-              },
-            },
-          },
-          definitions: {
-            MyType: {
-              type: 'object',
-              properties: {
-                field: { const: 42 },
-              },
-            },
-          },
-        },
-      },
-    ]
-    const result = anthropicToolsToOpenAI(tools as any)
-    const params = (result[0] as { function: { parameters: any } }).function.parameters as any
-    expect(params.properties.outer.properties.inner).toEqual({ enum: ['fixed'] })
-    expect(params.definitions.MyType.properties.field).toEqual({ enum: [42] })
-  })
-
-  test('sanitizes const in anyOf/oneOf/allOf', () => {
-    const tools = [
-      {
-        type: 'custom',
-        name: 'union',
-        description: 'union test',
-        input_schema: {
-          type: 'object',
-          properties: {
-            val: {
-              anyOf: [
-                { const: 'a' },
-                { const: 'b' },
-                { type: 'string' },
-              ],
-            },
-          },
-        },
-      },
-    ]
-    const result = anthropicToolsToOpenAI(tools as any)
-    const anyOf = ((result[0] as { function: { parameters: any } }).function.parameters as any).properties.val.anyOf
-    expect(anyOf[0]).toEqual({ enum: ['a'] })
-    expect(anyOf[1]).toEqual({ enum: ['b'] })
-    expect(anyOf[2]).toEqual({ type: 'string' })
-  })
-})
-
-describe('anthropicToolChoiceToOpenAI', () => {
-  test('maps auto', () => {
-    expect(anthropicToolChoiceToOpenAI({ type: 'auto' })).toBe('auto')
-  })
-
-  test('maps any to required', () => {
-    expect(anthropicToolChoiceToOpenAI({ type: 'any' })).toBe('required')
-  })
-
-  test('maps tool to function', () => {
-    const result = anthropicToolChoiceToOpenAI({ type: 'tool', name: 'bash' })
-    expect(result).toEqual({ type: 'function', function: { name: 'bash' } })
-  })
-
-  test('returns undefined for undefined input', () => {
-    expect(anthropicToolChoiceToOpenAI(undefined)).toBeUndefined()
-  })
-
-  test('returns undefined for unknown type', () => {
-    expect(anthropicToolChoiceToOpenAI({ type: 'unknown' })).toBeUndefined()
-  })
-})
--- a/src/services/api/openai/tests/modelMapping.test.ts
+++ b/src/services/api/openai/tests/modelMapping.test.ts
@@ -1,68 +0,0 @@
-import { describe, expect, test, beforeEach, afterEach } from 'bun:test'
-import { resolveOpenAIModel } from '../modelMapping.js'
-
-describe('resolveOpenAIModel', () => {
-  const originalEnv = {
-    OPENAI_MODEL: process.env.OPENAI_MODEL,
-    OPENAI_DEFAULT_HAIKU_MODEL: process.env.OPENAI_DEFAULT_HAIKU_MODEL,
-    OPENAI_DEFAULT_SONNET_MODEL: process.env.OPENAI_DEFAULT_SONNET_MODEL,
-    OPENAI_DEFAULT_OPUS_MODEL: process.env.OPENAI_DEFAULT_OPUS_MODEL,
-    ANTHROPIC_DEFAULT_HAIKU_MODEL: process.env.ANTHROPIC_DEFAULT_HAIKU_MODEL,
-    ANTHROPIC_DEFAULT_SONNET_MODEL: process.env.ANTHROPIC_DEFAULT_SONNET_MODEL,
-    ANTHROPIC_DEFAULT_OPUS_MODEL: process.env.ANTHROPIC_DEFAULT_OPUS_MODEL,
-  }
-
-  beforeEach(() => {
-    delete process.env.OPENAI_MODEL
-    delete process.env.OPENAI_DEFAULT_HAIKU_MODEL
-    delete process.env.OPENAI_DEFAULT_SONNET_MODEL
-    delete process.env.OPENAI_DEFAULT_OPUS_MODEL
-    delete process.env.ANTHROPIC_DEFAULT_HAIKU_MODEL
-    delete process.env.ANTHROPIC_DEFAULT_SONNET_MODEL
-    delete process.env.ANTHROPIC_DEFAULT_OPUS_MODEL
-  })
-
-  afterEach(() => {
-    Object.assign(process.env, originalEnv)
-  })
-
-  test('OPENAI_MODEL env var overrides all', () => {
-    process.env.OPENAI_MODEL = 'my-custom-model'
-    expect(resolveOpenAIModel('claude-sonnet-4-6')).toBe('my-custom-model')
-  })
-
-  test('ANTHROPIC_DEFAULT_SONNET_MODEL overrides default map', () => {
-    process.env.ANTHROPIC_DEFAULT_SONNET_MODEL = 'my-sonnet'
-    expect(resolveOpenAIModel('claude-sonnet-4-6')).toBe('my-sonnet')
-  })
-
-  test('ANTHROPIC_DEFAULT_HAIKU_MODEL overrides default map', () => {
-    process.env.ANTHROPIC_DEFAULT_HAIKU_MODEL = 'my-haiku'
-    expect(resolveOpenAIModel('claude-haiku-4-5-20251001')).toBe('my-haiku')
-  })
-
-  test('ANTHROPIC_DEFAULT_OPUS_MODEL overrides default map', () => {
-    process.env.ANTHROPIC_DEFAULT_OPUS_MODEL = 'my-opus'
-    expect(resolveOpenAIModel('claude-opus-4-6')).toBe('my-opus')
-  })
-
-  test('maps known Anthropic model via DEFAULT_MODEL_MAP', () => {
-    expect(resolveOpenAIModel('claude-sonnet-4-6')).toBe('gpt-4o')
-  })
-
-  test('maps haiku model', () => {
-    expect(resolveOpenAIModel('claude-haiku-4-5-20251001')).toBe('gpt-4o-mini')
-  })
-
-  test('maps opus model', () => {
-    expect(resolveOpenAIModel('claude-opus-4-6')).toBe('o3')
-  })
-
-  test('passes through unknown model name', () => {
-    expect(resolveOpenAIModel('some-random-model')).toBe('some-random-model')
-  })
-
-  test('strips [1m] suffix', () => {
-    expect(resolveOpenAIModel('claude-sonnet-4-6[1m]')).toBe('gpt-4o')
-  })
-})
--- a/src/services/api/openai/tests/queryModelOpenAI.isolated.ts
+++ b/src/services/api/openai/tests/queryModelOpenAI.isolated.ts
@@ -1,487 +0,0 @@
-/**
- * Tests for queryModelOpenAI in index.ts.
- *
- * Focused on the two bugs fixed:
- *  1. stop_reason was always null in the assembled AssistantMessage because
- *     partialMessage (from message_start) has stop_reason: null, and the
- *     stop_reason captured from message_delta was never applied.
- *  2. partialMessage was not reset to null after message_stop, so the safety
- *     fallback at the end of the loop would yield a second identical
- *     AssistantMessage (causing doubled content in the next API request).
- *
- * Strategy: mock getOpenAIClient + adaptOpenAIStreamToAnthropic so we can
- * feed pre-built Anthropic events directly into queryModelOpenAI and inspect
- * what it emits — without any real HTTP calls.
- */
-import { describe, expect, test, mock, beforeEach, afterEach } from 'bun:test'
-import type { BetaRawMessageStreamEvent } from '@anthropic-ai/sdk/resources/beta/messages/messages.mjs'
-import type { AssistantMessage, StreamEvent } from '../../../../types/message.js'
-
-// ─── helpers ─────────────────────────────────────────────────────────────────
-
-/** Build a minimal message_start event */
-function makeMessageStart(overrides: Record<string, any> = {}): BetaRawMessageStreamEvent {
-  return {
-    type: 'message_start',
-    message: {
-      id: 'msg_test',
-      type: 'message',
-      role: 'assistant',
-      content: [],
-      model: 'test-model',
-      stop_reason: null,
-      stop_sequence: null,
-      usage: { input_tokens: 0, output_tokens: 0, cache_creation_input_tokens: 0, cache_read_input_tokens: 0 },
-      ...overrides,
-    },
-  } as any
-}
-
-/** Build a content_block_start event for the given block type */
-function makeContentBlockStart(index: number, type: 'text' | 'tool_use' | 'thinking', extra: Record<string, any> = {}): BetaRawMessageStreamEvent {
-  const block =
-    type === 'text'
-      ? { type: 'text', text: '' }
-      : type === 'tool_use'
-        ? { type: 'tool_use', id: 'toolu_test', name: 'bash', input: {} }
-        : { type: 'thinking', thinking: '', signature: '' }
-  return { type: 'content_block_start', index, content_block: { ...block, ...extra } } as any
-}
-
-/** Build a text_delta content_block_delta event */
-function makeTextDelta(index: number, text: string): BetaRawMessageStreamEvent {
-  return { type: 'content_block_delta', index, delta: { type: 'text_delta', text } } as any
-}
-
-/** Build an input_json_delta content_block_delta event */
-function makeInputJsonDelta(index: number, json: string): BetaRawMessageStreamEvent {
-  return { type: 'content_block_delta', index, delta: { type: 'input_json_delta', partial_json: json } } as any
-}
-
-/** Build a thinking_delta content_block_delta event */
-function makeThinkingDelta(index: number, thinking: string): BetaRawMessageStreamEvent {
-  return { type: 'content_block_delta', index, delta: { type: 'thinking_delta', thinking } } as any
-}
-
-/** Build a content_block_stop event */
-function makeContentBlockStop(index: number): BetaRawMessageStreamEvent {
-  return { type: 'content_block_stop', index } as any
-}
-
-/** Build a message_delta event with stop_reason and output_tokens */
-function makeMessageDelta(stopReason: string, outputTokens: number): BetaRawMessageStreamEvent {
-  return {
-    type: 'message_delta',
-    delta: { stop_reason: stopReason, stop_sequence: null },
-    usage: { output_tokens: outputTokens },
-  } as any
-}
-
-/** Build a message_stop event */
-function makeMessageStop(): BetaRawMessageStreamEvent {
-  return { type: 'message_stop' } as any
-}
-
-/** Async generator from a fixed array of events */
-async function* eventStream(events: BetaRawMessageStreamEvent[]) {
-  for (const e of events) yield e
-}
-
-/** Collect all outputs from queryModelOpenAI into typed buckets */
-async function runQueryModel(
-  events: BetaRawMessageStreamEvent[],
-  envOverrides: Record<string, string | undefined> = {},
-) {
-  // Wire events into the mocked stream adapter
-  _nextEvents = events
-  // Save + apply env overrides
-  const saved: Record<string, string | undefined> = {}
-  for (const [k, v] of Object.entries(envOverrides)) {
-    saved[k] = process.env[k]
-    if (v === undefined) delete process.env[k]
-    else process.env[k] = v
-  }
-
-  try {
-    // We inline mock.module inside the try block.
-    // Bun resolves mock.module at the call site synchronously (hoisted),
-    // so we register once per test file, then re-import each time.
-    const { queryModelOpenAI } = await import('../index.js')
-
-    const assistantMessages: AssistantMessage[] = []
-    const streamEvents: StreamEvent[] = []
-    const otherOutputs: any[] = []
-
-    const minimalOptions: any = {
-      model: 'test-model',
-      tools: [],
-      agents: [],
-      querySource: 'main_loop',
-      getToolPermissionContext: async () => ({
-        alwaysAllow: [],
-        alwaysDeny: [],
-        needsPermission: [],
-        mode: 'default',
-        isBypassingPermissions: false,
-      }),
-    }
-
-    for await (const item of queryModelOpenAI(
-      [],
-      { type: 'text', text: '' } as any,
-      [],
-      new AbortController().signal,
-      minimalOptions,
-    )) {
-      if (item.type === 'assistant') {
-        assistantMessages.push(item as AssistantMessage)
-      } else if (item.type === 'stream_event') {
-        streamEvents.push(item as StreamEvent)
-      } else {
-        otherOutputs.push(item)
-      }
-    }
-
-    return { assistantMessages, streamEvents, otherOutputs }
-  } finally {
-    // Restore env
-    for (const [k, v] of Object.entries(saved)) {
-      if (v === undefined) delete process.env[k]
-      else process.env[k] = v
-    }
-  }
-}
-
-// ─── mock setup ──────────────────────────────────────────────────────────────
-
-// We mock at module level. Bun's mock.module replaces the module for the
-// entire file, so we configure the stream per-test via a shared variable.
-let _nextEvents: BetaRawMessageStreamEvent[] = []
-
-/** Captured arguments from the last chat.completions.create() call */
-let _lastCreateArgs: Record<string, any> | null = null
-
-mock.module('../client.js', () => ({
-  getOpenAIClient: () => ({
-    chat: {
-      completions: {
-        create: async (args: Record<string, any>) => {
-          _lastCreateArgs = args
-          return { [Symbol.asyncIterator]: async function* () {} }
-        },
-      },
-    },
-  }),
-}))
-
-mock.module('../streamAdapter.js', () => ({
-  adaptOpenAIStreamToAnthropic: (_stream: any, _model: string) => eventStream(_nextEvents),
-}))
-
-mock.module('../modelMapping.js', () => ({
-  resolveOpenAIModel: (m: string) => m,
-}))
-
-mock.module('../convertMessages.js', () => ({
-  anthropicMessagesToOpenAI: () => [],
-}))
-
-mock.module('../convertTools.js', () => ({
-  anthropicToolsToOpenAI: () => [],
-  anthropicToolChoiceToOpenAI: () => undefined,
-}))
-
-mock.module('../../../../utils/context.js', () => ({
-  MODEL_CONTEXT_WINDOW_DEFAULT: 200_000,
-  COMPACT_MAX_OUTPUT_TOKENS: 20_000,
-  CAPPED_DEFAULT_MAX_TOKENS: 8_000,
-  ESCALATED_MAX_TOKENS: 64_000,
-  is1mContextDisabled: () => false,
-  has1mContext: () => false,
-  modelSupports1M: () => false,
-  getModelMaxOutputTokens: () => ({ upperLimit: 8192, default: 8192 }),
-  getContextWindowForModel: () => 200_000,
-  getSonnet1mExpTreatmentEnabled: () => false,
-  calculateContextPercentages: () => ({ usedPercent: 0, remainingPercent: 100 }),
-  getMaxThinkingTokensForModel: () => 0,
-}))
-
-mock.module('../../../../utils/messages.js', () => ({
-  normalizeMessagesForAPI: (msgs: any) => msgs,
-  normalizeContentFromAPI: (blocks: any[]) => blocks,
-  createAssistantAPIErrorMessage: (opts: any) => ({
-    type: 'assistant',
-    message: { content: [{ type: 'text', text: opts.content }], apiError: opts.apiError },
-    uuid: 'error-uuid',
-    timestamp: new Date().toISOString(),
-  }),
-}))
-
-mock.module('../../../../utils/api.js', () => ({
-  toolToAPISchema: async (t: any) => t,
-}))
-
-mock.module('../../../../utils/toolSearch.js', () => ({
-  isToolSearchEnabled: async () => false,
-  extractDiscoveredToolNames: () => new Set(),
-}))
-
-mock.module('../../../../tools/ToolSearchTool/prompt.js', () => ({
-  isDeferredTool: () => false,
-  TOOL_SEARCH_TOOL_NAME: '__tool_search__',
-}))
-
-mock.module('../../../../cost-tracker.js', () => ({
-  addToTotalSessionCost: () => {},
-}))
-
-mock.module('../../../../utils/modelCost.js', () => ({
-  COST_TIER_3_15: {},
-  COST_TIER_15_75: {},
-  COST_TIER_5_25: {},
-  COST_TIER_30_150: {},
-  COST_HAIKU_35: {},
-  COST_HAIKU_45: {},
-  getOpus46CostTier: () => ({}),
-  MODEL_COSTS: {},
-  getModelCosts: () => ({}),
-  calculateUSDCost: () => 0,
-  calculateCostFromTokens: () => 0,
-  formatModelPricing: () => '',
-  getModelPricingString: () => undefined,
-}))
-
-mock.module('../../../../utils/debug.js', () => ({
-  logForDebugging: () => {},
-  logAntError: () => {},
-  isDebugMode: () => false,
-  isDebugToStdErr: () => false,
-  getDebugFilePath: () => null,
-  getDebugLogPath: () => '',
-  getDebugFilter: () => null,
-  getMinDebugLogLevel: () => 'debug',
-  enableDebugLogging: () => false,
-  setHasFormattedOutput: () => {},
-  getHasFormattedOutput: () => false,
-  flushDebugLogs: async () => {},
-}))
-
-// ─── tests ───────────────────────────────────────────────────────────────────
-
-describe('queryModelOpenAI — stop_reason propagation', () => {
-  test('assembled AssistantMessage has stop_reason end_turn (not null)', async () => {
-    _nextEvents = [
-      makeMessageStart(),
-      makeContentBlockStart(0, 'text'),
-      makeTextDelta(0, 'Hello'),
-      makeContentBlockStop(0),
-      makeMessageDelta('end_turn', 10),
-      makeMessageStop(),
-    ]
-
-    const { assistantMessages } = await runQueryModel(_nextEvents)
-
-    expect(assistantMessages).toHaveLength(1)
-    expect(assistantMessages[0]!.message.stop_reason).toBe('end_turn')
-  })
-
-  test('assembled AssistantMessage has stop_reason tool_use', async () => {
-    _nextEvents = [
-      makeMessageStart(),
-      makeContentBlockStart(0, 'tool_use'),
-      makeInputJsonDelta(0, '{"cmd":"ls"}'),
-      makeContentBlockStop(0),
-      makeMessageDelta('tool_use', 20),
-      makeMessageStop(),
-    ]
-
-    const { assistantMessages } = await runQueryModel(_nextEvents)
-
-    expect(assistantMessages).toHaveLength(1)
-    expect(assistantMessages[0]!.message.stop_reason).toBe('tool_use')
-  })
-
-  test('assembled AssistantMessage has stop_reason max_tokens', async () => {
-    _nextEvents = [
-      makeMessageStart(),
-      makeContentBlockStart(0, 'text'),
-      makeTextDelta(0, 'truncated'),
-      makeContentBlockStop(0),
-      makeMessageDelta('max_tokens', 8192),
-      makeMessageStop(),
-    ]
-
-    const { assistantMessages } = await runQueryModel(_nextEvents)
-
-    // Two assistant-typed items: the content message + the max_output_tokens error signal.
-    // The error signal is emitted as a synthetic assistant message by createAssistantAPIErrorMessage.
-    expect(assistantMessages).toHaveLength(2)
-    const contentMsg = assistantMessages[0]!
-    expect(contentMsg.message.stop_reason).toBe('max_tokens')
-    // Second item is the error signal (has apiError set)
-    const errorMsg = assistantMessages[1]!.message as any
-    expect(errorMsg.apiError).toBe('max_output_tokens')
-  })
-
-  test('stop_reason is null when no message_delta was received (safety fallback path)', async () => {
-    // Stream ends without message_stop — triggers the safety fallback branch.
-    // stop_reason stays null since no message_delta was ever seen.
-    _nextEvents = [
-      makeMessageStart(),
-      makeContentBlockStart(0, 'text'),
-      makeTextDelta(0, 'partial'),
-      makeContentBlockStop(0),
-      // No message_delta / message_stop
-    ]
-
-    const { assistantMessages } = await runQueryModel(_nextEvents)
-
-    // Safety fallback should yield the partial content
-    expect(assistantMessages).toHaveLength(1)
-    expect(assistantMessages[0]!.message.stop_reason).toBeNull()
-  })
-})
-
-describe('queryModelOpenAI — usage accumulation', () => {
-  test('usage in assembled message reflects all four fields from message_delta', async () => {
-    // message_start has all fields=0 (trailing-chunk pattern: usage not yet available).
-    // message_delta carries the real values after stream ends.
-    // The spread in the message_delta handler must override all zeros from message_start,
-    // including cache_read_input_tokens which was previously missing from message_delta.
-    _nextEvents = [
-      makeMessageStart({ usage: { input_tokens: 0, output_tokens: 0, cache_creation_input_tokens: 0, cache_read_input_tokens: 0 } }),
-      makeContentBlockStart(0, 'text'),
-      makeTextDelta(0, 'response'),
-      makeContentBlockStop(0),
-      // message_delta carries all four Anthropic usage fields (as emitted by the fixed streamAdapter)
-      {
-        type: 'message_delta',
-        delta: { stop_reason: 'end_turn', stop_sequence: null },
-        usage: { input_tokens: 30011, output_tokens: 190, cache_read_input_tokens: 19904, cache_creation_input_tokens: 0 },
-      } as any,
-      makeMessageStop(),
-    ]
-
-    const { assistantMessages } = await runQueryModel(_nextEvents)
-
-    expect(assistantMessages).toHaveLength(1)
-    const usage = assistantMessages[0]!.message.usage as any
-    expect(usage.input_tokens).toBe(30011)
-    expect(usage.output_tokens).toBe(190)
-    // cache_read_input_tokens from message_delta overrides the 0 from message_start
-    expect(usage.cache_read_input_tokens).toBe(19904)
-    expect(usage.cache_creation_input_tokens).toBe(0)
-  })
-
-  test('usage is zero when no usage events arrive (prevents false autocompact)', async () => {
-    // If usage stays 0, tokenCountWithEstimation will undercount — so at least
-    // verify the field exists and is numeric (to detect regressions).
-    _nextEvents = [
-      makeMessageStart(),
-      makeContentBlockStart(0, 'text'),
-      makeTextDelta(0, 'hi'),
-      makeContentBlockStop(0),
-      makeMessageDelta('end_turn', 0),
-      makeMessageStop(),
-    ]
-
-    const { assistantMessages } = await runQueryModel(_nextEvents)
-
-    const usage = assistantMessages[0]!.message.usage as any
-    expect(typeof usage.input_tokens).toBe('number')
-    expect(typeof usage.output_tokens).toBe('number')
-  })
-})
-
-describe('queryModelOpenAI — no duplicate AssistantMessage (partialMessage reset)', () => {
-  test('yields exactly one AssistantMessage per message_stop when content is present', async () => {
-    _nextEvents = [
-      makeMessageStart(),
-      makeContentBlockStart(0, 'text'),
-      makeTextDelta(0, 'only once'),
-      makeContentBlockStop(0),
-      makeMessageDelta('end_turn', 5),
-      makeMessageStop(),
-    ]
-
-    const { assistantMessages } = await runQueryModel(_nextEvents)
-
-    // Before the fix, partialMessage was not reset to null, so the safety
-    // fallback at the end of the loop would yield a second message with the
-    // same message.id — causing mergeAssistantMessages to concatenate content.
-    expect(assistantMessages).toHaveLength(1)
-  })
-
-  test('thinking + text response yields exactly one AssistantMessage', async () => {
-    _nextEvents = [
-      makeMessageStart(),
-      makeContentBlockStart(0, 'thinking'),
-      makeThinkingDelta(0, 'let me think'),
-      makeContentBlockStop(0),
-      makeContentBlockStart(1, 'text'),
-      makeTextDelta(1, 'answer'),
-      makeContentBlockStop(1),
-      makeMessageDelta('end_turn', 30),
-      makeMessageStop(),
-    ]
-
-    const { assistantMessages } = await runQueryModel(_nextEvents)
-
-    expect(assistantMessages).toHaveLength(1)
-  })
-
-  test('safety fallback path still yields message when stream ends without message_stop', async () => {
-    // Simulates a stream that cuts off without the normal termination sequence.
-    _nextEvents = [
-      makeMessageStart(),
-      makeContentBlockStart(0, 'text'),
-      makeTextDelta(0, 'abrupt end'),
-      // No content_block_stop, no message_delta, no message_stop
-    ]
-
-    const { assistantMessages } = await runQueryModel(_nextEvents)
-
-    expect(assistantMessages).toHaveLength(1)
-  })
-})
-
-describe('queryModelOpenAI — stream_events forwarded', () => {
-  test('every adapted event is also yielded as stream_event for real-time display', async () => {
-    _nextEvents = [
-      makeMessageStart(),
-      makeContentBlockStart(0, 'text'),
-      makeTextDelta(0, 'hello'),
-      makeContentBlockStop(0),
-      makeMessageDelta('end_turn', 5),
-      makeMessageStop(),
-    ]
-
-    const { streamEvents } = await runQueryModel(_nextEvents)
-
-    const eventTypes = streamEvents.map(e => (e as any).event?.type)
-    expect(eventTypes).toContain('message_start')
-    expect(eventTypes).toContain('content_block_start')
-    expect(eventTypes).toContain('content_block_delta')
-    expect(eventTypes).toContain('content_block_stop')
-    expect(eventTypes).toContain('message_delta')
-    expect(eventTypes).toContain('message_stop')
-  })
-})
-
-describe('queryModelOpenAI — max_tokens forwarded to request', () => {
-  test('buildOpenAIRequestBody includes max_tokens in the request payload', async () => {
-    _nextEvents = [
-      makeMessageStart(),
-      makeContentBlockStart(0, 'text'),
-      makeTextDelta(0, 'hi'),
-      makeContentBlockStop(0),
-      makeMessageDelta('end_turn', 5),
-      makeMessageStop(),
-    ]
-
-    await runQueryModel(_nextEvents)
-
-    expect(_lastCreateArgs).not.toBeNull()
-    expect(_lastCreateArgs!.max_tokens).toBe(8192)
-  })
-})
--- a/src/services/api/openai/tests/queryModelOpenAI.test.ts
+++ b/src/services/api/openai/tests/queryModelOpenAI.test.ts
@@ -1,559 +0,0 @@
-/**
- * Tests for queryModelOpenAI in index.ts.
- *
- * Focused on the two bugs fixed:
- *  1. stop_reason was always null in the assembled AssistantMessage because
- *     partialMessage (from message_start) has stop_reason: null, and the
- *     stop_reason captured from message_delta was never applied.
- *  2. partialMessage was not reset to null after message_stop, so the safety
- *     fallback at the end of the loop would yield a second identical
- *     AssistantMessage (causing doubled content in the next API request).
- *
- * Strategy: mock getOpenAIClient + adaptOpenAIStreamToAnthropic so we can
- * feed pre-built Anthropic events directly into queryModelOpenAI and inspect
- * what it emits — without any real HTTP calls.
- */
-import { describe, expect, test, mock, beforeEach, afterEach } from 'bun:test'
-import type { BetaRawMessageStreamEvent } from '@anthropic-ai/sdk/resources/beta/messages/messages.mjs'
-import type { AssistantMessage, StreamEvent } from '../../../../types/message.js'
-
-// ─── helpers ─────────────────────────────────────────────────────────────────
-
-/** Build a minimal message_start event */
-function makeMessageStart(overrides: Record<string, any> = {}): BetaRawMessageStreamEvent {
-  return {
-    type: 'message_start',
-    message: {
-      id: 'msg_test',
-      type: 'message',
-      role: 'assistant',
-      content: [],
-      model: 'test-model',
-      stop_reason: null,
-      stop_sequence: null,
-      usage: { input_tokens: 0, output_tokens: 0, cache_creation_input_tokens: 0, cache_read_input_tokens: 0 },
-      ...overrides,
-    },
-  } as any
-}
-
-/** Build a content_block_start event for the given block type */
-function makeContentBlockStart(index: number, type: 'text' | 'tool_use' | 'thinking', extra: Record<string, any> = {}): BetaRawMessageStreamEvent {
-  const block =
-    type === 'text'
-      ? { type: 'text', text: '' }
-      : type === 'tool_use'
-        ? { type: 'tool_use', id: 'toolu_test', name: 'bash', input: {} }
-        : { type: 'thinking', thinking: '', signature: '' }
-  return { type: 'content_block_start', index, content_block: { ...block, ...extra } } as any
-}
-
-/** Build a text_delta content_block_delta event */
-function makeTextDelta(index: number, text: string): BetaRawMessageStreamEvent {
-  return { type: 'content_block_delta', index, delta: { type: 'text_delta', text } } as any
-}
-
-/** Build an input_json_delta content_block_delta event */
-function makeInputJsonDelta(index: number, json: string): BetaRawMessageStreamEvent {
-  return { type: 'content_block_delta', index, delta: { type: 'input_json_delta', partial_json: json } } as any
-}
-
-/** Build a thinking_delta content_block_delta event */
-function makeThinkingDelta(index: number, thinking: string): BetaRawMessageStreamEvent {
-  return { type: 'content_block_delta', index, delta: { type: 'thinking_delta', thinking } } as any
-}
-
-/** Build a content_block_stop event */
-function makeContentBlockStop(index: number): BetaRawMessageStreamEvent {
-  return { type: 'content_block_stop', index } as any
-}
-
-/** Build a message_delta event with stop_reason and output_tokens */
-function makeMessageDelta(stopReason: string, outputTokens: number): BetaRawMessageStreamEvent {
-  return {
-    type: 'message_delta',
-    delta: { stop_reason: stopReason, stop_sequence: null },
-    usage: { output_tokens: outputTokens },
-  } as any
-}
-
-/** Build a message_stop event */
-function makeMessageStop(): BetaRawMessageStreamEvent {
-  return { type: 'message_stop' } as any
-}
-
-/** Async generator from a fixed array of events */
-async function* eventStream(events: BetaRawMessageStreamEvent[]) {
-  for (const e of events) yield e
-}
-
-/** Collect all outputs from queryModelOpenAI into typed buckets */
-async function runQueryModel(
-  events: BetaRawMessageStreamEvent[],
-  envOverrides: Record<string, string | undefined> = {},
-) {
-  // Wire events into the mocked stream adapter
-  _nextEvents = events
-  // Save + apply env overrides
-  const saved: Record<string, string | undefined> = {}
-  for (const [k, v] of Object.entries(envOverrides)) {
-    saved[k] = process.env[k]
-    if (v === undefined) delete process.env[k]
-    else process.env[k] = v
-  }
-
-  try {
-    // We inline mock.module inside the try block.
-    // Bun resolves mock.module at the call site synchronously (hoisted),
-    // so we register once per test file, then re-import each time.
-    const { queryModelOpenAI } = await import('../index.js')
-
-    const assistantMessages: AssistantMessage[] = []
-    const streamEvents: StreamEvent[] = []
-    const otherOutputs: any[] = []
-
-    const minimalOptions: any = {
-      model: 'test-model',
-      tools: [],
-      agents: [],
-      querySource: 'main_loop',
-      getToolPermissionContext: async () => ({
-        alwaysAllow: [],
-        alwaysDeny: [],
-        needsPermission: [],
-        mode: 'default',
-        isBypassingPermissions: false,
-      }),
-    }
-
-    for await (const item of queryModelOpenAI(
-      [],
-      { type: 'text', text: '' } as any,
-      [],
-      new AbortController().signal,
-      minimalOptions,
-    )) {
-      if (item.type === 'assistant') {
-        assistantMessages.push(item as AssistantMessage)
-      } else if (item.type === 'stream_event') {
-        streamEvents.push(item as StreamEvent)
-      } else {
-        otherOutputs.push(item)
-      }
-    }
-
-    return { assistantMessages, streamEvents, otherOutputs }
-  } finally {
-    // Restore env
-    for (const [k, v] of Object.entries(saved)) {
-      if (v === undefined) delete process.env[k]
-      else process.env[k] = v
-    }
-  }
-}
-
-// ─── mock setup ──────────────────────────────────────────────────────────────
-
-// We mock at module level. Bun's mock.module replaces the module for the
-// entire file, so we configure the stream per-test via a shared variable.
-let _nextEvents: BetaRawMessageStreamEvent[] = []
-
-/** Captured arguments from the last chat.completions.create() call */
-let _lastCreateArgs: Record<string, any> | null = null
-
-mock.module('../client.js', () => ({
-  getOpenAIClient: () => ({
-    chat: {
-      completions: {
-        create: async (args: Record<string, any>) => {
-          _lastCreateArgs = args
-          return { [Symbol.asyncIterator]: async function* () {} }
-        },
-      },
-    },
-  }),
-}))
-
-mock.module('../streamAdapter.js', () => ({
-  adaptOpenAIStreamToAnthropic: (_stream: any, _model: string) => eventStream(_nextEvents),
-}))
-
-mock.module('../modelMapping.js', () => ({
-  resolveOpenAIModel: (m: string) => m,
-}))
-
-mock.module('../convertMessages.js', () => ({
-  anthropicMessagesToOpenAI: () => [],
-}))
-
-mock.module('../convertTools.js', () => ({
-  anthropicToolsToOpenAI: () => [],
-  anthropicToolChoiceToOpenAI: () => undefined,
-}))
-
-mock.module('../../../../utils/context.js', () => ({
-  getModelMaxOutputTokens: () => ({ upperLimit: 8192, default: 8192 }),
-  getContextWindowForModel: () => 200_000,
-  modelSupports1M: () => false,
-  has1mContext: () => false,
-  is1mContextDisabled: () => false,
-  getSonnet1mExpTreatmentEnabled: () => false,
-  MODEL_CONTEXT_WINDOW_DEFAULT: 200_000,
-  COMPACT_MAX_OUTPUT_TOKENS: 20_000,
-  CAPPED_DEFAULT_MAX_TOKENS: 8_000,
-  ESCALATED_MAX_TOKENS: 64_000,
-  calculateContextPercentages: () => ({ used: null, remaining: null }),
-  getMaxThinkingTokensForModel: () => 8191,
-}))
-
-mock.module('../../../../utils/messages.js', () => ({
-  normalizeMessagesForAPI: (msgs: any) => msgs,
-  normalizeContentFromAPI: (blocks: any[]) => blocks,
-  createAssistantAPIErrorMessage: (opts: any) => ({
-    type: 'assistant',
-    message: { content: [{ type: 'text', text: opts.content }], apiError: opts.apiError },
-    uuid: 'error-uuid',
-    timestamp: new Date().toISOString(),
-  }),
-}))
-
-mock.module('../../../../utils/api.js', () => ({
-  toolToAPISchema: async (t: any) => t,
-}))
-
-mock.module('../../../../Tool.js', () => ({
-  getEmptyToolPermissionContext: () => ({
-    alwaysAllow: [],
-    alwaysDeny: [],
-    needsPermission: [],
-    mode: 'default',
-    isBypassingPermissions: false,
-  }),
-  toolMatchesName: () => false,
-}))
-
-mock.module('../../../../utils/envUtils.js', () => ({
-  isEnvTruthy: (v: string | undefined) => v === '1' || v === 'true',
-  isEnvDefinedFalsy: (v: string | undefined) => v === '0' || v === 'false' || v === 'no' || v === 'off',
-}))
-
-mock.module('../../../../utils/toolSearch.js', () => ({
-  isToolSearchEnabled: async () => false,
-  extractDiscoveredToolNames: () => new Set(),
-}))
-
-mock.module('../../../../tools/ToolSearchTool/prompt.js', () => ({
-  isDeferredTool: () => false,
-  TOOL_SEARCH_TOOL_NAME: '__tool_search__',
-}))
-
-mock.module('../../../../cost-tracker.js', () => ({
-  addToTotalSessionCost: () => {},
-}))
-
-mock.module('../../../../utils/modelCost.js', () => ({
-  calculateUSDCost: () => 0,
-}))
-
-mock.module('../../../../utils/debug.js', () => ({
-  logForDebugging: () => {},
-}))
-
-// ─── tests ───────────────────────────────────────────────────────────────────
-
-describe('queryModelOpenAI — stop_reason propagation', () => {
-  test('assembled AssistantMessage has stop_reason end_turn (not null)', async () => {
-    _nextEvents = [
-      makeMessageStart(),
-      makeContentBlockStart(0, 'text'),
-      makeTextDelta(0, 'Hello'),
-      makeContentBlockStop(0),
-      makeMessageDelta('end_turn', 10),
-      makeMessageStop(),
-    ]
-
-    const { assistantMessages } = await runQueryModel(_nextEvents)
-
-    expect(assistantMessages).toHaveLength(1)
-    expect(assistantMessages[0]!.message.stop_reason).toBe('end_turn')
-  })
-
-  test('assembled AssistantMessage has stop_reason tool_use', async () => {
-    _nextEvents = [
-      makeMessageStart(),
-      makeContentBlockStart(0, 'tool_use'),
-      makeInputJsonDelta(0, '{"cmd":"ls"}'),
-      makeContentBlockStop(0),
-      makeMessageDelta('tool_use', 20),
-      makeMessageStop(),
-    ]
-
-    const { assistantMessages } = await runQueryModel(_nextEvents)
-
-    expect(assistantMessages).toHaveLength(1)
-    expect(assistantMessages[0]!.message.stop_reason).toBe('tool_use')
-  })
-
-  test('assembled AssistantMessage has stop_reason max_tokens', async () => {
-    _nextEvents = [
-      makeMessageStart(),
-      makeContentBlockStart(0, 'text'),
-      makeTextDelta(0, 'truncated'),
-      makeContentBlockStop(0),
-      makeMessageDelta('max_tokens', 8192),
-      makeMessageStop(),
-    ]
-
-    const { assistantMessages } = await runQueryModel(_nextEvents)
-
-    // Two assistant-typed items: the content message + the max_output_tokens error signal.
-    // The error signal is emitted as a synthetic assistant message by createAssistantAPIErrorMessage.
-    expect(assistantMessages).toHaveLength(2)
-    const contentMsg = assistantMessages[0]!
-    expect(contentMsg.message.stop_reason).toBe('max_tokens')
-    // Second item is the error signal (has apiError set)
-    const errorMsg = assistantMessages[1]!.message as any
-    expect(errorMsg.apiError).toBe('max_output_tokens')
-  })
-
-  test('stop_reason is null when no message_delta was received (safety fallback path)', async () => {
-    // Stream ends without message_stop — triggers the safety fallback branch.
-    // stop_reason stays null since no message_delta was ever seen.
-    _nextEvents = [
-      makeMessageStart(),
-      makeContentBlockStart(0, 'text'),
-      makeTextDelta(0, 'partial'),
-      makeContentBlockStop(0),
-      // No message_delta / message_stop
-    ]
-
-    const { assistantMessages } = await runQueryModel(_nextEvents)
-
-    // Safety fallback should yield the partial content
-    expect(assistantMessages).toHaveLength(1)
-    expect(assistantMessages[0]!.message.stop_reason).toBeNull()
-  })
-})
-
-describe('queryModelOpenAI — usage accumulation', () => {
-  test('usage in assembled message reflects all four fields from message_delta', async () => {
-    // message_start has all fields=0 (trailing-chunk pattern: usage not yet available).
-    // message_delta carries the real values after stream ends.
-    // The spread in the message_delta handler must override all zeros from message_start,
-    // including cache_read_input_tokens which was previously missing from message_delta.
-    _nextEvents = [
-      makeMessageStart({ usage: { input_tokens: 0, output_tokens: 0, cache_creation_input_tokens: 0, cache_read_input_tokens: 0 } }),
-      makeContentBlockStart(0, 'text'),
-      makeTextDelta(0, 'response'),
-      makeContentBlockStop(0),
-      // message_delta carries all four Anthropic usage fields (as emitted by the fixed streamAdapter)
-      {
-        type: 'message_delta',
-        delta: { stop_reason: 'end_turn', stop_sequence: null },
-        usage: { input_tokens: 30011, output_tokens: 190, cache_read_input_tokens: 19904, cache_creation_input_tokens: 0 },
-      } as any,
-      makeMessageStop(),
-    ]
-
-    const { assistantMessages } = await runQueryModel(_nextEvents)
-
-    expect(assistantMessages).toHaveLength(1)
-    const usage = assistantMessages[0]!.message.usage as any
-    expect(usage.input_tokens).toBe(30011)
-    expect(usage.output_tokens).toBe(190)
-    // cache_read_input_tokens from message_delta overrides the 0 from message_start
-    expect(usage.cache_read_input_tokens).toBe(19904)
-    expect(usage.cache_creation_input_tokens).toBe(0)
-  })
-
-  test('usage is zero when no usage events arrive (prevents false autocompact)', async () => {
-    // If usage stays 0, tokenCountWithEstimation will undercount — so at least
-    // verify the field exists and is numeric (to detect regressions).
-    _nextEvents = [
-      makeMessageStart(),
-      makeContentBlockStart(0, 'text'),
-      makeTextDelta(0, 'hi'),
-      makeContentBlockStop(0),
-      makeMessageDelta('end_turn', 0),
-      makeMessageStop(),
-    ]
-
-    const { assistantMessages } = await runQueryModel(_nextEvents)
-
-    const usage = assistantMessages[0]!.message.usage as any
-    expect(typeof usage.input_tokens).toBe('number')
-    expect(typeof usage.output_tokens).toBe('number')
-  })
-})
-
-describe('queryModelOpenAI — no duplicate AssistantMessage (partialMessage reset)', () => {
-  test('yields exactly one AssistantMessage per message_stop when content is present', async () => {
-    _nextEvents = [
-      makeMessageStart(),
-      makeContentBlockStart(0, 'text'),
-      makeTextDelta(0, 'only once'),
-      makeContentBlockStop(0),
-      makeMessageDelta('end_turn', 5),
-      makeMessageStop(),
-    ]
-
-    const { assistantMessages } = await runQueryModel(_nextEvents)
-
-    // Before the fix, partialMessage was not reset to null, so the safety
-    // fallback at the end of the loop would yield a second message with the
-    // same message.id — causing mergeAssistantMessages to concatenate content.
-    expect(assistantMessages).toHaveLength(1)
-  })
-
-  test('thinking + text response yields exactly one AssistantMessage', async () => {
-    _nextEvents = [
-      makeMessageStart(),
-      makeContentBlockStart(0, 'thinking'),
-      makeThinkingDelta(0, 'let me think'),
-      makeContentBlockStop(0),
-      makeContentBlockStart(1, 'text'),
-      makeTextDelta(1, 'answer'),
-      makeContentBlockStop(1),
-      makeMessageDelta('end_turn', 30),
-      makeMessageStop(),
-    ]
-
-    const { assistantMessages } = await runQueryModel(_nextEvents)
-
-    expect(assistantMessages).toHaveLength(1)
-  })
-
-  test('safety fallback path still yields message when stream ends without message_stop', async () => {
-    // Simulates a stream that cuts off without the normal termination sequence.
-    _nextEvents = [
-      makeMessageStart(),
-      makeContentBlockStart(0, 'text'),
-      makeTextDelta(0, 'abrupt end'),
-      // No content_block_stop, no message_delta, no message_stop
-    ]
-
-    const { assistantMessages } = await runQueryModel(_nextEvents)
-
-    expect(assistantMessages).toHaveLength(1)
-  })
-})
-
-describe('queryModelOpenAI — stream_events forwarded', () => {
-  test('every adapted event is also yielded as stream_event for real-time display', async () => {
-    _nextEvents = [
-      makeMessageStart(),
-      makeContentBlockStart(0, 'text'),
-      makeTextDelta(0, 'hello'),
-      makeContentBlockStop(0),
-      makeMessageDelta('end_turn', 5),
-      makeMessageStop(),
-    ]
-
-    const { streamEvents } = await runQueryModel(_nextEvents)
-
-    const eventTypes = streamEvents.map(e => (e as any).event?.type)
-    expect(eventTypes).toContain('message_start')
-    expect(eventTypes).toContain('content_block_start')
-    expect(eventTypes).toContain('content_block_delta')
-    expect(eventTypes).toContain('content_block_stop')
-    expect(eventTypes).toContain('message_delta')
-    expect(eventTypes).toContain('message_stop')
-  })
-})
-
-describe('queryModelOpenAI — max_tokens forwarded to request', () => {
-  test('buildOpenAIRequestBody includes max_tokens in the request payload', async () => {
-    _nextEvents = [
-      makeMessageStart(),
-      makeContentBlockStart(0, 'text'),
-      makeTextDelta(0, 'hi'),
-      makeContentBlockStop(0),
-      makeMessageDelta('end_turn', 5),
-      makeMessageStop(),
-    ]
-
-    await runQueryModel(_nextEvents)
-
-    expect(_lastCreateArgs).not.toBeNull()
-    expect(_lastCreateArgs!.max_tokens).toBe(8192)
-  })
-
-  test('OPENAI_MAX_TOKENS env var overrides max_tokens', async () => {
-    const original = process.env.OPENAI_MAX_TOKENS
-    process.env.OPENAI_MAX_TOKENS = '4096'
-    try {
-      _nextEvents = [
-        makeMessageStart(),
-        makeContentBlockStart(0, 'text'),
-        makeTextDelta(0, 'hi'),
-        makeContentBlockStop(0),
-        makeMessageDelta('end_turn', 5),
-        makeMessageStop(),
-      ]
-
-      await runQueryModel(_nextEvents)
-
-      expect(_lastCreateArgs).not.toBeNull()
-      expect(_lastCreateArgs!.max_tokens).toBe(4096)
-    } finally {
-      if (original === undefined) {
-        delete process.env.OPENAI_MAX_TOKENS
-      } else {
-        process.env.OPENAI_MAX_TOKENS = original
-      }
-    }
-  })
-
-  test('CLAUDE_CODE_MAX_OUTPUT_TOKENS env var overrides max_tokens', async () => {
-    const original = process.env.CLAUDE_CODE_MAX_OUTPUT_TOKENS
-    process.env.CLAUDE_CODE_MAX_OUTPUT_TOKENS = '2048'
-    try {
-      _nextEvents = [
-        makeMessageStart(),
-        makeContentBlockStart(0, 'text'),
-        makeTextDelta(0, 'hi'),
-        makeContentBlockStop(0),
-        makeMessageDelta('end_turn', 5),
-        makeMessageStop(),
-      ]
-
-      await runQueryModel(_nextEvents)
-
-      expect(_lastCreateArgs).not.toBeNull()
-      expect(_lastCreateArgs!.max_tokens).toBe(2048)
-    } finally {
-      if (original === undefined) {
-        delete process.env.CLAUDE_CODE_MAX_OUTPUT_TOKENS
-      } else {
-        process.env.CLAUDE_CODE_MAX_OUTPUT_TOKENS = original
-      }
-    }
-  })
-
-  test('OPENAI_MAX_TOKENS takes priority over CLAUDE_CODE_MAX_OUTPUT_TOKENS', async () => {
-    const origOpenai = process.env.OPENAI_MAX_TOKENS
-    const origClaude = process.env.CLAUDE_CODE_MAX_OUTPUT_TOKENS
-    process.env.OPENAI_MAX_TOKENS = '4096'
-    process.env.CLAUDE_CODE_MAX_OUTPUT_TOKENS = '2048'
-    try {
-      _nextEvents = [
-        makeMessageStart(),
-        makeContentBlockStart(0, 'text'),
-        makeTextDelta(0, 'hi'),
-        makeContentBlockStop(0),
-        makeMessageDelta('end_turn', 5),
-        makeMessageStop(),
-      ]
-
-      await runQueryModel(_nextEvents)
-
-      expect(_lastCreateArgs).not.toBeNull()
-      expect(_lastCreateArgs!.max_tokens).toBe(4096)
-    } finally {
-      if (origOpenai === undefined) delete process.env.OPENAI_MAX_TOKENS
-      else process.env.OPENAI_MAX_TOKENS = origOpenai
-      if (origClaude === undefined) delete process.env.CLAUDE_CODE_MAX_OUTPUT_TOKENS
-      else process.env.CLAUDE_CODE_MAX_OUTPUT_TOKENS = origClaude
-    }
-  })
-})
--- a/src/services/api/openai/tests/streamAdapter.test.ts
+++ b/src/services/api/openai/tests/streamAdapter.test.ts
@@ -1,679 +0,0 @@
-import { describe, expect, test } from 'bun:test'
-import type { ChatCompletionChunk } from 'openai/resources/chat/completions/completions.mjs'
-import { join, dirname } from 'path'
-import { fileURLToPath } from 'url'
-import { readFileSync, writeFileSync, mkdirSync } from 'fs'
-import { tmpdir } from 'os'
-
-// Guard against mock pollution from queryModelOpenAI.test.ts which replaces
-// ../streamAdapter.js process-wide via mock.module (bun has no un-mock API).
-// We copy the source to a unique temp path so the import bypasses bun's
-// module mock cache completely.
-const _testDir = dirname(fileURLToPath(import.meta.url))
-const _realSource = readFileSync(join(_testDir, '..', 'streamAdapter.ts'), 'utf-8')
-const _tempDir = join(tmpdir(), `stream-adapter-test-${Date.now()}`)
-mkdirSync(_tempDir, { recursive: true })
-const _tempFile = join(_tempDir, 'streamAdapter.ts')
-writeFileSync(_tempFile, _realSource, 'utf-8')
-const { adaptOpenAIStreamToAnthropic } = await import(_tempFile)
-
-/** Helper to create a mock async iterable from chunk array */
-function mockStream(chunks: ChatCompletionChunk[]): AsyncIterable<ChatCompletionChunk> {
-  return {
-    [Symbol.asyncIterator]() {
-      let i = 0
-      return {
-        async next() {
-          if (i >= chunks.length) return { done: true, value: undefined }
-          return { done: false, value: chunks[i++] }
-        },
-      }
-    },
-  }
-}
-
-/** Create a minimal ChatCompletionChunk */
-function makeChunk(overrides: Partial<ChatCompletionChunk> & any = {}): ChatCompletionChunk {
-  return {
-    id: 'chatcmpl-test',
-    object: 'chat.completion.chunk',
-    created: 1234567890,
-    model: 'gpt-4o',
-    choices: [],
-    ...overrides,
-  } as ChatCompletionChunk
-}
-
-/** Collect all emitted Anthropic events from the stream adapter for assertion */
-async function collectEvents(chunks: ChatCompletionChunk[]) {
-  const realModuleUrl = new URL(
-    `../streamAdapter.js?real=${Date.now()}-${Math.random().toString(36).slice(2)}`,
-    import.meta.url,
-  ).href
-  const { adaptOpenAIStreamToAnthropic } = await import(realModuleUrl)
-  const events: any[] = []
-  for await (const event of adaptOpenAIStreamToAnthropic(mockStream(chunks), 'gpt-4o')) {
-    events.push(event)
-  }
-  return events
-}
-
-describe('adaptOpenAIStreamToAnthropic', () => {
-  test('emits message_start on first chunk', async () => {
-    const events = await collectEvents([
-      makeChunk({
-        choices: [{
-          index: 0,
-          delta: { role: 'assistant', content: '' },
-          finish_reason: null,
-        }],
-      }),
-      makeChunk({
-        choices: [{
-          index: 0,
-          delta: { content: 'hello' },
-          finish_reason: null,
-        }],
-      }),
-      makeChunk({
-        choices: [{
-          index: 0,
-          delta: {},
-          finish_reason: 'stop',
-        }],
-        usage: { prompt_tokens: 10, completion_tokens: 5, total_tokens: 15 },
-      }),
-    ])
-
-    expect(events[0].type).toBe('message_start')
-    expect(events[0].message.role).toBe('assistant')
-    expect(events[0].message.model).toBe('gpt-4o')
-  })
-
-  test('converts text content stream', async () => {
-    const events = await collectEvents([
-      makeChunk({
-        choices: [{ index: 0, delta: { content: 'Hello' }, finish_reason: null }],
-      }),
-      makeChunk({
-        choices: [{ index: 0, delta: { content: ' world' }, finish_reason: null }],
-      }),
-      makeChunk({
-        choices: [{ index: 0, delta: {}, finish_reason: 'stop' }],
-      }),
-    ])
-
-    const types = events.map(e => e.type)
-    expect(types).toContain('message_start')
-    expect(types).toContain('content_block_start')
-    expect(types.filter(t => t === 'content_block_delta').length).toBe(2)
-    expect(types).toContain('content_block_stop')
-    expect(types).toContain('message_delta')
-    expect(types).toContain('message_stop')
-
-    const textDeltas = events.filter(e => e.type === 'content_block_delta') as any[]
-    expect(textDeltas[0].delta.text).toBe('Hello')
-    expect(textDeltas[1].delta.text).toBe(' world')
-  })
-
-  test('converts tool_calls stream', async () => {
-    const events = await collectEvents([
-      makeChunk({
-        choices: [{
-          index: 0,
-          delta: {
-            tool_calls: [{
-              index: 0,
-              id: 'call_abc',
-              type: 'function',
-              function: { name: 'bash', arguments: '' },
-            }],
-          },
-          finish_reason: null,
-        }],
-      }),
-      makeChunk({
-        choices: [{
-          index: 0,
-          delta: {
-            tool_calls: [{
-              index: 0,
-              function: { arguments: '{"comm' },
-            }],
-          },
-          finish_reason: null,
-        }],
-      }),
-      makeChunk({
-        choices: [{
-          index: 0,
-          delta: {
-            tool_calls: [{
-              index: 0,
-              function: { arguments: 'and":"ls"}' },
-            }],
-          },
-          finish_reason: null,
-        }],
-      }),
-      makeChunk({
-        choices: [{ index: 0, delta: {}, finish_reason: 'tool_calls' }],
-      }),
-    ])
-
-    const blockStart = events.find(e => e.type === 'content_block_start') as any
-    expect(blockStart.content_block.type).toBe('tool_use')
-    expect(blockStart.content_block.name).toBe('bash')
-
-    const jsonDeltas = events.filter(
-      e => e.type === 'content_block_delta' && e.delta.type === 'input_json_delta',
-    ) as any[]
-    const fullArgs = jsonDeltas.map(d => d.delta.partial_json).join('')
-    expect(fullArgs).toBe('{"command":"ls"}')
-  })
-
-  test('maps finish_reason stop to end_turn', async () => {
-    const events = await collectEvents([
-      makeChunk({
-        choices: [{ index: 0, delta: { content: 'hi' }, finish_reason: null }],
-      }),
-      makeChunk({
-        choices: [{ index: 0, delta: {}, finish_reason: 'stop' }],
-      }),
-    ])
-
-    const msgDelta = events.find(e => e.type === 'message_delta') as any
-    expect(msgDelta.delta.stop_reason).toBe('end_turn')
-  })
-
-  test('forces tool_use stop_reason when tool_calls present but finish_reason is stop', async () => {
-    // Some backends (e.g., certain OpenAI-compatible endpoints) incorrectly
-    // return finish_reason "stop" when they actually made tool calls.
-    const events = await collectEvents([
-      makeChunk({
-        choices: [{
-          index: 0,
-          delta: {
-            tool_calls: [{ index: 0, id: 'call_1', function: { name: 'bash', arguments: '{"cmd":"ls"}' } }],
-          },
-          finish_reason: null,
-        }],
-      }),
-      makeChunk({
-        choices: [{ index: 0, delta: {}, finish_reason: 'stop' }],
-      }),
-    ])
-
-    const msgDelta = events.find(e => e.type === 'message_delta') as any
-    expect(msgDelta.delta.stop_reason).toBe('tool_use')
-  })
-
-  test('maps finish_reason tool_calls to tool_use', async () => {
-    const events = await collectEvents([
-      makeChunk({
-        choices: [{
-          index: 0,
-          delta: {
-            tool_calls: [{ index: 0, id: 'call_1', function: { name: 'bash', arguments: '{}' } }],
-          },
-          finish_reason: null,
-        }],
-      }),
-      makeChunk({
-        choices: [{ index: 0, delta: {}, finish_reason: 'tool_calls' }],
-      }),
-    ])
-
-    const msgDelta = events.find(e => e.type === 'message_delta') as any
-    expect(msgDelta.delta.stop_reason).toBe('tool_use')
-  })
-
-  test('maps finish_reason length to max_tokens', async () => {
-    const events = await collectEvents([
-      makeChunk({
-        choices: [{ index: 0, delta: { content: 'truncated' }, finish_reason: null }],
-      }),
-      makeChunk({
-        choices: [{ index: 0, delta: {}, finish_reason: 'length' }],
-      }),
-    ])
-
-    const msgDelta = events.find(e => e.type === 'message_delta') as any
-    expect(msgDelta.delta.stop_reason).toBe('max_tokens')
-  })
-
-  test('handles mixed text and tool_calls', async () => {
-    const events = await collectEvents([
-      makeChunk({
-        choices: [{ index: 0, delta: { content: 'Thinking...' }, finish_reason: null }],
-      }),
-      makeChunk({
-        choices: [{
-          index: 0,
-          delta: {
-            tool_calls: [{ index: 0, id: 'call_1', function: { name: 'grep', arguments: '{"p":"test"}' } }],
-          },
-          finish_reason: null,
-        }],
-      }),
-      makeChunk({
-        choices: [{ index: 0, delta: {}, finish_reason: 'tool_calls' }],
-      }),
-    ])
-
-    const blockStarts = events.filter(e => e.type === 'content_block_start') as any[]
-    expect(blockStarts.length).toBe(2)
-    expect(blockStarts[0].content_block.type).toBe('text')
-    expect(blockStarts[1].content_block.type).toBe('tool_use')
-  })
-})
-
-describe('thinking support (reasoning_content)', () => {
-  test('converts reasoning_content to thinking block', async () => {
-    const events = await collectEvents([
-      makeChunk({
-        choices: [{
-          index: 0,
-          delta: { reasoning_content: 'Let me analyze this...' },
-          finish_reason: null,
-        }],
-      }),
-      makeChunk({
-        choices: [{
-          index: 0,
-          delta: { reasoning_content: ' step by step.' },
-          finish_reason: null,
-        }],
-      }),
-      makeChunk({
-        choices: [{ index: 0, delta: {}, finish_reason: 'stop' }],
-      }),
-    ])
-
-    // Should have a thinking content block
-    const blockStart = events.find(e => e.type === 'content_block_start') as any
-    expect(blockStart.content_block.type).toBe('thinking')
-    expect(blockStart.content_block.signature).toBe('')
-
-    // Should have thinking_delta events
-    const thinkingDeltas = events.filter(
-      e => e.type === 'content_block_delta' && e.delta.type === 'thinking_delta',
-    ) as any[]
-    expect(thinkingDeltas.length).toBe(2)
-    expect(thinkingDeltas[0].delta.thinking).toBe('Let me analyze this...')
-    expect(thinkingDeltas[1].delta.thinking).toBe(' step by step.')
-  })
-
-  test('converts reasoning then content (DeepSeek-style)', async () => {
-    const events = await collectEvents([
-      makeChunk({
-        choices: [{
-          index: 0,
-          delta: { reasoning_content: 'Thinking about the answer...' },
-          finish_reason: null,
-        }],
-      }),
-      makeChunk({
-        choices: [{
-          index: 0,
-          delta: { content: 'Here is my answer.' },
-          finish_reason: null,
-        }],
-      }),
-      makeChunk({
-        choices: [{ index: 0, delta: {}, finish_reason: 'stop' }],
-      }),
-    ])
-
-    // Should have two content blocks: thinking + text
-    const blockStarts = events.filter(e => e.type === 'content_block_start') as any[]
-    expect(blockStarts.length).toBe(2)
-    expect(blockStarts[0].content_block.type).toBe('thinking')
-    expect(blockStarts[1].content_block.type).toBe('text')
-
-    // Thinking block should be closed before text block starts
-    const blockStops = events.filter(e => e.type === 'content_block_stop') as any[]
-    expect(blockStops[0].index).toBe(0) // thinking block closed at index 0
-    expect(blockStarts[1].index).toBe(1) // text block starts at index 1
-
-    // Verify text delta
-    const textDelta = events.find(
-      e => e.type === 'content_block_delta' && e.delta.type === 'text_delta',
-    ) as any
-    expect(textDelta.delta.text).toBe('Here is my answer.')
-  })
-
-  test('handles reasoning then tool_calls', async () => {
-    const events = await collectEvents([
-      makeChunk({
-        choices: [{
-          index: 0,
-          delta: { reasoning_content: 'I need to run a command.' },
-          finish_reason: null,
-        }],
-      }),
-      makeChunk({
-        choices: [{
-          index: 0,
-          delta: {
-            tool_calls: [{ index: 0, id: 'call_1', function: { name: 'bash', arguments: '{"c":"ls"}' } }],
-          },
-          finish_reason: null,
-        }],
-      }),
-      makeChunk({
-        choices: [{ index: 0, delta: {}, finish_reason: 'tool_calls' }],
-      }),
-    ])
-
-    const blockStarts = events.filter(e => e.type === 'content_block_start') as any[]
-    expect(blockStarts.length).toBe(2)
-    expect(blockStarts[0].content_block.type).toBe('thinking')
-    expect(blockStarts[1].content_block.type).toBe('tool_use')
-  })
-
-  test('thinking block index is 0, text block index is 1', async () => {
-    const events = await collectEvents([
-      makeChunk({
-        choices: [{
-          index: 0,
-          delta: { reasoning_content: 'reason' },
-          finish_reason: null,
-        }],
-      }),
-      makeChunk({
-        choices: [{
-          index: 0,
-          delta: { content: 'answer' },
-          finish_reason: null,
-        }],
-      }),
-      makeChunk({
-        choices: [{ index: 0, delta: {}, finish_reason: 'stop' }],
-      }),
-    ])
-
-    const blockStarts = events.filter(e => e.type === 'content_block_start') as any[]
-    expect(blockStarts[0].index).toBe(0)
-    expect(blockStarts[1].index).toBe(1)
-  })
-})
-
-describe('prompt caching support', () => {
-  test('maps cached_tokens to cache_read_input_tokens', async () => {
-    const events = await collectEvents([
-      makeChunk({
-        choices: [{
-          index: 0,
-          delta: { content: 'hi' },
-          finish_reason: null,
-        }],
-        usage: {
-          prompt_tokens: 1000,
-          completion_tokens: 0,
-          total_tokens: 1000,
-          prompt_tokens_details: { cached_tokens: 800 },
-        } as any,
-      }),
-      makeChunk({
-        choices: [{ index: 0, delta: {}, finish_reason: 'stop' }],
-        usage: {
-          prompt_tokens: 1000,
-          completion_tokens: 50,
-          total_tokens: 1050,
-          prompt_tokens_details: { cached_tokens: 800 },
-        } as any,
-      }),
-    ])
-
-    const msgStart = events.find(e => e.type === 'message_start') as any
-    expect(msgStart.message.usage.cache_read_input_tokens).toBe(800)
-    expect(msgStart.message.usage.input_tokens).toBe(1000)
-  })
-
-  test('defaults cache_read_input_tokens to 0 when no cached_tokens', async () => {
-    const events = await collectEvents([
-      makeChunk({
-        choices: [{ index: 0, delta: { content: 'hi' }, finish_reason: null }],
-        usage: { prompt_tokens: 100, completion_tokens: 0, total_tokens: 100 },
-      }),
-      makeChunk({
-        choices: [{ index: 0, delta: {}, finish_reason: 'stop' }],
-      }),
-    ])
-
-    const msgStart = events.find(e => e.type === 'message_start') as any
-    expect(msgStart.message.usage.cache_read_input_tokens).toBe(0)
-    expect(msgStart.message.usage.cache_creation_input_tokens).toBe(0)
-  })
-
-  test('updates cached_tokens from later chunks', async () => {
-    const events = await collectEvents([
-      makeChunk({
-        choices: [{ index: 0, delta: { content: 'hi' }, finish_reason: null }],
-        usage: {
-          prompt_tokens: 500,
-          completion_tokens: 0,
-          total_tokens: 500,
-        } as any,
-      }),
-      makeChunk({
-        choices: [{ index: 0, delta: {}, finish_reason: 'stop' }],
-        usage: {
-          prompt_tokens: 500,
-          completion_tokens: 10,
-          total_tokens: 510,
-          prompt_tokens_details: { cached_tokens: 300 },
-        } as any,
-      }),
-    ])
-
-    const msgStart = events.find(e => e.type === 'message_start') as any
-    // First chunk had no cached_tokens, so initially 0
-    // But the message_start usage reflects the first chunk's data
-    expect(msgStart.message.usage.cache_read_input_tokens).toBe(0)
-    expect(msgStart.message.usage.input_tokens).toBe(500)
-  })
-
-  test('captures output_tokens and input_tokens from trailing chunk sent after finish_reason', async () => {
-    // Many OpenAI-compatible endpoints (e.g. DeepSeek) send usage in a separate
-    // final chunk AFTER the finish_reason chunk, with choices: [].
-    // message_delta must carry both input_tokens and output_tokens so that
-    // queryModelOpenAI's spread can override the zeros from message_start — which is
-    // emitted before the trailing chunk and always has input_tokens=0.
-    const events = await collectEvents([
-      makeChunk({
-        choices: [{ index: 0, delta: { content: 'hello' }, finish_reason: null }],
-      }),
-      // finish_reason chunk — usage not yet available
-      makeChunk({
-        choices: [{ index: 0, delta: {}, finish_reason: 'stop' }],
-      }),
-      // trailing usage-only chunk (choices: [])
-      makeChunk({
-        choices: [],
-        usage: { prompt_tokens: 123, completion_tokens: 45, total_tokens: 168 },
-      }),
-    ])
-
-    // message_start emits on the first chunk before trailing usage arrives
-    const msgStart = events.find(e => e.type === 'message_start') as any
-    expect(msgStart.message.usage.input_tokens).toBe(0)
-
-    // message_delta is emitted after stream loop ends with final real values
-    const msgDelta = events.find(e => e.type === 'message_delta') as any
-    expect(msgDelta.usage.input_tokens).toBe(123)
-    expect(msgDelta.usage.output_tokens).toBe(45)
-    expect(msgDelta.delta.stop_reason).toBe('end_turn')
-  })
-
-  test('captures input_tokens from trailing chunk (used by tokenCountWithEstimation for autocompact)', async () => {
-    // input_tokens is the dominant term in tokenCountWithEstimation. Without it,
-    // getTokenCountFromUsage returns only output_tokens (~100-700), which is far below
-    // the autocompact threshold (~33k), so compaction never fires.
-    const events = await collectEvents([
-      makeChunk({
-        choices: [{ index: 0, delta: { content: 'answer' }, finish_reason: null }],
-      }),
-      makeChunk({
-        choices: [{ index: 0, delta: {}, finish_reason: 'stop' }],
-      }),
-      makeChunk({
-        choices: [],
-        usage: { prompt_tokens: 800, completion_tokens: 200, total_tokens: 1000 },
-      }),
-    ])
-
-    const msgDelta = events.find(e => e.type === 'message_delta') as any
-    expect(msgDelta.usage.input_tokens).toBe(800)
-    expect(msgDelta.usage.output_tokens).toBe(200)
-  })
-
-  test('trailing usage chunk with tool_calls: stop_reason stays tool_use', async () => {
-    // Verifies that deferring message_delta does not break stop_reason mapping
-    // when the model made tool calls and usage arrives in a trailing chunk.
-    const events = await collectEvents([
-      makeChunk({
-        choices: [{
-          index: 0,
-          delta: {
-            tool_calls: [{ index: 0, id: 'call_x', function: { name: 'bash', arguments: '{"cmd":"ls"}' } }],
-          },
-          finish_reason: null,
-        }],
-      }),
-      makeChunk({
-        choices: [{ index: 0, delta: {}, finish_reason: 'tool_calls' }],
-      }),
-      // trailing usage-only chunk
-      makeChunk({
-        choices: [],
-        usage: { prompt_tokens: 500, completion_tokens: 30, total_tokens: 530 },
-      }),
-    ])
-
-    const msgDelta = events.find(e => e.type === 'message_delta') as any
-    expect(msgDelta.delta.stop_reason).toBe('tool_use')
-    expect(msgDelta.usage.output_tokens).toBe(30)
-  })
-
-  test('message_delta always comes before message_stop', async () => {
-    // Verifies event ordering is preserved after deferring to post-loop emission.
-    const events = await collectEvents([
-      makeChunk({ choices: [{ index: 0, delta: { content: 'x' }, finish_reason: null }] }),
-      makeChunk({ choices: [{ index: 0, delta: {}, finish_reason: 'stop' }] }),
-      makeChunk({ choices: [], usage: { prompt_tokens: 10, completion_tokens: 5, total_tokens: 15 } }),
-    ])
-
-    const types = events.map(e => e.type)
-    const deltaIdx = types.lastIndexOf('message_delta')
-    const stopIdx = types.lastIndexOf('message_stop')
-    expect(deltaIdx).toBeGreaterThanOrEqual(0)
-    expect(stopIdx).toBeGreaterThan(deltaIdx)
-  })
-
-  // ── cache_read_input_tokens in message_delta (the core bug fix) ──────────
-
-  test('message_delta carries cache_read_input_tokens from trailing usage chunk', async () => {
-    // Real-world case: DeepSeek-V3 returns cached_tokens=19904
-    // in a trailing chunk with choices:[]. Previously message_delta only carried
-    // input_tokens and output_tokens, so cache_read_input_tokens stayed 0 after
-    // queryModelOpenAI's spread — even though cachedTokens was captured internally.
-    const events = await collectEvents([
-      makeChunk({
-        choices: [{ index: 0, delta: { content: 'answer' }, finish_reason: null }],
-      }),
-      makeChunk({
-        choices: [{ index: 0, delta: {}, finish_reason: 'stop' }],
-      }),
-      // trailing usage chunk matching the observed server response format
-      makeChunk({
-        choices: [],
-        usage: {
-          prompt_tokens: 30011,
-          completion_tokens: 190,
-          total_tokens: 30201,
-          prompt_tokens_details: { audio_tokens: 0, cached_tokens: 19904 },
-        } as any,
-      }),
-    ])
-
-    // message_start is emitted before trailing chunk — cache fields are 0
-    const msgStart = events.find(e => e.type === 'message_start') as any
-    expect(msgStart.message.usage.cache_read_input_tokens).toBe(0)
-
-    // message_delta carries the real values from the trailing chunk
-    const msgDelta = events.find(e => e.type === 'message_delta') as any
-    expect(msgDelta.usage.input_tokens).toBe(30011)
-    expect(msgDelta.usage.output_tokens).toBe(190)
-    expect(msgDelta.usage.cache_read_input_tokens).toBe(19904)
-    expect(msgDelta.usage.cache_creation_input_tokens).toBe(0)
-  })
-
-  test('cache_read_input_tokens=0 in message_delta when cached_tokens is absent', async () => {
-    // Non-caching requests should still have the field present and zero.
-    const events = await collectEvents([
-      makeChunk({
-        choices: [{ index: 0, delta: { content: 'hi' }, finish_reason: null }],
-      }),
-      makeChunk({
-        choices: [{ index: 0, delta: {}, finish_reason: 'stop' }],
-      }),
-      makeChunk({
-        choices: [],
-        usage: { prompt_tokens: 100, completion_tokens: 20, total_tokens: 120 },
-      }),
-    ])
-
-    const msgDelta = events.find(e => e.type === 'message_delta') as any
-    expect(msgDelta.usage.cache_read_input_tokens).toBe(0)
-    expect(msgDelta.usage.cache_creation_input_tokens).toBe(0)
-  })
-
-  test('cache_read_input_tokens=0 in message_delta when cached_tokens is 0', async () => {
-    // Explicit cached_tokens:0 should not be treated differently from absent.
-    const events = await collectEvents([
-      makeChunk({
-        choices: [{ index: 0, delta: { content: 'hi' }, finish_reason: null }],
-      }),
-      makeChunk({
-        choices: [{ index: 0, delta: {}, finish_reason: 'stop' }],
-      }),
-      makeChunk({
-        choices: [],
-        usage: {
-          prompt_tokens: 500,
-          completion_tokens: 50,
-          total_tokens: 550,
-          prompt_tokens_details: { cached_tokens: 0 },
-        } as any,
-      }),
-    ])
-
-    const msgDelta = events.find(e => e.type === 'message_delta') as any
-    expect(msgDelta.usage.cache_read_input_tokens).toBe(0)
-  })
-
-  test('cache_read_input_tokens updated when cached_tokens arrives in same chunk as finish_reason', async () => {
-    // Some endpoints send usage in the finish_reason chunk instead of a trailing chunk.
-    const events = await collectEvents([
-      makeChunk({
-        choices: [{ index: 0, delta: { content: 'result' }, finish_reason: null }],
-      }),
-      makeChunk({
-        choices: [{ index: 0, delta: {}, finish_reason: 'stop' }],
-        usage: {
-          prompt_tokens: 2000,
-          completion_tokens: 100,
-          total_tokens: 2100,
-          prompt_tokens_details: { cached_tokens: 1500 },
-        } as any,
-      }),
-    ])
-
-    const msgDelta = events.find(e => e.type === 'message_delta') as any
-    expect(msgDelta.usage.cache_read_input_tokens).toBe(1500)
-    expect(msgDelta.usage.input_tokens).toBe(2000)
-    expect(msgDelta.usage.output_tokens).toBe(100)
-  })
-})
--- a/src/services/api/openai/tests/thinking.test.ts
+++ b/src/services/api/openai/tests/thinking.test.ts
@@ -1,5 +1,5 @@
 import { describe, expect, test, beforeEach, afterEach } from 'bun:test'
-import { isOpenAIThinkingEnabled, buildOpenAIRequestBody } from '../index.js'
+import { isOpenAIThinkingEnabled, buildOpenAIRequestBody } from '../requestBody.js'

 describe('isOpenAIThinkingEnabled', () => {
  const originalEnv = {
--- a/src/services/api/openai/convertMessages.ts
+++ b/src/services/api/openai/convertMessages.ts
@@ -1,305 +0,0 @@
-import type {
-  BetaContentBlockParam,
-  BetaToolResultBlockParam,
-  BetaToolUseBlock,
-} from '@anthropic-ai/sdk/resources/beta/messages/messages.mjs'
-import type {
-  ChatCompletionAssistantMessageParam,
-  ChatCompletionMessageParam,
-  ChatCompletionSystemMessageParam,
-  ChatCompletionToolMessageParam,
-  ChatCompletionUserMessageParam,
-} from 'openai/resources/chat/completions/completions.mjs'
-import type { AssistantMessage, UserMessage } from '../../../types/message.js'
-import type { SystemPrompt } from '../../../utils/systemPromptType.js'
-
-export interface ConvertMessagesOptions {
-  /** When true, preserve thinking blocks as reasoning_content on assistant messages
-   *  (required for DeepSeek thinking mode with tool calls). */
-  enableThinking?: boolean
-}
-
-/**
- * Convert internal (UserMessage | AssistantMessage)[] to OpenAI-format messages.
- *
- * Key conversions:
- * - system prompt → role: "system" message prepended
- * - tool_use blocks → tool_calls[] on assistant message
- * - tool_result blocks → role: "tool" messages
- * - thinking blocks → silently dropped (or preserved as reasoning_content when enableThinking=true)
- * - cache_control → stripped
- */
-export function anthropicMessagesToOpenAI(
-  messages: (UserMessage | AssistantMessage)[],
-  systemPrompt: SystemPrompt,
-  options?: ConvertMessagesOptions,
-): ChatCompletionMessageParam[] {
-  const result: ChatCompletionMessageParam[] = []
-  const enableThinking = options?.enableThinking ?? false
-
-  // Prepend system prompt as system message
-  const systemText = systemPromptToText(systemPrompt)
-  if (systemText) {
-    result.push({
-      role: 'system',
-      content: systemText,
-    } satisfies ChatCompletionSystemMessageParam)
-  }
-
-  // When thinking mode is on, detect turn boundaries so that reasoning_content
-  // from *previous* user turns is stripped (saves bandwidth; DeepSeek ignores it).
-  // A "new turn" starts when a user text message appears after at least one assistant response.
-  const turnBoundaries = new Set<number>()
-  if (enableThinking) {
-    let hasSeenAssistant = false
-    for (let i = 0; i < messages.length; i++) {
-      const msg = messages[i]
-      if (msg.type === 'assistant') {
-        hasSeenAssistant = true
-      }
-      if (msg.type === 'user' && hasSeenAssistant) {
-        const content = msg.message.content
-        // A user message starts a new turn if it contains any non-tool_result content
-        // (text, image, or other media). Tool results alone do NOT start a new turn
-        // because they are continuations of the previous assistant tool call.
-        const startsNewUserTurn = typeof content === 'string'
-          ? content.length > 0
-          : Array.isArray(content) && content.some(
-              (b: any) =>
-                typeof b === 'string' ||
-                (b &&
-                  typeof b === 'object' &&
-                  'type' in b &&
-                  b.type !== 'tool_result'),
-            )
-        if (startsNewUserTurn) {
-          turnBoundaries.add(i)
-        }
-      }
-    }
-  }
-
-  for (let i = 0; i < messages.length; i++) {
-    const msg = messages[i]
-    switch (msg.type) {
-      case 'user':
-        result.push(...convertInternalUserMessage(msg))
-        break
-      case 'assistant':
-        // Preserve reasoning_content unless we're before a turn boundary
-        // (i.e., from a previous user Q&A round)
-        const preserveReasoning = enableThinking && !isBeforeAnyTurnBoundary(i, turnBoundaries)
-        result.push(...convertInternalAssistantMessage(msg, preserveReasoning))
-        break
-      default:
-        break
-    }
-  }
-
-  return result
-}
-
-function systemPromptToText(systemPrompt: SystemPrompt): string {
-  if (!systemPrompt || systemPrompt.length === 0) return ''
-  return systemPrompt
-    .filter(Boolean)
-    .join('\n\n')
-}
-
-/**
- * Check if index `i` falls before any turn boundary (i.e. it belongs to a previous turn).
- * A message at index i is "before" a boundary if there exists a boundary j where i < j.
- */
-function isBeforeAnyTurnBoundary(i: number, boundaries: Set<number>): boolean {
-  for (const b of boundaries) {
-    if (i < b) return true
-  }
-  return false
-}
-
-function convertInternalUserMessage(
-  msg: UserMessage,
-): ChatCompletionMessageParam[] {
-  const result: ChatCompletionMessageParam[] = []
-  const content = msg.message.content
-
-  if (typeof content === 'string') {
-    result.push({
-      role: 'user',
-      content,
-    } satisfies ChatCompletionUserMessageParam)
-  } else if (Array.isArray(content)) {
-    const textParts: string[] = []
-    const toolResults: BetaToolResultBlockParam[] = []
-    const imageParts: Array<{ type: 'image_url'; image_url: { url: string } }> = []
-
-    for (const block of content) {
-      if (typeof block === 'string') {
-        textParts.push(block)
-      } else if (block.type === 'text') {
-        textParts.push(block.text)
-      } else if (block.type === 'tool_result') {
-        toolResults.push(block as BetaToolResultBlockParam)
-      } else if (block.type === 'image') {
-        const imagePart = convertImageBlockToOpenAI(block as unknown as Record<string, unknown>)
-        if (imagePart) {
-          imageParts.push(imagePart)
-        }
-      }
-    }
-
-    // CRITICAL: tool messages must come BEFORE any user message in the result.
-    // OpenAI API requires that a tool message immediately follows the assistant
-    // message with tool_calls. If we emit a user message first, the API will
-    // reject the request with "insufficient tool messages following tool_calls".
-    // See: https://github.com/anthropics/claude-code/issues/xxx
-    for (const tr of toolResults) {
-      result.push(convertToolResult(tr))
-    }
-
-    // 如果有图片，构建多模态 content 数组
-    if (imageParts.length > 0) {
-      const multiContent: Array<{ type: 'text'; text: string } | { type: 'image_url'; image_url: { url: string } }> = []
-      if (textParts.length > 0) {
-        multiContent.push({ type: 'text', text: textParts.join('\n') })
-      }
-      multiContent.push(...imageParts)
-      result.push({
-        role: 'user',
-        content: multiContent,
-      } satisfies ChatCompletionUserMessageParam)
-    } else if (textParts.length > 0) {
-      result.push({
-        role: 'user',
-        content: textParts.join('\n'),
-      } satisfies ChatCompletionUserMessageParam)
-    }
-  }
-
-  return result
-}
-
-function convertToolResult(
-  block: BetaToolResultBlockParam,
-): ChatCompletionToolMessageParam {
-  let content: string
-  if (typeof block.content === 'string') {
-    content = block.content
-  } else if (Array.isArray(block.content)) {
-    content = block.content
-      .map(c => {
-        if (typeof c === 'string') return c
-        if ('text' in c) return c.text
-        return ''
-      })
-      .filter(Boolean)
-      .join('\n')
-  } else {
-    content = ''
-  }
-
-  return {
-    role: 'tool',
-    tool_call_id: block.tool_use_id,
-    content,
-  } satisfies ChatCompletionToolMessageParam
-}
-
-function convertInternalAssistantMessage(
-  msg: AssistantMessage,
-  preserveReasoning = false,
-): ChatCompletionMessageParam[] {
-  const content = msg.message.content
-
-  if (typeof content === 'string') {
-    return [
-      {
-        role: 'assistant',
-        content,
-      } satisfies ChatCompletionAssistantMessageParam,
-    ]
-  }
-
-  if (!Array.isArray(content)) {
-    return [
-      {
-        role: 'assistant',
-        content: '',
-      } satisfies ChatCompletionAssistantMessageParam,
-    ]
-  }
-
-  const textParts: string[] = []
-  const toolCalls: NonNullable<ChatCompletionAssistantMessageParam['tool_calls']> = []
-  const reasoningParts: string[] = []
-
-  for (const block of content) {
-    if (typeof block === 'string') {
-      textParts.push(block)
-    } else if (block.type === 'text') {
-      textParts.push(block.text)
-    } else if (block.type === 'tool_use') {
-      const tu = block as BetaToolUseBlock
-      toolCalls.push({
-        id: tu.id,
-        type: 'function',
-        function: {
-          name: tu.name,
-          arguments:
-            typeof tu.input === 'string' ? tu.input : JSON.stringify(tu.input),
-        },
-      })
-    } else if (block.type === 'thinking' && preserveReasoning) {
-      // DeepSeek thinking mode: preserve reasoning_content for tool call iterations
-      const thinkingText = (block as unknown as Record<string, unknown>).thinking
-      if (typeof thinkingText === 'string' && thinkingText) {
-        reasoningParts.push(thinkingText)
-      }
-    }
-    // Skip redacted_thinking, server_tool_use, etc.
-  }
-
-  const result: ChatCompletionAssistantMessageParam = {
-    role: 'assistant',
-    content: textParts.length > 0 ? textParts.join('\n') : null,
-    ...(toolCalls.length > 0 && { tool_calls: toolCalls }),
-    ...(reasoningParts.length > 0 && { reasoning_content: reasoningParts.join('\n') }),
-  }
-
-  return [result]
-}
-
-/**
- * 将 Anthropic image 块转换为 OpenAI image_url 格式。
- *
- * Anthropic 格式: { type: "image", source: { type: "base64", media_type: "image/png", data: "..." } }
- * OpenAI 格式: { type: "image_url", image_url: { url: "data:image/png;base64,..." } }
- */
-function convertImageBlockToOpenAI(
-  block: Record<string, unknown>,
-): { type: 'image_url'; image_url: { url: string } } | null {
-  const source = block.source as Record<string, unknown> | undefined
-  if (!source) return null
-
-  if (source.type === 'base64' && typeof source.data === 'string') {
-    const mediaType = (source.media_type as string) || 'image/png'
-    return {
-      type: 'image_url',
-      image_url: {
-        url: `data:${mediaType};base64,${source.data}`,
-      },
-    }
-  }
-
-  // url 类型的图片直接传递
-  if (source.type === 'url' && typeof source.url === 'string') {
-    return {
-      type: 'image_url',
-      image_url: {
-        url: source.url,
-      },
-    }
-  }
-
-  return null
-}
--- a/src/services/api/openai/convertTools.ts
+++ b/src/services/api/openai/convertTools.ts
@@ -1,123 +0,0 @@
-import type { BetaToolUnion } from '@anthropic-ai/sdk/resources/beta/messages/messages.mjs'
-import type { ChatCompletionTool } from 'openai/resources/chat/completions/completions.mjs'
-
-/**
- * Convert Anthropic tool schemas to OpenAI function calling format.
- *
- * Anthropic: { name, description, input_schema }
- * OpenAI:    { type: "function", function: { name, description, parameters } }
- *
- * Anthropic-specific fields (cache_control, defer_loading, etc.) are stripped.
- */
-export function anthropicToolsToOpenAI(
-  tools: BetaToolUnion[],
-): ChatCompletionTool[] {
-  return tools
-    .filter(tool => {
-      // Only convert standard tools (skip server tools like computer_use, etc.)
-      const toolType = (tool as unknown as { type?: string }).type
-      return tool.type === 'custom' || !('type' in tool) || toolType !== 'server'
-    })
-    .map(tool => {
-      // Handle the various tool shapes from Anthropic SDK
-      const anyTool = tool as unknown as Record<string, unknown>
-      const name = (anyTool.name as string) || ''
-      const description = (anyTool.description as string) || ''
-      const inputSchema = anyTool.input_schema as Record<string, unknown> | undefined
-
-      return {
-        type: 'function' as const,
-        function: {
-          name,
-          description,
-          parameters: sanitizeJsonSchema(inputSchema || { type: 'object', properties: {} }),
-        },
-      } satisfies ChatCompletionTool
-    })
-}
-
-/**
- * Recursively sanitize a JSON Schema for OpenAI-compatible providers.
- *
- * Many OpenAI-compatible endpoints (Ollama, DeepSeek, vLLM, etc.) do not
- * support the `const` keyword in JSON Schema. Convert it to `enum` with a
- * single-element array, which is semantically equivalent.
- */
-function sanitizeJsonSchema(schema: Record<string, unknown>): Record<string, unknown> {
-  if (!schema || typeof schema !== 'object') return schema
-
-  const result = { ...schema }
-
-  // Convert `const` → `enum: [value]`
-  if ('const' in result) {
-    result.enum = [result.const]
-    delete result.const
-  }
-
-  // Recursively process nested schemas
-  const objectKeys = ['properties', 'definitions', '$defs', 'patternProperties'] as const
-  for (const key of objectKeys) {
-    const nested = result[key]
-    if (nested && typeof nested === 'object') {
-      const sanitized: Record<string, unknown> = {}
-      for (const [k, v] of Object.entries(nested as Record<string, unknown>)) {
-        sanitized[k] = v && typeof v === 'object' ? sanitizeJsonSchema(v as Record<string, unknown>) : v
-      }
-      result[key] = sanitized
-    }
-  }
-
-  // Recursively process single-schema keys
-  const singleKeys = ['items', 'additionalProperties', 'not', 'if', 'then', 'else', 'contains', 'propertyNames'] as const
-  for (const key of singleKeys) {
-    const nested = result[key]
-    if (nested && typeof nested === 'object' && !Array.isArray(nested)) {
-      result[key] = sanitizeJsonSchema(nested as Record<string, unknown>)
-    }
-  }
-
-  // Recursively process array-of-schemas keys
-  const arrayKeys = ['anyOf', 'oneOf', 'allOf'] as const
-  for (const key of arrayKeys) {
-    const nested = result[key]
-    if (Array.isArray(nested)) {
-      result[key] = nested.map(item =>
-        item && typeof item === 'object' ? sanitizeJsonSchema(item as Record<string, unknown>) : item
-      )
-    }
-  }
-
-  return result
-}
-
-/**
- * Map Anthropic tool_choice to OpenAI tool_choice format.
- *
- * Anthropic → OpenAI:
- * - { type: "auto" } → "auto"
- * - { type: "any" }  → "required"
- * - { type: "tool", name } → { type: "function", function: { name } }
- * - undefined → undefined (use provider default)
- */
-export function anthropicToolChoiceToOpenAI(
-  toolChoice: unknown,
-): string | { type: 'function'; function: { name: string } } | undefined {
-  if (!toolChoice || typeof toolChoice !== 'object') return undefined
-
-  const tc = toolChoice as Record<string, unknown>
-  const type = tc.type as string
-
-  switch (type) {
-    case 'auto':
-      return 'auto'
-    case 'any':
-      return 'required'
-    case 'tool':
-      return {
-        type: 'function',
-        function: { name: tc.name as string },
-      }
-    default:
-      return undefined
-  }
-}
--- a/src/services/api/openai/index.ts
+++ b/src/services/api/openai/index.ts
@@ -10,17 +10,10 @@ import type { AgentId } from '../../../types/ids.js'
 import type { Tools } from '../../../Tool.js'
 import type { Stream } from 'openai/streaming.mjs'
 import type {
-  ChatCompletionChunk,
  ChatCompletionCreateParamsStreaming,
 } from 'openai/resources/chat/completions/completions.mjs'
 import { getOpenAIClient } from './client.js'
-import { anthropicMessagesToOpenAI } from './convertMessages.js'
-import {
-  anthropicToolsToOpenAI,
-  anthropicToolChoiceToOpenAI,
-} from './convertTools.js'
-import { adaptOpenAIStreamToAnthropic } from './streamAdapter.js'
-import { resolveOpenAIModel } from './modelMapping.js'
+import { anthropicMessagesToOpenAI, resolveOpenAIModel, adaptOpenAIStreamToAnthropic, anthropicToolsToOpenAI, anthropicToolChoiceToOpenAI } from '@ant/model-provider'
 import { normalizeMessagesForAPI } from '../../../utils/messages.js'
 import { toolToAPISchema } from '../../../utils/api.js'
 import {
@@ -30,7 +23,8 @@ import {
 import { logForDebugging } from '../../../utils/debug.js'
 import { addToTotalSessionCost } from '../../../cost-tracker.js'
 import { calculateUSDCost } from '../../../utils/modelCost.js'
-import { isEnvTruthy, isEnvDefinedFalsy } from '../../../utils/envUtils.js'
+import { isOpenAIThinkingEnabled, resolveOpenAIMaxTokens, buildOpenAIRequestBody } from './requestBody.js'
+export { isOpenAIThinkingEnabled, resolveOpenAIMaxTokens, buildOpenAIRequestBody }
 import { getModelMaxOutputTokens } from '../../../utils/context.js'
 import type { Options } from '../claude.js'
 import { randomUUID } from 'crypto'
@@ -48,104 +42,6 @@ import {
  TOOL_SEARCH_TOOL_NAME,
 } from '@claude-code-best/builtin-tools/tools/ToolSearchTool/prompt.js'

-/**
- * Detect whether DeepSeek-style thinking mode should be enabled.
- *
- * Enabled when:
- * 1. OPENAI_ENABLE_THINKING=1 is set (explicit enable), OR
- * 2. Model name contains "deepseek-reasoner" OR "DeepSeek-V3.2" (auto-detect, case-insensitive)
- *
- * Disabled when:
- * - OPENAI_ENABLE_THINKING=0/false/no/off is explicitly set (overrides model detection)
- *
- * @param model - The resolved OpenAI model name
- * @internal Exported for testing purposes only
- */
-export function isOpenAIThinkingEnabled(model: string): boolean {
-  // Explicit disable takes priority (overrides model auto-detect)
-  if (isEnvDefinedFalsy(process.env.OPENAI_ENABLE_THINKING)) return false
-  // Explicit enable
-  if (isEnvTruthy(process.env.OPENAI_ENABLE_THINKING)) return true
-  // Auto-detect from model name (deepseek-reasoner and DeepSeek-V3.2 support thinking mode)
-  const modelLower = model.toLowerCase()
-  return modelLower.includes('deepseek-reasoner') || modelLower.includes('deepseek-v3.2')
-}
-
-/**
- * Resolve max output tokens for the OpenAI-compatible path.
- *
- * Override priority:
- * 1. maxOutputTokensOverride (programmatic, from query pipeline)
- * 2. OPENAI_MAX_TOKENS env var (OpenAI-specific, useful for local models
- *    with small context windows, e.g. RTX 3060 12GB running 65536-token models)
- * 3. CLAUDE_CODE_MAX_OUTPUT_TOKENS env var (generic override)
- * 4. upperLimit default (64000)
- *
- * @internal Exported for testing purposes only
- */
-export function resolveOpenAIMaxTokens(
-  upperLimit: number,
-  maxOutputTokensOverride?: number,
-): number {
-  return maxOutputTokensOverride
-    ?? (process.env.OPENAI_MAX_TOKENS ? parseInt(process.env.OPENAI_MAX_TOKENS, 10) || undefined : undefined)
-    ?? (process.env.CLAUDE_CODE_MAX_OUTPUT_TOKENS ? parseInt(process.env.CLAUDE_CODE_MAX_OUTPUT_TOKENS, 10) || undefined : undefined)
-    ?? upperLimit
-}
-
-/**
- * Build the request body for OpenAI chat.completions.create().
- * Extracted for testability — the thinking mode params are injected here.
- *
- * DeepSeek thinking mode: inject thinking params via request body.
- * Two formats are added simultaneously to support different deployments:
- * - Official DeepSeek API: `thinking: { type: 'enabled' }`
- * - Self-hosted DeepSeek-V3.2: `enable_thinking: true` + `chat_template_kwargs: { thinking: true }`
- * OpenAI SDK passes unknown keys through to the HTTP body.
- * Each endpoint will use the format it recognizes and ignore the others.
- * @internal Exported for testing purposes only
- */
-export function buildOpenAIRequestBody(params: {
-  model: string
-  messages: any[]
-  tools: any[]
-  toolChoice: any
-  enableThinking: boolean
-  maxTokens: number
-  temperatureOverride?: number
-}): ChatCompletionCreateParamsStreaming & {
-  thinking?: { type: string }
-  enable_thinking?: boolean
-  chat_template_kwargs?: { thinking: boolean }
-} {
-  const { model, messages, tools, toolChoice, enableThinking, maxTokens, temperatureOverride } = params
-  return {
-    model,
-    messages,
-    max_tokens: maxTokens,
-    ...(tools.length > 0 && {
-      tools,
-      ...(toolChoice && { tool_choice: toolChoice }),
-    }),
-    stream: true,
-    stream_options: { include_usage: true },
-    // DeepSeek thinking mode: enable chain-of-thought output.
-    // When active, temperature/top_p/presence_penalty/frequency_penalty are ignored by DeepSeek.
-    ...(enableThinking && {
-      // Official DeepSeek API format
-      thinking: { type: 'enabled' },
-      // Self-hosted DeepSeek-V3.2 format
-      enable_thinking: true,
-      chat_template_kwargs: { thinking: true },
-    }),
-    // Only send temperature when thinking mode is off (DeepSeek ignores it anyway,
-    // but other providers may respect it)
-    ...(!enableThinking && temperatureOverride !== undefined && {
-      temperature: temperatureOverride,
-    }),
-  }
-}
-
 /**
 * Assemble the final AssistantMessage (and optional max_tokens error) from
 * accumulated stream state. Extracted to avoid duplication between the
--- a/src/services/api/openai/modelMapping.ts
+++ b/src/services/api/openai/modelMapping.ts
@@ -1,63 +0,0 @@
-/**
- * Default mapping from Anthropic model names to OpenAI model names.
- * Used only when ANTHROPIC_DEFAULT_*_MODEL env vars are not set.
- */
-const DEFAULT_MODEL_MAP: Record<string, string> = {
-  'claude-sonnet-4-20250514': 'gpt-4o',
-  'claude-sonnet-4-5-20250929': 'gpt-4o',
-  'claude-sonnet-4-6': 'gpt-4o',
-  'claude-opus-4-20250514': 'o3',
-  'claude-opus-4-1-20250805': 'o3',
-  'claude-opus-4-5-20251101': 'o3',
-  'claude-opus-4-6': 'o3',
-  'claude-haiku-4-5-20251001': 'gpt-4o-mini',
-  'claude-3-5-haiku-20241022': 'gpt-4o-mini',
-  'claude-3-7-sonnet-20250219': 'gpt-4o',
-  'claude-3-5-sonnet-20241022': 'gpt-4o',
-}
-
-/**
- * Determine the model family (haiku / sonnet / opus) from an Anthropic model ID.
- */
-function getModelFamily(model: string): 'haiku' | 'sonnet' | 'opus' | null {
-  if (/haiku/i.test(model)) return 'haiku'
-  if (/opus/i.test(model)) return 'opus'
-  if (/sonnet/i.test(model)) return 'sonnet'
-  return null
-}
-
-/**
- * Resolve the OpenAI model name for a given Anthropic model.
- *
- * Priority:
- * 1. OPENAI_MODEL env var (override all)
- * 2. OPENAI_DEFAULT_{FAMILY}_MODEL env var (e.g. OPENAI_DEFAULT_SONNET_MODEL)
- * 3. ANTHROPIC_DEFAULT_{FAMILY}_MODEL env var (backward compatibility)
- * 4. DEFAULT_MODEL_MAP lookup
- * 5. Pass through original model name
- */
-export function resolveOpenAIModel(anthropicModel: string): string {
-  // Highest priority: explicit override
-  if (process.env.OPENAI_MODEL) {
-    return process.env.OPENAI_MODEL
-  }
-
-  // Strip [1m] suffix if present (Claude-specific modifier)
-  const cleanModel = anthropicModel.replace(/\[1m\]$/, '')
-
-  // Check family-specific overrides
-  const family = getModelFamily(cleanModel)
-  if (family) {
-    // OpenAI-specific family override (preferred for openai provider)
-    const openaiEnvVar = `OPENAI_DEFAULT_${family.toUpperCase()}_MODEL`
-    const openaiOverride = process.env[openaiEnvVar]
-    if (openaiOverride) return openaiOverride
-
-    // Anthropic env var (backward compatibility)
-    const anthropicEnvVar = `ANTHROPIC_DEFAULT_${family.toUpperCase()}_MODEL`
-    const anthropicOverride = process.env[anthropicEnvVar]
-    if (anthropicOverride) return anthropicOverride
-  }
-
-  return DEFAULT_MODEL_MAP[cleanModel] ?? cleanModel
-}
--- a/src/services/api/openai/requestBody.ts
+++ b/src/services/api/openai/requestBody.ts
@@ -0,0 +1,103 @@
+/**
+ * Pure utility functions for building OpenAI request bodies and detecting
+ * thinking mode. Extracted from index.ts so tests can import them without
+ * triggering heavy module side-effects (OpenAI client, stream adapter, etc.).
+ */
+import type {
+  ChatCompletionCreateParamsStreaming,
+} from 'openai/resources/chat/completions/completions.mjs'
+import { isEnvTruthy, isEnvDefinedFalsy } from '../../../utils/envUtils.js'
+
+/**
+ * Detect whether DeepSeek-style thinking mode should be enabled.
+ *
+ * Enabled when:
+ * 1. OPENAI_ENABLE_THINKING=1 is set (explicit enable), OR
+ * 2. Model name contains "deepseek-reasoner" OR "DeepSeek-V3.2" (auto-detect, case-insensitive)
+ *
+ * Disabled when:
+ * - OPENAI_ENABLE_THINKING=0/false/no/off is explicitly set (overrides model detection)
+ *
+ * @param model - The resolved OpenAI model name
+ */
+export function isOpenAIThinkingEnabled(model: string): boolean {
+  // Explicit disable takes priority (overrides model auto-detect)
+  if (isEnvDefinedFalsy(process.env.OPENAI_ENABLE_THINKING)) return false
+  // Explicit enable
+  if (isEnvTruthy(process.env.OPENAI_ENABLE_THINKING)) return true
+  // Auto-detect from model name (deepseek-reasoner and DeepSeek-V3.2 support thinking mode)
+  const modelLower = model.toLowerCase()
+  return modelLower.includes('deepseek-reasoner') || modelLower.includes('deepseek-v3.2')
+}
+
+/**
+ * Resolve max output tokens for the OpenAI-compatible path.
+ *
+ * Override priority:
+ * 1. maxOutputTokensOverride (programmatic, from query pipeline)
+ * 2. OPENAI_MAX_TOKENS env var (OpenAI-specific, useful for local models
+ *    with small context windows, e.g. RTX 3060 12GB running 65536-token models)
+ * 3. CLAUDE_CODE_MAX_OUTPUT_TOKENS env var (generic override)
+ * 4. upperLimit default (64000)
+ */
+export function resolveOpenAIMaxTokens(
+  upperLimit: number,
+  maxOutputTokensOverride?: number,
+): number {
+  return maxOutputTokensOverride
+    ?? (process.env.OPENAI_MAX_TOKENS ? parseInt(process.env.OPENAI_MAX_TOKENS, 10) || undefined : undefined)
+    ?? (process.env.CLAUDE_CODE_MAX_OUTPUT_TOKENS ? parseInt(process.env.CLAUDE_CODE_MAX_OUTPUT_TOKENS, 10) || undefined : undefined)
+    ?? upperLimit
+}
+
+/**
+ * Build the request body for OpenAI chat.completions.create().
+ * Extracted for testability — the thinking mode params are injected here.
+ *
+ * DeepSeek thinking mode: inject thinking params via request body.
+ * Two formats are added simultaneously to support different deployments:
+ * - Official DeepSeek API: `thinking: { type: 'enabled' }`
+ * - Self-hosted DeepSeek-V3.2: `enable_thinking: true` + `chat_template_kwargs: { thinking: true }`
+ * OpenAI SDK passes unknown keys through to the HTTP body.
+ * Each endpoint will use the format it recognizes and ignore the others.
+ */
+export function buildOpenAIRequestBody(params: {
+  model: string
+  messages: any[]
+  tools: any[]
+  toolChoice: any
+  enableThinking: boolean
+  maxTokens: number
+  temperatureOverride?: number
+}): ChatCompletionCreateParamsStreaming & {
+  thinking?: { type: string }
+  enable_thinking?: boolean
+  chat_template_kwargs?: { thinking: boolean }
+} {
+  const { model, messages, tools, toolChoice, enableThinking, maxTokens, temperatureOverride } = params
+  return {
+    model,
+    messages,
+    max_tokens: maxTokens,
+    ...(tools.length > 0 && {
+      tools,
+      ...(toolChoice && { tool_choice: toolChoice }),
+    }),
+    stream: true,
+    stream_options: { include_usage: true },
+    // DeepSeek thinking mode: enable chain-of-thought output.
+    // When active, temperature/top_p/presence_penalty/frequency_penalty are ignored by DeepSeek.
+    ...(enableThinking && {
+      // Official DeepSeek API format
+      thinking: { type: 'enabled' },
+      // Self-hosted DeepSeek-V3.2 format
+      enable_thinking: true,
+      chat_template_kwargs: { thinking: true },
+    }),
+    // Only send temperature when thinking mode is off (DeepSeek ignores it anyway,
+    // but other providers may respect it)
+    ...(!enableThinking && temperatureOverride !== undefined && {
+      temperature: temperatureOverride,
+    }),
+  }
+}
--- a/src/services/api/openai/streamAdapter.ts
+++ b/src/services/api/openai/streamAdapter.ts
@@ -1,375 +0,0 @@
-import type { BetaRawMessageStreamEvent } from '@anthropic-ai/sdk/resources/beta/messages/messages.mjs'
-import type { ChatCompletionChunk } from 'openai/resources/chat/completions/completions.mjs'
-import { randomUUID } from 'crypto'
-
-/**
- * Adapt an OpenAI streaming response into Anthropic BetaRawMessageStreamEvent.
- *
- * Mapping:
- *   First chunk              → message_start
- *   delta.reasoning_content  → content_block_start(thinking) + thinking_delta + content_block_stop
- *   delta.content            → content_block_start(text) + text_delta + content_block_stop
- *   delta.tool_calls         → content_block_start(tool_use) + input_json_delta + content_block_stop
- *   finish_reason            → message_delta(stop_reason) + message_stop
- *
- * Usage field mapping (OpenAI → Anthropic):
- *   prompt_tokens                        → input_tokens
- *   completion_tokens                    → output_tokens
- *   prompt_tokens_details.cached_tokens  → cache_read_input_tokens
- *   (no OpenAI equivalent)               → cache_creation_input_tokens (always 0)
- *
- *   All four fields are emitted in the post-loop message_delta (not message_start)
- *   so that trailing usage chunks (sent after finish_reason by some
- *   OpenAI-compatible endpoints) are fully captured before the final counts are reported.
- *
- * Thinking support:
- *   DeepSeek and compatible providers send `delta.reasoning_content` for chain-of-thought.
- *   This is mapped to Anthropic's `thinking` content blocks:
- *     content_block_start: { type: 'thinking', thinking: '', signature: '' }
- *     content_block_delta: { type: 'thinking_delta', thinking: '...' }
- *
- * Prompt caching:
- *   OpenAI reports cached tokens in usage.prompt_tokens_details.cached_tokens.
- *   This is mapped to Anthropic's cache_read_input_tokens.
- */
-export async function* adaptOpenAIStreamToAnthropic(
-  stream: AsyncIterable<ChatCompletionChunk>,
-  model: string,
-): AsyncGenerator<BetaRawMessageStreamEvent, void> {
-  const messageId = `msg_${randomUUID().replace(/-/g, '').slice(0, 24)}`
-
-  let started = false
-  let currentContentIndex = -1
-
-  // Track tool_use blocks: tool_calls index → { contentIndex, id, name, arguments }
-  const toolBlocks = new Map<number, { contentIndex: number; id: string; name: string; arguments: string }>()
-
-  // Track thinking block state
-  let thinkingBlockOpen = false
-
-  // Track text block state
-  let textBlockOpen = false
-
-  // Track usage — all four Anthropic fields, populated from OpenAI usage fields:
-  //   prompt_tokens                          → input_tokens
-  //   completion_tokens                      → output_tokens
-  //   prompt_tokens_details.cached_tokens    → cache_read_input_tokens
-  //   (no standard OpenAI equivalent)        → cache_creation_input_tokens (always 0)
-  let inputTokens = 0
-  let outputTokens = 0
-  let cachedReadTokens = 0
-
-  // Track all open content block indices (for cleanup)
-  const openBlockIndices = new Set<number>()
-
-  // Deferred finish state: populated when finish_reason is encountered so that
-  // message_delta / message_stop are emitted AFTER the stream loop ends.
-  // This ensures usage chunks that arrive after the finish_reason chunk are
-  // captured before we emit the final token counts.
-  let pendingFinishReason: string | null = null
-  let pendingHasToolCalls = false
-
-  for await (const chunk of stream) {
-    const choice = chunk.choices?.[0]
-    const delta = choice?.delta
-
-    // Extract usage from any chunk that carries it.
-    // Many OpenAI-compatible endpoints (e.g. DeepSeek) send usage in a separate
-    // final chunk that arrives AFTER the finish_reason chunk. Reading it here
-    // (before emitting message_delta) ensures the token counts are available
-    // when we later emit message_delta.
-    if (chunk.usage) {
-      inputTokens = chunk.usage.prompt_tokens ?? inputTokens
-      outputTokens = chunk.usage.completion_tokens ?? outputTokens
-      // OpenAI prompt caching: prompt_tokens_details.cached_tokens
-      // → Anthropic cache_read_input_tokens
-      // Note: OpenAI has no equivalent for cache_creation_input_tokens.
-      const details = (chunk.usage as any).prompt_tokens_details
-      if (details?.cached_tokens != null) {
-        cachedReadTokens = details.cached_tokens
-      }
-    }
-
-    // Emit message_start on first chunk
-    if (!started) {
-      started = true
-
-      yield {
-        type: 'message_start',
-        message: {
-          id: messageId,
-          type: 'message',
-          role: 'assistant',
-          content: [],
-          model,
-          stop_reason: null,
-          stop_sequence: null,
-          usage: {
-            input_tokens: inputTokens,
-            output_tokens: 0,
-            cache_creation_input_tokens: 0,
-            cache_read_input_tokens: cachedReadTokens,
-          },
-        },
-      } as unknown as BetaRawMessageStreamEvent
-    }
-
-    // Skip chunks that carry only usage data (no delta content)
-    if (!delta) continue
-
-    // Handle reasoning_content → Anthropic thinking block
-    // DeepSeek and compatible providers send delta.reasoning_content
-    const reasoningContent = (delta as any).reasoning_content
-    if (reasoningContent != null && reasoningContent !== '') {
-      if (!thinkingBlockOpen) {
-        currentContentIndex++
-        thinkingBlockOpen = true
-        openBlockIndices.add(currentContentIndex)
-
-        yield {
-          type: 'content_block_start',
-          index: currentContentIndex,
-          content_block: {
-            type: 'thinking',
-            thinking: '',
-            signature: '',
-          },
-        } as BetaRawMessageStreamEvent
-      }
-
-      yield {
-        type: 'content_block_delta',
-        index: currentContentIndex,
-        delta: {
-          type: 'thinking_delta',
-          thinking: reasoningContent,
-        },
-      } as BetaRawMessageStreamEvent
-    }
-
-    // Handle text content
-    if (delta.content != null && delta.content !== '') {
-      if (!textBlockOpen) {
-        // Close thinking block if still open (reasoning done, now generating answer)
-        if (thinkingBlockOpen) {
-          yield {
-            type: 'content_block_stop',
-            index: currentContentIndex,
-          } as BetaRawMessageStreamEvent
-          openBlockIndices.delete(currentContentIndex)
-          thinkingBlockOpen = false
-        }
-
-        currentContentIndex++
-        textBlockOpen = true
-        openBlockIndices.add(currentContentIndex)
-
-        yield {
-          type: 'content_block_start',
-          index: currentContentIndex,
-          content_block: {
-            type: 'text',
-            text: '',
-          },
-        } as BetaRawMessageStreamEvent
-      }
-
-      yield {
-        type: 'content_block_delta',
-        index: currentContentIndex,
-        delta: {
-          type: 'text_delta',
-          text: delta.content,
-        },
-      } as BetaRawMessageStreamEvent
-    }
-
-    // Handle tool calls
-    if (delta.tool_calls) {
-      for (const tc of delta.tool_calls) {
-        const tcIndex = tc.index
-
-        if (!toolBlocks.has(tcIndex)) {
-          // Close thinking block if open
-          if (thinkingBlockOpen) {
-            yield {
-              type: 'content_block_stop',
-              index: currentContentIndex,
-            } as BetaRawMessageStreamEvent
-            openBlockIndices.delete(currentContentIndex)
-            thinkingBlockOpen = false
-          }
-
-          // Close text block if open
-          if (textBlockOpen) {
-            yield {
-              type: 'content_block_stop',
-              index: currentContentIndex,
-            } as BetaRawMessageStreamEvent
-            openBlockIndices.delete(currentContentIndex)
-            textBlockOpen = false
-          }
-
-          // Start new tool_use block
-          currentContentIndex++
-          const toolId = tc.id || `toolu_${randomUUID().replace(/-/g, '').slice(0, 24)}`
-          const toolName = tc.function?.name || ''
-
-          toolBlocks.set(tcIndex, {
-            contentIndex: currentContentIndex,
-            id: toolId,
-            name: toolName,
-            arguments: '',
-          })
-          openBlockIndices.add(currentContentIndex)
-
-          yield {
-            type: 'content_block_start',
-            index: currentContentIndex,
-            content_block: {
-              type: 'tool_use',
-              id: toolId,
-              name: toolName,
-              input: {},
-            },
-          } as BetaRawMessageStreamEvent
-        }
-
-        // Stream argument fragments
-        const argFragment = tc.function?.arguments
-        if (argFragment) {
-          toolBlocks.get(tcIndex)!.arguments += argFragment
-          yield {
-            type: 'content_block_delta',
-            index: toolBlocks.get(tcIndex)!.contentIndex,
-            delta: {
-              type: 'input_json_delta',
-              partial_json: argFragment,
-            },
-          } as BetaRawMessageStreamEvent
-        }
-      }
-    }
-
-    // Handle finish: close all open content blocks and record the finish_reason.
-    // message_delta + message_stop are emitted AFTER the stream loop so that any
-    // trailing usage chunk (sent after the finish chunk by some endpoints)
-    // is captured first — ensuring token counts are non-zero.
-    if (choice?.finish_reason) {
-      // Close thinking block if still open
-      if (thinkingBlockOpen) {
-        yield {
-          type: 'content_block_stop',
-          index: currentContentIndex,
-        } as BetaRawMessageStreamEvent
-        openBlockIndices.delete(currentContentIndex)
-        thinkingBlockOpen = false
-      }
-
-      // Close text block if still open
-      if (textBlockOpen) {
-        yield {
-          type: 'content_block_stop',
-          index: currentContentIndex,
-        } as BetaRawMessageStreamEvent
-        openBlockIndices.delete(currentContentIndex)
-        textBlockOpen = false
-      }
-
-      // Close all tool blocks that haven't been closed yet
-      for (const [, block] of toolBlocks) {
-        if (openBlockIndices.has(block.contentIndex)) {
-          yield {
-            type: 'content_block_stop',
-            index: block.contentIndex,
-          } as BetaRawMessageStreamEvent
-          openBlockIndices.delete(block.contentIndex)
-        }
-      }
-
-      // Defer message_delta / message_stop until after the loop so that any
-      // trailing usage chunk is processed before we emit the final token counts.
-      pendingFinishReason = choice.finish_reason
-      pendingHasToolCalls = toolBlocks.size > 0
-    }
-  }
-
-  // Safety: close any remaining open blocks if stream ended without finish_reason
-  for (const idx of openBlockIndices) {
-    yield {
-      type: 'content_block_stop',
-      index: idx,
-    } as BetaRawMessageStreamEvent
-  }
-
-  // Emit message_delta + message_stop now that the stream is fully consumed.
-  // Usage values (inputTokens / outputTokens) reflect all chunks including any
-  // trailing usage-only chunk sent after the finish_reason chunk.
-  if (pendingFinishReason !== null) {
-    // Map finish_reason to Anthropic stop_reason.
-    // CRITICAL: When finish_reason is 'length' (token budget exhausted), always
-    // report 'max_tokens' regardless of whether partial tool calls were received.
-    // Otherwise the query loop would try to execute tool calls with incomplete
-    // JSON arguments instead of triggering the max_tokens retry/recovery path.
-    const stopReason =
-      pendingFinishReason === 'length'
-        ? 'max_tokens'
-        : pendingHasToolCalls
-          ? 'tool_use'
-          : mapFinishReason(pendingFinishReason)
-
-    yield {
-      type: 'message_delta',
-      delta: {
-        stop_reason: stopReason,
-        stop_sequence: null,
-      },
-      // Carry all four Anthropic usage fields so queryModelOpenAI's message_delta
-      // handler (which spreads this into the accumulated usage object) can override
-      // every field that message_start emitted as 0. For endpoints that send usage
-      // in a trailing chunk (e.g. DeepSeek), message_start is emitted on the first
-      // content chunk before the trailing usage chunk arrives, so all four fields
-      // start at 0. By the time we reach here (post-loop) the trailing chunk has
-      // been processed and all values reflect the real counts.
-      //
-      // OpenAI → Anthropic field mapping:
-      //   prompt_tokens                        → input_tokens
-      //   completion_tokens                    → output_tokens
-      //   prompt_tokens_details.cached_tokens  → cache_read_input_tokens
-      //   (no OpenAI equivalent)               → cache_creation_input_tokens (stays 0)
-      usage: {
-        input_tokens: inputTokens,
-        output_tokens: outputTokens,
-        cache_read_input_tokens: cachedReadTokens,
-        cache_creation_input_tokens: 0,
-      },
-    } as BetaRawMessageStreamEvent
-
-    yield {
-      type: 'message_stop',
-    } as BetaRawMessageStreamEvent
-  }
-}
-
-/**
- * Map OpenAI finish_reason to Anthropic stop_reason.
- *
- * stop           → end_turn
- * tool_calls     → tool_use
- * length         → max_tokens
- * content_filter → end_turn
- */
-function mapFinishReason(reason: string): string {
-  switch (reason) {
-    case 'stop':
-      return 'end_turn'
-    case 'tool_calls':
-      return 'tool_use'
-    case 'length':
-      return 'max_tokens'
-    case 'content_filter':
-      return 'end_turn'
-    default:
-      return 'end_turn'
-  }
-}