Compare commits

...

7 Commits

Author SHA1 Message Date
claude-code-best
b62b384e36 fix: normalizeMessagesForAPI 不再跨 tool_result 边界合并同 ID assistant 消息 (CC-1215)
ACP 模式下 extended thinking + tool_use 同一 turn 时,StreamingToolExecutor
在两个同 message.id 的 AssistantMessage 之间插入 tool_result,导致向后遍历
合并跨越边界,产生重复 tool_use ID → 孤立 tool_result → 连续 user 消息 → 400。

修改向后遍历停止条件:遇到非 assistant 消息(含 tool_result)即停止,不再跳过。
2026-06-04 15:41:41 +08:00
claude-code-best
d7001b870f fix: add markResourceTiming polyfill to performance shim for Node.js v22 undici compatibility
Node.js v22 undici internal calls performance.markResourceTiming() after
every fetch. The performance shim was missing this method, causing
TypeError crashes in ACP mode when running with Node.js.
2026-06-04 14:30:34 +08:00
claude-code-best
18437c20d2 fix: prevent crash when DiscoverSkills receives undefined query via ExecuteExtraTool
searchSkills() called .trim() on query without null-guard. When
DiscoverSkills is invoked through ExecuteExtraTool with missing
description, query is undefined, causing 'Cannot read properties of
undefined (reading trim)'.

Fixed with optional chaining: !query.trim() → !query?.trim()

Co-Authored-By: deepseek-v4-pro <deepseek-ai@claude-code-best.win>
2026-06-03 21:38:23 +08:00
James F
02298cb199 security: close telemetry leak in preconnectAnthropicApi startup path (#1253)
🔒 Security Discovery: Un-gated outbound connection bypasses privacy controls

Summary
-------
preconnectAnthropicApi() unconditionally sends a TCP+TLS handshake to
api.anthropic.com on every ccb startup — even when the user has explicitly
disabled all non-essential traffic via CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1
or DISABLE_TELEMETRY=1.

This is the LAST un-gated outbound connection in the entire startup path.
Every other telemetry sink (Sentry, Langfuse, OpenTelemetry, GrowthBook,
1P Event Logger, Datadog, BigQuery, etc.) already respects the
privacyLevel module's isEssentialTrafficOnly() gate. This one did not.

Impact
------
While the preconnect is a HEAD request with no payload, the connection
itself leaks the client's IP address and session timing to Anthropic's
infrastructure. For privacy-conscious users and enterprise deployments
that have disabled telemetry, this constitutes an unexpected data leak.

Fix
---
Add isEssentialTrafficOnly() check at the function entry, consistent
with every other privacy-gated code path in the codebase. The
privacyLevel module is already imported by init.ts and 12+ other
modules — no new dependencies.

Verification
------------
Reproduced and verified via strace on Linux (aarch64):

  # Before fix
  $ strace -f -e connect ccb -p <<< 'hello'
  connect(16, sin_addr=inet_addr("160.79.104.10"), sin_port=htons(443)) = 0
  # ↑ connector to api.anthropic.com despite CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1

  # After fix
  $ strace -f -e connect ccb -p <<< 'hello'
  # ↑ zero remote TCP connections — all traffic to localhost only

Changes: 1 file, +5 lines (import + gate)
2026-06-02 09:30:13 +08:00
claude-code-best
b2b1981da3 docs: update contributors 2026-06-01 00:23:43 +00:00
claude-code-best
33c52578a6 docs: 修改 README 2026-05-31 22:11:29 +08:00
claude-code-best
e33b17bde7 feat: sideQuery 支持第三方 provider 路由 (OpenAI/Grok/Gemini)
- 新增 getProviderPrimaryModel() 从环境变量解析 provider 主模型
- getDefaultOpus/Sonnet/HaikuModel 在第三方 provider 下回退到用户配置的主模型
- sideQuery 根据 provider 类型分发到对应的 API 适配器
- 新增 sideQueryViaOpenAICompatible (OpenAI + Grok) 和 sideQueryViaGemini 适配函数
- 避免 sideQuery 后台任务在配置第三方端点时仍请求 Anthropic API
2026-05-31 14:08:30 +08:00
9 changed files with 644 additions and 24 deletions

View File

@@ -10,12 +10,11 @@
> Which Claude do you like? The open source one is the best. > Which Claude do you like? The open source one is the best.
牢 A (Anthropic) 官方 [Claude Code](https://docs.anthropic.com/en/docs/claude-code) CLI 工具的源码反编译/逆向还原项目。目标是将 Claude Code 大部分功能及工程化能力复现 (问就是老佛爷已经付过钱了)。虽然很难绷, 但是它叫做 CCB(踩踩背)... 而且, 我们实现了企业版或者需要登陆 Claude 账号才能使用的特性, 实现技术普惠 牢 A (Anthropic) 官方 [Claude Code](https://docs.anthropic.com/en/docs/claude-code) 完整复原的工程化项目。虽然很难绷, 但是它叫做 CCB(踩踩背)... 而且, 我们实现了企业版或者需要登陆 Claude 账号才能使用的特性, 并在此基础上扩展了更多好玩的特性。
> 我们将会在五一期间进行整个代码仓库的 lint 规范化, 这个期间提交的 PR 可能会有非常多的冲突, 所以大的功能请尽量在这之前提交哈 [Peri Code](https://github.com/KonghaYao/peri)Claude Code 兼容的 Rust Agent多年大模型经验匠心制作国内大模型DeepSeek/GLM精调CPU/内存极致优化,在开发版/树莓派上也能跑 CC 一样的体验。
[文档在这里, 支持投稿 PR](https://ccb.agent-aura.top/) | [留影文档在这里](./Friends.md) | [Discord 群组](https://discord.gg/uApuzJWGKX)
[文档在这里](https://ccb.agent-aura.top/) | [留影文档在这里](./Friends.md) | [Discord 群组,群主在线答疑](https://discord.gg/uApuzJWGKX)
| 特性 | 说明 | 文档 | | 特性 | 说明 | 文档 |
| --------------------------- | ---------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------- | | --------------------------- | ---------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------- |
@@ -150,7 +149,6 @@ bun run build
需要填写的字段: 需要填写的字段:
| 📌 字段 | 📝 说明 | 💡 示例 | | 📌 字段 | 📝 说明 | 💡 示例 |
| ------------ | ------------- | ---------------------------- | | ------------ | ------------- | ---------------------------- |
| Base URL | API 服务地址 | `https://api.example.com/v1` | | Base URL | API 服务地址 | `https://api.example.com/v1` |

File diff suppressed because one or more lines are too long

Before

Width:  |  Height:  |  Size: 2.3 MiB

After

Width:  |  Height:  |  Size: 2.5 MiB

View File

@@ -385,7 +385,7 @@ export function searchSkills(
index: SkillIndexEntry[], index: SkillIndexEntry[],
limit = 5, limit = 5,
): SearchResult[] { ): SearchResult[] {
if (index.length === 0 || !query.trim()) return [] if (index.length === 0 || !query?.trim()) return []
const queryTokens = tokenizeAndStem(query) const queryTokens = tokenizeAndStem(query)
if (queryTokens.length === 0) return [] if (queryTokens.length === 0) return []

View File

@@ -610,3 +610,179 @@ describe('ensureToolResultPairing', () => {
expect(lastMsg.type).toBe('user') expect(lastMsg.type).toBe('user')
}) })
}) })
// ─── CC-1215: normalizeMessagesForAPI must not merge assistants across tool_results ──
describe('normalizeMessagesForAPI thinking + tool_use same turn (CC-1215)', () => {
test('does not merge same-id assistants across a tool_result boundary', () => {
// Simulate the streaming sequence when extended thinking + tool_use appear
// in the same turn, and StreamingToolExecutor inserts a tool_result
// between the two assistant content-block messages.
const sharedMessageId = 'msg_shared_001'
const toolUseId = 'toolu_cc1215'
// assistant[thinking] — first content_block_stop yield
const thinkingMsg = createAssistantMessage({
content: [
{ type: 'thinking', thinking: 'Let me think...', signature: 'sig1' },
],
})
thinkingMsg.message.id = sharedMessageId
// user[tool_result] — from StreamingToolExecutor completing fast
const toolResultMsg = createUserMessage({
content: [
{
type: 'tool_result',
tool_use_id: toolUseId,
content: '/home/user',
},
],
})
// assistant[tool_use] — second content_block_stop yield
const toolUseMsg = createAssistantMessage({
content: [
{
type: 'tool_use',
id: toolUseId,
name: 'Bash',
input: { command: 'pwd' },
},
],
})
toolUseMsg.message.id = sharedMessageId
const messages: Message[] = [
makeUserMsg('Run pwd'),
thinkingMsg,
toolResultMsg,
toolUseMsg,
]
const result = normalizeMessagesForAPI(messages)
// Before the fix, the backward walk would skip the tool_result and merge
// thinking + tool_use into one assistant. This produced duplicate tool_use
// IDs after ensureToolResultPairing ran, leading to orphaned tool_results
// and consecutive user messages → API 400.
//
// After the fix, the backward walk stops at the tool_result, so the two
// assistants remain separate. The result should have 4 messages:
// user, assistant[thinking], user[tool_result], assistant[tool_use]
expect(result).toHaveLength(4)
expect(result[0]!.type).toBe('user')
expect(result[1]!.type).toBe('assistant')
expect(result[2]!.type).toBe('user')
expect(result[3]!.type).toBe('assistant')
// The thinking assistant should NOT have been merged with the tool_use one
const thinkingAssistant = result[1] as AssistantMessage
const thinkingContent = thinkingAssistant.message.content as Array<{
type: string
}>
expect(thinkingContent.some(b => b.type === 'tool_use')).toBe(false)
const toolUseAssistant = result[3] as AssistantMessage
const toolUseContent = toolUseAssistant.message.content as Array<{
type: string
}>
expect(toolUseContent.some(b => b.type === 'tool_use')).toBe(true)
})
test('still merges consecutive same-id assistants without intervening tool_result', () => {
const sharedMessageId = 'msg_shared_002'
const thinkingMsg = createAssistantMessage({
content: [{ type: 'thinking', thinking: 'Hmm', signature: 'sig2' }],
})
thinkingMsg.message.id = sharedMessageId
const toolUseMsg = createAssistantMessage({
content: [
{
type: 'tool_use',
id: 'toolu_merge',
name: 'Bash',
input: { command: 'ls' },
},
],
})
toolUseMsg.message.id = sharedMessageId
// No tool_result between them — they should still be merged
const messages: Message[] = [
makeUserMsg('List files'),
thinkingMsg,
toolUseMsg,
]
const result = normalizeMessagesForAPI(messages)
// Should be: user, assistant[thinking + tool_use]
expect(result).toHaveLength(2)
expect(result[0]!.type).toBe('user')
const merged = result[1] as AssistantMessage
const content = merged.message.content as Array<{ type: string }>
expect(content.some(b => b.type === 'thinking')).toBe(true)
expect(content.some(b => b.type === 'tool_use')).toBe(true)
})
test('full pipeline: normalize + ensureToolResultPairing produces valid role alternation', () => {
const sharedMessageId = 'msg_shared_003'
const toolUseId = 'toolu_pipeline'
const thinkingMsg = createAssistantMessage({
content: [
{ type: 'thinking', thinking: 'Planning...', signature: 'sig3' },
],
})
thinkingMsg.message.id = sharedMessageId
const toolResultMsg = createUserMessage({
content: [
{
type: 'tool_result',
tool_use_id: toolUseId,
content: 'file.txt',
},
],
})
const toolUseMsg = createAssistantMessage({
content: [
{
type: 'tool_use',
id: toolUseId,
name: 'Bash',
input: { command: 'ls' },
},
],
})
toolUseMsg.message.id = sharedMessageId
// Full pipeline: normalize → ensureToolResultPairing
const normalized = normalizeMessagesForAPI([
makeUserMsg('Run ls'),
thinkingMsg,
toolResultMsg,
toolUseMsg,
])
const result = ensureToolResultPairing(normalized)
// Verify strict role alternation: user → assistant → user → assistant → ...
for (let i = 1; i < result.length; i++) {
const prev = result[i - 1]!
const curr = result[i]!
if (prev.type === 'user' && curr.type === 'user') {
expect.unreachable(`Consecutive user messages at index ${i - 1}-${i}`)
}
if (prev.type === 'assistant' && curr.type === 'assistant') {
expect.unreachable(
`Consecutive assistant messages at index ${i - 1}-${i}`,
)
}
}
})
})

View File

@@ -25,6 +25,7 @@
import { getOauthConfig } from '../constants/oauth.js' import { getOauthConfig } from '../constants/oauth.js'
import { isEnvTruthy } from './envUtils.js' import { isEnvTruthy } from './envUtils.js'
import { isEssentialTrafficOnly } from './privacyLevel.js'
let fired = false let fired = false
@@ -32,6 +33,10 @@ export function preconnectAnthropicApi(): void {
if (fired) return if (fired) return
fired = true fired = true
// Also skip when non-essential traffic is disabled via
// CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC / DISABLE_TELEMETRY / proxy env.
if (isEssentialTrafficOnly()) return
// Skip if using a cloud provider — different endpoint + auth // Skip if using a cloud provider — different endpoint + auth
if ( if (
isEnvTruthy(process.env.CLAUDE_CODE_USE_BEDROCK) || isEnvTruthy(process.env.CLAUDE_CODE_USE_BEDROCK) ||

View File

@@ -2541,21 +2541,26 @@ export function normalizeMessagesForAPI(
} }
// Find a previous assistant message with the same message ID and merge. // Find a previous assistant message with the same message ID and merge.
// Walk backwards, skipping tool results and different-ID assistants, // Walk backwards, skipping different-ID assistants, since concurrent
// since concurrent agents (teammates) can interleave streaming content // agents (teammates) can interleave streaming content blocks from
// blocks from multiple API responses with different message IDs. // multiple API responses with different message IDs.
//
// Do NOT skip tool_result messages — when claude.ts yields separate
// AssistantMessages for thinking and tool_use blocks (same message.id),
// a StreamingToolExecutor tool_result can land between them. Merging
// across that boundary produces duplicate tool_use IDs that downstream
// ensureToolResultPairing strips, leaving orphaned tool_results and
// ultimately consecutive user messages → API 400 (CC-1215).
for (let i = result.length - 1; i >= 0; i--) { for (let i = result.length - 1; i >= 0; i--) {
const msg = result[i]! const msg = result[i]!
if (msg.type !== 'assistant' && !isToolResultMessage(msg)) { if (msg.type !== 'assistant') {
break break
} }
if (msg.type === 'assistant') { if (msg.message.id === normalizedMessage.message.id) {
if (msg.message.id === normalizedMessage.message.id) { result[i] = mergeAssistantMessages(msg, normalizedMessage)
result[i] = mergeAssistantMessages(msg, normalizedMessage) return
return
}
} }
} }

View File

@@ -120,6 +120,19 @@ export function getBestModel(): ModelName {
return getDefaultOpusModel() return getDefaultOpusModel()
} }
/**
* Resolve the provider's primary model from its env var (e.g. OPENAI_MODEL).
* Returns undefined for providers that don't have a primary-model env var
* (Bedrock, Vertex, Foundry, firstParty).
*/
function getProviderPrimaryModel(): ModelName | undefined {
const provider = getAPIProvider()
if (provider === 'openai') return process.env.OPENAI_MODEL
if (provider === 'gemini') return process.env.GEMINI_MODEL
if (provider === 'grok') return process.env.GROK_MODEL
return undefined
}
// @[MODEL LAUNCH]: Update the default Opus model (3P providers may lag so keep defaults unchanged). // @[MODEL LAUNCH]: Update the default Opus model (3P providers may lag so keep defaults unchanged).
export function getDefaultOpusModel(): ModelName { export function getDefaultOpusModel(): ModelName {
const provider = getAPIProvider() const provider = getAPIProvider()
@@ -138,10 +151,12 @@ export function getDefaultOpusModel(): ModelName {
if (process.env.ANTHROPIC_DEFAULT_OPUS_MODEL) { if (process.env.ANTHROPIC_DEFAULT_OPUS_MODEL) {
return process.env.ANTHROPIC_DEFAULT_OPUS_MODEL return process.env.ANTHROPIC_DEFAULT_OPUS_MODEL
} }
// 3P providers (Bedrock, Vertex, Foundry) all publish Opus 4.7 in sync // 3P providers: if user set a primary model (e.g. OPENAI_MODEL=glm-5.1),
// with firstParty as of 2026-04-17 (AWS Bedrock, Google Vertex AI, and // fall back to it instead of a hardcoded Anthropic model. This prevents
// Microsoft Foundry announcements and model catalogs all confirm). The // sideQuery / background tasks from sending requests to Anthropic's API
// branch is kept as a structural hook in case a future launch lags on 3P. // when the user configured a third-party provider.
const primaryModel = getProviderPrimaryModel()
if (primaryModel) return primaryModel
if (provider !== 'firstParty') { if (provider !== 'firstParty') {
return getModelStrings().opus47 return getModelStrings().opus47
} }
@@ -166,7 +181,11 @@ export function getDefaultSonnetModel(): ModelName {
if (process.env.ANTHROPIC_DEFAULT_SONNET_MODEL) { if (process.env.ANTHROPIC_DEFAULT_SONNET_MODEL) {
return process.env.ANTHROPIC_DEFAULT_SONNET_MODEL return process.env.ANTHROPIC_DEFAULT_SONNET_MODEL
} }
// Default to Sonnet 4.5 for 3P since they may not have 4.6 yet // 3P providers: fall back to user's primary model instead of a hardcoded
// Anthropic model name. Prevents background API calls from being routed to
// Anthropic when the user configured a third-party endpoint.
const primaryModel = getProviderPrimaryModel()
if (primaryModel) return primaryModel
if (provider !== 'firstParty') { if (provider !== 'firstParty') {
return getModelStrings().sonnet45 return getModelStrings().sonnet45
} }
@@ -191,6 +210,10 @@ export function getDefaultHaikuModel(): ModelName {
if (process.env.ANTHROPIC_DEFAULT_HAIKU_MODEL) { if (process.env.ANTHROPIC_DEFAULT_HAIKU_MODEL) {
return process.env.ANTHROPIC_DEFAULT_HAIKU_MODEL return process.env.ANTHROPIC_DEFAULT_HAIKU_MODEL
} }
// 3P providers: fall back to user's primary model instead of a hardcoded
// Anthropic model name.
const primaryModel = getProviderPrimaryModel()
if (primaryModel) return primaryModel
// Haiku 4.5 is available on all platforms (first-party, Foundry, Bedrock, Vertex) // Haiku 4.5 is available on all platforms (first-party, Foundry, Bedrock, Vertex)
return getModelStrings().haiku45 return getModelStrings().haiku45

View File

@@ -135,6 +135,9 @@ const shim = {
clearResourceTimings: (() => {}) as typeof performance.clearResourceTimings, clearResourceTimings: (() => {}) as typeof performance.clearResourceTimings,
setResourceTimingBufferSize: setResourceTimingBufferSize:
(() => {}) as typeof performance.setResourceTimingBufferSize, (() => {}) as typeof performance.setResourceTimingBufferSize,
// Node.js v22 undici internal calls this after every fetch — must exist to
// avoid TypeError: markResourceTiming is not a function
markResourceTiming: (() => {}) as any,
// Delegate read-only properties to the original // Delegate read-only properties to the original
get timeOrigin() { get timeOrigin() {
return original.timeOrigin return original.timeOrigin
@@ -148,7 +151,7 @@ const shim = {
toJSON() { toJSON() {
return original.toJSON() return original.toJSON()
}, },
} as typeof performance } as unknown as typeof performance
/** /**
* Install the shim onto globalThis.performance. Safe to call multiple times. * Install the shim onto globalThis.performance. Safe to call multiple times.

View File

@@ -33,6 +33,19 @@ import { errorMessage } from './errors.js'
import { computeFingerprint } from './fingerprint.js' import { computeFingerprint } from './fingerprint.js'
import { getAPIProvider } from './model/providers.js' import { getAPIProvider } from './model/providers.js'
import { normalizeModelStringForAPI } from './model/model.js' import { normalizeModelStringForAPI } from './model/model.js'
import { getOpenAIClient } from '../services/api/openai/client.js'
import { getGrokClient } from '../services/api/grok/client.js'
import {
anthropicMessagesToOpenAI,
resolveOpenAIModel,
anthropicToolsToOpenAI,
anthropicToolChoiceToOpenAI,
resolveGrokModel,
resolveGeminiModel,
anthropicToolsToGemini,
anthropicToolChoiceToGemini,
} from '@ant/model-provider'
import type { SystemPrompt } from './systemPromptType.js'
type MessageParam = Anthropic.MessageParam type MessageParam = Anthropic.MessageParam
type TextBlockParam = Anthropic.TextBlockParam type TextBlockParam = Anthropic.TextBlockParam
@@ -99,6 +112,46 @@ function extractFirstUserMessageText(messages: MessageParam[]): string {
return textBlock?.type === 'text' ? textBlock.text : '' return textBlock?.type === 'text' ? textBlock.text : ''
} }
/**
* Extract system prompt text from the `system` option.
*/
function extractSystemText(system?: string | TextBlockParam[]): string {
if (!system) return ''
if (typeof system === 'string') return system
return system
.filter((b): b is { type: 'text'; text: string } => 'text' in b && !!b.text)
.map(b => b.text)
.join('\n\n')
}
/**
* Convert Anthropic MessageParam[] to a list of {role, content} objects
* suitable for OpenAI-compatible chat.completions APIs.
*/
function messageParamsToOpenAIRoleContent(
messages: MessageParam[],
): Array<{ role: 'user' | 'assistant'; content: string }> {
const result: Array<{ role: 'user' | 'assistant'; content: string }> = []
for (const m of messages) {
if (m.role !== 'user' && m.role !== 'assistant') continue
const text =
typeof m.content === 'string'
? m.content
: Array.isArray(m.content)
? m.content
.filter(
(b): b is { type: 'text'; text: string } => b.type === 'text',
)
.map(b => b.text)
.join('\n')
: ''
if (text) {
result.push({ role: m.role as 'user' | 'assistant', content: text })
}
}
return result
}
/** /**
* Lightweight API wrapper for "side queries" outside the main conversation loop. * Lightweight API wrapper for "side queries" outside the main conversation loop.
* *
@@ -112,6 +165,7 @@ function extractFirstUserMessageText(messages: MessageParam[]): string {
* - Proper betas for the model * - Proper betas for the model
* - API metadata * - API metadata
* - Model string normalization (strips [1m] suffix for API) * - Model string normalization (strips [1m] suffix for API)
* - Third-party provider routing (OpenAI, Grok, Gemini)
* *
* @example * @example
* // Permission explainer * // Permission explainer
@@ -142,6 +196,14 @@ export async function sideQuery(opts: SideQueryOptions): Promise<BetaMessage> {
stop_sequences, stop_sequences,
} = opts } = opts
const provider = getAPIProvider()
if (provider === 'openai' || provider === 'grok') {
return sideQueryViaOpenAICompatible(opts)
}
if (provider === 'gemini') {
return sideQueryViaGemini(opts)
}
const client = await getAnthropicClient({ const client = await getAnthropicClient({
maxRetries, maxRetries,
model, model,
@@ -198,7 +260,6 @@ export async function sideQuery(opts: SideQueryOptions): Promise<BetaMessage> {
} }
const normalizedModel = normalizeModelStringForAPI(model) const normalizedModel = normalizeModelStringForAPI(model)
const provider = getAPIProvider()
const start = Date.now() const start = Date.now()
const traceName = `side-query:${opts.querySource}` const traceName = `side-query:${opts.querySource}`
@@ -328,3 +389,352 @@ export async function sideQuery(opts: SideQueryOptions): Promise<BetaMessage> {
return response return response
} }
/**
* OpenAI-compatible side query for OpenAI and Grok providers.
* Both use the OpenAI SDK with different base URLs.
*
* Converts Anthropic-format params to OpenAI Chat Completions, sends a
* non-streaming request, and wraps the response back into a BetaMessage
* shape so callers remain provider-agnostic.
*
* Supports tools and tool_choice for structured output (e.g. yoloClassifier,
* permissionExplainer).
*/
async function sideQueryViaOpenAICompatible(
opts: SideQueryOptions,
): Promise<BetaMessage> {
const {
model,
system,
messages,
tools,
tool_choice,
max_tokens = 1024,
temperature,
signal,
} = opts
const provider = getAPIProvider()
const normalizedModel = normalizeModelStringForAPI(model)
// Resolve model name and client per provider
let openaiModel: string
// eslint-disable-next-line @typescript-eslint/no-redundant-type-constituents
let client: import('openai').default
if (provider === 'grok') {
openaiModel = resolveGrokModel(normalizedModel)
client = getGrokClient({ maxRetries: opts.maxRetries ?? 2 })
} else {
openaiModel = resolveOpenAIModel(normalizedModel)
client = getOpenAIClient({ maxRetries: opts.maxRetries ?? 2 })
}
// Build system prompt text
const systemText = extractSystemText(system)
// Build OpenAI messages: system first, then user/assistant
const openaiMessages: Array<{
role: 'system' | 'user' | 'assistant'
content: string
}> = []
if (systemText) {
openaiMessages.push({ role: 'system', content: systemText })
}
openaiMessages.push(...messageParamsToOpenAIRoleContent(messages))
// Convert tools and tool_choice if provided
const openaiTools =
tools && tools.length > 0
? anthropicToolsToOpenAI(tools as BetaToolUnion[])
: undefined
const openaiToolChoice = tool_choice
? anthropicToolChoiceToOpenAI(tool_choice)
: undefined
const start = Date.now()
const requestParams: Record<string, unknown> = {
model: openaiModel,
messages: openaiMessages,
max_tokens,
}
if (temperature !== undefined) requestParams.temperature = temperature
if (openaiTools && openaiTools.length > 0) {
requestParams.tools = openaiTools
if (openaiToolChoice) requestParams.tool_choice = openaiToolChoice
}
const response = await client.chat.completions.create(
requestParams as unknown as import('openai/resources/chat/completions/completions.mjs').ChatCompletionCreateParamsNonStreaming,
{ signal },
)
const choice = response.choices[0]
const message = choice?.message
// Build content blocks for BetaMessage
const contentBlocks: Array<
| { type: 'text'; text: string }
| { type: 'tool_use'; id: string; name: string; input: unknown }
> = []
if (message?.content) {
contentBlocks.push({ type: 'text', text: message.content })
}
if (message?.tool_calls) {
for (const tc of message.tool_calls) {
// ChatCompletionMessageToolCall is a union — only function-type has .function
if (tc.type === 'function' && 'function' in tc) {
const fn = (tc as { function: { name: string; arguments: string } })
.function
contentBlocks.push({
type: 'tool_use',
id: tc.id ?? `toolu_${Date.now()}`,
name: fn.name,
input: JSON.parse(fn.arguments || '{}'),
})
}
}
}
const now = Date.now()
const requestId = response.id
const lastCompletion = getLastApiCompletionTimestamp()
logEvent('tengu_api_success', {
requestId:
requestId as AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS,
querySource:
opts.querySource as AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS,
model:
openaiModel as AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS,
inputTokens: response.usage?.prompt_tokens ?? 0,
outputTokens: response.usage?.completion_tokens ?? 0,
cachedInputTokens: 0,
uncachedInputTokens: response.usage?.prompt_tokens ?? 0,
durationMsIncludingRetries: now - start,
timeSinceLastApiCallMs:
lastCompletion !== null ? now - lastCompletion : undefined,
})
setLastApiCompletionTimestamp(now)
const stopReason =
choice?.finish_reason === 'tool_calls'
? 'tool_use'
: choice?.finish_reason === 'length'
? 'max_tokens'
: 'end_turn'
return {
id: response.id,
type: 'message',
role: 'assistant',
content: contentBlocks as BetaMessage['content'],
model: openaiModel,
stop_reason: stopReason as BetaMessage['stop_reason'],
stop_sequence: null,
usage: {
input_tokens: response.usage?.prompt_tokens ?? 0,
output_tokens: response.usage?.completion_tokens ?? 0,
},
} as BetaMessage
}
/**
* Gemini side query. Converts Anthropic-format params to Gemini
* generateContent format, sends a non-streaming request via fetch,
* and wraps the response back into a BetaMessage shape.
*/
async function sideQueryViaGemini(
opts: SideQueryOptions,
): Promise<BetaMessage> {
const {
model,
system,
messages,
tools,
tool_choice,
max_tokens = 1024,
temperature,
signal,
} = opts
const normalizedModel = normalizeModelStringForAPI(model)
const geminiModel = resolveGeminiModel(normalizedModel)
// Build Gemini contents from Anthropic MessageParam[]
const contents: Array<{
role: 'user' | 'model'
parts: Array<{ text: string }>
}> = []
for (const m of messages) {
if (m.role !== 'user' && m.role !== 'assistant') continue
const text =
typeof m.content === 'string'
? m.content
: Array.isArray(m.content)
? m.content
.filter(
(b): b is { type: 'text'; text: string } => b.type === 'text',
)
.map(b => b.text)
.join('\n')
: ''
if (text) {
contents.push({
role: m.role === 'assistant' ? 'model' : 'user',
parts: [{ text }],
})
}
}
// Build system instruction
const systemText = extractSystemText(system)
const systemInstruction = systemText
? { parts: [{ text: systemText }] }
: undefined
// Convert tools and tool_choice
const geminiTools =
tools && tools.length > 0
? anthropicToolsToGemini(tools as BetaToolUnion[])
: undefined
const geminiToolConfig = tool_choice
? anthropicToolChoiceToGemini(tool_choice)
: undefined
const baseUrl = (
process.env.GEMINI_BASE_URL ||
'https://generativelanguage.googleapis.com/v1beta'
).replace(/\/+$/, '')
const modelPath = geminiModel.startsWith('models/')
? geminiModel
: `models/${geminiModel}`
const url = `${baseUrl}/${modelPath}:generateContent`
const body: Record<string, unknown> = {
contents,
...(systemInstruction && { systemInstruction }),
...(geminiTools && geminiTools.length > 0 && { tools: geminiTools }),
...(geminiToolConfig && {
toolConfig: { functionCallingConfig: geminiToolConfig },
}),
...(temperature !== undefined && {
generationConfig: { temperature },
}),
...(max_tokens !== undefined && {
generationConfig: {
...(temperature !== undefined && { temperature }),
maxOutputTokens: max_tokens,
},
}),
}
// Merge generationConfig if both temperature and max_tokens are set
if (temperature !== undefined && max_tokens !== undefined) {
body.generationConfig = { temperature, maxOutputTokens: max_tokens }
}
const start = Date.now()
const res = await fetch(url, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': process.env.GEMINI_API_KEY || '',
},
body: JSON.stringify(body),
signal,
})
if (!res.ok) {
const errorBody = await res.text()
throw new Error(
`Gemini API request failed (${res.status} ${res.statusText}): ${errorBody || 'empty response body'}`,
)
}
const geminiResponse = (await res.json()) as {
candidates?: Array<{
content?: {
role?: string
parts?: Array<{
text?: string
functionCall?: { name?: string; args?: Record<string, unknown> }
}>
}
finishReason?: string
}>
usageMetadata?: {
promptTokenCount?: number
candidatesTokenCount?: number
totalTokenCount?: number
}
id?: string
}
// Build content blocks from Gemini response
const contentBlocks: Array<
| { type: 'text'; text: string }
| { type: 'tool_use'; id: string; name: string; input: unknown }
> = []
const candidate = geminiResponse.candidates?.[0]
const parts = candidate?.content?.parts
if (parts) {
for (const part of parts) {
if (part.text) {
contentBlocks.push({ type: 'text', text: part.text })
}
if (part.functionCall) {
contentBlocks.push({
type: 'tool_use',
id: `toolu_${Date.now()}_${Math.random().toString(36).slice(2, 8)}`,
name: part.functionCall.name ?? '',
input: part.functionCall.args ?? {},
})
}
}
}
const now = Date.now()
const lastCompletion = getLastApiCompletionTimestamp()
logEvent('tengu_api_success', {
requestId: (geminiResponse.id ??
'') as AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS,
querySource:
opts.querySource as AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS,
model:
geminiModel as AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS,
inputTokens: geminiResponse.usageMetadata?.promptTokenCount ?? 0,
outputTokens: geminiResponse.usageMetadata?.candidatesTokenCount ?? 0,
cachedInputTokens: 0,
uncachedInputTokens: geminiResponse.usageMetadata?.promptTokenCount ?? 0,
durationMsIncludingRetries: now - start,
timeSinceLastApiCallMs:
lastCompletion !== null ? now - lastCompletion : undefined,
})
setLastApiCompletionTimestamp(now)
const stopReason =
candidate?.finishReason === 'STOP'
? 'end_turn'
: candidate?.finishReason === 'MAX_TOKENS'
? 'max_tokens'
: 'end_turn'
return {
id: geminiResponse.id ?? `gemini_${Date.now()}`,
type: 'message',
role: 'assistant',
content: contentBlocks as BetaMessage['content'],
model: geminiModel,
stop_reason: stopReason as BetaMessage['stop_reason'],
stop_sequence: null,
usage: {
input_tokens: geminiResponse.usageMetadata?.promptTokenCount ?? 0,
output_tokens: geminiResponse.usageMetadata?.candidatesTokenCount ?? 0,
},
} as BetaMessage
}