Provider Integration

Moxn supports multiple LLM providers. This guide covers how to convert prompts to provider-specific formats and handle their responses.

Supported Providers

from moxn.types.content import Provider

Provider.ANTHROPIC       # Claude (Anthropic)
Provider.OPENAI_CHAT     # GPT models (OpenAI Chat Completions)
Provider.OPENAI_RESPONSES # OpenAI Responses API
Provider.GOOGLE_GEMINI   # Gemini (Google AI Studio)
Provider.GOOGLE_VERTEX   # Gemini (Vertex AI)

Anthropic (Claude)

Basic usage

from moxn import MoxnClient
from moxn.types.content import Provider
from anthropic import Anthropic

async with MoxnClient() as client:
    session = await client.create_prompt_session(
        prompt_id="...",
        session_data=your_input
    )

    anthropic = Anthropic()
    response = anthropic.messages.create(
        **session.to_anthropic_invocation()
    )

    # Log telemetry
    async with client.span(session) as span:
        await client.log_telemetry_event_from_response(
            session, response, Provider.ANTHROPIC
        )

What to_anthropic_invocation() returns

{
    "system": "Your system message...",  # Optional
    "messages": [
        {"role": "user", "content": "..."},
        {"role": "assistant", "content": "..."}
    ],
    "model": "claude-sonnet-4-20250514",  # From completion_config
    "max_tokens": 4096,
    "temperature": 0.7,
    # If tools configured:
    "tools": [...],
    "tool_choice": {"type": "auto"},
    # If structured output configured:
    "output_format": {...}
}

Extended thinking

For Claude models with extended thinking:

response = anthropic.messages.create(
    **session.to_anthropic_invocation(
        thinking={"type": "enabled", "budget_tokens": 10000}
    ),
    extra_headers={"anthropic-beta": "interleaved-thinking-2025-05-14"}
)

Structured outputs

If your prompt has a structured output schema configured:

response = anthropic.messages.create(
    **session.to_anthropic_invocation(),
    extra_headers={"anthropic-beta": "structured-outputs-2025-11-13"}
)

OpenAI (GPT)

Chat Completions API

from openai import OpenAI

openai = OpenAI()
response = openai.chat.completions.create(
    **session.to_openai_chat_invocation()
)

# Log telemetry
async with client.span(session) as span:
    await client.log_telemetry_event_from_response(
        session, response, Provider.OPENAI_CHAT
    )

What to_openai_chat_invocation() returns

{
    "messages": [
        {"role": "system", "content": "..."},
        {"role": "user", "content": "..."},
        {"role": "assistant", "content": "..."}
    ],
    "model": "gpt-4o",
    "max_tokens": 4096,
    "temperature": 0.7,
    # If tools configured:
    "tools": [...],
    "tool_choice": "auto",
    "parallel_tool_calls": True,
    # If structured output configured:
    "response_format": {...}
}

Responses API

For OpenAI’s newer Responses API:

response = openai.responses.create(
    **session.to_openai_responses_invocation()
)

await client.log_telemetry_event_from_response(
    session, response, Provider.OPENAI_RESPONSES
)

Reasoning models

For o1, o3, and other reasoning models:

response = openai.chat.completions.create(
    **session.to_openai_chat_invocation(
        thinking={"reasoning_effort": "high"}
    )
)

Google (Gemini)

Google AI Studio

from google import genai

google_client = genai.Client()
response = google_client.models.generate_content(
    **session.to_google_gemini_invocation()
)

# Log telemetry
async with client.span(session) as span:
    await client.log_telemetry_event_from_response(
        session, response, Provider.GOOGLE_GEMINI
    )

Vertex AI

from google import genai

vertex_client = genai.Client(vertexai=True)
response = vertex_client.models.generate_content(
    **session.to_google_vertex_invocation()
)

await client.log_telemetry_event_from_response(
    session, response, Provider.GOOGLE_VERTEX
)

What to_google_gemini_invocation() returns

{
    "model": "gemini-2.5-flash",
    "contents": [...],  # Conversation content
    "config": {
        "system_instruction": "...",
        "max_output_tokens": 4096,
        "temperature": 0.7,
        # If tools configured:
        "tools": [{"function_declarations": [...]}],
        "tool_config": {...},
        # If structured output:
        "response_schema": {...},
        "response_mime_type": "application/json"
    }
}

Thinking models

For Gemini thinking models:

response = google_client.models.generate_content(
    **session.to_google_gemini_invocation(
        thinking={"thinking_budget": 10000}
    )
)

Generic Provider Method

Use to_invocation() for provider-agnostic code:

from moxn.types.content import Provider

# Use provider from prompt's completion_config
payload = session.to_invocation()

# Or specify explicitly
payload = session.to_invocation(provider=Provider.ANTHROPIC)

# With overrides
payload = session.to_invocation(
    provider=Provider.OPENAI_CHAT,
    model="gpt-4o",
    max_tokens=8000,
    temperature=0.5
)

Message-Only Methods

If you only need messages (without model config):

# Generic method (uses stored provider from completion_config)
messages = session.to_messages()

# Or override provider explicitly
messages = session.to_messages(provider=Provider.ANTHROPIC)

# Or use provider-specific methods
anthropic_payload = session.to_anthropic_messages()      # {system, messages}
openai_payload = session.to_openai_chat_messages()       # {messages}
google_payload = session.to_google_gemini_messages()     # {system_instruction, content}

Parsing Responses

Parse any provider’s response to a normalized format:

# Parse response (uses stored provider from completion_config)
parsed = session.parse_response(response)

# Or override provider explicitly
# parsed = session.parse_response(response, provider=Provider.ANTHROPIC)

# Access normalized data
parsed.candidates        # list[Candidate] - response options
parsed.input_tokens      # int | None
parsed.output_tokens     # int | None
parsed.model             # str | None
parsed.stop_reason       # str | None
parsed.raw_response      # dict - original response
parsed.provider          # Provider

# Each candidate has content blocks
for candidate in parsed.candidates:
    for block in candidate.content:
        match block.block_type:
            case "text":
                print(block.text)
            case "tool_call":
                print(f"{block.tool_name}: {block.input}")
            case "thinking":
                print(f"Thinking: {block.text}")

Tool Use

If your prompt has tools configured, they’re automatically included:

# Tools are included in the invocation
response = anthropic.messages.create(
    **session.to_anthropic_invocation()
)

# Check for tool calls in response
parsed = session.parse_response(response)
for candidate in parsed.candidates:
    for block in candidate.content:
        if block.block_type == "tool_call":
            # Execute the tool
            result = execute_tool(block.tool_name, block.input)

            # Add tool result to session (if doing multi-turn)
            # Then call the LLM again...

Tool choice

The SDK translates tool_choice across providers:

Moxn Setting	Anthropic	OpenAI	Google
`"auto"`	`{"type": "auto"}`	`"auto"`	`{"mode": "AUTO"}`
`"required"`	`{"type": "any"}`	`"required"`	`{"mode": "ANY"}`
`"none"`	Tools omitted	`"none"`	`{"mode": "NONE"}`
`"tool_name"`	`{"type": "tool", "name": "..."}`	`{"type": "function", "function": {"name": "..."}}`	`{"mode": "ANY", ...}`

Multimodal Content

Images and PDFs are automatically converted to provider-specific formats:

# Anthropic: base64 with media_type
{
    "type": "image",
    "source": {
        "type": "base64",
        "media_type": "image/png",
        "data": "..."
    }
}

# OpenAI: data URI
{
    "type": "image_url",
    "image_url": {"url": "data:image/png;base64,..."}
}

# Google: Part with inline_data
Part(inline_data=Blob(mime_type="image/png", data=...))

The SDK handles signed URL refresh automatically for images and PDFs stored in cloud storage.

Error Handling

Handle provider-specific errors:

from anthropic import APIError as AnthropicError
from openai import APIError as OpenAIError

try:
    response = anthropic.messages.create(
        **session.to_anthropic_invocation()
    )
except AnthropicError as e:
    if "rate_limit" in str(e):
        # Handle rate limiting
        await asyncio.sleep(60)
    else:
        raise

Provider Feature Matrix

Feature	Anthropic	OpenAI Chat	OpenAI Responses	Google
System messages	Separate field	In messages	Instructions	Config
Tools	Yes	Yes	Yes	Yes
Structured output	Yes (beta)	Yes	Yes	Yes
Images	Yes	Yes	Yes	Yes
PDFs	Yes	Yes	Limited	Yes
Extended thinking	Yes	Yes (o1/o3)	Yes	Yes
Streaming	SDK level	SDK level	SDK level	SDK level

Next Steps

Code Generation

Generate type-safe input models

Structured Outputs

Configure and parse structured responses

Multimodal Content

Work with images and documents

Telemetry

Log provider interactions

Video Walkthrough

Getting Started

Core Workflow

Concepts

Telemetry

Web App

Advanced

Provider Integration

Supported Providers

Anthropic (Claude)

Basic usage

What to_anthropic_invocation() returns

Extended thinking

Structured outputs

OpenAI (GPT)

Chat Completions API

What to_openai_chat_invocation() returns

Responses API

Reasoning models

Google (Gemini)

Google AI Studio

Vertex AI

What to_google_gemini_invocation() returns

Thinking models

Generic Provider Method

Message-Only Methods

Parsing Responses

Tool Use

Tool choice

Multimodal Content

Error Handling

Provider Feature Matrix

Next Steps

Code Generation

Structured Outputs

Multimodal Content

Telemetry

Video Walkthrough

Getting Started

Core Workflow

Concepts

Telemetry

Web App

Advanced

​Supported Providers

​Anthropic (Claude)

​Basic usage

​What to_anthropic_invocation() returns

​Extended thinking

​Structured outputs

​OpenAI (GPT)

​Chat Completions API

​What to_openai_chat_invocation() returns

​Responses API

​Reasoning models

​Google (Gemini)

​Google AI Studio

​Vertex AI

​What to_google_gemini_invocation() returns

​Thinking models

​Generic Provider Method

​Message-Only Methods

​Parsing Responses

​Tool Use

​Tool choice

​Multimodal Content

​Error Handling

​Provider Feature Matrix

​Next Steps

Code Generation

Structured Outputs

Multimodal Content

Telemetry

Supported Providers

Anthropic (Claude)

Basic usage

What to_anthropic_invocation() returns

Extended thinking

Structured outputs

OpenAI (GPT)

Chat Completions API

What to_openai_chat_invocation() returns

Responses API

Reasoning models

Google (Gemini)

Google AI Studio

Vertex AI

What to_google_gemini_invocation() returns

Thinking models

Generic Provider Method

Message-Only Methods

Parsing Responses

Tool Use

Tool choice

Multimodal Content

Error Handling

Provider Feature Matrix

Next Steps