Skip to main content
This guide walks you through installing the SDK, fetching a prompt, and making your first LLM call with telemetry.

Prerequisites

  • Python 3.10+
  • A Moxn account and API key (get one at moxn.dev)
  • An LLM provider API key (Anthropic, OpenAI, or Google)

Installation

Install the Moxn SDK using pip:
pip install moxn
Set your API key as an environment variable:
export MOXN_API_KEY="your-api-key"

Your First Moxn Call

Here’s a complete example that fetches a prompt, creates a session, sends it to Claude, and logs the interaction:
import asyncio
from moxn import MoxnClient
from moxn.types.content import Provider
from anthropic import Anthropic

async def main():
    # Initialize the Moxn client
    async with MoxnClient() as client:
        # Fetch a prompt from your task
        # Use branch_name for development, commit_id for production
        prompt = await client.get_prompt(
            prompt_id="your-prompt-id",
            branch_name="main"
        )

        # Create a session with your input data
        session = await client.create_prompt_session(
            prompt_id="your-prompt-id",
            branch_name="main",
            session_data=YourPromptInput(
                ragItems=[
                    ragItem(
                        title="Best Document",
                        content="The answer to the user's query"
                    )
                ]
            )
        )

        # Convert to Anthropic format and send
        anthropic = Anthropic()
        response = anthropic.messages.create(
            **session.to_anthropic_invocation()
        )

        print(response.content[0].text)

        # Log telemetry
        async with client.span(session) as span:
            await client.log_telemetry_event_from_response(
                session, response, Provider.ANTHROPIC
            )

asyncio.run(main())

From Template to Provider Call

The code above follows the core Moxn pattern (see Core Concepts):
StepWhat Happens
1. Fetch templateYour prompt (messages, variables, config) is retrieved from Moxn
2. Create sessionTemplate + your session_data = a session ready to render
3. Build invocationto_anthropic_invocation() returns a plain Python dict
4. Call providerYou pass the dict to the provider SDK with **invocation

You Control the Payload

The invocation is just a dictionary—you can inspect, modify, or extend it:
invocation = session.to_anthropic_invocation()

# Add streaming
invocation["stream"] = True

# Use a new provider feature
invocation["extra_headers"] = {"anthropic-beta": "citations-2025-01-01"}

# Override settings for A/B testing
invocation["model"] = "claude-sonnet-4-20250514"

response = anthropic.messages.create(**invocation)
This design is intentional: Moxn helps you build payloads, but never owns the integration. You always call the provider SDK directly—no wrapper, no middleware, no magic. If a provider releases a new feature tomorrow, you can use it immediately by adding parameters to the dict or to the provider directly. Moxn can help you build the provider specific payload.

Understanding the Code

Let’s break down what’s happening:

1. MoxnClient as Context Manager

async with MoxnClient() as client:
The client manages connections and telemetry batching. Always use it as an async context manager to ensure proper cleanup.

2. Fetching Prompts

prompt = await client.get_prompt(
    prompt_id="your-prompt-id",
    branch_name="main"  # or commit_id="abc123" for production
)
  • Branch access (branch_name): Gets the latest version, including uncommitted changes. Use for development.
  • Commit access (commit_id): Gets an immutable snapshot. Use for production.

3. Creating Sessions

prompt_session = await client.create_prompt_session(
    prompt_id="your-prompt-id",
    branch_name="main",
    session_data=YourInputModel(...)  # Pydantic model from codegen
)
A (prompt) session combines your prompt template with runtime data. The session_data is typically a Pydantic model generated by codegen. The (prompt) session holds the message history - you can append additional messages or context, append an LLM response followed by additional user messages. It maintains the conversation history for the session. Each telemetry loggign call will log the session independently.

4. Converting to Provider Format

response = anthropic.messages.create(
    **prompt_session.to_anthropic_invocation()
)
The to_*_invocation() methods return complete payloads you can unpack directly into provider SDKs. They include:
  • Messages formatted for the provider
  • Model configuration from your prompt
  • Tools and structured output schemas (if configured)

5. Logging Telemetry

async with client.span(session) as span:
    await client.log_telemetry_event_from_response(
        session, response, Provider.ANTHROPIC
    )
Spans create observable traces of your LLM calls. Every call within a span is linked for debugging and analysis.

Using Code Generation

For type-safe session data, generate Pydantic models from your prompts:
# Generate models for all prompts in a task
async with MoxnClient() as client:
    await client.generate_task_models(
        task_id="your-task-id",
        branch_name="main",
        output_dir="./models"
    )
This creates a Python file with Pydantic models matching your prompt’s input schema:
# models/your_task_models.py (generated)
from datetime import datetime
from typing import TypedDict
from pydantic import BaseModel
from moxn.types.base import RenderableModel


class DocumentRendered(TypedDict):
    """Flattened string representation for prompt template injection."""
    title: str
    created_at: str
    last_modified: str
    author: str
    content: str


class Document(BaseModel):
    title: str
    created_at: datetime
    last_modified: datetime | None = None
    author: str
    content: str

    def render(self, **kwargs) -> DocumentRendered:
        """Render to flattened dictionary for prompt variable substitution."""
        result: DocumentRendered = {
            "title": self.title,
            "created_at": self.created_at.isoformat(),
            "last_modified": self.last_modified.isoformat() if self.last_modified else "",
            "author": self.author,
            "content": self.content,
        }
        return result


class YourPromptInputRendered(TypedDict):
    """Flattened representation - keys match your prompt's variable blocks."""
    query: str
    user_id: str
    context: str


class YourPromptInput(RenderableModel):
    query: str
    user_id: str
    context: list[Document] | None = None

    def render(self, **kwargs) -> YourPromptInputRendered:
        # Render context as XML document collection
        if self.context:
            doc_xmls = []
            for doc in self.context:
                data = doc.render(**kwargs)  # Returns DocumentRendered
                attrs = f'title="{data["title"]}" author="{data["author"]}" created_at="{data["created_at"]}"'
                if data["last_modified"]:
                    attrs += f' last_modified="{data["last_modified"]}"'
                doc_xmls.append(f"<document {attrs}>\n{data['content']}\n</document>")
            context_str = "<context>\n" + "\n".join(doc_xmls) + "\n</context>"
        else:
            context_str = ""

        return {
            "query": self.query,
            "user_id": self.user_id,
            "context": context_str,
        }
The render() method transforms your typed data into strings for prompt injection. This example renders documents as XML—a format that works well for providing structured context to LLMs. Then use it in your code:
from models.your_task_models import YourPromptInput, Document
from datetime import datetime

session = await client.create_prompt_session(
    prompt_id="your-prompt-id",
    session_data=YourPromptInput(
        query="How do I reset my password?",
        user_id="user_123",
        context=[
            Document(
                title="Password Reset Guide",
                created_at=datetime(2024, 1, 15),
                author="Support Team",
                content="To reset your password, click 'Forgot Password' on the login page..."
            ),
            Document(
                title="Account Security FAQ",
                created_at=datetime(2024, 2, 1),
                last_modified=datetime(2024, 3, 10),
                author="Security Team",
                content="We recommend using a password manager and enabling 2FA..."
            )
        ]
    )
)
This renders into the prompt as:
<context>
<document title="Password Reset Guide" author="Support Team" created_at="2024-01-15T00:00:00">
To reset your password, click 'Forgot Password' on the login page...
</document>
<document title="Account Security FAQ" author="Security Team" created_at="2024-02-01T00:00:00" last_modified="2024-03-10T00:00:00">
We recommend using a password manager and enabling 2FA...
</document>
</context>

The Render Pipeline

When you create a prompt session, your typed data transforms through several stages:
┌─────────────────────────────────────────────────────────────────────────┐
│  YOUR CODE                                                              │
│  ─────────                                                              │
│  YourPromptInput(                    ← Pydantic BaseModel               │
│    query="How do I reset...",          (typed, validated)              │
│    user_id="user_123",                                                 │
│    context=[Document(...), ...]                                        │
│  )                                                                     │
└─────────────────────────────────────────────────────────────────────────┘

                                    │ .render(**kwargs)

┌─────────────────────────────────────────────────────────────────────────┐
│  RENDERED REPRESENTATION                                                │
│  ───────────────────────                                                │
│  YourPromptInputRendered = {       ← TypedDict (all string values)     │
│    "query": "How do I reset...",                                       │
│    "user_id": "user_123",                                              │
│    "context": "<context>...</context>",                                │
│  }                                                                     │
└─────────────────────────────────────────────────────────────────────────┘

                                    │ to_anthropic_invocation()
                                    │ (variables matched by name)

┌─────────────────────────────────────────────────────────────────────────┐
│  PROVIDER PAYLOAD                                                       │
│  ────────────────                                                       │
│  {                                                                     │
│    "model": "claude-sonnet-4-20250514",                                │
│    "messages": [                                                       │
│      {"role": "user", "content": "How do I reset... <context>..."}    │
│    ],                                                                  │
│    ...                                                                 │
│  }                                                                     │
└─────────────────────────────────────────────────────────────────────────┘

How It Works

When you create a prompt session with session_data, the SDK:
  1. Stores your typed model — The Pydantic instance you pass to create_prompt_session()
  2. Calls render() at invocation time — When you call to_anthropic_invocation() (or other providers)
  3. Substitutes variables by name — Each key in the rendered dict matches a {{variable}} block in your prompt
# Your code
session = await client.create_prompt_session(
    prompt_id="your-prompt-id",
    session_data=YourPromptInput(...)  # ← Stored as-is
)

# When you call this, render() is invoked automatically
invocation = session.to_anthropic_invocation()
# session_data.render() → YourPromptInputRendered → variable substitution
Telemetry captures both representations:
  • Session data: Your original typed model (enables re-rendering with different logic)
  • Rendered input: The flattened strings that were injected into the prompt
Why Two Representations?The dual-representation pattern serves distinct purposes:
RepresentationTypePurpose
BaseModel (session data)Pydantic modelType-safe data structure with validation, rich types (datetime, nested objects, lists)
*Rendered (TypedDict)dict[str, str]Flat string values ready for template injection
This separation enables:
  • Re-rendering: Change how data is formatted without changing the data itself
  • Telemetry: Log the structured input separately from the rendered output
  • Testing: Compare different rendering strategies using the same session data

Next Steps