Documentation Index Fetch the complete documentation index at: https://docs.tracia.io/llms.txt
Use this file to discover all available pages before exploring further.
Run a prompt with variable substitution and get the generated response. This endpoint handles template rendering, LLM API calls, and automatically logs a span.
Request
Bearer token with your API key: Bearer tr_your_api_key
Key-value pairs for template variables. Must include all variables required by the prompt.
Override the default model (e.g., gpt-4o, claude-sonnet-4-20250514)
Run a specific prompt version. If omitted, the latest version is used. Useful for pinning a known-good version in production.
Tags for filtering spans in the dashboard
End user identifier for tracking
Session identifier for grouping related spans
Group related spans together (session ID for multi-turn conversations)
Link to parent span (creates a chain). When provided without traceId, the trace ID is inherited from the parent span.
Full conversation messages for multi-turn tool calling. When provided, template rendering is skipped and these messages are sent directly to the LLM. Each message has role (system/developer/user/assistant/tool), content, and optionally toolCallId/toolName for tool result messages.
Response
The generated text from the LLM
Unique identifier for this span
Session identifier (same as spanId if not part of session)
Version of the prompt that was used
Total request latency in milliseconds
Why the model stopped generating: stop, max_tokens, or tool_calls
Tool calls made by the model (when the prompt has tools configured) Unique identifier for the tool call
Arguments passed to the tool
Parsed JSON object when the prompt has an output schema configured. The output conforms to the JSON schema defined in the prompt settings.
Full conversation messages (rendered input + assistant response) for multi-turn continuation. Pass these back in the next request’s messages field to continue the conversation.
curl -X POST https://app.tracia.io/api/v1/prompts/welcome-email/run \
-H "Authorization: Bearer tr_your_api_key" \
-H "Content-Type: application/json" \
-d '{
"variables": {
"name": "Alice",
"product": "Tracia"
},
"tags": ["onboarding", "email"],
"userId": "user_123"
}'
200
200 Structured Output
400 Missing Variables
400 No Provider Key
404
500 Provider Error
{
"text" : "Dear Alice, \n\n Welcome to Tracia! We're thrilled to have you join our community..." ,
"spanId" : "sp_abc123xyz" ,
"traceId" : "tr_session789" ,
"promptVersion" : 3 ,
"latencyMs" : 1250 ,
"usage" : {
"inputTokens" : 45 ,
"outputTokens" : 120 ,
"totalTokens" : 165
},
"cost" : 0.0049 ,
"finishReason" : "stop" ,
"toolCalls" : null ,
"structuredOutput" : null
}
API key starting with tr_
Key-value pairs for template variables
Override the default model (e.g., gpt-4o , claude-sonnet-4-20250514 )
Run a specific prompt version (uses latest if omitted)
Tags for filtering spans in the dashboard
End user identifier for tracking
Session identifier for grouping related spans
Group related spans together (session ID for multi-turn conversations)
Link to parent span (creates a chain)
Full conversation messages for multi-turn (skips template rendering when provided)
The generated text from the LLM
Unique identifier for this span
Session identifier (same as spanId if not part of session)
Version of the prompt that was used
Total request latency in milliseconds
Why the model stopped generating
Available options:
stop,
max_tokens,
tool_calls
Tool calls made by the model
Parsed JSON when the prompt has an output schema configured
Full conversation messages for multi-turn continuation (input messages + assistant response)