{ "text": "Dear Alice,\n\nWelcome to Tracia! We're thrilled to have you join our community...", "spanId": "sp_abc123xyz", "traceId": "tr_session789", "promptVersion": 3, "latencyMs": 1250, "usage": { "inputTokens": 45, "outputTokens": 120, "totalTokens": 165 }, "cost": 0.0049, "finishReason": "stop", "toolCalls": null, "structuredOutput": null}
Run a prompt with variable substitution and get the generated response. This endpoint handles template rendering, LLM API calls, and automatically logs a span.
Full conversation messages for multi-turn tool calling. When provided, template rendering is skipped and these messages are sent directly to the LLM. Each message has role (system/developer/user/assistant/tool), content, and optionally toolCallId/toolName for tool result messages.
Full conversation messages (rendered input + assistant response) for multi-turn continuation. Pass these back in the next request’s messages field to continue the conversation.