Overview
Send a POST /chat/completions request with your chosen model and conversation history.
Luntrex handles routing, guardrails, and telemetry while returning a unified payload.
Request payload
Submit your prompt and any system messages. You can configure max_tokens,
evaluation tags, and additional routing hints if needed.
{
"model": "openai/gpt-4.1-mini",
"messages": [
{ "role": "user", "content": "Write a short rhyming poem about the ocean." }
],
"max_tokens": 512
}
Response payload
Every response includes the generated answer, token accounting, and your remaining balance.
{
"answer": "Foam-flecked verses drift and sway...",
"usage": { "input_tokens": 12, "output_tokens": 84 },
"balance_left": 49321
}