Create Chat Completion
Send a list of messages and get back an assistant reply. Supports streaming, tool calls, structured output, and multimodal content parts.
Sends a request for a model response for the given chat conversation.
Supports both streaming and non-streaming modes. The endpoint mirrors
OpenAI's chat.completions.create so any OpenAI SDK works out of the box -
just point baseURL at https://api.aivene.com/v1.
POST /v1/chat/completionsAuthentication
AuthorizationBearerrequiredAPI key as bearer token in the Authorization header. Create keys at
Manage API Keys.
Headers
Content-TypestringrequiredMust be application/json.
Body
modelstringrequiredModel id from GET /v1/models. Example: gpt-4o-mini,
claude-sonnet-4, gemini-2.5-pro.
messagesMessage[]requiredConversation history. Must contain at least one message. See Message shape below for the discriminated union.
streambooleanoptionalDefault falseStream partial deltas as Server-Sent Events instead of returning a single JSON response.
stream_optionsobjectoptionalOnly allowed when stream: true. Supports { include_usage: boolean } to
receive a final chunk with token totals.
max_completion_tokensintegeroptionalCap on output tokens. max_tokens is accepted as an alias and rewritten
server-side.
temperaturenumberoptionalDefault 1Between 0 and 2. Higher = more random.
top_pnumberoptionalDefault 1Between 0 and 1. Nucleus sampling. Use temperature or top_p,
not both.
nintegeroptionalDefault 1Number of completions to generate. Most models only support 1.
stopstring | string[]optionalUp to 4 stop sequences. Generation halts before the sequence is emitted.
presence_penaltynumberoptionalDefault 0Between -2 and 2. Positive values discourage repeating tokens.
frequency_penaltynumberoptionalDefault 0Between -2 and 2. Positive values reduce verbatim repetition.
seedintegeroptionalBest-effort determinism on supported providers.
response_formatobjectoptional{ type: 'json_object' | 'text' | 'json_schema', json_schema?: ... }.
Use json_schema for strict structured output.
toolsTool[]optionalFunction / tool schemas the model can call. See Tool calling.
tool_choicestring | objectoptional'auto', 'none', 'required', or
{ type: 'function', function: { name } } to force a specific call.
autononerequiredparallel_tool_callsbooleanoptionalDefault trueAllow the model to emit multiple tool calls in one turn.
reasoning_effort'low' | 'medium' | 'high'optionalHint for reasoning models. Ignored by non-reasoning models.
modalitiesstring[]optionalRequest additional modalities like ['text', 'audio'].
logprobsbooleanoptionalDefault falseInclude log probabilities in the response.
top_logprobsintegeroptionalNumber of top tokens (0-20) to return when logprobs is true.
logit_biasRecord<string, number>optionalBias map keyed by token id. Values between -100 and 100.
metadataobjectoptionalFree-form key/value attached to the request for logging.
userstringoptionalEnd-user identifier for abuse tracking. safety_identifier is accepted
as an alias.
`developer` role is rewritten
Messages with role: 'developer' are normalised to role: 'system' before
reaching the provider, so prompts work uniformly across vendors.
Message shape
type Message =
| { role: 'system' | 'developer'; content: string | Part[]; name?: string }
| { role: 'user'; content: string | Part[]; name?: string }
| { role: 'assistant'; content?: string | Part[] | null; tool_calls?: ToolCall[] }
| { role: 'tool'; content: string | Part[]; tool_call_id: string };Content parts cover text, images, audio, and documents:
[
{ "type": "text", "text": "What is in this image?" },
{ "type": "image_url", "image_url": { "url": "https://example.com/cat.png" } }
]Content part types
| Type | Description | Example |
|---|---|---|
text | Plain text | { "type": "text", "text": "Hello" } |
image_url | Image by URL or base64 | { "type": "image_url", "image_url": { "url": "https://..." } } |
input_video | Video by base64 (models with video capability) | { "type": "input_video", "input_video": { "data": "<base64>", "format": "mp4" } } |
input_audio | Audio by base64 (models with audio capability) | { "type": "input_audio", "input_audio": { "data": "<base64>", "format": "mp3" } } |
file | Document by URL, base64, or file_id | { "type": "file", "file": { "file_url": "https://..." } } |
audio and video capabilities
The input_audio and input_video content types are only supported by
models with the corresponding capability. Check GET /v1/models for
supported models. See Audio Understanding
and Video Understanding for usage examples.
Response
idstringoptionalUnique completion id, prefixed ilbs_.
object'chat.completion' | 'chat.completion.chunk'optionalchat.completion for non-streaming, chat.completion.chunk for SSE deltas.
createdintegeroptionalUnix timestamp (seconds) when the completion was created.
modelstringoptionalThe model id actually used. May differ from the request if routing fell back to an alternate provider.
choicesChoice[]optionalOne entry per n. Each contains index, message (or delta when
streaming), and a finish_reason of stop, length, tool_calls,
or content_filter.
usageobjectoptional{ prompt_tokens, completion_tokens, total_tokens }. Present on the
final chunk when stream_options.include_usage is set.
Streaming
When stream: true, the connection upgrades to text/event-stream. Each
event is a JSON delta; the stream terminates with a literal data: [DONE]
sentinel - close the reader when you see it.
Tool calling
Provide a tools array of JSON-schema function definitions. When the model
chooses to call one, it returns tool_calls instead of content and the
finish_reason is tool_calls. Run the function locally, then send the
result back as a role: 'tool' message with the same tool_call_id. Call
the endpoint again to get the final assistant reply.
Errors
All errors follow the OpenAI structured shape:
{ "error": { "type": "invalid_request_error", "message": "..." } }| Status | error.type | Meaning |
|---|---|---|
400 | invalid_request_error | Schema violation. The message names the offending field. |
401 | authentication_error | Missing or invalid API key. |
402 | billing_error | Account out of credit or hit a spend limit. |
403 | permission_error | Key lacks the required scope. |
429 | rate_limit_error | RPM or TPM exceeded. Respect Retry-After. |
500 | internal_server_error | Unexpected gateway failure. Safe to retry idempotently. |
502 / 503 | upstream_error | Downstream provider failure. |