Create Chat Completion

Send a list of messages and get back an assistant reply. Supports streaming, tool calls, structured output, and multimodal content parts.

Sends a request for a model response for the given chat conversation. Supports both streaming and non-streaming modes. The endpoint mirrors OpenAI's chat.completions.create so any OpenAI SDK works out of the box - just point baseURL at https://api.aivene.com/v1.

POST /v1/chat/completions

Authentication

AuthorizationBearerrequired

API key as bearer token in the Authorization header. Create keys at Manage API Keys.

Headers

Content-Typestringrequired

Must be application/json.

Body

modelstringrequired

Model id from GET /v1/models. Example: gpt-4o-mini, claude-sonnet-4, gemini-2.5-pro.

messagesMessage[]required

Conversation history. Must contain at least one message. See Message shape below for the discriminated union.

streambooleanoptionalDefault false

Stream partial deltas as Server-Sent Events instead of returning a single JSON response.

stream_optionsobjectoptional

Only allowed when stream: true. Supports { include_usage: boolean } to receive a final chunk with token totals.

max_completion_tokensintegeroptional

Cap on output tokens. max_tokens is accepted as an alias and rewritten server-side.

temperaturenumberoptionalDefault 1

Between 0 and 2. Higher = more random.

top_pnumberoptionalDefault 1

Between 0 and 1. Nucleus sampling. Use temperature or top_p, not both.

nintegeroptionalDefault 1

Number of completions to generate. Most models only support 1.

stopstring | string[]optional

Up to 4 stop sequences. Generation halts before the sequence is emitted.

presence_penaltynumberoptionalDefault 0

Between -2 and 2. Positive values discourage repeating tokens.

frequency_penaltynumberoptionalDefault 0

Between -2 and 2. Positive values reduce verbatim repetition.

seedintegeroptional

Best-effort determinism on supported providers.

response_formatobjectoptional

{ type: 'json_object' | 'text' | 'json_schema', json_schema?: ... }. Use json_schema for strict structured output.

toolsTool[]optional

Function / tool schemas the model can call. See Tool calling.

tool_choicestring | objectoptional

'auto', 'none', 'required', or { type: 'function', function: { name } } to force a specific call.

Allowed values:autononerequired
parallel_tool_callsbooleanoptionalDefault true

Allow the model to emit multiple tool calls in one turn.

reasoning_effort'low' | 'medium' | 'high'optional

Hint for reasoning models. Ignored by non-reasoning models.

modalitiesstring[]optional

Request additional modalities like ['text', 'audio'].

logprobsbooleanoptionalDefault false

Include log probabilities in the response.

top_logprobsintegeroptional

Number of top tokens (0-20) to return when logprobs is true.

logit_biasRecord<string, number>optional

Bias map keyed by token id. Values between -100 and 100.

metadataobjectoptional

Free-form key/value attached to the request for logging.

userstringoptional

End-user identifier for abuse tracking. safety_identifier is accepted as an alias.

`developer` role is rewritten

Messages with role: 'developer' are normalised to role: 'system' before reaching the provider, so prompts work uniformly across vendors.

Message shape

type Message =
  | { role: 'system' | 'developer'; content: string | Part[]; name?: string }
  | { role: 'user'; content: string | Part[]; name?: string }
  | { role: 'assistant'; content?: string | Part[] | null; tool_calls?: ToolCall[] }
  | { role: 'tool'; content: string | Part[]; tool_call_id: string };

Content parts cover text, images, audio, and documents:

[
  { "type": "text", "text": "What is in this image?" },
  { "type": "image_url", "image_url": { "url": "https://example.com/cat.png" } }
]

Content part types

TypeDescriptionExample
textPlain text{ "type": "text", "text": "Hello" }
image_urlImage by URL or base64{ "type": "image_url", "image_url": { "url": "https://..." } }
input_videoVideo by base64 (models with video capability){ "type": "input_video", "input_video": { "data": "<base64>", "format": "mp4" } }
input_audioAudio by base64 (models with audio capability){ "type": "input_audio", "input_audio": { "data": "<base64>", "format": "mp3" } }
fileDocument by URL, base64, or file_id{ "type": "file", "file": { "file_url": "https://..." } }

audio and video capabilities

The input_audio and input_video content types are only supported by models with the corresponding capability. Check GET /v1/models for supported models. See Audio Understanding and Video Understanding for usage examples.

Response

idstringoptional

Unique completion id, prefixed ilbs_.

object'chat.completion' | 'chat.completion.chunk'optional

chat.completion for non-streaming, chat.completion.chunk for SSE deltas.

createdintegeroptional

Unix timestamp (seconds) when the completion was created.

modelstringoptional

The model id actually used. May differ from the request if routing fell back to an alternate provider.

choicesChoice[]optional

One entry per n. Each contains index, message (or delta when streaming), and a finish_reason of stop, length, tool_calls, or content_filter.

usageobjectoptional

{ prompt_tokens, completion_tokens, total_tokens }. Present on the final chunk when stream_options.include_usage is set.

Streaming

When stream: true, the connection upgrades to text/event-stream. Each event is a JSON delta; the stream terminates with a literal data: [DONE] sentinel - close the reader when you see it.

Tool calling

Provide a tools array of JSON-schema function definitions. When the model chooses to call one, it returns tool_calls instead of content and the finish_reason is tool_calls. Run the function locally, then send the result back as a role: 'tool' message with the same tool_call_id. Call the endpoint again to get the final assistant reply.

Errors

All errors follow the OpenAI structured shape:

{ "error": { "type": "invalid_request_error", "message": "..." } }
Statuserror.typeMeaning
400invalid_request_errorSchema violation. The message names the offending field.
401authentication_errorMissing or invalid API key.
402billing_errorAccount out of credit or hit a spend limit.
403permission_errorKey lacks the required scope.
429rate_limit_errorRPM or TPM exceeded. Respect Retry-After.
500internal_server_errorUnexpected gateway failure. Safe to retry idempotently.
502 / 503upstream_errorDownstream provider failure.