Create Chat Completion

Send a list of messages and get back an assistant reply. Supports streaming, tool calls, structured output, and multimodal content parts.

Sends a request for a model response for the given chat conversation. Supports both streaming and non-streaming modes. The endpoint mirrors OpenAI's chat.completions.create so any OpenAI SDK works out of the box - just point baseURL at https://api.aivene.com/v1.

POST /v1/chat/completions

Authentication

AuthorizationBearerrequired

API key as bearer token in the Authorization header. Create keys at Manage API Keys.

Headers

Content-Typestringrequired

Must be application/json.

Body

modelstringrequired

Model id from GET /v1/models. Example: gpt-4o-mini, claude-sonnet-4, gemini-2.5-pro.

messagesMessage[]required

Conversation history. Must contain at least one message. See Message shape below for the discriminated union.

streambooleanoptionalDefault false

Stream partial deltas as Server-Sent Events instead of returning a single JSON response.

stream_optionsobjectoptional

Only allowed when stream: true. Supports { include_usage: boolean } to receive a final chunk with token totals.

max_completion_tokensintegeroptional

Cap on output tokens. max_tokens is accepted as an alias and rewritten server-side.

temperaturenumberoptionalDefault 1

Between 0 and 2. Higher = more random.

top_pnumberoptionalDefault 1

Between 0 and 1. Nucleus sampling. Use temperature or top_p, not both.

nintegeroptionalDefault 1

Number of completions to generate. Most models only support 1.

stopstring | string[]optional

Up to 4 stop sequences. Generation halts before the sequence is emitted.

presence_penaltynumberoptionalDefault 0

Between -2 and 2. Positive values discourage repeating tokens.

frequency_penaltynumberoptionalDefault 0

Between -2 and 2. Positive values reduce verbatim repetition.

seedintegeroptional

Best-effort determinism on supported providers.

response_formatobjectoptional

{ type: 'json_object' | 'text' | 'json_schema', json_schema?: ... }. Use json_schema for strict structured output.

toolsTool[]optional

Function / tool schemas the model can call. See Tool calling.

tool_choicestring | objectoptional

'auto', 'none', 'required', or { type: 'function', function: { name } } to force a specific call.

Allowed values:autononerequired

parallel_tool_callsbooleanoptionalDefault true

Allow the model to emit multiple tool calls in one turn.

reasoning_effort'low' | 'medium' | 'high'optional

Hint for reasoning models. Ignored by non-reasoning models.

modalitiesstring[]optional

Request additional modalities like ['text', 'audio'].

logprobsbooleanoptionalDefault false

Include log probabilities in the response.

top_logprobsintegeroptional

Number of top tokens (0-20) to return when logprobs is true.

logit_biasRecord<string, number>optional

Bias map keyed by token id. Values between -100 and 100.

metadataobjectoptional

Free-form key/value attached to the request for logging.

userstringoptional

End-user identifier for abuse tracking. safety_identifier is accepted as an alias.

`developer` role is rewritten

Messages with role: 'developer' are normalised to role: 'system' before reaching the provider, so prompts work uniformly across vendors.

Message shape

type Message =
  | { role: 'system' | 'developer'; content: string | Part[]; name?: string }
  | { role: 'user'; content: string | Part[]; name?: string }
  | { role: 'assistant'; content?: string | Part[] | null; tool_calls?: ToolCall[] }
  | { role: 'tool'; content: string | Part[]; tool_call_id: string };

Content parts cover text, images, audio, and documents:

[
  { "type": "text", "text": "What is in this image?" },
  { "type": "image_url", "image_url": { "url": "https://example.com/cat.png" } }
]

Content part types

Type	Description	Example
`text`	Plain text	`{ "type": "text", "text": "Hello" }`
`image_url`	Image by URL or base64	`{ "type": "image_url", "image_url": { "url": "https://..." } }`
`input_video`	Video by base64 (models with `video` capability)	`{ "type": "input_video", "input_video": { "data": "<base64>", "format": "mp4" } }`
`input_audio`	Audio by base64 (models with `audio` capability)	`{ "type": "input_audio", "input_audio": { "data": "<base64>", "format": "mp3" } }`
`file`	Document by URL, base64, or file_id	`{ "type": "file", "file": { "file_url": "https://..." } }`

audio and video capabilities

The input_audio and input_video content types are only supported by models with the corresponding capability. Check GET /v1/models for supported models. See Audio Understanding and Video Understanding for usage examples.

Response

idstringoptional

Unique completion id, prefixed avn_.

object'chat.completion' | 'chat.completion.chunk'optional

chat.completion for non-streaming, chat.completion.chunk for SSE deltas.

createdintegeroptional

Unix timestamp (seconds) when the completion was created.

modelstringoptional

The model id actually used. May differ from the request if routing fell back to an alternate provider.

choicesChoice[]optional

One entry per n. Each contains index, message (or delta when streaming), and a finish_reason of stop, length, tool_calls, or content_filter.

usageobjectoptional

{ prompt_tokens, completion_tokens, total_tokens }. Present on the final chunk when stream_options.include_usage is set.

Streaming

When stream: true, the connection upgrades to text/event-stream. Each event is a JSON delta; the stream terminates with a literal data: [DONE] sentinel - close the reader when you see it.

Tool calling

Provide a tools array of JSON-schema function definitions. When the model chooses to call one, it returns tool_calls instead of content and the finish_reason is tool_calls. Run the function locally, then send the result back as a role: 'tool' message with the same tool_call_id. Call the endpoint again to get the final assistant reply.

Errors

All errors follow the OpenAI structured shape:

{ "error": { "type": "invalid_request_error", "message": "..." } }

Status	`error.type`	Meaning
`400`	`invalid_request_error`	Schema violation. The message names the offending field.
`401`	`authentication_error`	Missing or invalid API key.
`402`	`billing_error`	Account out of credit or hit a spend limit.
`403`	`permission_error`	Key lacks the required scope.
`429`	`rate_limit_error`	RPM or TPM exceeded. Respect `Retry-After`.
`500`	`internal_server_error`	Unexpected gateway failure. Safe to retry idempotently.
`502` / `503`	`upstream_error`	Downstream provider failure.