Create Embeddings

Generate vector embeddings for search, retrieval, and clustering.

Turn text into a dense vector representation that you can store in a vector database and search by cosine similarity. The endpoint matches OpenAI's embeddings.create so any OpenAI SDK works out of the box - just point baseURL at https://api.aivene.com/v1.

POST /v1/embeddings

Authentication

AuthorizationBearerrequired

API key as bearer token in the Authorization header. Create keys at Manage API Keys.

Headers

Content-Typestringrequired

Must be application/json.

Body

modelstringrequired

Embedding model id from GET /v1/models. Example: text-embedding-3-small, text-embedding-3-large.

inputstring | string[]required

One string, an array of strings, or an array of token-id arrays.

dimensionsintegeroptional

Truncate the output vector to this many dimensions. Only supported by models with Matryoshka representation (e.g. text-embedding-3-small, text-embedding-3-large).

encoding_formatstringoptionalDefault float

Output format. float returns a JSON array of numbers, base64 returns a compact base64-encoded string.

Allowed values:floatbase64

userstringoptional

End-user identifier for abuse tracking. safety_identifier is accepted as an alias.

Batch when you can

Sending an array of strings in one request is dramatically cheaper than one request per string. Most models accept batches of up to 2,048 inputs.

Response

object'list'optional

Always list.

dataEmbedding[]optional

Array of embedding objects. Each contains index, object ('embedding'), and embedding (array of floats or base64 string).

modelstringoptional

The model id actually used.

usageobjectoptional

{ prompt_tokens, total_tokens }. Token counts for the input text.

Example

curl https://api.aivene.com/v1/embeddings \
  -H "Authorization: Bearer $AIVENE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "text-embedding-3-small",
    "input": [
      "The quick brown fox",
      "jumps over the lazy dog"
    ]
  }'

Response:

{
  "object": "list",
  "data": [
    { "index": 0, "object": "embedding", "embedding": [0.0123, -0.0456, ...] },
    { "index": 1, "object": "embedding", "embedding": [0.0234, -0.0567, ...] }
  ],
  "model": "text-embedding-3-small",
  "usage": { "prompt_tokens": 8, "total_tokens": 8 }
}

Resizing dimensions

{
  "model": "text-embedding-3-large",
  "input": "hello world",
  "dimensions": 512
}

This returns a 512-dim vector instead of the model's native size (e.g. 3072 for text-embedding-3-large). Use it to cut storage cost when full resolution is overkill.

Base64 encoding

Set encoding_format: 'base64' if you need a compact wire format. The client must decode the base64 string into a Float32Array before use.

Choosing dimensions

Use case	Suggested model	Dimensions
General RAG	`text-embedding-3-small`	1536 (native)
Tight storage	`text-embedding-3-small`	512 - 768
High-recall search	`text-embedding-3-large`	3072 (native)
Multilingual	provider-specific	check `/v1/models`

Errors

All errors follow the OpenAI structured shape:

{ "error": { "type": "invalid_request_error", "message": "..." } }

Status	`error.type`	Meaning
`400`	`invalid_request_error`	Schema violation. The message names the offending field.
`401`	`authentication_error`	Missing or invalid API key.
`402`	`billing_error`	Account out of credit or hit a spend limit.
`403`	`permission_error`	Key lacks the required scope.
`429`	`rate_limit_error`	RPM or TPM exceeded. Respect `Retry-After`.
`500`	`internal_server_error`	Unexpected gateway failure. Safe to retry idempotently.
`502` / `503`	`upstream_error`	Downstream provider failure.