Create Embeddings

Generate vector embeddings for search, retrieval, and clustering.

Turn text into a dense vector representation that you can store in a vector database and search by cosine similarity. The endpoint matches OpenAI's embeddings.create so any OpenAI SDK works out of the box - just point baseURL at https://api.aivene.com/v1.

POST /v1/embeddings

Authentication

AuthorizationBearerrequired

API key as bearer token in the Authorization header. Create keys at Manage API Keys.

Headers

Content-Typestringrequired

Must be application/json.

Body

modelstringrequired

Embedding model id from GET /v1/models. Example: text-embedding-3-small, text-embedding-3-large.

inputstring | string[]required

One string, an array of strings, or an array of token-id arrays.

dimensionsintegeroptional

Truncate the output vector to this many dimensions. Only supported by models with Matryoshka representation (e.g. text-embedding-3-small, text-embedding-3-large).

encoding_formatstringoptionalDefault float

Output format. float returns a JSON array of numbers, base64 returns a compact base64-encoded string.

Allowed values:floatbase64
userstringoptional

End-user identifier for abuse tracking. safety_identifier is accepted as an alias.

Batch when you can

Sending an array of strings in one request is dramatically cheaper than one request per string. Most models accept batches of up to 2,048 inputs.

Response

object'list'optional

Always list.

dataEmbedding[]optional

Array of embedding objects. Each contains index, object ('embedding'), and embedding (array of floats or base64 string).

modelstringoptional

The model id actually used.

usageobjectoptional

{ prompt_tokens, total_tokens }. Token counts for the input text.

Example

curl https://api.aivene.com/v1/embeddings \
  -H "Authorization: Bearer $AIVENE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "text-embedding-3-small",
    "input": [
      "The quick brown fox",
      "jumps over the lazy dog"
    ]
  }'

Response:

{
  "object": "list",
  "data": [
    { "index": 0, "object": "embedding", "embedding": [0.0123, -0.0456, ...] },
    { "index": 1, "object": "embedding", "embedding": [0.0234, -0.0567, ...] }
  ],
  "model": "text-embedding-3-small",
  "usage": { "prompt_tokens": 8, "total_tokens": 8 }
}

Resizing dimensions

{
  "model": "text-embedding-3-large",
  "input": "hello world",
  "dimensions": 512
}

This returns a 512-dim vector instead of the model's native size (e.g. 3072 for text-embedding-3-large). Use it to cut storage cost when full resolution is overkill.

Base64 encoding

Set encoding_format: 'base64' if you need a compact wire format. The client must decode the base64 string into a Float32Array before use.

Choosing dimensions

Use caseSuggested modelDimensions
General RAGtext-embedding-3-small1536 (native)
Tight storagetext-embedding-3-small512 - 768
High-recall searchtext-embedding-3-large3072 (native)
Multilingualprovider-specificcheck /v1/models

Errors

All errors follow the OpenAI structured shape:

{ "error": { "type": "invalid_request_error", "message": "..." } }
Statuserror.typeMeaning
400invalid_request_errorSchema violation. The message names the offending field.
401authentication_errorMissing or invalid API key.
402billing_errorAccount out of credit or hit a spend limit.
403permission_errorKey lacks the required scope.
429rate_limit_errorRPM or TPM exceeded. Respect Retry-After.
500internal_server_errorUnexpected gateway failure. Safe to retry idempotently.
502 / 503upstream_errorDownstream provider failure.