Create Embeddings
Generate vector embeddings for search, retrieval, and clustering.
Turn text into a dense vector representation that you can store in a vector
database and search by cosine similarity. The endpoint matches OpenAI's
embeddings.create so any OpenAI SDK works out of the box - just point
baseURL at https://api.aivene.com/v1.
POST /v1/embeddingsAuthentication
AuthorizationBearerrequiredAPI key as bearer token in the Authorization header. Create keys at
Manage API Keys.
Headers
Content-TypestringrequiredMust be application/json.
Body
modelstringrequiredEmbedding model id from GET /v1/models. Example: text-embedding-3-small,
text-embedding-3-large.
inputstring | string[]requiredOne string, an array of strings, or an array of token-id arrays.
dimensionsintegeroptionalTruncate the output vector to this many dimensions. Only supported by
models with Matryoshka representation (e.g. text-embedding-3-small,
text-embedding-3-large).
encoding_formatstringoptionalDefault floatOutput format. float returns a JSON array of numbers, base64 returns
a compact base64-encoded string.
floatbase64userstringoptionalEnd-user identifier for abuse tracking. safety_identifier is accepted
as an alias.
Batch when you can
Sending an array of strings in one request is dramatically cheaper than one request per string. Most models accept batches of up to 2,048 inputs.
Response
object'list'optionalAlways list.
dataEmbedding[]optionalArray of embedding objects. Each contains index, object ('embedding'),
and embedding (array of floats or base64 string).
modelstringoptionalThe model id actually used.
usageobjectoptional{ prompt_tokens, total_tokens }. Token counts for the input text.
Example
curl https://api.aivene.com/v1/embeddings \
-H "Authorization: Bearer $AIVENE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "text-embedding-3-small",
"input": [
"The quick brown fox",
"jumps over the lazy dog"
]
}'Response:
{
"object": "list",
"data": [
{ "index": 0, "object": "embedding", "embedding": [0.0123, -0.0456, ...] },
{ "index": 1, "object": "embedding", "embedding": [0.0234, -0.0567, ...] }
],
"model": "text-embedding-3-small",
"usage": { "prompt_tokens": 8, "total_tokens": 8 }
}Resizing dimensions
{
"model": "text-embedding-3-large",
"input": "hello world",
"dimensions": 512
}This returns a 512-dim vector instead of the model's native size (e.g. 3072
for text-embedding-3-large). Use it to cut storage cost when full
resolution is overkill.
Base64 encoding
Set encoding_format: 'base64' if you need a compact wire format. The
client must decode the base64 string into a Float32Array before use.
Choosing dimensions
| Use case | Suggested model | Dimensions |
|---|---|---|
| General RAG | text-embedding-3-small | 1536 (native) |
| Tight storage | text-embedding-3-small | 512 - 768 |
| High-recall search | text-embedding-3-large | 3072 (native) |
| Multilingual | provider-specific | check /v1/models |
Errors
All errors follow the OpenAI structured shape:
{ "error": { "type": "invalid_request_error", "message": "..." } }| Status | error.type | Meaning |
|---|---|---|
400 | invalid_request_error | Schema violation. The message names the offending field. |
401 | authentication_error | Missing or invalid API key. |
402 | billing_error | Account out of credit or hit a spend limit. |
403 | permission_error | Key lacks the required scope. |
429 | rate_limit_error | RPM or TPM exceeded. Respect Retry-After. |
500 | internal_server_error | Unexpected gateway failure. Safe to retry idempotently. |
502 / 503 | upstream_error | Downstream provider failure. |