Zum Inhalt springen

Modelle verwenden

Bevor Sie ein Modell des STACKIT AI Model Serving verwenden können, müssen Sie einen auth token erstellen.

  • Sie besitzen einen STACKIT AI Model Serving Auth-Token. Lesen Sie Manage auth tokens, um einen Token zu erstellen.

Sie können alle Shared Models über die API verwenden. STACKIT AI Model Serving bietet eine OpenAI-kompatible API, die eine einfache Integration in bestehende tools und libraries ermöglicht. Bitte konsultieren Sie die OpenAI API Documentation für zusätzliche parameter und detailed information.

Terminal-Fenster
curl -X POST \
https://api.openai-compat.model-serving.eu01.onstackit.cloud/v1/chat/completions \
-H "Authorization: Bearer [auth-token]" \
-H "Content-Type: application/json" \
-d '{
"model": "[model]",
"messages": [{"role": "system", "content": "[system-prompt]"}, {"role": "user", "content": "[user-message]"}, {"role": "assistant", "content": "[assistant-message]"}, {"role": "user", "content": "[user-message]"}],
"max\_completion\_tokens": [max-complention-tokens],
"temperature": 0.1
}'
Terminal-Fenster
curl -X POST \
https://api.openai-compat.model-serving.eu01.onstackit.cloud/v1/embeddings \
-H "Content-Type: application/json" \
-H "Authorization: Bearer [auth-token]" \
-d '{
"model": "[model]",
"input": [
"[document]"
]
}'

Beispiel:

Terminal-Fenster
curl -X POST \
https://api.openai-compat.model-serving.eu01.onstackit.cloud/v1/embeddings \
-H "Content-Type: application/json" \
-H "Authorization: Bearer eyNSksHSus78h2kshdfsd7878shjkdlkdc" \
-d '{
"model": "intfloat/e5-mistral-7b-instruct",
"input": [
"The API is fast and reliable.",
"The system reacts just in time and is stable."
]
}'

Das Modell wird mit den Embeddings antworten:

{
"id":"embd-96d405966aa14e8eb3d7e202a006e2cf",
"object":"list",
"created":1262540,
"model":"intfloat/e5-mistral-7b-instruct",
"data": [
{
"index":0,
"object":"embedding",
"embedding": [0.0167388916015625,0.005096435546875,0.01302337646484375,0.006805419921875,0.0089569091796875,-0.01406097412109375,...]
},
{
"index":1,
"object":"embedding",
"embedding": [0.0167388916015625,0.0050543545546875,0.01302337646484375,0.006805419921875,0.0089568951796875,-0.01406097412109375,...]
}
],
"usage": {
"prompt_tokens":3,
"total_tokens":3,
"completion_tokens":0
}
}

Multimodale Embedding-Modelle können genau wie textbasierte Embedding-Modelle eingesetzt werden. Sie bieten jedoch zusätzlich die Möglichkeit, Bilder einzubetten, indem sie die OpenAI-Spezifikation um die Unterstützung von Chat-Nachrichten für den Embeddings-Endpunkt erweitern.

Terminal-Fenster
curl -X POST \
https://api.openai-compat.model-serving.eu01.onstackit.cloud/v1/embeddings \
-H "Content-Type: application/json" \
-H "Authorization: Bearer [auth-token]" \
-d '{
"model": "[model]",
"messages": [
{"role": "system", "content": "[system-prompt]"},
{"role": "user", "content": [
{"type": "text", "text": "[user-message]"},
{"type": "image_url", "image_url": {"url": "data:image/jpeg;base64,[image-encoded]"}}
]}
]
}'

Beispiel – Laden des Bildes aus einer Datei (auf macOS) und anschließendes Embedding:

Terminal-Fenster
image_encoded=$(base64 -i image.jpg)
curl -X POST \
https://api.openai-compat.model-serving.eu01.onstackit.cloud/v1/embeddings \
-H "Content-Type: application/json" \
-H "Authorization: Bearer eyNSksHSus78h2kshdfsd7878shjkdlkdc" \
-d '{
"model": "Qwen/Qwen3-VL-Embedding-8B",
"messages": [
{ "role": "system", "content": "Represent the user input." },
{
"role": "user",
"content": [
{
"type": "text",
"text": "This image is part of the document x that explains feature y of the machine z."
},
{
"type": "image_url",
"image_url": { "url": "data:image/jpeg;base64,'"${image_encoded}"'" }
}
]
}
]
}'

Das Modell antwortet mit den entsprechenden Embeddings:

{
"id":"embd-1a434cef-62ee-9ca8-9588-154adcdd27e2",
"object":"list",
"created":1769180408,
"model":"Qwen/Qwen3-VL-Embedding-8B",
"data": [
{
"index":0,
"object":"embedding",
"embedding": [0.0167388916015625,0.005096435546875,0.01302337646484375,0.006805419921875,0.0089569091796875,-0.01406097412109375,...]
}
],
"usage": {
"prompt_tokens": 1295,
"total_tokens": 1295,
"completion_tokens": 0,
"prompt_tokens_details": null
}
}