Use the models
Prerequisites
Section titled “Prerequisites”Before you can use any model of the STACKI AI Model Serving, you need to create an auth token.
- You have a STACKIT AI Model Serving auth token.
Check Manage auth tokens to create a token.
Use the models
Section titled “Use the models”You can use all of the Shared Models via the API. STACKIT AI Model Serving provides an OpenAI-compatible API, making it easy to integrate with existing tools and libraries. Please consult the OpenAI API Documentation for additional parameters and detailed information.
Use chat models
Section titled “Use chat models”| Parameter | Meaning | Example |
|---|---|---|
| auth-token | The AI Model Serving auth token | BZasjkdasbu… |
| model | The model you want to use. | cortecs/Lla… |
| system-prompt | The instruction for the model prior to the chat | You are a h… |
| user-message | The message the user asks the model | Hey, please… |
| assistant-message | The message the chat model gave | Ok, thanks … |
| max-complention-tokens | The maximum length of the model’s answer in token | 250 |
| temperature | Defines the entropy of the model. A higher value means more creativity. | 0.1 |
curl -X POST \https://api.openai-compat.model-serving.eu01.onstackit.cloud/v1/chat/completions \-H "Authorization: Bearer [auth-token]" \-H "Content-Type: application/json" \-d '{ "model": "[model]", "messages": [{"role": "system", "content": "[system-prompt]"}, {"role": "user", "content": "[user-message]"}, {"role": "assistant", "content": "[assistant-message]"}, {"role": "user", "content": "[user-message]"}], "max\_completion\_tokens": [max-complention-tokens], "temperature": 0.1 }'Use embedding models
Section titled “Use embedding models”| Parameter | Meaning | Example |
|---|---|---|
| auth-token | The AI Model Serving auth token | BZasjkdasbu… |
| document | A document, must be a string | The API is fast and reliable |
| model | The model you want to use. | intfloat/e5-mistral-7b-instruct |
curl -X POST \ https://api.openai-compat.model-serving.eu01.onstackit.cloud/v1/embeddings \ -H "Content-Type: application/json" \ -H "Authorization: Bearer [auth-token]" \ -d '{ "model": "[model]", "input": [ "[document]" ] }'Example:
curl -X POST \ https://api.openai-compat.model-serving.eu01.onstackit.cloud/v1/embeddings \ -H "Content-Type: application/json" \ -H "Authorization: Bearer eyNSksHSus78h2kshdfsd7878shjkdlkdc" \ -d '{ "model": "intfloat/e5-mistral-7b-instruct", "input": [ "The API is fast and reliable.", "The system reacts just in time and is stable." ] }'The model will answer with the embeddings:
{ "id":"embd-96d405966aa14e8eb3d7e202a006e2cf", "object":"list", "created":1262540, "model":"intfloat/e5-mistral-7b-instruct", "data": [ { "index":0, "object":"embedding", "embedding": [0.0167388916015625,0.005096435546875,0.01302337646484375,0.006805419921875,0.0089569091796875,-0.01406097412109375,...] }, { "index":1, "object":"embedding", "embedding": [0.0167388916015625,0.0050543545546875,0.01302337646484375,0.006805419921875,0.0089568951796875,-0.01406097412109375,...] } ], "usage": { "prompt_tokens":3, "total_tokens":3, "completion_tokens":0 }}Use multi-modal embedding models
Section titled “Use multi-modal embedding models”Multi-modal embedding models can be used just like text-only embedding models, but additionally offer the capability to embed images by extending the OpenAI specification with chat message input support for the embeddings endpoint.
| Parameter | Meaning | Example |
|---|---|---|
| auth-token | The AI Model Serving auth token | BZasjkdasbu… |
| model | The model you want to use | Qwen/Qwen3-VL-Embed… |
| system-prompt | The instruction for the model | Represent the… |
| user-message | The message the user asks the model | Explain the image… |
| image-encoded | The image, base64-encoded | /9j/4AAQSkZJR… |
curl -X POST \ https://api.openai-compat.model-serving.eu01.onstackit.cloud/v1/embeddings \ -H "Content-Type: application/json" \ -H "Authorization: Bearer [auth-token]" \ -d '{ "model": "[model]", "messages": [ {"role": "system", "content": "[system-prompt]"}, {"role": "user", "content": [ {"type": "text", "text": "[user-message]"}, {"type": "image_url", "image_url": {"url": "data:image/jpeg;base64,[image-encoded]"}} ]} ] }'Example - loading the image (on macOS) from file and embedding it:
image_encoded=$(base64 -i image.jpg)
curl -X POST \ https://api.openai-compat.model-serving.eu01.onstackit.cloud/v1/embeddings \ -H "Content-Type: application/json" \ -H "Authorization: Bearer eyNSksHSus78h2kshdfsd7878shjkdlkdc" \ -d '{ "model": "Qwen/Qwen3-VL-Embedding-8B", "messages": [ { "role": "system", "content": "Represent the user input." }, { "role": "user", "content": [ { "type": "text", "text": "This image is part of the document x that explains feature y of the machine z." }, { "type": "image_url", "image_url": { "url": "data:image/jpeg;base64,'"${image_encoded}"'" } } ] } ] }'The model will answer with the embeddings:
{ "id":"embd-1a434cef-62ee-9ca8-9588-154adcdd27e2", "object":"list", "created":1769180408, "model":"Qwen/Qwen3-VL-Embedding-8B", "data": [ { "index":0, "object":"embedding", "embedding": [0.0167388916015625,0.005096435546875,0.01302337646484375,0.006805419921875,0.0089569091796875,-0.01406097412109375,...] } ], "usage": { "prompt_tokens": 1295, "total_tokens": 1295, "completion_tokens": 0, "prompt_tokens_details": null }}