Use the models

Prerequisites

Before you can use any model of the STACKI AI Model Serving, you need to create an auth token.

You have a STACKIT AI Model Serving auth token.
Check Manage auth tokens to create a token.

Use the models

You can use all of the Shared Models via the API. STACKIT AI Model Serving provides an OpenAI-compatible API, making it easy to integrate with existing tools and libraries. Please consult the OpenAI API Documentation for additional parameters and detailed information.

Use chat models

Parameter	Meaning	Example
auth-token	The AI Model Serving auth token	BZasjkdasbu…
model	The model you want to use.	cortecs/Lla…
system-prompt	The instruction for the model prior to the chat	You are a h…
user-message	The message the user asks the model	Hey, please…
assistant-message	The message the chat model gave	Ok, thanks …
max-complention-tokens	The maximum length of the model’s answer in token	250
temperature	Defines the entropy of the model. A higher value means more creativity.	0.1

curl -X POST \
https://api.openai-compat.model-serving.eu01.onstackit.cloud/v1/chat/completions \
-H "Authorization: Bearer [auth-token]" \
-H "Content-Type: application/json" \
-d '{
    "model": "[model]",
    "messages": [{"role": "system", "content": "[system-prompt]"}, {"role": "user", "content": "[user-message]"}, {"role": "assistant", "content": "[assistant-message]"}, {"role": "user", "content": "[user-message]"}],
    "max\_completion\_tokens": [max-complention-tokens],
    "temperature": 0.1
    }'

Use embedding models

Parameter	Meaning	Example
auth-token	The AI Model Serving auth token	BZasjkdasbu…
document	A document, must be a string	The API is fast and reliable
model	The model you want to use.	intfloat/e5-mistral-7b-instruct

curl -X POST \
  https://api.openai-compat.model-serving.eu01.onstackit.cloud/v1/embeddings \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer [auth-token]" \
  -d '{
      "model": "[model]",
      "input": [
        "[document]"
      ]
    }'

Example:

curl -X POST \
  https://api.openai-compat.model-serving.eu01.onstackit.cloud/v1/embeddings \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer eyNSksHSus78h2kshdfsd7878shjkdlkdc" \
  -d '{
      "model": "intfloat/e5-mistral-7b-instruct",
      "input": [
        "The API is fast and reliable.",
        "The system reacts just in time and is stable."
      ]
    }'

The model will answer with the embeddings:

{
    "id":"embd-96d405966aa14e8eb3d7e202a006e2cf",
    "object":"list",
    "created":1262540,
    "model":"intfloat/e5-mistral-7b-instruct",
    "data": [
        {
            "index":0,
            "object":"embedding",
            "embedding": [0.0167388916015625,0.005096435546875,0.01302337646484375,0.006805419921875,0.0089569091796875,-0.01406097412109375,...]
        },
        {
            "index":1,
            "object":"embedding",
            "embedding": [0.0167388916015625,0.0050543545546875,0.01302337646484375,0.006805419921875,0.0089568951796875,-0.01406097412109375,...]
        }
        ],
    "usage": {
        "prompt_tokens":3,
        "total_tokens":3,
        "completion_tokens":0
    }
}

Multi-modal embedding models can be used just like text-only embedding models, but additionally offer the capability to embed images by extending the OpenAI specification with chat message input support for the embeddings endpoint.

Parameter	Meaning	Example
auth-token	The AI Model Serving auth token	BZasjkdasbu…
model	The model you want to use	Qwen/Qwen3-VL-Embed…
system-prompt	The instruction for the model	Represent the…
user-message	The message the user asks the model	Explain the image…
image-encoded	The image, base64-encoded	/9j/4AAQSkZJR…

curl -X POST \
  https://api.openai-compat.model-serving.eu01.onstackit.cloud/v1/embeddings \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer [auth-token]" \
  -d '{
    "model": "[model]",
    "messages": [
        {"role": "system", "content": "[system-prompt]"},
        {"role": "user", "content": [
            {"type": "text", "text": "[user-message]"},
            {"type": "image_url", "image_url": {"url": "data:image/jpeg;base64,[image-encoded]"}}
        ]}
    ]
    }'

Example - loading the image (on macOS) from file and embedding it:

image_encoded=$(base64 -i image.jpg)

curl -X POST \
  https://api.openai-compat.model-serving.eu01.onstackit.cloud/v1/embeddings \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer eyNSksHSus78h2kshdfsd7878shjkdlkdc" \
  -d '{
    "model": "Qwen/Qwen3-VL-Embedding-8B",
    "messages": [
        { "role": "system", "content": "Represent the user input." },
        {
        "role": "user",
        "content": [
            {
            "type": "text",
            "text": "This image is part of the document x that explains feature y of the machine z."
            },
            {
            "type": "image_url",
            "image_url": { "url": "data:image/jpeg;base64,'"${image_encoded}"'" }
            }
        ]
        }
    ]
    }'

The model will answer with the embeddings:

{
    "id":"embd-1a434cef-62ee-9ca8-9588-154adcdd27e2",
    "object":"list",
    "created":1769180408,
    "model":"Qwen/Qwen3-VL-Embedding-8B",
    "data": [
        {
            "index":0,
            "object":"embedding",
            "embedding": [0.0167388916015625,0.005096435546875,0.01302337646484375,0.006805419921875,0.0089569091796875,-0.01406097412109375,...]
        }
    ],
    "usage": {
        "prompt_tokens": 1295,
        "total_tokens": 1295,
        "completion_tokens": 0,
        "prompt_tokens_details": null
    }
}

Use the models

Prerequisites

Use the models

Use chat models

Use embedding models

Use multi-modal embedding models