Integrate STACKIT AI Model Serving with other applications

As STACKIT AI Model Serving exposes an OpenAI‑compatible API, it can be integrated into many popular applications and other software.

This tutorial shows how to configure sample applications to use STACKIT AI Model Serving.

In general, configure the following components:

API base URL: https://api.openai-compat.model-serving.eu01.onstackit.cloud/v1
API key / authentication token / secret key: STACKIT AI Model Serving Auth Token (see STACKIT Portal or Manage auth tokens to create an Auth Token in the STACKIT Portal)

STACKIT AI Model Serving offers different model types (for example, chat model, embedding model), which are mutually exclusive. Chat applications should use chat models, not embedding models. Vector embeddings (for use in RAG applications or knowledge retrieval) can only be computed by embedding models.

Refer to Available shared models for model types and configure your application accordingly.

Integration examples

This guide focuses on integration only and assumes the application is installed and running.

The screenshots are from the latest versions of the applications at the time of writing. They may differ in other versions, but the general integration steps should remain the same.

Read instructions on integrating STACKIT AI Model Serving with the following third‑party tools:

Open WebUI is an open‑source, self‑hosted AI platform. See the Open WebUI official docs for details, or reach out via the STACKIT Help Center for custom deployments directly on STACKIT Cloud.

This guide focuses on integration only and assumes the following requirements are met:

A running instance of Open WebUI with administrator access
A STACKIT AI Model Serving Auth Token (visit STACKIT Portal or see Manage auth tokens to create an Auth Token in the STACKIT Portal)

Configuration steps

Open the admin panel: open the drop‑down by clicking on the account icon (top right corner) → Select Admin panel from the drop‑down.
Select the Settings tab, then click the Connections entry.
Ensure that the OpenAI API is enabled (toggle on the right side). Click the + button to add the STACKIT AI Model Serving API.
In the dialog, enter the following connection details:
- URL: https://api.openai-compat.model-serving.eu01.onstackit.cloud/v1
- Key: <your STACKIT AI Model Serving Auth Token>
Verify the connection by clicking the button shown in the image. On success, save the new connection. On error, review your configuration and refer to the FAQ.
In the Settings tab, select the Models entry and verify, that only chat models are enabled. If needed, disable models of other types as shown in this image.
Go back to the main page, select your model of choice from the drop-down list and start using it.

Configuration steps for setting up `speech2text`

Open the admin panel: Open the drop-down by clicking on the account icon (top right corner) → Select Admin Panel from the dropdown.
Select the Settings tab, then click the Audio entry.
Ensure that the OpenAI API is selected as Speech‑to‑Text Engine.
In the dialog, enter the following connection details:
- URL: https://api.openai-compat.model-serving.eu01.onstackit.cloud/v1
- Key: <your STACKIT AI Model Serving Auth Token>
Select the STT Model.
Click the Save button and start using it.

AnythingLLM is a full‑stack, open‑source application that lets you turn any document, resource, or content into context that any LLM can use as references during chatting. It lets you choose which LLM or vector database to use and supports multi‑user management and permissions.

This guide focuses on integration only and assumes the following requirements are met:

A running installation of AnythingLLM (get it from the AnythingLLM official website)
A STACKIT AI Model Serving Auth Token (visit STACKIT Portal or see Manage auth tokens to create an auth token in the STACKIT Portal)

Configuration steps

Initial LLM setup with STACKIT AI Model Serving: Search for openai and select the Generic OpenAI provider.
In the form, enter the following provider details:
- Base URL: https://api.openai-compat.model-serving.eu01.onstackit.cloud/v1
- API key: <your STACKIT AI Model Serving Auth Token>
- Chat Model name: Choose any chat model from our shared instances (enter the model name): Available Shared Models
- Token context window: must not exceed the context length of the selected chat model, as defined here: Available Shared Models
- Max tokens: this is the maximum number of tokens generated in a chat request, incl. prompt tokens from context. It must not exceed the context length of the selected chat model, as defined here: Available Shared Models
Example of the chat provider setup for STACKIT AI Model Serving
Finish the initial setup with the default values. By now, the chat should work. Here is an example:
Setup the Embedder for STACKIT AI Model Serving, by opening the Settings → Embedder and clicking on the currently active Embedding provider.
In the popup again search for openai and select the Generic OpenAI provider.
In the form, enter the following provider details:
- Base URL: https://api.openai-compat.model-serving.eu01.onstackit.cloud/v1
- Embedding Model: Choose any embedding model from our shared instances (enter the model name): Available Shared Models
- Max embedding chunk length: must not exceed the maximum input tokens of the selected embedding model, as defined here: Available Shared Models
- API Key: <your STACKIT AI Model Serving Auth Token>
Example of the embedding provider setup for STACKIT AI Model Serving

Cline is a VS Code extension that assists developers directly in their IDE with an autonomous coding agent. This section shows how to configure Cline with STACKIT AI Model Serving as the LLM provider for a sovereign AI developer assistant setup.

This guide focuses on integration only and assumes the following requirements are met:

The Cline VS Code extension is installed (follow the Cline official installation instructions)
A STACKIT AI Model Serving Auth Token (visit STACKIT Portal or see Manage auth tokens to create an Auth Token in the STACKIT Portal)

Configuration steps

Open the Cline extension.
Click the currently selected model at the very bottom of the Cline window (besides Plan/Act switch).
Select the OpenAI Compatible API Provider from the drop-down.
Fill the information:
- Base URL: https://api.openai-compat.model-serving.eu01.onstackit.cloud/v1
- API Key: <your STACKIT AI Model Serving Auth Token>
- Model ID: select the most capable model from our Available Shared Models and enter the model name here.
Expand the Model Configuration section and set the Context Window Size according to model specific value (from this list) according to the model chosen before
The final configuration should look similar to this screenshot:
Note that the Cline extension might sometimes fail to process complex demands. This is due to the nature of LLM capabilities.

Integrate STACKIT AI Model Serving with other applications

Integration examples

Configuration steps

Configuration steps for setting up speech2text

Configuration steps

Configuration steps

Configuration steps for setting up `speech2text`