Integration of tool calling with STACKIT AI Model Serving

Zuletzt aktualisiert am 13. März 2026

This tutorial demonstrates the utilization of LLMs with additional tool calling with the framework LangChain. It covers a setup that includes several invocations of an LLM and the invocation of a tool, requested by the model, in order to complete a task. For an initial contact with the framework, LangChain expression language (basic usage) tutorial is recommended.

The purpose of this tutorial is to demonstrate how to provide any user-defined tools to an LLM.

Motivation and possibilities

The rapid advancement of LLMs has revolutionized the field of natural language processing, enabling machines to understand and generate human-like text. To further expand their capabilities, tool calling - also known as function calling - has emerged as a powerful technique. By leveraging the strengths of LLMs and combining them with the capabilities of external tools and services, we can unlock a vast range of new possibilities and applications. Tool calling enables LLMs to tap into the vast resources of the internet, databases, and other specialized services, effectively bridging the gap between language understanding and real-world applications. With tool calling, LLMs can:

Perform web searches to gather information, verify facts, and stay up-to-date with the latest developments
Access databases and knowledge bases to retrieve specific information, such as definitions, statistics, or historical data
Run tasks that are challenging for LLMs, such as complex arithmetic operations on large numbers, data compression, or encryption
Leverage specialized tools and services, like language translators, sentiment analyzers, or text summarizers

By integrating tool calling into LLMs, we can create more versatile and powerful language models that can assist humans in a wide range of tasks, from research and writing to decision-making and problem-solving.

Setting up configuration to access STACKIT AI Model Serving

After creating a STACKIT AI Model Serving Auth Token, as described Getting started with the product API, you can provide it as model_serving_auth_token. With Available Shared Models, you can decide which model to use and provide the model’s name, along with the URL. These are used to initialize a ChatOpenAI client.

import os

from dotenv import load_dotenv
from langchain_openai import ChatOpenAI

load_dotenv("../.env")


model = os.environ["STACKIT_MODEL_SERVING_MODEL"]                           # select a chat-model from https://support.docs.stackit.cloud/stackit/en/models-licenses-319914532.html
base_url = os.environ["STACKIT_MODEL_SERVING_BASE_URL"]                     # e.g. "https://api.openai-compat.model-serving.eu01.onstackit.cloud/v1"
model_serving_auth_token = os.environ["STACKIT_MODEL_SERVING_AUTH_TOKEN"]   # e.g. "ey..."

llm = ChatOpenAI(
    model=model,
    base_url=base_url,
    api_key=model_serving_auth_token,
)

Defining a custom tool

This code segment defines a new tool, get_random_fibonacci_number, which generates a random Fibonacci number from the first k numbers in the sequence. The @tool decorator registers this function as a tool, and it is then bound to the LLM object using the bind_tools method.

import random
from langchain_core.tools import tool


@tool
def get_random_fibonacci_number(k: int = 20) -> int:
    """Provide randomly one of the first `k` fibonacci numbers."""
    fib = [0, 1]
    while len(fib) < k:
        fib.append(fib[-2] + fib[-1])
    return random.choice(fib)


llm_with_tools = llm.bind_tools([get_random_fibonacci_number])

From now on, the LLM client is aware of the tool and calls it if requested from the LLM.

Initializing conversation and invoking a tool

This code segment initializes a conversation with the model using a system message and a human message. The model is then invoked with these messages, and the response is stored in the tool_call_response variable. The tool_calls attribute of this response contains information about the tools that were called during the invocation.

from langchain_core.messages import HumanMessage, SystemMessage, ToolMessage


messages = [
    SystemMessage("You are a helpful AI bot, provided with some tools. Use those if needed. Strike a balance between leveraging the provided tools to enhance your response and avoiding unnecessary tool usage. Use the tools when they can provide significant benefits, but prioritize a simple and direct response when possible."),
    HumanMessage("Please provide me a random fibonacci number. One the first seventy would be suitable."),
]

tool_call_response = llm_with_tools.invoke(messages)
print(tool_call_response.tool_calls)

# Output
#> [{
#>   'name': 'get_random_fibonacci_number',
#>   'args': {'k': '70'},
#>   'id': 'chatcmpl-tool-62434994517042e095daeae2251d562e',
#>   'type': 'tool_call'
#> }]

The output shows how tool calls are structured. The name of the tool and the expected arguments are known through the binding of these tools. The id allows the model to map the results of the tool calls, and finally the value of the arguments is derived from the actual user request. Such a list can easily contain multiple, possibly nested, tool calls. Depending on the known tools and the user’s request.

Use the tool Call response to generate a final answer

This code segment appends the tool_call_response to the list of messages and creates a new ToolMessage that invokes the get_random_fibonacci_number tool with the arguments from the previous tool call. The model is then invoked again with the updated list of messages, and the final_response of the model utilizes the tool’s result to meet the initial demand.

from langchain_core.messages import ToolMessage

messages.append(tool_call_response)
messages.append(
    ToolMessage(
        get_random_fibonacci_number.invoke(input=tool_call_response.tool_calls[0]["args"]),
        tool_call_id=tool_call_response.tool_calls[0]["id"]
    )
)
final_response = llm_with_tools.invoke(messages)
print(final_response.content)

# Output
#> 63245986 is an appropriately large Fibonacci number.

Summary and future directions

Tool calling is a powerful feature that unlocks the full potential of modern large language models. By allowing LLMs to access external tools and services, we can create more versatile and effective language models that can assist humans in a wide range of tasks. The possibilities are endless, from improving language translation and text summarization to enabling LLMs to perform complex tasks like data analysis and decision-making.

In the context of agents and agentic systems, tool calling plays a crucial role in enabling LLMs to interact with their environment, access knowledge and resources, and perform tasks that are beyond their capabilities. As we continue to develop more advanced LLMs and agentic systems, tool calling will become an essential component of these architectures, enabling them to learn, adapt, and interact with the world in more sophisticated ways.