LangChain expression language (basic usage)

Diese Seite ist noch nicht in deiner Sprache verfügbar. Englische Seite aufrufen

This tutorial shows how to use large language models (LLMs) with the LangChain framework. To begin, we examine simple setups.

Configure access to STACKIT LLM instances

After you create a STACKIT AI Model Serving auth token (see Manage auth tokens or visit the STACKIT Portal), provide it as model_serving_auth_token. See Available shared models to choose a model, then provide the model name and the base URL.

import os

from dotenv import load_dotenv

load_dotenv("../.env")

model = os.environ["STACKIT_MODEL_SERVING_MODEL"]                           # Select a chat model from https://support.docs.stackit.cloud/stackit/en/models-licenses-319914532.html
base_url = os.environ["STACKIT_MODEL_SERVING_BASE_URL"]                     # For example: "https://api.openai-compat.model-serving.eu01.onstackit.cloud/v1"
model_serving_auth_token = os.environ["STACKIT_MODEL_SERVING_AUTH_TOKEN"]   # For example: "ey..."

LangChain Expression Language (LCEL)

One popular way to use LLMs is via LangChain chains. These typically consist of three components:

Prompt: Contains the instruction describing the task to accomplish.
Model: The client that connects to the LLM server, handles authorisation, and holds LLM‑specific configuration.
Output parser: Fetches the LLM response and returns it in the indicated format, such as a plain string or JSON.

All components of a chain are Runnables. They can be piped for sequential execution. This is how complex chains are constructed.

Since we focus on simple chains in this tutorial, the chain contains only the basic components. The ChatOpenAI instance uses default configuration (for example, temperature or presence penalty are not set). See the LangChain docs for a comprehensive list of options. For most use cases, the defaults are a good starting point.

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    model=model,
    base_url=base_url,
    api_key=model_serving_auth_token,
)

Following a very basic, general‑purpose chat prompt, the first line is a system message indicating the behaviour the LLM should follow. The user then begins an interaction, which influences the upcoming response. In the last line, {demand} is a placeholder that is replaced when the chain is invoked. Note that all messages in this template generate billable tokens, not only the request given as the value of demand.

from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate([
    ("system", "You are a helpful AI bot."),
    ("human", "Hello, how are you doing?"),
    ("ai", "I'm doing well, thanks!"),
    ("human", "{demand}"),
])

Finally, instantiate a basic output parser as the last runnable and concatenate the components via the pipe operator |. This indicates sequential execution, with each component’s input being the previous component’s output.

from langchain_core.output_parsers.string import StrOutputParser

chain = prompt | llm | StrOutputParser()

Invoke the chain

With the invoke method, the chain is executed for one specific input, given as a dictionary. Each placeholder in the prompt must appear as a key in the input dictionary. Additional keys are ignored, allowing input for components added later in a chain. See LangChain expression language (advanced usage) to dive deeper into that topic.

answer = chain.invoke({"demand": "Ask me a riddle."})
print(answer)

# Output
#> Here’s a riddle for you:
#>
#> I am always coming but never arrive, I have a head but never hair, I have a bed but never sleep, I have a mouth but never speak.
#> What am I?
#>
#> Think you can solve it?

Invoke batch processing on chain

Besides invoke, batch is useful: it triggers multiple executions of a chain on several inputs.

These chain requests can also be streamed or executed asynchronously, making it versatile.

batch_answers = chain.batch([
    "Tell me a dad joke. Make sure it is not too funny.",
    "Ask me a riddle.",
    "Tell me a joke.",
])

for answer in batch_answers:
    print(answer)

# Output
#> Here’s a dad joke for you: Why did the coffee file a police report? Because it got mugged.
#>
#> Here’s one:
#> I am always coming but never arrive, I have a head but never hair, I have a bed but never sleep, I have a mouth but never speak.
#> What am I?
#> (Let me know if you need a hint or if you think you have the answer!)
#>
#> Here’s one: What do you call a fake noodle?
#> An impasta!

Code checking chain

In this example, a more sophisticated system prompt is used. The following code_prompt sets the frame for upcoming tasks, for example supporting Python development. The prompt has two placeholders: function and demand. The first represents the source code to which the later demand refers. This prompt is versatile because the user may request any demand.

code_prompt = ChatPromptTemplate([
    ("system", "You are an experienced Python developer and provide support for all coding‑related tasks."),
    ("human", "Familiarise yourself with the following function; it will be the subject of your task: {function}"),
    ("ai", "Very well, I am ready to support you."),
    ("human", "{demand}")
])

code_chain = code_prompt | llm | StrOutputParser()

Following are the functions to examine. The source code does not adhere to common guidelines, including:

Descriptive names for functions and variables
Docstrings
Type hints
Expressive error messages

We use our code checking chain to improve this situation. Such helper functions are often found in a utils module. The function bar is a plausible example of a less documented I/O function that asserts properties of incoming data but does so in an opaque way.

from pathlib import Path

def foo(n, k=1):
    if n == 0:
        return k
    return foo(n-1, k*n)

def bar(p):
    baz = ["Frodo", "Sam", "Aragorn", "Gimli", "Gandalf"]
    p = Path.cwd().parent / Path(p)
    if p.exists():
        if p.is_file():
            if p.suffix == ".txt":
                with p.open(mode="r") as t:
                    a = t.read()
                    b = [_ in a for _ in baz]
                    if any(b):
                        return a
    raise RuntimeError()

Provide a helper to process code‑related tasks in a single‑input, many‑instructions way, using batch for blocking execution of all tasks.

import inspect
from typing import Callable, List

def single_target_multiple_tasks(target_function: Callable, tasks: List[str]) -> None:
    """Batch `tasks` on `target_function` using the code_chain."""
    responses = code_chain.batch(
        [
            {
                "function": inspect.getsource(target_function),  # String representation of function to examine
                "demand": task                                   # Actual examinations/tasks to perform
            } for task in tasks
        ]
    )
    for response in responses:
        print(response, raw=True)

Now define the tasks for the LLM. These prompts address three common needs:

What does this function do?
Document this function.
Provide a unit test.

It is not expected that these needs are always met perfectly, especially with complex or large source code. In most cases, the result is sufficient or reduces boilerplate. Very complex needs can be met with carefully crafted prompts. See the Prompt engineering guide for more information.

requests = [
    "Give a brief explanation of the function’s actual behaviour. Do not provide a detailed breakdown.",
    "Provide an alternative, descriptive name for the function. Additionally, provide a NumPy‑style docstring.",
    "Provide a unit test for this function."
]

single_target_multiple_tasks(
    target_function=foo,
    tasks=requests
)

single_target_multiple_tasks(
    target_function=bar,
    tasks=requests
)


# Output
#> This function checks if a specified file exists in the parent directory of the current working directory, reads its content if it’s a .txt file, and returns the content if any of the names from a predefined list (containing “Frodo”, “Sam”, “Aragorn”, “Gimli”, “Gandalf”) are found in the file’s content. If not, it raises a RuntimeError.
#>
#>
#> Here is an alternative name for the function along with a docstring in the NumPy style:
#> ```
#> def read_matching_text_file(p):
#>     """
#>     Read a text file if it contains any of the specified Fellowship members.
#>
#>     Parameters
#>     ----------
#>     p : str or Path
#>         Path to the text file to be read.
#>
#>     Returns
#>     -------
#>     str or None
#>         Content of the text file if it exists, is a file, has a '.txt' suffix,
#>         and contains any of the Fellowship members. Otherwise, raises RuntimeError.
#>
#>     Raises
#>     ------
#>     RuntimeError
#>         If the file does not exist, is not a file, does not have a '.txt' suffix,
#>         or does not contain any of the Fellowship members.
#>     """
#>     baz = ["Frodo", "Sam", "Aragorn", "Gimli", "Gandalf"]
#>     p = Path.cwd().parent / Path(p)
#>     if p.exists():
#>         if p.is_file():
#>             if p.suffix == ".txt":
#>                 with p.open(mode="r") as t:
#>                     a = t.read()
#>                     b = [_ in a for _ in baz]
#>                     if any(b):
#>                         return a
#>     raise RuntimeError()
#> ```
#> Note that the name `read_matching_text_file` provides a clear indication of what the function does, making it easier for users to understand its purpose. The docstring provides more detailed information about the function’s behavior, parameters, return values, and raised exceptions, following the NumPy style.
#>
#>
#> Here’s an example of how you can write unit tests for the bar function using Python’s built-in unittest module:
#> ```
#> import unittest
#> from pathlib import Path
#> import tempfile
#> from your_module import bar  # Replace 'your_module' with the actual module name
#>
#> class TestBarFunction(unittest.TestCase):
#>
#>     def test_file_exists_and_contains_baz(self):
#>         with tempfile.TemporaryDirectory() as tmp_dir:
#>             file_path = Path(tmp_dir) / 'test_file.txt'
#>             with file_path.open('w') as file:
#>                 file.write('Frodo is going to Mordor.')
#>             result = bar(file_path.name)
#>             self.assertEqual(result, 'Frodo is going to Mordor.')
#>
#>     def test_file_exists_but_does_not_contain_baz(self):
#>         with tempfile.TemporaryDirectory() as tmp_dir:
#>             file_path = Path(tmp_dir) / 'test_file.txt'
#>             with file_path.open('w') as file:
#>                 file.write('Gollum is searching for the Ring.')
#>             with self.assertRaises(RuntimeError):
#>                 bar(file_path.name)
#>
#>     def test_file_exists_but_is_not_txt(self):
#>         with tempfile.TemporaryDirectory() as tmp_dir:
#>             file_path = Path(tmp_dir) / 'test_file.pdf'
#>             with file_path.open('w') as file:
#>                 file.write('Frodo is going to Mordor.')
#>             with self.assertRaises(RuntimeError):
#>                 bar(file_path.name)
#>
#>     def test_file_does_not_exist(self):
#>         with self.assertRaises(RuntimeError):
#>             bar('non_existent_file.txt')
#>
#>     def test_file_is_directory(self):
#>         with tempfile.TemporaryDirectory() as tmp_dir:
#>             with self.assertRaises(RuntimeError):
#>                 bar(tmp_dir)
#>
#> if __name__ == '__main__':
#>     unittest.main()
#> ```
#> In these tests, we create a temporary directory and a file within it to test different scenarios. We then call the `bar` function with the file path and assert that the expected result is returned or an exception is raised.

This concludes the basic tutorial on handling LangChain runnables via LCEL. Key takeaways:

The most basic chain contains a prompt, a model, and an output parser.
The prompt specifies the task to be completed.
A prompt’s initial system message affects overall model behaviour.
The chain’s model controls the used LLM service.
Chains can be executed (invoked, batched, etc.) synchronously and asynchronously.

See other STACKIT tutorials for further advanced and specific goals achievable with LLMs.