Concepts
Last updated on
This page provides detailed information about key concepts and configurations for STACKIT Workflows.
DAGs repository
Section titled “DAGs repository”STACKIT Workflows automatically discovers and loads DAGs from your connected Git repository. The service continuously polls your repository for (pushed) changes and synchronizes DAG files to keep your workflows up to date. Airflow automatically parses all Python files in your repository that contain both the words airflow and DAG as potential DAGs.
The ideal Git repository structure varies by use-case. A typical structure might look like this:
your-dags-repo/├── dags/ # Directory for DAG files│ ├── data_pipeline_dag.py│ ├── ml_training_dag.py│ ├── maintenance_dag.py│ └── tasks/ # Optional module that contains business logic.│ ├── clean_timestamps.py # For Python or Spark jobs, don't import anything from airflow here│ └── compute_stats.py # (should not contain words "airflow" or "DAG")├── include/│ └── sql│ └── kpi_revenue.sql├── plugins/ # For any custom or community Airflow plugins│ └── custom_operators.py└── README.mdAs a best practice, we recommend to separate orchestration logic (DAGs, Airflow Tasks) from business logic (SQL running in the database, Python script that processes data != DAG / Task definition, Spark logic, etc.). This makes it easier to test and maintain your code, as business logic functions can be tested independently from Airflow, for example in the STACKIT Notebooks service. To separate business from orchestration logic, put business logic into separate Python files (i.e. compute_stats.py or kpi_revenue.sql in the example above or) that do not contain any references to Airflow or DAGs. These files can then be imported and used in your DAG files. Alternatively, business logic can also reside in separate git repositories which can be configured on a per-task basis using the git_sync_* parameters of the STACKITPythonScriptOperator or @stackit.workflows_python_kubernetes_task decorator.
Available packages / operators
Section titled “Available packages / operators”STACKIT Workflows comes with a wide range of pre-installed Python packages and Airflow operators to help you build your workflows. The environment includes the following standard Airflow providers:
apache-airflow-providers-amazonapache-airflow-providers-celeryapache-airflow-providers-cncf-kubernetesapache-airflow-providers-databricksapache-airflow-providers-elasticsearchapache-airflow-providers-fabapache-airflow-providers-ftpapache-airflow-providers-googleapache-airflow-providers-grpcapache-airflow-providers-hashicorpapache-airflow-providers-httpapache-airflow-providers-imapapache-airflow-providers-microsoft-mssqlapache-airflow-providers-microsoftapache-airflow-providers-mongoapache-airflow-providers-mysqlapache-airflow-providers-odbcapache-airflow-providers-openlineageapache-airflow-providers-postgresapache-airflow-providers-redisapache-airflow-providers-sftpapache-airflow-providers-slackapache-airflow-providers-smtpapache-airflow-providers-snowflakeapache-airflow-providers-sqliteapache-airflow-providers-ssh
Additionally the environment also includes the stackit_workflows provider package, which simplifies running Python and Spark jobs from the Workflows Instance:
@stackit_python_kubernetes_task: Airflow Task Decorator to run the decorated Python function on Kubernetes in the default STACKIT Python runtime image.STACKITPythonScriptOperator: Airflow Operator that runs a Python script in the default STACKIT Python runtime image.@stackit_spark_kubernetes_task: Airflow Task Decorator to run the decorated Python function on Kubernetes in the STACKIT Spark runtime image, where Spark and thestackit_sparkpackages are pre-installed.STACKITSparkScriptOperator: Airflow Operator to run a provided PySpark script in the STACKIT Spark runtime image.
Please check the Tutorials section for example usage.
All packages are kept up-to-date and maintained to ensure compatibility and security. For the complete list of versions, refer to the packages manifest in your environment. It can be obtained by starting the DAGs Development Environment (DDE) and running pip freeze in a Terminal. Please note that packages are updated with each new Workflows release.
Monitoring / Observability integration
Section titled “Monitoring / Observability integration”When Monitoring is enabled for a Workflows instance, metrics from all tasks launched from DAGs as well as core Airflow metrics are forwarded to the selected STACKIT Observability instance. A pre-configured Grafana dashboard is available to visualize these metrics.
In addition to common container metrics such as CPU and memory usage per task, Airflow-specific aggregated metrics are available as well. For a detailed description, please check the Airflow Metrics Description documentation.
STACKIT Portal roles
Section titled “STACKIT Portal roles”The following roles are available in the STACKIT Portal for managing Workflows instances:
- Workflows Admin: Full administrative access to all STACKIT Workflows resources.
- Workflows Editor: Permissions to create, modify, and manage STACKIT Workflows sub-resources (DAGs repository, identity provider configuration, etc.).
- Workflows Reader: Read-only access to all STACKIT Workflows resources.
These permissions are managed through the STACKIT Portal and STACKIT API.
Airflow UI roles
Section titled “Airflow UI roles”The permissions inside the Airflow UI are separate from the access control in the STACKIT Portal. Access to the Airflow UI is managed through Role-Based Access Control (RBAC) using your connected Identity Provider (IdP).
When users log in, their roles are extracted from the authentication token provided by the IdP. These roles determine what users can access and perform within the Airflow interface. For detailed configuration instructions, see the Identity Provider section.
The following roles are available within Airflow:
- Admin: Complete access to all Airflow features, settings, and configurations.
- User: Can view and trigger DAGs, but cannot modify system settings or configurations.
- Viewer: Read-only access to view DAGs and their running statuses.
Pricing
Section titled “Pricing”STACKIT Workflows pricing is comprised of a fixed monthly fee for the Workflows instance, which includes all core Airflow components such as the scheduler, webserver, metadata base and log storage. In addition, each running task is launched in a dedicated Kubernetes pod, which incurs additional costs based on the resources allocated (CPU, memory, duration). For detailed pricing information, please refer to the Ordering process in the STACKIT Portal.
Identity Provider
Section titled “Identity Provider”STACKIT Workflows integrates with external OIDC capable Identity Providers (IdP) to manage user authentication and authorization. This allows organizations to leverage their existing identity management systems for secure access to the Airflow.
Advanced Configuration
Section titled “Advanced Configuration”In addition to the basic IdP settings (client ID, secret, scopes, discovery URL), the following optional parameters are available to fine-tune the OAuth2 integration:
| Parameter | Type | Description |
|---|---|---|
| Resource | string | RFC 8707 Resource Indicator. Sent during token requests to indicate which resource the token is intended for. Useful when the IdP requires a resource parameter to scope tokens (e.g., some Entra ID configurations). |
| Roles Claim | string | Dot-separated path to extract roles from the JWT / userinfo claims (e.g., realm_access.roles or resource_access.my-client.roles). If not specified, provider-specific defaults are used (see table below). |
| API Audience | string[] | Expected audience value(s) for API token validation when using the session endpoint for token exchange. Tokens whose aud claim does not match any of the specified values will be rejected. Leave empty to disable audience validation. |
Default Roles Claim per provider
Section titled “Default Roles Claim per provider”When Roles Claim is not set, the following defaults are used to extract roles from the authentication token:
| Provider | Default Roles Claim | Example value in token |
|---|---|---|
| Entra ID (Azure) | roles | Top-level roles array (populated by App Roles) |
| Okta | roles | Top-level roles array |
| Keycloak | realm_access.roles | Nested under realm_access.roles. For new setups, we recommend using client roles and setting the Roles Claim to resource_access.<client-id>.roles to match the “Token Claim Name” of the client roles mapper in Keycloak. |
Entra ID (Azure)
Section titled “Entra ID (Azure)”To configure Entra ID (formerly Azure Active Directory) as your Identity Provider, follow these steps:
- Create a new “App Registration”
- Name: choose any, for this example we choose
STACKIT Workflows (dev) - Supported account types: For typical deployments select “Accounts in this organizational directory only” (single tenant)
- Redirect URI: Leave empty for now, we will set it later. Click “Register” to create the application.
- Name: choose any, for this example we choose
- In the “Overview” page of the “App Registration” note down the
Application (client) ID. Also note theDirectory (tenant) ID. - Navigate to “Manage” -> “Certificates & secrets” and create a new “Client secret”. Note down the generated “Value” (not the Secret-ID), as it will be shown only once.
- In the STACKIT Portal, enter the following values in the Identity Provider configuration section:
- Identity provider name:
Azure - Client ID: The
Application (client) IDfrom step 2 - Client Secret: The “Value” of the client secret created in step 3.
- Scopes:
openid email - Discovery URL:
https://login.microsoftonline.com/<Directory (tenant) ID>/.well-known/openid-configuration(replace<Directory (tenant) ID>with the value from step 2)
- Identity provider name:
After the Workflow instance is created, navigate to the Workflows instance in the STACKIT Portal and copy the “Redirect URL” from the Identity Provider section. Go back to the “App Registration” in Entra ID and navigate to “Manage” -> “Authentication”. Add a new platform “Web” and enter the copied “Redirect URL”. Save the changes.
Finally, we need to assign roles to users or groups in Entra ID, so that selected users can access Airflow. Navigate to “Manage” -> “App roles” and create the following roles. We recommend to create at least the “Admin” role, other roles can be created as needed. The available permissions for each role are documented in the Airflow Documentation.
| Display name | Allowed member types | Value | Description |
|---|---|---|---|
| Admin | Users/Groups & Applications | Admin | Workflows Administrator |
| Viewer | Users/Groups & Applications | Viewer | Workflows Viewer |
| User | Users/Groups & Applications | User | Workflows User |
| Op | Users/Groups & Applications | Op | Workflows Op |
Finally, we need to assign the created roles to users or groups. Navigate to the “Overview” page of the “App Registration” and click on the name of the Enterprise Application next to the “Managed application in local directory” label. Go to “Manage” -> “Users and groups” and assign the created roles to users or groups as needed.
To configure Okta as your Identity Provider, follow these steps:
- In the “Applications” menu, select “Create App Integration”. Choose “OIDC - OpenID Connect” as the sign-in method and “Web Application” as the application type. Click “Next”.
- Configure the following settings:
- App integration name: Choose any, for this example we choose
STACKIT Workflows (dev) - Sign-in redirect URIs: Leave empty for now, we will set it later.
- Assignments: Choose “Assign to people only” or “Assign to groups only” based on your requirements. You can also choose “No assignment required” to allow all users in your Okta organization to access the application. Click “Save” to create the application.
- App integration name: Choose any, for this example we choose
- In the “General” tab of the created application, note down the
Client IDandClient Secret. - In the STACKIT Portal, enter the following values in the Identity Provider configuration section:
- Identity provider name:
Okta - Client ID: The
Client IDfrom step 2 - Client Secret: The
Client Secretfrom step 2 - Scopes:
openid email - Discovery URL:
https://<your-okta-domain>/.well-known/openid-configuration(replace<your-okta-domain>with your Okta domain)
- Identity provider name:
After the Workflow instance is created, navigate to the Workflows instance in the STACKIT Portal and copy the “Redirect URL” from the Identity Provider section. Go back to the Okta Developer Console and edit the created application. Add the copied “Redirect URL” to the “Sign-in redirect URIs”. Save the changes.
Now, we need to assign roles in STACKIT Workflows to users or groups in Okta:
- In the “Directory” menu, select “Profile Editor”. Select the Application that was previously created.
- Select “Add Attribute” and create a new attribute with the following settings:
- Type:
string - Display name:
STACKIT Workflows Roles - Variable name:
roles Enum: yes (checked) - create the following enumerations with identical values and Display Names:Admin,User,Viewer,Op- Description:
Roles for STACKIT Workflows - Attribute required: yes (checked)
- Attribute type: As required, in this example we use “Group”
- Type:
- After saving the attribute, navigate to “Directory” -> “Groups” and select a group. Select the “Applications” tab and click “Assign Applications”. Select the previously created application and assign the desired role to the group. Repeat this step for all groups that need access to STACKIT Workflows.
Keycloak
Section titled “Keycloak”To configure Keycloak as your Identity Provider, follow these steps:
- In the Keycloak Admin Console, navigate to your realm and go to “Clients”. Click “Create client”.
- Configure the client settings:
- Client type: OpenID Connect
- Client ID: Choose any, for this example we use
stackit-workflows-dev - Name: Choose any, for this example we use
STACKIT Workflows (dev) - Click “Next”
- Configure capability settings:
- Client authentication: On (to get a client secret) - Workflows is a confidential / private client
- Authorization: Off
- Authentication flow: Check “Standard flow” and “Direct access grants”
- Click “Next”
- Login settings:
- Web origins:
+ - Leave other fields empty for now, we will set the redirect URI later.
- Click “Save”
- Web origins:
- In the “Credentials” tab of the created client, note down the
Client secret. - In the STACKIT Portal, enter the following values in the Identity Provider configuration section:
- Identity provider name:
keycloak - Client ID: The Client ID from step 2 (e.g.,
stackit-workflows-dev) - Client Secret: The Client secret from step 5
- Scopes:
openid email roles - Discovery URL:
https://<your-keycloak-domain>/realms/<realm-name>/.well-known/openid-configuration(replace<your-keycloak-domain>and<realm-name>with your values)
- Identity provider name:
After the Workflow instance is created, navigate to the Workflows instance in the STACKIT Portal and copy the “Redirect URL” from the Identity Provider section. Go back to Keycloak and edit the created client. Add the copied “Redirect URL” to the “Valid redirect URIs” field. Save the changes.
To assign roles to users in Keycloak:
- Navigate to the previously created client, select the “Roles” tab and create the following roles:
Admin- Workflows AdministratorUser- Workflows UserViewer- Workflows ViewerOp- Workflows Op
- Navigate to “Users” and select a user. Go to the “Role mapping” tab and assign the appropriate client roles to the user.
- For the roles to be included in tokens, go to “Client scopes” → “roles” → “Mappers” and open the “client roles” mapper. Ensure that both
Add to userinfoandAdd to access tokenare turned on. Theuserinfomapping is required for the UI login flow, and the access token mapping is required for the API Authentication flow. Also note the “Token Claim Name”—this is the claim path where roles will appear in the token (e.g.,resource_access.stackit-workflows-dev.roles). Set this value as the Roles Claim in the STACKIT Portal. Make sure to replace all variables in the string with their actual values before saving.
Alternatively, you can create groups and assign roles to groups, then assign users to groups for easier management.
API Authentication
Section titled “API Authentication”STACKIT Workflows provides a session endpoint that allows programmatic access to the Airflow REST API. This enables automation, CI/CD pipelines, and service-to-service communication by exchanging an OIDC access token from your Identity Provider for an Airflow session cookie.
Prerequisites
Section titled “Prerequisites”- An OAuth2 Identity Provider must be configured for your Workflows instance.
- API Audience must be set in the IdP configuration (via the STACKIT Portal or API). The session endpoint is disabled when no API Audience is configured.
- A service account or application in your IdP that can obtain access tokens with the configured audience. This is typically a separate OAuth2 client (with its own Client ID and Client Secret) that uses the
client_credentialsgrant to request tokens.
How it works
Section titled “How it works”- Obtain an OIDC access token from your Identity Provider (e.g., using
client_credentialsgrant). - Exchange it for an Airflow session by calling
POST /api/stackit/v1/sessionwith the token as a Bearer token. - Use the returned session cookie for subsequent Airflow REST API calls.
The session endpoint validates the token’s signature against the IdP’s JWKS, checks the iss (issuer) and aud (audience) claims, and then extracts user information and roles using the same logic as the UI login flow.
Example
Section titled “Example”import requestsfrom requests.auth import HTTPBasicAuth
# Step 1: Obtain an access token from your IdPtoken_response = requests.post( "https://<your-idp>/protocol/openid-connect/token", data={"grant_type": "client_credentials"}, auth=HTTPBasicAuth("<your-client-id>", "<your-client-secret>"),)token_response.raise_for_status()access_token = token_response.json()["access_token"]
# Step 2: Exchange the token for an Airflow sessionsession = requests.Session()r = session.post( "https://<your-workflows-instance>/api/stackit/v1/session", headers={"Authorization": f"Bearer {access_token}"},)r.raise_for_status()# Session cookie is now stored in the session object
# Step 3: Use the session for Airflow REST API callsdags = session.get("https://<your-workflows-instance>/api/v1/dags")dags.raise_for_status()print(dags.json())