Zum Inhalt springen

Managing clusters

After you created your first STEC managed edge cluster the next step is to use and manage it. This guide leads you through those steps by example.

When a new edge cluster is deployed there will be two API endpoints exposed that you can use to interact with it:

  1. The Talos Linux gRPC API. You interact with it using a compatible version of talosctl to manage Talos Linux.
  2. The Kubernetes REST API. You interact with it using any compatible Kubernetes Client, e.g. kubectl, to manage Kubernetes.

You authenticate with Kubernetes using a kubeconfig file. Follow the next steps to get the file.

Prerequisites:

Steps:

  1. Navigate to the Cluster section. You’ll get to the Clusters overview. Click on the name of the cluster you want to get the Kubeconfig file for.

    Screenshot of the STACKIT Edge Cloud web interface, now showing the Clusters view with a single cluster created.

  2. Click on the Kubeconfig button to start the download of a valid Kubeconfig file for the selected cluster.

    A screenshot of the STACKIT Edge Cloud web interface, displaying the Cluster Details view for a cluster named cluster-01.

You may use any Kubernetes API compatible client to interact with Kubernetes. For this example we’ll use kubectl.

Make sure you use the latest version of kubectl that’s supported with the Kubernetes version of the Kubernetes cluster you’re working with. In the examples below we’ve been using kubectl version 1.33.1.

Prerequisites:

  • You acquired a valid kubeconfig for the STEC managed Edge Cluster.
  • Tools: a generic Linux bash terminal, kubectl.

Steps:

Terminal window
> export KUBECONFIG=your-edge-cluster.kubeconfig.yaml
> kubectl get nodes
NAME STATUS ROLES AGE VERSION
talos-4ic-txr Ready control-plane 19m v1.30.2
> kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-8477467d67-vb27c 1/1 Running 0 19m
kube-system coredns-8477467d67-vmmwk 1/1 Running 0 19m
kube-system kube-apiserver-talos-4ic-txr 1/1 Running 0 19m
kube-system kube-controller-manager-talos-4ic-txr 1/1 Running 2 (20m ago) 18m
kube-system kube-flannel-6rzll 1/1 Running 0 19m
kube-system kube-proxy-7hf7w 1/1 Running 0 19m
kube-system kube-scheduler-talos-4ic-txr 1/1 Running 2 (20m ago) 18m

You authenticate with Talos using a talosconfig file. Follow the next steps to get the file.

Prerequisites:

Steps:

  1. Navigate to the Cluster section. You’ll get to the Clusters overview. Click on the name of the cluster you want to get the talosconfig file for.

    Screenshot of the STACKIT Edge Cloud web interface, now showing the Clusters view with a single cluster created.

  2. Click on the talosconfig button to start the download of a valid talosconfig file for the selected cluster.

    A screenshot of the STACKIT Edge Cloud web interface, showing the details page for a cluster.

You may use any gRPC compatible client to interact with Talos. For this example we’ll use talosctl.

Every Talos Linux node does expose an endpoint for the Talos gRPC API. When you use talosctl it will try to connect to the gRPC endpoint specified in the talosconfig. This may fail if the endpoint is not reachable. In that case you can specify a different node from the cluster you want to interact with using the --endpoints CLI parameter of talosctl, providing a IP / DNS record of that endpoint, to connect to a different endpoint.

The --nodes parameter of talosctl however always has to be specified and specifies the nodes that should be targeted by the talosctl command. If the --endpoints used are different from the --nodes used the chosen endpoint will proxy the command to all the specified nodes. A network connection from the talosctl CLI is only created to the --endpoints.

Check the talosctl documentation to learn more about how to use talosctl.

While it’s possible to use talosctl to interact with a STACKIT Edge Cloud managed cluster please be aware that you should not use talosctl to directly change the configuration of your managed systems. If you want to change the configuration of your system make sure to interact with it using the exposed STEC CRDs such as EdgeCluster, as explained in the documentation. Commands such as talosctl rollback, talosctl rotate-ca and talosctl reset can break the connection with STACKIT Edge Cloud management plane and lead to unexpected behavior. As a best practice only use commands that read information but don’t alter it.

Make sure you use the latest version of talosctl that’s supported with the Talos version of the Talos node you’re working with. In the examples below we’ve been using talosctl version 1.10.5.

Prerequisites:

  • You acquired a valid talosconfig for the STEC managed Edge Cluster.
  • Tools: a generic Linux bash terminal, talosctl, yq.

Steps:

Terminal window
> export TALOSCONFIG=your-edge-cluster.talosconfig.yaml
> TALOS_IP=$(yq '.contexts.[ keys |.[0]].endpoints[0] | split(":") |.[0]'./my-edge-cluster.talosconfig)
> talosctl --nodes $TALOS_IP get members
NODE NAMESPACE TYPE ID VERSION HOSTNAME MACHINE TYPE OS ADDRESSES
192.168.4.142 cluster Member talos-4ic-txr 1 talos-4ic-txr controlplane Talos (v1.10.5) ["192.168.4.142"]
> talosctl --nodes $TALOS_IP get svc
NODE NAMESPACE TYPE ID VERSION RUNNING HEALTHY HEALTH UNKNOWN
192.168.4.142 runtime Service apid 2 true true false
192.168.4.142 runtime Service auditd 2 true true false
192.168.4.142 runtime Service containerd 2 true true false
192.168.4.142 runtime Service cri 2 true true false
192.168.4.142 runtime Service dashboard 1 true false true
192.168.4.142 runtime Service etcd 2 true true false
192.168.4.142 runtime Service ext-edgehostlet 1 true false true
192.168.4.142 runtime Service kubelet 2 true true false
192.168.4.142 runtime Service machined 2 true true false
192.168.4.142 runtime Service syslogd 2 true true false
192.168.4.142 runtime Service trustd 2 true true false
192.168.4.142 runtime Service udevd 2 true true false

The cloud proxy is a feature that, when enabled, allows you to connect to your STEC edge cluster from anywhere in the world using nothing but a internet connection. If your cluster is behind a restricted network or you are otherwise not able to directly connect to it you may enable the STEC cloud proxy for that cluster.

How the cloud proxy works.

The cloud proxy is a optional feature that, when enabled, will create a HTTPS tunnel between the STEC control plane running on STACKIT cloud and a STEC edge cluster. The Kubernetes and/or Talos Linux API endpoints are than exposed to the internet and reachable to anyone. The connection is initialized from the STEC edge cluster, which means that the only requirement for the STEC cloud proxy to function is a outgoing connection from the edge cluster to the STACKIT cloud on TCP port 443 (HTTPS). There is no need to open inbound port to use the cloud proxy feature. When enabled the STEC control plane will send a command over it’s management interface forcing the edge cluster to initialize the tunnel connection. This process may take a few minutes since new services need to be started and DNS records have to be created before the proxy can be used.

Diagram: Cluster Proxy Initialization Flow. A simple flow diagram showing the connection initialization for the Cluster Proxy.

Once the tunnel is established you can use the Talos gRPC and/or Kubernetes REST APIs using the provided API proxy endpoints on the STACKIT cloud. There is no need for your clients, e.g. kubectl or talosctl, to be able to reach the edge cluster directly.

Diagram: Cluster Proxy Command Flow. A flow diagram illustrating how commands are routed through the established Cluster Proxy tunnel.

The STEC cloud proxy is a simple TCP proxy. When enabling the cloud proxy feature it effectively will directly expose the edge cluster’s Kubernetes API / Talos Linux API to the internet. This means it’s the edge cluster administrators responsibility to harden the edge cluster to prevent unauthorized access to the system(s).

Prerequisites:

Steps:

  1. Navigate to the Cluster section. You’ll get to the Clusters overview. Click on the name of the cluster you want to enable the cloud proxy feature for.

    Screenshot of the STACKIT Edge Cloud web interface, now showing the Clusters view with a single cluster created.

  2. If you want to enable the proxy for the Kubernetes API enable the “Cluster Proxy” slider. If you want to enable the proxy for the Talos Linux API enable the “Talos Proxy” slider. You can enable / disable the proxy for each of the services independent from each other.

    A screenshot of the STACKIT Edge Cloud web interface, showing the details page for a cluster.

  3. Wait a few minutes for the proxy configuration to finish before you try to use the proxy. Note that in order to use the proxy you’ll have to change the endpoint in the Kubeconfig and/or talosconfig as explained in the following tipp box.

When you enable the Kubernetes cloud proxy or the Talos cloud proxy you’ll have to re-download or re-configure your kubeconfig and talosconfig respectively to actually make use of the cloud proxy feature. This is because the kubeconfig and/or talosconfig have to use a different endpoint when using the cloud proxy. Please find a example diff of the configuration when using and when not using the cloud proxy feature.

apiVersion: v1
kind: Config
clusters:
- cluster:
certificate-authority-data: xxx
server: https://192.168.4.142:6443
name: <cluster>
users:
- name: <cluster>-admin
user:
client-certificate-data: xxx
client-key-data: xxx
contexts:
- context:
cluster: <cluster>
user: <cluster>-admin
name: <cluster>-admin@<cluster>
current-context: <cluster>-admin@<cluster>
context: <cluster>
contexts:
<cluster>:
endpoints:
- 192.168.4.142:50000
ca: xxx
crt: xxx
key: xxx

A STEC edge cluster is a vanilla Kubernetes cluster. This means workload management can be done with any Kubernetes API compatible tool. For the purpose of this example kubectl will be used. Other popular options are k9s or freelens.

When managing large fleets of Kubernetes clusters you should consider using a tool designed for this task such as Argo CD, Flux CD or similar.

Prerequisites:

  • Successfully authenticated with a STEC instance.
  • Kubeconfig for the edge cluster is exported in your terminal session.
  • Tools: a generic Linux bash terminal, kubectl, yq.

Steps:

  1. Deploy your workload with kubectl.

    Terminal window
    > export KUBECONFIG=my-edge-cluster.kubeconfig.yaml
    > kubectl create deployment hello-stec --image=docker.io/nginxdemos/nginx-hello:plain-text --dry-run=client -o yaml | yq '.spec.replicas=0' | kubectl apply -f -
    ### perform your changes as needed, e.g. add imagePullSecrets. Than scale the deployment to 1.
    > kubectl scale deployment hello-stec --replicas=1
    ### forward the port of the deployment to your localhost
    > kubectl port-forward deployment/hello-stec 8080:8080
    > curl localhost:8080
    Server address: 127.0.0.1:8080
    Server name: hello-stec-6b97f4567f-qmzwb
    Date: 01/Sep/2025:17:25:19 +0000
    URI: /
    Request ID: 4530af346a3857f129fb76131fab6d97
    ### cleanup after testing
    > kubectl delete deployment hello-stec

You can update the used Talos and Kubernetes version of your edge cluster, but only one of them at a time. This is done by changing the version via UI, or by editing the versions in the edgecluster resource in your Edge Cloud instance.
You can check out the progress of the update by looking at the status of the edgecluster resource in your Edge Cloud instance.

During an update the version parts of the spec will be locked, you have to wait for the update to finish before you can start further updates.
Do not add or remove any nodes from the cluster during an update, this might brick your cluster.

You can upgrade the Kubernetes version independent from the Talos Linux operating system as long as the chosen combination is supported according to the Talos Linux release notes.

We currently supporting the following updates:

  • kubernetes-upgrade-from-1-30-to-1-30-14
  • kubernetes-upgrade-from-1-30-14-to-1-31-0
  • kubernetes-upgrade-from-1-31-to-1-31-13
  • kubernetes-upgrade-from-1-31-13-to-1-32-0
  • kubernetes-upgrade-from-1-32-to-1-32-9
  • kubernetes-upgrade-from-1-32-9-to-1-33-0
  • kubernetes-upgrade-from-1-33-to-1-33-5
  • kubernetes-upgrade-from-1-33-5-to-1-34-0
  • kubernetes-upgrade-from-1-34-to-1-34-1

The upgrade process will start immediately.

Prerequisites:

Steps:

  1. Navigate to Clusters in the UI.

    STACKIT Edge Cloud Dashboard: Clusters List (Single Multi-node Cluster). A screenshot of the STACKIT Edge Cloud web interface's Clusters overview page. A single cluster named doc-cluster is listed. It has 2 Control Plane and 1 Worker node. The cluster's Kubernetes version is v1.30.0, and its Status is Ready (indicated by a green dot). The search bar is empty, showing one result out of one.

  2. Click on the Cluster you want to update.

    STACKIT Edge Cloud Dashboard: Cluster Details (Multiple Control Planes). A screenshot of the STACKIT Edge Cloud web interface, showing the details page for a cluster named doc-cluster. The cluster is in the Ready state, running Kubernetes v1.33.0 and Talos v1.9.7-stackit.v0.24.0. The Control Plane Endpoint is 192.168.178.56:6443. Under Machines, three Control Plane nodes and one Worker node are listed, indicating a highly available Control Plane setup. The Cluster Proxy and Talos Proxy toggles are both disabled.

  3. Select a new valid version from the Kubernetes dropdown.

    STACKIT Edge Cloud Dashboard: Cluster Update - Kubernetes Version Selection. A screenshot of the STACKIT Edge Cloud cluster details page for doc-cluster, showing the dropdown menu for selecting the Kubernetes version. The available versions are v1.31.3, v1.32.1, and v1.33.0 (currently selected). A warning notes: "Changing the Kubernetes version will trigger an upgrade/downgrade of the cluster." This illustrates the process of initiating a Kubernetes version change.

  4. Click on the update bottom at the bottom of the page.

    STACKIT Edge Cloud Dashboard: Cluster Details (Update Pending). A screenshot of the STACKIT Edge Cloud cluster details page for doc-cluster. The cluster is in a Ready state, but the Kubernetes version dropdown is currently set to v1.33.0, and the Talos version is set to v1.10.2-stackit.v0.25.0. The visible Update button suggests a configuration change has been made (like the Kubernetes or Talos version) and is pending application. The machine list shows three Control Plane and one Worker node.

You can upgrade the Talos Linux operating system independent from the Kubernetes version as long as the chose combination is supported according to the Talos Linux release notes. See “get available images” section in the Creating Images page for a list of possible values.

These are the generally allowed upgrade Paths for Talos:

  • talos-upgrade-stec-1-9-to-1-10

  • talos-upgrade-stec-1-10-to-1-10

  • talos-upgrade-stec-1-10-to-1-11

  • talos-upgrade-stec-1-11-to-1-11

  • The upgrade process will start immediately. A staged Talos upgrade is enforced which means that the node will reboot before applying the upgrade.

  • Make sure to check the Talos changelog before upgrading the Talos version. STEC currently does not do anything special to workaround possible issues when changing the Talos version.

Prerequisites:

Steps:

  1. Navigate to Clusters in the UI.

    STACKIT Edge Cloud Dashboard: Clusters List (Single Multi-node Cluster). A screenshot of the STACKIT Edge Cloud web interface's Clusters overview page. A single cluster named doc-cluster is listed. It has 2 Control Plane and 1 Worker node. The cluster's Kubernetes version is v1.30.0, and its Status is Ready (indicated by a green dot). The search bar is empty, showing one result out of one.

  2. Click on the Cluster you want to update.

    STACKIT Edge Cloud Dashboard: Cluster Details (Multiple Control Planes). A screenshot of the STACKIT Edge Cloud web interface, showing the details page for a cluster named doc-cluster. The cluster is in the Ready state, running Kubernetes v1.33.0 and Talos v1.9.7-stackit.v0.24.0. The Control Plane Endpoint is 192.168.178.56:6443. Under Machines, three Control Plane nodes and one Worker node are listed, indicating a highly available Control Plane setup. The Cluster Proxy and Talos Proxy toggles are both disabled.

  3. Select a new valid version from the Talos dropdown.

    STACKIT Edge Cloud Dashboard: Cluster Update - Talos Version Selection. A screenshot of the STACKIT Edge Cloud cluster details page for doc-cluster, showing the extensive dropdown menu for selecting the Talos version (OS/host OS). A long list of available v1.10 versions is visible, with v1.10.2-stackit.v0.25.0 highlighted at the bottom of the visible list. This illustrates the multitude of Talos OS versions available for selection and potential upgrade/downgrade.

  4. Click on the update bottom at the bottom of the page.

    STACKIT Edge Cloud Dashboard: Cluster Details (Update Pending). A screenshot of the STACKIT Edge Cloud cluster details page for doc-cluster. The cluster is in a Ready state, but the Kubernetes version dropdown is currently set to v1.33.0, and the Talos version is set to v1.10.2-stackit.v0.25.0. The visible Update button suggests a configuration change has been made (like the Kubernetes or Talos version) and is pending application. The machine list shows three Control Plane and one Worker node.

You can remove a host from a cluster to decommission or re-use it elsewhere.

When you remove the last host from a cluster this will result in the cluster being deleted.

When you remove a host from a cluster its machine configuration is removed and the host is reset. This means it will be rebooted with the initial kernel parameters it’s image had when it was created. If that’s not what you want you’ll have to edit the machine configuration of a host to overwrite the kernel parameters using the machine.install.extraKernelArgs. The changes made there will persist even when resetting a machine. In addition any Kubernetes workload running on the host will be stopped since it’s no longer part of a Kubernetes cluster. Depending on your cluster and workload configuration the stopped workload might be restarted on one of the remaining hosts of the cluster.

When a host is reset the data saved on it might still be accessible. It is your responsibility to delete remaining data as needed.

Prerequisites:

Steps:

  1. Navigate to the clusters view in the UI.

    STACKIT Edge Cloud Dashboard: Clusters List (Single Cluster). A screenshot of the STACKIT Edge Cloud web interface's Clusters overview page. A single cluster named cluster-01 is listed. It has 1 Control Plane and 0 Worker nodes. The cluster's Kubernetes version is v1.30.2, and its Status is Ready (indicated by a green dot). The table shows one result out of one.

  2. Click the cluster name of the cluster you want to remove a host from.

    Screenshot of the cluster detail view for cluster-01 in the STACKIT Edge Cloud console. This page shows key information: Name is cluster-01, namespace is default, Status is Ready. It displays the Kubernetes version as v1.30.2 and the Talos OS version as v1.10.5-stackit.v0.21.0. The Control Plane Endpoint IP is listed as 192.168.4.142:6443. At the bottom, under Machines, one machine is listed by a long ID (a8cab...71718c) with the Role Control Plane. This machine has a Remove button in the Action column. Below the machine list is an Update button.

  3. Scroll down to the list of machines that are part of this cluster. Identify the host that you want to remove from the cluster and click ‘remove’. When you remove a host from a cluster its machine configuration is removed and the host is reset.

    Screenshot of the cluster detail view for cluster-01, similar to the second image, but with a "Remove a8cab433-..." confirmation dialog box overlaid. This dialog appears after clicking the Remove action for the control plane machine. The warning text states, "Are you sure you want to remove host a8cab433-760d-412a-9ce7-9c72e771718c from cluster cluster-01? This is the last host in the cluster. If you remove it, the cluster will be destroyed." It presents two buttons: a primary Remove button (red) and a secondary Cancel button.

  4. If you remove the last host from a cluster this will result in the cluster being deleted.

    Screenshot of the Clusters overview page after the cluster has been successfully deleted. The table is now empty, showing "Showing 1-0 of 0" results, indicating no clusters are currently deployed in the environment. The side navigation panel shows 0 next to Hosts and Clusters.

When a edge cluster is no longer needed you can delete it to free up resources.

When you delete a cluster the machine configuration of all hosts that are part of this cluster is removed and the hosts are reset. This means those hosts will be rebooted with the initial kernel parameters it’s image had when it was created. Any Kubernetes workload running on those hosts will be stopped and deleted.

When a host is reset the data saved on it might still be accessible. It is your responsibility to delete remaining data as needed.

Prerequisites:

Steps:

  1. Navigate to the Clusters view in the UI.

    A screenshot of the STACKIT Edge Cloud web interface's Clusters overview page.

  2. Click the Action button of the cluster you want to delete and click on the ‘delete’ action.

    Screenshot of the STACKIT Edge Cloud web console, specifically the Clusters overview page.

  3. The cluster is deleted. All hosts that are part of it will be reset.

    Screenshot of the Clusters overview page with a "Delete cluster-01" confirmation dialog box overlaid.