STACKIT AI Model Serving: New Model Release GPT-OSS 20B (Replacement for Llama-8B and Nemo)
Abschnitt betitelt „STACKIT AI Model Serving: New Model Release GPT-OSS 20B (Replacement for Llama-8B and Nemo)“We are excited to announce that we are upgrading our model lineup by introducing openai/gpt-oss-20b, which will serve as the successor to our current Mistral-Nemo and Llama 3.1 8B offerings.
By leveraging 4-bit (MXFP4) quantization, this new 20-billion parameter model provides a significant boost in reasoning capabilities while maintaining the low-latency performance our customers expect. Applications such as real-time chatbots, retrieval-augmented generation (RAG), and agentic workflows will benefit from improved tool-calling and higher throughput.
Deprecation Notice
Section titled “Deprecation Notice”As part of this transition, we are officially deprecating the following models:
- neuralmagic/Mistral-Nemo-Instruct-2407-FP8
- neuralmagic/Meta-Llama-3.1-8B-Instruct-FP8
We kindly ask all customers to migrate their workloads to the new model openai/gpt-oss-20b before 4 June 2026.
Explore our full model portfolio, and access detailed examples and tutorials in our documentation. Our Help Center is always at your disposal if you have any questions.