Willkommen bei flozi00 TechHub
Technische Wissensdatenbank für Server, AI, Systems und IT-Infrastruktur.
📚 Dokumentation durchsuchen
Navigiere durch technische Anleitungen für Server und IT-Infrastruktur.
🔍 Schnellsuche
Finde Hardware-Lösungen, Konfigurationsanleitungen und technische Spezifikationen sofort.
Neueste Seiten
About Florian Zimmermeister
Learn about flozi00 - passionate about speech recognition, accessibility, and AI, with expertise in server infrastructure and systems.
Qwen3-Next: A Deep Dive into Alibaba's Hybrid MoE Powerhouse
A technical breakdown of the Qwen3-Next-80B-A3B-Instruct model, exploring its Hybrid MoE architecture, FP8 precision, and how it achieves efficiency with 80B total parameters but only 3B active.
Selecting the Right GPU for Qwen3 Inference
A practical overview that matches NVIDIA RTX PRO 6000, H200, and DGX Station platforms to Qwen3 model sizes using the ops:byte method from the LLM inference math guide.
A Practical Guide to LLM Inference Math: From Theory to Hardware
Essential mathematical formulas and calculations for profiling LLM inference performance, determining compute vs memory bottlenecks, and selecting optimal GPU hardware.
Understanding LLM VRAM Requirements: A Mathematical Deep Dive
Comprehensive guide to calculating GPU memory requirements for Large Language Model inference and training with practical examples using Qwen3-VL-32B.