Topic Hub

AI & ML on Cloud Native

Running AI/ML workloads on Kubernetes and modern infrastructure.

GPU scheduling, model serving, Kubeflow, LLMs on Kubernetes, NVIDIA NVCF, and the rapidly evolving AI infrastructure landscape.

Start here

More on AI & ML on Cloud Native

Cover for Day 4: Quantization Demystified. BF16, FP8, NVFP4, MXFP4, INT4, GGUF, and Why It All Matters
nvidiadgxsparkJun 10, 2026

Day 4: Quantization Demystified. BF16, FP8, NVFP4, MXFP4, INT4, GGUF, and Why It All Matters

A practical, beginner-friendly guide to BF16, FP8, NVFP4, MXFP4, INT4, and GGUF Q4_K_M on NVIDIA DGX Spark. Bytes per parameter, quality vs size, and which format to pick when.

Saiyam PathakSaiyam Pathak · 28 min
Read →
Cover for Day 3: The DGX Spark Unpacked. GB10, Unified Memory, sm_121, and the One Reason This Hardware Exists
nvidiadgxsparkJun 5, 2026

Day 3: The DGX Spark Unpacked. GB10, Unified Memory, sm_121, and the One Reason This Hardware Exists

A practical teardown of NVIDIA DGX Spark's GB10 Grace Blackwell Superchip, unified memory, sm_121, NVFP4 tensor cores, memory reporting, and decode limits.

Saiyam PathakSaiyam Pathak · 19 min
Read →
Cover for mlxcel: A Rust-Native Inference Engine for Apple Silicon, Tested on My M1 Max
mlxapple-siliconMay 29, 2026

mlxcel: A Rust-Native Inference Engine for Apple Silicon, Tested on My M1 Max

Day-one deep dive into mlxcel v0.1.0, a Rust-native MLX inference engine. Real M1 Max benchmarks vs mlx-lm and Ollama on Llama 3.2 3B and Qwen 2.5 7B, with architecture diagrams and an honest take on TurboQuant.

Saiyam PathakSaiyam Pathak · 28 min
Read →
Cover for Day 2: Anatomy of an LLM Inference Request. From Prompt to Answer, Step by Step
nvidiadgxsparkMay 27, 2026

Day 2: Anatomy of an LLM Inference Request. From Prompt to Answer, Step by Step

A beginner-friendly walkthrough of tokenization, prefill, KV cache, decode, batching, TTFT, and why memory bandwidth shapes local LLM performance on NVIDIA DGX Spark.

Saiyam PathakSaiyam Pathak · 26 min
Read →
Cover for Day 1: The Local LLM Revolution. Why Your Desk Just Became the New Datacenter
nvidiadgxsparkMay 25, 2026

Day 1: The Local LLM Revolution. Why Your Desk Just Became the New Datacenter

Why local LLMs are becoming practical in 2026, what changed across open weights, hardware, and inference software, and why DGX Spark makes the desk feel like a small AI lab.

Saiyam PathakSaiyam Pathak · 13 min
Read →
Cover for Day 6: Run an LLM on Your Laptop - With Docker
aidockerApr 30, 2026

Day 6: Run an LLM on Your Laptop - With Docker

"Pull AI models from Docker Hub, run them locally with GPU acceleration, and build an AI-powered app

Saloni NarangSaloni Narang · 10 min
Read →
Cover for What Claude Code's Leaked Source Actually Teaches Us About Building AI Agents
typescriptllmApr 1, 2026

What Claude Code's Leaked Source Actually Teaches Us About Building AI Agents

512K lines of TypeScript, verified against actual source. The engineering patterns in Claude Code's leaked codebase that most coverage got wrong.

Saiyam PathakSaiyam Pathak · 8 min
Read →
Cover for Here's What I Learned About Nemotron 3 Super -I Ran a 120B Parameter Model on Nvidia DGX Spark
nvidiaai-agentsMar 14, 2026

Here's What I Learned About Nemotron 3 Super -I Ran a 120B Parameter Model on Nvidia DGX Spark

Understand everything about Nvidia DGX spark along with hands on and benchmarks.

Saiyam PathakSaiyam Pathak · 14 min
Read →
Cover for Ditch the Overheating Laptop: Supercharge Your Docker Workflow with Docker Offload
aicloudAug 26, 2025

Ditch the Overheating Laptop: Supercharge Your Docker Workflow with Docker Offload

Running multiple Docker containers can slow down your laptop and drain your battery. In this blog, we explore Docker Offload — a game-changing feature

Saloni NarangSaloni Narang · 5 min
Read →
Cover for Docker MCP Catalog: Finding the Right AI Tools for Your Project
aidockerJun 26, 2025

Docker MCP Catalog: Finding the Right AI Tools for Your Project

This blog describes what Docker MCP is and how it

Saloni NarangSaloni Narang · 6 min
Read →
Cover for Testing Docker AI's "Gordon" – How Smart Is It?
aidockerFeb 21, 2025

Testing Docker AI's "Gordon" – How Smart Is It?

Testing Docker AI's "Gordon" – How Smart Is It?

Saloni NarangSaloni Narang · 9 min
Read →
Cover for Kubeflow Pipelines: Orchestrating Machine Learning Workflows - Part 3
machine-learningkubernetesDec 27, 2022

Kubeflow Pipelines: Orchestrating Machine Learning Workflows - Part 3

Kubeflow Pipelines is a great way to build and deploy end-to-end scalable and portable Machine Learning workloads. In this article, we take a look at how…

Rishit DagliRishit Dagli · 17 min
Read →
Show 1 more AI & ML on Cloud Native article