<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[Kubesimplify]]></title><description><![CDATA[On a Mission to simplify AI and Cloud Native for everyone!]]></description><link>https://blog.kubesimplify.com</link><image><url>https://cdn.hashnode.com/res/hashnode/image/upload/v1649087678065/oZZJ9QpqX.png</url><title>Kubesimplify</title><link>https://blog.kubesimplify.com</link></image><generator>RSS for Node</generator><lastBuildDate>Sat, 18 Apr 2026 07:26:02 GMT</lastBuildDate><atom:link href="https://blog.kubesimplify.com/rss.xml" rel="self" type="application/rss+xml"/><language><![CDATA[en]]></language><ttl>60</ttl><item><title><![CDATA[SSH Into Your DGX Spark From Anywhere in the World Using Tailscale
]]></title><description><![CDATA[I recently got my hands on an NVIDIA DGX Spark, and the first thing I wanted to figure out was: how do I access this thing from anywhere? Whether I'm at a coffee shop, at a conference, or on a differe]]></description><link>https://blog.kubesimplify.com/ssh-into-your-dgx-spark-from-anywhere-in-the-world-using-tailscale</link><guid isPermaLink="true">https://blog.kubesimplify.com/ssh-into-your-dgx-spark-from-anywhere-in-the-world-using-tailscale</guid><category><![CDATA[NVIDIA]]></category><category><![CDATA[tailscale]]></category><category><![CDATA[DGXSpark]]></category><category><![CDATA[ssh]]></category><category><![CDATA[Kubernetes]]></category><category><![CDATA[Devops]]></category><dc:creator><![CDATA[Saiyam Pathak]]></dc:creator><pubDate>Tue, 07 Apr 2026 12:01:10 GMT</pubDate><enclosure url="https://cdn.hashnode.com/uploads/covers/5ef48fe2877d056386648ab2/73a73de4-7383-44be-8853-78e3cf47b306.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<hr />
<p>I recently got my hands on an NVIDIA DGX Spark, and the first thing I wanted to figure out was: <strong>how do I access this thing from anywhere?</strong> Whether I'm at a coffee shop, at a conference, or on a different network entirely — I want to just <code>ssh</code> in and get to work.</p>
<p>The answer? <strong>Tailscale.</strong> It took me about 10 minutes to set up, and now I can SSH into my Spark from any device, on any network, anywhere in the world. I even set up a friend with access — simultaneously — without giving them my credentials. Here's exactly how I did it.</p>
<h2>Why Tailscale?</h2>
<p>Tailscale creates a private mesh network (called a "tailnet") between your devices. No port forwarding, no static IPs, no VPN server to maintain. You install it on your devices, log in with the same account, and they can talk to each other. It's built on WireGuard, so it's fast and encrypted.</p>
<img src="https://cdn.hashnode.com/uploads/covers/5ef48fe2877d056386648ab2/471fb562-c707-46de-a4e1-a157f818ca08.png" alt="" style="display:block;margin:0 auto" />

<p>For the DGX Spark, this means:</p>
<ul>
<li><p>No need to be on the same WiFi network</p>
</li>
<li><p>No need to mess with your router settings</p>
</li>
<li><p>Works behind NATs and firewalls</p>
</li>
<li><p>Encrypted end-to-end</p>
</li>
</ul>
<h2>Prerequisites</h2>
<p>Before starting, make sure your DGX Spark:</p>
<ul>
<li><p>Is running Ubuntu 24.04 or newer</p>
</li>
<li><p>Has internet connectivity</p>
</li>
<li><p>You have sudo access</p>
</li>
</ul>
<p>Here's what my system looked like:</p>
<pre><code class="language-bash">$ lsb_release -a
No LSB modules are available.
Distributor ID:    Ubuntu
Description:    Ubuntu 24.04.3 LTS
Release:    24.04
Codename:    noble
</code></pre>
<p>A quick ping to confirm internet:</p>
<pre><code class="language-bash">$ ping -c 3 google.com
64 bytes from tzdela-ba-in-x0e.1e100.net: icmp_seq=1 ttl=118 time=15.3 ms
64 bytes from tzdela-ba-in-x0e.1e100.net: icmp_seq=2 ttl=118 time=13.7 ms
64 bytes from tzdela-ba-in-x0e.1e100.net: icmp_seq=3 ttl=118 time=17.2 ms

--- google.com ping statistics ---
3 packets transmitted, 3 received, 0% packet loss
</code></pre>
<p>And verify sudo access:</p>
<pre><code class="language-bash">$ sudo whoami
root
</code></pre>
<p>Good to go.</p>
<h2>Step 1: Install Tailscale on the DGX Spark</h2>
<p>SSH into your Spark (or use a directly connected keyboard/monitor) and run:</p>
<pre><code class="language-bash"># Update package list and install prerequisites
sudo apt update
sudo apt install -y curl gnupg

# Add Tailscale signing key
curl -fsSL https://pkgs.tailscale.com/stable/ubuntu/noble.noarmor.gpg | \
  sudo tee /usr/share/keyrings/tailscale-archive-keyring.gpg &gt; /dev/null

# Add Tailscale repository
curl -fsSL https://pkgs.tailscale.com/stable/ubuntu/noble.tailscale-keyring.list | \
  sudo tee /etc/apt/sources.list.d/tailscale.list

# Install Tailscale
sudo apt update
sudo apt install -y tailscale
</code></pre>
<p>You'll see the repository being added and the package installing:</p>
<pre><code class="language-plaintext"># Tailscale packages for ubuntu noble
deb [signed-by=/usr/share/keyrings/tailscale-archive-keyring.gpg] https://pkgs.tailscale.com/stable/ubuntu noble main
...
Setting up tailscale (1.94.2) ...
Created symlink /etc/systemd/system/multi-user.target.wants/tailscaled.service → /usr/lib/systemd/system/tailscaled.service.
</code></pre>
<p>Verify the installation:</p>
<pre><code class="language-bash">$ tailscale version
1.94.2
  tailscale commit: 0a29cf18b56e478b9cd33af07755fcae90d5171a
  long version: 1.94.2-t0a29cf18b-g3f044c9f6
  go version: go1.25.5
</code></pre>
<p>Check the service is running:</p>
<pre><code class="language-bash">saiyam@spark-5223:~$ sudo systemctl status tailscaled --no-pager
[sudo] password for saiyam: 
● tailscaled.service - Tailscale node agent
     Loaded: loaded (/usr/lib/systemd/system/tailscaled.service; enabled; preset: enabled)
     Active: active (running) since Tue 2026-04-07 11:13:14 UTC; 9min ago
       Docs: https://tailscale.com/docs/
   Main PID: 2410 (tailscaled)
     Status: "Connected; saiyam911@gmail.com; 100.120.233.78 fd7a:115c:a1e0::f83a:e94e"
      Tasks: 22 (limit: 153561)
     Memory: 45.4M (peak: 53.7M)
        CPU: 615ms
     CGroup: /system.slice/tailscaled.service
             └─2410 /usr/sbin/tailscaled --state=/var/lib/tailscale/tailscaled.…
</code></pre>
<p>The status says "Needs login" — that's expected. We'll authenticate next.</p>
<h2>Step 2: Connect the Spark to Your Tailnet</h2>
<p>This is the magic step:</p>
<pre><code class="language-bash">$ sudo tailscale up

To authenticate, visit:

    https://login.tailscale.com/a/1ff5e3e9017787
</code></pre>
<p>Open that URL in any browser, log in with your account (Google, GitHub, Microsoft — whatever your org uses), and you'll see:</p>
<blockquote>
<p><strong>Login successful. Your device spark-5223 is logged in</strong></p>
</blockquote>
<p>Back on the Spark terminal, you'll see:</p>
<pre><code class="language-plaintext">Success.
Some peers are advertising routes but --accept-routes is false
</code></pre>
<p>That's it on the Spark side. Your DGX Spark is now part of your private Tailscale network with the hostname <code>spark-5223</code>.</p>
<blockquote>
<p><strong>Note:</strong> The <code>--accept-routes</code> message is harmless for SSH access. You can ignore it. If you ever need subnet routing, run <code>sudo tailscale up --accept-routes</code>.</p>
</blockquote>
<h2>Step 3: Install Tailscale on Your Laptop</h2>
<img src="https://cdn.hashnode.com/uploads/covers/5ef48fe2877d056386648ab2/61787831-2a1e-4577-94a2-cdf279c2db4c.png" alt="" style="display:block;margin:0 auto" />

<h3>macOS</h3>
<ul>
<li><p><strong>Option A:</strong> Download from the <a href="https://apps.apple.com/app/tailscale/id1475387142">Mac App Store</a> (search "Tailscale")</p>
</li>
<li><p><strong>Option B:</strong> Download the <code>.pkg</code> from <a href="https://tailscale.com/download">tailscale.com/download</a></p>
</li>
</ul>
<p>Open the app, click <strong>Log in</strong>, and sign in with the <strong>same account</strong> you used on the Spark.</p>
<h3>Windows</h3>
<ol>
<li><p>Download the installer from <a href="https://tailscale.com/download">tailscale.com/download</a></p>
</li>
<li><p>Run the <code>.msi</code> file</p>
</li>
<li><p>Launch Tailscale from the system tray</p>
</li>
<li><p>Log in with the same account</p>
</li>
</ol>
<h3>Linux</h3>
<p>Same commands as the Spark:</p>
<pre><code class="language-bash">sudo apt update
sudo apt install -y curl gnupg

curl -fsSL https://pkgs.tailscale.com/stable/ubuntu/noble.noarmor.gpg | \
  sudo tee /usr/share/keyrings/tailscale-archive-keyring.gpg &gt; /dev/null

curl -fsSL https://pkgs.tailscale.com/stable/ubuntu/noble.tailscale-keyring.list | \
  sudo tee /etc/apt/sources.list.d/tailscale.list

sudo apt update
sudo apt install -y tailscale
sudo tailscale up
</code></pre>
<h2>Step 4: SSH Into Your Spark From Anywhere</h2>
<p>First, confirm both devices see each other:</p>
<pre><code class="language-bash">$ tailscale status
100.104.142.22  spark-5223           saiyamxxx@  linux  -
100.108.115.75  saiyams-macbook-pro  saiyam9xxx@  macOS  -
</code></pre>
<p>You should see your Spark listed. Now, simply:</p>
<pre><code class="language-bash">ssh saiyam@spark-5223
</code></pre>
<p>That's it. Tailscale's <strong>MagicDNS</strong> resolves <code>spark-5223</code> to the right Tailscale IP automatically. No need to remember IP addresses.</p>
<p>If MagicDNS isn't working for some reason, use the Tailscale IP directly:</p>
<pre><code class="language-bash"># Find the IP
tailscale status
# Look for spark-5223 and note the 100.x.x.x address

ssh saiyam@100.104.142.22
</code></pre>
<h3>Setting Up SSH Key Authentication</h3>
<p>For passwordless SSH access, set up key-based authentication. If you already have an SSH key (check <code>~/.ssh/id_ed25519.pub</code> or <code>~/.ssh/id_rsa.pub</code>), add it to the Spark:</p>
<pre><code class="language-bash"># Copy your public key to the Spark (will ask for password once)
ssh-copy-id saiyam@spark-5223
</code></pre>
<p>Or manually add it on the Spark:</p>
<pre><code class="language-bash"># On the Spark — append the public key
echo "your-public-key-here" &gt;&gt; ~/.ssh/authorized_keys
chmod 600 ~/.ssh/authorized_keys
chmod 700 ~/.ssh
</code></pre>
<p>After that, SSH works without a password prompt.</p>
<blockquote>
<p><strong>Note:</strong> Password authentication still works alongside SSH keys. You don't have to choose one or the other.</p>
</blockquote>
<h2>What About My Second Laptop?</h2>
<p>This is the beauty of Tailscale — <strong>just install and log in</strong>:</p>
<ol>
<li><p>Install Tailscale on the second laptop (using the steps above for your OS)</p>
</li>
<li><p>Log in with the same account</p>
</li>
<li><p>Run <code>ssh saiyam@spark-5223</code></p>
</li>
</ol>
<p>No extra configuration on the Spark. Every device on your tailnet can reach every other device automatically.</p>
<h2>Sharing Your Spark With a Friend</h2>
<p>What if a friend also needs SSH access to your Spark — simultaneously, from their own laptop? You don't need to create a new Tailscale account for them. Use a <strong>pre-auth key</strong> to add their device to your tailnet.</p>
<h3>Generate a Pre-Auth Key</h3>
<ol>
<li><p>Go to the <a href="https://login.tailscale.com/admin/settings/keys">Tailscale Admin Console</a></p>
</li>
<li><p>Click <strong>"Generate auth key..."</strong></p>
</li>
<li><p>Enable <strong>Reusable</strong> if you want it to work for multiple devices</p>
</li>
<li><p>Set an expiration as needed</p>
</li>
<li><p>Copy the key (starts with <code>tskey-auth-...</code>)</p>
</li>
</ol>
<h3>Your Friend's Setup (macOS)</h3>
<ol>
<li><p>Install Tailscale from the <a href="https://apps.apple.com/app/tailscale/id1475387142">Mac App Store</a></p>
</li>
<li><p><strong>Important:</strong> If they're already logged in to their own Tailscale account, they need to leave it first:</p>
<pre><code class="language-bash">sudo tailscale logout
</code></pre>
</li>
<li><p>Join your tailnet using the pre-auth key:</p>
<pre><code class="language-bash">sudo tailscale up --auth-key=tskey-auth-xxxxxxxxxxxx
</code></pre>
</li>
<li><p>That's it — their Mac is now on your tailnet. No login, no email needed.</p>
</li>
</ol>
<h3>Add Their SSH Key to the Spark</h3>
<p>Your friend should generate an SSH key on their Mac (if they don't have one):</p>
<pre><code class="language-bash">ssh-keygen -t ed25519
</code></pre>
<p>Then share their public key with you (the contents of <code>~/.ssh/id_ed25519.pub</code>). On the Spark, add it:</p>
<pre><code class="language-bash">echo "ssh-ed25519 AAAA...their-key-here... friend@hostname" &gt;&gt; ~/.ssh/authorized_keys
chmod 600 ~/.ssh/authorized_keys
</code></pre>
<p>Now your friend can SSH in directly:</p>
<pre><code class="language-bash">ssh saiyam@spark-5223
</code></pre>
<p>No password prompt — the key handles authentication automatically. SSH automatically tries keys from the default location (<code>~/.ssh/id_ed25519</code>), so your friend does <strong>not</strong> need to use <code>ssh -i</code>.</p>
<p>Verify it all works:</p>
<pre><code class="language-bash">$ tailscale status
100.104.142.22  spark-5223           saiyamxxx@  linux  -
100.67.209.38   rohits-macbook-pro   saiyamxxx@  macOS  -
100.108.115.75   saiyams-macbook-pro  saiyamxxx@  macOS  -
</code></pre>
<p>Three devices, one tailnet, simultaneous SSH access.</p>
<blockquote>
<p><strong>Tip:</strong> You can manage access from the <a href="https://login.tailscale.com/admin/machines">Tailscale Admin Console</a>. To revoke someone's access, remove their device from the console and delete their key from <code>~/.ssh/authorized_keys</code> on the Spark.</p>
</blockquote>
<h2>Troubleshooting</h2>
<h3>"No Matching Peer" Error</h3>
<p>If your friend gets a "no matching peer" error when trying to SSH, it means <strong>they're on a different tailnet</strong> — not yours.</p>
<img src="https://cdn.hashnode.com/uploads/covers/5ef48fe2877d056386648ab2/55475e51-a4e7-4e4f-816b-91c7fae9c474.png" alt="" style="display:block;margin:0 auto" />

<p>The <code>100.x.x.x</code> Tailscale IPs are only reachable between devices on the <strong>same tailnet</strong>. The fix:</p>
<pre><code class="language-bash"># Friend logs out of their own tailnet
sudo tailscale logout

# Friend joins YOUR tailnet with your pre-auth key
sudo tailscale up --auth-key=tskey-auth-xxxxxxxxxxxx
</code></pre>
<h3>SSH Connection Timeout</h3>
<p>If <code>tailscale ping</code> works but SSH times out:</p>
<pre><code class="language-bash"># On the Spark — check SSH is running
sudo systemctl status ssh

# Check firewall isn't blocking
sudo ufw status

# If SSH isn't running
sudo systemctl start ssh

# If firewall is active and blocking
sudo ufw allow 22/tcp
</code></pre>
<p>Also check SSH is listening on all interfaces:</p>
<pre><code class="language-bash">$ ss -tlnp | grep 22
LISTEN  0  4096  0.0.0.0:22  0.0.0.0:*  users:(("sshd",...))
LISTEN  0  4096     [::]:22     [::]:*  users:(("sshd",...))
</code></pre>
<p>If SSH is only listening on a specific IP, edit <code>/etc/ssh/sshd_config</code> to ensure <code>ListenAddress</code> is not restricted, then <code>sudo systemctl restart ssh</code>.</p>
<h3>Permission Denied (publickey, password)</h3>
<p>This means SSH connected but authentication failed. Either:</p>
<ul>
<li><p>Your SSH key isn't in <code>~/.ssh/authorized_keys</code> on the Spark</p>
</li>
<li><p>You're using a non-default key path (use <code>ssh -i /path/to/key</code>)</p>
</li>
<li><p>Password authentication is disabled in sshd_config</p>
</li>
</ul>
<p>Check the authorized keys on the Spark:</p>
<pre><code class="language-bash">cat ~/.ssh/authorized_keys
</code></pre>
<p>Make sure your public key is listed there.</p>
<h2>Useful Commands Cheat Sheet</h2>
<table>
<thead>
<tr>
<th>Command</th>
<th>What it does</th>
</tr>
</thead>
<tbody><tr>
<td><code>tailscale status</code></td>
<td>List all devices on your tailnet</td>
</tr>
<tr>
<td><code>tailscale ping spark-5223</code></td>
<td>Test connectivity to a device</td>
</tr>
<tr>
<td><code>tailscale ip</code></td>
<td>Show your device's Tailscale IP</td>
</tr>
<tr>
<td><code>ssh saiyam@spark-5223</code></td>
<td>SSH using MagicDNS hostname</td>
</tr>
<tr>
<td><code>sudo tailscale up</code></td>
<td>Connect to tailnet</td>
</tr>
<tr>
<td><code>sudo tailscale down</code></td>
<td>Disconnect from tailnet</td>
</tr>
<tr>
<td><code>sudo tailscale logout</code></td>
<td>Leave the current tailnet entirely</td>
</tr>
<tr>
<td><code>ssh-copy-id saiyam@spark-5223</code></td>
<td>Copy your SSH key to the Spark</td>
</tr>
</tbody></table>
<h2>Pro Tips</h2>
<ol>
<li><p><strong>Tailscale starts on boot</strong> — the <code>tailscaled</code> service is enabled by default, so your Spark will rejoin the tailnet automatically after a reboot.</p>
</li>
<li><p><strong>Forward ports for Jupyter</strong> — if you run JupyterLab on your Spark:</p>
<pre><code class="language-bash">ssh -L 8888:localhost:8888 saiyam@spark-5223
</code></pre>
<p>Then open <code>http://localhost:8888</code> in your browser.</p>
</li>
<li><p><strong>File transfers work too:</strong></p>
<pre><code class="language-bash">scp model.bin saiyam@spark-5223:~/models/
</code></pre>
</li>
<li><p><strong>Check who's connected</strong> — on the Spark, see active SSH sessions:</p>
<pre><code class="language-bash">who
</code></pre>
</li>
<li><p><strong>Tailscale admin console</strong> — monitor all devices, manage keys, and remove devices at <a href="https://login.tailscale.com/admin">login.tailscale.com/admin</a>.</p>
</li>
</ol>
<h2>Cleanup (If Needed)</h2>
<p>If you ever want to remove Tailscale from your Spark:</p>
<pre><code class="language-bash">sudo tailscale down
sudo apt remove --purge tailscale
sudo rm /etc/apt/sources.list.d/tailscale.list
sudo rm /usr/share/keyrings/tailscale-archive-keyring.gpg
sudo apt update
</code></pre>
<p>To restore: re-run installation steps 1-2.</p>
<h2>Wrapping Up</h2>
<p>The whole setup took me about 10 minutes. Now I can SSH into my DGX Spark from my MacBook at home, my second laptop on the go, and even my friend can access it simultaneously from his MacBook — all without any port forwarding, static IPs, or VPN servers.</p>
<p>The key takeaways:</p>
<ul>
<li><p><strong>For yourself:</strong> Install Tailscale on both devices, log in with the same account, <code>ssh</code> in</p>
</li>
<li><p><strong>For friends:</strong> Generate a pre-auth key, have them join your tailnet, add their SSH public key to the Spark</p>
</li>
<li><p><strong>Troubleshooting:</strong> Make sure all devices are on the same tailnet, SSH is running, and keys are in <code>authorized_keys</code></p>
</li>
</ul>
<p>Just <code>ssh saiyam@spark-5223</code> — from anywhere in the world.</p>
<p>I also used this at 30000 feet in the air!</p>
<img src="https://cdn.hashnode.com/uploads/covers/5ef48fe2877d056386648ab2/1414da6d-8abe-4012-941e-081e310a16d4.png" alt="" style="display:block;margin:0 auto" />
<p><a class="embed-card" href="https://x.com/SaiyamPathak/status/2032098978213528037?s=20">https://x.com/SaiyamPathak/status/2032098978213528037?s=20</a></p>
]]></content:encoded></item><item><title><![CDATA[What Claude Code's Leaked Source Actually Teaches Us About Building AI Agents]]></title><description><![CDATA[Let me start with the honest version of what happened.
Yesterday, Anthropic accidentally published a 59.8 MB source map file inside version 2.1.88 of their @anthropic-ai/claude-code npm package. The b]]></description><link>https://blog.kubesimplify.com/claude-code-leak-what-the-source-actually-teaches</link><guid isPermaLink="true">https://blog.kubesimplify.com/claude-code-leak-what-the-source-actually-teaches</guid><category><![CDATA[ai agents]]></category><category><![CDATA[claude-code]]></category><category><![CDATA[TypeScript]]></category><category><![CDATA[AI Engineering]]></category><category><![CDATA[llm]]></category><dc:creator><![CDATA[Saiyam Pathak]]></dc:creator><pubDate>Wed, 01 Apr 2026 15:13:33 GMT</pubDate><enclosure url="https://cdn.hashnode.com/uploads/covers/5ef48fe2877d056386648ab2/0582a05f-42f3-4b97-8512-9c2133603126.svg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Let me start with the honest version of what happened.</p>
<p>Yesterday, Anthropic accidentally published a 59.8 MB source map file inside version 2.1.88 of their <code>@anthropic-ai/claude-code</code> npm package. The build pipeline was configured to generate source maps, and the packaging step — whether it was a missing <code>.npmignore</code> rule or a misconfigured <code>files</code> field in <code>package.json</code> — failed to exclude them. One packaging oversight, and ~512,000 lines of TypeScript source were public. Anthropic's DMCA notice eventually took down over 8,100 GitHub repositories. A clean-room rewrite called <a href="https://github.com/instructkr/claw-code">claw-code</a> hit 50,000 stars in about two hours and now about 100k+.</p>
<p>Within last 24 hours, the internet was flooded with hot takes. Architecture diagrams. The virtual pet system. Thread after thread of "I read the entire codebase and here's what I found." Sites like <a href="https://ccunpacked.dev">ccunpacked.dev</a> did genuinely good visual walkthroughs of the high-level structure.</p>
<p>But let's be real — nobody read 512,000 lines of TypeScript. I certainly didn't, and I'm skeptical of anyone who claims they did. What I did do was feed the source into Claude, systematically analyzed the key modules, cross-referenced what I found against public documentation and other analyses, and verified the claims I'm about to make against the actual code. If you've read other "I analyzed the leak" posts, they probably all used a similar workflow. The difference I'm going for here is honesty about the process.</p>
<hr />
<h2>The Core Loop Is a State Machine, and That's the Whole Point</h2>
<p>The agent loop lives in <code>query.ts</code>. It's exactly 1,729 lines (I checked), structured as an async generator function called <code>queryLoop</code> wrapping a <code>while(true)</code> loop. The code itself, in an internal comment, references "7 continue sites" — seven distinct points where the loop yields control and decides what to do next.</p>
<p>The actual function signature:</p>
<pre><code class="language-typescript">async function* queryLoop(
  params: QueryParams,
  consumedCommandUuids: string[],
): AsyncGenerator&lt;
  StreamEvent | RequestStartEvent | Message | TombstoneMessage | ToolUseSummaryMessage,
  Terminal
&gt;
</code></pre>
<p>Why does this matter? Because most agent frameworks treat the LLM call as the center of gravity. Send a prompt, get a response, run a tool, repeat. That works fine for demos. It falls apart the moment you need to pause a session, resume it later, serialize state, handle errors mid-turn, or compose multiple agents together.</p>
<p>The generator pattern makes every loop iteration an explicit state transition. You can yield control at each of the seven points without losing state. You can test individual stages. You can add compaction, permission checks, or budget tracking as stages rather than side effects bolted onto a callback chain.</p>
<p>If you're building an agent and your core loop is a simple <code>while</code> with <code>await model.chat()</code> in the middle, this is the pattern to study.</p>
<img src="https://cdn.hashnode.com/uploads/covers/5ef48fe2877d056386648ab2/87e9c4e9-9bfd-44ac-bdc0-f9fbf5c3d30b.png" alt="" style="display:block;margin:0 auto" />

<h2>Five Compaction Strategies (Not a Neat Stack)</h2>
<p>Every long-running agent eventually fills its context window. Most frameworks handle this by truncating old messages. Claude Code has five distinct strategies — though I want to be clear, these aren't a clean "Layer 1 through 5" hierarchy like some other posts have described. They're composable strategies that kick in under different conditions:</p>
<p><strong>Snip</strong> prunes older messages for quick headroom. Fast and lossy.</p>
<p><strong>Microcompact</strong> targets tool outputs specifically. A 5,000-line file read gets saved to disk; the model sees a summary with a reference. Two implementations handle this: <code>microCompact.ts</code> and <code>apiMicrocompact.ts</code>. This alone is a big deal — a single uncompressed tool output can eat half your context window.</p>
<p><strong>Context Collapse</strong> progressively compresses older conversation segments while keeping recent context sharp. It's still behind a <code>CONTEXT_COLLAPSE</code> feature flag, with dedicated persistence types (<code>ContextCollapseCommitEntry</code>, <code>ContextCollapseSnapshotEntry</code>) to survive session restarts. Not yet fully shipped.</p>
<p><strong>Autocompact</strong> is full-conversation summarization at configurable token thresholds. Replaces older history with a summary.</p>
<p><strong>Reactive Compact</strong> is the emergency brake — behind the <code>REACTIVE_COMPACT</code> feature flag. When the API returns a 413 (payload too large), this aggressively compacts everything so your session doesn't die. Without this, one bad tool output would brick the conversation.</p>
<p>Now, I've seen posts claiming "no other framework has this." That was arguably true in 2025, but it's not true now. Microsoft's <a href="https://learn.microsoft.com/en-us/agent-framework/agents/conversations/compaction">Agent Framework</a> has composable multi-strategy compaction pipelines. <a href="https://blog.langchain.com/context-management-for-deepagents/">LangChain Deep Agents</a> (shipped March 15, 2026) does filesystem offloading plus multi-frequency summarization. <a href="https://google.github.io/adk-docs/context/compaction/">Google ADK</a> has sliding window with summarization.</p>
<p>What sets Claude Code apart isn't that it has compaction — it's the granularity. Five strategies, two of them still being iterated on behind feature flags. That reflects the kind of edge cases you only discover at scale.</p>
<img src="https://cdn.hashnode.com/uploads/covers/5ef48fe2877d056386648ab2/17d6086b-cbcc-41f9-8fe6-23f9e71b817d.png" alt="" style="display:block;margin:0 auto" />

<h2>Deferred Tool Loading</h2>
<p>This is probably the most practical pattern in the codebase for anyone building agents.</p>
<p>When you connect MCP servers, you might have 200+ tools available. Sending all those schemas on every API call wastes thousands of tokens. Claude Code's solution: mark tools with <code>defer_loading: true</code>. The model doesn't see them. Instead, it has a single meta-tool called <code>ToolSearch</code> (the internal class is <code>ToolSearchTool</code>, but the model-facing name is <code>ToolSearch</code> — defined as <code>TOOL_SEARCH_TOOL_NAME = 'ToolSearch'</code> in the constants). When the model needs a capability, it calls <code>ToolSearch</code> with a query:</p>
<pre><code class="language-plaintext">User: "Deploy this to my Kubernetes cluster"

Model calls ToolSearch("kubernetes deploy")
  -&gt; System fuzzy-matches deferred tool descriptions
  -&gt; Injects matching schemas into the conversation
Model now has the tools it needs.
</code></pre>
<p>The model goes from ~20 core tools to access to hundreds, without the upfront token cost.</p>
<p>This pattern has spread. The <a href="https://developers.openai.com/api/docs/guides/tools-tool-search">OpenAI Agents SDK</a> now has <code>deferLoading: true</code> with tool search (requires GPT-5.4+). <a href="https://github.com/zeroclaw-labs/zeroclaw">ZeroClaw</a> implements nearly identical deferred loading. <a href="https://github.com/crewAIInc/crewAI/pull/4779">CrewAI 1.10.2a1</a> (March 2026) added dynamic tool injection via Anthropic's tool search API.</p>
<p>But there's still no framework-agnostic library for this. The core is straightforward — fuzzy matching over tool descriptions, schema injection on demand, MCP compatibility. If someone built this as a standalone package, it'd be useful immediately.</p>
<img src="https://cdn.hashnode.com/uploads/covers/5ef48fe2877d056386648ab2/0fde2eb1-f66e-48a9-acce-646add485209.png" alt="" style="display:block;margin:0 auto" />

<hr />
<h2>Default-Deny Permissions With a Graceful Fallback</h2>
<p>Claude Code's permission system is built on default-deny. Every tool has two permission-relevant properties defined in <code>Tool.ts</code>:</p>
<ul>
<li><p><code>isReadOnly</code> — defaults to <code>false</code> (assume the tool writes)</p>
</li>
<li><p><code>isDestructive</code> — defaults to <code>false</code></p>
</li>
</ul>
<p>Tools must explicitly declare their risk profile. The permission system then layers rule-based checks (<code>alwaysAllow</code>/<code>alwaysDeny</code> rules), pre-tool-use hooks (which can modify input, block execution, or log), and an auto-mode safety classifier.</p>
<p>The part I found most interesting is the denial tracking in <code>denialTracking.ts</code> — just 46 lines:</p>
<pre><code class="language-plaintext">3 consecutive denials → shouldFallbackToPrompting() returns true
20 total denials in a session → same result
</code></pre>
<p>If the user keeps saying "no," the system stops running in auto-mode and starts asking for explicit permission on every action. Most agent frameworks either keep retrying or hard-stop. Claude Code's approach gracefully degrades: "You're uncomfortable with what I'm doing, so I'll check before each step."</p>
<p>Small file, big principle.</p>
<img src="https://cdn.hashnode.com/uploads/covers/5ef48fe2877d056386648ab2/69f00384-911e-4f22-b886-a52b934bc576.png" alt="" style="display:block;margin:0 auto" />

<hr />
<h2>The Cost Engineering</h2>
<p>These are the details that only matter at Anthropic's scale, but they're instructive:</p>
<p><strong>Sticky-on latches.</strong> When a feature flag activates during a session, it stays on for the rest of that session. Flipping it back would change the system prompt, which busts the prompt cache. The <code>promptCacheBreakDetection.ts</code> file tracks 14 distinct state fields that can invalidate the cache — system prompt hash, tool schema hashes, model changes, beta headers, effort values, and more. Sticky latches prevent unnecessary cache invalidation from mode toggles.</p>
<p><strong>Tool result persistence.</strong> Large outputs get written to disk; the model sees a preview. This isn't just context management — it keeps the cache prefix stable.</p>
<p><strong>Schema stability.</strong> Tool schemas assembled once at session start, held stable throughout. MCP tools can come and go, but the core schema block doesn't change.</p>
<p>At scale, these optimizations compound significantly.</p>
<hr />
<h2>What I Actually Took Away From This</h2>
<p>I'm not going to pretend the leak is a startup idea list or that I discovered things nobody else saw. But analyzing the code did crystallize a few things:</p>
<p><strong>Context management is harder than it looks.</strong> Five strategies, two still behind feature flags, a dedicated <code>promptCacheBreakDetection.ts</code> tracking 14 vectors. This is not a solved problem, even for Anthropic.</p>
<p><strong>Deferred tool loading is becoming table stakes.</strong> Claude Code, OpenAI, ZeroClaw, CrewAI — multiple teams independently arrived at the same pattern. If you're building an agent with more than ~20 tools and you're not doing this, you're wasting tokens.</p>
<p><strong>Permission design matters more than permission features.</strong> The denial tracking system is 46 lines. The principle it encodes — "degrade gracefully when the user loses trust" — is more important than any specific implementation detail.</p>
<p><strong>The real work is in the orchestration.</strong> The model call is one stage out of seven in the main loop. Everything else — state management, compaction, tool loading, permissions, cost optimization — is where the engineering actually lives.</p>
<hr />
<h2>Being Honest About the Process</h2>
<p>Every blog you've read about this leak was written with AI assistance. This one included. I used Claude to analyze the source modules, identify patterns, and draft the initial structure. I then fact-checked every claim — cross-referencing the actual source code, public documentation, news coverage, and other technical analyses. Where I found errors in my initial draft (and there were several), I corrected them.</p>
<p>The value isn't in pretending I manually read half a million lines of code. It's in doing the verification work to make sure what I'm telling you is actually true. In a sea of AI-generated analysis of AI-generated code, accuracy is the differentiator.</p>
<p>If you spot something wrong, tell me — I'd rather correct it than let it stand.</p>
<p><em>I write about cloud-native, AI engineering, and the infrastructure that makes modern software work. Find me on</em> <a href="https://twitter.com/SaiyamPathak"><em>Twitter</em></a> <em>or</em> <a href="https://linkedin.com/in/saiyampathak"><em>LinkedIn</em></a><em>.</em></p>
]]></content:encoded></item><item><title><![CDATA[The Ingress NGINX Migration Just Got Easier: 119 Annotations, 3 Targets, Impact Ratings]]></title><description><![CDATA[A few months ago, I built ing-switch and wrote about it on kubesimplify. The response was incredible -- people loved the annotation mapping and the visual dashboard.
Since then, ingress-nginx was offi]]></description><link>https://blog.kubesimplify.com/ing-switch-119-annotations-gateway-api-traefik-impact-ratings</link><guid isPermaLink="true">https://blog.kubesimplify.com/ing-switch-119-annotations-gateway-api-traefik-impact-ratings</guid><category><![CDATA[Kubernetes]]></category><category><![CDATA[Devops]]></category><category><![CDATA[Gateway API]]></category><category><![CDATA[Traefik]]></category><category><![CDATA[ingress-nginx]]></category><category><![CDATA[cloud native]]></category><category><![CDATA[migration]]></category><dc:creator><![CDATA[Saiyam Pathak]]></dc:creator><pubDate>Mon, 30 Mar 2026 12:30:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/uploads/covers/5ef48fe2877d056386648ab2/a6561232-6f6b-451c-86ca-bbf693fbb9a6.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>A few months ago, I built <a href="https://github.com/saiyam1814/ing-switch">ing-switch</a> and <a href="https://blog.kubesimplify.com/ing-switch-migrate-from-ingress-nginx-to-traefik-or-gateway-api-in-minutes-not-days">wrote about it on kubesimplify</a>. The response was incredible -- people loved the annotation mapping and the visual dashboard.</p>
<p>Since then, <strong>ingress-nginx was officially archived</strong> (March 24, 2026). March 31 is end of life -- zero security patches after that date.</p>
<p>Based on community feedback from KubeCon, this is the biggest update yet: <strong>119 annotations</strong> (up from 50), <strong>Gateway API with Traefik as the provider</strong> (the #1 request), and <strong>impact ratings</strong> on every annotation so you know exactly what matters.</p>
<p>This post walks through a <strong>complete end-to-end migration</strong> on a <a href="https://github.com/loft-sh/vind">vind</a> cluster with actual command outputs.</p>
<h2>Why You Need to Migrate Now</h2>
<ul>
<li><p><strong>Nov 11, 2025:</strong> Kubernetes SIG Network announces ingress-nginx retirement</p>
</li>
<li><p><strong>Jan 29, 2026:</strong> Joint statement from Kubernetes Steering + Security Response Committees urging immediate migration</p>
</li>
<li><p><strong>Mar 24, 2026:</strong> GitHub repository archived (read-only)</p>
</li>
<li><p><strong>Mar 31, 2026:</strong> End of life -- zero support from this date</p>
</li>
</ul>
<p>Chainguard maintains a fork for CVE-level fixes only -- no features, no community PRs, no pre-built images. You're on your own.</p>
<h2>The Three Migration Paths</h2>
<img src="https://cdn.hashnode.com/uploads/covers/5ef48fe2877d056386648ab2/eeadef24-e6cd-455a-847c-34fedd6cd96e.png" alt="" style="display:block;margin:0 auto" />

<table>
<thead>
<tr>
<th>Target</th>
<th>Best For</th>
<th>What Changes</th>
</tr>
</thead>
<tbody><tr>
<td><strong>Traefik v3</strong></td>
<td>Fastest migration, lowest friction</td>
<td>Keep Ingress API, swap annotations to Middleware CRDs</td>
</tr>
<tr>
<td><strong>Gateway API (Envoy)</strong></td>
<td>Future-proof standard</td>
<td>Replace Ingresses with HTTPRoutes, Envoy policies</td>
</tr>
<tr>
<td><strong>Gateway API (Traefik)</strong></td>
<td>Rancher / k3s users</td>
<td>Standard HTTPRoutes + Gateway resources, with Traefik as the controller implementation. Advanced features (rate limiting, auth, IP filtering) use Traefik Middleware CRDs as extension policies.</td>
</tr>
</tbody></table>
<h2>The Annotation Problem</h2>
<p>The real complexity isn't swapping controllers -- it's the <strong>annotations</strong>. A typical production Ingress has 10-15 NGINX annotations for SSL, auth, rate limiting, CORS, session affinity, and more.</p>
<p>ing-switch maps <strong>119 annotations</strong> with impact ratings:</p>
<table>
<thead>
<tr>
<th></th>
<th>Traefik</th>
<th>Gateway API</th>
</tr>
</thead>
<tbody><tr>
<td>Supported (direct equivalent)</td>
<td>35</td>
<td>39</td>
</tr>
<tr>
<td>Partial (needs minor adjustment)</td>
<td>48</td>
<td>25</td>
</tr>
<tr>
<td>Unsupported (with impact notes)</td>
<td>42</td>
<td>62</td>
</tr>
</tbody></table>
<p>Every unsupported annotation gets an <strong>impact rating</strong>: <code>NONE</code> (safe to ignore), <code>LOW</code> (better defaults), <code>MEDIUM</code> (needs workaround), or <code>VARIES</code> (review your snippets). Most teams discover <strong>70%+ of "unsupported" annotations are safe to ignore</strong>.</p>
<h2>End-to-End Demo: vCluster + ing-switch</h2>
<p><a href="https://asciinema.org/a/nOYDQukAC4bzdSVI"><img src="https://asciinema.org/a/nOYDQukAC4bzdSVI.svg" alt="asciicast" style="display:block;margin:0 auto" /></a></p>
<p>Let's walk through a complete migration on a real cluster. We'll use <a href="https://www.vcluster.com/">vCluster</a> to spin up a Kubernetes cluster in Docker, deploy 3 services with NGINX annotations, and migrate them to Gateway API with Traefik.</p>
<h3>Step 1: Create a Cluster</h3>
<pre><code class="language-bash">vcluster create demo --driver docker
</code></pre>
<p>Output:</p>
<pre><code class="language-text">info  Using vCluster driver 'docker' to create your virtual clusters
info  Ensuring environment for vCluster demo...
done  Created network vcluster.demo
info  Starting vCluster standalone demo
done  Successfully created virtual cluster demo
info  Waiting for vCluster to become ready...
done  vCluster is ready
done  Switched active kube context to vcluster-docker_demo
</code></pre>
<p>Verify:</p>
<pre><code class="language-bash">kubectl get namespaces
</code></pre>
<pre><code class="language-text">NAME                 STATUS   AGE
default              Active   16s
kube-flannel         Active   6s
kube-node-lease      Active   16s
kube-public          Active   16s
kube-system          Active   16s
local-path-storage   Active   6s
</code></pre>
<h3>Step 2: Install Ingress NGINX</h3>
<pre><code class="language-bash">helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm install ingress-nginx ingress-nginx/ingress-nginx \
  --namespace ingress-nginx \
  --create-namespace \
  --set controller.service.type=ClusterIP \
  --set controller.admissionWebhooks.enabled=false \
  --wait --timeout 120s
</code></pre>
<pre><code class="language-text">NAME: ingress-nginx
LAST DEPLOYED: Sun Mar 29 11:15:57 2026
NAMESPACE: ingress-nginx
STATUS: deployed
</code></pre>
<pre><code class="language-bash">kubectl get pods -n ingress-nginx
</code></pre>
<pre><code class="language-text">NAME                                        READY   STATUS    RESTARTS   AGE
ingress-nginx-controller-5486dbd97f-vc9wv   1/1     Running   0          54s
</code></pre>
<h3>Step 3: Deploy 3 Apps with NGINX Annotations</h3>
<p>We deploy three services, each with different annotation patterns:</p>
<p><strong>App 1 -- Basic web app</strong> (SSL redirect + timeouts):</p>
<pre><code class="language-yaml">apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: web-app
  namespace: demo
  annotations:
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
    nginx.ingress.kubernetes.io/proxy-read-timeout: "60"
    nginx.ingress.kubernetes.io/proxy-connect-timeout: "10"
spec:
  ingressClassName: nginx
  rules:
  - host: web.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: web-app
            port:
              number: 80
</code></pre>
<p><strong>App 2 -- API with CORS + rate limiting</strong> (10 annotations):</p>
<pre><code class="language-yaml">apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: api-cors
  namespace: demo
  annotations:
    nginx.ingress.kubernetes.io/force-ssl-redirect: "true"
    nginx.ingress.kubernetes.io/enable-cors: "true"
    nginx.ingress.kubernetes.io/cors-allow-origin: "https://app.example.com,https://admin.example.com"
    nginx.ingress.kubernetes.io/cors-allow-methods: "GET, POST, PUT, DELETE, OPTIONS"
    nginx.ingress.kubernetes.io/cors-allow-headers: "Content-Type, Authorization, X-API-Key"
    nginx.ingress.kubernetes.io/cors-allow-credentials: "true"
    nginx.ingress.kubernetes.io/cors-max-age: "86400"
    nginx.ingress.kubernetes.io/limit-rps: "50"
    nginx.ingress.kubernetes.io/limit-burst-multiplier: "3"
    nginx.ingress.kubernetes.io/proxy-body-size: "5m"
spec:
  ingressClassName: nginx
  rules:
  - host: api.example.com
    http:
      paths:
      - path: /v1
        pathType: Prefix
        backend:
          service:
            name: api-service
            port:
              number: 80
</code></pre>
<p><strong>App 3 -- Auth-protected dashboard</strong> (external auth + IP allowlist + session affinity):</p>
<pre><code class="language-yaml">apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: dashboard
  namespace: demo
  annotations:
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
    nginx.ingress.kubernetes.io/auth-url: "https://auth.example.com/verify"
    nginx.ingress.kubernetes.io/auth-response-headers: "X-User-ID,X-User-Email"
    nginx.ingress.kubernetes.io/whitelist-source-range: "10.0.0.0/8,172.16.0.0/12"
    nginx.ingress.kubernetes.io/affinity: "cookie"
    nginx.ingress.kubernetes.io/session-cookie-name: "dashboard-session"
    nginx.ingress.kubernetes.io/session-cookie-max-age: "3600"
spec:
  ingressClassName: nginx
  rules:
  - host: dashboard.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: dashboard
            port:
              number: 80
</code></pre>
<p>After applying all three:</p>
<pre><code class="language-bash">kubectl get ingress -n demo
</code></pre>
<pre><code class="language-text">NAME        CLASS   HOSTS                   ADDRESS   PORTS   AGE
api-cors    nginx   api.example.com                   80      5s
dashboard   nginx   dashboard.example.com             80      5s
web-app     nginx   web.example.com                   80      5s
</code></pre>
<pre><code class="language-bash">kubectl get pods -n demo
</code></pre>
<pre><code class="language-text">NAME                           READY   STATUS    RESTARTS   AGE
api-service-5f99b6d99d-x7vmn   1/1     Running   0          24s
dashboard-9ddbf867-7dbgf       1/1     Running   0          24s
web-app-969c76b7c-7wqw5        1/1     Running   0          24s
</code></pre>
<p>3 ingresses, 20 NGINX annotations, 3 services running. Now let's see what ing-switch makes of this.</p>
<h3>Step 4: Scan the Cluster</h3>
<pre><code class="language-bash">ing-switch scan
</code></pre>
<pre><code class="language-text">  ing-switch -- Cluster Scan Results
  Cluster: vcluster-docker_demo

  Ingress Controller Detected
  Type:      ingress-nginx
  Version:   unknown
  Namespace: ingress-nginx

  Found 3 Ingress resource(s)

  NAMESPACE   NAME        HOSTS                   ANNOTATIONS   TLS   COMPLEXITY
  ---------   ----        -----                   -----------   ---   ----------
  demo        api-cors    api.example.com         10            no    unsupported
  demo        dashboard   dashboard.example.com   7             no    complex
  demo        web-app     web.example.com         3             no    complex
</code></pre>
<p>ing-switch detected the NGINX controller and found all 3 ingresses with their annotation counts and complexity scores.</p>
<h3>Step 5: Analyze Compatibility</h3>
<p>Let's compare all three targets:</p>
<p><strong>Traefik v3:</strong></p>
<pre><code class="language-bash">ing-switch analyze --target traefik
</code></pre>
<pre><code class="language-text">  Summary
  -------
  Total ingresses:      3
  Fully compatible:     1
  Needs workarounds:    2
  Has unsupported:      0
</code></pre>
<p><strong>Gateway API (Envoy):</strong></p>
<pre><code class="language-bash">ing-switch analyze --target gateway-api
</code></pre>
<pre><code class="language-text">  Summary
  -------
  Total ingresses:      3
  Fully compatible:     0
  Needs workarounds:    3
  Has unsupported:      0
</code></pre>
<p><strong>Gateway API (Traefik):</strong></p>
<pre><code class="language-bash">ing-switch analyze --target gateway-api-traefik
</code></pre>
<pre><code class="language-text">  Summary
  -------
  Total ingresses:      3
  Fully compatible:     0
  Needs workarounds:    3
  Has unsupported:      0
</code></pre>
<p>Key insight: <strong>Traefik is the highest-compatibility target</strong> for this workload (1 fully compatible out of 3). The CORS annotations map directly to Traefik's Headers middleware. For Gateway API, CORS is now also fully supported thanks to the native CORS filter in Gateway API v1.5.</p>
<p>Here's the detailed annotation mapping for the API with CORS:</p>
<pre><code class="language-text">  demo/api-cors
  -------------
  ANNOTATION               STATUS        TARGET RESOURCE                    NOTES
  enable-cors              [supported]   HTTPRoute (CORS filter)            Native CORS filter (GA in Gateway API v1.5)
  cors-allow-origin        [supported]   HTTPRoute (CORS filter)            allowOrigins in CORS filter
  cors-allow-methods       [supported]   HTTPRoute (CORS filter)            allowMethods in CORS filter
  cors-allow-headers       [supported]   HTTPRoute (CORS filter)            allowHeaders in CORS filter
  cors-allow-credentials   [supported]   HTTPRoute (CORS filter)            allowCredentials in CORS filter
  cors-max-age             [supported]   HTTPRoute (CORS filter)            maxAge in CORS filter
  force-ssl-redirect       [supported]   HTTPRoute (RequestRedirect filter) 301 redirect to HTTPS
  limit-rps                [partial]     BackendTrafficPolicy (RateLimit)   Envoy Gateway BackendTrafficPolicy
  limit-burst-multiplier   [partial]     BackendTrafficPolicy (RateLimit)   Burst configurable but uses tokens
  proxy-body-size          [partial]     BackendTrafficPolicy               requestBuffer.limit
</code></pre>
<p>7 out of 10 annotations are fully supported. The 3 "partial" ones work -- they just use a slightly different API.</p>
<h3>Step 6: Generate Migration Files</h3>
<pre><code class="language-bash">ing-switch migrate --target gateway-api-traefik --output-dir ./migration
</code></pre>
<pre><code class="language-text">  ing-switch -- Generating Migration Files
  Target:     gateway-api-traefik
  Output dir: ./migration

  + 00-migration-report.md
  + 01-install-gateway-api-crds/install.sh
  + 02-install-traefik-gateway/helm-install.sh
  + 02-install-traefik-gateway/values.yaml
  + 03-gateway/gatewayclass.yaml
  + 03-gateway/gateway.yaml
  + 04-httproutes/demo-api-cors.yaml
  + 04-httproutes/demo-dashboard.yaml
  + 04-httproutes/demo-web-app.yaml
  + 05-policies/demo-api-cors-ratelimit.yaml
  + 05-policies/demo-dashboard-forwardauth.yaml
  + 05-policies/demo-dashboard-ipallowlist.yaml
  + 06-verify.sh
  + 07-cleanup/remove-nginx.sh
  Generated 13 files in ./migration/
</code></pre>
<h3>Step 7: Inspect the Generated YAML</h3>
<p><strong>GatewayClass -- points to Traefik, not Envoy:</strong></p>
<pre><code class="language-yaml">apiVersion: gateway.networking.k8s.io/v1
kind: GatewayClass
metadata:
  name: traefik
spec:
  controllerName: traefik.io/gateway-controller
</code></pre>
<p><strong>HTTPRoute with native CORS filter</strong> (no more ResponseHeaderModifier hacks):</p>
<pre><code class="language-yaml">apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: api-cors
  namespace: demo
spec:
  parentRefs:
  - name: ing-switch-gateway
    namespace: default
  hostnames:
  - "api.example.com"
  rules:
  - matches:
    - path:
        type: PathPrefix
        value: "/v1"
    filters:
    - type: CORS
      cors:
        allowOrigins:
        - type: Exact
          value: "https://app.example.com"
        - type: Exact
          value: "https://admin.example.com"
        allowMethods:
        - "GET"
        - "POST"
        - "PUT"
        - "DELETE"
        - "OPTIONS"
        allowHeaders:
        - "Content-Type"
        - "Authorization"
        - "X-API-Key"
        allowCredentials: true
        maxAge: "86400s"
    backendRefs:
    - name: api-service
      port: 80
</code></pre>
<p><strong>Traefik Middleware CRDs</strong> (not Envoy-specific policies):</p>
<pre><code class="language-yaml"># Rate Limiting
apiVersion: traefik.io/v1alpha1
kind: Middleware
metadata:
  name: demo-api-cors-ratelimit
  namespace: demo
spec:
  rateLimit:
    average: 50
    burst: 3
</code></pre>
<pre><code class="language-yaml"># ForwardAuth (external authentication)
apiVersion: traefik.io/v1alpha1
kind: Middleware
metadata:
  name: demo-dashboard-forwardauth
  namespace: demo
spec:
  forwardAuth:
    address: "https://auth.example.com/verify"
  authResponseHeaders:
    - "X-User-ID"
    - "X-User-Email"
</code></pre>
<pre><code class="language-yaml"># IP AllowList
apiVersion: traefik.io/v1alpha1
kind: Middleware
metadata:
  name: demo-dashboard-ipallowlist
  namespace: demo
spec:
  ipAllowList:
    sourceRange:
    - "10.0.0.0/8"
    - "172.16.0.0/12"
</code></pre>
<h3>Step 8: Review the Migration Report</h3>
<p>The <code>migrate</code> command automatically generates <code>00-migration-report.md</code> in the output directory. Open it to see the full summary:</p>
<pre><code class="language-bash">cat ./migration/00-migration-report.md
</code></pre>
<pre><code class="language-markdown"># ing-switch Migration Report
**Target Controller:** gateway-api-traefik

## Summary
| Metric | Count |
|--------|-------|
| Total Ingresses | 3 |
| Fully Compatible | 0 |
| Needs Workarounds | 3 |
| Has Unsupported Annotations | 0 |

## demo/api-cors -- Needs workaround
| Annotation | Status | Target Resource | Notes |
|-----------|--------|-----------------|-------|
| enable-cors | OK | HTTPRoute (CORS filter) | Native CORS filter (GA in v1.5) |
| cors-allow-origin | OK | HTTPRoute (CORS filter) | allowOrigins in CORS filter |
| limit-rps | WARN | BackendTrafficPolicy | Envoy Gateway BackendTrafficPolicy |
...
</code></pre>
<h3>Step 9: Apply (Dry-Run First)</h3>
<img src="https://cdn.hashnode.com/uploads/covers/5ef48fe2877d056386648ab2/b4c92aec-3c11-41e4-84da-bce1b3891573.png" alt="" style="display:block;margin:0 auto" />

<pre><code class="language-bash"># Install Gateway API CRDs
bash ./migration/01-install-gateway-api-crds/install.sh

# Install Traefik with Gateway API provider
bash ./migration/02-install-traefik-gateway/helm-install.sh

# Dry-run all resources first
kubectl apply -f ./migration/03-gateway/ --dry-run=server
kubectl apply -f ./migration/04-httproutes/ --dry-run=server

# If dry-run passes, apply for real
kubectl apply -f ./migration/03-gateway/
kubectl apply -f ./migration/04-httproutes/
kubectl apply -f ./migration/05-policies/
</code></pre>
<p>At this point, <strong>both NGINX and Traefik are running side by side</strong>. DNS still points to NGINX. Production traffic is untouched.</p>
<h3>Step 10: Verify and Cutover</h3>
<pre><code class="language-bash"># Run the generated verification script
bash ./migration/06-verify.sh

# Once verified, update DNS to Traefik's IP
# Then clean up NGINX
bash ./migration/07-cleanup/remove-nginx.sh
</code></pre>
<h3>Step 11: Use the Web UI</h3>
<p>For teams that prefer a visual workflow:</p>
<pre><code class="language-bash">ing-switch ui
# Opens http://localhost:8080
</code></pre>
<p>The dashboard provides four pages:</p>
<p><strong>Detect</strong> -- Scan your cluster and see all ingresses with annotation counts and complexity:</p>
<img src="https://cdn.hashnode.com/uploads/covers/5ef48fe2877d056386648ab2/609e59ce-ab2a-40ac-8aa9-a864ad9be8e6.png" alt="" style="display:block;margin:0 auto" />

<p><strong>Analyze</strong> -- Choose between 3 targets and see the full annotation compatibility matrix:</p>
<img src="https://cdn.hashnode.com/uploads/covers/5ef48fe2877d056386648ab2/c6fa8d3a-1723-4e8f-bf54-f32d823ccf91.png" alt="" style="display:block;margin:0 auto" />

<p><strong>Migrate</strong> -- One-click generation with step-by-step checklist and dry-run buttons:</p>
<img src="https://cdn.hashnode.com/uploads/covers/5ef48fe2877d056386648ab2/344c9635-82e2-430e-aee4-7d1595bf96a7.png" alt="" style="display:block;margin:0 auto" />

<p>View all generated files inline with syntax highlighting:</p>
<img src="https://cdn.hashnode.com/uploads/covers/5ef48fe2877d056386648ab2/0b3a75b4-5b82-4d70-b340-7cf0ba784f63.png" alt="" style="display:block;margin:0 auto" />

<p>See migration gaps with impact ratings and fix instructions:</p>
<img src="https://cdn.hashnode.com/uploads/covers/5ef48fe2877d056386648ab2/4b9e6053-3d6e-42ba-896a-44b3727e8349.png" alt="" style="display:block;margin:0 auto" />

<p><strong>Validate</strong> -- Run live cluster checks to confirm your migration phase:</p>
<img src="https://cdn.hashnode.com/uploads/covers/5ef48fe2877d056386648ab2/f64e5599-45ac-490a-a6fd-757f9fda13ad.png" alt="" style="display:block;margin:0 auto" />

<h3>Cleanup</h3>
<pre><code class="language-bash">vcluster delete demo --driver docker
</code></pre>
<pre><code class="language-plaintext">done  Successfully deleted virtual cluster demo
</code></pre>
<h2>What Makes ing-switch Different</h2>
<table>
<thead>
<tr>
<th>Feature</th>
<th>ing-switch</th>
<th>ingress2gateway</th>
<th>Manual</th>
</tr>
</thead>
<tbody><tr>
<td>Annotation coverage</td>
<td>119</td>
<td>30+</td>
<td>You count</td>
</tr>
<tr>
<td>Traefik Ingress target</td>
<td>Yes</td>
<td>No</td>
<td>--</td>
</tr>
<tr>
<td>Gateway API (Traefik)</td>
<td>Yes</td>
<td>No</td>
<td>--</td>
</tr>
<tr>
<td>Gateway API (Envoy)</td>
<td>Yes</td>
<td>Yes</td>
<td>--</td>
</tr>
<tr>
<td>Impact ratings</td>
<td>Yes</td>
<td>No</td>
<td>No</td>
</tr>
<tr>
<td>Web UI</td>
<td>Yes</td>
<td>No</td>
<td>No</td>
</tr>
<tr>
<td>Install scripts</td>
<td>Yes</td>
<td>No</td>
<td>No</td>
</tr>
<tr>
<td>Verification scripts</td>
<td>Yes</td>
<td>No</td>
<td>No</td>
</tr>
<tr>
<td>DNS migration guide</td>
<td>Yes</td>
<td>No</td>
<td>No</td>
</tr>
<tr>
<td>Dry-run mode</td>
<td>Yes</td>
<td>No</td>
<td>--</td>
</tr>
</tbody></table>
<h2>The Ecosystem Is Ready</h2>
<ul>
<li><p><strong>Gateway API v1.5</strong> -- CORS filter, TLSRoute, BackendTLSPolicy all GA</p>
</li>
<li><p><strong>ingress2gateway v1.0</strong> -- Official tool with emitter architecture</p>
</li>
<li><p><strong>Traefik v3.7</strong> -- Native NGINX annotation provider (80+ annotations)</p>
</li>
<li><p><strong>Envoy Gateway v1.7</strong> -- XListenerSet, enhanced policies</p>
</li>
<li><p><strong>cert-manager v1.20</strong> -- Gateway API ListenerSet support</p>
</li>
<li><p><strong>Kubernetes 1.36</strong> -- Ships April 22, first release post-NGINX archival</p>
</li>
</ul>
<p>The tools exist. The standards are stable. The only thing left is to actually run the migration.</p>
<hr />
<p><strong>Star it, fork it, migrate today:</strong> <a href="https://github.com/saiyam1814/ing-switch">github.com/saiyam1814/ing-switch</a></p>
<p><em>ing-switch is open source under the MIT license. PRs welcome.</em></p>
]]></content:encoded></item><item><title><![CDATA[clawspark: Your Private OpenClaw AI Assistant That Never Phones Home]]></title><description><![CDATA[By Saiyam Pathak

OpenClaw has 314,000+ GitHub stars. It is the most popular open-source AI agent out there. It connects to WhatsApp and Telegram, does deep research, manages files, writes code, handl]]></description><link>https://blog.kubesimplify.com/clawspark-your-private-openclaw-ai-assistant-that-never-phones-home</link><guid isPermaLink="true">https://blog.kubesimplify.com/clawspark-your-private-openclaw-ai-assistant-that-never-phones-home</guid><category><![CDATA[ai-agent]]></category><category><![CDATA[ollama]]></category><category><![CDATA[openclaw]]></category><category><![CDATA[clawspark]]></category><category><![CDATA[AI Assistants ]]></category><dc:creator><![CDATA[Saiyam Pathak]]></dc:creator><pubDate>Sun, 15 Mar 2026 15:37:37 GMT</pubDate><enclosure url="https://cdn.hashnode.com/uploads/covers/5ef48fe2877d056386648ab2/fc9a8e16-4bcb-4bc8-a01e-46daf8c3fb7c.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>By Saiyam Pathak</em></p>
<hr />
<p>OpenClaw has 314,000+ GitHub stars. It is the most popular open-source AI agent out there. It connects to WhatsApp and Telegram, does deep research, manages files, writes code, handles voice notes, and genuinely works as a personal assistant. The catch is that setting it up with a local LLM on NVIDIA hardware is a long process with security in place.</p>
<p>I spent some time getting it right on a DGX Spark. Then I automated the entire thing into one command. Along the way I found nine bugs, wrote three source patches, fought Ubuntu's managed Python, debugged WhatsApp's device linking protocol, and integrated a hardware-aware model selection engine. This is the full story.</p>
<h2>The Problem</h2>
<p>NVIDIA's DGX Spark is a desktop AI supercomputer. GB10 Grace Blackwell chip, 128GB unified memory, 1 PFLOP of AI compute, 20-core ARM Cortex-A725 CPU. It sits on your desk, runs quietly, and has enough memory to load models that actually compete with cloud APIs. The hardware is not the bottleneck anymore.</p>
<p>The bottleneck is setup. If you want to run OpenClaw with a local model on DGX Spark, here is what you need to do manually:</p>
<ol>
<li><p>Install Node.js 22+</p>
</li>
<li><p>Install OpenClaw via npm</p>
</li>
<li><p>Install Ollama</p>
</li>
<li><p>Figure out which model fits your hardware (there are hundreds)</p>
</li>
<li><p>Pull the model (can be 20-80GB)</p>
</li>
<li><p>Configure OpenClaw to point at your local Ollama instance</p>
</li>
<li><p>Set the correct environment variables (OLLAMA_API_KEY, OLLAMA_BASE_URL)</p>
</li>
<li><p>Run onboard, which sets half your config to wrong defaults</p>
</li>
<li><p>Fix tools.profile from "messaging" to "full"</p>
</li>
<li><p>Start the gateway, then start a separate Node Host process</p>
</li>
<li><p>Pair the Node Host to the gateway</p>
</li>
<li><p>Link WhatsApp via QR code</p>
</li>
<li><p>Patch three different JavaScript files in OpenClaw's dist folder</p>
</li>
<li><p>Install skills</p>
</li>
<li><p>Harden security</p>
</li>
<li><p>Set up voice transcription</p>
</li>
</ol>
<p>Miss one step and things fail silently. The gateway starts, the model loads, but your agent can only send text messages because it has 5 tools instead of 15. Or WhatsApp linking fails because the browser identification string gets rejected. Or group messages never arrive because history sync is disabled.</p>
<p>NVIDIA's own docs at build.nvidia.com recommend gpt-oss-120b for DGX Spark and describe a manual multi-step process using Ollama or LM Studio. Their guide covers the inference setup but not WhatsApp integration, not voice transcription, not security hardening, and not the Node Host that the agent actually needs to do useful work. clawspark automates all of this, including the parts NVIDIA's guide does not cover.</p>
<h2>What clawspark Does</h2>
<p>One command:</p>
<pre><code class="language-bash">curl -fsSL https://clawspark.dev/install.sh | bash
</code></pre>
<p>That is it. The installer runs 14 steps. It detects your hardware, recommends a model, asks a few questions, then installs and configures everything. Here is the full flow:</p>
<p><strong>Step 1-2: Hardware detection.</strong> The script probes your GPU via nvidia-smi, reads DMI product name for DGX Spark identification, checks for Tegra signatures on Jetson, and measures total system memory. It classifies your hardware into one of four tiers: DGX Spark (128GB unified), Jetson AGX (64GB), RTX high-end (24GB+ VRAM), or RTX standard (8-24GB).</p>
<img src="https://cdn.hashnode.com/uploads/covers/5ef48fe2877d056386648ab2/e77d09be-44ed-4a07-9856-f6a8798020e3.jpg" alt="" style="display:block;margin:0 auto" />

<p><strong>Step 3: Model selection.</strong> This is where it gets interesting. For DGX Spark, I curated a list of 5 models ranked by <a href="https://github.com/AlexsJones/llmfit">llmfit</a> score and verified on real hardware:</p>
<table>
<thead>
<tr>
<th>Model</th>
<th>Size</th>
<th>Estimated tok/s</th>
<th>llmfit Score</th>
<th>Use Case</th>
</tr>
</thead>
<tbody><tr>
<td>qwen3.5:35b-a3b (default)</td>
<td>18 GB</td>
<td>~59 (measured)</td>
<td>91.8</td>
<td>General purpose</td>
</tr>
<tr>
<td>qwen3.5:122b-a10b</td>
<td>33 GB</td>
<td>~45</td>
<td>95.5</td>
<td>Best quality MoE</td>
</tr>
<tr>
<td>qwen3-coder-next</td>
<td>52 GB</td>
<td>~109</td>
<td>93.6</td>
<td>Coding/agentic</td>
</tr>
<tr>
<td>qwen3-next</td>
<td>50 GB</td>
<td>~59</td>
<td>92.2</td>
<td>Chat/instruct</td>
</tr>
<tr>
<td>qwen3-coder:30b</td>
<td>19 GB</td>
<td>~58</td>
<td>94.1</td>
<td>Coding lightweight</td>
</tr>
</tbody></table>
<img src="https://cdn.hashnode.com/uploads/covers/5ef48fe2877d056386648ab2/81931dba-337d-49ee-9b44-5548d4b2878b.jpg" alt="" style="display:block;margin:0 auto" />

<p>For non-DGX-Spark hardware (RTX, Jetson, anything else), the installer uses llmfit to analyze your specific hardware, score hundreds of models, map the results to Ollama model IDs, verify each candidate actually exists on the Ollama library, and present the top 5 that fit. No hardcoded lists. Your GPU, your recommendations.</p>
<p><strong>Step 4-5: Deployment and messaging.</strong> Choose local-only or hybrid (cloud fallback). Choose WhatsApp, Telegram, both, or skip messaging entirely. The web UI at <code>/__openclaw__/canvas/</code> always works regardless.</p>
<img src="https://cdn.hashnode.com/uploads/covers/5ef48fe2877d056386648ab2/a20d26e8-5507-47a6-ad86-57907be5f1c5.jpg" alt="" style="display:block;margin:0 auto" />

<p><strong>Step 6-14: The actual installation.</strong> Ollama install and model pull. Node.js 22 if needed. OpenClaw npm install. Config generation with correct Ollama endpoints. Onboard with overrides for all the wrong defaults. Three source patches (more on these below). Skills installation. Whisper voice setup. WhatsApp QR linking. Optional Tailscale for remote access. ClawMetry dashboard. Security hardening. Final verification.</p>
<img src="https://cdn.hashnode.com/uploads/covers/5ef48fe2877d056386648ab2/0b947b8a-fcac-424e-ba52-80b87f58a13d.jpg" alt="" style="display:block;margin:0 auto" />

<p>After installation, you get the <code>clawspark</code> CLI tool for day-to-day management: <code>clawspark status</code>, <code>clawspark benchmark</code>, <code>clawspark restart</code>, <code>clawspark skills sync</code>, <code>clawspark airgap on/off</code>, and more.</p>
<img src="https://cdn.hashnode.com/uploads/covers/5ef48fe2877d056386648ab2/283cd367-ff8f-4357-a604-88bc1c315ffd.jpg" alt="" style="display:block;margin:0 auto" />

<img src="https://cdn.hashnode.com/uploads/covers/5ef48fe2877d056386648ab2/ea479d57-267c-4d78-ab36-cea2e3e9ebec.jpg" alt="" style="display:block;margin:0 auto" />

<h2>The Architecture</h2>
<p>Here is how the pieces fit together once everything is running:</p>
<pre><code class="language-plaintext">WhatsApp / Telegram / Web UI (Canvas)
              |
    OpenClaw Gateway (port 18789)
     |            |            |
   Agent      Node Host     Baileys
   (LLM)      (Tools)      (WhatsApp Web)
     |            |
   Ollama      15 tools:
  (port 11434)  exec, read, write, edit,
     |          web_fetch, message, canvas,
   Model        process, cron, nodes,
   (GPU)        sessions_spawn, vision,
                transcribe, memory_search,
                memory_store
</code></pre>
<p>The Gateway is the central process. It manages the agent, routes messages from WhatsApp/Telegram/Web, and coordinates tool calls. The Agent is the LLM reasoning loop that decides what to do. The Node Host is a separate process that provides the actual tool implementations -- reading files, fetching web pages, executing code. Without the Node Host, the agent has only 5 basic messaging tools instead of the full 15.</p>
<p>Baileys is the WhatsApp Web client library that OpenClaw uses under the hood. It connects to WhatsApp's servers using a linked device session, the same way WhatsApp Web works in your browser. Messages flow from WhatsApp through Baileys to the Gateway, which sends them to the Agent, which calls tools on the Node Host, and the response flows back the same way.</p>
<h2><strong>The Bugs I Found and Fixed</strong></h2>
<p>This section is the reason I wrote this post. These are all real issues I hit on real hardware, and some of them took hours to diagnose. If you are setting up OpenClaw manually, this list might save you a lot of time.</p>
<h3><strong>1. tools.profile does not default to "full"</strong></h3>
<p>When you run <code>openclaw onboard</code>, it does not set <code>tools.profile</code> to "full". In v2026.3.2 it defaulted to "messaging" (5 tools only). This was partially fixed in v2026.3.7, which changed the default to "coding" -- better, but still missing tools like exec, process, cron, and nodes. The agent looks like it is working, but it cannot do everything it should.</p>
<p>The fix: <code>openclaw config set tools.profile full</code> after onboard completes. clawspark does this automatically.</p>
<h3><strong>2. Node Host is required but not documented</strong></h3>
<p>The Gateway alone does not provide execution tools. You need a separate "Node Host" process (<code>openclaw node run</code>) that connects to the Gateway and provides filesystem, browser, and execution capabilities. Without it, even with <code>tools.profile full</code>, the agent has no tools to call. The Node Host also needs to be paired with the Gateway (device approval), which is another step that is easy to miss.</p>
<p>clawspark starts the Node Host, detects pending pairing requests, auto-approves them, and restarts the Node Host with the pairing token.</p>
<h3><strong>3. Baileys browser string rejected by WhatsApp</strong></h3>
<p>OpenClaw's WhatsApp integration uses Baileys, which identifies itself as <code>["openclaw", "cli", VERSION]</code> to WhatsApp's servers. WhatsApp rejects this during device linking. The QR code scan works, but the connection fails silently.</p>
<p>The fix is a source patch: replace the browser identification with <code>["Ubuntu", "Chrome", "22.0"]</code>, which WhatsApp accepts. This requires patching the compiled JavaScript in OpenClaw's dist folder. clawspark finds the relevant session files and applies the patch automatically.</p>
<h3><strong>4. web_search requires a Brave API key</strong></h3>
<p>OpenClaw's built-in web_search tool requires a Brave Search API key. For a local setup, requiring an external API key defeats the purpose. clawspark works around this by configuring the agent's <a href="http://TOOLS.md">TOOLS.md</a> to use DuckDuckGo Lite via web_fetch instead:</p>
<pre><code class="language-plaintext">web_fetch with url="https://lite.duckduckgo.com/lite/?q=YOUR+QUERY"
</code></pre>
<p>This gives the agent web search capabilities without any API keys or external dependencies.</p>
<h3><strong>5. Agent narrates tool usage on WhatsApp</strong></h3>
<p>When you ask the agent a question on WhatsApp, it sends messages like "Let me search for that..." and "The search returned these results..." before giving you the actual answer. On WhatsApp, this means three or four notification buzzes for one question.</p>
<p>The fix is <a href="http://SOUL.md">SOUL.md</a> rules: explicit instructions to never narrate tool usage, use tools silently, and respond with one clean message.</p>
<h3><strong>6. syncFullHistory breaks group messages</strong></h3>
<p>OpenClaw defaults to <code>syncFullHistory: false</code> in its Baileys configuration. This means after a fresh WhatsApp link, Baileys never receives group sender keys. The result: groups are completely silent. No messages arrive, no errors are logged. It just looks like nobody is talking.</p>
<p>The fix: patch <code>syncFullHistory: false</code> to <code>syncFullHistory: true</code> in the compiled session files. clawspark finds and patches all relevant files automatically.</p>
<h3><strong>7. Mention detection has an early return that blocks text @mentions</strong></h3>
<p>OpenClaw's mention detection in group chats has a <code>return false</code> early exit when JID (WhatsApp ID) mentions exist but do not match the bot's JID. The problem is that WhatsApp resolves @mentions to JIDs, and sometimes the resolved JID does not match the bot's linked phone JID. The early return prevents the text-pattern fallback from ever running, so typing @botname in a group never triggers the bot.</p>
<p>The fix: remove the <code>return false</code> line so the text-pattern fallback always has a chance to match. Another source patch to the compiled JavaScript.</p>
<h3><strong>8. Systemd service missing Ollama environment variables</strong></h3>
<p>When OpenClaw's gateway runs as a systemd service, it does not inherit the shell environment. The OLLAMA_API_KEY and OLLAMA_BASE_URL variables are missing, so the gateway cannot reach Ollama. The model appears to load, but every inference call fails.</p>
<p>clawspark writes the environment variables to a gateway.env file, adds them to the user's shell profile (.bashrc or .zshrc), and sources them before starting any OpenClaw process.</p>
<h3><strong>9. OpenClaw bindings schema changed between versions</strong></h3>
<p>This one cost me an entire evening. Earlier versions of OpenClaw supported a <code>bindings</code> config for routing different message sources to different agents (e.g., full tools in DMs, restricted tools in groups). Starting with v2026.3.2, the bindings schema changed and the old format causes a validation error at startup: <code>Invalid config: bindings.0: Invalid input</code>. This is not a bug per se -- the schema evolved -- but any guide or config from earlier versions will break silently.</p>
<p>The fix: remove bindings entirely and use a single agent with context-aware rules. <a href="http://SOUL.md">SOUL.md</a> and <a href="http://TOOLS.md">TOOLS.md</a> contain explicit sections for DM context (full tools) and group context (Q&amp;A only). The agent enforces the boundary at the prompt level. Groups also use <code>requireMention: true</code> and <code>groupPolicy: open</code> at the config level so the bot only responds when @mentioned.</p>
<h2>Security</h2>
<p>Running a local AI agent is not automatically secure. clawspark applies multiple layers:</p>
<p><strong>Gateway binding.</strong> The OpenClaw gateway binds to localhost only. It is not accessible from other machines on your network unless you explicitly set up Tailscale.</p>
<p><strong>Firewall rules.</strong> UFW is configured to deny all incoming connections except SSH. Outgoing traffic is allowed by default, or blocked entirely in air-gap mode.</p>
<p><strong>Token authentication.</strong> A random 256-bit token is generated during installation. Only clients with this token can talk to the gateway API.</p>
<p><strong>Context-aware tool restrictions.</strong> In direct messages, the owner gets full access to all 15 tools. In group chats, the agent restricts itself to Q&amp;A only (message, web_fetch, memory). This is enforced at the prompt level via SOUL.md, which contains explicit rules for each context. Groups also require @mention to activate.</p>
<p><strong>SOUL.md and TOOLS.md.</strong> These workspace files contain the agent's identity, capabilities, and absolute rules. No credential disclosure (applies to all users, including the owner). No system information in groups. No self-modification. Both files are set to chmod 444 (read-only) so the agent cannot edit its own rules.</p>
<p><strong>Air-gap mode.</strong> For maximum isolation, <code>clawspark airgap on</code> blocks all outbound internet traffic via UFW. Only local network and loopback traffic is allowed. The model, the agent, and all tools run entirely offline.</p>
<p>One honest caveat: local models do not have the same safety filters that cloud providers build into their APIs. That is both a feature (no arbitrary refusals) and a responsibility. You should think carefully about who has access to message your bot.</p>
<h2>Real Performance Numbers</h2>
<p>All numbers below are from an actual DGX Spark running Linux 6.14.0-1015-nvidia (arm64), Ollama 0.17.7, Node.js v22.22.1, OpenClaw v2026.3.13, with Qwen 3.5 35B-A3B:</p>
<table>
<thead>
<tr>
<th>Metric</th>
<th>Value</th>
</tr>
</thead>
<tbody><tr>
<td>Cold model load (first query)</td>
<td>~41 seconds</td>
</tr>
<tr>
<td>Warm prompt evaluation</td>
<td>~265 tok/s</td>
</tr>
<tr>
<td>Warm text generation</td>
<td>~59 tok/s</td>
</tr>
<tr>
<td>End-to-end WhatsApp response</td>
<td>15-45 seconds</td>
</tr>
</tbody></table>
<p>The 59 tok/s generation speed is fast enough to feel responsive. You send a question on WhatsApp and the response arrives in 15-45 seconds depending on complexity. The cold load penalty only hits on the first query after a restart. After that, the model stays in memory.</p>
<p>To put this in perspective: 59 tok/s means the model generates roughly 45-50 words per second. A typical response of 200 words takes about 4 seconds of pure generation time. The rest of the 15-45 second latency comes from the WhatsApp message routing, tool calls (if the agent needs to search the web or read a file), and response formatting.</p>
<p>Is this as fast as GPT-4o or Claude via cloud API? No. Cloud inference on dedicated hardware with massive batching will always be faster for raw token throughput. But it is fast enough for practical use, and your data never leaves your desk. That is the tradeoff.</p>
<p>For the 122B-A10B MoE model (the highest-ranked by llmfit), expect roughly 45 tok/s. Slightly slower but you get access to the full 122B model's knowledge with only 10B active parameters. The DGX Spark's 128GB unified memory can comfortably hold this model (33GB) with plenty of room for the KV cache.</p>
<h2>Hardware-Aware Model Selection with llmfit</h2>
<p>One of the hardest problems with local AI is knowing which model to use. There are hundreds of models on Ollama, each with different sizes, quantizations, and performance characteristics. Picking the wrong one means either wasting memory (model too small for the hardware) or crashing on load (model too big).</p>
<p>I integrated <a href="https://github.com/AlexsJones/llmfit">llmfit</a> to solve this. llmfit is a Rust-based CLI tool that detects your hardware (GPU, VRAM, RAM, CPU), scores every model in its database for fit, speed, and quality, and tells you which ones will actually run well.</p>
<p>For DGX Spark, I ran llmfit on the real hardware and it correctly detected the NVIDIA GB10 with 119.7 GB unified memory and CUDA backend. I then cross-referenced its top recommendations against the Ollama library to verify each model is actually pullable. The result is the curated list of 5 models you see during installation.</p>
<p>For all other hardware, the installer runs llmfit live:</p>
<ol>
<li><p>Install llmfit (one curl command, Rust binary)</p>
</li>
<li><p>Run <code>llmfit recommend --json -n 20 --min-fit good</code></p>
</li>
<li><p>Map each HF-style model name to an Ollama model ID (40+ regex patterns)</p>
</li>
<li><p>Verify each candidate exists on Ollama's library (HTTP check against ollama.com)</p>
</li>
<li><p>Present the top 5 verified models with score, estimated tok/s, and fit level</p>
</li>
</ol>
<p>If llmfit is not available or fails, the installer falls back to curated lists per hardware tier. No hardcoded guessing. Your GPU, your recommendations.</p>
<h2>WhatsApp Integration: The Deep Cut</h2>
<p>Getting WhatsApp working reliably was the hardest part of this entire project. Here is why.</p>
<p>OpenClaw uses the Baileys library, which is an unofficial WhatsApp Web client. It works by emulating a linked device session, the same protocol that WhatsApp Web uses in your browser. The connection is end-to-end encrypted and goes through WhatsApp's servers. There is no official API involved.</p>
<p>This creates three categories of problems:</p>
<p><strong>Protocol issues.</strong> WhatsApp regularly changes its protocol, and Baileys has to keep up. The browser string rejection (bug #3) is an example. WhatsApp started rejecting non-standard browser identifiers at some point, and OpenClaw's default string got caught.</p>
<p><strong>Group message handling.</strong> WhatsApp groups use Signal's sender keys protocol. When you first link a device, it needs to receive sender keys from all group participants before it can decrypt group messages. Setting <code>syncFullHistory: false</code> (bug #6) prevents this initial key exchange, making groups completely silent.</p>
<p><strong>Mention routing.</strong> In WhatsApp groups, @mentions get resolved to JIDs (WhatsApp internal user IDs). The bot's JID might not match its linked phone number's JID in all cases. The early return in mention detection (bug #7) means the bot never sees @mentions in groups unless you patch the code.</p>
<p>clawspark applies all three patches automatically and re-applies them after updates (since <code>npm update</code> overwrites the dist files). Groups require @mention to activate, and the agent is restricted to Q&amp;A only in group context. Full tool access is reserved for direct messages with the owner.</p>
<p>Voice notes work through the local-whisper skill, which runs Whisper (OpenAI's open-source speech-to-text model) locally on the GPU. On DGX Spark, it uses the large-v3 model for maximum transcription accuracy. On Jetson, it drops to the small model. On RTX, it scales based on available VRAM. The audio never leaves your machine.</p>
<h2>Skills</h2>
<p>clawspark installs 10 skills by default, verified against the OpenClaw skill registry:</p>
<table>
<thead>
<tr>
<th>Category</th>
<th>Skills</th>
</tr>
</thead>
<tbody><tr>
<td>Core</td>
<td>local-whisper, self-improvement, memory-setup</td>
</tr>
<tr>
<td>Voice</td>
<td>whatsapp-voice-chat-integration-open-source</td>
</tr>
<tr>
<td>Productivity</td>
<td>deep-research-pro, agent-browser</td>
</tr>
<tr>
<td>Knowledge</td>
<td>second-brain, proactive-agent</td>
</tr>
<tr>
<td>Web Search</td>
<td>ddg-web-search, local-web-search-skill</td>
</tr>
</tbody></table>
<p>Web search works without any API keys. The agent uses DuckDuckGo Lite via web_fetch, fetches result URLs, and composes answers from the content. No Brave API key, no Google API key, no external dependencies.</p>
<p>You can add or remove skills with <code>clawspark skills add &lt;name&gt;</code> and <code>clawspark skills remove &lt;name&gt;</code>. The skills.yaml file in your config directory is the source of truth, and <code>clawspark skills sync</code> reads it and installs everything.</p>
<h2>Getting Started</h2>
<p><strong>Tested and verified on DGX Spark.</strong> Should also work on Mac (Apple Silicon M1/M2/M3/M4), RTX desktops, and Jetson. The installer has fallbacks for all platforms -- macOS uses Homebrew, Ollama runs natively on Apple Silicon, and llmfit handles model selection. These platforms have not been end-to-end tested yet. Community testing welcome -- open an issue if you try it on different hardware.</p>
<pre><code class="language-bash">curl -fsSL https://clawspark.dev/install.sh | bash
</code></pre>
<p>Or with specific options:</p>
<pre><code class="language-bash">bash install.sh --model=qwen3.5:122b-a10b --messaging=whatsapp
</code></pre>
<p>After installation:</p>
<pre><code class="language-bash">clawspark status      # Check all components
clawspark benchmark   # Run a performance benchmark
clawspark logs        # Tail the gateway logs
clawspark restart     # Restart all services
clawspark update      # Update OpenClaw and re-apply patches
clawspark airgap on   # Enable air-gap mode
</code></pre>
<p>The web UI is at <code>http://localhost:18789/__openclaw__/canvas/</code>. The metrics dashboard is at <code>http://localhost:8900</code>.</p>
<p>Source code and documentation: <a href="https://clawspark.dev">clawspark.dev</a></p>
<h2>What is Next</h2>
<p>A few things I am working on:</p>
<p><strong>Multi-model routing.</strong> Use the fast 35B model for simple queries and automatically route complex reasoning tasks to the 122B model. The hardware can handle both loaded simultaneously since 128GB is enough for both.</p>
<p><strong>Better metrics.</strong> ClawMetry currently shows basic gateway stats. I want per-query latency tracking, token usage by model, and cost-equivalent comparisons (how much this query would have cost on cloud APIs).</p>
<p><strong>More hardware testing.</strong> The Jetson and RTX paths in clawspark are written and should work (hardware detection, llmfit model selection, Ollama setup), but I have only done full end-to-end testing on DGX Spark. Jetson AGX Orin is next. For RTX desktops, open a terminal and run the install command directly -- no SSH needed. If you try it on your hardware, please open an issue with your results.</p>
<p><strong>Upstream patches.</strong> The three source patches I wrote are necessary because of bugs in OpenClaw's compiled code. Ideally these get fixed upstream so the patches become unnecessary. I plan to submit them.</p>
<h2>Closing Thoughts</h2>
<p>Two years ago, running a capable AI model locally meant a server rack, a cooling system, and deep knowledge of CUDA. Today it means a quiet box on your desk and a bash script.</p>
<p>The gap between local and cloud AI is closing fast. Not because local is getting as fast as cloud (dedicated data center hardware will always win on raw throughput), but because local is getting good enough. 59 tokens per second from a 35B-parameter MoE model on a desktop machine is good enough for a personal assistant. The tradeoff is straightforward: you get complete data privacy and zero ongoing cost in exchange for slightly higher latency.</p>
<p>clawspark is just the glue. It takes hardware that is already capable (DGX Spark), software that is already good (OpenClaw, Ollama), a model that is already smart (Qwen 3.5), and a tool selector that actually knows your hardware (llmfit), and removes the friction between them. The one-click part is not the innovation. The innovation is all the edge cases, patches, and defaults that make it actually work when you run it for the first time on real hardware.</p>
<p>If you have a DGX Spark, an RTX GPU, a Mac with Apple Silicon, or a Jetson, give it a try. One command, a few questions, and you have a private AI assistant that genuinely never phones home.</p>
<hr />
<p><em>GitHub:</em> <a href="https://github.com/saiyam1814/claw-spark"><em>github.com/saiyam1814/claw-spark</em></a> <em>Website:</em> <a href="https://clawspark.dev"><em>clawspark.dev</em></a></p>
]]></content:encoded></item><item><title><![CDATA[Here's What I Learned About Nemotron 3 Super -I Ran a 120B Parameter Model on Nvidia DGX Spark]]></title><description><![CDATA[There’s a moment when you’re watching a model load into memory. The progress bar is filling up to 87 gigabytes and it hits you. You’re about to talk to something that has 120 billion parameters. Not t]]></description><link>https://blog.kubesimplify.com/nemotron3-on-dgx-spark</link><guid isPermaLink="true">https://blog.kubesimplify.com/nemotron3-on-dgx-spark</guid><category><![CDATA[nemotron 3]]></category><category><![CDATA[nemotron]]></category><category><![CDATA[DGXSpark]]></category><category><![CDATA[NVIDIA]]></category><category><![CDATA[ai agents]]></category><dc:creator><![CDATA[Saiyam Pathak]]></dc:creator><pubDate>Sat, 14 Mar 2026 12:44:36 GMT</pubDate><enclosure url="https://cdn.hashnode.com/uploads/covers/5ef48fe2877d056386648ab2/6f7096fe-54a1-4e2e-aacc-184cb109d071.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>There’s a moment when you’re watching a model load into memory. The progress bar is filling up to 87 gigabytes and it hits you. You’re about to talk to something that has 120 billion parameters. Not through an API. Not in the cloud. On a box the size of a sandwich sitting next to your keyboard.</p>
<p>That’s what running NVIDIA’s Nemotron 3 Super on the DGX Spark feels like. After spending time with it, I think this model needs more attention than it’s getting. Not because of one benchmark number, but because of the engineering choices behind it. These choices show you exactly where AI inference is going.</p>
<p>Let me walk you through what I found.</p>
<h3><strong>The Headline Numbers (And Why They’re Misleading)</strong></h3>
<p>When NVIDIA drops a model, they lead with the big stats: 120 billion parameters, 1 million token context, 5x throughput. These numbers are real, but they hide the real story.</p>
<p>The number that actually matters is <strong>12.7 billion</strong>. That’s how many parameters fire per token. Out of 120.6 billion total, only about a tenth light up for any given input. The rest sit there, waiting until the right token needs their skill.</p>
<p>This roughly 10:1 ratio is the whole story. It’s why the model runs on desktop hardware. It’s why it’s fast. It’s why NVIDIA built it this way. Everything else follows from this one design choice.</p>
<p>The second thing that matters is the layer mix. The model has 88 layers total. Most of them are Mamba-2 layers. This is a completely different architecture from transformers and it doesn’t need to store growing key-value caches. Only a small number are traditional transformer attention layers. NVIDIA interleaves them in a repeating pattern: groups of Mamba-2 blocks paired with Latent MoE layers, with attention layers placed at key depths. We’ll come back to why this split is so important.</p>
<h3><strong>Three Architectures in a Trenchcoat</strong></h3>
<p>Nemotron 3 Super isn’t one architecture. It’s three, stacked together so each one does what it’s best at. Once you get this stack, you get why the model works the way it does.</p>
<img src="https://pbs.twimg.com/media/HDSdda-bcAA0LrZ?format=jpg&amp;name=large" alt="" style="display:block;margin:0 auto" />

<h3><strong>Mamba-2: The Workhorse</strong></h3>
<p>The majority of the 88 layers are Mamba-2 blocks. Mamba is a state-space model. Think of it like a recurrent architecture that keeps a compact, fixed-size state and updates it as each new token comes in.</p>
<p>The key thing: Mamba runs in <strong>linear time</strong>. Double the sequence length, you roughly double the compute. Compare that with transformer attention, where doubling the sequence quadruples the compute.</p>
<p>This is why Nemotron 3 Super can actually deliver a 1-million-token context window in practice. With most layers being Mamba, the bulk of the model doesn’t care how long your prompt is. Compute grows linearly, and memory doesn’t grow at all. Mamba’s state stays the same size no matter the sequence length.</p>
<h3><strong>Transformer Attention: The Precision Tool</strong></h3>
<p>A small number of layers in the stack are traditional transformer attention layers, using Grouped Query Attention with 32 query heads, 2 KV heads, and a head dimension of 128. These are confirmed specs from the technical report.</p>
<p>Why keep any attention at all? Because Mamba has a known gap. It struggles with precise associative recall, like connecting a specific detail from position 1,000 with something at position 500,000. The fixed-size state means some info gets compressed away over very long sequences.</p>
<p>NVIDIA’s fix: place attention layers at carefully chosen depths through the 88-layer stack. They act like precision tools, handling the long-range connections that Mamba would miss, while Mamba does everything else fast.</p>
<p>The result: the vast majority of the model’s compute happens in linear time. Quadratic attention is used only where it’s needed.</p>
<h3><strong>The KV Cache Payoff</strong></h3>
<p>This design has a huge side effect that matters a lot for hardware like the DGX Spark.</p>
<p>In a transformer, the KV cache grows with sequence length. Every attention layer stores key and value tensors for every token it has seen. For a model like Qwen 3.5-122B with 12 full attention layers, head dimension 256, and 2 KV heads, that adds up to about 22.9 GiB at 1 million tokens in BF16. (The math: 12 layers x 2 KV heads x 256 dim x 2 bytes x 2 for K+V x 1M tokens.)</p>
<img src="https://substackcdn.com/image/fetch/$s_!w9U0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac671316-de89-4a33-b0e6-c80cc8fdfebf_2048x1365.jpeg" alt="Image" style="display:block;margin:0 auto" />

<p>Nemotron 3 Super has far fewer attention layers, each with head dimension 128 and 2 KV heads. Because Mamba layers use a fixed-size state (no KV cache growth), only the attention layers add to the KV cache. The bottom line: the KV cache is roughly <strong>3x smaller</strong> than Qwen’s at the same context length. On the DGX Spark’s 128 GB of unified memory, you load the 87 GB model, add a relatively small KV cache even at very long contexts, and you still have plenty of room to spare.</p>
<p>For practical purposes, the KV cache almost doesn’t matter with this model.</p>
<h2><strong>Latent MoE: Getting 4x More Experts for Free</strong></h2>
<p>The Mixture of Experts layer is where things get really clever.</p>
<img src="https://substackcdn.com/image/fetch/$s_!5Kdw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89ffec61-5e59-42db-b9f3-d7943928265b_2048x1365.jpeg" alt="Image" style="display:block;margin:0 auto" />

<p>In a standard MoE, each token is routed to one or two “expert” sub-networks from a larger pool. The idea is simple: different experts specialize in different things, and the router learns which expert to call for each token.</p>
<p>The problem is cost. Routing happens at the model’s full hidden dimension, and each expert operates at that same dimension. If you want more experts (for better specialization), routing gets more expensive. If you want to activate more experts per token (for better accuracy), inference gets slower.</p>
<p>Latent MoE solves this with a compression trick. Before routing, token embeddings are projected from the full hidden dimension down to a smaller latent dimension. The router operates in this compressed space, which is much cheaper. Experts also operate on the compressed representations.</p>
<p>The compute you save on compression doesn’t disappear. It gets reinvested. NVIDIA uses it to increase both the total number of experts <strong>and</strong> the number of experts active per token by the same factor. The result: 4x more experts consulted per token, at approximately the same inference cost as a standard MoE with fewer experts.</p>
<p>Each token effectively gets a committee of 4 specialists deliberating instead of a single expert making a snap judgment. The accuracy improvement is significant, and you don’t pay for it in latency.</p>
<h2><strong>Multi-Token Prediction: Built-In Speculative Decoding</strong></h2>
<p>Standard language models predict one token at a time. Generate position N, feed it back in, generate position N+1, repeat. This sequential nature is the fundamental bottleneck in text generation speed.</p>
<img src="https://cdn.hashnode.com/uploads/covers/5ef48fe2877d056386648ab2/e115fb71-9ad5-4d6c-abbc-2a36fe6f6559.png" alt="" style="display:block;margin:0 auto" />

<p>Nemotron 3 Super predicts multiple future tokens from each position simultaneously. The model has shared-weight prediction heads that project from the same internal representation to predict not just the next token, but several tokens ahead.</p>
<p>This serves two purposes. During training, it forces the model to learn longer-range dependencies. You can’t predict three tokens ahead without understanding the broader context. This makes the model smarter.</p>
<p>During inference, it works like built-in speculative decoding. Instead of generating one token per forward pass, the model proposes multiple tokens, verifies them, and keeps the correct ones. For structured output like code and tool calls, where the next few tokens are often very predictable, NVIDIA reports up to 3x wall-clock speedup. General chat won’t see the full 3x, but code generation benefits a lot.</p>
<p>The nice thing: you don’t need a separate draft model. The speculation is built right into the model.</p>
<h2><strong>How It Was Trained: The Three-Phase Pipeline</strong></h2>
<p>This is where NVIDIA’s openness really shines. They didn’t just release weights. They published the complete methodology, and the numbers are big.</p>
<img src="https://cdn.hashnode.com/uploads/covers/5ef48fe2877d056386648ab2/8961cc5a-2e77-4316-a388-efb33cccadf9.png" alt="" style="display:block;margin:0 auto" />

<p><strong>Phase 1: Pretraining.</strong> 25 trillion tokens total (10 trillion unique), plus 10 billion additional tokens focused specifically on reasoning, plus 15 million coding problems. The majority of compute ran in NVFP4, which is NVIDIA’s native 4-bit floating point format. This is unusual and important: most models train in higher precision and quantize down later, losing accuracy. Nemotron 3 Super was born in FP4.</p>
<p><strong>Phase 2: Supervised Fine-Tuning.</strong> 7 million carefully selected samples from a corpus of 40 million. Coverage spans reasoning, instruction following, coding, safety, and critically, multi-step agent task completion. This phase is where the model learns to be useful, not just knowledgeable.</p>
<p><strong>Phase 3: Reinforcement Learning.</strong> 1.2 million environment rollouts across 21 different configurations, using NeMo Gym and NeMo RL frameworks with 37 datasets. This is where the model learns to reason through complex, multi-step problems. This is the kind of thinking that makes it useful as an autonomous agent.</p>
<p>NVIDIA released around 10 of the pretraining datasets publicly, 15 RL training environments, and about 10 of the 37 RL datasets, along with complete training recipes. The Artificial Analysis Openness Index scored this release at 83 out of 100. Only two research labs (Ai2 and MBZUAI) score higher, and their models aren’t anywhere near this performance level.</p>
<p>This kind of transparency is new for a model this good. With enough compute, you could follow their recipe and reproduce the training run yourself.</p>
<h2><strong>Why NVIDIA Built This for Agents</strong></h2>
<p>The model wasn’t designed for chatbots. Every architectural decision points toward one use case: long-running autonomous agents. The 1M context, the sparse activation, the MoE efficiency. All of it.</p>
<img src="https://cdn.hashnode.com/uploads/covers/5ef48fe2877d056386648ab2/b0f51fa2-0d6e-4ec0-b839-ba0927fb91e7.png" alt="" style="display:block;margin:0 auto" />

<p>When you run a multi-agent system, token consumption explodes. Each agent interaction requires sending the full conversation history, tool outputs, intermediate reasoning steps, and results from other agents. NVIDIA’s numbers suggest multi-agent workflows generate up to 15x more tokens than standard chat.</p>
<p>This creates two problems that kill most models:</p>
<p><strong>Context overflow.</strong> At 128K tokens, even large models run out of context in extended agent sessions. The agent either loses early context (and the original goal with it), or you implement complex summarization/RAG schemes that lose fidelity. Nemotron 3 Super’s million-token window means the agent can hold the entire workflow state. Every tool call, every intermediate result, every reasoning step stays in memory without ever truncating.</p>
<p><strong>The cost of thinking.</strong> Agents need to reason at every step. If each reasoning call costs as much as a full 120B forward pass, running thousands of agent subtasks gets very expensive very fast. With only 12B active parameters, each call through Nemotron 3 Super costs a fraction of what a dense 120B model would. You get the intelligence of a large model with the economics of a small one.</p>
<p>The benchmarks back this up. On PinchBench, which tests models as actual coding agents (not just answering coding questions), it scores 85.6%. That’s the best open model out there. On DeepResearch Bench, which tests multi-step research over large document sets, NVIDIA’s AI-Q multi-agent system took the number one position. AI-Q is built on top of a fine-tuned Nemotron 3 Super. Worth noting: AI-Q is a full multi-agent system with orchestrator, planner, and researcher sub-agents. It’s not just the base model running solo. But Nemotron 3 Super is the reasoning engine at its core.</p>
<p>Why Sparse MoE is Perfect for the DGX Spark</p>
<p>Here’s something that sounds wrong: on the DGX Spark, a 120B MoE model runs way faster than a smaller 70B dense model. The bigger model is faster. How?</p>
<img src="https://cdn.hashnode.com/uploads/covers/5ef48fe2877d056386648ab2/a316513a-2122-4da5-a635-cba89aa8e443.png" alt="" style="display:block;margin:0 auto" />

<h2><strong>Running It: Three Paths from Zero to Inference</strong></h2>
<p>I tested three ways to get Nemotron 3 Super running on the DGX Spark. Here they are, from simplest to most configurable.</p>
<p><strong>Path 1: Ollama (Two Commands)</strong></p>
<p>The fastest possible path. If Ollama is installed on your Spark (it comes pre-installed on DGX OS):</p>
<pre><code class="language-text">ollama pull nemotron-3-super

ollama run nemotron-3-super
</code></pre>
<pre><code class="language-text">saiyam@spark-5385:~$   ollama pull nemotron-3-super
pulling manifest 
pulling 0fc53cc990a2: 100% ▕███████████████████████████████████████████████████████████████████████████████████████▏  86 GB                         
pulling d02d998e5ae6: 100% ▕███████████████████████████████████████████████████████████████████████████████████████▏  23 KB                         
pulling 02897ca0d6a3: 100% ▕███████████████████████████████████████████████████████████████████████████████████████▏   31 B                         
pulling 9c35241878aa: 100% ▕███████████████████████████████████████████████████████████████████████████████████████▏  509 B                         
verifying sha256 digest 
writing manifest 
success 


saiyam@spark-5385:~$ ollama list

NAME                       ID              SIZE     MODIFIED               
nemotron-3-super:latest    95acc78b3ffd    86 GB    Less than a second ago    
qwen3.5:35b-a3b            3460ffeede54    23 GB    5 days ago                
saiyam@spark-5385:~$  ollama run nemotron-3-super --verbose "Explain the difference between Mamba and Transformer architectures like I'm a DevOps engineer who has never worked with ML."
</code></pre>
<pre><code class="language-text">ollama show nemotron-3-super 
  Model
    architecture        nemotron_h_moe    
    parameters          123.6B            
    context length      262144            
    embedding length    4096              
    quantization        Q4_K_M            
    requires            0.17.1            

  Capabilities
    completion    
    tools         
    thinking      

  Parameters
    temperature    1       
    top_p          0.95    

  License
    NVIDIA Software and Model Evaluation License                                            
    IMPORTANT NOTICE – PLEASE READ AND AGREE BEFORE USING THE NVIDIA LICENSED MATERIALS.    
    ...                                                                                     
</code></pre>
<p><strong>Real performance on DGX Spark:</strong></p>
<pre><code class="language-text">prompt eval rate:  3.51 tokens/s
eval rate: 19.50 tokens/s
eval count:  2504 tokens
total duration:  2m56s
</code></pre>
<p><strong>Path 2: llama.cpp from Source (Full Control)</strong></p>
<p>If you want more control over context sizes, quantization, and API serving, you can build llama.cpp from source and run the GGUF model directly. Unsloth has a detailed guide for this: <a href="https://docs.unsloth.ai/basics/nvidia-nemotron-3-super">Unsloth Nemotron 3 Super Guide</a></p>
<p>The key things to know for DGX Spark:</p>
<ul>
<li><p>When building llama.cpp, use -DCMAKE_CUDA_ARCHITECTURES=”121” for the GB10 chip. Without this you’ll fall back to CPU inference.</p>
</li>
<li><p>The GGUF files are at <a href="https://huggingface.co/unsloth/NVIDIA-Nemotron-3-Super-120B-A12B-GGUF">unsloth/NVIDIA-Nemotron-3-Super-120B-A12B-GGUF</a> on Hugging Face.</p>
</li>
<li><p>NVIDIA recommends temperature 1.0 for general chat, and 0.6 with top_p 0.95 for tool calling.</p>
</li>
<li><p>Set --ctx-size based on your available memory. On DGX Spark, 16384 to 262144 is practical. Setting it to 1M may trigger CUDA OOM.</p>
</li>
<li><p>llama-server gives you an OpenAI-compatible API, so VS Code Continue, LangChain, CrewAI, Open WebUI all just work.</p>
</li>
</ul>
<h2><strong>The DGX Spark: Quick Hardware Context</strong></h2>
<p>For readers unfamiliar with the hardware, the DGX Spark is NVIDIA’s desktop AI computer. The relevant specs:</p>
<ul>
<li><p><strong>Chip:</strong> GB10 Grace Blackwell Superchip</p>
</li>
<li><p><strong>Memory:</strong> 128 GB unified LPDDR5x (shared CPU/GPU, 273 GB/s)</p>
</li>
<li><p><strong>GPU:</strong> 6,144 CUDA cores, 5th-gen Tensor Cores, 1 PFLOP FP4 sparse</p>
</li>
<li><p><strong>CPU:</strong> 20-core ARM (10x Cortex-X925 + 10x Cortex-A725)</p>
</li>
<li><p><strong>Size:</strong> 150mm x 150mm x 50mm, 1.2 kg</p>
</li>
<li><p><strong>Power:</strong> 240W</p>
</li>
</ul>
<img src="https://cdn.hashnode.com/uploads/covers/5ef48fe2877d056386648ab2/d8e3fd38-2ef8-488a-b7f4-676d1edd7880.png" alt="" style="display:block;margin:0 auto" />

<p>The unified memory is the key. Unlike a discrete GPU where you’re limited by VRAM (24 GB on an RTX 4090, 32 GB on an RTX 5090), the DGX Spark’s 128 GB is coherently shared between CPU and GPU with no PCIe bottleneck. The full 87 GB model lives in one address space.</p>
<p>NVIDIA rates it for models up to 200 billion parameters on a single unit, or 405 billion on two connected Sparks.</p>
<h2><strong>Putting It in Context: How Nemotron 3 Super Compares</strong></h2>
<p>Here’s the honest comparison against peers, based on published third-party benchmarks:</p>
<img src="https://cdn.hashnode.com/uploads/covers/5ef48fe2877d056386648ab2/354ed625-31cf-4636-950a-438ef99234d1.png" alt="" style="display:block;margin:0 auto" />

<h3><strong>What I’d Actually Use This For</strong></h3>
<p>After spending time with the model, here’s where I see genuine value in the Nemotron + DGX Spark combination:</p>
<p><strong>Openclaw</strong> - I am going to put this as the model for openclaw too which is already running on my machine.</p>
<p><strong>Private data analysis.</strong> If you work in healthcare, finance, legal, or defense, some data simply cannot leave your building. No cloud provider promise changes the rules. A local model that never touches a network is the only option for these workloads.</p>
<p><strong>Code review and analysis.</strong> 73.4% precision on Qodo’s code review benchmark means about three out of four issues it flags are real. That’s useful enough for a local code review helper, especially when you’re working on code you can’t send to an external API.</p>
<p><strong>Long-document reasoning.</strong> The million-token context with a tiny KV cache means you can load entire codebases, spec documents, or stacks of research papers and ask questions across everything. No chunking, no RAG pipeline needed. Just load it all and ask.</p>
<p>Where I wouldn’t use it: production serving at scale, real-time latency-critical applications, or model training. The DGX Spark is an inference machine, not a training rig.</p>
<h3><strong>The Bigger Picture</strong></h3>
<p>Nemotron 3 Super is interesting as a model, but it’s even more interesting as a strategy.</p>
<p>NVIDIA makes the chips (GB10, B200), the inference runtime (NIM, TensorRT-LLM), the training framework (NeMo), and now the models (Nemotron). They’ve released the model, the data, and the recipes. Everything except the hardware to train on.</p>
<p>That’s the play. The more developers build on Nemotron, the more they need NVIDIA hardware. The openness isn’t charity. It’s ecosystem building.</p>
<p>But for us as practitioners, the result is clearly good. We get a top-tier model with full training transparency, running on hardware we can put on a desk. The hybrid Mamba-Transformer architecture with Latent MoE and multi-token prediction isn’t just a research paper. It’s a practical solution for running large models on limited hardware.</p>
<p>NVIDIA has confirmed that Ultra, the bigger sibling at roughly 500 billion parameters, is coming. If Super at 120B is this capable, Ultra will be worth watching closely.</p>
<p>Sources:</p>
<ul>
<li><p><a href="https://research.nvidia.com/labs/nemotron/files/NVIDIA-Nemotron-3-Super-Technical-Report.pdf"><strong>NVIDIA Nemotron 3 Super Technical Report (PDF)</strong></a></p>
</li>
<li><p><a href="https://developer.nvidia.com/blog/introducing-nemotron-3-super-an-open-hybrid-mamba-transformer-moe-for-agentic-reasoning/"><strong>NVIDIA Technical Blog: Introducing Nemotron 3 Super</strong></a></p>
</li>
<li><p><a href="https://blogs.nvidia.com/blog/nemotron-3-super-agentic-ai/"><strong>NVIDIA Blog: 5x Higher Throughput for Agentic AI</strong></a></p>
</li>
<li><p><a href="https://artificialanalysis.ai/articles/nvidia-nemotron-3-super-the-new-leader-in-open-efficient-intelligence"><strong>Artificial Analysis: Nemotron 3 Super — The New Leader in Open Intelligence</strong></a></p>
</li>
<li><p><a href="https://www.qodo.ai/blog/nvidia-nemotron-3-super-is-closing-the-gap-for-open-source-models/"><strong>Qodo: Code Review Analysis</strong></a></p>
</li>
<li><p><a href="https://ollama.com/blog/dgx-spark"><strong>Ollama DGX Spark Benchmarks</strong></a></p>
</li>
<li><p><a href="https://lmsys.org/blog/2025-10-13-nvidia-dgx-spark/"><strong>DGX Spark Performance (LMSYS)</strong></a></p>
</li>
<li><p><a href="https://www.pinchbench.com/"><strong>PinchBench</strong></a></p>
</li>
<li><p>My own testing on DGX Spark — 19.5 tok/s (Q4_K_M, Ollama), prompt eval 3.51 tok/s</p>
</li>
</ul>
]]></content:encoded></item><item><title><![CDATA[ing-switch: Migrate from Ingress NGINX to Traefik or Gateway API in Minutes, Not Days]]></title><description><![CDATA[If you run Kubernetes, there's a deadline you can't ignore: Ingress NGINX is being deprecated in March 2026. Roughly half of all Kubernetes clusters depend on it. That's a lot of teams who need a migr]]></description><link>https://blog.kubesimplify.com/ing-switch-migrate-from-ingress-nginx-to-traefik-or-gateway-api-in-minutes-not-days</link><guid isPermaLink="true">https://blog.kubesimplify.com/ing-switch-migrate-from-ingress-nginx-to-traefik-or-gateway-api-in-minutes-not-days</guid><dc:creator><![CDATA[Saiyam Pathak]]></dc:creator><pubDate>Wed, 25 Feb 2026 17:24:43 GMT</pubDate><enclosure url="https://cdn.hashnode.com/uploads/covers/5ef48fe2877d056386648ab2/5df7c855-f9ff-4859-9415-e2899869e70b.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>If you run Kubernetes, there's a deadline you can't ignore: <strong>Ingress NGINX is being deprecated in March 2026</strong>. Roughly half of all Kubernetes clusters depend on it. That's a lot of teams who need a migration plan — and most of them are going to discover that it's harder than it looks.</p>
<p>The core problem isn't moving from one controller to another. It's that Ingress NGINX is held together by annotations. Hundreds of <code>nginx.ingress.kubernetes.io/*</code> annotations that control everything from TLS redirects to rate limiting to sticky sessions to canary deployments. These annotations have no direct equivalent in a new controller. Some map cleanly. Some map partially, with caveats. Some have no equivalent at all. And the existing tooling (<code>ingress2gateway</code>) only handles basic routing — it doesn't tell you what you're losing or how to compensate.</p>
<p>That's why I built <a href="https://github.com/saiyam1814/ing-switch"><strong>ing-switch</strong></a> — an open-source CLI + visual UI that takes you through the full migration lifecycle: scan → analyze → generate → verify → cutover → cleanup.</p>
<hr />
<h2>What ing-switch does</h2>
<p>The tool has four commands that map to four stages of migration:</p>
<pre><code class="language-bash">ing-switch scan      # detect your controller + list all ingresses
ing-switch analyze   # map every annotation to the target controller
ing-switch migrate   # generate ready-to-apply manifests
ing-switch ui        # open the visual 4-page migration dashboard at :8080
</code></pre>
<p>And it supports two migration targets:</p>
<table style="min-width:50px"><colgroup><col style="min-width:25px"></col><col style="min-width:25px"></col></colgroup><tbody><tr><th><p>Target</p></th><th><p>What's generated</p></th></tr><tr><td><p><strong>Traefik v3</strong></p></td><td><p>Traefik Middleware CRDs + updated Ingress resources (stays on <code>kind: Ingress</code>, no new CRDs to learn)</p></td></tr><tr><td><p><strong>Gateway API (Envoy Gateway)</strong></p></td><td><p>GatewayClass + Gateway + HTTPRoutes + BackendTrafficPolicy + SecurityPolicy</p></td></tr></tbody></table>

<p><strong>Traefik</strong> is the lowest-friction path: you keep <code>kind: Ingress</code>, your team learns almost nothing new, and almost all annotations have a Traefik equivalent. <strong>Gateway API</strong> is the future-proof path: standardized, implementation-agnostic, and where the ecosystem is heading — but it requires more preparation for partial-support annotations.</p>
<hr />
<h2>Annotation coverage: the hard part</h2>
<p>This is the part most migration tools skip. <code>ing-switch</code> maps over 50 <code>nginx.ingress.kubernetes.io/*</code> annotations for both targets. Here's a sample:</p>
<table style="min-width:75px"><colgroup><col style="min-width:25px"></col><col style="min-width:25px"></col><col style="min-width:25px"></col></colgroup><tbody><tr><th><p>Annotation</p></th><th><p>Traefik</p></th><th><p>Gateway API</p></th></tr><tr><td><p><code>ssl-redirect</code></p></td><td><p>✅ RedirectScheme Middleware</p></td><td><p>✅ HTTPRoute RequestRedirect filter</p></td></tr><tr><td><p><code>force-ssl-redirect</code></p></td><td><p>✅ Permanent redirect</p></td><td><p>✅ 301 HTTPRoute redirect</p></td></tr><tr><td><p><code>enable-cors</code> (all 6 fields)</p></td><td><p>✅ Headers Middleware</p></td><td><p>⚠️ Manual ResponseHeaderModifier (no native CORS filter in v1)</p></td></tr><tr><td><p><code>auth-url</code> (ForwardAuth)</p></td><td><p>✅ ForwardAuth Middleware</p></td><td><p>⚠️ SecurityPolicy (Envoy ext-auth)</p></td></tr><tr><td><p><code>limit-rps</code> / <code>limit-rpm</code></p></td><td><p>✅ RateLimit Middleware</p></td><td><p>⚠️ BackendTrafficPolicy (Envoy extension)</p></td></tr><tr><td><p><code>whitelist-source-range</code></p></td><td><p>✅ IPAllowList Middleware</p></td><td><p>⚠️ HTTPRouteMatch source IP (limited)</p></td></tr><tr><td><p><code>affinity: cookie</code> (sticky sessions)</p></td><td><p>✅ Service sticky annotation</p></td><td><p>⚠️ BackendLBPolicy SessionPersistence (v1.1)</p></td></tr><tr><td><p><code>canary</code> + <code>canary-weight</code></p></td><td><p>✅ Weighted service split</p></td><td><p>✅ HTTPRoute weighted backendRefs</p></td></tr><tr><td><p><code>rewrite-target</code></p></td><td><p>✅ ReplacePath/AddPrefix</p></td><td><p>✅ URLRewrite filter</p></td></tr><tr><td><p><code>proxy-read-timeout</code></p></td><td><p>⚠️ ServersTransport CRD</p></td><td><p>⚠️ HTTPRoute spec.rules[].timeouts</p></td></tr><tr><td><p><code>configuration-snippet</code></p></td><td><p>❌ Not supported (security)</p></td><td><p>❌ Not supported</p></td></tr><tr><td><p><code>session-cookie-samesite</code></p></td><td><p>✅ Sticky cookie</p></td><td><p>❌ Not in BackendLBPolicy spec</p></td></tr></tbody></table>

<p>The tool shows you exactly which category each annotation falls into: fully supported (just apply the generated YAML), partial (the YAML is generated, but read the note about what's different), or unsupported (manual work required, with guidance on the best alternative).</p>
<hr />
<h2>Installation</h2>
<pre><code class="language-bash"># macOS arm64
curl -L https://github.com/saiyam1814/ing-switch/releases/latest/download/ing-switch-darwin-arm64 -o ing-switch
chmod +x ing-switch &amp;&amp; sudo mv ing-switch /usr/local/bin/

# Linux amd64
curl -L https://github.com/saiyam1814/ing-switch/releases/latest/download/ing-switch-linux-amd64 -o ing-switch
chmod +x ing-switch &amp;&amp; sudo mv ing-switch /usr/local/bin/
</code></pre>
<p>Or build from source:</p>
<pre><code class="language-bash">git clone https://github.com/saiyam1814/ing-switch.git
cd ing-switch
make build   # builds React UI then embeds it in the Go binary
./ing-switch --help
</code></pre>
<hr />
<h2>The Dashboard</h2>
<p>The easiest way to understand what <code>ing-switch</code> does is to open the UI:</p>
<pre><code class="language-bash">ing-switch ui   # opens at localhost:8080
</code></pre>
<h3>Page 1: Detect</h3>
<p>The first page scans your cluster and discovers every Ingress resource across all namespaces, identifies the controller type and version, and flags each ingress by complexity.</p>
<img src="https://cdn.hashnode.com/uploads/covers/5ef48fe2877d056386648ab2/9dfd53cb-7cd4-464d-a8e8-d76b06489710.png" alt="" style="display:block;margin:0 auto" />

<p><em>The tool shows a countdown banner — Ingress NGINX retires March 2026 — and lets you scope the scan to a specific namespace or scan everything.</em></p>
<p>After clicking <strong>Scan Cluster</strong>, the tool connects to your cluster via kubeconfig and enumerates every <code>networking.k8s.io/v1 Ingress</code> object. In my test cluster (a vcluster running 11 production-realistic ingresses), the sidebar updated immediately to show "11 ingresses · retiring · ingress-nginx":</p>
<img src="https://cdn.hashnode.com/uploads/covers/5ef48fe2877d056386648ab2/adf1f056-5f49-4bbb-a693-4aba846ef71e.png" alt="" style="display:block;margin:0 auto" />

<p><em>The sidebar shows your cluster name, ingress count, and controller status. "Retiring" badge means the detected controller is Ingress NGINX.</em></p>
<h3>Page 2: Analyze</h3>
<p>The Analyze page is where you decide your migration target. You pick between <strong>Traefik v3</strong> (labeled "Lowest friction") or <strong>Gateway API via Envoy Gateway</strong> (labeled "Future-proof standard"), then click <strong>Analyze Compatibility</strong>.</p>
<img src="https://cdn.hashnode.com/uploads/covers/5ef48fe2877d056386648ab2/6cccda7e-dbb3-4d9f-8863-36076eb7b83b.png" alt="" style="display:block;margin:0 auto" />

<p>The engine runs through every annotation on every ingress and produces a per-ingress compatibility matrix. Each annotation gets a status badge:</p>
<ul>
<li><p><strong>Green (supported)</strong> — the generated YAML covers this fully; just apply it</p>
</li>
<li><p><strong>Yellow (partial)</strong> — the feature works but with limitations; the note explains exactly what's different</p>
</li>
<li><p><strong>Red (unsupported)</strong> — no direct equivalent exists; the tool explains the best available workaround</p>
</li>
</ul>
<p>This is the output that replaces weeks of reading changelogs and GitHub issues.</p>
<h3>Page 3: Migrate</h3>
<p>The Migrate page generates the complete output directory of Kubernetes manifests. Click <strong>Generate Migration Files</strong> and the tool writes every file needed for a zero-downtime cutover.</p>
<img src="https://cdn.hashnode.com/uploads/covers/5ef48fe2877d056386648ab2/b5656123-e68e-41d3-83e7-8a13885eb832.png" alt="" style="display:block;margin:0 auto" />

<p><em>The checklist walks you through every step in order, so you can't accidentally skip a dependency (like applying GatewayClass before HTTPRoutes).</em></p>
<p>After generation, each step gains <strong>View File</strong>, <strong>Dry-run</strong>, and <strong>Apply</strong> buttons:</p>
<img src="https://cdn.hashnode.com/uploads/covers/5ef48fe2877d056386648ab2/e8f69606-1790-4d40-8a59-3bb3701a686f.png" alt="" style="display:block;margin:0 auto" />

<p><em>The tool generated 27 migration files for Traefik across 11 ingresses. Each checklist item links to the file it creates, and you can preview the YAML before applying.</em></p>
<p>The <strong>Gaps tab</strong> gives you the executive summary — which ingresses are fully compatible, which have auto-generated workarounds, and which need manual review:</p>
<img src="https://cdn.hashnode.com/uploads/covers/5ef48fe2877d056386648ab2/2477c587-a5c8-463a-a2ca-2b0e91e29e99.png" alt="" style="display:block;margin:0 auto" />

<p><em>"Needs Review" ingresses are listed with the specific annotations that require manual attention, so you know exactly what work remains before cutover.</em></p>
<h3>Page 4: Validate</h3>
<p>After applying the generated manifests, the Validate page gives you a structured checklist of post-migration checks and lets you run live validation against the cluster.</p>
<img src="https://cdn.hashnode.com/uploads/covers/5ef48fe2877d056386648ab2/85281d50-0a67-4c7e-abcd-0c7b127af988.png" alt="" style="display:block;margin:0 auto" />

<p><em>The checklist covers TLS verification, auth testing, rate limit validation, canary routing verification, and a 24-hour monitoring window before removing NGINX.</em></p>
<hr />
<h2>The CLI: same power, no browser</h2>
<p>Everything the UI does is also available as CLI commands, which makes it scriptable and CI-friendly:</p>
<pre><code class="language-bash"># Scan your cluster
$ ing-switch scan
Cluster: production-cluster
Controller: ingress-nginx v1.9.4 (namespace: ingress-nginx)

NAMESPACE    NAME              HOSTS                     ANNOTATIONS  TLS
ecommerce    ecommerce-shop    shop.example.com          13           yes
security     payment-api       payments.example.com      6            yes
messaging    realtime-chat     chat.example.com          2            yes
...
11 ingresses found across 6 namespaces

# Analyze annotation compatibility for Gateway API
$ ing-switch analyze --target gateway-api
INGRESS                       SUPPORTED   PARTIAL   UNSUPPORTED
ecommerce/ecommerce-shop      6           5         2
security/payment-api          3           2         1
...

# Generate all migration manifests
$ ing-switch migrate --target traefik --output-dir ./migration
Generated 27 files across 11 ingresses

migration/
├── 00-migration-report.md      # Full annotation analysis
├── 01-install-traefik/         # Helm install script + values.yaml
├── 02-middlewares/             # Traefik Middleware CRDs
├── 03-ingresses/               # Updated Ingress resources
├── 04-verify.sh                # Test script per hostname
├── 05-dns-migration.md         # DNS cutover guide
└── 06-cleanup/                 # Remove NGINX after cutover
</code></pre>
<hr />
<h2>How it handles the tricky cases</h2>
<p>Three annotation scenarios trip up almost every migration, and <code>ing-switch</code> handles all of them:</p>
<h3>HTTP→HTTPS redirect</h3>
<p>The naive approach — putting both a RequestRedirect filter and a backend rule in the same HTTPRoute — causes a redirect loop where HTTPS traffic gets redirected back to HTTPS. <code>ing-switch</code> generates <strong>two separate HTTPRoutes</strong> using Gateway API <code>sectionName</code> to attach each route to the correct listener:</p>
<pre><code class="language-yaml"># &lt;name&gt;-redirect: attached to HTTP listener only (sectionName: http)
# Returns 301/302 for all incoming HTTP requests

# &lt;name&gt;: attached to HTTPS listener (sectionName: https-0)
# Routes HTTPS traffic to your backends
</code></pre>
<p>This is the correct pattern but requires knowing the Gateway API spec deeply enough to spot the constraint. The tool just does it right.</p>
<h3>Regex paths</h3>
<p>NGINX annotations like <code>use-regex: "true"</code> and paths like <code>/app(/|$)(.*)</code> are common. Gateway API's <code>PathPrefix</code> type rejects regex characters. <code>ing-switch</code> auto-detects paths containing <code>(</code>, <code>)</code>, <code>|</code>, <code>[</code>, <code>]</code> and switches them to <code>PathMatch.RegularExpression</code> type — even when the <code>use-regex</code> annotation is absent.</p>
<h3>Timeout constraints</h3>
<p><code>proxy-read-timeout: 300</code> and <code>proxy-connect-timeout: 5</code> look straightforward to map. But Gateway API enforces <code>backendRequest ≤ request</code>, and mapping read→backendRequest (300s) and connect→request (5s) would violate that constraint. <code>ing-switch</code> maps only <code>proxy-read-timeout → backendRequest</code> and intentionally omits <code>proxy-connect-timeout</code>, with a note explaining why.</p>
<hr />
<h2>Generated output: Gateway API example</h2>
<p>For a single ingress with SSL redirect, session affinity, and rate limiting, the Gateway API migration generates:</p>
<pre><code class="language-plaintext">migration/
├── 00-migration-report.md
├── 01-install-gateway-api-crds/
│   └── install.sh
├── 02-install-envoy-gateway/
│   ├── helm-install.sh
│   └── values.yaml
├── 03-gateway/
│   ├── gatewayclass.yaml      # GatewayClass (Envoy Gateway)
│   └── gateway.yaml           # Gateway: HTTP + HTTPS listeners
├── 04-httproutes/
│   ├── ecommerce-ecommerce-shop-redirect.yaml   # HTTP→HTTPS (sectionName: http)
│   └── ecommerce-ecommerce-shop.yaml            # HTTPS backend (sectionName: https-0)
├── 05-policies/
│   └── ecommerce-ecommerce-shop-btp.yaml        # BackendTrafficPolicy (rate limit)
├── 06-verify.sh
└── 07-cleanup/
    └── remove-nginx.sh
</code></pre>
<hr />
<h2>Example ingresses</h2>
<p>The repo includes 11 production-realistic NGINX Ingress configurations covering every major annotation category you're likely to have in a real cluster:</p>
<table style="min-width:50px"><colgroup><col style="min-width:25px"></col><col style="min-width:25px"></col></colgroup><tbody><tr><th><p>File</p></th><th><p>Covers</p></th></tr><tr><td><p><code>01-basic-routing.yaml</code></p></td><td><p>Path routing, TLS termination</p></td></tr><tr><td><p><code>02-ssl-tls.yaml</code></p></td><td><p>SSL redirect, HSTS, force-ssl</p></td></tr><tr><td><p><code>03-auth-external.yaml</code></p></td><td><p>External auth (auth-url, auth-response-headers)</p></td></tr><tr><td><p><code>04-session-affinity.yaml</code></p></td><td><p>Sticky cookies (all 8 session-cookie-* fields)</p></td></tr><tr><td><p><code>05-canary.yaml</code></p></td><td><p>Canary by weight, header, cookie</p></td></tr><tr><td><p><code>06-cors.yaml</code></p></td><td><p>Full CORS (all 6 cors-* annotations)</p></td></tr><tr><td><p><code>07-path-rewrite-regex.yaml</code></p></td><td><p>Regex routing, rewrite-target capture groups</p></td></tr><tr><td><p><code>08-rate-limit-ip.yaml</code></p></td><td><p>Rate limiting, IP allowlist/denylist</p></td></tr><tr><td><p><code>09-websocket.yaml</code></p></td><td><p>WebSocket upgrade</p></td></tr><tr><td><p><code>10-grpc.yaml</code></p></td><td><p>gRPC passthrough</p></td></tr><tr><td><p><code>11-full-featured.yaml</code></p></td><td><p>All of the above combined</p></td></tr></tbody></table>

<p>You can apply them to a test cluster and run the full migration against them:</p>
<pre><code class="language-bash">kubectl apply -f examples/
ing-switch migrate --target gateway-api --output-dir ./migration-examples
kubectl apply --dry-run=client -f ./migration-examples/03-gateway/
kubectl apply --dry-run=client -f ./migration-examples/04-httproutes/
</code></pre>
<hr />
<h2>Zero-downtime migration strategy</h2>
<p>The tool is designed around a zero-downtime approach:</p>
<ol>
<li><p><strong>Install the new controller alongside NGINX</strong> — both run simultaneously; DNS still points to NGINX</p>
</li>
<li><p><strong>Apply generated manifests</strong> — Middlewares, HTTPRoutes, Gateway</p>
</li>
<li><p><strong>Verify the new controller</strong> — run <code>06-verify.sh</code> to test each hostname against the new IP</p>
</li>
<li><p><strong>Shift DNS</strong> — update your DNS records to point to the new controller's LoadBalancer IP</p>
</li>
<li><p><strong>Monitor for 24 hours</strong> — watch for 5xx errors, auth failures, session issues</p>
</li>
<li><p><strong>Remove NGINX</strong> — run <code>06-cleanup/remove-nginx.sh</code></p>
</li>
</ol>
<p>The Migrate page's step-by-step checklist mirrors this order and locks steps until their dependencies are checked off.</p>
<hr />
<h2>Try it</h2>
<pre><code class="language-bash"># Install
curl -L https://github.com/saiyam1814/ing-switch/releases/latest/download/ing-switch-darwin-arm64 -o ing-switch
chmod +x ing-switch &amp;&amp; sudo mv ing-switch /usr/local/bin/

# Point at your cluster
export KUBECONFIG=~/.kube/config

# Open the visual UI
ing-switch ui
</code></pre>
<p>The source, examples, and annotation mapping database are at <a href="https://github.com/saiyam1814/ing-switch"><strong>github.com/saiyam1814/ing-switch</strong></a>.</p>
<p>The annotation mapping database lives in <code>pkg/analyzer/compatibility.go</code> (status + target resource per annotation) and <code>pkg/server/guides.go</code> (human-readable what/fix/example per annotation). PRs for additional annotation mappings are welcome.</p>
<hr />
<p><em>March 2026 is closer than it looks. The tools are ready.</em></p>
]]></content:encoded></item><item><title><![CDATA[Exploiting Metasploitable2 Using msfconsole (Kali Linux Lab)]]></title><description><![CDATA[Exploiting Metasploitable2 Using msfconsole (Kali Linux Lab)
Introduction
msfconsole is the heart of the Metasploit Framework and one of the most powerful tools used by penetration testers to identify, exploit, and validate security vulnerabilities. ...]]></description><link>https://blog.kubesimplify.com/exploiting-metasploitable2-using-msfconsole-kali-linux-lab</link><guid isPermaLink="true">https://blog.kubesimplify.com/exploiting-metasploitable2-using-msfconsole-kali-linux-lab</guid><category><![CDATA[cybersecurity]]></category><category><![CDATA[Msfconsole]]></category><category><![CDATA[metasploitable2]]></category><category><![CDATA[Web Exploitation]]></category><dc:creator><![CDATA[Ankit Kumar]]></dc:creator><pubDate>Sat, 17 Jan 2026 16:15:22 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1768665637972/12f25f19-87fa-4976-a328-5182fbed203e.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-exploiting-metasploitable2-using-msfconsole-kali-linux-lab"><strong>Exploiting Metasploitable2 Using</strong> <code>msfconsole</code> (Kali Linux Lab)</h2>
<h2 id="heading-introduction"><strong>Introduction</strong></h2>
<p><code>msfconsole</code> is the heart of the <strong>Metasploit Framework</strong> and one of the most powerful tools used by penetration testers to <strong>identify, exploit, and validate security vulnerabilities</strong>. In real-world security assessments as well as Capture The Flag (CTF) challenges, <code>msfconsole</code> is often used to automate and streamline exploitation workflows.</p>
<p>In this blog, we will explore how to use <code>msfconsole</code> from <strong>Kali Linux</strong> to exploit an intentionally vulnerable machine, <strong>Metasploitable2</strong>, in a safe and controlled lab environment.</p>
<p>Both machines are hosted on <strong>Oracle VM VirtualBox</strong> and configured on the same internal network. This setup allows us to simulate real attack scenarios while maintaining proper ethical boundaries.</p>
<p>The goal of this blog is to:</p>
<ul>
<li><p>Understand what <code>msfconsole</code> is and why it is used</p>
</li>
<li><p>Learn how attackers interact with vulnerable services using Metasploit</p>
</li>
<li><p>Gain hands-on experience with a realistic exploitation lab</p>
</li>
</ul>
<blockquote>
<p><strong><em>Note:</em></strong> <em>All demonstrations in this blog are performed on machines owned by us or intentionally designed to be vulnerable. Never use these techniques on unauthorized systems.</em></p>
</blockquote>
<p>In the next section, we will briefly look at the lab architecture before launching <code>msfconsole</code> and beginning the exploitation process.</p>
<p>Press enter or click to view image in full size</p>
<p><img src="https://miro.medium.com/v2/resize:fit:1400/1*F-1r_D3nLsU0NgJxadHHuA.png" alt /></p>
<h2 id="heading-setting-up-msfconsole-and-metasploitable2-step-by-step-lab-setup"><strong>Setting Up</strong> <code>msfconsole</code> and Metasploitable2 (Step-by-Step Lab Setup)</h2>
<p>Before launching any exploitation using <code>msfconsole</code>, we must ensure that <strong>both the attacker and the vulnerable target are properly set up and reachable</strong>. This section covers the <strong>complete setup process</strong> for <code>msfconsole</code> on Kali Linux and the Metasploitable2 vulnerable server.</p>
<h2 id="heading-1-setting-up-the-attacker-machine-kali-linux"><strong>1. Setting Up the Attacker Machine (Kali Linux)</strong></h2>
<h2 id="heading-why-kali-linux"><strong>Why Kali Linux?</strong></h2>
<p><strong>Kali Linux</strong> comes pre-installed with hundreds of penetration testing tools, including the <strong>Metasploit Framework</strong>.</p>
<h2 id="heading-verify-metasploit-installation"><strong>Verify Metasploit Installation</strong></h2>
<p>On Kali, Metasploit is installed by default. To verify:</p>
<pre><code class="lang-plaintext">msfconsole --version
</code></pre>
<p><img src="https://miro.medium.com/v2/resize:fit:1250/1*yYXvf1oFpRK4eLEfj9mdWA.png" alt /></p>
<p>If Metasploit is installed correctly, you will see version details.</p>
<p>— — — — — — — — — — — — — — — — —</p>
<h2 id="heading-start-msfconsole"><strong>Start</strong> <code>msfconsole</code></h2>
<pre><code class="lang-plaintext">msfconsole
</code></pre>
<p>On first launch, Metasploit may:</p>
<ul>
<li><p>Initialize its database</p>
</li>
<li><p>Create required configuration files</p>
</li>
</ul>
<p>You should now see the familiar <code>msf6 &gt;</code> prompt.</p>
<p>This confirms that <code>msfconsole</code> <strong>is ready to use</strong>.</p>
<p>Press enter or click to view image in full size</p>
<p><img src="https://miro.medium.com/v2/resize:fit:1400/1*2R285OLEWTQFtpGtItnnVQ.png" alt /></p>
<p>— — — — — — — — — — — — — — — — —</p>
<h2 id="heading-2-setting-up-the-target-machine-metasploitable2"><strong>2. Setting Up the Target Machine (Metasploitable2)</strong></h2>
<h2 id="heading-what-is-metasploitable2"><strong>What is Metasploitable2?</strong></h2>
<p><strong>Metasploitable2</strong> is a deliberately vulnerable Linux machine created for practicing penetration testing techniques.</p>
<h2 id="heading-start-metasploitable2-vm"><strong>Start Metasploitable2 VM</strong></h2>
<ul>
<li><p>Launch Metasploitable2 in <strong>Oracle VM VirtualBox</strong></p>
</li>
<li><p>Wait until it boots to the login screen</p>
</li>
</ul>
<h2 id="heading-default-credentials"><strong>Default Credentials</strong></h2>
<pre><code class="lang-plaintext">Username: msfadmin
Password: msfadmin
</code></pre>
<p>Login successfully to access the system.</p>
<h2 id="heading-check-ip-address-of-metasploitable2"><strong>Check IP Address of Metasploitable2</strong></h2>
<pre><code class="lang-plaintext">ifconfig
</code></pre>
<p>Example output:</p>
<pre><code class="lang-plaintext">inet addr:192.168.56.101
</code></pre>
<p>Press enter or click to view image in full size</p>
<p><img src="https://miro.medium.com/v2/resize:fit:1400/1*orC8wVFlmX8C9S_saVqHyQ.png" alt /></p>
<p><strong>Note this IP address</strong>, as it will be used as the target (<code>RHOSTS</code>) inside <code>msfconsole</code>.</p>
<p>— — — — — — — — — — — — — — — — —</p>
<h2 id="heading-3-ensure-both-machines-are-on-the-same-network"><strong>3. Ensure Both Machines Are on the Same Network</strong></h2>
<p>Both VMs must be configured with:</p>
<ul>
<li><p><strong>Network Adapter:</strong> Host-only Adapter</p>
</li>
<li><p><strong>Name:</strong> VirtualBox Host -Only Ethernet Adapter</p>
</li>
</ul>
<p>This ensures:</p>
<ul>
<li><p>Kali ↔ Metasploitable2 communication</p>
</li>
<li><p>No internet exposure (safe lab)</p>
</li>
</ul>
<p>Press enter or click to view image in full size</p>
<p><img src="https://miro.medium.com/v2/resize:fit:1400/1*jw_6awLbGPXPpHGiSwUheg.png" alt /></p>
<p>Press enter or click to view image in full size</p>
<p><img src="https://miro.medium.com/v2/resize:fit:1400/1*4YuIMsvhpt9eC4OBsvdfbg.png" alt /></p>
<p>— — — — — — — — — — — — — — — — —</p>
<h2 id="heading-4-test-connectivity-very-important"><strong>4. Test Connectivity (Very Important)</strong></h2>
<h2 id="heading-from-kali-linux"><strong>From Kali Linux:</strong></h2>
<pre><code class="lang-plaintext">ping 192.168.56.101
</code></pre>
<p>If you receive replies, your lab network is working correctly.</p>
<p>Press enter or click to view image in full size</p>
<p><img src="https://miro.medium.com/v2/resize:fit:1400/1*GfrW8rPEuFozFSKXF6kvbQ.png" alt /></p>
<p>— — — — — — — — — — — — — — — — —</p>
<h2 id="heading-5-confirm-target-visibility-using-nmap"><strong>5. Confirm Target Visibility Using Nmap</strong></h2>
<p>Before using Metasploit, attackers always enumerate first.</p>
<pre><code class="lang-plaintext">nmap -sV 192.168.56.101
</code></pre>
<p>You should see multiple <strong>intentionally vulnerable services</strong>, such as</p>
<ul>
<li><p>FTP (vsftpd 2.3.4)</p>
</li>
<li><p>SSH</p>
</li>
<li><p>Samba</p>
</li>
<li><p>Tomcat</p>
</li>
</ul>
<p>Press enter or click to view image in full size</p>
<p><img src="https://miro.medium.com/v2/resize:fit:1400/1*URDqkjKlFLdGB6D8vvClcg.png" alt /></p>
<p>This confirms that <strong>Metasploitable2 is ready for exploitation</strong>.</p>
<h2 id="heading-setup-checklist"><strong>Setup Checklist</strong></h2>
<p>✔ Kali Linux boots successfully<br />✔ <code>msfconsole</code> launches without errors<br />✔ Metasploitable2 is accessible<br />✔ Both machines are on the same subnet<br />✔ Ping &amp; Nmap scans work</p>
<p>Once all checks pass, your lab is <strong>fully prepared</strong>.</p>
<h2 id="heading-basic-msfconsole-commands-getting-comfortable-with-the-interface"><strong>Basic</strong> <code>msfconsole</code> Commands (Getting Comfortable with the Interface)</h2>
<p>Before jumping into exploitation, it’s important to understand the <strong>basic operating system–style commands</strong> and navigation used inside <code>msfconsole</code>. This section helps beginners feel confident while working in the Metasploit environment.</p>
<p>We are using <strong>Kali Linux</strong> with the <strong>Metasploit Framework</strong>.</p>
<h2 id="heading-starting-msfconsole"><strong>Starting</strong> <code>msfconsole</code></h2>
<p>Open a terminal in Kali Linux and run:</p>
<pre><code class="lang-plaintext">msfconsole
</code></pre>
<p>Once loaded, you will see:</p>
<pre><code class="lang-plaintext">msf6 &gt;
</code></pre>
<p>This prompt indicates that <code>msfconsole</code> is ready to accept commands.</p>
<h2 id="heading-getting-help-in-msfconsole"><strong>Getting Help in</strong> <code>msfconsole</code></h2>
<h2 id="heading-show-all-commands"><strong>Show All Commands</strong></h2>
<pre><code class="lang-plaintext">help
</code></pre>
<p>or simply:</p>
<pre><code class="lang-plaintext">?
</code></pre>
<p>This lists all available commands, similar to using <code>help</code> in an operating system shell.</p>
<p>Press enter or click to view image in full size</p>
<p><img src="https://miro.medium.com/v2/resize:fit:1400/1*3Srbz0J8FUP1JeGKlzyaDA.png" alt /></p>
<h2 id="heading-navigation-commands-os-like-basics"><strong>Navigation Commands (OS-Like Basics)</strong></h2>
<p>CommandDescription<code>pwd</code>Shows the current module path<code>cd</code>Change module directory<code>ls</code>List available modules<code>clear</code>Clear the screen</p>
<p>Example:</p>
<pre><code class="lang-plaintext">pwd
ls
</code></pre>
<p>These commands work <strong>inside Metasploit</strong>, not the Linux filesystem.</p>
<h2 id="heading-searching-for-modules"><strong>Searching for Modules</strong></h2>
<p>One of the most used commands:</p>
<pre><code class="lang-plaintext">search &lt;keyword&gt;
</code></pre>
<p>Example:</p>
<pre><code class="lang-plaintext">search ftp
search samba
search vsftpd
</code></pre>
<p>Press enter or click to view image in full size</p>
<p><img src="https://miro.medium.com/v2/resize:fit:1400/1*_59RCXTD8OlOBxmtHTKWTw.png" alt /></p>
<p>This helps you quickly find:</p>
<ul>
<li><p>Exploits</p>
</li>
<li><p>Auxiliary scanners</p>
</li>
<li><p>Payloads</p>
</li>
</ul>
<h2 id="heading-understanding-module-types"><strong>Understanding Module Types</strong></h2>
<p>Metasploit is organized into modules:</p>
<p>Metasploit is organized into different <strong>module types</strong>, each designed for a specific purpose in the penetration testing lifecycle.</p>
<p>Press enter or click to view image in full size</p>
<p><img src="https://miro.medium.com/v2/resize:fit:1400/1*mup8HD7Ny-wJB_UHIm-wsA.png" alt /></p>
<p>You can list them using:</p>
<pre><code class="lang-plaintext">ls exploit
ls auxiliary
</code></pre>
<h2 id="heading-using-a-module"><strong>Using a Module</strong></h2>
<p>To select a module:</p>
<pre><code class="lang-plaintext">use exploit/unix/ftp/vsftpd_234_backdoor
</code></pre>
<p>Press enter or click to view image in full size</p>
<p><img src="https://miro.medium.com/v2/resize:fit:1400/1*BOa3f7W203Dorp7kN5ly0g.png" alt /></p>
<p>Once selected, the prompt changes to:</p>
<pre><code class="lang-plaintext">msf6 exploit(unix/ftp/vsftpd_234_backdoor) &gt;
</code></pre>
<p>This tells you <strong>which module is currently active</strong>.</p>
<h2 id="heading-viewing-amp-setting-options"><strong>Viewing &amp; Setting Options</strong></h2>
<h2 id="heading-show-required-options"><strong>Show Required Options</strong></h2>
<pre><code class="lang-plaintext">show options
</code></pre>
<p>Press enter or click to view image in full size</p>
<p><img src="https://miro.medium.com/v2/resize:fit:1400/1*YQCTkTtyTuH2KCRHknxBog.png" alt /></p>
<h2 id="heading-set-target-ip"><strong>Set Target IP</strong></h2>
<pre><code class="lang-plaintext">set RHOSTS 192.168.56.101
</code></pre>
<h2 id="heading-set-port-if-needed"><strong>Set Port (if needed)</strong></h2>
<pre><code class="lang-plaintext">set RPORT 21
</code></pre>
<p>To verify:</p>
<pre><code class="lang-plaintext">show options
</code></pre>
<p>Press enter or click to view image in full size</p>
<p><img src="https://miro.medium.com/v2/resize:fit:1400/1*XWEtzl5SBnYN4wkrQJPeXA.png" alt /></p>
<h2 id="heading-running-a-module"><strong>Running a Module</strong></h2>
<pre><code class="lang-plaintext">run
</code></pre>
<p>or</p>
<pre><code class="lang-plaintext">exploit
</code></pre>
<p>Both commands do the same thing.</p>
<p>Press enter or click to view image in full size</p>
<p><img src="https://miro.medium.com/v2/resize:fit:1400/1*H9vWWSHvNAf6eQbcVLjrLQ.png" alt /></p>
<h2 id="heading-session-management-basics"><strong>Session Management Basics</strong></h2>
<p>After successful exploitation:</p>
<pre><code class="lang-plaintext">sessions
</code></pre>
<p>Interact with a session:</p>
<pre><code class="lang-plaintext">sessions -i 1
</code></pre>
<p>Exit session:</p>
<p>Press enter or click to view image in full size</p>
<p><img src="https://miro.medium.com/v2/resize:fit:1400/1*mYxSH7v_A7HnR1W6QTTkbw.png" alt /></p>
<pre><code class="lang-plaintext">exit
</code></pre>
<h2 id="heading-exiting-modules-amp-msfconsole"><strong>Exiting Modules &amp;</strong> <code>msfconsole</code></h2>
<p>In Metasploit, use <code>back</code> to leave the current module and return to the main console. Use <code>quit</code> or <code>exit</code> to close <code>msfconsole</code> completely.</p>
<h2 id="heading-key-takeaways"><strong>Key Takeaways</strong></h2>
<p>✔ <code>msfconsole</code> feels like a mini operating system<br />✔ <code>search</code>, <code>use</code>, and <code>show options</code> are core commands<br />✔ Always understand a module before running it<br />✔ Enumeration comes before exploitation</p>
<h2 id="heading-conclusion"><strong>Conclusion</strong></h2>
<p>In this blog, we explored how <code>msfconsole</code>, the core interface of the <strong>Metasploit Framework</strong>, can be used to exploit a vulnerable <strong>FTP service</strong> on <strong>Metasploitable2</strong> from an attacker machine running <strong>Kali Linux</strong>.</p>
<p>Starting from proper lab setup and network configuration, we moved through the essential stages of a penetration test: <strong>service enumeration, exploit selection, configuration, and execution</strong>. By exploiting the outdated <code>vsftpd 2.3.4</code> service, we demonstrated how a single vulnerable service can lead to <strong>full system compromise</strong> when basic security practices are ignored.</p>
<p>This exercise highlights several important lessons:</p>
<ul>
<li><p>Enumeration is more important than exploitation</p>
</li>
<li><p>Outdated services pose serious security risks</p>
</li>
<li><p>Automation tools like <code>msfconsole</code> must be used with understanding, not blindly</p>
</li>
<li><p>Ethical hacking is about <strong>learning and improving security</strong>, not breaking systems</p>
</li>
</ul>
<p>Practicing in a controlled environment like Metasploitable2 helps build a strong foundation for <strong>CTFs, real-world penetration testing, and defensive security awareness</strong>.</p>
<p>In future labs, this knowledge can be extended to:</p>
<ul>
<li><p>Exploiting other services such as Samba and Tomcat</p>
</li>
<li><p>Using Meterpreter for advanced post-exploitation</p>
</li>
<li><p>Understanding how blue teams detect and prevent such attacks</p>
</li>
</ul>
<blockquote>
<p><strong><em>Always remember:</em></strong> <em>with great power comes great responsibility. Use these skills only where you have explicit permission.</em></p>
</blockquote>
<p>Follow Kubesimplify on <a target="_blank" href="https://blog.kubesimplify.com/"><strong>Hashnode</strong></a><a target="_blank" href="https://blog.kubesimplify.com/">, <strong>Twitte</strong></a><a target="_blank" href="https://twitter.com/kubesimplify"><strong>r/X</strong></a> <a target="_blank" href="https://twitter.com/kubesimplify">and <strong>Link</strong></a><a target="_blank" href="https://www.linkedin.com/company/kubesimplify"><strong>edIn</strong></a><a target="_blank" href="https://www.linkedin.com/company/kubesimplify">. Join o</a>ur <a target="_blank" href="https://kubesimplify.com/discord"><strong>Discord server</strong></a> <a target="_blank" href="https://kubesimplify.com/discord">to learn with</a> us!</p>
]]></content:encoded></item><item><title><![CDATA[Kubernetes v1.35 – What’s New, What’s Changing, and What You Should Know]]></title><description><![CDATA[Release date: December 17, 2025
Kubernetes 1.35 is released and again the velocity for this project never makes you feel that there is no innovation left. The releases is focussed on more stability, AI workloads, introducing concept of workload and s...]]></description><link>https://blog.kubesimplify.com/kubernetes-v135-whats-new-whats-changing-and-what-you-should-know</link><guid isPermaLink="true">https://blog.kubesimplify.com/kubernetes-v135-whats-new-whats-changing-and-what-you-should-know</guid><category><![CDATA[Kuberntes 1.35]]></category><category><![CDATA[Kubernetes]]></category><category><![CDATA[release updates]]></category><category><![CDATA[new]]></category><category><![CDATA[newrelease]]></category><dc:creator><![CDATA[Saloni Narang]]></dc:creator><pubDate>Fri, 19 Dec 2025 13:42:59 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1766148219752/976c8a3e-ea9a-4ecc-b054-267dac3d5645.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><strong>Release date: December 17, 2025</strong></p>
<p>Kubernetes 1.35 is released and again the velocity for this project never makes you feel that there is no innovation left. The releases is focussed on more stability, AI workloads, introducing concept of workload and so much more.</p>
<p>Huge thanks to all the contributors and the release team for making this happen. Let’s look at some of the cool features for this release.</p>
<p><img src="https://media.licdn.com/dms/image/v2/D5622AQELO5TpuYIYHg/feedshare-shrink_2048_1536/B56Zss9.NHIEAw-/0/1765986005355?e=1767830400&amp;v=beta&amp;t=qdPIggbl_P9iIE8_4qvq1ZpHUP8B9zNDhTgFnLARv8Y" alt="No alternative text description for this image" /></p>
<h2 id="heading-native-gang-scheduling-is-finally-here-and-its-a-big-deal">Native Gang Scheduling is finally here (and it’s a big deal)</h2>
<p>For years, Kubernetes scheduled pods one by one. That model works well for stateless services, but it breaks down badly for distributed workloads, think PyTorch training jobs, Spark, Ray, or MPI-style applications.</p>
<p>Kubernetes 1.35 introduces <strong>native Gang Scheduling</strong> through a new <strong>Workload API</strong>. This enables <em>all-or-nothing</em> scheduling: either all pods in a group get scheduled together, or none do. No more half-started jobs burning GPUs while waiting for peers. This is foundational work for AI and HPC workloads on Kubernetes.</p>
<p>First, define a <code>Workload</code> that represents the gang policy:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">apiVersion:</span> <span class="hljs-string">scheduling.k8s.io/v1alpha1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">Workload</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">demo-workload</span>
  <span class="hljs-attr">namespace:</span> <span class="hljs-string">gang-demo</span>
<span class="hljs-attr">spec:</span>
  <span class="hljs-attr">podGroups:</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">workers</span>
    <span class="hljs-attr">policy:</span>
      <span class="hljs-attr">gang:</span>
        <span class="hljs-attr">minCount:</span> <span class="hljs-number">3</span>
</code></pre>
<p>Then reference it from your Pod:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">apiVersion:</span> <span class="hljs-string">v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">Pod</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">worker-0</span>
  <span class="hljs-attr">namespace:</span> <span class="hljs-string">gang-demo</span>
  <span class="hljs-attr">labels:</span>
    <span class="hljs-attr">app:</span> <span class="hljs-string">gang-demo</span>
<span class="hljs-attr">spec:</span>
  <span class="hljs-attr">workloadRef:</span>
    <span class="hljs-attr">name:</span> <span class="hljs-string">demo-workload</span>
    <span class="hljs-attr">podGroup:</span> <span class="hljs-string">workers</span>
  <span class="hljs-attr">containers:</span>
    <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">demo</span>
      <span class="hljs-attr">image:</span> <span class="hljs-string">nginx:1.27</span>
      <span class="hljs-attr">resources:</span>
        <span class="hljs-attr">requests:</span>
          <span class="hljs-attr">cpu:</span> <span class="hljs-string">"200m"</span>
          <span class="hljs-attr">memory:</span> <span class="hljs-string">"128Mi"</span>
<span class="hljs-meta">---</span>
<span class="hljs-attr">apiVersion:</span> <span class="hljs-string">v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">Pod</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">worker-1</span>
  <span class="hljs-attr">namespace:</span> <span class="hljs-string">gang-demo</span>
  <span class="hljs-attr">labels:</span>
    <span class="hljs-attr">app:</span> <span class="hljs-string">gang-demo</span>
<span class="hljs-attr">spec:</span>
  <span class="hljs-attr">workloadRef:</span>
    <span class="hljs-attr">name:</span> <span class="hljs-string">demo-workload</span>
    <span class="hljs-attr">podGroup:</span> <span class="hljs-string">workers</span>
  <span class="hljs-attr">containers:</span>
    <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">demo</span>
      <span class="hljs-attr">image:</span> <span class="hljs-string">nginx:1.27</span>
      <span class="hljs-attr">resources:</span>
        <span class="hljs-attr">requests:</span>
          <span class="hljs-attr">cpu:</span> <span class="hljs-string">"200m"</span>
          <span class="hljs-attr">memory:</span> <span class="hljs-string">"128Mi"</span>
<span class="hljs-meta">---</span>
<span class="hljs-attr">apiVersion:</span> <span class="hljs-string">v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">Pod</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">worker-2</span>
  <span class="hljs-attr">namespace:</span> <span class="hljs-string">gang-demo</span>
  <span class="hljs-attr">labels:</span>
    <span class="hljs-attr">app:</span> <span class="hljs-string">gang-demo</span>
<span class="hljs-attr">spec:</span>
  <span class="hljs-attr">workloadRef:</span>
    <span class="hljs-attr">name:</span> <span class="hljs-string">demo-workload</span>
    <span class="hljs-attr">podGroup:</span> <span class="hljs-string">workers</span>
  <span class="hljs-attr">containers:</span>
    <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">demo</span>
      <span class="hljs-attr">image:</span> <span class="hljs-string">nginx:1.27</span>
      <span class="hljs-attr">resources:</span>
        <span class="hljs-attr">requests:</span>
          <span class="hljs-attr">cpu:</span> <span class="hljs-string">"200m"</span>
          <span class="hljs-attr">memory:</span> <span class="hljs-string">"128Mi"</span>
</code></pre>
<p>The scheduler now understands that these pods belong together and either they will be scheduled together or won’t get scheduled.</p>
<p><strong>Saiyam Pathak already published a full hands-on video on the Kubesimplify YouTube channel</strong>, where he builds Kubernetes 1.35 from source, enables the Workload API, and demonstrates native Gang Scheduling end to end. If you’re working with AI or distributed computing, that walkthrough is worth your time.</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://youtu.be/bD_eQU0GwOw?si=zEc3c5xBwBIoF5Pl">https://youtu.be/bD_eQU0GwOw?si=zEc3c5xBwBIoF5Pl</a></div>
<p> </p>
<h2 id="heading-in-place-pod-update-moves-to-stable"><strong>In place pod update moves to stable</strong></h2>
<p>In Kubernetes v1.35, In-place update of Pod resources graduated to GA (Stable), which means you can now change a running Pod’s CPU and memory requests/limits without recreating the Pod (and often without restarting containers). This is a big deal for stateful and batch workloads where a restart is costly: you can do smoother, less disruptive vertical scaling, and Kubernetes will reflect desired resources in <code>spec</code> while tracking what’s actually applied in <code>status</code> as the kubelet works through the resize. This graduation comes from KEP #1287 (SIG Node).</p>
<h2 id="heading-user-namespaces-in-pods">User namespaces in Pods</h2>
<p>Linux user namespaces in Kubernetes allows pods to run with different user and group IDs than the host while preserving strong isolation. With this feature, a process can run as root inside a container but map to an unprivileged user on the node, significantly reducing the blast radius of container breakouts and mitigating several high-severity CVEs. It adds a new <code>pod.spec.hostUsers</code> field to opt into user namespaces, integrates with CRI and idmapped mounts for safe volume handling, and aligns with Pod Security Standards to safely relax certain restrictions when enabled. Overall, this strengthens node-to-pod and pod-to-pod isolation without changing default behaviour, making Kubernetes workloads more secure by design.</p>
<h2 id="heading-kubernetes-is-finally-letting-go-of-old-baggage"><strong>Kubernetes is finally letting go of old baggage</strong></h2>
<p>Kubernetes 1.35 draws a hard line under several legacy technologies.</p>
<p>If your nodes still rely on <strong>cgroup v1</strong>, the kubelet will not start by default anymore. Most modern Linux distributions have already moved to cgroup v2, which offers better resource isolation and consistency. Kubernetes waited long enough and now is the time to take action on this.</p>
<p>The same applies to <strong>containerd 1.x</strong>. Kubernetes 1.35 is the <em>last</em> release that supports it. If you plan to upgrade beyond this version, moving to containerd 2.x is no longer optional.</p>
<p><strong>IPVS mode in kube-proxy</strong> is also deprecated. While once popular for performance reasons, it became increasingly complex to maintain. Kubernetes is consolidating around nftables, reducing operational and maintenance burden.</p>
<p><strong>Ingress NGINX retirement</strong>: This came as a massive blow where due to maintainer debt the ingress nginx which is uses by thousands of organisations is now getting retired. Although Chainguard has come up to support and maintain the version as per <a target="_blank" href="https://www.chainguard.dev/unchained/introducing-chainguard-emeritoss">this announcement</a>. IMO you should start your migrations to gatewayAPI and we have a full end to end video about this on Kubesimplify already where you get to see entire demo on how to migrate from ingress to gatewayAPI.</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://youtu.be/Z-vKixowC9c?si=Rb3ud8-K5vSFaaSt">https://youtu.be/Z-vKixowC9c?si=Rb3ud8-K5vSFaaSt</a></div>
<p> </p>
<h2 id="heading-securing-cached-images-in-kubernetes-135"><strong>Securing Cached Images in Kubernetes 1.35</strong></h2>
<p>KEP-2535 addresses a security gap in Kubernetes image pulling. Historically, when <code>imagePullPolicy</code> was set to <code>IfNotPresent</code> or <code>Never</code>, a pod could start using a private image already cached on a node even if it did not have the credentials to pull that image itself. In multi-tenant clusters, this meant one workload could unintentionally benefit from another workload’s credentials. Kubernetes 1.35 introduces kubelet-level credential verification for cached images, ensuring that a pod is authorized to use an image before it can run, regardless of whether the image is already present. With this beta feature enabled by default and configurable via <code>imagePullCredentialsVerificationPolicy</code>, clusters can now enforce stricter image access without forcing <code>Always</code> pulls, significantly improving tenant isolation and supply-chain security.</p>
<h3 id="heading-overall-i-am-excited-about">Overall I am excited about</h3>
<ul>
<li><p>In-place Pod resource updates (now GA): being able to adjust CPU/memory without restarting pods is huge for stateful and long-running workloads.</p>
</li>
<li><p>Native gang scheduling and the new Workload API direction is a big deal for ML/HPC style workloads and any “all-or-nothing” batch pipelines.</p>
</li>
</ul>
<ul>
<li><p><em>Pod certificates (beta): a strong step toward native workload identity and simpler mTLS setups without always relying on extra controllers.</em></p>
</li>
<li><p>Node declared features (alpha): a practical way to reduce upgrade/version-skew surprises by letting nodes declare supported capabilities before scheduling decisions happen.</p>
</li>
</ul>
<p>Overall this release consists of 60 enhancements, including 17 stable, 19 beta, and 22 alpha features. What is the feature that you are most excited about for this Kubernetes release and is there anything you would like to see a deep dive on Kubesimplify.</p>
]]></content:encoded></item><item><title><![CDATA[Ditch the Overheating Laptop: Supercharge Your Docker Workflow with Docker Offload]]></title><description><![CDATA[We've all been there: you run a resource-intensive Docker build or a compute-heavy container, and your laptop's fans start screaming, the CPU maxes out, and your entire machine slows to a crawl. For developers working with large applications, complex...]]></description><link>https://blog.kubesimplify.com/ditch-the-overheating-laptop-supercharge-your-docker-workflow-with-docker-offload</link><guid isPermaLink="true">https://blog.kubesimplify.com/ditch-the-overheating-laptop-supercharge-your-docker-workflow-with-docker-offload</guid><category><![CDATA[AI]]></category><category><![CDATA[Docker]]></category><category><![CDATA[docker desktop]]></category><category><![CDATA[Cloud]]></category><dc:creator><![CDATA[Saloni Narang]]></dc:creator><pubDate>Tue, 26 Aug 2025 06:40:46 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1756220091194/365cc324-2eb9-4656-9515-fc69b74abb3e.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1756189065352/1607355b-8235-4238-a3c2-51024e060f63.png" alt class="image--center mx-auto" /></p>
<p>We've all been there: you run a resource-intensive Docker build or a compute-heavy container, and your laptop's fans start screaming, the CPU maxes out, and your entire machine slows to a crawl. For developers working with large applications, complex multi-stage builds, or AI/ML workloads, this is a frustratingly common problem. But what if you could have the best of both worlds here? You have the awesome Docker Desktop experience along with the power of a high-performance cloud machine?</p>
<p>Enter <strong>Docker Offload</strong>, a game-changing service that lets you offload your Docker builds and container runs to a secure, dedicated cloud environment. It's not about learning a new cloud platform; it's about seamlessly extending your existing Docker workflow to where the resources are.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1755952155295/d21f69d1-a215-4470-9f44-3e015d991eec.png" alt class="image--center mx-auto" /></p>
<hr />
<h3 id="heading-what-exactly-is-docker-offload">What Exactly Is Docker Offload?</h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1756189192968/32a8ce98-059d-43f0-b696-9134e0264a05.png" alt class="image--center mx-auto" /></p>
<p>Docker Offload is a fully managed service that allows you to execute Docker commands on <strong>powerful, remote cloud infrastructure</strong> while maintaining the same user experience on your local machine. Think of it as a bridge that connects your Docker Desktop to a beefy, cloud-based Docker daemon. You still type <code>docker build</code> and <code>docker run</code> In your terminal, but the heavy lifting is done remotely, freeing up your local resources.</p>
<p>This service is particularly useful for tasks that are traditionally a pain on a local machine, such as:</p>
<ul>
<li><p><strong>Building large, complex images:</strong> Multi-stage builds with many dependencies or a large build context can take ages on a standard laptop.</p>
</li>
<li><p><strong>Running compute-intensive workloads:</strong> Tasks like machine learning model training, AI inferencing, data processing, and video transcoding often require more CPU, RAM, and most importantly, GPU power than a typical developer machine has.</p>
</li>
<li><p><strong>Standardizing development environments:</strong> It helps teams with a variety of hardware specs work on the same projects without performance bottlenecks, ensuring a consistent and fast experience for everyone.</p>
</li>
</ul>
<hr />
<h3 id="heading-key-benefits-of-offloading-to-the-cloud">Key Benefits of Offloading to the Cloud</h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1756189237048/ab1c410f-3c46-4c85-8477-21964b8e40f9.png" alt class="image--center mx-auto" /></p>
<ol>
<li><p><strong>Massive Performance Boost:</strong> By leveraging powerful cloud instances with access to <strong>NVIDIA L4 GPUs</strong>, you can slash build times and run containers that would otherwise overwhelm your local machine.</p>
</li>
<li><p><strong>No Changes to Your Workflow:</strong> This is the magic of Docker Offload. It's not a new tool or a different syntax. You use the same <code>docker</code> and <code>docker compose</code> commands you already know and love.</p>
</li>
<li><p><strong>Resource Optimization and Cost Efficiency:</strong> The service uses a pay-as-you-go model. Cloud environments are ephemeral, meaning they're provisioned for your session and then automatically shut down and cleaned up after a period of inactivity. This prevents you from paying for idle resources.</p>
</li>
<li><p><strong>Seamless Local-Cloud Integration:</strong> Even though your container is running in the cloud, exposed ports are still accessible via <a target="_blank" href="http://localhost"><code>localhost</code></a>. This allows you to interact with your running application as if it were local, making development, debugging, and testing a breeze.</p>
</li>
<li><p><strong>Shared Build Cache:</strong> Docker Offload uses a shared cache that can be reused by your team, further accelerating build times and ensuring consistency across different machines.</p>
</li>
</ol>
<hr />
<h3 id="heading-getting-started-a-quick-guide-to-implementation">Getting Started: A Quick Guide to Implementation</h3>
<p>Implementing Docker Offload is surprisingly simple. You'll need Docker Desktop version 4.43 or later.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1756189330902/6df2345e-fd7c-4c2d-8246-7176a5627e98.png" alt class="image--center mx-auto" /></p>
<h4 id="heading-1-start-docker-offload">1. Start Docker Offload</h4>
<p>In your terminal, just run:</p>
<pre><code class="lang-bash">docker offload start
</code></pre>
<p>This command will prompt you to log in to your Docker account and will ask if you want to enable GPU support. If you're working on AI/ML projects, enabling the GPU is highly recommended. Once it's running, you'll see a cloud icon in your Docker Desktop dashboard, and your terminal's context will be set to the new cloud environment.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1756189837358/da390ca3-e6ea-4437-8caa-ba45ebbeebe0.png" alt class="image--center mx-auto" /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1756189820022/07bc74f1-6a2d-49c8-89e1-cf7589d4b383.png" alt class="image--center mx-auto" /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1756189858072/2504b481-95e9-4eea-b9d2-3e3130b5e700.png" alt class="image--center mx-auto" /></p>
<h4 id="heading-2-gpu-test-with-docker-offload">2. GPU test with Docker offload</h4>
<p>With Docker Offload active, you can now use your regular commands. Let's try running a Minimal GPU smoke test with Compose + Offload.</p>
<p>Create a <code>compose.yaml</code> file</p>
<pre><code class="lang-dockerfile">services:
  gpu-smoke:
    image: nvidia/cuda:<span class="hljs-number">12.4</span>.<span class="hljs-number">1</span>-base-ubuntu22.<span class="hljs-number">04</span>
    command: nvidia-smi
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]
</code></pre>
<p>This will run <code>nvidia-smi</code> In the Offload cloud GPU instance and print the GPU info. You should see an NVIDIA-SMI table in the logs that confirms the GPU is available in your Offload session.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1756189962147/13f9e4ad-52a3-4879-9254-5cdd6afd936d.png" alt class="image--center mx-auto" /></p>
<p>In both cases, you'll see the output as if the container ran locally, but it's all happening on the powerful cloud machine.</p>
<h4 id="heading-3-stop-docker-offload">3. Stop Docker Offload</h4>
<p>When you're finished, you can switch back to local execution by running:</p>
<pre><code class="lang-plaintext">docker offload stop
</code></pre>
<p>This will tear down the cloud environment, ensuring you don't incur any additional costs.</p>
<ol start="4">
<li><p><strong>Checking Docker Offload Run history</strong></p>
<p> You can also go to your account under your Docker account and see the run history for the offload to track the usage. You can set up the usage limits to avoid any cost surprises.</p>
<p> <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1756190024230/c1d5fc28-6f5f-4bab-8f86-eed0c9954331.png" alt class="image--center mx-auto" /></p>
</li>
</ol>
<hr />
<h3 id="heading-real-world-use-cases">Real-World Use Cases</h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1756189237048/ab1c410f-3c46-4c85-8477-21964b8e40f9.png" alt /></p>
<ul>
<li><p><strong>AI/ML Development:</strong> Easily train or run inference on large models without the need for an expensive local GPU.</p>
</li>
<li><p><strong>High-Velocity Development:</strong> Accelerate CI/CD pipelines and local development feedback loops by offloading builds to a powerful, consistent environment.</p>
</li>
<li><p><strong>On-Demand Compute:</strong> Quickly spin up a resource-heavy container for a short task (like data processing or video rendering) without impacting your local machine's performance.</p>
</li>
<li><p><strong>Developer Experience:</strong> Provide every developer on a team with the same powerful environment, regardless of their local hardware, eliminating "it works on my machine" issues.</p>
</li>
</ul>
<h2 id="heading-conclusion">Conclusion</h2>
<p>Docker Offload is a powerful feature that bridges the gap between local convenience and cloud-scale resources. It's a testament to Docker's ongoing commitment to making containerization accessible and efficient for every developer. So next time your laptop's fans start spinning up, remember you have a new, more powerful option. What do you use for your development workflows when you need these big machines but want the same developer experience?</p>
]]></content:encoded></item><item><title><![CDATA[Docker MCP Catalog: Finding the Right AI Tools for Your Project]]></title><description><![CDATA[As large language models (LLMs) evolve from static text generators to dynamic agents capable of executing actions, there's a growing need for a standardized way to let them interact with external tooling securely. That’s where Model Context Protocol ...]]></description><link>https://blog.kubesimplify.com/docker-mcp-catalog</link><guid isPermaLink="true">https://blog.kubesimplify.com/docker-mcp-catalog</guid><category><![CDATA[mcp]]></category><category><![CDATA[Docker]]></category><category><![CDATA[docker images]]></category><category><![CDATA[docker desktop]]></category><category><![CDATA[AI]]></category><category><![CDATA[llm]]></category><dc:creator><![CDATA[Saloni Narang]]></dc:creator><pubDate>Thu, 26 Jun 2025 07:32:05 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1750923109185/dbdb9d02-71cb-42b5-b660-68290ac7d695.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>As large language models (LLMs) evolve from static text generators to dynamic agents capable of executing actions, there's a growing need for a standardized way to let them interact with external tooling securely. That’s where <a target="_blank" href="https://modelcontextprotocol.io/introduction">Model Context Protocol</a> (MCP) steps in, a protocol designed to turn your existing APIs into AI-accessible tools. Think of MCP as the missing middleware between LLMs and the real-world functionality you’ve already built. Instead of doing the prompt hacks or building custom plugins for each model, MCP allows you to define your capabilities as structured tools that any compliant AI client can discover, invoke, and interact with safely and predictably. While the protocol is still maturing and the documentation can be opaque, the underlying value is clear: MCP turns your backend into a toolbox for AI agents. Whether you're integrating scraping APIs, financial services, or internal business logic, MCP offers a portable, reusable, and scalable pattern for AI integrations. In this blog, we’ll walk through Docker Desktop's latest MCP client-server feature and explore how you can install an MCP server and use that directly from your LLM tool.  </p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1750069727593/8f4837fc-24e4-41f7-8d83-227da002418b.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-enter-docker">Enter Docker!</h3>
<ul>
<li><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1750070288946/ec056820-0667-4c3f-9d46-775f615dea58.png" alt class="image--center mx-auto" /></li>
</ul>
<p><strong>Why is Docker a game-changer for AI, and specifically for MCP tools?</strong></p>
<p><a target="_blank" href="https://www.docker.com/">Docker</a> has already proven to be the de facto standard for creating and distributing containerized applications. Its user experience is the key reason why I and millions of other developers use Docker today. Over the years, Docker has evolved to cater to the needs of developers, and it entered the AI game too. With so many MCP servers having a set of configurations living on separate GitHub repositories and different installation methods, Docker has again changed the game on how we think and run these MCP servers and connect to MCP clients like Claude.</p>
<p>Now we already have MCP that solves the problem of how Agents talk with the tools. In simple terms, your LLM models are capable of connecting to your tools and performing a wider set of actions. But how simple it is? There are 100s of MCP servers out there, each having its configurations and its own GitHub repository.</p>
<p>Docker has introduced the <a target="_blank" href="https://www.docker.com/products/mcp-catalog-and-toolkit/"><strong>Docker MCP Catalog and Toolkit</strong></a> (currently in Beta). This is a comprehensive solution designed to streamline the developer experience for building and using MCP-compatible tools.<br />We can simply go to the extensions and install the Docker MCP toolkit.  </p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1750071825142/9643e08a-8d53-4f78-926b-c25aed281f7c.png" alt class="image--center mx-auto" /></p>
<p><strong>What is the Docker MCP Catalog?</strong></p>
<p>The Docker MCP Catalog is a centralized, trusted registry that offers a curated collection of MCP-compatible tools packaged as Docker images. Integrated with Docker Hub and available directly through Docker Desktop, it simplifies the discovery, sharing, and execution of over 100 verified MCP servers from partners like Stripe, Grafana etc. By running each tool in an isolated container, the catalog addresses common issues such as environment conflicts, inconsistent platform behaviour, and complex setups, ensuring portability, security, and consistency across systems. Developers can instantly pull and run these tools using Docker CLI or Desktop, with built-in support for agent integration via the MCP Toolkit</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1750876075651/679dcf66-5fae-4e18-b998-c962d2703104.png" alt class="image--center mx-auto" /></p>
<ul>
<li><p><strong>Centralized Discovery:</strong> A dedicated place (often under the <code>mcp/</code> on Docker Hub ) to find a growing list of MCP servers. Docker has mentioned collaborations with providers like Stripe, Elastic, and Neo4j, indicating a rich ecosystem of tools.</p>
</li>
<li><p><strong>Verified and Versioned Tools:</strong> The catalog aims to provide access to tools from verified publishers and ensures that tools are versioned, allowing developers to rely on specific, stable releases.</p>
</li>
<li><p><strong>Easy Distribution :</strong> Easy pull based distribution using Docker’s infra</p>
</li>
</ul>
<p><strong>What is the MCP Toolkit?</strong></p>
<p>The <strong>Docker MCP Toolkit</strong> is a powerful companion to the MCP Catalog that streamlines the setup, management, and execution of containerized MCP servers and their integration with AI agents. It eliminates the need for manual configurations by offering one-click setup, secure defaults, and built-in compatibility with popular LLM-based clients like Docker gordon, Claude Desktop, Cursor, and <a target="_blank" href="http://Continue.dev">Continue.dev</a>. Acting as both an aggregator for MCP servers and a gateway for connected clients, the toolkit enables developers to browse, configure, and launch tools directly from Docker Desktop. Security is a core focus, all mcp servers are signed and includes SBOMs for transparency. The MCP Toolkit makes it easy to discover and run tools from the catalog, connect clients like Gordon or Docker AI Agent, and build secure, agent-driven workflows with minimal overhead.</p>
<p>Key functionalities of the MCP Toolkit:</p>
<ul>
<li><p><strong>Simplified Installation:</strong> Easily pull and run MCP tools (servers) from the catalog, often with just a few clicks or simple commands.</p>
</li>
<li><p><strong>Secure Credential Management:</strong> One of the major pain points in tool integration is handling authentication securely. The MCP Toolkit often includes features like OAuth-based authentication and secure storage for credentials, preventing the risky practice of hardcoding secrets.</p>
</li>
<li><p><strong>Isolated and Secure Execution:</strong> Leverages Docker's containerization to run tools in isolated environments, enhancing security and stability.</p>
</li>
<li><p><strong>Client Integration:</strong> Facilitates connecting these MCP tools/servers to various AI agent clients (e.g., popular AI models or development environments like Claude, Cursor) without needing to rewrite code for each client.</p>
</li>
</ul>
<p>Together, the Docker MCP Catalog and Toolkit provide a foundational layer for developers, making it easier to find, safer to use, and more scalable to integrate external tools into their AI agent applications.  </p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1750092298292/9991a34f-e936-48e2-a2a7-a3e8fd0e08b3.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-what-you-can-do-with-the-catalog">What You Can Do with the Catalog</h3>
<h3 id="heading-lets-see-it-in-action">Let’s see it in action:</h3>
<p>There are four MCP Clients available in the Docker Desktop<br />1. Gordon<br />2. Claude Desktop<br />3. Cursor<br />4. Continue.dev</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1750761333177/f6216f26-39a7-4dbe-858d-5da1dbc11208.png" alt class="image--center mx-auto" /></p>
<p>You can simply connect to any of these, but we are connecting to the Claude desktop. After installing, click on the Connect button on the Docker Desktop app. It automatically creates a configuration json file and puts it in the right place. When you restart Claude, you will be able to see the MCP_Docker with tools. Here tools will be the MCP servers that you have installed from the catalog.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1750911785768/2d3debe3-4c2c-439c-8da4-fb60c0ec1c3b.png" alt class="image--center mx-auto" /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1750911835798/efeb24d4-f669-46de-912c-b54547b2f414.png" alt class="image--center mx-auto" /></p>
<p>Now you can install MCP servers, in this case, we have installed docker mcp server.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1750911903335/d6d5c2f2-bbba-4624-8c73-9799cfd04f4c.png" alt class="image--center mx-auto" /></p>
<p>Once this is enabled, you can go to Claude and simply use human language, which would be able to interact with the tools. Here you can see 1 tool, and that tool will be listed as part of Claude desktop too.</p>
<p>Then let’s perform a simple task of creating an nginx container with the help of the Claude desktop:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1750762544097/52b0ee5c-bd0e-4b38-a478-0339fcda8296.png" alt class="image--center mx-auto" /></p>
<p>As soon as you run the command, it pops up a dialog box to ask permission for using the external integration, that is Docker in this case.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1750762315164/c3ede2d0-f4da-4b61-a6f2-1caba584b859.png" alt class="image--center mx-auto" /></p>
<p>So we can see below that the nginx container has been created with the ID 6d87626dcad5.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1750762698324/b060ae2a-96cd-49de-9731-10ef8d1d2219.png" alt class="image--center mx-auto" /></p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>Docker has always believed in streamlining the developer workflow; they have been a one-stop solution for packaging applications as Images and then running them as containers with ease. They also let you build and push WASM artifacts, and now with the AI wave, they are simplifying the whole AI ecosystem, keeping the same developer experience and fluid UI. In the future, we will build an MCP server too and then see how that works, till then let me know your favourite MCP server and what you think about Docker MCP catalog and toolkit!</p>
]]></content:encoded></item><item><title><![CDATA[Kubernetes v1.33: Key Features, Updates, and What You Need to Know]]></title><description><![CDATA[The Kubernetes v1.33, codenamed "Octarine: The Color of Magic" introduces 64 advancements.This release features 18 graduating to stable, 20 entering beta, and 24 new alpha features 1, with a strong emphasis on improving security, enhancing usability,...]]></description><link>https://blog.kubesimplify.com/kubernetes-v133-key-features-updates-and-what-you-need-to-know</link><guid isPermaLink="true">https://blog.kubesimplify.com/kubernetes-v133-key-features-updates-and-what-you-need-to-know</guid><category><![CDATA[v1.33 ]]></category><category><![CDATA[Kubernetes]]></category><category><![CDATA[#kubernetes #container ]]></category><category><![CDATA[newrelease]]></category><category><![CDATA[Devops]]></category><category><![CDATA[DevRel]]></category><dc:creator><![CDATA[Saiyam Pathak]]></dc:creator><pubDate>Tue, 10 Jun 2025 07:24:57 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1749539944448/c347f1e3-ea41-4ad4-bcb7-12dca6880134.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>The Kubernetes v1.33, codenamed "Octarine: The Color of Magic" introduces 64 advancements.This release features 18 graduating to stable, 20 entering beta, and 24 new alpha features 1, with a strong emphasis on improving security, enhancing usability, improving scalability, and refining the overall developer experience.</p>
<p>In this blog, we’ll explore the top highlights of Kubernetes v1.33.</p>
<h2 id="heading-how-to-try-out-kubernetes-133"><strong>How to Try Out Kubernetes 1.33</strong></h2>
<p>One of the biggest questions people often have is how they can try out the new Kubernetes version as soon as it is released. Cloud providers take some time to update the Kubernetes version, and until then, you can use vCluster. vCluster allows you to create a virtual cluster running in any version of vanilla Kubernetes, including Kubernetes version 1.33(the latest version at the time of writing this blog) with very simple steps.</p>
<p>Create a <code>vcluster.yaml</code> file:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">controlplane:</span>
  <span class="hljs-attr">distro:</span>
    <span class="hljs-attr">k8s:</span>
      <span class="hljs-string">version:v1.33.0Copy</span>
</code></pre>
<p>Then, create the virtual cluster:</p>
<pre><code class="lang-lua">vcluster create k8s133 -f vcluster.yamlCopy
</code></pre>
<p>Ensure your context is set to the virtual cluster and verify the nodes:</p>
<pre><code class="lang-rust">kubectl get nodes
NAME              STATUS   ROLES    AGE   VERSION
live-demo-e0is0   Ready    &lt;none&gt;   <span class="hljs-number">13</span>m   v1.<span class="hljs-number">33.0</span>
This setup enables you to test new features and plan upgrades accordingly.

  Warning  FailedScheduling        <span class="hljs-number">16</span>m                default-scheduler        <span class="hljs-number">0</span>/<span class="hljs-number">3</span> nodes are available: pod has unbound immediate PersistentVolumeClaims. preemption: <span class="hljs-number">0</span>/<span class="hljs-number">3</span> nodes are available: <span class="hljs-number">3</span> Preemption is not helpful <span class="hljs-keyword">for</span> scheduling.
  Normal   Scheduled               <span class="hljs-number">16</span>m                default-scheduler        Successfully assigned vcluster-demo4/demo4-<span class="hljs-number">0</span> to live-demo-e0is1
  Normal   SuccessfulAttachVolume  <span class="hljs-number">16</span>m                attachdetach-controller  AttachVolume.Attach succeeded <span class="hljs-keyword">for</span> volume <span class="hljs-string">"pvc-c615f428-70f6-4921-88a9-7ab26349ad00"</span>
  Normal   Pulled                  <span class="hljs-number">16</span>m                kubelet                  Container image <span class="hljs-string">"ghcr.io/loft-sh/vcluster-pro:0.24.0"</span> already present on machine
  Normal   Created                 <span class="hljs-number">16</span>m                kubelet                  Created container vcluster-copy
  Normal   Started                 <span class="hljs-number">16</span>m                kubelet                  Started container vcluster-copy
  Normal   Pulling                 <span class="hljs-number">16</span>m                kubelet                  Pulling image <span class="hljs-string">"registry.k8s.io/kube-controller-manager:v1.33.0"</span>
  Normal   Pulled                  <span class="hljs-number">16</span>m                kubelet                  Successfully pulled image <span class="hljs-string">"registry.k8s.io/kube-controller-manager:v1.33.0"</span> <span class="hljs-keyword">in</span> <span class="hljs-number">2.406</span>s (<span class="hljs-number">2.406</span>s including waiting). Image size: <span class="hljs-number">27635030</span> bytes.
  Normal   Created                 <span class="hljs-number">16</span>m                kubelet                  Created container kube-controller-manager
  Normal   Started                 <span class="hljs-number">16</span>m                kubelet                  Started container kube-controller-manager
  Normal   Pulling                 <span class="hljs-number">16</span>m                kubelet                  Pulling image <span class="hljs-string">"registry.k8s.io/kube-apiserver:v1.33.0"</span>
  Normal   Pulled                  <span class="hljs-number">16</span>m                kubelet                  Successfully pulled image <span class="hljs-string">"registry.k8s.io/kube-apiserver:v1.33.0"</span> <span class="hljs-keyword">in</span> <span class="hljs-number">2.267</span>s (<span class="hljs-number">2.267</span>s including waiting). Image size: <span class="hljs-number">30071307</span> bytes.
  Normal   Created                 <span class="hljs-number">16</span>m                kubelet                  Created container kube-apiserver
  Normal   Started                 <span class="hljs-number">16</span>m                kubelet                  Started container kube-apiserver
  Normal   Pulled                  <span class="hljs-number">15</span>m                kubelet                  Container image <span class="hljs-string">"ghcr.io/loft-sh/vcluster-pro:0.24.0"</span> already present on machine
  Normal   Created                 <span class="hljs-number">15</span>m                kubelet                  Created container syncer
  Normal   Started                 <span class="hljs-number">15</span>m                kubelet                  Started container syncerCopy
</code></pre>
<p><strong>Note:</strong> While vCluster allows you to experiment with the latest Kubernetes features, some functionalities, particularly those that interact directly with the underlying host's operating system, kernel, kubelet, or hardware, might have limited or no effect when the host cluster is running an older Kubernetes version. Let’s test out a couple of features from Kubernetes version 1.33 using vCluster.</p>
<p>First let's talk some of the cool stuff from 1.33 and then we will try a couple of features on vCluster 1.33</p>
<h3 id="heading-in-place-pod-vertical-scaling-beta"><strong>In-Place Pod Vertical Scaling (Beta)</strong></h3>
<p>In Kubernetes v1.33, the long‐awaited in-place Pod resize feature has graduated from alpha to beta and is now enabled by default. Instead of restarting Pods to adjust CPU or memory, you can simply patch the Pod’s resources via the new resize subresource and monitor its progress through two Pod conditions (PodResizePending and PodResizeInProgress). After its alpha debut in v1.27, resizing sidecar containers in place is now supported in beta. By reducing disruption and enabling more efficient resource use, especially for stateful or long-running workloads, this beta release lays the groundwork for future integration with the Vertical Pod Autoscaler and further production hardening based on community feedback.</p>
<pre><code class="lang-css"><span class="hljs-selector-tag">kubectl</span> <span class="hljs-selector-tag">edit</span> <span class="hljs-selector-tag">pod</span> &lt;<span class="hljs-selector-tag">pod-name</span>&gt; <span class="hljs-selector-tag">--subresource</span> <span class="hljs-selector-tag">resizeCopy</span>
</code></pre>
<p>Read more about this feature <a target="_blank" href="https://kubernetes.io/blog/2025/05/16/kubernetes-v1-33-in-place-pod-resize-beta/">here.</a></p>
<h3 id="heading-user-namespaces-on-by-default"><strong>User namespaces on by default</strong></h3>
<p>Kubernetes v1.33 now enables user namespaces by default, a significant security feature that enhances isolation between containers and the host system. User namespaces work by mapping user and group IDs (UIDs/GIDs) within a container to different, unprivileged UIDs/GIDs on the host. This is crucial because it prevents lateral movement between containers and increases host isolation, meaning that even if a container is compromised and runs as root internally, it has no elevated privileges on the host. This default enablement in Kubernetes 1.33 allows pods to opt-in to this stronger security posture without needing to enable specific feature flags, provided the underlying stack requirements are met.</p>
<p>To enable user namespaces in a pod you use the <code>hostUsers</code> value and set it to <code>false</code>.</p>
<p>Example:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">apiVersion:</span> <span class="hljs-string">v1</span>
<span class="hljs-string">kind:Pod</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">userns</span>
<span class="hljs-attr">spec:</span>
  <span class="hljs-attr">hostUsers:</span> <span class="hljs-literal">false</span>
  <span class="hljs-attr">containers:</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">shell</span>
    <span class="hljs-attr">command:</span> [<span class="hljs-string">"sleep"</span>, <span class="hljs-string">"infinity"</span>]
    <span class="hljs-attr">image:</span> <span class="hljs-string">debianCopy</span>
</code></pre>
<p>User namespaces allow you to run as root inside the container, but not have privileges in the host.</p>
<h3 id="heading-job-success-policy"><strong>Job Success Policy</strong></h3>
<p>In Kubernetes 1.33, the Job SuccessPolicy feature has reached General Availability, offering more flexible completion criteria for batch workloads. This feature is particularly important for scenarios like leader-follower patterns (e.g., MPI used in scientific simulations, AI/ML, and HPC) where the overall job can be considered successful even if not all individual pods or indexes complete successfully. Instead of requiring every pod to succeed, users can now define specific rules, such as a minimum number of successfully completed indexes or the success of a specific leader index, allowing for early exit and resource cleanup once the defined success criteria are met, thereby optimizing resource usage and accommodating more complex batch processing needs.</p>
<h3 id="heading-horizontalpodautoscaler-configurable-tolerance"><strong>HorizontalPodAutoscaler Configurable Tolerance</strong></h3>
<p>This Alpha feature lets you control how quickly your applications scale up or down. Before, there was a fixed 10% buffer for all applications before they would scale. Now, you can set this buffer specifically for each application. This means you can make your application scale up very quickly if there's a sudden increase in traffic (by setting a low or zero buffer for scaling up) and scale down more slowly to avoid too many changes if traffic drops a little (by setting a higher buffer for scaling down). This gives you better control, especially for large applications, helping to keep them stable and avoid unnecessary changes when small things fluctuate. You'll need to turn on the "HPAConfigurableTolerance" as a <a target="_blank" href="https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/">feature</a> <a target="_blank" href="https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/">gate</a>.</p>
<p>Example - an HPA with a tolerance of 5% on scale-down, and 1% tolerance on scale-up, would look like the following:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">apiVersion:</span> <span class="hljs-string">autoscaling/v2</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">HorizontalPodAutoscaler</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">demo</span>
<span class="hljs-attr">spec:</span>
  <span class="hljs-string">...</span>
  <span class="hljs-attr">behavior:</span>
    <span class="hljs-attr">scaleUp:</span>
      <span class="hljs-attr">tolerance:</span> <span class="hljs-number">0.01</span>
    <span class="hljs-attr">scaleDown::</span>
      <span class="hljs-attr">tolerance:</span> <span class="hljs-number">0.</span><span class="hljs-string">05Copy</span>
</code></pre>
<h3 id="heading-new-configuration-option-for-kubectl-with-kuberc-for-user-preferences"><strong>New configuration option for kubectl with .kuberc for user preferences</strong></h3>
<p>In v1.33, kubectl introduces an alpha feature that lets you keep aliases, default flags (e.g. server-side apply), and other preferences in a separate <code>~/.kube/kuberc</code> file, rather than crowding your kubeconfig.</p>
<h3 id="heading-new-features-in-dra"><strong>New features in DRA</strong></h3>
<p>Also in Kubernetes version 1.33, Dynamic Resource Allocation (DRA), a flexible way for applications to request specific hardware like GPUs, is getting better even though it's still in beta. A new beta update lets device drivers share more detailed status information. Several new early-test features have also been added:</p>
<ul>
<li><p>the ability to split single devices into smaller usable parts ("Partitionable Devices")</p>
</li>
<li><p>a way to mark some devices as unusable unless an application specifically allows it ("Device Taints and Tolerations")</p>
</li>
<li><p>an option for users to list their preferred devices in order ("Prioritized List")</p>
</li>
<li><p>improved security for administrative access to devices.</p>
</li>
</ul>
<p>These changes aim to make it easier and more efficient to use specialized hardware in Kubernetes, with the goal of making DRA fully available soon. DRA is supposed to go GA in Kubernetes 1.34.</p>
<h3 id="heading-sidecar-container-graduates"><strong>SideCar Container graduates</strong></h3>
<p>Launched in 1.28 as an alpha feature, sidecar finally graduated to stable in 1.33. These containers run alongside your primary application container within the same Pod. Kubernetes implements sidecars as a special type of init container configured with <code>restartPolicy: Always</code>. This ensures they start before your main application containers, run for the entire lifecycle of the Pod, and are automatically terminated after the main containers finish. This native support means you can rely on sidecars to use probes (startup, readiness, and liveness) to signal their health, and their memory (OOM) scores are adjusted like primary containers to prevent them from being terminated too early under memory pressure.</p>
<h2 id="heading-kubernetes-133-features-on-vcluster"><strong>Kubernetes 1.33 features on vCluster</strong></h2>
<p>Now we’ll try these two features using vCluster.</p>
<ul>
<li><p>Ordered Namespace Deletion</p>
</li>
<li><p>ClusterTrustBundle</p>
</li>
</ul>
<p>To enable these features we’ll need to set the following feature flags:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">controlPlane:</span>
  <span class="hljs-attr">distro:</span>
    <span class="hljs-attr">k8s:</span>
      <span class="hljs-attr">version:</span> <span class="hljs-string">v1.33.0</span>
      <span class="hljs-attr">apiServer:</span>
        <span class="hljs-attr">extraArgs:</span>
          <span class="hljs-bullet">-</span> <span class="hljs-string">"--feature-gates=OrderedNamespaceDeletion=true"</span>
          <span class="hljs-bullet">-</span> <span class="hljs-string">"--feature-gates=ClusterTrustBundle=true
ClusterTrustBundleProjection=true"</span>
          <span class="hljs-bullet">-</span> <span class="hljs-string">"--runtime-config=certificates.k8s.io/v1beta1/clustertrustbundles=true"</span><span class="hljs-string">Copy</span>
</code></pre>
<h5 id="heading-command"><strong>Command</strong></h5>
<pre><code class="lang-lua">vcluster create k8s-1-33-dev --namespace vcluster-133-ns -f kube133.configCopy
</code></pre>
<h5 id="heading-output"><strong>Output</strong></h5>
<pre><code class="lang-sql">vcluster list
     NAME     |    NAMESPACE    | STATUS  | VERSION | CONNECTED |    AGE      
  <span class="hljs-comment">---------------+-----------------+---------+---------+-----------+-------------</span>
    k8s-1-33-dev | vcluster-133-ns | Running | 0.24.0  | True      | 5h47m46s   Copy
</code></pre>
<h3 id="heading-ordered-namespace-deletion"><strong>Ordered Namespace Deletion</strong></h3>
<p>It is an alpha feature that brings in an opinionated deletion process in the Kubernetes namespace deletion to ensure secure deletion of resources within a namespace. The current deletion process is semi-random, which may lead to unintended behavior, such as Pods persisting after the deletion of their associated NetworkPolicies. If this feature is turned on, the Pods will be deleted before other resources with logical and security dependencies. This design enhances the security and reliability of Kubernetes by mitigating risks arising from the non-deterministic deletion order. In order to enable this feature you need to enable the feature flag <code>"--feature-gates=OrderedNamespaceDeletion=true"</code></p>
<p>Let’s create the pod</p>
<pre><code class="lang-yaml"><span class="hljs-attr">apiVersion:</span> <span class="hljs-string">v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">Namespace</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">demo-order</span>
<span class="hljs-meta">---</span>
<span class="hljs-attr">apiVersion:</span> <span class="hljs-string">v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">Pod</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">demo-pod</span>
  <span class="hljs-attr">namespace:</span> <span class="hljs-string">demo-order</span>
<span class="hljs-attr">spec:</span>
  <span class="hljs-attr">containers:</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">pause</span>
    <span class="hljs-attr">image:</span> <span class="hljs-string">k8s.gcr.io/pause:3.6</span>
<span class="hljs-meta">---</span>
<span class="hljs-attr">apiVersion:</span> <span class="hljs-string">networking.k8s.io/v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">NetworkPolicy</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">deny-all</span>
  <span class="hljs-attr">namespace:</span> <span class="hljs-string">demo-order</span>
<span class="hljs-attr">spec:</span>
  <span class="hljs-attr">podSelector:</span> {}
  <span class="hljs-attr">policyTypes:</span>
  <span class="hljs-bullet">-</span> <span class="hljs-string">IngressCopy</span>
</code></pre>
<p>This will create a namespace, Pod and a networkPolicy. Now in order to see the ordered deletion you can use any terminal in split view or just two terminal windows so that you can watch networkPolicy and pods.</p>
<p>Once you delete the namespace, you will observe that the pod gets deleted first and then the namespace, and see similar results as in the below image.</p>
<p><img src="https://cdn.prod.website-files.com/65a5be30bf4809bb3a2e8aff/683475a3132a6b86655fac71_73592eb6-348b-43bd-b2ab-dbabe98a8fe2.png" alt /></p>
<h3 id="heading-clustertrustbundle"><strong>ClusterTrustBundle</strong></h3>
<p>The ClusterTrustBundle is a beta feature in Kubernetes 1.33, part of the <code>certificates.k8s.io/v1beta1</code> API group, used to manage cluster-scoped X.509 trust anchors. It allows you to publish CA certificates that in-cluster components (e.g., webhooks, image registries, or workloads) can use for certificate verification.</p>
<p>Let’s try a sample and run it on vCluster:</p>
<p>Download a demo CA</p>
<pre><code class="lang-bash">curl -s https://letsencrypt.org/certs/isrgrootx1.pem -o isrgrootx1.pemCopy
</code></pre>
<p>Parsing check</p>
<pre><code class="lang-scss">openssl x509 -in isrgrootx1<span class="hljs-selector-class">.pem</span> -noout -textCopy
</code></pre>
<p>Use above and create a manifest as below:</p>
<pre><code class="lang-bash">apiVersion: certificates.k8s.io/v1beta1
kind: ClusterTrustBundle
metadata:
  name: letsencrypt-isrg-root-x1
spec:
  <span class="hljs-comment"># Global (signer-unlinked) bundle: visible to all workloads</span>
  trustBundle: |
    -----BEGIN CERTIFICATE-----
    MIIFazCCA1OgAwIBAgIRAIIQz7DSQONZRGPgu2OCiwAwDQYJKoZIhvcNAQELBQAw
    TzELMAkGA1UEBhMCVVMxKTAnBgNVBAoTIEludGVybmV0IFNlY3VyaXR5IFJlc2Vh
    cmNoIEdyb3VwMRUwEwYDVQQDEwxJU1JHIFJvb3QgWDEwHhcNMTUwNjA0MTEwNDM4
    WhcNMzUwNjA0MTEwNDM4WjBPMQswCQYDVQQGEwJVUzEpMCcGA1UEChMgSW50ZXJu
    ZXQgU2VjdXJpdHkgUmVzZWFyY2ggR3JvdXAxFTATBgNVBAMTDElTUkcgUm9vdCBY
    MTCCAiIwDQYJKoZIhvcNAQEBBQADggIPADCCAgoCggIBAK3oJHP0FDfzm54rVygc
    h77ct984kIxuPOZXoHj3dcKi/vVqbvYATyjb3miGbESTtrFj/RQSa78f0uoxmyF+
    0TM8ukj13Xnfs7j/EvEhmkvBioZxaUpmZmyPfjxwv60pIgbz5MDmgK7iS4+3mX6U
    A5/TR5d8mUgjU+g4rk8Kb4Mu0UlXjIB0ttov0DiNewNwIRt18jA8+o+u3dpjq+sW
    T8KOEUt+zwvo/7V3LvSye0rgTBIlDHCNAymg4VMk7BPZ7hm/ELNKjD+Jo2FR3qyH
    B5T0Y3HsLuJvW5iB4YlcNHlsdu87kGJ55tukmi8mxdAQ4Q7e2RCOFvu396j3x+UC
    B5iPNgiV5+I3lg02dZ77DnKxHZu8A/lJBdiB3QW0KtZB6awBdpUKD9jf1b0SHzUv
    KBds0pjBqAlkd25HN7rOrFleaJ1/ctaJxQZBKT5ZPt0m9STJEadao0xAH0ahmbWn
    OlFuhjuefXKnEgV4We0+UXgVCwOPjdAvBbI+e0ocS3MFEvzG6uBQE3xDk3SzynTn
    jh8BCNAw1FtxNrQHusEwMFxIt4I7mKZ9YIqioymCzLq9gwQbooMDQaHWBfEbwrbw
    qHyGO0aoSCqI3Haadr8faqU9GY/rOPNk3sgrDQoo//fb4hVC1CLQJ13hef4Y53CI
    rU7m2Ys6xt0nUW7/vGT1M0NPAgMBAAGjQjBAMA4GA1UdDwEB/wQEAwIBBjAPBgNV
    HRMBAf8EBTADAQH/MB0GA1UdDgQWBBR5tFnme7bl5AFzgAiIyBpY9umbbjANBgkq
    hkiG9w0BAQsFAAOCAgEAVR9YqbyyqFDQDLHYGmkgJykIrGF1XIpu+ILlaS/V9lZL
    ubhzEFnTIZd+50xx+7LSYK05qAvqFyFWhfFQDlnrzuBZ6brJFe+GnY+EgPbk6ZGQ
    3BebYhtF8GaV0nxvwuo77x/Py9auJ/GpsMiu/X1+mvoiBOv/2X/qkSsisRcOj/KK
    NFtY2PwByVS5uCbMiogziUwthDyC3+6WVwW6LLv3xLfHTjuCvjHIInNzktHCgKQ5
    ORAzI4JMPJ+GslWYHb4phowim57iaztXOoJwTdwJx4nLCgdNbOhdjsnvzqvHu7Ur
    TkXWStAmzOVyyghqpZXjFaH3pO3JLF+l+/+sKAIuvtd7u+Nxe5AW0wdeRlN8NwdC
    jNPElpzVmbUq4JUagEiuTDkHzsxHpFKVK7q4+63SM1N95R1NbdWhscdCb+ZAJzVc
    oyi3B43njTOQ5yOf+1CceWxG1bQVs5ZufpsMljq4Ui0/1lvh+wjChP4kqKOJ2qxq
    4RgqsahDYVvTH9w7jXbyLeiNdd8XM2w9U/t7y0Ff/9yi0GE44Za4rF2LN9d11TPA
    mRGunUHBcnWEvgJBQl9nJEiU0Zsnvgc/ubhPgXRR4Xq37Z0j4r7g1SgEEzwxA57d
    emyPxgcYxn/eR44/KJ4EBs+lVDR3veyJm+kXQ99b21/+jh5Xos1AnX5iItreGCc=
    -----END CERTIFICATE-----Copy
</code></pre>
<p>Apply the manifest</p>
<p>Command</p>
<pre><code class="lang-plaintext">kubectl apply -f bundle.yamlCopy
</code></pre>
<p>Output:</p>
<pre><code class="lang-bash">clustertrustbundle.certificates.k8s.io/letsencrypt-isrg-root-x1 createdCopy
</code></pre>
<pre><code class="lang-sql">kubectl get clustertrustbundle.certificates.k8s.io
NAME                       SIGNERNAME
letsencrypt-isrg-root-x1   &lt;none&gt;Copy
</code></pre>
<p>The above examples allow you to test at the API level some of the new features in Kubernetes 1.33 and how they work on vCluster.</p>
<h2 id="heading-deprecated-and-removed-features"><strong>Deprecated and Removed Features</strong></h2>
<p>Endpoints API Deprecation: The traditional Endpoints API is deprecated in favor of EndpointSlices, which offer better scalability and support for modern features.</p>
<p>Removal of gitRepo Volume Type: The deprecated gitRepo volume type has been removed. Users should transition to alternatives like init containers with git clone.</p>
<h2 id="heading-conclusion"><strong>Conclusion</strong></h2>
<p>Kubernetes releases always come with many new features and this time is no different. Let us know which Kubernetes features you are most excited about. And do read the <a target="_blank" href="https://kubernetes.io/blog/2025/04/23/kubernetes-v1-33-release/">official</a> <a target="_blank" href="https://kubernetes.io/blog/2025/04/23/kubernetes-v1-33-release/">announcement</a> <a target="_blank" href="https://kubernetes.io/blog/2025/04/23/kubernetes-v1-33-release/">post</a> from the release team. A huge thanks to all the members of the release team who helped with this amazing release. If you want to quickly try out Kubernetes 1.33, you can easily use vCluster.</p>
]]></content:encoded></item><item><title><![CDATA[Ephemeral Pull Request environment using Vcluster.]]></title><description><![CDATA[In a fast-paced development environment, having an isolated and ephemeral environment to test changes for every pull request (PR) is a game-changer. In this blog, I’ll walk you through setting up ephemeral PR environments using vCluster, enabling sea...]]></description><link>https://blog.kubesimplify.com/ephemeral-pull-request-environment-using-vcluster</link><guid isPermaLink="true">https://blog.kubesimplify.com/ephemeral-pull-request-environment-using-vcluster</guid><category><![CDATA[cluster]]></category><category><![CDATA[vcluster]]></category><category><![CDATA[Kubernetes]]></category><category><![CDATA[Pull Requests]]></category><category><![CDATA[Ingress Controllers]]></category><category><![CDATA[demo]]></category><category><![CDATA[Applications]]></category><dc:creator><![CDATA[Saiyam Pathak]]></dc:creator><pubDate>Thu, 10 Apr 2025 16:52:03 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1744303738819/050203f4-e809-4605-ba5a-3242c968bf0c.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In a fast-paced development environment, having an isolated and ephemeral environment to test changes for every pull request (PR) is a game-changer. In this blog, I’ll walk you through setting up ephemeral PR environments using <strong>vCluster</strong>, enabling seamless testing of your application in a Kubernetes environment. We’ll also leverage GitHub Actions for automation, ensuring every labeled PR dynamically creates a vCluster, deploys the application, and cleans up upon merging or label removal.</p>
<p>Let’s dive into the <strong>step-by-step guide</strong>.</p>
<h1 id="heading-what-is-vcluster"><strong>What is vCluster?</strong></h1>
<p><a target="_blank" href="https://www.vcluster.com/?__hstc=107455133.24d76b7b89d28afebee5af7771225ac7.1741672016640.1742467093098.1744283839944.5&amp;__hssc=107455133.1.1744283839944&amp;__hsfp=3213767220">vCluster</a> is a technology that allows you to create lightweight, isolated Kubernetes clusters within a host cluster. These virtual clusters offer full Kubernetes functionality while being resource-efficient, making them ideal for scenarios like PR testing environments.</p>
<h1 id="heading-why-ephemeral-pr-environments"><strong>Why Ephemeral PR Environments?</strong></h1>
<p>Ephemeral environments allow:</p>
<ul>
<li><p>Testing pull request changes in an isolated environment</p>
</li>
<li><p>Quick validation without interfering with the main cluster</p>
</li>
<li><p>Automatic cleanup post-testing</p>
</li>
</ul>
<p>By leveraging <strong>vCluster</strong> and <strong>GitHub Actions</strong>, you can automate this workflow and ensure every PR gets its own dedicated environment.</p>
<h1 id="heading-prerequisites"><strong>Prerequisites:</strong></h1>
<h2 id="heading-kubernetes-cluster"><strong>Kubernetes cluster</strong></h2>
<p>You need to have a Kubernetes cluster, in this case I am using a DigitalOcean Kubernetes cluster but any should work. I am creating a realistic production scenario so for that I used a cluster that can create service type: LoadBalancer.</p>
<p>Command:</p>
<p><code>kubectl get nodes</code></p>
<p>Output:</p>
<pre><code class="lang-plaintext">kubectl get nodes
NAME              STATUS   ROLES    AGE   VERSION
live-demo-e0is0   Ready    &lt;none&gt;   19d   v1.31.1
live-demo-e0is1   Ready    &lt;none&gt;   19d   v1.31.1
live-demo-e0isz   Ready    &lt;none&gt;   19d   v1.31.1
</code></pre>
<h2 id="heading-deploying-ingress-controller"><strong>Deploying Ingress controller</strong></h2>
<p>Command</p>
<pre><code class="lang-plaintext">kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.9.4/deploy/static/provider/cloud/deploy.yamlCopy
</code></pre>
<p>Output</p>
<pre><code class="lang-plaintext">kubectl get po,svc -n ingress-nginx
NAME                                           READY   STATUS      RESTARTS   AGE
pod/ingress-nginx-admission-create-lcb85       0/1     Completed   0          19d
pod/ingress-nginx-admission-patch-xl2fk        0/1     Completed   0          19d
pod/ingress-nginx-controller-79fcc99b4-7f7ls   1/1     Running     0          19d
</code></pre>
<p>Getting the LoadBalancer IP for the ingress controller:</p>
<p>Command:</p>
<p><code>kubectl get svc -n ingress-nginx</code></p>
<p>Output:</p>
<pre><code class="lang-plaintext">NAME                                         TYPE           CLUSTER-IP      EXTERNAL-IP      PORT(S)                      AGE
service/ingress-nginx-controller             LoadBalancer   10.109.28.126   209.38.160.229   80:31228/TCP,443:30435/TCP   19d
service/ingress-nginx-controller-admission   ClusterIP      10.109.15.162   &lt;none&gt;           443/TCP                      19dCopy
</code></pre>
<p>Domain mapping:</p>
<p>For our application we need dynamic ingress for testing so what we have done here is added the loadbalancer IP of the ingress controller as the A record to the Domain.</p>
<p><img src="https://miro.medium.com/v2/resize:fit:1400/1*jdy54YTGLoCZ5z8sbTKDFg.png" alt /></p>
<h2 id="heading-connect-the-kubernetes-cluster-to-the-platform"><strong>Connect the Kubernetes cluster to the platform</strong></h2>
<p>We will enable vCluster Pro in order to use templates and create the clusters. For simplicity, I am using my <a target="_blank" href="https://vcluster.cloud/">vcluster.cloud</a> account and then creating the access key to login. In this way I don’t have to run any agent on the current cluster. You can either run vcluster platform start or sign up on <a target="_blank" href="https://vcluster.cloud/">vCluster cloud</a> and once you login, you should be able to go to <a target="_blank" href="https://www.vcluster.com/docs/platform/administer/users-permissions/access-keys?__hstc=107455133.24d76b7b89d28afebee5af7771225ac7.1741672016640.1742467093098.1744283839944.5&amp;__hssc=107455133.1.1744283839944&amp;__hsfp=3213767220">access keys</a> and create a short lived access key for the demo (remember to delete the key post demo for security reasons).</p>
<p>Command:</p>
<pre><code class="lang-plaintext">vcluster platform login https://saiyam.vcluster.cloud --access-key &lt;your-access-key&gt;
</code></pre>
<p>Output:</p>
<p><img src="https://miro.medium.com/v2/resize:fit:1400/1*GvKM69qOfWnJyDg3xOSb9w.png" alt /></p>
<p><img src="https://miro.medium.com/v2/resize:fit:1400/1*7ot2eV9DPIK1D4rxyy9HHA.png" alt /></p>
<p>Create a template under vCluster templates in the vCluster cloud platform instance.</p>
<pre><code class="lang-plaintext">sync:
  fromHost:
    ingressClasses:
      enabled: true
  toHost:
    ingresses:
      enabled: true
external:
  platform:
    autoSleep:
      afterInactivity: 3600  # Automatically sleep after 1 hour of inactivity
</code></pre>
<p>Until now we have a Kubernetes cluster with ingress controller installed and the Public IP of the nginx controller pointed to our domain.</p>
<p>We also have logged into the platform using the access keys created using vcluster.cloud. Now let’s see the demo application that we have.</p>
<h1 id="heading-demo-application"><strong>Demo Application</strong></h1>
<p><img src="https://miro.medium.com/v2/resize:fit:1400/1*gkGYH14pe7GyJMBnnPKTFA.png" alt /></p>
<p><img src="https://miro.medium.com/v2/resize:fit:1400/1*32RbefKu33wZSF8n_RAGbw.png" alt /></p>
<p>The scenario we are trying to achieve here involves a sample application deployed onto a Kubernetes cluster. Often, in organizations, new features or bug fixes need to be deployed and tested before being merged into the main branch. In this case, a developer raises a pull request and adds a label to test it. Based on GitHub Actions, the application is built, and then a deployment, service, and ingress Kubernetes object file are generated and pushed to a new branch. A virtual cluster is created, and the new deployment file is applied, allowing the developer to test and verify the new application deployment.</p>
<p>Let’s see how this looks in practice.</p>
<p>GitHub repo — <a target="_blank" href="https://github.com/saiyam1814/vcluster-demo">https://github.com/saiyam1814/vcluster-demo</a></p>
<p>The application for this demo is a simple Go-based HTTP server:</p>
<pre><code class="lang-plaintext">package main
import (
    "fmt"
    "net/http"
)
func handler(w http.ResponseWriter, r *http.Request) {
    fmt.Fprintln(w, "Hellooo, World for blog!!")
}
func main() {
    http.HandleFunc("/", handler)
    fmt.Println("Starting server on :8080")
    err := http.ListenAndServe(":8080", nil)
    if err != nil {
        panic(err)
    }
}
</code></pre>
<h1 id="heading-step-1-setting-up-the-deployment-template"><strong>Step 1: Setting Up the Deployment Template</strong></h1>
<p>The application is packaged as a Kubernetes deployment and exposed via a service and ingress. The deployment uses Jinja2 templating to inject dynamic values like the image tag and ingress host.</p>
<p><strong>tmpl/deploy.j2:</strong></p>
<pre><code class="lang-plaintext">apiVersion: apps/v1
kind: Deployment
metadata:
  name: hello-world
  labels:
    app: hello-world
spec:
  replicas: 1
  selector:
    matchLabels:
      app: hello-world
  template:
    metadata:
      labels:
        app: hello-world
    spec:
      containers:
      - name: hello-world
        image: {{ image_deploy_tag }}
        ports:
        - containerPort: 8080
---
apiVersion: v1
kind: Service
metadata:
  name: hello-world
spec:
  ports:
  - port: 80
    targetPort: 8080
  selector:
    app: hello-world
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: hello-world
spec:
  ingressClassName: nginx
  rules:
  - host: {{ ingress_tag }}
    http:
      paths:
      - pathType: Prefix
        path: "/"
        backend:
          service:
            name: hello-world
            port:
              number: 80
</code></pre>
<h1 id="heading-step-2-automating-with-github-actions"><strong>Step 2: Automating with GitHub Actions</strong></h1>
<p>GitHub Actions handles the workflow from building the application to deploying it on a vCluster.</p>
<h2 id="heading-pr-workflow"><strong>PR Workflow</strong></h2>
<p><strong>File: .github/workflows/build-and-deploy.yml</strong> This workflow:</p>
<ol>
<li><p>Builds the application with the latest changes made by the developer using ko</p>
</li>
<li><p>Pushes the container image to docker hub account(credentials for which should be set in the Actions secret as described previously)</p>
</li>
<li><p>Creates a deployment manifest using Jinja2 — The action will replace the ingress host and the deployment image variables mentioned in the jinja template and then push to a new feature branch.</p>
</li>
<li><p>Creates a vCluster.</p>
</li>
<li><p>Deploys the application to the vCluster.</p>
</li>
<li><p>Exposes it via ingress for testing.</p>
</li>
</ol>
<pre><code class="lang-plaintext">name: Build and Deploy with vCluster
</code></pre>
<pre><code class="lang-plaintext">on:
  pull_request:
    types: [labeled]jobs:
  build-and-deploy:
    if: ${{ github.event.label.name == 'test' }}
    runs-on: ubuntu-latest    steps:
      # Step 1: Checkout PR Code
      - name: Checkout PR Code
        uses: actions/checkout@v3
        with:
          ref: ${{ github.event.pull_request.head.sha }}
      # Step 2: Set up Go
      - name: Set up Go
        uses: actions/setup-go@v4
        with:
          go-version: '1.22.5'
      # Step 3: Set up ko
      - name: Set up ko
        uses: ko-build/setup-ko@v0.6
        with:
          version: v0.14.1
      # Step 4: Log in to Docker Hub
      - name: Log in to Docker Hub
        env:
          KO_DOCKER_REPO: docker.io/saiyam911
        run: |
          echo "${{ secrets.DOCKER_PASSWORD }}" | ko login docker.io --username ${{ secrets.DOCKER_USERNAME }} --password-stdin
      # Step 5: Build and Push Image
      - name: Build and Push Image
        env:
          KO_DOCKER_REPO: docker.io/saiyam911/vcluster-demo
        run: |
          cd app
          export IMAGE_TAG=sha-$(git rev-parse --short HEAD)
          echo "image_deploy_tag=docker.io/saiyam911/vcluster-demo:$IMAGE_TAG" &gt;&gt; $GITHUB_ENV
          ko build --bare -t $IMAGE_TAG
      # Step 6: Generate Deployment Manifest
      - name: Generate Deployment Manifest
        uses: cuchi/jinja2-action@v1.1.0
        with:
          template: tmpl/deploy.j2
          output_file: deploy/deployment.yaml
          strict: true
          variables: |
            image_deploy_tag=${{ env.image_deploy_tag }}
            ingress_tag=pr${{ github.event.pull_request.number }}.vcluster.tech
      # Step 7: Install vCluster CLI
      - name: Install vCluster CLI
        uses: loft-sh/setup-vcluster@main
      # Step 8: Login to vCluster Platform
      - name: Login to vCluster Platform instance
        env:
          LOFT_URL: ${{ secrets.VCLUSTER_PLATFORM_URL }}
          ACCESS_KEY: ${{ secrets.VCLUSTER_ACCESS_KEY }}
        run: |
          vcluster platform login $LOFT_URL --access-key $ACCESS_KEY
      # Step 9: Create vCluster for the PR
      - name: Create A vCluster
        env:
          NAME: pr-${{ github.event.pull_request.number }}
        run: |
          vcluster platform create vcluster $NAME --project default --template my-template --link "Preview=http://pr${{ github.event.pull_request.number }}.vcluster.tech"
      # Step 10: Deploy to vCluster
      - name: Deploy Application to vCluster
        run: |
          kubectl apply -Rf deploy/
      # Step 11: Test Application with curl
      - name: Test Application
        run: |
          sleep 10
          curl --retry 5 --retry-delay 10 http://pr${{ github.event.pull_request.number }}.vcluster.tech
</code></pre>
<h1 id="heading-step-3-cleanup-workflow"><strong>Step 3: Cleanup Workflow</strong></h1>
<p>Once the PR is merged or the label is removed, the ephemeral vCluster is deleted.</p>
<p><strong>File: .github/workflows/cleanup.yml</strong></p>
<pre><code class="lang-plaintext">name: Clean Up vCluster
</code></pre>
<pre><code class="lang-plaintext">on:
  pull_request:
    types: [closed, unlabeled]jobs:
  cleanup:
    if: (github.event.action == 'closed' &amp;&amp; github.event.pull_request.merged == true) || github.event.label.name == 'test'
    runs-on: ubuntu-latest    steps:
      # Step 1: Install vCluster CLI
      - name: Install vCluster CLI
        uses: loft-sh/setup-vcluster@main
      # Step 2: Login to vCluster Platform
      - name: Login to vCluster Platform instance
        env:
          LOFT_URL: ${{ secrets.VCLUSTER_PLATFORM_URL }}
          ACCESS_KEY: ${{ secrets.VCLUSTER_ACCESS_KEY }}
        run: |
          vcluster platform login $LOFT_URL --access-key $ACCESS_KEY
      # Step 3: Delete vCluster
      - name: Delete vCluster
        env:
          NAME: pr-${{ github.event.pull_request.number }}
        run: |
          vcluster platform delete vcluster $NAME --project default
</code></pre>
<h1 id="heading-how-it-works"><strong>How It Works</strong></h1>
<p>A developer creates a PR to do the feature changes.</p>
<p><img src="https://miro.medium.com/v2/resize:fit:1400/1*1uaKcD3j2BEFxAjFx94xpw.png" alt /></p>
<p>‎With a small change the developer has raised a PR and now needs to add a <code>test</code> label.</p>
<p><img src="https://miro.medium.com/v2/resize:fit:1400/1*NJY6Yhvx9jWTpd8InHVfUA.png" alt /></p>
<p>As soon as the label is added the GitHub actions kicks off</p>
<p><img src="https://miro.medium.com/v2/resize:fit:1400/1*4xOScOS3C083EI49efveYw.png" alt /></p>
<p>In the vCluster platform cloud instance you will be able to see the cluster getting created and the application will be deployed.</p>
<p><img src="https://miro.medium.com/v2/resize:fit:1400/1*fOeIDW66-oAs0N7zlH4iKw.png" alt /></p>
<p>The Action is completed and <code>pr14.vcluster.tech</code> is created as part of ingress.</p>
<p><img src="https://miro.medium.com/v2/resize:fit:1400/1*PSpkrDEWWsw_c0xh_PXKOQ.png" alt /></p>
<p>The application is accessible at http://pr&lt;PR_NUMBER&gt;.vcluster.tech.</p>
<p>As you can see the latest changes made by the developer are deployed.</p>
<p><img src="https://miro.medium.com/v2/resize:fit:1400/1*mC64loxZ-qY9NY--iahWrA.png" alt /></p>
<p><strong>Cleanup:</strong></p>
<p>Upon PR merge or label removal, the ephemeral vCluster is automatically deleted.</p>
<p><img src="https://miro.medium.com/v2/resize:fit:1400/1*rXX4Ktak7fGIYJL0XNLhGw.png" alt /></p>
<p>After merging, the cleanup action is triggered, which will clear the virtual cluster.</p>
<h1 id="heading-conclusion"><strong>Conclusion</strong></h1>
<p>Ephemeral PR environments using vCluster simplify testing, reduce resource usage, and provide a seamless developer experience. By combining vCluster with GitHub Actions, you can achieve an automated and efficient workflow for testing PRs.</p>
<p>Check out the <a target="_blank" href="https://github.com/saiyam1814/vcluster-demo">demo repository</a> and give it a try! 🚀</p>
<p>Let me know your thoughts or if you face any challenges while implementing this.</p>
<p><a target="_blank" href="https://saiyampathak.medium.com/?source=post_page---post_author_info--798e98fd46cd---------------------------------------">  
</a></p>
]]></content:encoded></item><item><title><![CDATA[Multi tenancy in 2025 and beyond]]></title><description><![CDATA[Multi-tenancy in Kubernetes has been an ongoing challenge for organizations looking to optimize their cloud-native infrastructure. Over the years, the approach to multi-tenancy has evolved from simple namespace isolation to virtual clusters and, more...]]></description><link>https://blog.kubesimplify.com/multi-tenancy-in-2025-and-beyond</link><guid isPermaLink="true">https://blog.kubesimplify.com/multi-tenancy-in-2025-and-beyond</guid><category><![CDATA[Kubernetes]]></category><category><![CDATA[#multitenancy]]></category><category><![CDATA[cluster]]></category><category><![CDATA[ArgoCD]]></category><dc:creator><![CDATA[Saiyam Pathak]]></dc:creator><pubDate>Wed, 12 Mar 2025 06:11:58 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1741759693296/d2ee68da-af2b-4dca-b0cb-1c2a128f9939.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Multi-tenancy in Kubernetes has been an ongoing challenge for organizations looking to optimize their cloud-native infrastructure. Over the years, the approach to multi-tenancy has evolved from simple namespace isolation to virtual clusters and, more recently, full-fledged internal Kubernetes platforms (IKPs) that enable shared platform stacks across teams.</p>
<p>With Kubernetes adoption continuing its upward trend, the real challenge for organizations today is not just adopting Kubernetes but managing it at scale. The widespread cluster sprawl, where companies create separate clusters for each team, environment, or workload, has led to escalating operational complexity and rising costs. According to the CNCF, over 70% of organizations report Kubernetes over-provisioning as a major source of cloud spend. This makes efficient multi-tenancy a necessity rather than a luxury.</p>
<p>Let’s explore how shared platform stacks and internal Kubernetes platforms are shaping the future of multi-tenancy in 2025.</p>
<p><img src="https://miro.medium.com/v2/resize:fit:1400/1*9x7xmmr7BmIka1VWp-nYUw.png" alt /></p>
<h1 id="heading-what-is-multi-tenancy-in-kubernetes"><strong>What is Multi-Tenancy in Kubernetes?</strong></h1>
<p>Multi-tenancy means dividing a Kubernetes cluster into multiple isolated environments so that different teams or applications can share infrastructure while maintaining security, autonomy, and fair resource usage.</p>
<p>To understand this better, let’s take an analogy where you are looking for an accommodation:</p>
<ul>
<li><p>Renting an entire house gives you full control but comes with high maintenance costs — similar to having a dedicated Kubernetes cluster per team or application.</p>
</li>
<li><p>Renting an apartment in a shared building gives you personal space but reduces overhead, as maintenance is handled collectively where you have shared access to facilities like elevators, swimming pool, park etc., this is how multi-tenancy works in Kubernetes.</p>
</li>
</ul>
<p>Instead of spinning up an entirely new Kubernetes cluster for every team, organization, or workload, you partition a single cluster into multiple isolated environments.</p>
<p>The three key pillars of true multi-tenancy are:</p>
<ol>
<li><p>Isolation — Ensuring security boundaries between tenants.</p>
</li>
<li><p>Fair Resource Usage — Preventing noisy neighbor issues.</p>
</li>
<li><p>Tenant Autonomy — Allowing teams to self-manage workloads independently.</p>
</li>
</ol>
<p><img src="https://miro.medium.com/v2/resize:fit:1400/1*6MGIiOJsO-oQdFNH6Tx8BQ.png" alt /></p>
<h1 id="heading-traditional-multi-tenancy-approaches-amp-their-limitations"><strong>Traditional Multi-Tenancy Approaches &amp; Their Limitations</strong></h1>
<p>Natively within Kubernetes, there is a concept of namespaces, which is useful as many resources can be scoped to a namespace to create some level of isolation.</p>
<ul>
<li>Workload isolation can be achieved to a certain extent by using pod security standards and preventing privileged access with custom policy engines like Kyverno or jsPolicy. Additionally, you can define a well-structured network policy to restrict traffic to and from pods. When different teams have only namespace-level isolation, you may want to prevent them from communicating with each other while still allowing them to interact with the Kubernetes API. An example of this scenario can be as follows:</li>
</ul>
<pre><code class="lang-plaintext">apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-policy
  namespace: tenant-1
spec:
  policyTypes:
    - Egress
  egress:
    - to:
        - ipBlock:
            cidr: 0.0.0.0/0
            except:
            - 100.64.0.0/10
            - 127.0.0.0/8
            - 10.0.0.0/8
            - 172.16.0.0/12
            - 192.168.0.0/16
        - namespaceSelector:
            matchLabels:
              tenant: tenant-1
    - ports:
        - port: 53
          protocol: UDP
        - port: 53
          protocol: TCP
    - ports:
        - port: 443
        - port: 8443
      to:
        - ipBlock:
          cidr: ${KUBE_API}/32
</code></pre>
<ul>
<li>For managing resource usage, you can use Kubernetes objects like ResourceQuota to define the limit of resources that can be created within a cluster. You can also add LimitRange to set CPU and memory limits.</li>
</ul>
<p>While these namespace-level resources help create some isolation, achieving true multi-tenancy is still challenging due to several factors:</p>
<ul>
<li><p>It becomes difficult as the number of tenants increases.</p>
</li>
<li><p>How do you distribute different kubeconfigs per team?</p>
</li>
<li><p>How is cluster-level resource access, such as CRDs, managed?</p>
</li>
<li><p>Is resource sharing still an issue?</p>
</li>
<li><p>How do you handle different cluster versions?</p>
</li>
<li><p>What about different versions of an application?</p>
</li>
<li><p>There is still a single control plane and a single state for the cluster.</p>
</li>
</ul>
<p>Yes, multi-tenancy is hard if we rely solely on native Kubernetes constructs. Even with these measures, automating the entire process instead of manually defining everything is a major challenge.</p>
<h1 id="heading-how-vcluster-enables-true-multi-tenancy"><strong>How vCluster Enables True Multi-Tenancy</strong></h1>
<p>vCluster is an open-source tool that helps you create virtual Kubernetes clusters, each with its own control plane components and cluster state, in an automated way.</p>
<p>When you create a virtual machine in your cloud account, you gain full access to that virtual machine, but it is actually a slice of physical hardware in a data center. Similarly, a virtual cluster is a slice of a Kubernetes cluster, you have full access to it and complete ownership, but ultimately, it is still a part of a larger Kubernetes cluster.</p>
<p>How does vCluster work?</p>
<p><img src="https://miro.medium.com/v2/resize:fit:1400/1*TOkTbREcaM6HQR0-6l3t4w.png" alt /></p>
<p>Instead of managing multiple Kubernetes clusters, you can now have a single Kubernetes cluster and use the vCluster CLI to create virtual clusters. These virtual clusters can reuse the host cluster’s resources, such as Cert Manager, NGINX Ingress Controller, Vault, and more. Each virtual cluster will have its own independent kubeconfig file, allowing teams to deploy their workloads independently. This approach is more secure than namespace-based isolation because each virtual cluster has its own control plane and state (with options like SQLite, embedded etcd, or external etcd)</p>
<p><img src="https://miro.medium.com/v2/resize:fit:1400/1*ISHQkPJcYg8MSd_raDDyEQ.png" alt /></p>
<p>With vCluster Enterprise, organizations can also gain features like multi-cluster tenancy, enhanced security policies, and automated tenancy provisioning.</p>
<h1 id="heading-the-evolution-of-multi-tenancy-shared-platform-stacks-amp-internal-kubernetes-platforms-ikps"><strong>The Evolution of Multi-Tenancy: Shared Platform Stacks &amp; Internal Kubernetes Platforms (IKPs)</strong></h1>
<h2 id="heading-1-shared-platform-stack-the-key-to-efficiency"><strong>1. Shared Platform Stack: The Key to Efficiency</strong></h2>
<p><img src="https://miro.medium.com/v2/resize:fit:1400/1*s_p_j5GoX0rb77yOf_DG2A.png" alt /></p>
<p>Imagine three teams: A, B, and C, each needing their own Kubernetes cluster. As administrators, we create three separate Kubernetes clusters. By default, a newly created cluster only runs the essential components needed for Kubernetes itself, such as the control plane components, the cloud controller manager etc.</p>
<p>Now, if all three teams need to deploy applications with HTTPS support, the typical approach is to install an Ingress Controller and cert-manager. Each team then creates Deployments, Services, Ingress, and Certificate objects. However, since these components need to be installed on every cluster separately, this results in duplicate resources.</p>
<p>This duplication problem also exists in multi-tenancy. One of the biggest challenges in Kubernetes multi-tenancy is the shared platform stack. Ideally, we should be able to reuse resources from the host cluster instead of installing cert-manager and an Ingress Controller in every new cluster.</p>
<p>The easiest way to solve this problem is by using virtual clusters. With vCluster, you can define in the cluster configuration file which resources should be synced from the host cluster, allowing multiple tenants to share platform resources. This optimizes resource utilization and eliminates unnecessary duplication.</p>
<p>This concept of a shared platform stack in a multi-tenant Kubernetes environment using virtual clusters helps organizations efficiently manage resources and is crucial when you are creating an internal Kubernetes platform.</p>
<h2 id="heading-2-internal-kubernetes-platforms-ikps"><strong>2. Internal Kubernetes Platforms (IKPs)</strong></h2>
<p>We believe that an Internal Developer Platform (IDP) is evolving, with Kubernetes becoming the de facto choice for these platforms. Kubernetes is a technology well-suited for building platforms, and if you are developing an IDP in 2025 and beyond, you will or should be leveraging Kubernetes.</p>
<p>This is why we believe the shift is towards an Internal Kubernetes Platform (IKP), where multi-tenancy will play a crucial role, and vCluster will be at the center.</p>
<p>With vCluster integrated alongside your other cloud-native tooling, you can efficiently provision and manage Kubernetes clusters for your teams, making Kubernetes more accessible while maintaining governance and control.<br />IKPs ensure that tenants don’t need to deal with raw Kubernetes, instead, they receive a pre-configured platform tailored to their needs.</p>
<p>We’d love to hear your thoughts on IKPs as well!</p>
<h1 id="heading-future-of-multi-tenancy-in-kubernetes"><strong>Future of Multi-Tenancy in Kubernetes</strong></h1>
<p>Organizations are moving towards:</p>
<p>1️⃣ Standardized Shared Platform Stacks — Providing pre-configured Kubernetes environments.<br />2️⃣ IKPs for Developer Self-Service — Offering Kubernetes as a managed service within organizations.|<br />3️⃣ vCluster &amp; Virtualized Control Planes — Reducing cluster sprawl while maintaining autonomy.</p>
<p>Multi-tenancy is no longer just about namespaces or virtual clusters — it’s about creating an internal Kubernetes ecosystem that allows teams to be productive while keeping infrastructure efficient and manageable.</p>
<p>Throughout March, we’re hosting a Multi-Tenancy March series, featuring webinars, deep dives, and hands-on sessions to explore best practices for Kubernetes multi-tenancy. We will be conducting a hands-on workshop on March 6th, where we will demonstrate this in action, and you’ll have the opportunity to try it out alongside us.</p>
<p><a target="_blank" href="https://www.vcluster.com/event/seamless-kubernetes-multi-tenancy-with-vcluster-and-a-shared-platform-stack?__hstc=107455133.24d76b7b89d28afebee5af7771225ac7.1741672016640.1741672016640.1741672016640.1&amp;__hssc=107455133.2.1741672016640&amp;__hsfp=3213767220">Register for the webinar here</a></p>
<p>Join the <a target="_blank" href="https://slack.loft.sh/">vCluster Slack</a> to stay updated!</p>
<p><a target="_blank" href="https://saiyampathak.medium.com/?source=post_page---post_author_info--8bbed5ba5250---------------------------------------">  
</a></p>
]]></content:encoded></item><item><title><![CDATA[Testing Docker AI's "Gordon" – How Smart Is It?]]></title><description><![CDATA[Docker just launched "Ask Gordon", an AI-powered assistant inside Docker Desktop 4.38. It promises to help with troubleshooting, optimizing Dockerfiles, and even generating configurations automatically.
But does it really work? 🤔 In this blog, let’s...]]></description><link>https://blog.kubesimplify.com/testing-docker-ais-gordon-how-smart-is-it</link><guid isPermaLink="true">https://blog.kubesimplify.com/testing-docker-ais-gordon-how-smart-is-it</guid><category><![CDATA[Docker Gordon]]></category><category><![CDATA[AI]]></category><category><![CDATA[Docker]]></category><category><![CDATA[#ai-tools]]></category><category><![CDATA[docker desktop]]></category><category><![CDATA[DockerDesktop]]></category><dc:creator><![CDATA[Saloni Narang]]></dc:creator><pubDate>Fri, 21 Feb 2025 07:28:49 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1740122884135/a19dad9a-b2ee-4c04-9549-1d251e98bbac.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Docker just launched <strong>"Ask Gordon"</strong>, an AI-powered assistant inside <strong>Docker Desktop 4.38</strong>. It promises to help with troubleshooting, optimizing Dockerfiles, and even generating configurations automatically.</p>
<p>But does it really work? 🤔 In this blog, let’s <strong>put Gordon to the test</strong> with real-world scenarios and see if it’s actually useful</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1739600885602/5787a956-6eeb-4b24-b1a8-d919a36f9501.png" alt class="image--center mx-auto" /></p>
<h2 id="heading-what-is-ask-gordon">🔹 <strong>What is "Ask Gordon"?</strong></h2>
<p>Think of <strong>Gordon</strong> as a <strong>smart AI assistant</strong> built into Docker. Instead of searching the internet or reading documentation, you can ask Gordon questions directly in the <strong>Docker CLI</strong> or <strong>Docker Desktop</strong> <strong>UI</strong>.</p>
<p>For example, if a container crashes, instead of Googling for answers, you can simply ask:</p>
<pre><code class="lang-plaintext">docker ai "Why is my container crashing?"
</code></pre>
<p>And Gordon will analyze logs and suggest solutions. Pretty cool, right?</p>
<p>Now, let’s test out some scenarios and see if it <strong>actually works</strong>.</p>
<h2 id="heading-setting-up-gordon">🔧 <strong>Setting Up Gordon</strong></h2>
<p>Before we test, let’s quickly <strong>enable Gordon</strong>:</p>
<ol>
<li><p>You need to download Docker Desktop version 4.38 or later. You can download it from <a target="_blank" href="https://www.docker.com/products/docker-desktop/">here</a>.</p>
</li>
<li><p>This is how you can enable Ask Gordon:</p>
</li>
</ol>
<p>- After signing in to your Docker Account, enable the Docker AI feature:</p>
<p>- Open the <strong>Settings</strong> view in Docker Desktop.</p>
<p>- Navigate to <strong>Features in Development</strong>.</p>
<p>- Check the <strong>Enable Docker AI</strong> checkbox.<br />The Docker AI terms of service agreement are displayed. You must agree to the terms before you can enable the feature. Review the terms and select <strong>Accept and enable</strong> to continue.<br />- Select <strong>Apply &amp; restart</strong></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1740120211645/f1bd53df-9084-4462-b6b7-37b636e3065c.png" alt class="image--center mx-auto" /></p>
<p>You can now chat with Gordon via:</p>
<ul>
<li><p><strong>CLI</strong> → <code>docker ai "&lt;your question&gt;"</code></p>
</li>
<li><p><strong>Docker Desktop UI</strong> → Click the ✨ icon in various places</p>
</li>
</ul>
<p>Alright, now let’s run some real-world tests.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1739602616722/79810a2b-713b-4d4d-a7da-abe39dffb6b9.png" alt class="image--center mx-auto" /></p>
<p>Let’s say you click “How do I run Redis“, you will get something like below:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1739609259020/880bec84-8ddd-41ef-a337-96c035b142e7.png" alt class="image--center mx-auto" /></p>
<h1 id="heading-testing-gordons-capabilities">🚀 <strong>Testing Gordon’s Capabilities</strong></h1>
<p>If you are using shell, then you must know these commands.  </p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1739609765813/e8fc865a-5b7a-42b7-b02c-f9d750a4d3bc.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-test-1-dockerfile-issues"><strong>🛠 Test 1 : Dockerfile issues</strong></h3>
<p>I asked Gordon to check issues in my Dockerfile:</p>
<pre><code class="lang-dockerfile"><span class="hljs-comment">#Base image</span>
<span class="hljs-keyword">FROM</span> ubuntu:latest
<span class="hljs-comment">#Install dependencies</span>
<span class="hljs-keyword">RUN</span><span class="bash"> apt-get update &amp;&amp; apt-get install -y </span>
curl 
wget 
python3 
python3-pip 
vim
<span class="hljs-comment">#Copy application code</span>
<span class="hljs-keyword">COPY</span><span class="bash"> . /app</span>
<span class="hljs-comment">#Set working directory</span>
<span class="hljs-keyword">WORKDIR</span><span class="bash"> /app</span>
<span class="hljs-comment">#Install Python dependencies</span>
<span class="hljs-keyword">RUN</span><span class="bash"> pip3 install -r requirements.txt</span>
<span class="hljs-comment">#Expose application port</span>
<span class="hljs-keyword">EXPOSE</span> <span class="hljs-number">8080</span>
<span class="hljs-comment">#Start the application</span>
<span class="hljs-keyword">CMD</span><span class="bash"> [<span class="hljs-string">"python3"</span>, <span class="hljs-string">"app.py"</span>]</span>
</code></pre>
<p>Here comes the output:  </p>
<pre><code class="lang-plaintext">saloni@Salonis-MacBook-Air ~ % docker ai

&gt;&gt;&gt; check my dockerfile for issues                                              

    • Sending Dockerfile ✔️

 Your Dockerfile looks functional and should work for most basic use cases. 
However, there are a few potential issues and improvements you can consider:


  ### Issues and Suggestions
    1. Using latest tag for the base image

Using ubuntu:latest is not recommended for production environments because 
the latest tag can change over time, potentially introducing breaking changes. 
It's better to specify a specific version, e.g., ubuntu:20.04.
2. Combining RUN commands

  The RUN command for installing dependencies can be optimized to reduce the number 
of layers in the image. 
For example:


         RUN apt-get update &amp;&amp; apt-get install -y \
            curl \
            wget \
            python3 \
            python3-pip \
            vim &amp;&amp; \
            apt-get clean &amp;&amp; rm -rf /var/lib/apt/lists/*

  This ensures that temporary files created during the installation process are cleaned up, 
  reducing the image size.

  3. No explicit Python version

  The python3 and python3-pip packages will install the default Python version for the 
Ubuntu base image. If your application requires a specific Python version, you should 
explicitly install it.

 4. No non-root user

  The container runs as the root user by default, which can be a security risk. Consider 
adding a non-root user to run the application:
RUN useradd -m appuser
USER appuser

5. No health check

 Adding a HEALTHCHECK instruction can help monitor the health of the container. For example:
HEALTHCHECK --interval=30s CMD curl --fail http://localhost:8080 || exit 1

6. No .dockerignore file
Without a .dockerignore file, unnecessary files (e.g., .git, local environment files) might 
be copied into the image. Create a .dockerignore file to exclude such files:

 .git
  pycache
  *.pyc
   .env

  7. Hardcoded port

  The application exposes port 8080, but this might not be configurable. Consider making the 
   port configurable via an environment variable.

  8. No entrypoint script
  If your application requires environment setup or initialization, consider using an 
  entrypoint script instead of directly running the application in the CMD.

  ### Final Optimized Dockerfile
  Here’s an optimized version of your Dockerfile:

   # Base image
    FROM ubuntu:20.04

  # Install dependencies
    RUN apt-get update &amp;&amp; apt-get install -y \
        curl \
        wget \
        python3 \
        python3-pip \
        vim &amp;&amp; \
        apt-get clean &amp;&amp; rm -rf /var/lib/apt/lists/*

  # Add a non-root user
    RUN useradd -m appuser
    USER appuser

  # Copy application code
    COPY . /app
 # Set working directory
    WORKDIR /app
 # Install Python dependencies
    RUN pip3 install --no-cache-dir -r requirements.txt
 # Expose application port
    EXPOSE 8080
 # Add health check
    HEALTHCHECK --interval=30s CMD curl --fail http://localhost:8080 || exit 1
 # Start the application
    CMD ["python3", "app.py"]

Let me know if you need further clarification!
</code></pre>
<p>So it gave us the missing pieces form my Dockerfile and gives the optimised Dockerfile. This is neat as you can directly do this in your cli form where you are running the docker commands.</p>
<hr />
<h3 id="heading-test-2-fixing-a-broken-container"><strong>🛠 Test 2 : Fixing a Broken Container</strong></h3>
<h4 id="heading-scenario"><strong>Scenario:</strong></h4>
<p>I have a <strong>Node.js</strong> container, but when I start, it crashes immediately.</p>
<pre><code class="lang-plaintext">echo 'console.log(require("express"))' &gt; app.js
cat &lt;&lt;EOF &gt; Dockerfile
FROM node:18
WORKDIR /app
COPY app.js .
CMD ["node", "app.js"]
EOF
docker build -t crash-node .
</code></pre>
<pre><code class="lang-plaintext">docker run --name crash-node -d crash-node
</code></pre>
<pre><code class="lang-plaintext">docker ps -a | grep crash
9ef75103d838   crash-node                            "docker-entrypoint.s…"   13 seconds ago   Exited (1) 12 seconds ago                               crash-node
</code></pre>
<h4 id="heading-now-lets-ask-gordon-on-why-my-container-is-failing"><strong>Now let’s ask Gordon on why my container is failing</strong></h4>
<pre><code class="lang-plaintext">docker ai "Why is my container crash-node crashing?"
</code></pre>
<h4 id="heading-response"><strong>Response:</strong></h4>
<pre><code class="lang-plaintext">To diagnose why your container named crash-node is crashing, you can follow these steps:

    1. Check Container Logs:
  Use the docker logs command to view the logs of the crashing container. This can provide insights into 
  any errors or issues that occur just before the crash.

          docker logs crash-node

    2. Inspect Container Status:
  Use docker ps -a to check the status of the container. This can help determine if the container is 
  exiting with a specific error code.

          docker ps -a

    3. Use Docker Compose Logs (if applicable):
  If you are using Docker Compose, you can use the following command to view logs for all services, which 
  might give you more context:

          docker compose logs crash-node

    4. Check Resource Constraints:
  Ensure that your container is not running out of memory or other resources. You can inspect resource 
  usage with:

          docker stats crash-node

    5. Review Docker Daemon Logs:
  If the container logs do not provide enough information, you can check the Docker daemon logs for any 
  system-level issues. On Linux, you can use:

          journalctl -u docker.service

    6. Inspect Dockerfile and Configuration:
  Review your Dockerfile and any configuration files (like docker-compose.yml) for potential 
  misconfigurations or errors.

    7. Check for Known Issues:
  Refer to the Docker known issues 
  (https://docs.docker.com/desktop/troubleshoot-and-support/troubleshoot/known-issues/) to see if there 
  are any related issues that might be affecting your container.

  By following these steps, you should be able to gather more information about why your crash-node 
  container is crashing. If you find specific error messages or logs, feel free to share them for more 
  targeted assistance.
</code></pre>
<p>Let’s check the logs</p>
<pre><code class="lang-plaintext">docker logs crash-node
node:internal/modules/cjs/loader:1143
  throw err;
  ^

Error: Cannot find module 'express'
Require stack:
- /app/app.js
    at Module._resolveFilename (node:internal/modules/cjs/loader:1140:15)
    at Module._load (node:internal/modules/cjs/loader:981:27)
    at Module.require (node:internal/modules/cjs/loader:1231:19)
    at require (node:internal/modules/helpers:177:18)
    at Object.&lt;anonymous&gt; (/app/app.js:1:13)
    at Module._compile (node:internal/modules/cjs/loader:1364:14)
    at Module._extensions..js (node:internal/modules/cjs/loader:1422:10)
    at Module.load (node:internal/modules/cjs/loader:1203:32)
    at Module._load (node:internal/modules/cjs/loader:1019:12)
    at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:128:12) {
  code: 'MODULE_NOT_FOUND',
  requireStack: [ '/app/app.js' ]
}

Node.js v18.20.7
</code></pre>
<p>Now, let’s ask Gordon again</p>
<pre><code class="lang-plaintext">docker ai "docker logs crash-node
node:internal/modules/cjs/loader:1143
  throw err;
  ^

Error: Cannot find module 'express'
Require stack:
- /app/app.js
    at Module._resolveFilename (node:internal/modules/cjs/loader:1140:15)
    at Module._load (node:internal/modules/cjs/loader:981:27)
    at Module.require (node:internal/modules/cjs/loader:1231:19)
    at require (node:internal/modules/helpers:177:18)
    at Object.&lt;anonymous&gt; (/app/app.js:1:13)
    at Module._compile (node:internal/modules/cjs/loader:1364:14)
    at Module._extensions..js (node:internal/modules/cjs/loader:1422:10)
    at Module.load (node:internal/modules/cjs/loader:1203:32)
    at Module._load (node:internal/modules/cjs/loader:1019:12)
    at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:128:12) {
  code: 'MODULE_NOT_FOUND',
  requireStack: [ '/app/app.js' ]
}
</code></pre>
<pre><code class="lang-plaintext">The error indicates that the express module is missing in your Node.js application. This happens 
  because the express package is not installed in your Docker container
</code></pre>
<p>The main point here is that express is missing!!</p>
<p>so we can just update the Dockerfile as below</p>
<pre><code class="lang-plaintext">FROM node:18
WORKDIR /app
COPY app.js .
RUN npm install express
CMD ["node", "app.js"]
</code></pre>
<p>Now when you build, run again and check the logs you will see the logs.</p>
<pre><code class="lang-plaintext">docker logs 154d3d2759a8
[Function: createApplication] {
  application: {
    init: [Function: init],
    defaultConfiguration: [Function: defaultConfiguration],
    lazyrouter: [Function: lazyrouter],
    handle: [Function: handle],
    use: [Function: use],
    route: [Function: route],
    engine: [Function: engine],
    param: [Function: param],
    set: [Function: set],
    path: [Function: path],
    enabled: [Function: enabled],
    disabled: [Function: disabled],
    enable: [Function: enable],
    disable: [Function: disable],
    acl: [Function (anonymous)],
|
|
|
.......
</code></pre>
<p>All this is done while staying in the same CLI and not going anywhere else which is the coolest part IMO.</p>
<hr />
<h1 id="heading-conclusion-is-gordon-worth-using"><strong>Conclusion: Is Gordon Worth Using?</strong></h1>
<p>I have used Gordon for various purposes, in the end its the fine tuned and trained AI agent on Docker documentation which is a good thing as it will have the latest information as compared to other LLM’s out there. But for the simple errors, it should just give smaller outputs with the fix and then explain when the user asks to explain the action. It’s in beta and new so we can excuse for the generic answers it gives at times but if we want people to not use chatgpt or similar LLM’s then it should provide concise answers, to the point issues and also like github copilot it should be able to provide instant feedback when we are actually typing in commands.</p>
<p>What do you think about Docker Gordon? Are you using it already?</p>
<p>If you use Docker Desktop, definitely give Gordon a shot. It won’t replace a seasoned engineer, but it can make life a bit easier, especially for debugging and automation.</p>
<p>💡 Try it out and let me know your thoughts! What’s the weirdest question you asked Gordon? 😆👇</p>
]]></content:encoded></item><item><title><![CDATA[Understanding Docker Desktop: All-in-One Platform for Containers]]></title><description><![CDATA[In modern application development, containers have revolutionized how developers build, ship, and run applications. Among the tools facilitating this revolution, Docker Desktop is an essential tool for developers looking to streamline containerized a...]]></description><link>https://blog.kubesimplify.com/understanding-docker-desktop-all-in-one-platform-for-containers</link><guid isPermaLink="true">https://blog.kubesimplify.com/understanding-docker-desktop-all-in-one-platform-for-containers</guid><category><![CDATA[Docker]]></category><category><![CDATA[Kubernetes]]></category><category><![CDATA[docker desktop]]></category><category><![CDATA[DevOps Journey]]></category><category><![CDATA[DevRel]]></category><dc:creator><![CDATA[Saloni Narang]]></dc:creator><pubDate>Fri, 31 Jan 2025 10:56:37 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1738320523043/46181ade-54e5-4053-b65c-e5923961e584.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In modern application development, containers have revolutionized how developers build, ship, and run applications. Among the tools facilitating this revolution, <strong>Docker Desktop</strong> is an essential tool for developers looking to streamline containerized application workflows. Whether you are just starting your container journey or are a seasoned pro, Docker Desktop offers a user-friendly way to manage and work with containers.</p>
<h3 id="heading-what-is-docker-desktop">What is Docker Desktop?</h3>
<p>Docker Desktop is a cross-platform application available for Windows, macOS, and Linux that provides an easy-to-use interface to manage your Docker containers and images. It combines Docker Engine, Docker CLI, Docker Compose, and Kubernetes (optional) into a unified solution, enabling seamless development and testing of containerized applications on your local machine.</p>
<p>For installation of Docker Desktop, you can refer to my previous post.<br /><a target="_blank" href="https://blog.kubesimplify.com/docker-captain-journey">https://blog.kubesimplify.com/docker-captain-journey</a></p>
<h3 id="heading-key-features-of-docker-desktop">Key Features of Docker Desktop</h3>
<h4 id="heading-1-cross-platform-support">1. <strong>Cross-Platform Support</strong></h4>
<p>Docker Desktop works on Windows, macOS, and Linux, making it a versatile tool for developers across different operating systems. It automatically configures and integrates with the host system, eliminating the need for complex setup processes.</p>
<h4 id="heading-2-built-in-kubernetes-support">2. <strong>Built-in Kubernetes Support</strong></h4>
<p>Docker Desktop includes a lightweight, single-node Kubernetes cluster for developers working with Kubernetes. This allows users to deploy, test, and manage Kubernetes workloads locally without needing a separate cluster.</p>
<h4 id="heading-3-docker-compose">3. <strong>Docker Compose</strong></h4>
<p>With Docker Compose integrated you can define multi-container applications in a simple YAML file and deploy them using a single command. This is particularly useful for microservices architectures.</p>
<h4 id="heading-4-resource-controls">4. <strong>Resource Controls</strong></h4>
<p>Docker Desktop provides an intuitive interface to allocate system resources like CPU, memory, and disk space for Docker containers, ensuring optimal performance without overloading your machine.</p>
<h4 id="heading-5-image-management">5. <strong>Image Management</strong></h4>
<p>The Image view in the Docker Desktop Dashboard makes it easy to manage container images. You can pull images from Docker Hub, inspect image details, and run images as containers. Additionally, the Dashboard helps you clean up unused images to free up disk space and provides a summary of image vulnerabilities, enabling proactive security management.</p>
<h4 id="heading-6-volume-management">6. <strong>Volume Management</strong></h4>
<p>The Volumes view offers a streamlined way to manage Docker volumes. You can create, delete, and inspect volumes, as well as view which containers are using them, all from a centralized interface.</p>
<h4 id="heading-7-builds-view">7. <strong>Builds View</strong></h4>
<p>Docker Desktop’s Builds view lets you inspect your build history and manage builders. This includes details of ongoing and completed builds, making it easier to track your workflows.</p>
<h4 id="heading-8-notifications-and-learning-center">8. <strong>Notifications and Learning Center</strong></h4>
<p>Stay informed with Docker Desktop’s notification center, which provides updates about new releases, installation progress, and other alerts. The Learning Center offers in-app walkthroughs and resources to help you master Docker quickly.</p>
<h4 id="heading-9-quick-search">9. <strong>Quick Search</strong></h4>
<p>Docker Desktop includes a Quick Search feature in the Dashboard, allowing you to locate containers , compose applications, images, volumes, or extensions with ease. For images, it provides options to pull, run, or view documentation, while for containers, you can perform actions like start, stop, or delete directly from the search results.</p>
<p><strong>10. Docker Scout</strong></p>
<p><strong>Docker Scout</strong> is a tool that provides deep insights into container images, helping developers analyze dependencies, identify vulnerabilities, and improve image quality. It integrates seamlessly with Docker workflows, offering actionable recommendations and enabling policy enforcement to ensure secure, efficient, and compliant containerized applications. Perfect for managing your software supply chain!</p>
<ol start="11">
<li><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1738265198618/b0091f36-bf83-4d4c-b01b-6c6058f231a6.png" alt class="image--center mx-auto" /></p>
<p><strong>WebAssembly workloads</strong></p>
<p><strong>Wasm in Docker</strong> enables running lightweight, fast WebAssembly (Wasm) workloads alongside Linux containers. To use Wasm, enable the <strong>container image store</strong> and turn on the <strong>Enable Wasm</strong> feature in Docker Desktop settings. Docker Desktop installs various Wasm runtimes, like Wasmtime and WasmEdge, to support Wasm workloads</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1738265118729/6de4ead9-faad-4869-bbc5-b6d137e0a2ed.png" alt class="image--center mx-auto" /></p>
<p>.</p>
</li>
<li><p><strong>Docker Extensions</strong></p>
<p><strong>Docker Desktop Extensions</strong> are add-ons that enhance Docker Desktop by integrating tools like security scanners, monitoring, and debugging directly into its UI. They simplify workflows, improve productivity, and allow developers to build and share custom extensions. You can explore and install them via the <strong>Docker Extensions Marketplace</strong>.</p>
</li>
<li><p><strong>Dev Environments</strong></p>
<p>A <strong>Dev Environment</strong> in Docker Desktop lets developers quickly set up and share reproducible environments with all tools, dependencies, and configurations. It ensures consistency, simplifies onboarding, and eliminates "it works on my machine" issues, enabling seamless collaboration.</p>
</li>
<li><p><strong>Ask Gordon</strong></p>
<p><strong>Ask Gordon</strong> is Docker’s AI-powered assistant, currently in Beta, designed to streamline workflows in Docker Desktop and the CLI. It provides contextual, actionable insights by understanding your local setup, including Dockerfiles, containers, and applications. With features like identifying vulnerabilities and optimizing Dockerfiles, Ask Gordon helps make Docker's ecosystem more intuitive and efficient.</p>
</li>
</ol>
<h3 id="heading-exploring-docker-desktop">Exploring Docker Desktop</h3>
<h4 id="heading-docker-desktop-dashboard">Docker Desktop Dashboard</h4>
<p>The Docker Desktop Dashboard serves as your command center. Here are the key views available:</p>
<ol>
<li><p><strong>Containers View</strong>: Provides a runtime view of all your containers and applications, allowing you to manage their lifecycle, inspect logs, and perform other common actions directly from your machine.</p>
</li>
<li><p><strong>Images View</strong> displays a list of local images, lets you pull images from Docker Hub, run images as containers, and clean up unused images. If you are logged in, you can also view images shared by your organization on Docker Hub.</p>
</li>
<li><p><strong>Volumes View</strong>: Displays Docker volumes, allowing you to easily manage their lifecycle.</p>
</li>
<li><p><strong>Builds View</strong>: Shows your build history, ongoing builds, and completed builds.</p>
</li>
</ol>
<h4 id="heading-docker-menu">Docker Menu</h4>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1738265159514/2544292c-d999-4b73-8ba0-9bd0037b1020.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-doing-stuff-with-docker-desktop">Doing Stuff with Docker Desktop</h3>
<p>To do everything within the Docker Desktop, you can enable the terminal form within the Docker Desktop to interact with the host machine.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1738265636324/b9131634-222f-4779-b00c-7dee0280d2cf.png" alt class="image--center mx-auto" /></p>
<h2 id="heading-1-building-custom-images">1. <strong>Building Custom Images</strong></h2>
<p>Let’s say you want to build an image for your python flask application.</p>
<p>Create a simple <a target="_blank" href="http://app.py"><code>app.py</code></a> as below:</p>
<pre><code class="lang-bash">from flask import Flask

app = Flask(__name__)

@app.route(<span class="hljs-string">'/'</span>)
def home():
    <span class="hljs-built_in">return</span> <span class="hljs-string">"Hello, Dockerized Python App!"</span>

<span class="hljs-keyword">if</span> __name__ == <span class="hljs-string">'__main__'</span>:
    app.run(host=<span class="hljs-string">'0.0.0.0'</span>, port=5000)
</code></pre>
<h3 id="heading-requirements-requirementstxt">Requirements (<code>requirements.txt</code>)</h3>
<p>We will have the <code>Dockerfile</code> install the dependencies from <code>requirements.txt</code>, create a <code>requirements.txt</code> file with the following content:</p>
<pre><code class="lang-bash">flask
</code></pre>
<p>This will ensure that Flask is installed when building the Docker image.</p>
<p>Create a <code>Dockerfile</code> to define a custom image:</p>
<pre><code class="lang-plaintext">FROM python:3.9-slim
WORKDIR /app
COPY . .
RUN pip install -r requirements.txt
CMD ["python", "app.py"]
</code></pre>
<p>Build and run the image:</p>
<pre><code class="lang-plaintext">docker build -t my-python-app .
docker run -p 5001:5000 my-python-app
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1738266179774/64cd66ef-1f3a-4bfb-9034-1c0f277bfbc7.png" alt class="image--center mx-auto" /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1738266219541/efd934fe-f96a-4783-a5c7-3057c157ec9f.png" alt class="image--center mx-auto" /></p>
<p>Then, open your browser and visit `http://localhost:5001` to see the output.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1738266279683/f61c78f4-079e-4d63-98a8-e64ce1f18c1a.png" alt class="image--center mx-auto" /></p>
<h4 id="heading-2-using-docker-desktop-with-kubernetes">2. <strong>Using Docker Desktop with Kubernetes:</strong></h4>
<p>Enable Kubernetes in Docker Desktop settings and see the cluster is running and the context is switched to docker-desktop.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1738266352117/d7608ae1-9725-49a7-a4b6-6917a13f3ff6.png" alt class="image--center mx-auto" /></p>
<p>Then, create a Kubernetes ‘deployment.yaml’:</p>
<pre><code class="lang-plaintext">apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
spec:
  replicas: 2
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx
        ports:
        - containerPort: 80
</code></pre>
<p>Apply the configuration:</p>
<pre><code class="lang-plaintext">kubectl apply -f deployment.yaml
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1738266422015/954c83e9-18f4-43ed-b563-86ab0d91e183.png" alt class="image--center mx-auto" /></p>
<p>Verify the deployment:</p>
<pre><code class="lang-plaintext">kubectl get deployment
kubectl get pods
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1738266494582/fe21883f-3423-4ef6-b3cf-261ed9e62e1f.png" alt class="image--center mx-auto" /></p>
<p>You can see that you have a local Kubernetes experience that you can use to test out Kubernetes for your development purposes.</p>
<h4 id="heading-3-exploring-extensions">3. <strong>Exploring Extensions</strong></h4>
<p>Docker Desktop’s Extensions Marketplace allows you to add new functionalities. From CI/CD integrations to security scanners, you can install extensions with a single click and extend the capabilities of Docker Desktop. In the below example, I searched for the Lens extension to visualize Kubernetes.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1738265492095/a2191cf0-3ec7-411a-b2c0-5860d9c56f16.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-conclusion">Conclusion</h3>
<p>Docker Desktop is an indispensable tool for developers and DevOps professionals. It simplifies the container lifecycle, bridges the gap between local and production environments, and offers powerful features like Kubernetes integration, resource management, and extensions. Whether building your first containerized app or managing complex microservices, Docker Desktop is your go-to solution for local development.</p>
<p>Start exploring Docker Desktop today and transform how you build, ship, and run applications!</p>
]]></content:encoded></item><item><title><![CDATA[Best DevOps Tools 2025]]></title><description><![CDATA[As we enter 2025, the DevOps landscape continues to evolve, with innovative tools addressing complex challenges in the cloud-native ecosystem. At Kubesimplify, we aim to make these technologies more accessible and provide clear recommendations to hel...]]></description><link>https://blog.kubesimplify.com/best-devops-tools-2025</link><guid isPermaLink="true">https://blog.kubesimplify.com/best-devops-tools-2025</guid><category><![CDATA[Kubernetes]]></category><category><![CDATA[Devops]]></category><category><![CDATA[2025]]></category><category><![CDATA[technology]]></category><category><![CDATA[Trending]]></category><dc:creator><![CDATA[Saloni Narang]]></dc:creator><pubDate>Mon, 13 Jan 2025 07:18:15 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1736752572860/02a99f29-d7d1-440c-8688-7dd794d74faa.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>As we enter 2025, the DevOps landscape continues to evolve, with innovative tools addressing complex challenges in the cloud-native ecosystem. At <strong>Kubesimplify</strong>, we aim to make these technologies more accessible and provide clear recommendations to help you navigate this ever-changing space. Here’s our perspective on how DevOps tools will shape workflows in 2025.  </p>
<p>(This blog is based on the latest video on our YouTube channel, <strong>"DevOps Tools 2025"</strong>)</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://www.youtube.com/watch?v=8dhwmKqfAa8&amp;t=31s">https://www.youtube.com/watch?v=8dhwmKqfAa8&amp;t=31s</a></div>
<p> </p>
<hr />
<h3 id="heading-1-building-secure-base-images"><strong>1. Building Secure Base Images</strong></h3>
<p>The journey begins with building secure and compliant base images. At Kubesimplify, we recommend using <strong>BuildSafe</strong>, a tool designed to simplify this process while ensuring compliance and a developer-friendly experience.</p>
<p><a target="_blank" href="https://github.com/buildsafedev">BuildSafe</a>, built on <strong>Nix</strong>, helps you build 0 CVE base image yourself, provide you with higher quality build time SBOM out of the box and also let you achieve higher SLSA levels and even go beyond.</p>
<hr />
<h3 id="heading-2-cicd-pipelines"><strong>2. CI/CD Pipelines</strong></h3>
<p>Automation lies at the heart of DevOps, and effective CI/CD pipelines are key to operational efficiency. We emphasize a combination of:</p>
<ul>
<li><p><strong>GitHub Actions and ArgoCD for modern CI/CD.</strong> You can check out this project where we have built and shown everything for a CI\CD pipeline.</p>
<p>  %[https://www.youtube.com/watch?v=kCWAwXFnYic&amp;t=46s] </p>
</li>
</ul>
<ul>
<li><strong>Argo CD</strong> is a declarative GitOps continuous delivery tool for Kubernetes. It synchronizes Kubernetes resources with a desired state defined in a Git repository, enabling automated deployments and real-time drift detection. Argo CD supports multi-cluster environments and application rollbacks and integrates with Helm, Kustomize, and plain YAML manifests to manage Kubernetes workloads effectively.</li>
</ul>
<p>Couple more tools highlighted in this video is:</p>
<ul>
<li><p><a target="_blank" href="https://dagger.io/"><strong>Dagger</strong></a> <strong>is</strong> a tool for writing pipelines in programming languages like Go, Python, and TypeScript. It offers flexibility with local and CI-based runs, making it a standout for modern workflows.</p>
</li>
<li><p><strong>Kargo:</strong> Created by the Argo team, it streamlines multi-stage application promotion using GitOps principles, removing the need for custom automation or CI pipelines. Kargo integrates seamlessly with Argo CD, automating progressive rollouts to improve efficiency, safety, and visibility across the application lifecycle.</p>
</li>
</ul>
<hr />
<h3 id="heading-3-infrastructure-as-code-iac"><strong>3. Infrastructure as Code (IaC)</strong></h3>
<p>Managing infrastructure has never been easier with tools that provide powerful abstractions and flexibility. Kubesimplify recommends:</p>
<ul>
<li><p><a target="_blank" href="https://www.crossplane.io/"><strong>Crossplane</strong></a>: <strong>Crossplane</strong> is an open-source framework that enables infrastructure and application management using Kubernetes-native declarative APIs. It extends Kubernetes to manage resources like cloud infrastructure, databases, and services through Custom Resource Definitions (CRDs). Crossplane supports multi-cloud environments and infrastructure-as-code practices and integrates seamlessly with existing Kubernetes workflows, providing a unified way to manage both infrastructure and application. The feature I love the most from crossplane is - crossplane composition. You can check out this video that I did with Dan on <a target="_blank" href="https://www.youtube.com/watch?v=78xR7ypzB4Q">Crossplane compostion deep dive</a>.</p>
</li>
<li><p><a target="_blank" href="https://www.pulumi.com/"><strong>Pulumi</strong></a>: <strong>Pulumi</strong> is an open-source infrastructure-as-code (IaC) tool that allows you to define, deploy, and manage cloud infrastructure using familiar programming languages like Python, TypeScript, Go, and C#. It supports a wide range of cloud providers and on-premises systems. Pulumi enables developers and DevOps teams to write infrastructure as real code, fostering better integration, testing, and reusability while simplifying infrastructure management across diverse environment.</p>
</li>
<li><p><a target="_blank" href="https://opentofu.org/"><strong>OpenTofu</strong></a>: <strong>OpenTofu</strong> is an open-source infrastructure such as a code (IaC) framework that helps define, provision, and manage cloud infrastructure using declarative configuration files. It supports multiple cloud providers and on-premises environments, enabling reproducible and automated infrastructure deployments. As a community-driven fork of Terraform, OpenTofu promotes openness and extensibility, empowering users with flexibility for modern infrastructure management.</p>
</li>
</ul>
<p>Along with that some still use <strong>Ansible</strong> that remains a reliable option.</p>
<hr />
<h3 id="heading-4-kubernetes-package-management"><strong>4. Kubernetes Package Management</strong></h3>
<p>Managing additional tooling on Kubernetes clusters requires robust package management. While <strong>Helm</strong> remains a popular choice, <a target="_blank" href="https://glasskube.dev/"><strong>Glasskube</strong></a> is our recommended tool for 2025. It simplifies:</p>
<ul>
<li><p>Dependency management.</p>
</li>
<li><p>Automatic CRD updates.</p>
</li>
<li><p>Kubernetes version compatibility testing.</p>
</li>
</ul>
<p>Glasskube’s focus on lifecycle management and CLI-based intuitive updates makes it an excellent choice for managing Kubernetes packages.</p>
<hr />
<h3 id="heading-5-observability"><strong>5. Observability</strong></h3>
<p>Observability is critical for maintaining operational excellence. While tools like <strong>Prometheus</strong>, <strong>Grafana</strong>, and <strong>Jaeger</strong> are well-established, Kubesimplify highlights:</p>
<ul>
<li><p><a target="_blank" href="https://signoz.io/">Signoz</a><strong>: SigNoz</strong> is an open-source observability platform for monitoring and troubleshooting applications. It provides metrics, logs, and traces in a single interface, enabling developers to quickly analyze application performance and detect issues. Built to focus on simplicity and scalability, SigNoz integrates with OpenTelemetry and supports popular storage backends like ClickHouse. It serves as a cost-effective alternative to proprietary observability solutions like Datadog or New Relic.</p>
</li>
<li><p><a target="_blank" href="https://openobserve.ai/"><strong>OpenObserve</strong></a>: <strong>OpenObserve</strong> is an open-source observability platform that provides a unified solution for logs, metrics, and traces. It is designed for high performance, scalability, and cost-efficiency, making it suitable for modern cloud-native applications. OpenObserve simplifies monitoring and troubleshooting by offering a single interface for observability data, helping teams ensure system reliability and performance.</p>
</li>
</ul>
<p>Both tools are great choices for monitoring, logging, and tracing in a Kubernetes environment.</p>
<hr />
<h3 id="heading-6-security-and-compliance"><strong>6. Security and Compliance</strong></h3>
<p>Security is a non-negotiable aspect of DevOps. Kubesimplify suggests:</p>
<ul>
<li><p><strong>BuildSafe</strong>: <strong>BuildSafe</strong> is a tool designed to secure the software supply chain by enabling organizations to create tamper-proof, 0-CVE (zero known vulnerabilities) artifacts compliant with government regulations. It focuses on generating high-quality SBOM’s, providing path for secure builds that are developer-friendly and easy to integrate into existing workflows. BuildSafe emphasizes hermetic builds, high-quality SBOMs (Software Bill of Materials), and compliance, helping organizations reduce risks associated with supply chain attacks.</p>
</li>
<li><p><a target="_blank" href="https://trivy.dev/latest/"><strong>Trivy</strong></a>: <strong>Trivy</strong> is a security scanner for identifying vulnerabilities, misconfigurations, and sensitive data in containers, Kubernetes, IaC templates, and repositories. It supports a wide range of formats, integrates seamlessly into CI/CD pipelines, and provides detailed reports to enhance application and infrastructure security. Trivy is widely used for its speed, simplicity, and effectiveness in ensuring secure deployments.</p>
</li>
<li><p><a target="_blank" href="https://www.cncf.io/projects/kubescape/"><strong>Kubescape</strong></a>: <strong>Kubescape</strong> is a Kubernetes security platform by ARMO, providing end-to-end protection across the development and runtime lifecycle. It features shift-left security, runtime threat detection, cluster scanning, YAML/Helm validation, compliance with frameworks like NSA-CISA and MITRE ATT&amp;CK, and multi-cloud support. A CNCF sandbox project, Kubescape ensures a robust security posture for Kubernetes environments.</p>
</li>
<li><p><a target="_blank" href="https://falco.org/community/falco-brand/"><strong>Falco</strong></a>: <strong>Falco</strong> is a runtime security tool for Kubernetes and cloud-native environments. It detects anomalous behavior, potential threats, and policy violations by monitoring system calls in real time. With predefined and customizable rules, Falco provides actionable alerts for suspicious activities, helping secure workloads and maintain compliance in dynamic, containerized environments. It is a CNCF graduated project.</p>
</li>
</ul>
<p>These tools collectively ensure a robust security posture throughout the DevOps lifecycle.</p>
<hr />
<h3 id="heading-7-cost-optimization"><strong>7. Cost Optimization</strong></h3>
<p>Kubernetes cost optimization is a growing concern, and tools like <a target="_blank" href="https://cast.ai/"><strong>Cast.AI</strong></a> offer intelligent auto-scaling to reduce expenses. Alongside cost reduction, <a target="_blank" href="https://www.vcluster.com/"><strong>vCluster</strong></a> is pivotal in enabling multi-tenancy, sustainability, and platform engineering goals.</p>
<hr />
<h3 id="heading-8-webassembly-and-ai-workloads"><strong>8. WebAssembly and AI Workloads</strong></h3>
<p>WebAssembly continues to make waves in the industry. Tools like <a target="_blank" href="https://www.spinkube.dev/"><strong>SpinKube</strong></a> and <a target="_blank" href="https://wasmcloud.com/"><strong>WasmCloud</strong></a> simplify deploying WebAssembly applications on Kubernetes.</p>
<p>For AI workloads, <a target="_blank" href="https://www.kubeflow.org/"><strong>Kubeflow</strong></a> remains a leading choice, offering a complete lifecycle solution from training to inference using components like KServe and pipelines. Kubernetes is rapidly becoming a preferred platform for running AI agents, and tools like <strong>Argo Workflows</strong> further enhance these capabilities.</p>
<hr />
<h3 id="heading-kubesimplifys-final-thoughts"><strong>Kubesimplify’s Final Thoughts</strong></h3>
<p>At Kubesimplify, our mission is to break down the complexities of cloud-native technologies and empower you to make informed decisions. These tools represent the forefront of innovation in 2025, simplifying workflows while addressing real-world challenges.</p>
<p>We’d love to hear from you! <strong>What tools are part of your DevOps journey in 2025?</strong> Share your thoughts, and let’s grow together as a community.</p>
<p>A big thanks to the CNCF and open-source contributors for driving these innovations. Stay tuned to <strong>Kubesimplify</strong> for more insights, tutorials, and updates from the cloud-native ecosystem.</p>
]]></content:encoded></item><item><title><![CDATA[Becoming a Docker Captain]]></title><description><![CDATA[Hey everyone! I’m thrilled to share that I’ve recently become a Docker Captain. I wanted to take a moment to reflect on my journey with Docker, starting in 2019, and share some learnings and insights along the way.
Discovering Docker in 2019: A Spark...]]></description><link>https://blog.kubesimplify.com/docker-captain-journey</link><guid isPermaLink="true">https://blog.kubesimplify.com/docker-captain-journey</guid><category><![CDATA[docker captain]]></category><category><![CDATA[Docker]]></category><category><![CDATA[docker desktop]]></category><category><![CDATA[Kubernetes]]></category><dc:creator><![CDATA[Saloni Narang]]></dc:creator><pubDate>Tue, 17 Dec 2024 06:22:46 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1734416444826/f7b4dbf4-3c58-4f37-8ccd-b4356cc442b6.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey everyone! I’m thrilled to share that I’ve recently become a <a target="_blank" href="https://www.docker.com/community/captains/"><strong>Docker Captain</strong></a>. I wanted to take a moment to reflect on my journey with Docker, starting in 2019, and share some learnings and insights along the way.</p>
<h4 id="heading-discovering-docker-in-2019-a-spark-of-interest">Discovering Docker in 2019: A Spark of Interest</h4>
<p>It all began back in 2019, when I was working at <strong>SAP Labs in Bangalore</strong>. As someone passionate about exploring new technologies, I started attending local meetups to connect with like-minded professionals and learn from their experiences. It was during one of these meetups that I first came across the term <strong>Docker</strong>. The concept of containerization and its potential to transform software development immediately got my attention.</p>
<p>Eager to dive deeper, I started learning Docker through various online platforms. I explored tutorials, blogs, and documentation to understand its fundamentals and practical applications. This newfound knowledge not only enhanced my skills but also ignited a desire to share what I had learned with others.</p>
<h4 id="heading-hands-on-experience-working-on-a-docker-project-at-sap-labs">Hands-On Experience: Working on a Docker Project at SAP Labs</h4>
<p>While at SAP Labs, I got the opportunity to work on a <strong>Docker-based project</strong> in a <strong>production environment</strong>. This was a game-changer for me, as it allowed me to gain <strong>hands-on experience with Docker in real-world scenarios</strong>. Working on the project helped me understand the challenges and nuances of using containerization at scale, such as deploying containers in production, managing workloads, and ensuring high availability.</p>
<p>This experience not only strengthened my technical expertise but also gave me a practical perspective on Docker’s impact on enterprise environments.</p>
<h4 id="heading-taking-it-to-the-next-level-earning-the-docker-certified-associate-dca">Taking It to the Next Level: Earning the Docker Certified Associate (DCA)</h4>
<p>As my interest in Docker grew, I decided to deepen my understanding by pursuing the <strong>Docker Certified Associate (DCA)</strong> certification. Preparing for this certification was challenging but rewarding. It provided me with a strong foundation in containerisation concepts, Docker commands, orchestration, and real-world applications. This certification confirmed my expertise and inspired me to explore further possibilities with Docker. I became DCA when the Enterprise business was acquired by Mirantis.</p>
<h4 id="heading-organising-indias-largest-docker-meetup-a-milestone-in-2020">Organising India's Largest Docker Meetup: A Milestone in 2020</h4>
<p>One of the most memorable milestones in my journey was organizing <strong>India's largest Docker meetup</strong> in January 2020. This was the first event of the year, and it was truly extraordinary. With over <strong>550 attendees</strong>, it became a platform for professionals and enthusiasts to gather, share knowledge, and discuss the future of containerization.</p>
<p>From coordinating with speakers to managing logistics, the experience of organising such a large-scale event was both challenging and exhilarating. Seeing the community’s enthusiasm and engagement reaffirmed my commitment to contributing to the Docker ecosystem.</p>
<h4 id="heading-keeping-the-community-alive-during-covid-19">Keeping the Community Alive During COVID-19</h4>
<p>When the world came to a standstill due to the COVID-19 pandemic, I saw an opportunity to keep the community spirit alive through virtual meetups. I hosted several online sessions on Docker, sharing insights, best practices, and use cases with developers worldwide. These sessions not only helped others learn but also deepened my own understanding as I prepared and answered questions from the community.</p>
<p>During this time, I also created a dedicated <a target="_blank" href="https://youtube.com/playlist?list=PL5uLNcv9SibBZj30yqG01a7A4_MXSyGK3&amp;si=pj3oQaTbZv4MARMj"><strong>Docker playlist</strong> on my YouTube channel</a>, Kubesimplify. This playlist became a go-to resource for developers to learn Docker step-by-step, from beginner concepts to advanced techniques. Additionally, I began writing blogs on <strong>Medium</strong> and <strong>Kubesimplify</strong>, covering practical Docker topics to reach an even wider audience.</p>
<h4 id="heading-applying-for-docker-captain-the-next-chapter">Applying for Docker Captain: The Next Chapter</h4>
<p>After years of learning, contributing, and engaging with the Docker community, I felt ready to take the next step: applying for the <strong>Docker Captain program</strong>. This recognition would not only validate my contributions but also provide me with a platform to reach more people and advocate for Docker in the global developer ecosystem.</p>
<h4 id="heading-reflections-on-the-journey">Reflections on the Journey</h4>
<p>Looking back, what began as curiosity at a meetup has grown into a passion for containerisation and community building. Docker has been more than just a technology for me, it has been a catalyst for personal growth, professional opportunities, and meaningful connections with developers worldwide.</p>
<p>As I continue this journey, I am excited to share more knowledge, build impactful communities, and explore the endless possibilities of containerisation. For anyone looking to start their Docker journey, my advice is simple: <strong>stay curious, contribute to the community, and never stop learning.</strong></p>
<h3 id="heading-whats-happening-at-docker-inc">What’s Happening at Docker Inc.?</h3>
<p>Before we talk about Docker Desktop, let me quickly highlight the key areas Docker Inc. is focusing on post the <strong>2019 Mirantis acquisition</strong> of their enterprise business. Docker Inc. has been refining its offerings, and here are its major products:</p>
<ol>
<li><p><strong>Docker Desktop</strong><br /> A local development environment enabling developers to efficiently build, share, and run containerized applications on their desktops. It integrates seamlessly with multiple developer tools and supports various programming languages. It also has WASM integration and you can build and run wasm OCI images too.</p>
</li>
<li><p><strong>Docker Hub</strong><br /> A cloud-based repository where developers can discover, share, and store container images. It acts as a central hub for managing container images and finding trusted content.</p>
</li>
<li><p><strong>Docker Scout</strong><br /> A tool designed to simplify the software supply chain by providing insights into container images, helping developers identify and address vulnerabilities. It makes finding CVE’s super simple in the container images.</p>
</li>
</ol>
<p>Now that we’ve got a quick overview, let’s zoom in on Docker Desktop and learn how to get started.</p>
<h3 id="heading-installing-docker-desktop">Installing Docker Desktop</h3>
<p>For this blog, I’ll focus on the installation process for macOS, as that’s what I use. If you’re on Windows or any other you can visit the <a target="_blank" href="https://www.docker.com/products/docker-desktop/">official website</a> and download as per the OS.</p>
<h4 id="heading-steps-to-instahttpsdocsdockercomdesktopsetupinstallwindows-installll-docker-desktop-on-macoshttpsdocsdockercomdesktopsetupinstallwindows-install"><a target="_blank" href="https://docs.docker.com/desktop/setup/install/windows-install/">Steps to Insta</a>ll Docke<a target="_blank" href="https://docs.docker.com/desktop/setup/install/windows-install/">r Desktop on macOS:</a></h4>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1734075071703/1dd54880-7214-4401-ba58-3e109812ba20.png" alt class="image--center mx-auto" /></p>
<ol>
<li>Visit the <a target="_blank" href="https://docs.docker.com/desktop/setup/install/mac-install/">Docker Desktop installation guide for macOS</a> and click download for MAC</li>
</ol>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1733915977129/6a283767-e57f-4a80-94f5-832bc54687f1.png" alt class="image--center mx-auto" /></p>
<ol start="2">
<li><p>Download the installer and click on it to begin the installation process.</p>
</li>
<li><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1733916484654/023643c3-f43c-41a6-a2fc-1c9835ff66e6.png" alt /></p>
<p> Follow the on-screen instructions to complete the setup.</p>
<p> <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1733916435831/1c2c9792-1229-4832-a545-8a7c08600afc.png" alt /></p>
</li>
<li><p>Once installed, open Docker Desktop, and you’re ready to start containerizing!</p>
</li>
</ol>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1734075583412/da6e876f-359f-4a11-9307-1e363045fceb.png" alt class="image--center mx-auto" /></p>
<p>If you’re planning to use Docker Desktop for work, make sure to select the appropriate license during setup. Docker Desktop offers both personal and professional options, so choose based on your needs.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1733916065649/16dee8b9-bdac-4c87-a28b-375e3351e33c.png" alt class="image--center mx-auto" /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1733916343860/279e4429-727a-4d16-85c4-658dd39cf7bb.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-whats-next">What’s Next?</h3>
<p>In my upcoming blogs, I’ll dive deeper into Docker Desktop’s features, its integration with Kubernetes, and tips to optimise your containerised workflows. For now, I encourage you to explore Docker Desktop, try running your first container, and share your experiences in the comments below.</p>
]]></content:encoded></item><item><title><![CDATA[Mastering Kubernetes Costs: From Monitoring to Automation]]></title><description><![CDATA[Navigating the Kubernetes Cost Challenge
Kubernetes has revolutionized how we deploy and manage applications, offering unparalleled scalability and flexibility. However, with great power comes great complexity, and this complexity can often lead to e...]]></description><link>https://blog.kubesimplify.com/mastering-kubernetes-costs-from-monitoring-to-automation</link><guid isPermaLink="true">https://blog.kubesimplify.com/mastering-kubernetes-costs-from-monitoring-to-automation</guid><category><![CDATA[kubernetes costs]]></category><category><![CDATA[Kubernetes]]></category><category><![CDATA[cost-optimisation]]></category><category><![CDATA[@CastAI]]></category><category><![CDATA[monitoring]]></category><dc:creator><![CDATA[Saloni Narang]]></dc:creator><pubDate>Fri, 13 Dec 2024 12:48:18 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1734093150713/641b00cb-3369-4416-ab49-fc4cfdce766a.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><strong>Navigating the Kubernetes Cost Challenge</strong></p>
<p>Kubernetes has revolutionized how we deploy and manage applications, offering unparalleled scalability and flexibility. However, with great power comes great complexity, and this complexity can often lead to excessive cloud costs if not managed effectively.</p>
<p>This blog delves into the journey from basic Kubernetes monitoring to advanced automation, highlighting how intelligent solutions like <a target="_blank" href="https://cast.ai/kubesimplify"><strong>CAST AI</strong></a> can help organizations unlock significant cost savings without compromising performance or stability.</p>
<p>We’ll start by understanding why monitoring is the foundation of any cost optimization strategy. Next, we’ll explore the common challenges teams face when transitioning to automation for resource management. Finally, we’ll examine how CAST AI addresses these challenges and empowers organizations with its innovative solutions and real-world success stories.</p>
<hr />
<h2 id="heading-kubernetes-monitoring-the-foundation-of-cost-optimization"><strong>Kubernetes Monitoring: The Foundation of Cost Optimization</strong></h2>
<p>Monitoring is the essential first step towards understanding and managing your Kubernetes environment. By providing real-time visibility into performance, security, and resource utilization, monitoring helps teams identify inefficiencies and opportunities for optimization.</p>
<h3 id="heading-key-areas-of-kubernetes-monitoring"><strong>Key Areas of Kubernetes Monitoring</strong></h3>
<ol>
<li><p><strong>Performance Monitoring</strong><br /> Metrics like CPU usage, memory consumption, network traffic, and pod restarts help identify bottlenecks. This information allows teams to:</p>
<ul>
<li><p>Optimize resource allocation.</p>
</li>
<li><p>Adjust application configurations for smoother operation.</p>
</li>
<li><p>Ensure that workloads are running efficiently.</p>
</li>
</ul>
</li>
<li><p><strong>Security Monitoring</strong><br /> Monitoring user activity, API calls, and network traffic patterns can reveal suspicious behaviors and potential security breaches. Prompt detection and resolution of security issues prevent costly data leaks and system downtime.</p>
</li>
<li><p><strong>Application Health Monitoring</strong><br /> Metrics such as response times, error rates, and request throughput provide insights into application health. Proactive monitoring allows teams to address performance degradation before it impacts end users, ensuring a seamless experience.</p>
</li>
</ol>
<p>While tools like <strong>Prometheus</strong> and <strong>Grafana</strong> are excellent at collecting and visualizing this data, many teams stop short of translating these insights into actionable measures, missing the opportunity to optimize their Kubernetes clusters effectively.</p>
<hr />
<h2 id="heading-missed-opportunities-the-case-for-automation"><strong>Missed Opportunities: The Case for Automation</strong></h2>
<p>Although monitoring provides valuable insights, acting on them manually can be time-consuming and error-prone. Automation is the next logical step where people should be doing resource optimization, cost optimization, yet many teams hesitate to adopt it for several reasons:</p>
<ol>
<li><p><strong>Fear of Disruption</strong><br /> Teams worry that automation might introduce instability, especially in complex Kubernetes environments.</p>
</li>
<li><p><strong>Lack of Expertise</strong><br /> The rapidly evolving Kubernetes landscape can make it challenging for teams to implement and manage sophisticated automation solutions.</p>
</li>
<li><p><strong>Resistance to Change</strong><br /> Traditional practices and a reluctance to embrace new technologies often slow down the adoption of automation.</p>
</li>
</ol>
<p>These hesitations lead to significant missed opportunities, such as the adopting of <strong>spot instances</strong> which is a discounted compute option provided by cloud providers. While spot instances offer substantial cost savings, their ephemeral nature requires robust automation to handle disruptions effectively.</p>
<hr />
<h2 id="heading-cast-ai-unlocking-the-power-of-automation"><strong>CAST AI: Unlocking the Power of Automation</strong></h2>
<p>CAST AI bridges the gap between monitoring and automation, providing a comprehensive platform that optimizes Kubernetes environments for cost, performance, and security.</p>
<h3 id="heading-key-features-of-cast-ai"><strong>Key Features of CAST AI</strong></h3>
<ol>
<li><p><strong>Cost Monitoring and Insights</strong><br /> CAST AI offers detailed dashboards that break down resource usage, trends, and forecasts. Unlike traditional tools, it goes a step further by providing actionable recommendations for cost savings.</p>
</li>
<li><p><strong>Rebalancing</strong><br /> CAST AI’s rebalancing engine minimizes resource fragmentation by intelligently placing pods across nodes. For example, instead of spreading workloads across three underutilized nodes, it consolidates them onto two nodes, reducing costs while maintaining high availability.</p>
</li>
<li><p><strong>Workload Right-Sizing</strong><br /> By analyzing historical data and real-time metrics, CAST AI dynamically adjusts CPU and memory allocations for workloads. This ensures optimal resource utilization without compromising performance.</p>
</li>
<li><p><strong>Spot Instance Management</strong><br /> CAST AI automates the use of spot instances with features like:</p>
<ul>
<li><p><strong>Fallback Mechanisms</strong>: Automatically shifts workloads to on-demand nodes when spot instances are reclaimed.</p>
</li>
<li><p><strong>Spot Diversity</strong>: Leverages a wide range of spot instance types to minimize disruption risk.</p>
</li>
</ul>
</li>
</ol>
<p>These features empower organizations to achieve cost reductions of up to <strong>50%-80%</strong>, depending on their workloads and strategies.</p>
<hr />
<h2 id="heading-real-world-success-stories"><strong>Real-World Success Stories</strong></h2>
<p>CAST AI’s solutions have delivered tangible results for organizations across industries:</p>
<ol>
<li><p><strong>Eliminating Manual Kubernetes Upgrades</strong><br /> A customer who spent two weeks annually upgrading Kubernetes versions across environments automated the process with CAST AI. This not only saved time but also boosted team morale, allowing engineers to focus on strategic initiatives.</p>
</li>
<li><p><strong>Spot Instance Management During Peak Traffic</strong><br /> Yotpo, a marketing software company, leveraged CAST AI’s spot instance automation during Black Friday. Despite high traffic, CAST AI ensured seamless scaling and uninterrupted operations, resulting in significant cost savings.</p>
</li>
</ol>
<hr />
<h2 id="heading-beyond-cost-optimization-cast-ais-expanding-capabilities"><strong>Beyond Cost Optimization: CAST AI’s Expanding Capabilities</strong></h2>
<p>CAST AI is continuously innovating to address broader Kubernetes challenges. Recent launches include:</p>
<ul>
<li><p><strong>AI Enabler</strong>: A solution for leveraging AI to enhance Kubernetes operations.</p>
</li>
<li><p><strong>Kubernetes Security</strong>: A new product designed to improve container and cluster security.</p>
</li>
</ul>
<p>These additions position CAST AI as a comprehensive platform for Kubernetes day-two operations, covering everything from cost optimization to security and AI-driven enhancements.</p>
<hr />
<h2 id="heading-embracing-automation-for-a-cost-efficient-future"><strong>Embracing Automation for a Cost-Efficient Future</strong></h2>
<p>Over-provisioning remains a rampant issue in Kubernetes environments, with studies showing that up to <strong>87% of resources are underutilized</strong>. By embracing automation, organizations can unlock significant cost savings, improve operational efficiency, and free up teams to focus on innovation.</p>
<p>As Giri from CAST AI aptly put it: <em>“Don’t do manual work for resource optimization. We’re in 2024 - it’s time to automate.”</em></p>
<hr />
<h2 id="heading-ready-to-get-started"><strong>Ready to Get Started?</strong></h2>
<p>CAST AI offers a free trial to help teams explore its capabilities. <a target="_blank" href="https://cast.ai/kubesimplify">Visit their website</a> to set up a demo cluster and see how monitoring and automation can transform your Kubernetes operations.<br />With intelligent solutions like CAST AI, Kubernetes cost optimization is no longer a daunting task but an achievable goal. Start your journey today and unlock the full potential of automation.</p>
<p>Check out the entire Conversation on Kubesimplify Youtube channel </p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://www.youtube.com/watch?v=HHETsDLAlt0&amp;t=1s">https://www.youtube.com/watch?v=HHETsDLAlt0&amp;t=1s</a></div>
]]></content:encoded></item><item><title><![CDATA[KubeCon + CloudNativeCon North America 2024 Recap: Themes, Innovations, and Community Spirit]]></title><description><![CDATA[Hello everyone! KubeCon + CloudNativeCon North America 2024 has just wrapped up, and I’m thrilled to share what I learned and experienced from this incredible event. I returned home two days ago and am still struggling with jet lag, but the excitemen...]]></description><link>https://blog.kubesimplify.com/kubecon-cloudnativecon-north-america-2024-recap-themes-innovations-and-community-spirit</link><guid isPermaLink="true">https://blog.kubesimplify.com/kubecon-cloudnativecon-north-america-2024-recap-themes-innovations-and-community-spirit</guid><category><![CDATA[Kubecon]]></category><category><![CDATA[cloud native]]></category><category><![CDATA[CNCF]]></category><category><![CDATA[Kubernetes]]></category><dc:creator><![CDATA[Saloni Narang]]></dc:creator><pubDate>Sat, 23 Nov 2024 10:41:18 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1732631958431/4563ab8c-7b9f-4959-a519-9178c19f017c.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hello everyone! KubeCon + CloudNativeCon North America 2024 has just wrapped up, and I’m thrilled to share what I learned and experienced from this incredible event. I returned home two days ago and am still struggling with jet lag, but the excitement and insights from the conference keep me energized.<br />This year’s KubeCon was thoughtfully organized around themes, with each day focusing on a specific topic:</p>
<ul>
<li><p>Day 1: Artificial Intelligence (AI)</p>
</li>
<li><p>Day 2: Security</p>
</li>
<li><p>Day 3: Community</p>
</li>
</ul>
<p>For anyone new to the cloud-native ecosystem, KubeCon is the largest and most influential conference dedicated to cloud-native technologies. This year, the 2024 edition was hosted in Salt Lake City, Utah. Preceding the main event was Rejekts, a two-day conference held on November 10 and 11. Rejekts offers a unique platform for deeper discussions and features talks that didn’t make it into the main KubeCon lineup. I had the opportunity to attend Rejekts in person and was thrilled to see Kubesimplify as a proud community and media partner for the event. The energy, the insightful conversations, and the chance to connect with incredible individuals made the experience unforgettable.</p>
<p>I also had the honor of speaking at KubeCon, delivering a talk titled <strong>"</strong><a target="_blank" href="https://youtu.be/X-0zyyWRkiM?si=Gta_WKdnie5Dr12Q"><strong>Cloud Native Sustainability Speedrun: Tools from Infrastructure to Application Level</strong></a><strong>."</strong> It was an incredible experience to share insights into sustainability tools like Kepler, KubeGreen, and Cloud Carbon Footprint, and to demonstrate how cloud-native applications can incorporate sustainability at every level.</p>
<p><img src="https://pbs.twimg.com/media/GcZTj1JaAAAPR7L?format=jpg&amp;name=large" alt="Image" /></p>
<p><strong><em>Day 1 Highlights: AI in Cloud-Native and Tackling Patent Trolls</em></strong></p>
<p>The first day of KubeCon was a blend of excitement around Artificial Intelligence and a call to action against legal challenges facing the open-source community.</p>
<ol>
<li><strong>Scaling Kubernetes for Generative AI</strong></li>
</ol>
<p>Industry experts shared lessons learned while building Kubernetes clusters to support generative AI workloads. They tackled challenges like hardware failures, GPU scheduling optimization, and observability while leveraging CNCF projects to manage platforms for AI.</p>
<ol start="2">
<li><strong>NVIDIA’s Contributions to AI and CNCF Projects</strong><br />Chris Lamb from NVIDIA discussed how the company uses and contributes to CNCF projects, showcasing the synergy between open source and AI advancements.</li>
</ol>
<p><strong>3. Patent Trolls and the Cloud Native Heroes Challenge</strong><br />CNCF launched the Cloud Native Heroes Challenge, a patent troll bounty program in partnership with Unified Patents. This initiative aims to protect open-source projects from legal threats, allowing the community to contribute while earning rewards. Learn more <a target="_blank" href="https://cncf.io/heroes/">here</a>.</p>
<p>On Day 1, the blend of technological advancements in AI and the community’s resilience in tackling challenges showcased the collaborative spirit of KubeCon.</p>
<p><strong><em>Day 2 Highlights: Security in the Cloud-Native Ecosystem</em></strong><br /> Day 2 focused on the critical topic of Security, reflecting the growing importance of safeguarding open-source ecosystems.</p>
<p>1. <strong>Envoy AI Gateway</strong><br />A significant announcement was the introduction of the Envoy AI Gateway, the first CNCF AI project. Built on Envoy Proxy, this gateway offers scalable solutions for LLM access, unified APIs, and upstream authorization.</p>
<p>2. <strong>CNCF End User Awards</strong><br /> Adobe received the CNCF End User Award for contributing to 46 CNCF projects, including Kubernetes, Prometheus, and OpenTelemetry.</p>
<p><img src="https://pbs.twimg.com/media/GcW49HRWAAEDl26?format=jpg&amp;name=medium" alt="Image" /></p>
<p>3. <strong>Announcing the 2024 Community Awards!</strong></p>
<p>This year’s awards include:<br />- Lifetime Achievement award (new!)<br />- Top Committer<br />- Chop Wood Carry Water<br />- CNCF Lorem Ipsum (previously Documentarian)<br />- TAGGIE<br />- Lift and Shift (special)</p>
<p><img src="https://pbs.twimg.com/media/GcXGrQta8AAPb5w?format=jpg&amp;name=medium" alt="Image" /></p>
<ol start="4">
<li><strong>New Certifications</strong><br /> At <a target="_blank" href="https://events.linuxfoundation.org/kubecon-cloudnativecon-north-america/?__hstc=60185074.51db3727f50fc735b9cf22025d9e6d69.1694516457625.1730541543361.1732355999476.16&amp;__hssc=60185074.2.1732355999476&amp;__hsfp=3589139824"><strong>KubeCon + CloudNativeCon North America</strong>, CNCF is also introducing three new p</a>roject-specific certifications:</li>
</ol>
<ul>
<li><p><a target="_blank" href="https://training.linuxfoundation.org/certification/certified-backstage-associate-cba/?__hstc=60185074.51db3727f50fc735b9cf22025d9e6d69.1694516457625.1730541543361.1732355999476.16&amp;__hssc=60185074.2.1732355999476&amp;__hsfp=3589139824"><strong>Certified Backstage Associate (CBA)</strong></a></p>
</li>
<li><p><a target="_blank" href="https://training.linuxfoundation.org/certification/certified-backstage-associate-cba/?__hstc=60185074.51db3727f50fc735b9cf22025d9e6d69.1694516457625.1730541543361.1732355999476.16&amp;__hssc=60185074.2.1732355999476&amp;__hsfp=3589139824"><strong>Op</strong></a><a target="_blank" href="https://events.linuxfoundation.org/kubecon-cloudnativecon-north-america/?__hstc=60185074.51db3727f50fc735b9cf22025d9e6d69.1694516457625.1730541543361.1732355999476.16&amp;__hssc=60185074.2.1732355999476&amp;__hsfp=3589139824"><strong>enTelemetry Certified Associate (OTCA)</strong></a></p>
</li>
<li><p><a target="_blank" href="https://events.linuxfoundation.org/kubecon-cloudnativecon-north-america/?__hstc=60185074.51db3727f50fc735b9cf22025d9e6d69.1694516457625.1730541543361.1732355999476.16&amp;__hssc=60185074.2.1732355999476&amp;__hsfp=3589139824"><strong>Kyverno Certified Associate (KCA)</strong></a></p>
</li>
</ul>
<p><strong><em>Day 3 Highlights: Community and the Future of Cloud-Native</em></strong></p>
<p>The final day celebrated the Community, emphasizing collaboration and envisioning the future of cloud-native technologies.</p>
<ol>
<li><strong>Kubernetes’ Ten-Year Journey</strong></li>
</ol>
<p>Kelsey Hightower reflected on a decade of Kubernetes, sharing stories of its evolution and celebrating the community, which has grown exponentially since the first KubeCon, which had 600 attendees in 2016.</p>
<p><img src="https://pbs.twimg.com/media/GccFjtQXkAAyHpm?format=jpg&amp;name=medium" alt="Image" /></p>
<p>2. <strong>Training African Technologists</strong><br />CNCF announced a partnership with LF Education and Andela to train 20,000 African technologists in cloud-native basics, enabling them to earn certifications like KCNA and CKAD.</p>
<p>3. <strong>Graduated Projects</strong><br />Congratulations to the newest CNCF Graduated projects:</p>
<ul>
<li><p>Cert-Manager</p>
</li>
<li><p>Dapr</p>
</li>
<li><p>KubeEdge</p>
</li>
<li><p>Falco</p>
</li>
</ul>
<p><strong>Co-Located Events</strong></p>
<p>KubeCon + CloudNativeCon started with various co-located events on Monday, November 11, 2024, each providing a deep dive into specific topics within the cloud-native ecosystem. Here's a quick rundown of a few of them:</p>
<ul>
<li><p>WasmCon Day 1</p>
</li>
<li><p>ArgoCon Hosted by CNCF</p>
</li>
<li><p>BackstageCon, Hosted by CNCF</p>
</li>
<li><p>Cilium + eBPF Day Hosted by CNCF</p>
</li>
<li><p>Cloud Native + Kubernetes AI Day Hosted by CNCF</p>
</li>
<li><p>Observability Day Hosted by CNCF</p>
</li>
<li><p>Platform Engineering Day Hosted by CNCF</p>
</li>
<li><p>Cloud Native University Hosted by CNCF (Half Day)</p>
</li>
<li><p>Data on Kubernetes Day Hosted by CNCF (Half Day)</p>
</li>
<li><p>EnvoyCon Hosted by CNCF (Half Day)</p>
</li>
<li><p>OpenFeature Summit Hosted by CNCF (Half Day)</p>
</li>
<li><p>AppDeveloperCon Hosted by CNCF</p>
</li>
</ul>
<p>These events set the stage for the main conference by fostering meaningful conversations and offering hands-on workshops for participants to expand their expertise. Save the Dates for Future KubeCons<br />Here are the upcoming KubeCon events to mark on your calendar:</p>
<ul>
<li><p>KubeCon + CloudNativeCon India 2024 | December 11–12 | Delhi, India</p>
</li>
<li><p>KubeCon + CloudNativeCon Europe 2025 | April 1–4 | London, England</p>
</li>
<li><p>KubeCon + CloudNativeCon China 2025 | June 10–11 | Hong Kong</p>
</li>
<li><p>KubeCon + CloudNativeCon Japan 2025 | June 16–17 | Tokyo, Japan</p>
</li>
<li><p>KubeCon + CloudNativeCon North America 2025 | November 10–13 | Atlanta, Georgia</p>
</li>
<li><p>KubeCon + CloudNativeCon India 2025 | August 6–7 | Hyderabad, India</p>
</li>
</ul>
<p><strong>Final Thoughts</strong></p>
<p>This KubeCon was particularly special for me as I recently became a <strong>CNCF Ambassador</strong>! With this new role, I had the privilege of attending the exclusive CNCF Ambassador breakfast, where I got to connect with other ambassadors and community leaders. It was an enriching experience to exchange ideas and learn more about contributing to the cloud-native ecosystem.</p>
<p><img src="https://pbs.twimg.com/media/GcR38YkbkAEnuzY?format=jpg&amp;name=medium" alt="Image" /></p>
<p>This time, I had my family with me → my husband, Saiyam, and our little one, Rushika. Believe it or not, this was Rushika’s third KubeCon! Thanks to the excellent <strong>daycare facility at KubeCon</strong>, balancing family and conference activities was seamless. Watching Rushika enjoy herself while I was immersed in the sessions and networking made this trip truly memorable.</p>
<p><img src="https://pbs.twimg.com/media/GcUdAHkboAA7_LV?format=jpg&amp;name=medium" alt="Image" /></p>
<p>Another highlight for me was finally being able to register for a <strong>professional headshot session</strong>. I’ve always missed this in previous events, but this time, I made it a point to grab the opportunity, and I’m thrilled with how it turned out!<br />Additionally, Saiyam and I teamed up with Shwetha Vohra to record an exclusive <strong>interview on platform engineering</strong> for our YouTube channel, <strong>Kubesimplify</strong>. The discussion was insightful, and I can’t wait to share it with you all soon!</p>
<p>I also love the concept of the <strong>job board at KubeCon</strong>. It is thoughtful, offering attendees a chance to explore job opportunities at the event. It was great to see companies actively seeking talent and participants being able to connect with potential employers directly.</p>
<p><img src="https://pbs.twimg.com/media/GcW4NhfXYAE-PXi?format=jpg&amp;name=900x900" alt="Image" /></p>
<p>What were your favorite moments from KubeCon 2024? Let’s discuss this in the comments!</p>
]]></content:encoded></item><item><title><![CDATA[Optimizing Kubernetes Costs: Balancing Spot and On-Demand Instances with Topology Spread Constraints]]></title><description><![CDATA[In the fast-evolving world of cloud-native applications, cost optimization is a top priority for any development team. Kubernetes offers a flexible and scalable platform for deploying applications, but with that flexibility comes complexity, especial...]]></description><link>https://blog.kubesimplify.com/optimizing-kubernetes-costs-balancing-spot-and-on-demand-instances-with-topology-spread-constraints</link><guid isPermaLink="true">https://blog.kubesimplify.com/optimizing-kubernetes-costs-balancing-spot-and-on-demand-instances-with-topology-spread-constraints</guid><category><![CDATA[Kubernetes]]></category><category><![CDATA[cost-optimisation]]></category><category><![CDATA[cloudcostmanagement]]></category><category><![CDATA[Devops]]></category><category><![CDATA[Devops articles]]></category><dc:creator><![CDATA[FacetsCloud]]></dc:creator><pubDate>Sat, 21 Sep 2024 08:41:43 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1726662654072/acf2f126-c490-43c3-8546-d1a16c771012.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In the fast-evolving world of cloud-native applications, cost optimization is a top priority for any development team. Kubernetes offers a flexible and scalable platform for deploying applications, but with that flexibility comes complexity, especially when it comes to managing costs.</p>
<p>One of the most effective strategies to <a target="_blank" href="https://blog.facets.cloud/cloud-cost-optimization-efficiency-by-design/">reduce cloud spending</a> is to use a mix of spot and on-demand instances. Spot instances are significantly cheaper, but they come with the risk of being terminated at any time, while on-demand instances provide the stability needed to keep your applications running smoothly.</p>
<p>On paper, the solution seems simple: combine spot and on-demand instances to get the best of both worlds—cost savings and reliability. <strong>However, the reality of managing pod placement across these different instance types is far from straightforward.</strong> Let’s explore the problem and how we tackled it.</p>
<h2 id="heading-the-problem-managing-pod-placement-in-mixed-instance-environments">The Problem: Managing Pod Placement in Mixed Instance Environments</h2>
<p>As you begin to implement a mixed instance strategy in Kubernetes, you quickly run into challenges with pod placement. Kubernetes does provide tools for controlling where pods are deployed, but they’re often too simplistic or too rigid for the nuanced control you need.</p>
<p><strong>Node Selectors</strong> allow you to direct Kubernetes to place a pod on a specific type of instance, such as a spot instance. But this method is binary—it either places the pod on a spot instance, or it doesn’t. There’s no middle ground, no balancing between instance types.</p>
<p><strong>Affinity and Anti-Affinity Rules</strong> provide more control by allowing you to express preferences or requirements for pod placement. For example, you could set a rule that prefers spot instances but allows on-demand instances if no spot instances are available. However, as your cluster grows and your applications become more complex, these rules can become cumbersome to manage. The YAML configurations become lengthy and difficult to maintain, and the rules themselves can become conflicting or lead to unintended consequences.</p>
<p>Additionally, as clusters scale, maintaining even distribution of pods across instance types becomes a challenge. Without careful management, you could end up with too many pods on one type of instance, leading to inefficiencies or increased risk if those instances are interrupted.</p>
<p>This lack of fine-grained control and the complexity of managing pod placement in large, diverse clusters were the core problems we needed to solve.</p>
<h2 id="heading-the-solution-leveraging-kubernetes-topology-spread-constraints">The Solution: Leveraging Kubernetes Topology Spread Constraints</h2>
<p>To address these challenges, we turned to <a target="_blank" href="https://kubernetes.io/docs/concepts/scheduling-eviction/topology-spread-constraints/">Kubernetes Topology Spread Constraints</a> (TSC), a feature that simplifies and streamlines the process of distributing pods across different topologies within a cluster. TSC allowed us to control pod placement with more nuance and flexibility, reducing the complexity of our configurations and improving our ability to manage mixed-instance environments.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1726836664397/f03c7e5b-6500-44e7-9d2c-70bad392f464.png" alt class="image--center mx-auto" /></p>
<p><a target="_blank" href="https://kubernetes.io/docs/concepts/scheduling-eviction/topology-spread-constraints/#spread-constraint-definition"><strong>MaxSkew</strong></a> is a critical component of this approach. By setting MaxSkew to 1, we ensured that pods are distributed as evenly as possible across both spot and on-demand instances. This prevents any single instance type from becoming overloaded, thereby improving the overall resilience and performance of our applications.</p>
<p>We also leveraged the <strong>Topology Key</strong> to distinguish between spot and on-demand instances. By using a node label such as "<a target="_blank" href="http://node.kubernetes.io/instance-type">node.kubernetes.io/instance-type</a>," we were able to clearly differentiate between the two, allowing Kubernetes to make informed decisions about where to place pods.</p>
<p>Here's an example of how this would look in practice using a Kubernetes <code>Deployment</code>:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">kind:</span> <span class="hljs-string">Deployment</span>
<span class="hljs-attr">apiVersion:</span> <span class="hljs-string">apps/v1</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">mypod</span>
  <span class="hljs-attr">labels:</span>
    <span class="hljs-attr">foo:</span> <span class="hljs-string">bar</span>
<span class="hljs-attr">spec:</span>
  <span class="hljs-attr">replicas:</span> <span class="hljs-number">10</span>
  <span class="hljs-attr">selector:</span>
    <span class="hljs-attr">matchLabels:</span>
      <span class="hljs-attr">foo:</span> <span class="hljs-string">bar</span>
  <span class="hljs-attr">template:</span>
    <span class="hljs-attr">metadata:</span>
      <span class="hljs-attr">labels:</span>
        <span class="hljs-attr">foo:</span> <span class="hljs-string">bar</span>
    <span class="hljs-attr">spec:</span>
      <span class="hljs-attr">topologySpreadConstraints:</span>
        <span class="hljs-bullet">-</span> <span class="hljs-attr">maxSkew:</span> <span class="hljs-number">1</span>
          <span class="hljs-attr">topologyKey:</span> <span class="hljs-string">node.kubernetes.io/instance-types</span>
          <span class="hljs-attr">whenUnsatisfiable:</span> <span class="hljs-string">DoNotSchedule</span>
          <span class="hljs-attr">labelSelector:</span>
            <span class="hljs-attr">matchLabels:</span>
              <span class="hljs-attr">foo:</span> <span class="hljs-string">bar</span>
      <span class="hljs-attr">containers:</span>
        <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">pause</span>
          <span class="hljs-attr">image:</span> <span class="hljs-string">registry.k8s.io/pause:3.1</span>
</code></pre>
<p>In this example:</p>
<ul>
<li><p><code>maxSkew: 1</code> ensures even distribution of pods across the different instance types.</p>
</li>
<li><p>The <code>topologyKey</code> is set to <a target="_blank" href="http://node.kubernetes.io/instance-type"><code>node.kubernetes.io/instance-type</code></a> to distinguish between spot and on-demand instances.</p>
</li>
<li><p>We use <code>whenUnsatisfiable: DoNotSchedule</code> to ensure strict adherence to the distribution rule, preventing pods from being scheduled if the constraints can’t be met.</p>
</li>
</ul>
<p><strong>whenUnsatisfiable</strong> became our fallback strategy, providing two options:</p>
<ul>
<li><p><strong>DoNotSchedule</strong> enforces strict adherence to the constraints, ensuring that if the desired distribution cannot be achieved, the pod won’t be scheduled. This option is crucial for scenarios where balanced distribution is critical for application performance or reliability.</p>
</li>
<li><p><strong>ScheduleAnyway</strong> offers flexibility by allowing the scheduler to proceed with pod placement even if perfect distribution isn’t possible. This approach is particularly useful in situations requiring rapid scaling, such as during sudden traffic spikes, where ensuring availability is more important than maintaining an ideal distribution.</p>
</li>
</ul>
<p>This solution has been around for a few years now, but the adoption hasn't quite caught on due to its lack of awareness.</p>
<h2 id="heading-real-world-challenges-limitations-and-considerations">Real-World Challenges: Limitations and Considerations</h2>
<p>While Topology Spread Constraints offer a powerful solution, they are not without limitations. One significant challenge is that <strong>Topology Spread Constraints do not rebalance pods at runtime</strong>. They only apply when pods are initially scheduled. If the distribution of instances changes—such as when a spot instance is terminated—Kubernetes does not automatically rebalance the pods across the remaining instances. This can lead to uneven distribution over time, potentially undermining the benefits of using TSC.</p>
<p>Another challenge arises during <strong>node failures</strong>. When a node fails, Kubernetes will reschedule the affected pods, but it may not respect the original Topology Spread Constraints. This could result in pods becoming concentrated on fewer nodes, reducing the effectiveness of your mixed-instance strategy.</p>
<h2 id="heading-conclusion-making-kubernetes-work-for-you">Conclusion: Making Kubernetes Work for You</h2>
<p>Optimizing costs in Kubernetes is a challenging but essential task for any development team. While the complexities of managing mixed-instance environments can be daunting, tools like Topology Spread Constraints offer a path forward. By embracing these features and understanding their limitations, you can achieve a balance between cost efficiency and reliability, making Kubernetes work for you rather than against you.</p>
<p>In the end, it’s about finding the right tools and strategies to meet your specific needs. Whether you’re managing a small cluster or a large, complex environment, the key is to remain flexible and adaptable, continually refining your approach as your requirements evolve. With the right mindset and the right tools, you can optimize your <a target="_blank" href="https://blog.facets.cloud/kubernetes-cicd-explained/">Kubernetes deployments</a> for both cost and performance, ensuring that your applications are always running at their best.</p>
]]></content:encoded></item></channel></rss>