Kubecon + CloudNativeCon Recap | Nvidia DGX Spark

Nov 16, 2025

KubeCon is about community, collaboration, and the innovation shaping the cloud-native space.

This time was extra special as the CNCF celebrated its 10-year anniversary. The next decade will require even more hard work and collaboration from the community. On a bittersweet note, I also saw many maintainers struggling to stay motivated and supported, the recent Ingress NGINX retirement announcement being one example. So, while we celebrate, it’s also a reminder to build a model that motivates existing maintainers, encourages new ones to step up, and ensures their work is supported.

This KubeCon was special for me → packed with meetings and sessions right from Rejekts. Although my Rejekts talk wasn’t originally planned, I ended up presenting “Beyond the Default Scheduler: Navigating GPU Multitenancy in the AI Era” with Hrittik, where we discussed GPU architecture, GPU sharing, and tenancy models for AI. I then attended the Maintainers Summit, met some amazing maintainers, and joined a few TAG roundtables.

At the Kubernetes Colocated Events, I had the opportunity to give a 5-minute keynote as part of the vCluster team:

Nov 8: Cloud Native Rejekts
Nov 9: Maintainers Summit
Nov 10: AI-Ready Platforms: Scaling Teams Without Scaling Costs: I spoke about how virtual clusters and autoscaling are reshaping how AI teams scale infrastructure.
Nov 12: Open Source at the Edge: Hardware, Firmware & AI Stacks: together with Miley Fu, we showcased the EchoKit device and demonstrated how to create an AI-enabled voice assistant.
Nov 12: Building Resilient Cloud-Native Infrastructure in the Second Decade (TAG Operational Resilience): a panel with Rafael Brito (StormForge), Mario Fahlandt (Kubermatic), Carolina Valencia (KrolCloud), Nabarun Pal (Broadcom), and myself, where we discussed TAG Operational Resilience initiatives and subprojects like Green Reviews.

I also spent time at the vCluster booth and had a personal milestone → my first-ever book signing! :)

My Key Highlights from KubeCon:

Meaningful conversations with friends and new community members: I spoke to many about AI-ready platforms, observability, and security for LLMs and the broader AI ecosystem.
The Maintainers Summit discussions around technical debt and how we can make TAGs better hubs for innovation.
The energy at the booths, sessions, and sponsor showcases is always inspiring.
Big ecosystem updates: Ingress NGINX retirement, ESO GA release, and the new CNCF AI Conformance Program.
Continued progress in TAG Operational Resilience, with plans to get more contributors involved.
And, of course, my first-ever book signing, definitely a core memory!

NVIDIA DGX Spark

Well, well, well, I have some exciting news!

I finally bought the NVIDIA DGX Spark, and you’ll soon see some awesome content around it on Kubesimplify. I’ve been wanting my own GPU for so long and can’t wait to hook it up to my desktop, run some local RAG pipelines, and benchmark models.

If you’ve got any crazy ideas, send them my way, I’d love to try them out!

Awesome Reads

Ingress NGINX Retirement: What You Need to Know - This is a very big news coming in, what are your thoughts? Kubernetes SIG Network and the Security Response Committee announced the retirement of Ingress NGINX, with maintenance continuing only until March 2026. Users are advised to migrate to modern alternatives like the Gateway API, as no future security fixes or updates will be provided after that date.
Streamline Complex AI Inference on Kubernetes with NVIDIA Grove - NVIDIA Grove, now part of NVIDIA Dynamo, is an open-source Kubernetes API designed to orchestrate and scale complex, multi-component AI inference systems—such as agentic and multimodal pipelines—across thousands of GPUs. It enables declarative system-level management with features like hierarchical gang scheduling, topology-aware placement, multilevel autoscaling, and role-aware orchestration for efficient and reliable distributed inference.
External Secret Operator GA - The External Secrets Operator (ESO) has reached General Availability (GA) with the release of version v1.0.0, marking a stable milestone that follows semantic versioning for future updates. This release introduces key features like generic target types (allowing ESO to create resources beyond Secrets) and modular provider support, enabling customized builds with specific providers while maintaining compatibility and ease of upgrade from previous versions.
Open Source KServe AI Inference Platform Becomes CNCF Project - KServe, an open-source AI inference platform originally part of Kubeflow, has been donated to the Cloud Native Computing Foundation (CNCF) as an incubating project to enhance collaboration across the cloud-native AI ecosystem. Now integrated with tools like vLLM and Red Hat OpenShift AI, KServe aims to simplify scalable AI model serving and enable Model-as-a-Service (MaaS) deployments across Kubernetes and edge environments.
ConfigHub: Why Your Internal Developer Platform Needs It - explains how ConfigHub, created by Alexis Richardson, Brian Grant, and Jesper Joergensen, introduces a new paradigm called Configuration as Data (CaD) to solve GitOps sprawl and complexity in managing Kubernetes configurations. Unlike traditional GitOps that mixes templates and code, ConfigHub stores fully rendered configurations in a versioned database, enabling a single pane of glass for managing, validating, and promoting configurations across environments with clarity, control, and automation.
CNCF Launches Certified Kubernetes AI Conformance Program to Standardize AI Workloads on Kubernetes - The Cloud Native Computing Foundation (CNCF) has launched the Certified Kubernetes AI Conformance Program to establish open, community-driven standards for running AI workloads reliably and consistently on Kubernetes. This initiative aims to ensure portability, interoperability, and stability across AI platforms, reducing fragmentation as major vendors like AWS, Google Cloud, Microsoft, Red Hat, and VMware adopt the certification to validate their AI-ready Kubernetes environments.

Awesome Resources/Repos

Deepnote - Deepnote is a drop-in replacement for Jupyter with an AI-first design, sleek UI, new blocks, and native data integrations. Use Python, R, and SQL locally in your favorite IDE, then scale to Deepnote cloud for real-time collaboration, Deepnote agent, and deployable data apps.
NVSentinel - NVSentinel is a cross-platform fault remediation service designed to rapidly remediate runtime node-level issues in GPU-accelerated computing environments
MLOps explained

Learn from X

https://x.com/satyanadella/status/1989755076353921404?s=20
https://x.com/steren/status/1987194178460381655?s=20
https://x.com/TetherIA_ai/status/1986598951290872254?s=20

Some Pictures from the event

Its a free newsletter, sharing is caring and subscribing is showing support!

Neural Foundry

Nov 16

Your session on GPU multitenancy sounds timely given the exploson in AI workloads. The Ingress NGINX retirement is definitly a watershed moment for the ecosystem. Curious how the DGX Spark will handle local RAG pipelines, that hardware should bring some serious inferece capabilities to your setup. Congrats on the book signing milesone too!

Expand full comment

Cloud native with Saiyam

Discussion about this post

Ready for more?