LLM Observability trend on the rise?

Apr 16, 2025

KubeCon EU 2025. is over and what an event!! I was out for WasmIO in Barcelona and then KubeCon. I wasn’t able to make it to cloud native Rejekts(which is always a dope event). Let’s get straight into the events!

Wasm IO

I attended a place where the builders are present. Over the past three years, I’ve seen people advocating for the technology, showcasing its progress, and sharing production use cases to demonstrate how Wasm is being used in real-world scenarios. This year, the big question was: “Is Wasm ready for enterprise?”

There were a lot of great discussions, but one key takeaway for me was that the focus is now shifting towards “building the app that matters using Wasm,” rather than just pitching Wasm as a technology.

Although I expected the conference to be a bit larger this time, and noticed slightly less interest, I’m still backing this awesome piece of tech and will keep you posted on what happens next.

Playlist of the sessions - https://www.youtube.com/playlist?list=PLP3xGl7Eb-4OtFH1tcQm6u7_LRED7-3rg

KubeCon EU 2025 (technically KubeCon London)

This year Kubecon EU was one of the biggest KubeCon’s with over 12500 attendees, that is huge!! I felt it throughout my time at the conference. I like to visit KubeCon’s because I get to meet people, see whats new happening in the cloud native space, what are people innovating and meeting old friends. I was able to to a bit of everything!

The major theme this Kubecon from the Keynotes were LLM and observability. The AI and LLM discussions signaled a major technological shift, driving demand for specialized infrastructure, networking, and observability solutions capable of handling these demanding workloads. This AI focus coexisted with a profound emphasis on security as a non-negotiable foundation. The security narrative has evolved, shifting strongly towards proactive software supply chain integrity (signing, provenance, verification) and Zero Trust principles, moving beyond reactive measures.

With all these advancements, platform engineering has solidified its role as the main strategy for managing this complexity, prioritizing developer productivity through abstraction and standardization. Observability is also maturing, moving towards unified, multi-modal insights with OpenTelemetry emerging as the standard , and its scope expanding to encompass proactive optimization, including cost management. Technologies like Wasm are finding their footing in production, offering secure and efficient extensibility.

A bit more on Observability - At KubeCon, I saw the opening Keynotes were all about putting right data into LLM’s to get meaningful responses. The observability for LLM’s an how to achieve that. eBPF on the rise for now not only network tracing but capturing metrics too.

Finally, the inclusion of discussions around digital sovereignty and geopolitics indicates an awareness that technology choices are increasingly influenced by broader global factors, with open source positioned as a strategic enabler. KubeCon EU 2025 showcased an ecosystem moving decisively beyond basic orchestration towards building intelligent, resilient, secure, and efficient platforms ready for the challenges and opportunities ahead.

Apart from this I enjoyed my time at vCluster booth and did 3 demos, I also had a full house talk with Paco.

More with AI?

There has been so much stuff that happened in the AI ecosystem that I don’t know how big I should write this newsletter.

Did you try the Ghibli style? the OpenAI image generation has gone crazy next level!

Gemini 2.5 with deep research is dope, I tried that and the level of research it does along with google search an sources is dope. Its like reading through 20-30 webpages to get the real deal! Loving the deep research. Over the past month I have been studying about MCP and also the working is coming out on Kubesimplify this week, dived into AI agents with Solomon Hykes(previously founded Docker) and Kyle from Dagger. Google announced bunch of things at Google Cloud Next

This, and so much more! OpenAI got a new 4.1 model which is more developer centric and scored high in the software benchmark rankings.

Below is how you can build AI agents Using Dagger, MCP course dropping this week on Kubesimplify and April I plan to complete the CKS series as well. I am working on Kubernetes crash course too which will be soon published on Kubesimplify. So better to follow for all the updates!

Before we get to Awesome reads I want to share my thoughts on managed Kubernetes:

The Next wave for Managed Kubernetes: Seamless, Secure AI Workloads

As AI become the engine of innovation across industries, the demands on cloud infrastructure are rapidly evolving. Kubernetes is one of the defacto standards for running workloads and AI is not left behind, we have seen that at past 4 KubeCons too on how CNCF and the cloud native community is moving in to support the AI workloads.

Managed Kubernetes services have made it easier than ever to deploy and scale applications, but running AI workloads, especially those requiring GPUs still presents significant challenges for developers and infrastructure teams alike.

The real opportunity for cloud providers now is to deliver a Kubernetes experience where AI workloads run seamlessly, securely, and efficiently, with zero friction for developers.

What Does This Look Like in Practice?

- Seamless GPU Support: Developers should be able to request and use GPU resources as easily as they request CPU or memory, without wrestling with low-level configuration or hardware management.
- Workload Isolation and Data Privacy: With multiple teams and customers sharing the same GPU infrastructure, robust isolation is critical. Solutions like multi tenancy, namespace-level security, maybe confidential computing can ensure that every application’s data remains private and protected.
- Maximized GPU Utilization: Idle GPUs are wasted potential. Advanced scheduling, Kueue ,Nvidia operator support (k8s-dra-driver-gpu), fractional GPU allocation, and intelligent workload placement can drive higher utilization, lower costs, and better ROI for both providers and customers.
- Self-Service and Automation: Infrastructure teams shouldn’t have to build custom solutions for every new AI project. A truly managed experience means automated provisioning, monitoring, and scaling freeing up engineers to focus on innovation, not infrastructure.

Why This Matters now?

Organizations are under pressure to deliver AI-powered products faster than ever. If cloud providers can offer a platform that eliminates the complexity of running AI workloads on Kubernetes, while guaranteeing security, performance, and developer happiness they’ll become the go-to choice for the next generation of digital leaders.

The Path Forward

The leaders in this space will be those who:

- Invest in next-gen GPU scheduling and isolation technologies
- Build developer-centric APIs and tools for AI/ML workflows
- Share real-world success stories and best practices to inspire confidence
This is the hour of need. The cloud providers who rise to this challenge delivering a seamless, secure, and scalable AI experience on Kubernetes will define the future of cloud-native AI.
What are your thoughts? Any providers already providing this experience?

Awesome Reads

Kubernetes v1.33 sneak peek - Kubernetes v1.33, releasing on April 23, 2025, will deprecate the original Endpoints API, remove kube-proxy version info from node status, and drop host network support for Windows pods, while introducing user namespaces for Linux pods as a default security feature and enabling in-place resource resizing for pods, among other improvements. The update also brings enhancements like ordered namespace deletion, better device status reporting for ResourceClaims, and more robust indexed job management.
Introducing kube-scheduler-simulator - kube-scheduler-simulator, a tool allowing users and developers to examine and test the Kubernetes scheduler's behavior and custom configurations in simulated or production-like environments. This simulator provides insights into scheduling decisions and facilitates safer testing of new scheduler versions.
Components vs. Containers: Fight? - WebAssembly (Wasm) components offer a lightweight, portable, and efficient alternative to containers for running applications, especially in edge, multi-cloud, and resource-constrained environments. Rather than replacing containers, components when paired with orchestrators like wasmCloud can complement Kubernetes by enhancing application efficiency, flexibility, and scalability across diverse infrastructure.
GitHub App bootstrap with Flux Operator - This post showcase how Flux Operator can be used to bootstrap Kubernetes clusters using the GitHub App authentication method introduced in Flux 2.5.0.
LLM Embeddings Explained: A Visual and Intuitive Guide - Embeddings are the core mechanism that transforms raw text into numerical vectors, enabling LLMs to understand and reason about language in a high-dimensional space. This article explains the evolution, implementation, and visualization of embeddings, with interactive examples and code using the DeepSeek-R1-Distill-Qwen-1.5B model, all designed for a focused, hands-on learning experience.
DNS Explained: From Basics to Building My Own DNS Server - This blog is a nice primer to fundamentals of DNS and building your own DNS server.
Getting Started with Google A2A: A Hands-on Tutorial for the Agent2Agent Protocol - While the AI agent ecosystem is booming, it's also fragmented with various frameworks like LangGraph, CrewAI, and Google ADK. This tutorial explores how Google's Agent2Agent (A2A) protocol enables cross-framework collaboration, walking through a practical demo setup and explaining the core concepts behind seamless agent communication.
Introducing Docker Model Runner: A Better Way to Build and Run GenAI Models Locally - Docker has introduced Docker Model Runner, a new feature in Docker Desktop 4.40 that simplifies running and testing AI models locally, integrating seamlessly into existing development workflows. By supporting local inference with GPU acceleration, OCI-based model packaging, and partnerships with key AI and tooling providers, Model Runner enables faster, more efficient GenAI development without the typical complexity of AI toolchains.

Awesome X posts

https://x.com/mattpocockuk/status/1904208550127186104
https://x.com/SaiyamPathak/status/1910383275392328004
https://www.reddit.com/r/kubernetes/comments/1jwisw9/who_is_running_close_to_1k_pods_per_node/
https://x.com/SaiyamPathak/status/1912076346139853135

There was so much to pack in a single edition, I hope you liked this edition!

Cloud native with Saiyam

Discussion about this post