LLM Embeddings Explained Like I’m 5

Intro We often hear about RAG (Retrieval-Augmented Generation) and vector databases that store embeddings, but we fail to remember what exactly are embeddings used for and how they work. In this post, we’ll break down how embeddings work – in the simplest way possible (yes, like you’re 5 πŸ§ πŸ“Ž). I. What is an Embedding? Embeddings …

vLLM production-stack: LLM inference for Enterprises (part1)

Intro If you’ve played with vLLM locally you already know how fast it can crank out tokens. But the minute you try to serve real traffic with multiple models, thousands of chats, you hit the same pain points the community kept reporting: ⚠️ Pain point What you really want High GPU bill Smarter routing + …

Safety Detectives Interview with CloudThrill CEO Kosseila Hd

Our Founder & CEO, Kosseila Hd, recently sat down with SafetyDetectives to share CloudThrill’s vision on why the real AI revolution lies in infrastructure rather than just apps, and what that means for organizations building and owning their private, scalable AI.

Recursive `.π‘”π‘–π‘‘π‘–π‘”π‘›π‘œπ‘Ÿπ‘’`: When Ignoring Goes Too Far

intro You might have heard of the “recursive .gitignore symptom”, or maybe you haven’tβ€”but if you work with Git long enough, there’s a good chance you’ll run into it. It’s one of those sneaky issues that can cause unexpected behavior in your repositories, making files disappear from Git tracking when you least expect it. Imagine …

HashiCorp Vault for Dummies: K8s Auth setup in an External Vault (WSL)

Intro This is part three of our Vault for Dummies series. After Part 2 where we set up Vault with Transit Auto-unseal, it’s time to tackle Kubernetes authentication, from outside the cluster. In this post, we’ll walk through setting up Kubernetes auth with an external Vault, so your K8s workloads can securely authenticate and pull …