HashiCorp Vault for Dummies: Transit Auto-Unseal Across 2 WSL Nodes

Intro This is part two of our Vault for Dummies series. After setting up a Vault server with Raft and TLS in part1, we’ll now configure it to auto-unseal at startup using another Vault server as a Transit engine. Perfect if you want to simulate a cluster across nodes in WSL. This guide walks you …

vLLM for beginners: Key Features & Performance Optimization(PartII)

Intro In Part 1 of our vLLM for beginners Series, we covered the fundamentals—core concepts and terminology behind vLLM’s architecture. In Part 2, we go deeper into what makes vLLM excel at performance: features like PagedAttention, attention backends, prefill & decode management, and more. 💡This series is about building a strong foundation in vLLM—understanding how …

Terraform Pipelines for Dummies Part3: GitHub Actions Azure Deploy with OIDC

Intro Did you know that over 23 millions secrets were publicly exposed in GitHub in 2024 alone? and even 70% of the secrets leaked in 2022 are still valid? This is additional evidence that leaked secrets are still the number one biggest threat to your business. The worst thing to do is make it easy …

vLLM for beginners: The Fundamentals

Intro last year, I have dived deep into Ollama inference where I ended up building and speaking about Ollama Kubernetes deployments along with rich documentation in my ollama_lab repo and quantization article—This year’s Cloudtrhill focus is VLLM Inference which is a next level beast from a model serving standpoint. Exploring multiple inference options is time-intensive …

Ollama deployment on Civo K8s Cluster with terraform

Intro Tired of sharing your IP & sensitive data to OpenAI ? What if you could run your own private AI chatbot powered by Local Inference & LLMs, with 100% data privacy—all inside a Kubernetes cluster?Today we’ll show you how to deploy an end-to-end LLM inference setup on a Civo Cloud Talos K8s cluster with …