vLLM Production Stack on GCP GKE with Terraform๐Ÿง‘๐Ÿผโ€๐Ÿš€

Intro Welcome back to the terraform vLLM Production Stack series! After covering AWS EKS and Azure AKS, today we’re deploying vLLM production-stack on Google Cloud GKE with the same Terraform approach. This guide shows you how to deploy a production-ready LLM serving environment on Google Cloud, with GCP-specific optimizations including Dataplane V2 (Cilium eBPF), VPC-native …

vLLM Production Stack on Azure AKS with Terraform๐Ÿง‘๐Ÿผโ€๐Ÿš€

Intro The vLLM Production Stack is designed to work across any cloud provider with Kubernetes. After covering AWS EKS, today we’re deploying vLLM production-stack on Azure AKS with the same Terraform approach. This guide shows you how to deploy the same production-ready LLM serving environment on Azure, with azure-specific optimizations. We’ll cover network architecture, certificate …

vLLM Production Stack on Amazon EKS with Terraform๐Ÿง‘๐Ÿผโ€๐Ÿš€

Intro Deploying vLLM manually is fine for a lab, but running it in production means dealing with Kubernetes, autoscaling, GPU orchestration, and observability. Thatโ€™s where the vLLM Production Stack comes in – a Terraform-based blueprint that delivers production-ready LLM serving with enterprise-grade foundations. In this post, we’ll deploy it on Amazon EKS, covering everything from …

Zero to Civo: Deploy Talos Kubernetes with Terraform (incl Grafana & Prometheus)

Intro If you’re looking to spin up a modern, secure Kubernetes cluster in Civo Cloud with full observabilityโ€”this guide is for you. We’ll walk through deploying a Civo Talos K8s cluster using Terraform, and layer in Letsncrypt TLS certs, Prometheus and Grafana for monitoring. Whether you’re building a quick lab, testing a workload, or setting …

Ollama deployment on Civo K8s Cluster with terraform

Intro Tired of sharing your IP & sensitive data to OpenAI ? What if you could run your own private AI chatbot powered by Local Inference & LLMs, with 100% data privacyโ€”all inside a Kubernetes cluster?Today we’ll show you how to deploy an end-to-end LLM inference setup on a Civo Cloud Talos K8s cluster with …