
Intro
If you’re looking to spin up a modern, secure Kubernetes cluster in Civo Cloud with full observability—this guide is for you. We’ll walk through deploying a Civo Talos K8s cluster using Terraform, and layer in Letsncrypt TLS certs, Prometheus and Grafana for monitoring. Whether you’re building a quick lab, testing a workload, or setting up for production, this setup gives you a rock-solid foundation in minutes.
💡You can find the repo ➡️ terraform-provider-civo along with many tf templates (AWS, Azure, GCP, OCI, AliCloud).
What’s in the box?📦
While Civo provides a K3s Terraform template, I found it too basic for my needs—so I built a more complete talos version and decided to share it🚀.
The deployment includes 19 resources in total:
✅ Civo Kubernetes Cluster (Talos)
✅ Traefik Ingress Controller (via Helm)
✅ Cert-Manager for TLS encryption
✅ Self-Signed ClusterIssuer setup
✅ LetsEncrypt http01 ClusterIssuer setup
✅ Grafana with Letsencrypt TLS certificate
✅ Prometheus
✅ Simple setup leveraging helm charts— easy to customize
Prerequisites
Before you begin, ensure you have the following:
🛠 Terraform (>=1.5
)
🛠 kubectl (>=1.25
)
🛠 Helm (>=3.10
)
☁ Civo Cloud Account with an API key
Getting started
Civo defaults to lightweight k3s clusters, which work great for many use cases. But for this project, I wanted something closer to a production-grade, on-prem setup. So I went with Talos Linux: minimal, secure, and purpose-built for K8s.

1️⃣Clone the repository
Civo talos deployment is located under terraform-provider-civo/k8s/talos directory (see below):
🌍 Repo: https://github.com/brokedba/terraform-examples
This repo is also a one stop shop for (AWS/OI/Azure/Alibaba) terraform deployments🚀
terraform-examples/terraform-provider-civo/
terraform-examples/terraform-provider-civo/k8s/talos
- Navigate to the civo directory and initialize terraform
$ git clone https://github.com/brokedba/terraform-examples.git
📂..
$ cd terraform-examples/terraform-provider-civo/k8s/talos
$ terraform init
2️⃣ Set Up Environment Variables
use an env-vars
file to export your TF_VARS such as Civo API Key. Replace placeholders with your values:
export CIVO_TOKEN="YOUR_CIVO_API_KEY"
export TF_VAR_region="NYC1"
export TF_VAR_compute_type="standard"
export TF_VAR_cluster_node_count=2
export TF_VAR_cluster_name_prefix="cloudthrill"
export TF_VAR_kubernetes_version="talos-v1.5.0"
export TF_VAR_label="k8s-pool"
export TF_VAR_kubernetes_namespace="my-namespace"
### NETWORK ####
export TF_VAR_network_name="default"
export TF_VAR_network_cidr="10.20.0.0/16"
# Monitoring
export TF_VAR_grafana_enabled="true"
export TF_VAR_prometheus_enabled="true"
export TF_VAR_app_name="grafana"
export TF_VAR_metrics_server_enabled="true"
export TF_VAR_ingress_email_issuer="no-reply@example.cloud"
# cluster_node_size="g4s.kube.medium"
- Load the Variables into Your Shell Before running Terraform, source the env-vars file:
$ source env-vars
3️⃣ Run Terraform deployment:
You can now safely run Terraform plan & apply. You will deploy 19 resources in total, including local kubeconfig.
$ terraform plan
$ terraform apply
...
...
...
### Final output
Apply complete! Resources: 19 added, 0 changed, 0 destroyed.
Outputs:
cluster_installed_applications = tolist([])
grafana_admin_password = <sensitive>
grafana_url = "grafan.d402f4e6.nip.io" <---- using nip.io domain
ingress_controller_load_balancer_hostname = "100ca564--eb7c6995cbd3.lb.civo.com"
kubernetes_cluster_endpoint = "https://212.2.240.98:6443"
kubernetes_cluster_id = "fc83580a-a320-40ff-a33d-13351050dd67"
kubernetes_cluster_name = "cloudthrill2-cluster"
kubernetes_cluster_ready = true
kubernetes_cluster_status = "ACTIVE"
kubernetes_cluster_version = "talos-v1.5.0"
master_ip = "212.2.240.98"
network_id = "33161f60-ed86-4f45-903b-94b7959fc991"
Full Plan
Terraform will perform the following actions:
# data.civo_kubernetes_cluster.cluster will be read during apply
# (depends on a resource or a module with changes pending)
<= data "civo_kubernetes_cluster" "cluster" {
+ api_endpoint = (known after apply)
+ applications = (known after apply)
+ cni = (known after apply)
+ created_at = (known after apply)
+ dns_entry = (known after apply)
+ installed_applications = (known after apply)
+ kubeconfig = (known after apply)
+ kubernetes_version = (known after apply)
+ master_ip = (known after apply)
+ name = "cloudthrill2-cluster"
+ num_target_nodes = (known after apply)
+ pools = (known after apply)
+ ready = (known after apply)
+ status = (known after apply)
+ tags = (known after apply)
+ target_nodes_size = (known after apply)
}
# data.kubernetes_secret.grafana[0] will be read during apply
# (config refers to values not yet known)
<= data "kubernetes_secret" "grafana" {
+ data = (sensitive value)
+ id = (known after apply)
+ immutable = (known after apply)
+ type = (known after apply)
+ metadata {
+ generation = (known after apply)
+ name = "grafana"
+ namespace = (known after apply)
+ resource_version = (known after apply)
+ uid = (known after apply)
}
}
# data.kubernetes_service.traefik will be read during apply
# (depends on a resource or a module with changes pending)
<= data "kubernetes_service" "traefik" {
+ id = (known after apply)
+ spec = (known after apply)
+ status = (known after apply)
+ metadata {
+ generation = (known after apply)
+ name = "traefik"
+ namespace = "traefik"
+ resource_version = (known after apply)
+ uid = (known after apply)
}
}
# civo_firewall.firewall will be created
+ resource "civo_firewall" "firewall" {
+ create_default_rules = false
+ id = (known after apply)
+ name = "cloudthrill2-firewall"
+ network_id = "33161f60-ed86-4f45-903b-94b7959fc991"
+ region = (known after apply)
+ egress_rule {
+ action = "allow"
+ cidr = [
+ "0.0.0.0/0",
]
+ id = (known after apply)
+ label = "all"
+ port_range = "1-65535"
+ protocol = "tcp"
}
+ ingress_rule {
+ action = "allow"
+ cidr = [
+ "0.0.0.0/0",
]
+ id = (known after apply)
+ label = "kubernetes-api-server"
+ port_range = "6443"
+ protocol = "tcp"
}
}
# civo_firewall.firewall-ingress will be created
+ resource "civo_firewall" "firewall-ingress" {
+ create_default_rules = false
+ id = (known after apply)
+ name = "cloudthrill2-firewall-ingress"
+ network_id = "33161f60-ed86-4f45-903b-94b7959fc991"
+ region = (known after apply)
+ ingress_rule {
+ action = "allow"
+ cidr = [
+ "0.0.0.0/0",
]
+ id = (known after apply)
+ label = "web"
+ port_range = "80"
+ protocol = "tcp"
}
+ ingress_rule {
+ action = "allow"
+ cidr = [
+ "0.0.0.0/0",
]
+ id = (known after apply)
+ label = "websecure"
+ port_range = "443"
+ protocol = "tcp"
}
}
# civo_kubernetes_cluster.cluster will be created
+ resource "civo_kubernetes_cluster" "cluster" {
+ api_endpoint = (known after apply)
+ cluster_type = "talos"
+ cni = "flannel"
+ created_at = (known after apply)
+ dns_entry = (known after apply)
+ firewall_id = (known after apply)
+ id = (known after apply)
+ installed_applications = (known after apply)
+ kubeconfig = (sensitive value)
+ kubernetes_version = "talos-v1.5.0"
+ master_ip = (known after apply)
+ name = "cloudthrill2-cluster"
+ network_id = "33161f60-ed86-4f45-903b-94b7959fc991"
+ num_target_nodes = (known after apply)
+ ready = (known after apply)
+ region = "NYC1"
+ status = (known after apply)
+ target_nodes_size = (known after apply)
+ write_kubeconfig = true
+ pools {
+ instance_names = (known after apply)
+ label = "k8s-pool"
+ node_count = 2
+ public_ip_node_pool = (known after apply)
+ size = "g4s.kube.large"
}
+ timeouts {
+ create = "5m"
}
}
# helm_release.cert_manager will be created
+ resource "helm_release" "cert_manager" {
+ atomic = false
+ chart = "cert-manager"
+ cleanup_on_fail = false
+ create_namespace = true
+ dependency_update = false
+ disable_crd_hooks = false
+ disable_openapi_validation = false
+ disable_webhooks = false
+ force_update = false
+ id = (known after apply)
+ lint = false
+ manifest = (known after apply)
+ max_history = 0
+ metadata = (known after apply)
+ name = "cert-manager"
+ namespace = "cert-manager"
+ pass_credentials = false
+ recreate_pods = false
+ render_subchart_notes = true
+ replace = false
+ repository = "https://charts.jetstack.io"
+ reset_values = false
+ reuse_values = false
+ skip_crds = false
+ status = "deployed"
+ timeout = 300
+ verify = false
+ version = "v1.17.2"
+ wait = true
+ wait_for_jobs = false
+ set {
+ name = "global.leaderElection.namespace"
+ value = "cert-manager"
}
+ set {
+ name = "installCRDs"
+ value = "true"
}
+ set {
+ name = "webhook.timeoutSeconds"
+ value = "30"
}
}
# helm_release.grafana[0] will be created
+ resource "helm_release" "grafana" {
+ atomic = false
+ chart = "grafana"
+ cleanup_on_fail = false
+ create_namespace = false
+ dependency_update = false
+ disable_crd_hooks = false
+ disable_openapi_validation = false
+ disable_webhooks = false
+ force_update = false
+ id = (known after apply)
+ lint = false
+ manifest = (known after apply)
+ max_history = 0
+ metadata = (known after apply)
+ name = "grafana"
+ namespace = (known after apply)
+ pass_credentials = false
+ recreate_pods = false
+ render_subchart_notes = true
+ replace = false
+ repository = "https://grafana.github.io/helm-charts"
+ reset_values = false
+ reuse_values = false
+ skip_crds = false
+ status = "deployed"
+ timeout = 300
+ values = (known after apply)
+ verify = false
+ version = "8.4.8"
+ wait = false
+ wait_for_jobs = false
+ set {
+ name = "grafana\\.ini.server.root_url"
+ type = "string"
+ value = "%(protocol)s://%(domain)s:%(http_port)s/grafana"
}
}
# helm_release.metrics_server[0] will be created
+ resource "helm_release" "metrics_server" {
+ atomic = false
+ chart = "metrics-server"
+ cleanup_on_fail = false
+ create_namespace = false
+ dependency_update = false
+ disable_crd_hooks = false
+ disable_openapi_validation = false
+ disable_webhooks = false
+ force_update = false
+ id = (known after apply)
+ lint = false
+ manifest = (known after apply)
+ max_history = 0
+ metadata = (known after apply)
+ name = "metrics-server"
+ namespace = "kube-system"
+ pass_credentials = false
+ recreate_pods = false
+ render_subchart_notes = true
+ replace = false
+ repository = "https://kubernetes-sigs.github.io/metrics-server"
+ reset_values = false
+ reuse_values = false
+ skip_crds = false
+ status = "deployed"
+ timeout = 300
+ verify = false
+ version = "3.12.1"
+ wait = false
+ wait_for_jobs = false
+ set {
+ name = "args"
+ value = "{--kubelet-insecure-tls,--kubelet-preferred-address-types=InternalIP}"
}
}
# helm_release.prometheus[0] will be created
+ resource "helm_release" "prometheus" {
+ atomic = false
+ chart = "prometheus"
+ cleanup_on_fail = false
+ create_namespace = false
+ dependency_update = false
+ disable_crd_hooks = false
+ disable_openapi_validation = false
+ disable_webhooks = false
+ force_update = false
+ id = (known after apply)
+ lint = false
+ manifest = (known after apply)
+ max_history = 0
+ metadata = (known after apply)
+ name = "prometheus"
+ namespace = (known after apply)
+ pass_credentials = false
+ recreate_pods = false
+ render_subchart_notes = true
+ replace = false
+ repository = "https://prometheus-community.github.io/helm-charts"
+ reset_values = false
+ reuse_values = false
+ skip_crds = false
+ status = "deployed"
+ timeout = 300
+ values = [
+ <<-EOT
extraScrapeConfigs: |
- job_name: 'traefik'
metrics_path: /metrics
scrape_interval: 10s
static_configs:
- targets:
- traefik.traefik.svc.cluster.local:9100
nodeExporter:
hostRootfs: false
containerSecurityContext:
privileged: true
allowPrivilegeEscalation: true
runAsUser: 0
runAsGroup: 0
EOT,
]
+ verify = false
+ version = "27.11.0"
+ wait = false
+ wait_for_jobs = false
}
# helm_release.traefik_ingress will be created
+ resource "helm_release" "traefik_ingress" {
+ atomic = false
+ chart = "traefik"
+ cleanup_on_fail = false
+ create_namespace = true
+ dependency_update = false
+ disable_crd_hooks = false
+ disable_openapi_validation = false
+ disable_webhooks = false
+ force_update = false
+ id = (known after apply)
+ lint = false
+ manifest = (known after apply)
+ max_history = 0
+ metadata = (known after apply)
+ name = "traefik"
+ namespace = "traefik"
+ pass_credentials = false
+ recreate_pods = false
+ render_subchart_notes = true
+ replace = false
+ repository = "https://helm.traefik.io/traefik"
+ reset_values = false
+ reuse_values = false
+ skip_crds = false
+ status = "deployed"
+ timeout = 900
+ verify = false
+ version = "35.2.0"
+ wait = true
+ wait_for_jobs = false
+ set {
+ name = "metrics.prometheus.service.enabled"
+ value = "true"
}
+ set {
+ name = "ports.metrics.expose.default"
+ value = "true"
}
+ set {
+ name = "service.annotations.kubernetes\\.civo\\.com/firewall-id"
+ type = "string"
+ value = (known after apply)
}
}
# kubectl_manifest.app_certificate will be created
+ resource "kubectl_manifest" "app_certificate" {
+ api_version = (known after apply)
+ apply_only = false
+ force_conflicts = false
+ force_new = false
+ id = (known after apply)
+ kind = (known after apply)
+ live_manifest_incluster = (sensitive value)
+ live_uid = (known after apply)
+ name = (known after apply)
+ namespace = (known after apply)
+ server_side_apply = false
+ uid = (known after apply)
+ validate_schema = true
+ wait_for_rollout = true
+ yaml_body = (sensitive value)
+ yaml_body_parsed = (known after apply)
+ yaml_incluster = (sensitive value)
}
# kubectl_manifest.grafana_replace_path[0] will be created
+ resource "kubectl_manifest" "grafana_replace_path" {
+ api_version = (known after apply)
+ apply_only = false
+ force_conflicts = false
+ force_new = false
+ id = (known after apply)
+ kind = (known after apply)
+ live_manifest_incluster = (sensitive value)
+ live_uid = (known after apply)
+ name = (known after apply)
+ namespace = (known after apply)
+ server_side_apply = false
+ uid = (known after apply)
+ validate_schema = true
+ wait_for_rollout = true
+ yaml_body = (sensitive value)
+ yaml_body_parsed = (known after apply)
+ yaml_incluster = (sensitive value)
}
# kubectl_manifest.letsencrypt_prod_issuer will be created
+ resource "kubectl_manifest" "letsencrypt_prod_issuer" {
+ api_version = "cert-manager.io/v1"
+ apply_only = false
+ force_conflicts = false
+ force_new = false
+ id = (known after apply)
+ kind = "ClusterIssuer"
+ live_manifest_incluster = (sensitive value)
+ live_uid = (known after apply)
+ name = "letsencrypt-prod"
+ namespace = (known after apply)
+ server_side_apply = false
+ uid = (known after apply)
+ validate_schema = true
+ wait_for_rollout = true
+ yaml_body = (sensitive value)
+ yaml_body_parsed = <<-EOT
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
spec:
acme:
email: no-reply@example.cloud
privateKeySecretRef:
name: letsencrypt-prod
server: https://acme-v02.api.letsencrypt.org/directory
solvers:
- http01:
ingress:
class: traefik
EOT
+ yaml_incluster = (sensitive value)
}
# kubectl_manifest.self_signed_cluster_issuer will be created
+ resource "kubectl_manifest" "self_signed_cluster_issuer" {
+ api_version = "cert-manager.io/v1"
+ apply_only = false
+ force_conflicts = false
+ force_new = false
+ id = (known after apply)
+ kind = "ClusterIssuer"
+ live_manifest_incluster = (sensitive value)
+ live_uid = (known after apply)
+ name = "self-signed-cluster-issuer"
+ namespace = (known after apply)
+ server_side_apply = false
+ uid = (known after apply)
+ validate_schema = true
+ wait_for_rollout = true
+ yaml_body = (sensitive value)
+ yaml_body_parsed = <<-EOT
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: self-signed-cluster-issuer
spec:
selfSigned: {}
EOT
+ yaml_incluster = (sensitive value)
}
# kubernetes_deployment.nginx will be created
+ resource "kubernetes_deployment" "nginx" {
+ id = (known after apply)
+ wait_for_rollout = true
+ metadata {
+ generation = (known after apply)
+ name = "nginx"
+ namespace = "default"
+ resource_version = (known after apply)
+ uid = (known after apply)
}
+ spec {
+ min_ready_seconds = 0
+ paused = false
+ progress_deadline_seconds = 600
+ replicas = "1"
+ revision_history_limit = 10
+ selector {
+ match_labels = {
+ "nginx" = "nginx"
}
}
+ template {
+ metadata {
+ generation = (known after apply)
+ labels = {
+ "nginx" = "nginx"
}
+ name = (known after apply)
+ resource_version = (known after apply)
+ uid = (known after apply)
}
+ spec {
+ automount_service_account_token = true
+ dns_policy = "ClusterFirst"
+ enable_service_links = true
+ host_ipc = false
+ host_network = false
+ host_pid = false
+ hostname = (known after apply)
+ node_name = (known after apply)
+ restart_policy = "Always"
+ scheduler_name = (known after apply)
+ service_account_name = (known after apply)
+ share_process_namespace = false
+ termination_grace_period_seconds = 30
+ container {
+ image = "nginx:1.21.6"
+ image_pull_policy = (known after apply)
+ name = "nginx"
+ stdin = false
+ stdin_once = false
+ termination_message_path = "/dev/termination-log"
+ termination_message_policy = (known after apply)
+ tty = false
+ liveness_probe {
+ failure_threshold = 3
+ initial_delay_seconds = 3
+ period_seconds = 3
+ success_threshold = 1
+ timeout_seconds = 1
+ http_get {
+ path = "/"
+ port = "80"
+ scheme = "HTTP"
+ http_header {
+ name = "X-Custom-Header"
+ value = "Awesome"
}
}
}
+ resources {
+ limits = {
+ "cpu" = "0.5"
+ "memory" = "512Mi"
}
+ requests = {
+ "cpu" = "250m"
+ "memory" = "50Mi"
}
}
}
}
}
}
}
# kubernetes_ingress_v1.grafana[0] will be created
+ resource "kubernetes_ingress_v1" "grafana" {
+ id = (known after apply)
+ status = (known after apply)
+ wait_for_load_balancer = true
+ metadata {
+ annotations = (known after apply)
+ generation = (known after apply)
+ name = "grafana"
+ namespace = (known after apply)
+ resource_version = (known after apply)
+ uid = (known after apply)
}
+ spec {
+ ingress_class_name = "traefik"
+ tls {
+ hosts = (known after apply)
+ secret_name = "grafana-letsencrypt-prod-tls"
}
}
}
# kubernetes_namespace.cluster_tools[0] will be created
+ resource "kubernetes_namespace" "cluster_tools" {
+ id = (known after apply)
+ wait_for_default_service_account = false
+ metadata {
+ generation = (known after apply)
+ labels = {
+ "pod-security.kubernetes.io/enforce" = "privileged"
}
+ name = "cluster-tools"
+ resource_version = (known after apply)
+ uid = (known after apply)
}
}
# kubernetes_namespace.landing_ns will be created
+ resource "kubernetes_namespace" "landing_ns" {
+ id = (known after apply)
+ wait_for_default_service_account = false
+ metadata {
+ generation = (known after apply)
+ name = "my-amespace"
+ resource_version = (known after apply)
+ uid = (known after apply)
}
}
# kubernetes_namespace.telemtry_ns will be created
+ resource "kubernetes_namespace" "telemtry_ns" {
+ id = (known after apply)
+ wait_for_default_service_account = false
+ metadata {
+ generation = (known after apply)
+ name = "telemetry-ns"
+ resource_version = (known after apply)
+ uid = (known after apply)
}
}
# local_file.cluster-config will be created
+ resource "local_file" "cluster-config" {
+ content = (sensitive value)
+ content_base64sha256 = (known after apply)
+ content_base64sha512 = (known after apply)
+ content_md5 = (known after apply)
+ content_sha1 = (known after apply)
+ content_sha256 = (known after apply)
+ content_sha512 = (known after apply)
+ directory_permission = "0755"
+ file_permission = "0600"
+ filename = "./kubeconfig"
+ id = (known after apply)
}
# null_resource.wait_for_letsencrypt_prod_issuer_ready will be created
+ resource "null_resource" "wait_for_letsencrypt_prod_issuer_ready" {
+ id = (known after apply)
+ triggers = {
+ "issuer_id" = (known after apply)
}
}
Plan: 19 to add, 0 to change, 0 to destroy.
4️⃣Log into Grafana online
Upon deployment, the Grafana ingress URL will be displayed in the output (grafana_url). add “/grafana
” to the URL to open Grafana login page (See below). i.e http://grafan.d402f4e6.nip.io/grafana


- Username: admin
- Password : through kubectl or via tfstate (see below)
cat terraform.tfstate | grep admin-password
"grafana_admin_password" {
"admin-password" : "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX "
Monitor the cluster
Kubernetes pods Monitoring via Prometheus

TLS (through K9s)🚀
Letsnecrypt issuer will send a certificate request when Grafana ingress is created which will be approved by acme.


5️⃣ Destroying the Infrastructure 🚧
To delete everything just run the below (Note: sometimes you need to run it twice as the loadbalancer get’s tough to die)
terraform destroy -auto-approve
TLS/Prometheus troubleshooting:
Certificate failure (error 400) indicates the issuer or cert-manager was not ready when Grafana ingress was up.
➡️Solution: redeploy the Grafana ingress resource using terraform replace.
$ terraform apply -replace=kubernetes_ingress_v1.grafana[0]
.. kubernetes_ingress_v1.grafana[0]: Destroying... [id=cluster-tools/grafana]
.. kubernetes_ingress_v1.grafana[0]: Destruction complete after 0s
.. kubernetes_ingress_v1.grafana[0]: Creation complete after 0s
[id=cluster-tools/grafana]
Apply complete! Resources: 1 added, 0 changed, 1 destroyed.
Up to 10000 certificates can be issued per registered domain (i.e we used nip.io) every 7 days.
➡️Solution: When this limit is reached you need to wait.
Failed to create Order: 429 urn:ietf:params:acme:error:rateLimited:
too many certificates (10000) already issued for "nip.io"
Extra privileges are needed to enable node exporter pods to scrape Hostnetwork/HostPID metrics.
➡️Solution: I had to label the cluster-tools namespace (other options include Kyverno).
resource "kubernetes_namespace" "cluster_tools" {
metadata {
name = var.cluster_tools_namespace
labels = {
"pod-security.kubernetes.io/enforce" = "privileged"
# }
}
Next Steps 🚀
- Add External DNS integration
- Enable ArgoCD
- Use
kube-prometheus-stack
helm chart
🤗Feel free to contribute by sending suggestions& PRs to this Repository.
Conclusion
I hope you found it useful!. If your organization is exploring private Kubernetes deployments in any cloud, please reach out to Cloudthrill team🤝, we’d love to help.
🙋🏻♀️If you like this content please subscribe to our blog newsletter ❤️.