
Intro
Every once in a while, a new cloud platform shows up that doesnโt just offer โmore computeโ . It rethinks what the cloud should look like in an AI-first world. Thatโs what caught my attention with Nebius, a European-born cloud designed from the ground up for high-performance, AI-centric workloads. One that just closed a $17 billion GPU capacity deal with Microsoft.
In this post, I’ll take you through a hands-on tour of Nebius: their history, GPU catalog, pricing model, stack offering, and CLI setup. We’ll explore what makes Nebius worth evaluating while staying fair and unbiased, focused only on the technical aspects. By the end, you’ll know if this AI Cloud deserves a spot in your strategy.
From Russia’s Google to AI Hyperscaler

1. The Yandex Era (1997-2021) -$31B Tech Empire
Arkady Volozh founded Yandex in 1997 , Russia’s version of Google before Google even entered the market. By 2011, Yandex went public on Nasdaq with a $1.3 billion IPO, becoming a tech powerhouse with search, maps, ride-hailing, and cloud services across Russia. At its peak in late 2021, Yandex N.V. hit a $31 billion valuation.
2. War, Sanctions, Forced Evolution and the $5.4B Exit
Then everything changed. In February 2022, Nasdaq halted trading after Russia’s Ukraine invasion triggered sanctions. In July 2024, Yandex N.V.(based in Amsterdam) sold its Russian assets to local investors for $5.4 billion, the largest corporate exit from Russia since the war began. What remained? A Finnish data center, $2 billion in cash, and thousands of battle-tested engineers who’d spent 20+ years building one of Europe’s largest tech ecosystems.
3. Rebirth as Nebius
In August 2024, the company rebranded as Nebius Group and resumed Nasdaq trading in October 2024 Nebius. The twist? Nebius is one of the only public AI infrastructure companies(debt free), most competitors are private or divisions of tech giants. The company raised $1 billion in September 2024 to fuel AI infrastructure expansion, and in December 2024, raised another $700 million from investors including Nvidia, which acquired a 0.5% stake. A year later, Nebius signs $17.4 billion AI infrastructure deal with Microsoft.
Regions
Nebius operates a rapidly expanding global footprint with availability zones across Europe and North America:

In Nebius Cloud, a single Project-ID is uniquely tied to one specific region. Multi-region deployments require separate Project for each region.
Data Centers (GPU Capacity)
| Region/Location | Status | Capacity | Key Details |
|---|---|---|---|
| Finland (Mรคntsรคlรค) | Live | 75 MW | Flagship owned data center, tripling capacity to support up to 60,000 GPUs. |
| France (Paris-Equinix) | Live | โ | Among first globally to deploy NVIDIA H200 GPUs. |
| Iceland (Keflavik) | Q1 2025 | 10 MW | 100% renewable (geothermal + hydro). |
| UK (Surrey-Ark) | Q4 2025 | โ | 4,000 NVIDIA Blackwell Ultra GPUs. |
| US (Kansas City) | Live (Q1 2025) | 5 โ 40 MW | Converted printing press, expandable to 35,000 GPUs, H200 + Blackwell. |
| US (New Jersey) | Summer 2025 | 300 MW | First major US-owned data center, built in phases. |
๐ท๏ธCombined scale:
This is a combined 400+ MW secured capacity with $1B+ annual revenue potential at full utilization, expanding to 1 GW by end of 2026. For context, 1 million NVIDIA Blackwell GPUs require ~1-1.4 GW of power, Nebius is targeting 3x+ xAI’s Colossus 1 scale at full buildout.
Free Tier & Credits
Unlike hyperscalers, Nebius has no free trial. You must add a credit card deposit with a minimum of $25 to enable billing. However, startups can apply for their programs (from $5k to $150K in credits), though approval requires application review and may take time
Nebius Offering

1. Nebius AI Studio
Managed inference-as-a-service with pre-configured environments, serverless one-click model deployment, and integrations for popular frameworks. Built for teams who want to deploy models fast without infrastructure overhead.
2. Nebius Cloud
Full IaaS with GPU instances, managed Kubernetes, and CLI/API access. For teams seeking control over the full stack.
3. ๐ฅNew: Token Factory
Token Factory โ is a brand new Serverless inference platform (like Groq, Fireworks, Together AI) with guaranteed uptime, zero-retention data flow, and usage-based pricing, no GPU management required.

โ๏ธ Core Services
Beyond the usual cloud fundamentals (storage, networking, IAM, monitoring), Nebius includes:
- VM Compute: GPU/CPU virtual machines with customizable presets
- Kubernetes: Managed Kubernetes (MK8s) clusters with GPU support
- HPC: Slurm-managed clusters via Soperator (Slurm on Kubernetes)
- Databases: Managed PostgreSQL clusters
- Inference as a Service: Run open-source AI Models (Llama, Qwen, DeepSeek…) at production speed
- Fine-tuning service: Fine-tune and distill leading open-source models into specialized, domain-expert systems
- ML Tools:
- Managed MLflow clusters for experiment tracking,
- Managed Ray clusters for distributed computing
โกApplications
What sets Nebius apart as an AI cloud: A rich marketplace of one-click, pre-configured AI/ML deployments. no YAML wrestling, no dependency hell.
1. Standalone VMs Apps

Deploy complete AI stacks (inference servers, development environments, Databases) as ready-to-use virtual machines with GPU support baked in.
๐ฆPopular picks include
- JupyterLab
- vLLM, Ollama, SGLang, SkyPilot (to run, manage, and scale AI batch workloads across cloud platforms)
- ComfyUI, Open WebUI, Flowise, code-server(IDE)
- MariaDB, Qdrant, Milvus, and Apache Airflow
2. Kubernetes Applications

One click Install of AI frameworks & tools as Helm charts directly into your MK8s clusters, from model serving platforms to MLOps tooling.
๐ฆPopular picks include
- Core K8s Add-ons: NVIDIA GPU/Network/Device Operators, Prometheus, Grafana, Loki, Cert-manager, etc
- AI/ML Frameworks & Training: JupyterHub, MLflow, Ray Cluster, Kubeflow, BioNeMo Framework
- Model Serving: vLLM (inluding models), Ollama, Stable Diffusion WebUI, Ray Serve, ComfyUI
- Data Processing: Apache Spark, Apache Flink, CVAT, Flowise, Rasa
- MLOps & Orchestration: Argo CD, ClearML Agent, Apache Airflow, Volcano, Anyscale
- Data & Vector DBs: Qdrant, Milvus, ClickHouse, Weaviate
๐GPU Platforms & Presets

1. GPU Platforms
Nebius offers the following GPU platforms across regions:
| Platform | GPU | CPU | Regions | Use Case |
|---|---|---|---|---|
| gpu-b200-sxm | NVIDIA B200 NVLink | Intel Emerald Rapids | us-central1 | Frontier training/inference |
| gpu-h200-sxm | NVIDIA H200 NVLink | Intel Sapphire Rapids | eu-north1, eu-west1, us-central1 | Large-scale training |
| gpu-h100-sxm | NVIDIA H100 NVLink | Intel Sapphire Rapids | eu-north1 | High-performance training |
| gpu-l40s-a | NVIDIA L40S PCIe | Intel Ice Lake | eu-north1 | Inference workloads |
| gpu-l40s-d | NVIDIA L40S PCIe | AMD EPYC Genoa | eu-north1 | Inference workloads |
2. Available Presets
NVIDIAยฎ B200 NVLink with Intel Emerald Rapids (gpu-b200-sxm), available in us-central1:
| Preset name | Number of GPUs | Number of vCPUs | RAM, GiB |
|---|---|---|---|
| 8gpu-160vcpu-1792gb | 8 | 160 | 1792 |
H100 NVLink (gpu-h100-sxm)
| Preset | GPUs | vCPUs | RAM (GB) |
|---|---|---|---|
| 1gpu-16vcpu-200gb | 1 | 16 | 200 |
| 8gpu-128vcpu-1600gb | 8 | 128 | 1600 |
L40S PCIe – Intel (gpu-l40s-a)
| Preset | GPUs | vCPUs | RAM (GB) |
|---|---|---|---|
| 1gpu-8vcpu-32gb | 1 | 8 | 32 |
| 1gpu-16vcpu-64gb | 1 | 16 | 64 |
| 1gpu-24vcpu-96gb | 1 | 24 | 96 |
| 1gpu-32vcpu-128gb | 1 | 32 | 128 |
| 1gpu-40vcpu-160gb | 1 | 40 | 160 |
L40S PCIe – AMD (gpu-l40s-d)
| Preset | GPUs | vCPUs | RAM (GB) |
|---|---|---|---|
| 1gpu-16vcpu-96gb | 1 | 16 | 96 |
| 1gpu-32vcpu-192gb | 1 | 32 | 192 |
| 1gpu-48vcpu-288gb | 1 | 48 | 288 |
| 2gpu-64vcpu-384gb | 2 | 64 | 384 |
| 2gpu-96vcpu-576gb | 2 | 96 | 576 |
| 4gpu-128vcpu-768gb | 4 | 128 | 768 |
| 4gpu-192vcpu-1152gb | 4 | 192 | 1152 |
GPU Cluster Presets (NVLink only)
| Platform | Preset | GPUs | vCPUs | RAM (GB) | Regions |
|---|---|---|---|---|---|
| gpu-b200-sxm | 8gpu-160vcpu-1792gb | 8 | 160 | 1792 | us-central1 |
| gpu-h200-sxm | 8gpu-128vcpu-1600gb | 8 | 128 | 1600 | eu-north1, eu-west1, us-central1 |
| gpu-h100-sxm | 8gpu-128vcpu-1600gb | 8 | 128 | 1600 | eu-north1 |
๐ท๏ธPricing & Value
Nebius uses simple hourly pricing across all GPU and CPU instances, with several services included for free.
GPU Instances (per hour)
| GPU | vCPUs | RAM (GB) | Price |
|---|---|---|---|
| NVIDIA GB200 NVL72 | * | 12800 | Contact sales |
| NVIDIA HGX B200 | 16 | 200 | $5.50 |
| NVIDIA HGX H200 | 16 | 200 | $3.50 |
| NVIDIA HGX H100 | 16 | 200 | $2.95 |
| NVIDIA L40S (AMD) | 16-192 | 96-1152 | from $1.82 |
| NVIDIA L40S (Intel) | 8-40 | 32-160 | from $1.55 |
CPU-Only Instances (per hour)
| CPU | vCPUs | RAM (GB) | Price |
|---|---|---|---|
| AMD EPYC Genoa | 4-64 | 16-256 | from $0.10 |
| Intel Ice Lake | 2-80 | 8-320 | from $0.05 |
Storage (per GiB/month unless noted)
| Type | Price |
|---|---|
| Shared Filesystem | $0.08 |
| WEKA Filesystem | $0.10 |
| Object Storage โ Standard (volume) | $0.015 |
| Object Storage โ Standard (egress) | $0.015/GiB |
| Object Storage โ Enhanced (volume) | $0.11 |
| Object Storage โ Enhanced (egress) | Free |
| Block Volumes (no replication) | $0.053 |
| Block Volumes (erasure coding) | $0.071 |
| Block Volumes (3x mirroring) | $0.118 |
Included Free
โ
Managed Kubernetes
โ
Managed Slurm (Soperator)
โ
Egress/Ingress traffic
โ
Public IP addresses
CLI Installation & Quick Start
1. Install the CLI
Ubuntu/Debian:
# Add Nebius repository
curl -fsSL https://storage.ai.nebius.cloud/nebius-public-keys/repository.gpg | sudo gpg --dearmor -o /usr/share/keyrings/nebius-archive-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/nebius-archive-keyring.gpg] https://storage.ai.nebius.cloud/nebius-repo stable main" | sudo tee /etc/apt/sources.list.d/nebius-repo.list
# Install
sudo apt update && sudo apt install nebius-cliOther OS: See official installation guide for macOS, Windows, and other Linux distributions.
2. Auto completion
nebius completion bash > ~/.nebius/completion.bash.inc
echo 'if [ -f ~/.nebius/completion.bash.inc ]; then source ~/.nebius/completion.bash.inc; fi' >> ~/.bashrc
source ~/.bashrc3. Configure Your Profile
# Interactive setup (recommended)
$ nebius profile create
profile name: my-profile
Set api endpoint: api.nebius.cloud
Set federation endpoint: auth.nebius.com
# Opens browser for authentication
โ Profile "my-profile" configured and activatedOr configure manually:
export NB_PROFILE_NAME=<profile-name>
export NB_PROJECT_ID=<project-id> # From console Project Settings
nebius profile create \
--profile $NB_PROFILE_NAME --endpoint api.nebius.cloud \
--federation-endpoint auth.nebius.com --parent-id $NB_PROJECT_ID4. Verify Setup
# Check your identity
nebius iam whoami
...
user_profile:
attributes:
email:
family_name: hd
given_name: Clouddude
locale: ""
name: Clouddude
phone_number: ""
picture: https://avatars.githubusercontent.com/u
preferred_username: ""
sub: "xxxxx"
federation_info:
federation_id: federation-e00github
federation_user_account_id: "xxxxxx"
id: useraccount-xxx
tenants:
- tenant_id: tenant-xx
tenant_user_account_id: tenantuseraccount-xx
# List profiles
nebius profile list
# View config
cat ~/.nebius/config.yamlNetwork
Having worked with 6+ cloud providers (including AliCloud and Civo), networking always caught me off guard.
Default pool: Each project has default-network-pool with a region-specific CIDR (e.g., 10.x.0.0/16).
The key difference
- CIDR blocks are defined at the pool level, not the VPC
- Networks reference pools (no CIDR at network level)
- Subnets allocate their CIDRs from those poolsโprovided they don’t overlap and stay within the pool’s range
How it works:
# 1. Pool defines the CIDR range (e.g., 10.0.0.0/8)
nebius vpc pool create --cidr 10.0.0.0/16
# 2. Network references the pool (no CIDR at network level)
nebius vpc network create --name my-network --ipv4-private-pool-id <pool_id>
# 3. Subnets carve out non-overlapping blocks from the pool
nebius vpc subnet create --network-id <network_id> --cidr 10.0.1.0/24 # Must be within pool range
nebius vpc subnet create --network-id <network_id> --cidr 10.0.2.0/24 # Different block, no overlapQuick Kubernetes Cluster Deploy
# Get subnet ID
export NB_SUBNET_ID=$(nebius vpc subnet list --format json | jq -r '.items[0].metadata.id')
# Create cluster
export NB_CLUSTER_ID=$(nebius mk8s cluster create \
--name quickstart-cluster \
--control-plane-subnet-id $NB_SUBNET_ID \
'{"spec": {"control_plane": {"endpoints": {"public_endpoint": {}}}}}' \
--format json | jq -r '.metadata.id')
# Create node group (2 nodes, cpu-e2 preset)
nebius mk8s node-group create \
--name quickstart-nodes \
--parent-id $NB_CLUSTER_ID \
--fixed-node-count 2 \
--template-resources-platform "cpu-e2" \
--template-resources-preset "2vcpu-8gb" \
--template-boot-disk-type network_ssd \
--template-boot-disk-size-bytes 137438953472
# Get kubeconfig
nebius mk8s cluster get-credentials --id $NB_CLUSTER_ID --external
kubectl cluster-infoโกCustom IP management for Kubernetes:
You must ensure your pool has enough CIDR blocks for all cluster requirements (nodes, services, pods), or you’ll hit “Unable to allocate CIDR from the pool” errors. Check available blocks with:
nebius vpc subnet get --id <subnet_ID><br>
# Look for: spec.ipv4_private_pools.pools.cidrsComing Up Next
Now that weโve explored what makes Nebius stand out, from its GPU-focused architecture to its rich developer experience .Itโs time to take things one step further.
In Part 2, weโll show how to deploy and configure GPU resources using Terraform. from authentication and instance provisioning to network setup. This step brings us closer to integrating Nebius into the official vLLM Production Stack, making open-source inference more portable and cloud-agnostic.
Stay tuned for Part 2โก

Run AI Your Way โ In Your Cloud
Want full control over your AI backend? The CloudThrill VLLM Private Inference POC is still open โ but not forever.
๐ข Secure your spot (only a few left), ๐๐ฝ๐ฝ๐น๐ ๐ป๐ผ๐!
Run AI assistants, RAG, or internal models on an AI backend ๐ฝ๐ฟ๐ถ๐๐ฎ๐๐ฒ๐น๐ ๐ถ๐ป ๐๐ผ๐๐ฟ ๐ฐ๐น๐ผ๐๐ฑ –
โ
No external APIs
โ
No vendor lock-in
โ
Total data control