Meet Nebius: The Cloud Built for the AI Era

Intro

Every once in a while, a new cloud platform shows up that doesn’t just offer “more compute” . It rethinks what the cloud should look like in an AI-first world. That’s what caught my attention with Nebius, a European-born cloud designed from the ground up for high-performance, AI-centric workloads. One that just closed a $17 billion GPU capacity deal with Microsoft.

In this post, I’ll take you through a hands-on tour of Nebius: their history, GPU catalog, pricing model, stack offering, and CLI setup. We’ll explore what makes Nebius worth evaluating while staying fair and unbiased, focused only on the technical aspects. By the end, you’ll know if this AI Cloud deserves a spot in your strategy.

From Russia’s Google to AI Hyperscaler

1. The Yandex Era (1997-2021) -$31B Tech Empire

Arkady Volozh founded Yandex in 1997 , Russia’s version of Google before Google even entered the market. By 2011, Yandex went public on Nasdaq with a $1.3 billion IPO, becoming a tech powerhouse with search, maps, ride-hailing, and cloud services across Russia. At its peak in late 2021, Yandex N.V. hit a $31 billion valuation.

2. War, Sanctions, Forced Evolution and the $5.4B Exit

Then everything changed. In February 2022, Nasdaq halted trading after Russia’s Ukraine invasion triggered sanctions. In July 2024, Yandex N.V.(based in Amsterdam) sold its Russian assets to local investors for $5.4 billion, the largest corporate exit from Russia since the war began. What remained? A Finnish data center, $2 billion in cash, and thousands of battle-tested engineers who’d spent 20+ years building one of Europe’s largest tech ecosystems.

3. Rebirth as Nebius

In August 2024, the company rebranded as Nebius Group and resumed Nasdaq trading in October 2024 Nebius. The twist? Nebius is one of the only public AI infrastructure companies(debt free), most competitors are private or divisions of tech giants. The company raised $1 billion in September 2024 to fuel AI infrastructure expansion, and in December 2024, raised another $700 million from investors including Nvidia, which acquired a 0.5% stake. A year later, Nebius signs $17.4 billion AI infrastructure deal with Microsoft.

Nebius didn’t start from zero. It inherited proven infra DNA and immediately redirected it toward GPU-dense AI workloads.

Regions

Nebius operates a rapidly expanding global footprint with availability zones across Europe and North America:

This image has an empty alt attribute; its file name is image-1.png

Note:
In Nebius Cloud, a single Project-ID is uniquely tied to one specific region. Multi-region deployments require separate Project for each region.

Data Centers (GPU Capacity)

Region/Location	Status	Capacity	Key Details
Finland (Mäntsälä)	Live	75 MW	Flagship owned data center, tripling capacity to support up to 60,000 GPUs.
France (Paris-Equinix)	Live	—	Among first globally to deploy NVIDIA H200 GPUs.
Iceland (Keflavik)	Q1 2025	10 MW	100% renewable (geothermal + hydro).
UK (Surrey-Ark)	Q4 2025	—	4,000 NVIDIA Blackwell Ultra GPUs.
US (Kansas City)	Live (Q1 2025)	5 → 40 MW	Converted printing press, expandable to 35,000 GPUs, H200 + Blackwell.
US (New Jersey)	Summer 2025	300 MW	First major US-owned data center, built in phases.

🏷️Combined scale:

This is a combined 400+ MW secured capacity with $1B+ annual revenue potential at full utilization, expanding to 1 GW by end of 2026. For context, 1 million NVIDIA Blackwell GPUs require ~1-1.4 GW of power, Nebius is targeting 3x+ xAI’s Colossus 1 scale at full buildout.

Free Tier & Credits

Unlike hyperscalers, Nebius has no free trial. You must add a credit card deposit with a minimum of $25 to enable billing. However, startups can apply for their programs (from $5k to $150K in credits), though approval requires application review and may take time

Nebius Offering

1. Nebius AI Studio

Managed inference-as-a-service with pre-configured environments, serverless one-click model deployment, and integrations for popular frameworks. Built for teams who want to deploy models fast without infrastructure overhead.

2. Nebius Cloud

Full IaaS with GPU instances, managed Kubernetes, and CLI/API access. For teams seeking control over the full stack.

Both share the same underlying GPU infrastructure, the difference is one is managed, the other gives control.

3. 💥New: Token Factory

Token Factory – is a brand new Serverless inference platform (like Groq, Fireworks, Together AI) with guaranteed uptime, zero-retention data flow, and usage-based pricing, no GPU management required.

Note: At the time of writing, It’s not yet confirmed if this completely replaces the AI Studio service or more a merge.

⚙️ Core Services

Beyond the usual cloud fundamentals (storage, networking, IAM, monitoring), Nebius includes:

VM Compute: GPU/CPU virtual machines with customizable presets
Kubernetes: Managed Kubernetes (MK8s) clusters with GPU support
HPC: Slurm-managed clusters via Soperator (Slurm on Kubernetes)
Databases: Managed PostgreSQL clusters
Inference as a Service: Run open-source AI Models (Llama, Qwen, DeepSeek…) at production speed
Fine-tuning service: Fine-tune and distill leading open-source models into specialized, domain-expert systems
ML Tools:
- Managed MLflow clusters for experiment tracking,
- Managed Ray clusters for distributed computing

⚡Applications

What sets Nebius apart as an AI cloud: A rich marketplace of one-click, pre-configured AI/ML deployments. no YAML wrestling, no dependency hell.

1. Standalone VMs Apps

Deploy complete AI stacks (inference servers, development environments, Databases) as ready-to-use virtual machines with GPU support baked in.

📦Popular picks include

JupyterLab
vLLM, Ollama, SGLang, SkyPilot (to run, manage, and scale AI batch workloads across cloud platforms)
ComfyUI, Open WebUI, Flowise, code-server(IDE)
MariaDB, Qdrant, Milvus, and Apache Airflow

2. Kubernetes Applications

One click Install of AI frameworks & tools as Helm charts directly into your MK8s clusters, from model serving platforms to MLOps tooling.

📦Popular picks include

Core K8s Add-ons: NVIDIA GPU/Network/Device Operators, Prometheus, Grafana, Loki, Cert-manager, etc
AI/ML Frameworks & Training: JupyterHub, MLflow, Ray Cluster, Kubeflow, BioNeMo Framework
Model Serving: vLLM (inluding models), Ollama, Stable Diffusion WebUI, Ray Serve, ComfyUI
Data Processing: Apache Spark, Apache Flink, CVAT, Flowise, Rasa
MLOps & Orchestration: Argo CD, ClearML Agent, Apache Airflow, Volcano, Anyscale
Data & Vector DBs: Qdrant, Milvus, ClickHouse, Weaviate

Note: The current vLLM K8s addon doesn’t seem to be an official Helm chart—however, we’re working on a trraform vLLM production-stack automation for Nebius (stay tuned).

🔋GPU Platforms & Presets

1. GPU Platforms

Nebius offers the following GPU platforms across regions:

Platform	GPU	CPU	Regions	Use Case
gpu-b200-sxm	NVIDIA B200 NVLink	Intel Emerald Rapids	us-central1	Frontier training/inference
gpu-h200-sxm	NVIDIA H200 NVLink	Intel Sapphire Rapids	eu-north1, eu-west1, us-central1	Large-scale training
gpu-h100-sxm	NVIDIA H100 NVLink	Intel Sapphire Rapids	eu-north1	High-performance training
gpu-l40s-a	NVIDIA L40S PCIe	Intel Ice Lake	eu-north1	Inference workloads
gpu-l40s-d	NVIDIA L40S PCIe	AMD EPYC Genoa	eu-north1	Inference workloads

2. Available Presets

NVIDIA® B200 NVLink with Intel Emerald Rapids (gpu-b200-sxm), available in us-central1:

Preset name	Number of GPUs	Number of vCPUs	RAM, GiB
8gpu-160vcpu-1792gb	8	160	1792

H100 NVLink (gpu-h100-sxm)

Preset	GPUs	vCPUs	RAM (GB)
1gpu-16vcpu-200gb	1	16	200
8gpu-128vcpu-1600gb	8	128	1600

L40S PCIe – Intel (gpu-l40s-a)

Preset	GPUs	vCPUs	RAM (GB)
1gpu-8vcpu-32gb	1	8	32
1gpu-16vcpu-64gb	1	16	64
1gpu-24vcpu-96gb	1	24	96
1gpu-32vcpu-128gb	1	32	128
1gpu-40vcpu-160gb	1	40	160

L40S PCIe – AMD (gpu-l40s-d)

Preset	GPUs	vCPUs	RAM (GB)
1gpu-16vcpu-96gb	1	16	96
1gpu-32vcpu-192gb	1	32	192
1gpu-48vcpu-288gb	1	48	288
2gpu-64vcpu-384gb	2	64	384
2gpu-96vcpu-576gb	2	96	576
4gpu-128vcpu-768gb	4	128	768
4gpu-192vcpu-1152gb	4	192	1152

GPU Cluster Presets (NVLink only)

Platform	Preset	GPUs	vCPUs	RAM (GB)	Regions
gpu-b200-sxm	8gpu-160vcpu-1792gb	8	160	1792	us-central1
gpu-h200-sxm	8gpu-128vcpu-1600gb	8	128	1600	eu-north1, eu-west1, us-central1
gpu-h100-sxm	8gpu-128vcpu-1600gb	8	128	1600	eu-north1

🏷️Pricing & Value

Nebius uses simple hourly pricing across all GPU and CPU instances, with several services included for free.

GPU Instances (per hour)

GPU	vCPUs	RAM (GB)	Price
NVIDIA GB200 NVL72	*	12800	Contact sales
NVIDIA HGX B200	16	200	$5.50
NVIDIA HGX H200	16	200	$3.50
NVIDIA HGX H100	16	200	$2.95
NVIDIA L40S (AMD)	16-192	96-1152	from $1.82
NVIDIA L40S (Intel)	8-40	32-160	from $1.55

CPU-Only Instances (per hour)

CPU	vCPUs	RAM (GB)	Price
AMD EPYC Genoa	4-64	16-256	from $0.10
Intel Ice Lake	2-80	8-320	from $0.05

Storage (per GiB/month unless noted)

Type	Price
Shared Filesystem	$0.08
WEKA Filesystem	$0.10
Object Storage — Standard (volume)	$0.015
Object Storage — Standard (egress)	$0.015/GiB
Object Storage — Enhanced (volume)	$0.11
Object Storage — Enhanced (egress)	Free
Block Volumes (no replication)	$0.053
Block Volumes (erasure coding)	$0.071
Block Volumes (3x mirroring)	$0.118

Included Free

✅ Managed Kubernetes
✅ Managed Slurm (Soperator)
✅ Egress/Ingress traffic
✅ Public IP addresses

CLI Installation & Quick Start

1. Install the CLI

Ubuntu/Debian:

# Add Nebius repository
curl -fsSL https://storage.ai.nebius.cloud/nebius-public-keys/repository.gpg | sudo gpg --dearmor -o /usr/share/keyrings/nebius-archive-keyring.gpg

echo "deb [signed-by=/usr/share/keyrings/nebius-archive-keyring.gpg] https://storage.ai.nebius.cloud/nebius-repo stable main" | sudo tee /etc/apt/sources.list.d/nebius-repo.list

# Install
sudo apt update && sudo apt install nebius-cli

# Add Nebius repository
curl -fsSL https://storage.ai.nebius.cloud/nebius-public-keys/repository.gpg | sudo gpg --dearmor -o /usr/share/keyrings/nebius-archive-keyring.gpg

echo "deb [signed-by=/usr/share/keyrings/nebius-archive-keyring.gpg] https://storage.ai.nebius.cloud/nebius-repo stable main" | sudo tee /etc/apt/sources.list.d/nebius-repo.list

# Install
sudo apt update && sudo apt install nebius-cli

Other OS: See official installation guide for macOS, Windows, and other Linux distributions.

2. Auto completion

nebius completion bash > ~/.nebius/completion.bash.inc
echo 'if [ -f ~/.nebius/completion.bash.inc ]; then source ~/.nebius/completion.bash.inc; fi' >> ~/.bashrc
source ~/.bashrc

nebius completion bash > ~/.nebius/completion.bash.inc
echo 'if [ -f ~/.nebius/completion.bash.inc ]; then source ~/.nebius/completion.bash.inc; fi' >> ~/.bashrc
source ~/.bashrc

3. Configure Your Profile

# Interactive setup (recommended)
$ nebius profile create

profile name: my-profile
Set api endpoint: api.nebius.cloud
Set federation endpoint: auth.nebius.com

# Opens browser for authentication
✔ Profile "my-profile" configured and activated

# Interactive setup (recommended)
$ nebius profile create

profile name: my-profile
Set api endpoint: api.nebius.cloud
Set federation endpoint: auth.nebius.com

# Opens browser for authentication
✔ Profile "my-profile" configured and activated

Or configure manually:

export NB_PROFILE_NAME=<profile-name>
export NB_PROJECT_ID=<project-id>  # From console Project Settings

nebius profile create \
    --profile $NB_PROFILE_NAME --endpoint api.nebius.cloud \
    --federation-endpoint auth.nebius.com --parent-id $NB_PROJECT_ID

export NB_PROFILE_NAME=<profile-name>
export NB_PROJECT_ID=<project-id>  # From console Project Settings

nebius profile create \
    --profile $NB_PROFILE_NAME --endpoint api.nebius.cloud \
    --federation-endpoint auth.nebius.com --parent-id $NB_PROJECT_ID

4. Verify Setup

# Check your identity
nebius iam whoami
...
 user_profile:
  attributes:
    email:  
    family_name: hd
    given_name: Clouddude
    locale: ""
    name: Clouddude
    phone_number: ""
    picture: https://avatars.githubusercontent.com/u
    preferred_username: ""
    sub: "xxxxx"
  federation_info:
    federation_id: federation-e00github
    federation_user_account_id: "xxxxxx"
  id: useraccount-xxx
  tenants:
    - tenant_id: tenant-xx
      tenant_user_account_id: tenantuseraccount-xx

# List profiles
nebius profile list

# View config
cat ~/.nebius/config.yaml

# Check your identity
nebius iam whoami
...
 user_profile:
  attributes:
    email:  
    family_name: hd
    given_name: Clouddude
    locale: ""
    name: Clouddude
    phone_number: ""
    picture: https://avatars.githubusercontent.com/u
    preferred_username: ""
    sub: "xxxxx"
  federation_info:
    federation_id: federation-e00github
    federation_user_account_id: "xxxxxx"
  id: useraccount-xxx
  tenants:
    - tenant_id: tenant-xx
      tenant_user_account_id: tenantuseraccount-xx

# List profiles
nebius profile list

# View config
cat ~/.nebius/config.yaml

Network

Having worked with 6+ cloud providers (including AliCloud and Civo), networking always caught me off guard.
Default pool: Each project has default-network-pool with a region-specific CIDR (e.g., 10.x.0.0/16).

The key difference

CIDR blocks are defined at the pool level, not the VPC
Networks reference pools (no CIDR at network level)
Subnets allocate their CIDRs from those pools—provided they don’t overlap and stay within the pool’s range

How it works:

# 1. Pool defines the CIDR range (e.g., 10.0.0.0/8)
nebius vpc pool create --cidr 10.0.0.0/16

# 2. Network references the pool (no CIDR at network level)
nebius vpc network create --name my-network --ipv4-private-pool-id <pool_id>

# 3. Subnets carve out non-overlapping blocks from the pool
nebius vpc subnet create --network-id <network_id> --cidr 10.0.1.0/24  # Must be within pool range

nebius vpc subnet create --network-id <network_id> --cidr 10.0.2.0/24  # Different block, no overlap

# 1. Pool defines the CIDR range (e.g., 10.0.0.0/8)
nebius vpc pool create --cidr 10.0.0.0/16

# 2. Network references the pool (no CIDR at network level)
nebius vpc network create --name my-network --ipv4-private-pool-id <pool_id>

# 3. Subnets carve out non-overlapping blocks from the pool
nebius vpc subnet create --network-id <network_id> --cidr 10.0.1.0/24  # Must be within pool range

nebius vpc subnet create --network-id <network_id> --cidr 10.0.2.0/24  # Different block, no overlap

Quick Kubernetes Cluster Deploy

# Get subnet ID
export NB_SUBNET_ID=$(nebius vpc subnet list --format json | jq -r '.items[0].metadata.id')

# Create cluster
export NB_CLUSTER_ID=$(nebius mk8s cluster create \
  --name quickstart-cluster \
  --control-plane-subnet-id $NB_SUBNET_ID \
  '{"spec": {"control_plane": {"endpoints": {"public_endpoint": {}}}}}' \
  --format json | jq -r '.metadata.id')

# Create node group (2 nodes, cpu-e2 preset)
nebius mk8s node-group create \
  --name quickstart-nodes \
  --parent-id $NB_CLUSTER_ID \
  --fixed-node-count 2 \
  --template-resources-platform "cpu-e2" \
  --template-resources-preset "2vcpu-8gb" \
  --template-boot-disk-type network_ssd \
  --template-boot-disk-size-bytes 137438953472

# Get kubeconfig
nebius mk8s cluster get-credentials --id $NB_CLUSTER_ID --external
kubectl cluster-info

# Get subnet ID
export NB_SUBNET_ID=$(nebius vpc subnet list --format json | jq -r '.items[0].metadata.id')

# Create cluster
export NB_CLUSTER_ID=$(nebius mk8s cluster create \
  --name quickstart-cluster \
  --control-plane-subnet-id $NB_SUBNET_ID \
  '{"spec": {"control_plane": {"endpoints": {"public_endpoint": {}}}}}' \
  --format json | jq -r '.metadata.id')

# Create node group (2 nodes, cpu-e2 preset)
nebius mk8s node-group create \
  --name quickstart-nodes \
  --parent-id $NB_CLUSTER_ID \
  --fixed-node-count 2 \
  --template-resources-platform "cpu-e2" \
  --template-resources-preset "2vcpu-8gb" \
  --template-boot-disk-type network_ssd \
  --template-boot-disk-size-bytes 137438953472

# Get kubeconfig
nebius mk8s cluster get-credentials --id $NB_CLUSTER_ID --external
kubectl cluster-info

⚡Custom IP management for Kubernetes:
You must ensure your pool has enough CIDR blocks for all cluster requirements (nodes, services, pods), or you’ll hit “Unable to allocate CIDR from the pool” errors. Check available blocks with:

nebius vpc subnet get --id <subnet_ID><br>
# Look for: spec.ipv4_private_pools.pools.cidrs

nebius vpc subnet get --id <subnet_ID><br>
# Look for: spec.ipv4_private_pools.pools.cidrs

Coming Up Next

Now that we’ve explored what makes Nebius stand out, from its GPU-focused architecture to its rich developer experience .It’s time to take things one step further.

In Part 2, we’ll show how to deploy and configure GPU resources using Terraform. from authentication and instance provisioning to network setup. This step brings us closer to integrating Nebius into the official vLLM Production Stack, making open-source inference more portable and cloud-agnostic.

Stay tuned for Part 2⚡

Run AI Your Way — In Your Cloud

Want full control over your AI backend? The CloudThrill VLLM Private Inference POC is still open — but not forever.

📢 Secure your spot (only a few left), 𝗔𝗽𝗽𝗹𝘆 𝗻𝗼𝘄!

Run AI assistants, RAG, or internal models on an AI backend 𝗽𝗿𝗶𝘃𝗮𝘁𝗲𝗹𝘆 𝗶𝗻 𝘆𝗼𝘂𝗿 𝗰𝗹𝗼𝘂𝗱 –
✅ No external APIs
✅ No vendor lock-in
✅ Total data control

Claim YOur FREE VLLM POC