Meet Nebius: The Cloud Built for the AI Era

Intro

Every once in a while, a new cloud platform shows up that doesnโ€™t just offer โ€œmore computeโ€ . It rethinks what the cloud should look like in an AI-first world. Thatโ€™s what caught my attention with Nebius, a European-born cloud designed from the ground up for high-performance, AI-centric workloads. One that just closed a $17 billion GPU capacity deal with Microsoft.

In this post, I’ll take you through a hands-on tour of Nebius: their history, GPU catalog, pricing model, stack offering, and CLI setup. We’ll explore what makes Nebius worth evaluating while staying fair and unbiased, focused only on the technical aspects. By the end, you’ll know if this AI Cloud deserves a spot in your strategy.

From Russia’s Google to AI Hyperscaler

1. The Yandex Era (1997-2021) -$31B Tech Empire

Arkady Volozh founded Yandex in 1997 , Russia’s version of Google before Google even entered the market. By 2011, Yandex went public on Nasdaq with a $1.3 billion IPO, becoming a tech powerhouse with search, maps, ride-hailing, and cloud services across Russia. At its peak in late 2021, Yandex N.V. hit a $31 billion valuation.

2. War, Sanctions, Forced Evolution and the $5.4B Exit

Then everything changed. In February 2022, Nasdaq halted trading after Russia’s Ukraine invasion triggered sanctions. In July 2024, Yandex N.V.(based in Amsterdam) sold its Russian assets to local investors for $5.4 billion, the largest corporate exit from Russia since the war began. What remained? A Finnish data center, $2 billion in cash, and thousands of battle-tested engineers who’d spent 20+ years building one of Europe’s largest tech ecosystems.

3. Rebirth as Nebius

In August 2024, the company rebranded as Nebius Group and resumed Nasdaq trading in October 2024 Nebius. The twist? Nebius is one of the only public AI infrastructure companies(debt free), most competitors are private or divisions of tech giants. The company raised $1 billion in September 2024 to fuel AI infrastructure expansion, and in December 2024, raised another $700 million from investors including Nvidia, which acquired a 0.5% stake. A year later, Nebius signs $17.4 billion AI infrastructure deal with Microsoft.

Nebius didn’t start from zero. It inherited proven infra DNA and immediately redirected it toward GPU-dense AI workloads.

Regions

Nebius operates a rapidly expanding global footprint with availability zones across Europe and North America:

This image has an empty alt attribute; its file name is image-1.png
Note:
In Nebius Cloud, a single Project-ID is uniquely tied to one specific region. Multi-region deployments require separate Project for each region.

Data Centers (GPU Capacity)

Region/Location Status Capacity Key Details
Finland (Mรคntsรคlรค) Live 75 MW Flagship owned data center, tripling capacity to support up to 60,000 GPUs.
France (Paris-Equinix) Live โ€” Among first globally to deploy NVIDIA H200 GPUs.
Iceland (Keflavik) Q1 2025 10 MW 100% renewable (geothermal + hydro).
UK (Surrey-Ark) Q4 2025 โ€” 4,000 NVIDIA Blackwell Ultra GPUs.
US (Kansas City) Live (Q1 2025) 5 โ†’ 40 MW Converted printing press, expandable to 35,000 GPUs, H200 + Blackwell.
US (New Jersey) Summer 2025 300 MW First major US-owned data center, built in phases.

๐Ÿท๏ธCombined scale:

This is a combined 400+ MW secured capacity with $1B+ annual revenue potential at full utilization, expanding to 1 GW by end of 2026. For context, 1 million NVIDIA Blackwell GPUs require ~1-1.4 GW of power, Nebius is targeting 3x+ xAI’s Colossus 1 scale at full buildout.

Free Tier & Credits

Unlike hyperscalers, Nebius has no free trial. You must add a credit card deposit with a minimum of $25 to enable billing. However, startups can apply for their programs (from $5k to $150K in credits), though approval requires application review and may take time

Nebius Offering

1. Nebius AI Studio

Managed inference-as-a-service with pre-configured environments, serverless one-click model deployment, and integrations for popular frameworks. Built for teams who want to deploy models fast without infrastructure overhead.

2. Nebius Cloud

Full IaaS with GPU instances, managed Kubernetes, and CLI/API access. For teams seeking control over the full stack.

Both share the same underlying GPU infrastructure, the difference is one is managed, the other gives control.

3. ๐Ÿ’ฅNew: Token Factory

Token Factory โ€“ is a brand new Serverless inference platform (like Groq, Fireworks, Together AI) with guaranteed uptime, zero-retention data flow, and usage-based pricing, no GPU management required.

Note: At the time of writing, It’s not yet confirmed if this completely replaces the AI Studio service or more a merge.

โš™๏ธ Core Services

Beyond the usual cloud fundamentals (storage, networking, IAM, monitoring), Nebius includes:

โšกApplications

What sets Nebius apart as an AI cloud: A rich marketplace of one-click, pre-configured AI/ML deployments. no YAML wrestling, no dependency hell.

1. Standalone VMs Apps

Deploy complete AI stacks (inference servers, development environments, Databases) as ready-to-use virtual machines with GPU support baked in.

๐Ÿ“ฆPopular picks include
  • JupyterLab
  • vLLM, Ollama, SGLang, SkyPilot (to run, manage, and scale AI batch workloads across cloud platforms)
  • ComfyUI, Open WebUI, Flowise, code-server(IDE)
  • MariaDB, Qdrant, Milvus, and Apache Airflow

2. Kubernetes Applications

One click Install of AI frameworks & tools as Helm charts directly into your MK8s clusters, from model serving platforms to MLOps tooling.

๐Ÿ“ฆPopular picks include
  • Core K8s Add-ons: NVIDIA GPU/Network/Device Operators, Prometheus, Grafana, Loki, Cert-manager, etc
  • AI/ML Frameworks & Training: JupyterHub, MLflow, Ray Cluster, Kubeflow, BioNeMo Framework
  • Model Serving: vLLM (inluding models), Ollama, Stable Diffusion WebUI, Ray Serve, ComfyUI
  • Data Processing: Apache Spark, Apache Flink, CVAT, Flowise, Rasa
  • MLOps & Orchestration: Argo CD, ClearML Agent, Apache Airflow, Volcano, Anyscale
  • Data & Vector DBs: Qdrant, Milvus, ClickHouse, Weaviate
Note: The current vLLM K8s addon doesn’t seem to be an official Helm chartโ€”however, we’re working on a trraform vLLM production-stack automation for Nebius (stay tuned).

๐Ÿ”‹GPU Platforms & Presets

1. GPU Platforms

Nebius offers the following GPU platforms across regions:

Platform GPU CPU Regions Use Case
gpu-b200-sxm NVIDIA B200 NVLink Intel Emerald Rapids us-central1 Frontier training/inference
gpu-h200-sxm NVIDIA H200 NVLink Intel Sapphire Rapids eu-north1, eu-west1, us-central1 Large-scale training
gpu-h100-sxm NVIDIA H100 NVLink Intel Sapphire Rapids eu-north1 High-performance training
gpu-l40s-a NVIDIA L40S PCIe Intel Ice Lake eu-north1 Inference workloads
gpu-l40s-d NVIDIA L40S PCIe AMD EPYC Genoa eu-north1 Inference workloads

2. Available Presets

NVIDIAยฎ B200 NVLink with Intel Emerald Rapids (gpu-b200-sxm), available in us-central1:

Preset name Number of GPUs Number of vCPUs RAM, GiB
8gpu-160vcpu-1792gb 8 160 1792

H100 NVLink (gpu-h100-sxm)

Preset GPUs vCPUs RAM (GB)
1gpu-16vcpu-200gb 1 16 200
8gpu-128vcpu-1600gb 8 128 1600

L40S PCIe – Intel (gpu-l40s-a)

Preset GPUs vCPUs RAM (GB)
1gpu-8vcpu-32gb 1 8 32
1gpu-16vcpu-64gb 1 16 64
1gpu-24vcpu-96gb 1 24 96
1gpu-32vcpu-128gb 1 32 128
1gpu-40vcpu-160gb 1 40 160

L40S PCIe – AMD (gpu-l40s-d)

Preset GPUs vCPUs RAM (GB)
1gpu-16vcpu-96gb 1 16 96
1gpu-32vcpu-192gb 1 32 192
1gpu-48vcpu-288gb 1 48 288
2gpu-64vcpu-384gb 2 64 384
2gpu-96vcpu-576gb 2 96 576
4gpu-128vcpu-768gb 4 128 768
4gpu-192vcpu-1152gb 4 192 1152

GPU Cluster Presets (NVLink only)

Platform Preset GPUs vCPUs RAM (GB) Regions
gpu-b200-sxm 8gpu-160vcpu-1792gb 8 160 1792 us-central1
gpu-h200-sxm 8gpu-128vcpu-1600gb 8 128 1600 eu-north1, eu-west1, us-central1
gpu-h100-sxm 8gpu-128vcpu-1600gb 8 128 1600 eu-north1

๐Ÿท๏ธPricing & Value

Nebius uses simple hourly pricing across all GPU and CPU instances, with several services included for free.

GPU Instances (per hour)

GPU vCPUs RAM (GB) Price
NVIDIA GB200 NVL72 * 12800 Contact sales
NVIDIA HGX B200 16 200 $5.50
NVIDIA HGX H200 16 200 $3.50
NVIDIA HGX H100 16 200 $2.95
NVIDIA L40S (AMD) 16-192 96-1152 from $1.82
NVIDIA L40S (Intel) 8-40 32-160 from $1.55

CPU-Only Instances (per hour)

CPU vCPUs RAM (GB) Price
AMD EPYC Genoa 4-64 16-256 from $0.10
Intel Ice Lake 2-80 8-320 from $0.05

Storage (per GiB/month unless noted)

Type Price
Shared Filesystem $0.08
WEKA Filesystem $0.10
Object Storage โ€” Standard (volume) $0.015
Object Storage โ€” Standard (egress) $0.015/GiB
Object Storage โ€” Enhanced (volume) $0.11
Object Storage โ€” Enhanced (egress) Free
Block Volumes (no replication) $0.053
Block Volumes (erasure coding) $0.071
Block Volumes (3x mirroring) $0.118

Included Free

โœ… Managed Kubernetes
โœ… Managed Slurm (Soperator)
โœ… Egress/Ingress traffic
โœ… Public IP addresses

CLI Installation & Quick Start

1. Install the CLI

Ubuntu/Debian:

# Add Nebius repository
curl -fsSL https://storage.ai.nebius.cloud/nebius-public-keys/repository.gpg | sudo gpg --dearmor -o /usr/share/keyrings/nebius-archive-keyring.gpg

echo "deb [signed-by=/usr/share/keyrings/nebius-archive-keyring.gpg] https://storage.ai.nebius.cloud/nebius-repo stable main" | sudo tee /etc/apt/sources.list.d/nebius-repo.list

# Install
sudo apt update && sudo apt install nebius-cli

Other OS: See official installation guide for macOS, Windows, and other Linux distributions.

2. Auto completion

nebius completion bash > ~/.nebius/completion.bash.inc
echo 'if [ -f ~/.nebius/completion.bash.inc ]; then source ~/.nebius/completion.bash.inc; fi' >> ~/.bashrc
source ~/.bashrc

3. Configure Your Profile

# Interactive setup (recommended)
$ nebius profile create

profile name: my-profile
Set api endpoint: api.nebius.cloud
Set federation endpoint: auth.nebius.com

# Opens browser for authentication
โœ” Profile "my-profile" configured and activated

Or configure manually:

export NB_PROFILE_NAME=<profile-name>
export NB_PROJECT_ID=<project-id>  # From console Project Settings

nebius profile create \
    --profile $NB_PROFILE_NAME --endpoint api.nebius.cloud \
    --federation-endpoint auth.nebius.com --parent-id $NB_PROJECT_ID

4. Verify Setup

# Check your identity
nebius iam whoami
...
 user_profile:
  attributes:
    email:  
    family_name: hd
    given_name: Clouddude
    locale: ""
    name: Clouddude
    phone_number: ""
    picture: https://avatars.githubusercontent.com/u
    preferred_username: ""
    sub: "xxxxx"
  federation_info:
    federation_id: federation-e00github
    federation_user_account_id: "xxxxxx"
  id: useraccount-xxx
  tenants:
    - tenant_id: tenant-xx
      tenant_user_account_id: tenantuseraccount-xx

# List profiles
nebius profile list

# View config
cat ~/.nebius/config.yaml

Network

Having worked with 6+ cloud providers (including AliCloud and Civo), networking always caught me off guard.
Default pool: Each project has default-network-pool with a region-specific CIDR (e.g., 10.x.0.0/16).

The key difference

  • CIDR blocks are defined at the pool level, not the VPC
  • Networks reference pools (no CIDR at network level)
  • Subnets allocate their CIDRs from those poolsโ€”provided they don’t overlap and stay within the pool’s range

How it works:

# 1. Pool defines the CIDR range (e.g., 10.0.0.0/8)
nebius vpc pool create --cidr 10.0.0.0/16

# 2. Network references the pool (no CIDR at network level)
nebius vpc network create --name my-network --ipv4-private-pool-id <pool_id>

# 3. Subnets carve out non-overlapping blocks from the pool
nebius vpc subnet create --network-id <network_id> --cidr 10.0.1.0/24  # Must be within pool range

nebius vpc subnet create --network-id <network_id> --cidr 10.0.2.0/24  # Different block, no overlap

Quick Kubernetes Cluster Deploy

# Get subnet ID
export NB_SUBNET_ID=$(nebius vpc subnet list --format json | jq -r '.items[0].metadata.id')

# Create cluster
export NB_CLUSTER_ID=$(nebius mk8s cluster create \
  --name quickstart-cluster \
  --control-plane-subnet-id $NB_SUBNET_ID \
  '{"spec": {"control_plane": {"endpoints": {"public_endpoint": {}}}}}' \
  --format json | jq -r '.metadata.id')

# Create node group (2 nodes, cpu-e2 preset)
nebius mk8s node-group create \
  --name quickstart-nodes \
  --parent-id $NB_CLUSTER_ID \
  --fixed-node-count 2 \
  --template-resources-platform "cpu-e2" \
  --template-resources-preset "2vcpu-8gb" \
  --template-boot-disk-type network_ssd \
  --template-boot-disk-size-bytes 137438953472

# Get kubeconfig
nebius mk8s cluster get-credentials --id $NB_CLUSTER_ID --external
kubectl cluster-info

โšกCustom IP management for Kubernetes:
You must ensure your pool has enough CIDR blocks for all cluster requirements (nodes, services, pods), or you’ll hit “Unable to allocate CIDR from the pool” errors. Check available blocks with:

nebius vpc subnet get --id <subnet_ID><br>
# Look for: spec.ipv4_private_pools.pools.cidrs

Coming Up Next

Now that weโ€™ve explored what makes Nebius stand out, from its GPU-focused architecture to its rich developer experience .Itโ€™s time to take things one step further.

In Part 2, weโ€™ll show how to deploy and configure GPU resources using Terraform. from authentication and instance provisioning to network setup. This step brings us closer to integrating Nebius into the official vLLM Production Stack, making open-source inference more portable and cloud-agnostic.

Stay tuned  for Part 2โšก

Run AI Your Way โ€” In Your Cloud


Run AI assistants, RAG, or internal models on an AI backend ๐—ฝ๐—ฟ๐—ถ๐˜ƒ๐—ฎ๐˜๐—ฒ๐—น๐˜† ๐—ถ๐—ป ๐˜†๐—ผ๐˜‚๐—ฟ ๐—ฐ๐—น๐—ผ๐˜‚๐—ฑ –
โœ… No external APIs
โœ… No vendor lock-in
โœ… Total data control

๐—ฌ๐—ผ๐˜‚๐—ฟ ๐—ถ๐—ป๐—ณ๐—ฟ๐—ฎ. ๐—ฌ๐—ผ๐˜‚๐—ฟ ๐—บ๐—ผ๐—ฑ๐—ฒ๐—น๐˜€. ๐—ฌ๐—ผ๐˜‚๐—ฟ ๐—ฟ๐˜‚๐—น๐—ฒ๐˜€…

Share this…

Don't miss a Bit!

Join countless others!
Sign up and get awesome cloud content straight to your inbox. ๐Ÿš€

Start your Cloud journey with us today .