GPU Cloud

GPUs that are actually in stock, by the hour.

H200, H100, A100 and L40S accelerators with NVLink, InfiniBand and a JupyterLab quickstart. Pay per second, snapshot mid-training, never sign an annual contract.

Deploy in 60 seconds View pricing

No credit card to start
Free migrations
Cancel any time

gpu_h100_18a · sjc1

CUDA 12.4

GPU

H100 80GB SXM5

Mem BW

3.35 TB/s

FP16 TFLOPs

989

NVLink

900 GB/s

vCPU

32 · EPYC 9354

RAM

256 GB DDR5

# reserve an H100 node and start jupyter
$ hostengine gpu create --plan "h100-80gb-x1"
✓ provisioned gpu_h100_18a in 92s
✓ jupyter at https://18a.gpu.hostengine.dev
>>> torch.cuda.get_device_name(0)
'NVIDIA H100 80GB HBM3'

H200 / H100

Latest accelerators in stock

$1.89/hr

L40S starting on-demand

3 TB/s

HBM3e on H200 nodes

InfiniBand

3.2 Tbps fabric on multi-GPU

Accelerator menu

The chip you need, in the region you need it.

Live capacity across all 14 regions. We post the per-SKU stock count on every dashboard refresh — never guess what is in inventory.

NVIDIA H200 141 GB

$5.49/hr

Memory: 141 GB HBM3e
Bandwidth: 4.8 TB/s
Regions: sjc1 · fra1 · sgp1

NVIDIA H100 80 GB SXM

$3.69/hr

Memory: 80 GB HBM3
Bandwidth: 3.35 TB/s
Regions: sjc1 · ash1 · fra1 · sgp1

NVIDIA H100 80 GB PCIe

$3.19/hr

Memory: 80 GB HBM3
Bandwidth: 2 TB/s
Regions: 8 regions

NVIDIA A100 80 GB

$1.99/hr

Memory: 80 GB HBM2e
Bandwidth: 2 TB/s
Regions: 10 regions

NVIDIA L40S 48 GB

$1.89/hr

Memory: 48 GB GDDR6
Bandwidth: 864 GB/s
Regions: 11 regions

RTX 6000 Ada 48 GB

$1.39/hr

Memory: 48 GB GDDR6
Bandwidth: 960 GB/s
Regions: 9 regions

Productivity

One link, JupyterLab on a real GPU.

No bashing CUDA versions, no compiling Triton, no chasing kernel headers. The image boots into a JupyterLab tab with PyTorch, JAX, vLLM and your repo cloned.

PyTorch 2.5, JAX 0.5, TensorFlow 2.16 — kept current
vLLM, TRL, axolotl, sglang and Hugging Face hub pre-warmed
GitHub repo auto-clone with one-line bootstrap

Container image

PyTorch 2.5.1 · cu124 ●

CUDA Toolkit 12.4.1 ●

Triton 3.1.0 ●

vLLM 0.6.4 · pinned ●

Flash-Attn 2.7.0 · prebuilt ●

Jupyter 4.3 · password-locked ●

Networking

InfiniBand fabric for multi-GPU runs.

When you need eight H100s talking, you need 3.2 Tbps of NDR InfiniBand between them — not best-effort Ethernet. Our cluster nodes deliver it by default.

NDR ConnectX-7 · 400 Gb/s per link · GPUDirect RDMA
RCCL and NCCL tuned for our exact topology
Free 100 TB transfer from object storage to GPU node

Economics

Per-second billing, weekly reservations.

Pay only for the seconds you train, or commit to a week and save 22%. No annual contracts, no upfront capex, no calls with sales unless you want one.

Per-second billing after first hour, on every SKU
Reservations from 1 week to 3 years, 14% – 47% off list
Spot pricing up to 65% off, with snapshot-on-preempt

H100 · pricing tiers

On-demand $3.69/hr

1 week reserved $3.18/hr

1 month reserved $2.79/hr

1 year reserved $2.34/hr

Spot (preempt) $1.29/hr

Plans

Three lanes, all per-second.

Need a different shape — say A100 + 2 TB shared NVMe? Build a custom node in the dashboard or talk to us.

Inference

Mid-size models, batch generation, vector search.

$1.89 /hr · L40S

1 × NVIDIA L40S 48 GB
16 vCPU EPYC · 128 GB DDR5
1.6 TB NVMe scratch
JupyterLab + Ollama pre-installed
Per-second billing after first hour
Free egress to HostEngine compute

Start with Inference

Training

From notebooks to dense MoE pre-training.

ML researcher at a biotech

Fine-tunes a 13B protein model nightly

Spins an H100 node at 23:00, runs LoRA training, snapshots weights into object storage at 04:00 and tears it down — total cost $34/run.

Generative-art platform

Serves 20k images/minute at peak

Auto-scales between 4 and 36 L40S replicas based on queue depth. Saves ~$18,000/month versus running 36 replicas 24/7 on a hyperscaler.

Indie LLM team

Pre-trains a 7B model from scratch

Reserves an 8×H100 node for 14 days at the weekly rate. Trains 12B tokens, ships the checkpoint to Hugging Face, releases the box.

Compare

The numbers that change a procurement deck.

Capability	HostEngine	Hyperscaler A	Legacy Host B
H100 80 GB on-demand	$3.69/hr	$8.50/hr	Reservation only
Per-second billing
InfiniBand fabric included		+contract
Pre-baked PyTorch / CUDA image
Free egress to platform storage
Snapshot mid-training state
Reservations from 1 week		1 year minimum	1 year minimum

Stack

The frameworks your team already loves.

Integrates with the stack you already use

PyTorch 2.5
CUDA 12.4
JAX
TensorFlow
Triton
vLLM
TRL
Hugging Face
Weights & Biases
JupyterLab
Slurm
Kubeflow
Ray
Determined

FAQ

Questions ML teams keep asking.

Which GPUs are available right now?

H200 141 GB, H100 80 GB SXM5, H100 80 GB PCIe, A100 80 GB, L40S 48 GB and RTX 6000 Ada 48 GB. Capacity for each SKU is listed live on the dashboard with ETA if a region is full.

Can I reserve a multi-GPU node?

Yes — reservations from one week up to three years. The minimum reservation discount is 14% (one-week) and the maximum is 47% (three-year, paid upfront).

What about networking between nodes?

Multi-node training runs over 3.2 Tbps NDR InfiniBand inside a single availability zone. Cross-zone goes through 400 GbE Ethernet with RoCE.

Do you support Kubernetes / Slurm?

Both. We expose CRDs for the NVIDIA GPU operator and ship a managed Slurm option for traditional HPC desks. Kubeflow, Ray and Determined deploy in one Helm command.

How does scratch storage work?

Every GPU node has local NVMe scratch (1.6 – 30 TB). Persist artefacts to the platform object store (free egress) or to a managed parallel filesystem with 80 GB/s aggregate read.

Are GPUs MIG-able?

Yes — we expose MIG profiles on H100 and H200 so you can split a single accelerator into 1g.20gb / 2g.40gb / 3g.80gb instances. Useful for cheap inference replicas.

Used by 1,200+ ML teams and 38 frontier-model labs

Northwind

Cobalt Studio

Volcrest

Northbeam AI

Halcyon

Acme Cloud

Pinepoint

Verdant

Helix Labs

Riverstone

Iron Forge

Beacon

Northwind

Cobalt Studio

Volcrest

Northbeam AI

Halcyon

Acme Cloud

Pinepoint

Verdant

Helix Labs

Riverstone

Iron Forge

Beacon

Ready when you are

Spin up an H100 by the time you finish coffee.

No reservation form, no quota request — pick a region, click deploy, get a JupyterLab link in 92 seconds.

Launch a GPU node Talk to GPU engineering

No credit card to start
Free migration from any provider
99.99% uptime SLA, in writing

Frankfurt · 3 nodes · healthy

38ms p99

# spin up a 4 vCPU / 8 GB cloud VPS in 55s
$ hostengine vps create --plan "performance-4x8" --region "fra1"
✓ provisioned vps_2x9k1q  (172.247.18.42)
✓ image debian-12 ready · ssh keys attached
✓ snapshot policy: hourly · backups: 30 days

$ hostengine domain attach "trading.acme.io" --ssl
✓ DNS verified · Let's Encrypt cert issued in 6.4s

55s

median provision

global regions

$200

welcome credit