↓44%
AWS H100 price cut since June 2025
↓20%
A100 market prices since March 2025
40–60%
Neocloud discount vs hyperscalers for H100
$1.87
Lowest H100 spot price (Vast.ai, April 2026)

The market in one paragraph

When I first started looking at this seriously in 2024, H100s were $4–6/hr on the handful of providers that actually had them. In April 2026, you can get the same GPU for $1.87/hr on spot, $2.49/hr on-demand at Lambda Labs, or $1.99/hr on a one-year reserved at the same provider. AWS — once the ceiling everyone anchored to — cut its H100 rates 44% in mid-2025 and is still the most expensive option in the table below. What struck me from a markets perspective: this isn't a temporary promotional situation. The compute market has structurally moved from scarcity pricing to competition pricing, and teams who recognized this shift early have been capturing real cost advantages.

What actually happened to H100 pricing

The H100 shortage in 2023–2024 was legitimate. NVIDIA was supply-constrained, hyperscalers were absorbing allocation as fast as they could get it, and neoclouds were charging $4–6/hr because that's what the market would support. Coming from a financial services background, I recognized this immediately: it was a supply-squeeze pricing dynamic, and those don't last forever.

The supply situation broke in three phases. First, NVIDIA ramped H100 production significantly through 2024. Second, neocloud operators — CoreWeave, Lambda Labs, RunPod, Vast.ai — expanded their fleets aggressively and drove on-demand pricing down through direct price competition. Third, AWS made a strategic decision to cut H100 pricing by 44% in June 2025, almost certainly to defend ML workload market share against neoclouds that were winning purely on price.

The result: the cheapest on-demand H100 (Lambda at $2.49/hr) now costs 36% less than AWS ($3.90/hr), and spot pricing on Vast.ai sits below $2/hr. These aren't promotional rates — they're the real prices you'll pay when you spin up an instance today.

Bottom line on pricing: The H100 is no longer a scarce commodity. If your workloads can tolerate spot interruptions — and most training jobs can, if you checkpoint properly — you're looking at rates that would have seemed implausible 18 months ago. The spot-to-on-demand spread is 50–60% right now, which in commodity market terms reads as a clear signal of supply exceeding demand. In GPU terms, it just means the market is working and buyers have leverage.

Current H100 SXM5 pricing by provider (April 2026)

Provider Type On-Demand Reserved / Spot Egress
Vast.ai Spot marketplace $1.87/hr N/A (spot only) $0.01–0.03/GB
RunPod Community cloud $1.99/hr ~$1.50/hr spot $0.02/GB
Lambda Labs Neocloud $2.49/hr $1.99/hr (1yr) $0.02/GB
CoreWeave Neocloud $2.99/hr Negotiated $0.03/GB
Google Cloud Hyperscaler $3.00/hr ~$2.10/hr (1yr committed) $0.08–0.12/GB
AWS (p5.48xlarge) Hyperscaler $3.90/hr ~$1.45/hr spot · $2.21/hr (3yr) $0.09/GB

Per-GPU prices derived from official provider pricing. Instance prices divided by GPU count. Last verified April 2026.

The neocloud vs hyperscaler gap is structural, not temporary

The 40–60% price gap between neoclouds and hyperscalers for H100 compute is a structural feature of two completely different business models — not a temporary promotional play worth waiting out.

Hyperscalers (AWS, GCP, Azure) run massive multi-service platforms. GPU compute is one product line among thousands. When I look at hyperscaler pricing through a cost-structure lens — which is how I approach most vendor analysis — it makes sense: their pricing reflects infrastructure overhead, their support organization, global redundancy requirements, and their bundling strategy. AWS would prefer you also run S3, RDS, SageMaker, and VPC alongside your GPU instances. The H100 price implicitly carries a premium for that ecosystem access, whether you use it or not.

Neoclouds (Lambda, CoreWeave, RunPod) do one thing: rent GPUs. Their cost structure is fundamentally simpler and they compete purely on compute price. This isn't sustainable at the very low end of the market — hence consolidation in the spot marketplace — but for serious on-demand compute, the neocloud cost advantage looks durable as long as the business model does.

When hyperscalers win anyway: Compliance requirements, data residency, VPC integration with existing AWS infrastructure, and enterprise support SLAs are real reasons to pay the premium. But in my experience looking at how teams actually make this decision, the compliance argument is often assumed rather than genuinely required. Worth asking your legal and security teams directly: do you actually need this, or do you assume you do? The answer changes the math significantly.

A100 pricing: more nuanced than it looks

The A100 80GB has fallen roughly 20% in market price since early 2025 and starts around $1.89/hr at Lambda Labs. On the surface that looks like a clear 'cheaper option.' It's not quite that simple.

When I modeled this properly — accounting for the H100's 2–3× speed advantage on transformer training — the math often flips. At $2.49/hr for H100 vs $1.89/hr for A100, an H100 that's 2.5× faster costs 32% more per hour but finishes the same job in 40% of the time, making it 45% cheaper per training run completed. That's a meaningful difference that gets lost when you only look at hourly rates. I go through the full scenario analysis in the H100 vs A100 deep dive.

That said, the A100 is genuinely the right answer for inference workloads where you're paying by the hour indefinitely rather than by the run, for small models where the H100 speed advantage doesn't fully materialize, and for teams with existing A100 infrastructure where the switching cost is real.

Spot pricing: historically wide gap suggests continued oversupply

Spot and preemptible pricing on H100s is running 50–60% below on-demand right now. That spread is historically wide — and for anyone who's spent time reading commodity or credit markets, it reads clearly as supply exceeding demand. When demand is tight, spot premiums compress and interruptions are frequent. When supply exceeds demand, spot falls and interruptions become rare. The current spread strongly suggests the latter.

For workloads that can checkpoint and restart — which includes most training jobs, if you actually implement checkpointing — spot instances are the single biggest cost lever available. To put actual numbers on it: a team running a 30-day training job on Lambda Labs on-demand spends $43,000 (8× H100, $2.49/hr). The same job on Vast.ai spot at $1.87/hr costs $32,000 — saving $11,000 without changing anything about the underlying workload. If you're training models and not using spot at all, you're leaving real money on the table.

What to expect for the rest of 2026

A few things I'm watching closely for the second half of the year.

NVIDIA's Blackwell ramp is the biggest variable. As B100 and B200 instances come through hyperscalers and then neoclouds, existing H100 fleet operators will face pressure to discount inventory to stay competitive. I'd expect H100 on-demand rates to fall toward $2/hr or below at the most competitive neoclouds by late 2026 — which means teams that locked into 1-year H100 reserved contracts in Q1 2026 at $1.99/hr may find themselves paying market rate rather than a premium savings by year end.

Hyperscaler response is ongoing. AWS, Google, and Azure have all made pricing moves in the past 12 months and they're not going to cede ML workloads without further action. Expect continued rate cuts, better reserved pricing, and heavier ecosystem bundling (credits, integrations, managed services) as competitive tactics. This is generally good for buyers.

Enterprise demand is rising. As AI moves from experimental to production, teams with real compliance, SLA, and data governance requirements will increasingly pay the hyperscaler premium. CoreWeave is well-positioned to capture the middle of this market — more enterprise-ready than Lambda, significantly cheaper than AWS — and I think that positioning gets more valuable as the enterprise wave hits.

My read on the current market: If you're running training workloads without hard availability or compliance requirements, Lambda Labs on-demand or reserved pricing is the most defensible cost position right now. For inference or enterprise features, CoreWeave and GCP are the most credible options respectively. I find it hard to justify AWS on price alone — but if your data is already there, the egress cost of moving it is real and worth calculating before you make a decision.

Methodology note

All prices in this piece are sourced from official provider pricing pages and verified in April 2026. Per-GPU prices for hyperscaler instances are derived by dividing total instance cost by GPU count — I note where I've done this so you can verify. Spot prices reflect current market rates and change frequently; treat them as indicative rather than guaranteed. Reserved and committed rates reflect publicly available 1-year terms unless otherwise noted. Full methodology →