RTX A5000
NVIDIA · dual-slot blower · Released April 2021
Dual-slot blower with 24 GB and ECC. The professional's 3090 — same VRAM, quieter, rack-ready.
Hardware spec sheet
| Vendor | NVIDIA |
|---|---|
| Category | GPU |
| VRAM / memory | 24 GB |
| Memory bandwidth | 768 GB/s |
| FP16 TFLOPS | 27.8 |
| INT8 TOPS | 222 |
| TDP | 230 W |
| Architecture | Ampere |
| Form factor | dual-slot blower |
| Release date | April 2021 |
| Street price (USD) | 800-1100 (used) |
| 120V note | 230W each; two A5000s on 120V/15A with a 1000W PSU is the practical limit. |
The RTX A5000 is the dual-slot blower Ampere workstation card: 24 GB of ECC GDDR6, 768 GB/s, 230 W TDP. Launched alongside the A4000 in April 2021, it slots between the A4000 (16 GB, 140 W) and the A6000 (48 GB, 300 W). Ampere architecture identical to the 3090 — same CUDA core count even — but with ECC memory, a blower cooler, and NVIDIA’s professional driver stack.
Who it’s for: professionals, studios, and Hashcenter operators who want 3090-class performance in a rack-friendly package with ECC for workload reliability. Used prices \$800–1100 as of 2026.
Models it runs comfortably: identical parameter envelope to the RTX 3090 (Llama 3 70B at Q4, Qwen 2.5 32B at Q5_K_M), with slightly lower bandwidth (768 vs 936 GB/s) meaning ~20% slower tok/s on memory-bound workloads. The ECC and blower form factor usually justify the trade-off for production use.
Hashcenter notes: dual-slot blower — two A5000s fit in four slots and dissipate 460 W total through front-to-back airflow, ideal for a 4U chassis. Quieter than a 3090 blower under load because the lower clocks keep fan RPM down. 230 W each means two cards on 120V/15A with a 1000 W PSU. Credit NVIDIA’s Quadro/RTX Pro lineage for the engineering.
Further reading: This card is a core component of a pleb-grade AI Hashcenter. Pair it with the sovereignty argument in the Sovereign AI for Bitcoiners Manifesto, or look at how the same 120V envelope powers a Bitcoin space heater in our mining catalog. Running both workloads on one rig? See Heating Your Home With Inference.
Models that run on this hardware
Get it running
-
01
Install Ollama →
Ten-minute local LLM runtime. One binary, zero cloud.
-
02
Give it a UI →
Open-WebUI turns Ollama into a self-hosted ChatGPT.
-
03
Which runner? →
LM Studio vs Ollama vs llama.cpp — pick the right runtime for your rig.
Further reading: Heating your home with inference for turning this card into a winter-heat source, and the Sovereign AI for Bitcoiners Manifesto for the bigger picture on owner-operated AI.
