RTX 3090
NVIDIA · triple-slot · Released September 2020
NVIDIA's 2020 flagship remains the pleb sweet spot: 24 GB of GDDR6X for $600–800 used, runs 32B models comfortably at Q4.
Hardware spec sheet
| Vendor | NVIDIA |
|---|---|
| Category | GPU |
| VRAM / memory | 24 GB |
| Memory bandwidth | 936 GB/s |
| FP16 TFLOPS | 35.6 |
| INT8 TOPS | 284 |
| TDP | 350 W |
| Architecture | Ampere |
| Form factor | triple-slot |
| Release date | September 2020 |
| Street price (USD) | 600-800 (used) |
| 120V note | 350W fits comfortably on 120V/15A with an 850W+ PSU; two 3090s on one 120V circuit is marginal, prefer 240V. |
The RTX 3090 is the used-market champion of local inference. Launched September 2020 as NVIDIA’s Ampere flagship, it packs 24 GB of GDDR6X on a 384-bit bus delivering 936 GB/s of memory bandwidth — the stat that actually matters for LLM inference. Ampere descends from Turing (RTX 20-series) and the tensor-core lineage goes back to Volta (V100).
Who it’s for: hobbyists and prosumers who want to run 7B–32B models locally without taking a second mortgage. At roughly \$600–800 on the used market in 2026, it is the reference recommendation for a first serious LLM rig.
Models it runs comfortably: Llama 3 8B at full FP16, Llama 3 70B distillations at Q4, Qwen 2.5 32B at Q5_K_M, Mistral Small at Q8. Anything up to roughly 40B parameters at Q4.
Hashcenter notes: triple-slot cooler, 350 W TDP, 3× 8-pin power. Noise is middling — fine in a home office, loud in a rack. For a quiet Hashcenter, swap to blower variants (RTX A5000 if budget allows). 350 W fits comfortably on 120V/15A with an 850 W+ PSU; two 3090s on one 120V circuit is marginal, prefer 240V. Credit to NVIDIA for building the card that plebs could actually afford on the used market.
Further reading: This card is a core component of a pleb-grade AI Hashcenter. Pair it with the sovereignty argument in the Sovereign AI for Bitcoiners Manifesto, or look at how the same 120V envelope powers a Bitcoin space heater in our mining catalog. Running both workloads on one rig? See Heating Your Home With Inference.
Models that run on this hardware
Get it running
-
01
Install Ollama →
Ten-minute local LLM runtime. One binary, zero cloud.
-
02
Give it a UI →
Open-WebUI turns Ollama into a self-hosted ChatGPT.
-
03
Which runner? →
LM Studio vs Ollama vs llama.cpp — pick the right runtime for your rig.
Further reading: Heating your home with inference for turning this card into a winter-heat source, and the Sovereign AI for Bitcoiners Manifesto for the bigger picture on owner-operated AI.
