Current

RTX 3090

NVIDIA · triple-slot · Released September 2020

NVIDIA's 2020 flagship remains the pleb sweet spot: 24 GB of GDDR6X for $600–800 used, runs 32B models comfortably at Q4.

Hardware spec sheet

Vendor	NVIDIA
Category	GPU
VRAM / memory	24 GB
Memory bandwidth	936 GB/s
FP16 TFLOPS	35.6
INT8 TOPS	284
TDP	350 W
Architecture	Ampere
Form factor	triple-slot
Release date	September 2020
Street price (USD)	600-800 (used)
120V note	350W fits comfortably on 120V/15A with an 850W+ PSU; two 3090s on one 120V circuit is marginal, prefer 240V.

The RTX 3090 is the used-market champion of local inference. Launched September 2020 as NVIDIA’s Ampere flagship, it packs 24 GB of GDDR6X on a 384-bit bus delivering 936 GB/s of memory bandwidth — the stat that actually matters for LLM inference. Ampere descends from Turing (RTX 20-series) and the tensor-core lineage goes back to Volta (V100).

Who it’s for: hobbyists and prosumers who want to run 7B–32B models locally without taking a second mortgage. At roughly \$600–800 on the used market in 2026, it is the reference recommendation for a first serious LLM rig.

Models it runs comfortably: Llama 3 8B at full FP16, Llama 3 70B distillations at Q4, Qwen 2.5 32B at Q5_K_M, Mistral Small at Q8. Anything up to roughly 40B parameters at Q4.

Hashcenter notes: triple-slot cooler, 350 W TDP, 3× 8-pin power. Noise is middling — fine in a home office, loud in a rack. For a quiet Hashcenter, swap to blower variants (RTX A5000 if budget allows). 350 W fits comfortably on 120V/15A with an 850 W+ PSU; two 3090s on one 120V circuit is marginal, prefer 240V. Credit to NVIDIA for building the card that plebs could actually afford on the used market.

Further reading: This card is a core component of a pleb-grade AI Hashcenter. Pair it with the sovereignty argument in the Sovereign AI for Bitcoiners Manifesto, or look at how the same 120V envelope powers a Bitcoin space heater in our mining catalog. Running both workloads on one rig? See Heating Your Home With Inference.

Models that run on this hardware

Gemma 3 Gemma runs at Q5_K_M with headroom Gemma 2 Gemma runs at Q5_K_M with headroom Mistral Small 3 Mistral runs at Q5_K_M with headroom Phi-4 Phi runs at Q8 / FP16 comfortably FLUX.1 schnell FLUX runs at Q8 / FP16 comfortably FLUX.1 dev FLUX runs at Q8 / FP16 comfortably Stable Diffusion 3.5 Stable Diffusion runs at Q8 / FP16 comfortably Mistral 7B Mistral runs at Q8 / FP16 comfortably

Get it running

01 Install Ollama →
Ten-minute local LLM runtime. One binary, zero cloud.
02 Give it a UI →
Open-WebUI turns Ollama into a self-hosted ChatGPT.
03 Which runner? →
LM Studio vs Ollama vs llama.cpp — pick the right runtime for your rig.

Further reading: Heating your home with inference for turning this card into a winter-heat source, and the Sovereign AI for Bitcoiners Manifesto for the bigger picture on owner-operated AI.

RTX 3090

Hardware spec sheet

Models that run on this hardware

Get it running

Related products, repair, and setup paths