Skip to content

We're upgrading our operations to serve you better. Orders ship as usual from Laval, QC. Questions? Contact us

Bitcoin accepted at checkout  |  Ships from Laval, QC, Canada  |  Expert support since 2016

Current

RTX 4090

NVIDIA · triple-slot · Released October 2022

Ada Lovelace's consumer flagship: 24 GB, 1 TB/s bandwidth, 82.6 FP16 TFLOPS. Fastest single-card pleb option for inference.

Hardware spec sheet

VendorNVIDIA
CategoryGPU
VRAM / memory24 GB
Memory bandwidth1008 GB/s
FP16 TFLOPS82.6
INT8 TOPS660
TDP450 W
ArchitectureAda Lovelace
Form factortriple-slot
Release dateOctober 2022
Street price (USD)1600-1900 (new/used)
120V note450W on 120V/15A is the practical ceiling for a single card with a 1000W PSU; two 4090s need 240V.

The RTX 4090 launched October 2022 on NVIDIA’s Ada Lovelace architecture — the direct successor to Ampere (RTX 3090) with a lithography jump to TSMC 4N. Same 24 GB VRAM as the 3090, but meaningfully faster: 1008 GB/s bandwidth, 82.6 FP16 TFLOPS, and roughly 2× tensor throughput for INT8/FP8. Ada Lovelace introduced 4th-gen tensor cores and FP8 support, both of which matter for quantized inference workloads.

Who it’s for: prosumers who want the fastest single-card inference without jumping to workstation cards. Also the card of choice for serious Stable Diffusion / ComfyUI users.

Models it runs comfortably: same parameter envelope as the 3090 (up to ~40B at Q4), but roughly 1.7–2× faster tok/s. Llama 3 70B at Q4 fits with 4K context.

Hashcenter notes: triple-slot, 450 W TDP, 16-pin 12VHPWR connector (check cable quality — early cables had connector issues NVIDIA and partners have since addressed). 450 W on 120V/15A is the practical ceiling for a single card with an 1000 W PSU; a second 4090 really needs 240V. Credit Ada Lovelace — the 2022 flagship that made local 70B-class models feel responsive on a home rig.

Further reading: This card is a core component of a pleb-grade AI Hashcenter. Pair it with the sovereignty argument in the Sovereign AI for Bitcoiners Manifesto, or look at how the same 120V envelope powers a Bitcoin space heater in our mining catalog. Running both workloads on one rig? See Heating Your Home With Inference.

Get it running

  1. 01 Install Ollama →

    Ten-minute local LLM runtime. One binary, zero cloud.

  2. 02 Give it a UI →

    Open-WebUI turns Ollama into a self-hosted ChatGPT.

  3. 03 Which runner? →

    LM Studio vs Ollama vs llama.cpp — pick the right runtime for your rig.

Further reading: Heating your home with inference for turning this card into a winter-heat source, and the Sovereign AI for Bitcoiners Manifesto for the bigger picture on owner-operated AI.