Skip to content

We're upgrading our operations to serve you better. Orders ship as usual from Laval, QC. Questions? Contact us

Bitcoin accepted at checkout  |  Ships from Laval, QC, Canada  |  Expert support since 2016

Current

RTX A4000

NVIDIA · single-slot blower · Released April 2021

Single-slot Ampere workstation card with 16 GB and a blower. The quiet-rack pleb's favourite for dense multi-GPU builds.

Hardware spec sheet

VendorNVIDIA
CategoryGPU
VRAM / memory16 GB
Memory bandwidth448 GB/s
FP16 TFLOPS19.2
INT8 TOPS155
TDP140 W
ArchitectureAmpere
Form factorsingle-slot blower
Release dateApril 2021
Street price (USD)600-900 (used)
120V note140W each; four A4000s on one 120V/15A circuit is comfortable with an 850W PSU.

The RTX A4000 is NVIDIA’s Ampere-generation workstation card optimised for density: single-slot blower cooler, 140 W TDP, 16 GB of ECC GDDR6. Launched April 2021, it shares Ampere’s compute lineage with the 3090 — same tensor-core generation, same FP16/INT8 throughput per CUDA core — but in a form factor that lets you cram 4–7 cards into a single workstation chassis.

Who it’s for: Hashcenter builders who need multi-GPU density without datacenter cards. Four A4000s in a Threadripper workstation = 64 GB VRAM in a quiet, thermally sane package. Also popular in rack deployments where blower airflow matters.

Models it runs comfortably: single card handles Llama 3 8B at FP16, 14B at Q8, 32B at Q4 (tight). Two cards split a 70B at Q4.

Hashcenter notes: ECC memory is a genuine advantage for long-running inference workloads where bit-flips are a silent failure mode. Blower cooler is noticeably quieter than consumer 3090 blowers because the 140 W TDP keeps fan RPM lower. Used prices \$600–900 as of 2026. 140 W each means four cards on a single 120V/15A circuit is very comfortable. Credit to NVIDIA’s workstation team for a genuinely pleb-friendly dense-compute card.

Further reading: This card is a core component of a pleb-grade AI Hashcenter. Pair it with the sovereignty argument in the Sovereign AI for Bitcoiners Manifesto, or look at how the same 120V envelope powers a Bitcoin space heater in our mining catalog. Running both workloads on one rig? See Heating Your Home With Inference.

Get it running

  1. 01 Install Ollama →

    Ten-minute local LLM runtime. One binary, zero cloud.

  2. 02 Give it a UI →

    Open-WebUI turns Ollama into a self-hosted ChatGPT.

  3. 03 Which runner? →

    LM Studio vs Ollama vs llama.cpp — pick the right runtime for your rig.

Further reading: Heating your home with inference for turning this card into a winter-heat source, and the Sovereign AI for Bitcoiners Manifesto for the bigger picture on owner-operated AI.