GPU Inference as Heat Reuse: The Hashcenter Heat Story

A GPU running AI inference is a 100%-efficient electric space heater. Every watt drawn from the wall becomes 3.412 BTU/hr of usable heat — identical to a baseboard heater drawing the same load — except the GPU is simultaneously serving AI workloads. At Hashcenter scale, this heat becomes a co-generation asset: the BTU output that would be exhausted outdoors can instead displace propane, natural gas, or purchased electricity for facility heating.

Bitcoin ASICs pioneered the proof-of-concept: any high-density compute facility is also a heating plant. GPUs running local AI inference, model fine-tuning, or distributed training obey the same thermodynamics — and open a parallel heat-reuse story that extends the D-Central heat reuse thesis beyond SHA-256 hashing into the sovereign AI compute era. This page covers the physics, the home-scale math, the seasonal economics, and what the heat story looks like at Hashcenter density.

The physics: why GPU inference = space heater

The first law has no exceptions

The first law of thermodynamics forbids the destruction of energy. In any resistive electrical device — a baseboard heater, an Antminer S21, or an NVIDIA RTX 4090 running a 70-billion-parameter language model — 100% of the electrical energy drawn from the wall must leave as heat. A small fraction exits as sound and negligible electromagnetic emissions, but for sizing purposes the conversion is exact: 1 watt of electrical draw = 3.412 BTU/hr of thermal output (IEC 80000-5, derived from 1 BTU = 1,055.06 joules).

This is not a GPU-specific property. It is a fundamental constraint of physics. A GPU at full inference load draws its rated TDP (Thermal Design Power) from the wall and converts that entire draw to heat. There is no “efficiency loss” compared to a dedicated heater — both are 100% efficient at converting electricity to heat. The GPU merely does additional computational work at the same time.

The heat-reuse framing: If you are already running GPU inference for AI workloads, you are already producing heat. Heat reuse asks a single question: are you capturing that heat and using it, or are you exhausting it outside and then paying separately to heat the same space?

How a GPU produces heat

A modern discrete GPU (Graphics Processing Unit) dissipates heat through three primary paths:

GPU die: The CUDA/stream processor cores, memory controllers, and shader units generate the majority of heat during inference. At full load the die operates at 70–90°C with active cooling.
VRAM: High-bandwidth memory (HBM2e, GDDR6X) generates substantial heat at sustained inference throughput. VRAM thermal limits are a common bottleneck in sustained inference runs.
VRM and PCB: Voltage regulation modules dissipate additional heat as they step down supply voltages to the GPU die.

All of these heat sources exit through the GPU’s cooling system — air-cooled blower or open-air fans, or a liquid-cooled loop — and ultimately enter the ambient environment. In a typical server room or home office, that heat is removed by HVAC and exhausted. In a heat-reuse deployment, it is captured and put to work.

Inference load vs. TDP: what fraction of rated wattage runs?

GPU TDP is the rated maximum sustained power draw. Real inference loads vary:

Large language model inference (70B+ parameters, full quantization): Typically 80–100% TDP utilization on the GPU(s) with the model fully loaded in VRAM. Memory-bandwidth-limited workloads tend to sustain high GPU power.
Medium model inference (7B–13B, INT4/INT8): Often 40–70% TDP, depending on batch size and model quantization. See D-Central AI quantization guide for how quantization affects compute load.
Idle between inference calls: GPU idles at 15–30 W; this idle heat is minimal and should not be included in sustained heating calculations.

For a conservative heat reuse estimate, use actual sustained-load power draw from monitoring tools (nvidia-smi, GPU-Z) rather than rated TDP. For a ceiling estimate, use TDP. This page’s calculator accepts a custom wattage input for that reason.

Home-scale math: single-GPU cost framing

The most accessible heat reuse scenario is the home AI builder running a local LLM on a consumer GPU — the exact hardware profiled in the D-Central local AI hardware guide and GPU for local LLM comparison. Here is how the heat math works at the single-GPU level.

Reference GPU thermal profiles (approximate — verify with manufacturer)

The following figures use NVIDIA published TDP specifications as of their respective product launches. Actual sustained inference draw may be 10–30% lower depending on model, batch size, and driver power limits. These are upper-bound estimates.

GPU	Rated TDP (W) †	Heat at TDP (BTU/hr)	Rough room coverage ‡	Primary use case
RTX 3060 (12 GB)	170 W	~580 BTU/hr	~58 sq ft	7B models, INT4
RTX 3090 (24 GB)	350 W	~1,194 BTU/hr	~119 sq ft	13B–30B models, INT4/INT8
RTX 4070 Ti Super (16 GB)	285 W	~973 BTU/hr	~97 sq ft	13B models, INT4/INT8
RTX 4090 (24 GB)	450 W	~1,535 BTU/hr	~154 sq ft	30B–70B models (INT4), largest consumer model
2× RTX 3090 NVLink (48 GB)	700 W (system)	~2,388 BTU/hr	~239 sq ft	70B models (FP16/INT8); see GPU comparison guide
A100 80 GB SXM (server)	400 W	~1,365 BTU/hr	~137 sq ft	70B+ FP16 inference, fine-tuning
H100 SXM (server)	700 W	~2,388 BTU/hr	~239 sq ft	Large-scale inference, MoE models
8× H100 SXM (DGX H100)	~10,200 W (system)	~34,800 BTU/hr	~3,480 sq ft (heating only; not home-scale)	Frontier-model serving, Hashcenter AI node

† TDP figures from NVIDIA published product specifications at time of product launch. Actual sustained inference draw varies by workload, power limit settings, and driver version. Verify with nvidia-smi for your specific deployment. ‡ Room coverage using the 10 BTU/hr per sq ft HVAC rule of thumb; consult a heating professional for your specific space.

The net-cost framing

The key question is not “does the GPU produce heat?” (it always does) but “what is the net cost of the electricity I am already spending, once I account for the heating I no longer need to buy?”

Consider a home inference server in Quebec (approximately 7.18 ¢/kWh Tier 1 residential, per the D-Central electricity rates dataset; verify at Hydro-Québec for your tariff class) running an RTX 4090 at 400 W sustained load (not TDP; accounting for system draw):

Electricity cost: 400 W × 24 hr × 30 days = 288 kWh/month × $0.07065/kWh ≈ $20.68 CAD/month (estimate; verify with your utility rate)
Heat produced: 288 kWh/month × 3,412 BTU/kWh ≈ 982,656 BTU/month of heat
Equivalent heating oil displaced: 982,656 BTU / (10,600 BTU/litre × 0.85 efficiency) ≈ 109 litres/month. At approximately $1.30/litre (hedged; 2024/2025 benchmark — verify with your supplier), that is approximately $141 CAD/month in heating oil that does not need to be purchased (estimate only)
Net framing: The inference server costs ~$20.68/month to run on Quebec rates and displaces an estimated ~$141 of heating oil — meaning the GPU’s heat alone more than offsets the electricity cost for a Quebec user displacing oil heat. The AI work is, in this framing, “free” during the heating season.

Important caveats on this example

Quebec Tier 1 rate applies only to the first ~33 kWh/day of consumption; additional use enters Tier 2 (~10.95 ¢/kWh as of 2025 — verify with Hydro-Québec). Sustained high-wattage inference will push many households into Tier 2.
Heating oil price varies by region, supplier, and date. The $1.30/litre figure is an approximate 2024/2025 Canadian benchmark from the Canadian Petroleum Products Institute. Verify with your supplier before making any decisions.
The heating benefit only applies when you actually need heat — approximately November through April in most of Quebec and Ontario. In summer, this calculation inverts.
This is orientation only — not financial advice. Consult a professional before making infrastructure investment decisions.

Seasonal economics: heating season vs. summer

GPU inference heat is valuable in winter and a liability in summer. This asymmetry shapes how you should think about inference workload scheduling, hardware sizing, and cooling infrastructure. Canada’s heating-season economics are particularly favourable.

Canadian heating season: the sweet spot

Natural Resources Canada’s household energy data indicates that space and water heating accounts for approximately 60% of total residential energy use in Canada, with the heating season spanning roughly 5–7 months depending on province (longer in Quebec, Manitoba, and northern regions; shorter in BC’s Lower Mainland and southern Ontario).

During the heating season:

Every watt of GPU inference heat that enters the living or working space directly offsets heating demand from your primary heat source (gas, oil, electric, propane).
If your inference server is in a room with a thermostat, the thermostat will run the primary heater less — the offset is automatic.
For homes heated by propane or oil (Atlantic Canada, rural Quebec, northern Ontario), the per-BTU cost of the displaced fuel can be substantially higher than the electricity cost to run the GPU — creating the strongest net-cost framing.
For homes heated by natural gas (most of Ontario and BC), the economics depend on the gas rate and the electricity rate. At Quebec rates vs. Atlantic oil rates, the case is strongest. At Ontario rates vs. Ontario gas, the math is tighter — use the calculator below for your specific inputs.

Summer: the liability side

In summer, GPU inference heat adds to the building’s cooling load. If you have air conditioning, every watt of GPU heat must be removed by the AC — which costs additional electricity, typically at a COP of 2.5–4.0 for a modern air-source unit. The net effect:

No AC: GPU heat raises the building temperature. You pay the GPU’s electricity cost and get no heating benefit. The “free heat” framing does not apply.
With AC (COP 3.0): The AC must remove the GPU’s heat output. At 400 W GPU load, the AC draws an additional ~133 W to remove that heat (400 W / COP 3.0). Total effective cost: 400 W inference + 133 W additional AC = ~533 W effective draw for the same inference work you do in winter at 400 W. Summer inference in a cooled space costs approximately 33% more than winter inference on a net-energy basis (rough estimate; actual depends on COP, infiltration, building envelope).

Scheduling insight: For workloads with flexibility (batch embedding runs, fine-tuning jobs, RAG index rebuilds), scheduling heavy inference tasks during the heating season and lighter workloads in summer is a practical way to capture the heat value without increasing summer cooling costs.

Province-by-province framing (approximate)

The economics vary significantly by province. Higher electricity rates reduce the advantage; lower rates expand it. The heat-offset benefit is largest where the displaced fuel (oil, propane) is expensive relative to electricity.

Province	Approx. electricity rate †	Primary residential heat fuel	Heat-reuse economics framing
Quebec	~7.18–10.95 ¢/kWh (Tier 1/2)	Electric, oil (rural)	Strong: low electricity rate, high oil displacement value in rural areas; winter heating season is long
Manitoba	~9.97 ¢/kWh	Natural gas, electric	Moderate-strong: low electricity rate, gas displacement economics depend on rate
British Columbia	~11.87 ¢/kWh (Tier 1)	Natural gas, electric	Moderate: reasonable electricity rate, coastal mild winters reduce heating season length
Ontario	~12–18 ¢/kWh all-in	Natural gas (dominant)	Moderate: depends on all-in rate (Global Adjustment adds significantly); gas is relatively cheap
Atlantic (NB, NL, NS, PE)	~14–19 ¢/kWh	Heating oil (large share), electric	Variable: high oil prices improve heat-offset value, but high electricity rates erode the base margin
Territories (YT, NT, NU)	28–75 ¢/kWh	Diesel, propane	Challenging: electricity costs overwhelm most heat offsets; fuel displacement value would need to be very high

† Residential Tier 1 rates from the D-Central Canadian electricity rates dataset, sourced from provincial utilities as of 2025/2026. Rates change; verify with your utility before making decisions.

GPU inference heat calculator

Enter your GPU’s actual sustained inference wattage (not TDP — use nvidia-smi to measure your real load), your province’s electricity rate, and the fuel you would otherwise use for heating. The calculator shows your monthly heat value, electricity cost, and net-cost framing for the heating season. This is an estimate tool, not financial advice — verify all inputs against your actual bills.

GPU inference heat value estimator

A — Your GPU inference load

Preset GPU (TDP ceiling)

These are TDP ceiling values. Actual sustained inference draw is typically 40–90% of TDP. Measure with nvidia-smi for accuracy.

Actual sustained inference wattage (W)

Include total system draw (GPU + CPU + RAM + fans) for a whole-system heat estimate. GPU-only is valid for the GPU-side heat only.

Inference hours per day (average)

8 = typical office-hours inference server. 24 = continuous serving (Hashcenter inference node).

Heating season length (months/year)

Quebec/MB: 6–7 months; BC Lower Mainland: 4–5 months; Territories: 8–9 months. Approximation only.

B — Your electricity rate

Province

Tier 1 residential rates from D-Central electricity dataset. Verify with your utility.

Electricity rate (¢/kWh)

Auto-filled from province. Edit to match your actual bill.

C — Displaced heating fuel (heating season only)

Fuel displaced

Fuel price ($/litre)

~$1.15–$1.50/litre heating oil (Canada, 2024/2025 approx — verify). Source: Canadian Petroleum Products Institute benchmarks.

Appliance efficiency (%)

Standard oil furnace: 80–85%; condensing gas: 90–97%; electric: 100%; heat pump: 200–400% (COP). Source: NRCan.

Scaling to a Hashcenter

Individual GPU heat is a room-warming story. At Hashcenter density — racks of GPUs running sustained AI inference, fine-tuning, or distributed training alongside Bitcoin ASICs — the BTU output becomes a co-generation asset that changes the facility economics. This is the extension of D-Central’s energy for compute thesis from a single device to a facility-level heat budget.

What Hashcenter AI compute density looks like

For reference (from published NVIDIA specifications; actual deployed configurations vary):

Configuration	Approx. GPU power draw	Thermal output (BTU/hr)	Equivalent to
4-GPU inference node (4× RTX 4090)	~1,800 W (GPU only)	~6,142 BTU/hr	~1.8 kW baseboard heater
1 DGX H100 (8× H100 SXM)	~10,200 W (system)	~34,800 BTU/hr	~3 residential Antminer S21 units
4-rack AI cluster (8 DGX H100 nodes)	~81,600 W (82 kW)	~278,400 BTU/hr	~23 Antminer S21 units (equivalent heat)
100-GPU inference pod (mixed RTX/H100)	~50,000 W (50 kW, estimated)	~170,600 BTU/hr	~14 Antminer S21 units (equivalent heat)

GPU system power figures from NVIDIA published DGX specifications and consumer GPU TDP. “GPU only” figures do not include networking, storage, or facility overhead. Real Hashcenter power consumption is typically 30–60% higher than GPU-only TDP due to infrastructure overhead. Use these as order-of-magnitude reference only.

The Hashcenter heat story: from exhaust to asset

A conventional hyperscale compute facility treats heat as a pure cost: it must be removed by chillers, cooling towers, or direct liquid cooling, consuming additional electricity in the process. A Hashcenter designed around heat reuse inverts this: the heat is a second product alongside the compute revenue.

The three practical pathways for Hashcenter heat reuse at AI-GPU density:

Direct air-to-space heating: GPU rack exhaust (typically 40–55°C) is ducted into adjacent spaces that require heat — an attached warehouse, workshop, grow facility, or office block. No heat exchanger required; the simplest integration. Effective in colder climates where the 40–55°C exhaust temperature is well above ambient. This approach mirrors what ASIC miners have implemented in hashcenters across Quebec, Nordic countries, and northern Ontario.
Liquid-cooled GPU loop to hydronic heat distribution: GPU nodes using direct liquid cooling (DLC) or rear-door heat exchangers transfer heat to a closed water loop. The heated water (typically 40–60°C output from GPU CDUs) circulates to radiant floor loops, baseboard radiators, or a district heat exchanger. This is the most infrastructure-intensive but most efficient approach — minimizing heat losses and enabling higher-temperature distribution. Applicable at multi-rack scale. See immersion vs. air cooling comparison for how liquid cooling affects heat capture quality.
Heat-as-a-service to adjacent facilities: A Hashcenter located adjacent to a food processor, greenhouse cluster, aquaculture facility, or commercial laundry can sell or supply heat under a service agreement. The compute facility receives revenue (or reduced rent / land costs) from the heat buyer; the buyer receives lower-cost thermal energy than from natural gas or propane. This model has operational precedents in Nordic countries, credited to early GPU and ASIC mining operations that demonstrated the co-generation model at commercial scale.

ASIC + GPU hybrid heat at Hashcenter scale

D-Central’s distributed compute thesis positions Bitcoin ASICs and GPU inference hardware as complementary, not competing, within a sovereign compute stack. From a heat perspective, this complementarity is particularly clean:

ASICs run 24/7: Bitcoin mining is a continuous workload — the miner runs whenever it is profitable, which is most of the time. This produces a steady baseline heat load throughout the heating season.
GPU inference is bursty: AI inference workloads are driven by demand — request volumes vary by time of day, seasonality, and workload type. During peak hours, GPU heat supplements ASIC heat. During off-peak, ASICs carry the base load.
Combined heat profile: A hybrid ASIC + GPU Hashcenter produces a more consistent thermal output profile than either alone, which simplifies heat recovery system sizing.

For a worked example of the ASIC side of this heat story, see the ASIC heat reuse net-cost calculator and the heat reuse hub.

Scaling note: At Hashcenter scale, co-generation economics become highly site-specific: land costs, proximity to heat buyers, local utility tariff structure for commercial loads, and permitting all affect the business case. This page provides orientation only. Contact D-Central for a consultation on your specific deployment.

ASIC heat vs. GPU heat: complementary, not competing

A common question when D-Central’s heat reuse thesis is applied to both Bitcoin ASICs and GPU inference: which is “better” for heat reuse? The answer is that they serve different use cases and roles within a sovereign compute stack.

Dimension	Bitcoin ASIC heat	GPU inference heat
Revenue alongside heat	Bitcoin (sats/day, network-dependent)	AI workload value (inference serving, data processing, model training)
Workload profile	Continuous, steady (24/7 unless stopped)	Demand-driven, bursty (inference requests); can be sustained (training runs)
Wattage per unit	High (1,350–3,600 W for modern ASICs)	Variable (170 W consumer to 10,200 W DGX per node)
Heat temperature	Exhaust air 50–70°C (air-cooled); higher with immersion	Exhaust air 40–55°C (air-cooled); ~50–60°C water-side (DLC)
Noise	High (65–80 dB stock); can be reduced with quiet fans	Moderate (server-grade fans, 55–70 dB at full load)
Home deployment viability	Yes, with acoustic treatment (see BTU calculator)	Yes; server GPU fans are quieter than ASIC fans; home AI server viable
Sovereignty dimension	Decentralizes Bitcoin hash rate; sovereign money production	Decentralizes AI inference; data stays local; see Sovereign AI Canada

The Bitcoin ASIC side of this story is covered in depth at the heat reuse hub and ASIC heat reuse net-cost calculator. The GPU inference side is what this page covers. Used together — a hybrid ASIC/GPU Hashcenter or home sovereign compute stack — both heat streams contribute to the same physical space, and the sovereignty case is compounded: self-custodied bitcoin production and self-hosted AI inference, both producing useful heat.

Practical deployment notes

Capturing GPU heat effectively

Air-cooled consumer GPUs (blower-style or open-air fan design) exhaust differently:

Blower-style coolers (reference designs, server GPUs) exhaust hot air out the rear bracket — well-suited for rack enclosures and directed ducting. All heat exits one direction, making capture straightforward.
Open-air coolers (most aftermarket consumer cards: triple-fan designs) exhaust downward and upward into the case, then out through chassis fans. Heat is more diffuse; capturing it requires enclosure-level ducting rather than card-level duct work.

For home inference heating, placing the GPU system in a utility room or enclosed cabinet with a duct to the living space captures exhaust air effectively. For Hashcenter rack deployments, rear-door heat exchangers or in-row cooling with heat recovery is the standard approach.

Inlet temperature management

GPU thermal limits are tighter than ASIC limits in one respect: VRAM thermal throttling. Modern data-centre GPUs (A100, H100) have VRAM at 85–90°C operating limits; consumer GPUs (RTX 40-series) at 95–105°C junction temperature limits. If the GPU draws pre-warmed recirculated air as its cooling inlet, thermal headroom shrinks and throttling becomes more likely. Best practice: maintain a cool fresh-air inlet (outside air or unheated space) even when capturing exhaust heat for the room. This is the same discipline required for ASIC heat reuse deployments.

Power delivery for home AI inference servers

A consumer GPU server drawing 450–700 W (GPU alone) plus CPU, RAM, storage, and network draws 700–1,200 W total from the wall — equivalent to a small to mid-size space heater. This is well within a standard 15 A household circuit (1,800 W capacity at 120 V). For multi-GPU servers (2× RTX 3090 or 4× RTX 4070 Ti), a dedicated 20 A or 240 V circuit is advisable. Consult a licensed electrician for any significant infrastructure change. See Home mining circuit planner for circuit sizing guidance (primarily covers ASICs, but the electrical principles apply equally to GPU servers).

Monitoring and automation

Combining a GPU inference server with home heating is simplest when thermostat control is preserved. If the GPU runs in a thermostatically controlled room, the thermostat will automatically reduce boiler or furnace runtime as the GPU contributes heat — no special integration needed. For more active management, home automation systems (Home Assistant, for example) can monitor GPU wattage via MQTT/API and correlate it with thermostat demand data, allowing workload scheduling to align with heating need. This is an advanced integration; D-Central covers foundational approaches in the local LLM Canada guide and sovereign stack guide.

Frequently asked questions

Does running AI inference actually heat a room?

Yes. A GPU running AI inference is a resistive electrical load — it converts 100% of its power draw to heat, by the first law of thermodynamics. A 350 W RTX 3090 at full inference load produces approximately 1,194 BTU/hr of heat — equivalent to a small electric space heater. In a well-insulated home office, that is enough to meaningfully reduce thermostat demand from your primary heating system. The GPU does not “waste” heat compared to a dedicated heater; it is thermodynamically identical while also running your AI workloads.

How does GPU inference heat compare to Bitcoin ASIC heat?

The thermodynamics are identical — both are resistive electrical loads. The practical differences are wattage range (ASICs tend to draw 1,350–3,600 W per unit; consumer GPUs draw 170–700 W per card), the revenue alongside the heat (Bitcoin for ASICs, AI workload value for GPUs), and noise profile (stock ASICs are louder than GPU servers at equivalent wattages). At a hybrid Hashcenter with both ASIC and GPU hardware, both heat streams combine in the same facility. Neither is “better” — they serve different compute roles and together produce a complementary heat profile. See the heat reuse hub for the ASIC side of this story.

When does GPU inference heat NOT save money?

Several conditions reduce or eliminate the heat-offset value: (1) You have a heat pump — a COP-3 heat pump delivers 3 kWh of heat per 1 kWh of electricity, making it 3× more efficient than a GPU-as-heater. The GPU’s heat is “free” relative to what you already spend on inference, but it competes poorly against a heat pump in per-BTU cost. (2) It is summer and you have air conditioning — GPU heat becomes a cooling load, not a heating asset, adding ~25–40% to the effective electricity cost of inference (rough estimate; depends on AC COP). (3) You are in a high-electricity-rate province (Nova Scotia, Territories) — electricity cost overwhelms the fuel offset value. (4) Your inference load factor is low — a GPU idling at 25 W between sparse requests produces minimal heat and minimal offset.

Is it worth buying a GPU just for heating with AI inference?

This is a capital-allocation question that D-Central does not answer with a blanket recommendation — the answer depends on your existing AI workload need, your electricity rate, your heating fuel, and your investment horizon. The heat-offset value adds to the ROI of hardware you were already planning to buy for AI inference; it is a bonus from workloads you run anyway, not a justification on its own to buy new hardware purely for heat. Use the calculator above and consult a financial professional before making capital decisions.

How many GPUs would I need to heat a whole house with AI inference?

Using the 10 BTU/hr per sq ft rule of thumb for a moderate climate: a 1,200 sq ft well-insulated home needs approximately 12,000 BTU/hr of continuous heat. At 1,535 BTU/hr per RTX 4090 at TDP (450 W), you would need approximately 8 RTX 4090 units running at full load continuously — drawing approximately 3,600 W in GPU power alone, plus system overhead. That is well beyond a single home circuit and approaches small-scale GPU cluster territory. In practice, GPU heat supplements rather than replaces a primary heating system at home scale; whole-home AI inference heat is a Hashcenter concept, not a residential one. At the home level, a 1–4 GPU inference server meaningfully reduces heating demand without replacing it.

What is the sovereign argument for local AI inference + heat reuse?

Running local AI inference on your own hardware keeps your prompts, documents, and queries off third-party servers — data sovereignty, not a cloud provider’s terms of service. The heat the GPU produces is a byproduct of sovereignty-preserving compute. In a Canadian context — where federal and provincial data residency expectations are evolving (see AI regulation in Canada) — local inference is increasingly relevant beyond just the heat story. D-Central’s Sovereign AI Canada hub and distributed compute page cover the broader case; the heat reuse angle reinforces it: local inference costs less net energy during the heating season than cloud API calls plus a separate heater running simultaneously.

How do I measure how much heat my GPU inference server actually produces?

The most direct method: measure total system power draw from the wall using a plug-in power meter (e.g., Kill-A-Watt or a Canadian-compatible equivalent). Whatever the meter reads while inference is running is the heat output in watts (divide by 1,000 for kW; multiply by 3.412 for BTU/hr). GPU-side power specifically can be read from nvidia-smi (command: nvidia-smi --query-gpu=power.draw --format=csv) — but this captures GPU power only, not CPU, RAM, or fans. For heat sizing purposes, whole-system wall power draw is the correct input to the calculator above.

Standing on the shoulders of giants

The concept of GPU and ASIC compute heat as a building asset was not invented by D-Central. The Heat Punks community pioneered the cultural and engineering movement around Bitcoin ASIC heat reuse. Nordic compute facilities — particularly operators in Sweden, Finland, and Norway — were among the first to integrate GPU and ASIC mining heat into district heating networks at commercial scale, demonstrating the co-generation model that forms the basis of what we describe here. Braiins contributed public research on the “Hashrate Heated House” concept. The AI inference heat reuse framing builds on the open-source local AI ecosystem — llama.cpp (Georgi Gerganov), Ollama, and vLLM — whose work makes self-hosted inference on consumer GPU hardware practical. We credit all of these predecessors and stand on their shoulders, not above them.