Antminer T21 – Temperature Too High
Critical — Immediate action required
Symptoms
- Dashboard shows ERROR_TEMP_TOO_HIGH, often with `PCB temp 255 max 80` or `chip temp 255 max 95` (255 is the failed-read sentinel, not a literal reading)
- Miner auto-shutdowns under load after 3-20 minutes of hashing and reboots itself in a loop
- Hashboard PCB temps reported above 80 degC or chip temps above 95 degC on any of the three boards
- All four PWM fans ramp to 100% duty (~6000-6600 RPM) and stay there until shutdown
- kern.log shows `over max temp, pcb temp X chip temp Y` entries on chain 0, 1, or 2
- One hashboard runs markedly hotter than the other two — per-chain deltas >5 degC
- Temp readings across chains diverge by more than 4 degC on the miner status page
- Exhaust air noticeably hotter than ambient expectation (>15 degC intake-to-exhaust rise)
- Realized hashrate drops below 180 TH/s nameplate even between shutdowns
- Chip-to-chip temp delta inside one chain exceeds 8 degC in firmware logs
- Dust cake visible on intake grille, heatsink fins, or fan hubs during visual inspection
Step-by-Step Fix
Power off the miner at the breaker for a full 5 minutes — not a soft reboot. Let the chassis drop below 45 degC before any diagnostic work; kernel watchdog state can wedge after a thermal trip and only a full power cycle reliably clears it. Confirm the breaker actually killed power by checking that status LEDs are fully dark before touching anything.
Verify ambient temperature at the intake grille with an IR thermometer — read the air, not the metal grille, and not the wall thermostat. Target is at or below 30 degC for no-derate operation; absolute ceiling is 35 degC. If intake reads warmer than ambient, you have recirculation — hot exhaust from this or an adjacent miner is being re-inhaled. Move the miner, add ducting, or separate intake/exhaust paths before retrying.
Check for visible dust on the intake grille and heatsink fin packs. A 2 mm felt-like dust layer on fins can add 8-10 degC to chip temp at nameplate load and is the single most common Tier-1 fix on T21s older than 6 months. If dust is visible, schedule the Tier-2 compressed-air cleanout. If grille is clear but miner still trips, suspect airflow restriction further downstream — shrouds, ducting, rack obstructions.
Confirm all four fans are spinning during startup by watching through the exhaust grille. A dead or partially-dead fan reduces CFM by 25% per missing fan — enough to push a marginal T21 into thermal trip under sustained load. If any fan is visibly stopped, spinning slowly, or making grinding or ticking noises, proceed to the Tier-2 fan inspection and replacement step.
Reboot and watch the first 10 minutes of hashing with the miner status page open. Record exactly which chain (0, 1, or 2) reports the error first, and whether the failing value is plausible (95-105 degC) or the 255 sentinel. Plausible values point to thermal / paste / airflow issues. The 255 value points to a sensor or PIC or I2C fault — a very different repair path (Tier 3-4). This single observation determines your fix tree.
Deep-clean the miner with compressed air. Power off, disconnect from mains, move to a ventilated area. Hold each fan stationary with a zip tie or finger so it cannot over-spin in the air stream — an over-spun fan generates damaging back-EMF that can destroy the fan controller. Blow intake-to-exhaust, then reverse. Focus on heatsink fin packs — the dust cake there is the #1 thermal offender on a 12-18-month-old T21.
Replace any underperforming fan. If a fan will not hit nameplate RPM (6000-6600 at PWM=100%) or is audibly grinding or ticking, swap with an original-spec 140 mm industrial PWM fan. T21 fans are 12 V 4-pin PWM — verify polarity before plugging in; reverse polarity instantly destroys the fan controller. Re-test: a new fan on a borderline-thermal T21 is frequently the sole fix needed.
Measure per-chain thermal balance under load. Let the miner run 5 minutes (it will likely trip within 15). Capture chain temps just before trip. Healthy T21: all three chains within 3 degC of each other. Imbalance >5 degC = that board's thermal interface has degraded. Label the three hashboards 0/1/2 with tape for reference in later steps and for any slot-swap isolation procedure.
Verify PSU output voltage at the hashboard connector under load. Multimeter on DC, probe at the PSU-to-board connector while miner is fully hashing. Expect >=13.8 V sustained on a T21. A sagging PSU forces the hashboard to draw more current to hit its power target, producing more I2R heat on-board and eventually a thermal trip that looks like a cooling problem but is actually electrical. Swap PSU with known-good unit if rail sags.
Reset to stock profile and revert any overclock or undervolt. If this T21 was running a custom profile, revert to stock nameplate (190 TH/s, 3610 W) via the UI or by flashing the stock firmware. Observe 30 minutes. If the trip does not recur at stock, you were outside the chip's thermal envelope — tune back up slowly in Tier 3 with per-chip visibility enabled, not stock firmware guesswork.
Refresh thermal paste on the hot hashboard. Pull the miner from service. Remove heatsink — typically 8-12 screws on the T21 heatsink clamp. Clean old paste from every BM1362 with IPA 99% and lint-free wipes (never paper towel — fibers contaminate paste). Apply Arctic MX-6 or Thermal Grizzly Kryonaut in a uniform thin layer using the spread-with-plastic-card technique (pea-drop is unreliable on a 66-chip board). Re-torque clamp evenly in a star pattern to avoid PCB flex. Expect chip temp to drop 6-12 degC from pre-refresh baseline.
Inspect and replace thermal pads on voltage-domain ICs. The BM1362 chips get paste; the PMICs and buck converters use thermal pads between themselves and a secondary heatsink plate. Pads dry and crumble on the same 12-18-month schedule as paste. Replace with 1 mm or 2 mm pads matching original thickness exactly — thicker pad creates a gap, thinner one loses contact. Gelid Extreme or Thermalright Odyssey are proven on ASIC hashboards.
Flash DCENT_OS for per-chip thermal visibility. DCENT_OS is D-Central's own open-source Antminer firmware (landing: d-central.tech/dcent-os, source: github.com/DCentralTech/DCENT_OS). It exposes per-chip temperature, per-chip HW%, and autotuning on the T21. Stock Bitmain firmware shows chain-level only; DCENT_OS shows which of the 198 BM1362 chips is running hot. This is the single largest diagnostic upgrade available on a T21. Alternatives: Braiins OS+, LuxOS, Vnish 1.2.x+.
Identify and disable a failing chip position if one chip is the statistical outlier. DCENT_OS and Braiins OS+ both support per-chip disable. Losing one BM1362 at ~2.9 TH/s is cheap compared to replacing a $550 hashboard — this is a legitimate buy-yourself-6-months fix while you queue a proper repair. Re-monitor after disabling; if a second chip starts trending hot within days, the board has systemic drift — stop masking, book Tier 4.
Underclock / undervolt to stock-minus. If the T21 ran hot even on fresh paste and custom firmware shows all chips healthy, you are seeing accumulated silicon stress. Run at -5% frequency and -2% voltage from stock for 48 hours. You lose approximately 10 TH/s; you drop chip temps 5-8 degC and the miner stops tripping. This is the home-miner longevity play: trade a little hashrate for years of life. Canadian garage operators and dual-purpose space-heater users benefit disproportionately from this pattern.
Stop DIY when the temperature sensor on a hashboard is confirmed dead (board reports 255 sentinel consistently, fault follows board across slot swaps), or when BM1362 chips need replacement, or when PMIC damage is visible, or when the PIC has crashed and will not recover after stock firmware reflash. These are test-fixture and soldering-bench jobs. Book a D-Central ASIC Repair slot at d-central.tech/services/asic-repair — turnaround 5-10 business days Canada / US / international.
D-Central bench process for a T21 thermal fault: test fixture with programmable load, per-chip isolation using Bitmain's T21 test binaries, sensor IC replacement where sensor is the fault, BM1362 reflow or replacement where silicon is the fault, full thermal-paste-plus-pad refresh as standard, 24-hour burn-in at nameplate before ship-back. Component-level repair preserves the $550 hashboard instead of salvaging the whole board over a $3 sensor IC.
Ship hashboards safely if you are sending only boards (saves shipping cost vs full miner). Remove boards, place each in an anti-static bag, double-box with at least 5 cm of foam on every face. Do NOT ship with heatsink clamped — transit vibration can fracture BGA joints on BM1362 chips under clamp pressure. Include note with observed symptoms, kern.log excerpts, firmware version, your contact. Saves diagnostic time at the bench, saves you repair cost.
When to Seek Professional Repair
If the steps above do not resolve the issue, or if you are not comfortable performing these repairs yourself, professional service is recommended. Attempting advanced repairs without proper equipment can cause further damage.
Related Error Codes
Still Having Issues?
Our team of Bitcoin Mining Hackers has been repairing ASIC miners since 2016. We have seen it all and fixed it all. Get a professional diagnosis.
