Avalon 1246 – Temperature Too High
Critical — Immediate action required
Symptoms
- Miner boots, hashes for 15-90 minutes, then hard-shuts down (not a graceful exit)
- Canaan web UI SYSTEMSTATU or MM_STATUS shows temperature warning or elevated MTavg/MTmax on one or more hashboards
- CGMiner API (TCP 4028) stats shows PVT_T with one or more chips above 85 C or board-avg above 68 C
- AUC3 controller red LED sustained (not blinking)
- Hashrate drops 20-40% gradually over 30 minutes before shutdown as firmware throttles frequency
- Fans ramp to 100% duty cycle and stay there; sound level climbs 8-12 dB above normal
- kern.log or miner log shows repeated MM_THERMAL_WARN, TEMP_OVER, or Toohot messages
- Rear exhaust air is noticeably hotter than normal - 'pull your hand back' hot at 15 cm
- Unit refuses to boot back up for several minutes after shutdown (thermal interlock holding)
- Per-chip PVT_T spread wider than 10 C left-to-right on the same hashboard
- One 1246 in a row of identical units is the only one tripping - points at unit-specific issue
- CODE_MMCRCFAILED IIC CRC events in the log correlated with phantom thermal readings
Step-by-Step Fix
Kill power at the PDU or breaker for a full 10 minutes. Not a UI reboot - the AUC3 caches its last thermal fault state and a soft reboot can land straight back in the trip condition. A 10-minute cold-start clears cached state, lets the NTCs normalize, and gives dust a moment to settle. Resume power and watch the first 15 minutes carefully from the web UI or CGMiner API.
Confirm intake ambient with an IR thermometer at the front grille - not room-middle. Target less than or equal to 30 C with the miner running at full load. If you are above that, fix ambient first: open a window, relocate, or add a dedicated intake fan pulling cooler air. No amount of repair fixes a hot room.
Clear the front 30 cm. The 1246 is a stacked-fan unit that chokes on restricted intake. Shelves, curtains, another miner's exhaust, or a wall within 30 cm of the front grille strangles airflow. Pull the unit forward and verify nothing is pulling recirculated hot exhaust back into the intake. Sub-zero-dollar fix that resolves a surprising percentage of 1246 thermal tickets.
Set DNS to 8.8.8.8 / 1.1.1.1 and verify NTP is syncing. Canaan 1246s often ship with DNS 114.114.114.114 (China-centric) baked in. DNS failures do not cause OVER_TEMP directly but can cause the firmware to hang in ways that mask thermal logs or delay throttle responses. You want accurate timestamps on your thermal logs.
Inspect the intake mesh for obstructions - paper, plastic wrap, pet hair, packing material from the last reshuffle. If you have added an aftermarket filter, vacuum it. For stock configurations just confirm nothing blocks the intake mesh. A 30-second visual.
Compressor-blow the fin stack and fans with a real compressor at 80-90 PSI - not a canister, not an air duster. Disconnect the miner from power first. Blow from the exhaust side back through the intake. Spin each fan blade by hand while blowing so the air hits the hub. Expect a visible dust cloud on the first pass; do a second pass after 30 seconds to confirm the stack is clean.
Remove and re-seat each hashboard. Power off, unplug, wait 5 minutes. Unscrew the top cover, label each slot 0/1/2 with painter's tape. Lift each board straight up - watch for the data ribbon and power connectors. Inspect connectors for dust, oxidation, bent pins. Reseat firmly and listen for the click. A data connector 0.5 mm proud of seated reports garbage thermal data and can trip OVER_TEMP from noise alone.
Swap hashboards between slots to isolate. If one board was running 8-15 C hotter than the others, physically swap it with a cold one. Boot, run 20 minutes, re-read thermal data. If the hot reading follows the board it is a board-level issue (Tier 3). If the hot reading stays in the slot it is a control-path / AUC3 / cable issue (Step 14).
Check fan RPM versus nameplate. Canaan 1246 stock fans run 6000-7000 RPM at full duty. If a fan is reporting under 5500 RPM at 100% duty, or audibly grinding, replace it immediately. Do not trust a marginal fan to 'probably last' - fans fail hard, not soft, and a dead fan on a summer afternoon tips you into thermal runaway in minutes.
Verify PSU output voltage under load. Multimeter on DC, probe at the PSU-to-AUC3 connector while the miner is hashing at full power. Expect roughly 12.0 V input rail and 1.20-1.32 V on the hashboard rail (Canaan spec: 1200-1320 mV). PSU sag pushes chips into a bad region of the voltage-frequency curve, which produces real extra heat. Below 11.7 V input, swap the PSU with a known-good unit.
Refresh thermal paste on all hashboards. Remove each board's fin stack (Phillips or Torx), preserve or replace thermal pads on the PCH and voltage-domain ICs. Clean old paste with 99% IPA and lint-free wipes; leave zero residue on die or heatsink. Apply Arctic MX-6 or Thermal Grizzly Kryonaut - rice-grain blob per die, let mounting pressure spread it. Reassemble with even fastener torque. Highest-impact repair on an aged 1246; typically restores full thermal margin for another 18-24 months.
Replace the intake and exhaust fans. Spec replacement fans to match OEM static pressure and airflow - cheap 'equivalent' fans do not push enough air through a restricted fin stack. D-Central stocks compatible fans. Replace all fans at once, not one at a time - the survivors are almost as tired as the one that just died, and you do not want a repeat call in two months.
Inspect and replace marginal capacitors near the PMIC. Bulging electrolytics, cracked MLCCs, discoloration on solder mask near voltage-domain ICs all mean replace. Soldering-iron plus hot-air job. Use lead-free solder matching existing joints (most 1246 boards are RoHS). Match cap capacitance and voltage rating; do not substitute a lower-voltage cap even if it looks similar.
Tune AUC3 IIC bus parameters. Edit miner config to --avalon7-aucspeed 200000 (down from 400000 default) and leave --avalon7-aucxdelay 19200. A slower bus is more tolerant of marginal cables and shielding issues. Verify CODE_MMCRCFAILED events disappear from the log; if they persist, the AUC3 or the USB cable between AUC3 and control board is the root cause.
Replace the AUC3 controller if Step 14 does not clear the IIC errors. A failed or marginal AUC3 reads thermal data incorrectly and trips phantom OVER_TEMP even when the miner is running cool. Replacement AUC3s available from Canaan, D-Central, and secondary market. Keep the old unit - useful as a bench test tool for diagnosing other miners.
Stop DIY when: PMIC or voltage-domain IC shows visible heat damage or measurable short, after paste + fan + AUC3 the OVER_TEMP still persists, thermal camera isolates a single chip hot-spot, or you smell burnt component. At this point you need a test fixture, chip-level tools, and parts that are not in a home workshop. Book a D-Central ASIC Repair slot.
D-Central bench process on a 1246 OVER_TEMP case: test-fixture boot with programmable load, chip-by-chip thermal isolation via PVT_T array extraction at the API layer, NTC and PMIC verification with reference fixtures, chip replacement using A3206 stock from graded inventory, full reflow where the failure mode supports it, 24-hour burn-in at nameplate before the miner ships back. Canadian turnaround, Canadian labour, Canadian accountability.
Ship safely. Hashboards in anti-static bags, double-boxed with at least 5 cm foam on every side. Include a physical note inside the box: firmware version, observed symptoms with timestamps, which Tier-1 and Tier-2 steps you already tried, and your contact info. Every minute our bench techs spend reconstructing the fault history is a minute added to your repair bill.
When to Seek Professional Repair
If the steps above do not resolve the issue, or if you are not comfortable performing these repairs yourself, professional service is recommended. Attempting advanced repairs without proper equipment can cause further damage.
Related Error Codes
Still Having Issues?
Our team of Bitcoin Mining Hackers has been repairing ASIC miners since 2016. We have seen it all and fixed it all. Get a professional diagnosis.
