Avalon 1366 – Temperature Too High
Informational — Monitor and address as needed
Symptoms
- Web dashboard shows Miner Status: Fault with OVER_TEMP, OT1, OT2, or OT3 in the status line
- kern.log or /var/log/cgminer.log contains over_temp, PS err 2, PS err 4, or PS err 8 entries
- Front-panel LED flips from solid green to sustained red, fans ramp to 100 percent briefly, then miner shuts down
- cgminer-api estats returns PVT_T[] array with at least one entry at 95 C or higher on the affected chain
- MM_STATUS field reads WORK_MODE FAULT and SYSTEMSTATU shows at least one chain in Error state
- Hashrate drops to 0 TH/s and stays there until you power-cycle or clear the fault from the dashboard
- Event recurs within minutes of a restart in warm ambient, or within hours during an overnight run
- Intake temperature at the front grille reads above 30 C (above 35 C is past Canaan's absolute maximum)
- Heatsinks are visibly dust-coated or you cannot see daylight through the fin pack
- At least one of the four HA1250H12SB-Z 120 mm fans reports below 3000 RPM at full tach in the dashboard
- Fault recurs at a predictable time of day (late afternoon, evening peak) suggesting rising ambient or circuit-shared appliance load
- BOOTBY[0x10.<addr>] overheat reboot code in kernel log after miner sat in Fault then cycled
Step-by-Step Fix
Pull the power cord at the PDU, wait 60 seconds for the bulk capacitors to drain, then restart. A full cold-boot clears any wedged MM3v2 firmware state around the OVER_TEMP fault flag - Canaan's build occasionally holds the flag across a soft reboot. Count to 60 seconds; do not just hit the dashboard restart. This step alone clears roughly one in five events on a 1366 that saw a transient thermal spike (neighbour's appliance on the same HVAC circuit, brief intake-air warm-up, afternoon sun on an uninsulated garage wall). If the miner boots clean and runs an hour without re-tripping, you caught a transient. Re-trip within minutes - escalate.
Vacuum the front intake grille and rear exhaust vent with a shop-vac using the soft-brush attachment. Pay particular attention to the four HA1250H12SB-Z fan blades and the heatsink fin packs visible through the grille. Dust-packed fins can raise heatsink-to-ambient delta by 8-15 C, which is the entire margin between running fine and OT on a warm day. Use the soft-brush, not a hard plastic nozzle - you're moving dust, not polishing chassis steel. On a 1366 that has never been cleaned, expect a visible difference in exhaust-air temperature within 10 minutes.
Measure intake-air temperature with an IR thermometer or probe thermometer held 5 cm from the front grille during hashing. Target 30 C or below for a Canadian basement or garage install; 35 C is the Canaan absolute maximum per spec. If intake reads above 30 C, open a window, crack the garage door, or move the miner's intake to a cooler corner before touching the miner. You cannot fix an ambient-envelope problem at the miner.
Confirm physical clearance: at least 30 cm in front of the intake, at least 15 cm behind the exhaust. A 1366 on a shelf with no breathing room recirculates its own exhaust - which is exactly how a miner that ran fine last week starts tripping OT this week after someone stacked a box in front of it. Pull everything back from the miner. Clearance is free; use it. Restart and confirm.
If you run multiple miners in a shed, garage, or basement rack, reconfirm airflow direction. Miners aimed at each other's intakes cook each other. Correct 1366 layout is cold intake from the front on a dedicated cold-air plenum, exhaust to outside or into a heat-utilisation duct for home heating. Recirculating the miner's own exhaust into its intake is the fastest way to hit OT in any install, but particularly in a heated basement from November through April.
Open the chassis, remove each of the four HA1250H12SB-Z fans in turn, and spin them by hand. A healthy dual-ball-bearing fan spins freely for 2-3 seconds after a flick. A failing fan grinds, buzzes, or stops immediately. Replace any fan that fails this test with a drop-in HA1250H12SB-Z. The 1366 uses the same fan as the 1346, 1446, 1466, and 1566 - one spare SKU covers the whole A13/A14/A15 generation. Parts available from Zeus Mining, bit2miner DF1205012B2 cross-reference, or a D-Central parts order.
Measure the 12 V fan rail at the PSU-to-control-board connector under load. Expect 11.8 V to 12.2 V sustained. Below 11.6 V means a tired PSU sagging the fan rail - the fans slow down, the chips get hot, the miner trips OT, and you chase a thermal fault that is actually a power fault. Swap to a known-good PSU from the 1346/1366/1446/1466/1566 family before assuming the miner is the problem. Note: 1366 PSUs are NOT cross-compatible with 1166 Pro or 1246 PSUs - use the correct generation.
Re-torque the heatsink mounting clips on each of the three hashboards. Canaan's 1366 heatsink clip loosens over 12-18 months of thermal cycling. A heatsink with even 0.1 mm of lift loses paste contact and the chip underneath cooks while its neighbours are fine. Remove the heatsink, wipe off the old paste with 99 percent IPA, apply a thin uniform layer of Arctic MX-6 or Thermal Grizzly Kryonaut, reseat with even clip pressure. Do one hashboard at a time so you do not mix clips or hardware between boards.
Install a foam pre-filter on the intake if your install does not already have one. D-Central strongly recommends a simple foam pre-filter on every 1366 install - it catches dust before it hits the heatsinks, and you can vacuum or wash it in seconds instead of opening the chassis. Without a pre-filter you rely on the heatsink fins themselves to catch debris, which is the failure mode step 2 fixes.
Verify line voltage at the outlet under load. On 240 V split-phase (the correct feed for a 1366's 3420 W draw) expect 235-245 V. On 208 V commercial expect 202-212 V. Low line voltage forces the PSU to draw more current, which heats the PSU internals, which can trip the PSU thermal sensor independently of the hashboard sensors. If line voltage is low, fix the feed before continuing - that is an electrician or panel problem before it is a miner problem.
Snapshot cgminer-api estats output to a text file before any further changes. Run `echo -n '{"command":"estats"}' | nc 127.0.0.1 4028` from the miner or a network-connected laptop. Save as 1366-pre-fix.txt. If the miner ends up shipping to D-Central, that pre-fix snapshot is exactly what the bench tech wants to see. It saves diagnostic time, which saves you repair dollars.
Remove a hashboard, strip the heatsinks, and reapply thermal paste on every A3206 ASIC on the chain that dominated your PVT_T outlier list. Use Arctic MX-6 or Thermal Grizzly Kryonaut in a thin uniform layer - the grain-of-rice heuristic is for CPUs, not a 120-chip hashboard; you want a uniform film across each chip top. Replace any thermal pad under the PCH or voltage-domain ICs if pads are crumbled, discoloured, or compressed. Budget one hour per hashboard the first time, 30 minutes once practised.
If one specific chip position is the outlier and paste refresh did not resolve it, reflow that single chip. Preheat the hashboard from below at 150 C for 3 minutes, hot-air from above at 310-330 C for approximately 30 seconds with flux around the chip periphery, natural cool-down. The A3206 BGA tolerates a single reflow cycle reliably. A second reflow on the same chip within 90 days rarely sticks - at that point the chip is dying and needs replacement.
Flash a known-good MM3v2 firmware build from Canaan's firmware portal at avalonminer.org/firmware-document/. The portal is login-gated and release notes are sparse, but it is the only official source for MM3v2 builds. Verify your hardware revision against the firmware compatibility table before flashing - the wrong MM3v2 for a late-rev 1366 can brick the control board. Flash via the dashboard's firmware page over a stable wired connection; never flash over wireless, never flash while hashing.
If the fault cleared after firmware update but returned within a week, downgrade to the previous stable build and note the revision numbers. A regression in Canaan's OT threshold logic between MM3v2 builds is a real failure mode that user communities have documented on bitcointalk and r/ASICminer but that Canaan has never publicly acknowledged. Document your working build and flash date in a text file taped to the chassis for future reference.
Reseat the control-board ribbon connectors on all three hashboards. The same IDC-style ribbons used on the 1346 MM3 show up on the 1366 MM3v2 and oxidize the same way in humid environments. Pull each ribbon fully, wipe contacts with 99 percent IPA on a lint-free wipe, reseat until the latch clicks. Oxidized ribbons can break the temperature telemetry path specifically, causing a chain to appear cool while running hot - tripping a different sensor and confusing diagnosis.
Stop DIY and ship to D-Central when any of the following are true: every fan replaced, every heatsink cleaned, every chip repasted and the miner still trips OT within 24 hours; reflowed a chip once and OT returned within 30 days; PVT_T[] returns impossible values (negative, above 200 C, or identical across all 120 chips); same chip position trips on two different 1366 units in your rig; visible discoloration, burnt smell, or capacitor damage anywhere on the hashboard or MM3v2. Book a D-Central ASIC Repair slot. Turnaround 5-10 business days, Canada-wide shipping, US and international welcomed.
Pack for shipping: anti-static bags on each hashboard and the MM3v2 separately, double-box with at least 5 cm of foam on every side, include a printed diagnostic note with the kern.log excerpt (actual OT lines), MM3v2 firmware build, PSU model, measured line voltage, intake ambient, fan tach readings before failure, and every Tier 1-3 step you have already run. The bench tech starts from your notes - better notes, faster and cheaper repair. Never ship hashboards loose or in single-box packaging; A3206 BGAs do not survive careless handling.
When to Seek Professional Repair
If the steps above do not resolve the issue, or if you are not comfortable performing these repairs yourself, professional service is recommended. Attempting advanced repairs without proper equipment can cause further damage.
Related Error Codes
Still Having Issues?
Our team of Bitcoin Mining Hackers has been repairing ASIC miners since 2016. We have seen it all and fixed it all. Get a professional diagnosis.
