Passer au contenu

Nous améliorons nos opérations pour mieux vous servir. Les commandes sont expédiées normalement depuis Laval, QC. Questions? Contactez-nous

Bitcoin accepté au paiement  |  Expédié depuis Laval, QC, Canada  |  Soutien expert depuis 2016

OVER_TEMP / OT1 / OT2 / OT3 (PS err bits 2/4/8) Info

Avalon 1466 – Temperature Too High

AvalonMiner A1466 trips protective thermal shutdown: MM3v2 firmware logs OVER_TEMP, PSU cuts hashboard rail, dashboard flips to Fault. Triggered when PVT_T exceeds chip or PSU thermal ceiling (approx 95-105 C chip junction, 75-85 C PSU) on an A3207-class A14-generation hashboard.

Informational — Monitor and address as needed

Affected Models: AvalonMiner A1466 - approx 150 TH/s nameplate, A3207-class ASIC family (A14 generation), MM3v2 control board, four DF1205012B2 (Martech) / cross-compatible HA1250H12SB-Z 120 mm fans, approx 3230-3420 W wall draw at stock. Shares fan, PSU family, and MM3v2 platform with Avalon 1346, 1366, 1446, and 1566 - many Tier 2/3 fixes apply across that generation.

Symptoms

  • Web dashboard shows Miner Status: Fault with OVER_TEMP, OT1, OT2, or OT3 in the status line
  • kern.log or /var/log/cgminer.log contains over_temp, PS err 2, PS err 4, or PS err 8 entries
  • Front-panel LED flips from solid green to sustained red, fans ramp to 100% briefly, then the miner drops compute
  • cgminer-api estats returns PVT_T[] with at least one entry at 95 C or higher on the affected chain
  • MM_STATUS field reads WORK_MODE FAULT and SYSTEMSTATU shows at least one chain in Error state
  • Hashrate drops to 0 TH/s until you cold-boot at the PDU or clear the fault from the dashboard
  • Event recurs within minutes of a restart when ambient is warm, or within hours during an overnight run
  • Intake temperature at the front grille reads above 30 C (above 35 C is past Canaan's absolute maximum)
  • Heatsinks visibly dust-coated or you cannot see daylight through the fin pack
  • At least one of the four DF1205012B2 / HA1250H12SB-Z 120 mm fans reports below 3000 RPM at full tach
  • BOOTBY[0x10] reboot-cause code shows up in the prior-boot log (overheat reboot on MM3/MM3v2)
  • Fault recurs at a predictable time of day (e.g. late afternoon) suggesting rising ambient or shared-circuit appliance load

Step-by-Step Fix

1

Pull the power cord at the PDU, wait 60 seconds for bulk capacitors to drain, then restart. A full cold-boot clears any wedged MM3v2 state around the OVER_TEMP trip flag - Canaan's firmware occasionally holds the fault across a soft reboot. Count to 60; do not just hit the dashboard restart. This alone clears roughly one in five A1466 events that saw a transient thermal spike (warm intake air for ten minutes, an appliance on a shared HVAC circuit, an afternoon heat soak). If the miner runs an hour without re-tripping, you caught a transient and you are done. If it re-trips within minutes, escalate.

2

Vacuum the front intake grille and rear exhaust vent with a shop-vac using the soft-brush attachment. Pay attention to the four fan blades and heatsink fin packs visible through the grille. Dust-packed fins can raise heatsink-to-ambient delta by 8-15 C, which is the entire margin between running fine and OT on a warm day. No hard nozzle; you are moving dust, not polishing chassis steel. Restart and run for one hour. Clean intake alone resolves a large fraction of A1466 OT events, particularly in basement or garage installs.

3

Measure intake-air temperature with an IR thermometer held 5 cm from the front grille during hashing. Target 30 C or below for a Canadian basement or garage; 35 C is Canaan's absolute maximum per spec. If intake reads above 30 C, open a window, crack the garage door, or move the miner's intake to a cooler corner before doing anything more invasive. You cannot fix an ambient-envelope problem at the miner - the miner is just reporting what the room is doing to it.

4

Confirm physical clearance: at least 30 cm in front of the intake, at least 15 cm behind the exhaust. An A1466 piled on a shelf with no breathing room recirculates its own exhaust - which is how a miner that ran fine last week starts tripping OT this week after someone stacked a box in front of it. Pull everything back from the miner. Clearance is free. Restart and confirm the fault does not return within an hour.

5

In a multi-miner shed or garage, reconfirm airflow direction. Miners pointed at each other's intakes cook each other. Correct A1466 layout is cold intake from the front on a dedicated cold-air plenum, exhaust to outside or into a heat-utilisation duct for home heating. Recirculating the miner's own exhaust into its intake is the fastest way to hit OT in any install, particularly in a heated basement from November through April - and it wastes the dual-purpose heating play.

6

Open the chassis, remove each of the four A1466 fans in turn, and spin each by hand. A healthy dual-ball-bearing fan spins freely for 2-3 seconds after a flick. A failing fan grinds, buzzes, or stops immediately. Replace any fan that fails this test with a DF1205012B2 (current Martech revision used across A1326/A1346/A1366/A1446/A1466/A15) - cross-compatible with the older HA1250H12SB-Z on mounting and electrical. One spare SKU covers the entire A13/A14/A15 generation. Parts available from bit2miner, Zeus, or a D-Central parts order.

7

Measure the 12 V fan rail at the PSU-to-MM3v2 connector under load. Expect 11.8 V to 12.2 V sustained. Below 11.6 V means a tired PSU sagging the fan rail - the fans slow, chips heat, the miner trips OT, and you chase a thermal fault that is actually a power fault. Swap to a known-good PSU from the 1346/1366/1446/1466/1566 PSU family before assuming the miner is the problem. Do NOT swap in a 1166 Pro or 1246 PSU - not cross-compatible with A14 control signaling.

8

Re-torque the heatsink mounting clips on each of the three hashboards. A1466 heatsink clips loosen over 12-18 months of thermal cycling. A heatsink with even 0.1 mm of lift loses its paste contact patch and the chip beneath cooks while neighbours are fine. Remove heatsink, wipe old paste with 99% IPA, apply a thin uniform layer of Arctic MX-6 or Thermal Grizzly Kryonaut, reseat with even clip pressure. One hashboard at a time - do not mix clips between boards.

9

Install a foam pre-filter on the intake if your install does not already have one. D-Central strongly recommends a simple foam pre-filter on every A1466 install - catches dust before it hits the heatsinks and can be vacuumed or washed in seconds instead of opening the chassis. Without a pre-filter you rely on the heatsink fins themselves to catch debris, which is the exact failure mode step 2 fixes - and you will be fixing it over and over.

10

Verify line voltage at the outlet under load. On 240 V split-phase (the correct feed for an A1466's 3230-3420 W draw) expect 235-245 V. On 208 V commercial expect 202-212 V. Low line voltage forces the PSU to draw more current, heats PSU internals, and can trip the PSU thermal sensor independently of hashboard sensors. Fix the feed before continuing - electrician or panel problem before miner problem. Never run an A1466 on 120 V.

11

Snapshot cgminer-api estats output to a text file before any further changes. Run `echo -n '{"command":"estats"}' | nc 127.0.0.1 4028` from the miner or a network-connected laptop. Save as a1466-pre-fix.txt. If the miner ships to D-Central, that pre-fix snapshot is exactly what the bench tech needs to see - and it saves diagnostic time, which saves you repair dollars.

12

Remove a hashboard, strip the heatsinks, and reapply thermal paste on every ASIC on the chain that dominated your PVT_T outlier list. Arctic MX-6 or Thermal Grizzly Kryonaut in a thin uniform layer - the grain-of-rice heuristic is for CPUs, not a multi-chip hashboard; you want a uniform film across each chip top. Replace any thermal pad under the PCH or voltage-domain ICs if pads are crumbled, discoloured, or compressed. Budget one hour per hashboard the first time, 30 minutes once practised.

13

If one specific chip position is the outlier and paste refresh did not resolve it, reflow that single chip. Preheat the hashboard from below at 150 C for 3 minutes, hot-air from above at 310-330 C for approximately 30 seconds with flux around the chip periphery, natural cool-down. The A3207-class BGA tolerates a single reflow cycle reliably. A second reflow on the same chip within 90 days rarely sticks - at that point the chip is dying and needs graded-replacement from a bench with salvaged A14-generation inventory.

14

Flash a known-good MM3v2 firmware build for the A1466 from Canaan's login-gated firmware portal at avalonminer.org/firmware-document/. Release notes are sparse, but it is the only official source. Verify your hardware revision against the firmware compatibility table before flashing - the wrong MM3v2 build for a late-rev A1466 can brick the control board. Flash via the dashboard's firmware page over a wired connection; never flash over wireless, never flash while hashing, never flash while an OT fault is active.

15

If the fault cleared after firmware update but returned within a week, downgrade to the previous stable build and document revision numbers. Regressions in Canaan's OT threshold logic between MM3v2 builds are a real failure mode that user communities have documented on bitcointalk and r/ASICminer but that Canaan has never publicly acknowledged. Tape a text file to the chassis with your working build plus flash date for future reference.

16

Reseat the MM3v2 ribbon connectors on all three hashboards. The IDC-style ribbons used on the 1166 Pro and 1246 carry over on the A1466 and oxidize the same way in humid environments. Pull each ribbon fully, wipe contacts with 99% IPA on a lint-free wipe, reseat until the latch clicks. Oxidized ribbons can break the temperature telemetry path specifically, causing a chain to appear cool while running hot - tripping a different sensor and confusing diagnosis.

17

Stop DIY and ship to D-Central when: every fan replaced, every heatsink cleaned, every chip repasted and the miner still trips OT within 24 hours; reflowed a chip once and OT returned within 30 days; PVT_T[] returns impossible values (negative, above 200 C, or identical across all chip positions); same chip position trips on two different A1466 units in your rig; visible discoloration, burnt smell, or capacitor damage anywhere on the hashboard, MM3v2, or near the PSU. Book a D-Central ASIC Repair slot. Turnaround 5-10 business days, Canada-wide, US and international welcomed.

18

Pack for shipping: anti-static bags on each hashboard and the MM3v2 separately, double-box with at least 5 cm of foam on every side, include a printed diagnostic note with the kern.log excerpt (actual OT lines), MM3v2 firmware build, PSU model, measured line voltage, intake ambient, fan tach readings before failure, and every Tier 1-3 step you have already run. The bench tech starts from your notes - better notes, faster and cheaper repair. Never ship hashboards loose or in single-box packaging; A3207-class BGAs do not survive careless handling.

When to Seek Professional Repair

If the steps above do not resolve the issue, or if you are not comfortable performing these repairs yourself, professional service is recommended. Attempting advanced repairs without proper equipment can cause further damage.

Related Error Codes

Still Having Issues?

Our team of Bitcoin Mining Hackers has been repairing ASIC miners since 2016. We have seen it all and fixed it all. Get a professional diagnosis.