Passer au contenu

Nous améliorons nos opérations pour mieux vous servir. Les commandes sont expédiées normalement depuis Laval, QC. Questions? Contactez-nous

Bitcoin accepté au paiement  |  Expédié depuis Laval, QC, Canada  |  Soutien expert depuis 2016

FAN_FAULT Warning

Avalon 1566 – Fan Speed Error

Fan speed error / Fan RPM abnormal on the Avalon 1566 — `estats` reports `Fan*[0]` and/or PS error bitmap bit `2048` (`FAN_error`) is set. 4-fan A15 flagship configuration.

Warning — Should be addressed soon

Affected Models: Avalon 1566 (A1566 185 TH/s nameplate, ~3,420 W wall draw, MM3v2-class control board, 4-fan configuration)

Symptoms

  • `estats` / `stats` API response shows `Fan1[0]`, `Fan2[0]`, `Fan3[0]` or `Fan4[0]`
  • `FanR[0%]` or `FanR[100%]` stuck in the universal API reply — either silent or survivors screaming at max duty
  • Web UI dashboard shows a red "Fan Speed Error" / "Fan Abnormal" banner
  • PS error bitmap shows bit `2048` set (`FAN_error`) per the Canaan `avalon10-docs` Universal API encoding
  • `HS_RCD*[...Fan*:0...]` record in the per-hashboard detailed status block
  • Remaining fans audibly ramp to 100% duty within 30-60 seconds as the thermal governor compensates
  • Per-chip PVT temperatures (`PVT_T0` / `PVT_T1` / `PVT_T2`) climbing 4-6 C per minute, not stabilizing
  • Hashrate drops 10-30% below the 185 TH/s nameplate as firmware throttles frequency to protect silicon
  • `BOOTBY[0x10.<addr>]` overheat reboot code in the kernel log after 3-8 minutes under load
  • One fan visibly motionless through the rear grille while the other three spin
  • `ascset|0,fan-spd,90` override does not change the stopped fan's state (healthy fans respond)
  • Fan hub audibly ticks or chirps briefly at boot then stops — classic stalled-bearing signature
  • Smell of dust-cooked heatsink grease near the rear chassis on re-power
  • `MM ERROR` entry in `log/miner.log` referencing fan sensor loss or tach timeout
  • One hashboard's `HS_RCD` temperature climbs 3-5 C hotter than the other two within minutes

Step-by-Step Fix

1

Power down at the PDU and confirm the thermal state first. The A1566 will keep trying to hash with one fan down — don't let it. At the PDU, not the web UI, not the power button, kill the breaker. Open the chassis within 2 minutes so residual heat dissipates before you start probing. A dead fan on a 185 TH/s miner running even 5 extra minutes can cook thermal pads and shorten hashboard life by months. This is the only truly non-skippable step. The A1566's thermal margin under a dead-fan state is tighter than any previous A-series because of its heatsink density — treat it that way.

2

Read the log, write down the fan number. On the miner's web UI or via `estats` over the API, record exactly which fan position reports `0`. Is it `Fan1`? All four? One per hashboard? Position dictates the repair — an intake fan failure and an exhaust fan failure are the same log line but different physical parts and different airflow consequences. Screenshot or copy/paste the `estats` response. This artifact is the difference between a 20-minute fix and a 2-hour scavenger hunt, and it's what D-Central's bench asks for first if the miner ends up shipping. Include the MM firmware version string — the A15 platform firmware cadence is fast-moving and the bench needs to know what build you were on.

3

Blow the suspect fan out with canned air. Upright can, short bursts. Hold the blade still with a plastic probe while blasting — spinning the fan through induced airflow stresses a tired bearing further and has been known to finish off a marginal fan on the bench. Dust dams on the hub are the most common cause of a fan that *looks* fine but reads zero RPM. Basement shop, garage miner, or any cat/dog within 50 metres of the rig? This step fixes the fault about 15% of the time and costs nothing. Also blow out the intake and rear exhaust grilles while you're there — the A1566's fin pitch traps dust faster than older A-series.

4

Check ambient at the intake. Use an IR thermometer at the front grille, not in the middle of the room. The A1566's thermal governor is designed around `<=35 C` inlet. If your intake is already at 38 C because it's July in Ontario and the garage has no cross-ventilation, the fan controller may be running the survivors at 100% and you're hearing that, not a fault. Confirm the `Fan[0]` reading against this — if every fan reads real RPM but one is pegged at `0`, it's not ambient, it's the fan. If *all* fans are above 6800 RPM and the box is loud, it's ambient. Different problem, different fix. The A1566 is less forgiving of elevated intake temperatures than its A13 predecessors.

5

Reseat the fan connector at the MM3v2-class control board. Power off at the PDU. Wait 60 seconds. Pop the lid. Locate the 4-pin fan header matching the log's fan position against the board silkscreen. Unplug, inspect the shell for bent pins or a crushed plastic latch, reseat firmly until you feel the click. This alone fixes roughly 60% of `Fan[0]` tickets on Canaan MM3-family boards. Before reassembling, put a tiny dab of dielectric grease on the pins — vibration is what backed the connector out the first time, and it'll do it again without help. On the A1566 specifically, verify the silkscreen matches your log's fan-position numbering — do not assume the A1346/A1366 mapping carries over.

6

Inspect the fan cable end-to-end. Look for chafe points where the cable runs past sheet-metal edges (especially at the corner of the rear fan housing), strain-relief damage at the fan-side terminal, discoloration from heat, or obvious crush damage from the lid being reinstalled over a pinched cable. Replace the harness if anything looks off. The convention on 4-pin PWM fans in this class is GND / +12 V / TACH / PWM — verify against the silkscreen on your specific MM3v2-class revision before assuming. Harnesses for A1566 fans are available from general ASIC parts suppliers; generic PC 4-pin fan extensions do not meet the 9 A current rating the HA1250H12SB-Z pulls.

7

Swap the suspect fan into a known-good position. Four fan headers on the MM3v2-class board, four fans in the chassis. Move the suspect fan's harness to a position you know is working. Power on, re-read `estats`. If the suspect fan now spins and reads correctly in the new slot, you've isolated the fault to the *original slot* — a board-side issue that is Tier 3/4 territory. If the suspect fan still reads `0` in a known-good slot, the fan itself is dead and needs replacement. This single diagnostic saves you from ordering a new fan when the real fault is a blown SMD fuse on the board, and vice versa.

8

Measure the +12 V rail at the dead fan header under load. Multimeter on DC, probe V+ to GND on the fan header while the miner is powered on and the fan controller is commanded to spin (use `ascset|0,fan-spd,90` via the API). Expect 11.8-12.2 V steady. Below 11 V or reading nothing = blown SMD fuse or supply-rail fault on the MM3v2-class board — Tier 3. Healthy 12 V rail with a still-dead fan = fan or harness, replace. This is the single most important measurement on a suspected board-side fan failure and it takes about 90 seconds once the lid is off.

9

Set manual fan speed via the API as a diagnostic. `echo '[{"command":"ascset","parameter":"0,fan-spd,90"}]' | nc <miner-ip> 4028` pushes the fan controller to 90% duty. Try 20, 50, 90. A healthy fan will respond within a second — tach value should climb proportional to duty. A dead one won't. This is a diagnostic, not a fix — the setting doesn't persist across reboots on stock MM firmware, so don't use it to mask a real fault. Pairing this command with Step 8's voltage measurement tells you whether the controller is even driving the rail.

10

Check the kernel log for `BOOTBY[0x10.*]` overheat reboots. If you find repeated `BOOTBY[0x10]` entries alongside the `Fan[0]` state, the fan failure has already caused one or more overheat events and the hashboards have been through thermal cycling they shouldn't have. Plan for thermal paste refresh at the next reasonable opportunity, and watch for elevated per-chip CRC errors over the following week. Pull the full kernel log to a file and keep it — if the miner throws an `ASICCRC` fault within 30 days, the bench will want to see both logs to triage. On an A1566, overheat events leave less margin for recovery than older A-series.

11

Replace the fan with the correct part. Stock A1566: `HA1250H12SB-Z`, 12 V / 9 A, 120 x 120 x 50 mm, 4-pin PWM header. Per Zeus Mining's article, confirmed compatible across A1346 / A1366 / A1466 / A1566. Do NOT substitute a 120x120x25 mm or 120x120x38 mm fan — the A1566 chassis is designed around the 50 mm depth for static pressure through the dense heatsink fin pitch. Thinner fans cut airflow ~30% and on a 185 TH/s miner that margin is the difference between stable hashrate and thermal throttling. Confirm polarity against the silkscreen before power-on. Torque fan guard screws snug, not crushing — over-torque has been known to deform fan housings and cause blade ticking on neighboring units.

12

Replace the SMD fuse on the MM3v2-class fan rail. If your voltage measurement in Step 8 showed the +12 V rail dead on one fan header while others were healthy, a Canaan-side SMD fuse likely popped when a previous fan seized and drew locked-rotor current. The fuse is typically a 1206-package fast-blow in the 5 A-10 A range; identify by following the rail from the dead fan header back to the nearest SMD fuse component. Reflow the replacement with hot air at ~290 C and flux. Verify rail voltage returns to 12 V under a dummy load before reconnecting the real fan. If the replacement fuse blows again immediately, there's a downstream short and you're done with DIY. A1566-specific note: reference photography for the exact fuse designator on the A15-series MM3v2 revision is not yet widely published — ship to the bench if you're not confident.

13

Firmware check and rollback. If Steps 1-12 haven't resolved and `estats` shows all four fans at `0` plus temperature sensors at `-273 C`, you've hit a firmware bug — the I2C bus has lost its handle and the fan-tach polling died with it. Pull your current MM firmware version via `ascset|0,ver` or the web UI. Compare against the Avalon firmware portal for the latest A1566-targeted MM3v2 build. If you recently upgraded and the fault coincides with the upgrade, roll back to the previous known-good build for your hardware revision. Match the MM3v2 silkscreen rev to the firmware target before flashing — the A1566's MM3v2 revision is distinct from A13-series MM3v2, and flashing the wrong image bricks the controller.

14

Replace the MM3v2-class control board. If the fan rail is dead and the SMD fuse swap doesn't restore it, damage is deeper — gate driver, voltage regulator, or board-level short. At that point, replacing the full MM3v2-class board is faster than board-level rework unless you have A1566 bench fixtures. D-Central's Avalon repair bench handles both; the board itself is a Canaan control-board item available through multiple parts channels, though A15-specific stock is thinner than A13 at the time of writing.

15

Refresh thermal paste on all hashboards if overheat events occurred. If the dead-fan state ran long enough to trigger any `BOOTBY[0x10]` overheats, the thermal pads on the A1566's three hashboards took abuse. Pull each hashboard, replace thermal paste on the ASIC-to-heatsink interface with Arctic MX-6 or Thermal Grizzly Kryonaut, inspect the pads between PMIC/PVT sensors and the chassis for crumbling or dry-out. This is a preventative step that pays off when you want to keep the miner alive another 2-3 years rather than trading it down the secondary market in six months. On the A1566 specifically, correct pad thickness and flatness matter more than on older A-series because of the tighter heatsink tolerance.

16

Stop DIY when any of these are true: +12 V rail is dead across multiple fan positions (not just one), you see visible heat damage or discoloration on the MM3v2-class board, two fans in the same rig have failed within 30 days (points to upstream PSU or ground issue), any fan failure coincided with a hashboard going dark or throwing `ASICCRC` errors, or your SMD fuse replacement blew immediately on re-power. That's no longer fan territory — that's D-Central Avalon repair bench territory.

17

What D-Central does at the Avalon bench for an A1566. Diagnosis against a reference A1566 rig, MM3v2-class component-level repair including SMD fuse and gate-driver replacement, fan harness remake with dielectric-greased connectors, full hashboard thermal service if overheat events occurred, and a 24-hour nameplate burn-in with all four fans monitored via continuous `estats` polling before ship-back. We're one of the few North American benches servicing the A15 generation as it ramps — Canaan's official channel routes to Malaysia or Shenzhen with multi-week turnarounds.

18

Ship the whole miner, not just the MM3v2-class board. The A1566's fan assembly, power-splitter, and hashboard-to-MM harnesses are awkward to pack safely separate from the chassis. Double-box the full unit, remove the hashboards and wrap each separately in anti-static bags, include a note with the failing `estats` output, MM firmware version string, and observed symptoms. Saves diagnostic hours, which saves repair dollars.

When to Seek Professional Repair

If the steps above do not resolve the issue, or if you are not comfortable performing these repairs yourself, professional service is recommended. Attempting advanced repairs without proper equipment can cause further damage.

Related Error Codes

Still Having Issues?

Our team of Bitcoin Mining Hackers has been repairing ASIC miners since 2016. We have seen it all and fixed it all. Get a professional diagnosis.