Passer au contenu

Nous améliorons nos opérations pour mieux vous servir. Les commandes sont expédiées normalement depuis Laval, QC. Questions? Contactez-nous

Bitcoin accepté au paiement  |  Expédié depuis Laval, QC, Canada  |  Soutien expert depuis 2016

FAN_FAULT Critical

Avalon 1166 – Fan Speed Error

Fan speed error - chassis fan tach dropped to 0 RPM on Avalon A1166 / A1166 Pro; PS[0] bit 11 (2048, FAN_error) set in cgminer-api estats; red LED sustained; hashing halted.

Critical — Immediate action required

Affected Models: Avalon 1166, Avalon 1166 Pro (cross-applicable diagnostic logic to A1066 and A1266; not applicable to Nano 3 / Q / A1346+)

Symptoms

  • Fan1 or Fan2 reads 0 RPM in Canaan web UI or cgminer-api -o estats output
  • Red chassis LED sustained - no green, no yellow, no flash pattern
  • Dashboard hashrate has dropped to 0 and miner stopped accepting pool work
  • PS[0] = 2048 (FAN_error bit 11) set in cgminer-api stats output
  • Kernel log or dmesg | tail -100 shows repeated ERROR_FAN_LOST or fan check failed entries
  • Miner boots to flashing-green LED, then drops to sustained red inside 30-90 seconds and halts hashing
  • One of the two 14038 fans is audibly silent while the peer spins at nameplate ~6000 RPM
  • PVT_T0, PVT_T1, or PVT_T2 maximums climbing past 80 C in the last telemetry window before shutdown
  • Chassis intake or exhaust air noticeably still on one side vs the other
  • Loud click, thump, or grinding noise preceded the fault (bearing seize or foreign-object strike)
  • Days or weeks of buzzing / rattling preceded the hard fault (early-warning signature ignored)
  • Fan rotor visibly seized - will not turn by hand-spin with PDU off
  • Miner restarts itself in a loop: boot, red LED, shutdown, boot, red LED, shutdown
  • Visible scorch marks, melted plastic, or ozone smell at control-board fan header

Step-by-Step Fix

1

Tier 1 - Power cycle at the PDU for 60 seconds. Pull the AC plug completely; do not use the web-UI reboot, which leaves residual state on the control board. A full cold start clears any wedged cgminer driver state, drains the PSU caps, and lets the fans re-initialize from zero. Observe whether FAN_FAULT re-asserts immediately on boot or only after several minutes of full-load operation. Approximately 10-15% of tickets resolve here because the fan controller was stuck in a nonsensical commanded state. Record whether the fault returns at cold boot or only under warm load - both signatures branch to different root causes downstream.

2

Tier 1 - Pull the dashboard RPM and PS[0] baseline from cgminer-api -o estats. SSH into the controller and capture the full estats output to a file. You need the raw data before touching anything: which fan reads 0 RPM, what PS[0] bitmap value is set, what PVT_T0-2 values appeared in the last telemetry window before shutdown, and what SYSTEMSTATU reports. This data decides whether you spend the next 15 minutes on a connector or the next 3 hours on a PSU teardown, so do not skip it.

3

Tier 1 - Verify ambient and intake clearance around the chassis. A1166 wants inlet air at or below 35 C per Canaan spec. Nothing within 15 cm of the rear intake - no curtains, cardboard boxes, pet bedding, or neighboring-miner exhaust plumes obstructing airflow. Bottlenecked intake won't directly cause FAN_FAULT but accelerates bearing wear via heat soak and masks a marginal fan's decline. Canadian basement/garage deployments can also have cold-start issues: bearing grease viscosity at -5 C can push coast time below tach threshold for the first 60-90 seconds after power-on.

4

Tier 1 - Shop-vac the rear intake grill and chassis exterior before opening anything. Dust ingestion is the leading driver of 14038 bearing wear in home A1166 deployments. Vacuum the rear intake grill, side vents, and exposed duct surfaces. Do not point compressed air at the fans - blowback spins them past nameplate RPM, drives back-EMF into the PWM driver IC, and can create a new fault you did not start with. Vacuum pulls only.

5

Tier 2 - Reseat both 4-pin fan harnesses at the control board. PDU off for 60 seconds. Remove the top cover (6-8x M3 screws depending on revision). Locate each fan's 4-pin Molex PWM header. Unplug and replug each firmly until a positive seat is felt and heard. Inspect pin sockets for bent contacts, corrosion, or green verdigris. Tug gently to confirm retention. Reassemble and power on. Single highest-yield physical step on A1166 FAN_FAULT tickets - approximately 25% resolve here. Reseat both harnesses even if only one fan reports the fault.

6

Tier 2 - Hand-spin each 14038 fan with the PDU off. Remove the rear grill (8x M3x12 Torx T10). Grip each fan rotor and give it a brisk spin. Healthy fan coasts silently 6-10 seconds. Bad bearing: gritty feel, stops under 2 seconds. Tick per revolution: blade hitting the grill/shroud or cracked hub. Frozen rotor: dead, replace. Rank fans worst-to-best before touching a replacement part; the fan throwing FAN_FAULT may not be the only one on borrowed time.

7

Tier 2 - Inspect each blade and the fan housing under raking light. Remove the suspect fan (4x M4). Hold each blade edge against a bright flashlight at a shallow oblique angle. Look for chips, hairline cracks, missing fragments, embedded debris, dust cake, or witness marks on the hub. Check the cage for bent wires. Check the shroud for foreign objects - zip-tie tails, cable jacket fragments, insulation, packing peanuts. Correct any mechanical issue before blaming the fan.

8

Tier 2 - Continuity-check the fan harness end-to-end. PDU off. Unplug the fan from the control-board header. Multimeter in continuity/beep mode. Probe the TACH pin at both ends of the harness - should beep or read less than 1 Ohm. Repeat for +12V and GND. A broken TACH wire is the classic cause of fan clearly spinning but reporting 0 RPM (approximately 12% of tickets). Any open wire: recrimp the connector or replace the harness. Also inspect for chafe-through at sharp chassis edges.

9

Tier 2 - Swap the fans between bays. Label both bays. Power off. Remove both fans (4x M4 each), reinstall in swapped positions. Power on, let the miner stabilize 5-10 minutes, re-read cgminer-api -o estats. If the fault moves with the fan (now Fan2 = 0 RPM), you've confirmed a dead fan - order a replacement. If the fault stays in the same bay with a known-good fan installed, the socket or PWM driver is the fault (Tier 4).

10

Tier 2 - Measure socket voltage under load. Suspect socket's fan unplugged. Miner running on the remaining good fan (FAN_FAULT will re-assert in ~60 s on one fan so be quick). Multimeter on DC volts: red on +12V pin, black on GND. Expect 12.0-12.5 V steady. Probe PWM to GND: variable DC reading that changes when you issue a fan-speed change via Canaan UI or ascset. Socket reads 12.0-12.5 V + PWM responds = socket fine, replace fan. Socket reads 0 V / floats / drifts = PWM driver or header failure (Tier 4).

11

Tier 2 - Replace the fan with a matching 14038 PWM unit. Source a 140x140x38 mm, 12 V, 2.4-3.2 A, 4-wire PWM server fan with +12V / GND / PWM / TACH pinout matching the original. Nidec V35R14BS2M3-07 / T35R14BS2M3-07 and Sanyo Denki 9GV1412M401 are the common factory parts. Confirm pinout against your existing harness before ordering - a 4-wire PWM fan with non-standard pinout can damage the PWM driver on install. Mount with 4x M4 screws snug but not cranked, zip-tie the harness away from blade sweep, reassemble.

12

Tier 2 - Shop-vac and wipe the fan cages, grill, and duct while the chassis is open. Dust ingestion accelerates every failure mode on this list. 99% isopropyl + lint-free cloth on the grill and duct. Shop-vac (not compressed air) through all exposed surfaces. If one fan failed, the other is exposed to the same environment - preventative cleaning saves you a return trip in 2-3 months.

13

Tier 3 - Check the PSU internal fan if PS[0] = 2048 with chassis fans healthy. Power off, disconnect the PSU, open the case. Bleed the main DC-link caps with a bleed resistor before touching anything - 3400 W server PSUs store enough energy to kill you. The PSU fan is typically an 80x80x25 mm or 120x120x38 mm 12 V ball-bearing unit. Replace with matching spec. If you have not opened a 3400 W server PSU before, stop and ship the PSU to D-Central. Bench safety is not optional at this scale.

14

Tier 3 - Refresh thermal paste on the MM nearest the dead fan. If the miner ran in fault state for more than a few thermal cycles before the operator caught it, the MM on the dead-fan side likely saw Tj spikes that dried out or pump-outed the paste. Remove the MM, clean existing paste with 99% IPA + lint-free wipes, reapply Arctic MX-6 or Thermal Grizzly Kryonaut - thin uniform layer, no globs. Replace any visibly dried thermal pads on voltage-regulation silicon. Cheap insurance once the fan is replaced.

15

Tier 3 - Reflow the PWM driver IC or fan header on the control board. If Step 10 confirmed a dead socket, inspection under raking light often shows cold solder joints, dull grey joints, or lifted pads at the 4-pin header or adjacent PWM driver IC. Reflow with hot air at approximately 320 C, flux the joints, clean with IPA after. Replace the driver IC if it's visibly damaged or if reflow alone doesn't restore socket voltage. This is Tier 3 bordering on Tier 4 - if you don't have the fixtures and replacement silicon, skip to Tier 4.

16

Tier 3 - Roll firmware back one version if the fault began within 24-72 hours of a flash. Source older builds from avalonminer.org/firmware-document/. Canaan's signed-firmware model complicates rollback but does not prevent it for builds signed by the same key. Wired Ethernet only during flash - a mid-flash Wi-Fi drop will brick the controller on Canaan's signature check. Observe 24 hours after rollback. If the fault does not return, the newer build's fan-curve or tach-debounce was incompatible with your physical fans; stay on the older build.

17

Tier 3 - Replace the fan harness end-to-end. If Step 8's continuity check revealed a break inside the harness boot or at a crimp you cannot re-access cleanly, replace the whole harness rather than butchering it with splices. Factory harness is a 4-wire 22 AWG cable with locking 4-pin Molex on both ends. Same spec works across A1066/A1166/A1246 generations; D-Central carries replacements on the bench, and most ASIC parts vendors will ship one for under CAD $25.

18

Tier 4 - Stop DIY and book D-Central ASIC Repair when: (a) Step 10 proved the fan socket is dead after a known-good fan swap; (b) burnt-component odour, scorch marks, ozone smell, or visibly melted plastic near the fan header or PSU vents (power off immediately); (c) PS[0] bits 128/256/512/1024 co-set with 2048 (PSU channel-level faults needing programmable-load bench isolation); (d) the 3400 W PSU internal fan needs replacement and you have not bled server-scale DC-link caps before. D-Central bench runs A3205/A3206 test fixtures with programmable AUC3 load and 24-hour nameplate burn-in. Book at d-central.tech/services/asic-repair/. Turnaround 5-10 business days, Canada/US/international.

When to Seek Professional Repair

If the steps above do not resolve the issue, or if you are not comfortable performing these repairs yourself, professional service is recommended. Attempting advanced repairs without proper equipment can cause further damage.

Related Error Codes

Still Having Issues?

Our team of Bitcoin Mining Hackers has been repairing ASIC miners since 2016. We have seen it all and fixed it all. Get a professional diagnosis.