Passer au contenu

Nous améliorons nos opérations pour mieux vous servir. Les commandes sont expédiées normalement depuis Laval, QC. Questions? Contactez-nous

Bitcoin accepté au paiement  |  Expédié depuis Laval, QC, Canada  |  Soutien expert depuis 2016

CHAIN_FAIL Critical

Avalon 1566 – Hashboard Not Detected

The 1566 powers up and the AUC3 answers the network, but SYSTEMSTATU reports a Work count of 2 (or 1, or 0) instead of the expected 3. One or more A15 hashboards failed to enumerate over the IIC bus during boot.

Critical — Immediate action required

Affected Models: Avalon 1566 (standard ~185 TH/s bin), Avalon 1566 MaxSpeed variants, and mixed-bin 1566 fleets. Shares MM control-board / AUC3 architecture, PSU output topology, and DF1205012B2 fan family with the A13 / A14 family (1326, 1346, 1366, 1446, 1466) — cross-model diagnostic procedures apply with minor firmware-version offsets.

Symptoms

  • Dashboard hashrate sits near `two-thirds of nameplate` — a stock 1566 running `~123 TH/s` instead of `~185 TH/s` — for `30+ minutes` of continuous mining
  • Avalon Web UI **Device** tab shows `Hash Boards: 2` (or `1`) when all three are physically installed and the miner was previously hashing at nameplate
  • cgminer API `estats` response on port `4028` shows `SYSTEMSTATU[0] Work: 2` (or `1` / `0`) instead of `Work: 3`
  • `MW0` / `MW1` / `MW2` array for the missing chain is entirely zero, missing, or truncated from the API response
  • `ECHU[x x x]` reports non-zero bits on the affected chain position, or the slot's `ECHU` field is absent entirely
  • `PS[0]` bitmap non-zero — especially bits `512` (`OC_IOSA`), `1024` (`OC_IOSB`), or `2048` (`OC_IOSC`) — flags a PSU output-channel overcurrent trip upstream of the hashboard
  • Top-panel status LED sits **sustained red** (the overloaded Avalon red LED — one colour, seven possible faults)
  • `/tmp/log/log` (MM firmware log) shows `avalon: chain X init fail`, `iic: no ACK from addr 0x60`, `EEPROM read fail`, `MMCRCFAILED`, or `CODE_MMCRCFAILED`
  • `PVT_T0 / PVT_T1 / PVT_T2` arrays empty or zero on the missing chain — no per-chip temperature telemetry = dead board or dead enumeration
  • `PVT_V0 / PVT_V1 / PVT_V2` arrays report `0 mV` on the dark chain but nominal (`~1200 – 1350 mV` domain, `~290 – 360 mV` core) on the live chains
  • One of the three chassis fans spins at normal RPM but the corresponding hashboard shows **no thermal climb** 30-60 seconds after boot — dead board means dead heat
  • Pool side shows stable stratum connection and flat reject rate — this is a detection fault, not a mining-quality fault
  • Efficiency (`J/TH`) has climbed from the advertised `~18 J/TH` into the `~25 – 30 J/TH` zone — power bill unchanged, revenue down
  • Fault reproduces consistently across hard reboots (always-on = start at the board); or comes and goes (intermittent = start at PSU / harness / AUC3 bus)
  • Miner was recently moved, re-racked, had a hashboard swapped, had thermal paste refreshed, had a PSU swap, or had an MM firmware flash in the last `30 days`

Step-by-Step Fix

1

Hard-power-cycle at the PDU for a full `60 seconds`. Not a Web UI reboot — a true AC disconnect at the breaker or PDU. Wait the full minute for PSU bulk caps to discharge and MM state to flush. Power back up, watch the dashboard for `3/3` to appear. This is the cheapest diagnostic on any Avalon chassis and recovers a non-trivial fraction of `CHAIN_FAIL` tickets caused by wedged MM state after a brownout, network hiccup, or power flicker. While waiting, note any red LEDs, unusual PSU fan behaviour, or burnt smell — those observations steer the rest of the triage and matter ten times more on a `~3400 W` 1566 than on a smaller 1246.

2

Pull `estats` from the cgminer API before you touch the hardware. `curl http://<miner-ip>:4028 -d '{"command":"estats"}'`. Read `SYSTEMSTATU`, `MW0/MW1/MW2`, `ECHU`, and especially `PS[0..2]`. If `PS[0]` has bits `512`, `1024`, or `2048` set, you have a PSU output-channel trip and the board is exonerated — chase the PSU, not the hashboard. If the dark chain's `MW` array is empty and `ECHU` unreadable, confirmed `CHAIN_FAIL` at the IIC handshake level. This five-second API query saves an hour of screwdriver work on almost every 1566 ticket and is the same API surface the A13 / A14 / A15 family shares.

3

Check the MM firmware log. SSH in and grep `/tmp/log/log` for `chain`, `iic`, `EEPROM`, `PS`, `MMCRCFAILED`, `CODE_MMCRCFAILED`. The log tells you which of the six failure buckets is active before you open the chassis — IIC nack means dark board, EEPROM fail means corrupt identity, `PS` bitmap means PSU trip, `MMCRCFAILED` means marginal bus. Let the log route the triage; on a 1566, log-first diagnostics are free while opening the chassis burns a full service slot.

4

Confirm recent service and firmware-change history. A `CHAIN_FAIL` three days after a paste refresh is almost always a reversed power-sequence (Tier 3 damage to `U1/U2/R8/R9`) or a loose ribbon (Tier 2 reseat). A `CHAIN_FAIL` right after a firmware flash is an MM image corruption. A `CHAIN_FAIL` out of the blue after months of stable mining on a 1566 is most likely a PSU output-channel event, a connector oxidation, or an age-related PMIC failure. Service history is the single highest-value context you can give a remote diagnostic or a bench tech.

5

Confirm intake ambient. Per Canaan's published A15 operating envelope, intake air should be at or below `~35°C`. Some 1566 MM firmware revisions silently mask detection failures as thermal foldback — a hot intake can look like a dead board on the dashboard. Measure at the intake grille with an IR thermometer, not at room-middle. In a Canadian basement in winter this isn't usually the fault; in a summer garage or a hot-aisle feeding adjacent 1566s their exhaust, it absolutely can be.

6

Hard power-down and reseat the signal ribbon + `12V` lugs on the missing chain. Kill AC at the PDU. Wait `5 minutes` for electrolytics to drain (the 1566's bulk caps are larger than earlier A-series; give them the time). Open the top cover. Disconnect the signal ribbon on the dark chain; inspect both connector faces under bright light for oxidation, green-oxide, bent / recessed pins. IPA-swab both sides (`99%` IPA + lint-free). Reseat until you hear the click. Loosen, clean, and re-torque the `12V` DC copper lugs (community torque `1.2 – 1.5 Nm`; Canaan publishes no spec). Close up, power on, re-pull `estats` to confirm `3/3`.

7

DMM-verify `12V` at the hashboard input lug under boot load. Disconnect the `12V` lug from the PSU side. Probe DC with miner off — expect `12.0 – 12.6V` open-circuit from the PSU. Reconnect firmly, power up, re-probe at the lug during the boot window (first `30 seconds`) — expect `≥11.8V` sustained into load. A 1566 pulls substantially more per-channel current than a 1246, so a PSU that reads `12.3V` open and collapses to `10.5V` under 1566 load is tired *for the 1566* even if it was fine for a smaller chassis last year.

8

Swap to a known-good 1566-compatible Canaan PSU. Use a confirmed-working PSU from the `A1326 / A1346 / A1366 / A1446 / A1466 / A15xx` family (shared PSU part across A13/A14/A15). Power up, observe `3/3` on the dashboard within the first `90 seconds`. Do NOT cross-connect a Bitmain `APW`-series PSU; Canaan and Bitmain pinouts differ and cross-brand PSU is a documented failure mode. If `PS[0]` bits `512` / `1024` / `2048` clear on the swap, the old PSU had an output-channel trip and needs replacement.

9

Swap the suspect board into a known-good slot. Power down. Swap the dark hashboard with one of the known-good boards (swap slots `1` and `3`, keep orientation). Power up and read `estats`. Fault follows the board = the board itself is bad (ship to D-Central or bench repair). Fault stays with the slot = MM control board, AUC3 bridge, or PSU output channel for that slot is the fault. This swap isolates *where* the fault lives in under five minutes and is the highest-value Tier-2 diagnostic on any A-series chassis, including the 1566.

10

Re-seat every ribbon, harness, and AUC3 USB connection in the chassis. Kill AC, disconnect every control-board-to-hashboard ribbon, every PSU output harness, and the AUC3 USB / ribbon connection. Inspect each connector pin-by-pin under bright light. IPA-clean dirty contacts. Re-seat firmly. Cable-tie the AUC3 ribbon and PSU harness to the chassis frame so chassis-fan vibration cannot work connectors loose — this matters more on 1566s than on earlier A-series because A15 fans spin at higher RPM and the vibration energy is higher.

11

Re-flash the current factory MM firmware image via the AUC3 Web UI. If the log scan in Step 3 showed `EEPROM read fail` without IIC nacks, or if the chassis had a failed firmware flash recently, re-flash the correct current MM image following the D-Central Avalon firmware upgrade guide. VERIFY the image is for A1566 — cross-flashing a 1466 or 1366 image bricks the control board because Canaan signature-checks per-model. Rollback is blocked. Do not interrupt the flash; a 1566 MM control board lost mid-flash becomes a bench job.

12

Capture the MM control-board serial log at boot via USB-TTL. Connect a USB-TTL serial adapter (`FT232` / `CH340` / `CP2102`) to the control board's UART header (location varies by MM revision — check silkscreen). Capture the full boot log at the documented baud rate. Hashboard-handshake attempts, IIC init messages, `MMCRCFAILED` errors, and per-chain enumeration outcomes all surface here. This is the highest-value diagnostic when the Web UI says `2/3` but the cgminer API and the MM log can't agree on which chain — the serial log is ground truth.

13

Tune `--avalon7-aucspeed` and `--avalon7-aucxdelay`. Defaults per cgminer source are `aucspeed 400000` and `aucxdelay 19200`. If serial logs show intermittent `MMCRCFAILED`, drop `aucspeed` to `300000` (or `200000` if severely marginal) and bump `aucxdelay` to `24000` (or `38400`). The `avalon7` flag prefix still applies to A15 silicon — Canaan never renamed the upstream cgminer flags. This tuning is undocumented in Canaan's own materials and lives only in the cgminer `ASIC-README` — pure community lore.

14

Measure hashboard domain and core rails with DMM on the dark chain. Probe at the silkscreened test points during the boot window. Domain rail in the `~1200 – 1350 mV` band, core rail in the `~290 – 360 mV` band (confirm against the silkscreen on your specific 1566 revision). If `12V` is present at the input but the domain or core rail reads `0 mV`, the on-board PMIC / buck converter is dead or a fuse / bulk cap has failed short. This diagnosis sends the board to Tier 3 rework or D-Central ship.

15

Inspect `U1`, `U2`, `R8`, `R9` on the dark hashboard with DMM in diode-check mode. The Zeus Mining A11/A12 repair guide documents the topology that carries forward from A11/A12 boards to A13 / A14 / A15. `U1/U2` are the primary power-sequence MOSFETs; `R8/R9` are the companion sense / gate resistors. A burnt `U1` or `U2` from a reversed install sequence reads dead-short or open-circuit in diode mode. If identified, replace the exact silkscreened part — wrong `V_DS` or `R_DS(on)` substitutes fail immediately. Requires preheater + hot-air rework + SMD skills.

16

Swap donor AUC3 from a 1346 / 1366 / 1446 / 1466 / 1566 chassis. If fault stays with the slot across board-swap (Step 9) and re-seating doesn't help, the AUC3 bridge or MM control board is suspect. The A13 / A14 / A15 family shares AUC3 part identity. A parts-donor AUC3 from any of those chassis is a valid swap-out for isolation. Chain enumerates on donor AUC3 = your AUC3 is dead. Still dark = MM control board itself (Tier 4).

17

Scope the `12V` rail at the hashboard input during boot. A 50-MHz handheld oscilloscope captures the rail-up ramp during MM power sequence. Healthy: clean step from `0V` to `12V` in tens of milliseconds, flat at `12V` thereafter with `<200 mV` ripple under load. Damaged PSU or cable: oscillation, overshoot, or incomplete ramp. Damaged per-board power sequence: rail comes up and immediately collapses as `U1/U2` fail to latch. The scope capture distinguishes PSU faults from per-board faults more reliably than DMM averaging on a 1566.

18

Re-flow suspect components on a visibly-damaged hashboard. If Step 15 identified a specific bad part and you have preheater + hot-air + SMD skills, replace `U1`, `U2`, `R8`, or `R9` on the affected board. Pre-heat bottom side to `~150°C`, top-side hot air at `~320°C`, desolder the failed part, clean pads with braid and flux, place replacement, reflow down. Re-check with DMM diode mode before re-installing. Use the exact silkscreen P/N — wrong parts fail on first power-on, and on a 1566 that cascades into the A15 silicon is a much more expensive mistake than on a 1246.

19

Stop DIY when multiple hashboards in the same 1566 chassis show identical `U1/U2/R8/R9` damage or identical PMIC signatures. Two or three boards with the same burnt power-sequence parts means a feeder event (surge, brownout recovery, lightning-induced transient) simultaneously damaged all of them. That is a full-chassis bench rebuild: multi-board component-level repair, PSU replacement, control-board surge-input-MOSFET inspection, and a full nameplate burn-in at `~185 TH/s`. Shotgun-replacing at home on a 1566 rarely lands — D-Central's bench isolates each board on a programmable load, verifies each chip's `PVT_V` and `PVT_T` against a known-good A15 baseline, and only returns boards that pass burn-in.

20

Stop DIY when the MM control board / AUC3 is suspected dead. A control board that shows zero LED activity at power-up with a known-good PSU, or persistent `MMCRCFAILED` after AUC3 donor-swap, requires a bench fixture to localize. The `MM3v2`-lineage control board has a `12V → 3.3V` chain with multiple stages, any of which can fail on a surge event. D-Central carries parts-graded control boards and AUC3 daughterboards from the A13 / A14 / A15 family for swap-out and cross-references repair-queue logs across the 1346 / 1366 / 1446 / 1466 / 1566 fleet.

21

Ship the miner properly. Pack the chassis in its original Canaan box if you still have it, or double-box with `5cm+` foam on every side. Include the PSU — a tired PSU that mostly works on your bench may be the root cause of what looks like 'hashboard dead' on a 1566 specifically because A15 per-channel draw is higher, and we can only diagnose the fault if we can reproduce it. Include a note with: serial number, observed symptoms (screenshots of `2/3` dashboard, captured `PS[0..2]` / `ECMM` / `MMCRCFAILED` lines from the MM log, which slot is dark), service history (when opened, what was done, firmware flashes in last 30 days), and contact info. Saves bench diagnostic time and saves you money.

22

Discuss repair-vs-replace up front. A full 1566 bench rebuild (three-board component-level inspection + PSU + control board + burn-in at `~185 TH/s`) runs `CAD $900 – $1,800` depending on what we find. A used 1566 on the secondary market in late 2025 / early 2026 runs roughly `CAD $3,000 – $5,500+` depending on bin and condition. The repair math almost always favours repair on a 1566 right now, especially below `~$0.10/kWh` — the 1566 is current-gen A15 silicon and not yet amortized in most fleets. D-Central quotes honestly before committing bench hours — we'll tell you if the math doesn't work.

When to Seek Professional Repair

If the steps above do not resolve the issue, or if you are not comfortable performing these repairs yourself, professional service is recommended. Attempting advanced repairs without proper equipment can cause further damage.

Related Error Codes

Still Having Issues?

Our team of Bitcoin Mining Hackers has been repairing ASIC miners since 2016. We have seen it all and fixed it all. Get a professional diagnosis.