Avalon 1566 – Dead Chip Count Exceeded
Warning — Should be addressed soon
Symptoms
- `cgminer-api` `stats` reports `ASIC count` below nameplate on at least one of the three hashboards (e.g. `MM count: 142/156, board disabled`)
- Realized hashrate has dropped roughly proportional to missing chips (each board normally contributes `~62 TH/s` of the rig's `185 TH/s`)
- Boot log shows non-zero `ECHU` on a specific board, or controller reports `ASIC_COUNT_LOW` and skips powering that hashboard
- Dashboard shows two boards green/active and one greyed out / marked `disabled`, even though the board is physically installed
- After a power cycle the missing-chip count is stable, not flickering — the chips are physically gone, not transiently glitching
- `MW0` array contains `0x0` entries clustered on adjacent chip indices (a contiguous cold zone in the chain)
- Thermal camera or IR thermometer shows cold spots at near-ambient on the running board while neighbours run `60-85 °C` (healthy A1566 hashboard temps measured `68-74 °C` at `28 °C` ambient)
- `PVT_T` per-group temperature reads abnormally low or returns `-` / `--` for the affected group
- Per-group voltage rail off-spec — one group's regulator output measures `0 V` or far from neighbouring groups on the same board
- Rest of the miner is healthy — no fan errors, no `PSU3500-01` faults, no over-temperature; the issue is isolated to chip count on a single board
- Chip count drift accelerated after a known event: overclock push, thermal excursion, `PSU3500-01` swap with bad pin contact on the new `PA45` connector, power surge, or a prior repaste/rework
- If the board was previously third-party repaired: pattern of losing `1` chip every `4-6` weeks after the repair (grey-market A15 chip provenance)
Step-by-Step Fix
Hard power-cycle the miner — `60 seconds` off at the breaker, not a soft reboot. Some MM firmware builds latch a board-disable state across soft reboots but clear it on a cold boot. If the board returns at nameplate count, you had a stuck firmware state, not a dead chip. Re-monitor `24 hours` to confirm it stays healthy before declaring victory.
Re-seat the AUC3 ribbon and the new `4-pin PA45/P14` power loom on the affected hashboard. Power off at the breaker, unplug, remove the chassis lid, disconnect/reconnect the AUC3 ribbon between the control board and the suspect hashboard. Inspect for bent pins, oxidation, or blackened contacts. The `PA45` is new for the A15 generation — seats slightly differently than the older Avalon power plug; listen for the click on every connector and do not force.
Read `ASIC count` directly from `cgminer-api` rather than trusting the web dashboard. From any LAN machine: `echo -n '{"command":"stats"}' | nc <miner-ip> 4028`. Parse the `MM ID0/1/2` blocks for `ASIC count`. This is the ground truth; the dashboard can lag, cache, or misreport during a board-disable transition.
Verify intake ambient at the front grille with an IR thermometer. Target `≤ 30 °C` for normal A1566 operation; `35 °C` is the hard ceiling. Anything above pushes hashboard temps past the healthy `68-74 °C` band Hashrate Index measured and accelerates A15 silicon mortality across the entire fleet, not just this one board. Filter clean and ambient correction is the cheapest possible intervention.
Check the Canaan firmware portal (`avalonminer.org/firmware-document/`) for current MM image for your A1566 hardware revision (control board `MM_V1_2_20230609` family). Some MM builds shipped with overly aggressive `0x0` nonce reporting that masquerades as chip mortality. Roll one MM version back or forward only after confirming the build matches your hardware revision — wrong MM bricks the controller.
Multimeter the per-group voltage rails on the affected board under full hashing load. Probe each group's regulator output test point. Healthy: all groups within `±5%` of each other on the same board. If one group reads `0 V` or grossly off-spec while neighbours are clean, the PMIC for that group is dead, not the chips — the dead chip count likely returns to zero once the regulator is replaced. Also probe for resistance to ground on the affected rail with the miner powered off; a dead short means a downstream cap or chip is shorted and you stop here.
Thermal-camera the running board after `5 minutes` of hashing. Boot at stock frequency, lift the chassis lid, image the hashboard from above with a `FLIR ONE Pro` or equivalent. Dead chips appear as cold spots at ambient while neighbours run at `60-85 °C`. Photograph the heatmap and overlay on a board layout to identify failed positions — disambiguates chip death from thermal-pad failure in `30 seconds`.
Re-paste the entire board. Power off, cool `30 minutes`, remove the dual front-and-back heatsinks (the A1566 uses bolted heatsinks on both faces — back out every fastener evenly, lift straight, do not pry). Clean every chip with `99% IPA` and lint-free wipes. Apply `Arctic MX-6` or `Thermal Grizzly Kryonaut` in a thin uniform layer to every chip on both faces. Replace any failed thermal pads on PMICs with `1.0 mm` `5-6 W/mK` pads. Reassemble with calibrated screw torque.
Verify `PSU3500-01` rail voltage under load at the hashboard input. Multimeter probes at the `4-pin PA45/P14` power harness pins on the affected board while the miner is hashing at full power. Expect output sustained in the `11.5-15 V DC` window per Canaan's published spec. Sag below the lower bound drives chip stress and is one of the most common causes of multi-board chip-count drift; swap with a known-good `PSU3500-01` and re-baseline.
Run for `24 hours` after Tier 1-2 interventions and re-query `cgminer-api` `stats`. Compare against your baseline. Stable count above the disable threshold = bleeding stopped. Still dropping = silicon mortality in progress; escalate to Tier 3 or 4.
Decode the `MW0`/`PVT_T` per-chip map for surgical targeting. Pull `MW0` via `cgminer-api`, identify indices reading `0x0`, cross-reference with board silkscreen positions. Confirm with the thermal photo from Step 7. You should now have a list of `1-12` specific chip positions to rework — mark them with a fine-tip silver Sharpie on the board edge, not on the chips themselves.
Source replacement A15-generation chips of known provenance. Use new-old-stock or D-Central salvaged-grade chips from a healthy parent board — avoid grey-market lots, which historically show accelerated mortality. Order `1.5×` your dead-chip count to allow for rework loss. Note: as of mid-2026 the A15 replacement-chip ecosystem is materially less mature than the A3200CFA market for A1366; pricing and availability vary.
Pre-heat the board, remove the dead chip. Bottom-side preheat platform `~150 °C`. Apply flux around the target chip's BGA. Top-side hot air at `300-330 °C` for `~30-45 seconds`. Lift the chip with vacuum tweezers — it should release cleanly. If it fights, wait `5 more seconds`; do not force it. Bottom-side preheat is non-negotiable on A1566 boards — top-only hot air on a `185 TH/s`-class hashboard is how chips and pads get destroyed.
Clean the pad, re-tin, place the new chip. Wick residual solder with copper braid, clean pads with `99% IPA`. Apply fresh solder paste or pre-tin pads. Align the new A15 chip to the silkscreen orientation marker (asymmetric corner — wrong orientation and the chip is dead on first power-up). Lower into place. Top-side hot air `310-330 °C` for `30 s`. Light pressure with tweezers; do not press down — you'll squeeze solder out of the BGA balls.
Cool naturally on an antistatic mat for `5+ minutes` — never blow cold air on a hot board (thermal shock cracks adjacent joints). Clean flux with IPA. Re-paste all chips on the board with the heatsinks back on (you've disturbed both faces anyway). Bench-test on a universal Avalon test fixture if available, or reinstall and re-query `cgminer-api` `stats`. Confirm `ASIC count` recovered to nameplate and previously-dead positions now report non-`0x0` `MW0` values. Run `24 hours` at stock before declaring success.
Stop DIY when: `>3` chips dead on one board; PMIC / voltage regulator suspected; you lack hot-air rework experience on `0.4 mm`-pitch BGA; a previous DIY rework left the board worse; capacitor bulging or board discoloration is visible; the `PA45/P14` connector itself shows heat damage. D-Central runs Tier 4 chip-level rework on Avalon hashboards — book a slot and bring photos of the `MW0` map and the thermal image (saves bench time, saves you money).
D-Central bench process for A1566: universal Avalon test-fixture bench-up, full per-chip enumeration via `cgminer` debug build, voltage-rail integrity across every group, chip-level rework with new-old-stock or graded A15-family chips of known provenance, full re-paste with high-end paste on both heatsink faces, `24-hour` post-repair burn-in at nameplate `~62 TH/s` per board / `185 TH/s` per miner. Boards with `>12` dead chips quoted as scrap-for-parts; boards with `1-3` dead return as-new.
Ship safely to Quebec. Anti-static bag the affected hashboard(s) — leave the rest of the miner intact unless instructed. Double-box with `≥ 5 cm` foam every side. Include a printed note: observed symptoms, MM firmware version, `cgminer-api` `stats` output, your `MW0` map, the `PSU3500-01` serial if you suspect PSU contribution, contact info. Canada-wide / US / international accepted; return turnaround `5-10 business days` typical.
When to Seek Professional Repair
If the steps above do not resolve the issue, or if you are not comfortable performing these repairs yourself, professional service is recommended. Attempting advanced repairs without proper equipment can cause further damage.
Related Error Codes
Still Having Issues?
Our team of Bitcoin Mining Hackers has been repairing ASIC miners since 2016. We have seen it all and fixed it all. Get a professional diagnosis.
