Passer au contenu

Nous améliorons nos opérations pour mieux vous servir. Les commandes sont expédiées normalement depuis Laval, QC. Questions? Contactez-nous

Bitcoin accepté au paiement  |  Expédié depuis Laval, QC, Canada  |  Soutien expert depuis 2016

A1466_DEAD_CHIPS Warning

Avalon 1466 – Dead Chip Count Exceeded

Hashboard reports more than the configurable threshold (typically 5-10%, i.e. 6-12 of ~120 chips) of dead A3207 ASIC chips. Firmware auto-disables the affected hashboard on the next boot to protect the surviving chips and the voltage rails.

Warning — Should be addressed soon

Affected Models: Avalon Made A1466 (~150 TH/s, ~3230 W, 3rd-gen A3207 silicon, 3 hashboards x ~120 chips)

Symptoms

  • `cgminer-api` `stats` reports `ASIC count` below nameplate on at least one of the three hashboards (e.g. `MM count: 110/120, board disabled`)
  • Realized hashrate has dropped roughly proportional to missing chips (each `A3207` ~`0.42 TH/s` of the board's `~50 TH/s` share)
  • Boot log shows non-zero `ECHU` on a specific board, or controller reports `ASIC_COUNT_LOW` and skips powering that hashboard
  • Dashboard shows two boards green/active and one greyed out / marked `disabled`, even though the board is physically installed
  • After a power cycle the missing-chip count is stable, not flickering — the chips are physically gone, not transiently glitching
  • `MW0` array contains `0x0` entries clustered on adjacent chip indices (`A1`-`A3`, or `A19`-`A21`, etc.)
  • Thermal camera or IR thermometer shows cold spots at near-ambient on the running board while neighbours run `60-85 °C`
  • `PVT_T` per-group temperature reads abnormally low or returns `-` / `--` for the affected group
  • Group voltage rail off-spec — `VDDIO` ~`1.8 V` or core ~`0.65-0.75 V` measures wrong, or one rail reads `0 V`
  • Rest of the miner is healthy — no fan errors, no PSU faults, no over-temperature; the issue is isolated to chip count on a single board
  • Chip count drift accelerated after a known event: overclock push, thermal excursion, power surge, or a prior repaste/rework
  • If the board was previously third-party repaired: pattern of losing `1` chip/month consistently after the repair (grey-market `A3207` provenance)

Step-by-Step Fix

1

Hard power-cycle the miner — `60 seconds` off at the breaker, not a soft reboot. Some MM firmware builds latch a board-disable state across soft reboots but clear it on a cold boot. If the board returns at nameplate `ASIC count`, you had a stuck firmware state, not a dead chip. Re-monitor `24 hours` to confirm it stays healthy before declaring victory.

2

Re-seat the AUC3 ribbon and the power loom on the affected hashboard. Power off at the breaker, unplug, remove the chassis lid, disconnect/reconnect the AUC3 ribbon between the control board and the suspect hashboard. Inspect for bent pins, oxidation, or blackened contacts. Repeat for the power harness; the `A1466` power loom carries serious current, and a bad contact there manifests as group-level dropouts that look exactly like a dead-chip pattern.

3

Read `ASIC count` directly from `cgminer-api` rather than trusting the web dashboard. From any LAN machine: `echo -n '{"command":"stats"}' | nc <miner-ip> 4028`. Parse the `MM ID0/1/2` blocks for `ASIC count`. This is the ground truth; the dashboard can lag, cache, or misreport. Save the JSON output for the bench tech if it gets that far.

4

Verify intake ambient at the front grille with an IR thermometer. Target `≤ 35 °C` at the front of the miner — anything above pushes `Tj` toward the `>100 °C` mortality zone for `A3207` and accelerates chip death across the fleet. Filter clean and ambient correction is the cheapest possible intervention and protects boards you haven't lost yet.

5

Check the Canaan firmware portal (`avalonminer.org/firmware-document/`) for current MM image for your A1466 hardware revision. Some MM builds shipped with overly aggressive `0x0` nonce reporting that masquerades as chip mortality. Roll one MM version back or forward only after confirming the build matches your hardware revision — wrong MM bricks the controller, and Canaan's recovery story for a bricked controller is no story at all outside warranty.

6

Multimeter the group voltage rails on the affected board under full hashing load. Probe each group's `VDDIO` (~`1.8 V`) and core (~`0.65-0.75 V`) test points. Healthy: all groups within `±5%` of nominal. If one group reads `0 V` or grossly off-spec while neighbours are clean, the PMIC for that group is dead, not the chips — the dead chip count likely returns to zero once the regulator is replaced. Tag the group with a silver Sharpie.

7

Thermal-camera the running board after `5 minutes` of hashing. Boot at stock frequency, lift the chassis lid, image the hashboard from above with a `FLIR ONE Pro` or equivalent. Dead chips appear as cold spots at ambient while neighbours run `60-85 °C`. Photograph the heatmap and overlay on a board layout to identify failed positions — disambiguates chip death from thermal-pad failure in `30 seconds`.

8

Re-paste the entire board. Power off, cool `30 minutes`, remove heatsink, clean every chip with `99% IPA` and lint-free wipes, apply `Arctic MX-6` or `Thermal Grizzly Kryonaut` in a thin uniform layer. Replace any failed thermal pads on PMICs with `1.0 mm` `5-6 W/mK` pads. Reassemble with calibrated screw torque — uneven heatsink pressure is itself a cause of chip mortality, especially on `A3207`'s small package.

9

Verify PSU rail voltage under load at the hashboard input. Multimeter probes at the power harness pins on the affected board while the miner is hashing at full power. Expect `≥ 13.8 V` sustained. PSU sag below that drives chip stress and is one of the most common causes of multi-board chip-count drift; swap with a known-good PSU and re-baseline.

10

Run for `24 hours` after Tier 1-2 interventions and re-query `cgminer-api` `stats`. Compare against your baseline. Stable count above the disable threshold = bleeding stopped. Still dropping = silicon mortality in progress; escalate to Tier 3 or 4. Do not push frequency on a degrading board to make up the lost hashrate — you'll finish the survivors.

11

Decode the `MW0`/`PVT_T` per-chip map for surgical targeting. Pull `MW0` via `cgminer-api`, identify indices reading `0x0`, cross-reference with board silkscreen positions. Confirm with the thermal photo from Step 7. You should now have a list of `1-12` specific chip positions to rework — mark them with a fine-tip silver Sharpie on the board edge (mark survives the reflow if it's on the edge).

12

Source replacement chips: `A3207` is the `A1466` chip; supply is tight, and Canaan does not publish the part standalone. Use new-old-stock or D-Central salvaged-grade chips of known provenance — avoid grey-market lots, which historically show accelerated mortality. Order `1.5×` your dead-chip count to allow for rework loss. Do not substitute a non-`A3207` chip on an A1466 board.

13

Pre-heat the board, remove the dead chip. Bottom-side preheat platform `~150 °C`. Apply flux around the target chip's BGA. Top-side hot air at `300-330 °C` for `~30-45 seconds`. Lift the chip with vacuum tweezers — it should release cleanly. If it fights, wait `5 more seconds`; do not force it (you'll lift pads). Bottom-side preheat is non-negotiable on these boards.

14

Clean the pad, re-tin, place the new chip. Wick residual solder with copper braid, clean pads with `99% IPA`. Apply fresh solder paste or pre-tin pads. Align new `A3207` to the silkscreen orientation marker (asymmetric corner — get this wrong and the chip is dead on first power-up). Lower into place. Top-side hot air `310-330 °C` for `30 s`. Light pressure with tweezers; do not press down — you'll squeeze solder out of the BGA balls and create shorts.

15

Cool naturally on an antistatic mat for `5+ minutes` — never blow cold air on a hot board (thermal shock cracks adjacent joints). Clean flux with IPA. Re-paste all chips on the board (you've disturbed the heatsink anyway). Bench-test on a universal Avalon test fixture if available, or reinstall and re-query `cgminer-api` `stats`. Confirm `ASIC count` recovered to nameplate and the previously-dead positions now report non-`0x0` `MW0` values.

16

Stop DIY when: `>3` chips dead on one board; PMIC / voltage regulator suspected; you lack hot-air rework experience on `0.4 mm`-pitch BGA; a previous DIY rework left the board worse; capacitor bulging or board discoloration is visible. D-Central runs Tier 4 chip-level rework on Avalon hashboards — book a slot and bring photos of the `MW0` map and the thermal image (saves bench time, saves you money). The A1466 is new enough that bench-tech context matters more than usual.

17

D-Central bench process for A1466: universal Avalon test-fixture bench-up, full per-chip enumeration via `cgminer` debug build, voltage-rail integrity across all groups, chip-level rework with new-old-stock or graded `A3207`, full re-paste with high-end paste, `24-hour` post-repair burn-in at nameplate `~50 TH/s` per board. Boards with `>12` dead chips quoted as scrap-for-parts; boards with `1-3` dead return as-new.

18

Ship safely to Quebec. Anti-static bag the affected hashboard(s) — leave the rest of the miner intact unless instructed. Double-box with `≥ 5 cm` foam every side. Include a printed note: observed symptoms, MM firmware version, `cgminer-api` `stats` output, your `MW0` map, contact info. Canada-wide / US / international accepted; return turnaround `5-10 business days` typical, longer when `A3207` chip supply tightens.

When to Seek Professional Repair

If the steps above do not resolve the issue, or if you are not comfortable performing these repairs yourself, professional service is recommended. Attempting advanced repairs without proper equipment can cause further damage.

Related Error Codes

Still Having Issues?

Our team of Bitcoin Mining Hackers has been repairing ASIC miners since 2016. We have seen it all and fixed it all. Get a professional diagnosis.