A1366_DEAD_CHIPS Warning

Avalon 1366 – Dead Chip Count Exceeded

Hashboard reports more than the configurable threshold (typically 5-10%, i.e. 6-12 of 120 chips) of dead A3200CFA ASIC chips. Firmware auto-disables the affected hashboard on the next boot to protect the survivors and the voltage rails.

Warning — Should be addressed soon

Affected Models: Avalon Made A1366 (130 TH/s, 3250 W, 2nd-gen 7nm A3200CFA silicon, 3 hashboards x 120 chips)

Réponse rapide

Hashboard reports more than the configurable threshold (typically 5-10%, i.e. First step: Hard power-cycle the miner.

Symptoms

`cgminer-api` `stats` reports `ASIC count` below `120` on at least one of the three hashboards (e.g. `MM count: 110/120, board disabled`)
Realized hashrate has dropped roughly proportional to missing chips (each A3200CFA ~`0.36 TH/s` of the board's `~43 TH/s` share)
Boot log shows non-zero `ECHU` on a specific board, or controller reports `ASIC_COUNT_LOW` and skips powering that hashboard
Dashboard shows two boards green/active and one greyed out / marked `disabled`, even though the board is physically installed
After a power cycle the missing-chip count is stable, not flickering — the chips are physically gone, not transiently glitching
`MW0` array contains `0x0` entries clustered on adjacent chip indices (`A1`-`A3`, or `A19`-`A21`, etc.)
Thermal camera or IR thermometer shows cold spots at near-ambient on the running board while neighbours run `60-85 °C`
`PVT_T` per-group temperature reads abnormally low or returns `-` / `--` for the affected group
Group voltage rail off-spec — `1.8 V` (`VDDIO`) or `0.75 V` (`VTOP`) measures wrong, or one rail reads `0 V`
Rest of the miner is healthy — no fan errors, no PSU faults, no over-temperature; the issue is isolated to chip count on a single board
Chip count drift accelerated after a known event: overclock push, thermal excursion, power surge, or a prior repaste/rework
If the board was previously third-party repaired: pattern of losing `1` chip/month consistently after the repair (grey-market chip provenance)

Step-by-Step Fix

Hard power-cycle the miner — `60 seconds` off at the breaker, not a soft reboot. Some MM firmware builds latch a board-disable state across soft reboots but clear it on a cold boot. If the board returns at `120/120`, you had a stuck firmware state, not a dead chip. Re-monitor `24 hours` to confirm it stays healthy before declaring victory.

Re-seat the AUC3 ribbon and the power loom on the affected hashboard. Power off at the breaker, unplug, remove the chassis lid, disconnect/reconnect the AUC3 ribbon between the control board and the suspect hashboard. Inspect for bent pins, oxidation, or blackened contacts. Repeat for the power harness; listen for the click on every connector.

Read `ASIC count` directly from `cgminer-api` rather than trusting the web dashboard. From any LAN machine: `echo -n '{"command":"stats"}' | nc <miner-ip> 4028`. Parse the `MM ID0/1/2` blocks for `ASIC count`. This is the ground truth; the dashboard can lag, cache, or misreport.

Verify intake ambient at the front grille with an IR thermometer. Target `≤ 35 °C` at the front of the miner — anything above pushes `Tj` toward the `>100 °C` mortality zone for `A3200CFA` and accelerates chip death across the fleet. Filter clean and ambient correction is the cheapest possible intervention.

Check the Canaan firmware portal (`avalonminer.org/firmware-document/`) for current MM image for your A1366 hardware revision. Some MM builds shipped with overly aggressive `0x0` nonce reporting that masquerades as chip mortality. Roll one MM version back or forward only after confirming the build matches your hardware revision — wrong MM bricks the controller.

Multimeter the group voltage rails on the affected board under full hashing load. Probe each group's `1.8 V` (`VDDIO`) and `0.75 V` (`VTOP`) test points. Healthy: all 40 groups within `±5%` of nominal. If one group reads `0 V` or grossly off-spec while neighbours are clean, the PMIC for that group is dead, not the chips — the dead chip count likely returns to zero once the regulator is replaced.

Thermal-camera the running board after `5 minutes` of hashing. Boot at stock frequency, lift the chassis lid, image the hashboard from above with a `FLIR ONE Pro` or equivalent. Dead chips appear as cold spots at ambient while neighbours run `60-85 °C`. Photograph the heatmap and overlay on a board layout to identify failed positions — disambiguates chip death from thermal-pad failure in `30 seconds`.

Re-paste the entire board. Power off, cool `30 minutes`, remove heatsink, clean every chip with `99% IPA` and lint-free wipes, apply `Arctic MX-6` or `Thermal Grizzly Kryonaut` in a thin uniform layer. Replace any failed thermal pads on PMICs with `1.0 mm` `5-6 W/mK` pads. Reassemble with calibrated screw torque — uneven heatsink pressure is itself a cause of chip mortality.

Verify PSU rail voltage under load at the hashboard input. Multimeter probes at the power harness pins on the affected board while the miner is hashing at full power. Expect `≥ 13.8 V` sustained. PSU sag below that drives chip stress and is one of the most common causes of multi-board chip-count drift; swap with a known-good PSU and re-baseline.

Run for `24 hours` after Tier 1-2 interventions and re-query `cgminer-api` `stats`. Compare against your baseline. Stable count above the disable threshold = bleeding stopped. Still dropping = silicon mortality in progress; escalate to Tier 3 or 4.

Decode the `MW0`/`PVT_T` per-chip map for surgical targeting. Pull `MW0` via `cgminer-api`, identify indices reading `0x0`, cross-reference with board silkscreen positions `A1`-`A120`. Confirm with the thermal photo from Step 7. You should now have a list of `1-12` specific chip positions to rework — mark them with a fine-tip silver Sharpie on the board edge.

Source replacement chips: `A3200CFA` primary, with `A3200CMA` and `A3200CMCV3` documented as drop-in compatible on A1346/A1366 boards. Use new-old-stock or D-Central salvaged-grade chips of known provenance — avoid grey-market lots, which historically show accelerated mortality. Order `1.5×` your dead-chip count to allow for rework loss.

Pre-heat the board, remove the dead chip. Bottom-side preheat platform `~150 °C`. Apply flux around the target chip's BGA. Top-side hot air at `300-330 °C` for `~30-45 seconds`. Lift the chip with vacuum tweezers — it should release cleanly. If it fights, wait `5 more seconds`; do not force it (you'll lift pads). Bottom-side preheat is non-negotiable on these boards.

Clean the pad, re-tin, place the new chip. Wick residual solder with copper braid, clean pads with `99% IPA`. Apply fresh solder paste or pre-tin pads. Align new `A3200CFA` to the silkscreen orientation marker (asymmetric corner — get this wrong and the chip is dead on first power-up). Lower into place. Top-side hot air `310-330 °C` for `30 s`. Light pressure with tweezers; do not press down.

Cool naturally on an antistatic mat for `5+ minutes` — never blow cold air on a hot board (thermal shock cracks adjacent joints). Clean flux with IPA. Re-paste all 120 chips (you've disturbed the heatsink anyway). Bench-test on a universal Avalon test fixture if available, or reinstall and re-query `cgminer-api` `stats`. Confirm `ASIC count` recovered to `120/120` and the previously-dead positions now report non-`0x0` `MW0` values.

Stop DIY when: `>3` chips dead on one board; PMIC / voltage regulator suspected; you lack hot-air rework experience on `0.4 mm`-pitch BGA; a previous DIY rework left the board worse; capacitor bulging or board discoloration is visible. D-Central runs Tier 4 chip-level rework on Avalon hashboards — book a slot and bring photos of the `MW0` map and the thermal image (saves bench time, saves you money).

D-Central bench process for A1366: universal Avalon test-fixture bench-up, full per-chip enumeration via `cgminer` debug build, voltage-rail integrity across all 40 groups, chip-level rework with new-old-stock or graded `A3200CFA`, full re-paste with high-end paste, `24-hour` post-repair burn-in at nameplate `~43 TH/s` per board. Boards with `>12` dead chips quoted as scrap-for-parts; boards with `1-3` dead return as-new.

Ship safely to Quebec. Anti-static bag the affected hashboard(s) — leave the rest of the miner intact unless instructed. Double-box with `≥ 5 cm` foam every side. Include a printed note: observed symptoms, MM firmware version, `cgminer-api` `stats` output, your `MW0` map, contact info. Canada-wide / US / international accepted; return turnaround `5-10 business days` typical.

When to Seek Professional Repair

If the steps above do not resolve the issue, or if you are not comfortable performing these repairs yourself, professional service is recommended. Attempting advanced repairs without proper equipment can cause further damage.

All Repair Services

Foire aux questions

What does the A1366_DEAD_CHIPS error mean?

Hashboard reports more than the configurable threshold (typically 5-10%, i.e. 6-12 of 120 chips) of dead A3200CFA ASIC chips. Firmware auto-disables the affected hashboard on the next boot to protect the survivors and the voltage rails. Commonly reported on: Avalon Made A1366 (130 TH/s, 3250 W, 2nd-gen 7nm A3200CFA silicon, 3 hashboards x 120 chips).

Can I fix the A1366_DEAD_CHIPS error myself?

This is an advanced repair best handled by someone experienced with ASIC hardware. Start with: Hard power-cycle the miner — `60 seconds` off at the breaker, not a soft reboot. Some MM firmware builds latch a board-disable state across soft reboots bu... If you are not equipped for board-level work, D-Central can diagnose and repair it at our Laval bench.

How much does it cost to repair?

A DIY repair typically runs $63-$900 CAD depending on which part the fault traces to. D-Central can also diagnose and quote a mail-in bench repair.

What parts might I need to fix this?

Common replacement parts for this fault: Hashboard Thermal Paste (Cyan), Fluke Multimeter 15B+, STASIC Hashboard MultiTester Pro. The exact part depends on diagnosis - measure first.

Key Terms in This Fault

Jump to the full definition of the technical terms involved in this fault:

Hashboard Voltage domain Hashrate PSU Firmware Autotuning Overclocking Heatsink

Related Error Codes

Own your firmware — DCENT_OS (Antminer first)

DCENT_OS is D-Central’s open-source, GPL-3.0 firmware effort, now in public beta on Antminer (SHA-256) hardware — signed S9 and S19j Pro (Zynq/XIL) images are free to download. It is experimental and not production-ready. We build on the shoulders of the open-firmware projects that came before us, and we are starting with Antminer before widening hardware support. If you run Antminer gear, or just want firmware you can fully own and audit, grab the beta image. This is a free public beta, never a pre-order — collection only, we will not email you anything else yet.

I agree to D-Central storing my email to contact me about this. See our privacy policy.

Printable quick-reference cards

ASIC Miner Error-Code Quick-Reference Card — print-to-PDF one-pager
Stratum Share-Rejection Error Decoder Card — print-to-PDF one-pager
ASIC PSU & Connector Pinout Quick-Reference Card — print-to-PDF one-pager

Still Having Issues?

Our team of Bitcoin Mining Hackers has been repairing ASIC miners since 2016. We have seen it all and fixed it all. Get a professional diagnosis.

ASIC Repair Services Browse All Errors

Avalon 1366 – Dead Chip Count Exceeded

Symptoms

Step-by-Step Fix

When to Seek Professional Repair

Foire aux questions

Key Terms in This Fault

Related Error Codes

Avalon 1166 - Low Hashrate

Avalon 1246 - Low Hashrate

Avalon 1166 - Hashboard Not Detected

Antminer S19 - Low Hashrate (Missing ASIC Chips)

Antminer S21 - ASIC Chip HW Errors

Antminer S19 - ASIC Chip Temperature Imbalance

Antminer - Hashboard Short Circuit

NerdQAxe - Low Hashrate on One Chip

Own your firmware — DCENT_OS (Antminer first)

Printable quick-reference cards

Still Having Issues?

Produits, réparations et guides connexes