Avalon Error Codes & LED Status Reference (Canaan)

Canaan AvalonMiners report faults through two channels at once: a CGMiner-compatible JSON API on TCP port 4028 (the estats log, where the ECHU, BOOTBY, PS and MM_STATUS fields live) and two physical status lights (the chassis LED and the AUC controller LED). Canaan documents almost none of it in plain English. This page indexes the verified Avalon codes and LED patterns D-Central tracks on the repair bench, organized so you can decode a sick miner in minutes.

It complements D-Central’s ASIC Fault Finder (650 indexed codes) with a maker-level overview: roughly 99 Avalon/Canaan entries live on the site, grouped here by how the firmware surfaces each one. Where the public record is thin, this page says so and gives the diagnostic approach instead of inventing specifics.

How an Avalon signals a fault

Every Avalon from the A11 generation forward runs the same control stack: an AUC USB controller (AUC3 on A11/A12, AUC4 on A13+) hosts the MM (Mining Module) firmware and drives a 4 MHz IIC bus to three hashboards. Each board is a serial chain of Canaan A32xx ASICs (A3206/A3210 on A11/A12, later A3218/A3228-class silicon), so one dead chip position can take a whole board offline. That produces three fault surfaces:

The estats JSON API (port 4028) — the real diagnostic. Send {"command":"estats"} (via nc, telnet, or curl http://<ip>:4028 -d '{"command":"estats"}') and the reply dumps every field the firmware knows. The web UI shows less than half of it.
Two physical LEDs — the chassis LED (overall health) and the AUC controller LED (bus/comm health). Canaan compresses many faults into a single colour, so the LED tells you something is wrong, not what.
Web UI / log messages — plain-text strings such as E: no asics!!, MMCRCFAILED, and asic_init_fail: chip N no ACK in the miner log and the control-board serial console.

Canaan publishes the field names in the Canaan-Creative/avalon10-docs repo and stops there — no bit meanings, no thresholds, no remediation. The decoded tables below combine that repo, Zeus Mining’s backstage-log writeups, and D-Central’s A11–A15 repair-queue observations. Treat the inferred meanings as directional and verify against your specific MM build.

The `estats` field map — read in order

Don’t start with hashrate; it’s a downstream symptom. Read top-down: are the boards there? → is the controller happy? → are the chains clean? → how well are they hashing?

Field	What it reports	Healthy value	Decoder page
`SYSTEMSTATU`	How many hashboards enumerated; work state	`Work: 3` on a 3-board rig	estats decoder
`ECMM`	Module-management (MM control layer) error	`0`	estats decoder
`ECHU[a b c]`	Per-chain chip status bitmap (one 32-bit value per chain)	`[0 0 0]`	ECHU bit decoder
`MM_STATUS`	Miner work mode	`WORK_MODE NORMAL` / `IDLE`	estats decoder
`PVT_T[]`	Per-chip junction temperature	≤ 75 °C; soft-shutdown > 85 °C (A11/A12)	ASIC temp abnormal
`PVT_V[]`	Per-chip domain voltage	Within ±20 mV across a board (A1246 chip range ~290–350 mV)	Voltage domain abnormal
`PS[0..2]` + bitmap	AUC-to-PSU telemetry and PSU fault flags	Non-zero telemetry, bitmap `0`	PSU not detected
`BOOTBY[0xNN]`	Why the miner last rebooted	`0x05`/`0x21` = intentional	BOOTBY reference
`GHSmm` vs `GHSavg`	Nameplate vs realized hashrate	Within 5–10% (A1246 ~90 TH/s, A1566 ~185 TH/s)	Low hashrate

`BOOTBY` — reboot-cause codes

MM writes a one-byte cause code to non-volatile storage on every reboot and prints BOOTBY[0xNN.xxxxxxxx] at the next boot (ignore the trailing pointer). A single event is noise; a repeating cadence is the signal. Codes confirmed across the A11–A15 line (Zeus Mining mirrors Canaan’s partial list; 0x10 is community/D-Central-sourced and absent from Canaan’s English docs):

Code	Meaning	First action
`0x01`	Hard reboot / cold start / unclassified power event	Verify clean mains; single event = ignore
`0x02` (legacy `0A1E`)	Overheat soft-shutdown (`PVT_T` > ~85 °C)	Filter, airflow, thermal path
`0x03`	Stratum / DNS / pool-side fault	Swap pool; drop the `114.114.114.114` default DNS
`0x04`	MM re-initialised its own network stack	Reseat AUC USB, refresh MM firmware
`0x05`	API-initiated reboot (operator/controller)	Healthy — check your monitoring stack if unexpected
`0x08`	Reserved / watchdog (MM-build-dependent)	Treat as `0x10` if it persists
`0x10`	Unclassified watchdog (not in Canaan’s table)	Firmware refresh, PSU test, AUC reseat
`0x11`	No hashrate within 5 min of boot	Reseat ribbons, slot-swap, read `SYSTEMSTATU`
`0x12`	Hashrate dropped below 70% of nameplate	`PVT_T`/`PVT_V` check, chip-level diagnosis
`0x21`	Soft reboot after a config change	Healthy — no action

Model-deep companions: Avalon 1246 reboot BOOTBY and A1166 Pro 0x10 loop.

`PS` — PSU error bitmap

The PS field is a bitmap published as a header-file table in avalon10-docs (no remediation prose). Sum the set bits and work each one — PS error 2049 is bits 0 + 11 = mains undervoltage and a stalled PSU fan, not “fault 2049.”

Bit / value	Canaan label	Meaning	First action
0 / `1`	`Input_UV`	AC input undervoltage	Measure wall voltage under load (≥ 195 V)
1 / `2`	`OT1`	PSU overtemp warning	Improve airflow, lower ambient
2 / `4`	`OT2`	PSU overtemp critical	Stop and inspect PSU
3 / `8`	`OT3`	Thermal shutdown imminent	Immediate shutdown
4 / `16`	`OC_Pri`	Primary-side overcurrent	Check circuit, PDU, line voltage
5 / `32`	`UV_out`	DC rail undervoltage	PSU tired or load too high
6 / `64`	`OC_out`	DC rail overcurrent	Suspect a hashboard short
7 / `128`	`CS_error`	Current-sensor fault	PSU service / replace
8–10 / `256/512/1024`	`OC_IOSA/B/C`	Output channel overcurrent	Isolate by unplugging one board at a time
11 / `2048`	`FAN_error`	PSU fan failure	Replace PSU fan / PSU

`ECHU` — per-chip chain status (decompose to bits)

ECHU[a b c] carries one 32-bit value per chain. Canaan publishes the field but not the bit meanings. Always decompose: ECHU[513 0 0] is bits 0 + 9, not “fault 513.” The three most-observed bits:

Bit 0 (1) — chain communication trip. First fix: reseat AUC/ribbon, check mains sag. → Hashboard comm error, ASICCRC error.
Bit 7 (128) — chip-temperature outlier (the most-cited value; Zeus names it explicitly). Walk PVT_T; clean filter, refresh pads. → ASIC temp abnormal.
Bit 9 (512) — firmware disabled a chip on that chain. Walk PVT_V for a 0 V position; reflow or replace. → Voltage domain abnormal.

Pattern across the triplet: [N 0 0] = one chain (board/slot-localised); [N N N] = controller-side (AUC, MM, PSU rail, or environment — often a 120 V circuit sagging under load). Full table at the ECHU decoder.

LED & status-light reference

An A1246-class chassis has two LEDs, and between them they encode at least seven fault classes in three or four colours. There is no blink-count code — the disambiguation lives in the log, not the light.

Chassis LED (MM-driven, on the lid)

White / blue rapid flash — bootloader active, OS not yet handed off. Persisting > ~60 s = controller failed to boot.
Yellow / amber sustained — initializing (fans, PSU handshake, MM enumeration). Should reach green within ~5 min on a cold boot.
Green sustained — all three MMs enumerated, pool connected, valid shares. Healthy.
Red sustained — one of seven faults, per Canaan’s own 721–841 PDF: Toohot, Loopback failed, PG (Power Good) failed, Core test failed, Voltage error, Temp sensor error, No fan. Pull estats to disambiguate.

AUC controller LED (on the USB-stub module)

Blue — initializing or idle.
Green — passing IIC traffic to the MMs normally.
Red sustained or flickering — comm issue or rejected shares. A red flicker synced to pool submissions is reject/comm, not hardware; a steady red with the chassis green points at the cable/IIC path. → AUC USB connection lost, AUC controller failure.

The seven red-LED causes, mapped to fields

Red-LED cause	Confirm in `estats`	Page
Toohot	`PVT_T` outlier > 85 °C	Temperature too high
Loopback failed	`E: no asics!!`, `SYSTEMSTATU` < 3	No ASICs
PG failed	`PS` bitmap non-zero	PSU not detected
Core test failed	`ECHU` non-zero, BIN reads 0	Chip BIN mismatch
Voltage error	`PVT_V` outside ~290–350 mV (chip)	Voltage domain abnormal
Temp sensor error	Single `PVT_T` at floor/ceiling vs neighbours	ASIC temp abnormal
No fan	`Fan1`/`Fan2` RPM = 0	Fan speed error

The Avalon diagnostic ladder

Cold-boot 60 s at the PDU — kill AC so bulk caps discharge and the MM state machine flushes. Many E: no asics!! and 0x10 events on healthy hardware clear here, with zero spend.
Pull estats on port 4028 and screenshot it — read SYSTEMSTATU → ECMM → ECHU → MM_STATUS → PS → PVT_T/V → GHS. This snapshot is what the bench team asks for if the unit ships.
Read both LEDs to corroborate which subsystem the firmware blames (chassis = MM/hashboard/thermal; AUC = bus/comm/pool).
30-minute slot-swap test. Move the flagged board to another slot. Fault follows the board = board-side (chip/paste/ribbon); fault stays in the slot = AUC/ribbon/MM path. This bisection halves diagnostic time on every Avalon ticket.
Check the environment before the silicon — mains under load (≥ 195 V; a dedicated 240 V circuit beats 120 V), intake ≤ 30 °C, DNS set to 1.1.1.1/8.8.8.8. Then confirm the MM firmware is the last-known-good build before flashing.
Escalate to ribbon reseat/clean (99% IPA), thermal-pad refresh, then chip-level reflow — only once the slot-swap proves the fault is on the board.

DIY vs. the bench

Most LED-only and log-only complaints are a free cold-boot, a reseat, a filter clean, a DNS fix, or a firmware rollback — fully DIY. Stop and book the bench when the fault is in the silicon or control hardware: ECHU bit 9 recurs at the same chip position within 30 days of a reflow; SYSTEMSTATU < 3 survives ribbon reseats and slot-swaps; BOOTBY persists across two firmware builds; PVT_T stays > 85 °C after a fresh pad refresh with clean ambient; or you see scorched traces, bulging electrolytics, or a burnt smell. Reversed PSU connect order (burned U1/U2/R8/R9 power-sequence parts) and cross-connecting a Bitmain APW PSU are bench jobs too. D-Central diagnoses and repairs these in-house — chip replacement, MM/AUC swaps from stocked A11–A13 inventory, 24-hour burn-in.

Book Avalon ASIC repair → · 5–10 business-day turnaround · Laval, Quebec workshop · ships Canada / US / international. Searching a specific string? Use the ASIC Fault Finder. Model specs and tuning: AvalonMiner 1246, A1166 Pro, A1346, A1566.

Sources

Canaan-Creative/avalon10-docs — Canaan’s own API reference; defines estats, the PS bitmap, ECHU, BOOTBY (field names only, no meanings).
Zeus Mining — Avalon backstage log description and the A1246 kernel-log walkthrough — third-party mirror naming ECHU 128 and the LED states.
Canaan 721–841 series troubleshooting PDF — the canonical (and only) source for the seven red-LED states.
D-Central in-house Avalon repair queue (A11–A15), 2023–2026 — the basis for the 0x10 prevalence pattern, the ECHU[N N N] = electrical-environment rule, and the field-to-fix mapping above.

Related products, repair, and setup paths

Last reviewed June 8, 2026.