Avalon Error Codes & LED Status Reference (Canaan)
Canaan AvalonMiners report faults through two channels at once: a CGMiner-compatible JSON API on TCP port 4028 (the estats log, where the ECHU, BOOTBY, PS and MM_STATUS fields live) and two physical status lights (the chassis LED and the AUC controller LED). Canaan documents almost none of it in plain English. This page indexes the verified Avalon codes and LED patterns D-Central tracks on the repair bench, organized so you can decode a sick miner in minutes.
It complements D-Central’s ASIC Fault Finder (650 indexed codes) with a maker-level overview: roughly 99 Avalon/Canaan entries live on the site, grouped here by how the firmware surfaces each one. Where the public record is thin, this page says so and gives the diagnostic approach instead of inventing specifics.
How an Avalon signals a fault
Every Avalon from the A11 generation forward runs the same control stack: an AUC USB controller (AUC3 on A11/A12, AUC4 on A13+) hosts the MM (Mining Module) firmware and drives a 4 MHz IIC bus to three hashboards. Each board is a serial chain of Canaan A32xx ASICs (A3206/A3210 on A11/A12, later A3218/A3228-class silicon), so one dead chip position can take a whole board offline. That produces three fault surfaces:
- The
estatsJSON API (port 4028) — the real diagnostic. Send{"command":"estats"}(vianc,telnet, orcurl http://<ip>:4028 -d '{"command":"estats"}') and the reply dumps every field the firmware knows. The web UI shows less than half of it. - Two physical LEDs — the chassis LED (overall health) and the AUC controller LED (bus/comm health). Canaan compresses many faults into a single colour, so the LED tells you something is wrong, not what.
- Web UI / log messages — plain-text strings such as
E: no asics!!,MMCRCFAILED, andasic_init_fail: chip N no ACKin the miner log and the control-board serial console.
Canaan publishes the field names in the Canaan-Creative/avalon10-docs repo and stops there — no bit meanings, no thresholds, no remediation. The decoded tables below combine that repo, Zeus Mining’s backstage-log writeups, and D-Central’s A11–A15 repair-queue observations. Treat the inferred meanings as directional and verify against your specific MM build.
The estats field map — read in order
Don’t start with hashrate; it’s a downstream symptom. Read top-down: are the boards there? → is the controller happy? → are the chains clean? → how well are they hashing?
| Field | What it reports | Healthy value | Decoder page |
|---|---|---|---|
SYSTEMSTATU |
How many hashboards enumerated; work state | Work: 3 on a 3-board rig |
estats decoder |
ECMM |
Module-management (MM control layer) error | 0 |
estats decoder |
ECHU[a b c] |
Per-chain chip status bitmap (one 32-bit value per chain) | [0 0 0] |
ECHU bit decoder |
MM_STATUS |
Miner work mode | WORK_MODE NORMAL / IDLE |
estats decoder |
PVT_T[] |
Per-chip junction temperature | ≤ 75 °C; soft-shutdown > 85 °C (A11/A12) | ASIC temp abnormal |
PVT_V[] |
Per-chip domain voltage | Within ±20 mV across a board (A1246 chip range ~290–350 mV) | Voltage domain abnormal |
PS[0..2] + bitmap |
AUC-to-PSU telemetry and PSU fault flags | Non-zero telemetry, bitmap 0 |
PSU not detected |
BOOTBY[0xNN] |
Why the miner last rebooted | 0x05/0x21 = intentional |
BOOTBY reference |
GHSmm vs GHSavg |
Nameplate vs realized hashrate | Within 5–10% (A1246 ~90 TH/s, A1566 ~185 TH/s) | Low hashrate |
BOOTBY — reboot-cause codes
MM writes a one-byte cause code to non-volatile storage on every reboot and prints BOOTBY[0xNN.xxxxxxxx] at the next boot (ignore the trailing pointer). A single event is noise; a repeating cadence is the signal. Codes confirmed across the A11–A15 line (Zeus Mining mirrors Canaan’s partial list; 0x10 is community/D-Central-sourced and absent from Canaan’s English docs):
| Code | Meaning | First action |
|---|---|---|
0x01 |
Hard reboot / cold start / unclassified power event | Verify clean mains; single event = ignore |
0x02 (legacy 0A1E) |
Overheat soft-shutdown (PVT_T > ~85 °C) |
Filter, airflow, thermal path |
0x03 |
Stratum / DNS / pool-side fault | Swap pool; drop the 114.114.114.114 default DNS |
0x04 |
MM re-initialised its own network stack | Reseat AUC USB, refresh MM firmware |
0x05 |
API-initiated reboot (operator/controller) | Healthy — check your monitoring stack if unexpected |
0x08 |
Reserved / watchdog (MM-build-dependent) | Treat as 0x10 if it persists |
0x10 |
Unclassified watchdog (not in Canaan’s table) | Firmware refresh, PSU test, AUC reseat |
0x11 |
No hashrate within 5 min of boot | Reseat ribbons, slot-swap, read SYSTEMSTATU |
0x12 |
Hashrate dropped below 70% of nameplate | PVT_T/PVT_V check, chip-level diagnosis |
0x21 |
Soft reboot after a config change | Healthy — no action |
Model-deep companions: Avalon 1246 reboot BOOTBY and A1166 Pro 0x10 loop.
PS — PSU error bitmap
The PS field is a bitmap published as a header-file table in avalon10-docs (no remediation prose). Sum the set bits and work each one — PS error 2049 is bits 0 + 11 = mains undervoltage and a stalled PSU fan, not “fault 2049.”
| Bit / value | Canaan label | Meaning | First action |
|---|---|---|---|
0 / 1 |
Input_UV |
AC input undervoltage | Measure wall voltage under load (≥ 195 V) |
1 / 2 |
OT1 |
PSU overtemp warning | Improve airflow, lower ambient |
2 / 4 |
OT2 |
PSU overtemp critical | Stop and inspect PSU |
3 / 8 |
OT3 |
Thermal shutdown imminent | Immediate shutdown |
4 / 16 |
OC_Pri |
Primary-side overcurrent | Check circuit, PDU, line voltage |
5 / 32 |
UV_out |
DC rail undervoltage | PSU tired or load too high |
6 / 64 |
OC_out |
DC rail overcurrent | Suspect a hashboard short |
7 / 128 |
CS_error |
Current-sensor fault | PSU service / replace |
8–10 / 256/512/1024 |
OC_IOSA/B/C |
Output channel overcurrent | Isolate by unplugging one board at a time |
11 / 2048 |
FAN_error |
PSU fan failure | Replace PSU fan / PSU |
ECHU — per-chip chain status (decompose to bits)
ECHU[a b c] carries one 32-bit value per chain. Canaan publishes the field but not the bit meanings. Always decompose: ECHU[513 0 0] is bits 0 + 9, not “fault 513.” The three most-observed bits:
- Bit 0 (
1) — chain communication trip. First fix: reseat AUC/ribbon, check mains sag. → Hashboard comm error, ASICCRC error. - Bit 7 (
128) — chip-temperature outlier (the most-cited value; Zeus names it explicitly). WalkPVT_T; clean filter, refresh pads. → ASIC temp abnormal. - Bit 9 (
512) — firmware disabled a chip on that chain. WalkPVT_Vfor a 0 V position; reflow or replace. → Voltage domain abnormal.
Pattern across the triplet: [N 0 0] = one chain (board/slot-localised); [N N N] = controller-side (AUC, MM, PSU rail, or environment — often a 120 V circuit sagging under load). Full table at the ECHU decoder.
LED & status-light reference
An A1246-class chassis has two LEDs, and between them they encode at least seven fault classes in three or four colours. There is no blink-count code — the disambiguation lives in the log, not the light.
Chassis LED (MM-driven, on the lid)
- White / blue rapid flash — bootloader active, OS not yet handed off. Persisting > ~60 s = controller failed to boot.
- Yellow / amber sustained — initializing (fans, PSU handshake, MM enumeration). Should reach green within ~5 min on a cold boot.
- Green sustained — all three MMs enumerated, pool connected, valid shares. Healthy.
- Red sustained — one of seven faults, per Canaan’s own 721–841 PDF: Toohot, Loopback failed, PG (Power Good) failed, Core test failed, Voltage error, Temp sensor error, No fan. Pull
estatsto disambiguate.
AUC controller LED (on the USB-stub module)
- Blue — initializing or idle.
- Green — passing IIC traffic to the MMs normally.
- Red sustained or flickering — comm issue or rejected shares. A red flicker synced to pool submissions is reject/comm, not hardware; a steady red with the chassis green points at the cable/IIC path. → AUC USB connection lost, AUC controller failure.
The seven red-LED causes, mapped to fields
| Red-LED cause | Confirm in estats |
Page |
|---|---|---|
| Toohot | PVT_T outlier > 85 °C |
Temperature too high |
| Loopback failed | E: no asics!!, SYSTEMSTATU < 3 |
No ASICs |
| PG failed | PS bitmap non-zero |
PSU not detected |
| Core test failed | ECHU non-zero, BIN reads 0 |
Chip BIN mismatch |
| Voltage error | PVT_V outside ~290–350 mV (chip) |
Voltage domain abnormal |
| Temp sensor error | Single PVT_T at floor/ceiling vs neighbours |
ASIC temp abnormal |
| No fan | Fan1/Fan2 RPM = 0 |
Fan speed error |
The Avalon diagnostic ladder
- Cold-boot 60 s at the PDU — kill AC so bulk caps discharge and the MM state machine flushes. Many
E: no asics!!and0x10events on healthy hardware clear here, with zero spend. - Pull
estatson port 4028 and screenshot it — readSYSTEMSTATU→ECMM→ECHU→MM_STATUS→PS→PVT_T/V→GHS. This snapshot is what the bench team asks for if the unit ships. - Read both LEDs to corroborate which subsystem the firmware blames (chassis = MM/hashboard/thermal; AUC = bus/comm/pool).
- 30-minute slot-swap test. Move the flagged board to another slot. Fault follows the board = board-side (chip/paste/ribbon); fault stays in the slot = AUC/ribbon/MM path. This bisection halves diagnostic time on every Avalon ticket.
- Check the environment before the silicon — mains under load (≥ 195 V; a dedicated 240 V circuit beats 120 V), intake ≤ 30 °C, DNS set to
1.1.1.1/8.8.8.8. Then confirm the MM firmware is the last-known-good build before flashing. - Escalate to ribbon reseat/clean (99% IPA), thermal-pad refresh, then chip-level reflow — only once the slot-swap proves the fault is on the board.
DIY vs. the bench
Most LED-only and log-only complaints are a free cold-boot, a reseat, a filter clean, a DNS fix, or a firmware rollback — fully DIY. Stop and book the bench when the fault is in the silicon or control hardware: ECHU bit 9 recurs at the same chip position within 30 days of a reflow; SYSTEMSTATU < 3 survives ribbon reseats and slot-swaps; BOOTBY persists across two firmware builds; PVT_T stays > 85 °C after a fresh pad refresh with clean ambient; or you see scorched traces, bulging electrolytics, or a burnt smell. Reversed PSU connect order (burned U1/U2/R8/R9 power-sequence parts) and cross-connecting a Bitmain APW PSU are bench jobs too. D-Central diagnoses and repairs these in-house — chip replacement, MM/AUC swaps from stocked A11–A13 inventory, 24-hour burn-in.
Book Avalon ASIC repair → · 5–10 business-day turnaround · Laval, Quebec workshop · ships Canada / US / international. Searching a specific string? Use the ASIC Fault Finder. Model specs and tuning: AvalonMiner 1246, A1166 Pro, A1346, A1566.
Sources
Canaan-Creative/avalon10-docs— Canaan’s own API reference; definesestats, thePSbitmap,ECHU,BOOTBY(field names only, no meanings).- Zeus Mining — Avalon backstage log description and the A1246 kernel-log walkthrough — third-party mirror naming
ECHU 128and the LED states. - Canaan 721–841 series troubleshooting PDF — the canonical (and only) source for the seven red-LED states.
- D-Central in-house Avalon repair queue (A11–A15), 2023–2026 — the basis for the
0x10prevalence pattern, theECHU[N N N]= electrical-environment rule, and the field-to-fix mapping above.
Related products, repair, and setup paths
- how D-Central diagnoses ASIC repairs
- ASIC troubleshooting library
- ASIC manuals and repair guides
- replacement hashboards
- ASIC control boards
- ASIC power supplies
- S19 family replacement hashboard
- C52 replacement control board
- APW12 S19 power supply
- immersion cooling hub
- home immersion cooling guide
- ASIC miners for immersion planning
- ASIC cooling parts
- airflow shroud before immersion
- compare miner specs in the database
- ASIC repair support
- compare ASIC miner specs
- ASIC miner database
Last reviewed June 8, 2026.
