Passer au contenu

Nous améliorons nos opérations pour mieux vous servir. Les commandes sont expédiées normalement depuis Laval, QC. Questions? Contactez-nous

Bitcoin accepté au paiement  |  Expédié depuis Laval, QC, Canada  |  Soutien expert depuis 2016

FW_ERR Warning

Antminer – Autotune Firmware Crash

Warning (escalates to Critical when the crash loop prevents the miner from staying up long enough to flash clean firmware)

Warning — Should be addressed soon

Affected Models: Antminer S17 · S17 Pro · S17+ · S17e · S19 · S19 Pro · S19j · S19j Pro · S19 XP · S19 XP Hydro · S19k Pro · S19 Pro+ Hydro · S21 · S21 Pro · S21 Hydro · T21 · L7 — any Antminer running a firmware with autotuning enabled (stock Bitmain post-2022-08, Braiins OS+, LuxOS, Vnish, or DCENT_OS).

Symptoms

  • Miner reboots unpredictably — intervals from 3 minutes to 3 hours — with no thermal or fan `ERROR_` code immediately preceding the reboot
  • `kern.log` / `bmminer.log` shows repeated `bmminer: segmentation fault`, `Killed by signal 11`, or `OOM: killed bmminer` lines followed by a respawn
  • Web UI status page flashes "Tuning in progress… x%" then the connection drops; page reload shows the tuning bar reset to 0%
  • Dashboard briefly shows per-chain hashrate ramping up, then goes to zero, then the miner reboots — same pattern every tuning cycle
  • After a stock Bitmain firmware update to a post-`2022-08` auto-tuning build, the miner started this behaviour; the same miner was stable on the previous build
  • On Braiins OS+ / LuxOS / Vnish / DCENT_OS: the dashboard's "Autotuning" indicator is stuck on "Running" for > 60 min, or repeatedly restarts the tuning pass
  • Per-chip HW% (visible on third-party firmware) spikes on a specific chip position at the exact moment the tuning engine touches that chain, then crashes
  • Control-board LED pattern is the normal green flash — hardware is healthy, software is thrashing. This distinguishes autotune crash from a real hardware fault.
  • One hashboard in the rig crashes tuning; the other two complete the pass. Swapping the bad board to a different slot moves the crash with the board.
  • `dmesg` shows `watchdog: BUG: soft lockup - CPU#0 stuck for 22s!` or similar SoC-level lockup lines in the minutes before the reboot
  • The crashes cluster in the first 20-40 minutes after every cold-boot, then (if you're lucky) the miner stabilizes at reduced hashrate for hours before the next crash
  • Pool side: the miner appears, submits a dozen shares, disappears, repeats. Pool dashboard shows a sawtooth online/offline pattern with no offline reason.

Step-by-Step Fix

1

Hard power-cycle at the breaker for 60 seconds, then boot and observe for `45 min` with a stopwatch. Note every reboot and whether UI reached the "mining" state before the reboot. This establishes a cadence baseline — without it, every subsequent step is guesswork. On S19-class hardware, if the miner never reaches "mining" state before rebooting, you're watching a tuning-phase crash.

2

Disable autotune in the dashboard if the build supports it. Bitmain stock post-`2022-08` exposes an "Auto-tuning" toggle on the Miner Config page; Braiins OS+ / LuxOS / Vnish / DCENT_OS all expose autotune on/off in their tuning sections. Set to "Manual" and apply a known-good fixed-frequency profile. Watch for `30 min` — if it stabilizes, you've confirmed the crash is autotune-specific and can tune back up manually in Tier 2.

3

Factory reset the miner. On web UI: System → Reset to Factory Defaults. Or physical: hold the reset button `5 s` inside the 2-10 minute post-boot window per Bitmain's published procedure. Factory reset wipes the NVRAM tune table; a corrupted tune table is the fastest and cheapest cause to eliminate before you chase hardware. Wait `4 min` for the auto-restart. Re-observe cadence.

4

Clean intake filters and verify ambient. Dust + warm ambient = tune-phase thermal burst goes further over spec = tuner interprets chips as unstable and crashes. Shop-vac the filter, wipe the intake grille, and verify inlet air with an IR thermometer at the grille (not room-middle): `≤ 35 °C` standard, `≤ 40 °C` Hydro. Pay special attention to the first `15 cm` in front of the miner — furniture, curtains, and vertical stacking on a shelf all choke intake.

5

Roll one firmware version back or forward. Bitmain's Download Center — `support.bitmain.com/downloads` — lists builds per hardware model. If you're on a post-`2022-08` stock auto-tune build and the miner was fine on the previous build, roll back. If you're on an old build, roll forward by one version. A specific firmware regression is a real cause; don't skip this before opening the chassis.

6

SSH in and pull the kern.log + bmminer.log. `ssh root@<miner-ip>` (default password `root` on most stock Bitmain firmware post-2020; check the sticker if unsure). Run `dmesg | tail -100` and `tail -200 /var/log/bmminer.log`. Save them. Grep for `segfault`, `Killed by signal`, `OOM`, `watchdog: soft lockup`. The log you pull here is the evidence you'll hand to D-Central if you end up shipping the miner — it saves diagnostic time, which saves you money.

7

Measure PSU output under load. Multimeter on DC, probes at the PSU→board 6-pin connector while the miner is attempting to hash at full power (watch for a brief window before the crash). Expect `≥ 13.8 V` sustained on an S19, `≥ 12.8 V` on an S9, `≥ 14.2 V` on an S19 XP. Sag below the target = PSU tired or circuit undersized. Swap PSU with a known-good unit (borrow from a healthy miner for 30 min to confirm).

8

Re-seat every hashboard cable. Power off at the breaker. Disconnect each hashboard's data ribbon and power connector, visually inspect for blackening, corrosion, or bent pins, then reconnect firmly. Listen for the click. Check both ends of each ribbon — the control board side is the one users forget. A flaky data connection during tune reads as garbage chip response, which the tuner handles poorly.

9

Swap hashboards between slots. Label the three slots `0/1/2` with tape. Move the suspect board to a known-good slot. If the crash follows the board, the board has the marginal chip or domain. If the crash stays in the slot, the control board / cable / PIC is suspect. This `10-minute` test is the difference between a Tier 3 board-level repair and a Tier 3 control-board job — don't skip it.

10

Apply a known-good fixed-frequency profile in place of autotune. On stock Bitmain: select the stock profile for your model (e.g. `S19-110T`, `S19 Pro-110T`, `S19j Pro-100T`). On Braiins OS+ / LuxOS / Vnish / DCENT_OS: load the default fixed-frequency profile for your model, not the autotuning one. Mine for `60 min` and observe. If it stays up, you've functionally fixed the problem while you decide whether to chase the autotune crash further.

11

Cross-flash to a firmware with per-chip diagnostics. Our pick: DCENT_OS — D-Central's own open-source Antminer firmware, built by the Mining Hackers with all the per-chip HW%, manual + autotune profiles, Stratum V2, and live tuning telemetry you'd get from commercial third-party firmware, open-source under DCentralTech/DCENT_OS on GitHub. Alternatives: Braiins OS+, LuxOS, Vnish `1.2.x+`. All four expose per-chip autotune state, which is the single most valuable diagnostic upgrade you can make on a crash-looping Antminer. Flash via the stock firmware's upload page or — on later S19 XP / S21 eMMC-only boards — via the vendor's SD card recovery procedure. Let the miner stabilize `20 min` before inspecting the per-chip dashboard.

12

Disable the worst-performing chip position(s). On DCENT_OS, Braiins OS+, LuxOS, and Vnish you can disable a specific chip index from the dashboard (Braiins OS+: Advanced → Per-chip; DCENT_OS: Tuning → Chip Table). Disable the chip(s) the tune pass kept choking on. Rerun tune. You'll lose `~0.9 TH/s` per disabled chip on S19-class, `~0.6 TH/s` on S17-class — worth it to keep the rig hashing until you have time to reflow.

13

Reflow the outlier chip. If one or two chip positions dominate autotune crashes, reflow is the lowest-risk chip-level repair on BM1398/BM1362/BM1368. Remove the heatsink, flux the BGA, preheat the board bottom-side to `~150 °C`, then top-side hot air at `310-330 °C` for `~30 s`. Let cool naturally, re-apply thermal paste (Arctic MX-6 or Thermal Grizzly Kryonaut), reassemble. Results hold `6-18 months` most of the time; sometimes permanent.

14

SD card recovery on Amlogic control boards. S17/S19-class miners with Amlogic control boards support SD card firmware recovery: format a `≤ 16 GB` microSD as `FAT32`, copy the official firmware image (`support.bitmain.com/downloads`) to the root, insert with miner off, hold the IP/reset button while powering on, release after `~10 s`. Wait `4-6 min`. Use this when the autotune crash loop prevents the web UI from staying up long enough to re-flash normally. S19 XP and S21 eMMC-only boards don't support this path — they need the vendor's USB recovery tool.

15

UART serial recovery (terminal-only). If SD card recovery fails or the board is eMMC-only and the web UI is dead, open the chassis and attach a `3.3 V` USB-TTL serial adapter to the control board's UART header (pinout varies by control-board revision — BHB42 vs BHB56 vs BHB68, check the silkscreen). Baud typically `115200 8N1`. Catch `u-boot` on power-on, halt autoboot, and recover via TFTP. This is the last stop before Tier 4; if you're unsure about the pinout or the `u-boot` commands, stop and ship it — a wrong command here overwrites the bootloader.

16

Stop DIY when any of these are true: per-chip autotune crashes isolate the same chip *position* across two different hashboards (PCB-level fault, not chip-level); you see visible heat damage, capacitor bulging, or a burnt-component smell; a reflow has already failed once and the crash returned within `30 days`; UART recovery fails or the bootloader is suspected corrupt. You're now in test-fixture territory. [Book a D-Central ASIC Repair slot.](https://d-central.tech/services/asic-repair/)

17

What D-Central does at the bench. Test fixture with programmable load, per-chip isolation using official Bitmain test binaries, chip replacement with graded salvaged or new-old-stock BM1398/BM1362/BM1368, PMIC / voltage-domain IC replacement where needed, control-board eMMC re-image via JTAG for eMMC-only boards, full reflow and re-seal, post-repair `24-hour` burn-in at nameplate with autotune enabled to confirm the fix survives the exact workload that killed it.

18

Ship safely. Pack hashboards in anti-static bags, double-box with `≥ 5 cm` foam on every side. Include: (a) the kern.log and bmminer.log you pulled in step 6, (b) which firmware version was running when it crashed, (c) what you've already tried (Tiers 1-3), (d) your contact info. A note like "crashes on autotune pass `14/342`, chip index `47` on board 1 flagged red in DCENT_OS" saves us 2 hours of diagnostic and that's real money off your invoice.

When to Seek Professional Repair

If the steps above do not resolve the issue, or if you are not comfortable performing these repairs yourself, professional service is recommended. Attempting advanced repairs without proper equipment can cause further damage.

Related Error Codes

Still Having Issues?

Our team of Bitcoin Mining Hackers has been repairing ASIC miners since 2016. We have seen it all and fixed it all. Get a professional diagnosis.