Antminer – CGMiner / BMMiner Crash
Warning — Should be addressed soon
Symptoms
- Fans continue spinning at normal or slightly-elevated RPM, but realized hashrate reads `0 GH/s` on dashboard and pool side
- Web UI `Miner Status` page loads but shows "cgminer not running" / "bmminer not started" / a blank hashrate chart
- Pool worker drops offline abruptly with no network error on the pool side and no reconnect attempts for > `60 s`
- `kern.log` / `bmminer.log` shows `bmminer: segmentation fault`, `cgminer: Killed by signal 11`, `OOM: killed bmminer`, or `bmminer exited with signal 9`
- Daemon restarts itself via watchdog but crashes again within `30 s – 5 min` — classic crash loop
- Crash behaviour started right after a firmware update, pool change, or OC profile change; same miner was stable before
- Control-board LED pattern is the normal green flash — hardware diagnostics look healthy, the crash is software-layer only
- Reboot at the breaker "fixes" it for minutes to hours, then it crashes again at the same place in the mining loop
- `dmesg` shows repeated `watchdog: BUG: soft lockup - CPU#0 stuck for 22s!` or `oom-killer invoked` lines before the daemon died
- On S19 Pro rigs with > a dozen units: multiple miners crash / restart simultaneously — the "20-miner restart batch" pattern
- Multiple miners on the same breaker / PDU crash at the same time each evening, clustered `18:00 – 22:00` local — line-voltage sag
- Temperature, fan RPM, and PSU rail voltage all read normal in the last-good state captured before the crash
Step-by-Step Fix
Restart the mining daemon cleanly via the web UI: `Miner Status → Restart cgminer` / `Restart bmminer`. Wait `2 min` and re-check hashrate. If the restart button isn't exposed on your firmware, reboot the miner via `System → Reboot`. This is not a power cycle — that's the next step. A large fraction of one-off `bmminer` crashes don't come back after a single clean daemon restart; it costs you nothing to try first.
Hard power-cycle at the breaker for `30 s`. A soft reboot doesn't fully clear kernel driver state; pulling AC for 30 seconds does. Re-apply power, wait `4 min` for full boot, re-check hashrate and the web UI status page. If the original crash was a wedged kernel driver — rare but real on older S17/S19 control boards running post-update firmware — this clears it. If the crash returns within minutes of boot, skip ahead to the log capture in step 6.
Revert to a stock profile — no OC, no UV. In the Miner Config page, remove any custom tune / OC / UV settings and apply the default profile for your exact model (e.g. `S19-110T`, `S19 Pro-110T`, `S19j Pro-100T`). Apply, reboot, watch `30 min`. If the crash stops, your tuning was the cause — rebuild it much more slowly in Tier 2 (step 10). If it persists, tuning wasn't the cause; continue.
Verify intake airflow and ambient. IR thermometer at the front grille — not room-middle. Target `≤ 35 °C` standard, `≤ 40 °C` Hydro. Dust on the filters, furniture in front of the intake, or a closet stack with no exhaust can push the miner into thermal territory that destabilizes the daemon without tripping a hard thermal alarm. Shop-vac the filters, clear the first `15 cm` of intake space.
Check Bitmain's Download Center (`support.bitmain.com/downloads`) for a firmware update or a clean-reflash candidate. If you're on a known-buggy build — particularly the early post-`2022-08` stock auto-tune builds on S19 Pro or the initial S19 XP auto-tune builds — roll one version back or forward. Always verify the build matches your exact control-board revision (check the silkscreen: BHB42 / BHB56 / BHB68) before flashing; wrong firmware for a late-revision board bricks the control board.
SSH in and capture logs before they rotate. `ssh root@<miner-ip>` with the stock default password (`root` on most stock Bitmain firmware post-2020; confirm on the unit sticker). Run `dmesg | tail -200`, `tail -500 /var/log/bmminer.log`, and `grep -Ei 'segfault|killed by signal|oom|signal 11|signal 9|watchdog|soft lockup|abort|panic' /var/log/bmminer.log /var/log/kern.log`. Save the output to a text file on your workstation — this is the single most important evidence for every subsequent step, and the log rotates on every daemon respawn.
Measure PSU output under load. Multimeter on DC, probes directly at the PSU-to-board 6-pin connector while the miner is attempting to hash at full power. For crash-looping units, use the `30 s – 5 min` window each cycle. Expect `≥ 13.8 V` sustained on an S19, `≥ 12.8 V` on an S9 / L3+, `≥ 14.2 V` on an S19 XP. Sag below target means PSU is tired or the circuit is undersized. Borrow a known-good PSU from a healthy miner for `30 min` to confirm before buying a replacement.
Measure line voltage at the panel under load. Multimeter on AC, at the breaker or a receptacle on the same circuit with the miner pulling full power. `235 – 245 V` is healthy on `240 V` split-phase; below `230 V` under load means you're pulling too much current and the PSU is having to compensate. On `208 V` commercial expect `202 – 212 V`. Low line voltage means dedicated-circuit or electrician territory — this is the single most-diagnosed "mystery crash" cause in D-Central's queue.
Re-seat every hashboard cable. Power off at the breaker first. Disconnect each hashboard's data ribbon and power connector, visually inspect for blackening, corrosion, or bent pins, then reconnect firmly. Listen for the click. Check both ends of each ribbon — the control-board side is the side users forget. A flaky data ribbon produces intermittent garbage on the chip-communication bus, which is a classic "daemon crashes with nothing obvious in the log" trigger.
Rebuild OC profile slowly from stock. Once factory reset is clean and stable, add frequency `+100 MHz` at a time. Let it run `15 min` between steps and confirm the daemon is alive and hashrate is stable. Stop one step *before* HW% crosses `2%` or before the daemon shows any instability. That's your specific miner's silicon-lottery ceiling — it's per-unit, no two identical S19s have the same ceiling. Write the working numbers down so you can re-apply them after future firmware changes.
Cross-flash to DCENT_OS (D-Central's own open-source Antminer firmware — Mining Hackers' pick) via the stock firmware's upload page. Per-chip HW%, tuning, autotuning, Stratum V2, live telemetry, full stack traces on crashes — the diagnostics Bitmain hides. Alternatives: Braiins OS+, LuxOS, Vnish `1.2.x+`. On later S19 XP / S21 eMMC-only control boards, flash via the vendor's SD or USB recovery procedure. Let the miner stabilize `20 min`, then inspect the per-chip dashboard and daemon uptime. Source on GitHub under `DCentralTech/DCENT_OS`.
Disable the worst-performing chip position(s). If per-chip diagnostics under DCENT_OS / Braiins OS+ / LuxOS / Vnish show one or two chip indices with severely elevated HW% or repeated communication failures, disable them from the tuning page. You'll lose `~0.9 TH/s` per disabled chip on S19-class or `~0.6 TH/s` on S17-class — worth it to keep the remaining chips hashing cleanly while the daemon is no longer being poisoned by garbage from a dying chip.
Reflow the outlier chip. If per-chip diagnostics isolate the same chip position repeatedly, reflow is the lowest-risk chip-level repair on BM1398 / BM1362 / BM1368 BGA packages. Remove the heatsink, flux the BGA, preheat bottom-side to `~150 °C`, top-side hot air at `310 – 330 °C` for `~30 s`. Let cool naturally, re-apply thermal paste (Arctic MX-6 or Thermal Grizzly Kryonaut), reassemble. Results hold `6 – 18 months` typically; occasionally permanent. A second reflow on the same chip rarely helps — silicon is failing, not solder.
SD-card firmware recovery on Amlogic control boards. S17 / S19-class miners with Amlogic control boards support SD recovery: format a `≤ 16 GB` microSD card as `FAT32`, copy the official firmware image from `support.bitmain.com/downloads` to the root, insert with the miner off, hold the IP/reset button while powering on, release after `~10 s`. Wait `4 – 6 min`. Use this path when the daemon crash-loop prevents the web UI from staying up long enough for a normal flash. S19 XP and S21 eMMC-only boards need the vendor's USB recovery tool instead.
Roll firmware to the last-known-good version for your specific hardware revision. If a specific build introduced the crash, downgrade. Check your exact control-board revision (BHB42 / BHB56 / BHB68 on the silkscreen) against Bitmain's hardware table before flashing — wrong firmware for a late-revision board bricks the control board. Cryptographic signing on newer Bitmain firmware can also block downgrades without an SD-card flash path; if that happens, use SD recovery from step 14.
Stop DIY and book a D-Central repair slot when any of these is true: (a) per-chip diagnostics isolate the same failing chip position across two different hashboards (PCB-level fault); (b) you've reflowed the outlier chip once and the crash returned inside `30 days`; (c) `dmesg` shows `mmcblk0: I/O error` or `ext4-fs error` (storage failing); (d) bootloader (`u-boot`) is suspected corrupt or UART-level recovery failed; (e) visible heat damage, capacitor bulging, or burnt-component smell. Test-fixture territory. https://d-central.tech/services/asic-repair/
What D-Central does at the bench. Test fixture with programmable load, per-chip isolation using official Bitmain test binaries, chip replacement with graded salvaged or new-old-stock BM1398 / BM1362 / BM1368, PMIC and voltage-domain IC replacement where the crash traced to power-rail collapse, eMMC re-image via JTAG for eMMC-only boards, full reflow and re-seal, `24-hour` post-repair burn-in at nameplate to confirm the fix survives the exact workload that killed it. Turnaround `5 – 10 business days`, Canada-wide, US/international welcomed.
Ship safely with evidence. Pack hashboards and control boards in anti-static bags, double-box with `≥ 5 cm` of foam on every side. Include: (a) the `bmminer.log` / `kern.log` / `dmesg` captures from step 6, (b) which firmware version was running when it crashed, (c) the specific cause class you narrowed down, (d) what you've already tried across Tiers 1 – 3. A note like "crash loop, `OOM: killed bmminer` every `45 min`, confirmed not PSU/ambient, DCENT_OS flash didn't help, suspect eMMC wear" saves the bench 2 hours of diagnostic — real money off your invoice.
When to Seek Professional Repair
If the steps above do not resolve the issue, or if you are not comfortable performing these repairs yourself, professional service is recommended. Attempting advanced repairs without proper equipment can cause further damage.
Related Error Codes
Still Having Issues?
Our team of Bitcoin Mining Hackers has been repairing ASIC miners since 2016. We have seen it all and fixed it all. Get a professional diagnosis.
