IceRiver Self-Check Fail at Boot / Stuck in Boot Loop
Warning — Should be addressed soon
Symptoms
- PSU fan spins, all four front-panel LEDs (`D1`-`D4`) cycle through their normal startup sequence at power-on
- Web UI at `http://<miner-ip>/` becomes briefly reachable (5-30 seconds) between reboot cycles
- Miner accepts a DHCP lease — pingable, `arp` shows the MAC, 'Detect IP' tool finds the unit
- Reboot cycle repeats every `60-180 seconds` indefinitely, never reaching the steady-state hashing screen
- Hashrate display shows `0 GH/s` or `0 TH/s` even during the brief web-UI windows
- Visible error code in the brief web-UI window in the `300`-`302`, `530`, `540`, `550`, `710`-`712`, or `800`-`802` range
- Log lines mentioning `chip count mismatch`, `chip id read failure`, `temp sensor abnormal`, `firmware signature mismatch`, or `hashboard not found`
- Behaviour started immediately after a firmware update (OTA or SD-card flash)
- Behaviour started after physical work — chassis open, hashboard reseat, fan replacement, PSU swap
- Behaviour started after a thermal event — summer ambient spike, fan stall, dust-blocked intake, heat-soaked rack
- Behaviour started after a power event — surge, brownout, dirty residential AC during a storm
- Chassis fans do not ramp past idle / barely audible — firmware never reaches the `enable hashboards` stage
- One specific hashboard sounds different (faint relay click, faint coil whine) for a moment before reboot
- Unit was running clean for months / years and entered the loop with no apparent trigger (silent hashboard chip drift)
Step-by-Step Fix
Cold-boot at the breaker: power off at the wall or breaker, wait `60 seconds`, power back on. Hard cycles clear cached self-check state from the previous boot. About 15% of self-check loops are transient I²C glitches that one cold cycle clears. If the loop stops, soak-test for `2 hours` at full hashrate before declaring the unit fixed.
Capture every error code visible during the brief web-UI windows. Hit refresh on `http://<miner-ip>/` repeatedly during the first `3 minutes` after power-on. Screenshot any code, log line, or visible error string. Codes in the `300`-`302`, `350`-`352`, `530`-`550`, `710`-`712`, and `800`-`802` ranges each map to a specific fault and immediately narrow the diagnostic tree.
Move the chassis to clean ambient — pull from any cramped rack, dusty corner, or hot attic. Bench in `≤ 22 °C` ambient with `30 cm` clearance on every side. Re-run cold boot. If self-check now passes, the original 'self-check fail' was a thermal-cliff false alarm and the real problem is intake / dust / overheat, not boot-stage integrity.
Vacuum the intake fans and filters with a shop-vac, gently, on the intake side. Dust-clogged intakes raise hashboard ambient enough to trip sensor self-checks intermittently. This is the most common Tier 1 fix on KS units 18+ months into service in residential dust environments.
Verify nothing else changed: did you flash firmware recently? Roll back to the previous version via the official `iceriver.io/downloads` bundle for your exact hardware variant. Did you swap a PSU or reseat anything? Restore the original. Did you move the unit? Inspect for partial-seating from transit vibration. The loop started for a reason — find what changed before reaching for tools.
Hashboard reseat: power off at the breaker. Open the chassis (back-panel Phillips screws). Each hashboard has a power connector and a data flex cable to the control board. Disconnect both, inspect contacts under good light for blackening / corrosion / bent pins, reseat firmly until each connector clicks. Re-run cold boot. About 10% of self-check loops on units that have shipped or moved recently are partial-seating from vibration.
Hashboard isolation test: with the chassis open, disconnect all but one hashboard (`HB0`). Power on with only `HB0` connected. Observe self-check behaviour. Repeat for `HB1` then `HB2`. The board that fails self-check while alone — or the slot that fails self-check regardless of which board is in it — is your fault. Document which is which before reassembly.
Cable / flex inspection: with the chassis open, inspect every ribbon and flex cable between control board and hashboards for torn shielding, sharp bends with cracked traces, and connector oxidation (the greenish-white powder on copper). A torn data flex frequently presents as a Group-2 sensor fail because the I²C lines run alongside the data lines on the same flex.
Roll firmware to the previous version: visit `iceriver.io/downloads`, download the previous firmware bundle for your exact hardware variant, flash via official OTA or SD-card method. About 5% of self-check loops appear right after an OTA when a known-buggy build slipped through QA. Do not flash a different variant's firmware — KS3 vs KS3M vs KS3L are not interchangeable.
Factory reset hold: with the unit powered on and stuck in the loop, press and hold the reset button for `20 seconds` until the red LED flashes. Release. Wait `5 minutes` for the factory partition to copy across to the active partition. About 8% of self-check loops are config-corruption rather than hardware faults, and the factory reset clears them cleanly.
UART boot-log capture: wire a USB-to-TTL adapter (`CH340` / `CP2102` / `FTDI`, `3.3 V` logic) to the control board's debug header. `GND-GND`, board `TX → adapter RX`, board `RX → adapter TX`. Open `PuTTY` or `screen` at `115200 8N1`. Power on. The full boot log scrolls past — `U-Boot`, kernel init, `mining_daemon` init, then the self-check failure line. That line names your fault more precisely than anything in the web UI.
I²C bus continuity test: power off, hashboard disconnected from control board. Multimeter on resistance, probe between `SDA-GND` and `SCL-GND` on the hashboard's data connector. Expected: `2-10 kΩ` on each (the pull-up). Reading `< 100 Ω` = shorted line, often a cracked MLCC. Reading `> 1 MΩ` = missing pull-up, usually a cracked or knocked-off resistor. Visually inspect near the temp sensor IC and replace the failed component.
Temperature sensor IC reflow / replace: if a specific temp sensor is non-responsive on I²C and pull-ups read correctly, the sensor IC itself is dead (typical part `TMP421` / `TMP432` in `SOIC-8`). Reflow with hot air at `300-320 °C` for `30 s`; if reflow fails, desolder, source replacement (Digi-Key / Mouser carry single units), reinstall, retest self-check.
Single-chip dropout repair: if chip-count mismatch is the fault and isolation pointed at a specific chip in the chain, the typical failure is a cold solder joint on a serial-data line. Reflow with preheat (`150 °C` bottom-side, hot air `310-320 °C` top-side, `30 s`). KS-series chips tolerate one reflow cycle. Two reflows on the same chip rarely help — the chip itself is failing at that point and needs replacement.
U-Boot eMMC inspection: from the UART U-Boot prompt (interrupt boot with a key during the U-Boot countdown), run `mmc info` and `mmc part`. If `mmc info` errors out, the eMMC chip itself is degraded — fix is eMMC replacement at the bench. If `mmc info` succeeds but `mmc read` returns garbage on specific partition offsets, the eMMC has bad blocks; a fresh reflash sometimes recovers via controller remap, sometimes doesn't.
Stop DIY when: hashboard isolation pointed at a specific board with no obvious fixable fault; UART showed `mmc info` failure; a reflowed chip's HW-error rate spikes within `24 h`; self-check fails identically across two different hashboards in the same chassis; or you see scorched / leaking components. Any of these = bench territory. Book a D-Central ASIC Repair slot.
D-Central bench process: test fixture with programmable load applied to the suspect hashboard alone (rules out chassis interaction); per-chip isolation using vendor or generic JTAG / serial-bus probes; chip replacement with new-old-stock or salvaged ICs; eMMC chip replacement on control-board faults; component-level work on the I²C bus (pull-up replacement, sensor IC swap, trace repair). Post-repair: 24-hour burn-in at nameplate hashrate before return.
Ship safely: hashboards in anti-static bags. Double-box the chassis with `≥ 5 cm` foam on every side; remove the PSU and ship it separately to avoid impact damage to the chassis. Include a note with: observed error codes, cycle timing, firmware version, ambient and PSU details, and what you've already tried. We bill diagnostic time, so the more you've narrowed it, the cheaper the repair.
When to Seek Professional Repair
If the steps above do not resolve the issue, or if you are not comfortable performing these repairs yourself, professional service is recommended. Attempting advanced repairs without proper equipment can cause further damage.
Related Error Codes
Still Having Issues?
Our team of Bitcoin Mining Hackers has been repairing ASIC miners since 2016. We have seen it all and fixed it all. Get a professional diagnosis.
