Passer au contenu

Nous améliorons nos opérations pour mieux vous servir. Les commandes sont expédiées normalement depuis Laval, QC. Questions? Contactez-nous

Bitcoin accepté au paiement  |  Expédié depuis Laval, QC, Canada  |  Soutien expert depuis 2016

AVALON_PIN_SHORT Warning

Avalon Series – A3203 ASIC Pin Short Diagnosis

A Canaan A32xx chip on an Avalon hashboard has a pin shorted to VSS (ground) or to its local domain rail. Field signature: external Canaan PSU trips into Over-Current Protection (OCP) within milliseconds of contactor-close, the chassis stays dark on the AUC side, and a thermal camera with a current-limited bench supply lands a single hot spot on one A32xx package within 5-10 seconds. Cold-resistance probe at the failed hashboard's main DC input pad to chassis ground reads single-digit ohms or sub-ohm, while the other two boards read tens-to-hundreds of ohms. cgminer JSON API on port 4028 (when the chassis briefly enumerates) returns PS[x] = 64 (OC_out), 128 (CS_error), or 256/512/1024 (OC_IOSA/B/C) per the avalon10-docs PS bitmap. Affects the A32xx silicon family (A3206 / A3210 / A3218 / A3228 / A3229) across A11/A12/A13/A14/A15 generations. Five root-cause categories: surge / lightning / feeder event, sustained over-voltage from wrong PSU, reflow or repair-shop induced damage, conductive contamination (flux residue, salt mist, condensation cycling), and latent silicon defect.

Warning — Should be addressed soon

Affected Models: Every Avalon hashboard built around the Canaan A32xx serial chip family: A3206 on A1126 Pro-S / A1146 Pro, A3210 on A1166 Pro / A1246 / A1266, A3218 on A1326 / A1346 / A1366, A3228 / A3229 on A1446 / A1466 / A1546 / A1566. Silicon shrinks and rail voltages drop generation to generation but the failure mode (single chip pin shorted to VSS or local domain rail) and the diagnostic procedure are identical across the family. Cost ranges, donor-chip availability, and reflow temperatures shift slightly per generation; the cold-resistance bisection, thermal-localization, and matched-bin replacement workflow do not.

Symptoms

  • Canaan PSU trips into OCP within milliseconds of contactor-close — fan blips, status LED flashes, chassis stays dark
  • cgminer JSON API on port 4028 (`curl http://<ip>:4028 -d '{"command":"estats"}'`) returns `PS[x] = 64` (`OC_out`), `128` (`CS_error`), or `256` / `512` / `1024` (`OC_IOSA` / `OC_IOSB` / `OC_IOSC`) per the avalon10-docs PS bitmap
  • Cold-resistance probe at the hashboard input reads single-digit ohms or sub-ohm to `VSS`, while the other two boards read tens-to-hundreds of ohms
  • Thermal camera or IR thermometer with a current-limited bench supply lands a hot spot on a single `A32xx` package within `5-10` seconds of rail-up
  • Burnt-component smell from a specific board, or visible discolouration / scorching on or near a single chip footprint
  • Chassis was previously throwing `E: no asics!!` on the same chain for weeks before the hard-short event — slow degradation followed by a discrete failure
  • Recent power surge, brownout, lightning event, or feeder-side breaker trip in the install location
  • Recent thermal-paste refresh or repair workshop visit before the failure (mechanical insult to BGA, dropped board, paste contamination)
  • PSU was swapped to a higher-capacity unit and the failure appeared shortly after (over-rated PSU happily delivers fault current that an in-spec PSU would have OCP-tripped on)
  • Wrong PSU was installed by a previous tech — A11/A12 PSU on an A14/A15 chassis, or vice versa
  • Bench-test fixture on the suspect board displays `asic count: 0` and the bench-side PSU folds back the moment the board is connected
  • Multiple chips on a single domain warm together rather than one isolated chip — points at a domain regulator (LDO / DC-DC) failure rather than a single pin short
  • OCP trip pattern is reproducible to the millisecond — same delay every cycle, no 'sometimes it boots' behaviour

Step-by-Step Fix

1

Cut AC at the breaker for 60 seconds, then attempt one cold boot. If the PSU trips OCP again immediately, STOP. Do not keep cycling power. Each OCP trip pumps inrush energy into the fault path and can promote a single-chip verdict into a chip + power-sequence MOSFET + adjacent regulator verdict in three or four power cycles. Document the PSU LED behaviour, fan blip duration, and any audible click in your service log.

2

Pull cgminer estats output if the chassis briefly enumerates. From any laptop on the same LAN: `curl http://<ip>:4028 -d '{"command":"estats"}'`. Save PS[0..2], ECMM, ECHU[0..2], MWx arrays, and MM_STATUS if printed. PS = 64 (OC_out) or any OC_IOSx bit set tells you the PSU saw a hard short downstream of that channel. Screenshot the response before any further action.

3

Multimeter cold-resistance probe at each hashboard input. Power off, AC disconnected, all boards still installed. Probe each hashboard's main DC input pad to chassis ground on the lowest resistance range. Hold each probe for 10-15 seconds (bulk caps slowly charge the meter). Healthy = tens to hundreds of ohms. Shorted = single digits or sub-ohm. Record all three readings. This is the single highest-yield diagnostic on this fault.

4

Visual inspection under bright light. Look for discoloured silkscreen, scorched solder mask, bulged or burst capacitors, missing components, conductive contamination (white crusty residue, dark stains under chips), burnt-component smell. Photograph any visible damage. A burnt board's damage pattern often points directly at the failed chip or the failed domain regulator.

5

Verify PSU compatibility with the chassis generation. A1056 / A1066 / A1126 Pro-S / A1146 Pro / A1166 Pro / A1246 / A1266 use one Canaan PSU family per Canaan's official product page. A1326+ uses a different family with different rail voltage and connector pinout. Wrong PSU on chassis is a documented cause of pin shorts. Photograph the PSU label and confirm against Canaan's compatibility matrix before re-applying power.

6

Pull the suspect hashboard. Power off, AC disconnected. Remove side cover. Disconnect the AUC ribbon and main DC harness from the suspect board. Photograph the connector pinout BEFORE disconnection so you can put it back exactly the way it came out. Slide the board out and place on an anti-static work surface.

7

Bench-test on a current-limited supply. Lab supply set to the board's rated input voltage (12V on A11/A12/A13 generations, higher on A14/A15 — check the chassis label) with the current limit set to 2-3A. NOT the full chassis rating. Connect the board. If the supply folds back into current limit immediately = hard short. If the supply rails up cleanly = failure is intermittent or chassis-side; re-probe in the chassis.

8

Thermal-camera or IR-thermometer hot-spot localization. With the bench supply folding back to current limit, leave the board powered for 5-10 seconds while sweeping the thermal camera across the chip array. The shorted chip warms visibly while the rest of the board stays cold. Record the chip position (board reference designator if printed, or row-column if not).

9

Re-probe cold resistance directly at the suspect chip. Identify the chip's supply pins (corner BGA pads or the nearest decoupling capacitor pads on the package's perimeter). Probe pin-to-VSS. Sub-ohm directly at the chip = chip-side short. Tens of ohms at the chip but sub-ohm at the board input = short is elsewhere on the board, walk the board to the next hot spot or suspect a regulator IC.

10

Stop here and ship if you've confirmed a single chip-side short. A32xx package replacement is not a Tier-2 repair. Continue to Tier-3 only if you have hot-air rework, bottom-side preheat, microscope, and matched-bin replacement chips. Otherwise book a D-Central ASIC Repair slot — your work to this point will save us 60-90 minutes of bench bisection, which directly reduces your repair invoice.

11

Document the failed chip's exact position and orientation. Photograph from above with a ruler in frame. Mark pin-1 orientation (corner dot or asymmetric pad layout). Record the chip's marking — A3206 / A3210 / A3218 / A3228 / A3229 plus the date code and lot. Replacement chips MUST come from a matched bin per the Zeus Mining A11/A12 repair guide. BIN mismatch on Avalon hashboards causes the firmware to refuse to enumerate the chain — a chip from a different bin will be cosmetically identical but will not boot.

12

Preheat the board from below to ~150 °C. Use a controlled-temperature preheater. Hot-air-from-above without bottom-side preheat will warp the PCB on an Avalon hashboard and lift adjacent pads. Allow the board to soak at 150 °C for 3-5 minutes before any top-side heat. The Avalon hashboard PCB is multi-layer copper-heavy and acts as a giant heatsink — without preheat, your hot-air gun is fighting the entire board.

13

Remove the failed chip with hot air at 310-330 °C for ~30 seconds. Watch for the first sign of solder reflow (the chip will visibly settle or shift slightly under gentle vacuum-pen pressure). Lift the chip with the vacuum pen, do not pry. Inspect the freed pads under magnification. If any pads lifted with the chip, the board is Tier-4 only — stop and ship.

14

Clean the freed pads. Inspect for solder-bridges, conductive residue, lifted vias. Wick excess solder with copper braid + flux. IPA-clean. Re-flux. If the freed pad area looks healthy, place the matched-bin replacement chip with pin-1 alignment confirmed against your photograph. Re-flow at 310-330 °C from above with bottom-side preheat held at 150 °C. Cool naturally. Inspect under magnification — every pad should show a visible solder fillet.

15

Re-probe cold resistance before re-installing. Pin-to-VSS should now read in the expected tens-to-hundreds-of-ohms range. If still sub-ohm, you either installed the chip mis-oriented, or there's a PCB-side short underneath that the chip-replacement didn't fix. Diagnose before re-install. If healthy, re-install the board, bring up the chassis on a current-limited supply if available, and burn in 24 hours under nameplate load.

16

Stop DIY and book a D-Central ASIC Repair slot when: two or more chips on the same board are diagnosed as shorted, suspected solder-bridge or PCB-side short under a chip, visible surge damage on the AUC or input filter, lifted pads from a previous repair attempt, cracked PCB, or no matched-bin donor chip available. Multi-chip rework on A32xx silicon is bench-only — yield drops sharply once you're chasing more than one position.

17

D-Central bench process: programmable load + bench supply with controllable inrush, full PS / MM_STATUS capture, thermal camera with chip-level resolution, chip removal under controlled hot-air with bottom-side preheat, matched-bin chip placement from a graded donor inventory, trace repair (jumper-wire or copper foil) where pads have lifted, full chassis re-burn-in under nameplate load for 24 hours before release. On older generations where board repair is uneconomic, we quote salvaged-grade replacement boards.

18

Ship the chassis with the PSU and your service log. We test your exact stack — the PSU is a contributing diagnostic, not just a power source. Include screenshots of the failing PS[0..2] / ECMM / MM_STATUS, your cold-resistance readings, photos of any visible damage, full service history (when, by whom, what was swapped), and contact info. Pack each hashboard in an anti-static bag, double-box with ≥5 cm foam on every side. Saves diagnostic time and saves you money.

When to Seek Professional Repair

If the steps above do not resolve the issue, or if you are not comfortable performing these repairs yourself, professional service is recommended. Attempting advanced repairs without proper equipment can cause further damage.

Related Error Codes

Still Having Issues?

Our team of Bitcoin Mining Hackers has been repairing ASIC miners since 2016. We have seen it all and fixed it all. Get a professional diagnosis.