Passer au contenu

Nous améliorons nos opérations pour mieux vous servir. Les commandes sont expédiées normalement depuis Laval, QC. Questions? Contactez-nous

Bitcoin accepté au paiement  |  Expédié depuis Laval, QC, Canada  |  Soutien expert depuis 2016

A1366_AUC_OT Warning

Avalon 1366 – AUC Controller Overheating

Avalon 1366 AUC4 (Avalon USB Controller v4) board overheating — the USB-PMBus interface board sits on or near the dedicated 12V PSU and inherits its thermal envelope. When local PCB temperature exceeds the PMBus transceiver / USB controller silicon's thermal-protect threshold (typical 80-85 degrees C), AUC4 enters thermal fold-back: USB endpoint enumeration goes flaky, PMBus handshake CRCs fail, the controller declares 'PSU not detected' / 'PMBus handshake timeout,' hashboards drop out for 30-90 seconds while MM firmware retries PSU init, AUC4 cools, comm restores, hashboards re-enable, cycle repeats. AUC4 status LED flips between green and red on the thermal cycle. cgminer JSON API at port 4028 shows PS[0..2] flapping between healthy and 0; GHSavg sawtooths between nameplate and 0; pool-side share submission has a sawtooth shape over 24 hours. Distinct from hard A1366_PSU_FAIL where PS[0..2] is continuously 0. Causes: high ambient at intake, dust loading, aged AUC4 thermal interface, chassis airflow obstruction, direct sun, hot-air recirculation in rack, damaged AUC4 silicon from prior overtemp, aged PSU heating AUC4 by radiation, MM firmware regression increasing AUC4 polling rate. Revenue 60-90% of nameplate during flap cycles. Each cycle accumulates wear on AUC4 silicon AND on hashboard ASICs (every comm-drop is a hashboard power-down/re-init thermal cycle).

Warning — Should be addressed soon

Affected Models: Avalon 1366 (~130 TH/s / ~3300W chassis, A3206-class silicon, A12-series control architecture). Carries forward the AUC4 (Avalon USB Controller v4) + PMBus-based PSU handshake first introduced on the 1266. The AUC4 board lives on or near the dedicated 12V PSU, sharing its thermal envelope. Local AUC4 PCB temperature climbs 15-25 degrees C above intake-air temperature under normal three-board hashing load; thermal-protect threshold on PMBus transceiver / USB controller silicon is approximately 80-85 degrees C. The 1366 design has roughly 25-30 degrees C of ambient headroom before AUC4 silicon hits derate; high ambient, dust loading, or aged thermal interface erases that headroom and triggers thermal-flap. Verify 1366 nameplate hashrate / wattage and AUC4 sub-revision against current Canaan documentation before publish.

Symptoms

  • AUC4 status LED reads red, blinking red, or flapping between green and red on a thermal cycle (warms up = red, cools down = green)
  • Web UI status panel shows 'Power Supply Not Detected' or 'PMBus handshake timeout' appearing intermittently, often clearing then returning within minutes
  • cgminer JSON API at port 4028 (curl http://<ip>:4028 -d '{"command":"estats"}') returns PS[0..2] flapping: non-zero on one query, 0 on the next, back to non-zero
  • GHSavg cycles: ramps to nameplate ~130 TH/s, drops to 0 for 30-90s while MM retries PSU init, ramps back up — sawtooth pattern over 24 hours on pool dashboard
  • kern.log shows repeating 'PMBus init failed', 'psu_probe timeout', 'auc4_usb timeout', 'i2c-tools timeout on PSU address' interleaved with 'PMBus init OK' lines
  • Symptom intensifies during hottest part of the day; clears overnight when ambient drops below 25C; returns next afternoon — classic thermal-bound failure signature
  • IR thermometer on AUC4 board housing through chassis vent reads above ~70C; bench-direct on AUC4 PCB reads above ~80C
  • Chassis intake air at front grille is at or above 35C, or chassis is in direct sun, or rack is poorly ventilated, or room AC failed during a heat wave
  • Chassis filter is dust-loaded; visible dust on AUC4 board's heat-spreading surfaces; 90+ days since last deep blow-out
  • PSU fan runs normally, AC input is healthy, hashboards enumerate when comm is up — i.e. nothing else is wrong, the AUC4 just keeps dropping out
  • ascset|0,<cmd> write commands targeting the PSU (output enable, fan curve, voltage query) intermittently return STATUS=F then succeed on retry
  • Recent service event: AUC4 USB cable was re-routed and now sits across AUC4 vent slots, OR aftermarket fix (foam, tape, RTV) was applied near AUC4 board choking airflow
  • Miner is 4-6 years old, has never had AUC4 thermal-paste refresh, original thermal interface on AUC4 silicon is dried out and cracked

Step-by-Step Fix

1

Read the AUC4 LED and confirm thermal-flap pattern. Look at the AUC4 status LED through the chassis vent. Note whether it's solid green (healthy), solid red (continuous over-temp = different page, see A1366_PSU_FAIL), or flapping between green and red. Refresh the Web UI status panel several times over 5-10 minutes. The flap pattern — green and 'PSU detected' most of the time, brief red spells with 'PSU not detected' — is the signature for this page. The five-second observation rules out half the wrong-page possibilities and costs nothing. Document times of day when red appears most often; correlate with outdoor temperature.

2

Log the cgminer API for 20 minutes minimum. Curl port 4028 with {"command":"estats"} every 10 seconds from a laptop on the same subnet. Save the output. The flap signature is PS[0..2] cycling between healthy and 0, GHSavg sawtoothing nameplate to 0 and back. This log is the irrefutable evidence of the failure mode and saves D-Central an hour of bench reproduction time if you end up shipping the chassis. The Web UI alone goes stale and lies; the JSON API tells the truth in real time.

3

Check ambient temperature at the chassis intake. IR thermometer at the front grille — not the hallway, not the room middle. The 1366 / AUC4 thermal envelope demands intake <=35C. If your reading is over 35C on a hot afternoon, AUC4 has no thermal headroom. Improve room AC, open a window, run room exhaust fan, relocate rack, shade chassis from direct sun. Also check for rack neighbour exhaust dumping into intake, dryer vent nearby, server room hot aisle. Ambient improvement alone resolves a meaningful fraction of summer thermal-flap cases without ever opening the chassis.

4

Visual check for direct sun and dust loading. Is the chassis in direct sun for any part of the day? Sun-heated chassis steel can add 10-20C to AUC4 local temp on a sunny afternoon. Shade the chassis. Look at the front intake grille — is it visibly dust-loaded? When was the last deep blow-out? 90+ days without service = expect AUC4 thermal-flap as filters approach a wall of compressed dust. Schedule a 30-90 day blow-out cycle if you don't have one.

5

Note recent service or environmental events. Was the chassis opened, the AUC4 cable disturbed, or any aftermarket fix (foam, tape, RTV, re-routed harness) applied in the last 30 days? Did the room's AC fail or weaken? Did rack loading change (more miners added, layout shifted, hot aisle compromised)? Recent events are the single biggest predictor of which failure category you're in. Document the timeline; D-Central needs it if you ship.

6

Deep blow-out: chassis intake, AUC4 area, PSU vents, hashboards, rear exhaust. Kill AC at the PDU. Wait 60s. Move chassis to well-ventilated work area. Shop-vac the front intake grille. Open chassis. Compressed air through AUC4 area (short bursts — full-pressure can sandblast SMD components), then PSU vent path, hashboard intakes, rear exhaust. Wipe visible dust off AUC4 PCB with dry lint-free swab. Wipe chassis interior. Vacuum workspace before reassembling. This single service drops AUC4 PCB temp by 5-10C for a typical 90+ days-since-service chassis.

7

Inspect the AUC4 area for airflow obstructions. Lid still off. Look hard at the AUC4 board's vent path: are any vent slots blocked? Is any cable bundle, harness, foam, RTV, or aftermarket fix routed across the AUC4 area? Is the AUC4 USB cable re-routed across AUC4 vents from a previous service? Pull any obstruction. Re-route harnesses cleanly so they're far from AUC4 PCB. Zip-tie everything to the chassis frame. The AUC4 needs the same exhaust path the PSU uses; nothing else can be in the way.

8

IR thermometer the AUC4 surface and the intake. With chassis running on the bench (cover chassis-fan blades for safety), point IR thermometer at AUC4 PCB and at warmest silicon (PMBus transceiver, USB controller). Compare to intake air temp at front grille. Healthy after blow-out: AUC4 PCB <=60C with intake <=30C. Stress band: AUC4 PCB 60-80C. Thermal-protect: AUC4 PCB >80C. The delta between intake and AUC4 should be 15-25C on a healthy chassis at full nameplate; 30C+ delta means AUC4 has internal heat problems — escalate to Tier 3 thermal interface refresh.

9

Stage a 120mm external fan if ambient is hard to control. For installs where you cannot get intake <=35C reliably (summer garage without AC, container deploy, partial-outdoor install), a small 120mm USB-powered or 12V external fan blowing room air across the AUC4 area through a chassis vent buys 5-10C of AUC4 PCB headroom. Mount with foam pads against chassis so it doesn't transmit vibration. Cable-manage fan power away from chassis hot zones. This is a band-aid that buys the rest of the summer while planning a permanent ambient fix or AUC4 relocation.

10

Relocate the rack out of direct sun and away from neighbour exhaust. If chassis is in south-facing window, sun-baked garage, or downstream of another miner's hot exhaust, moving it 2-3m to a shaded or thermally-isolated spot drops effective intake temperature by 5-15C on a sunny afternoon. Build a cold/hot aisle if you have multiple miners — even a piece of plywood with sealed gaps between racks dramatically improves intake temperatures. Canadian plebs: a basement or garage shop with passive winter cooling is a free 15-25C of AUC4 thermal headroom from October through April.

11

Strain-relieve all internal harnesses while the chassis is open. Since the lid is off, dress every cable away from fan blades, hot MOSFETs on hashboards, sharp metal edges, AND the AUC4 board's vent path. Zip-tie cables to chassis rails so chassis-fan vibration cannot walk anything loose. Verify the AUC4 USB cable and AUC4-to-PSU PMBus ribbon are both strain-relieved. CAD $0.05 of zip-tie prevents a CAD $50+ future diagnostic visit.

12

Verify AUC4 USB cable is not the obstruction itself. A meaningful fraction of post-service AUC4 thermal-flap cases trace to a re-routed AUC4 USB cable that now sits across AUC4 vent slots. Re-route the cable so it runs around the AUC4 area, not across it. Zip-tie it to the chassis frame ~10-15 cm from the AUC4 connector so it can't drift back. Verify both AUC4 USB and PMBus ribbons are now clear of all vent paths before closing the chassis.

13

Capture the AUC4 boot log via USB-TTL during a cold boot AND during a thermal-flap event. Connect FT232 / CH340 / CP2102 USB-TTL adapter to the AUC4 UART header (location varies by board revision — check silkscreen). Capture during cold boot first (baseline). Then let chassis run until thermal-flap fires and capture during the flap. Search for 'PMBus init failed', 'psu_probe timeout', 'auc4_usb timeout', 'i2c-tools timeout', 'thermal protection' interleaved with successful 'PMBus init OK' / 'PSU probe OK' lines. The log distinguishes thermal flap from hard PSU comm failure.

14

IR-thermometer the AUC4 silicon stage by stage during a flap event. Bench session, lid off, fans covered, chassis running. Note baseline temps when AUC4 LED is green. Wait for AUC4 LED to flip red. IR-thermometer each silicon: PMBus transceiver, USB controller, 3.3V LDO, MCU. Note which IC is hottest at the moment of failure. Compare to neighbours. The hottest IC at flap time is the silicon hitting its thermal-protect threshold first — likely PMBus transceiver or USB controller. That diagnosis tells you whether thermal-interface refresh on that specific IC will fix the fault, or whether silicon has been damaged and needs replacement.

15

Refresh AUC4 silicon thermal interface and apply adhesive heatsinks. Lid off, AC at PDU off, wait 60s for caps to discharge. Disconnect AUC4 board (label cables before unplugging). Pull the AUC4 if removable; otherwise work in-place. Identify hottest silicon from Step 14. Carefully clean any old paste / pad with 99% IPA and lint-free swab. Apply Arctic MX-6 or Thermal Grizzly Kryonaut in thin uniform layer between silicon and factory spreader (if present). For bare ICs, apply small adhesive heatsinks (8mm x 8mm or 10mm x 10mm aluminium with thermal adhesive — common Bitaxe / RPi / VRM kits) directly to IC top. Reassemble, re-cable carefully, retest. Expect 5-15C AUC4 surface drop.

16

Relocate the AUC4 outside the chassis with a USB extension cable. If AUC4 is mechanically separable on your specific 1366 revision (verify separability against current 1366 production silkscreen / mechanical revision before relying on this), mount it in a small 3D-printed ventilated enclosure outside the chassis. Use a 0.5-1.5m shielded USB extension — shielded matters because the USB run carries comm channel for the entire PSU control surface and unshielded extensions over ~0.5m start picking up chassis-fan PWM noise. Route AUC4-to-PSU PMBus ribbon cleanly. AUC4 PCB temp drops to room ambient +5-10C instead of room ambient +25-35C. Cleanest long-term fix when ambient is the constraint.

17

Add a small dedicated 40-60mm fan blowing across the AUC4 board (in-chassis option). If relocation isn't feasible, a small 5V or 12V 40-60mm fan mounted inside the chassis blowing directly across AUC4 area drops AUC4 PCB temp by 8-15C. Power off controller's 5V rail or via separate USB connection. Mount with foam pads to isolate vibration. Verify fan doesn't pull more current than rail can supply (typical 40mm = 0.05-0.15A at 12V). Strain-relieve fan power leads and signal wire. Verify exact controller 5V / 12V rail headroom against 1366 schematics before drawing fan power from internal rails; alternative is separate small wall-wart supply.

18

Roll MM firmware back one version (if Step 1 timeline pointed to a recent flash). If AUC4 thermal-flap started immediately after a firmware flash, suspect a regression that increased AUC4 polling duty cycle. Re-flash the prior MM firmware build for the 1366 specifically via the AUC4 Web UI. VERIFY the image is for 1366 — Canaan signature-checks per-model and cross-model flash bricks the AUC4. See Avalon 1246 - Firmware Flash Failure for the procedure. If the prior build runs cool and stable, file an issue with Canaan describing the AUC4 self-heat regression on the latest build.

19

Refresh PSU thermal pads if PSU is also running hot. With chassis open, IR-thermometer the PSU surface near primary switching MOSFETs. If PSU surface is +10-20C over its as-new operating temperature, the PSU's primary thermal pads are dried out and the PSU is dragging the AUC4 thermal envelope with it. Swap thermal pads on the primary stage — bench-level work, requires PSU disassembly, much more fiddly than chassis-level cleaning. If you're uncomfortable with PSU rework, ship PSU to D-Central for bench refresh in $80-180 range — cheaper than buying a fresh PSU at $180-420.

20

Stop DIY when AUC4 still thermal-flaps after relocation + interface refresh + heatsinks + clean airflow + clean ambient. The silicon is damaged from accumulated overtemp cycles — PMBus transceiver junction degraded, USB controller analog stage drifted, 3.3V regulator bias point shifted. The fault threshold is now at room temperature and no environmental fix recovers it. AUC4 board needs replacement. Ship the chassis to D-Central — bench replacement and re-test runs faster than buying a parts-donor unless you already have one with confirmed silicon health.

21

Stop DIY when you see visible damage on AUC4 silicon under 10x-20x magnification. Hairline package cracks, discoloured solder pads, scorched silkscreen, burnt-component smell on AUC4 PCB. This is silicon damage, not interface failure. Refreshing thermal paste or adding heatsinks will not fix damaged silicon. Ship to D-Central for AUC4 board replacement; while we're in the chassis we audit adjacent silicon (PSU PMBus MCU, controller-side USB stack, surge clamps) because heat damage to one named component often correlates with collateral wear on adjacent stages.

22

Stop DIY if you don't have preheat + hot-air SMD rework capability and the fault narrows to silicon-level replacement. Iron-only rework on PMBus transceiver ICs, USB controller ICs, or LDO silicon lifts pads, breaks vias, and turns a CAD $120-260 bench repair into a CAD $300-500 'we have to replace the whole AUC4 board because the original is now scrap' repair. Ship the chassis at the first hint of needing component-level work unless you genuinely have the toolchain and the practice. The Bitaxe is the right $150 practice rig for this skill.

23

Ship with full context. Pack the chassis WITH the PSU (AUC4 thermal envelope is partly inherited from the PSU; we need both to reproduce). Include: cgminer API flap log, USB-TTL serial log if captured, IR thermometer notes (intake / AUC4 PCB / per-IC), MM firmware build string, service history (when chassis was opened, what was swapped, when AUC4 LED first flipped red, environmental events), ascset|0 responses captured during diagnostic, photos of any visible damage. Match chassis serial to PSU serial on shipping note. Anti-static bags around AUC4, double-box with >=5cm foam every side. Canada-wide standard shipping; US / international welcomed. Turnaround 3-7 business days.

When to Seek Professional Repair

If the steps above do not resolve the issue, or if you are not comfortable performing these repairs yourself, professional service is recommended. Attempting advanced repairs without proper equipment can cause further damage.

Related Error Codes

Still Having Issues?

Our team of Bitcoin Mining Hackers has been repairing ASIC miners since 2016. We have seen it all and fixed it all. Get a professional diagnosis.