4479 Desserte Nord Autoroute 440, Laval, QC H7P 6E2

Antminer S17+ Maintenance & Repair Guide

Table of Contents

Maintaining the Antminer S17+ is essential for ensuring optimal performance and longevity. This guide provides information on how to identify and troubleshoot common issues that may arise, as well as step-by-step instructions for performing routine maintenance tasks. It also includes a flowchart to help you quickly identify potential problems and determine the best course of action. With this guide, you can keep your Antminer S17+ running smoothly and efficiently.

Preparation and Maintenance Guidelines

It’s essential to take the time to properly prepare and maintain components before, during, and after installation. This includes applying thermal gel for better heat transfer, forming air ducts for better airflow, connecting power supplies in the correct sequence, fixing chips to prevent overheating, and ensuring test fixtures meet production requirements. Additionally, these guidelines should also include instructions on cleaning components with approved solvents such as isopropyl alcohol or distilled water, as well as how to store components away from extreme temperatures and humidity levels safely. Finally, regularly scheduled maintenance checks should be carried out every few months or at least annually to guarantee the proper functioning of all parts within the system.

Preparation Requirements for Repair Platform, Tools, and Equipment

I. Platform Requirements

  • The necessary requirements for the platform include a workbench for repairing rubber sheets that must be properly grounded. In addition, an anti-static wrist strap and grounding are also required to prevent static electricity from damaging the materials being worked on.

II. Equipment Requirements

  • Constant temperature soldering iron (350-360℃) with a specific head for small patches such as chip resistors and capacitors
  • Heat gun and BGA rework station for disassembling and soldering chips and BGA components
  • Multimeter, soldering steel pin, and shrink tubing for easy measurement (Fluke 15b+ recommended)
  • Oscilloscope, Agilent recommended
  • Hash board tester fixture
  • Low-temperature solder paste (Alpha OM550), flux, water, and anhydrous alcohol for cleaning panel and soldering residues
  • Thermal conductive paste for chips and heat sinks after maintenance (may vary by model)
  • Tin-planting steel mesh, ball-planting steel mesh, solder wire, and solder balls (0.4mm diameter recommended) for chip replacement
  • Plant tin on chip pin and BSM surface before soldering to the hashboard
  • Common maintenance spare materials: 0402 resistance (0R, 33R, 1K, 4.7K), 0201 resistance (0R), 0402 capacitors (0.1uf, 1uf)

III. Test Tool Requirements

ARC Kit

  • ARC Antminer Hashboard Tester
  • Lab PSU 10-30V / 1-15A

Bitmain Kit

  • APW9+ power supply and power patch cord for hash board power supply
  • Use the test fixture of the V2.3 control board (test fixture material number ZJ0001000001).

IV. Maintenance Auxiliary Materials/Tools Requirements

  • Solder Paste 138°C, flux, Mechanic lead-free circuit board cleaner, and anhydrous alcohol.
  • Mechanic lead-free circuit board cleaner cleans up the flux residue after maintenance.
  • Thermally conductive gel is used to apply to the chip surface after repair.
  • Ball-planting steel mesh, desoldering wick, and solder balls (the recommended ball diameter is 0.4mm).
  • When replacing a new chip, it is necessary to tin the chip pins and then solder them to the hash board. Apply thermally conductive gel evenly on the chip’s surface, then lock the heatsink.
  • Serial port code scanner.
  • Serial port adapter board RS232 to TTL adapter board 3.3V.
  • Self-made short-circuit probe (use the pins for wiring and welding and heat the shrinkable sleeve to prevent short-circuit between the probe and the small heatsink).

V. Common Maintenance Spare Material Requirements

  • 0402 resistor (0R, 10K, 4.7K,)
  • 0402 capacitor (0.1uF, 1uF)

Maintenance Requirements

  1. Maintenance personnel are required to possess a certain level of electronic knowledge, with at least one year of maintenance experience and expertise in soldering technologies such as BGA, QFN, and LGA packages.
  2. After repairing a hash board, it must undergo at least two tests to ensure that it is functioning correctly. If it fails either of these tests, it will be rejected.
  3. When replacing a chip, it is essential to pay close attention to the operation method to avoid any obvious deformations of the PCB board. It is necessary to check for any open or short circuits, or missing parts in the replacement parts and their surroundings.
  4. Before starting any maintenance work, it is important to check the tools and test fixtures, ensuring they are working correctly. It is also necessary to determine the test software parameters for the maintenance station, version of test fixtures, and other relevant details.
  5. After the repair and replacement chip tests, it is essential to perform a full chip check before proceeding to the functional test. The functional test must ensure that the double-sided heat sinks are soldered correctly and that the cooling fan is operating at full speed. It is necessary to form air ducts by putting three hash boards together when using the chassis cooling function. Even for single-sided production tests, the formation of air ducts is crucial.
  6. When measuring signals, it is advisable to use two fans to dissipate heat and maintain full speed. Using a laser tachometer to test the fan speed is recommended.
  7. During the measurement and maintenance of the front and back of the hash board, the steel windshield should be under 21V voltage. It is essential to keep the maintenance table clean and insulated to prevent any short circuits during maintenance.
  8. When replacing a new chip, it is important to apply solder paste on the pins and the BSM surface to ensure that the chip is pre-tinned before soldering to PCBA for maintenance.
  9. Fixtures at the maintenance end should adopt Repair_Mode mode and config configuration files tested in non-scanning mode. After passing the test, the production end should start the production line from the test piece, while the after-sale end should be installed and aged normally (installed at the same level). The test configuration file can be obtained from TE.

Overview of Antminer S17+ Components

S17+ Hashboard Structure

The S17+ hash board consists of 65 BM1397 chips, which are divided into 13 groups, each consisting of 5 ICs. The BM1397 chip used in the S17+ hash board operates at a voltage of 1.5V. The boost circuit U6 outputs 24.5V, which powers the LDO, and the LDO outputs 1.8V. The last third and third groups are powered by 24.5V DCDC to output 1.8V, while the other groups are powered by 21V divided voltage to provide 1.8V through DCDC. The remaining 0.8V is provided by the 1.8V of this domain via the LDO output.

Boost Circuit of S17+ Hashboard

The boost circuit of the S17+ hash board is powered by the power supply, and boosts from 21V to 24.5V.

 

Signal Direction of S17+ Chip

  1. CLK (XIN) Signal Direction: The CLK signal is generated by the Y1 25M crystal oscillator and transmitted from chip 01 to chip 65. During operation, the voltage is 1.45-1.65V (measured by oscilloscope), and the voltage measured by a multimeter is about 0.7-0.9V.
  2. TX (CI, CO) Signal Direction: The TX signal is input from pin 7 (3.3V) of the IO port, transferred to IC U2 through level conversion, and then transmitted from chip 01 to chip 65. The voltage is 0V when the IO line is not inserted, and 1.8V during operation.
  3. RX (RI, RO) Signal Direction: The RX signal is transmitted from chip 65 to chip 01, returned to pin 8 of the signal cable terminal via U1, and then returned to the control board. The voltage is 0.3V when the IO line is not inserted, and 1.8V during operation.
  4. BO (BI, BO) Signal Direction: The BO signal is transmitted from chip 01 to chip 65, and the voltage measured using a multimeter is 0V.
  5. RST Signal Direction: The RST signal is input from pin 3 of the IO port, and then transmitted from chip 01 to chip 65. The voltage is 0V without IO signal or in standby, and 1.8V during operation.

 

Antminer S17+ Structure

The miner consists of three hash boards, one control board, APW9+ power supply, and four cooling fans.

Identifying Common Issues with Hashboards and Troubleshooting Procedures

Phenomenon 1: The detection chip on a single hash board test shows 0

To troubleshoot this issue, we need to follow the steps mentioned below:

Step One: Check the power output

  • Check the power output at the location specified.
  • Ensure that the power output is within the expected range.
  • If the power output is not within the expected range, further troubleshooting will be required.

 

Step Two: Check the voltage domain voltage output

  • Check the voltage output in each voltage domain, which should be around 1.6V.
  • Ensure that there is a domain voltage when power is supplied at 21V.
  • Measure the output of the power supply terminal of the hashboard.
  • Determine if the MOS is shorted by measuring the resistance between pins 1, 4, and 8.
  • If there is power supply at 21V but no domain voltage, continue to check downward.
  • Measure the voltage output of D5/D8, which should be between 23-24.5V.
  • If the voltage output is not within the expected range, further troubleshooting will be required.

 

Step Three: Check the PIC circuit

  • Measure whether the second pin of U3 has an output, which should be approximately 3.2V.
  • If there is no output or the voltage is not within the expected range, check the connection status of the fixture cable and the hash board.
  • If the connection status is ok, re-program the PIC.
  • If there is an output of approximately 3.2V, proceed to troubleshoot further.
  • Check the output of signal pins (CLK/CI/RI/BO/RST) and compare with the expected voltage values as per the signal direction.
  • If the voltage values do not match the expected range, compare them with the measured values of adjacent groups.
  • If the signal output is not within the expected range, further troubleshooting will be required.

Step Four: Check the boost circuit output

  • Test D5/D8 to measure the voltage output.
  • The voltage output should be between 23-24.5V.
  • If the voltage output is not within the expected range, further troubleshooting will be required.

 

Step Five: Check the LDO 1.8V or PLL 0.8V output of each group

  • Check the LDO 1.8V or PLL 0.8V output of each group.
  • Ensure that the voltage output is within the expected range.
  • If the voltage output is not within the expected range, further troubleshooting will be required.

 

Step Six: Check the chip signal output (CLK/CI/RI/BO/RST)

  • Refer to the voltage range values for each signal direction.
  • Measure the voltage output of each signal pin (CLK/CI/RI/BO/RST).
  • Ensure that the voltage output for each signal pin is within the expected range.
  • If there is a large deviation in the voltage value, compare it with the measured values of adjacent groups.
  • If the signal output is not within the expected range, further troubleshooting will be required.
  • If the signal output is standard and the chip is still not working, troubleshoot by shorting the RO pull-up resistor R639.
  • If 64 chips can be detected after short, it indicates that chips 1-64 should be normal, and you can troubleshoot the 65th chip now.
  • If 63 chips are detected after short-circuiting, conduct troubleshooting forward.
  • If there is an issue with the 32nd chip, the remaining 31 chips can be divided into two groups: 1-16 and 17-31.
  • If the issue is with chips 1-16, further divide them into two groups: 1-8 and 9-16. Test each group until the problematic chip is found.
  • If the issue is with chips 17-31, divide them into two groups: 17-24 and 25-31. Test each group until the problematic chip is found.
  • If the issue is not with the 32nd chip, divide the remaining chips into two groups: 1-32 and 33-64.
  • Test each group until the problematic chip is found.
  • Repeat the above steps as necessary until the problematic chip is found.
  • Once the problematic chip is found, troubleshoot it as required.
  • If the issue is resolved, retest to ensure that the chip is now working correctly.

 

Refer to the range of voltage values described by the signal direction. If the measurement encounters a large deviation in voltage value, it can be compared with the measured values of adjacent groups.

If the chip signal pin output voltage is normal, and the chip is still not working, for example, if 64 chips are detected, you can troubleshoot by shorting the RO pull-up resistor R639. If 64 chips can be detected after shorting, it indicates that chips 1-64 should be normal, and you can troubleshoot the 65th chip at this time. If 63 chips are detected after short-circuiting, conduct troubleshooting forward. It is recommended to adopt the dichotomy method for troubleshooting, that is, test from the middle (starting from the 32nd).

Lastly, please refer to the troubleshooting comparison table for a summary of the steps mentioned above.

Phenomenon 2: Single hashboard detection chip is not complete

To locate the source of an error, it is important to check the relevant signals (CLK/CI/RI/BO/RST) both before and after the position where the error occurred. By examining the signals of the integrated circuit (IC) that has abnormal readings, the problematic area can be identified. Additionally, repair can be performed by referencing the signal direction and voltage range.

Phenomenon 3: single board pattern NG, that is, the response nonce data is incomplete (PT2 station type)

In order to determine the location of insufficient nonce data, the computer is connected to the serial port and reads the test log. The log displays the results of the test, which can be used to identify the position of the chip where the problem is occurring. Once the location is determined, the corresponding chip can be replaced to address the issue.

Phenomenon 4: Test temperature reading is abnormal (PT2 station)

To address this issue, it is important to conduct a thorough examination of the temperature-sensing power supply VDD and check the connection status between the temperature sensor and the chip (TEMP_P; TEMP_N). Additionally, it is recommended to inspect the soldering quality of the chip connected to the temperature sensor.

To further investigate the problem, check the quality of the heatsinks that are connected to the chip on both the front and back. It is essential to ensure that the heat sinks are welded properly, as poor welding can impact the temperature difference and result in abnormal readings during testing.

Troubleshooting Common Miner Failures

Preliminary test of the whole machine

Referring to the test process document, the general problems that arise are assembly process problems and control board process problems. Common problems include the inability to detect the IP, abnormal number of detected fans, and abnormal detected chain. If any issues arise during the test, they should be repaired according to the monitoring interface and the test LOG prompts. The maintenance methods for the initial test and the aging test of the whole machine are the same.

Aging Testing of the Miner

The aging test should be repaired according to the monitored interface test. For instance:

  • Abnormal fan display: In such cases, we need to check whether the fan works normally, whether the connection with the control board is normal, and whether the control board is abnormal.
  • Less chain: This means that three boards are missing one board. In most cases, there is a problem with the connection between the hash and control boards. Check the cable to see if there is an open circuit. If the connection is OK, you can test the board to PT2 to see if it can be tested. If it can be tested, it can basically be determined that it is the control board. If the test fails, use the repair method of PT2 maintenance.
  • Abnormal temperature: Generally, the temperature is high. The maximum PCB temperature set by our monitoring system cannot exceed 90℃. The fan will alarm and it will not work normally. Generally, the ambient temperature is too high, and the abnormal operation of the fan will also cause abnormal temperature.
  • Insufficient number of chips: If the number of chips is insufficient, you can refer to PT2 for testing and repair.
  • After running for a period of time, there is no hashrate, and the connection of the mining pool is interrupted, check the network.

If the miner still loses hashrate, reduce the frequency and other conditions remain unchanged. Let the miner mine to see if it will lose hashrate and whether the hash board will hit X. If it still hits X in losing hashrate, then remove the heat sink of the hashboard for mining and wait for the hashrate to drop. Measure whether the domain voltage is normal. Generally, the domain voltage will be abnormal in the problematic domain. Then measure the RI signal to see if the RI signal is broken. If the RI signal is missing, basically, the chip is short-circuited or damaged after being tinned.

Other Considerations and Maintenance Flow Chart

  • To begin the repair process of a hash board, a routine test must be performed. The first step of this test is to visually inspect the board for any deformations or burn marks on the printed circuit board (PCB). If any issues are found, they must be addressed before proceeding with further tests. Next, any parts with obvious damage, such as burn marks, collision offset or missing components, should be identified. If there are no visible problems, the impedance of each voltage domain can be tested to detect short or open circuits. Additionally, the voltage of each domain should be checked to ensure it is around 1.5V.
  • Once the routine test is completed and no problems are found, the hash board tester can be used to perform chip detection. The test results from the hash board tester can be used to determine the location of the faulty chip.
  • The voltages of chip test points, such as CO, NRST, RO, XIN, BI, VDD0V8, and VDD1V8, should be tested starting from the area around the faulty chip. The signal direction, such as the RX signal passing in the reverse direction (from chip 65 to 1), and several signals including CLK, CO, BO, and RST transmitting in the forward direction (from chip 1 to 65), can be used to locate the fault point through the power supply sequence.
  • After locating the faulty chip, it needs to be re-soldered. The process involves adding a flux around the chip, heating the solder joints of the chip pins to a dissolved status, promoting the chip pins and pads to re-align, and removing the excess tin, achieving the effect of re-tinning. If the failure persists after re-soldering, the chip should be replaced.
  • To determine whether the repaired hash board is a good product, it needs to pass the fixture tests twice. After replacing any faulty components, the board should be cooled down and tested. If it passes, it can be set aside to cool down further before being tested again. If the board passes the test twice, it can be considered a good product.
  • After repairing the board, relevant maintenance and analysis records should be prepared for feedback to the production, after-sales, and research and development departments. These records should include the date, SN, PCB version, tag number, bad cause, and bad liability attribution.
  • Finally, the entire miner should undergo conventional aging after the repaired hash board is installed. Any good products that have been repaired at the production end should flow production from the first station of production, which includes at least an appearance inspection and starting from the PT1/PT2 test station.
Share the Post:

Disclaimer: The information provided on this blog is for informational purposes only and should not be taken as any form of advice.

Ready to become an ASIC expert?

The D-Central team is a well-established and recognized authority in matters of ASIC repair. Our Training Course is tailored to equip individuals with the skills they need to tackle even the toughest repair jobs with confidence. The consulting services we offer provide our clients with expert guidance and support, allowing them to get the most out of their repair sessions. By bringing their own equipment, participants can learn to fix up to five broken devices during each session, making the trip pay for itself. Don’t let technical details hold you back.

Sign up for our Training Course now

High Demand: Shipping & Support Delays

Thank you for your enthusiasm! Due to increased demand, our shipping and support times may be longer than usual. We’re actively training additional staff to get orders out faster and provide the support you deserve. Thank you for your patience and understanding as we grow to meet your needs.