The S17, S17 Pro, T17 servers are Bitmain’s newest versions in the 17 server series. Power supply APW9 is part of S17, S17 Pro, T17 servers. Here takes pictures of S17 mining server as examples:
Introduction to T17 Hashboard
There are a total of 30 BM1397 chips on the whole board. They are distributed under the front heat sink and connected to the heatsink through the tinned copper top of the chips.
One of the most common issues with the T17, as well as the rest of the Bitmain 17 series, is that the copper chip tops oxidize/delaminate. This delamination causes the heatsinks to fall, which then causes the chip to lose signals. On some occasions, the heatsinks will short out the power supply and cause the hashboard to explode unrecoverably.
When working on a Series 17 hashboard, it is very important to replace all delaminated chips, otherwise the repair will not last and could possibly cause more damage.
There are now heatsink upgrades which involve removing all the original heatsinks soldered to the hashboard and then replacing them with bulk heatsinks that will be screwed into the hashboard, avoiding the biggest problem of this generation.
Tools for T17 Repairs
It is a flammable clear solvent that effectively cleans ionic, polar and non-polar residues. It is safe on most painted surfaces, plastics, and elastomers. It leaves no residue, evaporates quickly relative to water, and does not cause corrosion.
The ARC Tester helps to determine malfunctioning chip on the board and can flash the EEPROM of hashboards without disassembling the miner. Compared to Bitmain’s tester, no need to have several testers and switch between them. You don’t need to swap SD cards and wait for tester to load. Just turn the selector knob.
ATTEN ST-862D 1000W 110V anti-static controllable hot air gun soldering station, temperature stabilization feature gives you more safety to aid in overheating, menu-style operation, three sets of fast temperature / air volume access keys to achieve common temperature.
Nidec Cooling fans
12V 1.65A 120*120*38mm cooling fan used for replacing damaged cooling fans.
Fluke 15b+ Multimeter
The Fluke 15B+ is the revamped version of the much liked 15B DMM and is one of the cheaper devices from Fluke. The Fluke 15B+ is a compact,reliable,easy to use multimeter. Used for AC/DC Voltage/Current ,Capacitance, Diode & Continuity tests.
Flux aids in soldering and desoldering processes by removing oxide films which form on the surface of metals being soldered. It increases the wetting ability of the solder, causing it to flow more uniformly over surfaces without balling-up (dewetting).
Chipquick Solder Paste
Solder paste is a mixture of minute solder spheres held within a specialised form of solder flux. The fact that it is a paste means that it can be easily applied to the board during PCB assembly. The solder particles are a mixture of solder.
Electrostatic Dissipative (ESD) Tweezers have electrostatic dissipative coating helps protect delicate electronic components from static damage. Multi-purpose Precision tweezers are great for beading and other fine crafts, laboratory work, cosmetic hair removal and many aspects of watch and clock repair, it is a good co-worker for watchmakers and jewelers.
Longwei Variable PSU
Adjustable Regulated Switching Power Supply Digital with Leads Power Cord (30V 10A). You can easily tune it within 0-30V and 0-10A. The backlit 4-digit LED display provides a more accurate readout for the voltage and current value.
Weller Soldering Station
70 watt station, iron and heat-resistant silicon cable for safe handling. Reinforced safety rest for secure iron storage. Temperature stability and temperature lock protects tips and components, affording a consistently high quality. Process with repeatable soldering results.
Before you start
Set variable PSU between 17V and 21V. Have some new chips pre tinned ready for application. Before powering the hashboard on, make sure polarity is good, otherwise the booster chip will burn. Connect to the IP address of your ARC Tester, available by pushing on the left knob.
When plugging the PSU to the hashboard, make sure the PSU is powered off. Plug the negative pole first, and then the positive pole. When all is verified, power on power supply. You can start testing by using the play button of the testers web interface.
How to flash with micro SD
Antminers have a Micro SD card slot that can be used to flash the firmware on the Antminer. Antminers can be sensitive about the Micro SD card being used, where cards with larger memory (16 GB or above) may not work in all cases.
- Format an SD card with FAT32 file system
- Extract the content of the ZIP file to an SD card. The content of the ZIP file should be extracted to the root of the SD card drive.
- Insert the Micro SD card into the card reader on the Antminer. Start the miner and wait a few minutes. When the two LED lights (green and red) are blinking at the same time, the process is completed.
- Power off the miner and remove the SD card
- Power on the miner again and it will now have the firmware installed.
How to flash from web
Go to your miner interface and go to System > Upgrade. Click “Browse” to find the file and “Flash” to load the file. Keep the miner powered on and running while the firmware loads, and for 20 minutes after. It will show “System Upgrade Success” if the upgrade is successful.
Disable/Enable Auto-tuning with VNish
An easy way to get started and achieve better hashrate performance is to apply one of the predefined Mining Profiles. A mining profile defines the target hashrate level for the Antminer. Go to Miner Configuration -> Mining profiles. Use the Preset property to select a target hashrate and click the Save button to apply. After selecting a Mining Profile, the miner will restart itself a number of times over the next few hours. This is expected and part of the optimization process.
Antminer T17 hashboards have several causes of failure, first we need to see the component structure of the hashboard and its signal transmission circuit.
Signal transmission channel, signal CLK-RST-BO-CO is transmitted from the first chip to the second, and till the 30th, RI signal is reversely transmitted from the 30th chips to the first chip.
Signal transmission circuit
Signal communication circuit from IO socket to chip
Steps to hashboard maintenance
First perform visual inspection on the hashboard to be repaired, observe whether the PCB is deformed or burnt. If yes, it must be handled first; check whether there are any parts with obvious burn marks, collision offset or missing parts, etc.
If no problem is found through visual inspection, the impedance of each voltage domain can be tested first to detect whether there is a short circuit or an open circuit. If yes, it must be handled first.
Check whether the voltage of each domain is about 1.7V.
After the routine test is OK, you can use the ARC Tester to perform chip detection, and determine the positioning based on the test fixture test results.
According to the chip map results from the ARC Tester, test the voltages of chip test points CLK, CO, RI, BO, RST, VDD0V8 and VDD1V8, etc. starting from the vicinity of the faulty chip.
In order to repair the Antminer hashboard, we first need to determine the location of the damaged chip with the ARC Tester. In order to ensure accuracy, we still need to use a multimeter to test the hashboard signals.
The Antminer hashboard has 5 principal test points. They are: CLK, CO, RI, BO, RST, defined as follows:
CLK = Clock signal
CO = Signal transmission
RI = Signal reception
BO = Pulse signal
RST = Reset signal
The test will send 4 (CLK, CO, BO, RST) signals in sequence, and send 1 (RI) signal in reverse.
Each set of test points represents a corresponding chip, we can test their voltage and resistance to determine whether they are good.
The power supply needs to be disconnected when testing resistance.
Due to the operating characteristics of the diode, when testing resistance, we need to touch the red pen of the multimeter to the GND of the hash board, and the black pen to 5 test points.
When testing the voltage, we need to touch the black pen of the multimeter to the GND and the red pen to 5 test points. You need to use the ARC Tester to start the hash board and test it while it is working.
Main components of the hashboard
ASIC chip (BM1397)
There are a total of 30 BM1397 chips on the whole board. They are distributed under the front heatsink and connected to the heatsink through the thermosetting adhesive. After the ASIC chip is damaged, the miner will have no hashrate or low hashrate. In some cases, the low hashrate of a single chip will also lead to a low hashrate of the entire hashboard. In this case, one by one testing will be used to locate the faulty chip.
MOS chip (P34M4SS 1939)
There are 4 MOS chips in the whole board. MOS are a field-effect tube, which exists in the hashboard voltage regulator circuit. After the MOS chip is damaged, it will cause no voltage or abnormal voltage on the whole board. Usually, when prompting 0 ASIC, check the voltage domain voltage first, if it is abnormal, it is most likely to be a MOS failure. This component can be replaced by other models, please check the chip datasheet.
PIC chip (dsPIC33EP16)
The PIC chip dsPIC33EP16 is located on the upper left side of the reverse side of the hashboard, and there is a total of 1 PIC chip on the whole board. This component is used to store the configuration data of the hashboard. After the chip is damaged, the hashboard will not operate, and the voltage and resistance of the whole board will be normal, but it will not start normally.
Temperature sensor chip (T451 71J)
There are 4 T451 temperature sensor chips on the whole board, which are located on the backside of the hash board. The temperature sensor chip is under the heatsink on the reverse side, and the radiator needs to be disassembled to be found. If the chip is damaged, it will cause abnormal temperature reading, and the hashboard will not start.
Solid capacitor (330 30V)
There are 6 solid capacitors 330 30V on the whole board, 5 on the right side of the front of the hash board, and 1 on the left side. The main function of this component is to filter, rectify and store energy. The 5 capacitors at the power input interface of the T17 hash board are mainly used for filtering and rectification. If one of the 5 capacitors is damaged will cause the abnormal voltage on the hash board and damage other components. The capacitor near the inductor is mainly used for rectification and energy storage. After damage, it will affect the operation of the boost circuit and cause the hash board to not operate.
EEPROM memory (02DMCN)
This component is mainly used to store EEPROM data and is located on the upper left side of the back of the hashboard. If the chip is damaged, the entire miner will not run, because there are three hash boards in the Antminer T17 and the hash boards must maintain the same EEPROM data to be recognized by the control board.
Low dropout voltage regulator chip (MP2019)
The low dropout voltage regulator chip is located on the upper left side of the reverse side of the hash board. The main function of this component is over-voltage protection, over-current protection, and over-temperature protection, the internal MOSFET is integrated. After damage, it will cause abnormal voltage in the voltage domain, or no voltage, under-voltage and other faults, which will cause the miner to not operate.
Boost converter chip (1517DR)
The boost converter chip is located on the upper left side of the front of the hash board. The main function of this component is to detect and control the output voltage of the boost circuit of the hashboard and adjust it to a suitable voltage in time, the internal integrated FET voltage regulator module, after damaged will cause no voltage in the back section of the MOS circuit. When the chip fails, the MOS test the output is normal, but there is no voltage in the voltage domain.
The Inductor 100 is located on the upper left side of the front of the hash board. The main function of the inductor is to stabilize the current of the hashboard and suppress the electromagnetic interference of the circuit board. After damage, the hash board is easily affected by the electromagnetic interference of the adjacent hashboard, which affects the 25Mhz clock signal, resulting in unstable operation of the hashboard, for example, Low force, inaccurate temperature monitoring, boot loops, and other faults.
Voltage monitoring chip (SXE1933)
The voltage monitoring chip is located on the upper left side of the front of the hashboard. The main function of the voltage monitoring chip is to monitor the real-time voltage in the voltage domain of the hashboard. This component is located at the back of the boost circuit. After damage, the front boost chip will not receive the returned data and stop working, causing the hashboard not to start.
SMD resistor (10R0)
There are 20 10R0 chip resistors on the whole board, distributed between the voltage domains on the back of the hashboard. The main function is to conduct the voltage domain of the whole board in a limited manner to realize the voltage balance between the voltage domains. After damage, it will cause voltage in the voltage domain to be poor, resulting in unstable operation of the Hashboard.
Chip capacitors (330e98)
There are 20 chip capacitors on the whole board, distributed between the voltage domains on the back of the hashboard. The main function is to filter and rectify, to achieve voltage balance between the voltage domains. After damage, it will cause a voltage difference in the voltage domain, resulting in a hashboard’s unstable operation.
Voltage domain steady current chip (LDO)
There are 26 voltage domain steady current chips on the whole board, distributed under the heat sink on the reverse side of the hashboard. The function of this component is to stabilize the voltage and provide a preset starting current to the voltage domain where it is located. After the chip is damaged, it will cause the voltage domain to fail. Because the voltage domain is in series, it will cause no voltage in the back-end voltage domain, and the voltage domain involved will stop operating.
Repairs Cases of Hashboard Troubleshooting
Note that a particular problem affects 17th series machines, whatever problem requires you to service the machine, you probably also have a lot of delaminated chips that are about to cause your heatsinks to fail one after the other. Although replacing delaminated chips with tinned ones solves this problem, it is still likely that some yet undiscovered delaminated chips will drop in the near future. This is why upgrading to all-in-one heatsinks is highly recommended.
We will now explain the different problems you may encounter, and their most common solutions. This will save you time in diagnostics.
- Whether the IO cable and hashboard are in good contact.
- The T17 hash board J6-J7 should have a voltage of 17V when testing.
- When testing, measure whether there is voltage between the 10 voltage domains.
3.1 If there is no voltage in the voltage domain, it is necessary to see whether the normal working voltage of the pin 4 of Q7, Q8, Q9, Q11 is low level of 0V. If it is high level, then it depends on whether the pin 1 of Q10 is high level of 3.3v, if Q10 does not have 3.3V Voltage, that is, U3-PIC loses firmware or there is no power supply.
If the power supply is normal and there is voltage in the voltage domain, then the RI signal of the chip should be measured to see if the RI signal has a voltage of 1.8V. When measuring the RI signal, it should start from the test point of the last chip. If there is voltage in the last chip, measure whether the 20th chip has RI -1.8v, and the rest can be done in the same manner, when finding the chip that does not have RI output voltage, first measure the 1.8V power supply of this chip, if there is no power supply of 1.8V, then check the 1.8V power supply circuit.
The 1.8V power supply circuit supplies power to the LDO pin 1 through the voltage division of the voltage domain. The pin 5 of LDO outputs a voltage of 1.8V. (Each voltage domain has -1.8 V LDO to supply power to the chip). If there is no output, this LDO should have a problem.
If 1.8V is normal, measure the ground resistance of the test point after the power is cut off and compare with the OK board to see if there is any resistance abnormality; if the resistance value is normal, and there is no problem with soldering. This chip should have a problem.
“X ASIC Found”
In a hashboard test, X chips can be found. It can be judged that the RI signal is normal. If the following chip cannot be found, we will directly measure the voltage of the Xth chip CLK-RST-CO to see if the power supply is normal. If CLK does not have a voltage of 0.8V, then it depends on the power supply circuit of CLK.
CLK circuit analysis: If CLK does not have 0.8V, first check whether the 0.8V power supply of the bad chip voltage domain is normal, the 0.8V power supply circuit is obtained by voltage domain division, and 1.8V power supply mode is the same, as for the maintenance method for pin 5 output 0.8V, refer to the 1.8V maintenance method.
Note that there are 2 out of 3 chips outputting 0.8V LDO power supply in each chip in each domain of T17, and each LDO supplies to 2 chips.
If the 0.8V power supply circuit does not have a 0.8V output, then see if the 0.8V LDO power supply has a supply voltage of about 3.2V. If it has, see if the LDO is soldered insufficiently or short-circuited. If there is a 0.8V output, then see the ground resistance of the chip, if the resistance is correct, it should be a bad chip.
The chip signal pin output is abnormal (BO / RST / CO / RI / CLK). Determine the bad position according to the signal direction. Please measure the impedance of the chip to ground firstly after power is off (compared to a good board or an adjacent group).
The temperature reading is abnormal. This bad single board appears as sensor NG (the fixture interface can display temp NG, log synchronization test results); the entire machine appears as a temperature reading of 0 ° C or no temperature reading. In case of this kind of problem, please check the temperature-sensitive model and the welding of surrounding parts, and report the soldering of the chip corresponding to abnormal temperature-sensitive condition.
“29 chips problem”
In this problem, you get to 29 chips, but the signal readings at the 29th chip and the 30th chip are good. The problem lies elsewhere but test it and the signals you measure appear good. You will have to short the signal and see when the tester will not react, to proceed thus by elimination and to replace the faulty chip.