On reliability of pogo pins
A bit over a year ago I designed and built a device for testing assembled printed circuit boards as they come off the assembly line. While I'm not new to electronic test fixtures, this was the first time I've used the bed-of-nails approach: the test jig has a number of spring-loaded pogo pins that make contact with various test pads on the device-under-test (DUT). This setup has now made thousands of cycles and the device proved itself to be capable of detecting a large variety of defects, without doubt preventing many expensive debugging sessions.
However one problem that has been constantly troubling this setup since the beginning is its unreliability. Even after a lot of fussing around with various adjustments, the procedure still has an abysmal false error rate compared to the actual rate of manufacturing defects. In many cases, the operator must remove, re-seat the DUT and restart the test several times before the test will signal a pass. Such test repetitions obviously cause a lot of frustration, decrease the confidence in the testing procedure and significantly lengthen a test that would otherwise take only a few moments. All evidence, like the fact that detected defect types appear completely random and that most test failures disappear when re-seating the DUT, firmly points towards the pogo pins as the cause.
I was surprised at this outcome, since I've never heard about bad contacts being such a problem with pogo pins. There are quite a few blog posts and basic tutorials around about the pogo pin test jigs. Hacker Noon mentions that getting the fine mechanical details correct can be tricky. The Big Mess o' Wires blog says that their test board only worked reliably after three iterations of the design. Thom wrote that they didn't have many issues with contacts on their test jig. It seems that reliability is not a common problem people have with pogo pins, once initial mechanical problems have been ironed out.
My bed of nails setup is shown above. It uses P75-type pogo pins - a widely available, cheap variant of uncertain origin. For example, they are sold by Adafruit. The whole bed has 21 pins and uses a combination of needle heads (P75-B1) and cupped heads (P75-A1). There was not enough PCB space on the DUT for all the required test pads so I used cupped head pins to mate with the underside of THT connector pins. P75 pogo pins seem to use exposed steel for the head and plunger (they are slightly magnetic) and only have the gold plating on the bottom body part. I'm not using the mounting sleeves. The pin bodies are directly soldered to the test jig PCB.
The mechanical parts have been removed in the photograph above, but you can get an idea of how they look from the CAD render below. During the test the DUT is securely fixed onto the pins using a clamp, centering pins and a frame. This setup is similar to the one described by Hacker Noon. The difference is that I'm using two parallel PCB boards to position the pins instead of 3D printed parts. The setup was designed so that the pogo pins only compress to approximately half of their 100 mil travel. The mechanical frame carries most of the clamping force.
The boards I'm testing have a lead-free HASL finish and there is no solder paste applied to the test pads. This means that test pads might be sensitive to oxidation. However that shouldn't be a problem since the test is applied shortly after production. It's also worth mentioning that I'm testing an analog circuit. Compared to purely digital tests these are more sensitive to the resistance between the test fixture and the DUT.
Since I have a lot of data collected from the test device I thought statistical analysis might shed some light on the reliability problem. If not directly showing a way to improve the existing device, perhaps it would at least give me some idea what can be expected from pogo pins when designing future test fixtures.
The first thing I was interested in was the resistance between a pogo pin on the test fixture and its corresponding test pad on the DUT. The test procedure was not designed to directly measure this. Fortunately however I found a way to estimate test point resistance for two specific pogo pins (out of 21). I calculated their resistances from certain other measurements I took during the test procedure. Of course, this was not as good as a direct measurement and the estimate is still affected somewhat by variations in some components on the DUT, the test device and resistances of other test points. A Monte Carlo simulation showed an error in the resistance estimate of less than 10 mΩ due to these effects.
As luck would have it, one of the pogo pins I was able to estimate was using the needle head while the second one was using the cupped head. This resulted in the following two histograms of resistances to two test points. They show how commonly each of the two test points exhibited a certain resistance over thousands of matings with the DUT:
Different colors show data from different DUT production batches. Overall, you can see that most commonly the connection resulted in a resistance of around 0.1 Ω and majority of connections were below 0.5 Ω. This is pretty good, even if somewhat above the 50 mΩ rated contact resistance for this type of pins. The cupped head pin showed less variance than the needle head. Still, the values show much higher variance than the estimated 10 mΩ error, which gives some confidence that this is actually due to changing contact resistances of the pogo pins.
However, one thing that is not visible on these plots is the fact that some connections resulted in estimates well over 1 Ω (approximately 10% for the needle head and 6% for the cupped head). I could also only produce this estimate when the test progressed to the point where some voltage measurements have been made (which depend on a reasonably good contact over 4 pogo pins for needle head pin and 2 pogo pins for cupped head pin). Hence test runs where these measurements were not taken are not included in the histograms above.
So what about these failed attempts? One way to show them is the number of test repetitions that a DUT had to undergo before a test first passed. Using records of thousands of tests, the following histogram emerged:
Again, the colors show data from different production batches. Overall, approximately 60% of DUTs passed on the first test attempt. A bit above 20% passed on the second and around 10% on the third attempt. You can also see some differences in batches. For example, the batch shown in red was particularly bad and more DUTs required a second repetition than passed the first test. Number of DUTs that failed the test 10 times or more is very small - mostly these are the DUTs that actually had a manufacturing defect and didn't fail due to a false reading on the test fixture.
The histogram shows a nicely exponential characteristic - exactly what you would expect if each test repetition was a random event with a Ppass probability of succeeding. From the data I can estimate that:
If I further assume that a test will succeed if all pogo pins contact successfully, and that each of the 21 pogo pin contacts is an independent random event by itself, we can calculate the a probability Pfail-pin that a pogo pin will fail to make a good contact:
Using this model, I can back predict the probability that a DUT will pass the test after N test repetitions:
This model fits almost perfectly with the measured histogram, as you can see on the picture below. The predicted number of test repetitions before first pass (red) is laid over the histogram of measurements (gray).
The model also fits reasonably well with number of cases where I've estimated test point resistances above 1 Ω. This might be a bit handwavy since it's hard to see how different failures would affect the results. For the needle-head test point I've seen approximately 10% of cases where resistance was above 1 Ω. This fits well with the fact that 4 points needed to be well connected for the measurement to be accurate and 2.4% failure rate for the connections:
Similarly for the cupped pin measurement, where I've seen 6% of measurements above 1 Ω and required 2 points to be well connected:
In conclusion, my data shows that individual pogo pins seem to have approximately 2.4% chance of not mating correctly with their test points. When they do contact correctly, they usually show a reasonably low resistance of approximately 100 mΩ between the pin and the test pad, with worst cases being less than 500 mΩ. It's not clear from the data what is causing such a high rate of unsuccessful connections. Since the failure rate varies from batch to batch, this suggests that at least part of it is related in some way to the production process (for example, oxide or flux residue on the test pads). On the other hand, it's also possible that the pins themselves are responsible for these failures. The bad contact might in fact be between the plunger and pin body, not between the head and the test pad. In that case it might be worth experimenting with the more expensive pogo pins that have gold plated heads and plungers.