What's inside cheap SMA terminators

29.05.2020 14:16

I've recently ordered a bag of "YWBL-WH" 50 Ω SMA terminators off Amazon along with some other stuff. Considering they were about 3 EUR per piece and I was paying for shipment anyway, they seemed like a good deal. Unsurprisingly, they turned out less than stellar in practice.

50 Ω SMA terminators I bought off Amazon.

At the time when I bought them, the seller's page listed these specifications, claiming to be usable up to 6 GHz and 2 W of power dissipation. There's no real brand listed and identical-looking ones can be found from other sellers:

Specifications for 50 ohm SMA terminators.

Their DC resistances all measured very close to 51 Ω, which is good enough. However when I tried using them for some RF measurements around 1 GHz I got some unusual results. I thought the terminators could be to blame even though I don't currently have equipment to measure their return loss. If I had bothered to scroll down on that Amazon page, I might have seen a review from Dominique saying that they have only 14 dB return loss at 750 MHz and are hence useless at higher frequencies.

I suspected what's going on because I've seen this before in cheap BNC terminators sold for old Ethernet networks, but I still took one apart.

Cheap SMA terminator taken apart.

Indeed they simply have a standard through-hole axial resistor inside. The center SMA pin is soldered to the lead of the resistor, but ground lead was just pressed against the inside of the case. According to the resistor's color bands it's rated at 51 Ω, 5% tolerance and 100 ppm/K. I suspect it's a metal film resistor based on the blue casing and low thermal coefficient (if that's what the fifth color band stands for). It might be rated for 2 W, although judging by the size it looks more like 1/2 W to me. In any case, this kind of resistor is useless at RF frequencies because of its helical structure that acts like an inductor.

Again it turned out that cheaping out on lab tooling was just a waste of money.

Posted by Tomaž | Categories: Analog | Comments »

Simple method of measuring small capacitances.

22.05.2020 18:17

I stumbled upon this article on Analog Devices' website while looking for something else. It looks like instructions for a student lab session. What I found interesting about it is that it describes a way of measuring small capacitances (around 1 pF) with only a sine-wave generator and an oscilloscope. I don't remember seeing this method before and it seems useful in other situations as well, so I thought I might write a short note about it. I tried it out and indeed it gives reasonable results.

Breadboard capacitance measurement schematic.

Image by Analog Devices, Inc.

I won't go into details - see original article for a complete explanation and a step-by-step guide. In short, what you're doing is using a standard 10x oscilloscope probe and an unknown, small capacitance (Crow in the schematic above) as an AC voltage divider. From the attenuation of the divider and estimated values of other components it's possible to derive the unknown. Since the capacitance of the probe is usually only around 10 pF, this works reasonably well when the unknown is similarly small. The tricky part is calibrating this measurement, by estimating stray capacitances of wires and more accurately characterizing the resistance and capacitance of the probe. This is done by measuring both gain of the divider and its 3 dB corner frequency.

Note that the article is talking about using some kind of an instrument that has a network analyzer mode and can directly show a gain vs. frequency plot. This is not necessary and it's perfectly possible to do this measurement with a separate signal generator and a digital oscilloscope. For measuring capacitances of around 1 pF using a 10 pF/10 MΩ probe a signal generator capable of about 100 kHz sine-wave is sufficient. Determining when the amplitude of the signal displayed on the scope falls by 3 dB probably isn't very accurate, but for a rough measurement it seems to suffice.

The measurement depends on the probe having a DC resistance to ground as well as capacitance. I found that on my TDS 2002B scope you need to set the channel to DC coupled, otherwise there is no DC path to ground from the probe tip. It seems obvious in retrospect, but it did confuse me for a moment why I wasn't getting good results.

I also found that my measured signal was being overwhelmed by the 50 Hz mains noise. The solution was to use external synchronization on the oscilloscope and then use the averaging function. This cancels out the noise and gives much better measurements of the signal amplitude at the frequency that the signal generator is set to. You just need to be careful with the attenuator setting so that noise + signal amplitude still falls inside the scope's ADC range.

Posted by Tomaž | Categories: Analog | Comments »

Another SD card postmortem

16.05.2020 11:28

I was recently restoring a Raspberry Pi at work that was running a Raspbian system off a SanDisk Ultra 8 GB micro SD card. It was powered on continuously and managed to survive almost exactly 6 months since I last set it up. I don't know when this SD card first started showing problems, but when the problem became apparent I couldn't log in and Linux didn't even boot up anymore after a power cycle.

SanDisk Ultra 8 GB micro SD card.

I had a working backup of the system, however I was curious how well ddrescue would be able to recover the contents of the failed card. To my surprise, it did quite well, restoring 99.9% of the data after about 30 hours of run time. I've only ran the copy and trim phase (--no-scrape). Approximately 8 MB out of 8 GB of data remained unrecovered.

This was enough that fsck was able to recover the filesystem to a good enough state so that it could be mounted. Another interesting thing in the recovered data was the write statistic that is kept in ext4 superblock. The system only had one partition on the SD card:

$ dumpe2fs /dev/mapper/loop0p2 | grep Lifetime
dumpe2fs 1.43.4 (31-Jan-2017)
Lifetime writes:          823 GB

On one hand, 823 GB of writes after 6 months was more than I was expecting. The system was setup in a way to avoid a lot of writes to the SD card and had a network mount where most of the heavy work was supposed to be done. It did have a running Munin master though and I suspect that was where most of these writes came from.

On the other hand, 823 GB on a 8 GB card is only about 100 write cycles per cell, if the card is any good at doing wear leveling. That's awfully low.

In addition to a raw data file, ddrescue also creates a log of which parts of the device failed. Very likely a controller in the SD card itself is doing a lot of remapping. Hence a logical address visible from Linux has little to do with where the bits are physically stored in silicon. So regardless of what the log says, it's impossible to say whether errors are related to one failed physical area on a flash chip, or if they are individual bit errors spread out over the entire device. Still, I think it's interesting to look at this visualization:

Visualization of the ddrescue map file.

This image shows the distribution of unreadable sectors reported by ddrescue over the address space of the SD card. The address space has been sliced into 4 MB chunks (8192 blocks of 512 bytes). These slices are stacked horizontally, hence address 0 is on the bottom left and increases up and right in a saw-tooth fashion. The highest address is on the top right. Color shows the percentage of unreadable blocks in that region.

You can see that small errors are more or less randomly distributed over the entire address space. Keep in mind that summed up, unrecoverable blocks only cover 0.10% of the space, so this image exaggerates them. There are a few hot spots though and one 4 MB slice in particular at around 4.5 GB contains a lot of more errors than other regions. It's also interesting that some horizontal patterns can also be seen - the upper half of the image appears more error free than the bottom part. I've chosen 4 MB slices exactly because of that. While internal memory organization is a complete black box, it does appear that 4 MB blocks play some role in it.

Just for comparison, here is the same data plotted using a space-filling curve. The black area on the top-left is part of the graph not covered by the SD card address space (the curve covers 224 = 16777216 blocks of 512 bytes while the card only stores 15523840 blocks or 7948206080 bytes). This visualization better shows grouping of errors, but hides the fact that 4 MB chunks seem to play some role:

Visualization of the ddrescue map file using a Hilbert curve.

I quickly also looked into whether failures could be predicted by something like SMART. Even though it appears that some cards do support it, none I tried produced any useful data with smartctl. Interestingly, plugging the SanDisk Ultra into an external USB-connected reader on a laptop does say that the device has a SMART capability:

$ smartctl -d scsi -a /dev/sdb
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.9.0-12-amd64] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor:               Generic
Product:              STORAGE DEVICE
Revision:             1206
Compliance:           SPC-4
User Capacity:        7 948 206 080 bytes [7,94 GB]
Logical block size:   512 bytes
scsiModePageOffset: response length too short, resp_len=4 offset=4 bd_len=0
Serial number:        000000001206
Device type:          disk
scsiModePageOffset: response length too short, resp_len=4 offset=4 bd_len=0
Local Time is:        Thu May 14 16:36:47 2020 CEST
SMART support is:     Available - device has SMART capability.
SMART support is:     Enabled
Temperature Warning:  Disabled or Not Supported

=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK
Current Drive Temperature:     0 C
Drive Trip Temperature:        0 C

Error Counter logging not supported

scsiModePageOffset: response length too short, resp_len=4 offset=4 bd_len=0
Device does not support Self Test logging

However I suspect this response comes from the reader, not the SD card. Multiple cards I tried produced the same 1206 serial number. Both a new and a failed card had the "Health Status: OK" line, so that's misleading as well.

This is a second time I was replacing the SD card in this Raspberry Pi. The first time it lasted around a year and a half. It further justifies my opinion that SD cards just aren't suitable for unattended systems or those running continuously. In fact, I suggest avoiding them if at all possible. For example, newer Raspberry Pis support booting from USB-attached storage.

Posted by Tomaž | Categories: Digital | Comments »

On missing IPv6 router advertisements

03.05.2020 16:58

I've been having problems with Internet connectivity for the past week or so. Randomly connections would timeout and some things would work very slowly or not at all. In the end it turned out to be a problem with IPv6 routing. It seems my Internet service provider is having problems with sending out periodic Router Advertisements and the default route on my router often times out. I've temporarily worked around it by manually adding a route.

I'm running a simple, dual-stack network setup. There's a router serving a LAN. The router is connected over an optical link to the ISP that's doing Prefix Delegation. Problems appeared as intermittent. A lot of software seems to gracefully fall back onto IPv4 if IPv6 stops working, but there's usually a more or less annoying delay before it does that. On the other hand some programs don't and seem to assume that there's global connectivity as long as a host has a globally-routable IPv6 address.

The most apparent and reproducible symptom was that IPv6 pings to hosts outside of LAN often weren't working. At the same time, hosts on the LAN had valid, globally-routable IPv6 addresses, and pings inside the LAN would work fine:

$ ping -6 -n3 host-on-the-internet
connect: Network is unreachable
$ ping -6 -n3 host-on-the-LAN
PING ...(... (2a01:...)) 56 data bytes
64 bytes from ... (2a01:...): icmp_seq=1 ttl=64 time=0.404 ms
64 bytes from ... (2a01:...): icmp_seq=2 ttl=64 time=0.353 ms
64 bytes from ... (2a01:...): icmp_seq=3 ttl=64 time=0.355 ms

--- ... ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2026ms
rtt min/avg/max/mdev = 0.353/0.370/0.404/0.032 ms

Rebooting my router seemed to help for a while, but then the problem would reappear. After some debugging I've found out that the immediate cause of the problems was that the default route on my router would disappear approximately 30 minutes after it has been rebooted. It would then randomly re-appear and disappear a few times a day.

On my router, the following command would return empty most of the time:

$ ip -6 route | grep default

But immediately after a reboot, or if I got lucky, I would get a route. I'm not sure why there are two identical entries here, but the only difference is the from field:

$ ip -6 route | grep default
default from 2a01::... via fe80::... dev eth0 proto static metric 512 pref medium
default from 2a01::... via fe80::... dev eth0 proto static metric 512 pref medium

The following graph shows the number of entries returned by the command above over time. You can see that most of the day router didn't have a default route:

Number of valid routes obtained from RA over time.

The thing that was confusing me the most was the fact that the mechanism for getting the default IPv6 route is distinct from the the way the prefix delegation is done. This means that every device in the LAN can get a perfectly valid, globally-routable IPv6 address, but at the same time there can be no configured route for packets going outside of the LAN.

The route is automatically configured via Router Advertisement (RA) packets, which are part of the Neighbor Discovery Protocol. When my router first connects to the ISP, it sends out a Router Solicitation (RS). In response to the RS, the ISP sends back a RA. The RA contains the link-local address to which the traffic intended for the Internet should be directed to, as well as a Router Lifetime. Router Lifetime sets a time interval for which this route is valid. This lifetime appears to be 30 minutes in my case, which is why rebooting the router seemed to fix the problems for a short while.

The trick is that the ISP should later periodically re-send the RA by itself, refreshing the information and lifetime, hence pushing back the deadline at which the route times out. Normally, a new RA should arrive well before the lifetime of the first one runs out. However in my case, it seemed that for some reason the ISP suddenly started sending out RA's only sporadically. Hence the route would timeout in most cases, and my router wouldn't know where to send the packets that were going outside of my LAN.

To monitor RA packets on the router using tcpdump:

$ tcpdump -v -n -i eth0 "icmp6 && ip6[40] == 134"

This should show packets like the following arriving in intervals that should be much shorter than the advertised router lifetime. On a different, correctly working network, I've seen packets arriving roughly once every 10 minutes with lifetime of 30 minutes:

18:52:01.080280 IP6 (flowlabel 0xb42b9, hlim 255, next-header ICMPv6 (58) payload length: 176)
fe80::... < ff02::1: [icmp6 sum ok] ICMP6, router advertisement, length 176
	hop limit 64, Flags [managed, other stateful], pref medium, router lifetime 1800s, reachable time 0ms, retrans timer 0ms
	...
19:00:51.599538 IP6 (flowlabel 0xb42b9, hlim 255, next-header ICMPv6 (58) payload length: 176) 
fe80::... < ff02::1: [icmp6 sum ok] ICMP6, router advertisement, length 176
	hop limit 64, Flags [managed, other stateful], pref medium, router lifetime 1800s, reachable time 0ms, retrans timer 0ms
	...

However in this case this wasn't happening. Similarly to what the graph above shows, these packets only arrive sporadically. As far as I know, this is an indication that something is wrong on the ISP side. Sending a RA in response to RS seems to work, but periodic RA sending doesn't. Strictly speaking there's nothing that can be done to fix this on my end. My understanding of RFC 4861 is that a downstream host should only send out RS once, after connecting to the link.

Once the host sends a Router Solicitation, and receives a valid Router Advertisement with a non-zero Router Lifetime, the host MUST desist from sending additional solicitations on that interface, until the next time one of the above events occurs.

Indeed, as far as I can see, Linux doesn't have any provisions for re-sending RS in case all routes from a previously received RAs time out. This answer argues that it should, but I can find no references that would confirm this. On the other hand, this answer agrees with me that RS should only be sent when connecting to a link. On that note, I've also found a discussion that mentions blocking multicast packets as a cause of similar problems. I don't believe that is the case here.

In the end I've used an ugly workaround so that things kept working. I've manually added a permanent route that is identical to what is randomly advertised in RA packets:

$ ip -6 route add default via fe80::... dev eth0

Compared to entries originating from RA this manual entry in the routing table won't time out - at least not until my router gets rebooted. It also doesn't hurt anything if additional, identical routes get occasionally added via RA. Of course, it still goes completely against the IPv6 neighbor discovery mechanism. If anything changes on the ISP side, for example if the link-local address of the router changes, the entry won't get updated and the network will break again. However it does seem fix my issues at the moment. The fact that it's working also seems to confirm my suspicion that something is only wrong with RA transmissions on the ISP side, and that actual routing on their end works correctly. I've reported my findings to the ISP and hopefully things will get fixed on their end, but in the mean time, this will have to do.

Posted by Tomaž | Categories: Code | Comments »