Monitoring HomePlug AV devices

23.05.2018 18:51

Some time ago I wanted to monitor the performance of a network of Devolo dLAN devices. These are power-line Ethernet adapters. Each device looks like a power brick with a standard Ethernet RJ45 port. You can plug several of these into wall sockets around a building and theoretically they will together act as an Ethernet bridge, linking all their ports as if they were connected to a single network. The power-line network in question seemed to be having intermittent problems, but without some kind of a log it was hard to pin-point exactly what was the problem.

I have very little experience with power-line networks and some quick web searches yielded conflicting information about how these things work and what protocols are at work behind the curtains. Purely from the user perspective, the experience seems to be similar to wireless LANs. While individual devices have flashy numbers written on them, such as 500 Mbps, these are just theoretical "up to" throughputs. In practice, bandwidth of individual links in the network seems to be dynamically adjusted based on signal quality and is commonly quite a bit lower than advertised.

Devolo Cockpit screenshot.

Image by devolo.com

Devolo provides an application called Cockpit that allows you to configure the devices and visualize the power-line network. The feature I was most interested in was the real-time display of the physical layer bitrate for each individual link in the network. While the Cockpit is available for Linux, it is a user friendly point-and-click graphical application and chances were small that I would be able to integrate it into some kind of an automated monitoring process. The prospect of decoding the underlying protocol seemed easier. So I did a packet capture with Wireshark while the Cockpit was fetching the bitrate data:

HomePlug AV protocol decoded in Wireshark.

Wireshark immediately showed the captured packets as part of the HomePlug AV protocol and provided a nice decode. This finally gave me a good keyword I could base my further web searches on, which revealed a helpful white paper with some interesting background technical information. HomePlug AV physical layer apparently uses frequencies in the range of 2 - 28 MHz using OFDM with adaptive number of bits per modulation symbol. The network management is centralized, using a coordinator and a mix of CSMA/CA and TDMA access.

More importantly, the fact that Wireshark decode showed bitrate information in plain text gave me confidence that replicating the process of querying the network would be relatively straightforward. Note how the 113 Mbit/sec in the decode directly corresponds to hex 0x71 in raw packet contents. It appeared that only two packets were involved, a Network Info Request and a Network Info Confirm:

HomePlug AV Network Info Confirmation packet decode.

However before diving directly into writing code from scratch I came across the Faifa command-line tool on GitHub. The repository seems to be a source code dump from a now-defunct dev.open-plc.org web site. There is very little in terms of documentation or clues to its progeny. Last commit was in 2016. However a browse through its source code revealed that it is capable of sending the 0xa038 Network Info Request packet and receiving and decoding the corresponding 0xa039 Network Info Confirm reply. This was exactly what I was looking for.

Some tweaking and a compile later I was able to get the bitrate info from my terminal. Here I am querying one device in the power-line network (its Ethernet address is in the -a parameter). The queried device returns the current network coordinator and a list of stations it is currently connected to, together with the receive and transmit bitrates for each of those connections:

# faifa -i eth4 -a xx:xx:xx:xx:xx:xx -t a038
Faifa for HomePlug AV (GIT revision master-5563f5d)

Started receive thread
Frame: Network Info Request (Vendor-Specific) (0xA038)

Dump:
Frame: Network Info Confirm (Vendor-Specific) (A039), HomePlug-AV Version: 1.0
Network ID (NID): xx xx xx xx xx xx xx
Short Network ID (SNID): 0x05
STA TEI: 0x24
STA Role: Station
CCo MAC: 
	xx:xx:xx:xx:xx:xx
CCo TEI: 0xc2
Stations: 1
Station MAC       TEI  Bridge MAC        TX   RX  
----------------- ---- ----------------- ---- ----
xx:xx:xx:xx:xx:xx 0xc2 xx:xx:xx:xx:xx:xx 0x54 0x2b
Closing receive thread

The original tool had some annoying problems that I needed to work around before deploying it to my monitoring system. Most of all, it operated by sending the query with Ethernet broadcast address as the source. It then put the local network interface into promiscuous mode to listen for broadcasted replies. This seemed like bad practice and created problems for me, least of which was log spam with repeated kernel warnings about promiscuous mode enters and exits. It's possible that the use of broadcasts was a workaround for hardware limitation on some devices, but the devices I tested (dLAN 200 and dLAN 550) seem to reply just fine to queries from non-broadcast addresses.

I also fixed a race condition that was in the original tool due to the way it received the replies. If multiple queries were running on the same network simultaneously sometimes replies from different devices became confused. Finally, I fixed some rough corners regarding libpcap usage that prevented the multi-threaded Faifa process from exiting cleanly once a reply was received. I added a -t command-line option for sending and receiving a single packet.

As usual, the improved Faifa tool is available in my fork on GitHub:

$ git clone https://github.com/avian2/faifa.git

To conclude, here is an example of bitrate data I recorded using this approach. It shows transmit bitrates reported by one device in the network to two other devices (here numbered "station 1" and "station 2"). The data was recorded over the course of 9 days and the network was operating normally during this time:

Recorded PHY bitrate for two stations.

Even this graph shows some interesting things. Some devices (like the "station 1" here) seem to enter a power saving mode. Such devices don't appear in the network info reports, which is why data is missing for some periods of time. Even out of power saving mode, devices don't seem to update their reported bitrates if there is no data being transferred on that specific link. I think this is why the "station 2" here seems to have long periods where the reported bitrate remains constant.

Posted by Tomaž | Categories: Code

Add a new comment


(No HTML tags allowed. Separate paragraphs with a blank line.)