The case of disappearing PulseAudio daemon

03.11.2018 19:39

Recently I was debugging an unusual problem with the PulseAudio daemon. PulseAudio handles high-level details of audio recording, playback and processing. These days it is used by default in many desktop Linux distributions. The problem I was investigating was on an embedded device using Linux kernel 4.9. I relatively quickly I found a way to reproduce it. However finding the actual cause was surprisingly difficult and led me into the kernel source and learning about the real-time scheduler. I thought I might share this story in case someone else finds themselves in a similar situation.

The basic issue was that PulseAudio daemon occasionally restarted. This reset some run-time configuration that should not be lost while the device was operating and also broke connections to clients that were talking to the daemon. Restarts were seemingly connected to the daemon or the CPU being under load. For example, the daemon would restart if many audio files were played simultaneously on it. I could reproduce the problem on PulseAudio 12.0 using a shell script similar to the following:

n=0
while [ $n -lt 100 ]; do
	pacmd play-file foo.wav 0
	n=$(($n+1))
done

This triggers 100 playbacks of the foo.wav file at almost the same instant and would reliably make the daemon restart on the device. However I was sometimes seeing restarts with less than ten simultaneous audio plays. A similar script with 1000 plays would sometimes also cause a restart on my Intel-based laptop using the same PulseAudio version. This made it easier to investigate the problem since I didn't need to do the debugging remotely on the device.

syslog held some clues what was happening. PulseAudio process was apparently being sent the SIGKILL signal. systemd detected that and restarted the service shortly after:

systemd[1]: pulseaudio.service: Main process exited, code=killed, status=9/KILL
systemd[1]: pulseaudio.service: Unit entered failed state.
systemd[1]: pulseaudio.service: Failed with result 'signal'.

However, there were no other clues as to what sent the SIGKILL or why. Kernel log from dmesg had nothing whatsoever related to this. Increasing logging verbosity in systemd and PulseAudio showed nothing relevant. Attaching a debugger and strace to the PulseAudio process showed that the signal was received at seemingly random points in the execution of the daemon. This showed that the problem was not directly related to any specific line of code, but otherwise led me nowhere.

When I searched the web, all suggestions seemed to point to the kernel killing the process due to an out-of-memory condition. The kernel being the cause of the signal seemed reasonable, however OOM condition is usually clearly logged. Another guess was that systemd itself was killing the process after I learned about its resource control feature. This turned out to be a dead end as well.

The first real useful clue was when Gašper gave me a crash course on using perf. After setting up debugging symbols for the kernel and the PulseAudio binary, I used the following:

$ perf_4.9 record -g -a -e 'signal:*'
(trigger the restart here)
$ perf_4.9 script

This printed out a nice backtrace of the code that generated the problematic SIGKILL signal (among noise about uninteresting other signals being sent between processes on the system):

alsa-sink-ALC32 12510 [001] 24535.773432: signal:signal_generate: sig=9 errno=0 code=128 comm=alsa-sink-ALC32 pid=12510 grp=1 res=0
            7fffbae8982d __send_signal+0x80004520223d ([kernel.kallsyms])
            7fffbae8982d __send_signal+0x80004520223d ([kernel.kallsyms])
            7fffbaef0652 run_posix_cpu_timers+0x800045202522 ([kernel.kallsyms])
            7fffbaea7aa9 scheduler_tick+0x800045202079 ([kernel.kallsyms])
            7fffbaefaa80 tick_sched_timer+0x800045202000 ([kernel.kallsyms])
            7fffbaefa480 tick_sched_handle.isra.12+0x800045202020 ([kernel.kallsyms])
            7fffbaefaab8 tick_sched_timer+0x800045202038 ([kernel.kallsyms])
            7fffbaeebbfe __hrtimer_run_queues+0x8000452020de ([kernel.kallsyms])
            7fffbaeec2dc hrtimer_interrupt+0x80004520209c ([kernel.kallsyms])
            7fffbb41b1c7 smp_apic_timer_interrupt+0x800045202047 ([kernel.kallsyms])
            7fffbb419a66 __irqentry_text_start+0x800045202096 ([kernel.kallsyms])

alsa-sink-ALC32 is the thread in the PulseAudio daemon that is handling the interface between the daemon and the ALSA driver for the audio hardware. The stack trace shows that the signal was generated in the context of that thread, however the originating code was called from a timer interrupt, not a syscall. Specifically, the run_posix_cpu_timers function seemed to be the culprit. This was consistent with the random debugger results I saw before, since interrupts were not in sync with the code running in the thread.

Some digging later I found the following code that is reached from run_posix_cpu_timers via some static functions. Intermediate static functions probably got optimized away by the compiler and don't appear in the perf stack trace:

if (hard != RLIM_INFINITY &&
    tsk->rt.timeout > DIV_ROUND_UP(hard, USEC_PER_SEC/HZ)) {
	/*
	 * At the hard limit, we just die.
	 * No need to calculate anything else now.
	 */
	__group_send_sig_info(SIGKILL, SEND_SIG_PRIV, tsk);
	return;
}

Now things started to make sense. Linux kernel implements some limits on how much time a thread with real-time scheduling priority can use before cooperatively yielding the CPU to other threads and processes (via a blocking syscall for instance). If a thread hits this time limit it is silently sent the SIGKILL signal by the kernel. Kernel resource limits are documented in the setrlimit man page (the relevant limit here is RLIMIT_RTTIME). The PulseAudio daemon was setting the ALSA thread to real-time priority and it was getting killed under load.

Using real-time scheduling seems to be the default in PulseAudio 12.0 and the time limit set for the process is 200 ms. The limit for a running daemon can be inspected from shell using prlimit:

$ prlimit --pid PID | grep RTTIME
RTTIME     timeout for real-time tasks           200000    200000 microsecs

Details of real-time scheduling can be adjusted in /etc/pulse/daemon.conf, for instance with:

realtime-scheduling = yes
rlimit-rttime = 400000

Just to make sure that an RTTIME over-run indeed produces such symptoms I made the following C program that intentionally triggers it. Running the program showed that indeed the cause for the SIGKILL in this case isn't logged anywhere and produces a similar perf backtrace:

#include <stdio.h>
#include <sched.h>
#include <sys/resource.h>

int main(int argc, char** argv)
{
        struct sched_param sched_param;
        if (sched_getparam(0, &sched_param) < 0) {
                printf("sched_getparam() failed\n");
                return 1;
        }
        sched_param.sched_priority = sched_get_priority_max(SCHED_RR);
        if (sched_setscheduler(0, SCHED_RR, &sched_param)) {
		printf("sched_setscheduler() failed\n");
		return 1;
	}
        printf("Scheduler set to Round Robin with priority %d...\n",
			sched_param.sched_priority);
        fflush(stdout);

	struct rlimit rlimit;
	rlimit.rlim_cur = 500000;
	rlimit.rlim_max = rlimit.rlim_cur;

	if (setrlimit(RLIMIT_RTTIME, &rlimit)) {
		printf("setrlimit() failed\n");
		return 1;
	}
	printf("RTTIME limit set to %ld us...\n",
			rlimit.rlim_max);

	printf("Hogging the CPU...\n");
	while(1);
}

It would be really nice if kernel would log somewhere the reason for this signal, like it does with OOM. It might be that in this particular case developers wanted to avoid calling possibly expensive logging functions from interrupt context. On the other hand, it seems that kernel by default doesn't log any process kills at all due to resource limit over-runs. Failed syscalls can be logged using auditd, but that wouldn't help here as no syscalls actually failed.

As far as this particular PulseAudio application was concerned, there weren't really any perfect solutions. This didn't look like a bug in PulseAudio but rather a fact of life in a constrained environment with real-time tasks. The PulseAudio man page discusses some trade-offs of real-time scheduling (which is nice in hindsight, but you first have to know where to look). In my specific case, there were more or less only three possibilities of how to proceed:

  1. Disable RTTIME limit and accept that PulseAudio might freeze other processes on the device for an arbitrary amount of time,
  2. disable real-time scheduling and accept occasional skips in the audio due to other processes taking the CPU from PulseAudio for too long, or
  3. accept the fact that PulseAudio will restart occasionally and make other software on the device recover from this case as best as possible.

After considering implications to the functionality of the device I went with the last one in the end. I also slightly increased the default RTTIME limit so that restarts would be less common while still having an acceptable maximum response time for other processes.

Posted by Tomaž | Categories: Code | Comments »

Getting ALSA sound levels from a command-line

01.09.2018 14:33

Sometimes I need to get an idea of signal levels coming from an ALSA capture device, like a line-in on a sound card, and I only have a ssh session available. For example, I've recently been exploring a case of radio-frequency interference causing audible noise in an audio circuit of an embedded device. It's straightforward to load up raw samples into a Jupyter notebook or Audacity and run some quick analysis (or simply listen to the recording on headphones). But what I've really been missing is just a simple command-line tool that would show me some RMS numbers. This way I wouldn't need to transfer potentially large wav files around quite so often.

I felt like a command-line VU meter was something that should have already existed, but finding anything like that on Google turned out elusive. Overtime I've ended up with a bunch of custom half-finished Python scripts. However Python is slow, especially on low-powered ARM devices, so I was considering writing a more optimized version in C. Luckily I've recently come across this recipe for sox which does exactly what I want. Sox is easily apt-gettable on all Debian flavors I commonly care about and even on slow CPUs the following doesn't take noticeable longer than it takes to record the data:

$ sox -q -t alsa -d -n stats trim 0 5
             Overall     Left      Right
DC offset  -0.000059 -0.000059 -0.000046
Min level  -0.333710 -0.268066 -0.333710
Max level   0.273834  0.271820  0.273834
Pk lev dB      -9.53    -11.31     -9.53
RMS lev dB    -25.87    -26.02    -25.73
RMS Pk dB     -20.77    -20.77    -21.09
RMS Tr dB     -32.44    -32.28    -32.44
Crest factor       -      5.44      6.45
Flat factor     0.00      0.00      0.00
Pk count           2         2         2
Bit-depth      15/16     15/16     15/16
Num samples     242k
Length s       5.035
Scale max   1.000000
Window s       0.050

This captures 5 seconds of audio (trim 0 5) from the default ALSA device (-t alsa -d), runs it through the stats filter and discards the samples without saving them to a file (-n). The stats filter calculates some useful statistics and dumps them to standard error. A different capture device can be selected through the AUDIODEV environment variable.

The displayed values are pretty self-explanatory: min and max levels show extremes in the observed samples, scaled to ±1 (so they are independent of the number of bits in ADC samples). Root-mean-square (RMS) statistics are scaled so that 0 dB is the full-scale of the ADC. In addition to overall mean RMS level you also get peak and trough values measured over a short sliding window (length of this window is configurable, and is shown on the last line of the output). This gives you some idea of the dynamics in the signal as well as the overall volume. Description of other fields can be found in the sox(1) man page.

Posted by Tomaž | Categories: Code | Comments »

Monitoring HomePlug AV devices

23.05.2018 18:51

Some time ago I wanted to monitor the performance of a network of Devolo dLAN devices. These are power-line Ethernet adapters. Each device looks like a power brick with a standard Ethernet RJ45 port. You can plug several of these into wall sockets around a building and theoretically they will together act as an Ethernet bridge, linking all their ports as if they were connected to a single network. The power-line network in question seemed to be having intermittent problems, but without some kind of a log it was hard to pin-point exactly what was the problem.

I have very little experience with power-line networks and some quick web searches yielded conflicting information about how these things work and what protocols are at work behind the curtains. Purely from the user perspective, the experience seems to be similar to wireless LANs. While individual devices have flashy numbers written on them, such as 500 Mbps, these are just theoretical "up to" throughputs. In practice, bandwidth of individual links in the network seems to be dynamically adjusted based on signal quality and is commonly quite a bit lower than advertised.

Devolo Cockpit screenshot.

Image by devolo.com

Devolo provides an application called Cockpit that allows you to configure the devices and visualize the power-line network. The feature I was most interested in was the real-time display of the physical layer bitrate for each individual link in the network. While the Cockpit is available for Linux, it is a user friendly point-and-click graphical application and chances were small that I would be able to integrate it into some kind of an automated monitoring process. The prospect of decoding the underlying protocol seemed easier. So I did a packet capture with Wireshark while the Cockpit was fetching the bitrate data:

HomePlug AV protocol decoded in Wireshark.

Wireshark immediately showed the captured packets as part of the HomePlug AV protocol and provided a nice decode. This finally gave me a good keyword I could base my further web searches on, which revealed a helpful white paper with some interesting background technical information. HomePlug AV physical layer apparently uses frequencies in the range of 2 - 28 MHz using OFDM with adaptive number of bits per modulation symbol. The network management is centralized, using a coordinator and a mix of CSMA/CA and TDMA access.

More importantly, the fact that Wireshark decode showed bitrate information in plain text gave me confidence that replicating the process of querying the network would be relatively straightforward. Note how the 113 Mbit/sec in the decode directly corresponds to hex 0x71 in raw packet contents. It appeared that only two packets were involved, a Network Info Request and a Network Info Confirm:

HomePlug AV Network Info Confirmation packet decode.

However before diving directly into writing code from scratch I came across the Faifa command-line tool on GitHub. The repository seems to be a source code dump from a now-defunct dev.open-plc.org web site. There is very little in terms of documentation or clues to its progeny. Last commit was in 2016. However a browse through its source code revealed that it is capable of sending the 0xa038 Network Info Request packet and receiving and decoding the corresponding 0xa039 Network Info Confirm reply. This was exactly what I was looking for.

Some tweaking and a compile later I was able to get the bitrate info from my terminal. Here I am querying one device in the power-line network (its Ethernet address is in the -a parameter). The queried device returns the current network coordinator and a list of stations it is currently connected to, together with the receive and transmit bitrates for each of those connections:

# faifa -i eth4 -a xx:xx:xx:xx:xx:xx -t a038
Faifa for HomePlug AV (GIT revision master-5563f5d)

Started receive thread
Frame: Network Info Request (Vendor-Specific) (0xA038)

Dump:
Frame: Network Info Confirm (Vendor-Specific) (A039), HomePlug-AV Version: 1.0
Network ID (NID): xx xx xx xx xx xx xx
Short Network ID (SNID): 0x05
STA TEI: 0x24
STA Role: Station
CCo MAC: 
	xx:xx:xx:xx:xx:xx
CCo TEI: 0xc2
Stations: 1
Station MAC       TEI  Bridge MAC        TX   RX  
----------------- ---- ----------------- ---- ----
xx:xx:xx:xx:xx:xx 0xc2 xx:xx:xx:xx:xx:xx 0x54 0x2b
Closing receive thread

The original tool had some annoying problems that I needed to work around before deploying it to my monitoring system. Most of all, it operated by sending the query with Ethernet broadcast address as the source. It then put the local network interface into promiscuous mode to listen for broadcasted replies. This seemed like bad practice and created problems for me, least of which was log spam with repeated kernel warnings about promiscuous mode enters and exits. It's possible that the use of broadcasts was a workaround for hardware limitation on some devices, but the devices I tested (dLAN 200 and dLAN 550) seem to reply just fine to queries from non-broadcast addresses.

I also fixed a race condition that was in the original tool due to the way it received the replies. If multiple queries were running on the same network simultaneously sometimes replies from different devices became confused. Finally, I fixed some rough corners regarding libpcap usage that prevented the multi-threaded Faifa process from exiting cleanly once a reply was received. I added a -t command-line option for sending and receiving a single packet.

As usual, the improved Faifa tool is available in my fork on GitHub:

$ git clone https://github.com/avian2/faifa.git

To conclude, here is an example of bitrate data I recorded using this approach. It shows transmit bitrates reported by one device in the network to two other devices (here numbered "station 1" and "station 2"). The data was recorded over the course of 9 days and the network was operating normally during this time:

Recorded PHY bitrate for two stations.

Even this graph shows some interesting things. Some devices (like the "station 1" here) seem to enter a power saving mode. Such devices don't appear in the network info reports, which is why data is missing for some periods of time. Even out of power saving mode, devices don't seem to update their reported bitrates if there is no data being transferred on that specific link. I think this is why the "station 2" here seems to have long periods where the reported bitrate remains constant.

Posted by Tomaž | Categories: Code | Comments »

Switching window scaling in GNOME

01.05.2018 13:22

A while back I got a new work laptop: a 13" Dell XPS 9360. I was pleasantly surprised that installing the latest Debian Stretch with GNOME went smoothly and no special tweaks were needed to get everything up and running. The laptop works great and the battery life in Linux is a significant step up from my old HP EliteBook. The only real problem I noticed after a few months of use is weird behavior of the headphone jack, which often doesn't work for some unknown reason.

In any case, this is my first computer with a high-DPI screen. The 13-inch LCD panel has a resolution of 3200x1800, which means that text on a normal X window screen is almost unreadable without some kind of scaling. Thankfully, GNOME that ships with Stretch has a relatively good support for window scaling. You can set a scaling factor in the GNOME Tweak Tool and all windows will have their content scaled by an integer factor, making text and icons intelligible again.

Window scaling setting in GNOME Tweak Tool.

This setting works fine for me, at least for terminal windows and Firefox, which is what I mostly use on this computer. I've only noticed some minor cosmetic issues when I change this at run-time. Some icons and buttons in GNOME Shell (like the bar on the top of the screen or the settings menu on the upper-right) will sometimes look weird until the next reboot.

A bigger annoyance was the fact that I often use this computer with a normal (non-high-DPI) external monitor. I had to open up the Tweak Tool each time I connected or disconnected the monitor. Navigating huge scaled UI on the low-DPI external monitor or tiny UI on the high-DPI laptop panel got old really quick. It was soon obvious that changing that setting should be a matter of a single key press.

Finding a way to set window scaling programmatically was surprisingly difficult (not unlike my older effort in switching audio output device) I tried a few different approaches, like setting some dconf keys, but none worked reliably. I ended up digging into the Tweak Tool source. This revealed that the Tweak Tool is built around a nice Python library that exposes the necessary settings as functions you can call from your own scripts. The rest was simple.

I ended up with the following Python script:

#!/usr/bin/python2.7

from gtweak.utils import XSettingsOverrides

def main():
        xsettings = XSettingsOverrides()

        sf = xsettings.get_window_scaling_factor()

	if sf == 1:
		sf = 2
	else:
		sf = 1

	xsettings.set_window_scaling_factor(sf)

if __name__ == "__main__":
	main()

I have this script saved as toggle_hidpi and then a shortcut set in GNOME Keyboard Settings so that Super-F11 runs it. Note that using the laptop's built-in keyboard this means pressing the Windows logo, Fn, and F11 keys due to the weird modern practice of hiding normal function keys behind the Fn modifier. On an external USB keyboard, only Windows logo and F11 need to be pressed.

High DPI toggle shortcut in GNOME keyboard settings.

Posted by Tomaž | Categories: Code | Comments »

OpenCT on Debian Stretch

24.02.2018 10:03

I don't like replacing old technology that isn't broken, although I wonder sometimes whether that's just rationalizing the fear of change. I'm still using a bunch of old Schlumberger Cryptoflex (e-Gate) USB tokens for securely storing client-side SSL certificates. All of them are held together by black electrical tape at this point, since the plastic became brittle with age. However they still serve their purpose reasonably well, even if software support for them has been obsoleted a long time ago. So what follows is another installment of the series on keeping these hardware tokens working on the latest Debian Stable release.

Stretch upgrades the pcscd and libpcsclite1 packages (from the pcsc-lite project) to version 1.8.20. Unfortunately, this upgrade breaks the old openct driver, which is to my knowledge the only way to use these tokens on a modern system. This manifests itself as the following error when dumping the list of currently connected smart cards:

$ pkcs15-tool -D
Using reader with a card: Axalto/Schlumberger/Gemalo egate token 00 00
PKCS#15 binding failed: Unsupported card

Some trial and error and git bisect led me to commit 8eb9ea1 which apparently caused this issue. It was committed between releases 1.8.13 (which was shipped in Jessie) and 1.8.14. This commit introduces some subtle changes in the way buffers of data are exchanged between pcscd and its drivers, which break openct 0.6.20.

There are two ways around that: you can keep using pcscd and libpcsclite1 from Jessie (the 1.8.13 source package from Jessie builds fine on Stretch), or you can patch openct. I've decided on the second option.

The openct driver is no longer developed upstream and has been removed from Debian in Jessie (last official release was in 2010, although there has been some effort to modernize it). I keep my own git repository and Debian packages based on the last package shipped in Wheezy. My patched version 0.6.20 includes changes required for systemd support, and now also the patch required to support modern pcscd version on Stretch. The latter has been helpfully pointed out to me by Ludovic Rousseau on the pcsc-lite mailing list.

My openct packages for Stretch on amd64 can be found here (version 0.6.20-1.2tomaz2). The updated source is also in a git repository (with a layout compatible with git-buildpackage), should you want to built it yourself:

$ git clone http://www.tablix.org/~avian/git/openct.git

Other smart card-related packages work for me as-shipped in Stretch (e.g. opensc and opensc-pkcs11 0.16.0-3). No changes were necessary in Firefox configuration for it to be able to pull client-side certificates from the hardware tokens. It is still required however, to insert the token only when no instances of Firefox are running.

Posted by Tomaž | Categories: Code | Comments »

News from the Z80 land

05.01.2018 19:30

Here are some overdue news regarding modern development tools for the vintage Z80 architecture. I've been wanting to give this a bit of exposure, but alas other things in life interfered. I'm happy that there is still this much interest in old computers and software archaeology, even if I wasn't able to dedicate much of my time to it in the last months.


Back in 2008, Stefano Bodrato added support for Galaksija to Z88DK, a software development toolkit for Z80-based computers. Z88DK consists of a C compiler and a standard C library (with functions like printf and scanf) that interfaces with Galaksija's built-in terminal emulation routines. His work was based on my annotated ROM disassembly and development tools. The Z88DK distribution also includes a few cross-platform examples that can be compiled for Galaksija, provided they fit into the limited amount of RAM. When compiling for Galaksija, the C compiler produces a WAV audio file that can be directly loaded over an audio connection, emulating an audio cassette recording.

This November, Stefano improved support for the Galaksija compile target. The target now more fully supports the graphics functions from Z88DK library, including things like putsprite for Galaksija's 2x3 pseudographics. He also added a joystick emulation to Z88DK cross-platform joystick library. The library emulates two joysticks via the keyboard. The first one uses arrow keys on Galaksija's keyboard. The second one uses the 5-6-7-8 key combination in the top row that should be familiar to the users of ZX Spectrum. He also added a clock function that uses the accurate clock ticks provided by the ROM's video interrupt.

Z88DK clock example on Galaksija

(Click to watch Z88DK clock example on Galaksija video)

This made it possible to build more games and demos for Galaksija. I've tried the two-player snakes game and the TV clock demo and (after some tweaking) they both worked on my Galaksija replica. To try them yourself, follow the Z88DK installation instructions and then compile the demos under examples/graphics by adapting the zcc command-lines at the top of the source files.

For instance, to compile clock.c into clock.wav I used the following:

$ zcc +gal -create-app -llib3d -o clock clock.c

The second piece of Z80-related news I wanted to share relates to the z80dasm 1.1.5 release I made back in August. z80dasm is a disassembler for the Z80 machine code. The latest release significantly improves handling of undocumented instructions that was reported to me by Ast Moore. I've already written in more detail about that in a previous blog post.

Back in November I've also updated Debian packaging for z80dasm. So the new release is now also available as a Debian source or binary package that should cleanly build and install on Jessie, Stretch and Sid. Unfortunately the updated package has not yet been uploaded to the Debian archive. I have been unable to reach any of my previous sponsors (in the Debian project, a package upload must be sponsored by a Debian developer). If you're a Debian developer with an interest in vintage computing and would like to sponsor this upload, please contact me.

Update: z80dasm 1.1.5-1 package is now in Debian Unstable.

Posted by Tomaž | Categories: Code | Comments »

On piping Curl to apt-key

21.08.2017 16:52

Piping Curl to Bash is dangerous. Hopefully, this is well known at this point and many projects no longer advertise this installation method. Several popular posts by security researchers demonstrated all kinds of covert ways such instructions could be subverted by an attacker who managed to gain access to the server hosting the install script. Passing HTTP responses uninspected to other sensitive system commands is no safer, however.

This is how Docker documentation currently advertises adding their GPG key to the trusted keyring for Debian's APT:

Screenshot of instructions for adding Docker GPG key.

The APT keyring is used to authenticate packages that are installed by Debian's package manager. If an attacker can put their public key into this keyring they can for example provide compromised package updates to your system. Obviously you want to be extra sure no keys from questionable origins get added.

Docker documentation correctly prompts you to verify that the added key matches the correct fingerprint (in contrast to some other projects). However, the suggested method is flawed since the fingerprint is only checked after an arbitrary number of keys has already been imported. Even more, the command shown specifically checks only the fingerprint of the Docker signing key, not of the key (or keys) that were imported.

Consider the case where download.docker.com starts serving an evil key file that contains both the official Docker signing key and the attacker's own key. In this case, the instructions above will not reveal anything suspicious. The displayed fingerprint matches the one shown in the documentation:

$ curl -fsSL https://www.tablix.org/~avian/docker/evil_gpg | sudo apt-key add -
OK
$ sudo apt-key fingerprint 0EBFCD88
pub   4096R/0EBFCD88 2017-02-22
      Key fingerprint = 9DC8 5822 9FC7 DD38 854A  E2D8 8D81 803C 0EBF CD88
uid                  Docker Release (CE deb) <docker@docker.com>
sub   4096R/F273FCD8 2017-02-22

However, importing our key file also silently added another key to the keyring. Considering that apt-key list outputs several screen-fulls of text on a typical system, this is not easy to spot after the fact without specifically looking for it:

$ sudo apt-key list|grep -C1 evil
pub   1024R/6537017F 2017-08-21
uid                  Dr. Evil <evil@example.com>
sub   1024R/F2F8E843 2017-08-21

The solution is obviously not to pipe the key file directly into apt-key without inspecting it first. Interestingly, this is not straightforward. pgpdump tool is recommended by the first answer that comes up when asking Google about inspecting PGP key files. The version in Debian Jessie fails to find anything suspicious about our evil file:

$ pgpdump -v
pgpdump version 0.28, Copyright (C) 1998-2013 Kazu Yamamoto
$ curl -so evil_gpg https://www.tablix.org/~avian/docker/evil_gpg
$ pgpdump evil_gpg|grep -i evil

Debian developers suggest importing the key into a personal GnuPG keyring, inspecting the integrity and then exporting it into apt-key. That is more of a hassle, and not that useful for Docker's key that doesn't use the web of trust. In our case, inspecting the file with GnuPG directly is enough to show that it in fact contains two keys with different fingerprints:

$ gpg --version|head -n1
gpg (GnuPG) 1.4.18
$ gpg --with-fingerprint evil_gpg
pub  4096R/0EBFCD88 2017-02-22 Docker Release (CE deb) <docker@docker.com>
      Key fingerprint = 9DC8 5822 9FC7 DD38 854A  E2D8 8D81 803C 0EBF CD88
sub  4096R/F273FCD8 2017-02-22
pub  1024R/6537017F 2017-08-21 Dr. Evil <evil@example.com>
      Key fingerprint = 3097 2749 79E6 C35F A009  E25E 6CD6 1308 6537 017F
sub  1024R/F2F8E843 2017-08-21

The key file used in the example above was created by simply concatenating the two ASCII-armored public key blocks. It looks pretty suspicious in a text editor because it contains two ASCII headers (which is probably why pgpdump stops processing it before the end). However, two public key blocks could easily have also been exported into a single ASCII-armor block.

Posted by Tomaž | Categories: Code | Comments »

z80dasm 1.1.5

06.08.2017 12:26

This is becoming a summer of retro computing blog posts, isn't it? In any case, today I'm pleased to announce the release of z80dasm 1.1.5, the latest version of my disassembler for the Z80 CPU machine code.

Ast Moore recently noticed that z80dasm doesn't recognize several well known undocumented instructions. When I made z80dasm on the base of Jan Panteltje's dz80 several years ago my goal was to make a disassembler that would correctly produce an assembly listing for Galaksija's ROM. While I did add support for the sli instruction back then, it seems that Galaksija didn't use any of the other undocumented instructions.

z80dasm 1.1.4 did correctly include unknown instructions in the disassembly as raw hex values with defb directives, however it guessed the length of some of them wrong, which desynced the instruction decoder and produced garbage output. The output would still assemble back into the same binary, but the point of a disassembler is to produce human-readable output, so that was not very useful.

Fixing this problem was a bit more involved than I expected, but not for the usual reasons. One of the guidelines I had with z80dasm was to keep z80dasm and z80asm a matched pair. Being able to reproduce back the original binary from disassembled source is very useful when hacking on old software as well as in testing z80dasm. Unfortunately, it turned out that z80asm is quite bad at supporting these undocumented instructions as well. For instance, inc and dec with ixl and ixh register operands are not recognized at all. Some other instructions, like add with these undocumented operands produce wrong opcodes.

I guess this is not surprising. In contrast to the official instruction set there is no authoritative documentation for these instructions, so even the names differ in different sources. For example, both sli a and sll a mnemonics are commonly used for the cb 37 opcode.

I tried to fix some of the problems with z80asm I found, but it turned out to be quite time consuming. Both z80asm and z80dasm are written in, by today's standards, quite an archaic C style, with lots of pointer arithmetic and packing various unrelated things into a single int variable to save space. While I'm still fairly familiar with z80dasm code, the modifications I did to z80asm took a several sheets of paper to figure out.

$ hd
00000000  ed 70 dd 2c dd 23 23 3d  ed 71 dd 09              |.p.,.##=.q..|
0000000c
$ z80dasm example.bin
; z80dasm 1.1.5
; command line: z80dasm example.bin

	org	00100h

	defb 0edh,070h	;in f,(c)
	defb 0ddh,02ch	;inc ixl
	inc ix
	inc hl	
	dec a	
	defb 0edh,071h	;out (c),0
	add ix,bc

In the end, I gave up and solved the problem of z80asm compatibility by leaving z80dasm to decode undocumented instructions as defb directives. However, the instruction lengths should now be decoded correctly and also the mnemonic is included in a comment for better readability. There is now also a new --undoc option in 1.1.5. If it's specified on the command-line, z80dasm will place all instructions directly into the disassembly.

As before, you can get the source code for z80dasm in a release tarball or from the git repository. See the README file for install instructions and the included man page for explanations of all command-line options. z80dasm is also included in Debian, however 1.1.5 has not been uploaded yet.

Posted by Tomaž | Categories: Code | Comments »

IPv6 problems on TP-Link Archer C20

26.04.2017 12:44

Recently I replaced an old 2.4 GHz-only Linksys WRT54GL wireless router with a shiny new, dual band TP-Link Archer C20 (hardware version V1). Unfortunately, the new router brought some unusual problems. It turns out some devices are now unable to get a global IPv6 address when connected over Wi-Fi. For example, my Android 5.1 smartphone and my work laptop with Debian Jessie and Network Manager don't get IPv6 connectivity. They worked just fine when connected through the old router. At the same time, a different phone with Android 6.0 seems to have no problems with the new Archer C20 router.

First a brief note on the network setup: Archer C20 is used here as a wireless access point only. Some other host on the network acts as a gateway to the Internet. That host also provides a DHCP service for IPv4 and runs the route advertisement daemon (radvd) for IPv6 SLAAC. The setup has been quite well tested and works flawlessly on the wired Ethernet. The old WRT54GL has also been used in this way, which is why IPv6 connectivity on the Wi-Fi worked fine even though the old router's firmware had no explicit IPv6 support.

As the TP-Link FAQ entry explains, the WAN port on the C20 is unused, the network is connected to one of the LAN ports and the DHCP server on the C20 is disabled. IPv6 status tab in the configuration interface shows the following:

IPv6 status tab in Archer C20 configuration interface.

The IPv6 problem is somewhat frustrating to diagnose, since it only appears some time after the router has been restarted. For instance, I've usually seen that IPv6 stops working the next day after a reboot. Similarly, changing some unrelated settings, like the wireless SSID, also appears to temporarily fix the issue, only for it to reappear after a while.

Searching the web I can find some discussions about similar problems with TP-Link routers, with no clear conclusion. The firmware changelog does say in a vague way that the latest version fixes an IPv6 problem. However, I've tried the V1_160427 and V1_151120 firmwares and they both behave in the same (broken) way.

Modifications in Archer C20 firmware.

After much head scratching I found out that the base cause why my laptop does not get an IPv6 address over Wi-Fi is IPv6 duplicate address detection. This is apparent from the dadfailed flag on the link local address:

$ ip addr show dev wlan0
3: wlan0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether a0:88:xx:xx:xx:xx brd ff:ff:ff:ff:ff:ff
    inet 192.168.xx.xx/24 brd 192.168.xx.255 scope global dynamic wlan0
       valid_lft 484sec preferred_lft 484sec
    inet6 fe80::a288:xxxx:xxxx:xxxx/64 scope link tentative dadfailed
       valid_lft forever preferred_lft forever

Also, when you know what to look for, this error appears in the logs:

$ dmesg|grep duplicate
IPv6: wlan0: IPv6 duplicate address fe80::a288:xxxx:xxxx:xxxx detected!

So it seems that my laptop thinks that there is another device on the network with the same link-local (and hence Ethernet MAC) address. This is of course not true. In fact, if I disable the duplicate address detection, IPv6 starts working properly:

# sysctl net.ipv6.conf.wlan0.accept_dad=0

Investigating things a bit further, Wireshark shows the following curious packet capture immediately after the laptop connects to the wireless network:

Packet capture during IPv6 neighbor solicitation.

This appears to be the progress of a normal attempt at a IPv6 autoconfiguration from the laptop's side. The laptop (with the MAC address a0:88:...) sends some packets to a IPv6 multicast address (33:33:...). However, all these packets seem to be immediately reflected back to the laptop by Archer C20. The incoming packets highlighted in yellow are byte-by-byte identical to the preceding outgoing packets, with only the destination address in the Ethernet header changed from the multicast to the laptop's MAC address. These incoming packets are not present when the laptop is connected to one of the wired LAN ports, or when using the old wireless router.

These reflected packets trigger the duplicate address detection in the laptop's network stack and the autoconfiguration is interrupted. It seems that at some point in the Archer C20 uptime, IPv6 multicast groups stop working correctly. In fact I don't understand why it even tries to do anything special with those. WRT54GL had no concept of IPv6 and it worked fine. I've experimented with other options that looked related to multicast (like IGMP settings), but with no success. So unfortunately at the moment I don't have a good network-side solution. Any suggestions would be most welcome. Changing the stock firmware might work, but support in OpenWRT for this hardware currently seems experimental at best.

The device-side work-around is to disable DAD like I show above, but this is somewhat ugly. There might be a way to disable it on a per-network basis with Network Manager (see dad-timeout), but I have not tried this yet. It is also still not clear why some devices appear to work. It might be that Android simply disabled duplicate address detection in 6.0.

Update: This Mikrotik manual suggests that such repeated multicast packets are expected on wireless access points that are aware of multicast addresses. So it still might be that the problem is somewhere on the client side. I did find out that enabling Client Isolation on the Archer C20 fixes this problem (with the obvious side effect that wireless clients can no longer talk to each other). dad-timeout NetworkManager option is not supported on Debian Jessie.

Update: Client Isolation doesn't actually help. After a few days I'm again getting the dadfailed flag.

Posted by Tomaž | Categories: Code | Comments »

GIMP onion layers plug-in

21.02.2017 20:48

Some time ago I was playing with animation making applications on a (non-pro) iPad. I found the whole ecosystem very closed and I had to jump through some hoops to get my drawings back onto a Linux computer. However the fact that you can draw directly on the screen does make some things easier compared to a standalone Wacom tablet, even if the accuracy is significantly worse.

One other thing in particular stood out compared to my old GIMP setup. These applications make it very easy to jump frame by frame through the animation. In one touch you can display the next frame and do some quick edits and then move back with another touch. You can browse up and down the stack as a quick way to preview the animation. They also do something they call onion layering which simply means that they overlay the next and previous frames with reduced opacity so that it's easier to see how things are moving around.

This is all obviously useful. I was doing similar things in GIMP, except that changing frames there took some more effort. GIMP as such doesn't have a concept of frames. Instead you use image layers (or layer groups) as frames. You have to click to select a layer and then a few more clicks to adjust the visibility and opacity for neighboring layers if you want to have the onion layer effect. This quickly amounts to a lot of clicking around if you work on more than a handful of frames.

GIMP does offer a Python plug-in interface however, so automating quick frame jumps is relatively simple. Relatively, because GIMP Python Documentation turns out to be somewhat rudimentary if you're not already familiar with GIMP internals. I found it best to learn from the Python-Fu samples and explore the interface using the built-in interactive console.

Screenshot of the GIMP onion layers plug-in

The end result of this exercise was the GIMP onion layers plug-in, which you can now find on GitHub together with installation and usage instructions. The plug-in doesn't have much in terms of an user interface - it merely registers a handful of python-fu-onion- actions for stepping to previous or next frame, with or without the onion layer effect. The idea is that you then assign keyboard (or tablet button) shortcuts to these actions. You will have to define the shortcuts yourself though, since the plug-in can't define them for you. I like to use dot and comma keys since they don't conflict with other GIMP shortcuts and match the typical frame step buttons on video players.

If you follow the layer structure suggested by the Export layers plug-in, this all works quite nicely, including handling of background layers. The only real problem I encountered was the fact that the layer visibility and opacity operations clutter the undo history. Unfortunately, that seems to be the limitation of the plug-in API. Other plug-ins work around this by doing operations on a duplicate of the image, but obviously I can't do that here.

I should note that I was using GIMP 2.8.14 from Debian Jessie, so the code might be somewhat outdated compared to latest GIMP 2.8.20. Feedback in that regard is welcome, as always.

Posted by Tomaž | Categories: Code | Comments »

Python applications in a .zip

08.02.2017 10:20

Poking around youtube-dl I found this interesting recipe on how to package a self-contained Python application. Youtube-dl ships as a single executable file you can run immediately, or put somewhere into your PATH. This makes it very convenient to use even when you don't want to do the usual pip install dance. Of course, it comes at the cost of not resolving any dependencies for you.

I was expecting the file to be a huge, monolithic Python source file, but in fact it's a ZIP with a prepended hash-bang and nicely structured Python package inside. Simplified a bit, here is the gist of the Makefile part that builds it:

hello:
	cd src && zip ../hello hello/*.py __main__.py
	echo '#!/usr/bin/python' > hello
	cat hello.zip >> hello
	rm hello.zip
	chmod a+x hello

Now, if src/__main__.py contains:

import hello
hello.greet()

And src/hello/__init__.py contains:

def greet():
	print("Hello, World!")

Building the executable and running hello from the command-line should result in the standard greeting:

$ make
cd src && zip ../hello hello/*.py __main__.py
  adding: hello/__init__.py (stored 0%)
  adding: __main__.py (stored 0%)
echo '#!/usr/bin/python' > hello
cat hello.zip >> hello
rm hello.zip
chmod a+x hello
$ ./hello
Hello, World!

How does this work? Apparently it's quite an old trick with some added refinement. Already since version 2.3 Python knows how to import modules directly from ZIP files in the same way as from the usual directories. Python also allows executing modules from the command-line.

It sounds very much like Java JARs, doesn't it? The only missing part is the #!... line that makes the Linux kernel use the Python interpreter when executing the file. Since ZIP format ignores any junk that precedes the compressed data, the line can simply be prepended as if the whole file was a simple Bash script.

Posted by Tomaž | Categories: Code | Comments »

Moving Dovecot indexes and control files

23.12.2016 22:06

Dovecot IMAP server can use a standard Maildir for storage of messages inside user's home directories. The default in that case is to store search indexes and control files in the same directory structure, alongside mail files. That can be convenient, since no special setup is needed and everything is stored in the same place.

However, this doesn't work very well if you have disk quotas enabled on the filesystem that stores Maildirs. In case a user reaches their quota, Dovecot will not be able to write to its own files, which can lead to problems. Hence, documentation recommends that you configure a separate location for Dovecot's files in that case. This is done with INDEX and CONTROL options to the mail_location specification.

For example, after setting up appropriate directory structure and permissions under /var/lib/dovecot:

mail_location = maildir:%h/Maildir:INDEX=/var/lib/dovecot/index/%u:CONTROL=/var/lib/dovecot/control/%u

You can just set this up and leave old index and control files in place. In that case, Dovecot will automatically regenerate them. However, this is not ideal. It can take significant time to regenerate indexes if you have a lot of mail. You also lose some IMAP-related metadata, like message flags and unique IDs, which will confuse IMAP clients. It would be better to move existing files to the new location, however the documentation doesn't say how to do that.

I found that the following script works with Dovecot 2.2.13 on Debian Jessie. As always, be careful when dealing with other people's mail and double check that the script does what you want. I had my share of problems when coming up with this. Make backups.

#!/bin/bash

set -ue

# Run as "migrate_dovecot.sh user path-to-users-maildir" for each user.

# Make sure that Dovecot isn't running or that this specific IMAP user isn't
# connected (and can't connect) while this script runs!
USERNAME=$1
MAILDIR=$2

DOVECOTDIR=/var/lib/dovecot

# Correct after double-checking that this script does what you want.
MV="echo mv -i"

cd $MAILDIR

# Index files like dovecot.index, dovecot.index.cache, etc. go under the 
# INDEX directory. The directory structure should be preserved. For example,
# ~/Maildir/.Foo/dovecot.index should go to index/.Foo/dovecot.index.

# Exception are index files in the root of Maildir. Those should go under 
# .INBOX
b=$DOVECOTDIR/index/$USERNAME/.INBOX
mkdir -p "$b"
$MV *index* "$b"

find . -name "*index*"|while read a; do
	b=$DOVECOTDIR/index/$USERNAME/`dirname "$a"`
	mkdir -p "$b"
	$MV "$a" "$b"
done

# dovecot-uidlist and dovecot-keywords files should go under CONTROL, in a
# similar way to indexes. There is the same exception for .INBOX.
b=$DOVECOTDIR/control/$USERNAME/.INBOX
mkdir -p "$b"
$MV dovecot-uidlist dovecot-keywords "$b"

find . -name "*dovecot*"|while read a; do
	b=$DOVECOTDIR/control/$USERNAME/`dirname "$a"`
	mkdir -p "$b"
	$MV "$a" "$b"
done

# subscriptions file should go to the root of the control directory.

# Note that commands above also move some dovecot-* files into the root of
# the control directory. This seems to be fine.
$MV "subscriptions" "$DOVECOTDIR/control/$USERNAME"
Posted by Tomaž | Categories: Code | Comments »

Blacklisting users for inbound mail in Exim

18.09.2016 12:00

You can prevent existing local users from receiving mail by redirecting them to :fail: in /etc/aliases. For example, to make SMTP delivery to list@... fail with 550 Unrouteable address:

list: :fail:Unrouteable address

See special items section in the Redirect router documentation.

By default, Exim in Debian will attempt to deliver mail for all user accounts, even non-human system users. System users (like list above) typically don't have a traditional home directory set in /etc/passwd. This means that mail for them will get stuck in queue as Exim tries and fails to write to their mailbox. Because spam also gets sent to such addresses, mail queue will grow and various things will start to complain. Traditionally, mail for system accounts is redirected to root in /etc/aliases, but some accounts just receive a ton of spam and it's better to simply reject mail sent to them.

Another thing worth pointing out is the Handling incoming mail for local accounts with low UID section in README.Debian.gz in case you want to reject mail sent to all system accounts.

This took way too much time to figure out. There's a ton of guides on how to blacklist individual users for outgoing mail, but none I could find for cases like this. I was half-way into writing a custom router before I stumbled upon this feature.

Posted by Tomaž | Categories: Code | Comments »

Script for setting up multiple terminals.

16.09.2016 19:04

Sometimes I'm working on software that requires running a lot of different inter-dependent processes (are microservices still a thing?). Using systemd or some other init system for starting up such systems is fine for production. While debugging something on my laptop however it's useful to have each process running in its own X terminal. This allows me to inspect any debug output and to occasionally restart something. I used to have scripts that would run commands in individual GNU screen sessions, but that had a number of annoying problems.

I've recently came up with the following:

#!/bin/bash

set -ue

if [ "$#" -ne 1 ]; then
	echo "USAGE: $0 path_to_file"
	echo "File contains one command per line to be started in a terminal window."
	exit 1
fi

cat "$1" | while read CMD; do
	if [ -z "$CMD" -o "${CMD:0:1}" = "#" ]; then
		continue
	fi

	RCFILE=`tempfile`

	cat > "$RCFILE" <<END
source ~/.bashrc
history -s $CMD
echo $CMD
$CMD
END

	gnome-terminal -e "/bin/bash --rcfile \"$RCFILE\""
	rm "$RCFILE"
done

This script reads a file that contains one command per line. Empty lines and lines starting with a hash sign are ignored. For each line it opens a new gnome-terminal (adjust as needed - most terminal emulators support the -e argument) and runs the command in a way that:

  • The terminal doesn't immediately close after the command exits. Instead it drops back to bash. This allows you to inspect any output that got printed right before the process died.
  • The command is printed on top of the terminal before the command runs. This allows you to identify the terminal running a particular process in case that is not obvious from the command output. For some reason, gnome-terminal's --title doesn't work.
  • The command is appended on top of bash' history list. This allows you to easily restart the process that died (or you killed with Ctrl-C) by simply pressing the up cursor and enter keys.
Posted by Tomaž | Categories: Code | Comments »

The NumPy min and max trap

30.08.2016 20:21

There's an interesting trap that I managed to fall into a few times when doing calculations with Python. NumPy provides several functions with the same name as functions built into Python. These replacements typically provide a better integration with array types. Among them are the min() and max(). In vast majority of cases, NumPy versions are a drop-in replacement for built-ins. In a few, however, they can cause some very hard-to-spot bugs. Consider the following:

import numpy as np

print(max(-1, 0))
print(np.max(-1, 0))

This prints (at least in NumPy 1.11.1 and earlier):

0
-1

Where is the catch? The built-in max() can be used in two distinct ways: you can either pass it an iterable as the single argument (in which case the largest element of the iterable will be returned), or you can pass multiple arguments (in which case the largest argument will be returned). In NumPy, max() is an alias for amax() and that only supports the former convention. The second argument in the example above is interpreted as array axis along which to perform the maximum. It appears that NumPy thinks axis zero is a reasonable choice for a zero-dimensional input and doesn't complain.

Yes, recent versions of NumPy will complain if you have anything else than 0 or -1 in the axis argument. Having max(x, 0) in code is not that unusual though. I use it a lot as a shorthand when I need to clip negative values to 0. When moving code around between scripts that use NumPy, those that don't and IPython Notebooks (which do "from numpy import *" by default), its easy to mess things up.

I guess both sides are to blame here. I find that flexible functions that interpret arguments in multiple ways are usually bad practice and I try to leave them out of interfaces I design. Yes, they are convenient, but they also often lead to bugs. On the other hand, I would also expect NumPy to complain about the nonsensical axis argument. Axis -1 makes sense for a zero-dimensional input, axis 0 doesn't. The alias from max() to amax() is dangerous (and as far as I can see undocumented). A possible way to prevent such mistakes would be to support only the named version of the axis argument.

Posted by Tomaž | Categories: Code | Comments »