Recent free software work

30.07.2016 9:25

I've done a bit of open source janitorial work recently. Here is a short recap.


jsonmerge is a Python module for merging a series of JSON documents using arbitrarily complicated rules inspired by JSON schema. I have developed the gist of it with Sarah Bird during EuroPython 2014. Since then I've been maintaining it, but not really doing any further development. 1.2.1 release fixes a bug in internal handling of JSON references, reported by chseeling. The bug caused a RefResolutionError to be raised when merging properties with slash or tilde characters in them.

I believe jsonmerge is usable in its current form. The only larger problem that I know of is the fact that automatic schema generation for merged documents would need to be rethought and probably refactored. This would address incompatibility with jsonschema 2.5.0 and improve handling of some edge cases. get_schema() seems to be rarely used however. I don't have any plans to work on this issue at the moment as I'm not using jsonmerge myself. I would be happy to look into any pull requests or work on it myself if anyone would offer a suitable bounty.


aspell-sl is the Slovenian dictionary for GNU Aspell. Its Debian package was recently orphaned. As far as I know, this is currently the only Slovenian dictionary included in Debian. I took over as the maintainer of the Debian package and fixed several long-standing packaging bugs to prevent it from disappearing from next Debian stable release. I haven't updated the dictionary however. The word list, while usable, remains as it was since the last update somewhere in 2002.

The situation with this dictionary seems complicated. The original word list appears to have been prepared in 2001 or 2002 by a diverse group of people from JSI, LUGOS, University of Ljubljana and private companies. I'm guessing they were funded by the now-defunct Ministry of Information Society which was financing localization of open source projects around that time. The upstream web page is long gone. In fact, aspell itself doesn't seem to be doing that well, although I'm still a regular user. The only free and up-to-date Slovenian dictionary I've found on-line was in the Slovenian Dictionary Pack for LibreOffice. It seems the word list from there would require relatively little work to be adapted for GNU Aspell (Hunspell dictionaries use very similar syntax). However, the upstream source of data in the pack is unclear to me and I hesitate to mess too much with things I know very little about.


z80dasm is a disassembler for the Zilog Z80 microprocessor. I forked the dz80 project by Jan Panteltje when it became apparent that no freely available disassembler was capable of correctly disassembling Galaksija's ROM. The 1.1.4 release adds options for better control of labels generated at the start and end of sections in the binaries. It also fixes a memory corruption bug that could sometimes lead to a wrong disassembly.

Actually, I committed these two changes to the public git repository three years ago. Unfortunately it seems that I have forgotten to package them into a new release at that time. Now I also took the opportunity to update and clean up the autotools setup. I'll work towards updating the z80dasm Debian package as well. z80dasm is pretty much feature complete at this point and except any further bug reports I don't plan any further development.

Posted by Tomaž | Categories: Code | Comments »

On "ap_pass_brigade failed"

25.05.2016 20:37

Related to my recent rant regarding the broken Apache 2.4 in Debian Jessie, another curious thing was the appearance of the following in /var/log/apache2/error.log after the upgrade:

[fcgid:warn] [pid ...:tid ...] (32)Broken pipe: [client ...] mod_fcgid: ap_pass_brigade failed in handle_request_ipc function, referer: ...

Each such error is also related to a 500 Internal Server Error HTTP response logged in the access log.

There's a lot of misinformation floating about this on the web. Contrary to the popular opinion, this is not caused by wrong values of various Fcgid... options or the PHP_FCGI_MAX_REQUESTS variable. Actually, I don't know much about PHP (which seems to be the primary use case for FCGI), but I do know how to read the mod_fcgid source code and this error seems to have a very simple cause: clients that close the connection before waiting for the server to respond.

The error is generated on line 407 of fcgid_bridge.c (mod_fcgid 2.3.9):

/* Now pass any remaining response body data to output filters */
if ((rv = ap_pass_brigade(r->output_filters,
                          brigade_stdout)) != APR_SUCCESS) {
    if (!APR_STATUS_IS_ECONNABORTED(rv)) {
        ap_log_rerror(APLOG_MARK, APLOG_WARNING, rv, r,
                      "mod_fcgid: ap_pass_brigade failed in "
                      "handle_request_ipc function");
    }

    return HTTP_INTERNAL_SERVER_ERROR;
}

The comment at the top already suggests the cause of the error message: failure to send the response generated by the FCGI script. The condition is easy to reproduce with a short Python script that sends a request and immediately closes the socket:

import socket, ssl

HOST="..."
# path to some document generated by an FCGI script
PATH="..."

ctx = ssl.create_default_context()
conn = ctx.wrap_socket(socket.socket(socket.AF_INET), server_hostname=HOST)
conn.connect((HOST, 443))
conn.sendall("GET " + PATH + " HTTP/1.0\r\nHost: " + HOST + "\r\n\r\n")
conn.close()

Actually, you can do the same with a browser by mashing refresh and stop buttons. The success somewhat depends on how long the script takes to generate the response - for very fast scripts it's hard to tear down the connection fast enough.

Probably at some point ap_pass_brigade() returned ECONNABORTED when the client broke the connection, hence the if statement in the code above. It appears that now EPIPE is returned and mod_fcgid was not properly updated. I was testing this on apache2 2.4.10-10+deb8u4.

In any case, this error message is benign. Fiddling with the FcgidOutputBufferSize might cause the response to be sent out earlier and reduce the chance that this will be triggered by buggy crawlers and such, but in the end there is nothing you can do about it on the server side. The 500 response in the log is also clearly an artifact in this case, since it's the client that caused the error, not the server, and no error page was actually delivered.

Posted by Tomaž | Categories: Code | Comments »

Jessie upgrade woes

23.05.2016 19:59

Debian 8 (Jessie) was officially released a bit over a year ago. Previous January I mentioned I plan to upgrade my CubieTruck soon which in this case meant 16 months. Doesn't time fly when you're not upgrading software? In any case, here are some assorted notes regarding the upgrade from Debian Wheezy. Most of them are not CubieTruck specific, so I guess someone else might find them useful. Or entertaining.

Jessie armhf comes with kernel 3.16, which supports CubieTruck's Allwinner SoC and most of the peripherals I care about. However, it seems you can't use the built-in NAND flash for booting. It would be nice to get away from the sunxi 3.4 kernel and enjoy kernel updates through apt, but I don't want to get back to messing with SD cards. Daniel Andersen keeps the 3.4 branch reasonably up-to-date and Jessie doesn't seem to have problems with it, so I'll stick with that for the time being.

CubieTruck

Dreaded migration to systemd didn't cause any problems, apart from having to migrate a couple of custom init.d scripts. The most noticeable change is a significant increase in the number of mounted tmpfs filesystems, which makes df output somewhat unwieldy and, by consequence, Munin's disk usage graphs a mess.

SpeedyCGI was a way of making dynamic web pages back in the olden days. In the best Perl tradition it tweaked some low-level parts of the language in order to avoid restarting the interpreter for each HTTP request - like automagically persisting global state and making exit() not actually exit. From a standpoint of a lazy web developer it was an incredibly convenient way to increase performance of plain old CGI scripts. But alas, it remained unmaintained for many years and was finally removed in Jessie.

FCGI and Apache's mod_fcgi (not to be confused with mod_fastcgi, it's non-free and slightly more broken cousin) seemed like natural replacements. While FCGI makes persistence explicit, the programming model is more or less the same and hence the migration required only some minor changes to my scripts - and working around various cases of FCGI's brain damage. Like for instance intentional ignorance of Perl's built-in Unicode support. Or the fact that gracefully stopping worker processes is more or less unsupported. In fact, FCGI's process management seems to be broken on multiple levels, as mod_fcgi has problems maintaining a stand-by pool of workers.

Perl despair

In any case, the new Apache 2.4 is a barrel of fun by itself. It changes the syntax for access control in such a way that config files need to be updated manually. It now also ignores all config files if they don't end in .conf. Incidentally, Apache will serve files from /var/www/html if it has no VirtualHosts defined. This seems to be a hard-coded default, so you can't find why it's doing that by grepping through /etc/apache2.

The default config in Jessie frequently warns about deadlocks in various places:

(35)Resource deadlock avoided: [client ...] mod_fcgid: can't lock process table in pid ...
(35)Resource deadlock avoided: AH00273: apr_proc_mutex_lock failed. Attempting to shutdown process gracefully.
(35)Resource deadlock avoided: AH01948: Failed to acquire OCSP stapling lock

I'm currently using the following in apache2.conf, which so far seems to work around this problem:

# was: Mutex file:${APACHE_LOCK_DIR} default
Mutex sem default

Apache 2.4 in Jessie breaks HTTP ETag caching mechanism. If you're using mod_deflate (it's used by default to compress text-based content like HTML, CSS, RSS), browsers won't be getting 304 Not Modified responses, which means longer load times and higher bandwidth use. The workaround I'm using is the following in mods-available/deflate.conf (you need to also enable mod_headers):

Header edit "Etag" '^"(.*)-gzip"$' '"$1"'

This differs somewhat from the solution proposed in Apache's Bugzilla, but as far as I can see restores the old and tested behavior of Apache 2.2, even if it's not exactly up to HTTP specification.

I wonder whether this state of affairs means that everyone has moved on to nginx or these are just typical problems for a new major release. Anyway, to conclude on a more positive note, Apache now supports OCSP stapling, which is pretty simple to enable.

Finally, rsyslog is slightly broken in Jessie on headless machines that don't have an X server running. It spams the log with lines like:

rsyslogd-2007: action 'action 17' suspended, next retry is Sat May 21 18:12:53 2016 [try http://www.rsyslog.com/e/2007 ]

This can be worked around by commenting-out the following lines in rsyslog.conf:

#daemon.*;mail.*;\
#       news.err;\
#       *.=debug;*.=info;\
#       *.=notice;*.=warn       |/dev/xconsole
Posted by Tomaž | Categories: Code | Comments »

Display resolutions in QEMU in Windows 10 guest

21.01.2016 18:08

A while back I posted a recipe on how to add support for non-standard display resolutions to QEMU's stdvga virtual graphics card. For instance, I use that to add a wide-screen 1600x900 mode for my laptop. That recipe still works on Windows 7 guest and the latest QEMU release 2.5.0.

On Windows 10, however, the only resolution you can set with that setup is 1024x768. Getting it to work requires another approach. I should warn though that performance seems quite bad. Specifically, opening a web page that has any kind of dynamic elements on it can slow down the guest so much that just closing the offending window can take a couple of minutes.

First of all, the problem with Windows 10 being stuck at 1024x768 can be solved by switching the VGA BIOS implementation from the Bochs BIOS, which is shipped with QEMU upstream by default, to SeaBIOS. Debian packages switched to SeaBIOS in 1.7.0+dfsg-2, so if you are using QEMU packages from Jessie, you should already be able to set resolutions other than 1024x768 in Windows' Advanced display settings dialog.

If you're compiling QEMU directly from upstream, switching the VGA BIOS is as simple as replacing the vgabios-stdvga.bin file. Install the seabios package and run the following:

$ rm $PREFIX/share/qemu/vgabios-stdvga.bin
$ ln -s /usr/share/seabios/vgabios-stdvga.bin $PREFIX/share/qemu

where $PREFIX is the prefix you used when installing QEMU. If you're recompiling QEMU often, the following patch for QEMU's top-level Makefile does this automatically on make install:

--- qemu-2.5.0.orig/Makefile
+++ qemu-2.5.0/Makefile
@@ -457,6 +457,8 @@ ifneq ($(BLOBS),)
 	set -e; for x in $(BLOBS); do \
 		$(INSTALL_DATA) $(SRC_PATH)/pc-bios/$$x "$(DESTDIR)$(qemu_datadir)"; \
 	done
+	rm -f "$(DESTDIR)$(qemu_datadir)/vgabios-stdvga.bin"
+	ln -s /usr/share/seabios/vgabios-stdvga.bin "$(DESTDIR)$(qemu_datadir)/vgabios-stdvga.bin"
 endif
 ifeq ($(CONFIG_GTK),y)
 	$(MAKE) -C po $@

Now you should be able to set resolutions other than 1024x768, but SeaBIOS still doesn't support non-standard resolutions like 1600x900 by default. For that, you need to amend the list of video modes and recompile SeaBIOS. Get the source (apt-get source seabios) and apply the following patch:

--- seabios-1.7.5.orig/vgasrc/bochsvga.c
+++ seabios-1.7.5/vgasrc/bochsvga.c
@@ -99,6 +99,9 @@ static struct bochsvga_mode
     { 0x190, { MM_DIRECT, 1920, 1080, 16, 8, 16, SEG_GRAPH } },
     { 0x191, { MM_DIRECT, 1920, 1080, 24, 8, 16, SEG_GRAPH } },
     { 0x192, { MM_DIRECT, 1920, 1080, 32, 8, 16, SEG_GRAPH } },
+    { 0x193, { MM_DIRECT, 1600, 900,  16, 8, 16, SEG_GRAPH } },
+    { 0x194, { MM_DIRECT, 1600, 900,  24, 8, 16, SEG_GRAPH } },
+    { 0x195, { MM_DIRECT, 1600, 900,  32, 8, 16, SEG_GRAPH } },
 };
 
 static int dispi_found VAR16 = 0;

You probably want to also increment the package version to keep it from being overwritten next time you do apt-get upgrade. Finally, recompile the package with dpkg-buildpackage and install it.

Now when you boot the quest you should see your new mode appear in the list of resolutions. There is no need to recompile or reinstall QEMU again.

Posted by Tomaž | Categories: Code | Comments »

Conveniently sharing screenshots

20.09.2015 20:44

When collaborating remotely or when troubleshooting something, it's often useful to quickly share screenshots in a chat like IRC or instant messaging. There are cloudy services that can do image sharing and provide convenient applications that do that. However, if you have your own web server, it seems wasteful to ship screenshots overseas just to be downloaded a moment later by a colleague a few blocks away. Here is a simple setup that instead uses SSH to copy a screenshot to a folder on a web server and integrates nicely with a GNOME desktop.

First, you need a script that pushes files to a public_html folder (or something equivalent) on your web server and reports back the URL:

#!/bin/bash

set -e

# Enter your server's details here...
SSH_DEST="www.example.com:public_html/boxdrop"
URL_BASE="https://www.example.com/~user/boxdrop"

for i in "$@"; do
	base=`basename "$i"`
	abs=`realpath "$i"`

	chmod o+r "$abs"
	scp -pC "$abs" "$SSH_DEST"

	URL="$URL_BASE/$base"
	zenity --title "Shared image" --info --text "<a href="\"$URL\"">$URL</a>"
done

I use Zenity to display the URL in a nice dialog box (apt-get install zenity). Also, don't forget to setup a passwordless SSH login to your web server using your public key.

Name this script share and put it in some convenient folder. A good place is ~/.local/bin, which will also put it in your PATH on most systems (so you can also conveniently share arbitrary files from your terminal). Don't forget to mark the script executable.

You now have to associate this script with some image MIME types. Put a file named share.desktop into ~/.local/share/applications with the following contents. Don't forget to correct the full path to your share script.

[Desktop Entry]
Version=1.0
Type=Application
Name=Share image
Comment=Share an image in a public HTTP folder
Exec=/home/user/.local/bin/share %U
Terminal=false
MimeType=image/png;image/jpx

To check if this worked, open a file manager like Nautilus and right-click on some PNG file. It should now offer to open the file with Share image in addition to other applications usually used with images. If it doesn't work, you might need to run update-desktop-database ~/.local/share/applications.

Finally, install xfce4-screenshooter and associate it with the Print screen button on the keyboard. In GNOME, this can be done in the keyboard preferences:

Associating xfce4-screenshooter with Print screen key in GNOME.

Now, when you press Print screen and make a screenshot, you should have a Share image option under the Open with drop-down menu.

xfce4-screenshooter with "Share image" selected.

When you select it, a window with the URL should appear. You can copy the URL to the chat window, or click it to open it in your browser.

Zenity dialog with the URL of the shared screenshot.

Of course, if the folder you use on your web server is open to the public, so will be the screenshots you share. You might want to have the folder use a secret URL or setup basic HTTP authentication. In any case, it's good to apply some common sense when using your newly found screenshot sharing powers. It's also useful to have a cronjob (systemdjob?) on the server that cleans up the folder with shared images once in a while. This both prevents the web server's disk from filling up and stale images from lingering for too long on the web. Implementing that is left as an exercise for the reader.

Posted by Tomaž | Categories: Code | Comments »

Using Metex M-3860D from Python

08.08.2015 15:56

Metex M-3860D is a fairly old multimeter with a computer interface. It uses a RS-232 serial line, but requires a somewhat unusual setup. It does work with modern USB-to-serial converters, however a serial terminal application like cutecom won't work.

Here is a Python recipe using pyserial (replace /dev/ttyUSB0 with the path to your serial device):

import serial

# 1200 baud, 7 data bits, 2 stop bits, no parity, no flow control.
comm = serial.Serial("/dev/ttyUSB0", 1200, 7, "N", 2, timeout=1.)

# DTR line must be set to high, RTS line must be set to low.
comm.setDTR(1)
comm.setRTS(0)

# The instrument sends back currently displayed measurement each time
# you send 'D' over the line.
comm.write('D')

# This returns 14 characters that show type of measurement, value and
# unit - for example "DC  0.000   V\n".
print comm.readline()

The correct setting of the DTR and RTS modem lines is important. If you don't set them as shown here, you won't get any reply from the instrument. This is not explicitly mentioned in the manual. It's likely that the computer interface in the multimeter is in fact powered from these status lines.

Posted by Tomaž | Categories: Code | Comments »

Improving privacy with Iceweasel

06.08.2015 19:35

For several years now Debian has been shipping with a rebranded Firefox browser called Iceweasel. As far as I understand the matter, the reason is Mozilla's trademark policy which says that anything bearing the Firefox brand must be approved by them. Instead of dealing with the review process for each Debian-specific patch, Debian maintainers chose to strip the branding from the browser. This is something Mozilla's code actually makes provisions for.

The Iceweasel releases from the Debian Mozilla Team closely track the upstream. In contrast to other Firefox forks, changes to Iceweasel generally only improve integration with the Debian system. Packages released with the stable releases might also contain backported security fixes (although this might change soon).

A problem with Iceweasel is that it also identifies as such to all websites it loads through its User-Agent header. Iceweasel is quite a rarity among browsers, which makes its users easy to track across web pages. Currently EFF's Panopticlick shows that Iceweasel 39's User-Agent provides around 16 bits of identifying information. In contrast, a vanilla Firefox on Linux gives trackers around 11 bits. EFF's numbers change quite a lot over time though - I've seen 22 bits reported for Iceweasel a few weeks ago. In any case, if you have friends that like to watch Apache logs for fun, it's quite obvious when you're hitting their blogs if you are using Iceweasel.

In order to improve this situation I've created a very simple add-on for Iceweasel that removes the reference to Iceweasel from the User-Agent header. It consists only of a few lines of Javascript that run a regexp replace on the built-in User-Agent string and set the general.useragent.override preference. The User-Agent set this way should be identical to the vanilla Firefox of the same version running on Debian.

The extension is not on addons.mozilla.org, so you will have to fetch it from GitHub and build it yourself (git clone followed by make should do the trick).

Thawed Weasel extension for Iceweasel

How well does it work? Panopticlick shows the expected reduction of around 5 to 6 bits of identifying information. Which might not matter if the site is actively trying to fingerprint you and can run Javascript - system fonts and browser plugins are still very unique for a typical Debian desktop. But at least you don't stick out from access.log as a sore thumb.

You might ask why not just use one of the myriad existing User-Agent overrider add-ons. The trick is that I have not found any that would allow you to apply a search-and-replace regexp on the built-in User-Agent string. Without that, you either have to manually keep it up-to-date with the actual browser version, or risk sporting a unique, out-dated User-Agent string once everyone else's browser has auto-updated. I don't want my computer to have another itch that needs regular scratching.

A related argument against this add-on would be that providing an accurate User-Agent string is good etiquette on the web. It helps administrators with browser usage statistics and debugging any browser-specific problems with their web-sites. Considering that the idea of Iceweasel is to have minimal changes against the upstream Firefox, I think it is still within the boundaries of good behavior to present itself as Firefox. Whether this argument is valid or not is up to debate though. At the time of writing, Iceweasel 39.0-1~bpo70+1 has 36 patches applied against the upstream Firefox source, touching around 1800 lines of code.

Finally, of course, you can just install Mozilla's Linux build of Firefox on Debian. I'm sticking with Iceweasel because I prefer software managed through the package manager instead of dumping tarballs into /usr/local. Adding another distribution's repositories into Debian's /etc/apt/sources.list is just wrong though.

Posted by Tomaž | Categories: Code | Comments »

From F-Spot to Shotwell in Jessie

14.07.2015 20:24

F-Spot photo manager was dropped in Debian Jessie in favor of Shotwell. While Shotwell supports automatically importing data from F-Spot out-of-the box, the import is far from perfect. Here are some assorted notes from migrating my library. Note that this applies specifically to the version in Jessie (0.20.1).

  • Shotwell does not support storing multiple versions of a photo (F-Spot created a new version of a photo for each edit in GIMP, for instance). On import, each version is stored as a separate photo in Shotwell.

  • The tag hierarchy is correctly imported from F-Spot. However sometimes a tag is imported twice: once in its correct position in the hierarchy and once at the top level. This is because F-Spot at some point started embedding tags (without hierarchy data) inside image files themselves in the Xmp.dc.subject field. Shotwell import treats the F-Spot hierarchy and the embedded tags separately, resulting in duplication.

    One suggestion is to remove the embedded tags before import. However, this further modifies files (which shouldn't be modified in the first place, but you can't do anything about that now). Upon removing the field, the exiv2 tool also seems to corrupt vendor-specific fields in JPEG (I don't think truncating the entry warnings are as benign as this thread suggests.

    A better way is to ignore the embedded tags on import. Apply this patch, recompile the Shotwell Debian package and use the modified version to import data from F-Spot:

    --- shotwell-0.20.1.orig/src/photos/PhotoMetadata.vala	2014-03-04 23:54:12.000000000 +0100
    +++ shotwell-0.20.1/src/photos/PhotoMetadata.vala	2015-07-12 13:11:34.021751079 +0200
    @@ -857,7 +857,6 @@
         }
         
         private static string[] KEYWORD_TAGS = {
    -        "Xmp.dc.subject",
             "Iptc.Application2.Keywords"
         };
    

    After importing, you can revert back to the original version, shipped with Debian. The embedded tags are only read once on import.

  • Shotwell doesn't support GIF images. Any GIFs are skipped on import (and noted in the import log) As crazy as that sounds, I had some photos in my library from an old feature phone that saved photographs as GIF files. I converted them to PNG and manually re-imported.

  • Time and date information is not imported from F-Spot at all. Shotwell reads only the data from the EXIF JPEG header. If you adjusted creation time in F-Spot, those modifications will be lost.

    Unfortunately, F-Spot itself seems to have had problems with storing this data in its database. It looks like the timestamps have been stored in different formats over time without proper migrations between versions. Looking at my F-Spot library, various photos have +1, 0 and -1 hour offsets compared what they probably should be.

    I gave up on trying to correct the mess in the F-Spot library. I made a Python script that copied timestamps directly from the F-Spot SQLite database to the Shotwell database, but only when the difference was more than 1 hour. The script is quite hacky and probably too dangerous for general consumption. If you have similar issues and are not afraid of SQL, send me a mail and I'll share it.

In the end, Shotwell feels much more minimalist than F-Spot. Its user interface has its own set of quirks and leaves a lot to be desired, particularly regarding tag management. On the other hand, I have yet to see it crash, which was a common occurrence with F-Spot.

Posted by Tomaž | Categories: Code | Comments »

Python PublicSuffix release 1.1.0

10.07.2015 16:46

The publicsuffix Python module is one of the most widely used pieces of open source software that resulted from my years at Zemanta. With around 1000 daily downloads from the package index, it's probably only second to my Python port of Unidecode. I've retained the ownership of these packages even though I moved on to other things. Being mature pieces of specialized code, they are pretty low-maintenance, with typically one minor release per year and an occasional email conversation.

The publicsuffix 1.1.0 release had some interesting reasons behind it, and I think it's worth discussing them a bit here.

Since the start, publicsuffix shipped with a copy of Mozilla's Public Suffix List as part of the Python package. My initial intention was to make the module as easy to use as possible. As the example in the README showed, if you constructed the PublicSuffixList object without any arguments, it just worked, using the built-in data. The object supported loading arbitrary files, but this was only mentioned in the docstring. My reasoning at the time was that Public Suffix List was relatively slow-moving. For quick hacks it wouldn't matter if you used slightly out-dated data and serious users could pass in the exact version of the list they needed.

Later I started getting reports that other software started using publicsuffix with the assumption that the built-in data is always up-to-date. A recent example reported by Philippe Ombredanne was url-py. Some blame is definitely with authors of such software not reading the documentation. My README did have a warning about the default list being possibly out-dated, although that fact was probably not stressed enough. However, after reading complaints over time, I felt that perhaps publicsuffix could offer an interface that was less prone to errors like that.

I looked through the possibilities on how to improve things by looking at the quite numerous publicsuffix forks on GitHub and Public Suffix implementations in other languages.

The publicsuffix package with a built-in copy could be updated each time Mozilla updates the List. This would make the package decidedly less low-maintenance for me. It would also require that every software that uses publicsuffix regularly re-installs its dependencies and restarts to refresh the data used by PublicSuffix. From my experience with deploying server-side Python software, this is quite an unreasonable requirement.

Another approach would be to fetch the list automatically from Mozilla's servers at run-time. However, I don't like doing HTTP requests without this being explicitly visible to the developer. Such things have led to unintentional storms of traffic to unsuspecting sites. Also, for some uses, the latest version of the list from Mozilla's servers might not even be what they need (consider someone processing historical data).

Finally, I could simply say it's the user's responsibility to supply the algorithm with data and not provide any default. While this requires a bit more code, it also hopefully gives the developer a pause to think what version of the List is relevant to them.

The latter is the approach I went with in the end. I deprecated the argument-less PublicSuffixList constructor with a DeprecationWarning. I also added a convenience fetch() function that downloads the latest List from the web and the result of which can be directly passed to the constructor. This makes the request explicit (and explicit is better than implicit is, after all, in the Zen of Python) and only little less convenient. I opted against implementing any kind of caching in publicsuffix since I think that is something specific to each application.

In conclusion, you can't force people to carefully read the documentation. People will blame you for bugs in other people's software (incidentally, this release also works around a packaging problem due to a long-standing bug in Wheel). It's worth thinking about defaults and carefully picking what examples you show in the README.

Posted by Tomaž | Categories: Code | Comments »

Exponential backup rotation, part 2

27.06.2015 17:45

I've been writing previously about Chris Forno's log2rotate algorithm. After using my Python implementation for a while, one problem became apparent: the algorithm is quite bad at handling missing backups.

Missing backups happen in practice. For instance, a host might be down or without network connectivity at the time the daily backup Cron job is supposed to run. So every once in a while, a daily backup might be skipped. Unfortunately, log2rotate makes a strong assumption that new backups always come in a regular time series. The worst case happens if a backup that is selected by the algorithm for long-term storage is missing. This can double the age of restored files after a data loss.

Chris' original implementation does a simple check: if an expected backup is not there, it simply throws a fatal error. You can ignore the error with the --unsafe switch, but then it makes no promises as to what the resulting backup series will be.

My initial Python port of the algorithm basically mimicked this behavior (although with a bit smarter check that avoided some false alarms). Now, I've added another step in the algorithm that selects which backups to keep. I still calculate the pattern of retained backups using Chris' algorithm. However after that, I do a search for a backup that actually exists, within some limited time interval.

Most recent backup age versus time of data loss.

The graph above demonstrates this improvement. It shows the age of the most recent backup in a case that data loss occurred at some past point in time. On the horizontal axis, present day is day 100 and day 0 is the day the oldest backup was made.

The black dashed line shows the ideal case - using log2rotate algorithm and no missing backups. For example, if we wanted to restore a file that we know was lost on day 60 (40 days ago), the age of the most recent retained backup that still contains the file (created on day 32) is 28 days.

The blue line shows the worst case for a single missing backup using the original algorithm, but overriding errors with --unsafe. For restoring a file that was lost on day 60, the worst case is if backup on day 32 is missing. In that case, the next most recent backup is from day 0, which is 60 days old.

Finally, the red line shows the situation with the fuzzy matching algorithm. In this case, if backup on day 32 is missing, the algorithm retains the backup on day 31. So when restoring data lost on day 60, the most recent backup is only one day older compared to when no backups are missing.


The updated code is on GitHub. Documentation is somewhat lacking at the moment, but the algorithm is pretty well tested (backups are always a bit of a sensitive thing). If I see some interest, I might clean it up and push it to PyPi.

Posted by Tomaž | Categories: Code | Comments »

Identifying USB serial devices

19.06.2015 20:04

Many devices use a simulated serial line to talk over the USB port. This is by far the simplest way of adding USB connectivity to an embedded system - either by adding a dedicated RS232-to-USB hardware converter or by implementing USB CDC ACM protocol on a microcontroller with an USB peripheral. For instance, Arduino boards use one of these approaches depending on the model.

On Linux systems, such devices typically get device files like /dev/ttyUSBx or /dev/ttyACMx associated with them. For example, if you want to talk to a device using pySerial, you have to provide the path to such a device file to the library.

A common problem is how to match such a device file to a device that has been plugged into an USB port. My laptop for instance, already has /dev/ttyACM0 through /dev/ttyACM2 on boot with nothing connected externally. USB devices conveniently identify themselves using bus addresses, product and vendor identifiers, serial number and so on. However these back end identifiers are not accessible in any way through /dev/tty* device files, which are merely front ends to the Linux kernel terminal subsystem.

Surprisingly, this appears to be quite a complicated task to do programmatically. For instance in Python, PyUSB is of no assistance in this matter as far as I can see. Browsing the web you can find plenty of recipes that all appear overly convoluted (running a regexp over dmesg wins in this respect). You can crawl through the sysfs, but this seems like a brittle solution.

The best (as in, simplest to implement and reasonably future-proof) solution I found so far is to ask udev. udev takes care of various tasks related to hot-plugging devices and hence knows about many aspects of how a physical device is registered in the kernel. It also seems to be pretty universal these days, at least on Linux. So I guess it's not a terribly bad dependency to add. pyudev (apt-get install python-pyudev on Debian) provides a nice Python interface. udevadm gets you much of the same information from a command line.

For instance, to list all terminal devices connected over the USB bus and pretty print their properties:

import pyudev
import pprint

context = pyudev.Context()

for device in context.list_devices(subsystem='tty', ID_BUS='usb'):
	pprint.pprint(dict(device))

The device file path is in the DEVNAME property, as you can see below. Other identificators from the USB bus are available in other properties. You can use them to find the specific device you need (list_devices() arguments work as a filter):

{u'DEVLINKS': u'/dev/serial/by-id/usb-Olimex_Olimex_OpenOCD_JTAG-if01-port0 /dev/serial/by-path/pci-0000:00:1d.0-usb-0:1.1:1.1-port0',
 u'DEVNAME': u'/dev/ttyUSB1',
 u'DEVPATH': u'/devices/pci0000:00/0000:00:1d.0/usb4/4-1/4-1.1/4-1.1:1.1/ttyUSB1/tty/ttyUSB1',
 u'ID_BUS': u'usb',
 u'ID_MM_CANDIDATE': u'1',
 u'ID_MODEL': u'Olimex_OpenOCD_JTAG',
 u'ID_MODEL_ENC': u'Olimex\\x20OpenOCD\\x20JTAG',
 u'ID_MODEL_FROM_DATABASE': u'OpenOCD JTAG',
 u'ID_MODEL_ID': u'0003',
 u'ID_PATH': u'pci-0000:00:1d.0-usb-0:1.1:1.1',
 u'ID_PATH_TAG': u'pci-0000_00_1d_0-usb-0_1_1_1_1',
 u'ID_REVISION': u'0500',
 u'ID_SERIAL': u'Olimex_Olimex_OpenOCD_JTAG',
 u'ID_TYPE': u'generic',
 u'ID_USB_DRIVER': u'ftdi_sio',
 u'ID_USB_INTERFACES': u':ffffff:',
 u'ID_USB_INTERFACE_NUM': u'01',
 u'ID_VENDOR': u'Olimex',
 u'ID_VENDOR_ENC': u'Olimex',
 u'ID_VENDOR_FROM_DATABASE': u'Olimex Ltd.',
 u'ID_VENDOR_ID': u'15ba',
 u'MAJOR': u'188',
 u'MINOR': u'1',
 u'SUBSYSTEM': u'tty',
 u'UDEV_LOG': u'3',
 u'USEC_INITIALIZED': u'220805300741'}
{u'DEVLINKS': u'/dev/serial/by-id/usb-STMicroelectronics_VESNA_SpectrumWars_radio_32A155593935-if00 /dev/serial/by-path/pci-0000:00:1d.0-usb-0:1.2:1.0',
 u'DEVNAME': u'/dev/ttyACM3',
 u'DEVPATH': u'/devices/pci0000:00/0000:00:1d.0/usb4/4-1/4-1.2/4-1.2:1.0/tty/ttyACM3',
 u'ID_BUS': u'usb',
 u'ID_MM_CANDIDATE': u'1',
 u'ID_MODEL': u'VESNA_SpectrumWars_radio',
 u'ID_MODEL_ENC': u'VESNA\\x20SpectrumWars\\x20radio',
 u'ID_MODEL_ID': u'5740',
 u'ID_PATH': u'pci-0000:00:1d.0-usb-0:1.2:1.0',
 u'ID_PATH_TAG': u'pci-0000_00_1d_0-usb-0_1_2_1_0',
 u'ID_REVISION': u'0200',
 u'ID_SERIAL': u'STMicroelectronics_VESNA_SpectrumWars_radio_32A155593935',
 u'ID_SERIAL_SHORT': u'32A155593935',
 u'ID_TYPE': u'generic',
 u'ID_USB_DRIVER': u'cdc_acm',
 u'ID_USB_INTERFACES': u':020201:0a0000:',
 u'ID_USB_INTERFACE_NUM': u'00',
 u'ID_VENDOR': u'STMicroelectronics',
 u'ID_VENDOR_ENC': u'STMicroelectronics',
 u'ID_VENDOR_FROM_DATABASE': u'SGS Thomson Microelectronics',
 u'ID_VENDOR_ID': u'0483',
 u'MAJOR': u'166',
 u'MINOR': u'3',
 u'SUBSYSTEM': u'tty',
 u'UDEV_LOG': u'3',
 u'USEC_INITIALIZED': u'245185375947'}

If you intend to use a device on a certain host in a more permanent fashion, a better solution is to write a custom udev rule for it. A custom rule also allows you to set a custom name, permissions and other nice things. But the above seems to work quite nicely for software that requires a minimal amount of setup.

Posted by Tomaž | Categories: Code | Comments »

OpenCT on Debian Jessie

03.06.2015 17:12

I use a Schlumberger Cryptoflex (e-Gate) USB key to store some client-side SSL certificates. This is an obsolete device that has been increasingly painful to keep working with each new Debian release. I was especially worried when upgrading my desktop to Debian Jessie since the openct package was removed in this release cycle.

The old openct binary package from Wheezy is still installable in Jessie. In fact, on my system it was left installed (but marked as obsolete) by the upgrade process. The problem was that the USB key was very unreliable. In most cases, it errored out with some random-looking error:

$ pkcs15-tool -D
Using reader with a card: Axalto/Schlumberger/Gemalo egate token 00 00
Failed to connect to card: Generic reader error
$ pkcs15-tool -D
Using reader with a card: Axalto/Schlumberger/Gemalo egate token 00 00
Failed to connect to card: Unresponsive card (correctly inserted?) 

But on a few occasions it would actually work correctly:

$ pkcs15-tool -D
Using reader with a card: Axalto/Schlumberger/Gemalo egate token 00 00
PKCS#15 Card [OpenSC Card]:
	Version        : 0
	Serial number  : ...

After much digging through the internals of the smart card framework, I found out that the e-Gate is only supported by the pcscd daemon through the openct driver (which is just one of the many drivers you can install in Debian). OpenCT starts its own daemon per each connected USB key called ifdhandler. This daemon communicates over a socket with the rest of the framework and must run as long as the USB key is inserted. The main problem on Jessie is that this daemon gets killed at some random time by the new systemd-udevd. Somewhere along the pipeline, something reacts badly to the fact that the ifdhandler disappears, hence the random error messages.

A simple workaround is simply to start the daemon by hand after inserting the USB key:

$ sudo /usr/sbin/ifdhandler -F -H -dddddd -p egate usb /dev/bus/usb/002/104

A better solution is to fix the udev configuration installed by the openct package, so that the daemon doesn't get killed. Fortunately, the solution to this was already found by Mike Kazantsev, who has a a nice write-up on this topic. Of course, I only found his blog post after spending several hours finding the root of the problem.

I've prepared an updated set of openct packages that can be installed on Jessie and have this problem fixed, at least for my specific USB key model. I didn't fix several other packaging problems reported by lintian. The binaries for amd64 can be found here. The updated source is also in a git repository (use git-buildpackage):

$ git clone http://gildedale.tablix.org/~avian/git/openct.git

Other smart card-related packages work for me as-shipped in Jessie (e.g. opensc 0.14.0-2 and pcscd 1.8.13-1). Also note that my new packages depend on systemd. It is possible that the old Wheezy openct package still works correctly if you prevent the upgrade from migrating your init to systemd.

Other than that, my old instructions on how to set things up on Wheezy still work. One peculiarity I noticed though is that you need to insert the USB key before starting Firefox/Iceweasel. Otherwise the browser will not pick up any client certificate from the key.

Posted by Tomaž | Categories: Code | Comments »

Notes on M&M clock recovery

12.03.2015 8:48

M&M (Mueller and Müller) clock recovery is the name of one of the signal processing blocks in GNU Radio. Its task is to recover samples from a signal with the same frequency and phase as those used by the transmitter. This is necessary, for instance, when you want to extract symbols from an asynchronous digital signal. It allows you to synchronize your receiver with centers of ones and zeros present in the signal.

Searching the internet, the way this block works seems to be mostly regarded as black magic by users of GNU Radio. There's very little concrete information on how it works and more importantly, how to make it work in your particular use case since the block takes several parameters. The best I could find were some guesses and some suggestions on which values to try.

This is my attempt to clarify things a bit. I don't claim to completely understand the algorithm though. What's written bellow is my current understanding that I got foremost from studying the source. The academic paper that is cited by GNU Radio source points you to a trail of citations that leads all the way to the original 1976 paper describing the method. I have only read a small part of this literature. As it usually happens, it turned out to be a quite poor substitute for proper software documentation.

M&M Clock Recovery block from GNU Radio

The meat of the implementation is the following loop in general_work() method in clock_recovery_mm_ff_impl.cc. Note that I'm talking about the floating point implementation. There is also a complex number implementation, which should work in a very similar fashion, but I haven't looked into it.

while(oo < noutput_items && ii < ni ) {
  // produce output sample
  out[oo] = d_interp->interpolate(&in[ii], d_mu);
  mm_val = slice(d_last_sample) * out[oo] - slice(out[oo]) * d_last_sample;
  d_last_sample = out[oo];

  d_omega = d_omega + d_gain_omega * mm_val;
  d_omega = d_omega_mid + gr::branchless_clip(d_omega-d_omega_mid, d_omega_lim);
  d_mu = d_mu + d_omega + d_gain_mu * mm_val;

  ii += (int)floor(d_mu);
  d_mu = d_mu - floor(d_mu);
  oo++;
}

First of all, this code immediately shows some general characteristics of the algorithm, as it is implemented in GNU Radio. All this might seem very obvious to some, but none of it is clearly stated anywhere in the documentation:

  • This is a decimation block. It will output one sample per one or more input samples. Specifically, it will output a sample that it thinks is the center of the symbol. This can be the first source of confusion, since nothing in the name of the block suggests decimation.

  • The decimation rate is not constant. It varies with what the block thinks the current symbol clock is. This is important in context of GNU Radio since many things start behaving weirdly with streams that do not have a constant sample rate (like for instance the Scope instrument with multiple inputs).

  • The signal must have zero mean. The algorithm works by observing sign changes on the signal (the slice() method is basically a signum function).

  • There is some sample interpolation involved. d_interp->interpolate() is a FIR interpolator with 8 taps. This means that output can contain values not strictly present in the input.

The code also gives some insights into the meanings of the somewhat cryptically named parameters:

  • Omega is symbol period in samples per symbol. In other words, it is the inverse of the symbol rate or clock frequency. The value given as the block parameter is the initial guess.

  • The limits within which the omega can change during run time depend on the initial omega and the relative limit parameter. For instance, with omega = 10 and omega_relative_limit = 0.1, the block will allow the omega to change between 9 and 11. Note that until version 3.7.5.1, relative limit was actually absolute due to a bug.

  • Mu is the detected phase shift in samples between the receiver and transmitter. The initial value given as the block parameter doesn't matter in practice, since it's usually impossible to guess the phase in advance. Internally, phase is tracked between 0 to omega (corresponding to 0 and 2π). However, because of the way the code is implemented, you can only set the fractional part of the initial value (i.e. up to 1.0)

  • Mu gain and omega gain are gains in two feedback loops that adjust frequency and phase during run time. Their effect varies with the amplitude of the input signal. Values that are optimal for a signal that goes from -1 to +1 will not work for an identical signal that goes from -10 to +10.

Based on the code above, this is how the M&M clock recovery block samples the input signal. You can imagine that a sine wave on the picture below represents a digital signal of a series of binary ones (>0) and zeros (<0) that has been filtered through a filter. Mu is the distance between the first sample and time zero. Omega is the distance between two consecutive samples:

Illustration of M&M clock recovery algorithm variables.

Finally, if you look at how the code adjusts the omega and mu during run time (through the mm_val value), you can see that:

  • Feedback loop doesn't work on square signals. The method depends on the fact that the center of the symbol is the crest. With a square signal there is no difference between the edge of a symbol and its center and hence the feedback loop has no means of positioning the sampling point.

  • The algorithm will not track phase on perfectly symmetrical input. Consider the sine input above: if the frequency (omega) is perfectly adjusted, the error term will always be zero, regardless of the sampling phase (mu). This is important for instance when constructing artificial test cases for debugging, but it might also cause problems for cases where you want to synchronize on a periodic header (e.g. many packet-based transmissions prefix data with such a synchronization header).

In conclusion, this block is tricky to get right. In fact, after a lot of experimenting, I have failed to make it consistently work better than a straight-forward blind decimator. Note that with an accurate enough estimate of the symbol rate and a low-pass filter in front that matches the symbol rate, an ordinary non-synchronized decimator will do reasonably well. For instance in short packet-based transmissions, you might lose some packets if they arrive with a particularly unfavorable phase difference, but overall the results will look passable.

I suspect many reports of successful usage of the M&M clock recovery block are due to this fact and not because the algorithm worked particularly well. In my experience, a heuristical approach to clock recovery works better (for instance like the one implemented in the capture process in am433 and ec3k receivers). While not as scientific and more computationally complex, its performance is more consistent and hence easier to debug and write unit tests against.

Finally, I also suspect the algorithm in GNU Radio is plagued by some practical implementation problems. It looks very much like the interpolator doesn't work as it should (there was no answer yet to my my question on discuss-gnuradio regarding that). While I haven't investigated it thoroughly, I also think that the code might lose phase tracking between consecutive calls to the general_work() method (only fractional part of the phase is retained between calls). These might happen more or less often depending on how the block is placed in the flow chart and might cause intermittent desynchronizations.

Posted by Tomaž | Categories: Code | Comments »

Comment spam and HTTPS

09.03.2015 21:15

Moving this site to HTTPS-only back in November had one unexpected side effect: the amount of attempted comment spam fell significantly.

The graph below shows the number of submitted comments on this blog that were caught by different spam filters over time. Majority of submissions are blocked by the simple textual captcha that is shown by the orange line. The sharp drop on the graph corresponds exactly with the date I reconfigured the site to be served through HTTPS.

Blog comment submissions versus time.

Immediately after I noticed this change I had two theories: either most spam bots do not implement HTTP 301 redirects that I send on the old HTTP URLs or they in fact skip HTTPS sites. If the first theory was correct, I was expecting the spam to quickly ramp up again as bots rediscover my blog on the new, HTTPS URL through crawlers or what ever other process they use to find victims. However, as you can see, the spam rate shows no sign of increasing after four months.

I don't think lack of support for encryption could be the reason. I'm pretty sure all reasonably modern HTTP libraries transparently support HTTPS as well. My server is also setup in a relatively backwards-compatible fashion as far as SSL ciphers are concerned. It is probably more likely that secure sites are simply not considered a worthy target at the moment. Not that I'm complaining.

Anyway, if you run a website, this might be another reason you might want to switch to HTTPS-only.

Posted by Tomaž | Categories: Code | Comments »

Working around broken Boost.Thread

21.02.2015 12:42

Debian Wheezy ships with a broken boost/thread.hpp header. Specifically, there's a conflict between a TIME_UTC preprocessor macro that was introduced in C11 and an enum type defined by Boost 1.49 (bug 701377). This results in somewhat cryptic compile errors for all C files that happen to include this header:

/usr/include/boost/thread/xtime.hpp:23:5: error: expected identifier before numeric constant
/usr/include/boost/thread/xtime.hpp:23:5: error: expected ‘}’ before numeric constant
/usr/include/boost/thread/xtime.hpp:23:5: error: expected unqualified-id before numeric constant
/usr/include/boost/thread/xtime.hpp:46:14: error: expected type-specifier before ‘system_time’

I first encountered this when compiling GNU Radio from source, but the problem is not specific to that project. The obvious solution would be to upgrade Boost to a newer version, but this does not appear to be possible on Wheezy without upgrading many other packages. Some web searches reveal that the general consensus regarding the solution seems be to open the /usr/include/.../xtime.hpp header in a text editor and change the offending enum.

This is bad advice. Straightforward editing of files that are managed by the package manager is never a good idea. Your fix will be silently overwritten by any minor Boost package update. Also, when debugging any kind of problem, nobody expects that files shipped in a package have been modified locally. Among other things, this can lead to hard-to-solve bugs that are not reproducible on other, seemingly identical systems.

The correct way to solve this problem (at least until Debian ships a fixed Boost library) is to work around it in the source you are compiling. For instance, including the following on top of every C file that includes boost/thread.hpp removes the conflicting macro:

#include <time.h>
#undef TIME_UTC

The trick is to have this before any other #includes. You want it to be the first time time.h is included in the compilation unit. Since time.h is protected against multiple includes, TIME_UTC won't get redefined later on when it is included for the second time through Boost headers. Of course, then the TIME_UTC macro isn't available anymore, but so far I haven't seen any code that would use it.

I've also made a patch for GNU Radio 3.7.5 that applies this workaround in all necessary places.

Posted by Tomaž | Categories: Code | Comments »