Extending GIMP with Python talk

22.09.2018 18:47

This Thursday I presented a talk on writing GIMP plug-ins in Python at the monthly Python Meetup in Ljubljana. I gave a brief guide on how to get started and shared some tips that I personally would find useful when I first started playing with GIMP. I didn't go into the details of the various functions but tried to keep it high-level, showing most of the things live in the Python REPL window and demoing some of the plug-ins I've written.

I wanted to give this talk for two reasons: first was that I struggled to find such a high-level and up-to-date introduction into writing plug-ins. I though it would be useful for anyone else diving into GIMP's Python interface. I tried to prepare slides so that they can serve as a useful document on their own (you can find the slides here, and the example plug-in on GitHub). The slides also include some useful links for further reading.

Annotated "Hello, World!" GIMP plug-in.

The second reason was that I was growing unhappy with topics presented at these meetups. It seemed to me that recently most were revolving about web and Python at the workplace. I can understand why this is so. The event needs sponsors and they want to show potential new hires how awesome it is to work with them. The web is eating the world, Python meetups or otherwise, and it all leads to a kind of a monotonic series of events. So it's been a while since I've heard a fun subject discussed and I thought I might stir things up a bit.

To be honest, initially I was quite unsure whether this was good venue for the talk. I discussed the idea with a few friends that are also regular meetup visitors and their comments encouraged me to go on with it. Afterwards I was immensely surprised at the positive response. I had prepared a short tour of GIMP for the start of the talk, thinking that most people would not be familiar with it. It turned out that almost everyone in the audience used GIMP, so I skipped it entirely. I was really happy to hear that people found the topic interesting and it reminded me what a pleasure it is to give a well received talk.

Posted by Tomaž | Categories: Life | Comments »

Getting ALSA sound levels from a command-line

01.09.2018 14:33

Sometimes I need to get an idea of signal levels coming from an ALSA capture device, like a line-in on a sound card, and I only have a ssh session available. For example, I've recently been exploring a case of radio-frequency interference causing audible noise in an audio circuit of an embedded device. It's straightforward to load up raw samples into a Jupyter notebook or Audacity and run some quick analysis (or simply listen to the recording on headphones). But what I've really been missing is just a simple command-line tool that would show me some RMS numbers. This way I wouldn't need to transfer potentially large wav files around quite so often.

I felt like a command-line VU meter was something that should have already existed, but finding anything like that on Google turned out elusive. Overtime I've ended up with a bunch of custom half-finished Python scripts. However Python is slow, especially on low-powered ARM devices, so I was considering writing a more optimized version in C. Luckily I've recently come across this recipe for sox which does exactly what I want. Sox is easily apt-gettable on all Debian flavors I commonly care about and even on slow CPUs the following doesn't take noticeable longer than it takes to record the data:

$ sox -q -t alsa -d -n stats trim 0 5
             Overall     Left      Right
DC offset  -0.000059 -0.000059 -0.000046
Min level  -0.333710 -0.268066 -0.333710
Max level   0.273834  0.271820  0.273834
Pk lev dB      -9.53    -11.31     -9.53
RMS lev dB    -25.87    -26.02    -25.73
RMS Pk dB     -20.77    -20.77    -21.09
RMS Tr dB     -32.44    -32.28    -32.44
Crest factor       -      5.44      6.45
Flat factor     0.00      0.00      0.00
Pk count           2         2         2
Bit-depth      15/16     15/16     15/16
Num samples     242k
Length s       5.035
Scale max   1.000000
Window s       0.050

This captures 5 seconds of audio (trim 0 5) from the default ALSA device (-t alsa -d), runs it through the stats filter and discards the samples without saving them to a file (-n). The stats filter calculates some useful statistics and dumps them to standard error. A different capture device can be selected through the AUDIODEV environment variable.

The displayed values are pretty self-explanatory: min and max levels show extremes in the observed samples, scaled to ±1 (so they are independent of the number of bits in ADC samples). Root-mean-square (RMS) statistics are scaled so that 0 dB is the full-scale of the ADC. In addition to overall mean RMS level you also get peak and trough values measured over a short sliding window (length of this window is configurable, and is shown on the last line of the output). This gives you some idea of the dynamics in the signal as well as the overall volume. Description of other fields can be found in the sox(1) man page.

Posted by Tomaž | Categories: Code | Comments »

Accuracy of a cheap USB multimeter

12.08.2018 11:08

Some years ago I bought this cheap USB gadget off Amazon. You connect it in-line with a USB-A device and it shows voltage and current on the power lines of a USB bus. There were quite a few slightly different models available under various brands. Apart from Muker mine doesn't have any specific model name visible (a similar product is currently listed as Muker V21 on Amazon). I chose this particular model because it looked like it had a nicer display than the competition. It also shows the time integral of the current to give an estimate of the charge transferred.

Muker V21 USB multimeter.

Given that it cost less than 10 EUR when I bought it I didn't have high hopes as to its accuracy. I found it useful to quickly check for life signs in a USB power supply, and to check whether a smartphone switches to fast charge or not. However I wouldn't want to use it for any kind of measurement where accuracy is important. Or would I? I kept wondering how accurate it really is.

For comparison I took a Voltcraft VC220 multimeter. True, this is not a particularly high-precision instrument either, but it does come with a list of measurement tolerances in its manual. At least I can check whether the Muker's readings fall with-in the error range of VC220.

My first test measured open-circuit voltage. Muker's input port was connected to a lab power supply and the output port was left empty. I recorded the voltage displayed by the VC220 and Muker for a range of set values (VC220 was measuring voltage on the input port). Below approximately 3.3 V Muker's display went off, so that seems to be the lowest voltage that can be measured. On the graph below you can see a comparison. Error bars show the VC220's measurement tolerance for the 20 V DC range I used in the experiment.

Comparison of voltage measurements.

In the second test I connected an electronic DC load to Muker's output port and varied the load current. I recorded the current displayed by the VC220 (on the output port) and Muker. Again, error bars on the graph below show the maximum VC220 error as per its manual.

Comparison of current measurements.

Finally, I was interested in how much resistance Muker adds onto the supply line. I noticed that some devices will not fast charge when Muker is connected in-line, but will do so when connected directly to the charger. The USB data lines seems to pass directly through the device, so USB DCP detection shouldn't be the cause of that.

I connected Muker with a set of USB-A test leads to a lab power supply and a DC load. The total cable length was approximately 90 cm. With 2.0 A of current flowing through the setup I measured the voltages on both sides of the leads. I did the measurement with the Muker connected in the middle of the leads and with just the two leads connected together:

With Muker Leads only
Current [A] 2.0 2.0
Voltage, supply side [V] 5.25 5.25
Voltage, load side [V] 3.61 4.15
Total resistance [mΩ] 820 550

Muker V21 connected to USB test leads.

In summary, the current displayed by the Muker device seems to be reasonably correct. Measurements fell well with-in the tolerances of the VC220 multimeter, with the two instruments deviating for less than 20 mA in most of the range. Only at 1.8 A and above did the difference start to increase. Voltage readings seem much less accurate and Muker measurements appeared too high. However they did fall with-in the measurement tolerances between 4.2 and 5.1 V.

In regard to increased resistance of the power supply, it seems that Muker adds about 270 mΩ to the supply line. I suspect a significant part of that is contact resistance of the one additional USB-A connector pair in the line. The values I measured did differ quite a lot if I wiggled the connectors.

Posted by Tomaž | Categories: Analog | Comments »

Investigating the timer on a Hoover washing machine

20.07.2018 17:47

One day around last Christmas I stepped into the bathroom and found myself standing in water. The 20-odd years old Candy Alise washing machine finally broke a seal. For some time already I was suspecting that it wasn't rinsing the detergent well enough from the clothes and it was an easy decision to just scrap it. So I spent the holidays shopping for a new machine as well as taking apart and reassembling bathroom furniture that was in the way, and fixing various collateral damage on plumbing. A couple of weeks later and thanks to copious amounts of help from my family I was finally able to do laundry again without flooding the apartment in the process.

I bought a Hoover WDXOC4 465 AC combined washer-dryer. My choice was mostly based on the fact that a local shop had it on stock and its smaller depth compared to the old Candy meant a bit more floor space in the small bathroom. Even though I was trying to avoid it, I ended up with a machine with capacitive touch buttons, a NFC interface and a smartphone app.

Front panel on a Hoover WDXOC4 465 AC washer.

After half a year the machine works reasonably well. I gave up on Hoover's Android app the moment it requested to read my phone's address book, but thankfully all the features I care about can be used through the front panel. Comments I saw on the web that the dryer doesn't work have so far turned out to be unfounded. My only complaint is that 3 or 4 times it happened that the washer didn't do a spin cycle when it should. I suspect that the quality of embedded software is less then stellar. Sometimes I hear the inlet valve or the drum motor momentarily stop and engage again and I wonder if that's due to a bug or controller reset or if they are meant to work that way.

Anyway, as most similar machines, this one displays the remaining time until the end of the currently running program. There's a 7-segment display on the front panel that counts down hours and minutes. One of the weirder things I noticed is that the countdown seems to use Microsoft minutes. I would look at the display to see how many minutes are left, and then when I looked again later it would be seem to be showing more time remaining, not less. I was curious how that works and whether it was just my bad memory or whether the machine was really showing bogus values, so I decided to do an experiment.

I took a similar approach to my investigation of the water heater years ago. I loaded up the machine and set it up for a wash-only cycle (6 kg cottons program, 60°C, 1400 RPM spin cycle, stain level 2) and recorded the display on the front panel with a laptop camera. I then read out the timer values from the video with a simple Python program and compared them to the frame timestamps. The results are below: The real elapsed time is on the horizontal axis and the display readout is on the vertical axis. The dashed gray line shows what the timer should ideally display if the initial estimate would be perfect:

Countdown display on a Hoover washer versus real time.

The timer first shows a reasonable estimate of 152 minutes at the start of the program. The filled-in area on the graph around 5 minutes in is when the "Kg Mode" is active. At that time the display shows a little animation as the washer supposedly weighs the laundry. My OCR program got confused by that, so the time values there are invalid. However, after the display returns to the countdown, the time displayed is significantly lower, at just below 2 hours.

"Kg Mode" description from the Hoover manual.

Image by Candy Hoover Group S.r.l.

That seems fine at first - I didn't use a full 6 kg load, so it's reasonable to believe that the machine adjusted the wash time accordingly. However the minutes on the display then count down slower than real time. So much slower in fact that they more than make up for the difference and the machine ends the cycle after 156 minutes, 4 minutes later than the initial estimate.

You can also see that the display jumps up and down during the program, with the largest jump around 100 minutes in. Even when it seems to be running consistently, it will often count a minute down, then up and down again. You can see evidence of that on the graph below that shows differences on the timer versus time. I've verified this visually on the video and it's not an artifact of my OCR.

Timer display differences on a Hoover washer.

Why is it doing that? Is it just buggy software, an oscillator running slow or maybe someone figured that people would be happier when a machine shows a lower number and then slows down the passing minutes? Hard to say. Even if you don't tend to take videos of your washing machine, it's hard not to notice that the timer stands still at the last 1 minute for around 7 minutes.

Incidentally, does anyone know how the weighing function works? I find it hard to believe that they actually measure the weight with a load cell. They must have some cheaper way, perhaps by measuring the time it takes for water level rise up in the drum or something like that. The old Candy machine also advertised this feature and considering that was an electro-mechanical affair without sophisticated electronics it must be a relatively simple trick.

Pixel sampling locations for the 7-segment display OCR.

In conclusion, a few words on how I did the display OCR. I started off with the seven-segment-ocr project by Suyash Kumar I found on GitHub. Unfortunately it didn't work for me. It uses a very clever trick to do the OCR - it just looks at intensity profiles of two horizontal and one vertical line per character per frame. However because my video had low resolution and was recorded at a slight angle no amount of tweaking worked. In the end I just hacked up a script that sampled 24 hand-picked single pixels from the video. It then compared them to a hand-picked threshold and decoded the display values from there. Still, Kumar's project came in useful since I could reuse all of his OpenCV boilerplate.

Recording of a digital clock synchronized to the DCF77 signal.

Since I like to be sure, I also double-checked that my method of getting real time from video timestamps is accurate enough. Using the same setup I recorded a DCF77-synchronized digital clock for two hours. I estimated the relative error between video timestamps and the clock to be -0.15%, which means that after two hours, my timings were at most 11 seconds off. The Microsoft-minute effects I found in Hoover are much larger than that.

Posted by Tomaž | Categories: Life | Comments »

Monitoring HomePlug AV devices

23.05.2018 18:51

Some time ago I wanted to monitor the performance of a network of Devolo dLAN devices. These are power-line Ethernet adapters. Each device looks like a power brick with a standard Ethernet RJ45 port. You can plug several of these into wall sockets around a building and theoretically they will together act as an Ethernet bridge, linking all their ports as if they were connected to a single network. The power-line network in question seemed to be having intermittent problems, but without some kind of a log it was hard to pin-point exactly what was the problem.

I have very little experience with power-line networks and some quick web searches yielded conflicting information about how these things work and what protocols are at work behind the curtains. Purely from the user perspective, the experience seems to be similar to wireless LANs. While individual devices have flashy numbers written on them, such as 500 Mbps, these are just theoretical "up to" throughputs. In practice, bandwidth of individual links in the network seems to be dynamically adjusted based on signal quality and is commonly quite a bit lower than advertised.

Devolo Cockpit screenshot.

Image by devolo.com

Devolo provides an application called Cockpit that allows you to configure the devices and visualize the power-line network. The feature I was most interested in was the real-time display of the physical layer bitrate for each individual link in the network. While the Cockpit is available for Linux, it is a user friendly point-and-click graphical application and chances were small that I would be able to integrate it into some kind of an automated monitoring process. The prospect of decoding the underlying protocol seemed easier. So I did a packet capture with Wireshark while the Cockpit was fetching the bitrate data:

HomePlug AV protocol decoded in Wireshark.

Wireshark immediately showed the captured packets as part of the HomePlug AV protocol and provided a nice decode. This finally gave me a good keyword I could base my further web searches on, which revealed a helpful white paper with some interesting background technical information. HomePlug AV physical layer apparently uses frequencies in the range of 2 - 28 MHz using OFDM with adaptive number of bits per modulation symbol. The network management is centralized, using a coordinator and a mix of CSMA/CA and TDMA access.

More importantly, the fact that Wireshark decode showed bitrate information in plain text gave me confidence that replicating the process of querying the network would be relatively straightforward. Note how the 113 Mbit/sec in the decode directly corresponds to hex 0x71 in raw packet contents. It appeared that only two packets were involved, a Network Info Request and a Network Info Confirm:

HomePlug AV Network Info Confirmation packet decode.

However before diving directly into writing code from scratch I came across the Faifa command-line tool on GitHub. The repository seems to be a source code dump from a now-defunct dev.open-plc.org web site. There is very little in terms of documentation or clues to its progeny. Last commit was in 2016. However a browse through its source code revealed that it is capable of sending the 0xa038 Network Info Request packet and receiving and decoding the corresponding 0xa039 Network Info Confirm reply. This was exactly what I was looking for.

Some tweaking and a compile later I was able to get the bitrate info from my terminal. Here I am querying one device in the power-line network (its Ethernet address is in the -a parameter). The queried device returns the current network coordinator and a list of stations it is currently connected to, together with the receive and transmit bitrates for each of those connections:

# faifa -i eth4 -a xx:xx:xx:xx:xx:xx -t a038
Faifa for HomePlug AV (GIT revision master-5563f5d)

Started receive thread
Frame: Network Info Request (Vendor-Specific) (0xA038)

Dump:
Frame: Network Info Confirm (Vendor-Specific) (A039), HomePlug-AV Version: 1.0
Network ID (NID): xx xx xx xx xx xx xx
Short Network ID (SNID): 0x05
STA TEI: 0x24
STA Role: Station
CCo MAC: 
	xx:xx:xx:xx:xx:xx
CCo TEI: 0xc2
Stations: 1
Station MAC       TEI  Bridge MAC        TX   RX  
----------------- ---- ----------------- ---- ----
xx:xx:xx:xx:xx:xx 0xc2 xx:xx:xx:xx:xx:xx 0x54 0x2b
Closing receive thread

The original tool had some annoying problems that I needed to work around before deploying it to my monitoring system. Most of all, it operated by sending the query with Ethernet broadcast address as the source. It then put the local network interface into promiscuous mode to listen for broadcasted replies. This seemed like bad practice and created problems for me, least of which was log spam with repeated kernel warnings about promiscuous mode enters and exits. It's possible that the use of broadcasts was a workaround for hardware limitation on some devices, but the devices I tested (dLAN 200 and dLAN 550) seem to reply just fine to queries from non-broadcast addresses.

I also fixed a race condition that was in the original tool due to the way it received the replies. If multiple queries were running on the same network simultaneously sometimes replies from different devices became confused. Finally, I fixed some rough corners regarding libpcap usage that prevented the multi-threaded Faifa process from exiting cleanly once a reply was received. I added a -t command-line option for sending and receiving a single packet.

As usual, the improved Faifa tool is available in my fork on GitHub:

$ git clone https://github.com/avian2/faifa.git

To conclude, here is an example of bitrate data I recorded using this approach. It shows transmit bitrates reported by one device in the network to two other devices (here numbered "station 1" and "station 2"). The data was recorded over the course of 9 days and the network was operating normally during this time:

Recorded PHY bitrate for two stations.

Even this graph shows some interesting things. Some devices (like the "station 1" here) seem to enter a power saving mode. Such devices don't appear in the network info reports, which is why data is missing for some periods of time. Even out of power saving mode, devices don't seem to update their reported bitrates if there is no data being transferred on that specific link. I think this is why the "station 2" here seems to have long periods where the reported bitrate remains constant.

Posted by Tomaž | Categories: Code | Comments »

On the output voltage of a real flyback converter

13.05.2018 13:18

I was recently investigating a small switch-mode power supply for a LED driver. When I was looking at the output voltage waveform on an oscilloscope it occurred to me that it looked very little like the typical waveforms that I remember from university lectures. So I thought it would be interesting to briefly explain here why it looks like that and where the different parts of the waveform come from.

The power supply I was looking at is based on the THX203H integrated circuit. It's powered from line voltage (230 V AC) and uses an isolated (off-line) flyback topology. The circuit is similar to the one shown for a typical application in the THX203H datasheet. The switching frequency is around 70 kHz. Below I modified the original schematic to remove the pi filter on the output which wasn't present in the circuit I was working with:

Schematic of a flyback converter based on THX203H.

When this power supply was loaded with its rated current of 1 A, this is how the output voltage on the output terminals looked like on an oscilloscope:

Output voltage of a flyback converter on an oscilloscope.

If you recall the introduction into switch mode voltage converters, they operate by charging a coil using the input voltage and discharging it into the output. A regulator keeps the duty cycle of the switch adjusted so that the mean inductor current is equal to the load current. For flyback converters, a typical figure you might recall for discontinuous operation is something like the following:

Idealized coil currents and output voltage in a flyback converter.

From top to bottom are the primary coil current, secondary coil current and output voltage, not drawn to scale on the vertical axis. First the ferrite core is charged by the increasing current in the primary coil. Then the controller turns off the primary current and the core discharges through the secondary coil into the output capacitor.

Note how the ripple on the oscilloscope looks more like the secondary current than the idealized output voltage waveform. The main realization here is that the ripple in this case is defined mostly by the output capacitor's equivalent series resistance (ESR) rather than it's capacitance. The effect of ESR is ignored in the idealized graphs above.

In this power supply, the 470 μF 25 V aluminum electrolytic capacitor on the output has an ESR somewhere around 0.1 Ω. The peak capacitor charging current is around 3 A and hence the maximum voltage drop on the ESR is around 300 mV. On the other hand, ripple due to capacitor charging and discharging is an order of a magnitude smaller, at around 30 mV peak-to-peak in this specific case.

Adding the ESR drop to the capacitor voltage gives a much better approximation of the observed output voltage. The break in the slope marks the point where the coil has stopped discharging. Before that point the slope is defined by the decaying current in the coil and the capacitor ESR. After that point, the slope is defined by the discharging output capacitor.

ESR voltage drop, capacitor voltage and output voltage in a flyback converter.

The only feature still missing is the high-frequency noise we see on the oscilloscope. This is caused by the switching done by the THX203H. Abrupt changes in the current cause the leakage magnetic flux in the transformer and the capacitances of the diodes to oscillate. Since the output filter has been removed as a cost-cutting measure, the ringing can be seen unattenuated on the output terminals. The smaller oscillations are caused by the primary switching on, while the larger oscillations that are obscuring the rising slope are caused by the THX203H switching off the primary coil.

Switching noise on the output voltage of a flyback converter.

Posted by Tomaž | Categories: Analog | Comments »

Switching window scaling in GNOME

01.05.2018 13:22

A while back I got a new work laptop: a 13" Dell XPS 9360. I was pleasantly surprised that installing the latest Debian Stretch with GNOME went smoothly and no special tweaks were needed to get everything up and running. The laptop works great and the battery life in Linux is a significant step up from my old HP EliteBook. The only real problem I noticed after a few months of use is weird behavior of the headphone jack, which often doesn't work for some unknown reason.

In any case, this is my first computer with a high-DPI screen. The 13-inch LCD panel has a resolution of 3200x1800, which means that text on a normal X window screen is almost unreadable without some kind of scaling. Thankfully, GNOME that ships with Stretch has a relatively good support for window scaling. You can set a scaling factor in the GNOME Tweak Tool and all windows will have their content scaled by an integer factor, making text and icons intelligible again.

Window scaling setting in GNOME Tweak Tool.

This setting works fine for me, at least for terminal windows and Firefox, which is what I mostly use on this computer. I've only noticed some minor cosmetic issues when I change this at run-time. Some icons and buttons in GNOME Shell (like the bar on the top of the screen or the settings menu on the upper-right) will sometimes look weird until the next reboot.

A bigger annoyance was the fact that I often use this computer with a normal (non-high-DPI) external monitor. I had to open up the Tweak Tool each time I connected or disconnected the monitor. Navigating huge scaled UI on the low-DPI external monitor or tiny UI on the high-DPI laptop panel got old really quick. It was soon obvious that changing that setting should be a matter of a single key press.

Finding a way to set window scaling programmatically was surprisingly difficult (not unlike my older effort in switching audio output device) I tried a few different approaches, like setting some dconf keys, but none worked reliably. I ended up digging into the Tweak Tool source. This revealed that the Tweak Tool is built around a nice Python library that exposes the necessary settings as functions you can call from your own scripts. The rest was simple.

I ended up with the following Python script:

#!/usr/bin/python2.7

from gtweak.utils import XSettingsOverrides

def main():
        xsettings = XSettingsOverrides()

        sf = xsettings.get_window_scaling_factor()

	if sf == 1:
		sf = 2
	else:
		sf = 1

	xsettings.set_window_scaling_factor(sf)

if __name__ == "__main__":
	main()

I have this script saved as toggle_hidpi and then a shortcut set in GNOME Keyboard Settings so that Super-F11 runs it. Note that using the laptop's built-in keyboard this means pressing the Windows logo, Fn, and F11 keys due to the weird modern practice of hiding normal function keys behind the Fn modifier. On an external USB keyboard, only Windows logo and F11 need to be pressed.

High DPI toggle shortcut in GNOME keyboard settings.

Posted by Tomaž | Categories: Code | Comments »

Faking adjustment layers with GIMP layer modes

12.03.2018 19:19

Drawing programs usually use a concept of layers. Each layer is like a glass plate you can draw on. The artwork appears in the program's window as if you were looking down through the stack of plates, with ink on upper layers obscuring that on lower layers.

Example layer stack in GIMP.

Adobe Photoshop extends this idea with adjustment layers. These do not add any content, but instead apply a color correction filter to lower layers. Adjustment layers work in real-time, the color correction gets applied seamlessly as you draw on layers below.

Support for adjustment layers in GIMP is a common question. GIMP does have a Color Levels tool for color correction. However, it can only be applied on one layer at a time. Color Levels operation is also destructive. If you apply it to a layer and want to continue drawing on it, you have to also adjust the colors of your brushes if you want to be consistent. This is often hard to do. It mostly means that you need to leave the color correction operation for the end, when no further changes will be made to the drawing layers and you can apply the correction to a flattened version of the image.

Adjustment layers are currently on the roadmap for GIMP 3.2. However, considering that Debian still ships with GIMP 2.8, this seems like a long way to go. Is it possible to have something like that today? I found some tips on the web on how to fake the adjustment layers using various layer modes. But these are very hand-wavy. Ideally what I would like to do is perfectly replicate the Color Levels dialog, not follow some vague instructions on what to do if I want the picture a bit lighter. That post did give me an idea though.

Layer modes allow you to perform some simple mathematical operations between pixel values on layers, like addition, multiplication and a handful of others. If you fill a layer with a constant color and set it to one of these layer modes, you are in fact transforming pixel values from layers below using a simple function with one constant parameter (the color on the layer). Color Levels operation similarly transforms pixel values by applying a function. So I wondered, would it be possible to combine constant color layers and layer modes in such a way as to approximate arbitrary settings in the color levels dialog?

Screenshot of the Color Levels dialog in GIMP.

If we look at the color levels dialog, there are three important settings: black point (b), white point (w) and gamma (g). These can be adjusted for red, green and blue channels individually, but since the operations are identical and independent, it suffices to focus a single channel. Also note that GIMP performs all operations in the range of 0 to 255. However, the calculations work out as if it was operating in the range from 0 to 1 (effectively a fixed point arithmetic is used with a scaling factor of 255). Since it's somewhat simpler to demonstrate, I'll use the range from 0 to 1.

Let's first ignore gamma (leave it at 1) and look at b and w. Mathematically, the function applied by the Color Levels operation is:

y = \frac{x - b}{w - b}

where x is the input pixel value and y is the output. On a graph it looks like this (you can also get this graph from GIMP with the Edit this Settings as Curves button):

Plot of the Color Levels function without gamma correction.

Here, the input pixel values are on the horizontal axis and output pixel values are on the vertical. This function can be trivially split into two nested functions:

y = f_2(f_1(x))
where
f_1(x) = x - b
f_2(x) = \frac{x}{w-b}

f1 shifts the origin by b and f2 increases the slope. This can be replicated using two layers on the top of the layers stack. GIMP documentation calls these masking layers:

  • Layer mode Subtract, filled with pixel value b,
  • Layer mode Divide, filled with pixel value w - b

Note that values outside of the range 0 - 1 are clipped. This happens on each layer, which makes things a bit tricky for more complicated layer stacks.

The above took care of the black and white point settings. What about gamma adjustment? This is a non-linear operation that gives more emphasis to darker or lighter colors. Mathematically, it's a power function with a real exponent. It is applied on top of the previous linear scaling.

y = x^g

GIMP allows for values of g between 0.1 and 10 in the Color Levels tool. Unfortunately, no layer mode includes an exponential function. However, the Soft light mode applies the following equation:

R = 1 - (1 - M)\cdot(1-x)
y = ((1-x)\cdot M + R)\cdot x

Here, M is the pixel value of the masking layer. If M is 0, this simplifies to:

y = x^2

So by stacking multiple such masking layers with Soft light mode, we can get any exponent that is a multiple of 2. This is still not really useful though. We want to be able to approximate any real gamma value. Luckily, the layer opacity setting opens up some more options. Layer opacity p (again in the range 0 - 1) basically does a linear combination of the original pixel value and the masked value. So taking this into account we get:

y = (1-p)x + px^2

By stacking multiple masking layers with opacity, we can get a polynomial function:

y = a_1 x + a_2 x^2 + x_3 x^3 + \dots

By carefully choosing the opacities of masking layers, we can manipulate the polynomial coefficients an. Polynomials of a sufficient degree can be a very good approximation for a power function with g > 1. For example, here is an approximation using the above method for g = 3.33, using 4 masking layers:

Polynomial approximation for the gamma adjustment function.

What about the g < 1 case? Unfortunately, polynomials don't give us a good approximation and there is no channel mode that involves square roots or any other usable function like that. However, we can apply the same principle of linear combination with opacity settings to multiple saturated divide operations. This effectively makes it possible to piecewise linearly approximate the exponential function. It's not as good as the polynomial, but with enough linear segments it can get very close. Here is one such approximation, for g = 0.33, again using 4 masking layers:

Piecewise linear approximation for the gamma adjustment function.

To test this all together in practice I've made a proof-of-concept GIMP plug-in that implements this idea for gray-scale images. You can get it on GitHub. Note that it needs a relatively recent Scipy and Numpy versions, so if it doesn't work at first, try upgrading them from PyPi. This is how the adjustment layers look like in the Layers window:

Fake color adjustment layer stack in GIMP.

Visually, results are reasonably close to what you get through the ordinary Color Layers tool, although not perfectly identical. I believe some of the discrepancy is caused by rounding errors. The following comparisons show a black-to-white linear gradient with a gamma correction applied. First stripe is the original gradient, second has the gamma correction applied using the built-in Color Levels tool and the third one has the same correction applied using the fake adjustment layer created by my plug-in:

Fake adjustment layer compared to Color Levels for gamma = 0.33.

Fake adjustment layer compared to Color Levels for gamma = 3.33.

How useful is this in practice? It certainly has the non-destructive, real-time property of the Photoshop's adjustment layer. The adjustment is immediately visible on any change in the drawing layers. Changing the adjustment settings, like the gamma value, is somewhat awkward, of course. You need to delete the layers created by the plug-in and create a new set (although an improved plug-in could assist in that). There is no live preview, like in the Color Levels dialog, but you can use Color Levels for preview and then copy-paste values into the plug-in. The multiple layers also clutter the layer list. Unfortunately it's impossible to put them into a layer group, since layer mode operations don't work on layers outside of a group.

My current code only works on gray-scale. The polynomial approximation uses layer opacity setting, and layer opacity can't be set separately for red, green and blue channels. This means that way of applying gamma adjustment can't be adapted for colors. However support for colors should be possible for the piecewise linear method, since in that case you can control the divider separately for each channel (since it's defined by the color fill on the layer). The opacity still stays the same, but I think it should be possible to make it work. I haven't done the math for that though.

Posted by Tomaž | Categories: Life | Comments »

OpenCT on Debian Stretch

24.02.2018 10:03

I don't like replacing old technology that isn't broken, although I wonder sometimes whether that's just rationalizing the fear of change. I'm still using a bunch of old Schlumberger Cryptoflex (e-Gate) USB tokens for securely storing client-side SSL certificates. All of them are held together by black electrical tape at this point, since the plastic became brittle with age. However they still serve their purpose reasonably well, even if software support for them has been obsoleted a long time ago. So what follows is another installment of the series on keeping these hardware tokens working on the latest Debian Stable release.

Stretch upgrades the pcscd and libpcsclite1 packages (from the pcsc-lite project) to version 1.8.20. Unfortunately, this upgrade breaks the old openct driver, which is to my knowledge the only way to use these tokens on a modern system. This manifests itself as the following error when dumping the list of currently connected smart cards:

$ pkcs15-tool -D
Using reader with a card: Axalto/Schlumberger/Gemalo egate token 00 00
PKCS#15 binding failed: Unsupported card

Some trial and error and git bisect led me to commit 8eb9ea1 which apparently caused this issue. It was committed between releases 1.8.13 (which was shipped in Jessie) and 1.8.14. This commit introduces some subtle changes in the way buffers of data are exchanged between pcscd and its drivers, which break openct 0.6.20.

There are two ways around that: you can keep using pcscd and libpcsclite1 from Jessie (the 1.8.13 source package from Jessie builds fine on Stretch), or you can patch openct. I've decided on the second option.

The openct driver is no longer developed upstream and has been removed from Debian in Jessie (last official release was in 2010, although there has been some effort to modernize it). I keep my own git repository and Debian packages based on the last package shipped in Wheezy. My patched version 0.6.20 includes changes required for systemd support, and now also the patch required to support modern pcscd version on Stretch. The latter has been helpfully pointed out to me by Ludovic Rousseau on the pcsc-lite mailing list.

My openct packages for Stretch on amd64 can be found here (version 0.6.20-1.2tomaz2). The updated source is also in a git repository (with a layout compatible with git-buildpackage), should you want to built it yourself:

$ git clone http://www.tablix.org/~avian/git/openct.git

Other smart card-related packages work for me as-shipped in Stretch (e.g. opensc and opensc-pkcs11 0.16.0-3). No changes were necessary in Firefox configuration for it to be able to pull client-side certificates from the hardware tokens. It is still required however, to insert the token only when no instances of Firefox are running.

Posted by Tomaž | Categories: Code | Comments »

Notes on the general-purpose clock on BCM2835

17.02.2018 20:30

Raspberry Pi boards, or more specifically the Broadcom system-on-chips they are based upon, have the capability to generate a wide range of stable clock signals entirely in hardware. These are called general-purpose clock peripherals (GPCLK) and the clock signals they generate can be routed to some of the GPIO pins as alternate pin functions. I was recently interested in this particular detail of Raspberry Pi and noticed that there is little publicly accessible information about this functionality. I had to distill a coherent picture from various, sometimes conflicting, sources of information floating around various websites and forums. So, in the hope that it will be useful for someone else, I'm posting my notes on Raspberry Pi GPCLKs with links for anyone that needs to dig deeper. My research was focused on Raspberry Pi Compute Module 3, but it should mostly apply to all Raspberry Pi boards.

Here is an example of a clock setup that uses two GPCLK peripherals to produce two clocks that drive components external to the BCM2835 system-on-chip (in my case a 12.228 MHz clock for an external audio interface and a 24 MHz clock for an USB hub). This diagram is often called the clock tree and is a common sight in datasheets. Unfortunately, the publicly-accessible BCM2835 datasheet omits it, so I drew my own on what information I could gather on the web. Only components relevant to clocking the GPCLKs are shown:

Rough sketch of a clock tree for BCM2835.

The root of the tree is an oscillator on the far left of the diagram. BCM2835 derives all other internal clocks from it by multiplying or dividing its frequency. On a Compute Module 3 the oscillator clock is a 19.2 MHz signal defined by the on-board crystal resonator. The oscillator frequency is fixed and cannot be changed in software.

19.2 MHz crystal resonator on a Compute Module 3.

The oscillator is routed to a number of phase-locked loop (PLL) devices. A PLL is a complex device that allows you to multiply the frequency of a clock by a configurable, rational factor. Practical PLLs necessarily add some amount of jitter into the clock. How much depends on their internal design and is largely independent of the multiplication factor. For BCM2835 some figures can be found in the Compute Module datasheet, under the section Electrical specification. You can see that routing the clock through a PLL increases the jitter by 28 ps.

GPCLK jitter characteristics from RPI CM datasheet.

Image by Raspberry Pi (Trading) Ltd.

The BCM2835 contains 5 independent PLLs - PLLA, PLLB, PLLC, PLLD and PLLH. The system uses most of these for their own purposes, such as clocking the ARM CPU, the VideoCore GPU, HDMI interface, etc. The PLLs have some default configuration that however cannot be strongly relied upon. Some default frequencies are listed on this page. Note that it says that PLLC settings depend on overclocking settings. My own experiments show that PLLH settings change between firmware versions and whether a monitor is attached to HDMI or not. On some Raspberry Pi boards, other PLLs are used to clock on-board peripherals like Ethernet or Wi-Fi - search for gp_clk in dt-blob.dts. On the Compute Module, PLLA appears to be turned-off by default and free for general-purpose use. This kernel commit suggests that PLLD settings are also stable.

For each PLL, the multiplication factors can be set independently in software using registers in I/O memory. To my knowledge these registers are not publicly documented, but the clk-bcm2835.c file in the Linux kernel offers some insight into what settings are available. The output of each PLL branches off into several channels. Each channel has a small integer divider that can be used to lower the frequency. It is best to leave the settings of the PLL and channel dividers to the firmware by using the vco@PLLA and chan@APER sections in dt-blob.bin. This is described in the Raspberry Pi documentation.

There are three available GPCLK peripherals: GPCLK0, GPCLK1 and GPCLK2. For each you can independently choose a source. 5 clock sources are available: oscillator clock and 4 channels from 4 PLLs (PLLB isn't selectable). Furthermore, each GPCLK peripheral has a independent fractional divider. This divider can again divide the frequency of the selected clock source by (almost) an arbitrary rational number.

Things are somewhat better documented at this stage. The GPCLK clock source and fractional divider are controlled from I/O memory registers that are described in the BCM2835 ARM peripherals document. Note that there is an error in the equation for the average output frequency in Table 6-32. It should be:

f_{GPCLK} = \frac{f_{source}}{\mathrm{DIVI} + \frac{\mathrm{DIVF}}{4096}}

It is perfectly possible to setup GPCLK by writing directly into registers from Linux user space, for example by mmaping /dev/mem or using a command-line tool like busybox devmem. This way you can hand-tune the integer and fractional parts of the dividers or use the noise-shaping functions. You might want to do this if jitter is important. When experimenting, if find the easiest way to get the register base address is to search the kernel log. In the following case, the CM_GP0CTL register would be at 0x3f201070:

# dmesg|grep gpiomem
gpiomem-bcm2835 3f200000.gpiomem: Initialised: Registers at 0x3f200000

A simpler way for taking care of GPCLK settings is again through the dt-blob.bin file. By using the clock@GPCLK0 directives under clock_routing and clock_setup, the necessary register values are calculated and set automatically at boot by the firmware. As far as I can see, these only allow using PLLA and APER. Attempting to set or use other PLL sources has unpredictable results in my experience. I also recommend checking the actual clock output with an oscilloscope, since these automatic settings might not be optimal.

The settings shown on the clock tree diagram on the top were obtained with the following part compiled into the dt-blob.bin:

clock_routing {
	vco@PLLA {
		freq = <1920000000>;
	};
	chan@APER {
		div = <4>;
	};
	clock@GPCLK0 {
		pll = "PLLA";
		chan = "APER";
	};
	clock@GPCLK2 {
		pll = "PLLA";
		chan = "APER";
	};
};

clock_setup {
	clock@GPCLK0 {
		freq = <24000000>;
	};
	clock@GPCLK2 {
		freq = <12288000>;
	};
};

If the clocks can be setup automatically, why is all this background important? Rational dividers used by GPCLKs work by switching the output between two frequencies. In contrast to PLLs the jitter they introduce depends largely on their settings and can be quite severe in some cases. For example, this is how the 12.228 MHz clock looked like when set by firmware from a first-attempt dt-blob.bin:

Clock signal with a lot of jitter produced by GPCLK function.

It's best to keep your clocks divided by integer ratios, because in that case the dividers introduce minimal jitter. If you can't do that, jitter is minimized by maximizing the input clock source frequency. In my case, I wanted to generate two clocks that didn't have an integer ratio, so I was forced to use the fractional part on at least one divider. I opted to have low jitter, integer divided clock for the 24 MHz USB clock and a higher jitter (but still well within acceptable range) for the 12.288 MHz audio clock.

This is how the 12.228 MHz clock looks like with the settings shown above. It's a significant improvement over the first attempt:

Low-jitter clock signal produced by GPCLK function.

GPCLKs can be useful in lowering the cost of a system, since they can remove the need for separate expensive crystal resonators. However, it can sometimes be deceiving how easy they are to set up on Raspberry Pi. dt-blob.bin offers a convenient way of enabling clocks on boot without any assistance from Linux userland. Unfortunately the exact mechanism on how the firmware sets the registers is not public, hence it's worth double checking its work by understanding what is happening behind the scenes, inspecting the registers and checking the actual clock signal with an oscilloscope.

Posted by Tomaž | Categories: Digital | Comments »