On "The Bullet Journal Method" book

17.01.2020 12:02

How can you tell if someone uses a Bullet Journal®? You don't have to, they will immediately tell you themselves.

Some time last year I saw this book in the window of a local bookstore. I was aware of the website, but I didn't know the author also published a book about his method of organizing notebooks. I learned about the Bullet Journal back in 2014 and it motivated me to better organize my daily notes. About 3000 written pages later I'm still using some of the techniques I learned back then. I was curious if the book holds any new useful note-taking ideas, so I bought it on the spot.

The Bullet Journal Method by Ryder Carroll.

The Bullet Journal Method is a 2018 book by Ryder Carroll (by the way, the colophon says my copy is printed in Slovenia). The text is split into 4 parts: first part gives motivation for writing a notebook. That is followed by a description of the actual note-taking methods. The third and longest part of the book at around 100 pages is called "The Practice". It's kind of a collection of essays giving advice on life philosophy with general topics such as meaning, gratitude and so on. The last part explores a few variations of the methods described in the book.

The methods described in the book differ a bit from what I remember. In fact the author does note in a few places that their advice has changed over time. The most surprising to me was the change from using blank squares as a symbol for an unfinished task to simple dots. The squares were in my opinion one of the most useful things I took from the Bullet Journal as they are a very clear visual cue. They really catch the eye among other notes and drawings when browsing for things left undone in a project.

In general, the contents of my notebooks are quite different from the journals the book talks about. I don't have such well defined formats of pages (they call it "collections"), except perhaps monthly indexes. My notebooks more resemble lab notes and I also tend to write things in longer form than the really short bullet lists suggested in the book. The author spends a lot of time on migrations and reflection: rewriting things from an old, full notebook to a new one, moving notes between months and so on. I am doing very little of that and rely more on referencing and looking up things in old notebooks. I do see some value in that though and after reading the book I'm starting to do more of it for some parts of my notes. I've experimented with a few other note-taking methods from the book as well, and some seem to be working for me and I've dropped the others.

The Bullet Journal Method on Productivity.

I was surprised to see that a large portion of the book is dedicated to this very general motivational and life style advice, including diagrams like the one you see above, much in the style of self-help books. It made me give up on the book midway half-way through for a few months. I generally have a dislike for this kind of texts, but I don't think it's badly written. The section is intertwined with exercises that you can write down in your journal, like the "five whys" and so on. Some were interesting and others not so much. Reading about a suggestion to write your own obituary after a recent death in the family was off-putting, but I can hardly blame the book for that coincidence.

There is certainly some degree of Bullet Journal® brand building in this book. It feels like the author tries quite hard to sell their method in the first part of the book via thankful letters and stories from people that solved various tough life problems by following their advice. Again, something I think commonly found in self-help books and for me personally this usually has the opposite effect from what was probably intended. I do appreciate that the book doesn't really push the monetary side of it. Author's other businesses (branded notebooks and the mobile app) are each mentioned once towards the end of the book and not much more.

Another pleasant surprise was the tactful acknowledgment from the author that many journals shared on the web and social media don't resemble real things and can be very demotivational or misleading. I've noticed that myself. For example, if you search for "bullet journal" on YouTube you'll find plenty of people sharing their elaborately decorated notebooks that have been meticulously planned and sectioned for a year in advance. That's simply not how things work in my experience and most of all, I strongly believe that writing the notebook with the intention of sharing it on social media defeats the whole purpose.

In conclusion, it's an interesting book and so far I've kept it handy on my desk to occasionally look up some example page layouts that are given throughout it. I do recommend it if you're interested in using physical notebooks or are frustrated with the multitude of digital productivity apps that never tend to quite work out. It's certainly a good starting point, but keep in mind that what's recommended in there might not be what actually works best for you. My advice would be only to keep writing and give it some time until you figure out the useful parts.

Posted by Tomaž | Categories: Life | Comments »

Jahresrückblick

03.01.2020 14:58

One of my more vivid childhood memories is from the center of Ljubljana, somewhere around the new year 1990. On the building of the Nama department store there was a new, bright red LED screen. It was scrolling a message that read "Welcome to the new decade". It was probably one of the first such displays I ever saw. I remember finding the message kind of weird and surprising. I think up to that point I had decades shelved among the terms that were only the matter of books or movies.

It's now the end of another decade and, amusingly enough, I was again not really thinking about it in such terms. I was only reminded of the upcoming round number on the calendar when social media posts and articles summarizing the past 10 years started popping up. Anyway, I'm not even going to attempt to sum up the decade. A huge enough number of things happened in the past year alone, both happy and sad and somewhere in between. I usually have problems summarizing even that down to a few paragraphs, so here are only a few personal highlights.

PCB with an OLED screen and some analog circuits.

On the electronic side, I'm really happy with how one work-related project turned out. It's a small multi-purpose microcontroller board that includes an analog front-end for certain proprietary buses. I've executed the whole project, from writing up a specification based on measurements and reverse engineering, drawing up the schematic to assembling prototypes and coding the firmware. It was an interesting exercise in optimization, both from the perspective of having a minimal BOM and low-level programming of integrated peripherals in the microcontroller.

Each year I mention the left-over pile of unfinished side projects. This year isn't any different and some projects stretch back worryingly deep into my stack of notebooks. Perhaps the closest to completion is a curve tracer that I designed while researching a curiosity in bipolar transistor behavior. I've also received the ERASynth Micro signal generator that I've helped crowdfund. It's supposed to become a part of an RF measurement system that I'm slowly piecing together. I feel bad for not posting a review of it, but as usual other things intervened.

"Failed sim" drawing

I've continued to spend many evenings drawing, either in a classical drawing class at the National Gallery or behind the digital setup I have at home. At the start of the year I was doing some more experiments with animation, trying out lessons I've learned and checking out how far I can get with Python scripts for compositing and lighting effects. I further developed my GIMP plug-in.

I played around with some story ideas, but I didn't end up doing any kind of a longer project like I did a year I ago. I enjoyed trying out different drawing styles, experimenting with character design and doing random illustrations that came up in my mind. I've come up with some kind of an alternative space-race theme with animals, but in the end I realized that while I can draw the characters, I don't really have a story to tell about them.

Measuring the CPU temperature with an IR thermometer.

Speaking about telling stories, I've written more blog posts this year than the year before. I've also had my moment of fame when my rant about Google blocking messages from my mail server was posted on HackerNews and reached the top of the front page. My server received a yearly amount of traffic in just a couple of days and the article got mentioned on sites like Bloomberg and BoingBoing. It was a fresh dose of motivation to keep writing amid the trend of falling number of visitors and RSS subscribers that I've seen since around 2014.

As I always repeat in these posts, it's hard to sum up 365 days in a few paragraphs. I've tried to stick to the positive side of things above. In case the picture is too rosy, I must add that there were also sad times that were hard to wade through and plans that didn't turn out as they should. The next year looks like it will bring some big challenges for me, so again I'll say that I won't make any plans on what kind of personal projects I'll do. If anything, I wish to finish some that I already started and try to refrain from beginning any new ones to add to the pile.

Posted by Tomaž | Categories: Life | Comments »

Food container damage in a microwave oven

12.12.2019 17:19

Some time ago I made a decision to start bringing my own lunch to work. The idea was that for a few days per week I would cook something simple at home the evening before and store it in a plastic container over night in my fridge. I would then bring the container with me to the office next day and heat it up in a microwave at lunch time. For various reasons I wasn't 100% consistent in following this plan, but for the past 3 months or so I did quite often make use of the oven in the office kitchenette. Back in mid-September I also bought a new set of food-grade plastic containers to use just for this purpose.

Around a week ago, just when I was about to fill one of the new containers, I noticed some white stains on its walls. Some increasingly vigilant scraping and rinsing later and the stains started looking less and less like dried-on food remains and more like some kind of corrosion of the plastic material. This had me worried, since the idea that I was eating dissolved polymer with my lunch didn't sound very inviting. On the other hand, I was curious. I've never seen plastic corroding in this way. In any case, I stopped using the containers and did some quick research.

Two types of clear plastic polypropylene food containers.

After carefully inspecting all plastic containers in my kitchen, I've found a few more instances of this exact same effect. All were on one of the two types of containers I've used for carrying lunch to work. The two types are shown on the photo above. The top blue one is a 470 ml Lock & Lock (this model is apparently now called "classic"). It's dated 2008, made in China. I have a stock of these that I've used for more than 10 years for freezing or refrigerating food, but until recently never for heating things up in a microwave. The bottom green one is a 1.1 L Curver "Smart fresh". I've bought a few of these 3 months ago and only used them for carrying and heating up lunches in a microwave.

Both of these types are marked microwave safe, food safe and dishwasher safe (I've been washing the containers in a dishwasher after use). They all have the number "5" resin identification code and the PP acronym, meaning they are supposed to be made out of polypropylene polymer. The following line of logos is embossed on the bottom of the Curver containers (on a side note, the capacity spelled out in Braille seems to say "7.6 L"). Lock & Lock has a similar line of logos, except they don't advertise "BPA FREE":

Markings on the Curver Smart Fresh food container.

The damage to the Curver container is visible on the photograph below. It looks like white dots and lines on the vertical wall of the container. At a first glance it could be mistaken for dried-on food remains. On all damaged containers it most often appears in a line approximately around the horizontal level where the interface between the liquid and air would be. I tend to fill these to around the half way mark and I've used the containers both for mostly solid food like rice or pasta and liquids like sauces and soups. If I run a finger across the stains they feel rough compared to the mirror finish plastic in the other parts. No amount of washing with water or a detergent will remove them. However, the stains are much less visible when wet.

Damaged walls of the Curver plastic food container.

Here is how these stains look under a microscope. The width of area pictured here is approximately 10 mm. Microscope shows much better than the naked eye that what look like white stains on the surface are actually small pits and patches of the wall that became corrugated. The damage does appear superficial and doesn't seem to penetrate the immediate surface layer of the plastic.

Damage to the polypropylene surface under a microscope, 1.

Damage to the polypropylene surface under a microscope, 2.

I've only used the containers in this office microwave. It's a De'Longhi Perfecto MW 311 rated at 800 W (MAFF heating category "D"). I've always used the rotating plate, usually the highest power level and 2 to 3 minutes of heating time per container.

Power rating sign on the De'Longhi MW 311 microwave oven.

After some searching around the web, I found a MetaFilter post from 2010 that seems to describe exactly the same phenomenon. Rough patches of plastic that seem like corrosion appearing on Lock & Lock polypropylene containers. The only difference seems to be that in Hakaisha's case the damage seems to be on the bottom of the container. The comments in that thread that seem plausible to me suggest physical damage from steam bubbles, chemical corrosion from tomato sauce or other acids in food or some non-specific effect of microwaves on the plastic.

My experience suggests that heating and/or use in a microwave oven was required for this effect. If only food contact would be to blame, I'm sure I would have seen this on my old set of Lock & Lock containers sooner. Polypropylene is quite a chemically inert material (hence why it's generally considered food safe), however its resistance to various chemicals does decrease with higher temperatures. For example, the chemical resistance table entry for oleic acid goes from no significant attack at 20°C to light attack at 60°C.

The comment about tomatoes is interesting. I've definitely seen that oily foods with a strong red color from tomatoes or red peppers will stain the polypropylene, even when stored in the refrigerator. In fact, leaflets that come with these food containers often warn that this is possible. In my experience, the red, transparent stain remains on the container for several cycles in the dishwasher, but does fade after some time. My Lock & Lock containers have been stained like that many times, but didn't develop the damaged surface before I started microwaving food in them.

Physical damage from steam bubbles seems unlikely to me. I guess something similar to cavitation might occur as a liquid-filled container moves through nodes and antinodes of the microwave oven's EM field, causing the water to boil and cool. However, it doesn't explain why this seems to mostly occur at the surface of the liquid. Direct damage from microwave radiation also doesn't make sense. It would occur all over the volume of the plastic, not only on the inner surface and in those specific spots. In any case, dielectric heating of the polypropylene itself should be negligible (it is, after all, used for low-loss capacitors exactly because of that property).

Another interesting source on this topic I found was a paper on deformation of packaging materials by Yoon et al. published in Korean Journal of Food Science and Technology in 2015. It discusses composite food pouches rather than monolithic polypropylene containers, however the inner layer of those pouches was in fact a polypropylene film. The authors investigated the causes of damage to that film after food in the pouches has been heated by microwaves. They show some microphotographs that look similar to what I've seen under my microscope.

Unfortunately, the paper is in Korean except for the abstract and figure captions. Google translate isn't very helpful. My understanding is that they conclude that hot spots can occur in salty, high-viscosity mixtures that contain little water. My guess is the mixture must be salty to reduce the penetration depth due to increased conductivity and high-viscosity to lessen the effect of evaporative cooling.

Most telling was the following graph that shows temperature measurements in various spots in a food pouch during microwave heating. Note how the highest temperatures are reached near the filling level (which I think means on the interface between the food in the pouch and air). Below the filling level, the temperature never raises above the boiling point of water. Wikipedia has values between 130°C and 166°C for the melting point of polypropylene. Given the graph below it seems plausible that a partially-dried out food mixture stuck to the container above the liquid level might heat up enough to melt a spot on the container.

Figure 3 from Analysis of the Causes of Deformation by Yoon et al.

Image by Yoon et al.

In summary, I think spot melting of the plastic described in the Yoon paper seems the most plausible explanation for what I was seeing. Then again, I'm judging this based on my high-school knowledge of chemistry, so there are probably aspects of this question I didn't consider. It's also hard to find anything health- or food-related on the web that appears trustworthy. It would be interesting to try out some experiments to test some of these theories. Whatever the true cause of the damage might be, I thought it was prudent to buy some borosilicate glass containers to replace the polypropylene ones for the time being.

Posted by Tomaž | Categories: Life | Comments »

Experiences in product certification

24.10.2019 9:45

Yesterday I was invited to give a presentation at a seminar on procedures for product certification. My former colleagues at the Department of Communication Systems at the Jožef Stefan Institute invited three companies to share their experiences in getting electronics products to European market. We discussed compliance with the CE mark, testing for safety and electromagnetic compatibility standards and various other approvals that are required before you can put mass-produced electronics on the shelves.

"Lessons learned" slide from the certification seminar.

In my presentation (slides are here) I've discussed my view of the two certification cycles I've participated in at Klevio during roughly the last year and a half. I did a short intro about the company and products and then listed what we chose to certify and how. I didn't do any general intro into the certification procedures and individual measurements. This in the end turned out just fine, because others did it better than I could. Most of the time I spent talking about purely practical lessons we learned on how to best prepare for the task. Looking back at it now, I feel like most of my advice boils down to having a good understanding of what is involved and not basing your expectations purely on certification lab's sales pitches.

I wasn't sure how much time I would have and what the interest of the audience was. Because of that I've put the details on debugging specific compliance problems into a separate part of the presentation. I've ended up doing that part of the talk as well. I discussed a problem we had with an ESD test temporarily disabling an audio codec IC and a problem with radiated emissions that took 3 months to debug and led to some pretty significant changes to one of the DC-DC converters.

I wanted to talk more about some interesting home-brew methods of estimating radiated emissions. I spent a lot of time researching and experimenting with them when I was debugging Klevio's compliance problems. In the end I realized that I could spend a whole talk just on that topic. It also turned out that none of those methods were actually useful in finding a solution, so it didn't make much sense to do more than just list them out. For more info on that, here's an article I found useful on making magnetic loop probes. The idea for the common-mode current probe based on a ferrite ring is from the Application Note AN045 by Richtek. The SDR-based method was my own idea based on my previous research on spectrum sensing and I might eventually do a longer write up on that in the future.

In the end, it was interesting to compare notes and what others have learned solving similar problems. I found that even though our EMI problem took longer to solve than others presented at the seminar, Klevio's experience didn't differ much in regard to timelines or certification lab practicalities. It seems almost everyone stumbles upon some problems during certifications, and while specific issues are unique, the biggest obstacle to finding a solution seems pretty universal: reproducing the problem outside of the certification lab.

Posted by Tomaž | Categories: Life | Comments »

Some more notes on Denon RCD-M41DAB

19.10.2019 16:13

Back in January I bought a new micro Hi-Fi after the transformer in my old one failed and I couldn't repair it. Since then I've been pretty happy using Denon RCD-M41DAB for listening to music and radio. I previously did some measurements on its output amplifier and found it meets its ratings in regards to power and distortion. In any case, that was more to satisfy my electrical engineering curiosity and I'm not particularly sensitive to reproduction quality. However after extended use two problems did pop up. Since I see this model is still being sold, I thought I might do a short follow up.

Denon RCD-M41DAB

The first thing is that when streaming audio to the Hi-Fi over Bluetooth there are about 2 seconds of lag. It looks as if there's some kind of an audio processing buffer inside the RCD-M41DAB, or it might simply be the choice of the codec used for wireless transfer. When listening to music the delay isn't that annoying. For example, when clicking "next track" or changing volume on the music player on the phone it takes about 2 seconds before you can actually hear the change. However this lag does make a Bluetooth-connected RCD-M41DAB completely useless as an audio output when playing videos or anything interactive. The lag happens on iOS and Android and I'm pretty sure this is not something related to my smartphone, since when I'm using Bluetooth-connected headphones, there is no discernible lag there.

The other annoyance is that the CD player has problems reading some CDs in my collection. Apparently it reads the table of contents fine, because the number of tracks is displayed correctly. However it has problems seeking to tracks. For example, one disc won't play any tracks at all and I can just hear the head continuously seeking. Another disc won't start playing the first track, but if I skip it manually, player will find and play the second one just fine. All such discs play normally in the CD drive in my desktop computer and other CD players I've tried. It still might be that the discs have some kind of an error in them or have bad reflectivity (all of the problematic ones are kind of niche releases I bought from Bandcamp), but other players apparently are able to work around it.

Posted by Tomaž | Categories: Life | Comments »

Double pendulum simulation

16.05.2019 21:05

One evening a few years ago I was sitting behind a desk with a new, fairly powerful computer at my fingertips. The kind of a system where you run top and the list of CPUs doesn't fit in the default terminal window. Whatever serious business I was using it for at the time didn't parallelize very well and I felt most of its potential remained unused. I was curious how well the hardware would truly perform if I could throw at it some computing problem that would be better suited for a massively parallel machine.

Somewhere around that time I also stumbled upon a gallery of some nice videos of chaotic pendulums. These were done in Mathematica and simulated a group of double-pendulums with slightly different initial conditions. I really liked the visualization. Each pendulum is given a different color. They first move in sync, but after a while their movements deviate and the line they trace falls apart into a rainbow.

Simulation of 42 double pendulums.

Image by aWakeOfBuzzards

The simulations published by aWakeOfBuzzards included only 42 pendulums. I guess it's a reference to the Hitchhiker's Guide, but I thought, why not do better than that? Would it be possible to eliminate visual gaps between the traces? Since each simulation of a pendulum is independent, this should be a really nice, embarrassingly parallel problem I was looking for.

I didn't want to spend a lot of time writing code. This was just another crazy idea and I could only rationalize avoiding more important tasks for so long. Since I couldn't run Mathematica on that machine, I couldn't re-use aWakeOfBuzzards's code and rewriting it to Numpy seemed non-trivial. Nonetheless, I still managed to shamelessly copy most of the code from various other sources on the web. For a start, I found a usable piece of physics simulation code in a Matplotlib example.

aWakeOfBuzzards' simulations simply draw the pendulum traces opaquely on top of each other. It appears that the code draws the red trace last, since when all the pendulums move closely together, all other traces get covered and the trail appears red. I wanted to do better. I had CPU cycles to spare after all.

Instead of rendering animation frames in standard red-green-blue color planes, I instead worked with wavelengths of visible light. I assigned each pendulum a specific wavelength and added that emission line to the spectrum for each pixel it occupied. Only when I had a complete spectrum for each pixel I converted that to a RGB tuple. This meant that when all the pendulums were on top of each other, they would be seen as white, since white light is a sum of all wavelengths. When they diverged, the white line would naturally break into a rainbow.

Frames from an early attempt at the pendulum simulation.

For parallelization, I simply used a process pool from Python's multiprocessing package with N - 1 worker processes, where N was the number of processors in the system. The worker processes solved the Runge-Kutta and returned a list of vertex positions. The master process then rendered the pendulums and wavelength data to an RGB framebuffer by abusing the ImageDraw.line from the Pillow library. Since drawing traces behind the pendulums meant that animation frames were not independent of each other, I dropped that idea and instead only rendered the pendulums themselves.

For 30 seconds of simulation this resulted in an approximately 10 GB binary .npy file with raw framebuffer data. I then used another, non-parallel step that used Pillow and FFmpeg to compress it to a more reasonably sized MPEG video file.

Double pendulum Monte Carlo

(Click to watch Double pendulum Monte Carlo video)

Of course, it took several attempts to fine-tune various simulation parameters to get a nice looking result you can find above. This final video is rendered from 200.000 individual pendulum simulations. Initial conditions only differed in the angular velocity of the second pendulum segment, which was chosen from a uniform distribution.

200.000 is not an insanely high number. It manages to blur most of the gaps between the pendulums, but you can still see the cloud occasionally fall apart into individual lines. Unfortunately I didn't seem to note down at the time what bottleneck exactly caused me not to go higher than that. Looking at the code now, it was most likely the non-parallel rendering of the final frames. I was also beyond the point of diminishing returns and probably something like interpolation between the individual pendulum solutions would yield better results than just increasing the number of solutions.

I was recently reminded of this old hack I did and I thought I might share it. It was a reminder of a different time and a trip down the memories to piece the story back together. The project that funded that machine is long concluded and I now spend evenings behind a different desk. I guess using symmetric multiprocessing was getting out of fashion even back then. I would like to imagine that these days someone else is sitting in that lab and wondering similar things next to a GPU cluster.

Posted by Tomaž | Categories: Life | Comments »

Google is eating our mail

25.04.2019 20:06

I've been running a small SMTP and IMAP mail server for many years, hosting a handful of individual mailboxes. It's hard to say when exactly I started. whois says I registered the tablix.org domain in 2005 and I remember hosting a mailing list for my colleagues at the university a bit before that, so I think it's safe to say it's been around 15 years.

Although I don't jump right away on every email-related novelty, I've tried to keep the server up-to-date with well accepted standards over the years. Some of these came for free with Debian updates. Others needed some manual work. For example, I have SPF records and DKIM message signing setup on the domains I use. The server is hosted on commercial static IP space (with the very same IP it first went on-line) and I've made sure with the ISP that correct reverse DNS records are in place.

Homing pigeon

Image by Andreas Trepte CC BY-SA 2.5

From the beginning I've been worrying that my server would be used for sending spam. So I always made sure I did not have an open relay and put in place throughput restrictions and monitoring that would alert me about unusual traffic. In any case, the amount of outgoing mail has stayed pretty minimal over the years. Since I'm hosting just a few personal accounts these days, there have been less than 1000 messages sent to remote servers over SMTP in the last 12 months. I've given up on hosting mailing lists many years ago.

All of this effort paid off and, as far as I'm aware, my server was never listed on any of the public spam black lists.

So why am I writing all of this? Unfortunately, email is starting to become synonymous with Google's mail, and Google's machines have decided that mail from my server is simply not worth receiving. Being a good administrator and a well-behaved player on the network is no longer enough:

550-5.7.1 [...] Our system has detected that this
550-5.7.1 message is likely unsolicited mail. To reduce the amount of spam sent
550-5.7.1 to Gmail, this message has been blocked. Please visit
550-5.7.1  https://support.google.com/mail/?p=UnsolicitedMessageError
550 5.7.1  for more information. ... - gsmtp

Since mid-December last year, I'm regularly seeing SMTP errors like these. Sometimes the same message re-sent right away will not bounce again. Sometimes rephrasing the subject will fix it. Sometimes all mail from all accounts gets blocked for weeks on end until some lucky bit flips somewhere and mail mysteriously gets through again. Since many organizations use Gmail for mail hosting this doesn't happen just for ...@gmail.com addresses. Now every time I write a mail I wonder whether Google's AI will let it through or not. Only when something like this happens you realize just how impossible it is to talk to someone on the modern internet without having Google somewhere in the middle.

Of course, the 550 SMTP error helpfully links to a wholly unhelpful troubleshooting page. It vaguely refers to suspicious looking text and IP history. It points to Bulk Sender Guidelines, but I have trouble seeing myself as a bulk sender with 10 messages sent last week in total. It points to the Postmaster Tools which, after letting me jump through some hoops to authenticate, tells me I'm too small a fish and has no actual data to show.

Screenshot of Google Postmaster Tools.

So far Google has blocked personal messages to friends and family in multiple languages, as well as business mail. I stopped guessing what text their algorithms deem suspicious. What kind of intelligence sees a reply, with the original message referenced in the In-Reply-To header and part quoted, and considers it unsolicited? I don't discount the possibility that there is something misconfigured at my end, but since Google gives no hint and various third-party tools I've tried don't report anything suspicious I've ran out of ideas where else to look.

My server isn't alone with this problem. At work we use Google's mail hosting and I've seen this trigger happy filter from the other end. Just recently I've overlooked an important mail because it ended up in the spam folder. I guess it was pure luck it didn't get rejected at the SMTP layer. With my work email address I'm subscribed to several mailing lists of open source software projects and regularly Google will decide to block this traffic. I know since Mailman sends me a notification that my address caused excessive bounces. What system decides, after months of watching me read these messages and not once seeing me mark one as spam, that I suddenly don't want to receive them ever again?

Screenshot of the mailing list probe message.

I wonder. Google as a company is famously focused on machine learning through automated analytics and bare minimum of human contact. What kind of a signal can they possibly use to train these SMTP rejects? Mail gets rejected at the SMTP level without user's knowledge. There is no way for a recipient to mark it as not-spam since they don't know the message ever existed. In contrast to merely classifying mail into spam/non-spam folders, it's impossible for an unprivileged human to tell the machine it has made a mistake. Only the sender knows the mail got rejected and they don't have any way to report it either. One half of the feedback loop appears to be missing.

I'm sure there is no malicious intent behind this and that there are some very smart people working on spam prevention at Google. However for a metric driven company where a majority of messages are only passed with-in the walled garden, I can see how there's little motivation to work well with mail coming from outside. If all training data is people marking external mail as spam and there's much less data about false positives, I guess it's easy to arrive to a prior that all external mail is spam even with best intentions.

This is my second rant about Google in a short while. I'm mostly indifferent to their search index policies, however this mail problem is much more frustrating. I can switch search engines, but I can't tell other people to go off Gmail. Email used to work, from its 7-bit days onward. It was one standard thing that you could rely on in the ever changing mess of messaging web apps and proprietary lock-ins. And now it's increasingly broken. I hope people realize that if they don't get a reply, perhaps it's because some machine somewhere decided for them that they don't need to know about it.

Posted by Tomaž | Categories: Life | Comments »

Wacom Cintiq 16 on Debian Stretch

15.02.2019 18:47

Wacom Cintiq 16 (DTK-1660/K0-BX) is a drawing tablet that was announced late last year. At around 600€ I believe it's now the cheapest tablet with a display you can buy from Wacom. For a while I've been curious about these things, but the professional Wacoms were just way too expensive and I wasn't convinced by the cheaper Chinese alternatives. I've been using a small Intuos tablet for several years now and Linux support has been flawless from the start. So when I recently saw the Cintiq 16 on sale at a local reseller and heard there are chances that it will work similarly well on Linux I couldn't resist buying one.

Wacom Cintiq 16 with GIMP running on it.

Even though it's a very recent device, the Linux kernel already includes a compatible driver. Regardless, I was prepared for some hacking before it was usable on Debian. I was surprised how little was actually required. As before, the Linux Wacom Project is doing an amazing job, even though Linux support is not officially acknowledged by Wacom as far as I know.

The tablet connects to the computer using two cables: HDMI and USB. HDMI connection behaves just like a normal 1920x1080 60 Hz flat panel monitor and the display works even if support for everything else is missing on the computer side. The HDMI cable also caries an I2C connection that can be used to adjust settings that you would otherwise expect in a menu accessible by buttons on the side of a monitor (aside from a power button, the Cintiq itself doesn't have any buttons).

After loading the i2c-dev kernel module, the ddcutil version 0.9.4 correctly recognized the display. The i2c-2 bus interface in the example below goes through the HDMI cable and was in my case provided by the radeon driver:

$ ddcutil detect
Display 1
   I2C bus:             /dev/i2c-2
   EDID synopsis:
      Mfg id:           WAC
      Model:            Cintiq 16
      Serial number:    ...
      Manufacture year: 2018
      EDID version:     1.3
   VCP version:         2.2
$ ddcutil --bus=2 capabilities
MCCS version: 2.2
Commands:
   Command: 01 (VCP Request)
   Command: 02 (VCP Response)
   Command: 03 (VCP Set)
   Command: 06 (Timing Reply )
   Command: 07 (Timing Request)
   Command: e3 (Capabilities Reply)
   Command: f3 (Capabilities Request)
VCP Features:
   Feature: 02 (New control value)
   Feature: 04 (Restore factory defaults)
   Feature: 05 (Restore factory brightness/contrast defaults)
   Feature: 08 (Restore color defaults)
   Feature: 12 (Contrast)
   Feature: 13 (Backlight control)
   Feature: 14 (Select color preset)
      Values:
         04: 5000 K
         05: 6500 K
         08: 9300 K
         0b: User 1
   Feature: 16 (Video gain: Red)
   Feature: 18 (Video gain: Green)
   Feature: 1A (Video gain: Blue)
   Feature: AC (Horizontal frequency)
   Feature: AE (Vertical frequency)
   Feature: B2 (Flat panel sub-pixel layout)
   Feature: B6 (Display technology type)
   Feature: C8 (Display controller type)
   Feature: C9 (Display firmware level)
   Feature: CC (OSD Language)
      Values:
         01: Chinese (traditional, Hantai)
         02: English
         03: French
         04: German
         05: Italian
         06: Japanese
         07: Korean
         08: Portuguese (Portugal)
         09: Russian
         0a: Spanish
         0d: Chinese (simplified / Kantai)
         14: Dutch
         1e: Polish
         26: Unrecognized value
   Feature: D6 (Power mode)
      Values:
         01: DPM: On,  DPMS: Off
         04: DPM: Off, DPMS: Off
   Feature: DF (VCP Version)
   Feature: E1 (manufacturer specific feature)
      Values: 00 01 02 (interpretation unavailable)
   Feature: E2 (manufacturer specific feature)
      Values: 01 02 (interpretation unavailable)
   Feature: EF (manufacturer specific feature)
      Values: 00 01 02 03 (interpretation unavailable)
   Feature: F2 (manufacturer specific feature)

I didn't play much with these settings, since I found the factory setup sufficient. I did try out setting white balance and contrast and the display responded as expected. For example, to set the white balance to 6500K:

$ ddcutil --bus=2 setvcp 14 5

Behind the USB connection is a hub and two devices connected to it:

$ lsusb -s 1:
Bus 001 Device 017: ID 0403:6014 Future Technology Devices International, Ltd FT232H Single HS USB-UART/FIFO IC
Bus 001 Device 019: ID 056a:0390 Wacom Co., Ltd 
Bus 001 Device 016: ID 056a:0395 Wacom Co., Ltd 
$ dmesg
...
usb 1-6: new high-speed USB device number 20 using ehci-pci
usb 1-6: New USB device found, idVendor=056a, idProduct=0395, bcdDevice= 1.00
usb 1-6: New USB device strings: Mfr=1, Product=2, SerialNumber=0
usb 1-6: Product: Cintiq 16 HUB
usb 1-6: Manufacturer: Wacom Co., Ltd.
hub 1-6:1.0: USB hub found
hub 1-6:1.0: 2 ports detected
usb 1-6.2: new high-speed USB device number 21 using ehci-pci
usb 1-6.2: New USB device found, idVendor=0403, idProduct=6014, bcdDevice= 9.00
usb 1-6.2: New USB device strings: Mfr=1, Product=2, SerialNumber=0
usb 1-6.2: Product: Single RS232-HS
usb 1-6.2: Manufacturer: FTDI
ftdi_sio 1-6.2:1.0: FTDI USB Serial Device converter detected
usb 1-6.2: Detected FT232H
usb 1-6.2: FTDI USB Serial Device converter now attached to ttyUSB0
usb 1-6.1: new full-speed USB device number 22 using ehci-pci
usb 1-6.1: New USB device found, idVendor=056a, idProduct=0390, bcdDevice= 1.01
usb 1-6.1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
usb 1-6.1: Product: Cintiq 16
usb 1-6.1: Manufacturer: Wacom Co.,Ltd.
usb 1-6.1: SerialNumber: ...
input: Wacom Cintiq 16 Pen as /devices/pci0000:00/0000:00:1a.7/usb1/1-6/1-6.1/1-6.1:1.0/0003:056A:0390.000D/input/input62
on usb-0000:00:1a.7-6.1/input0

056a:0395 is the hub and 056a:0390 is the Human Interface Device class device that provides the actual pen input. When the tablet is off but connected to power, the HID disconnects but other two USB devices are still present on the bus. I'm not sure what the UART is for. This thread on GitHub suggests that it offers an alternative way of interfacing with the I2C bus for adjusting the display settings.

On the stock 4.9.130 kernel that comes with Stretch the pen input wasn't working. However, the 4.19.12 kernel from stretch-backports correctly recognizes the devices and as far as I can see works perfectly.

$ cat /proc/version
Linux version 4.19.0-0.bpo.1-amd64 (debian-kernel@lists.debian.org) (gcc version 6.3.0 20170516 (Debian 6.3.0-18+deb9u1)) #1 SMP Debian 4.19.12-1~bpo9+1 (2018-12-30)

I'm using the stock GNOME 3.22 desktop that comes with Stretch. After upgrading the kernel, the tablet correctly showed up in the Wacom Tablet panel in gnome-control-center. Pen settings there also work and I was able to calibrate the input using the Calibrate button. Calibration procedure instructs you to press the pen onto targets shown on the display and looks exactly like when using official software on Mac OS.

Update: It seems that GNOME sometimes messes up the tablet-to-screen mapping. If the stylus suddenly starts to move the cursor on some other monitor instead of displaying it underneath the stylus on the Wacom display, check the key /org/gnome/desktop/peripherals/tablets/056a:0390/display in dconf-editor. It should contain something like ['WAC', 'Cintiq 16', '...']. If it doesn't, delete they key and restart gnome-settings-daemon.

Wacom Tablet panel in the GNOME Control Center.

In GIMP I had to enable the new input devices under Edit, Input Devices and set them to Screen mode, same as with other tablets. You get two devices: one for the pen tip and another one for when you turn the pen around and use the other eraser-like end. These two behave like independent devices in GIMP, so each remembers its own tool settings.

Configure Input Devices dialog in GIMP.

As far as I can see, pen pressure, tilt angle and the two buttons work correctly in GIMP. The only problem I had is that it's impossible to do a right-click on the canvas to open a menu. This was already unreliable on the Intuos and I suspect it has to do with the fact that the pen always moves slightly when you press the button (so GIMP registers click-and-drag rather than a click). With the higher resolution of the Cintiq it makes sense that it's even harder to hold the pen still enough.

Otherwise, GIMP works mostly fine. I found GNOME to be a bit stubborn about where it wants to place new dialog windows. If I have a normal monitor connected alongside the tablet, file and color chooser dialogs often end up on the monitor instead of the tablet. Since I can't use the pen to click on something on the monitor, it forces me to reach for the mouse, which can be annoying.

I've noticed that GIMP sometimes lags behind the pen, especially when dragging the canvas with the middle-click. I didn't notice this before, but I suspect it has to do with the higher pen resolution or update rate of the Cintiq. The display also has a higher resolution than my monitor, so there are more pixels to push around. In any case, my i5-750 desktop computer will soon be 10 years old and is way overdue for an upgrade.


In conclusion, I'm very happy with it even if it was a quite an expensive gadget to buy for an afternoon hobby. After a few weeks I'm still getting used to drawing onto the screen and tweaking tool dynamics in GIMP. The range of pressures the pen registers feels much wider than on the Intuos, although that is probably very subjective. To my untrained eyes the display looks just amazing. The screen is actually bigger than I thought and since it's not easily disconnectable it is forcing me to rethink how to organize my desk. In the end however, my only worry is that my drawing skills are often not on par with owning such a powerful tool.

Posted by Tomaž | Categories: Life | Comments »

Google index coverage

08.02.2019 11:57

It recently came to my attention that Google has a new Search Console where you can see the status of your web site in Google's search index. I checked out what it says for this blog and I was a bit surprised.

Some things I expected, like the number of pages I've blocked in the robots.txt file to prevent crawling (however I didn't know that blocking an URL there means that it can still appear in search results). Other things were weirder, like this old post being soft recognized as a 404 Not Found response. My web server is properly configured and quite capable of sending correct HTTP response codes, so ignoring standards in that regard is just craziness on Google's part. But the thing that caught my eye the most was the number of Excluded pages on the Index Coverage pane:

Screenshot of the Index Coverage pane.

Considering that I have less than a thousand published blog posts this number seemed high. Diving into the details, it turned out that most of the excluded pages were redirects to canonical URLs and Atom feeds for post comments. However at least 160 URL were permalink addresses of actual blog posts (there may be more, because the CSV export only contains the first 1000 URLs).

Index coverage of blog posts versus year of publication.

All of these were in the "crawled, not indexed" category. In their usual hand-waving way, Google describes this as:

The page was crawled by Google, but not indexed. It may or may not be indexed in the future; no need to resubmit this URL for crawling.

I read this as "we know this page exists, there's no technical problem, but we don't consider it useful to show in search results". The older the blog post, the more likely that it was excluded. Google's index apparently contains only around 60% of my content from 2006, but 100% of that published in the last couple of years. I've tried searching for some of these excluded blog posts and indeed they don't show in the results.

I have no intention to complain about my early writings not being shown to Google's users. As long as my web site complies with generally accepted technical standards I'm happy. I write about things that I find personally interesting and what I earnestly believe might be useful information in general. I don't feel entitled to be shown in Google's search results and what they include in their index or not is their own business.

That said, it did made me think. I'm using Google Search almost exclusively to find information on the web. I suspected that they heavily prioritize new over old, but I've never seriously considered that Google might be intentionally excluding parts of the web from their index altogether. I often hear the sentiment how the old web is disappearing. That the long tail of small websites is as good as gone. Some old one-person web sites may indeed be gone for good, but as this anecdote shows, some such content might just not be discoverable through Google.

All this made me switch my default search engine in Firefox to DuckDuckGo. Granted I don't know what they include or exclude from their search either. I have yet to see how well it works, but maybe it isn't such a bad idea to go back to the time where trying several search engines for a query was a standard practice.

Posted by Tomaž | Categories: Life | Comments »

Jahresrückblick

19.01.2019 20:07

And so another year rushed by. In the past twelve months I've published 19 blog posts, written around 600 notebook pages and read 13 books.

Perhaps the largest change last year was that I left my position at the Department of communication systems at the Jožef Stefan Institute. After 7 years behind the same desk I really needed a change in the environment and I already switched to working only part-time the previous fall. This year I only planned to handle closing work on an EU project that was spinning down. I found it hard to do meaningful research work while working on other things most of the week anyway.

"Kein Jammern" poster.

Even though I led my project work to official (and successful) completion I feel like I left a lot of things undone there. There are interesting results left unpublished, hardware forgotten, and not the least my PhD work which got kind of derailed over the course of the last project and was left in some deep limbo with no clear way of getting out. I thought that by stepping away of it all for a few months I will get a clearer perspective on what exactly I want to do with all of this. I miss research work, I don't miss the politics and I have yet to come to any conclusion if and how to proceed.

As a kind of ironic twist, I also unexpectedly got first authorship of a scientific paper last year. According to git log, it took almost exactly 4 years and around 350 commits and it tells a story quite unlike what I initially had in my mind. After countless rejections from various journals I basically gave up on the last submission going through. It was accepted for publication pending more experimental work, which caused a crazy month of hunting down all the same equipment from years ago, spending weekends and nights in the lab and writing up the new results.

A history of the number of journal pages written per month.

I spent most of the rest of my work days at Klevio wearing an electrical engineer's hat. Going from a well equipped institute back to a growing start-up brought new challenges. I was doing some programming and a lot of electronics design and design for manufacture, a field I barely touched with my electronics work at the Institute. In contrast to my previous RF work, here I brushed up on my analog audio knowledge and acoustics. I discovered the joy of meeting endless electromagnetic compatibility requirements for consumer devices.

Not surprisingly, after this I was not doing a lot of electronics in my spare time. I have several hardware projects in a half-finished state still left over from a year ago. I wish to work more on them this year and hopefully also write up some interesting blog posts. Similarly, I was not doing a lot of open source work, short of some low-effort maintenance of my old projects. Giving my talk about developing GIMP plug-ins was an absolute pleasure and definitely the most fun presentation to prepare last year.

Dapper

Drawing has remained my favorite pass-time and a way to fight anxiety, although I sometimes feel conflicted about it. I did countless sketches and looking back I'm happy to see my drawing has improved. I made my first half-way presentable animated short. It was nice to do such a semi-long project from start to completion, although it sometimes started to feel too much like a yet another afternoon job. I have some more ideas and with everything I learned last year I think it would be fun to try my hand at animating something more original, if only I could manage a more relaxed schedule for it.

All in all, looking back at my notes suggests it wasn't such a bad year. Except maybe December, which tends to be the most depressing month for me anyway. As last time, I'm not making any big plans for this year. I'm sure it will be again too short to clear out my personal backlog of interesting things to do and everything else that will want to happen before 2020. I only hope to waste less of it on various time-sinks like Hacker News and other addictive brain candy web sites. These seem to be starting to eat up my days despite my trying to keep my distance.

Posted by Tomaž | Categories: Life | Comments »

Going to 35c3

12.12.2018 19:03

Just a quick note that I've managed to get a ticket for this year's Chaos Communications Congress. So I'll be in Leipzig at the end of the month, in case anyone wants to have a chat. I don't have anything specific planned yet. I haven't been at the congress since 2015 and I don't know how much has changed in the past years since it moved from Hamburg. I've selected some interesting talks on Halfnarp, but I have a tentative plan to spend more time talking with people than camping in the lecture halls. Drop me an email or possibly tweet to @avian2 if you want to meet.

35c3 refreshing memories logo.

Posted by Tomaž | Categories: Life | Comments »

Notes on using GIMP and Kdenlive for animation

08.12.2018 12:01

This summer I made a 60-second hand-drawn animation using GIMP and Kdenlive. I thought I might write down some notes about the process I used and the software problems I had to work around. Perhaps someone else considering a similar endeavor will find them useful.

For sake of completion, the hardware I used for image and video editing was a Debian GNU/Linux system with 8 GB of RAM and an old quad-core i5 750 CPU. For drawing I used a 7" Wacom Intuos tablet. For some rough sketches I also used an old iPad Air 2 with Animation Creator HD and a cheap rubber finger-type stylus. Animation Creator was the only proprietary software I used.

Screenshot of the GIMP Onion Layers plug-in.

I drew all the final animation cels in GIMP 2.8.18, as packaged in Debian Stretch. I've used my Onion Layers plug-in extensively and added several enhancements to it as work progressed: It now has the option to slightly color tint next and previous frames to emphasize in which direction outlines are moving. I also added new shortcuts for adding a layer to all frames and adding a new frame. I found that GIMP will quite happily work with images with several hundreds of layers in 1080p resolution and in general I didn't encounter any big problems with it. The only thing I missed was an easier way to preview the animation with proper timings, although flipping through frames by hand using keyboard shortcuts worked quite well.

For some of the trickier sequences I used the iPad to draw initial sketches. The rubber stylus was much too imprecise to do final outlines with it, but I found drawing directly on the screen more convenient for experimenting. I later imported those sketches into GIMP using my Animation Creator import plug-in and then drew over them.

Comparison of different methods of coloring cels in GIMP.

A cel naively colored using Bucket Fill (left), manually using the Paintbrush Tool (middle) and using the Color Layer plug-in (right).

I've experimented quite a bit with how to efficiently color-in the finished outlines. Normal Bucket Fill leaves transparent pixels if used on soft anti-aliased outlines. The advice I got was to color the outlines manually with a Paintbrush Tool, but I found that much too time consuming. In the end I made a quick-and-ugly plug-in that helped automate the process somewhat. It used thresholding to create a sharp outline on a layer beneath the original outline. I then used Bucket Fill on that layer. This preserved somewhat the the quality of the initial outline.

I mostly used one GIMP XCF file per scene for cels and one file for the backgrounds (some scenes had multiple planes of backgrounds for parallax scrolling). I exported individual cels using the Export Layers into transparent-background PNG files. For the scene where I had two characters I later wished that I had one XCF file per character since that would make it easier to adjust timing.

I imported the background and cels into Kdenlive. Backgrounds were simple static Image Clips. For cels I mostly used Slideshow Clips with a small frame duration (e.g. 3 frame duration for 8 frames per second). For some scenes I instead imported individual cels separately as images and dragged them manually to the timeline if I wanted to adjust timing of individual cels. That was quite time consuming. The cels and backgrounds were composited using a large number of Composite & Transform transitions.

Kdenlive timeline for a single scene.

I was first using Kdenlive 16.12.2 as packaged by Debian but later found it too unstable. I switched to using the 18.04.1 AppImage release from the Kdenlive website. Switching versions was painful, since the project file didn't import properly in the newer version. Most transitions were wrong, so I had to redo much of the editing process.

I initially wanted to do everything in one monolithic Kdenlive project. However this proved to be increasingly inconvenient as work progressed. Even 18.04.1 was getting unstable with a huge number of clips and tracks on the timeline. I also was having trouble properly doing nice-looking dissolves between scenes that involved multiple tracks. Sometimes seemingly independent Composite & Transforms were affecting each other in unpredictable ways. So in the end I mostly ended up with one Kdenlive project per scene. I rendered each such scene to a lossless H.264 and then imported the rendered scenes into a master Kdenlive project for final editing.

Regarding Kdenlive stability, I had the feeling that it doesn't like Color Clips for some reason, or transitions to blank tracks. I'm not sure if there's really some bug or it was just my imagination (I couldn't reliably reproduce any crash that I encountered), but it seemed that the frequency of crashes went down significantly when I put one track with an all-black PNG at the bottom of my projects.

In general, the largest problem I had with Kdenlive was an issue with scaling. My Kdenlive project was 720p, but all my cels and backgrounds were in 1080p. It appeared that sometimes Composite & Transform would use 720p screen coordinates and sometimes 1080p coordinates in unpredictable ways. I think that the renderer implicitly scales down the bottom-most track to the project size, if at some point in time it sees a frame larger than the project size. In the end I couldn't figure the exact logic behind it and I had to resort to randomly experimenting until it worked. Having separate, simpler project files for each scene helped this significantly.

Comparison of Kdenlive scaling quality.

Frame scaled explicitly from 1080p to 720p using Composite & Transform (left) and scaled implicitly during rendering (right).

Another thing I noticed was the the implicit scaling of video tracks seemed to use lower-quality scaling algorithm than the Composite & Transform, resulting in annoying visible changes in image quality. In the end, I forcibly scaled all tracks to 720p using an extra Composite & Transform, even when one was not explicitly necessary.

Comparison of luminosity on transition to black using Composite & Transform and Fade to black.

Comparison of luminosity during a one second transition from a dark scene to black between Fade to black effect and Composite & Transform transition. Fade to black reaches black faster than it should, but luminosity does not jump up and down.

I was initially doing transitions between scenes using the Composite & Transform because I found adjusting the alpha values through keyframes more convenient than setting lengths of Fade to/from black effects. However I noticed that the Composite & Transform seems to have some kind of a rounding issue and transitions using it were showing a lot of bands and flickering in the final render. In the end I switched to using Dissolves and Fade to/from black which looked better.

Finally, some minor details about video encoding I learned. The H.264 quality setting in Kdenlive (CRF) is inverted. Lower values mean higher quality. H.264 uses YUV color space, while my drawings were in RGB color space. After rendering the final video some shades of gray got a bit of a green tint, which was due to the color space conversion in Kdenlive. As far as I know there is no way to help with that. GIMP and PNG only support RGB. In any case, that was quite minor and I was assured that I only saw it since I was starting at these drawings for so long.

"Home where we are" title card.

How to sum this up? Almost every artist I talked with recommended using proprietary animation software and was surprised when I told them what I'm using. I think in general that is reasonable advice (although the prices of such software seem anything but reasonable for a summer project). I was happy to spend some evenings writing code and learning GIMP internals instead of drawing, but I'm pretty sure that's more an exception than the rule.

There were certainly some annoyances that made me doubt my choice of software. Redoing all editing after switching Kdenlive versions was one of those things. Other things were perhaps just me being too focused on minor technical details. In any case, I remember doing some projects with Cinelerra many years ago and I think Kdenlive is a quite a significant improvement over it in terms of user interface and stability. Of course, neither GIMP nor Kdenlive were specifically designed for this kind of work (but I've heard Krita got native support for animations, so it might be worth checking out). If anything, the fact that it was possible for me to do this project shows how flexible open source tools can be, even when they are used in ways they were not meant for.

Posted by Tomaž | Categories: Life | Comments »

A summer animation project

30.11.2018 22:19

In December last year I went to an evening workshop on animation that was part of the Animateka festival in Ljubljana. It was fascinating to hear how a small group of local artists made a short animated film, from writing the script and making a storyboard to the final video. Having previously experimented with drawing a few seconds worth of animation in GIMP I was tempted to try at least once to go through this entire process and try to animate some kind of story. Not that I hoped to make anything remotely on the level that was presented there. They were professionals winning awards, I just wanted to do something for fun.

Unfortunately I was unable to attend the following workshops to learn more. At the time I was working for the Institute, wrapping up final months of an academic project, and at the same time just settling into an industry job. Between two jobs it was unrealistic to find space for another thing on my schedule.

I played with a few ideas here and there and thought I might prepare something to discuss at this year's GalaCon. During the Easter holidays I took a free week and being somewhat burned out I didn't have anything planned. So I used the spare time to step away from the life of an electrical engineer and come up with a short script based on a song I happened to randomly stumble upon on YouTube some weeks before. I tried to be realistic and stick as much as possible to the things I felt confident I could actually draw. By the end of the week I had a very rough 60-second animatic that I nervously shared around.

One frame of the animatic.

At the time I doubted I would actually do the final animation. I was encouraged by the responses I got to the script, but I didn't previously take notes on how much time it took me to do one frame of animation. So I wasn't sure how long it would take to complete a project like that. It just looked like an enormous pile of cels to draw. And then I found in my inbox a mail saying my scientific paper that I had unsuccessfully tried to publish for nearly 4 years got accepted pending a major revision and everything else was put on hold. It was another mad dash to repeat the experiments, process the measurements and catch the re-submission deadline.

By June my work at the Institute ended and I felt very much tired and disappointed and wanted to do something different. With a part of my week freed up I decided to spend the summer evenings working on this animation project I came up with during Easter. I went through some on-line tutorials to refresh my knowledge and, via a recommendation, bought the Animator's Survival Kit book, which naturally flew completely over my head. By August I was able to bother some con-goers in Ludwigsburg with a more detailed animatic and one fully animated scene. I was very grateful for the feedback I got there and found it encouraging that people were seeing some sense in all of it.

Wall full of animation scraps.

At the end of the summer I had the wall above my desk full of scraps and of course I was nowhere near finished. I underestimated how much work was needed and I was too optimistic in thinking that I will be able to dedicate more than a few hours per week on the project. I scaled back my expectations a bit and I made a new, more elaborate production spreadsheet. The spreadsheet said I'll finish in November, which in the end actually turned out to be a rather good estimate. The final render landed on my hard disk on the evening of 30 October.

Production spreadsheet for my animation project.

So here it is, a 60 second video that is the result of about 6 months of on-and-off evening work by a complete amateur. Looking at it one month later, it has unoriginal character design obviously inspired by the My Little Pony franchise. I guess not enough to appeal to the fans of that show and enough to repel everybody else. There's a cheesy semblance of a story. But it means surprisingly much to me and I probably would never finish it if I went with something more ambitious and original.

I've learned quite a lot about animation and video editing. I also wrote quite a bit of code for making this kind of animation possible using a completely free software stack and when time permits I plan to write another post about the technical details. Perhaps some part of my process can be reused by someone with a bit more artistic talent. In the end it was a fun, if eventually somewhat tiring, way to blow off steam after work and reflect on past decisions in life.

Home where we are

(Click to watch Home where we are video)

Posted by Tomaž | Categories: Life | Comments »

Analyzing PIN numbers

13.10.2018 12:17

Since I already had a dump from haveibeenpwned.com on my drive from my earlier password check, I thought I could use this opportunity to do some more analysis on it. Six years ago DataGenetics blog posted a detailed analysis of 4-digit numbers that were found in password lists from various data breaches. I thought it would be interesting to try to reproduce some of their work and see if their findings still hold after a few years and with a significantly larger dataset.

DataGenetics didn't specify the source of their data, except that it contained 3.4 million four-digit combinations. Guessing from the URL, their analysis was published in September 2012. I've done my analysis on the pwned-passwords-ordered-by-hash.txt file downloaded from haveibeenpwned.com on 6 October (inside the 7-Zip archive the file had a timestamp of 11 July 2018, 02:37:47). The file contains 517.238.891 SHA1 hashes with associated frequencies. By searching for SHA1 hashes that correspond to 4-digit numbers from 0000 to 9999, I found that all of them were present in the file. Total sum of their frequencies was 14.479.676 (see my previous post for the method I used to search the file). Hence my dataset was roughly 4 times the size of DataGenetics'.

Here are the top 20 most common numbers appearing in the dump, compared to the rank on the top 20 list from DataGenetics:

nnew nold PIN frequency
1 1 1234 8.6%
2 2 1111 1.7%
3 1342 1.1%
4 3 0000 1.0%
5 4 1212 0.5%
6 8 4444 0.4%
7 1986 0.4%
8 5 7777 0.4%
9 10 6969 0.4%
10 1989 0.4%
11 9 2222 0.3%
12 13 5555 0.3%
13 2004 0.3%
14 1984 0.2%
15 1987 0.2%
16 1985 0.2%
17 16 1313 0.2%
18 11 9999 0.2%
19 17 8888 0.2%
20 14 6666 0.2%

This list looks similar to the results published DataGenetics. The first two PINs are the same, but the distribution is a bit less skewed. In their results, first four most popular PINs accounted for 20% of all PINs, while here they only make up 12%. It seems also that numbers that look like years (1986, 1989, 2004, ...) have become more popular. In their list the only two in the top 20 list were 2000 and 2001.

Cumulative frequency of PINs

DataGenetics found that number 2580 ranked highly in position 22. They concluded that this is an indicator that a lot of these PINs were originally devised on devices with numerical keyboards such as ATMs and phones (on those keyboards, 2580 is straight down the middle column of keys), even though the source of their data were compromised websites where users would more commonly use a 104-key keyboard. In the haveibeenpwned.com dataset, 2580 ranks at position 65, so slightly lower. It is still in top quarter by cumulative frequency.

Here are 20 least common numbers appearing in the dump, again compared to their rank on the bottom 20 list from DataGenetics:

nnew nold PIN frequency
9981 0743 0.00150%
9982 0847 0.00148%
9983 0894 0.00147%
9984 0756 0.00146%
9986 0934 0.00146%
9985 0638 0.00146%
9987 0967 0.00145%
9988 0761 0.00144%
9989 0840 0.00142%
9991 0835 0.00141%
9990 0736 0.00141%
9993 0742 0.00139%
9992 0639 0.00139%
9994 0939 0.00132%
9995 0739 0.00129%
9996 0849 0.00126%
9997 0938 0.00125%
9998 0837 0.00119%
9999 9995 0738 0.00108%
10000 0839 0.00077%

Not surprisingly, most numbers don't appear in both lists. Since these have the lowest frequencies it also means that the smallest changes will significantly alter the ordering. The least common number 8068 in DataGenetics' dump is here in place 9302, so still pretty much at the bottom. I guess not many people choose their PINs after the iconic Intel CPU.

Here is a grid plot of the distribution, drawn in the same way as in the DataGenetics' post. Vertical axis depicts the right two digits while the horizontal axis depicts the left two digits. The color shows the relative frequency in log scale (blue - least frequent, yellow - most frequent).

Grid plot of the distribution of PINs

Many of the same patterns discussed in the DataGenetics' post are also visible here:

  • The diagonal line shows popularity of PINs where left two and right two digits repeat (pattern like ABAB), with further symmetries superimposed on it (e.g. AAAA).

  • The area in lower left corner shows numbers that can be interpreted as dates (vertically MMDD and horizontally DDMM). The resolution is good enough that you can actually see which months have 28, 30 or 31 days.

  • The strong vertical line at 19 and 20 shows numbers that can be interpreted as years. The 2000s are more common in this dump. Not surprising, since we're further into the 21st century than when DataGenetics' analysis was done.

  • Interestingly, there is a significant shortage of numbers that begin with 0, which can be seen as a dark vertical stripe on the left. A similar pattern can be seen in DataGenetics' dump although they don't comment on it. One possible explanation would be if some proportion of the dump had gone through a step that stripped leading zeros (such as a conversion from string to integer and back, maybe even an Excel table?).

In conclusion, the findings from DataGenetics' post still mostly seem to hold. Don't use 1234 for your PIN. Don't choose numbers that have symmetries in them or have years or dates in them. These all significantly increase the chances that someone will be able to guess them. And of course, don't re-use your SIM card or ATM PIN as a password on websites.

Another thing to note is that DataGenetics concluded that their analysis was possible because of leaks of clear-text passwords. However, PINs provide a very small search space of only 10000 possible combinations. It was trivial for me to perform this analysis even though haveibeenpwned.com dump only provides SHA-1 hashes, and not clear-text. With a warmed-up disk cache, the binary search only took around 30 seconds for all 10000 combinations.

Posted by Tomaž | Categories: Life | Comments »

Checking passwords against haveibeenpwned.com

07.10.2018 11:34

haveibeenpwned.com is a useful website that aggregates database leaks with user names and passwords. They have an on-line form where you can check whether your email has been seen in publicly available database dumps. The form also tells you in which dumps they have seen your email. This gives you a clue which of your passwords has been leaked and should be changed immediately.

For a while now haveibeenpwned.com reported that my email appears in a number of combined password and spam lists. I didn't find that surprising. My email address is publicly displayed on this website and is routinely scraped by spammers. If there was any password associated with it, I thought it has come from the 2012 LinkedIn breach. I knew my old LinkedIn password has been leaked, since scam mails I get are commonly mentioning it.

Screenshot of a scam email.

However, it came to my attention that some email addresses are in the LinkedIn leak, but not in these combo lists I appear in. This seemed to suggest that my appearance in those lists might not only be due to the old LinkedIn breach and that some of my other passwords could have been compromised. I thought it might be wise to double-check.

haveibeenpwned.com also provides an on-line form where they can directly check your password against their database. This seems a really bad practice to encourage, regardless of their assurances of security. I was not going to start sending off my passwords to a third-party. Luckily the same page also provides a dump of SHA-1 hashed passwords you can download and check locally. (Update: as multiple readers have pointed out, haveibeenpwned.com also offers an on-line search by partial hash, which seems a reasonable alternative if you don't want to download the whole database).

I used Transmission to download the dump over BitTorrent. After uncompressing the 7-Zip file I ended up with a 22 GB text file with one SHA-1 hash per line:

$ head -n5 pwned-passwords-ordered-by-hash.txt
000000005AD76BD555C1D6D771DE417A4B87E4B4:4
00000000A8DAE4228F821FB418F59826079BF368:2
00000000DD7F2A1C68A35673713783CA390C9E93:630
00000001E225B908BAC31C56DB04D892E47536E0:5
00000006BAB7FC3113AA73DE3589630FC08218E7:2

I chose the ordered by hash version of the file, so hashes are alphabetically ordered. The number after the colon seems to be the number of occurrences of this hash in their database. More popular passwords will have a higher number.

The alphabetical order of the file makes it convenient to do an efficient binary search on it as-is. I found the hibp-pwlookup tool tool for searching the password database, but that requires you to import the data into PostgreSQL, so it seems that the author was not aware of this convenience. In fact, there's already a BSD command-line tool that knows how to do binary search on such text files: look (it's in the bsdmainutils package on Debian).

Unfortunately the current look binary in Debian is somewhat broken and bails out with File too large error on files larger than 2 GB. It needs recompiling with a simple patch to its Makefile to work on this huge password dump. After I fixed this, I was able to quickly look up the hashes:

$ ./look -b 5BAA61E4 pwned-passwords-ordered-by-hash.txt
5BAA61E4C9B93F3F0682250B6CF8331B7EE68FD8:3533661

By the way, Stephane on the Debian bug tracker also mentions a method where you can search the file without uncompressing it first. Since I already had it uncompressed on the disk I didn't bother. Anyway, now I had to automate the process. I used a Python script similar to the following. The check() function returns True if the password in its argument is present in the database:

import subprocess
import hashlib

def check(passwd):
	s = passwd.encode('ascii')
	h = hashlib.sha1(s).hexdigest().upper()
	p = "pwned-passwords-ordered-by-hash.txt"

	try:
		o = subprocess.check_output([
			"./look",
			"-b",
			"%s:" % (h,),
			p])
	except subprocess.CalledProcessError as exc:
		if exc.returncode == 1:
			return False
		else:
			raise

	l = o.split(b':')[1].strip()

	print("%s: %d" % (
		passwd,
		int(l.decode('ascii'))))

	return True

def main():
	check('password')

main()

Before I was able to check those passwords I keep stored in Firefox, I stumbled upon another hurdle. Recent versions do not provide any facility for exporting passwords in the password manager. There are some third-party tools for that, but I found it hard to trust them. I also had my doubts on how complete they are: Firefox has switched many mechanisms for storing passwords over time and re-implementing the retrieval for all of them seems to be a non-trivial task.

In the end, I opted to get them from Browser Console using the following Javascript, written line-by-line into the console (I adapted this code from a post by cor-el on Mozilla forums):

var tokendb = Cc["@mozilla.org/security/pk11tokendb;1"].createInstance(Ci.nsIPK11TokenDB);
var token = tokendb.getInternalKeyToken();
token.login(true);

var passwordmanager = Cc["@mozilla.org/login-manager;1"].getService(Ci.nsILoginManager);
var signons = passwordmanager.getAllLogins({});
var json = JSON.stringify(signons, null, 1);

I simply copy-pasted the value of the json variable here into a text editor and saved it to a file. I used an encrypted volume, so passwords didn't end up in clear-text on the drive.

I hope this will help anyone else to check their passwords against the database of leaks without exposing them to a third-party in the process. Fortunately for me, this check didn't reveal any really nasty surprises. I did find more hits in the database than I'm happy with, however all of them were due to certain poor password policies about which I can do very little.

Posted by Tomaž | Categories: Life | Comments »