Meta meta talk

22.01.2012 18:12

I noticed that a lot of talks and presentations I attend, especially in the more academic circles, begin with the speaker showing a slide with the outline of the talk in form of a bulleted list and talks through it, explaining what he will talk about. With short talks this can actually take a significant part of the total time, especially if the speaker returns to that slide after each part, showing how far into the talk he progressed.

What is the point of that, short of giving the audience an opportunity for one last glance of their email client? The purpose of such an overview in printed articles is to give the reader some idea of the contents. This means that you can skip reading it altogether and move to the next article if the table of contents doesn't give you enough confidence that the article will be of use to you. Via page numbering it also allows you to skip directly to the interesting part, should only a part of the document be useful.

None of these uses apply to the spoken word though. You can't fast forward to the interesting bit and if you find that the talk won't be worth your time after the introduction it's usually considered quite rude to walk out at that point. As is browsing the web waiting for the part the interests you.

Some may argue that the table of contents makes the slide stack more readable in stand-alone form. I don't buy that - slides are part of the presentation and I don't think any consideration should be given on how useful they might be without the accompanying speech. It's 2012 after all and making at least an audio recording with cheap equipment shouldn't be science fiction anymore.

Posted by Tomaž | Categories: Ideas | Comments »

On black magic

27.09.2011 17:42

A while ago, during lunch break at a BarCamp, I sat at a table with some of my colleagues from Zemanta. There was an iPhone 4 on the table and conversation touched its infamous antenna issue. Gašper said "You know, I heard antenna design is kind of a black magic" to which I replied that somewhere an RF engineer is probably saying to her colleagues "You know, I heard software development is kind of a black magic".

While I was only half serious at the time, it did get me thinking. I believe that design in all fields of engineering has parts that can be calculated analytically. Parts where laws of nature, either directly or through simplification and abstraction permit parameters to be calculated with formulas and algorithms to some useful engineering precision. There are also other parts where the only solution is an experiment. Those are the parts where you have difficult to predict parameters, chaotic systems or simply involve systems so complex that an analytical solution either doesn't exist or costs more to obtain than doing a series of experiments that will lead to the same result.

I understand black magic as a problem that favors an extremely experiment-oriented approach. A problem where a solution can only be obtained with a heuristical process. One that requires intuition of a designer that has over the years accumulated and internalized a large number of failed and successful experiments. Designer with enough experience that when presented with a problem he can intuitively recall what approach worked well in a similar case, even if he can't explain it.

The two approaches, the purely analytical and purely experimental, overlap to a large degree. For example when designing analog circuits you can go pretty far calculating exact values down to the last resistor in the system, counting in stray capacitances and so on. Or you can take a more experimental approach. Calculate some basic parameters with wide enough error margins, build a prototype circuit or a simulation and tweak it until it works to your satisfaction. I would argue that the sweet spot lies somewhere between both extremes. Both in the aspect of least effort and quality of solution. Of course, hitting that sweet spot, knowing when you went too deep into theory, is also where intuition and experience come into play. It's a different kind of intuition though and arguably one that is more broadly applicable than the black magic one mentioned above.

Moving work from design to implementation.

Original by Paul A. Hoadley CC BY-SA 2.5

It's quite the same with software engineering. Just replace pages upon pages of differential equations with design documentation in a waterfall model. Or incremental building and optimization with the coveted agile approach.

In fact my growing dissatisfaction with the software development world is that the mindset is ever more moving towards extremely experimental design. This is most apparent in the fresh field of web application programming. There is little to no strict documentation on much of the modern frameworks that hold the web together. Few well defined interfaces have specifications that are too complex to fully understand and are riddled with implementation bugs. It's the field that favors the magicians that have enough mileage to recite individual web browser CSS bugs by heart in the middle of the night, ordered backwards by the issue numbers. With continuous deployment, increasing hate of version numbers and release cycles for software this bias is only growing more prevalent throughout the software industry.

It's an interesting contrast to the inherently analytical nature of computer programs and it certainly doesn't look like the fictional RF engineer's statement would be any less well grounded than my colleague's.

Posted by Tomaž | Categories: Ideas | Comments »

The most average blog

17.06.2011 21:49

Here is another interesting result extracted from the dataset of 150.000 blog screenshots I mentioned in my previous post. I summed the pixel values of all images and created the screenshot of an average blog:

Browser window averaged over 150 thousand blogs

Actually I made this on a whim after remembering a beautiful average face that decorated a NewScientist cover a while back. It took only around 40 lines of Python code using Numpy and Python imaging library and a few hours of processing time. I wouldn't say the result is cover-page material, but it interesting nonetheless.

I guess everyone can draw their own conclusions from it. The most prominent feature is the Firefox notification bar, which is the artifact of my screenshotting method - the browser I used didn't have Adobe Flash installed. There are methods suggested in the comments to my original post on how to do web page screenshots properly, which I will definitely use should I want to repeat a survey like this.

In hindsight, this bar might have affected the HSV histograms a bit. It is quite possible that on pages with a patterned background the yellow background of the notification bar would be picked up as the dominant color on the page. However I think the effect isn't significant, since this would have resulted in a single-color spike on the dominant color histogram in the yellow part and the spike observed there covers at least two histogram bars.

Posted by Tomaž | Categories: Ideas | Comments »

United colors of the blogosphere

10.06.2011 21:57

Several months ago we had a discussion in the office about the icons that Zemanta automatically adds to the footer of blog posts that contain suggested content. The conversation mostly revolved about how aesthetically pleasing they are combined with various web site designs out there.

What bothered me is that most of the arguments there were based on guesses and anecdotal evidence. It made me curios about what are the actual prevailing colors used on web sites out there. So I dumped the list of blogs Zemanta knows about, threw together a bunch of really simple shell scripts and let a machine crawl the blogs around the world. Of course it wasn't that simple and it wasted a week making screen shots of a Firefox error window before I noticed and fixed the bug. The whole machinery grew up to be pretty complex towards the end, mostly because it turns out that modern desktop software just isn't up to such a task (and I refused to go through the process of embedding a HTML rendering engine into some custom software). When you are visiting tens of thousands of pages a browser instance is good for at best one page load and the X server instance survives maybe thousand browser restarts.

Collage of screen shots of a few blogs.

After around two months and a bit over 150.000 visited blogs I ended up with 50 GB of screen shots, which hopefully make a representative sample of the world's blogger population.

So far I extracted two numbers from each of those files: the average color (the mean red, green and blue values for each page) and the dominant color (the red, green and blue value for the color that is present in the most pixels on the page). The idea is that the dominant color should generally be equal to the background color (except for pages that use a patterned background), while the average color is also affected by the content of the page.

Here are how histograms of those values look like, when converted to the HSV color model. Let's start with the dominant colors:

Histogram of dominant color hue used in blog themes.

You can see pretty well defined peaks around orange, blue and a curious sharp peak around green. Note that this graph only shows hue, so that orange peak also includes pages with, for instance, light brown background.

I excluded pages where the dominant color had zero saturation (meaning shades of gray from black to white) and as such had an undefined hue.

Histogram of dominant color saturation used in blog themes.

The saturation histogram is weighted heavily towards unsaturated colors (note that the peak at zero is much higher and is cut off in this picture). This is pretty reasonable. Saturated backgrounds are a bad choice for blogs, which mainly publish written content and should focus on the legibility of the text.

Histogram of dominant color value used in blog themes.

Again this result is pretty much what I expected. Peaks at very light colors and very dark ones. Backgrounds in the middle of the scale don't leave much space for text contrast.

Moving on to histograms of average colors:

Histogram of average hues used in blog themes.

Average color hues are pretty much equivalent to dominant color hues, which increases my confidence in these distributions. Still we have high peaks around orange and blue, although they are a bit more spread out. That is expected, since average colors are affected by content on the site and different blogs using the same theme but publishing different content will have a slightly different average color.

Histogram of average color saturation used in blog themes.

Again, weighted strongly towards unsaturated colors.

Histogram of average color value used in blog themes.

Now this is interesting. The peak around black has disappeared completely! This suggests that the black peak in dominant colors was an artifact, probably due to the black color of the text being dominant over any single background color (say in a patterned background). The white peak is again very spread out, probably due to light background colors mixing with dark text in the foreground.

Conclusions at this point would be that light backgrounds are in majority over dark backgrounds, most popular colors are based on orange and blue and most bloggers have the common sense to use desaturated colors in their designs.

I'm sure there are loads of other interesting metrics that can be extracted from this dataset, so any suggestions and comments are welcome as always. I also spent this Zemanta Hack Day working on a fancy interactive visualization, which will be a subject of a future blog post.

Posted by Tomaž | Categories: Ideas | Comments »

Aiming for the center

11.04.2011 21:08

Software that exposes its user interface over the web offers limitless opportunities for monitoring the behavior of people using it. You can literally watch each move of the mouse and each key press. And it's so tempting to gather all this information and try to make sense out of it that some are pushing this idea of data centered design where each decision must come from a bump in a graph or two. From my point of view as an engineer (UX is not in my verbal vocabulary) I find that kind of approach shortsighted at best.

You are invariably optimizing for an average user, even if you divide people into neat little numbered bins. Consider an analogy from the physical world: Center of gravity is the mean location of mass in an object. In many objects, like in this ring for instance, it is also devoid of any mass. Optimizing something for the average user can mean you simply make no one happy and everyone equally miserable.

Ring with a center of gravity mark

AB testing and similar ideas are just hill-climbing optimization algorithms for some variable that ultimately depends on processes in a wet human brain. Such simple approaches fall short in the real world where laws of physics are infinitely less complex. How can they be the main guide for development in user interface design, where a linear law is replaced by chaotic mess of firing neurons? I don't doubt that in some time in the future well be able to understand and model it (and that certainly won't be a two dimensional graph with vaguely defined axes). Until then it will take a human to understand a human.

Some may argue that at large scales groups of people are surprisingly predictable. My answer is that it's also well known that entropy increases in the universe. That doesn't bring you any closer to designing an internal combustion engine.

I'll stop myself here and won't go into the impossibility of doing controlled experiments with people or the moral problem I see in choosing profit as the universal optimization variable for all design questions. I'll offend enough liberal arts students as it is.

Measurements are important in evaluation, but anything involving a human in the loop is one significant figure at best. Modifying things based on that second decimal increase in metric instead of going with the gut feeling of what is best just takes everyone a little further from happiness.

Even if you reap mad profits from that bouncing blinking purchase button.

Posted by Tomaž | Categories: Ideas | Comments »

Treacherous waters

09.02.2011 21:44

These days (or rather months) my daily work at Zemanta often takes me to the part of the web I would not normally visit. Its shadier parts so to speak. And it turns out that those are surprisingly crowded these days.

I'm sure you've been there. Probably when you were searching for some useful piece of information and such sites cluttered the top of the result list and you had to sort through piles and piles of fluff before you found what you were looking for. Or maybe it was recommended by a friend through one of the many channels such recommendations travel in the age of social web. Perhaps you even had to deal with a bug report because a piece of your web-facing software, while compliant to all relevant standards, didn't perform up to some user's satisfaction when dealing with such a web site.

Dark tunnel that is HTML 5

Imagine for a second the stereotypical web site of this class: fixed width design, unreadably small, gray type on white backgrounds. Left-top logo in saturated colors and gray gradients, courtesy of web-two-point-oh. Probably the definitive destination for some wildly popular topic right off the first page of your typical yellow press (celebrities, health, cars, shopping) or emerging interests (say Android development). At most two paragraphs of actual content and the rest filled with ads, user comments and everything in between. And of course at least 10 ways to share this page on all social networks you know about plus 10 more that you don't.

Considering that serving those abominations of the web is the only thing the companies behind them do, they are surprisingly incompetent about it. Pages won't validate or will throw a hundred Javascript errors from tens of different scripts that load behind the curtains. The little content there is was scraped from the Wikipedia or it looks like someone from a less fortunate country was hired to copy-and-paste a few statements on a prescribed topic from all around the Internet. Everything under a CC license is considered free-for-all (but don't dare break their lengthy personal-use-only terms of use!). Nobody cared about the fact that there is anything other than ASCII encoding or the subtleties XML parsing or for that matter even that the description of software product and an image that shows a porn star of the same name do not refer the same thing. As long as half a dopamine-starved human brain is able to decode it it's good enough.

What's puzzling at first is that such sites seem to be getting a shocking amount of traffic (and probably revenue as well). Of course, the opinions about the quality differ. Even among my colleagues some consider such sites valuable destinations. They have comment buttons and you can share them on Twitter! They're way more fun to visit than some tired old Wikipedia that doesn't even have Facebook integration. Regardless of the fact that any user-contributed discussion is as devoid of actual content as the site itself.

What I see in such pages is an evolution of link farming. Social farming if you will. Search engines have gotten better at detecting content that has just been blatantly and automatically copied from somewhere. So an up-to-date spammer, er, vertical influencer has switched from a website copying bot to a few mechanical turks producing syntactically unique but semantically carbon-copied content. The network effect of modern social networks brings more and more people to the site, producing worthless comments that again give the appearance of a respectable site. At this point they are trying to trick an algorithm by introducing living people into the content copying process.

Therefore you can hear a lot about how the traditional search engines will in time be completely replaced by your social network, introducing wetware on the other side as well. The idea is that natural language processing and information retrieval won't be able to distinguish between what you would consider a reputable site and a link-farmed site that approximately copied content from that reputable site. But your friends in a social network will. First because they are (hopefully) human and can understand things AI can't and second because you share their interests and trust what they trust. They can therefore in theory push more useful information in your direction than some algorithm, even when it is intimately familiar with your click- and search history.

However I think the sites I described above are a perfect example why this scheme won't bring any reduction to the fluff you will need to go through before getting to the information you want. At this point you even don't have to fool search engines any more because once you've got people clicking those little "share" buttons they will bring more visitors to your site and push your coupon codes and endorsements and whatnot regardless. In the end it's just as easy to subvert the human social network to pass around unsolicited advertisements as it is for a software algorithm. You just need a different kind of engineering.

Posted by Tomaž | Categories: Ideas | Comments »


10.06.2010 20:12

After seeing the beautiful x0xb0x front panel on Keith's blog I spent some time thinking about the design of the exterior of my new lab power supply.

Power supply front panel wireframe

Compared to your average (computer) graphical user interface there is really not much to it. I want to keep it simple, with knobs and old-fashioned analogue panel meters.

I chose this wide, low case so the power supply will fit under my oscilloscope on the table.

I would love to have that brushed-metal finish. I've seen front panel adhesive films on which you can print with a laser printer. However the case is wider than A4 paper format, which makes it tricky to print on in one piece.

All my previous projects that used the "double-sided-tape + laser print on paper + transparent protective adhesive foil" approach got wrinkly with time. So I'm more than willing to try something new this time.

Posted by Tomaž | Categories: Ideas | Comments »

Camera in a bathroom

07.01.2010 21:53

Hot water in my apartment is supplied by a storage-type electrical boiler. As uneconomical as it is to use electricity for heating in a big city I can't really do much about that. However what I did notice is that the insulation on the boiler is pretty good - the water will remain hot enough to use even if the heater has been turned off for a couple of days. So maybe I could optimize the time the heater is working. Instead of a simple thermostat I could add a timer that would only allow the heater to work at night when electrical energy costs around half as much as during the day.

However, before I start tearing the boiler apart I want to have at least a rough approximation of how much energy the boiler actually uses and how much such a modification would save me per month in lower electricity bills.

Unfortunately I don't have proper equipment at hand to directly measure and record electrical power (plus that would mean more messing with the wiring in the bathroom and that's a can of worms want to open as rarely as possible).

So I used another approach. I took my EeePC, put it on a high shelf across the boiler and trained the camera on the indicator that lights up when the heater is operating. From the recording it will be obvious at what time the heater was on and since I know approximate power I'll be able to calculate the energy consumption and cost.

Of course, the thought has crossed my mind that having a camera operating in a bathroom can be a really bad idea. Even more so if it's connected to a Wi-Fi capable laptop. So I took some extra precautions to prevent anyone ending up on YouTube (i.e. I disabled the wireless adapter).

I used the following GStreamer pipeline to record the video:

gst-launch -v \
v4l2src ! video/x-raw-yuv,width=640,height=480,fps=5 ! ffmpegcolorspace ! \
videocrop bottom=112 left=288 right=288 top=112 ! \
videoscale ! videorate ! video/x-raw-yuv,width=128,height=512,framerate=1/4 ! \
clockoverlay ! \
jpegenc ! avimux ! filesink location=boiler.avi

Everything up to ffmpegcolorspace appears to be necessary for the camera to operate properly. videocrop crops the video so that only the boiler can be seen on the image (makes for a boring video, but that's what I'm aiming for here). videorate drops 19 frames out of 20 to save disk space.

I also added a clock overlay, so that each frame would have the exact time recorded, just in case I couldn't later deduce the time by other, more sophisticated means. I had to scale the video width up by 2 (hence the videoscale element) to make it wide enough for the clock display to fit onto it.

Video is saved in the simple motion-JPEG format. Not really the most efficient encoding out there, but everything else seemed to produce unusable results. I guess the default settings for Theora and friends don't work very well at 0.25 fps.

I left this setup running for a week and it produced around 3.5 MB of video data per hour - no problem for EeePC's somewhat limited storage capabilities.

If ran at a normal 25 frames per second, this video shows a day in 15 minutes, so in case I won't be able to do anything automatically I would still be able to watch it and manually record the time (however I'm confident it won't come to that)

Posted by Tomaž | Categories: Ideas | Comments »

Reverse biased LED

25.11.2009 16:10

Here's an interesting question: If you force current through a LED in the reverse direction (i.e. by causing a breakdown in the junction), will it emit light, and if not, why?

Fear the glowing spheres

Let's start with an experiment. I took an old low-intensity 3 mm yellow LED (I'm guessing GaAsP chemistry) - I want to limit this discussion only to simple P-N junction devices and ignore for the moment complicated new LEDs with quantum wells and such.

I applied a high reverse bias through a large resistor (1 MΩ, to limit power dissipation) and visually checked for any light. The diode started conducting a significant current at 185 V (which, by the way, is surprisingly high as all datasheets I've seen rate such LEDs at maximum 5 V reverse bias). At that reverse voltage I let 0.1 mA of current through it (which gave total power of something around the rated 20 mW). I couldn't see any light coming from the LED.

As a control I then reversed the polarity and let 0.1 mA flow in the forward direction. In that case I could clearly see the LED lit up in a darkened room. So, the LED wasn't destroyed in the experiment and 0.1 mA was enough to produce a visible effect.

So, the experiment confirms that a LED will not light up when the current flows in the reverse direction. But what is the theory behind it?

Visible photons emitted by the LED are generated by electron-hole recombinations in the semiconductor. The material has just the right energy gap that an electron in the conducting band can give away it's excess energy to a visible photon when it fills a hole in the valence band. In forward operation most of these recombinations happen after charge carriers travel through the depletion region and are diffusing into N or P material as minority carriers.

When the junction is in breakdown, a similar thing happens (from the high reverse voltage I measured I'm guessing the avalanche mode of breakdown). However this time carriers don't diffuse through the junction but are generated there through impact ionization. But because the voltage is reversed, electrons enter the N material as majority carriers (same goes for holes in the P material). So gone are the recombinations near the depletion region and no significant number of photons is produced.

Interesting though, carriers from an avalanche breakdown have significantly more energy than thermal ones in forward operation. And they do in fact emit light after they pass through the junction (i.e. through bremsstrahlung and hot carrier recombinations). However the light's wavelength is no longer defined by the material's band gap and its spectrum is completely different to that of forward operation. So the LED might have as well lit up in my experiment, but with intensity and wavelengths that were invisible to an unaided eye.

Posted by Tomaž | Categories: Ideas | Comments »

Google Translate idea

03.08.2009 18:40

Google Translate is currently pretty useless for any serious translation (regardless of what translators of user manuals for consumer electronics think).

However, I sometimes find myself searching for, say, a Slovene equivalent to a specific English technical term or phrase. I'm fluent in both languages, it's just that the term may be too far out of my expertise to know the correct translation - although I can usually spot the correct one when I see it in context.

Scientific terminology is usually very strict, with one specific name for a phenomenon. And if you want the translation to appear correct in the eyes of someone knowledgeable in that field, a simple by-the-dictionary translation of that name may make you a popular target of in-jokes (take note, authors of subtitles on Slovenian television networks).

So, if I don't have someone fluent in that terminology at hand, what I usually do is first check for equivalent pages in different languages of Wikipedia. More often than not, that approach fails miserably. Then I'm off to Google, where I search for the term I want to translate (in quotes) plus some terms in the targete language that I estimate must appear nearby in the translation.

However, that's not really a good task for a general search engine - translations may for example appear on different web pages (pages often have an English and an Slovenian section separated on different pages). On the other hand my search will only return results when a single page contains both English and Slovene texts, so a lot of potentially useful results would be missed.

Here's my idea: machine translation tools like Google Translate already recognize pairs of texts on the web that are direct translations of each other. That's the input for their machine learning algorithms, where they learn how to (badly) translate free text.

Wouldn't it be nice if you could enter just an English phrase and select a language, and get back a list of English texts containing that phrase, plus the matching texts in Slovene? It's not even necessary to point out where exactly in the text the equivalent phrases are - A paragraph-level of precision would be more than enough. I can find the exact spot myself while reading the context (which I must, to make sure I'm using the correct translation).

So in effect you would be using the infrastructure that most likely already exists, but for human instead of machine learning. I'm sure that would be a most useful tool for people translating technical texts. At least until machine translation becomes a little more accurate.

Posted by Tomaž | Categories: Ideas | Comments »

Entering PIN without confirmation, part 2

20.06.2009 19:44

In my last post I was trying to figure out an optimal strategy for guessing a combination for a lock that only checks the last m symbols entered.

It turned out my venture into the graph theory was a bit of a dead end. As Luka Rahne pointed out to me (on Facebook, no less), there's a different branch of mathematics that offers a much better approach to this problem and even has a connection to my home field of electronics.

Linear feedback shift registers are actually direct solutions to such a problem, albeit only if you limit yourself to two characters (n = 2). These are digital circuits that will generate all 2m-1 different sequences of length m in a shift register (i.e. for each step they insert a single digit at the beginning of the register and drop one at the end). Exception is the state that contains all zeroes, but that is easy to correct with a special case.

It turns out that there exists a generalization of the LFSR for n states. Basically you replace the XOR operations in the feedback loops with arithmetic multiplication and addition modulo n. There is quite a bit of theory behind, of course, but I had trouble finding a higher-level description of it on the internet. A great deal also appears to be patented judging by the ratio of patents that came up in the search.

Just like with ordinary LFSRs, where you need to find the correct locations of the feedback taps, you need to find the correct set of m+1 multiplicative constants in order to get the maximum length sequence (or M-sequence) out of it. This is not really straightforward, as it involves finding a primitive polynomial for the finite (Galois) field GF(nm). Luka posted his Mathematica code to calculate the polynomial's coefficients, but I found it easier to use COS.

Diagram of a non-binary Galois LFSR

Here Si are memory cells holding register states and aj are coefficients of the polynomial:

a_4 x^4 + a_3 x^3 + a_2 x^2 + a_1 x + a_0

My knowledge of this field is a bit superficial and I don't completely understand the connection between this polynomial and the fact that its coefficients will produce the M-sequence. I admit I only skimmed the most interesting parts of an introductory book yesterday. Even my curiosity has limits.

There was only one obstacle left: a finite field only exists for prime characterics n, which means that you can't get a M-sequence for, say, 10 symbols directly. The answer is to factor n and produce a M-sequence for each prime factor (2 and 5 for 10 symbols). You then run two sequences in parallel and assign a symbol to each unique ordered pair you get in this way. So for ordered pairs where the first element can have one of 2 different values and the second element can have one of 5, you get 10 different pairs.

It's easy to show that two sequences ran in parallel with periods

n_1^m \qquad \mathrm{and} \qquad n_2^m

have a composite period of

(n_1 n_2)^m

if n1 and n2 are prime.

So, the final result, a sequence of 10003 keypresses that checks all 4-digit combinations, is below. C code that calculated it and checked for its correctness is here if anyone wants to play with it.


In conclusion, it turned out that there is quite a lot more than one way optimal way to guess the combination (as each finite field has multiple primitive polynomials and each polynomial yields an unique M-sequence). Still, you won't hack into a beer store anytime soon with this, as any practical implementation surely locks you out after some number of tries or adds an delay.

Posted by Tomaž | Categories: Ideas | Comments »

Entering PIN without confirmation

16.06.2009 19:52

A brand of electronic locks marketed by a certain security company in Slovenia has a peculiar way of recognizing the correct code.

To unlock it, you need to enter a four digit PIN. However, instead of the usual way where you have to press a confirm key after entering four digits this one unlocks as soon as you enter the correct sequence.

So basically, if the correct code is 1-2-3-4, it will unlock after pressing 9-8-1-2-3-4, or whatever sequence of digits comes before the 1-2-3-4.

It's certain that such a design reduces the time needed to guess the right combination. In theory you can check a new combination for every keypress where a normal lock would require four plus one for confirmation. For example, entering 9-8-1-2-3-4 checks 9-8-1-2, 8-1-2-3 and 1-2-3-4 with 6 presses while a normal lock would require 15 presses).

I'm wondering. What is the optimum strategy for guessing the combination? Is the theoretical one combination per key possible? Or is there necessarily some overlap between sequences entered in this way?

In the general case of n digits and m places I'm picturing this as a directed graph with nm vertices, one for each possible combination, where each vertex has n outgoing edges (corresponding to n possible keys you can press next). So the problem translates to finding a Hamiltonian path in such a graph (which is a NP-complete problem in the general case).

Hamiltonian path for n=5 m=2

Path for n=5 m=2

Hamiltonian path for n=2 m=3

Path for n=2 m=3

I have been able to find nicely symmetrical paths for simple case of m = 2. At least from those examples it seems that you can construct the path incrementally, by first exhausting all combinations with n digits before moving to n + 1

It appears possible that an analogue way exists for more places, but I'm having problems drawing three and four-dimensional grids on paper.

Posted by Tomaž | Categories: Ideas | Comments »

Angels, Demons and antimatter

18.05.2009 21:17

On Saturday I went to see Angels & demons. It was entertaining to watch, although it was apparent at times that whoever wrote the script was more at home in medieval myths than physics. I guess I got used to weird things characters tend to say in Hollywood movies whenever the topic of conversation shifts to science.

One thing that did made me curious is a scene towards the end when they had to get rid of an anti-matter explosive device in the middle of Rome with 5 minutes to spare. So a guy jumps into a helicopter and flies straight up. When the time is up, the helicopter is high enough that the explosion only causes a spectacular fireworks display and shatters some frescoes.

At the first thought, that seemed like a really bad idea to me. They say that the explosion had an equivalent yield of 5 kT of TNT, which is comparable to a small nuclear bomb. This kind of devices do the greatest damage when they're exploded at some height above the target, so is it really feasible to save a city in the way shown in the movie?

First question is how high can you get in an average helicopter in 5 minutes? After sampling some random technical specs on Wikipedia it appears that the rate of climb of a helicopter varies from 6 m/s in civilian to 13 m/s in military craft. In the movie they use a police or a medical helicopter. It's also empty, so take an average, say 8 m/s, to get a nice rounded figure of 500 m/min.

Second question: How dangerous is a 5 kT explosion at 2500 m if you're standing right below it (like the crowds does in the movie)?

Unfortunately the Strangelove Slide Rule doesn't provide this kind of calculation. I did some more searching and found a couple of old DOS programs from the US Defense Nuclear Agency that answer exactly this kind of questions (they're complete with warnings to reset the computer after entering classified data - I guess they haven't yet come to shredding hard disks at that stage).

Angels and demons blast effects

The result shows that 5 kT explosion will cause a shock-wave with 10 kPa of maximum pressure on the ground zero directly below. That's a little under 1.5 PSI, which is, according to the previously mentioned slide rule, not so bad for any by-standers. Such a shock will shatter windows, but won't knock down houses or even damage most people's hearing.

Surprisingly, that's exactly what you see in the movie. So this may be one case where they actually got it right.

Posted by Tomaž | Categories: Ideas | Comments »

Having given both a whirl

29.04.2009 20:02

I'm fluent in both Perl and Python. I've been writing Perl scripts of various sizes since I began using Linux and more recently I've also worked on larger projects using Perl (from G-Tablix to Wikiprep). On the other hand my work at Zemanta has largely been about writing software in Python. Large amounts of complicated, production critical software in fact.

There's this very obvious philosophical difference between these two languages: Perl is all about there is more than one way to do it, while in Python there is (preferably) only One.

Recently I've rewritten software for this blog in Perl. My colleagues were surprised when I mentioned this. Why didn't I do it in Python, when I had plenty of examples at work of how easy it is to set up a web page with Django for example?

Perl and Python

In my experience with both languages the fundamental difference in design philosophy shows in an interesting way: Writing software in Perl is fun by itself. The multiplicity of solutions to a single problem that Perl offers gives you the opportunity to express yourself. You can stop for a moment to contemplate the most elegant way of doing something. It's about writing software in your own, distinct style. To match your thought patterns. Not surprisingly, Perl is often compared to poetry.

Writing Python on the other hand is dull. If you already know in advance the algorithm you are about to write, writing code in Python is pure drudgery. You only have this limited number of valid expressions. You are forced to do indentation just right. Chances are that you'll write exactly the same code as the other guy if he was given the same specification. It's boring because it's simple and straightforward.

You better be discovering something new on the algorithmic level when using Python, or it won't be very enjoyable. On the other hand writing Perl code itself is fun.

Python is for experiments on a logical level. Perl is for experiments on the syntax level.

Then there's the different approach to errors: In majority of cases Perl will do what you want (unless what you want is consistency, as perlfunc(1) author remarks), maybe giving you a warning that you can later research and fix if you're a perfectionist. But in most cases if an undef gets into the stream it's an one-in-a-thousand occurrence that you can safely ignore.

Python on the other will usually throw exceptions in the most unexpected places and a single uncaught exception will kill your program (which was probably 5 minutes away from completing that 100 hour task).

It's the bugs in the code where the situation turns around. Fixing bugs in Python is fun. You get a descriptive back trace by default on all errors, code is easy to read even when it's not yours and it's satisfying to follow that stray None value as it traversed half a dozen modules until it got to some function that actually tried to do arithmetic on it.

On the other hand when something does go seriously wrong in Perl you better be the author of the code you are looking at and be familiar with interpreter's guts. Since functions will silently pass around invalid values and wrongly encoded strings, the root cause of the error you're hunting may be thousands of lines separated from its effects.

Now it's probably clear why I wrote my blogging software in Perl. It's an application that will most likely never be used, maintained or extended by anyone else by me. I was reinventing the wheel. There is no opportunity to be creative with software design of a yet another blog. It's a straightforward job of telling the computer what you want. So at least I was able to get some satisfaction from the act of expressing my wishes by jotting them down in Perl.

Having said that, I still firmly believe Python is way better language for any kind of cooperative work when compared with Perl. Most of software is at one point worked on by more than one person, so on average Python is probably a better bet when starting a project from scratch.

On another thought, even going through your own code a few years later may be considered working together with your younger self. So far I hadn't had any problems understanding my oldest Perl scripts. Then again maybe they're just not old enough.

Posted by Tomaž | Categories: Ideas | Comments »

Get Satisfaction suggestion

18.04.2009 0:52

It's not my duty to read and reply to complaints that get posted to Zemanta's corner of Get Satisfaction public support forums. However I do get notified about them and usually read a few that look like they may have something to do with my department.

Surprisingly many of them rank pretty high on the level of politeness and usefulness in regard to solving the problem. However, there are some that make you wonder which evolutionary step the writer of that particular post missed. For those gems, I propose a minor change to the usual I'm feeling happy/sad/whatever cuteness, which would at least Give Satisfaction to the replier, since the original poster is obviously incapable of getting any.

Get Satisfaction UI proposal

* Not his real name. No offense meant to the chimpanzees.

Posted by Tomaž | Categories: Ideas | Comments »