## Image recognition 101

11.01.2010 19:14

The camera setup I mentioned in my last post left me with a week worth of video of the electrical water heater in my bathroom. Instead of watching the boring video myself, I used a simple program to extract a single bit of information from each frame: is the heater operating or not?

Here are four typical frames from the video - all four combinations of states for the heater and the light. The latter is obviously causing the largest undesirable changes in the image I needed to filter out.

First obstacle was how to actually get the raw image data back from the MJPEG encoded frames stored in the AVI container. Raw video is actually surprisingly hard to get to, since the whole GStreamer framework seems to be built around the idea to keep you away from it. Finally I found the fdsink element which, while not perfect, was good enough for the job. Some trial and error and I came up with the following pipeline:

gst-launch-0.10 --gst-debug-disable filesrc location=boiler.avi ! \
jpegdec ! video/x-raw-yuv,width=128,height=512,format="(fourcc)I420" ! \
fdsink fd=1 | ./get_pixel


This decodes MJPEG (jpegdec element) into YUV format variant called I420 and pipes it through GStreamer's standard output (file descriptor 1) into my program's (get_pixel) standard input.

YUV format is just what I needed for this purpose, since I'm only interested in luminance values.

Here's the C source for get_pixel:

#include <stdio.h>
#include <stdlib.h>

#define HEIGHT  512
#define WIDTH   128

static unsigned char get_yvalue(unsigned char *frame, int x, int y) {
return frame[x + y * WIDTH];
}

int main() {
size_t yplane_size = WIDTH*HEIGHT*sizeof(unsigned char);
size_t uvplane_size = WIDTH*HEIGHT*sizeof(unsigned char)/4;

size_t frame_size = yplane_size + uvplane_size * 2;
unsigned char *frame = malloc(frame_size);

/* fread(frame, 26, 1, stdin); */

while(fread(frame, frame_size, 1, stdin) == 1) {
unsigned char b = get_yvalue(frame, 97, 246);
unsigned char l = get_yvalue(frame, 97, 446);
printf("%d\t%d\n", b, l);
}

return 0;
}


This reads frame by frame from the standard input and stores it in a buffer. It then reads two luminance values from the image: one centered at the heater light (l) and one for reference on the white boiler surface (b) and writes both values to the standard output.

The commented-out fread() call is there because GStreamer sometimes writes 26 bytes of garbage at the start (actually it looks like some debug message that every once in a while doesn't end up in it's proper standard error stream). It's probably a bug somewhere in GStreamer, but I didn't look much into it since this is such an one-off program.

So, this program gave me a list of luminance values and it was then trivial to find a function that discriminates between heater on and heater off states regardless of the ambient light.

Here's what the luminance values look like for a typical 24 hour day:

The red dots show the reference value (b) and blue dots show the heater light luminance (l). The shaded areas show the time intervals where heater was detected to be operating.

The function used was:

f(b, l) = \left\{ \begin{array}{ll}1 & \textrm{if $l - \frac{b}{2} > 35$} \\ 0 & \textrm{if $l - \frac{b}{2} \le 35$}\end{array}\right.

Where b and l values range from 16 to 235 (and my question why is that so still stands).

As you can see the bathroom is mostly in the dark during the day and only around noon some daylight manages to shine into it. I manually checked some random points and it seems precision is nearly perfect. The only exception appear to be the first frames taken after the light has been turned on or off when the camera hasn't yet adapted to the light and the frames are grossly over- or underexposed.

Posted by | Categories: Code