CubieTruck Perl performance

23.01.2014 22:57

Two months ago I bought a CubieTruck, one of the many cheap, bare-bone ARM-based computers that keep popping-up everywhere these days. My idea was to replace the aging x86 server that is running this website with something more power-efficient. So I was looking for a reasonably powerful board with a proper SATA interface and a decent amount of RAM. Raspberry Pi was out of the question, but the latest incarnation of CubieBoard with a dual-core 1 GHz ARM Cortex-A7, 2 GB of RAM, SATA 2.0 and Gigabit Ethernet seemed to fit the bill.

Unfortunately I could not find any reliable benchmarks I could use to estimate how ARM SoCs perform in comparison with my existing setup. So before I decided to migrate I took a while to do some performance tests and get to know this hardware.


The software setup I'm interested in benchmarking is somewhat archaic in these days of Node.js and NoSQL. I'm using Perl 5 with HTML::Template doing most of the heavy lifting (at least according to Devel::NYTProf profiler). Most parts are statically generated and some are dynamic using a handful of Speedy CGI Perl 5 scripts. These are combined into a consistent website you see here with a somewhat convoluted Apache configuration using the threaded worker.

In the following benchmarks I'm comparing:

  • An AMD Duron at 700 MHz, 1.2 GB RAM running stock x86 Debian Squeeze. Root filesystem is mounted from an IDE hard drive.
  • A CubieTruck A20 running armhf Debian Wheezy and the kernel supplied for the CubieTruck Ubuntu Server installation. Root filesystem is mounted from an SD card.

Both machines were connected through a 100 Mb/s Ethernet switch to a laptop which was running the remote end of the benchmarks.

First, to see how fast the static part of the web site is generated, I ran the full (single threaded) HTML rebuild. I measured the required user space CPU time with the time utility. This is the fastest run of three on each machine:

AMD DuronCubieTruck
CPU time to rebuild static pages45.3 s61.8 s

Then, to check if network was operating at the bit rate I thought it was, I ran iperf to measure TCP throughput between the server and the laptop:

AMD DuronCubieTruck
iperf throughput test94.0 Mb/s94.5 Mb/s

Finally, I ran a suite of tests using the Apache benchmarking tool. I measured how many requests per minute a server can handle for different types of content and different number of concurrent requests. Numbers in parentheses show size of HTTP body (without headers).

CubieTruck requests per second for a static HTML page.

CubieTruck requests per second for a dynamic HTML page.

CubieTruck requests per second for an image.

CubieTruck requests per second for API call.

The site rebuild is somewhat disappointingly almost one-third slower than on a 10 year old PC. However the single threaded Apache performance is on par with it. In the case of more concurrent users the CubieTruck of course has an advantage because of an additional CPU core. Actually in both cases with static content CubieTruck managed to saturate the line when there was more than one concurrent request.

I tried to make these tests in a way that the slow SD card in the CubieTruck would minimally affect their outcome. All of data should fit into the buffer cache, which is why in the first test I only took into account the fastest run and only user space CPU time. However I now suspect that the SD card still affected the numbers somehow (the rebuild operation is the heaviest of the tests regarding filesystem I/O). I don't know for sure how kernel computes the time returned by the time utility.

These results are good enough that I can't dismiss CubieTruck based on performance. If a proper SATA drive wouldn't speed it up, I could probably parallelize the build process with not much work. That should cut down on time if it's really Perl performance on ARM that is slowing it down. On the other hand I'm having some other concerns about using CubieTruck as a personal server so I'm not completely decided yet about putting it on my rack.

Posted by Tomaž | Categories: Digital


I had performance problems due I/O while having the system running on the nand and on the SD card (even with a class 10) but using a SATA SSD dramatically improves performance. Haven't tried it but i think that with a SATA normal HD the performance would be also very good.
If you still want to use the SD card you can move your /var directory to tmpfs, i found that the log file writes on the SD card decrease the overall system performance.

i hope this helps...

Ivan, thanks for your comment. I'm now using CubieTruck with a SATA drive and I'm fairly happy with overall performance.

However the time to rebuild static pages didn't improve significantly when moving away from the SD card. That confirms my initial assumption that I was benchmarking the CPU and not the SD card.

Posted by Tomaž

They have a really weak FPU which shows in cases like this.
I..e. the CPU in my Nexus7 (gen1) tablet was already much faster than a cubietruck (I enjoy using those a lot!)
Comparing to the Duron700 was a great idea, most people can understand that nicely.

Add a new comment

(No HTML tags allowed. Separate paragraphs with a blank line.)