CubieTruck UDMA CRC errors

18.10.2014 20:07

Last year I bought a CubieTruck, a small, low-powered ARM computer, to host this web site and a few other things. Combined with a Samsung 840 EVO SSD on the SATA bus, it proved to be a relatively decent replacement for my aging Intel box.

One thing that has been bothering me right from the start though is that every once in a while, there were problems with the SATA bus. Occasionally, isolated error messages like these appeared in the kernel log:

kernel: ata1.00: exception Emask 0x10 SAct 0x2000000 SErr 0x400100 action 0x6 frozen
kernel: ata1.00: irq_stat 0x08000000, interface fatal error
kernel: ata1: SError: { UnrecovData Handshk }
kernel: ata1.00: failed command: WRITE FPDMA QUEUED
kernel: ata1.00: cmd 61/18:c8:68:0e:49/00:00:02:00:00/40 tag 25 ncq 12288 out
kernel:          res 40/00:c8:68:0e:49/00:00:02:00:00/40 Emask 0x10 (ATA bus error)
kernel: ata1.00: status: { DRDY }
kernel: ata1: hard resetting link
kernel: ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
kernel: ata1.00: supports DRM functions and may not be fully accessible
kernel: ata1.00: supports DRM functions and may not be fully accessible
kernel: ata1.00: configured for UDMA/133
kernel: ata1: EH complete

At the same time, the SSD reported increased UDMA CRC error count through the SMART interface:

UDMA CRC weekly error count on CubieTruck.

These errors were mostly benign. Apart from the cruft in the log files they did not appear to have any adverse effects. Only once or twice in the last 10 months or so did they cause the kernel to remount filesystems on the SSD as read-only, which required some manual intervention to get the CubieTruck back on-line.

I've seen some forum discussions that suggested this might be caused by a bad power supply. However, checking the power lines with an oscilloscope did not show anything suspicious. On the other hand, I did notice during this test that the errors seemed to occur when I was touching the SATA cable. This made me think that the cable or the connectors on it might be the culprit - something that was also suggested in the forums.

Originally, CubieTruck comes with a custom SATA cable that combines both power and data lines for the hard drive and has special connectors (at least considering what you usually see in the context of the SATA cabling) on the motherboard side.

Last few weeks it appeared that the errors were getting increasingly more common, so I decided to try replacing the cable. Instead of ordering a new CubieTruck SSD kit I improvised a bit: I didn't have proper connectors for CubieTruck's power lines at hand, so I just soldered the cables directly to the motherboard. On the SSD drive I used the standard 15-pin SATA power connector.

For the data connection, I used an ordinary SATA data cable. The shortest one I could find was about three times as long as necessary, so it looks a bit uglier now. The connector on the motherboard side also needed some work with a scalpel to fit into CubieTruck's socket. The original connector on the cable that came with CubieTruck is thinner than those on standard SATA cables I tried.

Replacement SATA cables for CubieTruck.

So far it seems this fixed the CRC errors. In the past few days since I replaced the cable I haven't seen any new errors pop up, but I guess it will take a month or so to be sure.

Posted by Tomaž | Categories: Digital

Add a new comment


(No HTML tags allowed. Separate paragraphs with a blank line.)