Another SD card postmortem
I was recently restoring a Raspberry Pi at work that was running a Raspbian system off a SanDisk Ultra 8 GB micro SD card. It was powered on continuously and managed to survive almost exactly 6 months since I last set it up. I don't know when this SD card first started showing problems, but when the problem became apparent I couldn't log in and Linux didn't even boot up anymore after a power cycle.
I had a working backup of the system, however I was curious how well ddrescue would be able to recover the contents of the failed card. To my surprise, it did quite well, restoring 99.9% of the data after about 30 hours of run time. I've only ran the copy and trim phase (--no-scrape
). Approximately 8 MB out of 8 GB of data remained unrecovered.
This was enough that fsck was able to recover the filesystem to a good enough state so that it could be mounted. Another interesting thing in the recovered data was the write statistic that is kept in ext4 superblock. The system only had one partition on the SD card:
$ dumpe2fs /dev/mapper/loop0p2 | grep Lifetime dumpe2fs 1.43.4 (31-Jan-2017) Lifetime writes: 823 GB
On one hand, 823 GB of writes after 6 months was more than I was expecting. The system was setup in a way to avoid a lot of writes to the SD card and had a network mount where most of the heavy work was supposed to be done. It did have a running Munin master though and I suspect that was where most of these writes came from.
On the other hand, 823 GB on a 8 GB card is only about 100 write cycles per cell, if the card is any good at doing wear leveling. That's awfully low.
In addition to a raw data file, ddrescue also creates a log of which parts of the device failed. Very likely a controller in the SD card itself is doing a lot of remapping. Hence a logical address visible from Linux has little to do with where the bits are physically stored in silicon. So regardless of what the log says, it's impossible to say whether errors are related to one failed physical area on a flash chip, or if they are individual bit errors spread out over the entire device. Still, I think it's interesting to look at this visualization:
This image shows the distribution of unreadable sectors reported by ddrescue over the address space of the SD card. The address space has been sliced into 4 MB chunks (8192 blocks of 512 bytes). These slices are stacked horizontally, hence address 0 is on the bottom left and increases up and right in a saw-tooth fashion. The highest address is on the top right. Color shows the percentage of unreadable blocks in that region.
You can see that small errors are more or less randomly distributed over the entire address space. Keep in mind that summed up, unrecoverable blocks only cover 0.10% of the space, so this image exaggerates them. There are a few hot spots though and one 4 MB slice in particular at around 4.5 GB contains a lot of more errors than other regions. It's also interesting that some horizontal patterns can also be seen - the upper half of the image appears more error free than the bottom part. I've chosen 4 MB slices exactly because of that. While internal memory organization is a complete black box, it does appear that 4 MB blocks play some role in it.
Just for comparison, here is the same data plotted using a space-filling curve. The black area on the top-left is part of the graph not covered by the SD card address space (the curve covers 224 = 16777216 blocks of 512 bytes while the card only stores 15523840 blocks or 7948206080 bytes). This visualization better shows grouping of errors, but hides the fact that 4 MB chunks seem to play some role:
I quickly also looked into whether failures could be predicted by something like SMART. Even though it appears that some cards do support it, none I tried produced any useful data with smartctl. Interestingly, plugging the SanDisk Ultra into an external USB-connected reader on a laptop does say that the device has a SMART capability:
$ smartctl -d scsi -a /dev/sdb smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.9.0-12-amd64] (local build) Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Vendor: Generic Product: STORAGE DEVICE Revision: 1206 Compliance: SPC-4 User Capacity: 7 948 206 080 bytes [7,94 GB] Logical block size: 512 bytes scsiModePageOffset: response length too short, resp_len=4 offset=4 bd_len=0 Serial number: 000000001206 Device type: disk scsiModePageOffset: response length too short, resp_len=4 offset=4 bd_len=0 Local Time is: Thu May 14 16:36:47 2020 CEST SMART support is: Available - device has SMART capability. SMART support is: Enabled Temperature Warning: Disabled or Not Supported === START OF READ SMART DATA SECTION === SMART Health Status: OK Current Drive Temperature: 0 C Drive Trip Temperature: 0 C Error Counter logging not supported scsiModePageOffset: response length too short, resp_len=4 offset=4 bd_len=0 Device does not support Self Test logging
However I suspect this response comes from the reader, not the SD card. Multiple cards I tried produced the same 1206 serial number. Both a new and a failed card had the "Health Status: OK" line, so that's misleading as well.
This is a second time I was replacing the SD card in this Raspberry Pi. The first time it lasted around a year and a half. It further justifies my opinion that SD cards just aren't suitable for unattended systems or those running continuously. In fact, I suggest avoiding them if at all possible. For example, newer Raspberry Pis support booting from USB-attached storage.
Update: For further reading on SD card internals, Peter mentions an interesting post in his comment below. The flashbench tool appears to be still available here. The Linaro flash survey seems to have been deleted from their Wiki. A copy is available from archive.org.
Hi Tomaž,
I found this info explaining how flash drives work (might be useful):
http://3gfp.com/wp/2014/07/formatting-sd-cards-for-speed-and-lifetime/
And yes, I was also having trouble with SD card stability. I ended up using btrfs RAID 1 over SD card and USB disk.
But now after reading the above, I'm wondering if some other filesystem might be better suited for such purpose (e.g. f2fs), since btrfs writes metadata all over the place rather than at the beginning of the drive.
Thanks for your blog. I love following it.