bug-ddrescue
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-ddrescue] Suggestion / feature request - bad head mapping


From: Scott Dwyer
Subject: Re: [Bug-ddrescue] Suggestion / feature request - bad head mapping
Date: Thu, 4 Jan 2018 17:21:58 -0500
User-agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.5.2

I wrote some advanced information on ddrescue in a forum a few years back. I think I will direct post here, breaking it up into a couple parts as I did when I originally posted it. I mention how to attempt to skip out of a bad head reasonably. I am just going to copy and paste from my original Word documents, so the formatting may not be the best.

Part 1

Ddrescue: Advanced Understanding

This thread is meant to be a place to discuss gnuddrescue, both how it works and how to use it to its full potential. I will be adding things to this in an ongoing process. There is way too much to discuss in just one post (or even a few posts).

First, an explanation of what ddrescue is: Ddresuce is a free open source disk cloning software. Its purpose is to copy data from a failing drive. It does this at the sector level. It has an algorithm that does the best it can to get the most recoverable data first before trying really hard at the bad areas. In my opinion, it is the best freeware option to do this.

What ddrescue does not do: It does not recover specific files. It doesn’t care what the file system is. It just copies data at the sector level. So in no way does it process files. It only processes the raw drive. Ddrescue also does not use any direct disk commands. It uses generic read commands, which allows it to be compiled and run on different posix based systems. I do have a patch for it that will allow the use of ATA passthrough commands on Linux, but that will be discussed later.

Now let’s take a look at the algorithm. I am going to focus on the most current version, which at the time of this writing is 1.19. I feel that 1.19 is far better than previous versions, and the previous versions to not have this same algorithm. There are three phases of the recovery: Copy phase, trimming phase, and scraping phase. The copy phase itself does three passes. The first pass is forwards. If you just run a default command such a “ddrescue /dev/sda image.dd image.log” it will read the default of 128 sectors at a time (65536 bytes). When it finds a drive error, it will mark that block as non-trimmed, skip the next 65536 bytes (by default) which is marked as non-tried, and then attempts to continue reading. If the next read is also bad, the skip size is doubled. The skip size will keep doubling until it hits the max of 1GB or 1% of the drive size, whichever is lowest. When it reaches the end of the drive, it will then do the same thing backwards (pass 2), reading only the areas that were marked as non-tried (skipped).

Before we get into copy pass 3, let’s look at the first two passes. The first pass is designed to skip out of bad areas as fast as possible. However, as the skip size grows it is possible to skip past a big chuck of good data before it starts reading again. As the second pass does the same thing only backwards, it should normally catch most of the good data that was at the end of bad areas from the first pass. You may notice that the reverse reads are much slower than the forward reads. This is because normally drives have a look-ahead feature that will read ahead and store the data internally in a buffer. This only works when reading forwards. If you send a special command to the drive to turn off this feature, you will find the forward and reverse reads will be at about the same speed.

Now it would help to understand how the data is stored on the platters. A typical disk can have between 1 and 4 platters, and 2 to 8 heads. The data is actually stored in small groups that could be 100MB or less up to 1GB or more, depending on the drive. So for example if the group size was exactly 100MB, then on a 2 platter 4 head drive the first 0-100MB would be read from head 1, 100-200MB from head 2, 200-300MB from head 3, 300-400MB from head 4. Then the next 400-500MB would go back to head 1, and so on. So as you see, the data is not all in strait line order. There are normally two basic hard drive errors (ones that can be worked with using ddrescue). The first is a damaged area on one of the platters. The size of this error can vary, and the error can span multiple groups on the head. A damaged platter can also cause head damage (or further head damage) when the head passes over it. The less time spent in this area the better. The second common error is a weak or damaged head. This will affect reads across the entire disk. I have seen more than one logfile that shows this. There are usually many small errors spaced a bit apart, and usually there is also somewhat of a pattern (that can only be seen by examining the logfile). You can use ddrescueview to see a visual reference of the errors caused by the bad head, and you can also use it to get an idea of the group size of the head.

So how can we best deal with this? I like to think that the skip-out-fast method would usually be the best. This method involves using the --skip-size option to set both the skip size and the max skip size. By default the skip size is 64KiB and the max is either 1GiB or 1% of the drive size, whichever is smaller. So for example if we use ddrescueview (or examine the logfile) to see the error pattern early on in the rescue to get an estimate that the data group size is about 100MB, then we might want to go with something like a 5Mi skip size with a 10Mi max ("--skip-size=5Mi,10Mi"). We want to keep skipping out of the bad head as fast as possible on the first pass, but don't want to skip way too far out if we can help it. The untried area that is skipped out away from the bad head will get processed by the reverse pass (a good benefit of the reverse pass). This means that we can skip out big and fast if wanted, but understand that reverse reads are usually slower than forward reads. And you also don't want to allow skipping more than half way to the next bad read, or good data could be missed on the reverse pass and would have to wait for the third no-skip pass. The skip out fast method will also work for a damaged area on the platter, although you will likely not know in advance the group size. The big benefit of this method is getting the most good data as fast as possible before working on the problem areas.

We have only covered the first two copy passes, but that is enough for the first post (I am losing focus). More to come soon…



On 1/4/2018 6:02 AM, Peter Clifton wrote:
Hi,

I've been dumping a disk with ddrescue for a friend, and it occurred to me that 
one feature present in hardware based / proprietary recovery tools (as far as I 
could discern from watching youtube videos of professional recovery), is 
bad-head mapping.

The pattern of slow / bad reads from this particular disk appears to be 75% 
good, 25% bad, in a fairly regular pattern. I know the disk has 2x platters, 4x 
heads, so this suggests (possibly), a damaged region of one platter face, or 
one read head wearing or damaged more significantly than the others.

I was curious as to whether you had suggestion how (or interest in adding a 
feature), to have ddrescue focus on the 3/4 of the disk which is more readily 
accessible.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]