|
From: | Robert Backhaus |
Subject: | Re: [Bug-ddrescue] Suggestion to improve recovery rate during splitting phase. |
Date: | Wed, 23 Jan 2013 08:01:34 +1000 |
Hello Caius Severus,Optimality has more than one dimension. :-)
kwb78 wrote:
I think that in a situation where the drive has many bad sectors
across the whole disk, as opposed to contained within a few areas,
ddrescue does not approach the problem optimally.
Yes. This is point 4 of "Algorithm" in ddrescue's manual[1].
If I have understood the way the splitting phase occurs at present, the
drive is read sequentially starting from the first unsplit area forwards
until a certain number of bad sectors have been encountered and then it
jumps an arbitrary distance ahead. It the repeats this, gradually breaking
down the unsplit areas until it has read every sector and recovered the data
or marked it bad.
[1]http://www.gnu.org/software/ddrescue/manual/ddrescue_manual.html#AlgorithmDdrescue 1.14 skipped after a minimum of 2 consecutive errors, but this produced logfiles too large (see the point 4 mentioned above). The algorithm of ddrescue is a compromise between splitting first the larger areas and keeping logfile size under control. Of course this compromise can be improved.
When there are only a few areas of bad sectors this approach works quite
well, but with larger numbers of bad sectors it is painfully slow. The
reason for this is the time penalty for reading a bad sector can be of the
order of seconds for each one. When it is attempting to read 8 or more
consecutive sectors before skipping, this means that it can spend a minute
or more between skips.
There are no such areas. After trimming, all non-split areas are flanked by bad sectors, or else they would have been fully copied or trimmed out.
My suggested algorithm is as follows:
Following trimming,
1. Examine the log file to locate the largest unsplit area on the disk that
is directly adjacent to a known good area.
[...]
3. Upon encountering 2 bad sectors in a row, stop (since the probability is
that the next sector will also be bad).
These steps make logfile grow fast. The only way they can be implemented is by alternating them with full reads on the smallest non-split areas so that logfile size is kept reasonable.
5. When there are no remaining unsplit areas next to good areas, choose the
largest unsplit area and begin reading from the middle, not the edge.
Currently ddrescue avoids splitting areas if there are more than 1000 blocks in the logfile. Maybe this threshold could be set with a command line option. (--max-logfile-size ?).Or until logfile has grown beyond a given size.
6. Keep doing this until the unsplit areas are all below an arbitrary
minimum size, at which point go back to reading linearly.
Thank you for sharing your observations. I'll try to optimize splitting a little more. :-)
Regards,
Antonio.
_______________________________________________
Bug-ddrescue mailing list
address@hidden
https://lists.gnu.org/mailman/listinfo/bug-ddrescue
[Prev in Thread] | Current Thread | [Next in Thread] |