[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-ddrescue] Feature requests - ddrescue
From: |
Dave Burton |
Subject: |
Re: [Bug-ddrescue] Feature requests - ddrescue |
Date: |
Wed, 08 Aug 2007 08:43:58 -0400 (EDT) |
address@hidden (Antonio Diaz Diaz) wrote:
> Dave Burton wrote:
> > However, if you simply overwrite a failing drive with all zero
> > sectors (which is what the manufacturers' utilities do), the
> > drive might remap the bad sectors, and it might (temporarily)
> > appear to be a good drive.
>
> John Gilmore uses GNU ddrescue to remap the bad sectors just as you
> suggest. See it here http://www.toad.com/gnu/sysadmin/index.html#ddrescue
John Gilmore's suggestion to overwrite bad sectors is:
cp logfile my-logfile-copy
ddrescue -r1 /dev/zero /dev/baddisk my-logfile-copy
Note that, as always with ddrescue, the operation can be
stopped at any time and resumed later, and it will pick
up where it left off. (I love that about ddrescue!)
To zero-out all the GOOD sectors before returning a bad
drive to the manufacturer, an alternative to using
my ddrwipe.pl script would be to "invert" the logfile
with my very simple ddrlognot.pl ("DDRescue LOGical NOT")
Perl script, and then use John Gilmore's technique:
perl ddrlognot.pl -g logfile inverted-logfile
ddrescue -r1 /dev/zero /dev/baddisk inverted-logfile
That will overwrite all the good ("+") and unknown
("/" and "?") sectors with zeros. You could omit the
"-g" option to leave alone sectors of unknown status
("/" and "?" in the logfile).
Again, the ddrescue operation is completely resumable,
which is fortunate since it can take a long time to wipe a
large hard disk drive.
Here's the "help" screen for ddrlognot.pl:
--------------( begin help screen )--------------
ddrlognot.pl ("DDRescue LOGfile logical .NOT.") v.3, 13-Sep-06
This is a Perl program which reads a GNU ddrescue 1.x "log file" (really,
a sector list file), and creates from it an "inverted" or "negated"
version, in which all the '-' (bad/unrecovered) sectors are listed as
'+' (good/recovered), and vice-versa.
By default, '/' and '?' sectors are treated as '-' (bad/unrecovered),
so such sectors will be listed as '+' in the output file. You may
add the '-g' (good) option to treat them as '+' (good/recovered),
or you may add the '-k' (keep-as-is) option to leave them as-is in
the output file.
Usage:
perl -w ddrlognot.pl {options} inputfile outputfile
{options} can be:
-k (keep-as-is, leave '/' and '?' lines alone)
-g (treat-as-good, convert '/' and '?' lines to '-' in outputfile)
E.g., to preserve '/' and '?' lines in outputfile:
perl -w ddrlognot.pl -k inputfile outputfile
Note that the parameter order is like 'cp' (input file first), which is
the OPPOSITE of ddrlogand.pl and ddrlogor.pl!
--------------( end help screen )--------------
Like ddrwipe.pl, ddrlognot.pl is part of my collection of
about a dozen Perl scripts for use with ddrescue, here:
http://www.burtonsys.com/download/ddr2sr.zip
Complementing the "logical NOT" function, there are also
scripts to combine multiple logfiles in various ways, such
as "logical AND" and "logical OR".
I've also just added to ddr2sr.zip a little "C" program
called "sparsecopy," which copies files "sparsely," sector
by sector. In other words, if any 512-byte sector contains
all zeros, it isn't copied. If you think about that for a
minute, you'll realize that it means that you can use it to
combine two ddrescue image files into one. (I wrote this
program after I accidentally used different image file
names for two successive ddrescue passes.)
> > So I wanted a disk wiper program which would zero out only the
> > good sectors, and leave the bad sectors alone. That way, the
> > drive will still test bad (i.e., with unreadable sectors).
[...discussion of ddrwipe.pl elided...]
>
> It seems it would be useful to make ddrescue able to overwrite good or
> bad sectors in the original drive or, as Christian Franke suggested here
> http://lists.gnu.org/archive/html/bug-ddrescue/2007-06/msg00002.html, in
> the copy (to make bad sectors easily recognizable when examined with an
> hex editor).
>
> I propose the addition of a new option to ddrescue, "--fill", with an
> argument telling it what sectors to fill. For example:
>
> ddrescue --fill=+ /dev/zero /dev/hdb logfile
>
> would write with zeros the sectors of /dev/hdb listed as good (+) in
> logfile, and:
>
> echo 'BAD SECTOR' > tmpfile
> ddrescue --fill=- tmpfile drive_copy logfile
>
> would fill with the string "BAD SECTOR" the sectors of drive_copy listed
> as bad (-) in logfile.
>
> What do you think?
That seems like it would be a handy option, but it is not clear
to me what what happens to the logfile. Would it be updated,
and if so how?
For "--file=-" and "--file=-/?" you could make ddrescue update
the logfile as with any other ddrescue operation, and the
operation would be resumable. However, for "--file=+" I don't
see how you can make the operation be resumable without using
a second logfile.
----
Changing the subject... I've been mulling over, for some time,
other ideas for enhancements to ddrescue. What follows is a
pair of ideas which are closely related to each other.
Basically, I'd like to have options to make ddrescue "smarter"
(or more flexibly programmable) when it encounters read errors,
before it gets to "splitting bad blocks" mode.
(Note: I might be confused about the best way to use ddrescue,
and specifically the "-c" parameter. Am I right in believing
that -c sets both the read chunk size and the amount by which
ddrescue skips ahead on errors, and that these are always same
number?)
When I have a drive that I think might be mostly good, I like
to start out by running ddrescue with big block sizes in
buffered mode, for maximum speed:
# buffered-big-block mode
hdparm -d1 /dev/hdc
hdparm -a256 /dev/hdc
hdparm -A1 /dev/hdc
blktool /dev/hdc read-ahead 256
ddrescue -B -n /dev/hdc drive.ima drive.log
The problem is that when it hits bad sectors, it becomes
excruciatingly slow. It appears to me that either the
Linux disk buffering code or else the drive firmware
tries to read every sector in the big blocks, even after
one of those sectors fails. Can that be right?
I don't know, but, in any event, when an error is encountered,
often MINUTES elapse before ddrescue even sees the error.
I need to do some experiments, because I'm not sure
whether this is due to the "-c" setting, or due to the
Linux and/or drive read-ahead setting, or some combination.
But even ctrl-C can take minutes to be noticed by ddrescue.
So if there are errors, I like to shrink the block size
to 4K (the Linux buffer size). This is not quite as fast
when the drive is error-free, but it is still reasonably fast
and it is much more robust in the event of errors:
# buffered-4K-block mode
hdparm -a8 /dev/hdc
hdparm -A0 /dev/hdc
blktool /dev/hdc read-ahead 0
ddrescue -B -c 8 /dev/hdc drive.ima drive.log
Unfortunately, if there are many errors, even that is slow.
In that case, it is best to read one sector at a time, which
in Linux requires raw mode or use of the raw device (to bypass
the OS's buffering). Unfortunately, in Linux raw mode is very,
very slow. So if there are no errors, it takes about 10x as
long to read a drive in raw made as it would take using the
normal buffered modes (above). But when there are lots of
errors at least in raw mode it makes steady progress.
(BTW, Antonio, I love the new "-d" option that you added in
ddrescue 1.4!)
Here's how I read the drive in raw, single-sector mode:
# raw-single-sector mode
hdparm -a0 /dev/hdc
hdparm -A0 /dev/hdc
hdparm -k1 /dev/hdc
hdparm -K1 /dev/hdc
# mknod /dev/raw/raw9 c 162 9 2>/dev/null
# raw /dev/raw/raw9 /dev/hdc
# ddrescue -B -c 1 /dev/raw/raw9 drive.ima drive.log
# "-d" takes the place of using the raw device (starting w/ ddrescue 1.4):
ddrescue -B -c 1 -d /dev/hdc drive.ima drive.log
What I find, in practice, is that I start ddrescue in the
fastest mode, and watch it for a minute or two. If there are
no immediate errors I walk away. After a while I check on
its progress. If it has been getting a few errors I ctrl-C
and restart it with smaller blocks. If it is stuck and has
obviously been making little progress, I ctrl-C (and wait
a long time for it to exit), then restart it reading one
sector at a time.
Periodically I check on its progress. If it has been reading
steadily in raw-single-sector mode, and getting no errors
for quite a while, I ctrl-C and restart it in buffered mode
again. Etc..
In other words, I "keep an eye on it," and when it is getting
no errors I switch it to buffered mode, but when it is getting
lots of errors I switch it to one-sector-at-a-time/raw mode.
(Then, when it is all done, I run usually ddrescue one more
time in raw-single-sector mode, with the "-r 1" option added,
to retry all the bad blocks one more time.)
SUGGESTION #1:
So what I'd like is a way to make ddrescue switch modes
automatically, based on the progress it is (or is not) making.
This amounts to a "smart" (automatically self-adjusting) setting
for the "-c" (and "-d") parameters. I envision a flow chart
something like this:
After each read...
1) If in buffered-big-block mode then
If read was unsuccessful then
switch to raw-single-sector mode
Else If read was successful then
stay in big-buffered-block mode
Endif
2) If in buffered-4K-block mode then
If read was unsuccessful then
switch to raw-single-sector mode
Else If read was successful then
If we've been 10 minutes without a read error then
switch to buffered-big-block mode
Else
stay in buffered-4K-block mode
Endif
Endif
3) If in raw-single-sector mode
If read was unsuccessful then
stay in raw-single-sector mode
Else If read was successful then
If we've been 5 minutes without a read error then
stay in buffered-4K-block mode
Endif
Endif
That's obviously a little state machine. I'm not sure to
what extent it needs to be parameterized, to let the user
adjust the block sizes and/or wait times. I don't know
whether what is best on one machine is apt to be best on
all machines.
SUGGESTION #2:
Also, I think it would be useful to be able to have a "smart"
setting for the skip-ahead sectors between reads.
If a read (whether large, or 4K, or one sector) is successful,
the next read should immediately follow it. However, if
several successive reads fail, it would be useful to have
ddrescue be able to "skip ahead" by progressively larger
and larger numbers of sectors, trying to pass over the bad
area as quickly as possible, to find and read as much of the
good areas as possible as quickly as possible.
I envision a parameter which supplements the "-c" parameter.
If it is specified, then under error conditions a skip
factor could be added to the "-c" parameter after an error
to determine the starting sector number for the next read
attempt. Perhaps something like this:
--skipahead=delay,min,accel,max
where:
"delay" is the number of successive read errors before skipahead
commences (default infinity)
"min" is the starting skip factor, usually expressed in blocks (sectors)
(default 0),
"accel" is the amount by which the skip factor increases with each error
(default=min)
"max" is the maximum skip factor (default 1/10 drive size),
Example:
While reading ub raw-single-sector mode, if sectors 1-100 are
bad, and --skipahead=4,4b then ddrescue would try to read
the following sectors: 0,1,2,3,8,16,28,40,60,84,112
What do you think of these ideas, Antonio?
As always, thank you for making such a terrific tool, Antonio!
Regards,
-Dave