coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] Add wipename option to shred


From: Pádraig Brady
Subject: Re: [PATCH] Add wipename option to shred
Date: Thu, 27 Jun 2013 18:06:06 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130110 Thunderbird/17.0.2

On 06/13/2013 05:13 PM, Joseph D. Wagner wrote:
> On 06/13/2013 8:35 am, Pádraig Brady wrote:
> 
>> On 06/13/2013 12:51 AM, Joseph D. Wagner wrote:
>>
>>> ## perchar ##
>>> real    678m33.468s
>>> user    0m9.450s
>>> sys    3m20.001s
>>>
>>> ## once ##
>>> real    151m54.655s
>>> user    0m3.336s
>>> sys    0m32.357s
>>>
>>> ## none ##
>>> real    107m34.307s
>>> user    0m2.637s
>>> sys    0m21.825s
>>>
>>> perchar: 11 hours 18 minutes 33.468 seconds
>>> once: 2 hours 31 minutes 54.655 seconds
>>>  * a 346% improvement over perchar
>>> none: 1 hour 47 minutes 34.307 seconds
>>>  * a 530% improvement over perchar
>>>  * a 41% improvement over once
>>
>> Whoa, so this creates 23s CPU work
>> but waits for 1 hour 47 mins on the sync!
>> What file system and backing device are you using here
>> as a matter of interest?
> 
> ext4 data=ordered (default) + 7200 SATA
> 
> Just to be clear, the times also include shredding the data part of the files.
>
> For my test I used 16 character file names and 100,000 files each 4k in size,
> which comes to:
> perchar: (1 data fsync + 16 name fsync) * 100,000 files = 1,700,000 fsync
> once: (1 data fsync + 1 name fsync) * 100,000 files = 200,000 fsync
> none: (1 data fsync + 0 name fsync) * 100,000 files = 100,000 fsync
> 
> I included the exact script I used to generate those statistics in a previous
> email.  Feel free to replicate my experiment on your own equipment, using my
> patched version of shred of course.
> 
> Alternatively, if you still have reservations about adopting my patch,
> would you be more open to a --no-wipename option?  This would be the
> equivalent of my proposed --wipename=none.  It would not imply any
> additional security; to the contrary, it implies less security.  Yet, it
> would still give me the optional performance boost I am trying to
> achieve.

Yes these sync latencies really add up.

I timed this simple test script on ext4 on an SSD
and traditional disk in my laptop:

  import os
  d=os.open(".", os.O_DIRECTORY|os.O_RDONLY)
  for i in range(1000):
    os.fdatasync(d)

That gave 2ms and 12ms per sync operation respectively.
This seems to be independent of dir size and whether any
changes were made to the dir, which is a bit surprising.
Seems like there could be only sync on change optimizations possible.
Anyway...

So with the extra 1.6M syncs above on spinning rust,
that would add an extra 5.3 hours by my calc.
Your latencies seemed to be nearly double that,
but fair enough, same ball park.

Now we could handle this outside of shred, if we only
wanted to choose between wiping names and simple delete.
Given that the above latencies, the overhead of a couple
of microseconds to start a process per file is insignificant:

find /files | xargs -n1 -I{} sh -c 'shred "{}" && rm "{}"'

But yes this is a bit awkward.
Also if you did want to select wipe, but avoid the explicit syncs,
because you knew your file system had synchronous metadata updates
then we couldn't support that operation with this scheme.

So I'm leaning a bit towards adding control through shred options.
So how about this interface:

-u, --remove[=HOW]
    truncate and remove file after overwriting.
    HOW indicates how to remove the directory entry:
    unlink => just call unlink, wipe => also first obfuscate the name,
    wipesync => also sync each obfuscated character to disk (the default)

thanks,
Pádraig.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]