[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Memory consumption with cp -l (fwd)
From: |
Jim Meyering |
Subject: |
Re: Memory consumption with cp -l (fwd) |
Date: |
Mon, 10 Mar 2003 18:39:02 +0100 |
...
> Actually I had a hard time reproducing the bug on Computer 2: When I
> copied part of the original material the mem usage got large quickly. But
> when I copied the copy the mem usage was fairly low. This got me thinking:
> What is unique about the original? And there _is_ something unique: More
> than 99% of the files have > 10 hardlinks. It seems _this_ is the cause.
Thank you for investigating that!
Knowing the bit about hard links, the increased memory footprint is
understandable. The additional memory usage comes from the part of copy.c
that I've included below. The overhead is incurred only when the link
count is 2 or greater.
There's probably a way to save some space in cases like yours.
If there were(is?) a way to make a hard link given only a dev/inode pair,
we could save the destination dev/inode instead of the dest. file name.
Jim
--------------
/* Associate the destination path with the source device and inode
so that if we encounter a matching dev/ino pair in the source tree
we can arrange to create a hard link between the corresponding names
in the destination tree.
Sometimes, when preserving links, we have to record dev/ino even
though st_nlink == 1:
- when using -H and processing a command line argument;
that command line argument could be a symlink pointing to another
command line argument. With `cp -H --preserve=link', we hard-link
those two destination files.
- likewise for -L except that it applies to all files, not just
command line arguments.
Also record directory dev/ino when using --recursive. We'll use that
info to detect this problem: cp -R dir dir. FIXME-maybe: ideally,
directory info would be recorded in a separate hash table, since
such entries are useful only while a single command line hierarchy
is being copied -- so that separate table could be cleared between
command line args. Using the same hash table to preserve hard
links means that it may not be cleared. */
if ((x->preserve_links
&& (1 < src_sb.st_nlink
|| (command_line_arg
&& x->dereference == DEREF_COMMAND_LINE_ARGUMENTS)
|| x->dereference == DEREF_ALWAYS))
|| (x->recursive && S_ISDIR (src_type)))
{
earlier_file = remember_copied (dst_path, src_sb.st_ino, src_sb.st_dev);
}