bug-tar
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug-tar] [PATCH] tar: live within system-supplied limits on file descri


From: Paul Eggert
Subject: [Bug-tar] [PATCH] tar: live within system-supplied limits on file descriptors
Date: Sun, 12 Sep 2010 14:43:19 -0700
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.12) Gecko/20100826 Thunderbird/3.0.7

I installed the following into GNU tar.  It follows up on
Jim Meyering's suggestion to put GNU tar on a file descriptor diet
when recursing into deeply-nested directories.

This patch uses a different strategy from coreutils.  Coreutils
limits itself to a small number of file descriptors, whereas
with this patch, tar keeps opening and caching file descriptors
until it runs into any system-imposed limits, and then it backs
off and closes cached descriptors as needed.

* NEWS: Note the change.  Mention dirfd and fdopendir.
* gnulib.modules: Add dirfd and fdopendir.  The code was already
using fdopendir; dirfd is a new need.
* src/common.h (open_searchdir_flags, get_directory_entries):
(subfile_open, restore_parent_fd, tar_stat_close): New decls.
(check_exclusion_tags): Adjust signature to match code change.
* src/create.c (IMPOSTOR_ERRNO): New constant.
(check_exclusion_tags): First arg is now a struct tar_stat_info
const *, not an fd.  All callers changed.
(dump_regular_file, dump_file0): A zero fd represents an unused
slot, so play it safe if the fd member is zero here.  A negative
fd represents the negation of an errno value, so play it safe and
do not assign -1 to fd merely because an open fails.
(open_failure_recover, get_directory_entries, restore_parent_fd):
(subfile_open): New functions.  These help to recover from file
descriptor exhaustion.
(dump_dir, dump_file0): Use them.
(dump_file0): Use tar_stat_close instead of rolling our own close.
* src/incremen.c (scan_directory): Use get_directory_entries,
subfile_open, etc., to recover from file descriptor exhaustion.
* src/names.c (add_hierarchy_to_namelist): Likewise.
(collect_and_sort_names): A negative fd represents the negation
of an errno value, so play it safe and do not assign -1 to fd.
* src/tar.c (decode_options): Set open_searchdir_flags.
Add O_CLOEXEC to all the open flags.
(tar_stat_close): New function, which knows how to deal with
new convention for directory streams and file descriptors.
Diagnose 'close' failures.
(tar_stat_destroy): Use it.
* src/tar.h (struct tar_stat_info): New member dirstream.
fd now has the negative of an errno value, not merely -1, if
the file could not be opened, so that failures to reopen directories
are better-diagnosed later.
* tests/Makefile.am (TESTSUITE_AT): Add extrac11.at.
* tests/testsuite.at: Likewise.
* tests/extrac11.at: New file.
---
 NEWS               |   14 ++-
 gnulib.modules     |    2 +
 src/common.h       |    9 ++-
 src/create.c       |  247 ++++++++++++++++++++++++++++++++++++++++++---------
 src/incremen.c     |   41 ++++-----
 src/names.c        |   54 ++++++++----
 src/tar.c          |   38 +++++++--
 src/tar.h          |   18 +++-
 tests/Makefile.am  |    1 +
 tests/extrac11.at  |   77 ++++++++++++++++
 tests/testsuite.at |    1 +
 11 files changed, 399 insertions(+), 103 deletions(-)
 create mode 100644 tests/extrac11.at

diff --git a/NEWS b/NEWS
index b305f26..667fad8 100644
--- a/NEWS
+++ b/NEWS
@@ -1,4 +1,4 @@
-GNU tar NEWS - User visible changes. 2010-09-06
+GNU tar NEWS - User visible changes. 2010-09-12
 Please send GNU tar bug reports to <address@hidden>
 
 
@@ -14,10 +14,14 @@ time stamps to the full resolution.
 ** More reliable directory traversal when creating archives
 
 Tar now checks for inconsistencies caused when a file system is
-modified while tar is creating an archive.  The new checks are
-implemented via the openat, fstatat, and readlinkat calls standardized
-by POSIX.1-2008.  On an older system that lacks these calls, tar
-emulates them at some cost in efficiency and reliability.
+modified while tar is creating an archive.  In the new approach, tar
+maintains a cache of file descriptors to directories, so it uses more
+file descriptors before, but it gracefully adjusts to system limits on
+the number of file descriptors.  The new checks are implemented via
+the openat, dirfd, fdopendir, fstatat, and readlinkat calls
+standardized by POSIX.1-2008.  On an older system where these calls do
+not exist or do not return useful results, tar emulates the calls at
+some cost in efficiency and reliability.
 
 ** Spurious error diagnostics on broken pipe.
 
diff --git a/gnulib.modules b/gnulib.modules
index 6d25072..6530923 100644
--- a/gnulib.modules
+++ b/gnulib.modules
@@ -8,10 +8,12 @@ argp-version-etc
 backupfile
 closeout
 configmake
+dirfd
 dirname
 error
 exclude
 exitfail
+fdopendir
 fileblocks
 fnmatch-gnu
 fseeko
diff --git a/src/common.h b/src/common.h
index 5be282c..e908854 100644
--- a/src/common.h
+++ b/src/common.h
@@ -357,8 +357,9 @@ struct name
 GLOBAL dev_t ar_dev;
 GLOBAL ino_t ar_ino;
 
-/* Flags for reading and fstatatting arbitrary files.  */
+/* Flags for reading, searching, and fstatatting files.  */
 GLOBAL int open_read_flags;
+GLOBAL int open_searchdir_flags;
 GLOBAL int fstatat_flags;
 
 GLOBAL int seek_option;
@@ -452,6 +453,7 @@ enum dump_status
 void add_exclusion_tag (const char *name, enum exclusion_tag_type type,
                        bool (*predicate) (int));
 bool cachedir_file_p (int fd);
+char *get_directory_entries (struct tar_stat_info *st);
 
 void create_archive (void);
 void pad_archive (off_t size_left);
@@ -466,9 +468,11 @@ union block * write_extended (bool global, struct 
tar_stat_info *st,
 union block *start_private_header (const char *name, size_t size, time_t t);
 void write_eot (void);
 void check_links (void);
+int subfile_open (struct tar_stat_info const *dir, char const *file, int 
flags);
+void restore_parent_fd (struct tar_stat_info const *st);
 void exclusion_tag_warning (const char *dirname, const char *tagname,
                            const char *message);
-enum exclusion_tag_type check_exclusion_tags (int dirfd,
+enum exclusion_tag_type check_exclusion_tags (struct tar_stat_info const *st,
                                              const char **tag_file_name);
 
 #define OFF_TO_CHARS(val, where) off_to_chars (val, where, sizeof (where))
@@ -685,6 +689,7 @@ void usage (int);
 int confirm (const char *message_action, const char *name);
 
 void tar_stat_init (struct tar_stat_info *st);
+bool tar_stat_close (struct tar_stat_info *st);
 void tar_stat_destroy (struct tar_stat_info *st);
 void usage (int) __attribute__ ((noreturn));
 int tar_timespec_cmp (struct timespec a, struct timespec b);
diff --git a/src/create.c b/src/create.c
index e137325..5e2171b 100644
--- a/src/create.c
+++ b/src/create.c
@@ -26,6 +26,10 @@
 #include "common.h"
 #include <hash.h>
 
+/* Error number to use when an impostor is discovered.
+   Pretend the impostor isn't there.  */
+enum { IMPOSTOR_ERRNO = ENOENT };
+
 struct link
   {
     dev_t dev;
@@ -72,13 +76,13 @@ exclusion_tag_warning (const char *dirname, const char 
*tagname,
 }
 
 enum exclusion_tag_type
-check_exclusion_tags (int fd, char const **tag_file_name)
+check_exclusion_tags (struct tar_stat_info const *st, char const 
**tag_file_name)
 {
   struct exclusion_tag *tag;
 
   for (tag = exclusion_tags; tag; tag = tag->next)
     {
-      int tagfd = openat (fd, tag->name, open_read_flags);
+      int tagfd = subfile_open (st, tag->name, open_read_flags);
       if (0 <= tagfd)
        {
          bool satisfied = !tag->predicate || tag->predicate (tagfd);
@@ -1038,7 +1042,7 @@ dump_regular_file (int fd, struct tar_stat_info *st)
            memset (blk->buffer + size_left, 0, BLOCKSIZE - count);
        }
 
-      count = (fd < 0) ? bufsize : safe_read (fd, blk->buffer, bufsize);
+      count = (fd <= 0) ? bufsize : safe_read (fd, blk->buffer, bufsize);
       if (count == SAFE_READ_ERROR)
        {
          read_diag_details (st->orig_file_name,
@@ -1159,7 +1163,7 @@ dump_dir0 (struct tar_stat_info *st, char const 
*directory)
       char *name_buf;
       size_t name_size;
 
-      switch (check_exclusion_tags (st->fd, &tag_file_name))
+      switch (check_exclusion_tags (st, &tag_file_name))
        {
        case exclusion_tag_all:
          /* Handled in dump_file0 */
@@ -1224,21 +1228,93 @@ ensure_slash (char **pstr)
   (*pstr)[len] = '\0';
 }
 
+/* If we just ran out of file descriptors, release a file descriptor
+   in the directory chain somewhere leading from DIR->parent->parent
+   up through the root.  Return true if successful, false (preserving
+   errno == EMFILE) otherwise.
+
+   Do not release DIR's file descriptor, or DIR's parent, as other
+   code assumes that they work.  On some operating systems, another
+   process can claim file descriptor resources as we release them, and
+   some calls or their emulations require multiple file descriptors,
+   so callers should not give up if a single release doesn't work.  */
+
 static bool
-dump_dir (struct tar_stat_info *st)
+open_failure_recover (struct tar_stat_info const *dir)
+{
+  if (errno == EMFILE && dir && dir->parent)
+    {
+      struct tar_stat_info *p;
+      for (p = dir->parent->parent; p; p = p->parent)
+       if (0 < p->fd && (! p->parent || p->parent->fd <= 0))
+         {
+           tar_stat_close (p);
+           return true;
+         }
+      errno = EMFILE;
+    }
+
+  return false;
+}
+
+/* Return the directory entries of ST, in a dynamically allocated buffer,
+   each entry followed by '\0' and the last followed by an extra '\0'.
+   Return null on failure, setting errno.  */
+char *
+get_directory_entries (struct tar_stat_info *st)
 {
-  char *directory = 0;
-  int dupfd = dup (st->fd);
-  if (0 <= dupfd)
+  DIR *dirstream;
+  while (! (dirstream = fdopendir (st->fd)) && open_failure_recover (st))
+    continue;
+
+  if (! dirstream)
+    return 0;
+  else
     {
-      directory = fdsavedir (dupfd);
-      if (! directory)
+      char *entries = streamsavedir (dirstream);
+      int streamsavedir_errno = errno;
+
+      int fd = dirfd (dirstream);
+      if (fd < 0)
        {
-         int e = errno;
-         close (dupfd);
-         errno = e;
+         /* The dirent.h implementation doesn't use file descriptors
+            for directory streams, so open the directory again.  */
+         char const *name = st->orig_file_name;
+         if (closedir (dirstream) != 0)
+           close_diag (name);
+         dirstream = 0;
+         fd = subfile_open (st->parent,
+                            st->parent ? last_component (name) : name,
+                            open_searchdir_flags);
+         if (fd < 0)
+           fd = - errno;
+         else
+           {
+             struct stat dirst;
+             if (! (fstat (fd, &dirst) == 0
+                    && st->stat.st_ino == dirst.st_ino
+                    && st->stat.st_dev == dirst.st_dev))
+               {
+                 close (fd);
+                 fd = - IMPOSTOR_ERRNO;
+               }
+           }
        }
+
+      st->fd = fd;
+      st->dirstream = dirstream;
+      errno = streamsavedir_errno;
+      return entries;
     }
+}
+
+/* Dump the directory ST.  Return true if successful, false (emitting
+   diagnostics) otherwise.  Get ST's entries, recurse through its
+   subdirectories, and clean up file descriptors afterwards.  */
+static bool
+dump_dir (struct tar_stat_info *st)
+{
+  char *directory = get_directory_entries (st);
   if (! directory)
     {
       savedir_diag (st->orig_file_name);
@@ -1247,6 +1323,7 @@ dump_dir (struct tar_stat_info *st)
 
   dump_dir0 (st, directory);
 
+  restore_parent_fd (st);
   free (directory);
   return true;
 }
@@ -1307,21 +1384,19 @@ create_archive (void)
                    {
                      if (! st.orig_file_name)
                        {
-                         st.orig_file_name = xstrdup (p->name);
-                         st.fd = open (st.orig_file_name,
-                                       ((open_read_flags - O_RDONLY
-                                         + O_SEARCH)
-                                        | O_DIRECTORY));
-                         if (st.fd < 0)
+                         int fd = open (p->name, open_searchdir_flags);
+                         if (fd < 0)
                            {
                              open_diag (p->name);
                              break;
                            }
-                         if (fstat (st.fd, &st.stat) != 0)
+                         st.fd = fd;
+                         if (fstat (fd, &st.stat) != 0)
                            {
                              stat_diag (p->name);
                              break;
                            }
+                         st.orig_file_name = xstrdup (p->name);
                        }
                      if (buffer_size < plen + qlen)
                        {
@@ -1492,6 +1567,74 @@ check_links (void)
     }
 }
 
+/* Assuming DIR is the working directory, open FILE, using FLAGS to
+   control the open.  A null DIR means to use ".".  If we are low on
+   file descriptors, try to release one or more from DIR's parents to
+   reuse it.  */
+int
+subfile_open (struct tar_stat_info const *dir, char const *file, int flags)
+{
+  int fd;
+
+  static bool initialized;
+  if (! initialized)
+    {
+      /* Initialize any tables that might be needed when file
+        descriptors are exhausted, and whose initialization might
+        require a file descriptor.  This includes the system message
+        catalog and tar's message catalog.  */
+      initialized = true;
+      strerror (ENOENT);
+      gettext ("");
+    }
+
+  while ((fd = openat (dir ? dir->fd : AT_FDCWD, file, flags)) < 0
+        && open_failure_recover (dir))
+    continue;
+  return fd;
+}
+
+/* Restore the file descriptor for ST->parent, if it was temporarily
+   closed to conserve file descriptors.  On failure, set the file
+   descriptor to the negative of the corresponding errno value.  Call
+   this every time a subdirectory is ascended from.  */
+void
+restore_parent_fd (struct tar_stat_info const *st)
+{
+  struct tar_stat_info *parent = st->parent;
+  if (parent && ! parent->fd)
+    {
+      int parentfd = openat (st->fd, "..", open_searchdir_flags);
+      struct stat parentstat;
+
+      if (parentfd < 0)
+       parentfd = - errno;
+      else if (! (fstat (parentfd, &parentstat) == 0
+                 && parent->stat.st_ino == parentstat.st_ino
+                 && parent->stat.st_dev == parentstat.st_dev))
+       {
+         close (parentfd);
+         parentfd = IMPOSTOR_ERRNO;
+       }
+
+      if (parentfd < 0)
+       {
+         int origfd = open (parent->orig_file_name, open_searchdir_flags);
+         if (0 <= origfd)
+           {
+             if (fstat (parentfd, &parentstat) == 0
+                 && parent->stat.st_ino == parentstat.st_ino
+                 && parent->stat.st_dev == parentstat.st_dev)
+               parentfd = origfd;
+             else
+               close (origfd);
+           }
+       }
+
+      parent->fd = parentfd;
+    }
+}
+
 /* Dump a single file, recursing on directories.  ST is the file's
    status info, NAME its name relative to the parent directory, and P
    its full name (which may be relative to the working directory).  */
@@ -1508,10 +1651,11 @@ dump_file0 (struct tar_stat_info *st, char const *name, 
char const *p)
   struct timespec original_ctime;
   struct timespec restore_times[2];
   off_t block_ordinal = -1;
-  int fd = -1;
+  int fd = 0;
   bool is_dir;
-  bool top_level = ! st->parent;
-  int parentfd = top_level ? AT_FDCWD : st->parent->fd;
+  struct tar_stat_info const *parent = st->parent;
+  bool top_level = ! parent;
+  int parentfd = top_level ? AT_FDCWD : parent->fd;
   void (*diag) (char const *) = 0;
 
   if (interactive_option && !confirm ("add", p))
@@ -1523,15 +1667,24 @@ dump_file0 (struct tar_stat_info *st, char const *name, 
char const *p)
 
   transform_name (&st->file_name, XFORM_REGFILE);
 
-  if (fstatat (parentfd, name, &st->stat, fstatat_flags) != 0)
+  if (parentfd < 0 && ! top_level)
+    {
+      errno = - parentfd;
+      diag = open_diag;
+    }
+  else if (fstatat (parentfd, name, &st->stat, fstatat_flags) != 0)
     diag = stat_diag;
   else if (file_dumpable_p (&st->stat))
     {
-      fd = st->fd = openat (parentfd, name, open_read_flags);
+      fd = subfile_open (parent, name, open_read_flags);
       if (fd < 0)
        diag = open_diag;
-      else if (fstat (fd, &st->stat) != 0)
-       diag = stat_diag;
+      else
+       {
+         st->fd = fd;
+         if (fstat (fd, &st->stat) != 0)
+           diag = stat_diag;
+       }
     }
   if (diag)
     {
@@ -1600,7 +1753,7 @@ dump_file0 (struct tar_stat_info *st, char const *name, 
char const *p)
          ensure_slash (&st->orig_file_name);
          ensure_slash (&st->file_name);
 
-         if (check_exclusion_tags (fd, &tag_file_name) == exclusion_tag_all)
+         if (check_exclusion_tags (st, &tag_file_name) == exclusion_tag_all)
            {
              exclusion_tag_warning (st->orig_file_name, tag_file_name,
                                     _("directory not dumped"));
@@ -1608,12 +1761,15 @@ dump_file0 (struct tar_stat_info *st, char const *name, 
char const *p)
            }
 
          ok = dump_dir (st);
+
+         fd = st->fd;
+         parentfd = top_level ? AT_FDCWD : parent->fd;
        }
       else
        {
          enum dump_status status;
 
-         if (fd != -1 && sparse_option && ST_IS_SPARSE (st->stat))
+         if (fd && sparse_option && ST_IS_SPARSE (st->stat))
            {
              status = sparse_dump_file (fd, st);
              if (status == dump_status_not_implemented)
@@ -1641,14 +1797,26 @@ dump_file0 (struct tar_stat_info *st, char const *name, 
char const *p)
 
       if (ok)
        {
-         if ((fd < 0
-              ? fstatat (parentfd, name, &final_stat, fstatat_flags)
-              : fstat (fd, &final_stat))
-             != 0)
+         if (fd < 0)
            {
-             file_removed_diag (p, top_level, stat_diag);
+             errno = - fd;
              ok = false;
            }
+         else if (fd == 0)
+           {
+             if (parentfd < 0 && ! top_level)
+               {
+                 errno = - parentfd;
+                 ok = false;
+               }
+             else
+               ok = fstatat (parentfd, name, &final_stat, fstatat_flags) == 0;
+           }
+         else
+           ok = fstat (fd, &final_stat) == 0;
+
+         if (! ok)
+           file_removed_diag (p, top_level, stat_diag);
        }
 
       if (ok)
@@ -1669,16 +1837,7 @@ dump_file0 (struct tar_stat_info *st, char const *name, 
char const *p)
            utime_error (p);
        }
 
-      if (0 < fd)
-       {
-         if (close (fd) != 0)
-           {
-             close_diag (p);
-             ok = false;
-           }
-         st->fd = 0;
-       }
-
+      ok &= tar_stat_close (st);
       if (ok && remove_files_option)
        queue_deferred_unlink (p, is_dir);
 
diff --git a/src/incremen.c b/src/incremen.c
index afd19af..c6d4f4c 100644
--- a/src/incremen.c
+++ b/src/incremen.c
@@ -426,7 +426,6 @@ procdir (const char *name_buffer, struct tar_stat_info *st,
 {
   struct directory *directory;
   struct stat *stat_data = &st->stat;
-  int fd = st->fd;
   dev_t device = st->parent ? st->parent->stat.st_dev : 0;
   bool nfs = NFS_FILE_STAT (*stat_data);
 
@@ -566,7 +565,7 @@ procdir (const char *name_buffer, struct tar_stat_info *st,
     {
       const char *tag_file_name;
 
-      switch (check_exclusion_tags (fd, &tag_file_name))
+      switch (check_exclusion_tags (st, &tag_file_name))
        {
        case exclusion_tag_all:
          /* This warning can be duplicated by code in dump_file0, but only
@@ -680,8 +679,7 @@ struct directory *
 scan_directory (struct tar_stat_info *st)
 {
   char const *dir = st->orig_file_name;
-  int fd = st->fd;
-  char *dirp = 0;
+  char *dirp = get_directory_entries (st);
   dev_t device = st->stat.st_dev;
   bool cmdline = ! st->parent;
   namebuf_t nbuf;
@@ -689,18 +687,6 @@ scan_directory (struct tar_stat_info *st)
   struct directory *directory;
   char ch;
 
-  int dupfd = dup (fd);
-  if (0 <= dupfd)
-    {
-      dirp = fdsavedir (dupfd);
-      if (! dirp)
-       {
-         int e = errno;
-         close (dupfd);
-         errno = e;
-       }
-    }
-
   if (! dirp)
     savedir_error (dir);
 
@@ -734,19 +720,29 @@ scan_directory (struct tar_stat_info *st)
            *entry = 'N';
          else
            {
+             int fd = st->fd;
              void (*diag) (char const *) = 0;
              struct tar_stat_info stsub;
              tar_stat_init (&stsub);
 
-             if (fstatat (fd, entry + 1, &stsub.stat, fstatat_flags) != 0)
+             if (fd < 0)
+               {
+                 errno = - fd;
+                 diag = open_diag;
+               }
+             else if (fstatat (fd, entry + 1, &stsub.stat, fstatat_flags) != 0)
                diag = stat_diag;
              else if (S_ISDIR (stsub.stat.st_mode))
                {
-                 stsub.fd = openat (fd, entry + 1, open_read_flags);
-                 if (stsub.fd < 0)
+                 int subfd = subfile_open (st, entry + 1, open_read_flags);
+                 if (subfd < 0)
                    diag = open_diag;
-                 else if (fstat (stsub.fd, &stsub.stat) != 0)
-                   diag = stat_diag;
+                 else
+                   {
+                     stsub.fd = subfd;
+                     if (fstat (subfd, &stsub.stat) != 0)
+                       diag = stat_diag;
+                   }
                }
 
              if (diag)
@@ -762,7 +758,10 @@ scan_directory (struct tar_stat_info *st)
                  else if (directory->children == ALL_CHILDREN)
                    pd_flag |= PD_FORCE_CHILDREN | ALL_CHILDREN;
                  *entry = 'D';
+
+                 stsub.parent = st;
                  procdir (full_name, &stsub, pd_flag, entry);
+                 restore_parent_fd (&stsub);
                }
              else if (one_file_system_option && device != stsub.stat.st_dev)
                *entry = 'N';
diff --git a/src/names.c b/src/names.c
index 2fc751d..c38ccb6 100644
--- a/src/names.c
+++ b/src/names.c
@@ -818,6 +818,7 @@ add_hierarchy_to_namelist (struct tar_stat_info *st, struct 
name *name)
            {
              struct name *np;
              struct tar_stat_info subdir;
+             int subfd;
 
              if (allocated_length <= name_length + string_length)
                {
@@ -841,21 +842,32 @@ add_hierarchy_to_namelist (struct tar_stat_info *st, 
struct name *name)
 
              tar_stat_init (&subdir);
              subdir.parent = st;
-             subdir.fd = openat (st->fd, string + 1,
-                                 open_read_flags | O_DIRECTORY);
-             if (subdir.fd < 0)
-               open_diag (namebuf);
-             else if (fstat (subdir.fd, &subdir.stat) != 0)
-               stat_diag (namebuf);
-             else if (! (O_DIRECTORY || S_ISDIR (subdir.stat.st_mode)))
+             if (st->fd < 0)
                {
-                 errno = ENOTDIR;
-                 open_diag (namebuf);
+                 subfd = -1;
+                 errno = - st->fd;
                }
              else
+               subfd = subfile_open (st, string + 1,
+                                     open_read_flags | O_DIRECTORY);
+             if (subfd < 0)
+               open_diag (namebuf);
+             else
                {
-                 subdir.orig_file_name = xstrdup (namebuf);
-                 add_hierarchy_to_namelist (&subdir, np);
+                 subdir.fd = subfd;
+                 if (fstat (subfd, &subdir.stat) != 0)
+                   stat_diag (namebuf);
+                 else if (! (O_DIRECTORY || S_ISDIR (subdir.stat.st_mode)))
+                   {
+                     errno = ENOTDIR;
+                     open_diag (namebuf);
+                   }
+                 else
+                   {
+                     subdir.orig_file_name = xstrdup (namebuf);
+                     add_hierarchy_to_namelist (&subdir, np);
+                     restore_parent_fd (&subdir);
+                   }
                }
 
              tar_stat_destroy (&subdir);
@@ -976,16 +988,20 @@ collect_and_sort_names (void)
        }
       if (S_ISDIR (st.stat.st_mode))
        {
-         st.fd = open (name->name, open_read_flags | O_DIRECTORY);
-         if (st.fd < 0)
+         int dir_fd = open (name->name, open_read_flags | O_DIRECTORY);
+         if (dir_fd < 0)
            open_diag (name->name);
-         else if (fstat (st.fd, &st.stat) != 0)
-           stat_diag (name->name);
-         else if (O_DIRECTORY || S_ISDIR (st.stat.st_mode))
+         else
            {
-             st.orig_file_name = xstrdup (name->name);
-             name->found_count++;
-             add_hierarchy_to_namelist (&st, name);
+             st.fd = dir_fd;
+             if (fstat (dir_fd, &st.stat) != 0)
+               stat_diag (name->name);
+             else if (O_DIRECTORY || S_ISDIR (st.stat.st_mode))
+               {
+                 st.orig_file_name = xstrdup (name->name);
+                 name->found_count++;
+                 add_hierarchy_to_namelist (&st, name);
+               }
            }
        }
 
diff --git a/src/tar.c b/src/tar.c
index ab53cec..6fd117c 100644
--- a/src/tar.c
+++ b/src/tar.c
@@ -2465,13 +2465,17 @@ decode_options (int argc, char **argv)
   if (recursive_unlink_option)
     old_files_option = UNLINK_FIRST_OLD_FILES;
 
-  /* Flags for accessing files to be copied into.  POSIX says
+  /* Flags for accessing files to be read from or copied into.  POSIX says
      O_NONBLOCK has unspecified effect on most types of files, but in
      practice it never harms and sometimes helps.  */
-  open_read_flags =
-    (O_RDONLY | O_BINARY | O_NOCTTY | O_NONBLOCK
-     | (dereference_option ? 0 : O_NOFOLLOW)
-     | (atime_preserve_option == system_atime_preserve ? O_NOATIME : 0));
+  {
+    int base_open_flags =
+      (O_BINARY | O_CLOEXEC | O_NOCTTY | O_NONBLOCK
+       | (dereference_option ? 0 : O_NOFOLLOW)
+       | (atime_preserve_option == system_atime_preserve ? O_NOATIME : 0));
+    open_read_flags = O_RDONLY | base_open_flags;
+    open_searchdir_flags = O_SEARCH | O_DIRECTORY | base_open_flags;
+  }
   fstatat_flags = dereference_option ? 0 : AT_SYMLINK_NOFOLLOW;
 
   if (subcommand_option == TEST_LABEL_SUBCOMMAND)
@@ -2684,9 +2688,31 @@ tar_stat_init (struct tar_stat_info *st)
   memset (st, 0, sizeof (*st));
 }
 
+/* Close the stream or file descriptor associated with ST, and remove
+   all traces of it from ST.  Return true if successful, false (with a
+   diagnostic) otherwise.  */
+bool
+tar_stat_close (struct tar_stat_info *st)
+{
+  int status = (st->dirstream ? closedir (st->dirstream)
+               : 0 < st->fd ? close (st->fd)
+               : 0);
+  st->dirstream = 0;
+  st->fd = 0;
+
+  if (status == 0)
+    return true;
+  else
+    {
+      close_diag (st->orig_file_name);
+      return false;
+    }
+}
+
 void
 tar_stat_destroy (struct tar_stat_info *st)
 {
+  tar_stat_close (st);
   free (st->orig_file_name);
   free (st->file_name);
   free (st->link_name);
@@ -2694,8 +2720,6 @@ tar_stat_destroy (struct tar_stat_info *st)
   free (st->gname);
   free (st->sparse_map);
   free (st->dumpdir);
-  if (0 < st->fd)
-    close (st->fd);
   xheader_destroy (&st->xhdr);
   memset (st, 0, sizeof (*st));
 }
diff --git a/src/tar.h b/src/tar.h
index c35ba5d..ce9850c 100644
--- a/src/tar.h
+++ b/src/tar.h
@@ -322,12 +322,20 @@ struct tar_stat_info
      file is at the top level.  */
   struct tar_stat_info *parent;
 
+  /* Directory stream.  If this is not null, it is in control of FD,
+     and should be closed instead of FD.  */
+  DIR *dirstream;
+
   /* File descriptor, if creating an archive, and if a directory or a
-     regular file or a contiguous file.  This is AT_FDCWD if it is the
-     working directory, which is possible only for a dummy parent node
-     just above the top level.  It may be -1 if the file could not be
-     opened.  Zero represents an otherwise-uninitialized value;
-     standard input is never used here.  */
+     regular file or a contiguous file.
+
+     It is zero if no file descriptor is available, either because it
+     was never needed or because it was open and then closed to
+     conserve on file descriptors.  (Standard input is never used
+     here, so zero cannot be a valid file descriptor.)
+
+     It is negative if it could not be reopened after it was closed.
+     Negate it to find out what errno was when the reopen failed.  */
   int fd;
 };
 
diff --git a/tests/Makefile.am b/tests/Makefile.am
index 4b70000..ea1c6ae 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -77,6 +77,7 @@ TESTSUITE_AT = \
  extrac08.at\
  extrac09.at\
  extrac10.at\
+ extrac11.at\
  filerem01.at\
  filerem02.at\
  gzip.at\
diff --git a/tests/extrac11.at b/tests/extrac11.at
new file mode 100644
index 0000000..9456695
--- /dev/null
+++ b/tests/extrac11.at
@@ -0,0 +1,77 @@
+# Process this file with autom4te to create testsuite. -*- Autotest -*-
+
+# Test suite for GNU tar.
+# Copyright (C) 2010 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3, or (at your option)
+# any later version.
+
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# written by Paul Eggert
+
+# Check that 'tar' works even in a file-descriptor-limited environment.
+
+AT_SETUP([scarce file descriptors])
+AT_KEYWORDS([extract extrac11])
+
+AT_TAR_CHECK([
+dirs='a
+      a/b
+      a/b/c
+      a/b/c/d
+      a/b/c/d/e
+      a/b/c/d/e/f
+      a/b/c/d/e/f/g
+      a/b/c/d/e/f/g/h
+      a/b/c/d/e/f/g/h/i
+      a/b/c/d/e/f/g/h/i/j
+      a/b/c/d/e/f/g/h/i/j/k
+'
+files=
+mkdir $dirs dest1 dest2 dest3 || exit
+for dir in $dirs; do
+  for file in X Y Z; do
+    echo $file >$dir/$file || exit
+    files="$files $file"
+  done
+done
+
+# Check that "ulimit" itself works.
+((ulimit -n 100 &&
+  tar -cf archive1.tar a 3<&- 4<&- 5<&- 6<&- 7<&- 8<&- 9<&- &&
+  tar -xf archive1.tar -C dest1 a 3<&- 4<&- 5<&- 6<&- 7<&- 8<&- 9<&-
+ ) &&
+ diff -r a dest1/a
+) >/dev/null 2>&1 ||
+   AT_SKIP_TEST
+
+# Another test that "ulimit" itself works:
+# tar should fail when completely starved of file descriptors.
+((ulimit -n 4 &&
+  tar -cf archive2.tar a 3<&- 4<&- 5<&- 6<&- 7<&- 8<&- 9<&- &&
+  tar -xf archive2.tar -C dest2 a 3<&- 4<&- 5<&- 6<&- 7<&- 8<&- 9<&-
+ ) &&
+ diff -r a dest2/a
+) >/dev/null 2>&1 &&
+   AT_SKIP_TEST
+
+# Tar should work when there are few, but enough, file descriptors.
+((ulimit -n 10 &&
+  tar -cf archive3.tar a 3<&- 4<&- 5<&- 6<&- 7<&- 8<&- 9<&- &&
+  tar -xf archive3.tar -C dest3 a 3<&- 4<&- 5<&- 6<&- 7<&- 8<&- 9<&-
+ ) &&
+ diff -r a dest3/a >/dev/null 2>&1
+) || { diff -r a dest3/a; exit 1; }
+],
+[0],[],[],[],[],[gnu])
+
+AT_CLEANUP
diff --git a/tests/testsuite.at b/tests/testsuite.at
index 8866ed0..2b2bd6b 100644
--- a/tests/testsuite.at
+++ b/tests/testsuite.at
@@ -149,6 +149,7 @@ m4_include([extrac07.at])
 m4_include([extrac08.at])
 m4_include([extrac09.at])
 m4_include([extrac10.at])
+m4_include([extrac11.at])
 
 m4_include([label01.at])
 m4_include([label02.at])
-- 
1.7.2




reply via email to

[Prev in Thread] Current Thread [Next in Thread]