grub-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[RFC] [PATCH] Generate stable device names in device.map on Linux


From: Colin Watson
Subject: [RFC] [PATCH] Generate stable device names in device.map on Linux
Date: Mon, 21 Jun 2010 10:34:30 +0100
User-agent: Mutt/1.5.18 (2008-05-17)

I have far too many Debian bugs that amount to problems with Linux
device names being fundamentally unstable: whether a disk is /dev/sda or
/dev/sdb depends on bus probing order, and whether a disk is /dev/sda or
/dev/hda depends on whether the Linux kernel was built with or without
libata.  Of course, GRUB has already committed to the idea that it
should be as independent of this as reasonably possible; that's why we
use UUIDs everywhere.

There is still one significant place in GRUB where we use traditional
device names, though: device.map.  UUIDs don't apply to disk devices,
only filesystems, and so the mechanism for finding stable device names
here is inevitably kernel-specific.  Some almost certainly have simpler
device architectures and don't need any of this.  But on Linux, it's
vitally important.

We can actually mostly work without a device.map nowadays, but there are
still some warts (e.g. LVM/RAID probing requires nasty teardown/setup of
the upper disk layers if you don't have a device.map, and we haven't yet
got round to removing 'set root=' statements in grub.cfg if they're
going to be meaningless at boot time due to including OS device names).
I don't really want to reopen that discussion just now, but I would like
to make sure that *if* you have a device.map then it uses stable device
names if possible.

This is important because if you have a device.map but it doesn't
include the disk you're trying to boot from (e.g. because your kernel
just started using libata, so device.map lists /dev/hda but you only
have /dev/sda), then grub-probe will fail and you're screwed unless you
know which files to edit by hand.  I'm OK with the odd user who's doing
something weird having to edit files by hand, but right now most Debian
users upgrading from lenny to squeeze will have to do this which is
really too much.

On Linux, the sane choices for stable names for disk devices are those
in /dev/disk/by-id/.  They're symlinks to the real disk devices, and
they're constructed using information such as the serial number of the
disk so they only change if the physical disk changes.  They're built in
userspace by udev rules, so are not necessarily available for all
devices (I hear there are problems with virtual disks under VMware, for
instance), but they're pretty widely available.  There may be multiple
symlinks for any one disk - for example, my laptop disk has both ata-*
and scsi-SATA_* by-id names - but this can be handled pretty stably by
just sorting the list of by-id names and picking the first.

So, I propose the following.  On Linux, use by-id names if we have them,
but if we don't, fall back to what we did before.  Call
canonicalize_file_name on every device name and make sure we never emit
the same one twice.  'grub-probe --target=device' still emits
traditional device names (i.e. the symlink target), but that's OK and in
fact it's probably best to leave this alone.

This means that a Debian package can convert people's device.map files
to stable device names on upgrade, and that it only needs to do this
once.  Thereafter, people only need to change it if they're adding or
removing disks that are relevant for booting, but other than that they
can leave it alone.  The same Debian upgrade can also start remembering
the disk on which grub-install should be run in terms of by-id names,
which should sort out the vast bulk of upgrade bugs we have.

The result looks something like this:

  $ sudo ./grub-mkdevicemap -m -
  (hd0)   /dev/disk/by-id/ata-FUJITSU_MHW2080BJ_G2_K30TT772603R
  (hd1)   
/dev/disk/by-id/usb-WD_My_Book_AV_1020_574341563535393535373637202020-0:0
  $ sudo ./grub-mkdevicemap -m - | sudo ./grub-probe 
--device-map=/proc/self/fd/0 --target=device /
  /dev/sda5
  $ sudo ./grub-mkdevicemap -m - | sudo ./grub-probe 
--device-map=/proc/self/fd/0 --target=drive /
  (hd0,msdos5)
  $ sudo ./grub-mkdevicemap -m - | sudo ./grub-probe 
--device-map=/proc/self/fd/0 --target=fs /
  ext2

What do people think?

2010-06-21  Colin Watson  <address@hidden>

        Change grub-mkdevicemap to emit /dev/disk/by-id/ names where
        possible on Linux.

        * util/deviceiter.c (check_device): Maintain and check a list of
        which devices (by canonicalized name) have already been seen.
        (clear_seen_devices): New function.
        (compare_file_names) [__linux__]: New function.
        (grub_util_iterate_devices): Clear the list of seen devices on exit
        and (just in case) on entry.
        (grub_util_iterate_devices) [__linux__]: Iterate over non-partition
        devices in /dev/disk/by-id/, in sorted order.  Remove DM-RAID
        seen-devices list, superseded by general code in check_device.

=== modified file 'util/deviceiter.c'
--- util/deviceiter.c   2010-06-11 20:31:16 +0000
+++ util/deviceiter.c   2010-06-21 08:54:07 +0000
@@ -28,6 +28,7 @@
 #include <errno.h>
 #include <fcntl.h>
 #include <limits.h>
+#include <dirent.h>
 
 #include <grub/util/misc.h>
 #include <grub/util/deviceiter.h>
@@ -345,18 +346,37 @@ get_xvd_disk_name (char *name, int unit)
 }
 #endif
 
+static struct seen_device
+{
+  struct seen_device *next;
+  const char *name;
+} *seen;
+
 /* Check if DEVICE can be read. If an error occurs, return zero,
    otherwise return non-zero.  */
 static int
 check_device (const char *device)
 {
+  char *real_device;
   char buf[512];
   FILE *fp;
+  struct seen_device *seen_elt;
 
   /* If DEVICE is empty, just return error.  */
   if (*device == 0)
     return 0;
 
+  /* Have we seen this device already?  */
+  real_device = canonicalize_file_name (device);
+  if (! real_device)
+    return 0;
+  if (grub_named_list_find (GRUB_AS_NAMED_LIST (seen), real_device))
+    {
+      grub_dprintf ("deviceiter", "Already seen %s (%s)\n",
+                   device, real_device);
+      goto fail;
+    }
+
   fp = fopen (device, "r");
   if (! fp)
     {
@@ -379,7 +399,7 @@ check_device (const char *device)
          break;
        }
       /* Error opening the device.  */
-      return 0;
+      goto fail;
     }
 
   /* Make sure CD-ROMs don't get assigned a BIOS disk number
@@ -387,7 +407,7 @@ check_device (const char *device)
 #ifdef __linux__
 # ifdef CDROM_GET_CAPABILITY
   if (ioctl (fileno (fp), CDROM_GET_CAPABILITY, 0) >= 0)
-    return 0;
+    goto fail;
 # else /* ! CDROM_GET_CAPABILITY */
   /* Check if DEVICE is a CD-ROM drive by the HDIO_GETGEO ioctl.  */
   {
@@ -395,14 +415,14 @@ check_device (const char *device)
     struct stat st;
 
     if (fstat (fileno (fp), &st))
-      return 0;
+      goto fail;
 
     /* If it is a block device and isn't a floppy, check if HDIO_GETGEO
        succeeds.  */
     if (S_ISBLK (st.st_mode)
        && MAJOR (st.st_rdev) != FLOPPY_MAJOR
        && ioctl (fileno (fp), HDIO_GETGEO, &hdg))
-      return 0;
+      goto fail;
   }
 # endif /* ! CDROM_GET_CAPABILITY */
 #endif /* __linux__ */
@@ -410,7 +430,7 @@ check_device (const char *device)
 #if defined(__FreeBSD_kernel__) || defined(__NetBSD__) || defined(__OpenBSD__)
 # ifdef CDIOCCLRDEBUG
   if (ioctl (fileno (fp), CDIOCCLRDEBUG, 0) >= 0)
-    return 0;
+    goto fail;
 # endif /* CDIOCCLRDEBUG */
 #endif /* __FreeBSD_kernel__ || __NetBSD__ || __OpenBSD__ */
 
@@ -418,21 +438,43 @@ check_device (const char *device)
   if (fread (buf, 1, 512, fp) != 512)
     {
       fclose (fp);
-      return 0;
+      goto fail;
     }
 
+  /* Remember that we've seen this device.  */
+  seen_elt = xmalloc (sizeof *seen_elt);
+  seen_elt->name = real_device; /* steal memory */
+  grub_list_push (GRUB_AS_LIST_P (&seen), GRUB_AS_LIST (seen_elt));
+
   fclose (fp);
   return 1;
+
+fail:
+  free (real_device);
+  return 0;
+}
+
+static void
+clear_seen_devices (void)
+{
+  while (seen)
+    {
+      struct seen_device *seen_elt = seen;
+      seen = seen->next;
+      free (seen_elt);
+    }
+  seen = NULL;
 }
 
 #ifdef __linux__
-# ifdef HAVE_DEVICE_MAPPER
-struct dmraid_seen
+/* Like strcmp, but doesn't require a cast for use as a qsort comparator.  */
+static int
+compare_file_names (const void *a, const void *b)
 {
-  struct dmraid_seen *next;
-  const char *name;
-};
-# endif /* HAVE_DEVICE_MAPPER */
+  const char *left = *(const char **) a;
+  const char *right = *(const char **) b;
+  return strcmp (left, right);
+}
 #endif /* __linux__ */
 
 void
@@ -441,6 +483,8 @@ grub_util_iterate_devices (int NESTED_FU
 {
   int i;
 
+  clear_seen_devices ();
+
   /* Floppies.  */
   for (i = 0; i < floppy_disks; i++)
     {
@@ -453,10 +497,56 @@ grub_util_iterate_devices (int NESTED_FU
       /* In floppies, write the map, whether check_device succeeds
         or not, because the user just may not insert floppies.  */
       if (hook (name, 1))
-       return;
+       goto out;
     }
 
 #ifdef __linux__
+  {
+    DIR *dir = opendir ("/dev/disk/by-id");
+
+    if (dir)
+      {
+       struct dirent *entry;
+       char **names;
+       size_t names_len = 0, names_max = 1024, i;
+
+       names = xmalloc (names_max * sizeof *names);
+
+       /* Dump all the directory entries into names, resizing if
+          necessary.  */
+       for (entry = readdir (dir); entry; entry = readdir (dir))
+         {
+           /* Skip partition entries.  */
+           if (strstr (entry->d_name, "-part"))
+             continue;
+           if (names_len >= names_max)
+             {
+               names_max *= 2;
+               names = xrealloc (names, names_max * sizeof *names);
+             }
+           names[names_len++] = xasprintf (entry->d_name);
+         }
+
+       qsort (names, names_len, sizeof *names, &compare_file_names);
+
+       closedir (dir);
+
+       /* Now add all the devices in sorted order.  */
+       for (i = 0; i < names_len; ++i)
+         {
+           char *path = xasprintf ("/dev/disk/by-id/%s", names[i]);
+           if (check_device (path))
+             {
+               if (hook (path, 0))
+                 goto out;
+             }
+           free (path);
+           free (names[i]);
+         }
+       free (names);
+      }
+  }
+
   if (have_devfs ())
     {
       i = 0;
@@ -476,10 +566,10 @@ grub_util_iterate_devices (int NESTED_FU
            {
              strcat (name, "/disc");
              if (hook (name, 0))
-               return;
+               goto out;
            }
        }
-      return;
+      goto out;
     }
 #endif /* __linux__ */
 
@@ -492,7 +582,7 @@ grub_util_iterate_devices (int NESTED_FU
       if (check_device (name))
        {
          if (hook (name, 0))
-           return;
+           goto out;
        }
     }
 
@@ -506,7 +596,7 @@ grub_util_iterate_devices (int NESTED_FU
       if (check_device (name))
        {
          if (hook (name, 0))
-           return;
+           goto out;
        }
     }
 
@@ -519,7 +609,7 @@ grub_util_iterate_devices (int NESTED_FU
       if (check_device (name))
        {
          if (hook (name, 0))
-           return;
+           goto out;
         }
     }
 
@@ -532,7 +622,7 @@ grub_util_iterate_devices (int NESTED_FU
       if (check_device (name))
        {
          if (hook (name, 0))
-           return;
+           goto out;
        }
     }
 #endif /* __linux__ */
@@ -546,7 +636,7 @@ grub_util_iterate_devices (int NESTED_FU
       if (check_device (name))
        {
          if (hook (name, 0))
-           return;
+           goto out;
        }
     }
 
@@ -569,7 +659,7 @@ grub_util_iterate_devices (int NESTED_FU
            if (check_device (name))
              {
                if (hook (name, 0))
-                 return;
+                 goto out;
              }
          }
       }
@@ -590,7 +680,7 @@ grub_util_iterate_devices (int NESTED_FU
            if (check_device (name))
              {
                if (hook (name, 0))
-                 return;
+                 goto out;
              }
          }
       }
@@ -611,7 +701,7 @@ grub_util_iterate_devices (int NESTED_FU
            if (check_device (name))
              {
                if (hook (name, 0))
-                 return;
+                 goto out;
              }
          }
       }
@@ -632,7 +722,7 @@ grub_util_iterate_devices (int NESTED_FU
            if (check_device (name))
              {
                if (hook (name, 0))
-                 return;
+                 goto out;
              }
          }
       }
@@ -650,7 +740,7 @@ grub_util_iterate_devices (int NESTED_FU
        if (check_device (name))
          {
            if (hook (name, 0))
-             return;
+             goto out;
          }
       }
   }
@@ -664,7 +754,7 @@ grub_util_iterate_devices (int NESTED_FU
       if (check_device (name))
        {
          if (hook (name, 0))
-           return;
+           goto out;
        }
     }
 
@@ -685,7 +775,6 @@ grub_util_iterate_devices (int NESTED_FU
       unsigned int next = 0;
       void *top_handle, *second_handle;
       struct dm_tree_node *root, *top, *second;
-      struct dmraid_seen *seen = NULL;
 
       /* Build DM tree for all devices.  */
       tree = dm_tree_create ();
@@ -721,7 +810,6 @@ grub_util_iterate_devices (int NESTED_FU
            {
              const char *node_name, *node_uuid;
              char *name;
-             struct dmraid_seen *seen_elt;
 
              node_name = dm_tree_node_get_name (second);
              dmraid_check (node_name, "dm_tree_node_get_name failed\n");
@@ -733,40 +821,21 @@ grub_util_iterate_devices (int NESTED_FU
                  goto dmraid_next_child;
                }
 
-             /* Have we already seen this node?  There are typically very few
-                DM-RAID disks, so a list should be fast enough.  */
-             if (grub_named_list_find (GRUB_AS_NAMED_LIST (seen), node_name))
-               {
-                 grub_dprintf ("deviceiter", "Already seen DM device %s\n",
-                               node_name);
-                 goto dmraid_next_child;
-               }
-
              name = xasprintf ("/dev/mapper/%s", node_name);
              if (check_device (name))
                {
                  if (hook (name, 0))
                    {
                      free (name);
-                     while (seen)
-                       {
-                         struct dmraid_seen *seen_elt = seen;
-                         seen = seen->next;
-                         free (seen_elt);
-                       }
                      if (task)
                        dm_task_destroy (task);
                      if (tree)
                        dm_tree_free (tree);
-                     return;
+                     goto out;
                    }
                }
              free (name);
 
-             seen_elt = xmalloc (sizeof *seen_elt);
-             seen_elt->name = node_name;
-             grub_list_push (GRUB_AS_LIST_P (&seen), GRUB_AS_LIST (seen_elt));
-
 dmraid_next_child:
              second = dm_tree_next_child (&second_handle, top, 1);
            }
@@ -774,12 +843,6 @@ dmraid_next_child:
        }
 
 dmraid_end:
-      while (seen)
-       {
-         struct dmraid_seen *seen_elt = seen;
-         seen = seen->next;
-         free (seen_elt);
-       }
       if (task)
        dm_task_destroy (task);
       if (tree)
@@ -787,5 +850,8 @@ dmraid_end:
     }
 # endif /* HAVE_DEVICE_MAPPER */
 #endif /* __linux__ */
+
+out:
+  clear_seen_devices ();
 }
 

Thanks,

-- 
Colin Watson                                       address@hidden



reply via email to

[Prev in Thread] Current Thread [Next in Thread]