bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#21459: Race condition in tests/tail-2/assert.sh


From: Pádraig Brady
Subject: bug#21459: Race condition in tests/tail-2/assert.sh
Date: Fri, 11 Sep 2015 21:34:25 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0

On 11/09/15 17:23, Ludovic Courtès wrote:
> Hello,
> 
> We have observed intermittent failures of tests/tail-2/assert.sh
> (Coreutils 8.24, libc 2.22), especially showing up on relatively slow
> machines (armhf and mips64el.)
> 
> The failure is with ‘tail --follow=name’, which, in inotify mode, would
> fail to report that file ‘foo’ has been deleted.
> 
> The strace of a correct execution (x86_64) is like this:
> 
> --8<---------------cut here---------------start------------->8---
> lstat("a", {st_mode=S_IFREG|0644, st_size=2, ...}) = 0
> lstat("foo", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
> inotify_init()                          = 5
> write(1, "==> a <==\nx\n\n==> foo <==\n", 25) = 25
> inotify_add_watch(5, ".", IN_ATTRIB|IN_MOVED_TO|IN_CREATE) = 1
> inotify_add_watch(5, "a", IN_MODIFY|IN_ATTRIB|IN_DELETE_SELF|IN_MOVE_SELF) = 2
> inotify_add_watch(5, ".", IN_ATTRIB|IN_MOVED_TO|IN_CREATE) = 1
> inotify_add_watch(5, "foo", IN_MODIFY|IN_ATTRIB|IN_DELETE_SELF|IN_MOVE_SELF) 
> = 3
> open("a", O_RDONLY|O_NONBLOCK)          = 6
> lstat("a", {st_mode=S_IFREG|0644, st_size=2, ...}) = 0
> fstat(6, {st_mode=S_IFREG|0644, st_size=2, ...}) = 0
> fstatfs(6, {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=16738210, 
> f_bfree=1219533, f_bavail=367617, f_files=4259840, f_ffree=2406492, 
> f_fsid={1622537548, 1497272261}, f_namelen=255, f_frsize=4096}) = 0
> close(6)                                = 0
> fstat(3, {st_mode=S_IFREG|0644, st_size=2, ...}) = 0
> open("foo", O_RDONLY|O_NONBLOCK)        = 6
> lstat("foo", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
> fstat(6, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
> fstatfs(6, {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=16738210, 
> f_bfree=1219533, f_bavail=367617, f_files=4259840, f_ffree=2406492, 
> f_fsid={1622537548, 1497272261}, f_namelen=255, f_frsize=4096}) = 0
> close(6)                                = 0
> fstat(4, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
> read(5, "\3\0\0\0\4\0\0\0\0\0\0\0\0\0\0\0", 20) = 16
> open("foo", O_RDONLY|O_NONBLOCK)        = -1 ENOENT (No such file or 
> directory)
> lstat("foo", 0x7ffee174c3b0)            = -1 ENOENT (No such file or 
> directory)
> write(2, "tail: ", 6)                   = 6
> write(2, "foo", 3)                      = 3
> write(2, ": No such file or directory", 27) = 27
> write(2, "\n", 1)                       = 1
> close(4)                                = 0
> read(5, "\3\0\0\0\0\4\0\0\0\0\0\0\0\0\0\0", 20) = 16
> inotify_rm_watch(5, 3)                  = -1 EINVAL (Invalid argument)
> --8<---------------cut here---------------end--------------->8---
> 
> For a *failing* execution (armhf), we get:
> 
> --8<---------------cut here---------------start------------->8---
> lstat64("a", {st_mode=S_IFREG|0644, st_size=2, ...}) = 0
> lstat64("foo", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
> inotify_init()                          = 5
> write(1, "==> a <==\nx\n\n==> foo <==\n", 25) = 25
> inotify_add_watch(5, ".", IN_ATTRIB|IN_MOVED_TO|IN_CREATE) = 1
> inotify_add_watch(5, "a", IN_MODIFY|IN_ATTRIB|IN_DELETE_SELF|IN_MOVE_SELF) = 2
> inotify_add_watch(5, ".", IN_ATTRIB|IN_MOVED_TO|IN_CREATE) = 1
> inotify_add_watch(5, "foo", IN_MODIFY|IN_ATTRIB|IN_DELETE_SELF|IN_MOVE_SELF) 
> = 3
> open("a", O_RDONLY|O_NONBLOCK|O_LARGEFILE) = 6
> lstat64("a", {st_mode=S_IFREG|0644, st_size=2, ...}) = 0
> fstat64(6, {st_mode=S_IFREG|0644, st_size=2, ...}) = 0
> fstatfs64(6, 88, {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=37378337, 
> f_bfree=36178942, f_bavail=34274471, f_files=9502720, f_ffree=9405759, 
> f_fsid={1592050960, 1812457140}, f_namelen=255, f_frsize=4096, f_flags=4128}) 
> = 0
> close(6)                                = 0
> fstat64(3, {st_mode=S_IFREG|0644, st_size=2, ...}) = 0
> open("foo", O_RDONLY|O_NONBLOCK|O_LARGEFILE) = 6
> lstat64("foo", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
> fstat64(6, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
> fstatfs64(6, 88, {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=37378337, 
> f_bfree=36178942, f_bavail=34274471, f_files=9502720, f_ffree=9405759, 
> f_fsid={1592050960, 1812457140}, f_namelen=255, f_frsize=4096, f_flags=4128}) 
> = 0
> close(6)                                = 0
> fstat64(4, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
> read(5, "\3\0\0\0\4\0\0\0\0\0\0\0\0\0\0\0", 20) = 16
> open("foo", O_RDONLY|O_NONBLOCK|O_LARGEFILE) = 6
> lstat64("foo", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
> fstat64(6, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
> fstatfs64(6, 88, {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=37378337, 
> f_bfree=36178938, f_bavail=34274467, f_files=9502720, f_ffree=9405759, 
> f_fsid={1592050960, 1812457140}, f_namelen=255, f_frsize=4096, f_flags=4128}) 
> = 0
> close(6)                                = 0
> read(5,
> --8<---------------cut here---------------end--------------->8---
> 
> In both cases, reading from the inotify file descriptor (number 5)
> returns a notification for watch #3 (corresponding to ‘foo’), with mask
> IN_ATTRIB (value 4).
> 
> However, the open("foo") call that immediately follows does *not* return
> ENOENT in the failing case: The file is still there.
> 
> The kernel’s ‘vfs_unlink’ goes like this:
> 
>         fsnotify_link_count(target);                    /* IN_ATTRIB */
>         d_delete(dentry); /* fsnotify_nameremove → IN_DELETE for “.” */
> 
> So, ‘tail’ first receives the IN_ATTRIB notification corresponding to
> the link count decrease on ‘foo’; at that point, ‘foo’ is still
> available.  And then, ‘tail’ should receive the IN_DELETE_SELF
> notification, at which point ‘foo’ would have been actually been
> unlinked.  But in practice we don’t seem to be receiving IN_DELETE_SELF,
> even in the succeeding case.
> 
> I think the problem happens when ‘tail’ opens ‘foo’ right in between of
> the two notifications: ‘foo’ is still there, and so ‘tail’ doesn’t
> report anything.
> 
> Does that make sense?

Thanks for looking into it in such detail.
IN_DELETE_SELF is only sent when the inode is removed,
not on the unlink itself (as tail has the file open).
One of the awkward gotchas with inotify being inode oriented.

I.E. I wonder is this a cache coherency issue,
with tail referencing the stale inode in the open()
before the dcache is updated.

It'll be a while before I can look at this.

thanks again,
Pádraig.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]