qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] block/gluster: glfs_lseek() workaround


From: Jeff Cody
Subject: Re: [Qemu-devel] [PATCH] block/gluster: glfs_lseek() workaround
Date: Wed, 24 May 2017 12:26:14 -0400
User-agent: Mutt/1.5.24 (2015-08-30)

On Wed, May 24, 2017 at 11:02:02AM +0200, Niels de Vos wrote:
> On Tue, May 23, 2017 at 01:27:50PM -0400, Jeff Cody wrote:
> > On current released versions of glusterfs, glfs_lseek() will sometimes
> > return invalid values for SEEK_DATA or SEEK_HOLE.  For SEEK_DATA and
> > SEEK_HOLE, the returned value should be >= the passed offset, or < 0 in
> > the case of error:
> > 
> > LSEEK(2):
> > 
> >     off_t lseek(int fd, off_t offset, int whence);
> > 
> >     [...]
> > 
> >     SEEK_HOLE
> >               Adjust  the file offset to the next hole in the file greater
> >               than or equal to offset.  If offset points into the middle of
> >               a hole, then the file offset is set to offset.  If there is no
> >               hole past offset, then the file offset is adjusted to the end
> >               of the file (i.e., there is  an implicit hole at the end of
> >               any file).
> > 
> >     [...]
> > 
> >     RETURN VALUE
> >               Upon  successful  completion,  lseek()  returns  the resulting
> >               offset location as measured in bytes from the beginning of the
> >               file.  On error, the value (off_t) -1 is returned and errno is
> >               set to indicate the error
> > 
> > However, occasionally glfs_lseek() for SEEK_HOLE/DATA will return a
> > value less than the passed offset, yet greater than zero.
> > 
> > For instance, here are example values observed from this call:
> > 
> >     offs = glfs_lseek(s->fd, start, SEEK_HOLE);
> >     if (offs < 0) {
> >         return -errno;          /* D1 and (H3 or H4) */
> >     }
> > 
> > start == 7608336384
> > offs == 7607877632
> > 
> > This causes QEMU to abort on the assert test.  When this value is
> > returned, errno is also 0.
> > 
> > This is a reported and known bug to glusterfs:
> > https://bugzilla.redhat.com/show_bug.cgi?id=1425293
> > 
> > Although this is being fixed in gluster, we still should work around it
> > in QEMU, given that multiple released versions of gluster behave this
> > way.
> 
> Versions of GlusterFS 3.8.0 - 3.8.8 are affected, 3.8.9 has the fix
> already and was released in February this year. We encourage users to
> update to recent versions, and provide a stable (bugfix only) update
> each month.
> 
> The Red Hat Gluster Storage product that is often used in combination
> with QEMU (+oVirt) does unfortunately not have an update where this is
> fixed.
> 
> Using an unfixed Gluster storage environment can cause QEMU to segfault.
> Although fixes exist for Gluster, not all users will have them
> installed. Preventing the segfault in QEMU due to a broken storage
> environment makes sense to me.
> 
> > This patch treats the return case of (offs < start) the same as if an
> > error value other than ENXIO is returned; we will assume we learned
> > nothing, and there are no holes in the file.
> > 
> > Signed-off-by: Jeff Cody <address@hidden>
> > ---
> >  block/gluster.c | 18 ++++++++++++++++--
> >  1 file changed, 16 insertions(+), 2 deletions(-)
> > 
> > diff --git a/block/gluster.c b/block/gluster.c
> > index 7c76cd0..c147909e 100644
> > --- a/block/gluster.c
> > +++ b/block/gluster.c
> > @@ -1275,7 +1275,14 @@ static int find_allocation(BlockDriverState *bs, 
> > off_t start,
> >      if (offs < 0) {
> >          return -errno;          /* D3 or D4 */
> >      }
> > -    assert(offs >= start);
> > +
> > +    if (offs < start) {
> > +        /* This is not a valid return by lseek().  We are safe to just 
> > return
> > +         * -EIO in this case, and we'll treat it like D4. Unfortunately 
> > some
> > +         *  versions of libgfapi will return offs < start, so an assert 
> > here
> > +         *  will unneccesarily abort QEMU. */
> 
> This is not really correct, the problem is not in libgfapi, but in the
> protocol translator on the server-side. The version of libgfapi does not
> matter here, it is the version on the server. But that might be too much
> detail for the comment.
> 
> > +        return -EIO;
> > +    }
> >  
> >      if (offs > start) {
> >          /* D2: in hole, next data at offs */
> > @@ -1307,7 +1314,14 @@ static int find_allocation(BlockDriverState *bs, 
> > off_t start,
> >      if (offs < 0) {
> >          return -errno;          /* D1 and (H3 or H4) */
> >      }
> > -    assert(offs >= start);
> > +
> > +    if (offs < start) {
> > +        /* This is not a valid return by lseek().  We are safe to just 
> > return
> > +         * -EIO in this case, and we'll treat it like H4. Unfortunately 
> > some
> > +         *  versions of libgfapi will return offs < start, so an assert 
> > here
> > +         *  will unneccesarily abort QEMU. */
> > +        return -EIO;
> > +    }
> >  
> >      if (offs > start) {
> >          /*
> > -- 
> > 2.9.3
> > 
> 
> You might want to explain the problem a little different in the commit
> message. It is fine too if you think it would become too detailed, my
> explanation is in the archives now in any case.


Thanks.  When applying it, I'll update the comments to mention glusterfs
server, and add the gluster version information you provided to the commit
message (and also fix the typo Eric pointed out).


> 
> Reviewed-by: Niels de Vos <address@hidden>



reply via email to

[Prev in Thread] Current Thread [Next in Thread]