[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Gluster-devel] Re; Load balancing ...
From: |
Gareth Bult |
Subject: |
Re: [Gluster-devel] Re; Load balancing ... |
Date: |
Mon, 28 Apr 2008 10:37:41 +0100 (BST) |
Hi,
>Gordon is right here. Selfhealing on the fly is very much dependant on
>lookup(). So it is inevitable to do lookup() on all the subvols. Also
>we use the results of lookup() call for subsequent operations on that
>file/directory. But it is not a bad idea to compromise consistency for
>speed (with read-subvolume option) as some users might prefer that. We
>can provide this as an option and let admins handle the
>inconsistancies that would arise of this compromise. We shall keep
>this in the TODO list.
Sounds good to me ... :)
If it were a 10% speed difference, I wouldn't even mention it.
But when it's potentially 30x it's a serious issue.
Regards,
Gareth.
On Sat, Apr 26, 2008 at 3:51 AM, Gareth Bult <address@hidden> wrote:
> >You're expecting a bit much here - for any shared/clustered FS. DRBD
> >might come close if your extents are big enough, but that's a whole
> >different ball game...
>
> I was quoting a real-world / live data scenario, DRBD handles it just fine.
> .. but it is a different mechanism to gluster.
>
>
> >Sounds like a reasonably sane solution to me.
>
> It is. It also makes Gluster useless in this scenario.
>
>
> >Why would the cluster effectively be down? Other nodes would still be
> >able to server that file.
>
> Nope, it won't replicate the file while another node has it locked .. which
> means you effectively need to close all files in order to kick off the
> replication process, and the OPEN call will not complete until the file has
> replicated .. so effectively (a) you need to restart all your processes to
> make then close and re-open their files (or HUP them.. or whatever), then
> those processes will all freeze until the files they are trying to open have
> replicated.
>
>
> >Or are you talking about the client-side AFR?
>
> Mmm, it's been a while, I'm not entirely sure I've tested the issue on
> client side and server side.
> Are you telling me that server-side will work quite happily and it's only
> client-side that has all these issues?
>
>
> >I have to say, a one-client/multiple-servers scenario sounds odd.
> >If you don't care about downtime (you have just one client node so that's
> >the only conclusion that can be reached), then what's the problem with a
> bit more downtime?
>
> My live scenario was 4 (2x2) AFR servers with ~ 12 clients.
>
> Obviously this setup is no longer available to me as it proved to be useless
> in practice.
>
> I'm currently revisiting Gluster with another "new" requirement (as per my
> last email) .. currently I'm testing a 2 x server + 1 x client setup with
> regards to load balancing and use over a slow line. Obviously (!) both
> servers can also act as clients so I guess to be pedantic you'd call it 2
> servers + 3 clients. My point was I have 1 machine with no server.
>
>
> Gareth.
>
> --
> Managing Director, Encryptec Limited
> Tel: 0845 5082719, Mob: 0785 3305393
> Email: address@hidden
> Statements made are at all times subject to Encryptec's Terms and Conditions
> of Business, which are available upon request.
>
>
> ----- Original Message -----
> From: "Gordan Bobic" <address@hidden>
> To: address@hidden
>
> Sent: Friday, April 25, 2008 9:40:00 PM GMT +00:00 GMT Britain, Ireland,
> Portugal
> Subject: Re: [Gluster-devel] Re; Load balancing ...
>
> Gareth Bult wrote:
>
>
> >> If you have two nodes and the 20 GB file
> >> only got written to node A while node B was down and
> >> node B comes up the whole 20 GB is resynced to node B;
> >> is that more network usage than if the 20 GB file were
> >> written immediately to node A & node B.
> >
> > Ah. Let's say you have both nodes running with a 20Gb file synced.
> > Then you have to restart one glusterfs on one of the nodes.
> > While it's down, let's say the other node appends 1 byte to the file.
> > When it comes back up and looks a the file, the other node will see it's
> out of date and re-copy the entire 20Gb.
>
> You're expecting a bit much here - for any shared/clustered FS. DRBD
> might come close if your extents are big enough, but that's a whole
> different ball game...
>
> >> Perhaps the issue is really that the cost comes at an
> >> unexpected time, on node startup instead of when the
> >> file was originally written? Would a startup
> >> throttling mechanism help here on resyncs?
> >
> > Yes, unfortunately you can't open a file while it's syncing .. so when you
> reboot your gluster server, downtime is the length of time it takes to
> restart glusterfs (or the machine, either way..) PLUS the amount of time it
> takes to recopy every file that was written to while one node was down ...
>
> Sounds like a reasonably sane solution to me.
>
> > Take a Xen server for example serving disk images off a gluster partition.
> > 10 Images at 10G each gives you a 100G copy to do.
>
> If they are static images why would they have changed? What you are
> describing would really be much better accomplished with a SAN+GFS or
> Coda which is specifically designed to handle disconnected operation at
> the expense of other things.
>
> > Wait, it get's better .. it will only re-sync the file on opening, so you
> actually have to close all the files, then try to re-open them , then wait
> while it re-syncs the data (during this time your cluster is effectively
> down), then the file open completes and you are back up again.
>
> Why would the cluster effectively be down? Other nodes would still be
> able to server that file. Or are you talking about the client-side AFR?
> I have to say, a one-client/multiple-servers scenario sounds odd. If you
> don't care about downtime (you have just one client node so that's the
> only conclusion that can be reached), then what's the problem with a bit
> more downtime?
>
> > Yet there is a claim in the FAQ that there is no single point of failure
> .. yet to upgrade gluster for example you effectively need to shut down the
> entire cluster in order to get all files to re-sync ...
>
> Wire protocol incompatibilities are, indeed unfortunate. But on one hand
> you speak of manual failover and SPOF clients and on the other you speak
> of unwanted downtime. If this bothers you, have enough nodes that you
> could shut down half (leaving half running), upgrade the downed ones,
> bring them up and migrade the IPs (heartbeat, RHCS, etc) to the upgraded
> ones and upgrade the remaining nodes. The downtime should be seconds at
> most.
>
> > Effectively storing anything like a large file on AFR is pretty unworkable
> and makes split-brian issues pale into insignificance ... or at least that's
> my experience of trying to use it...
>
> I can't help but think that you're trying to use the wrong tool for the
> job here. A SAN/GFS solution sounds like it would fit your use case better.
>
>
>
> Gordan
>
>
> _______________________________________________
> Gluster-devel mailing list
> address@hidden
> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>
>
> _______________________________________________
> Gluster-devel mailing list
> address@hidden
> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>
- Re: [Gluster-devel] Re; Load balancing ..., (continued)
Re: [Gluster-devel] Re; Load balancing ..., Gareth Bult, 2008/04/25
Re: [Gluster-devel] Re; Load balancing ..., Gareth Bult, 2008/04/25
Re: [Gluster-devel] Re; Load balancing ..., Gareth Bult, 2008/04/28
- Re: [Gluster-devel] Re; Load balancing ..., Krishna Srinivas, 2008/04/29
- Re: [Gluster-devel] Re; Load balancing ..., Martin Fick, 2008/04/29
- Re: [Gluster-devel] Re; Load balancing ..., Gordan Bobic, 2008/04/30
- [Gluster-devel] mmap support, Dionisas, 2008/04/30
- Re: [Gluster-devel] mmap support, Mickey Mazarick, 2008/04/30
- Re: [Gluster-devel] mmap support, Mickey Mazarick, 2008/04/30
- Re: [Gluster-devel] mmap support, Anand Avati, 2008/04/30