Re: [Pan-users] Article fetch fallback strategy

pan-users
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Pan-users] Article fetch fallback strategy

From:	Duncan
Subject:	Re: [Pan-users] Article fetch fallback strategy
Date:	Thu, 2 Oct 2014 10:48:25 +0000 (UTC)
User-agent:	Pan/0.140 (Chocolate Salty Balls; GIT 81929d0 /m/p/portage/src/egit-src/pan2)
Rhialto posted on Sun, 14 Sep 2014 00:46:21 +0200 as excerpted:

[So this is a couple weeks old, but I had it saved in my unread queue to 
reply to later as the last few weeks have been /way/ too busy...]

> I have the impression that the mult-server fetch strategy of Pan is as
> follows. It keeps headers of its various servers, and remembers which
> server claims to have which articles.
> 
> When a multi-part article is to be saved, it looks at all the parts and
> assigns which server is going to supply which. Then, it tries to
> download them from those pre-determined servers.

Not exactly.  That wouldn't make good use of the available bandwidth when 
one server has much faster connections than another.

I believe it's more like this.  Assuming servers are at the same rank 
(primary or first/second/etc backup), as it gets to a particular article 
in the queue, pan will try to fetch it from whatever server has a free 
connection first, thus trying to keep all the connections active.

When a server isn't very good and is missing a lot of articles, pan will 
naturally get way ahead on that server since it's skipping so many 
articles that aren't there.  Which is fine, since that means it'll get 
every article that it can from that server, leaving the more reliable 
servers to fill in the gaps, since pan will be further behind on them 
since they have more of the content and pan will be picking it up 
whatever the first servers didn't have from them.

Of course the server with lots of holes will finish first as it's 
skipping ahead because so many are missing, but that means pan will 
naturally get what it can from it, and then idle those connections since 
it doesn't have anything else it can get from that server.

> However, my news server quite often claims to have some article, but
> then when Pan tries to fetch it, it doesn't have it.

If a server /claims/ to have an article and then doesn't, that would 
indeed throw a monkey wrench in the works, since pan would try to get the 
articles from that server, and would end up waiting for articles it 
claimed to have that would never show up.

However, that would end up stalling those connections waiting for 
nothing, which would /normally/ mean that pan would end up getting pretty 
much everything else from the other, still active, connections.

> However there is now no failover to another server. In the mean time,
> the article stays Queued at for instance 95% and never finishes.

This is probably waiting for the stalled connections.

This actually sounds like something rather different, the infamous TCP 
dropped packet connection congestion issue.  TCP has a problem when 
connections get unreliable and are regularly dropping enough packets.  
The problem is that TCP interprets such dropped packets as congestion and 
throttles its speed accordingly.  But when too many packets are getting 
dropped, it cuts back on speed, and cuts back and cuts back, until the 
connection is effectively just sitting there doing nothing. =:^(

Unfortunately, there's little to be done at that point except reset the 
connection and start over.  And at some point, TCP will automatically try 
that as well.  But if the reset packets get dropped too... then the 
connection basically gets frozen until either one side or the other times 
it out.

The gotta for services with a limited number of connections, like many 
ISPs' own news servers back in the day when they actually still had them 
(at least in the US few ISPs include new service any longer; your headers 
suggest nl, which seems to be more news friendly and may not have dropped 
them like ISPs in the US did), is that if their timeouts are long enough 
(like a day), they can still be counting those long dead connections 
against the connections allowed from your IP or login, and not let you 
connect any more as they're registering you at max connections.  If that 
happens, you may have to try to get a new IP (on many DHCP systems, you 
may be able to get a different IP if you change your MAC address, an 
arguably somewhat technical trick, but possible for those who know how), 
or if that's not possible either due to lack of know-how or to having a 
static IP address assigned, you may have to have the NSP "reset" your 
connection count manually.

I know that from experience, unfortunately, tho not having an ISP 
supplied news service, the case these days, is even worse. =:^(

I guess that's one reason some of the big NSPs offer 20 or 50 connections 
per paid account, these days.  Paid accounts often don't cap per-
connection speed and there's no reason to actually /use/ that many 
connections, but having that many available /will/ mean less trouble if 
connections get stuck "on" for some reason, so it means significantly 
less support costs and/or user unhappiness and dropped accounts. =:^)

Anyway...

> I found a tedious workaround. What I can do is to edit the news servers'
> priorities. Then I need to remove the articles from the download queue
> and re-save then. Only then the new server priority settings take
> effect.

Question:  Does quitting and restarting pan without doing the server-
priority switchup thing help?  When pan is shut down, does netstat or 
whatever open connection reporting tool still report open connections to 
that server and/or does pan refuse to die?  If so, a reboot may be 
necessary to get rid of the stalled connections, but if I'm correct, 
getting them out of the picture should allow you to download the articles 
without redoing server priority.

Do you have tcptraceroute available as a troubleshooting tool?  It's can 
be used to check the route using actual TCP packets of an appropriate 
size (1500 byte, normally) on the appropriate TCP port, in case the 
results are different for it than they are for normal ICMP or UDP 
traceroute packets.  What about mtr (matt's traceroute)?  It uses normal 
traceroute packets but does continuous tracing and nice graphing of the 
results, instead of just 1-3 shots per hop.   I'm wondering if they 
register any packet loss?  I'd guess they do when you notice the problem.

> If this happens for a significant number of articles it is a lot of work
> to do, and if you wait for all downloads to cease, maybe not everything
> that is downloaded but not decoded is in the cache anymore.

That is indeed a problem.  You can try increasing your cache size... to 
16 GiB max as discussed on a different thread recently.  The default 
cache size is 10 MiB, which is a bit small if you're having issues, or if 
you prefer to download to cache and then browse when everything's local 
and thus instantly available, as I tend to do.

> I'm looking in the code (starting at task-article.cc which seems related
> to this) to see where this happens and if I can change it, but I'm not
> familiar with the code and it's quite complex. Maybe somebody with more
> experience / knowledge can have a look?

Well, I'm not a coder so am unlikely to be of help there.  I can do 
limited analysis and come up with the occasional patch if things aren't 
too complex, but I guess you're either at that point or beyond, so...

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman
[Prev in Thread]
Current Thread
[Next in Thread]
Re: [Pan-users] Article fetch fallback strategy, Duncan <=
Prev by Date: Re: [Pan-users] article cache size
Next by Date: [Pan-users] Messed up .newsrc
Previous by thread: Re: [Pan-users] article cache size
Next by thread: [Pan-users] Messed up .newsrc
Index(es):
- Date
- Thread