pan-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Pan-devel] Hacking pan for Giganews


From: Duncan
Subject: Re: [Pan-devel] Hacking pan for Giganews
Date: Wed, 10 Aug 2011 06:18:57 +0000 (UTC)
User-agent: Pan/0.135 (Tomorrow I'll Wake Up and Scald Myself with Tea; GIT 9996aa7 branch-master)

Conrad J. Sabatier posted on Tue, 09 Aug 2011 19:20:54 -0500 as excerpted:

> On Tue, 09 Aug 2011 16:40:37 -0700 walt
> <address@hidden> wrote:
> 
>> On 08/09/2011 02:35 PM, Conrad J. Sabatier wrote:
>> > Hi.  I'm in the process of hacking the pan source code to take fuller
>> > advantage of my Giganews Diamond account (50 connections!, whoo
>> > boy!).
>> > 
>> > I've already modified pan/gui/server-ui.cc to allow me to set the
>> > server's config to 50 connections (that part was easy), but I'm
>> > looking now for how to hack pan's task management to do more
>> > simultaneous header downloads, binary file saves, etc.
>> > 
>> > Can anyone point me in the right direction?  I'll send back to the
>> > list anything I come up with, if anyone's interested.
>> 
>> Hi Conrad, and welcome to the pan mailing list.  (Forgive my senility
>> if you've posted here before :)
> 
> Thank you, and no, I'm new to this list.  :-)

I haven't seen that name in some time!  Welcome! =:^)

First things first.  You don't need to modify the sources to allow more 
than the standard four connections.

Rather, pan is GNKSA, Good Netkeeping Seal of Approval, certified, 100%, 
and GNKSA has as a MUST that a compliant news client MUST allow the user 
to set no more than the old standard four connections per server, that 
was a common netiquette limit back in the day regardless of protocol 
(FWIW, I've seen that four connections per server netiquette limit in 
discussions on at least HTTP, NNTP and FTP).

*HOWEVER*, with the pan rewrite into C++ starting with 0.90, Charles 
Kerr, the longtime pan primary developer (and a big reason pan is GNKSA 
compliant, he was QUITE strict on that), deliberately set that as the max 
that the *GUI* spinner would allow (per server), but ALSO quite 
deliberately, did *NOT* set a check for the number when read-in from the 
config file (servers.xml, since it's one of the per-server variables), 
which pan uses as-is.  So *PAN* doesn't let you set more, thus compliying 
with GNKSA, but if you happen to hand-edit the config file itself, pan 
doesn't check that value against the GNKSA limit when it reads it in, and 
you can set what you want and pan will honor it, thus nicely bypassing 
the GNKSA limit for those who have sufficient motivation to do so.

So, set however many connections you wish in servers.xml itself (in 
~/.pan2 by default, but see the next point), and pan will try its best to 
honor that setting -- you just have to edit the relevant config file 
manually to do it.

But there's a number of related points to make about this as well.

0) (Really, I thought this was on the user list.  I now see its on devel.  
But since I have the post already pretty much done...  But cross-posted, 
and Fup2 set to the user list... aka group, on gmane, see point 5...)

1) Server connections isn't the only "undocumented no-GUI" config option 
available.  There's a number of others, some of which can be /quite/ 
useful.

1a) While ~/.pan2 is pan's default data dir, pan checks to see if the 
PAN_HOME environmental variable is set when it starts and uses the 
directory path found there instead, if it is.

In addition to the obvious ability to set some other dir if you don't 
like the ~/.pan2 location, this allows a very useful trick, which I've 
been using here for years, setup wrapper scripts to set this to different 
dirs before starting pan, and you now have the ability to setup multiple 
independent pan profiles. =:^)  Here, I use text, bin and test (the 
latter since pan remembers data like read message sequence numbers from 
every group you visit, whether you're subscribed or not; I like to be 
able to check out a group and blow away the config without it screwing up 
my permanent subscribed groups profiles), but you're perfectly free to 
setup mp3 groups in one, tv program groups in another, pr0n in a third, 
and the disney/nature/whatever groups for the kids in a forth (obviously 
this one would be the default, while you'd have to run your porn instance 
from the command line... if the kids weren't on an entirely separate user 
account... =:^), if it fits your needs better.

I then have another dir called globals that I keep files like the 
scorefile and accels.txt in, since I want the same setup for those 
features across all pan instances, with symlinks from the individual 
profiles to the files in the globals dir, where appropriate.

1b) Also in servers.xml along with the connections settings, the 
expiration time is actually set in days, and you can set it to whatever 
you want; you're not bound by the arbitrary choices in the GUI.

1c) Still in servers.xml, you can have three or more tiers of server 
rank, if desired, you're not limited to the GUI choices of primary (1) 
and backup (2, IIRC, or is it 0 and 1... I'm too lazy to go look ATM).

1d) In preferences.xml, it's possible to change the default cache size.  
Charles was obviously infected with the GNOME mindset of "users are 
dummies and can be easily confused if the config gets too complex", and 
he deliberately omitted this option from the GUI for that reason.  While 
the default cache settings are a measly 10 MB, for different reasons, I 
have both my binary and text profiles set to several gigs.

For the text groups, I have never-expire set on the servers and keep the 
keep messages around more or less permanently.  (I still have the cox.* 
messages from the time I switched to new-pan, until the final highwinds-
media cox user access shutdown at the end of the year after cox 
terminated the service.)  For the groups (lists, including this one, as 
newsgroups) I follow on gmane.org, since it archives too, I have at least 
some of the lists/groups (including this one) as complete as the gmane 
archive for them... back to 2002, which might be about the time gmane 
started, since several unrelated lists go back to about the same date in 
2002 and no further.  As such, I believe I set something like a 5 GB 
cache size for the text groups, altho current size was still well under a 
gig (700 or so, IIRC, but I downloaded the back-archive of a couple gmane 
groups since then), last I checked.

For binary groups, instead of saving files directly as they're 
downloaded, I much prefer to download to cache (well, I'll often download 
a sample or two first, if it's soemthing that can be reasonably sampled, 
to see whether it's something I want to bother downloading at all, 
deleting it straight-out if not, but other than that...), go and do 
something else (let it run while I sleep, or work, or whatever), and come 
back when they're done downloading and I can sort thru and save them off 
to the file locations, or delete them if I decide I don't want them (and 
delete the cached articles as I save them off, otherwise), without 
waiting for the downloads since they're locally cached.

I still remember in 2003 or so when I first started using (then /very/ 
old, 0.11, gnome-1 version) pan, and got VERY frustrated when I tried 
that, only to find out pan had deleted most of them pretty much as fast 
as it downloaded them, because of the stupidly tiny (for my usage) 10 MB 
default cache size.

So I have a dedicated partition, 12 gig (and I've not done binaries in 
ages, I might well make it 100 gig or so if I decided to get active again 
and re-did my partitions for it), for just for my binary pan instance 
cache.

There may be more such config-file-only settings, but unless I've 
forgotten one, those are the ones I know about.  So, on to the next 
point...

2) Pan isn't actually as efficient as it might be in managing 
connections.  Apparently, when decoding and saving files off directly, it 
does that in the same thread as the download, so that thread actually 
stops downloading temporarily, while the decode-and-save takes place.

As a result of this, if you use direct-save, you probably want more 
connections than you can actually use to full capacity, since at any 
given time, a couple of them are likely to be doing decode-and-save.

Of course, if you download to cache first, it not only eliminates this 
problem, but ALSO reduces disk I/O contention and head-seeking as the 
load is all write-load while actually downloading, instead of write to 
cache, then read all the parts in from cache for decode, do the decode 
(CPU intensive), then write the decoded binary out again... on top of 
whatever else you're doing at the same time, of course.

So if you set a big enough cache, then have pan download while you're 
sleeping or at work or whatever, then come back and work with what's 
downloaded, the download itself should perform rather better, utilizing 
connection and available bandwidth rather more efficiently -- and that 
has certainly been my experience, as well as that of a couple others that 
have tested it.

Of course, if you're all SSD, you may instead wish to reverse that, 
keeping the default 10 MB cache size and direct-downloading, but putting 
the (small) cache on on tmpfs or some such, so it never hits disk at all, 
thus avoiding write-cycles on your SSD by only writing the permanent data.

FWIW, the differing cache sizes (and locations, keeping the temp binaries 
cache separate from the generally permanent text cache) is one of the big 
reasons I use separate binary and text profiles... beyond simply not 
wanting the two subscribed-groups lists mixed with each other, anyway.  

That also makes it easier to catch up on the trivial-bandwidth text 
groups while I'm waiting for the binary download-to-cache to finish, if I 
want... without getting distracted when I'm trying to write a nice tech 
post by all of those binaries, since they're two entirely different pan 
instances and main-windows.

3) Charles Kerr retired from pan some years ago.  He says he no longer 
does news, at least not regularly, but last I knew was still active in 
the gtk/gnome community, working on transmission (the gtk-based bittorrent 
client), among other things.  That's why (official, at least) pan 
development was dormant for several years.  

However, we've made it thru that rough spot, when I'm sure I wasn't the 
only one debating shutting out the lights on this list and going home, as 
it seemed pan was destined to only get more hoary and stale with age... 
until the distros couldn't maintain it against newer versions of gcc and 
its dependencies any more...

Actually, now, pan seems to have a more active developer /community/ than 
I believe it /ever/ had with Charles at the helm, at least since I came 
on board to try to help with the lists, etc, in 2003 or so.

It worked out like this...  

KHaley (lostcoder on github) is the new primary community developer, but 
for whatever personal reasons, doesn't appear to be interested in the 
gnome side of things *AT* *ALL*, so for some time, while he was the first 
developer to start taking an interest in pan again, cloning the gnome repo 
to his github account and at first, simply assembling most of the various 
distro patches floating around, then gradually beginning to add his own 
stuff, he has never been interested AT ALL in working (himself) with the 
official gnome repo, so mostly, it was only a core group of regulars here 
that could clone his repo and build it locally, that got the benefits.

Then PKovar (Petr, but I don't know that KHaley has ever revealed what 
the K is for, so it's PKovar using a similar name) came along.  He's 
almost the exact reverse of KHaley, as AFAIK he doesn't really claim to 
be a (code) developer at all.  But he was an existing Gnome translator 
and thus had Gnome repo access rights, and was interested in pan, too.

So together, KHaley who codes and codes pan, but isn't interested in the 
gnome side of things, and PKovar, who has gnome access and is interested 
in pan but doesn't code, make an interesting team! =:^)  That's how pan 
came to have the two official releases already this year, 0.134 being 
mostly a wrapup of existing distro patches, etc, and 0.135, beginning to 
cautiously introduce new code, mostly tweaking around the edges, the line-
wrapping code (which Charles struggled and struggled with, but never 
/did/ get quite right), etc, nothing major, yet.  (FWIW, PKovar contacted 
Charles too, and Charles was quite happy to give the new team his 
blessing and get that one more bit of unfinished business of his plate.  
Charles and PKovar and the rebelbase.com folks apparently arranged for 
PKovar to take over pan.rebelbase.com, as well, so PKovar is the official 
maintainer of the entire project now, just that the code he commits to 
the public gnome repo is pretty much straight from KHaley's stable 
branch.  Like I said, they seem to make a great team! =:^)

But KHaley is apparently rather time constrained and even if he weren't, 
he seems to naturally be quite conservative in his changes, so were it 
just those two (and me the dominant list regular), pan would continue to 
move rather slowly, mostly in maintenance mode, with a few tweaks here 
and there but possibly never again a major new feature.  That's *WAY* 
better than slowly sinking into disrepair and irrelevance, as it was, and 
really, that's all pan needs now, and pretty much all we expected.

But then HMueller (Heinrich) burst on the scene, only this year, just a 
few months ago, and what an entrance he made! =:^)  While KHaley seems to 
be a near ideal "conservative policy release manager", don't break what 
people are already depending on, HMueller's the "Damn the torpedoes!  
Full speed ahead!" type. =:^)

His big entrance was an announcement that he was working on full-feature 
binary posting, for pan!!  This wasn't some half-ass single-part binary 
feature, either, but power-post material!  You know how many *YEARS* 
Charles had been talking about that?  Certainly since at least the 
gnome-1-based 0.11 series, in 2003!  And pan's name was originally an 
initialism for Pimp-Ass Newsreader (with the -Assed bit toned down over 
time, with PAN replacing Pimp-Assed Newsreader and gradually, pan 
replacing PAN, until few here now probably knew it at all, until I 
brought it up again recently, in exactly this binary-posting context, 
matter of fact), so it obviously had big ambitions well before I was 
around, too, possibly tracing to before Charles took over (I don't know 
if the name was his or inherited from the original dev(s), still in the 
credits but that's about all I know of them as it was before my time and 
I've /no/ idea when Charles took over or under what circumstances).

The binary-posting feature is still far from stable, AFAIK and in fact, 
remains quite experimental, with quite heavy changes still going on.  As 
such, I don't believe it's likely to make it into an official release in 
the /immediate/ future (as I'd say is appropriate, KHaley's steady hand 
is appreciated here), but I expect that it might make a release either 
late this year (Christmas present! =:^) or by spring, next.

He's also working on another long discussed feature.  If you were an old-
pan user, you may remember the "rules" it had, allowing automated mark-
read, delete or download, among other things.  Charles always thought it 
was too complicated, and I agree, to the extent that back then, how to 
setup pan for auto-mark-read and/or auto-downloading, was a very heavily 
asked FAQ, despite the feature being right there, if a bit complex.  For 
that reason, he never implemented them in new-pan.  There has often been 
discussion of a much simpler and quite reasonable alternative (score-
based, linked to the same scoring categories used in the view menu and 
for color-coding, see the archives for the details, or...), and Charles 
even commented that he agreed with the idea at one point, but by then, he 
was slowing down on pan coding, and that never got implemented either.

Which has been quite irritating for those of us that keep certain 
messages marked unread to come back to later, so we can't use the auto-
mark-read at group exit and/or at pan close options, because currently, 
scoring-ignored remains marked unread, and if they're not set to be 
viewed either, there's no way to MARK them read without marking the ones 
read that one wants to keep marked unread to come back to later!

And of course, being able to auto-download either all watched, or all 
scored at whatever level, would be a huge improvement as well!  The 
ability to auto-* based on rules is the one big feature that old-pan had 
that new-pan never had... until HMueller came along, anyway!

If pan gets those features, it'll be very very close to feature-complete, 
at least in my book.  About the only other big feature I know of that it 
could use, would be some decent documentation, but eh... this group/list 
needs SOME reason to exist. =:^)

So there's some VERY BIG changes coming down the pike, but right now, in 
ordered to taste 'em, you gotta pull from git and build your own!

Meanwhile, point...

4) With Charles now out of the picture, I knew it'd happen eventually.  
Someone would come along with a suggestion that if implemented, would 
kill pan's GNKSA certification.  Connections was of course the most 
likely trigger.

And so it happened.  HMueller, having no inkling about GNKSA or what it 
meant to Charles, pan's primary dev for so many years, and consequently, 
to at least some of the people who had continued to use pan all these 
years, AND, as you, not realizing that the 4-connections-limit was only 
enforced in the GUI server-settings and that it was DELIBERATELY possible 
to edit servers.xml directly and set whatever number of connections one 
liked and have pan honor it, WITHOUT forcing pan to take down that GNKSA 
seal that many of us respect, made what he thought was the logical change.

As I had been wondering when it might happen, and debating whether or how 
I should bring up the discussion myself, I of course immediately jumped 
on the opportunity poor innocent HMueller afforded me! =:^)

So, the regulars here had a little debate on the topic.  What, /indeed/ 
about GNKSA?

Once he realized the significance, HMueller did the right thing (at least 
for the time being) and reverted that bit of that commit.  So AFAIK, the 
GUI is back the way it was, and you can still set as many connections as 
you like in servers.xml, directly.

IMO, part of the problem with that particular bit of the GNKSA is that in 
reality, it was anachronistic from the very time it was introduced, with 
GNKSA 2.0 in (IIRC) the late '90s.

I'm almost positive that point, which stresses both opening no more than 
four connections *AND* making efficient use of the connections you *DO* 
open, was added to cover abuses by the likes of MS Outlook Express, which 
would open up multiple connections, use them for one thing, and then just 
let them sit there.  It would use one connection for its batch downloads, 
and if one were clever, one could force it to use a second by queuing up 
messages to read interactively, but that's all the connections you could 
get it to use efficiently, despite the fact that it opened another, 
dedicated ONLY to periodically checking for and updating the list of 
unread messages in all subscribed groups, and I /think/ another for 
something similar (maybe only for the initial automatic check to see if 
the newsgroup list had changed, tho I never confirmed that, but I *DO* 
know that various ISP support folks have told me that if they set less 
than four connections per user, support calls from OE users went up 
*DRAMATICALLY*)

But OE was already breaking all sort of other netiquette rules and simply 
declaring another in GNKSA wasn't going to make it's MS devs or its users 
care any more than (in general) they already didn't.

Further, by that time most ifThat's the one big thing the rules gave us 
with old-pan, but  not all commercial and ISP news services were server-
side tracking and enforcing connection limits already, because they HAD 
to, and again, whatever violators there were that forced this, simply 
weren't going to care what GNKSA had to say about it.  So by the time it 
was even introduced, the rule, applying as GNKSA does to the news CLIENT, 
was anachronistic, since everywhere it mattered, server-side connection 
tracking and limitation was already enforced.

Plus, by the simple mechanism of classic capitalistic competition, as 
soon as the number of connections was server-side enforced, it became a 
bullet point in the feature competition between news providers (and to a 
lessor extent, for a time, between ISPs).  So allowing (comparatively) 
huge numbers of connections very quickly became a distinguishing mark of 
the paid news providers, something users were paying good money for, yet 
GNKSA disallowed compliant clients letting their users take full 
advantage of what they were paying for!

But, here's the problem, despite that one IMO very bad choice, making it 
a MUST, no less, not just a SHOULD, MOST if not all of the rest of the 
GNKSA provokes FAR less controversy, and while /some/ users might quibble 
at the likes of top-posting restrictions, line-length restrictions, etc, 
I and from our little discussion here on the list, most other pan 
regulars (at least to the extent that they ARE on this list), believe 
having a sort of checklist to make sure nobody "accidentally" steps over 
the line, is a rather GOOD thing.

Meanwhile, the GNKSA folks themselves seem to believe its time has 
passed.  There was some discussion here on whether their reply indicated 
that they'd be willing to update it, if an appropriate suggestion were to 
come, or whether they considered it a closed chapter of their lives, save 
for perhaps keeping the now vanity domain name, perhaps for email they've 
long had, etc.

FWIW, the way it was left was that someone from here volunteered to be 
the liaison with the GNKSA folks so the whole list didn't bother them at 
once, and they figured they'd already bothered them enough for the time 
being, so were going to wait a month or so before getting clarification 
on further GNKSA updates, etc.

I think it has been about that long; perhaps we need to bump our liaison 
and see what he reports. =:^)

That was only a couple months ago.  If you're at all interested, please 
do go back and read the archive, and add your own thoughts if desired.

The alternatives include updating GNKSA if we're interested (presumably 
with either the GNKSA folks' cooperation, perhaps even giving us partial 
control of that bit of the web site, tho it's early to presume, or at 
least not vetoing our action, if they'd rather just leave it behind 
them), simply dropping all mention of it from the pan site and forgetting 
about GNKSA entirely, including the rest of it, or taking the rest of it 
and changing it into a public pledge, "pan is built with these goals in 
mind", with the same list minus the connections point and not calling it 
GNKSA (but presumably linking it in the acknowledgments).

Finally...

5) I've mentioned it in passing several times already, but this mailing 
list (and many others) is (are) carried by gmane.org in both newsgroup 
form and web forum form.  You can find the list archives there, as well 
as, I /think/, on the gnome site.  If you're already using pan for text 
groups, you may find that server a useful addition, for whatever lists 
you follow.

You can read the lists as newsgroups by pointing pan at news.gmane.org .  
Before replying to anything using gmane news, however, you should 
probably read up on how it all works, the way the confirmation mail 
works, etc, at http://gmane.org.

(The rest snipped.  The post is long enough, and you can reformulate and 
repost any questions you may still have after digesting the above. =:^)

(Again, "this list" thruout the above refers to the user list, for the 
most part.  The dev list doesn't get much traffic.  Please followup on 
user.)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman




reply via email to

[Prev in Thread] Current Thread [Next in Thread]