[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Pan-devel] Hacking pan for Giganews
From: |
Duncan |
Subject: |
Re: [Pan-devel] Hacking pan for Giganews |
Date: |
Wed, 10 Aug 2011 06:18:57 +0000 (UTC) |
User-agent: |
Pan/0.135 (Tomorrow I'll Wake Up and Scald Myself with Tea; GIT 9996aa7 branch-master) |
Conrad J. Sabatier posted on Tue, 09 Aug 2011 19:20:54 -0500 as excerpted:
> On Tue, 09 Aug 2011 16:40:37 -0700 walt
> <address@hidden> wrote:
>
>> On 08/09/2011 02:35 PM, Conrad J. Sabatier wrote:
>> > Hi. I'm in the process of hacking the pan source code to take fuller
>> > advantage of my Giganews Diamond account (50 connections!, whoo
>> > boy!).
>> >
>> > I've already modified pan/gui/server-ui.cc to allow me to set the
>> > server's config to 50 connections (that part was easy), but I'm
>> > looking now for how to hack pan's task management to do more
>> > simultaneous header downloads, binary file saves, etc.
>> >
>> > Can anyone point me in the right direction? I'll send back to the
>> > list anything I come up with, if anyone's interested.
>>
>> Hi Conrad, and welcome to the pan mailing list. (Forgive my senility
>> if you've posted here before :)
>
> Thank you, and no, I'm new to this list. :-)
I haven't seen that name in some time! Welcome! =:^)
First things first. You don't need to modify the sources to allow more
than the standard four connections.
Rather, pan is GNKSA, Good Netkeeping Seal of Approval, certified, 100%,
and GNKSA has as a MUST that a compliant news client MUST allow the user
to set no more than the old standard four connections per server, that
was a common netiquette limit back in the day regardless of protocol
(FWIW, I've seen that four connections per server netiquette limit in
discussions on at least HTTP, NNTP and FTP).
*HOWEVER*, with the pan rewrite into C++ starting with 0.90, Charles
Kerr, the longtime pan primary developer (and a big reason pan is GNKSA
compliant, he was QUITE strict on that), deliberately set that as the max
that the *GUI* spinner would allow (per server), but ALSO quite
deliberately, did *NOT* set a check for the number when read-in from the
config file (servers.xml, since it's one of the per-server variables),
which pan uses as-is. So *PAN* doesn't let you set more, thus compliying
with GNKSA, but if you happen to hand-edit the config file itself, pan
doesn't check that value against the GNKSA limit when it reads it in, and
you can set what you want and pan will honor it, thus nicely bypassing
the GNKSA limit for those who have sufficient motivation to do so.
So, set however many connections you wish in servers.xml itself (in
~/.pan2 by default, but see the next point), and pan will try its best to
honor that setting -- you just have to edit the relevant config file
manually to do it.
But there's a number of related points to make about this as well.
0) (Really, I thought this was on the user list. I now see its on devel.
But since I have the post already pretty much done... But cross-posted,
and Fup2 set to the user list... aka group, on gmane, see point 5...)
1) Server connections isn't the only "undocumented no-GUI" config option
available. There's a number of others, some of which can be /quite/
useful.
1a) While ~/.pan2 is pan's default data dir, pan checks to see if the
PAN_HOME environmental variable is set when it starts and uses the
directory path found there instead, if it is.
In addition to the obvious ability to set some other dir if you don't
like the ~/.pan2 location, this allows a very useful trick, which I've
been using here for years, setup wrapper scripts to set this to different
dirs before starting pan, and you now have the ability to setup multiple
independent pan profiles. =:^) Here, I use text, bin and test (the
latter since pan remembers data like read message sequence numbers from
every group you visit, whether you're subscribed or not; I like to be
able to check out a group and blow away the config without it screwing up
my permanent subscribed groups profiles), but you're perfectly free to
setup mp3 groups in one, tv program groups in another, pr0n in a third,
and the disney/nature/whatever groups for the kids in a forth (obviously
this one would be the default, while you'd have to run your porn instance
from the command line... if the kids weren't on an entirely separate user
account... =:^), if it fits your needs better.
I then have another dir called globals that I keep files like the
scorefile and accels.txt in, since I want the same setup for those
features across all pan instances, with symlinks from the individual
profiles to the files in the globals dir, where appropriate.
1b) Also in servers.xml along with the connections settings, the
expiration time is actually set in days, and you can set it to whatever
you want; you're not bound by the arbitrary choices in the GUI.
1c) Still in servers.xml, you can have three or more tiers of server
rank, if desired, you're not limited to the GUI choices of primary (1)
and backup (2, IIRC, or is it 0 and 1... I'm too lazy to go look ATM).
1d) In preferences.xml, it's possible to change the default cache size.
Charles was obviously infected with the GNOME mindset of "users are
dummies and can be easily confused if the config gets too complex", and
he deliberately omitted this option from the GUI for that reason. While
the default cache settings are a measly 10 MB, for different reasons, I
have both my binary and text profiles set to several gigs.
For the text groups, I have never-expire set on the servers and keep the
keep messages around more or less permanently. (I still have the cox.*
messages from the time I switched to new-pan, until the final highwinds-
media cox user access shutdown at the end of the year after cox
terminated the service.) For the groups (lists, including this one, as
newsgroups) I follow on gmane.org, since it archives too, I have at least
some of the lists/groups (including this one) as complete as the gmane
archive for them... back to 2002, which might be about the time gmane
started, since several unrelated lists go back to about the same date in
2002 and no further. As such, I believe I set something like a 5 GB
cache size for the text groups, altho current size was still well under a
gig (700 or so, IIRC, but I downloaded the back-archive of a couple gmane
groups since then), last I checked.
For binary groups, instead of saving files directly as they're
downloaded, I much prefer to download to cache (well, I'll often download
a sample or two first, if it's soemthing that can be reasonably sampled,
to see whether it's something I want to bother downloading at all,
deleting it straight-out if not, but other than that...), go and do
something else (let it run while I sleep, or work, or whatever), and come
back when they're done downloading and I can sort thru and save them off
to the file locations, or delete them if I decide I don't want them (and
delete the cached articles as I save them off, otherwise), without
waiting for the downloads since they're locally cached.
I still remember in 2003 or so when I first started using (then /very/
old, 0.11, gnome-1 version) pan, and got VERY frustrated when I tried
that, only to find out pan had deleted most of them pretty much as fast
as it downloaded them, because of the stupidly tiny (for my usage) 10 MB
default cache size.
So I have a dedicated partition, 12 gig (and I've not done binaries in
ages, I might well make it 100 gig or so if I decided to get active again
and re-did my partitions for it), for just for my binary pan instance
cache.
There may be more such config-file-only settings, but unless I've
forgotten one, those are the ones I know about. So, on to the next
point...
2) Pan isn't actually as efficient as it might be in managing
connections. Apparently, when decoding and saving files off directly, it
does that in the same thread as the download, so that thread actually
stops downloading temporarily, while the decode-and-save takes place.
As a result of this, if you use direct-save, you probably want more
connections than you can actually use to full capacity, since at any
given time, a couple of them are likely to be doing decode-and-save.
Of course, if you download to cache first, it not only eliminates this
problem, but ALSO reduces disk I/O contention and head-seeking as the
load is all write-load while actually downloading, instead of write to
cache, then read all the parts in from cache for decode, do the decode
(CPU intensive), then write the decoded binary out again... on top of
whatever else you're doing at the same time, of course.
So if you set a big enough cache, then have pan download while you're
sleeping or at work or whatever, then come back and work with what's
downloaded, the download itself should perform rather better, utilizing
connection and available bandwidth rather more efficiently -- and that
has certainly been my experience, as well as that of a couple others that
have tested it.
Of course, if you're all SSD, you may instead wish to reverse that,
keeping the default 10 MB cache size and direct-downloading, but putting
the (small) cache on on tmpfs or some such, so it never hits disk at all,
thus avoiding write-cycles on your SSD by only writing the permanent data.
FWIW, the differing cache sizes (and locations, keeping the temp binaries
cache separate from the generally permanent text cache) is one of the big
reasons I use separate binary and text profiles... beyond simply not
wanting the two subscribed-groups lists mixed with each other, anyway.
That also makes it easier to catch up on the trivial-bandwidth text
groups while I'm waiting for the binary download-to-cache to finish, if I
want... without getting distracted when I'm trying to write a nice tech
post by all of those binaries, since they're two entirely different pan
instances and main-windows.
3) Charles Kerr retired from pan some years ago. He says he no longer
does news, at least not regularly, but last I knew was still active in
the gtk/gnome community, working on transmission (the gtk-based bittorrent
client), among other things. That's why (official, at least) pan
development was dormant for several years.
However, we've made it thru that rough spot, when I'm sure I wasn't the
only one debating shutting out the lights on this list and going home, as
it seemed pan was destined to only get more hoary and stale with age...
until the distros couldn't maintain it against newer versions of gcc and
its dependencies any more...
Actually, now, pan seems to have a more active developer /community/ than
I believe it /ever/ had with Charles at the helm, at least since I came
on board to try to help with the lists, etc, in 2003 or so.
It worked out like this...
KHaley (lostcoder on github) is the new primary community developer, but
for whatever personal reasons, doesn't appear to be interested in the
gnome side of things *AT* *ALL*, so for some time, while he was the first
developer to start taking an interest in pan again, cloning the gnome repo
to his github account and at first, simply assembling most of the various
distro patches floating around, then gradually beginning to add his own
stuff, he has never been interested AT ALL in working (himself) with the
official gnome repo, so mostly, it was only a core group of regulars here
that could clone his repo and build it locally, that got the benefits.
Then PKovar (Petr, but I don't know that KHaley has ever revealed what
the K is for, so it's PKovar using a similar name) came along. He's
almost the exact reverse of KHaley, as AFAIK he doesn't really claim to
be a (code) developer at all. But he was an existing Gnome translator
and thus had Gnome repo access rights, and was interested in pan, too.
So together, KHaley who codes and codes pan, but isn't interested in the
gnome side of things, and PKovar, who has gnome access and is interested
in pan but doesn't code, make an interesting team! =:^) That's how pan
came to have the two official releases already this year, 0.134 being
mostly a wrapup of existing distro patches, etc, and 0.135, beginning to
cautiously introduce new code, mostly tweaking around the edges, the line-
wrapping code (which Charles struggled and struggled with, but never
/did/ get quite right), etc, nothing major, yet. (FWIW, PKovar contacted
Charles too, and Charles was quite happy to give the new team his
blessing and get that one more bit of unfinished business of his plate.
Charles and PKovar and the rebelbase.com folks apparently arranged for
PKovar to take over pan.rebelbase.com, as well, so PKovar is the official
maintainer of the entire project now, just that the code he commits to
the public gnome repo is pretty much straight from KHaley's stable
branch. Like I said, they seem to make a great team! =:^)
But KHaley is apparently rather time constrained and even if he weren't,
he seems to naturally be quite conservative in his changes, so were it
just those two (and me the dominant list regular), pan would continue to
move rather slowly, mostly in maintenance mode, with a few tweaks here
and there but possibly never again a major new feature. That's *WAY*
better than slowly sinking into disrepair and irrelevance, as it was, and
really, that's all pan needs now, and pretty much all we expected.
But then HMueller (Heinrich) burst on the scene, only this year, just a
few months ago, and what an entrance he made! =:^) While KHaley seems to
be a near ideal "conservative policy release manager", don't break what
people are already depending on, HMueller's the "Damn the torpedoes!
Full speed ahead!" type. =:^)
His big entrance was an announcement that he was working on full-feature
binary posting, for pan!! This wasn't some half-ass single-part binary
feature, either, but power-post material! You know how many *YEARS*
Charles had been talking about that? Certainly since at least the
gnome-1-based 0.11 series, in 2003! And pan's name was originally an
initialism for Pimp-Ass Newsreader (with the -Assed bit toned down over
time, with PAN replacing Pimp-Assed Newsreader and gradually, pan
replacing PAN, until few here now probably knew it at all, until I
brought it up again recently, in exactly this binary-posting context,
matter of fact), so it obviously had big ambitions well before I was
around, too, possibly tracing to before Charles took over (I don't know
if the name was his or inherited from the original dev(s), still in the
credits but that's about all I know of them as it was before my time and
I've /no/ idea when Charles took over or under what circumstances).
The binary-posting feature is still far from stable, AFAIK and in fact,
remains quite experimental, with quite heavy changes still going on. As
such, I don't believe it's likely to make it into an official release in
the /immediate/ future (as I'd say is appropriate, KHaley's steady hand
is appreciated here), but I expect that it might make a release either
late this year (Christmas present! =:^) or by spring, next.
He's also working on another long discussed feature. If you were an old-
pan user, you may remember the "rules" it had, allowing automated mark-
read, delete or download, among other things. Charles always thought it
was too complicated, and I agree, to the extent that back then, how to
setup pan for auto-mark-read and/or auto-downloading, was a very heavily
asked FAQ, despite the feature being right there, if a bit complex. For
that reason, he never implemented them in new-pan. There has often been
discussion of a much simpler and quite reasonable alternative (score-
based, linked to the same scoring categories used in the view menu and
for color-coding, see the archives for the details, or...), and Charles
even commented that he agreed with the idea at one point, but by then, he
was slowing down on pan coding, and that never got implemented either.
Which has been quite irritating for those of us that keep certain
messages marked unread to come back to later, so we can't use the auto-
mark-read at group exit and/or at pan close options, because currently,
scoring-ignored remains marked unread, and if they're not set to be
viewed either, there's no way to MARK them read without marking the ones
read that one wants to keep marked unread to come back to later!
And of course, being able to auto-download either all watched, or all
scored at whatever level, would be a huge improvement as well! The
ability to auto-* based on rules is the one big feature that old-pan had
that new-pan never had... until HMueller came along, anyway!
If pan gets those features, it'll be very very close to feature-complete,
at least in my book. About the only other big feature I know of that it
could use, would be some decent documentation, but eh... this group/list
needs SOME reason to exist. =:^)
So there's some VERY BIG changes coming down the pike, but right now, in
ordered to taste 'em, you gotta pull from git and build your own!
Meanwhile, point...
4) With Charles now out of the picture, I knew it'd happen eventually.
Someone would come along with a suggestion that if implemented, would
kill pan's GNKSA certification. Connections was of course the most
likely trigger.
And so it happened. HMueller, having no inkling about GNKSA or what it
meant to Charles, pan's primary dev for so many years, and consequently,
to at least some of the people who had continued to use pan all these
years, AND, as you, not realizing that the 4-connections-limit was only
enforced in the GUI server-settings and that it was DELIBERATELY possible
to edit servers.xml directly and set whatever number of connections one
liked and have pan honor it, WITHOUT forcing pan to take down that GNKSA
seal that many of us respect, made what he thought was the logical change.
As I had been wondering when it might happen, and debating whether or how
I should bring up the discussion myself, I of course immediately jumped
on the opportunity poor innocent HMueller afforded me! =:^)
So, the regulars here had a little debate on the topic. What, /indeed/
about GNKSA?
Once he realized the significance, HMueller did the right thing (at least
for the time being) and reverted that bit of that commit. So AFAIK, the
GUI is back the way it was, and you can still set as many connections as
you like in servers.xml, directly.
IMO, part of the problem with that particular bit of the GNKSA is that in
reality, it was anachronistic from the very time it was introduced, with
GNKSA 2.0 in (IIRC) the late '90s.
I'm almost positive that point, which stresses both opening no more than
four connections *AND* making efficient use of the connections you *DO*
open, was added to cover abuses by the likes of MS Outlook Express, which
would open up multiple connections, use them for one thing, and then just
let them sit there. It would use one connection for its batch downloads,
and if one were clever, one could force it to use a second by queuing up
messages to read interactively, but that's all the connections you could
get it to use efficiently, despite the fact that it opened another,
dedicated ONLY to periodically checking for and updating the list of
unread messages in all subscribed groups, and I /think/ another for
something similar (maybe only for the initial automatic check to see if
the newsgroup list had changed, tho I never confirmed that, but I *DO*
know that various ISP support folks have told me that if they set less
than four connections per user, support calls from OE users went up
*DRAMATICALLY*)
But OE was already breaking all sort of other netiquette rules and simply
declaring another in GNKSA wasn't going to make it's MS devs or its users
care any more than (in general) they already didn't.
Further, by that time most ifThat's the one big thing the rules gave us
with old-pan, but not all commercial and ISP news services were server-
side tracking and enforcing connection limits already, because they HAD
to, and again, whatever violators there were that forced this, simply
weren't going to care what GNKSA had to say about it. So by the time it
was even introduced, the rule, applying as GNKSA does to the news CLIENT,
was anachronistic, since everywhere it mattered, server-side connection
tracking and limitation was already enforced.
Plus, by the simple mechanism of classic capitalistic competition, as
soon as the number of connections was server-side enforced, it became a
bullet point in the feature competition between news providers (and to a
lessor extent, for a time, between ISPs). So allowing (comparatively)
huge numbers of connections very quickly became a distinguishing mark of
the paid news providers, something users were paying good money for, yet
GNKSA disallowed compliant clients letting their users take full
advantage of what they were paying for!
But, here's the problem, despite that one IMO very bad choice, making it
a MUST, no less, not just a SHOULD, MOST if not all of the rest of the
GNKSA provokes FAR less controversy, and while /some/ users might quibble
at the likes of top-posting restrictions, line-length restrictions, etc,
I and from our little discussion here on the list, most other pan
regulars (at least to the extent that they ARE on this list), believe
having a sort of checklist to make sure nobody "accidentally" steps over
the line, is a rather GOOD thing.
Meanwhile, the GNKSA folks themselves seem to believe its time has
passed. There was some discussion here on whether their reply indicated
that they'd be willing to update it, if an appropriate suggestion were to
come, or whether they considered it a closed chapter of their lives, save
for perhaps keeping the now vanity domain name, perhaps for email they've
long had, etc.
FWIW, the way it was left was that someone from here volunteered to be
the liaison with the GNKSA folks so the whole list didn't bother them at
once, and they figured they'd already bothered them enough for the time
being, so were going to wait a month or so before getting clarification
on further GNKSA updates, etc.
I think it has been about that long; perhaps we need to bump our liaison
and see what he reports. =:^)
That was only a couple months ago. If you're at all interested, please
do go back and read the archive, and add your own thoughts if desired.
The alternatives include updating GNKSA if we're interested (presumably
with either the GNKSA folks' cooperation, perhaps even giving us partial
control of that bit of the web site, tho it's early to presume, or at
least not vetoing our action, if they'd rather just leave it behind
them), simply dropping all mention of it from the pan site and forgetting
about GNKSA entirely, including the rest of it, or taking the rest of it
and changing it into a public pledge, "pan is built with these goals in
mind", with the same list minus the connections point and not calling it
GNKSA (but presumably linking it in the acknowledgments).
Finally...
5) I've mentioned it in passing several times already, but this mailing
list (and many others) is (are) carried by gmane.org in both newsgroup
form and web forum form. You can find the list archives there, as well
as, I /think/, on the gnome site. If you're already using pan for text
groups, you may find that server a useful addition, for whatever lists
you follow.
You can read the lists as newsgroups by pointing pan at news.gmane.org .
Before replying to anything using gmane news, however, you should
probably read up on how it all works, the way the confirmation mail
works, etc, at http://gmane.org.
(The rest snipped. The post is long enough, and you can reformulate and
repost any questions you may still have after digesting the above. =:^)
(Again, "this list" thruout the above refers to the user list, for the
most part. The dev list doesn't get much traffic. Please followup on
user.)
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman