pan-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Pan-users] where does pan cache it's incompleted downloads>


From: Duncan
Subject: Re: [Pan-users] where does pan cache it's incompleted downloads>
Date: Thu, 4 Jan 2018 12:50:35 +0000 (UTC)
User-agent: Pan/0.144 (Time is the enemy; 28ab3baf7)

Pedro posted on Thu, 04 Jan 2018 16:39:35 +1100 as excerpted:

> hi, sorry about that but I have no way of knowing about any reference
> headers in an email.
> 
> a. I deleted the title and replaced it with a new one thinking this was
> making a new thread.
> 
> b. I replied but changed the recipient to what I assume is this 'groups'
> email address. - address@hidden
> 
> c. I deleted the contents of the replied to thread and I cannot see any
> other information that I am sending.

?etouq erofeb ylper ,noitseuq erofeb rewsna ,thguoht drawkcab referp 
yllaer uoy oD

(Do you really prefer backward thought, answer before question, reply 
before quote?)

.senil desrever ,llits retteb eb lliw siht ebyaM

(Maybe this will be better still, reversed lines. [1])

Seriously, there's a reason pan highly discourages posting the reply 
before the quote with a warning if you try to send a post that has the 
reply first, instead of posting contextual replies, quoting what you are 
replying to and then replying to it.  Quotes followed by replies in 
context are simply more natural to follow, make clear what part of the 
post the reply was intended to reply to, and make it far easier for 
others to reply in turn in the same way.

It's unfortunate that Thunderbird and other popular mail and/or news 
clients aren't as similarly strict, warning about such things, but 
letting posters continue if they insist.

(FWIW, it's also part of GNKSA, Good Net-Keeping Seal of Approval, with 
pan one of few news clients that gets a 100% GNKSA pass.  There have been 
on-list discussions about whether pan should continue that in one point 
or another in the past, but the general feeling has been that while one 
or another point might be inconvenient at times, most pan users strongly 
support that GNKSA in general, strongly enough to support continued 100% 
compliance even if there's one or two individual points they don't 
necessarily entirely agree with, because once pan lost that 100% 
compliance mark due to loss of compliance on a single point, it would be 
all too easy to slip further and further away in other areas as well, 
until GNKSA was ignored entirely, and pan had lost much of the reason 
many people prefer it even for text groups.

But it's not /all/ bad! At least you spared us posts in html! =:^)

To answer your points, however...

[FWIW, this is a long post and a lot of information, but I imagine others 
besides you might find some of it interesting as well, so I think/hope 
it's worth the several hours I put into composing it. That's the 
advantage of newsgroups/mailinglists, where more than the person replied 
to can see the post. =:^) ]

Most clients don't display /all/ headers, unless you specifically ask 
them to.  In this case, the normally hidden references header, and the 
similar in-reply-to header, list the message-id of the post they're 
replying to, with in-reply-to listing only the immediate parent, while 
references, more common on news, lists all or most of the upthread.

Of course the message-id header itself is another such header, designed 
to uniquely identify individual messages, so they can be referred to in 
the above headers, listed in *.nzb files, etc.  Since message-ids are 
intended to be unique, pan uses them (with a few substitutions to make 
them work better as filenames) as the name of the individual message 
files in its cache, as well.

As mentioned, most clients will show you all headers on received messages 
if you tell them to.  (Sent messages are another matter, since some 
headers are added in transit or not assigned until you actually send the 
message.  See the lines and size headers, as well as the received headers 
added by mail servers as they get and pass on the message, for instance, 
and the similar entries in the path header for news.)

In pan the default hotkey to toggle view full headers on a received 
message is (IIRC, I've customized many of my hotkeys) the "H" key.  In 
claws, my mail client, it's the "V" key (view raw message, including 
headers).  Presumably, thunderbird has a similar option, which may or may 
not be assigned a hotkey.

> d. what strictly usenet group is this anyway? I can use pan.

While I don't believe I specifically said it was a newsgroup, it /does/ 
happen to be one, but a somewhat longer explanation may be necessary...

First, news and mail use the same general message format and many of the 
same headers, specified originally for mail messages in the RFCs. (RFCs 
are Internet Requests for Comment, the documents that eventually become 
the Internet Standards defining the protocols and formats required for 
interoperation of the great variety of different machines and software on 
the internet.)

News makes use of this originally mail message format, putting it to use 
for a one-to-many purpose, where a post is sent to many readers who 
connect to news servers to download it, as opposed to internet mail's 
normally one-to-one format, sent from one person to another specific 
person, tho it may be CCed or specific others as well.

Because the news and mail message formats are extremely similar, 
technically the same but for a few headers, which can be ignored by the 
other one, once someone has software handling one of the two, it's quite 
easy to add a little bit more code to handle the other one as well, 
almost "for free", since it's very little more code once one has the 
other one working properly.

This is why it's so common for clients that handle mail, to also handle 
news, altho not necessarily /well/, since news, particularly binaries, 
tends to be far higher message volume and bandwidth, and handling that 
many more messages as fast as necessary for efficient binary processing 
isn't easy.

But it also makes it reasonably easy to "gateway" news to mail, and mail 
to news, where that might be useful.

Which is where gmane comes in.  It's a service that "gateways" messages 
between mailing lists and newsgroups that match them, on a news server it 
hosts.

Now gmane has some interesting history, including a web-accessed version 
of the news service that was transferred to a new owner in the middle of 
2016, that never really came up again after the transfer, so the web side 
doesn't work, but the news side still does, and there's some 
(unfortunately limited) still /somewhat/ useful info about the news side 
on the website as well -- as long as you understand that some of it is 
stale and doesn't refer to current news-side status and conditions.

But you can get some idea, anyway...

https://gmane.org

Meanwhile, the news server is:

news.gmane.org

You can connect to it using the normal unencrypted news port, 119, or the 
encrypted snews port, 563. (Assuming you have pan built with gnutls 
support to enable secure connections, just set the security/TLS setting 
in the server properties to use secure connections, and it should 
automatically set the port to 563.)

It's still run by Lars, the guy who originally started the whole thing, 
with a few volunteer admins helping a bit.

The gmane.* hierarchy is the gated mailing-lists, with the pan-users 
mailing list as...

gmane.comp.gnome.apps.pan.user

Particularly since the info that was formerly on the website about 
posting, how to request new lists be added, etc, is now outdated or 
missing, you'll almost certainly want to subscribe as well to

gmane.discuss


That should let you /read/ the groups/lists you want, assuming they're 
carried by gmane.  Posting is rather more complicated.  I used to refer 
people to the gmane website's posting instructions, but that page is 
broken now, so that doesn't work...

Basically, the first time you post to a "group" via gmane, gmane will 
email the address you used to post (which thus must be valid) asking that 
you confirm your intent to post.  Once you confirm, gmane forwards the 
message to the mailing list, and it's upto the mailing list what to do 
with it.

Once gmane conforms an email address for a specific list, gmane will 
forward any new messages it gets from that address for that list.  But 
send to a different list and you'll need to confirm again, for the new 
list.

What the mailing list does with the message once gmane has forwarded it 
is out of gmane's hands and depends on the list.  Some lists are "open" 
and will, possibly after a few automated anti-spam checks, post the 
message to the list.  Others are private, and only accept posts from 
users who have subscribed to that list. (Note that this is "subscribed" 
from the list-serv's perspective, and is entirely independent of whether 
you've "subscribed" to the gmane group in pan or whatever other news 
client you may use.)  Then there are the "moderated-posting" groups, 
which will normally automatically post your message if you are 
subscribed, but will put non-subscriber posts in a moderation queue for a 
human to approve before they're actually posted to the list.  Finally, 
there are some "read-only/announcement" type lists that don't accept 
posts from normal subscribers at all, only a few admins.

Meanwhile, once a message has actually been posted to the list, it 
doesn't matter whether via gmane or directly to the list, gmane gets it 
via gmane's own subscription to the list, and will gateway that post to 
the newsgroup that matches that list.

Again, gmane only actually posts messages it gets from its subscription 
to a list, regardless of how they were posted to the list.  Confirming 
your email address to gmane won't necessarily get it posted to the gmane 
group, it'll just get the message forwarded to the list.  If the list 
rejects it, it won't get posted to gmane either, because gmane only posts 
to the group what it gets from its subscription to the corresponding list.

So posting via gmane is actually a two-step process, all behind the 
scenes once you're setup to do it, but for that first post to a group/
list, requiring first a confirmation for the gmane challenge that you 
actually want it posted, then, depending on the list's own policy, a 
possible wait for moderation or subscription there, as well.

Note that if you do want or have to subscribe to the list, once you're 
using gmane you may want to put your actual list subscription into either 
digest mode, so you only get periodic messages including everything to 
the list for the period instead of each individual message, or vacation 
mode, which lets you post, but won't actually send you the messages via 
mail, useful when you're reading them via gmane anyway.

Meanwhile, there have been a few complaints to gmane.discuss that the 
usual gmane first-post authorization process is broken in some cases, and 
people never get that first-post challenge from gmane to respond to, to 
authorize gmane to forward the message to the list-serv.

For this sort of case it's worth keeping in mind that replying via gmane 
isn't your only choice.  Since gmane is actually gating the posts from a 
mailing list, you can actually reply via mail to the mailing list itself 
as well, bypassing gmane entirely for the reply.  Since you're replying 
via mail to the list itself, the list-serv will apply its normal policy 
and foward the mail to the list if you're a member, or to a moderator if 
it's a moderated list and you're not a member, and gmane won't see the 
message at all unless/until the message is actually posted to the list 
and gmane sees it via its list subscription.

Finally, it's worth noting that for some lists, including the pan list, 
gmane obfuscates email addresses, or indeed, anything that looks like an 
email address (of the form user @ example.com , without the spaces, of 
course).  The obfuscated form will have a domain name of public.gmane.org, 
while the username portion looks like nonsense.  However, to gmane these 
obfuscations are reversable, and while I haven't actually tried it since 
the web side split off, at least before that, if you actually sent 
someone a mail via the obfuscated public.gmane.org address, gmane's mail 
server would, after ensuring it didn't look too spammy and gmane wasn't 
getting a whole bunch of them for other addresses at the same time, 
forward them to the real email address.

So it was, and should be still tho I've not tried it recently, actually 
possible to use your obfuscated @public.gmane.org address as a forwarding 
address to your real address.  Quite cool, actually! Maybe I should try 
it again one of these days to see if it still works.  =:^)

Meanwhile...

I'm not actually sure if gmane still hosts the gwene.* groups too, I've 
not looked since all the changes, but if so, they're similarly gated from 
web feeds (atom or rss) to newsgroups.  However, since those are XML 
based, you'll likely find pan doesn't work well with them.  But 
thunderbird likely will.

> that said, thanks for the warning and the information.. I'll check on
> it, those cache directories are there and I can see the cache settings
> in pan.
> 
> I doubt I will get what I want though as the msg files are scrambled. I
> wanted to start watching movies whilst downloading them.

That's not "scrambled" per se, it's "encoded".  Here's the deal.

The original email format was designed for 7-bit ASCII text only and it 
wasn't, and isn't, possible to send binary files directly.  There's 
actually quite some history behind the three major plus additional minor 
encoding schemes, but the general idea of all of them is to encode 8-bit 
binary into some form that can be transmitted over mail and news paths 
without corrupting the files.

The earliest major format was UUE, Unix-to-Unix-Encoding.  This chose 64 
ASCII characters that could transmit cleanly using 7-bit ASCII, using 
them to encode 6 bits (64=2^6) of the binary for each encoded ASCII 
character.  Since each ASCII character is transmitted using 8 bits, but 
it only encoded 6 bits of the binary plus overhead (line terminations, 
etc), this encoding form is slightly more than 33% larger than the 
original binary file.

That is UUE takes a message a bit over 4 KiB to encode 3 KiB of binary 
file.

But UUE was never entirely standardized, and a new, more standardized 
scheme was eventually chosen, MIME's Base64.  MIME itself stands for 
Multipurpose Internet Mail Extensions, and was a larger set of 
specifications that standardized a bunch of related stuff, including 
headers that specified the mime-type of a message part, the encoding used 
for it (including not only Base64, better for binary, but quoted-
printable, better for mostly text which came thru mostly unchanged, but 
could handle occasional binary characters), ways to specify how non-ASCII 
characters were encoded in otherwise ASCII-only headers so they wouldn't 
break backward compatibility, etc.

But while much more formally standardized, MIME Base64, as the name 
suggests, also used 64 ASCII characters (tho slightly different ones than 
UUE) to encode 6 bits of binary, so it had the same 33%+ encoding 
overhead (with MIME quoted-printable better for mostly text, but much 
much worse for mostly non-text binary).

So Base64 is also 4 KiB encoded for 3 KiB of binary file.


Then, over a decade later, along came yenc, aka yEnc, aka y-encoding.  
This was primarily pushed by binary news users, actually mostly posters 
since they define how they post and downloaders have to use something 
that understands it if they want to do anything with the download, who 
were fed-up with the 33% overhead of UUE and Base64, since by then news 
was /almost/, tho not entirely, 8-bit clean, and yenc could take 
advantage of that, for news only, to be much more efficient.

As a primarily news-poster-enforced choice, yEnc wasn't nearly as 
standardized as MIME/base64, and remains rather less mainstream than the 
old UUE as well, particularly since many primarily mail clients that also 
support news don't support it, and not all mail servers or paths support 
it.

But the one thing yenc *DOES* have going for it is the much better 
encoding efficiency, only ~5% overhead, compared to the 33% overhead of 
the other two major types of encoding!  It does this by using all the 8-
bit ASCII characters it can get away with -- the CRLF sequence is 
reserved as the standard line termination, and there's a few others 
reserved as well, these must be escape-encoded -- and using much longer 
lines, so there's less overhead.  Because it can take advantage to eek 
out slightly better efficiency, yenc is also "shifted" by a few values 
from standard 8-bit ASCII.  But it's only "guaranteed" to work over 
direct news transfers, not when gated to other formats, and not for mail.

But in an age where some posters and many downloaders were still 
connecting using dialup at 56kbps or less down, 33kbps or less up, 5% 
overhead compared to 33% overhead shaved enough time off the uploads and 
downloads that it was WORTH IT, for many.

There were a few other minor advantages as well, such as the fact that 
yenc checksums the encoding, so if it's corrupted, that's known 
immediately.

And within a couple years, no news client could seriously claim good 
binary handling if it didn't handle yenc, because that's what most of the 
binary posters were posting in!


So your "scrambled" is actually encoding, probably yenc, if some of the 
"characters" don't appear as standard ASCII text, or mime/base64 or 
possibly uue, if it's all ASCII.

These days, clients including pan normally transparently encode when 
posting and decode when saving binaries, so a user not familiar with the 
old days may not even be aware of that encoding, but back in the early 
days when it was UUE, posters had to run their binaries thru a separate 
encoder to encode them as 7-bit ASCII, then copy the encoded version into 
their emails as text.  Similarly, downloaders had to feed the raw text 
messages, or even edit them first to remove the wrappers so it was just 
the bare encoded binary, to the separate decoder, which would then output 
the decoded binary which could then be saved as a real binary once again.

And when yEnc first appeared, people who had been used to clients who 
could by then handle mime/base64 and UUE transparently, again found 
themselves having to manually feed the raw messages to their yenc-decoder 
before they could save them.  But it was still worth it to shave an hour 
of every four off their download time, and even if it wasn't, that's what 
the posters were posting in (yenc-power-post was one of the first ones 
that handled large-scale posting in yenc, and it quickly became popular, 
because at 28-33k, uploads were even slower than 48-56k downloads, on 
dialup!), so they had little choice.

Today any self-respecting news binary downloader, including pan, can 
handle yenc, but that doesn't mean you can't still use the separate 
encoder/decoders if need be.

If interested, try uudeview or one of the similar tools.  That should let 
you manually decode the binaries found in the raw cached messages.

Since you're specifically interested in decoding incomplete posts, you'll 
be interested in uudeview's -d/desperate option, which allows just that.  
However, with most file types you'll need at least the first part 
correct, or your viewer/player app probably won't be able to tell what 
codecs it needs to view/play it properly.

For that matter, you should be able to force pan to save an incomplete 
file as well, tho I think it'll try to download the missing parts first 
if they're on the server, and will only save the incomplete file if it 
can't see or get the parts to download.  And as I said, you'll probably 
need the first part, regardless, or it'll save but won't play.

(I haven't done binaries in quite some time now, but when I did, for 
movies I found I needed at least the first part.  For most movie types, 
after that I could skip some parts and play the others I had, with some 
temporary corruption where parts were missing until the next full anchor 
frame, but it would play.)

> On 03/01/18 16:28, Duncan wrote:
>> Pedro posted on Wed, 03 Jan 2018 13:26:35 +1100 as excerpted:
>>

---
[1] Reversed lines:  Courtesy of the "rev" command, part of the util-linux 
package.  See the rev (1) manpage.  Here's the line I used for the second 
one:

echo "Maybe this will be better still, reversed lines." | rev

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman




reply via email to

[Prev in Thread] Current Thread [Next in Thread]