nmh-workers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [nmh-workers] mhshow: invalid BASE64 encoding in --


From: Valdis Klētnieks
Subject: Re: [nmh-workers] mhshow: invalid BASE64 encoding in --
Date: Sun, 17 Mar 2019 22:04:17 -0400

On Sun, 17 Mar 2019 20:43:40 -0400, David Levine said:

> Note the "in base64-encoded data".  The characters in the footer are after
> the end of the base64-encoded data, per the use of "end" here:
>
>    Special processing is performed if fewer than 24 bits are available
>    at the end of the data being encoded.  A full encoding quantum is
>    always completed at the end of a body.

>From the very next paragraph:

   Because it is used only for padding at the end of the data, the
   occurrence of any "=" characters may be taken as evidence that the
   end of the data has been reached (without truncation in transit).  No
   such assurance is possible, however, when the number of octets
   transmitted was a multiple of three and no "=" characters are
   present.

In other words, you only have 2/3 chance of detecting that you've hit the
intentional end of the input by looking for an '='.  If you're ignoring line
breaks and illegal characters, that means you've  got a 1 in 3 chance of
continuing to keep pulling in stuff and decoding stuff that wasn't intended to
be decoded.

> If we do want to handle this input, I don't think that we should make the
> base64 decoder more lenient.  Instead, I think that mhfixmsg should transform
> it such that the rest of nmh could handle it.

That still doesn't address the *real* question, which is what form that
transform should take.  Deciding whether the detection of an issue should
be in the bse64 decoder or elsewhere is bikeshedding compared to trying
to decide what semantics you want..

> I think that it is what I meant.  But I also suspect that I'm missing your 
> point.

The point is that accidentally glomming non-base64 data into the decoder is
almost certainly *not* what you meant by 'see all of the content'. 

Particularly if the data wasn't ASCII - if it was a base64 of a zip file that
got a bunch of basically random bytes appended to the end, you're
going to have a *really* hard time figuring out why it was corrupted.

Feel free to give this a try:

cd /tmp
echo ab > shortfile
gzip shortfile
(base64 shortfile.gz; echo "I think that it is what I meant.  But I also 
suspect that I'm missing your point") | tr -dc '[A-Za-z+/\\n]' | base64 -d > 
short2.gz
gunzip short2.gz

At *best*, "see all the content" mean you get handed a bunch of decoded bytes
that were never encoded, so you get random trash splatted out.  At worst, you
get difficult-to-diagnose data corruption.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]