emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: libnettle/libhogweed WIP


From: Lars Ingebrigtsen
Subject: Re: libnettle/libhogweed WIP
Date: Fri, 21 Apr 2017 20:45:58 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/26.0.50 (gnu/linux)

Ted Zlatanov <address@hidden> writes:

> The KEY is secret and ideally would come from a file and never be
> seen at the Lisp level. But tests and other use cases may need it from a
> buffer (more secure but still accessible to Lisp) or a string (visible
> to all as a function parameter).

Hm...  Having a file that just has a passphrase in it sounds like an
unusual use case.  I think in Emacs these tokens would normally come
from auth-source in most applications.  At least that what I see when I
salivate at use cases.  :-)

> Getting the INPUT from a file enables large files (not in the first
> version probably) and other interesting use cases.

Emacs buffers are surprisingly efficient at handling large files:
They're basically just (sort of) contiguous areas of memory with some
structs describing their contents.  Here's how long it takes this
machine to put a 4GB .iso file into a buffer (and then kill Emacs):

address@hidden ~]$ time emacs -batch --eval "(with-temp-buffer 
(set-buffer-multibyte nil) (let ((coding-system-for-read 'binary)) 
(insert-file-contents \"~/Downloads/debian-8.6.0-amd64-DVD-1.iso\") (message 
\"%s\" (buffer-size))))"
3994091520

real    0m1.008s
user    0m0.012s
sys     0m0.988s

To compare, this is how long it takes this machine to just output it all
to /dev/null:

address@hidden ~]$ time cat ~/Downloads/debian-8.6.0-amd64-DVD-1.iso > /dev/null
 
real    0m0.294s
user    0m0.000s
sys     0m0.292s

So the Emacs primitives are definitely competitive in the "read a huge
file" stakes.  I think asking Emacs to encrypt a 4GB file will be a very
common use case, but it's doable without creating special handling.

If I understand the code correctly (and I may definitely not be doing
that; I've just skimmed it very, very briefly), you may be able to point
the encryption code at the Emacs buffer contents directly without
copying it anywhere beforehand, and then (since the results are usually
of very similar length) back to the same Emacs buffer afterwards.

4GB Emacs buffer -> encrypted to 4GB GnuTLS buffer -> 4GB Emacs buffer

instead of

4GB Emacs buffer -> copy to 4GB gnutls.c buffer -> encrypted to 4GB
GnuTLS buffer -> made into Emacs string or something

so you save at least one 4GB buffer by just taking the data directly
from the buffer and putting it back in the same place.  (So 8GB total
memory print instead of 12GB or even possibly 16GB in the current code.)

> LI> In any case, the `file' case you're discussing here doesn't really feel
> LI> that useful, but also makes things more complicated.  If the user wants
> LI> to encrypt a file, then it's more flexible to just have the caller
> LI> insert the file into a buffer and call the function as normal
>
> Aboslutely. It would be nice if the Emacs C core had "readers" like Java
> or Go because then this discussion would be really simple: "did you use
> a reader" - "yes" - "good" :)

I guess what I'm saying is that Emacs has readers, and we call those
"Emacs buffers".  :-)

The other problem with having a special file handler in the GnuTLS code
is that users will expect to be able to encrypt all files that they see
visible from Emacs, including the ones from Tramp, and application
writers will also have differing opinions on whether encrypting a .gz
file means encrypting the contents of the file or the file itself: That
is, Emacs has a very rich file handler jungle that it would be nice if
still works when you ask Emacs to encrypt something.

You'd have to handle

(file "~/foo)
(file "c:/foo/bar")
(file "Héllo") ; in iso-8859-1
(file "/ssh:host:/tmp/foo")

both as input and output specifiers if you never want the file contents
to his Elisp Land...

It all sounds a bit daunting.  To me, at least.  :-)

Instead we have most of the primitives we need for safe handling of
secrets in Emacs already; a few more should be added.  But I think this
pattern for handling secret files could be tweaked and macroised after
some code review:

(with-temp-buffer
  (set-buffer-multibyte nil)
  (let ((coding-system-for-read 'binary)
        (coding-system-for-write 'binary))
    (unwind-protect
      (progn
       (insert-file-contents "My DVD.iso")
       (gnutls-encrypt ... ... (current-buffer))
       (write-region ...))
     (clear-buffer (current-buffer))))) ;; New function that runs memset
                                        ;; over the buffer area
    
Or something.  We have to look at what buffers write-region creates and
stuff, but in the 'binary case, I don't think it creates copies of the
Emacs buffer anywhere.  Of course, if these files read and written are
via Tramp or a complex file handler, we can't guarantee that those don't
leave a buffer anywhere, but...

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



reply via email to

[Prev in Thread] Current Thread [Next in Thread]