[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[nongnu] elpa/hyperdrive e103791cbc 20/26: Merge: Fix text encoding issu
From: |
ELPA Syncer |
Subject: |
[nongnu] elpa/hyperdrive e103791cbc 20/26: Merge: Fix text encoding issues |
Date: |
Fri, 30 Aug 2024 16:00:12 -0400 (EDT) |
branch: elpa/hyperdrive
commit e103791cbcc455aece720d9b09b3eee0e62da2f2
Merge: 56c201e497 0702f40645
Author: Joseph Turner <joseph@ushin.org>
Commit: Joseph Turner <joseph@ushin.org>
Merge: Fix text encoding issues
---
CHANGELOG.org | 3 +++
doc/hyperdrive.org | 2 ++
doc/hyperdrive.texi | 2 ++
hyperdrive-dir.el | 2 ++
hyperdrive-lib.el | 40 ++++++++++++++++++++++++++++++-------
hyperdrive.el | 52 ++++++++++++++++++++++++++++++++----------------
tests/org links.org | 4 ++--
tests/test-hyperdrive.el | 2 +-
8 files changed, 80 insertions(+), 27 deletions(-)
diff --git a/CHANGELOG.org b/CHANGELOG.org
index b28eb92cd3..4144df07bf 100644
--- a/CHANGELOG.org
+++ b/CHANGELOG.org
@@ -65,6 +65,7 @@ space by "forgetting" your copy of a file
(~hyperdrive-forget-file~)!
** Fixed
+- Fix text encoding when reading and writing hyperdrive files.
- Fixed link targets/fragments for non-Org documents, like HTML.
- More reliably kill intermediate buffers when generating a diff.
- Avoid unnecessarily updating ~hyperdrive-version-ranges~.
@@ -78,12 +79,14 @@ space by "forgetting" your copy of a file
(~hyperdrive-forget-file~)!
- Improved reading of hyperdrive URLs.
- Improved visible buttons in ~hyperdrive-menu-bar-mode~.
- Avoid unnecessary prompt to save hyperdrive EWW buffers.
+- More accurately calculate file size when writing a hyperdrive file.
** Internal
- Refactor to accommodate
[[https://ushin.org/hyperdrive/hyperdrive-manual.html#Org_002dtransclusion-integration][hyperdrive-org-transclusion]].
- History-related functions now accept ~hyperdrive-entry~ structs.
- Update hyperdrive ~latest-version~ metadata when deleting directories.
+- Reduce memory usage in ~hyperdrive-write-buffer~.
- Added Org 9.7.6 as a dependency.
- Bumped package dependencies.
diff --git a/doc/hyperdrive.org b/doc/hyperdrive.org
index 5c0f92a716..7986be2752 100644
--- a/doc/hyperdrive.org
+++ b/doc/hyperdrive.org
@@ -1323,6 +1323,8 @@ world of p2p as well as the development of
~hyper-gateway~.
[[https://protesilaos.com][Protesilaos Stavrou]] for design input and
user-testing ~hyperdrive.el~.
+[[https://savannah.gnu.org/users/eliz][Eli Zaretskii]] for guidance about text
encoding systems.
+
[[https://emacs.ch/@yantar92][Ihor Radchenko]] for guidance about security and
Org mode integration.
[[https://karl-voit.at/][Karl Voit]] for his feedback which inspired the
design of ~hyperdrive-mirror~.
diff --git a/doc/hyperdrive.texi b/doc/hyperdrive.texi
index 546bc64297..a153101b45 100644
--- a/doc/hyperdrive.texi
+++ b/doc/hyperdrive.texi
@@ -1797,6 +1797,8 @@ world of p2p as well as the development of
@code{hyper-gateway}.
@uref{https://protesilaos.com, Protesilaos Stavrou} for design input and
user-testing @code{hyperdrive.el}.
+@uref{https://savannah.gnu.org/users/eliz, Eli Zaretskii} for guidance about
text encoding systems.
+
@uref{https://emacs.ch/@@yantar92, Ihor Radchenko} for guidance about security
and Org mode integration.
@uref{https://karl-voit.at/, Karl Voit} for his feedback which inspired the
design of @code{hyperdrive-mirror}.
diff --git a/hyperdrive-dir.el b/hyperdrive-dir.el
index f11458e722..99620803ca 100644
--- a/hyperdrive-dir.el
+++ b/hyperdrive-dir.el
@@ -388,8 +388,10 @@ see Info node `(elisp)Yanking Media'."
;; TODO: Extend this to other media types?
(cl-assert (and h/current-entry
(h//entry-directory-p h/current-entry)))
+ (cl-assert (not (multibyte-string-p image)))
(let ((entry (h/read-entry (h//context-hyperdrive :predicate #'h/writablep)
:latest-version t)))
+ (setf (he/size entry) (string-bytes image))
(he/api 'put entry :body image
;; TODO: Pass MIME type in a header? hyper-gateway detects it for us.
:then (lambda (_res) (h/open entry))
diff --git a/hyperdrive-lib.el b/hyperdrive-lib.el
index caa9a0f7ea..da9d318b29 100644
--- a/hyperdrive-lib.el
+++ b/hyperdrive-lib.el
@@ -64,8 +64,8 @@ Passes ARGS to `format-message'."
;; the name, we store the name as-is because, for one thing, the name
;; could theoretically contain a slash, and `file-name-nondirectory'
;; would return the wrong value in that case.
- (name nil :documentation "Decoded filename of entry (excluding leading
slash).")
- (path nil :documentation "Decoded path (including leading slash).")
+ (name nil :documentation "URL-decoded filename of entry without leading
slash.")
+ (path nil :documentation "URL-decoded path with leading slash.")
(headers nil :documentation "HTTP headers from request.")
(mtime nil :documentation "Last modified time.")
(size nil :documentation "Size of file.")
@@ -238,7 +238,18 @@ metadata from the response."
Sets ENTRY's hyperdrive to the persisted version of the drive if
it exists. Persists ENTRY's hyperdrive. Invalidates ENTRY display."
(pcase-let*
- (((map link allow content-length content-type last-modified x-drive-size
+ ((encoding
+ ;; TODO: After the resolution of <https://todo.sr.ht/~ushin/ushin/211>,
+ ;; use the encoding specified in the 'Content-Type' header. For now,
we
+ ;; rely on the guesswork of `detect-coding-region'.
+ (if-let ((filename-encoding (auto-coding-alist-lookup (he/name
entry))))
+ filename-encoding
+ (pcase (detect-coding-string (plz-response-body response) t)
+ ((or 'undecided 'undecided-dos 'undecided-mac 'undecided-unix)
+ ;; Default to UTF-8
+ 'utf-8)
+ (detected-encoding detected-encoding))))
+ ((map link allow content-length content-type last-modified x-drive-size
x-drive-version x-file-block-length
x-file-block-length-downloaded)
(plz-response-headers response))
;; RESPONSE is guaranteed to have a "Link" header with the public key,
@@ -247,6 +258,16 @@ it exists. Persists ENTRY's hyperdrive. Invalidates
ENTRY display."
(match-string 1 link)))
;; NOTE: Don't destructure `persisted-hyperdrive' with `pcase' here
since it may be nil.
(persisted-hyperdrive (gethash public-key h/hyperdrives)))
+ ;; Decode response body.
+ (unless (eq 'no-conversion encoding)
+ (cl-callf decode-coding-string (plz-response-body response) encoding))
+ ;; TODO: Once we can get hyperdrive file contents as a buffer
+ ;; (<https://github.com/alphapapa/plz.el/issues/61>), we should use
+ ;; `decode-coding-region' instead of `decode-coding-string'.
+ ;; `decode-coding-region' will set the buffer-local value of
+ ;; `buffer-file-coding-system' to the correct encoding. Currently,
+ ;; hyperdrive file buffers always have `buffer-file-coding-system' to the
+ ;; global default, `utf-8' on my machine.
(when persisted-hyperdrive
;; ENTRY's hyperdrive already persisted: merge domains into persisted
@@ -281,6 +302,10 @@ it exists. Persists ENTRY's hyperdrive. Invalidates
ENTRY display."
(setf (he/size entry)
(ignore-errors (cl-parse-integer content-length))))
(when content-type
+ ;; FIXME: `content-type' for 'text/plain' always has 'charset=utf-8',
+ ;; which may not be correct. Since charset in `hyperdrive-entry-type' is
+ ;; not used anywhere, this should not result in any bugs. This FIXME can
+ ;; be removed when <https://todo.sr.ht/~ushin/ushin/211> is resolved.
(setf (he/type entry) content-type))
(when last-modified
(setf (he/mtime entry) (encode-time (parse-time-string last-modified))))
@@ -1021,11 +1046,10 @@ Call ELSE if request fails."
(cl-defun h/write (entry &key body then else queue)
"Write BODY to hyperdrive ENTRY's URL.
-THEN and ELSE are passed to `hyperdrive-entry-api', which see."
+BODY should be an encoded string or buffer. THEN and ELSE are
+passed to `hyperdrive-entry-api', which see."
(declare (indent defun))
(he/api 'put entry
- ;; TODO: plz accepts buffer as a body, we should refactor calls to h/write
- ;; to pass in a buffer instead of a buffer-string.
:body body :then then :else else :queue queue))
(cl-defun h//format-entry-url
@@ -1327,7 +1351,9 @@ DEFAULT and INITIAL-INPUT are passed to `read-string'
as-is."
(declare (indent defun))
(let ((entry (he/create :hyperdrive hyperdrive
:path "/.well-known/host-meta.json")))
- (h/write entry :body (json-encode (h/metadata hyperdrive))
+ ;; TODO: Is it okay to always encode the JSON object as UTF-8?
+ (h/write entry :body (encode-coding-string
+ (json-encode (h/metadata hyperdrive)) 'utf-8)
:then then)
hyperdrive))
diff --git a/hyperdrive.el b/hyperdrive.el
index f05cf54108..76552b2d45 100644
--- a/hyperdrive.el
+++ b/hyperdrive.el
@@ -490,8 +490,9 @@ overwrite.
With universal prefix argument \\[universal-argument], overwrite
without prompting.
-This function is for interactive use only; for non-interactive
-use, see `hyperdrive-write'."
+This function is for interactive use only since it calls
+`select-safe-coding-system', which may prompt for input.
+For non-interactive use, see `hyperdrive-write'."
(interactive (list (h/read-entry (h/read-hyperdrive #'h/writablep)
:default-path
(or (and (buffer-file-name)
@@ -501,10 +502,22 @@ use, see `hyperdrive-write'."
(he/path h/current-entry)))
:latest-version t)
current-prefix-arg))
- (pcase-let (((cl-struct hyperdrive-entry hyperdrive name) entry)
- (url (he/url entry))
- (buffer (current-buffer))
- (buffer-visiting-entry (h//find-buffer-visiting entry)))
+ (pcase-let* (((cl-struct hyperdrive-entry hyperdrive name) entry)
+ (url (he/url entry))
+ (orig-buffer (current-buffer))
+ (coding-system
+ (with-current-buffer orig-buffer
+ ;; Detect coding in orig buffer so
`buffer-file-coding-system'
+ ;; is prioritized.
+ (select-safe-coding-system (point-min) (point-max))))
+ (encoded-buffer
+ (with-current-buffer
+ (generate-new-buffer
+ (format " *hyperdrive-encoded %s*" orig-buffer) t)
+ (insert-buffer-substring orig-buffer)
+ (encode-coding-region (point-min) (point-max) coding-system)
+ (current-buffer)))
+ (buffer-visiting-entry (h//find-buffer-visiting entry)))
(unless (or overwritep (not (he/at nil entry)))
(unless (y-or-n-p
(format "File %s exists; overwrite?" (h//format-entry entry)))
@@ -514,11 +527,10 @@ use, see `hyperdrive-write'."
(h//format-entry entry)))
(h/user-error "Aborted"))))
(h/write entry
- :body (without-restriction
- (buffer-substring-no-properties (point-min) (point-max)))
+ :body encoded-buffer
:then (lambda (response)
- (when (buffer-live-p buffer)
- (with-current-buffer buffer
+ (when (buffer-live-p orig-buffer)
+ (with-current-buffer orig-buffer
(unless h/mode
(h//clean-buffer)
(when (eq 'unknown (h/safe-p hyperdrive))
@@ -531,15 +543,16 @@ use, see `hyperdrive-write'."
(he//fill entry (plz-response-headers response))
;; PUT responses only include ETag and Last-Modified
;; headers, so we need to set other entry metadata manually.
- ;; FIXME: For large buffers, `buffer-size' returns a
different
- ;; value than the gateway's Content-Length header.
- (setf (he/size entry) (buffer-size))
+ (setf (he/size entry) (with-current-buffer encoded-buffer
+ (bufferpos-to-filepos (point-max))))
;; FIXME: Will entry type ever be anything besides
text/plain?
;; /.well-known/host-meta.json ?
- (setf (he/type entry) "text/plain; charset=utf-8")
+ (setf (he/type entry)
+ (format "text/plain; charset=%s"
+ (coding-system-base coding-system)))
(setq-local h/current-entry entry)
(setf buffer-file-name nil)
- (unless (eq buffer buffer-visiting-entry)
+ (unless (eq orig-buffer buffer-visiting-entry)
(when (buffer-live-p buffer-visiting-entry)
(kill-buffer buffer-visiting-entry))
(rename-buffer (h//generate-new-buffer-name entry)))
@@ -550,9 +563,13 @@ use, see `hyperdrive-write'."
;; and lets us avoid making another request for
;; metadata.
(set-visited-file-modtime (current-time))))
- (h/message "Wrote: %S to \"%s\"" name url))
+ (h/message "Wrote: %S to \"%s\"" name url)
+ (when (buffer-live-p encoded-buffer)
+ (kill-buffer encoded-buffer)))
:else (lambda (plz-error)
- (h/message "Unable to write: %S: %S" name plz-error)))
+ (h/message "Unable to write: %S: %S" name plz-error)
+ (when (buffer-live-p encoded-buffer)
+ (kill-buffer encoded-buffer))))
(h/message "Saving to \"%s\"..." url)
;; TODO: Reload relevant hyperdrive-dir buffers after writing buffer (if
ewoc buffers display version, then possibly all ewoc buffers for a given
hyperdrive should be reloaded)
))
@@ -723,6 +740,7 @@ After successful upload, call THEN. When QUEUE, use it."
;; "%a, %-d %b %Y %T %Z"
(file-attribute-modification-time
(file-attributes filename)) t))))
+ (setf (he/size entry) (file-attribute-size (file-attributes filename)))
(he/api 'put entry :queue queue
:body `(file ,filename)
:headers `(("Last-Modified" . ,last-modified))
diff --git a/tests/org links.org b/tests/org links.org
index ce606a60e9..10d8bddb41 100644
--- a/tests/org links.org
+++ b/tests/org links.org
@@ -9,7 +9,7 @@
+ *Notes*
+ This file should be loaded from a hyperdrive in order to verify that these
links work correctly for an Org file loaded from a hyperdrive.
- + To decode these links with Org, use ~(org-element-context)~ with point on
the link. The resulting element/plist is used in function
~hyperdrive--org-open-at-point~.
+ + To URL-decode these links with Org, use ~(org-element-context)~ with point
on the link. The resulting element/plist is used in function
~hyperdrive--org-open-at-point~.
** Link types
@@ -54,7 +54,7 @@ Here are various link types which we want to test.
+ *Notes:*
- + Spaces and such are required to NOT be encoded :: Spaces in bracketed,
Org-style links must /not/ have spaces, etc. URL-encoded.
+ + Spaces and such are required to NOT be URL-encoded :: Spaces in bracketed,
Org-style links must /not/ have spaces, etc. URL-encoded.
+ We do this so that existing repositories of Org files can be mirrored to
a hyperdrive as-is and the relative links between them will still work.
**** Links without targets
diff --git a/tests/test-hyperdrive.el b/tests/test-hyperdrive.el
index 77e1d96c79..c055683f12 100644
--- a/tests/test-hyperdrive.el
+++ b/tests/test-hyperdrive.el
@@ -25,7 +25,7 @@
;; TODO: Consider having a "round-trip" test that makes a new
;; hyperdrive, writes files of various names, then reads the
;; hyperdrive back and ensures that the entries names and paths are
-;; correctly encoded/decoded.
+;; correctly URL-encoded/URL-decoded.
;;; Code:
- [nongnu] elpa/hyperdrive updated (56c201e497 -> c05904f585), ELPA Syncer, 2024/08/30
- [nongnu] elpa/hyperdrive e4fecbf3ea 05/26: Meta: Update changelog, ELPA Syncer, 2024/08/30
- [nongnu] elpa/hyperdrive be372fcd82 09/26: Fix: (h/write-buffer) More accurately calculate file size, ELPA Syncer, 2024/08/30
- [nongnu] elpa/hyperdrive 9dfcc25a0a 18/26: Add: (h/dir--yank-media-image-handler) String multibyte assertion, ELPA Syncer, 2024/08/30
- [nongnu] elpa/hyperdrive 23cb37756b 04/26: Fix: (hyperdrive-write-buffer) Encode buffer contents when writing, ELPA Syncer, 2024/08/30
- [nongnu] elpa/hyperdrive e103791cbc 20/26: Merge: Fix text encoding issues,
ELPA Syncer <=
- [nongnu] elpa/hyperdrive 6a582d3cff 24/26: Docs: Note about non-UTF-8 encoded text, ELPA Syncer, 2024/08/30
- [nongnu] elpa/hyperdrive bccee74f7a 02/26: Meta: Update changelog, ELPA Syncer, 2024/08/30
- [nongnu] elpa/hyperdrive 740c21afdb 01/26: Change: (hyperdrive-write-buffer) Don't allocate unnecessary string, ELPA Syncer, 2024/08/30
- [nongnu] elpa/hyperdrive 8187adc021 07/26: Change: Distinguish URL-encoding from text-encoding in docstrings, ELPA Syncer, 2024/08/30
- [nongnu] elpa/hyperdrive 7a1d35cfb4 10/26: Fix: (h/upload-file) Calculate file size, ELPA Syncer, 2024/08/30
- [nongnu] elpa/hyperdrive 7db0f02d6e 08/26: Fix: (he//api-then) Decode response bodies, ELPA Syncer, 2024/08/30
- [nongnu] elpa/hyperdrive c4fe98415b 13/26: Meta: Acknowledge Eli Zaretskii, ELPA Syncer, 2024/08/30
- [nongnu] elpa/hyperdrive 0ee751c591 14/26: Fix: (h/write-buffer) Fix charset in entry type, ELPA Syncer, 2024/08/30
- [nongnu] elpa/hyperdrive 4ef6496965 11/26: Fix: (h/dir--yank-media-image-handler) Calculate file size, ELPA Syncer, 2024/08/30
- [nongnu] elpa/hyperdrive 6874248ed5 16/26: Fix: (hyperdrive-write-buffer) Clean up encoded buffer, ELPA Syncer, 2024/08/30