duplicity-talk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Duplicity-talk] UnicodeEncodeError during backup and restore


From: Michael Terry
Subject: Re: [Duplicity-talk] UnicodeEncodeError during backup and restore
Date: Mon, 23 Dec 2013 21:54:30 -0500

Hello!  Yes, there are some known problems with how duplicity handles filenames.  It doesn't always handle utf8 well (as you've found), and it certainly doesn't handle non-utf8 filenames well.

The patch you point to is a partial fix, but it doesn't handle all the cases I believe I lurking in the code.  I'm going to work on expanding the patch during the end of year.

-mt


On 18 December 2013 09:58, Milan Bouchet-Valat <address@hidden> wrote:
Hi!

I'd like to beg some attention for a terrible bug that has plagued my
usage of duplicity for more than one year. Basically, I always need to
apply a patch to be able to backup and restore. If this also happens to
users who really need their backups, this may have dramatic
consequences.

This bug manifests itself by a few errors related to encoding, like:
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in
position 16: ordinal not in range(128)

The reference of the bug report is:
https://bugs.launchpad.net/ubuntu/+source/duplicity/+bug/989496

Debian has applied a hack recently to work around this:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=682837


Here's a copy of the explanation I provided back in August, with the
patch I'm using available from here:
https://bugs.launchpad.net/ubuntu/+source/duplicity/+bug/989496/+attachment/3764630/+files/duplicity.patch

The bug does not happen only with invalid UTF-8 filenames, you simply
need UTF-8 filenames and a UTF-8 locale.

For example, in collections.py:810, there is:
log.Debug(_("File %s is not part of a known set; creating new set") % (filename,))

On my system, when this fails (see error below), the _() string is a str
object encoded in UTF-8; filename is a unicode object. The error below
happens while Python encodes filename into an ASCII str object. If the
_() string is a unicode object too, no encoding into a str object
happens at this stage, and everything works. This can be achieved by
setting gettext up differently in __init__.py, by passing unicode=True
to gettext.install(). This is the solution recommended by the author of
gettext for Python:
http://www.wefearchange.org/2012/06/the-right-way-to-internationalize-your.html

The patch still applies today to Duplicity 0.6.21.


Thanks for your time!

_______________________________________________
Duplicity-talk mailing list
address@hidden
https://lists.nongnu.org/mailman/listinfo/duplicity-talk


reply via email to

[Prev in Thread] Current Thread [Next in Thread]