[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Pan-devel] [PATCH] 8 bit characters in header
From: |
Sam Solon |
Subject: |
[Pan-devel] [PATCH] 8 bit characters in header |
Date: |
20 Jul 2002 20:27:01 -0400 |
Although the proper answer is "they're wrong according to RFC 977" there
seem to be a number of postings that use 8 bit characters in the header
-- particularly for the subject. This seems most common in binary
newsgroups and is probably an attempt to disguise a copyright violation.
Since Pan uses the current locale to convert to UTF-8 there is the
possibility that the conversion will fail, leaving the subject blank. At
least, that's what it does on my system, with the default "C" locale.
I think it's better to get something rather than nothing so I propose
the following patch.
If the conversion using the default locale fails, it is tried again with
"ISO-8859-1" explicitly specified (maybe there's something better?). If
that fails the beginning of the string up to the conversion failure
point is used.
I find it disconerting to have lots of blank subject lines in the
article-list -- not that *I* would ever download a file that violates a
copyright. ;-)
Index: pan-glib-extensions.c
===================================================================
RCS file: /cvs/gnome/pan/pan/base/pan-glib-extensions.c,v
retrieving revision 1.25
diff -u -u -r1.25 pan-glib-extensions.c
--- pan-glib-extensions.c 23 Jun 2002 11:28:11 -0000 1.25
+++ pan-glib-extensions.c 21 Jul 2002 00:14:01 -0000
@@ -855,7 +855,25 @@
gssize len,
char ** g_freeme)
{
- return pan_g_convert_to_utf8 (str, g_freeme, len, NULL, NULL, NULL);
+ const char * retval
+ = pan_g_convert_to_utf8 (str, g_freeme, len, NULL, NULL, NULL);
+
+ if (!retval) {
+ gsize bytes_read;
+ gsize bytes_written;
+
+ retval = *g_freeme = g_convert(str,
+ len,
+ "UTF-8",
+ "ISO-8859-1",
+ &bytes_read,
+ &bytes_written,
+ NULL);
+ if (!retval)
+ retval = *g_freeme = g_strndup(str, bytes_read);
+ }
+
+ return retval;
}
const char*
- [Pan-devel] [PATCH] 8 bit characters in header,
Sam Solon <=