|
From: | Vasudev Kamath |
Subject: | [silpa-discuss] [bug #29678] Word loss during reading of english dictionary |
Date: | Mon, 26 Apr 2010 16:46:05 +0000 |
User-agent: | Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.9) Gecko/20100414 Iceweasel/3.5.9 (like Firefox/3.5.9) |
URL: <http://savannah.nongnu.org/bugs/?29678> Summary: Word loss during reading of english dictionary Project: Silpa Indian Language Processing Project Submitted by: vasudev Submitted on: Mon 26 Apr 2010 04:46:04 PM GMT Category: None Severity: 2 - Minor Item Group: None Status: Confirmed Privacy: Public Assigned to: vasudev Open/Closed: Open Discussion Lock: Any _______________________________________________________ Details: In the spell checker module the reading of dictionaries using codecs.open with utf-8 encoding causes word loss for en_US.dic file. This essentially because the en_US.dic is not a utf-8 text file instead its ISO-8859 text file. I'm attaching a patch file which is difference between the original dictionary and file saved by reading the dictionary in utf-8 encoding and saving via a python script. _______________________________________________________ File Attachments: ------------------------------------------------------- Date: Mon 26 Apr 2010 04:46:04 PM GMT Name: difference_dic.patch Size: 7kB By: vasudev Difference between original dictionary and dictionary saved via python script <http://savannah.nongnu.org/bugs/download.php?file_id=20337> _______________________________________________________ Reply to this item at: <http://savannah.nongnu.org/bugs/?29678> _______________________________________________ Message sent via/by Savannah http://savannah.nongnu.org/
[Prev in Thread] | Current Thread | [Next in Thread] |