[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[bug-gnu-libiconv] iconv fails on large Greek files
From: |
W. Wesley Groleau (伟思礼) |
Subject: |
[bug-gnu-libiconv] iconv fails on large Greek files |
Date: |
Sat, 1 Oct 2022 12:36:08 -0700 |
In a large file of Greek, with no known decomposed characters, iconv will fail
if using it to fix any decomposed letters (“just in case”).
It does not fail if the reverse is done (or at least it didn’t this time). The
failure usually occurs after processing APPROX. 4000 bytes, but occasionally
approx. 8000. If the line in the error message and a few lines before and
after are processed, it doesn’t fail. Thought it might be related to a buffer
size, but the exact number of bytes varies. Also, if the source file is NOT
changed, the failure position varies (but always close to 4000 or 8000)
The failure also occurs when the file does have known decomposed characters.
WGroleau@MBP ~ % iconv --version
iconv (GNU libiconv 1.16)
WGroleau@MBP ~ % uname -a
Darwin MBP.local 21.6.0 Darwin Kernel Version 21.6.0: Mon Aug 22 20:17:10 PDT
2022; root:xnu-8020.140.49~2/RELEASE_X86_64 x86_64
WGroleau@MBP el % wc el.txt
179 975 8621 el.txt
WGroleau@MBP el % iconv -f UTF8-MAC -t UTF-8 el.txt > /tmp/tmp
iconv: el.txt:90:16: cannot convert
WGroleau@MBP el % wc /tmp/tmp
89 457 4093 /tmp/tmp
WGroleau@MBP el % iconv -f UTF-8 -t UTF8-MAC el.txt > /tmp/tmp
WGroleau@MBP el % wc /tmp/tmp
179 1029 9537 /tmp/tmp
WGroleau@MBP el % iconv -f UTF8-MAC -t UTF-8 el.txt > /tmp/tmp
iconv: el.txt:90:16: cannot convert
WGroleau@MBP el % wc el.txt
179 975 8621 el.txt WGroleau@MBP
el % tail -$((179-90+2)) el.txt > el+.txt
WGroleau@MBP el % wc el+.txt
90 522 4558 el+.txt
WGroleau@MBP el % iconv -f UTF8-MAC -t UTF-8 el+.txt > /tmp/tmp
iconv: el+.txt:84:36: cannot convert
WGroleau@MBP el % wc /tmp/tmp
83 469 4093 /tmp/tmp
WGroleau@MBP el % iconv -f UTF-8 -t UTF8-MAC el.txt > /tmp/tmp
WGroleau@MBP el % iconv -f UTF8-MAC -t UTF-8 /tmp/tmp > temp.txt
iconv: /tmp/tmp:161:7: cannot convert
WGroleau@MBP el % wc temp.txt
160 835 7390 temp.txt
WGroleau@MBP el % wc /tmp/tmp
179 1029 9537 /tmp/tmp
--
Wes Groleau
伟思礼
You can make many plans,
but the Lᴏʀᴅ’s purpose will prevail.
http://biblehub.com/proverbs/19-21.htm
- [bug-gnu-libiconv] iconv fails on large Greek files,
W. Wesley Groleau (伟思礼) <=