|
| From: | Jose Da Silva |
| Subject: | [aspell-devel] aspell 0.60 prezip.c & compress.c improvements |
| Date: | Fri, 17 Sep 2004 17:41:03 -0700 |
| User-agent: | KMail/1.6.1 |
Hi,
Please accept these additional fixes.
Explanations for each, below.
thanks.
---1--prezip.c---
Speed improvement:
Not necessary to test w[l] != '\0' if already tested p[l] != '\0' because
the next test is for p[l] == w[l]
---2--prezip.c---
fflush needed to flush out remaining binary since there is no trailing '\n'
---3--prezip.c---
Autosense & removal of trailing CR,
Currently, compress is immune to the carriagereturn-linefeed differences
between DOS based text and Linux/Mac/other based text lists, but prezip
processes the carriage return if you mix lists which means DOS-based versions
of prezip are going to create error-filled or dirty text lists
The added code should hopefully take care of differences of inputting from
mixed DOS-based and non-DOS-based lists while still being able to work on
wordlists that use an internal CR as a valid character.
test samples, one file with CR and one with no CR, both produce 60byte files:
prezip -z <q_cr.txt >q1.pwl
prezip -z <q_no_cr.txt >q2.pwl
diff q1.pwl q2.pwl = no differences = what you want :-)
prezip -d <q1.pwl >q.txt = 84 bytes for linux-based aspell
prezip -d <q1.pwl >q.txt = 91 bytes for DOS-based aspell
---1--compress.c---
compress.c needs additional fix:
#define BUFSIZE 256
should become something like:
/* BUFSIZE must be 256 to work correctly */
#define BUFSIZE 256
...so that potential modifications don't change BUFSIZE != 256 since it will
introduce potential errors, where:
(1) a number higher than 256 will introduce an error for word compression
larger than 255 since the length is encoded as 1char={0...255}
(2) a number lower than 256 will be a potential problem for large words
getting uncompressed... example BUFSIZE = 10 will have an error with a
wordlength=20 chars long.
---other, misc.---
Canadian English spelling missing "blonde"
q_no_cr.txt
Description: Text document
q_cr.txt
Description: Text document
diff_prezip.txt
Description: Text document
prezip.c
Description: Text Data
| [Prev in Thread] | Current Thread | [Next in Thread] |