help-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: How to circumvent warning in batch mode


From: Kevin Rodgers
Subject: Re: How to circumvent warning in batch mode
Date: Fri, 09 Oct 2009 07:43:40 -0600
User-agent: Thunderbird 2.0.0.23 (Macintosh/20090812)

Decebal wrote:
I have the following code:
emacs -batch -nw --eval='
  (let (
        (match-length)
        (reg-exp "^ +")
        (substitute-str "@")
        )
    (find-file "input")
    (goto-char (point-min))
    (while (re-search-forward "^ +" nil t)
      (setq match-length (- (point) (match-beginning 0)))
      (while (> match-length (length substitute-str))
        (setq substitute-str (concat substitute-str substitute-str)))
      (replace-match (substring substitute-str 0 match-length))
    )
    (write-file "outputEmacs")
  )
'
I have severall questions about it.
The input file is quite big and I get:
    File input is large (31MB), really open? (y or n)
Is there a way to circumvent this?

let-bind large-file-warning-threshold to nil around the call to find-file.

Is there a way to do this more efficient? This script needs about 20
seconds. When doing it with a Perl script, it takes about 6 seconds.

1. Put the code in a file (FILE.el) and byte-compile it.  Then instead of
   --eval 'CODE' on the command line, use --load FILE.elc

2. It looks like you are doing a lot of unnecessary string allocation with
   concat and substring:

   For every character after the first character in the match, you double the
   length of the replacement string until it is at least as long as the length
   of the match string, then you only use the number of characters that were in
   the match string anyway.  Change the loop to:

    (while (re-search-forward "^ +" nil t)
      (setq match-length (- (point) (match-beginning 0)))
      (if (> match-length 1)
          (replace-match (make-string match-length ?@))
        (replace-match "@")))

   That could be improved further by caching each replacement string of length
   > 1, so it is only allocated once... But now, I can see that my version
   using make-string does the same amount of string allocation as yours using
   substring, and that your use of concat is infrequent (only needed when the
   match string jumps to a larger length than has been seen so far).  So caching
   the replacement string (in an array, indexed by its length) is the way to go.

Instead of the '@' or chr$(64) I would like to use a nbsp or chr
$(160). But then the script needs almost 3 minutes. Also every space
is replaced by two characters chr$(194) + chr$(160).
What is going wrong here?

In UTF-8, NBSP is 2 bytes: decimal 194 160 aka hex 00C2 00A0.

--
Kevin Rodgers
Denver, Colorado, USA





reply via email to

[Prev in Thread] Current Thread [Next in Thread]