emacs-pretest-bug
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Emacs hangs on long words in .html files


From: Chris Moore
Subject: Emacs hangs on long words in .html files
Date: Sun, 03 Dec 2006 06:36:39 +0100

I just made a new firefox profile, and was trying to copy my favourite
bookmarks from my old bookmarks.html into my new one.  I opened the
bookmarks.html file in Emacs and did a C-s (isearch-forward) for
wikipedia and hit C-s a few times to try to find the bookmark I was
looking for.  After a few keypresses Emacs hung up, using all
available CPU.  Here's a backtrace:

  (gdb) where
  #0  0x081399b6 in re_match_2_internal (bufp=0x83088ac, string1=<value 
optimized out>, size1=0,
      string2=0xb7175008 "<!DOCTYPE NETSCAPE-Bookmark-file-1>\n<!--
      This is an automatically generated file.\n     It will be read
      and overwritten.\n     DO NOT EDIT! -->\n<META
      HTTP-EQUIV=\"Content-Type\" CONTENT=\"text/html; charse"...,
      size2=266793, pos=18697, regs=0x82feb28,
      stop=30897) at regex.c:5515
  #1  0x0813c45d in re_search_2 (bufp=0x83088ac,
      str1=0xb7175008 "<!DOCTYPE NETSCAPE-Bookmark-file-1>\n<!-- This
      is an automatically generated file.\n     It will be read and
      overwritten.\n     DO NOT EDIT! -->\n<META
      HTTP-EQUIV=\"Content-Type\" CONTENT=\"text/html; charse"...,
      size1=266793, str2=0xb71b69ec "", size2=0,
      startpos=18697, range=<value optimized out>, regs=0x82feb28, stop=30897) 
at regex.c:4433
  #2  0x08134da9 in search_buffer (string=<value optimized out>, pos=<value 
optimized out>,
      pos_byte=15941, lim=30876, lim_byte=30898, n=1, RE=1, trt=139281348,
      inverse_trt=139373300, posix=0) at search.c:1177
  #3  0x08135ae7 in search_command (string=149270987, bound=<value optimized 
out>,
      noerror=137455865, count=137455817, direction=1, RE=1, posix=0) at 
search.c:977
  #4  0x0815a197 in Ffuncall (nargs=4, args=0xbf9c3b20) at eval.c:3007
  #5  0x0818498a in Fbyte_code (bytestr=136500331, vector=136500348, 
maxdepth=72)
      at bytecode.c:679
  #6  0x08159bd4 in funcall_lambda (fun=136500268, nargs=3, 
arg_vector=0xbf9c3c64) at eval.c:3184
  #7  0x08159feb in Ffuncall (nargs=4, args=0xbf9c3c60) at eval.c:3054
  #8  0x0818498a in Fbyte_code (bytestr=136497035, vector=136497052, 
maxdepth=40)
      at bytecode.c:679
  #9  0x08159bd4 in funcall_lambda (fun=136496988, nargs=3, 
arg_vector=0xbf9c3d94) at eval.c:3184
  #10 0x08159feb in Ffuncall (nargs=4, args=0xbf9c3d90) at eval.c:3054
  #11 0x0818498a in Fbyte_code (bytestr=136495859, vector=136495876, 
maxdepth=32)
  ---Type <return> to continue, or q <return> to quit---
      at bytecode.c:679
  #12 0x08159bd4 in funcall_lambda (fun=136495804, nargs=2, 
arg_vector=0xbf9c3f48) at eval.c:3184
  #13 0x08159feb in Ffuncall (nargs=3, args=0xbf9c3f44) at eval.c:3054
  #14 0x0815b621 in run_hook_with_args (nargs=3, args=0xbf9c3f44, 
cond=to_completion)
      at eval.c:2656
  #15 0x0815a303 in Ffuncall (nargs=4, args=0xbf9c3f40) at eval.c:2978
  #16 0x0818498a in Fbyte_code (bytestr=136511131, vector=136511156, 
maxdepth=32)
      at bytecode.c:679
  #17 0x08159688 in Feval (form=136511117) at eval.c:2334
  #18 0x0815bcc1 in internal_lisp_condition_case (var=137785529, 
bodyform=136511117,
      handlers=136511189) at eval.c:1426
  #19 0x08183c89 in Fbyte_code (bytestr=136510795, vector=136510812, 
maxdepth=64)
      at bytecode.c:869
  #20 0x08159bd4 in funcall_lambda (fun=136510740, nargs=2, 
arg_vector=0xbf9c4274) at eval.c:3184
  #21 0x08159feb in Ffuncall (nargs=3, args=0xbf9c4270) at eval.c:3054
  #22 0x0818498a in Fbyte_code (bytestr=136510475, vector=136510492, 
maxdepth=72)
      at bytecode.c:679
  #23 0x08159bd4 in funcall_lambda (fun=136510436, nargs=1, 
arg_vector=0xbf9c44d4) at eval.c:3184
  #24 0x08159feb in Ffuncall (nargs=2, args=0xbf9c44d0) at eval.c:3054
  #25 0x081588c8 in internal_condition_case_2 (bfun=0x8159e40 <Ffuncall>, 
nargs=2,
      args=0xbf9c44d0, handlers=137455865, hfun=0x8075f60 <safe_eval_handler>) 
at eval.c:1580
  #26 0x0807f9de in safe_call (nargs=2, args=0xbf9c44d0) at xdisp.c:2339
  #27 0x0807fa25 in safe_call1 (fn=139462281, arg=125064) at xdisp.c:2359
  #28 0x0808a692 in handle_fontified_prop (it=0xbf9c4d34) at xdisp.c:3292
  #29 0x08073e3e in handle_stop (it=0xbf9c4d34) at xdisp.c:3045
  ---Type <return> to continue, or q <return> to quit---
  #30 0x0807972d in next_element_from_buffer (it=0xbf9c4d34) at xdisp.c:6241
  #31 0x080760bf in get_next_display_element (it=0xbf9c4d34) at xdisp.c:5499
  #32 0x08076f18 in move_it_in_display_line_to (it=0xbf9c4d34, 
to_charpos=47995, to_x=-1, op=8)
      at xdisp.c:6443
  #33 0x080792c0 in move_it_to (it=0xbf9c4d34, to_charpos=47995, to_x=-1, 
to_y=-1, to_vpos=-1,
      op=8) at xdisp.c:6806
  #34 0x08080bfb in move_it_vertically_backward (it=0xbf9c5c70, dy=344) at 
xdisp.c:6913
  #35 0x08084bf3 in redisplay_window (window=144431460, just_this_one_p=0) at 
xdisp.c:13254
  #36 0x080879a3 in redisplay_window_0 (window=144431460) at xdisp.c:11764
  #37 0x081589d1 in internal_condition_case_1 (bfun=0x8087980 
<redisplay_window_0>,
      arg=144431460, handlers=137442597, hfun=0x80651e0 
<redisplay_window_error>) at eval.c:1529
  #38 0x080741e7 in redisplay_windows (window=149176637) at xdisp.c:11743
  #39 0x080881bf in redisplay_internal (preserve_echo_area=<value optimized 
out>)
      at xdisp.c:11303
  #40 0x08088ca7 in redisplay_preserve_echo_area (from_where=2) at xdisp.c:11550
  #41 0x0805589e in Fredisplay (force=137455817) at dispnew.c:6565
  #42 0x0815a137 in Ffuncall (nargs=1, args=0xbf9c6960) at eval.c:2997
  #43 0x0818498a in Fbyte_code (bytestr=136083891, vector=136083908, 
maxdepth=40)
      at bytecode.c:679
  #44 0x08159bd4 in funcall_lambda (fun=136083828, nargs=1, 
arg_vector=0xbf9c6a94) at eval.c:3184
  #45 0x08159feb in Ffuncall (nargs=2, args=0xbf9c6a90) at eval.c:3054
  #46 0x0818498a in Fbyte_code (bytestr=136558067, vector=136558084, 
maxdepth=32)
      at bytecode.c:679
  #47 0x08159bd4 in funcall_lambda (fun=136558012, nargs=0, 
arg_vector=0xbf9c6bb4) at eval.c:3184
  #48 0x08159feb in Ffuncall (nargs=1, args=0xbf9c6bb0) at eval.c:3054
  ---Type <return> to continue, or q <return> to quit---
  #49 0x0818498a in Fbyte_code (bytestr=136544515, vector=136544532, 
maxdepth=40)
      at bytecode.c:679
  #50 0x08159bd4 in funcall_lambda (fun=136544492, nargs=0, 
arg_vector=0xbf9c6ce4) at eval.c:3184
  #51 0x08159feb in Ffuncall (nargs=1, args=0xbf9c6ce0) at eval.c:3054
  #52 0x0818498a in Fbyte_code (bytestr=136548035, vector=136548052, 
maxdepth=32)
      at bytecode.c:679
  #53 0x08159bd4 in funcall_lambda (fun=136548004, nargs=1, 
arg_vector=0xbf9c6e04) at eval.c:3184
  #54 0x08159feb in Ffuncall (nargs=2, args=0xbf9c6e00) at eval.c:3054
  #55 0x0818498a in Fbyte_code (bytestr=136548283, vector=136548300, 
maxdepth=16)
      at bytecode.c:679
  #56 0x08159bd4 in funcall_lambda (fun=136548252, nargs=0, 
arg_vector=0xbf9c6f24) at eval.c:3184
  #57 0x08159feb in Ffuncall (nargs=1, args=0xbf9c6f20) at eval.c:3054
  #58 0x0815b9f9 in apply1 (fn=139755201, arg=137455817) at eval.c:2738
  #59 0x08157187 in Fcall_interactively (function=139755201, 
record_flag=137455817,
      keys=137496332) at callint.c:406
  #60 0x080f7ac3 in Fcommand_execute (cmd=139755201, record_flag=137455817, 
keys=137455817,
      special=137455817) at keyboard.c:9867
  #61 0x0810316a in command_loop_1 () at keyboard.c:1858
  #62 0x08158c0b in internal_condition_case (bfun=0x8102df0 <command_loop_1>,
      handlers=137500521, hfun=0x80fd800 <cmd_error>) at eval.c:1481
  #63 0x080fcbde in command_loop_2 () at keyboard.c:1326
  #64 0x08158ccc in internal_catch (tag=137496729, func=0x80fcbb0 
<command_loop_2>,
      arg=137455817) at eval.c:1222
  #65 0x080fd64e in command_loop () at keyboard.c:1305
  #66 0x080fd9d8 in recursive_edit_1 () at keyboard.c:1003
  ---Type <return> to continue, or q <return> to quit---
  #67 0x080fdac6 in Frecursive_edit () at keyboard.c:1064
  #68 0x080f3b82 in main (argc=0, argv=0xbf9c77b4) at emacs.c:1794

  Lisp Backtrace:
  "re-search-forward" (0x8e5b1cb)
  "font-lock-fontify-keywords-region" (0x1e320)
  "font-lock-default-fontify-region" (0x1e320)
  "font-lock-fontify-region" (0x1e320)
  "run-hook-with-args" (0x8491b79)
  "byte-code" (0x822fe9b)
  "jit-lock-fontify-now" (0x1e888)
  "jit-lock-function" (0x1e888)
  "redisplay" (0x0)
  "sit-for" (0x0)
  "isearch-lazy-highlight-new-loop" (0x83168c9)
  "isearch-update" (0x6e60)
  "isearch-repeat" (0x838f269)
  "isearch-repeat-forward" (0x83168c9)
  "call-interactively" (0x8547ec1)
  (gdb)

I tried cutting the bookmarks.html file down to make a minimal test
case, and found that this does the trick:

  bash$ i=0; while ((i<2000)); do printf abcdefghij; ((i++)); done >| 
/tmp/file.html; echo >> /tmp/file.html
  bash$ /usr/local/bin/emacs -Q -nw /tmp/file.html

It turns out that my bookmarks.html had very long lines in it due to
inline favicons, which look like this:

  <DT><A HREF="http://news.gmane.org/gmane.linux.debian.user"; 
ADD_DATE="1161386011" \
  LAST_VISIT="1164119611" LAST_MODIFIED="1161386035" SHORTCUTURL="duser" 
ICON="data:\
  
image/x-icon;base64,AAABAAMAEBAAAAEACABoBQAANgAAACAgAAABAAgAqAgAAJ4FAAAwMAAAAQAIAK\
  
gOAABGDgAAKAAAABAAAAAgAAAAAQAIAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACxsbEAra2tAKWl\
  
pQChoaEAm5ubAJWVlQCLi4sAh4eHAGlpaQBZWVkA/v7+APz8/ABJSUkA+vr6APb29gD09PQAQUFBAPLy8g\
  
A/Pz8APT09AO7u7gAxMTEALy8vAODg4AAtLS0A2NjYACEhIQDQ0NAAzs7OABsbGwDGxsYAERERAA8PDwDA\
  
wMAADQ0NALi4uAAFBQUAAwMDAAEBAQCqqqoAqKioAKampgCampoAlpaWAJCQkACIiIgAgICAAH5+fgBycn\
  
IAaGhoAFpaWgBQUFAA////AExMTAD9/f0ASkpKAPv7+wBISEgA+fn5APf39wBEREQAQkJCAO3t7QDn5+cA\
  
4+PjAOHh4QAqKioA19fXACIiIgDR0dEAHBwcAM3NzQAaGhoAy8vLAMfHxwAUFBQAEBAQAAwMDAC9vb0ACg\
  
oKAAgICAAGBgYABAQEAAICAgCzs7MAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA\
  
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA\
  
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA\
  
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA\
  [and so on]

The favicon.ico is encoded into a long base64 encoded string, and at
places has long strings of 'A' characters.

I wasn't able to use C-g to get control back and ended up having to
kill Emacs.  I don't know how long it would have taken for Emacs to
recover.

I did a few tests to see how the startup time depends on the length of
the word:

   2000   1s
   4000   3s
   6000   8s
   8000  11s
  10000  20s
  12000  29s
  14000  46s
  16000  67s

This is worse than O(n^2) on the length of the word.

This seems to be specific to .html files.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]