bug#74547: 31.0.50; igc: assertion failed in buffer.c

bug-gnu-emacs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#74547: 31.0.50; igc: assertion failed in buffer.c

From:	Geza Herman
Subject:	bug#74547: 31.0.50; igc: assertion failed in buffer.c
Date:	Wed, 4 Dec 2024 20:11:18 +0100
User-agent:	Mozilla Thunderbird


On 12/1/24 22:15, Pip Cet wrote:

"Geza Herman" <geza.herman@gmail.com> writes:

    On 12/1/24 16:48, Pip Cet wrote:
Gerd M¶llmann <gerd.moellmann@gmail.com> writes:

    Back then, the future of the new GC was a question, so Gerd said
    (https://lists.gnu.org/archive/html/emacs-devel/2024-03/msg00544.html)
    that
    "Please don't take my GC efforts into consideration. That may succeed
    or not. But this is also a matter of good design, using the stack,
    (which BTW pdumper does, too), vs. bad design." That's why we went with
    the fastest implementation that doesn't use lisp vectors for storage.
    But we suspected that this JSON parser design will likely cause a
    problem with the new GC. So I think even if it turned out that the
    current problem was not caused by the parser, I still think that there
    should be something done about this JSON parser design to eliminate
    this potential problem. The lisp vector based approach was reverted
    because it added an extra pressure to the GC. For large JSON messages,
    it doesn't matter too much, but when the JSON is small, the extra GC
    time made the parser measurably slower. But, as far as I remember, that
    version hadn't have the small internal storage optimization yet. If we
    convert back to the vector based approach, the extra GC pressure will
    be smaller (compared to the original vector based approach without the
    internal storage), as for smaller sizes the vector won't be actually
    used.
    G©za

Thank you for the summary, that makes sense. Is there a standard corpus
of JSON documents that you use to benchmark the code? That would be very
helpful, I think, since Eli correctly points out JSON parsing
performance is critical.

I'm not aware of such a corpus. When I developed the new JSON parser,the performance difference was so large so it was obvious that the newparser is faster. But I did benchmarks on JSONs which was generated byLSP communication (maybe I can share this one, if there is interest, butI need to anonymize it first), and also I did a benchmark on all theJSONs I found on my computer.

But this time, the performance difference is expected to be smaller,using lisp vectors shouldn't have a very large effect on performance.I'd check the performance with small JSONs, but large enough ones wherethe (non-internal) object_workspace is actually get used (make sure torun a lot of iterations, so the amortized GC time will be included inthe result). For larger JSONs, we shouldn't have a difference, as allthe other allocations (which store the actual result of the parsing)should hide the additional lisp vector allocation cost. At least, thisis my theory.

My gut feeling is that we should get rid of the object_workspace
entirely, instead modifying the general Lisp code to avoid performance
issues (and sacrifice some memory in the process, on most systems).

object_workspace is only grown once for the lifetime of one parsing.Once it is grown to the needed size, the only extra cost when parsing avalue is to copy the data to its final place from the object_workspace.Truncate based solution does the same copy, but it also needs to growthe hashtable/array for each value, so it executes more allocations andcopies than the current solution. So I'd prefer if we keptobject_workspace. If the only solution is to convert it to a lispvector, then I think we should do that. But again, this is just mytheory. If we try the truncate based solution, and if it turns out thatit's not significantly slower, then it can be a good solution as well.

[Prev in Thread]

Current Thread

[Next in Thread]

bug#74547: 31.0.50; igc: assertion failed in buffer.c, (continued)

Prev by Date: bug#74642: 31.0.50; [Patch] Turn off current source line indicator when gdb buffer is killed
Next by Date: bug#74697: 30.0.91; Inconsistent fontificaton with elisp modes
Previous by thread: bug#74547: 31.0.50; igc: assertion failed in buffer.c
Next by thread: bug#74547: 31.0.50; igc: assertion failed in buffer.c
Index(es):
- Date
- Thread