emacs-orgmode
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [O] Exporting large documents


From: Nicolas Goaziou
Subject: Re: [O] Exporting large documents
Date: Mon, 06 May 2013 21:17:50 +0200

Hello,

Achim Gratz <address@hidden> writes:

> Lawrence Mitchell writes:
>> org-element--current-element takes (on my machine) 0.0003 seconds per
>> call.  However, when exporting 128x the orgmanual introduction, it's
>> called around 250000 times giving ~ 80 seconds total time (out of ~200
>> total).
>
> I've traced this a bit and the question does warrant further
> investigation.  Exporting the introduction without any duplications
> already shows some interesting things: the property drawer for the
> introduction is scanned a whopping 137 times, followed by 134 times the
> cindex entry following it, followed by 125 times the "Summary" headline.
> The header options feature prominently with around 100 scans each as
> well.
>
> The rest of the calls have mostly just a single invocation, but there
> are some instances where parts of the tree are traversed multiple times
> in succession to apparently adjust the :end property to the leaf element
> in small increments or decrements.  If elements are mutable during
> parsing then caching is more difficult as well, obviously.
>
>> So it sort of feels like actually what is needed is microoptimisations
>> of the bits of the export engine that are called the most.
>
> Looking at the traces I'd think if we could eliminate the repeated
> backtracking to adjust the leafs or at least skip over those elements in
> a backtrack that are already fully parsed instead of parsing them again,
> that would be a good start.

Actually this is a bit different. Parsing doesn't backtrack. Look at
`org-element-parse-buffer' through elp to see that elements are parsed
only once.

The problem comes from `org-element-at-point'. To be effective, it needs
to move back to the current headline, and start parsing buffer again
from there. That means the first element after the headline (often
a property drawer) will be parsed each time we need information within
the section.

A very good improvement for the exporter and, more importantly, for the
parser, would be to cache results from `org-element--current-element'.
Though, this cache would also need to be refreshed after each buffer
modification. This is the tricky part.

One solution would be to use `after-change-functions' and
`before-change-functions' to store intervals of modified areas in the
buffer. Then, during idle time, a `maphash' could update boundaries of
cached values or remove them completely, according to the intervals.


Regards,

-- 
Nicolas Goaziou



reply via email to

[Prev in Thread] Current Thread [Next in Thread]