emacs-orgmode
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [O] Discussion request: 15m tangle time, details follow


From: Eric Schulte
Subject: Re: [O] Discussion request: 15m tangle time, details follow
Date: Wed, 18 Jun 2014 16:59:16 -0400
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux)

Aaron Ecay <address@hidden> writes:

> Hi Grant,
>
> 2014ko ekainak 17an, Grant Rettke-ek idatzi zuen:
>> 
>> Good evening,
>> 
>> Over the past few months I've been working on the same literate
>> document. It has been a learning
>> experience for me, trial and error has abounded. The key tenet that
>> I've adhered too though is to truly
>> embrace literate programming, and the more I learn the more it makes
>> sense. The document has
>> grown quite organically and it has been and continues to be a
>> wonderful experience. What I need
>> help, feedback, discussion, and more on is the build time.
>> 
>> The average build takes 15m. 
>
> Here you mean time to tangle, correct?  (As opposed to exporting to
> HTML/LaTeX/etc.)
>
> I can confirm very long times to tangle a document with a structure like
> yours.  I ran the emacs profiler
> <https://www.gnu.org/software/emacs/manual/html_node/elisp/Profiling.html>
> while tangling the document for 30 secs, then interrupted with C-g and
> generated a report.  That is attached.
>
>
>
> I did two non-standard things to this profile.  The first was:
>
> (setq profiler-report-cpu-line-format
>   '((100 left)
>     ;; The 100 above is increased from the default of 50
>     ;; to allow the deeply nested call tree to be seen
>     (24 right ((19 right)
>              (5 right)))))
>
> The second was to convert an anonymous lambda found in
> org-babel-params-from-properties into a named function, so that it would
> show up in the profiling results on its own line:
>
> (defun org-babel-params-from-properties-inner1 (header-arg)
>   (let (val)
>     (and (setq val (org-entry-get (point) header-arg t))
>        (cons (intern (concat ":" header-arg))
>              (org-babel-read val)))))
>
> The profile shows that most of the slowdown is in org-entry-get.  Indeed,
> org-babel-params-from-properties calls this function ~30 times per source
> block.  When called with the inherit arg set to t (as here), this function
> takes time (at least) proportional to the number of headings dominating
> the source block, which in your document can be up to 5.
>

Thanks for taking the time to profile this.  It's nice to have more
evidence that the use of properties is definitely the culprit here.

>
> I think there are two problems here.  The first is the situation where
> babel needs to fetch 30 properties per source block.  Indeed, this is
> marked “deprecated” in the source, in favor of a system where there is
> only one header arg.  This has been marked deprecated for almost exactly
> a year in the code (Achim’s commit 90b16870 of 2013-06-23), but I don’t
> know of any prominent announcement of the deprecation.  So I doubt the
> old slow code could be removed without breaking many people’s setups,
> although possibly a customization variable could be introduced to allow
> users to opt in to the new, faster system.  You’d then have to update
> your file:
>
>   :PROPERTIES:
>   :exports: none
>   :tangle: no
>   :END:
>
> becomes
>
>   :PROPERTIES:
>   :header-args: :exports none :tangle no
>   :END:
>
> The new system is also a bit inferior, in that it doesn’t allow header
> arg inheritance as easily.  So with the one-prop-per-arg system the
> following works as expected:
>
>   * foo
>     :PROPERTIES:
>     :exports: none
>     :END:
>   ** bar
>      :PROPERTIES:
>      :tangle: no
>      :END:
>
>   (src block here)
>
> On the other hand, in the new system there’s no way to specify some
> header args at foo and some at bar; the lowest header-args property
> wins.  (At least as far as I can see)
>

As I recall this inheritance issue is the wall that we ran up against.
The deprecation comment in the code was premature.

>
> The second issue is that it’s desirable to memoize calls to
> org-entry-get.  Probably the easiest way to do this is to use the
> org-element cache.  Indeed, a quick and hacky test that I did seemed to
> confirm that this yields some speedup.  There are conceptual issues
> though – org-element forces all property keys to be uppercase, whereas
> org-entry-get (as near as I can tell...) follows the user’s
> customization of case-fold-search to determine its case sensitivity.  So
> one has to think carefully about how a rewrite to use org-element might
> affect the case-sensitivity of the property API (although code relying
> on the API to be sensitive to case of property keys might be rare in
> practice).
>

Thanks, it does sound like org-element cache could be useful here, I
don't believe this existed last time we wrestled with this performance
issue.

The only other options I can think of are;

- introduce a customization variable to eliminate or limit the use of
  property lookup for code blocks to either perform none or to only
  search for a limited set of properties

- possibly extend org-element-get (or provide an alternative) which
  takes multiple keys (which may be more efficient depending on the
  implementation)

>
> TL;DR:
>
> 1. I see the same slowness you report
> 2. It seems like an architectural issue rather than one of
>    (mis)configuration
> 3. There are broad fixes available, but they require potentially
>    compatibility-breaking changes to Org
> 4. (maybe with this analysis someone can come up with a more targeted
>    fix for your use case)
>
> Hope this is helpful,

Very helpful, thanks for providing both empirical data and useful
analysis.

Best,
Eric

-- 
Eric Schulte
https://cs.unm.edu/~eschulte
PGP: 0x614CA05D (see https://u.fsf.org/yw)



reply via email to

[Prev in Thread] Current Thread [Next in Thread]