emacs-orgmode
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Orgmode] Re: Custom entry IDs in HTML export


From: Carsten Dominik
Subject: Re: [Orgmode] Re: Custom entry IDs in HTML export
Date: Fri, 17 Apr 2009 06:11:49 +0200


On Apr 17, 2009, at 12:37 AM, Sebastian Rose wrote:

Carsten Dominik <address@hidden> writes:
On Apr 16, 2009, at 10:50 PM, Sebastian Rose wrote:

Carsten Dominik <address@hidden> writes:
Hi Sebastian,

On Apr 16, 2009, at 3:14 PM, Sebastian Rose wrote:

Hm - counter arguments?

The only counter argument is, that hand made IDs for links are prone to
error. But that risk should be up to the user.

Yes. and during the export, I can actually check and throw a warning or an
error if the same custom ID shows up twice.


I actually changed my mind a little in this concern.

If the user clicks a section link in the toc to jump to a section, he can bookmark the page with exactly that jump target. If the jump target
(the ID) is human readable, the bookmark is more verbose.

Yes, this is really the best application. Also, when hovering over internal
links, it is helpful if the link displays the human-readable  form.

Just one wish:

The containers should reflect that change (HRID = human readable id):

<div   id="outline-container-HRID">
<h4  id="HRID">                   headline    </h4>
<div id="outline-text-HRID">
 sections content...
</div>
</div>


Sure, we can do this.  I would then add sec-xxx as one
of the alternative anchors as well.

However:  If I make the structure as you indicate above,
do I understand correctly that the structure of a section without a
human-readable id should be changed to this:

<div   id="outline-container-sec-1.1">
<h4  id="sec-1.1">                   headline    </h4>
<div id="outline-text-sec-1.1">
 sections content...
</div>
</div>


Note the "sec-" which is added to the stuff that currently
defines the structure.



I considered the `sec-' part of the automatic IDs.

In either case I'd have to adjust org-info.js. So why not go for the
human readable IDs without `sec-'?


Right now we have:

<div id="outline-container-2" class="outline-2">
<h2 id="sec-2"><span class="section-number-2">2</span> Things I want to find
out </h2>
<div class="outline-text-2" id="text-2">

The `sec-' part is in the headlines ID only.


Why? Because this introduced a parsing inconsistency for you between automatic and custom IDs. Because for the automatic ones, you need to strip "sec-" to retrieve the correct suffix for the container etc names. With the custom IDs,
no such stripping should be done.  Does  this not make things harder?

- Carsten


That's the way it is _now_. The structure above is taken from one of my
exported org-files. But it's not that hard to strip `sec-' :)

Now the scanning considers `sec-' a prefix - just like
`outline-container-' and `outline-text-'.


But in the future:


If we now plan to use human readable IDs in the TOC, those IDs would be
the IDs of the section heading. That's why those IDs should have no
`sec-' prefix.

Otherwise, bookmark URLs would not be what we want them:

  http://orgmode.org/org-faq.php#sec-isearch-in-links

instead of

  http://orgmode.org/org-faq.php#isearch-in-links



Automatic IDs on the other hand must have a prefix, since an ID may
_not_ start with a number.


So wouldn't it make sense, to change the IDs of the containers this way:

 Case _automatic_:

      <div id="outline-container-sec-1.1" ... >
        <h3 id="sec-1.1"> .... </h3>
        <div id="outline-text-sec-1.1" ... >
        ....
        </div>
      </div>

 Case _human-readable_:

      <div id="outline-container-isearch-in-links" ... >
        <h3 id="isearch-in-links"> .... </h3>
        <div id="outline-text-isearch-in-links" ... >
        ....
        </div>
      </div>

Yes, it does make sense. t only introduces on tiny restriction: A human-readable ID may not be something like sec-555, but that is reasonable, we can document and enforce this.

OK. This is what I have done now. You need to use the property CUSTOM_ID.
Please do some testing, and then I will document this change.

Daniel, could you help testing, please?

- Carsten


??


 Sebastian






 Sebastian




That way the script would keep working with older pages.
Automatic IDs and human readable ones could be mixed.


The '<a id="">' anchors are scanned anyway, as are all jump targets in
the page.

Yes, you implemented that some time ago, I remember.


Maybe this is even the point to re-work the parser of org- info.js to
become independent of the TOC at all. The script could search for
headings instead. That's more work, but the script would then work for
all HTML pages with a structure similar to the org-export's one:

So this would mean, we could read web pages with your java
support even if those webpages were not created with Org?
Pretty cool.

<div id=""><hx id=""></hx><div>content</div></div>

but I could postpone this, if you fullfill my wish above.


Best wishes

- Carsten



Best wishes

Sebastian




Carsten Dominik <address@hidden> writes:
On Apr 16, 2009, at 10:50 AM, Sebastian Rose wrote:

Carsten Dominik <address@hidden> writes:
Hi Sebastian,

I kind of like the idea to have a property that can be
used to set an ID, as an alternative to the <<target>>
notation.  Actually, using a property seems a lot cleaner,
thanks for coming up with this idea, Daniel.

I can also follow the reasoning that it is useful to have
the table of contents link to the human-readable id, because
it provides a general, simple workflow to retrieve a link that
will persist through changes of the document.  This workflow
was described also by Bernt earlier in this thread.

Finally, I also agree that the main id in the <h3> tag
should be the automatically generated one because this is
best for automatic processing and because of all the arguments
you have presented.

Would it cause problems for org-info.js if the toc points to
a user specified anchor in the headline, instead of the main
ID that is inside the <h3> tag?  THis would really be the only
required change.


I'll have to test this before I can give a final answer to this
question.

But regardless of the results, I will adjust the script to reflect that change. The script should not rule the HTML export and it will be an
easy thing to do.

But I do want to hear any counter arguments you might have....

- Carsten


Sebastian



- Carsten


On Mar 30, 2009, at 1:49 PM, Daniel Clemente wrote:

El dv, mar 27 2009, Sebastian Rose va escriure:

What we have now, just as Carstens said:

# <<human-readable>>
* Section B

Creates this headline in HTML:

<h2 id="sec-2"><a name="human-readable" id="human- readable"></
a>2 Section B
</h2>

This is enough for all the use cases I can think of.


Yes, this is enough except for two things:
1. The TOC still links to #sec-2 and the user can't change that 2. Your syntax doesn't fold very well in the outliner. I mean: if you
use

# <<human-readable>>
* Section B

then the comment appears at the end of the previous section, and you can
miss
it when you are viewing the heading „Section B“. I would swap both
lines
(solution 1):

* Section B
# <<human-readable>>

But since there are already LOGBOOK drawers under the heading, it would
be
a
lot clearer to use a property, like EXPORT_ID (solution 2):

* Section B
:PROPERTIES:
:EXPORT_ID: human-readable
:END:


In this way, the TOC can reliably find the EXPORT_ID, and then generate:
<h2 id="sec-2"><a name="human-readable" id="human- readable"></
a>2 Section B
</h2>

(You could also leave *just* the human-readable id, but having two is
not
bad.


I would prefer solution 1, but I don't because I'm not sure that the TOC
can
find the ID if it is written as a comment anywhere under the heading
(and
together with other things).

Solution 2 involves thus: a new property to specify the human-
readable entry ID, which will be used to link to the entry. The
automatic
ID
(#sec-2) will still work for all entrys.



* Distinguishing automatic and human readable IDs

One thing I like is, that we now _can_ distinguish the
`human-readable-target' (human readable) from the `sec-2' (not human
readable and not context related) using a regular expression.

In org-info.js, I can now prefere the human readable ID in <a> from an automatic created one, and thus use that to create the links for `l' and `L'. The same holds true for other programming languages and
parsers.

If we open the <h3>'s ID for user defined values (bad), we can not distinguish those ID's using a regular expression and there is no way to detect the human readable one. There will be no way to _know_ that the <a>'s ID is the prefered one used for human readable links.


Solution 2 doesn't break the parsing techniques you use; in fact it can
also
make clearer which ID is the human readable one and which one not.


This is not extremely important; just useful:
- for pages with many incoming links from external sites
- to ensure link integrity (now you can't assure that links will still
work
in
1 year ... or in some weeks)
- to avoid that HTML visitors get directed to a wrong section and can't
find
what they searched


Greetings,
Daniel


_______________________________________________
Emacs-orgmode mailing list
Remember: use `Reply All' to send replies to the list.
address@hidden
http://lists.gnu.org/mailman/listinfo/emacs-orgmode


--
Sebastian Rose, EMMA STIL - mediendesign, Niemeyerstr.6, 30449 Hannover
Tel.:  +49 (0)511 - 36 58 472
Fax:   +49 (0)1805 - 233633 - 11044
mobil: +49 (0)173 - 83 93 417
Email: address@hidden, address@hidden
Http:  www.emma-stil.de


--
Sebastian Rose, EMMA STIL - mediendesign, Niemeyerstr.6, 30449 Hannover
Tel.:  +49 (0)511 - 36 58 472
Fax:   +49 (0)1805 - 233633 - 11044
mobil: +49 (0)173 - 83 93 417
Email: address@hidden, address@hidden
Http:  www.emma-stil.de


--
Sebastian Rose, EMMA STIL - mediendesign, Niemeyerstr.6, 30449 Hannover
Tel.:  +49 (0)511 - 36 58 472
Fax:   +49 (0)1805 - 233633 - 11044
mobil: +49 (0)173 - 83 93 417
Email: address@hidden, address@hidden
Http:  www.emma-stil.de


--
Sebastian Rose, EMMA STIL - mediendesign, Niemeyerstr.6, 30449 Hannover
Tel.:  +49 (0)511 - 36 58 472
Fax:   +49 (0)1805 - 233633 - 11044
mobil: +49 (0)173 - 83 93 417
Email: address@hidden, address@hidden
Http:  www.emma-stil.de





reply via email to

[Prev in Thread] Current Thread [Next in Thread]