[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [O] Smart Quotes Exporting
From: |
Mark E. Shoulson |
Subject: |
Re: [O] Smart Quotes Exporting |
Date: |
Fri, 01 Jun 2012 18:41:56 -0400 |
User-agent: |
Mozilla/5.0 (X11; Linux i686; rv:12.0) Gecko/20120430 Thunderbird/12.0.1 |
On 06/01/2012 01:11 PM, Nicolas Goaziou wrote:
Hello,
"Mark E. Shoulson"<address@hidden> writes:
Oh, certainly; they're all a disaster. I think I said that in the
writeup at the top. This is just proof of concept, nothing is in the
right place, nothing is properly documented. They have to be
defcustoms, there needs to be a good :type in the defcustom as well as
a proper docstring. You'll get no argument from me about the lack (or
inaccuracy) of docstrings and such. I hadn't gotten that far yet.
I said the patch was only if you wanted to tinker with the development
as this progresses.
No worries, I was just making some comments before forgetting about
them.
Ah, ok. Good! Thanks.
+(defun org-e-latex--quotation-marks (text info)
+ (org-export-quotation-marks text info org-e-latex-quote-replacements))
+ ;; (mapc (lambda(l)
+ ;; (let ((start 0))
+ ;; (while (setq start (string-match (car l) text start))
+ ;; (let ((new-quote (concat (match-string 1 text) (cdr l))))
+ ;; (setq text (replace-match new-quote t t text))))))
+ ;; (cdr (or (assoc (plist-get info :language) org-e-latex-quotes)
+ ;; ;; Falls back on English.
+ ;; (assoc "en" org-e-latex-quotes))))
+ ;; text)
Use directly `org-e-latex-quote-replacements' in code then.
Not sure I understand this comment.
Since `org-e-latex--quotation-marks' just calls
`org-export-quotation-marks', you can remove completely the former from
"org-export.el" and use the latter instead.
Well, that was done on purpose, and maybe the reason will make sense.
As I see it, each exporter should be able to have its own smartifier
function, and the export engine should make no assumptions about that:
just call the individual exporter's function. On the other hand, many
(but perhaps not all!) of the exporters may find themselves using
essentially the same code just with different replacement strings. So I
thought that "general-purpose" should be in org-export.el, just for the
convenience of exporters should they choose to make use of it. So, many
of the exporters' smartifier functions will really just be calls to the
more general-purpose function.
Does that make sense?
So... there's the filter-parse-tree-functions hook gets applied within
the parse tree... so a back-end can add a function to that list which
looks over the parse-tree and watches for these border cases (and also
the ones within ordinary strings). Looks like it's going to be tough
to work in any flexibility to define further per-language or
per-backend cleverness to handle anything beyond the "canonical set"
of open-double, close-double, open-single, close-single, and mid-word.
To be sure, anything we do will most assuredly fail even on some
fairly reasonable input, in which case the users are pretty much on
their own and will have to do things the hard way. And I could use
that as the answer here, that, "well, it'll work only within
plain-text strings" (and I might possibly still have to use that
answer), but I would rather include the situations you bring up in the
supported set and not throw up my hands at it. So, yes, will look at
that.
Actually it isn't very hard to handle this problem. But it will be
different than the fontification used in an Org buffer.
Yes, the fontification on-screen is different, and uses a rather
different function--but if I can help it, the same regexps! So things
work the same everywhere.
I also started thinking a little about what you write below, how we can
inspect the characters just after or before quotes at the very beginning
or end of each chunk. It would be nice if it could all be encapsulated
neatly in the regexp(s).
As a first approximation, I can imagine a function accepting an element,
an object or a secondary string and returning an equivalent element,
object or secondary string, with its quotes "smartified". The algorithm
could go like this:
Walk element/object/secondary-string's contents .
Need it be element/object/secondary-string? At the bottom level it's
always about strings; the higher levels don't affect the processing of
each string in isolation. Do we need to intercept it at the element
level or just wait to grab things in the plain-text filter, since we
have access at that point too?
(Might also be that my understanding of the process and the nature of
elements is faulty or limited. Will have to see what works.)
1. When a string is encountered:
1. If it has a quote as its first or last position, check for
objects before or after the string to guess its status. An
object never starts with a white space, but you may have to
check :post-blank property in order to know if previous object
had white spaces at its end.
Hmm, this may in fact answer my question above: you need to be able to
get at the object level to test the post-blank. I'll experiment.
2. For each quote everywhere else in the string, your regexp can
handle it fine.
2. When an object belonging to `org-element-recursive-objects' is
encountered, apply the function to this object.
3. Accumulate returned strings or objects.
Use accumulated data as the contents of the new object to return (i.e.
just add the type and the same properties at the beginning of this list
if it was an object or an element, return it as-is if that was
a secondary string).
On the elements side, only paragraphs, verse-blocks and table-rows can
directly contain quotes. Also, headline, inlinetask item and
footnote-reference have secondary strings containing quotes.
I also haven't yet worked in smarts (especially in the on-screen
fontifier) for things like not fontifying inside comments or verbatim
strings, etc. That'll come in time.
I'm not sure yet where and how to install such a function, but I will
think about it when it is implemented.
Uuum... Maybe org-export-filter-parse-tree-functions?
~mark