|
From: | Jambunathan K |
Subject: | Re: [O] bug in odt export via mathml of equations containing '&' |
Date: | Wed, 09 Nov 2011 02:47:36 +0530 |
User-agent: | Gnus/5.13 (Gnus v5.13) Emacs/24.0.91 (windows-nt) |
Hello Myles The example that you have cited encounters issues on every step along the way - plastex, mathtoweb and odt. I have tried my best to be useful here. I sincerely appreciate you exercising the LaTeX to MathML conversion facilities included in Org. I hope we get robust LaTeX->MathML converters *ultimately*. I see that there is plenty of scope for the LaTeX to MathML converters to improve and mature. This is going to be a long mail. Read on.
#+TITLE: improvements.org #+AUTHOR: Jambunathan K #+EMAIL: address@hidden #+DATE: 2011-11-09 Wed #+DESCRIPTION: #+KEYWORDS: #+LANGUAGE: en #+OPTIONS: H:3 num:t toc:t \n:nil @:t ::t |:t ^:t -:t f:t *:t <:t #+OPTIONS: TeX:t LaTeX:t skip:nil d:nil todo:t pri:nil tags:not-in-toc #+EXPORT_SELECT_TAGS: export #+EXPORT_EXCLUDE_TAGS: noexport #+LINK_UP: #+LINK_HOME: #+XSLT: * Improvements to LaTeX->MathML handling in ODF exporter Firstly, I felt a need for some support infrastructure for working with LaTeX fragments in the ODT exporter. I am listing few things that I have added since our last interaction: 1. Dvipng images & Math formulae created from LaTeX fragments will now have the LaTeX fragment as metadata. i.e., In LibreOffice you can see the LaTeX source by Image/Equation->Right Click->Description 2. New interactive commands - M-x org-export-as-odf and M-x org-export-as-odf-and-open. With these commands you can mark a latex fragment and export it as a odf - OpenDocument formula - document. The MathML source will be available as part of kill ring after the export. (See the docstrings) 3. Embed OpenDocument formula within the exported document by providing a link to *.mathml or *.odf file as below. #+CAPTION: cases with MathJaX [[./mathjax-cases.odf]] A link with neither caption or nor label will formatted inline type while one either or both of these attributes will be formatted as display.
> If an org file contains a latex equation with a '&' in it then when it is > exported to odt it makes dodgy xml. Unzipping the odt, opening the > content.xml and doing M-x rng-first-error gives the message: > > `&' that is not markup must be entered as `&' > > To reproduce, insert this: > > \begin{equation} > \delta_{mn} = > \begin{cases} > 1& \text{if $n=m$}\\ > 0& \text{if $n\nem$} > \end{cases} > \end{equation} > > (which I got from here > http://www.mathtoweb.com/cgi-bin/mathtoweb_users_guide.pl , search for > 'cases') > > into the file math-to-web-with-plastex.org in this post: > http://permalink.gmane.org/gmane.emacs.orgmode/48815 and export as per > instructions. > > There may be a similar error with equations containing '<', '>'.
#+TITLE: diagnosis.org #+AUTHOR: Jambunathan K #+EMAIL: address@hidden #+DATE: 2011-11-09 Wed #+DESCRIPTION: #+KEYWORDS: #+LANGUAGE: en #+OPTIONS: H:3 num:t toc:t \n:nil @:t ::t |:t ^:t -:t f:t *:t <:t #+OPTIONS: TeX:t LaTeX:t skip:nil d:nil todo:t pri:nil tags:not-in-toc #+EXPORT_SELECT_TAGS: export #+EXPORT_EXCLUDE_TAGS: noexport #+LINK_UP: #+LINK_HOME: #+XSLT: #+STARTUP: hideblocks * Diagnosis 1) User provided LaTeX fragment #+begin_src latex \begin{equation} \delta_{mn} = \begin{cases} 1& \text{if $n=m$}\\ 0& \text{if $n\nem$} \end{cases} \end{equation} #+end_src 2) Output from Plastex Note that plastex output includes a SPACE in two instances: - within `\text{blah}' - within `m$' #+begin_src latex \begin{equation} \delta _{mn} = \begin{cases} 1& \text {if $n=m$}\\ 0& \text {if $n\nem $} \end{cases} \end{equation} #+end_src #+begin_src latex \begin{equation} \delta _{mn} = \begin{cases} 1& \text{if $n=m$}\\ 0& \text{if $n\nem$} \end{cases} \end{equation} #+end_src 3) Output from MathWeb The extraneous SPACE is unacceptable to MathWeb and it complains. #+begin_src text Checking Syntax: ** -- found 1 syntax error(s) -- ** (em) Nesting Error: $..$ can not be nested inside \begin{equation}..\end{equation} unless it is within a \text{..} environment. line: 1 \begin{equation} \delta _{mn} ... ^ #+end_src If I remove the extraneous SPACEs by hand MathToWeb crashes. #+begin_src text Checking Syntax: *** -- no errors -- >> stand-alone math environments: [1] Converting: 1Exception in thread "Thread-3" java.lang.ArrayIndexOutOfBoundsException: 1 at MathToWeb.convertEachLatexMatrixToAMathMLExpression(MathToWeb.java:15511) at MathToWeb.doMatrixConversions(MathToWeb.java:3495) at MathToWeb.convertLatexToMathML(MathToWeb.java:2106) at ConvertLatexToMathMLThread.run(ConvertLatexToMathMLThread.java:64) #+end_src Question: Is the snippet output from plastex a valid LaTeX? Depending on the answer there is a bug in either plastex or mathtoweb. The moral of the story is that pre-processsing the LaTeX fragment with plastex - while it may help with circumventing ncf limitations of MathToWeb - may create side-effects which will be allergic to MathToweb. 4) How ODT handles LaTeX->MathML failures If ODT didn't receive a <math>...</math> element it assumes failure and tries to embed the LaTeX fragment verbatim in to the exporter. There was a bug in embedding LaTeX fragment as plain text in the ODT file which you have reported as below. This I have fixed. #+begin_src text > If an org file contains a latex equation with a '&' in it then when it is > exported to odt it makes dodgy xml. Unzipping the odt, opening the > content.xml and doing M-x rng-first-error gives the message: > > `&' that is not markup must be entered as `&' > There may be a similar error with equations containing '<', '>'. #+end_src * Some comments on "cases" ** Bug in MathToWeb wrt cases A comparison of [fn:1] and [fn:2] and a little experimentation with LibreOffice shows following issues with MathToWeb handling of \beign{cases}...\end{cases} which is allergic to LibreOffice. 1. MathJax uses: - <mfenced open="{" close="">...</mfenced> - while mathtoweb uses: - <mo>{</mo> and <mphantom> } </mphantom> 2. MathJax the scope of <mtext>...</mtext> to just the "if" while MathToWeb extends the scope to the entire "sub-equation" 3. MathJax uses   while MathToWeb uses   for non-breaking space. If 1, 2 and 3 are "hand-fixed" in MathToWeb output then LibreOffice not only opens the MathToWeb produced formula fine but also displays it correctly[fn:3]. ** A near-equivalent of MathToWeb's cases that is LibreOffice-friendly The below snippet is near-equivalent of "cases" formulation which is also LibreOffice-friendly. See the attached "workable-alternative-to-cases.odf". #+srcname: workable-alternative-to-cases #+begin_src latex \begin{equation*} \delta_{mn} = \left\{ \begin{smallmatrix} 1 & \text{if } n=m \\ 0 & \text{if } n\nem \end{smallmatrix} \right\} \end{equation*} #+end_src The best alternative in LaTeX would be to use the \left\{ and \right. (note the "dot") construct as below[fn:4]. Unfortunately MathToWeb fails miserably while MathJax succeeds with flying colors. #+srcname: exact-equivalent-of-cases #+begin_src latex \begin{equation*} \delta_{mn} = \left\{ \begin{smallmatrix} 1 & \text{if } n=m \\ 0 & \text{if } n\nem \end{smallmatrix} \right. \end{equation*} #+end_src * Workarounds ** Use plastex with discretion and consider MathJax as a potential option? An example scenario where it creates undesirable side-effects has been seen earlier. Interestingly, the original latex fragment DOES NOT rely on any user-defined newcommand for interpretation and can be passed on to MathToWeb directly. When the original snippet is exported with M-x org-export-as-odf-and-open RET, export to odf happens fine but LibreOffice fails to open the resulting formula[fn:1]. Also see the attached file "mathtoweb-cases.odf" If I open the resulting odf file and overwrite "content.xml" with the MathML produced by MathJax[fn:2][fn:5][fn:6] - see the attached "mathjax-cases.odf" - LibreOffice is happy. ** Provide the MathML or OpenDocument formula directly in the Org file One can provide the "right" MathML or OpenDocument formula directly in the Org file. The formula could either be created with LibreOffice's StarMath directly or by using the output from LaTeX to MathML converters as a first cut [fn:7] and improving the results subsequently with LibreOffice. * Footnotes [fn:1] #+srcname: output-from-mathtoweb-for-cases #+begin_src nxml <?xml version="1.0" encoding="UTF-8"?> <math xmlns="http://www.w3.org/1998/Math/MathML";> <mrow> <mspace width="1.00em" /> <msub> <mi>δ</mi> <mrow> <mi>m</mi> <mi>n</mi> </mrow> </msub> <mo>=</mo> <mrow> <mo>{</mo> <mtable class="m-cases" columnalign="left"> <mtr> <mtd> <mn>1</mn> </mtd> <mtd> <mtext>if  <math xmlns="http://www.w3.org/1998/Math/MathML";> <mrow> <mi>n</mi> <mo>=</mo> <mi>m</mi> </mrow> </math> </mtext> </mtd> </mtr> <mtr> <mtd> <mn>0</mn> </mtd> <mtd> <mtext>if  <math xmlns="http://www.w3.org/1998/Math/MathML";> <mrow> <mi>n</mi> <mo>≠</mo> <mi>m</mi> </mrow> </math> </mtext> </mtd> </mtr> </mtable> <mphantom> } </mphantom> </mrow> </mrow> </math> #+end_src [fn:2] #+srcname: output-from-mathjax-for-cases #+begin_src nxml <math xmlns="http://www.w3.org/1998/Math/MathML"; display="block"> <msub> <mi>δ<!-- δ --></mi> <mrow> <mi>m</mi> <mi>n</mi> </mrow> </msub> <mo>=</mo> <mfenced open="{" close=""> <mtable columnalign="left left" rowspacing=".1em" columnspacing="1em"> <mtr> <mtd> <mn>1</mn> </mtd> <mtd> <mtext>if </mtext> <mrow> <mi>n</mi> <mo>=</mo> <mi>m</mi> </mrow> </mtd> </mtr> <mtr> <mtd> <mn>0</mn> </mtd> <mtd> <mtext>if </mtext> <mrow> <mi>n</mi> <mtext mathcolor="red">\nem</mtext> </mrow> </mtd> </mtr> </mtable> </mfenced> </math> #+end_src [fn:3] Would you like to this forward as a bug report to MathToWeb team? [fn:4] For LaTeX, see the last example here: http://www.maths.tcd.ie/~dwilkins/LaTeXPrimer/Matrices.html [fn:5] MathJax doesn't seem to handle \ne well. [fn:6] Is there a command-line interface to MathJax? This will permit MathJax as a potential alternative to MathToWeb. If there is no command-line converter, can someone reverse-engineer the MathJax javascript and see what magic it does over the network or cloud. [fn:7] In case of matrices, MathToWeb produces a MathML which displays fine save for some characters that are displayed as "questions".
Jambunathan K. --
diagnosis.org
Description: Text Data
improvements.org
Description: Text Data
workable-alternative-to-cases.odf
Description: application/vnd.oasis.opendocument.formula
mathjax-cases.odf
Description: application/vnd.oasis.opendocument.formula
mathtoweb-cases.odf
Description: application/vnd.oasis.opendocument.formula
[Prev in Thread] | Current Thread | [Next in Thread] |