Re: XML tools for Octave

octave-maintainers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: XML tools for Octave

From:	Andy Adler
Subject:	Re: XML tools for Octave
Date:	Thu, 29 Jun 2006 12:14:31 -0400 (EDT)

On Thu, 29 Jun 2006, Bill Denney wrote:

On Thu, 29 Jun 2006, Andy Adler wrote:
On Thu, 29 Jun 2006, Bill Denney wrote:
Andy Adler wrote:
I'm trying to write this. My idea is that XML like this
<a b="c" d="e"> text <f g="h"/> more text <f>data</f> </a>
Shouldn't it parse as something more like

v.a.ATTS.b = "c"
v.a.ATTS.d = "e"
v.a.CHILD{1} = " text "
v.a.CHILD{2}.f.ATTS.g => "h"
v.a.CHILD{3} = " more text "
v.a.CHILD{4}.f.CHILD{1} => "data"
My concern is that this output makes writing software to parse the xmloutput really frustrating - you need to loop through the CHILD vectors tofind what you're looking for. This would result in people taking shortcutsthat make the code fragile.
But not doing it this way would make an incorrect representation for nestedstructures: what about just parsing xhtml like
abcd efg jklm

would turn into

v.p.TEXT = 'abcd jklm';
v.p.i.TEXT = 'efg';
which would not reverse correctly because all of these could turn into theabove:
abcdefg jklm
efgabcd jklm
abcd jklmefg
...
To me, it should be a reversible transformation. Also, without keeping theorder, you would lose the ability to do full XSLT interpretation (what do youdo about sibling commands).
This is actually a big debate in the XML semantics community - the factthat XML does not map easily to data structures.
I realize that it doesn't map easily, but it should map reversibly.

I honestly don't know how to address this. Your suggestion wouldbasically mean that we expose the full DOM API. Matlab did this with

a thin wrapper over Java's DOM.

However, this is a really bad idea. It means that all the effort of
XML parsing falls on the user - who will either make mistakes or
shortcuts, with the result of really fragile code.

For example a user will parse
     <data><item> 1 </item></data>

using
   v.data.CHILD{1}.item.TEXT{1}

But this will break when you have
     <data> <item> 1 </item></data>

or
     <data><metadata/><item> 1 </item></data>


So I don't think that pushing all the complexity to the user is right.

Somehow it should be easy to do easy things, but possible to docorrect things.


How about:
 <a b="c" d="e"> text <f g="h"/> more text <f>data</f> </a>

 v.a.ATTS.b      : "c"
 v.a.ATTS.d      : "e"
 v.a.TEXT{1}     : " text "
 v.a.TEXT{2}     : " more text "
 v.a.TEXT{3}     : " "
 v.a.f{1}.g      : "h"
 v.a.f{2}.TEXT{1}: "data"

and some extra information in:

 v.a.NAMESPACE
 v.a.ORDEREDELEMS
 v.a.UTFNAMES

This is starting to look like a big project ;-<


--
Andy Adler <address@hidden> 1(613)562-5800x6218

[Prev in Thread]

Current Thread

[Next in Thread]

Re: XML tools for Octave, (continued)

Prev by Date: Re: XML tools for Octave
Next by Date: Re: XML tools for Octave
Previous by thread: Re: XML tools for Octave
Next by thread: Re: XML tools for Octave
Index(es):
- Date
- Thread