emacs-orgmode
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [O] Parsing Org-mode in Python


From: Brett Viren
Subject: Re: [O] Parsing Org-mode in Python
Date: Thu, 09 Jan 2014 09:13:39 -0500
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux)

Hi Daniel,

Daniel Clemente <address@hidden> writes:

>   Are there already Python parsers for it?

Parsing generic JSON is fairly trivial in Python.

  import json
  data = json.dumps(open('file.json').read())

The resulting "data" is then a bunch of Python lists and/or dicts
matching whatever structure was output from org and is in the .json
file.  The schema in these three contexts are (will be) identical.

At this point, Pythonistas can do what they want with "data".  Although,
as I mentioned, I'd like to put another layer on this "raw" data
structure which expresses/enforces the org schema as understood by the
org-exporter.  If I can figure out how to dump a representation of this
schema from org I'll express it as a set of generated
collections.namedtuple instances.  We'll see.

>   Should ox-json's output be as raw as possible (e.g. what your code
> produces now) or transformed to simpler JSON?
>   (I think both formats should coexist).

I suppose there may be a usefulness to "winnow down" the structure.  One
thing I'm thinking about here is the narrowing done to support the "blog
From anywhere" feature of Karl's lazyblorg mentioned in this thread.

That can be done either on the emacs side or Python side (or both, in
principle).  However, my intention is to do as little modification of
the org document structure on the emacs-side in order to preserve
details that may possibly be interesting on the Python-side in the
future.  Also, I'm still learning LISP but know Python fairly well so
would rather do as much processing as possible on the Python side. :)

So far the only thing I see that needs to be stripped is the :parent
property (and the :structure, which really should be resolved as a copy
instead of being stripped) which cause the emacs-side data structure to
become a Circular Object and thus break the emacs JSON dumper.  

I just noticed that Python's JSON dumper can do this kind of stripping
implicitly and in general.  It might be nice if someone were to add such
a feature to the emacs JSON dumper but I don't plan to try this.

-Brett.

Attachment: pgpDzGBxwkffj.pgp
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]