[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Chicken-users] Cryptic SSAX error message
From: |
Peter Bex |
Subject: |
Re: [Chicken-users] Cryptic SSAX error message |
Date: |
Thu, 19 Mar 2015 09:07:27 +0100 |
User-agent: |
Mutt/1.5.21 (2010-09-15) |
On Wed, Mar 18, 2015 at 08:56:06PM -0600, Matt Gushee wrote:
> On Tue, Mar 17, 2015 at 2:13 AM, Peter Bex <address@hidden>
> wrote:
> > You shouldn't parse HTML with an XML parser.
>
> Not in general, no. But wouldn't you agree that, regardless of what is
> wrong with the input file and why it is wrong, it would be good if SSAX
> output something that would actually be useful in troubleshooting? That was
> my main point.
Oh, I definitely agree. I just wanted to point out that expecting to be
able to parse HTML with an XML parser is never going to work, in case you
were expecting it to.
> And of course, as I mentioned, I'm well aware that desirable
> != doable, but I didn't (and don't) know if this is a known issue, so I
> thought I should say something.
And I at least appreciate the bugreport :) But see my PS at the end.
> > Since you're using CHICKEN,
> > you could try the html-parser CHICKEN egg, which is more permissive.
>
> But that's not the goal. Perhaps you recall this discussion from 2 years
> ago?
>
> >> [Matt]
> >> Finally, an idea has occurred to me. What about a templating system where
> >> what actually gets used at runtime is SXML, but designers could create
> >> templates in XHTML, then when they are satisfied with the design, use a
> >> preprocessing tool to convert them to SXML? That would at least ensure
> >> well-formed markup.
>
> > [Peter]
> > Yep, that would be good. Representation and surface syntax don't
> > neccessarily need to be equivalent, though the Lisper in me disagrees
> > about that being a good idea :)
> REF:
> http://lists.nongnu.org/archive/html/chicken-users/2013-03/msg00058.html
That's a rather different point I was making at the time; I was arguing
for using SXML directly. The point I'm making now is that if you want
to parse HTML, you should use an HTML parser. But IIUC you're saying that
was a bit of mistake in one particular template you made, and you really
mean the templates to be strict XML?
> So Civet is the templating system I created pursuant to that conversation.
> The templates are supposed to be well-formed XML (in practice, mainly
> XHTML), and presumably created by a developer who knows what they're doing
> - though the current issue may call that into question ;-).
:)
> I certainly don't believe my approach is ideal from a purely technical
> standpoint. But given that the meta-goal of my projects is to "use Scheme
> to create web development tools that might be used by people who don't know
> Scheme" (as opposed to "use Scheme to develop websites"), I think it's
> about as good a compromise as can be expected. If I were creating Civet
> today, I think I would look for a different approach - but mainly because
> it is now clear (maybe it was in 2013 and I just didn't know it) that HTML5
> (in non-XML syntax) is becoming dominant, and the never-popular XHTML is
> dying, if not dead. But I still stand by the fundamental reasoning that led
> to Civet as it is (and BTW, it works pretty well within its limitations -
> you should try it ;-)
I don't create websites for nontechnical people anymore, and I'm very
glad about it; it's pretty thankless and unsatisfying work. However,
one of the things that made the job more painful than it had to be was
the shitty CMSes out there (we used Drupal, but the other systems out
there suck in a variety of different ways), so I'm sure a CMS in Scheme
is going to make the lives of people like my former self more pleasant.
> > I *think* XHTML Strict is a proper XML application, but I'm not 100% sure,
>
> I'm not 100% sure either, but if the W3C says it's XML, they most likely
> mean it is completely well-formed. One thing I know is that it prohibits
> inline CSS and JavaScript - and now I understand why.
Yeah, there are lots of hairy issues, especially with inline JavaScript
where it breaks the "normal" parsing rules, as some people noticed
in #chicken the other day. It's no wonder the web is so riddled with
security issues!
PS: I replied using my sourceforge address; for some reason I thought
you were reporting this to the SSAX mailinglist. That would be the
correct place to report this bug; we just import the upstream code, as-is.
Cheers,
Peter
signature.asc
Description: Digital signature