[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Parsing entities in HTML input
From: |
Kacper Gutowski |
Subject: |
Re: Parsing entities in HTML input |
Date: |
Fri, 17 Apr 2020 19:04:35 +0200 |
On Thu, Apr 16, 2020 at 03:29:08PM +0200, Dr. Jürgen Sauermann wrote:
> fixed in SVN 1262.
Thanks!
Angle brackets are now correctly converted at the end of lines too.
But parsing numeric entities decimally was actually correct. Now
doing a )DUMP-HTML followed by )COPY or )LOAD changes all ampersands
into a digit eight. The )DUMP-HTML encodes "&" as "&" which
is a correct, decimal representation of it. At r1262, this gets
incorrectly parsed as hexadecimal yielding "8".
As far as the HTML goes, ampersand could also be encoded as "&"
(which is the most common) or hexadecimally "&" (note the "x").
As a side note, numeric references could be of any length, not just
two digits (it could be "&" as well), but that doesn't matter
as long as the subset that )DUMP-HTML produces can be parsed.
> If you like the )DUMP-HTML command then you may like the ]DOXY command as
> well:
>
> https://www.gnu.org/software/apl/apl.html#Section-3_002e8
Oh yes, I do. It's pretty nice, especially for exploring how larger
workspaces like the Toronto Toolkit work.
I just noticed that ]DOXY gets some dependencies wrong: in the Toolkit,
there's a function "julian" which is shown to be calling "date", but in
fact it named its right argument "date" and doesn't call the function.
The Toronto Toolkit didn't contain any ampersands ;)
-k