[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: GLR ambiguity
From: |
Alessandro Di Marco |
Subject: |
Re: GLR ambiguity |
Date: |
Thu, 14 Jun 2007 15:09:31 +0200 |
User-agent: |
Gnus/5.11 (Gnus v5.11) Emacs/22.1.50 (gnu/linux) |
Hans Aberg <address@hidden> writes:
On 14 Jun 2007, at 14:46, Alessandro Di Marco wrote:
> In american english sentences like the following ones are
> quite common:
>
> 1) "blah blah blah". Some more blah...
> 2) "blah blah blah." Some more blah...
> 3) "blah blah blah. Some more blah...
>
> Now, the 3rd gives out the problem. For example, here it is an excerpt
> fooling
> my parser:
>
> Party chairwoman Hazel Blears was accused by the Conservatives of
> scapegoating
> immigrants after saying in an Independent on Sunday newspaper interview:
"We
> have got areas in Salford where private landlords are letting properties
> with
> 10 and 12 people in there. "Now, the community doesn't object to the
people
> -
> they object to the exploitation and the fact that that leads to people
being
> on
> the street drinking, anti-social behaviour." Welsh Secretary Peter Hain,
> meanwhile accused Home Secretary John Reid of "fanning up" last week's row
> over
> stop-and-question powers possibly being rolled out across the UK.
One way around is feeding a UTF-8 .ly file to Flex, and require that the
proper Unicode “...” be used, i.e. U+201C & U+201D. When U+201C arrives, in
the lexer, start parsing a quotation string. If the closing U+201D has not
arrived when the paragraph, or whatever block without the construct cannot
survive, closes, issue an error.
Thanks for the suggestion; unfortunately it is not viable because the text is
plain ascii. Considering the spaces around the quotes I could get a similar
effect, but there should be something better... does it?
Thanks again.
Alessandro
--
The best inheritance a parent can give his children is a few minutes of his
time each day. - O. A. Battista
Re: GLR ambiguity, Hans Aberg, 2007/06/14