[Grammatica-users] Parsing data out of an html file

I have a long and complex html file that contains a small piece of well formatted data inside it. I need to write a grammar of the type

Skip over the garbage

Read what I need

Skip the rest of the garbage

Here is an example:

</div>

<!--

<div class="endOfDay"> END OF PLAY REPORTS

</div>

-->

</div>

I want to parse the line <div class="endOfDayLeft"><a href="" and ignore everything else

I have tried to define a “skip everything” token and then special casing wwhat I want

SKIP_EVERYTHING = <<.*>> %ignore%

WHITESPACE = <<[ \t\n\r\d]+>> %ignore%

But I keep getting a parsing exception and the end of the file

Thanks for any help in advance

Andrew

From:	Andrew Smellie
Subject:	[Grammatica-users] Parsing data out of an html file
Date:	Mon, 7 Feb 2011 00:26:05 -0500