|
From: | Andrew Smellie |
Subject: | [Grammatica-users] Parsing data out of an html file |
Date: | Mon, 7 Feb 2011 00:26:05 -0500 |
Hi I have a long and complex html file that contains a small piece of well formatted data inside it. I need to write a grammar of the type Skip over the garbage Read what I need Skip the rest of the garbage Here is an example: </div> <!-- <div class="endOfDay"> END OF PLAY REPORTS <div class="endOfDayLinks"> <div class="endOfDayLeft"><a href=""> <div class="endOfDayRight"><a href=""> </div> </div> --> </div> I want to parse the line <div class="endOfDayLeft"><a href="" and ignore everything else I have tried to define a “skip everything” token and then special casing wwhat I want SKIP_EVERYTHING = <<.*>> %ignore% WHITESPACE = <<[ \t\n\r\d]+>> %ignore% But I keep getting a parsing exception and the end of the file Thanks for any help in advance Andrew |
[Prev in Thread] | Current Thread | [Next in Thread] |