grammatica-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Grammatica-users] Parsing data out of an html file


From: Andrew Smellie
Subject: [Grammatica-users] Parsing data out of an html file
Date: Mon, 7 Feb 2011 00:26:05 -0500

Hi

 

I have a long and complex html file that contains a small piece of well formatted data inside it. I need to write a grammar of the type

 

Skip over the garbage

Read what I need

Skip the rest of the garbage

 

Here is an example:

 

        </div>

          <!--

          <div class="endOfDay"> END OF PLAY REPORTS

            <div class="endOfDayLinks">

              <div class="endOfDayLeft"><a href="">

              <div class="endOfDayRight"><a href="">

            </div>

          </div>

          -->

        </div>

 

I want to parse the line <div class="endOfDayLeft"><a href="" and ignore everything else

 

I have tried to define a “skip everything” token and then special casing wwhat I want

 

SKIP_EVERYTHING         = <<.*>> %ignore%

WHITESPACE              = <<[ \t\n\r\d]+>> %ignore%

 

But I keep getting a parsing exception and the end of the file

 

Thanks for any help in advance

 

Andrew


reply via email to

[Prev in Thread] Current Thread [Next in Thread]