chicken-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Chicken-users] New project


From: Peter Bex
Subject: Re: [Chicken-users] New project
Date: Sat, 10 Sep 2011 22:11:20 +0200
User-agent: Mutt/1.4.2.3i

On Sat, Sep 10, 2011 at 01:02:05PM -0700, Steve Graham wrote:
> Although I have programmed for decades, I am new to Scheme.  I thought that I 
> would use a project I've been contemplating as a help in learning the 
> language.
> 
> There is a website of scriptures which I wish to download.  Some of the 
> webpages, of course, are indices into books and then chapters.  So I would 
> need to follow the links until I got to the actual text, which I would then 
> file into a database by volume, book, chapter and verse.
> 
> I would appreciate any hints as to how to do this.  I'm thinking I would need 
> some pointers with downloading web pages, stripping HTML and saving to a 
> database.

Well, that's an extremely broad and open-ended question.  You probably
want to start by becoming familiar with the libraries that can help
you with this task.

For downloading web pages you can use the http-client egg, for parsing
HTML into SXML you can use html-parser or htmlprag and for stripping the
tags you can hand-roll something or use one of the sxml-transforms or
sxpath eggs.

To store it in a database you should probably first determine which
database you want to use.  Which egg you can use will then be an
easy choice.

If you have more specific questions, feel free to ask.  There are
many helpful people here on the mailinglist and also on Freenode's
#chicken IRC channel!

Cheers,
Peter
-- 
http://sjamaan.ath.cx
--
"The process of preparing programs for a digital computer
 is especially attractive, not only because it can be economically
 and scientifically rewarding, but also because it can be an aesthetic
 experience much like composing poetry or music."
                                                        -- Donald Knuth



reply via email to

[Prev in Thread] Current Thread [Next in Thread]