chicken-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Chicken-users] Parsing HTML, best practice with Chicken


From: Kooda
Subject: Re: [Chicken-users] Parsing HTML, best practice with Chicken
Date: Mon, 29 Dec 2014 12:12:22 +0100
User-agent: Mutt/1.5.23 (2014-03-12)

Hi,

On Mon, Dec 29, 2014 at 03:28:15AM +0100, mfv wrote:
> So far, I have been getting the site with http-client, the raw html to sxml
> with html-parser, and trying to process the resulting list with
> matchable/srfi-13. I am not sure how much good it will do to use regex on 
> those
> lists. Are there any packages like Python's Beautifulsoup in the Chicken
> arsenal?

Sxml-transform and other sxml related eggs can certainly help you here,
but I don’t know them really well so I can’t help you with that.


> ;; saving function
> ;; * display form is more suitable, for it evaluates all those \n and other
> ;; * specials characters;; * might be good to remove these things from regex
> ;; * processing, too. 
> (define (savedata somedata filename)
>   (call-with-output-file filename
>     (lambda (p)
>       (let f ((ls somedata))
>       (unless (null? ls)
>         (display (car ls) p)   ; changed: display->write
>         (newline p)
>         (f (cdr ls)))))))

Here you can simply use `write` instead of your big function. `pp` can
also be useful if you want to read the resulting file with a text
editor.


> ;; --- member? returns #t if elemnt x is in list lst.
> ;; --- ref:
> ;; --- 
> http://stackoverflow.com/questions/14668616/scheme-fold-map-and-filter-functions
> ;; --- use: (member? "a" (list "a" 1)) --> #t
> (define (member? x lst)
>   (fold (lambda (e r)
>           (or r (equal? e x)))
>         #f lst))

This function already exists, it’s called `member` and is in the srfi-1
library.


> ;; --- string-contains/m returns #t if all strings of list lsstr are in
> ;; --- string str. 
> ;; --- case insensitive string matching. 
> ;; --- does not check if lsstr is empty. This would return #t. 
> ;; --- use: (string-contains/m "Somestring" '("10.1002" "issuetoc")
> (define (string-contains/m str lsstr)
>   (if (string? str) 
>       (if (not (member? #f (map (lambda (x) (string-contains-ci str x))
> lsstr))) #t)))

This looks wrong to me, your function can return an unspecified value,
try with this:

(define (string-contains/m str lsstr)
  (and (string? str) 
       (not (member? #f (map (lambda (x) (string-contains-ci str x))
                             lsstr)))))


I hope this will help you.

-- 
Envoyé depuis ma GameBoy.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]