[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Chicken-users] Chicken vs Perl
From: |
Sascha Ziemann |
Subject: |
[Chicken-users] Chicken vs Perl |
Date: |
Tue, 20 Sep 2011 14:11:41 +0200 |
I tried to use Chicken for a job I would use normally Perl for to find
out whether Chicken might be a useful alternative.
The job is: go through a web site mirror and report a unique list of
all domains from all hrefs.
This is the my Perl version:
#! /usr/bin/perl
use warnings;
use strict;
use File::Find;
my $dir = $ARGV[0] || '.';
my @files;
my %urls;
find ({wanted => sub { push @files, $_ if -f $_; },
no_chdir => 1}, $dir);
foreach my $file (@files) {
open (HTML, $file) || die "Can not open file '$file'";
while (<HTML>) {
while (/href="(http:\/\/[^"\/?]+)(["\/?].*)/i) {
$urls{lc $1} = 1;
$_ = $2; } }
close (HTML); }
foreach my $url (sort keys %urls) {
print $url, "\n"; }
The Perl version takes for my test tree about two seconds:
real 0m1.810s
user 0m1.664s
sys 0m0.140s
And this is my Chicken version:
#! /usr/local/bin/csi -s
(require-extension posix regex srfi-69)
(define dir (let ((args (command-line-arguments)))
(if (pair? args)
(car args)
".")))
(define files (find-files dir regular-file?))
(define urls (make-hash-table))
(define href (regexp "href=\"(http://[^\"/?]+)([\"/?].*)" #t))
(for-each
(lambda (filename)
(with-input-from-file filename
(lambda ()
(let next-line ((line (read-line)))
(if (not (eof-object? line))
(let next-href ((found (string-search href line)))
(if found
(begin
(hash-table-set! urls (string-downcase (cadr found)) #t)
(next-href (string-search href (caddr found)))))
(next-line (read-line))))))))
files)
(for-each
(lambda (arg)
(printf "~a\n" arg))
(sort (hash-table-keys urls) string<?))
And now hold on tight! It takes more than one minute for the same data:
real 1m16.540s
user 1m14.849s
sys 0m0.664s
And there is almost no significant performance boost by compiling it:
real 0m1.810s
user 0m1.664s
sys 0m0.140s
So the questions are:
- What is wrong with the Chicken code?
- How can I profile the code?
- Why is there no difference between csi and csc?
- [Chicken-users] Chicken vs Perl,
Sascha Ziemann <=