[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Chicken-users] Chicken vs Perl
From: |
Daishi Kato |
Subject: |
Re: [Chicken-users] Chicken vs Perl |
Date: |
Tue, 20 Sep 2011 22:08:16 +0900 |
User-agent: |
Wanderlust/2.14.0 (Africa) Emacs/21.4 Mule/5.0 (SAKAKI) |
Hi,
My situation is pretty similar to yours, meaning I used to use Perl
and later started using Chicken for my job.
Running your scripts on my machine produced similar result
(about 10 times difference).
-unsafe option in csc-4.6.0 didn't work (no change).
-unsafe-libraries in csc-4.0.0 did work (a little faster),
but it's not available in csc-4.6.0 (does anybody know why?).
I also tried with csc-4.7.0, and guess what, it's a little slower
(at least on my test data. I partially crawled wiki.call-cc.org).
Peter, how could this happen?
My guess is that read-line is slower than <> in perl.
(I think <> is so optimized in perl.)
This is just my guess and there's no guarantee,
but how about comparing with using read-all in chicken and $/=undef in perl?
Best,
Daishi
At Tue, 20 Sep 2011 14:11:41 +0200,
Sascha Ziemann wrote:
>
> I tried to use Chicken for a job I would use normally Perl for to find
> out whether Chicken might be a useful alternative.
>
> The job is: go through a web site mirror and report a unique list of
> all domains from all hrefs.
>
> This is the my Perl version:
>
> #! /usr/bin/perl
>
> use warnings;
> use strict;
> use File::Find;
>
> my $dir = $ARGV[0] || '.';
> my @files;
> my %urls;
>
> find ({wanted => sub { push @files, $_ if -f $_; },
> no_chdir => 1}, $dir);
>
> foreach my $file (@files) {
> open (HTML, $file) || die "Can not open file '$file'";
> while (<HTML>) {
> while (/href="(http:\/\/[^"\/?]+)(["\/?].*)/i) {
> $urls{lc $1} = 1;
> $_ = $2; } }
> close (HTML); }
>
> foreach my $url (sort keys %urls) {
> print $url, "\n"; }
>
> The Perl version takes for my test tree about two seconds:
>
> real 0m1.810s
> user 0m1.664s
> sys 0m0.140s
>
> And this is my Chicken version:
>
> #! /usr/local/bin/csi -s
>
> (require-extension posix regex srfi-69)
>
> (define dir (let ((args (command-line-arguments)))
> (if (pair? args)
> (car args)
> ".")))
> (define files (find-files dir regular-file?))
> (define urls (make-hash-table))
> (define href (regexp "href=\"(http://[^\"/?]+)([\"/?].*)" #t))
>
> (for-each
> (lambda (filename)
> (with-input-from-file filename
> (lambda ()
> (let next-line ((line (read-line)))
> (if (not (eof-object? line))
> (let next-href ((found (string-search href line)))
> (if found
> (begin
> (hash-table-set! urls (string-downcase (cadr found)) #t)
> (next-href (string-search href (caddr found)))))
> (next-line (read-line))))))))
> files)
>
> (for-each
> (lambda (arg)
> (printf "~a\n" arg))
> (sort (hash-table-keys urls) string<?))
>
> And now hold on tight! It takes more than one minute for the same data:
>
> real 1m16.540s
> user 1m14.849s
> sys 0m0.664s
>
> And there is almost no significant performance boost by compiling it:
>
> real 0m1.810s
> user 0m1.664s
> sys 0m0.140s
>
> So the questions are:
>
> - What is wrong with the Chicken code?
> - How can I profile the code?
> - Why is there no difference between csi and csc?
>
> _______________________________________________
> Chicken-users mailing list
> address@hidden
> https://lists.nongnu.org/mailman/listinfo/chicken-users
- Re: [Chicken-users] Chicken vs Perl, (continued)
- Re: [Chicken-users] Chicken vs Perl, Peter Bex, 2011/09/20
- Re: [Chicken-users] Chicken vs Perl, Christian Kellermann, 2011/09/20
- Re: [Chicken-users] Chicken vs Perl, Peter Bex, 2011/09/20
- Re: [Chicken-users] Chicken vs Perl, Alan Post, 2011/09/20
- Re: [Chicken-users] Chicken vs Perl,
Daishi Kato <=
- Re: [Chicken-users] Chicken vs Perl, Felix, 2011/09/20