[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Pan-devel] article rescoring during download
From: |
eto |
Subject: |
[Pan-devel] article rescoring during download |
Date: |
Fri, 21 Dec 2012 21:54:03 -0800 |
User-agent: |
Mutt/1.5.20 (2009-06-14) |
Hi all,
I was curious what the latest pan looked like so I pulled down debian
unstable (0.139-1). There was a remarkable slowdown and much higher CPU
usage when pulling binaries from large groups as compared to 0.133.
I grabbed the latest git code and did a little profiling. Eventually I
found a curious call being made in pan/gui/header-pane.cc, line 2527:
void
HeaderPane :: on_cache_added (const Quark& message_id)
{
quarks_t q;
q.insert(message_id);
_data.rescore_articles ( _group, q ); // here is slow
rebuild_article_action (message_id);
}
That rescore_articles call eventually gets down to a std::set_intersection
over what I believe to be the entire list of articles in the group of
interest.
This on_cache_added call is made fairly often, possibly each time an
article's contents are retrieved.
I couldn't see anything terribly valuable in this call since I don't use
scoring other than to plonk authors so I commented out the three related
lines and saw my bandwidth get fully utilized again.
As for treating this, the code is doing an intersection with a single
element set (the one quark), which is effectively a find. There might be a
faster find if these articles are already hashed. I don't know enough
about the system to suggest a complete new codepath. There might be a
"user doesn't use scoring" flag that flat avoids this call.
Eric Ortega
- [Pan-devel] article rescoring during download,
eto <=