pan-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Pan-users] Re: performance ...


From: Duncan
Subject: [Pan-users] Re: performance ...
Date: Tue, 9 Dec 2008 15:17:51 +0000 (UTC)
User-agent: Pan/0.133 (House of Butterflies)

address@hidden posted
address@hidden, excerpted below, on 
Tue, 09 Dec 2008 13:41:28 +0100:


>> But what you're seeing is normal.  Keep in mind that if pan is saying a
>> million articles, that's after combining multiparts.  In some groups,
>> that could mean ten or fifty million actual single-part articles.
> 
> I was referring to the total article count (not the thread count) My
> largest groups file is 900 MB.

But how are you counting articles?  Are you using the pan unread count or 
something else?  If you're using pan unread count, it's counting multiple 
parts (not threads, parts of a single multipart message, aka multi-
segment) as a single entry.  If you look you can see it, 15/15 or 
whatever, or sometimes a missing part, 14/15, with the corresponding 
broken puzzle icon instead of the full puzzle icon.

With old-pan you could separate the parts into the individual pieces, 
which then showed up as "threads".  With new-pan, it's displayed only 
once, tho actual replies are still shown threaded.

> Perhaps using memory maps might speed up things ? Also the data seems to
> be writting in ASCII format, requiring rescan/repars every time. 
> Perhaps saving in binary, which allows even more efficient use of memory
> maps might be usefull (Option only for large groups perhaps ? ) It might
> not reduce the size of the file but it will avoid having to convert lots
> of integers (like line numbers, sizes, dates etc).  Also it would allow
> to read in blocks without having to process those blocks.

Perhaps as an option.  Note that binary is a much more opaque format, 
much harder to repair manually if necessary.

> that is true but when you know you might need to treat 1G of data, you
> start managing the data cleaverly.  Generally you try to save the work
> that you did for later purposes.  E.g. if you have already figured out
> certain things, you store that info so that you don't have to figure it
> out later on.

As I said, pan now does save its work.  Old-pan used to re-thread every 
time you loaded the group.

>> Meanwhile, how do you monitor CPU usage?  Are you monitoring it per
>> core, or overall only?  Most of new-pan is single-threaded, because
>> Charles had gone with multi-threaded in old-pan and found the
>> complexity and thread- race bugs just not worth it for the limited
>> increase in performance. Instead, new-pan now hatches threads only in
>> limited performance critical sections (like when starting multiple
>> connections at once, one place I know it's used as I remember Charles
>> fixing a bug I had with it).  So pan will likely be using near 100% of
>> a single core, but the others should remain mostly idle, I /think/. 
>> (It has been awhile since I did binaries and IDR for sure.)
> 
> No when it is busy doing stuff and blocking other apps from doing
> something I ran top and it showed pan using about 80% cpu, constantly
> for a certain time.

Yes, but what are your top settings?  Are you showing each individual 
core separately or are you only showing the combined, and are you using 
irix or solaris mode?  I'm asking because depending on setting, using all 
four at 100% each it could call that 400% or 100%, with 100% of a single 
core being correspondingly 100% or 25%, all depending on how you have top 
set.

You're using Kubuntu so you should be able to setup a ksysguard graph if 
you like.  I don't know if you're on KDE 3.5 or 4.x but 4.x is still 
broken for daily use for me (4.2 should fix most of it AFAIK), so I'm 
using 3.5.10 still, with a ksysguard kicker applet at the top of my 
screen.  Its first four graphs are user/system/nice CPU on each of the 
four cores, so when I'm in KDE 3 anyway, I get a live updating graph of 
activity on each of the four cores.  (FWIW, next is load, then memory, 
then swap which is normally zero, then up and down network traffic, then 
multiplexed disk activity, then the four CPU core-temps, then two 
additional system temps.  I'm running two 1920x1200 LCDs stacked for 
1920x2400, with the ksysguard applet taking up nearly 1500 px width at 
the max 300 px kicker panel height, on the top LCD.)


>> Also, it may be disk I/O related, if you have a single disk only and
>> that group's data isn't in cache yet.  I run a dual dual-core Opteron
>> 290 (2.8 GHz) here, so have four cores too, but I'm running
>> Gentoo/~amd64 with everything compiled to my specific hardware, which
>> will help some (BTW, you didn't mention whether you were running 32-bit
>> or 64-bit kubuntu, 4 gigs on 32-bit is going to be less efficient than
>> 4 gigs on 64-bit), and I run a 4-disk kernel/md RAID, with pan's data
>> on RAID-6, which means it's two-way striped.  RAID striping really
>> /does/ help, and not just with pan; you might be surprised how much.
> 
> Yes i have been considering switching since
> 
> 1. my 4 GB is not used (because of memory of graphics card)

FWIW you should be able to configure the 32-bit kernel for 64-gig mode if 
you like, or probably download one so configured from Ubuntu (possibly 
named 686-bigmem, the Debian name AFAIK).  If the BIOS will remap the 
memory, you should then get the memory ordinarily covered by the legacy 
32-bit PCI I/O hole (typically half a gig or so, sometimes a full gig) 
mapped above the 4-gig boundary then.  This works using PAE mode, AFAIK.

Here's a bit about it.  The title says a gig, but it talks about both the 
HIGHMEM-4G and HIGHMEM-64G options.

http://www.linux.com/feature/119287

But in 32-bit mode that's less efficient as it has to effectively page 
the memory into a window it can actually address.  64-bit of course 
eliminates that.  And... not all BIOS support it, 32-bit or 64-bit, 
unfortunately, altho most of the newer ones will in 64-bit at least.

> 2. indeed my disk seems to be the bottleneck.

> However I need to completely upgrade my box and that is a hard job. 
> Also I have no experience setting up RAID (donno even if my mobo
> supports it)

FWIW, I had no experience with it either, until I had two drives go out 
in two years and decided I needed a bit more reliability than that.  So I 
upgraded to 4xSATA drives (my mobo supported it in firmware RAID mode but 
that sucks in Linux since it's really software RAID anyway, and proper 
kernel RAID is more reliable, so I set it to straight SATA mode and used 
the kernel RAID) and after some research and planning it all out, set it 
up.

If you decide to do it and know nothing of RAID, you'll want to google 
for the free chapter of O'Reilly's Linux RAID book.  It's an excellent 
intro, explaining the difference between hardware, firmware and kernel/md 
RAID, and the various RAID levels.  That's where I started as I knew very 
little about it before that.

After that, you'll want to read the kernel's md.txt document (in your 
kernel Documentation subdir) and look at the HOWTOs.  In particular, keep 
in mind that if you're going to boot off the RAID, you need a small 
RAID-1/mirrored partition to hold /boot, since RAID-1 is all either LILO 
or GRUB understand.   When I setup, there wasn't a lot of info out there 
about mdp/partitioned-RAID yet, as it was still pretty new, but I managed 
to find what I needed.  You will also want to consider LVM2 on top of 
RAID, the way people handled it before partitioned RAID, but while you 
can boot directly to RAID using an appropriate kernel command line, 
unfortunately LVM2 requires an initramfs/initrd.  I chose not to use 
that, so I put my root filesystem and a backup on partitioned RAID, and 
almost everything else I wanted to keep redundant on an LVM on top of 
RAID, setup so I could load LVM after I had my rootfs on the partitioned 
RAID already going.

That's 10km high overview at a few hundred km/hr! =:^)  It sounds 
confusing condensed like that, but take it a step at a time as I did, and 
you should be fine, as I was/am. =:^)  If you get stuck, you know someone 
to mail for help. =:^)

Not to pressure you if you don't believe your ready, but really, if 
you're already running quad-core and 4 gig RAM, a single spindle hard 
drive IS the bottleneck, and you'll find the system not only faster, but 
much more responsive, once you effectively get that millstone of your 
neck.  I just think it's such a shame to have a nice system bogged down 
like that.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman





reply via email to

[Prev in Thread] Current Thread [Next in Thread]