chicken-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Chicken-users] nsample and benchmarking in general


From: Brandon J. Van Every
Subject: [Chicken-users] nsample and benchmarking in general
Date: Thu, 24 Aug 2006 14:25:19 -0700
User-agent: Thunderbird 1.5.0.5 (Windows/20060719)

felix winkelmann wrote:

- nsample is gone. I still think benchmarking for nursery size makes
 sense, but...

The right way to do nsample (the program that tries to determine a good stack size for Chicken) is under controlled benchmarking conditions. Certainly when we did OpenGL benchmarking back at DEC, we didn't take just 3 samples, and we didn't do it on machines where we were answering e-mail. In fact, we'd strip those benchmarking machines to the bare minimum, as much as we could honestly get away with. Revenues for our OpenGL boxes depended on good benchmarks, and the OpenGL ARB did have ways of punishing people for cheating.

If you want to work on a controlled benchmark for nsample and include it in the benchmark directory, I can help with that, or actually do it at some later date. I just don't have time right now. But, my stack-size.cmake code does do the basics of multiple sampling. For maintainability, it would be best to rewrite it in Scheme. Benchmarking shouldn't be done at build time anyways. I consider CMake to be appropriate for "canonical build problems," not any old scripting problem.

A proper benchmark script would:
- do lotsa runs
- warn the user not to use their machine during the benchmarking
- measure the variance of the runs and declare them invalid if they're way off. This could be due to the user spacing out and checking his e-mail, other network and driver activity, or the architecture just may not have a good value for the stack size. - have a method of submitting the runs to a centralized database at the Chicken website
- have server-side metrics for the variance of submitted runs
- merge runs deemed "high quality" into a .txt file database of believed-good stack size values - evaluate all of this with respect to target CPU, machine, memory configs, etc. and include all that info as part of the submission

This kind of mechanism would be generally useful for evaluating, improving, and promoting the performance of Chicken in general. Conceivably, all the benchmarks could use the same mechanism, and submit to the same database.

Wrinkle: it would have to be very easy to recompile for a different stack size. If the stack size can be set at runtime, that would be ideal. If it's not easy for a user to try different stack sizes, they're not going to. At least, not in any widespread way. You'll get a few speed freaks here and there trying it out, and your benchmark submissions will reflect the systems of speed freaks, not of systems in general.


Cheers,
Brandon





reply via email to

[Prev in Thread] Current Thread [Next in Thread]