[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v6 12/15] qht: add qht-bench, a performance benc
From: |
Emilio G. Cota |
Subject: |
Re: [Qemu-devel] [PATCH v6 12/15] qht: add qht-bench, a performance benchmark |
Date: |
Fri, 3 Jun 2016 07:41:53 -0400 |
User-agent: |
Mutt/1.5.23 (2014-03-12) |
On Sun, May 29, 2016 at 23:45:23 +0300, Sergey Fedorov wrote:
> On 25/05/16 04:13, Emilio G. Cota wrote:
> > diff --git a/tests/qht-bench.c b/tests/qht-bench.c
> > new file mode 100644
> > index 0000000..30d27c8
> > --- /dev/null
> > +++ b/tests/qht-bench.c
> > @@ -0,0 +1,474 @@
> (snip)
> > +static void do_rw(struct thread_info *info)
> > +{
> > + struct thread_stats *stats = &info->stats;
> > + uint32_t hash;
> > + long *p;
> > +
> > + if (info->r >= update_threshold) {
> > + bool read;
> > +
> > + p = &keys[info->r & (lookup_range - 1)];
> > + hash = h(*p);
> > + read = qht_lookup(&ht, is_equal, p, hash);
> > + if (read) {
> > + stats->rd++;
> > + } else {
> > + stats->not_rd++;
> > + }
> > + } else {
> > + p = &keys[info->r & (update_range - 1)];
> > + hash = h(*p);
>
> The previous two lines are common for the both "if" branches. Lets move
> it above the "if".
Not quite. The mask uses lookup_range above, and update_range below.
> > + if (info->write_op) {
> > + bool written = false;
> > +
> > + if (qht_lookup(&ht, is_equal, p, hash) == NULL) {
> > + written = qht_insert(&ht, p, hash);
> > + }
> > + if (written) {
> > + stats->in++;
> > + } else {
> > + stats->not_in++;
> > + }
> > + } else {
> > + bool removed = false;
> > +
> > + if (qht_lookup(&ht, is_equal, p, hash)) {
> > + removed = qht_remove(&ht, p, hash);
> > + }
> > + if (removed) {
> > + stats->rm++;
> > + } else {
> > + stats->not_rm++;
> > + }
> > + }
> > + info->write_op = !info->write_op;
> > + }
> > +}
> > +
> > +static void *thread_func(void *p)
> > +{
> > + struct thread_info *info = p;
> > +
> > + while (!atomic_mb_read(&test_start)) {
> > + cpu_relax();
> > + }
> > +
> > + rcu_register_thread();
>
> Shouldn't we do this before checking for 'test_start'?
>From a correctness point of view it doesn't matter. But yes, it
is better to do it earlier. Changed.
> > +
> > + rcu_read_lock();
>
> Why don't we do rcu_read_lock()/rcu_read_unlock() inside the loop?
Because that will slow down the benchmark unnecessarily (Throughput
for single-threaded and default opts goes down from 38M/s to 35M/s).
For this benchmark we want to benchmark QHT's performance, not RCU's.
And really we're not allocating/deallocating elements dynamically,
so from a memory usage viewpoint calling this inside or outside
of the loop doesn't matter.
> > + while (!atomic_read(&test_stop)) {
> > + info->r = xorshift64star(info->r);
> > + info->func(info);
> > + }
> > + rcu_read_unlock();
> > +
> > + rcu_unregister_thread();
> > + return NULL;
> > +}
> > +
> > +/* sets everything except info->func */
> > +static void prepare_thread_info(struct thread_info *info, int i)
> > +{
> > + /* seed for the RNG; each thread should have a different one */
> > + info->r = (i + 1) ^ time(NULL);
> > + /* the first update will be a write */
> > + info->write_op = true;
> > + /* the first resize will be down */
> > + info->resize_down = true;
> > +
> > + memset(&info->stats, 0, sizeof(info->stats));
> > +}
> > +
> > +static void
> > +th_create_n(QemuThread **threads, struct thread_info **infos, const char
> > *name,
> > + void (*func)(struct thread_info *), int offset, int n)
>
> 'offset' is not used in this function.
Good catch! Changed now:
+ prepare_thread_info(&info[i], offset + i);
The offset is passed so that each created thread has a unique
RNG seed.
> > +{
> > + struct thread_info *info;
> > + QemuThread *th;
> > + int i;
> > +
> > + th = g_malloc(sizeof(*th) * n);
> > + *threads = th;
> > +
> > + info = qemu_memalign(64, sizeof(*info) * n);
> > + *infos = info;
> > +
> > + for (i = 0; i < n; i++) {
> > + prepare_thread_info(&info[i], i);
> > + info[i].func = func;
> > + qemu_thread_create(&th[i], name, thread_func, &info[i],
> > + QEMU_THREAD_JOINABLE);
> > + }
> > +}
> > +
> (snip)
> > +
> > +static void run_test(void)
> > +{
> > + unsigned int remaining;
> > + int i;
> > +
>
> Are we sure all the threads are ready at this point? Otherwise why
> bother with 'test_start' flag?
Good point. Added the following:
diff --git a/tests/qht-bench.c b/tests/qht-bench.c
index 885da9c..c1ed9b9 100644
--- a/tests/qht-bench.c
+++ b/tests/qht-bench.c
@@ -43,6 +43,7 @@ static unsigned long lookup_range = DEFAULT_RANGE;
static unsigned long update_range = DEFAULT_RANGE;
static size_t init_range = DEFAULT_RANGE;
static size_t init_size = DEFAULT_RANGE;
+static size_t n_ready_threads;
static long populate_offset;
static long *keys;
@@ -190,6 +191,7 @@ static void *thread_func(void *p)
rcu_register_thread();
+ atomic_inc(&n_ready_threads);
while (!atomic_mb_read(&test_start)) {
cpu_relax();
}
@@ -387,6 +389,9 @@ static void run_test(void)
unsigned int remaining;
int i;
+ while (atomic_read(&n_ready_threads) != n_rw_threads + n_rz_threads) {
+ cpu_relax();
+ }
atomic_mb_set(&test_start, true);
do {
remaining = sleep(duration);
Thanks,
Emilio
- Re: [Qemu-devel] [PATCH v6 12/15] qht: add qht-bench, a performance benchmark,
Emilio G. Cota <=