Re: bugs in test-lock

bug-gnulib

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: bugs in test-lock

From:	Torvald Riegel
Subject:	Re: bugs in test-lock
Date:	Thu, 05 Jan 2017 21:57:12 +0100

On Thu, 2017-01-05 at 19:42 +0100, Bruno Haible wrote:
> Torvald Riegel wrote [in private email, should have CCed bug-gnulib]:
> 
> [about the use of a lock in 'struct atomic_int' in test-lock.c:]
> 
> > > The delivered code, with a lock, is correct, however, right?
> > 
> > Not quite, unfortunately, because locks do not make any guarantees
> > regarding fairness.  Thus, there is no guarantee that critical sections
> > querying the flag do not starve a critical sections that wants to set
> > the flag.
> > Good implementations will try to avoid starvation to some extent, but
> > that's not simple without potentially decreasing overall throughput.
> > Thus, in practice, you can't expect to never observe starvation.
> 
> Good point. Yes, if I have multiple threads executing a loop
> 
>   for (;;)
>     {
>       pthread_mutex_lock (&lock);
>       do something;
>       pthread_mutex_unlock (&lock);
>       do very little other things;
>     }
> 
> it is very possible that a specific thread that also wants this lock
> will have to wait way too long.
> 
> That's why I inserted the yield statements in the lock_checker_thread:
> 
>   for (;;)
>     {
>       pthread_mutex_lock (&lock);
>       do something;
>       pthread_mutex_unlock (&lock);
>       do very little other things;
>       pthread_yield ();
>     }

Why do you think this works?  We don't have uniprocessors anymore,
there's more that's affecting execution order than just the OS
scheduler.

> It should have (and in my experiments, actually has) the effect that each
> waiting thread has an average chance to get the lock -- this relies on a
> fair scheduler in the kernel, of course -- and thus to limit the execution
> time of test-lock.

There's no guarantee that this will give fairness do calls before or
after it.  The OS scheduler may place let other runnable processes run
first, but modern synchronization primitives try not to involve the OS
as much as possible because of the overheads this typically involves.
The hardware itself, cache misses, etc. have a huge influence on the
interleaving of executions.

> EXPLICIT_YIELD = 0  ->  3.2 ... 3.5 sec  (especially slow in test_once).
> EXPLICIT_YIELD = 1  ->  1.8 sec
> 
> > POSIX makes no explicit guarantee regarding forward progress (ie, things
> > like guaranteeing no starvation), but the specification gives indication
> > that concurrent sem_trywait or sem_wait should not starve a sem_post
> > (IOW, simplified, sem_post should complete eventually).  I'd also say
> > that good-quality implementaitons of semaphores would not starve
> > sem_post, simply because you'd want to implement sem_trywait as just
> > atomic reads internally, and those won't starve concurrent atomic writes
> > (as is, for example, explicitly required in the C11/C++11 memory model).
> > 
> > Therefore, I'd suggest using semaphores for the completion flags.
> 
> What I actually need here, in lock_checker_done, is a notification mechanism
> from the main thread to the checker threads.
> I used 'volatile' - turned out to be very slow.
> Now I use a pthread_mutex_t, which is a bit of an abuse.
> sem_t may well be better than pthread_mutex_t, but is still a bit of an abuse:
> semaphores are about taking and releasing access to a resource, without the
> notion of "owner".

The notification is not tied to a particular owner.
sem_trywait allows you to query whether a notification is available.
sem_post allows you to make a notification available.

> For a real notification mechanism, probably a pthread_cond_t would be the 
> right
> thing.

But you don't want to wait for a condition, you want to query whether a
condition holds.  The latter is not supported by a condvar.

> The problem here is: it's a unit test. If it fails, I don't want to search
> for bugs in the condition variable subsystem, semaphore subsystem, AND the
> lock subsystem. I want the test to rely essentially only on the lock 
> subsystem.

locks are the wrong mechanism.  locks give you mutual exclusion.  If you
want a simple notification that you can query without blocking, use a
semaphore.

> > Second, the mutex test is incorrect because the checker thread's
> > critical sections are allowed to starve the mutator threads' critical
> > sections.  I'd suggest to either use a fixed number of iterations or a
> > fixed duration for both mutators and checkers, or to move the setting of
> > the completion flag to before the pthread_join for the mutators.
> 
> I think the yield () calls handle this danger; see above.

No, unfortunately not; see above.

> > Third, the rwlock test is also incorrect because it is, in the general
> > case, implementation-defined whether readers or writers take preference.
> > I have explained why here:
> > https://bugzilla.redhat.com/show_bug.cgi?id=1410052
> > Fixing this requires either similar steps as for the exclusive/recursive
> > mutex tests, or using something like
> > PTHREAD_RWLOCK_PREFER_WRITER_NONRECURSIVE_NP.
> 
> This was being discussed in the other thread. The glthread_rwlock_* functions
> now give preference to the writer - on all platforms, not only on some as it
> was before (Solaris, Mac OS X did it right already).

I disagree with that design choice, as I explained in the other thread.

[Prev in Thread]

Current Thread

[Next in Thread]

Re: bugs in test-lock, Bruno Haible, 2017/01/05
- Re: bugs in test-lock, Torvald Riegel <=
  - Re: bugs in test-lock, Bruno Haible, 2017/01/05
    - Re: bugs in test-lock, Torvald Riegel, 2017/01/06
    - Re: bugs in test-lock, Bruno Haible, 2017/01/06

Prev by Date: Re: read-write locks
Next by Date: Re: bugs in gnulib thread modules
Previous by thread: Re: bugs in test-lock
Next by thread: Re: bugs in test-lock
Index(es):
- Date
- Thread