classpath
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Note on PushbackInputStream


From: Dalibor Topic
Subject: Re: Note on PushbackInputStream
Date: Sun, 19 Aug 2001 21:44:41 +0200

Am Samstag, 18. August 2001 19:35 schrieb John Keiser:
> On 10 Aug 2001 17:25:58 +1200, Bryce McKinlay wrote:
> > Tom Tromey wrote:
> > >Suppose a thread calls PushbackInputStream.read and there are no bytes
> > >to be read.  So the thread blocks.
> > >
> > >Now suppose another thread calls PushbackInputStream.unread.
> > >
> > >Ideally you'd expect the first thread to wake up and read the
> > >pushed-back bytes.  However, this doesn't happen.  And in fact it
> > >isn't easy to make it happen.  I think you could do it by making a new
> > >thread inside PushbackInputStream.read.  That is awfully heavy though.
> >
> > In general, I dont think its useful/safe to have more than one thread
> > reading from any given InputStream, so in practise this probibly isnt a
> > problem.

As Elliotte Rusty Harold put it in his book "Java I/O", p. 136:

"Never allow two threads to share a stream."

He has a couple of nice paragraphs on how hard thread safety for streams is. 
Using his own example class, he points out some flaws, and says that Sun's 
implementation suffers from similar issues. So he concludes with the line 
above.

Looking at the InputStream and comparing it to a Reader, you'll see that 
among other differences, Readers provide a protected lock field to 
synchronize on, and multi-threading was on the designers' mind, when the 
classes were designed. There is no such lock in classes extending 
InputStream, so you could get the impression that these classes were designed 
to be used in a single threaded fashion.

Unfortunately, the API documentation is silent on that issue. If you decide 
to implement your streams only supporting single-threaded usage, you can't 
elegantly prohibit multi-threaded usage of a class in Java. [1]

So, without an effective possibility to prevent it, you have to tolerate (or 
even expect) multi-threaded usage of your streams. All you can, and should 
do, is to make that experience as safe as possible. I think Tom agrees with 
me on that.

> > Besides, even if there was a way to unblock another reading
> > thread, there would still be consistency issues. If thread A reads a
> > chunk of data, then thread B reads a chunk, then thread A decides to
> > push back a byte, that byte would no longer be in the correct position
> > on the stream.

not really. Regarding thread A, all it has done yet is:
I read a chunk of data.
I pushed back a byte.

What should a programmer using thread A expect now?
a) there are some bytes available for reading.
b) a and the next byte I'm going to read will be the byte I pushed back.
c) b and the next chunk I'm going to read afterwards will be the chunk 
previously read by thread b.

An optimistic programmer would expect c, a pessimistic one none of the above. 

>From the pessimistic programmer's point of view, another thread cold read all 
of the pushback stream's content before thread A gets a turn again. Or it 
could close the pushback stream. So the pessimistic programmer would 
synchronize on the pushback stream for the time when he needs single-threaded 
access. Assuming that instances of streams internally synchronize on 
themselves to achieve multi-thread safety for relevant operations, his 
strategy would work out nicely. [2]

I assume that a pessimistic programming style is better suited for 
multi-threaded environments :-)

> I can't see a way out of the situation either, without creating some
> kind of dual block (block on *either* the pushback or the underlying
> stream) or possibly setting a time interval to periodically come back
> from the underlying stream and check the pushback.

I came up with the following pseudo-code:

public int read throws IOException {
        for(;;) {
                synchronized(this) {
                        checkIfStillOpen();

                        if (availableInPushbackBuffer() > 0) {
                                return readSingleByteFromPushbackBuffer();
                        }
                        else {
                                synchronized(in) {
                                        if(in.available() > 0) {
                                                return 
readSingleByteFromInputStream();
                                        }
                                }
                        }
                }

                Thread.currentThread().yield();
        }
}

There are four ways to exit this method: 
1. checkIfStillOpen() throws an IOException (the "other thread closed the 
stream I'm blocking on"-case)
2. readSingleByteFromPushbackBuffer() returns a byte (the "there's something 
in the pushback buffer"-case)
3. readSingleByteFromInputStream() throws an IOException (the "let's 
propagate input stream IOException"-case)
4. readSingleByteFromInputStream() returns a byte (the "pushback buffer is 
empty, but there's something left on the input stream"-case)

It will block until a single byte can be read or an IOException is thrown. It 
will check if there is something available before attempting to read from the 
resource. It will obtain a lock on the resource before checking it.

That is necessary to prevent situations like this:
thread a:
        pushback buffer has a byte available!
thread b:
        pushback buffer has a byte available!
thread b:
        return readSingleByteFromPushbackBuffer; pushback buffer is now empty
thread a:
        try to readSingleByteFromPushbackBuffer and fail miserably.

You need to synchronize access to the input stream as well, to prevent 
another thread from closing it, for example. And of course to prevent your 
thread from getting blocked while trying to read from the input stream.

The yield() is added as multi-threading "sugar", to avoid burning cycles 
waiting on the resource if some other threads can do something useful. It 
should be very useful if the library is used in a cooperative threading 
environment.

The appropriate unread method should look like this:

public void unread(int oneByte) throws IOException {
        synchronized(this) {
                pushback(lowOrderByte(oneByte));
        }
}

This method can be exited in two ways:
1. pushback throws an IOException (the "pushback buffer is full"-case)
2. pushback works alright.

Coming back to Tom's hypothetical case:

Thread A tries to read a single byte, and blocks. it keeps acquiring and 
releasing the lock for the pushbackinputstream while it blocks.

Thread B tries to unread a single byte, but has to wait until thread A 
releases the lock. It does so eventually, and thread B unreads the byte into 
the pushback buffer. Then thread B releases the lock.

Now, thread A tried to get the lock, and finally acquires the lock. the 
pushbackinputstream is still open, and suddenly it discovers there is one 
byte available in the pushback buffer. it reads the byte, and returns (and 
releases the lock).

If you've got this far, I'd like to hear comments :-) I think that in a 
cooperative multi-threading environment the yield() is necessary, since 
otherwise thread B might not be able to get any CPU time to unread the byte, 
but I'm not sure.

have fun,

Dalibor Topic

[1] Well, you could cache the constructing thread, check for equality on each 
method call, and throw an exception if the test fails. But that would be a 
very uncommon and wasteful interpretation of the streams API. Besides that, 
it would not be elegant at all.

[2] Actually, a really pessimistic programmer could extend the base classes 
to his own single-threadable variants and just use those. But that's not the 
point I'm trying to prove.

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com




reply via email to

[Prev in Thread] Current Thread [Next in Thread]