Re: [really patch] Re: HashMap putAll/putAllInternal bug

classpath

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [really patch] Re: HashMap putAll/putAllInternal bug

From:	Stuart Ballard
Subject:	Re: [really patch] Re: HashMap putAll/putAllInternal bug
Date:	Mon, 29 Sep 2003 10:57:40 -0400
User-agent:	Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3.1) Gecko/20030527 Debian/1.3.1-2

Bryce McKinlay wrote:

size() is used here because, obviously, it is generally more efficientto call it once rather than calling hasNext() many times. I believe thatthe current implementation is within spec according to the collectionsdocumentation. If your collections are returning an inaccurate size()then I'd argue they are not valid implementations of Map.

Sure: as I noted, my argument is that Sun's implementation can handlesuch invalid implementations of Map, so people might rely on it, as I did.

I could fix up my implementation of Map to guarantee that size() iscorrect, but that would make this operation much *slower* than usinghasNext() would. To get an accurate size() out of my data structurewould require iterating across it fully every time size() is called.

Furthermore, I'd argue that the very existence of a hasNext() methodsuggests that Sun didn't intend people to make this "optimization". Ifthey expected that calling size() and maintaining your own counter wouldalways be more efficient, why didn't they just leave hasNext() out andrecommend coding that way?

Note too that our current implementations leave the collections in aninvalid state if next() causes a ConcurrentModificationException (orother RuntimeException) part way through, because they set the sizevariable in advance without waiting to ensure that all those elementscould in fact be added. It would be possible to fix this problem withoutusing hasNext(), and even if my patch isn't accepted I still think weshould at least fix that.

Of course, if there are real applications out there that rely on the waySun implements it, then we may have to change to using hasNext(). But weshould consider this carefully. If we must change it, then theaddAll/putAll implementations should change throughout the collectionsclasses for consistency - not just HashMap/Hashtable. As you noticed, insome cases this could make things significantly less efficient.

Firstly, there is at least one real application that relies on this :)If I could see a viable workaround, I'd modify nrdo to provide accuratesize() information so that Classpath wouldn't need to change (although Istill think that it's important to be bug-for-bug compatible with Sun'simplementation, so I'd still be in favor of this change even if I wasn'tpersonally relying on it). But in the case of my particular datastructure it's impossible to do that without slowing everything down,and that isn't acceptable for my application.

I think it's perfectly possible to implement all our collections to givethe correct behavior without making things less efficient in the casewhere size() is correct - at least, the only difference should be thecost of repeated calls to hasNext() versus a single call to size(), andthe cost of hasNext() should *always* be small enough that thatdifference falls into the realm of micro-optimization.

The example I gave was ArrayList: the optimization is to pre-allocatespace for size() worth of elements using ensureCapacity(), and then copyelements directly into the backing array. It would be perfectly possibleto implement it something like this, which optimizes for when size() iscorrect but is robust if it's not:


int csize = c.size();
ensureCapacity(size() + csize);
for (Iterator i = c.iterator(); csize-- > 0 && i.hasNext(); ) {
  // put i.next() directly into the array and increment size,
  // exactly as it's done now.
}
while (i.hasNext()) {
  add(i.next()); // use the standard add method which will
                 // grow the array further if needed.
}

In this version, the for() loop checks i.hasNext() as well as csize sothat it's robust against the actual size being smaller than csize. Thewhile() loop uses a slower path for extra elements in case the actualsize is larger than csize(). And the only additional cost of thisalgorithm is extra calls to hasNext(), which should always be cheap.

I don't think there are any ways where size() could be used as anoptimization that aren't susceptible to a similar approach - optimizefor the case where size() is correct, but still be prepared in case it'snot.


Stuart.

--
Stuart Ballard, Senior Web Developer
FASTNET - Web Solutions
(215) 283-2300, ext. 126
www.fast.net

[Prev in Thread]

Current Thread

[Next in Thread]

HashMap putAll/putAllInternal bug, Stuart Ballard, 2003/09/24
- [patch] Re: HashMap putAll/putAllInternal bug, Stuart Ballard, 2003/09/25
  - Re: [patch] Re: HashMap putAll/putAllInternal bug, Dalibor Topic, 2003/09/25
    - Re: [patch] Re: HashMap putAll/putAllInternal bug, Stuart Ballard, 2003/09/25
    - Re: [patch] Re: HashMap putAll/putAllInternal bug, Brian Jones, 2003/09/25
    - RE: [patch] Re: HashMap putAll/putAllInternal bug, David Holmes, 2003/09/25
    - Re: [patch] Re: HashMap putAll/putAllInternal bug, Stuart Ballard, 2003/09/26
- [really patch] Re: HashMap putAll/putAllInternal bug, Stuart Ballard, 2003/09/25
  - Re: [really patch] Re: HashMap putAll/putAllInternal bug, Bryce McKinlay, 2003/09/26
    - Re: [really patch] Re: HashMap putAll/putAllInternal bug, Stuart Ballard <=
    - Re: [really patch] Re: HashMap putAll/putAllInternal bug, Brian Jones, 2003/09/30
    - Re: [really patch] Re: HashMap putAll/putAllInternal bug, Stuart Ballard, 2003/09/30
    - Re: [really patch] Re: HashMap putAll/putAllInternal bug, Bryce McKinlay, 2003/09/30
    - RE: [really patch] Re: HashMap putAll/putAllInternal bug, David Holmes, 2003/09/30

Prev by Date: NYIException
Next by Date: Re: NYIException
Previous by thread: Re: [really patch] Re: HashMap putAll/putAllInternal bug
Next by thread: Re: [really patch] Re: HashMap putAll/putAllInternal bug
Index(es):
- Date
- Thread