15

Does the specification guarantee, that all operations on sequential Java Streams are executed in the current thread? (Except for "forEach" and "forEachOrdered")

I explicitly ask for the specification, not what the current implementation does. I can look into the current implementation myself and don't need to bother you with that. But the implementation might change and there are other implementations.

I'm asking because of ThreadLocals: I use a Framework which uses ThreadLocals internally. Even a simple call like company.getName() eventually uses a ThreadLocal. I cannot change how that framework is designed. At least not within a sane amount of time.

The specification seems confusing here. The documentation of the Package "java.util.stream" states:

If the behavioral parameters do have side-effects, unless explicitly stated, there are no guarantees as to the visibility of those side-effects to other threads, nor are there any guarantees that different operations on the "same" element within the same stream pipeline are executed in the same thread.

...

Even when a pipeline is constrained to produce a result that is consistent with the encounter order of the stream source (for example, IntStream.range(0,5).parallel().map(x -> x*2).toArray() must produce [0, 2, 4, 6, 8]), no guarantees are made as to the order in which the mapper function is applied to individual elements, or in what thread any behavioral parameter is executed for a given element.

I would interpret that as: Every operation on a stream can happen in a different thread. But the documentation of "forEach" and "forEachOrdered" explicitly states:

For any given element, the action may be performed at whatever time and in whatever thread the library chooses.

That statement would be redundant if every stream operation could happen in an unspecified thread. Is therefore the opposite true: All operations on a serial stream are guaranteed to be executed in the current thread, except for "forEach" and "forEachOrdered"?

I have googled for an authoritative answer about the combination of "Java", "Stream" and "ThreadLocal" but found nothing. The closes thing was an answer by Brian Goetz to a related question here on Stack Overflow, but it is about the order, not the thread, and it is only about "forEach", not the other stream methods: Does Stream.forEach respect the encounter order of sequential streams?

7
  • that forEach documentation has right at the beginning For parallel stream pipelines..., this only concerns parallel processing; even if there are two sentences
    – Eugene
    Commented May 23, 2018 at 14:56
  • 3
    About the "For parallel stream pipelines...": In the other question i have linked, Brian Goetz states that the restriction "For parallel stream pipelines" only applies to this one sentence. The sentence that I quoted is not restricted by it. (If I understand him correctly.)
    – user194860
    Commented May 23, 2018 at 15:11
  • 2
    Excellent question indeed - note that this sentence has also been added to the JavaDocs of iterate in java 9, which did not contain it in Java 8
    – Hulk
    Commented May 24, 2018 at 8:03
  • 1
    Related: stackoverflow.com/q/45871618/2513200 (perhaps even duplicate), but I'll admit that the only answer there doesn't really convince me
    – Hulk
    Commented May 24, 2018 at 8:45
  • 4
    @Hulk I think it's only up to Stuart Marks or Brian Goetz to answer this, eagerly waiting...
    – Eugene
    Commented May 24, 2018 at 11:55

1 Answer 1

1

I believe the answer you are looking for is not so well defined, as it will depends on the consumer and/or spliterator and their characteristics:

Before reading the main quote:

https://docs.oracle.com/javase/8/docs/api/java/util/Collection.html#stream

default Stream stream() Returns a sequential Stream with this collection as its source. This method should be overridden when the spliterator() method cannot return a spliterator that is IMMUTABLE, CONCURRENT, or late-binding. (See spliterator() for details.)

https://docs.oracle.com/javase/8/docs/api/java/util/Spliterator.html#binding

Despite their obvious utility in parallel algorithms, spliterators are not expected to be thread-safe; instead, implementations of parallel algorithms using spliterators should ensure that the spliterator is only used by one thread at a time. This is generally easy to attain via serial thread-confinement, which often is a natural consequence of typical parallel algorithms that work by recursive decomposition. A thread calling trySplit() may hand over the returned Spliterator to another thread, which in turn may traverse or further split that Spliterator. The behaviour of splitting and traversal is undefined if two or more threads operate concurrently on the same spliterator. If the original thread hands a spliterator off to another thread for processing, it is best if that handoff occurs before any elements are consumed with tryAdvance(), as certain guarantees (such as the accuracy of estimateSize() for SIZED spliterators) are only valid before traversal has begun.

Spliterators and consumers have their on set of characteristics, and that will define the guarantee. Let's suppose you are operating in a streem. As spliterators are supposed not to be thread safe and supposed to handle elements to other spliterators that might be in other thread, been sequencial or not, there guarantee is null. However, if no splits occor the quotes will lead to the following: under one spliterator, the operations will remain in the same thread, any event that leads to a split will cause the assumption to be null, but true otherwise

7
  • 9
    how does this answers the question?
    – Eugene
    Commented May 23, 2018 at 19:48
  • Spliterators and consumers have their on set of characteristics, and that will define the guarantee. Let's suppose you are operating in a streem. As spliterators are supposed not to be thread safe and supposed to handle elements to other spliterators that might be in other thread, been sequencial or not, there guarantee is null. However, if no splits occor the quotes will lead to the following: under one spliterator, the operations will remain in the same thread, any event that leads to a split will cause the assumption to be null, but true otherwise.
    – Victor
    Commented May 24, 2018 at 13:05
  • thanks, I will look further at this to try explain better when I get a break at work. For now I just copied my comment there.
    – Victor
    Commented May 24, 2018 at 13:35
  • I guess you could enforce all elements to be processed by a single thread by building your stream on an iterator that refuses to split (i.e. always returns null from trySplit), but I still don't see a reason why this single thread could not be some other thread than the one that invokes the terminal operation.
    – Hulk
    Commented May 24, 2018 at 13:54
  • 2
    @Hulk even then, it would be valid if tryAdvance is invoked on the same instance by different threads which coordinate their access, in other words, establish a happens-before relationship between these two invocations. Only concurrent access to the same spliterator is forbidden.
    – Holger
    Commented May 24, 2018 at 16:00

Not the answer you're looking for? Browse other questions tagged or ask your own question.