6

Suppose I have a tail recursive method that loops over the elements of a Stream, like this (simplified code, not tested):

@tailrec
def loop(s: Stream[X], acc: Y): Y = 
  s.headOption match {
    case None => acc
    case Some(x) => loop(s.tail, accumulate(x, acc))
  }

Am I keeping references to the head (and all other elements) of the stream while iterating, which I know should be avoided? If so, then what is a better way to achieve the same thing?

The code that calls this is (I hope) not keeping a reference. Suppose list is a List[X] then the code is calling

loop(list.sliding(n).toStream, initialY)

EDIT: I know this can be done easily without tail recursion (e.g. using foldLeft) but the non-simplified code is not looping exactly one element at a time (sometimes s is used instead of s.tail and sometimes s.tail.dropWhile(...) is used. So I'm looking to find out how to correctly use Stream.

14
  • 1
    What does accumulate doing here? From the code, I think loop is probably a wrong name here. Looks like Y is a collection here and if so then what you have done is perfectly alright.
    – Jatin
    Commented Jan 15, 2014 at 16:01
  • 1
    @Jatin what accumulate does and what Y is (fictional type) is not important here, just wanted to show that something is being done with the current element and the accumulated result. I think loop is fine since I'm looping over all the elements of the stream.
    – herman
    Commented Jan 15, 2014 at 16:12
  • @herman: thank you! It's the best question I have ever answered!
    – senia
    Commented Jan 15, 2014 at 20:10
  • I'm curious -- why did you make the function take a Stream as opposed to an Iterator? list.sliding(n) is already an Iterator, so the code would be simpler. It would probably be somewhat more efficient to use an Iterator as well. Also, Iterator is a more general interface, and (from what I can see here) all you need. And by avoiding Streams you would eliminate the memory concerns you speak of. If you really wanted to call it with a Stream, no problem -- Streams can give you Iterators.
    – AmigoNico
    Commented Jan 16, 2014 at 2:34
  • @AmigoNico: Iterator is mutable - it could lead to strange errors.
    – senia
    Commented Jan 16, 2014 at 6:40

2 Answers 2

8

tl;dr: your method loop is correct, there will be no reference to the head of the stream. You could test it on infinite Stream.

Let's simplify your code sample to the limit:

class Test {
  private[this] var next: Test = _

  final def fold(): Int = {
    next = new Test
    next.fold()
  }
}

Note that your loop method is also a method of some object.

The method is final (just like Stream#foldLeft) - it's very important.

With scalac -Xprint:all test.scala after tail recursion optimization you'll get this:

final def fold(): Int = {
  <synthetic> val _$this: Test = Test.this;
  _fold(_$this: Test){
    ({
      Test.this.next = new Test();
      _fold(Test.this.next)
    }: Int)
  }
};

And this code will not help you to understand what is going on.

The only road to the magical land of understanding is the java bytecode.

But you should remember one thing: there is no such thing as method of object. All methods are "static". And this is just the first parameter of the method. If the method is virtual there is such thing as vtable, but our method is final, so there will be no dynamic dispatch in this case.

Also note that there is no such thing as parameter: all parameters are just variables, initialized before method execution.

So this is just the first variable (index 0) of the method.

Let's take a look at the bytecode (javap -c Test.class):

public final int fold();
  Code:
     0: aload_0       
     1: new           #2                  // class Test
     4: dup           
     5: invokespecial #16                 // Method "<init>":()V
     8: putfield      #18                 // Field next:LTest;
    11: aload_0       
    12: getfield      #18                 // Field next:LTest;
    15: astore_0      
    16: goto          0

Let's write this method in pseudo scala-like code:

static foo(var this: Test): Int {
  :start // label for goto jump

  // place variable `this` onto the stack:
  //   0: aload_0       

  // create new `Test`
  //   1: new           #2                  // class Test
  // invoke `Test` constructor
  //   4: dup           
  //   5: invokespecial #16                 // Method "<init>":()V

  // assign `this.next` field value
  //   8: putfield      #18                 // Field next:LTest;

  this.next = new Test

  // place `this.next` onto the stack
  //  11: aload_0       
  //  12: getfield      #18                 // Field next:LTest;

  // assign `this.next` to variable `this`!
  //  15: astore_0      
  this = this.next // we have no reference to the previous `this`!

  //  16: goto          0
  goto :start
}

After this = this.next we have no reference to the previous this on stack or in the first variable. And the previous this can be removed by GC!

So tail.foldLeft(...) in Stream#foldLeft will be replaced with this = this.tail, ...; goto :start. And since this is just a firs argument of method @tailrec before foldLeft declaration makes sense.

And now we can finally understand scalac -Xprint:all test.scala result:

final def method(a: A, b: B, ...): Res = {
  <synthetic> val _$this: ThisType = ThisType.this;
  _method(_$this: Test, a: A, b: B, ...){
    ({
      // body
      _method(nextThis, nextA, nextB, ...)
    }: Res)
  }
};

means:

final def method(var this: ThisType, var a: A, var b: B, ...): Res = {
  // _method(_$this: Test, a: A, b: B, ...){
  :start

  // body

  //   _method(nextThis, nextA, nextB, ...)
  this = nextThis
  a = nextA
  b = nextB
  ...
  goto :start
};

And this is exactly what you'll get after scalac -Xprint:all on your loop method, but body will be huge. So in your case:

...
case Some(x) =>
  this = this
  s = s.tail
  acc = accumulate(x, acc)
  goto :start
...

After s = s.tail you have no reference to the head of the stream.

2
  • This looks good. In my case the method was not final but inside another method which works as well. Thanks for the research! Maybe I should have asked "how does tail call optimization work". I knew about not keeping a reference in the calling code, but now I'm confident that passing it as an argument doesn't pose a problem.
    – herman
    Commented Jan 15, 2014 at 22:24
  • @herman: private method is also not overridable, just like final. Note also that all methods of object are final.
    – senia
    Commented Jan 17, 2014 at 10:16
1

In most cases where this question comes up, the more important question is "does the calling code hang onto the head of stream?"

What really matters is that you passed the output from another method directly to loop, rather than assigning it to a val first.

That said, I would just avoid all possible confusion by using a simpler approach:

list.sliding(n).foldLeft(initialY)(accumulate)
14
  • The Stream is not infinite. But it's big enough that I don't want to cache the values. I've updated the question with code similar to the actual calling code. Basically sliding over a million elements using a window size n that could potentially be several hundreds of elements. So I wouldn't like to cache n * 1000000 elements.
    – herman
    Commented Jan 15, 2014 at 16:18
  • @Senia - well I never... I guess this is why I just use Iterators. Commented Jan 15, 2014 at 16:19
  • @senia hmm, I guess you're right: @tailrec confused me, it appears to be valid on calls that are not really recursive and I guess that is what helps with the stack overflow. I'm still puzzled about how the Stream object on which foldLeft is initially called can already be garbage collected even while foldLeft` is still executing.
    – herman
    Commented Jan 15, 2014 at 18:03
  • @herman At a guess, it's because tail recursive functions hang on to their arguments, but you don't need to pass the stream in as an argument when the method you're calling is a member of the Stream class Commented Jan 15, 2014 at 18:08
  • 1
    @herman There's no such thing as an expert, just people who've learned enough to truly appreciate how much more there is that they could possibly learn. Commented Jan 15, 2014 at 18:18

Not the answer you're looking for? Browse other questions tagged or ask your own question.