29

Consider the following Python code:

b = [1,2,3,4,5,6,7]
a = iter(b)
for x in a :
    if (x % 2) == 0 :
        print(next(a))

Which will print 3, 5, and 7. Is the use of next on the variable being iterated on a reliable construct (you may assume that a StopIteration exception is not a concern or will be handled), or does the modification of the iterator being looped over inside the loop constitute a violation of some principle?

5
  • A question to ponder: what happens if you skip the if condition and always call next(a)? Commented Dec 13, 2018 at 14:41
  • 6
    That's fine as long as you know what you're getting into.
    – timgeb
    Commented Dec 13, 2018 at 14:41
  • 4
    Ok, but worth commenting if others are going to be using/reading the code. Commented Dec 13, 2018 at 14:47
  • Other than academic interest, I fail to understand why one would write code like this?
    – copper.hat
    Commented Dec 13, 2018 at 21:15
  • @copper.hat You can see here - stackoverflow.com/questions/53762253/… - for the motivating example. The aim is to find a line in a file too large to read into memory and process the next line. The pairwise recipe in timgeb's answer is a clearer way to do this. Commented Dec 14, 2018 at 9:49

3 Answers 3

27

There's nothing wrong here protocol-wise or in theory that would stop you from writing such code. An exhausted iterator it will throw StopIteration on every subsequent call to it.__next__, so the for loop technically won't mind if you exhaust the iterator with a next/__next__ call in the loop body.

I advise against writing such code because the program will be very hard to reason about. If the scenario gets a little more complex than what you are showing here, at least I would have to go through some inputs with pen and paper and work out what's happening.

In fact, your code snippet possibly does not even behave like you think it behaves, assuming you want to print every number that is preceded by an even number.

>>> b = [1, 2, 4, 7, 8]                                              
>>> a = iter(b)                                                      
>>> for x in a: 
...:    if x%2 == 0: 
...:        print(next(a, 'stop'))                                   
4
stop

Why is 7 skipped although it's preceded by the even number 4?

>>>> a = iter(b)                                                      
>>>> for x in a: 
...:     print('for loop assigned x={}'.format(x)) 
...:     if x%2 == 0: 
...:         nxt = next(a, 'stop') 
...:         print('if popped nxt={} from iterator'.format(nxt)) 
...:         print(nxt)
...:                                               
for loop assigned x=1
for loop assigned x=2
if popped nxt=4 from iterator
4
for loop assigned x=7
for loop assigned x=8
if popped nxt=stop from iterator
stop

Turns out x = 4 is never assigned by the for loop because the explicit next call popped that element from the iterator before the for loop had a chance to look at the iterator again.

That's something I'd hate to work out the details of when reading code.


If you want to iterate over an iterable (including iterators) in "(element, next_element)" pairs, use the pairwise recipe from the itertools documentation.

from itertools import tee                                         

def pairwise(iterable):
    "s -> (s0,s1), (s1,s2), (s2, s3), ..." 
    a, b = tee(iterable) 
    next(b, None) 
    return zip(a, b) 

Demo:

>>> b = [1,2,3,4,5,6,7]                                               
>>> a = iter(b)                                                       
>>>                                                                   
>>> for x, nxt in pairwise(a): # pairwise(b) also works 
...:    print(x, nxt)                                                                      
1 2
2 3
3 4
4 5
5 6
6 7

In general, itertools together with its recipes provides many powerful abstractions for writing readable iteration-related code. Even more useful helpers can be found in the more_itertools module, including an implementation of pairwise.

8
  • The example above is, of course, a toy, you can see the motivating example here: stackoverflow.com/questions/53762253/… where the asker wants to search for a particular sequence in one line and print the next. Commented Dec 13, 2018 at 16:30
  • @JackAidley Usually, in cases like these, I believe the pattern to zip the list with itself, offset by one is interesting. For instance, you could do : for previous, current in zip(my_list, my_list[1:]):. This prevents you from doing fancy tricks with next, and is quite readable and elegant. Commented Dec 13, 2018 at 16:37
  • @VincentSavard This doesn't work when reading lines from a file, however. Commented Dec 13, 2018 at 16:51
  • @JackAidley Why wouldn't it work? You can read the first two lines to create the first element, then it's just a matter of reading the file line by line. Commented Dec 13, 2018 at 16:54
  • 5
    @VincentSavard I think Jack means that you cannot slice the iterator over lines you get from opening a file. pairwise from the itertools docs handles that.
    – timgeb
    Commented Dec 13, 2018 at 16:56
4

It depends what you mean by 'safe', as others have commented, it is okay, but you can imagine some contrived situations that might catch you out, for example consider this code snippet:

b = [1,2,3,4,5,6,7]
a = iter(b)
def yield_stuff():
    for item in a:
        print(item)
        print(next(a))
    yield 1

list(yield_stuff())

On Python <= 3.6 it runs and outputs:

1
2
3
4
5
6
7

But on Python 3.7 it raises RuntimeError: generator raised StopIteration. Of course this is expected if you read PEP 479 and if you're thinking about handling StopIteration anyway you might never encounter it, but I guess the use cases for calling next() inside a for loop are rare and there are normally clearer ways of re-factoring the code.

1
  • Nice catch on a subtle difference between versions. Commented Dec 14, 2018 at 9:56
4

If you modify you code to see what happens to iterator a:

b = [1,2,3,4,5,6,7]
a = iter(b)

for x in a :
    print 'x', x
    if (x % 2) == 0 :
        print 'a', next(a)

You will see the printout:

x 1
x 2
a 3
x 4
a 5
x 6
a 7

It means that when you are doing next(a) you are moving forward your iterator. If you need (or will need in the future) to do something else with iterator a, you will have problems. For complete safety use various recipes from itertools module. For example:

from itertools import tee, izip

def pairwise(iterable):
    "s -> (s0,s1), (s1,s2), (s2, s3), ..."
    a, b = tee(iterable)
    next(b, None)
    return izip(a, b) 

b = [1,2,3,4,5,6,7]
a = iter(b)
c = pairwise(a)

for x, next_x in c:
    if x % 2 == 0:
        print next_x

Not that here you have full control in any place of the cycle either on current iterator element, or next one.

Not the answer you're looking for? Browse other questions tagged or ask your own question.