Pythonic way of splitting loop over list in two parts with one iterator

Question

I am processing a text file with an irregular structure that consists of a header and of data in different sections. What I aim to do is walk through a list and jump to the next section once a certain character is encountered. I made a simple example below. What is the elegant way of dealing with this problem?

lines = ['a','b','c','$', 1, 2, 3]

for line in lines:
    if line == '$':
        print("FOUND END OF HEADER")
        break
    else:
        print("Reading letters")

# Here, I start again, but I would like to continue with the actual
# state of the iterator, in order to only read the remaining elements.
for line in lines:
    print("Reading numbers")

can't you just call lines.index('$') to get the separator position? — EdChum, Commented Mar 16, 2018 at 13:11
The use case is more complex than that, I am just searching for a general way of doing this. — Chiel, Commented Mar 16, 2018 at 13:34
it would be productive if you post your real problem rather than this trivial one then — EdChum, Commented Mar 16, 2018 at 13:35
The real problem has first a header, seperator and then an set of arbitrary sections with case specific number of lines. — Chiel, Commented Mar 16, 2018 at 13:42

Olivier Melançon · Accepted Answer · 2018-03-16 13:28:09Z

3

You actually can have one iterator for both loops by creating your line iterator outside the for loop with the builtin function iter. This way it will be partially exhausted in the first loop and reusable in the next loop.

lines = ['a','b','c','$', 1, 2, 3]

iter_lines = iter(lines) # This creates and iterator on lines

for line in iter_lines :
    if line == '$':
        print("FOUND END OF HEADER")
        break
    else:
        print("Reading letters")

for line in iter_lines:
    print("Reading numbers")

The above prints this result.

Reading letters
Reading letters
Reading letters
FOUND END OF HEADER
Reading numbers
Reading numbers
Reading numbers

edited Mar 16, 2018 at 13:28

answered Mar 16, 2018 at 13:17

Olivier Melançon

22.1k4 gold badges46 silver badges78 bronze badges

This is the most python answer. +1
– John Coleman
Commented Mar 16, 2018 at 13:42
This is exactly what I was looking for
– Chiel
Commented Mar 16, 2018 at 13:43

Add a comment |

John Coleman · Accepted Answer · 2018-03-16 13:16:15Z

1

You could use enumerate to keep track of where you are in the iteration:

lines = ['a','b','c','$', 1, 2, 3]

for i, line in enumerate(lines):
    if line == '$':
        print("FOUND END OF HEADER")
        break
    else:
        print("Reading letters")

print(lines[i+1:]) #prints [1,2,3]

But, unless you actually need to process the header portion, the idea of @EdChum to simply use index is probably better.

answered Mar 16, 2018 at 13:16

John Coleman

51.8k7 gold badges56 silver badges123 bronze badges

Add a comment |

Austin · Accepted Answer · 2018-03-16 13:30:39Z

0

A simpler way and maybe more pythonic:

lines = ['a','b','c','$', 1, 2, 3]
print([i for i in lines[lines.index('$')+1:]])
# [1, 2, 3]

If you want to read each element after $ to different variables, try this:

lines = ['a','b','c','$', 1, 2, 3]
a, b, c = [i for i in lines[lines.index('$')+1:]]
print(a, b, c)
# 1 2 3

Or if you are unaware of how many elements follow $, you could do something like this:

lines = ['a','b','c','$', 1, 2, 3, 4, 5, 6]
a, *b = [i for i in lines[lines.index('$')+1:]]
print(a, *b)
# 1 2 3 4 5 6

edited Mar 16, 2018 at 13:30

answered Mar 16, 2018 at 13:19

Austin

26k4 gold badges25 silver badges50 bronze badges

Add a comment |

sciroccorics · Accepted Answer · 2018-03-16 14:22:19Z

If you have more that one kind of separators, the most generic solution would be to built a mini-state machine to parse your data:

def state0(line):
  pass # processing function for state0

def state1(line):
  pass # processing function for state1

# and so on...

states = (state0, state1, ...)     # tuple grouping all processing functions
separators = {'$':1, '#':2, ...}   # linking separators and states
state = 0                          # initial state

for line in text:
  if line in separators:
    print('Found separator', line)
    state = separators[line]       # change state
  else:
    states[state](line)            # process line with associated function

This solution is able to correctly process arbitrary number of separators in arbitrary order with arbitrary number of repetitions. The only constraint is that a given separator is always followed by the same kind of data, that can be process by its associated function.

Collectives™ on Stack Overflow

Pythonic way of splitting loop over list in two parts with one iterator

4 Answers 4

Not the answer you're looking for? Browse other questions tagged
python
list
iterator
or ask your own question.

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Not the answer you're looking for? Browse other questions tagged pythonlistiterator or ask your own question.

Related

Not the answer you're looking for? Browse other questions tagged
python
list
iterator
or ask your own question.