How would you split one iterator into two without iterating twice or using additional memory to store all the data?
Solution when you can store everything in memory:
l = [{'a': i, 'b': i * 2} for i in range(10)]
def a(iterator):
for item in iterator:
print(item)
def b(iterator):
for item in iterator:
print(item)
a([li['a'] for li in l])
b([li['b'] for li in l])
or if you can iterate twice,
class SomeIterable(object):
def __iter__(self):
for i in range(10):
yield {'a': i, 'b': i * 2}
def a(some_iterator):
for item in some_iterator:
print(item)
def b(some_iterator):
for item in some_iterator:
print(item)
s = SomeIterable()
a((si['a'] for si in s))
b((si['b'] for si in s))
But how would I make it if I just want to iterate once?
a
must complete beforeb
begins, this is literally impossible. If that isn't a requirement, the problem is still a huge pain; you either need to rewritea
andb
, or you need to use threads.itertools.tee
?SomeIterator
class is not actually an iterator. An iterator has a__next__
(ornext
in Python 2) method, in addition to an__iter__
method that returns itself. Your class is more appropriately described as an "iterable".a
andb
that you're showing are simple examples to show what you want and not the real ones, but doesa
really need to go through the whole dataset before runningb
? Would it be ok to supply just one element toa
, then tob
, and then go to the next in the generator?