Repeater
-- Dave Beazley, "Iterations of Evolution: The Unauthorized Biography of the For-Loop"
Using for loops and list comprehensions in Python is basic and quite common, right? But how does iteration in Python actually work “under the hood”? The words “iterator” and “iterable” each occur over 500 times in the Python documentation, but what does an iterator actually do, as opposed to an iterable? And how do they do it? Learn the details as we turn the iteration protocol inside out, with live coded demonstrations along the way.
This talk will start from the way Python iterates of over a sequence, in comparison with iterating by index, like C. The key point of iterating over a sequence is that something needs to track which item in the sequence is next, which is something that Python’s iteration protocol manages.
The iterable section will demonstrate creating a simple object that returns items by index (e.g., a fibonacci series), showing that getitem is really all you need for an iterable, since an iterator is created for such objects when iterated upon. BUT, this doesn’t answer the question of how Python keeps track of which item is next.
The iterator section answers that question by converting the iterable just created to an iterator - adding iter and next methods and showing how the iterator saves state and essentially drives the iteration protocol.
Having an accurate understanding of iteration protocol will help developing Pythonistas reason better about both iterating over existing objects and creating their own iterables and iterators.
Consider the following:
In Python we'd normally use a for
loop to access each element ...
for temp in temp_readings:
print(temp)
all_ids = [cust_id for cust_id in customers]
product_gen =
(product for product in
csv.reader(open("product_file.csv")))
It wasn't always so obvious...
for
loops¶The for
statement in Python differs a bit
from what you may be
used to in C or Pascal. Rather than always iterating over an
arithmetic progression of numbers (like in Pascal), or leaving the user
completely free in the iteration test and step (as C), Python's for
statement iterates over the items of any sequence (e.g., a list
or a string), in the order that they appear in the sequence.
-- Python V 1.1 Docs, 1994
for
loop (C style)¶ for (int i=0; i < list_len; i++){
printf("%d\n", a_list[i]);
}
int i = 0;
while (i < list_len){
printf("%d\n", a_list[i]);
i++;
}
while
versionfor
loop¶# for loop (C style)
a_list = [1, 2, 3, 4]
for i in range(len(a_list)):
print(a_list[i])
Except it's not the same - Python is generating a range object (another series) and iterating over it to get the index values
for
Loop¶# for loop (Python style)
a_list = [1, 2, 3, 4]
for item in a_list:
print(item)
for key in a_dictionary:
for char in a_string:
for record in query_results:
for line in a_file:
etc...
for
loop know the “next” item?for
loops use so many different types?for
loop?An object capable of returning its members one at a time. Examples of iterables include all sequence types (such as list
, str
, and tuple
) and some non-sequence types like dict
, file objects, and objects of any classes you define with an __iter__()
method or with a __getitem__()
method that implements Sequence semantics.
Iterables can be used in a for
loop and in many other places where a sequence is needed (zip()
, map()
, …). When an iterable object is passed as an argument to the built-in function iter()
, it returns an iterator for the object. This iterator is good for one pass over the set of values. When using iterables, it is usually not necessary to call iter()
or deal with iterator objects yourself. The for
statement does that automatically for you, creating a temporary unnamed variable to hold the iterator for the duration of the loop. See also iterator, sequence, and generator.
--Python glossary
list
, str
, tuple
(sequence types)__iter__()
method that returns iterator__getitem__()
with sequence semanticsfor
statement creates an unnamed iterator from iterable automaticallymust return an iterator when the iter()
function is called on it.
__getitem__()
method with Sequence semantics - i.e., access items by integer index in [ ].__iter__()
method that returns an iterator (more on this soon)__iter__()
method?# check with hasattr
a_list = [1, 2, 3, 4]
hasattr(a_list, "__iter__")
__getitem__()
that is sequence compliant? (harder to decide)i.e, does calling iter()
on it return an iterator? or an exception?
is_it_iterable = ["asd", 1, open("Iteration Inside Out.ipynb"), {"one":1, "two":2}]
for item in is_it_iterable:
try:
an_iterator = iter(item)
except TypeError as e:
print(f"Not Iterable: {e}\n")
else:
print(f"Iterable: {an_iterator} is type({an_iterator})\n")
Repeater
¶A object that can be iterated over and returns the same value for the specified number of times.
repeat = Repeater("hello", 4)
for i in repeat:
print(i)
hello
hello
hello
hello
__getitem()__
¶class Repeater:
def __init__(self, value, limit):
self.value = value
self.limit = limit
def __getitem__(self, index):
if 0 <= index < self.limit:
return self.value
else:
raise IndexError
repeat = Repeater("hello", 4)
# does it have an __iter__ method?
hasattr(repeat, "__iter__")
# __getitem__ with sequence semantics?
repeat[0]
# can the iter() function return an iterator?
iter(repeat)
# for loop
for item in repeat:
print(item)
# list comprehension
[x for x in repeat]
repeat
objectclass Repeater:
def __init__(self, value, limit):
self.value = value
self.limit = limit
def __getitem__(self, index): # The bit we need for an iterable
if 0 <= index < self.limit:
return self.value
else:
raise IndexError # only needed if we want iteration to end
__getitem__()
method was neededThe Python for
loop relies on being able to get a next item, but...
An iterator has a __next__()
method (in Python 2 next()
) that tracks and returns the next item in the series, and you use the next()
function to return the next item for iteration.
__next__()
method__next__()
method (next()
function) return successive itemsStopIteration
when no more dataStopIteration
__iter__()
method, which returns selfAn object representing a stream of data. Repeated calls to the iterator’s __next__()
method (or passing it to the built-in function next()
) return successive items in the stream. When no more data are available a StopIteration exception is raised instead. At this point, the iterator object is exhausted and any further calls to its __next__()
method just raise StopIteration again...
...Iterators are required to have an __iter__()
method that returns the iterator object itself so every iterator is also iterable and may be used in most places where other iterables are accepted. One notable exception is code which attempts multiple iteration passes. A container object (such as a list) produces a fresh new iterator each time you pass it to the iter()
function or use it in a for loop. Attempting this with an iterator will just return the same exhausted iterator object used in the previous iteration pass, making it appear like an empty container.
--Python glossary
RepeatIterator
¶__next__()
method to return next item__iter__()
method to return itselfclass RepeatIterator:
def __init__(self, value, limit):
self.value = value
self.limit = limit
self.count = 0
def __next__(self):
if self.count < self.limit:
self.count += 1
return self.value
else:
raise StopIteration
def __iter__(self):
return self
repeat_iter = RepeatIterator("Hi", 4)
# __getitem__ with sequence semantics?
repeat_iter = RepeatIterator("Hi", 4)
# does it have an __iter__ method?
hasattr(repeat_iter, "__iter__")
# does it return next item using next() function?
next(repeat_iter)
# calling iter on it, returns object itself
print(repeat_iter)
repeat_iter_iter = iter(repeat_iter)
print(repeat_iter_iter)
# calling iter() on iterable always returns new iterator
print(id(repeat))
old_repeat_iter = iter(repeat)
print(id(old_repeat_iter))
# after 1 next(), how many repetitions left?
for item in repeat_iter:
print(item)
# Let's loop again
for item in repeat_iter:
print(item)
# one more next?
next(repeat_iter)
__next__()
method__iter__()
method that returns selfdef repeat_gen(value, limit):
for i in range(limit):
yield value
for i in repeat_gen("hi", 4): # iterator returns itself
print(i)
# or use a generator expression
value = "hi"
limit = 4
repeat_gen_expr = (value for x in range(limit))
for item in repeat_gen("hi", 4):
print(item)
This notebook available at http://projects.naomiceder.tech/talks/iteration-inside-out/