Python generator is a simple way of creating iterator.
There is a lot of overhead in building an iterator in python. We have to implement a class with __iter__()
and __next__()
method, keep track of internal states, raise StopIteration
when there was no values to be returned etc.
Iterator in Python is an object that can be iterated upon. An object which will return data, one element at a time. Iterator in python is any python type that can be used with a for in
loop.
Python lists, tuples, dicts and sets are all examples of inbuilt iterators.
Python iterator object must implement two special methods, __iter__()
and __next__()
. The __iter__()
method returns the iterator object itself. We use the next()
function to manually iterate through all the items of an iterator. When we reach the end and there is no more data to be returned, it will raise StopIteration
.
Example:
# define a list
my_list = [4, 7, 0, 3]
# get an iterator using iter()
my_iter = iter(my_list)
# iterate through it using next()
# prints 4
print(next(my_iter))
# prints 7
print(next(my_iter))
# next(obj) is same as obj.__next__()
# prints 0
print(my_iter.__next__())
# prints 3
print(my_iter.__next__())
# This will raise error, no items left
next(my_iter)
In the next example we will implement a function which give us next power of 2 in each iteration. Power exponent starts from zero up to a user set number.
class PowTwo:
"""Class to implement an iterator of powers of two"""
def __init__(self, max = 0):
self.max = max
def __iter__(self): self.n = 0
return self
def __next__(self):
if self.n <= self.max:
result = 2 ** self.n
self.n += 1
return result
else:
raise StopIteration
obj = PowTwo(4)
iter = iter(obj)
print(next(iter)) # print 1
print(next(iter)) # print 2
print(next(iter)) # print 4
All the overhead we mentioned above are automatically handled by generators in Python. Simply speaking, a generator is a function that returns an object (iterator) which we can iterate over, one value at a time.
It is fairly simple to create a generator in Python. It is as easy as defining a normal function with yield statement instead of a return statement. If a function contains at least one yield statement, it becomes a generator function.
Both yield and return will return some value from a function. The difference is that, while a return statement terminates a function entirely, yield statement pauses the function saving all its states and later continues from there on successive calls.
Example:
We have a generator function named my_gen()
with several yield statements.
# Generator function
def my_gen():
n = 1
print('print 1')
# Generator function contains yield statements
yield n
n += 1
print('print 2')
yield n
n += 1
print('print 3 - last')
yield n
# create iterator object
obj = my_gen()
# iterate through the items using next() function
print(next(obj))
Normally, generator functions are implemented with a loop having a suitable terminating condition.
Example:
def rev_str(my_str):
length = len(my_str)
for i in range(length-1, -1, -1):
yield my_str[i]
for char in rev_str("hello"):
print(char)
- Easy to Implement
Since generators keep track of details automatically, they can be implemented in a clear and concise way as compared to their iterator class counterpart.
- Memory Efficient
A normal function to return a sequence will create the entire sequence in memory before returning the result. This is an overkill if the number of items in the sequence is very large. Generator implementation of such sequence is memory friendly and is preferred since it only produces one item at a time.
- Represent Infinite Stream
Generators are excellent medium to represent an infinite stream of data. Infinite streams cannot be stored in memory and since generators produce only one item at a time, it can represent infinite stream of data.