Generators vs Lists
Comparing generators and lists in python followed by a speed test
Differences
Generators | Lists | |
Definition | They are a function that return an Iterator. Generators are lazy iterators. They return an item only when required | Lists are a sequence type that can group together several items of different types in a single variable. |
Creation | Using the yield keyword in functions; using generator expressions | Using assignment to directly assign the list; using list comprehensions |
Iteration | They can be iterated over only once. If you want to repeat the iteration, you will have to create a new generator. | They can iterated over several times. |
Accessing elements | Using next() or loops | Using indexes or loops |
Storage & Memory efficiency | They have lazy execution (producing items only when asked for). For this reason, a generator is much more memory efficient than an equivalent list. | Lists store all their elements in memory. |
Speed | Depends on the use case. | Depends on the use case. |
Slicing & Addition | They don't support slicing or additions. | Lists support slicing and also support additions with other lists. |
Suitability | Suitable for very large (and even infinite) sequences which need to be iterated over only once. | More suitable if the items need to be iterated over several times. |
Speed test:
Let's compare the performance of generators and lists.
- Here we are initializing a generator and a list with 10 million (1 crore) integers
from time import perf_counter
def time_using_generators(length):
start = perf_counter()
result = (i for i in range(0,length)) # generator expression
return perf_counter() - start
def time_using_lists(length):
start = perf_counter()
result = [i for i in range(0,length)] # list comprehension
return perf_counter() - start
total_time = 0
for i in range(0, 10):
total_time += time_using_generators(10000000)
print(f"Average time taken using generators: {round(total_time/10*1000,2)} ms")
total_time = 0
for i in range(0, 10):
total_time += time_using_lists(10000000)
print(f"Average time taken using lists: {round(total_time/10*1000, 2)} ms")
"""
Output:
Average time taken using generators: 0.0 ms
Average time taken using lists: 507.48 ms
"""
As you can see from the output above, initializing the generator takes no time at all whereas initializing the list takes a certain amount of time.
- Here we initialize a generator and a list with a large number (10 million) of integers and then iterate over them.
from time import perf_counter
def time_using_generators(length):
start = perf_counter()
for x in (i for i in range(0,length)):
pass
return perf_counter() - start
def time_using_lists(length):
start = perf_counter()
for x in [i for i in range(0,length)]:
pass
return perf_counter() - start
total_time = 0
for j in range(0, 10):
total_time += time_using_generators(10000000)
print(f"Average time taken using generators: {round(total_time/10*1000,2)} ms")
total_time = 0
for j in range(0, 10):
total_time += time_using_lists(10000000)
print(f"Average time taken using lists: {round(total_time/10*1000, 2)} ms")
"""
Output:
Average time taken using generators: 367.19 ms
Average time taken using lists: 604.76 ms
"""
As you can see from the output above, it takes less amount of time to iterate over the generator as compared to the list.
- Here we initialize a generator and a list with a small number (100) of integers and then iterate over them.
from time import perf_counter
def time_using_generators(length):
start = perf_counter()
for x in (i for i in range(0,length)):
pass
return perf_counter() - start
def time_using_lists(length):
start = perf_counter()
for x in [i for i in range(0,length)]:
pass
return perf_counter() - start
total_time = 0
for j in range(0, 10):
total_time += time_using_generators(100)
print(f"Average time taken using generators: {round(total_time/10*1000,2)} ms")
total_time = 0
for j in range(0, 10):
total_time += time_using_lists(100)
print(f"Average time taken using lists: {round(total_time/10*1000, 2)} ms")
"""
Output:
Average time taken using generators: 0.01 ms
Average time taken using lists: 0.01 ms
"""
From the output, you can see that there is no difference between the time to iterate over the generator and the list.