Generators vs Lists

Generators vs Lists

Comparing generators and lists in python followed by a speed test

·

3 min read

Differences

GeneratorsLists
DefinitionThey are a function that return an Iterator. Generators are lazy iterators. They return an item only when requiredLists are a sequence type that can group together several items of different types in a single variable.
CreationUsing the yield keyword in functions; using generator expressionsUsing assignment to directly assign the list; using list comprehensions
IterationThey can be iterated over only once. If you want to repeat the iteration, you will have to create a new generator.They can iterated over several times.
Accessing elementsUsing next() or loopsUsing indexes or loops
Storage & Memory efficiencyThey have lazy execution (producing items only when asked for). For this reason, a generator is much more memory efficient than an equivalent list.Lists store all their elements in memory.
SpeedDepends on the use case.Depends on the use case.
Slicing & AdditionThey don't support slicing or additions.Lists support slicing and also support additions with other lists.
SuitabilitySuitable for very large (and even infinite) sequences which need to be iterated over only once.More suitable if the items need to be iterated over several times.

Speed test:

Let's compare the performance of generators and lists.

  1. Here we are initializing a generator and a list with 10 million (1 crore) integers
from time import perf_counter

def time_using_generators(length):
    start = perf_counter()
    result = (i for i in range(0,length))  # generator expression
    return perf_counter() - start

def time_using_lists(length):
    start = perf_counter()
    result = [i for i in range(0,length)]  # list comprehension
    return perf_counter() - start

total_time = 0
for i in range(0, 10):
    total_time += time_using_generators(10000000)
print(f"Average time taken using generators: {round(total_time/10*1000,2)} ms")

total_time = 0
for i in range(0, 10):
    total_time += time_using_lists(10000000)
print(f"Average time taken using lists: {round(total_time/10*1000, 2)} ms")

"""
Output:
Average time taken using generators: 0.0 ms
Average time taken using lists: 507.48 ms
"""

As you can see from the output above, initializing the generator takes no time at all whereas initializing the list takes a certain amount of time.

  1. Here we initialize a generator and a list with a large number (10 million) of integers and then iterate over them.
from time import perf_counter

def time_using_generators(length):
    start = perf_counter()
    for x in (i for i in range(0,length)):
        pass
    return perf_counter() - start

def time_using_lists(length):
    start = perf_counter()
    for x in [i for i in range(0,length)]:
        pass
    return perf_counter() - start

total_time = 0
for j in range(0, 10):
    total_time += time_using_generators(10000000)
print(f"Average time taken using generators: {round(total_time/10*1000,2)} ms")

total_time = 0
for j in range(0, 10):
    total_time += time_using_lists(10000000)
print(f"Average time taken using lists: {round(total_time/10*1000, 2)} ms")

""" 
Output:
Average time taken using generators: 367.19 ms
Average time taken using lists: 604.76 ms
"""

As you can see from the output above, it takes less amount of time to iterate over the generator as compared to the list.

  1. Here we initialize a generator and a list with a small number (100) of integers and then iterate over them.
from time import perf_counter

def time_using_generators(length):
    start = perf_counter()
    for x in (i for i in range(0,length)):
        pass
    return perf_counter() - start

def time_using_lists(length):
    start = perf_counter()
    for x in [i for i in range(0,length)]:
        pass
    return perf_counter() - start

total_time = 0
for j in range(0, 10):
    total_time += time_using_generators(100)
print(f"Average time taken using generators: {round(total_time/10*1000,2)} ms")

total_time = 0
for j in range(0, 10):
    total_time += time_using_lists(100)
print(f"Average time taken using lists: {round(total_time/10*1000, 2)} ms")

"""
Output:
Average time taken using generators: 0.01 ms
Average time taken using lists: 0.01 ms
"""

From the output, you can see that there is no difference between the time to iterate over the generator and the list.