Python Yield vs Return: What's the Real Difference?

The difference between the yield and the return statement in Python is:

Return. A function that returns a value is called once. The return statement returns a value and exits the function altogether.
Yield. A function that yields values, is called repeatedly. The yield statement pauses the execution of a function and returns a value. When called again, the function continues execution from the previous yield. A function that yields values is known as a generator.

I’m sure this raises more questions than it answers.

Let’s go through yield in Python in great detail. Our goal is to understand why and when should one use yield over the return statement in Python.

Generators in Python

Any function that yields values is known as a generator in Python. A function that returns values is naturally just a function.

Thus, the question “yield vs. return in Python” can be rephrased as “functions vs. generators in Python”.

Here is a great illustration of functions vs. generators in Python:

Visualizing yielding vs returning in Python — A function runs from start to end and may return a value. A generator, in turn, yields values and pauses execution periodically.

A generator function returns a generator object, also known as an iterator. The iterator generates one value at a time. It does not store any values. This makes a generator memory-efficient.

For example, you can use a generator to loop through values without storing any of them in memory. Later on, you will find out why and when is this useful.

How to Replace ‘return’ with ‘yield’ in Python

Replacing return statements with yield statements in Python means you turn a function into a generator.

Example

Let’s create a square() function that squares an input list of numbers. This is a regular function that returns the whole list as a result:

def square(numbers):
    result = []
    for n in numbers:
        result.append(n ** 2)
    return result
    
numbers = [1, 2, 3, 4, 5]
squared_numbers = square(numbers)

print(squared_numbers)

Output:

[1, 4, 9, 16, 25]

Let’s then convert this function into a generator. Instead of storing the squared numbers into a list, you can yield values one at a time without storing them:

def square(numbers):
    for n in numbers:
        yield n ** 2

numbers = [1, 2, 3, 4, 5]
squared_numbers = square(numbers)

print(squared_numbers)

Output:

<generator object square at 0x7f685175b510>

Now you no longer get the list of squared numbers. This is because the result squared_numbers is a generator object.

But how can you access the values then?

Let’s talk about the next() function with which you ask generators to produce values.

Call next() to Yield Values from a Generator

A generator object doesn’t hold numbers in memory. Instead, it computes and yields them one at a time. It does this only when you ask for the next value using the next() function.

Let’s ask the generator to compute the first squared number:

print(next(squared_numbers))

Output:

Let’s make it compute the rest of the numbers by calling the next() four more times:

print(next(squared_numbers))
print(next(squared_numbers))
print(next(squared_numbers))
print(next(squared_numbers))

Output:

Now the generator has squared all the numbers. If you call next() one more time:

print(next(squared_numbers))

This time, an error occurs:

Traceback (most recent call last):
  File "<string>", line 13, in <module>
StopIteration

This error lets you know there are no more numbers to be squared. In other words, the generator is exhausted.

Now you understand how a generator works and how to make it compute values.

Forget about Calling next() with Generators

Using the next() function demonstrates how generators work.

In reality, you don’t need to call the next() function.

Instead, you can use a for loop with the same syntax you would use on a list.

The for loop actually calls the next() function under the hood.

For instance, let’s repeat the generator example using a for loop:

def square(numbers):
    for n in numbers:
        yield n ** 2

numbers = [1, 2, 3, 4, 5]
squared_numbers = square(numbers)

for n in squared_numbers:
    print(n)

Result:

This demonstrates the syntactical power generators have. Even though you do not store the values, you can still use the same for-loop syntax you would use on any other iterable.

Yield an Infinite Stream of Values

As you now know, the generator object yields one value at a time. It does not store any of those values.

This makes it possible to create an infinite stream of values. You can loop through the infinite stream with the same syntax you would loop a list. This is due to the flexible syntax of generators.

Example

Let’s create an infinite generator that produces all the numbers up to infinity after a starting point:

def infinite_values(start):
    current = start
    while True:
        yield current
        current += 1

This generator produces values from start to infinity.

Let’s loop through these values (Warning. An infinite loop):

infinite_nums = infinite_values(0)

for num in infinite_nums:
    print(num)

As a result, you see an infinite loop that prints values indefinitely:

But infinite loops are bad, aren’t they?

Yes, they are. But the point is to demonstrate how syntactically it looks as if you were able to loop through an infinite collection of values.

Look at the code—you can literally write for num in infinite_nums and it works! This is all thanks to generators and the fact that they do not store values.

Yield vs. Return—Runtime Comparison

Let’s perform a runtime comparison between yielding and returning in Python.

In this example, there’s a list of ten numbers and two functions:

A data_list() function that randomly selects a number from the list n times.
A data_generator() generator that function also randomly selects a number from the list n times.

This code compares the runtimes of using these functions to construct a list of 1 million randomly selected numbers:

import random
import timeit
from math import floor

numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

def data_list(n):
    result = []
    for i in range(n):
        result.append(random.choice(numbers))
    return result

def data_generator(n):
    for i in range(n):
        yield random.choice(numbers)

t_list_start = timeit.default_timer()
rand_list = data_list(1_000_000)
t_list_end = timeit.default_timer()

t_gen_start = timeit.default_timer()
rand_gen = data_generator(1_000_000)
t_gen_end = timeit.default_timer()

t_gen = t_gen_end - t_gen_start
t_list = t_list_end - t_list_start

print(f"List creation took {t_list} Seconds")
print(f"Generator creation took {t_gen} Seconds")

print(f"The generator is {floor(t_list / t_gen)} times faster to create")

Result:

List creation took 0.6045370370011369 Seconds
Generator creation took  3.48799949279055e-06 Seconds
The generator is 173319 times faster to create

This shows how a generator is way faster to create. This is because when you create a list, all the numbers have to be stored in memory. But when you use a generator, the numbers aren’t stored anywhere, so it’s lightning-fast to create.

When Use Yield in Python?

Ask yourself, “Do I need multiple items at the same time?”.

If the answer is “No”, use a generator.

Let’s go back to the example of squaring numbers. This function takes a list of numbers, squares them, and returns the list.

def square(numbers):
    result = []
    for n in numbers:
        result.append(n ** 2)
    return result

As you only want to print a list of squared numbers, you can use a generator. This is because the numbers do not depend on one another—You can square any number without knowing the next one. Thus there is no need to store all the squared numbers anywhere.

def square(numbers):
    for n in numbers:
        yield n ** 2

As another, perhaps more practical example, think about looping through a file with a billion strings (e.g. passwords).

There is no way you can store a billion strings into a single list. In this case, you can use a generator to loop through the strings one by one without storing them.

The best part is that syntactically it looks as if you really stored the values in a list and read them from there.

# You could write something like this

for word in billion_strings:
    check(word)

Conclusion

In Python, return is for regular functions, and yield is for generators.

“Return” gives a value and ends the function, while yield turns a function into a generator, which gives one value at a time and pauses until the next call.

Generators are efficient because they don’t store values in memory and are looped through like lists. They’re useful when you don’t need to store elements.

Thanks for reading. Happy coding!