In Python, a shallow copy is a “one-level-deep” copy. The copied object contains references to the child objects of the original object.
A deep copy is completely independent of the original object. It constructs a new collection object by recursively populating it with copies of the child objects.
Copying or Not
In Python, you might use the =
operator copy of an object. It is tempting to think this creates a new object, but it doesn’t. Instead, it creates a new variable that refers to the original object. This means changing a value in the copied object changes the value of the original object as well.
Let’s use lists to demonstrate this:
numbers = [1, 2, 3] new_numbers = numbers new_numbers[0] = 100 print('numbers: ', numbers) print('new_numbers: ', new_numbers)
Output:
numbers: [100, 2, 3]
new_numbers: [100, 2, 3]
As numbers
and new_numbers
now point to the same list, updating one always updates the other.
To verify that both objects really refer to the same object, you can also use the id()
method. Each Python object has a unique ID. If you check the IDs of the numbers
and new_numbers
, you can see they are the exact same:
print(id(numbers))
print(id(new_numbers))
Output:
139804802851400
139804802851400
Here is a helpful illustration:
How To Copy in Python
Let’s go over a common example: You want to take a copy of a list in such a way that the original list remains unchanged when updating the copied list. There are two ways to copy in Python:
- Shallow copy
- Deep copy
Both of these methods are implemented in the copy
module. To use these, you need to import the copy
module into your project.
- To take a shallow copy, call
copy.copy(object)
. - To take a deep copy, call
copy.deepcopy()
.
Let’s dig deeper into the details.
Shallow Copy vs. Deep Copy
Shallow copy
A shallow copy is shallow because it only copies the object but not its child objects. Instead, the child objects refer to the original object’s child objects.
This can be demonstrated by the following example:
import copy # Lists inside a list groups = [[1, 2, 3], [4, 5, 6], [7, 8, 9]] # Let's take a shallow copy of groups new_groups = copy.copy(groups)
Here, you have groups
, which is a list of lists of numbers. And you have a shallow copy called new_groups
. Let’s examine these:
- The ID of
groups
is not the same as the ID ofnew_groups
, which is reasonable, asnew_groups
is a copy ofgroups
.
print(id(groups), id(new_groups))
Output:
140315092110600 140315092266184
It seems as if new_groups
was a completely independent copy of groups
. But let’s take it a step further to see that it’s not.
2. Here is where the shallowness becomes evident. The IDs of the lists in the new_groups
are equal to the IDs of the lists of the original groups
:
print(id(groups[0]), id(new_groups[0]))
Output:
140315092110664 140315092110664
The shallowness means only the “outer” list gets copied. But the inner lists still refer to the lists of the original list. Due to this, changing a number in the copied list affects the original list:
new_groups[0][0] = 100000 print(new_groups[0]) print(groups[0])
Output:
[100000, 2, 3]
[100000, 2, 3]
3. The “outer” list of new_groups
is a “real” copy of the original groups
. Thus you can add new elements to it or even replace existing ones. These changes won’t affect the original groups
list.
For example, let’s replace the first list in new_groups
with a string. This should not affectgroups
.
new_groups[0] = "Something else" print(new_groups) print(groups)
Output:
Something else
[100000, 2, 3]
Deep copy
A deep copy creates a completely independent copy of the original object.
This is pretty straightforward. But for the sake of completeness, let’s repeat the experiments above with a deep copy too:
import copy # Lists inside a list groups = [[1, 2, 3], [4, 5, 6], [7, 8, 9]] # Let's take a deep copy of groups new_groups = copy.deepcopy(groups)
Here, you have groups
, which is a list that contains lists of numbers. And you take a deep copy, new_groups
. Let’s examine these:
- The IDs of
groups
andnew_groups
do not match:
print(id(groups), id(new_groups))
Output:
140416138566984 140416138739016
2. The IDs of the lists in new_groups
are not equal to the IDs of the lists in groups
:
print(id(groups[0]), id(new_groups[0]))
Output:
140416138567048 140416138566728
Changing a number in new_groups
doesn’t change that value in the original groups
.
new_groups[0][0] = 100000 print(new_groups[0]) print(groups[0])
Output:
[100000, 2, 3]
[1, 2, 3]
3. The new_groups
is an independent copy of groups
. Thus, there is no way changes made in new_groups
would be visible in the original groups
.
new_groups[0] = "Something else" print(new_groups) print(groups)
Output:
Something else
[1, 2, 3]
Conclusion
A shallow copy is a “one-level-deep” copy.
It constructs a copied object. But the child objects refer to the children of the original object. Thus, it may seem a bit “strange” at first.
A deep copy is the “real copy.” It is an independent copy of the original object.
Most of the time, the deep copy is what you want.