Python Lists, Shallow and Deep Copy

(Comments)

In python, as we know copying lists is not same as copying variables of data types like integer. Let us see different situations we come across while copying the lists.

Also the way simple lists act while copying is different from that of nested lists. Let's see what actually happens when we copy an integer variable in python.

>>> x = 4
>>>y = x
>>>print (x,y)
4 4
>>>

What actually happens here is, python creates a variable y which references to value 4. To understand more we make use of python id() function.

The id() function returns identity of the object. This is an integer which is unique for the given object and remains constant during its lifetime

>>> x = 4
>>>y = x
>>>print (x,y)
4 4
>>>print (id(x), id(y))
30503184 30503184
>>>

So what is happening here is, python is not creating a new integer object for y. It is creating a variable y which refers to data 4.

Now let's see what happens if we take the same example with list instead of integer.

>>> list1 = ['a','b','c','d']
>>> list2 = list1
>>> print list1, list2
['a', 'b', 'c', 'd'] ['a', 'b', 'c', 'd']
>>> print id(list1), id(list2)
140118925400256 140118925400256
>>> list2 = ['e','f','g','h']
>>> print list1
['a', 'b', 'c', 'd']
>>> print list2
['e', 'f', 'g', 'h']
>>>

This is what is called shallow copy. Python never creates real copies by default. It does shallow copy. We need to explicitly write code to make real copy.

The difference between real copy and shallow copy can be represented as shown in the image below.

alt text

Assigning new list to "list2" has no effect on "list1". Because here, python is assigning a new list object to "list2". The problem starts when we want to change just a single element of list instead of the whole list. Continuing with above example,

>>> list1 = ['a','b','c','d']
>>> list2 = list1
>>> print list1, list2
['a', 'b', 'c', 'd'] ['a', 'b', 'c', 'd']
>>> print id(list1), id(list2)
140118925400256 140118925400256
>>> list2[1] = 'z'
>>> print list2
['a', 'z', 'c', 'd']
>>> print list1
['a', 'z', 'c', 'd']
>>>

As you can see, both the lists "list1" and "list2" are changed even though we just updated the value of list "list2". This is because both "list1" and "list2" are referring to the same list object. The explanation is that we didn't assign a new object to list "list2". Both variables "list1" and "list2" still point to the same list object.

Slice operator:

We can avoid the above problems by using slice operator to copy lists.

>>> list1 = ['a','b','c','d']
>>> list2 = list1[:]
>>> print list1, list2
['a', 'b', 'c', 'd'] ['a', 'b', 'c', 'd']
>>> list2[1] = 'z'
>>> print list1, list2
['a', 'b', 'c', 'd'] ['a', 'z', 'c', 'd']
>>>

This works fine with the simple lists. What happens if we perform the similar operations on nested lists.

>>> list1 = ['a','b',['x', 'y']]
>>> list2 = list1[:]
>>> print list1
['a', 'b', ['x', 'y']]
>>> print list2
['a', 'b', ['x', 'y']]
>>> list2[0] = 'cow'
>>> print list2
['cow', 'b', ['x', 'y']]
>>> print list1
['a', 'b', ['x', 'y']]
>>>

If we change the value of 0th of 1st index element of the "list2", there wont be any effect on the "list1". But if we change the values within the sublist, it effects the "list1".

>>> list2[2][1] = 'wow'
>>> print list1
['a', 'b', ['x', 'wow']]
>>> print list2
['cow', 'b', ['x', 'wow']]
>>>

DeepCopy

To avoid all these problems, python provides a method called deepcopy. We will perform the above operations using deepcopy and see what happens.

>>> from copy import deepcopy
>>> list1 = ['a','b',['x', 'y']]
>>> list2 = deepcopy(list1)
>>> print list1
['a', 'b', ['x', 'y']]
>>> print list2
['a', 'b', ['x', 'y']]
>>> print id(list1), id(list2)
140118925529168 140118925479880
>>>

Using deepcopy, python creates a new list object for "list2". We can see that by noticing the identifiers of the two lists. We can also check the id's of the sublist and see that deepcopy, unlike shallow copy, creates a new object for the "list2"

>>> print id(list1[2]), id(list2[2])
140118925529240 140118925583136
>>> print id(list1[2][1]), id(list2[2][1])
140118925706504 140118925706504
>>>

We can also check by assigning some value to sublist of "list2" and see what happens to "list1".

>>> list2[2][1] = "wow"
>>> print list1
['a', 'b', ['x', 'y']]
>>> print list2
['a', 'b', ['x', 'wow']]
>>>

Deepcopy provides an elegant solution for the list copying but one should notice that it takes up extra space even in cases whn it not needed. One should carefully use the deepcopy method in ordered reduce the memory usage.

Comments

Recent Posts

Archive

2022
2021
2020
2019
2018
2017
2016
2015
2014

Tags

Authors

Feeds

RSS / Atom