Blog | CoWhite Softwarehttp://django.cowhite.com/blog/2017-07-21T15:15:26+00:00BlogPython Lists, Shallow and Deep Copy2017-07-21T15:15:26+00:00ganesh/blog/author/ganesh/http://django.cowhite.com/blog/python-lists-shallow-and-deep-copy/<p>In python, as we know copying lists is not same as copying variables of data types like integer. Let us see different situations we come across while copying the lists.</p>
<p>Also the way simple lists act while copying is different from that of nested lists. Let's see what actually happens when we copy an integer variable in python.</p>
<div class="codehilite"><pre>>>> x = 4
>>>y = x
>>>print (x,y)
4 4
>>>
</pre></div>
<p>What actually happens here is, python creates a variable y which references to value 4. To understand more we make use of python id() function.</p>
<p>The id() function returns identity of the object. This is an integer which is unique for the given object and remains constant during its lifetime</p>
<div class="codehilite"><pre>>>> x = 4
>>>y = x
>>>print (x,y)
4 4
>>>print (id(x), id(y))
30503184 30503184
>>>
</pre></div>
<p>So what is happening here is, python is not creating a new integer object for y. It is creating a variable y which refers to data 4.</p>
<p>Now let's see what happens if we take the same example with list instead of integer.</p>
<div class="codehilite"><pre>>>> list1 = ['a','b','c','d']
>>> list2 = list1
>>> print list1, list2
['a', 'b', 'c', 'd'] ['a', 'b', 'c', 'd']
>>> print id(list1), id(list2)
140118925400256 140118925400256
>>> list2 = ['e','f','g','h']
>>> print list1
['a', 'b', 'c', 'd']
>>> print list2
['e', 'f', 'g', 'h']
>>>
</pre></div>
<p>This is what is called shallow copy. Python never creates real copies by default. It does shallow copy. We need to explicitly write code to make real copy.</p>
<p>The difference between real copy and shallow copy can be represented as shown in the image below.</p>
<ul>
<li></li>
</ul>
<p><img alt="alt text" src="https://django.cowhite.com/static/media/uploads/shallow%20and%20deep%20copy/python-shallow-and-deep.png" /></p>
<ul>
<li></li>
</ul>
<p>Assigning new list to "list2" has no effect on "list1". Because here, python is assigning a new list object to "list2". The problem starts when we want to change just a single element of list instead of the whole list. Continuing with above example, </p>
<div class="codehilite"><pre>>>> list1 = ['a','b','c','d']
>>> list2 = list1
>>> print list1, list2
['a', 'b', 'c', 'd'] ['a', 'b', 'c', 'd']
>>> print id(list1), id(list2)
140118925400256 140118925400256
>>> list2[1] = 'z'
>>> print list2
['a', 'z', 'c', 'd']
>>> print list1
['a', 'z', 'c', 'd']
>>>
</pre></div>
<p>As you can see, both the lists "list1" and "list2" are changed even though we just updated the value of list "list2". This is because both "list1" and "list2" are referring to the same list object. The explanation is that we didn't assign a new object to list "list2". Both variables "list1" and "list2" still point to the same list object.</p>
<p><strong>Slice operator</strong>:</p>
<p>We can avoid the above problems by using slice operator to copy lists.</p>
<div class="codehilite"><pre>>>> list1 = ['a','b','c','d']
>>> list2 = list1[:]
>>> print list1, list2
['a', 'b', 'c', 'd'] ['a', 'b', 'c', 'd']
>>> list2[1] = 'z'
>>> print list1, list2
['a', 'b', 'c', 'd'] ['a', 'z', 'c', 'd']
>>>
</pre></div>
<p>This works fine with the simple lists. What happens if we perform the similar operations on nested lists. </p>
<div class="codehilite"><pre>>>> list1 = ['a','b',['x', 'y']]
>>> list2 = list1[:]
>>> print list1
['a', 'b', ['x', 'y']]
>>> print list2
['a', 'b', ['x', 'y']]
>>> list2[0] = 'cow'
>>> print list2
['cow', 'b', ['x', 'y']]
>>> print list1
['a', 'b', ['x', 'y']]
>>>
</pre></div>
<p>If we change the value of 0th of 1st index element of the "list2", there wont be any effect on the "list1". But if we change the values within the sublist, it effects the "list1".</p>
<div class="codehilite"><pre>>>> list2[2][1] = 'wow'
>>> print list1
['a', 'b', ['x', 'wow']]
>>> print list2
['cow', 'b', ['x', 'wow']]
>>>
</pre></div>
<p><strong>DeepCopy</strong></p>
<p>To avoid all these problems, python provides a method called <em>deepcopy</em>. We will perform the above operations using <em>deepcopy</em> and see what happens.</p>
<div class="codehilite"><pre><span class="o">>>></span> <span class="kn">from</span> <span class="nn">copy</span> <span class="kn">import</span> <span class="n">deepcopy</span>
<span class="o">>>></span> <span class="n">list1</span> <span class="o">=</span> <span class="p">[</span><span class="s">'a'</span><span class="p">,</span><span class="s">'b'</span><span class="p">,[</span><span class="s">'x'</span><span class="p">,</span> <span class="s">'y'</span><span class="p">]]</span>
<span class="o">>>></span> <span class="n">list2</span> <span class="o">=</span> <span class="n">deepcopy</span><span class="p">(</span><span class="n">list1</span><span class="p">)</span>
<span class="o">>>></span> <span class="k">print</span> <span class="n">list1</span>
<span class="p">[</span><span class="s">'a'</span><span class="p">,</span> <span class="s">'b'</span><span class="p">,</span> <span class="p">[</span><span class="s">'x'</span><span class="p">,</span> <span class="s">'y'</span><span class="p">]]</span>
<span class="o">>>></span> <span class="k">print</span> <span class="n">list2</span>
<span class="p">[</span><span class="s">'a'</span><span class="p">,</span> <span class="s">'b'</span><span class="p">,</span> <span class="p">[</span><span class="s">'x'</span><span class="p">,</span> <span class="s">'y'</span><span class="p">]]</span>
<span class="o">>>></span> <span class="k">print</span> <span class="nb">id</span><span class="p">(</span><span class="n">list1</span><span class="p">),</span> <span class="nb">id</span><span class="p">(</span><span class="n">list2</span><span class="p">)</span>
<span class="mi">140118925529168</span> <span class="mi">140118925479880</span>
<span class="o">>>></span>
</pre></div>
<p>Using deepcopy, python creates a new list object for "list2". We can see that by noticing the identifiers of the two lists. We can also check the id's of the sublist and see that deepcopy, unlike shallow copy, creates a new object for the "list2"</p>
<div class="codehilite"><pre>>>> print id(list1[2]), id(list2[2])
140118925529240 140118925583136
>>> print id(list1[2][1]), id(list2[2][1])
140118925706504 140118925706504
>>>
</pre></div>
<p>We can also check by assigning some value to sublist of "list2" and see what happens to "list1".</p>
<div class="codehilite"><pre>>>> list2[2][1] = "wow"
>>> print list1
['a', 'b', ['x', 'y']]
>>> print list2
['a', 'b', ['x', 'wow']]
>>>
</pre></div>
<p>Deepcopy provides an elegant solution for the list copying but one should notice that it takes up extra space even in cases whn it not needed. One should carefully use the deepcopy method in ordered reduce the memory usage.</p>