Mutability in Python

What is a mutable type in Python?

A mutable type (or mutable sequence) is one that can be changed after it is created. A list is a mutable sequence and can therefore be changed in place.

Let's refresh our memory on how strings couldn't be changed after they were created

>>> s = "my string"
>>> s[0] = "b"
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'str' object does not support item assignment

We get an error. Take a look back at the chapter on strings (Chapter 6) and revise the diagram that exists in that chapter.

I'm going to introduce the is keyword now. The is keyword compares object IDs (identities). This is similar to == which checks if the objects referred to by the variables have the same content (equality).

To give an analogy, let's say a clothing company is mass producing a certain type of shirt. Let's take two shirts as they are coming off the line.

We would say shirt_1 == shirt_2 evaluates to True (They are both the same color, size and have the same graphic printed on them)

We would say shirt_1 is shirt_2 evaluates to False (Even though they look similar, they are not the same shirt).

Mutability and Immutability is quite like that.

Let's look at some more Python examples:

>>> s = "Hello"
>>> t = s
>>> t is s
True
>>> t == s
True

These variables, s and t are referencing the same object (The string "Hello").What if we try to update s?

>>> s = "Hello"
>>> t = s
>>> t is s
True
>>> s = "World"
>>> t
"Hello"

The variable s is now referencing a different object and t still references "Hello". As strings are immutable we must create a new instance. We cannot change it in place.

With lists however, things operate a little differently.

>>> a = [1, 3, 7]
>>> b = a
>>> b is a
True
>>> a.append(9)
>>> a
[1, 3, 7, 9]
>>> b
[1, 3, 7, 9]
>>> a is b
True

The behavior has changed here and that is because the object pointed to by a and b is mutable. Modifying it does not create a new object and the original reference is not overwritten to point to a new object. We do not overwrite the reference in the variable a, instead we write through it to modify the mutable object it points to.

There are however some little tricks thrown in with this behavior. Consider the following code:

>>> a = [1, 3, 7]
>>> b = a
>>> a = a + [9]
>>> a
[1, 3, 7, 9]
>>> b
[1, 3, 7]

What's going on here? That goes against what we just talked about doesn't it? Well, obviously not, the Python developers wouldn't leave a bug that big in language. What's happening here is we executed the following code: a = a + [9]. This doesn't write through a to modify the list it references. Instead a new list is created from the concatenation of the list a and the list [7].

A reference to this new list overwrites the original reference in a. The list referenced by b is hence unchanged.

That's not all the little tricks thrown in. There's one more subtlety. Consider the following code:

>>> a = [1, 3, 7]
>>> b = a
>>> a += [9]
>>> a
[1, 3, 7, 9]
>>> b
[1, 3, 7, 9]

It turns out x += y isn't always shorthand for x = x + y. Although with immutable types they do the same thing, with lists they behave very differently. It turns out the += operator when applied to lists, modifies the list in-place. This means we write through the reference in a and append [9] to the list.

Writing through references is illustrated below.

Firstly consider the code, then the diagram:

>>> a = [1, 3]
>>> b = a
>>> a[1] = 9
** BEFORE UPDATING A **

         NAME LIST                   LIST OBJECT
        ===========                ===============
                                       |-----|          _____
               _____                   |     |          | 1 |
             a | # |------------------>|  #  |--------->|---|
               |---|                   |     |          _____
               _____                   |  #  |--------->| 3 |
             b | # |------------------>|_____|          |---|
               |---|
             
** AFTER UPDATING A **


         NAME LIST                   LIST OBJECT
        ===========                ===============
                                       |-----|           _____
               _____                   |     |           | 1 |
             a | # | ----------------->|  #  |---------> |---|
               |---|                   |     |           _____
               _____                   |  #  |-----|     | 3 |
             b | # | ----------------->|_____|     |     |---|
               |---|                               |
                                                   |     |---|
                                                   |---> | 9 |
                                                         |---|

So whats happening in the second part of this diagram? The #'s represent references.

a is a reference to a list object which contains references to integers.

b references the same list object. But when we "change" a, we're clearly not changing it. We're updating a reference within it. While b just points to the list, a has changed whats inside it hence b's reference is not effected.

We can still see the 3 floating around in the second part of the diagram. This will be picked up by something called the garbage collector (A garbage collector keeps memory clean and stops it from becoming flooded with stuff like the unreferenced 3 and is used).

It's important to keep a mental image of this.



Help support the author by donating or purchasing a copy of the book (not available yet)