In this article, we will learn about the nature of the data stored that variables reference. Then, we will also learn about the memory management mechanism in Python. To lay the foundation for understanding the knowledge in this article, you should revisit the article on Variables and Constants in Python.
1. Memory of Variables in Python
In languages such as C/C++, Java, or some other languages, we have to declare and initialize a variable’s value to use it. For example, declaring and initializing a variable in C++:
int a;//khai báo biến
a = 5;//gán giá trị cho biến
You can read more in the article Understanding Memory of Variables in C++ to learn about storing data for variables in C++.
In Python, we only need to assign a value to the variable name to use the variable. For example:
x = 5
name = "Python"
salary = 60.50
1.1. How does Python handle when creating a variable?
Every data value in Python is an object. Each object is allocated a separate memory location. Python creates an object to store each value, representing that value, and then assigns that object to the variable. We also refer to it as a variable reference to an object (the variable does not contain the value).
Example: When we create a variable x = 5
, Python handles it as follows:
1. Python creates an object to represent the value 5 in memory (Python determines the data type as an integer).
2. Python creates a name identifier x (variable name).
3. Python creates a link between the variable x and object 5, called a reference.
1.2. Python handling when changing variable values
Let’s take a look at the example below.
x = 5 # It's an integer
x = "amazon" # It's a string
x = 40.25 # It's a float
In the example above, when executing x = 5
, Python creates an object of the int class to store the value 5. Then, Python assigns x to reference this object.
When executing x = "amazon"
, Python creates an object of the str class to store the value “amazon”. Then, Python assigns x to reference this object. At this point, Python realizes that the object storing the value 5 is no longer referenced by any variable name. Therefore, the Garbage Collection in Python will automatically release the memory of the object storing the value 5.
When executing x = 40.25
, Python creates an object of the float class to store the value 40.25. Then, Python assigns x to reference this object. At this point, Python realizes that the object storing the value “amazon” is no longer referenced by any variable name. Therefore, the Garbage Collection in Python will automatically release the memory of the object storing the value “amazon”.
1.3. Multiple reference variables pointing to the same object in Python
Let’s look at the example below.
x = 5
y = x
We see that the variable x refers to the object of the int class storing the value 5. When executing the statement y = x
, both y and x refer to the object storing the value 5.
Let’s look at another example below.
x = 5
y = x
x = "amazon"
When executing the statement x = "amazon"
, the variable x will refer to the object of the str class storing the value “amazon”. Therefore, x will no longer refer to the object storing the value 5.
1.4. Understanding reference counters in Python
The reference counter in Python counts how many variables are referring to an object in the program. An object in Python stores the following data:
- Data type
- Data value
- Reference counter
- Corresponding reference variables
Example:
a = 100
b = 100
a = 5
b = 7
When executing lines 1 and 2, the object storing the value 100 has the following data table:
data type | int |
---|---|
value | 100 |
reference counter | 2 |
references | a, b |
When executing line 3, the object storing the value 100 only has 1 reference from b. When executing line 4, the object storing the value 100 no longer has any references. At this point, the garbage collector in Python will automatically release the memory of the object storing the value 100.
1.5. Python program illustrating variable memory addresses
a = 100
#Output 2699180379472
print("address of a referred object 100:", id(a))
b = 100
#Output 2699180379472
print("address of b referred object 100:", id(b))
a = 200
#Output 2699180382672
print("address of a referred object 200:", id(a))
b = 500
#Output 2699181414608
print("address of b referred object 500:", id(b))
#Output 2699180379472
print("address of object 100:", id(100))
Result
address of a referred object 100: 2699180379472
address of b referred object 100: 2699180379472
address of a referred object 200: 2699180382672
address of b referred object 500: 2699181414608
address of object 100: 2699180379472
2. Memory management mechanism in Python
2.1. Stack and heap memory in Python
The memory allocation for Python programs is divided into stack memory and heap memory. Stack memory stores function calls, arguments passed to functions, and reference variables. Heap memory stores all objects in Python programs.
Example of a Python program:
def bar(a):
a = a - 1
return a
def foo(a):
a = a * a
b = bar(a)
return b
def main():
x = 2
y = foo(x)
print("x = ", x)
print("y = ", y)
if __name__ == "__main__":
main()
Result
x = 2
y = 3
The diagram below illustrates the process of storing data in stack and heap memory in the Python program above.
All objects are stored in heap memory. Function calls and all reference variables are stored in stack memory. This is because, in Python, all variables are references to data storage objects.
The order of data storage in the stack is as follows:
(1) The main() function will be executed first and pushed onto the stack. The x variable is pushed onto the stack and references the object storing the value 2.
(2) The main() function calls the foo() function. The variable in the foo() function is pushed onto the stack and references the object storing the value 4.
(3) The foo() function calls the bar() function. The variable in the bar() function is pushed onto the stack and references the object storing the value 3.
(4) The b variable is pushed onto the stack and references the object storing the value 3 returned by the bar() function.
(5) The y variable is pushed onto the stack and references the object storing the value 3 returned by the foo() function.
2.2. Garbage Collection in Python
Garbage Collection (GC) helps to free up memory when it is not in use to provide to other objects. The GC’s garbage collection process is scheduled to execute based on a threshold. See the program below for more understanding.
# loading gc
import gc
# get the current collection
# thresholds as a tuple
print("Garbage collection thresholds:",
gc.get_threshold())
Result
Garbage collection thresholds: (700, 10, 10)
The GC divides objects into 3 generations: youngest generation, older generation, and oldest generation. The criterion for dividing objects is based on the lifetime of objects (or the number of times the garbage collection process has been performed while the object still exists).
The youngest generation consists of newly created objects. When the number of allocated memory objects minus the number of memory objects that have been reclaimed is greater than the threshold of 700, the GC will start running to garbage-collect objects in the youngest generation. The remaining thresholds of 10, and 10 are to collect garbage for older and oldest generation objects.
Reference: pythoneasy, geeksforgeeks.