Jan 12, 2017

Django Under The Hood 2016: Performance Debugging

After Django validation, debugging was another interesting presentation from the Django Under the Hood conference by Aymeric Augustin.

The introduction covered the Browser Console and the Django Debug Toolbar: both show the Page load time of a website, split into multiple stages. We use the Django Debug Toolbar pretty often, however mostly for inspecting SQL queries. It is easy to get too many SQL queries without the proper use of select_related and prefetch_related, which are very popular and almost necessary Django methods on Querysets. At this point, we have no need to improve our Page load time but the detailed stages description and DOM loading in Django were useful concepts to learn about.

It is only advised to use the Debug Toolbar in development, which is why it is disabled on production by default. It was not mentioned in the presentation but for production timings/SQL queries we use Opbeat, which looks like this:

Aymeric had an interesting statement supported by various tests from his blog and Facebook loading times:

The backend makes ~15% of a page load time.

UX and front-end was not a priority for us, which will probably change in year 2017. We have very little JavaScript modules in our app and did not need to deal with page speed. Aymeric explained about page loading stages from Django’s point of view pretty well.

Fetching related data

Comparison of select_related and prefetch_related was interesting. Assuming you know what these do, Aymeric made a nice comparison of both methods. Select related is used on Foreign keys (FK) or One-to-One (forwards and backwards). Select related can be called without any arguments and it will follow all non-null foreign keys , but it’s not recommended since it can return more data than is actually needed.

Select related only fetches non-null foreign keys.

  • If the FK is non-nullable, Django does an INNER JOIN.
  • If the FK is nullable, Django does LEFT OUTER JOIN, which is much slower in some circumstances. Django Admin automatically uses select_related for non-nullable FKs.

Prefetch related can be used in M2M relationships. It does not do joins between tables, but fetches original objects in the first SQL query and related objects in the second SQL query.
It can therefore be used on any kind of relationship: one-to-one, ForeignKey or ManyToMany.

It uses an extra SQL query, but as Aymeric demonstrated during a comparison in his presentation, sometimes prefetch_related is quicker than select_related and sometimes it is the other way around.
Django 1.10 introduces prefetch_related_objects, which prefetches objects that do not come from Queryset.

ORM optimization

The logical way to use Django Querysets is to get only the database columns we actually need using values() or values_list().
For example, let’s say we have an Employee model and want to get idfirst_name and team.name of each employee. (the test DB contains 1000 Employee objects and ~90 Team objects)

Let’s compare all()values() and values_list():

Usage of all()

>>> for e in Employee.objects.select_related('team').all():
... print(e.id, e.first_name, e.team.name)
215 Jennifer Marketing
...

Time:

import time
start = time.time()
for e in Employee.objects.select_related('team').all():
print(e.id, e.first_name, e.team.name)end = time.time()
print(end-start)
0.24029803276062012

Memory:

import tracemalloctracemalloc.start()
for e in Employee.objects.select_related('team').all():
print(e.id, e.first_name, e.team.name)tracemalloc.get_traced_memory()
# (266853, 6029062) -> (260,6KB current, 5.75MB peak)

Usage of values()

>>> for e in Employee.objects.select_related('team')\
... .values('id', 'first_name', 'team__name'):
print(e['id'], e['first_name'], e['team__name'])
215 Jennifer Marketing
...

Time:

import time
start = time.time()
for e in Employee.objects.select_related('team')\
.values('id', 'first_name', 'team__name'):
print(e['id'], e['first_name'], e['team__name'])

end = time.time()
print(end-start)0.04195404052734375

Memory:

import tracemalloctracemalloc.start()
for e in Employee.objects.select_related('team').all():
print(e.id, e.first_name, e.team.name)tracemalloc.get_traced_memory()
# (175740, 613556) -> (171,6KB current, 599,2KB peak)

Usage of values_list()

>>> for e in Employee.objects.select_related(‘team’)\
... .values_list(‘id’, ‘first_name’, ‘team__name’):
... print(e[0], e[1], e[2])
215 Jennifer Marketing
...

Time:

import time
start = time.time()
for e in Employee.objects.select_related('team')\
.values_list('id', 'first_name', 'team__name'):
print(e[0], e[1], e[2])

end = time.time()
print(end-start)
0.03514695167541504

Memory:

import tracemalloc
tracemalloc.start()
for e in Employee.objects.select_related('team')\
.values_list('id', 'first_name', 'team__name'):
print(e[0], e[1], e[2])

tracemalloc.get_traced_memory()
# (232395, 385494)-> (227KB current, 376,5KB peak)

Overview:

╔═══════════════╦═════════╦═══════╦═════════════╗
║ ║ usage ║ time ║ memory peak ║
╠═══════════════╬═════════╬═══════╬═════════════╣
║ all() ║ e.id ║ 0.240 ║ 5.75MB ║
║ values() ║ e['id'] ║ 0.041 ║ 599,2KB ║
║ values_list() ║ e[0] ║ 0.035 ║ 376,5KB ║
╚═══════════════╩═════════╩═══════╩═════════════╝

So value_list is better in terms of time and memory, however working with just a list of results is often not enough. Personally e.team.name is more readable than e[‘team__name’] or e[‘1’].

When creating models, attributes should be coupled by use case. If some attributes are not often needed with others, they should be normalised/moved to a separate model (especially columns with large amounts of data) and connected with JOINs (either select_related or prefetch_related) instead of creating one large model. If you have a large model, using all() method takes a lot of memory and time to load.

In addition to all()values() and values_list() Django provides only() which fetches specified columns and defer() does the opposite. Unfortunately while it is a lot faster in the database, it is actually slower because of the Python managing overhead.

╔═════════════════════╦══════╦══════════════════════╗
║ ║ time ║ memory(current/peak) ║
╠═════════════════════╬══════╬══════════════════════╣
║ .only('team__name') ║ 0.23 ║ 250KB/1,29MB ║
║ all() ║ 0.25 ║ 260,6KB/5.75MB ║
╚═════════════════════╩══════╩══════════════════════╝

Django instances are expensive (calling signals after every object creation)!
Instead of grouping and calculating in Python, the obvious way to go is aggregation and annotation in Django.

Iterator

The last one which is rarely used is iterator. If you have a database with good response times and you prefer to save RAM when doing a number of SQL queries, you might want to check out iterator. It iterates over instances but does not cache results in RAM. If results are needed again, they are not loaded from RAM but another SQL query is fired up. This avoids keeping entire Querysets in RAM. It is used in thedjango-import-export library for example.

Inspect SQL query

If you’re still struggling and raw SQL doesn’t make your blood run cold, printing raw SQL query in the inspect further (EXPLAIN command is easy to use and has tons of information):

print(queryset.query)

>>> print(Team.objects.filter(id=1).query)
SELECT “employees_team”.”id”, “employees_team”.”name” FROM “employees_team” WHERE “employees_team”.”id” = 1

Conclusion

In summary, if you want to know about the stages of page loading from Django’s point of view, similar speed tests on bigger databases, how to speed up your Django page, or you just wanna see cool pictures of flower gardens and bugs, watch the whole debugging presentation!

Search
Share
Featured articles
Don’t Fix Bad Data, Do This Instead
The Relevance Of Tech Conferences In A Post Pandemic World