Handling some practical Issues With Django querysets

(Comments)

In this post I would like address some the problems that developers regularly face with Querysets.


Before we set out to solve our queryset problems I will remind you that we can view the SQL queries that will actually be executed in the database with the query attribute of the queryset.

In [1]: from movies.models import Movie

In [2]: all_movies = Movie.objects.all()

In [3]: print(all_movies.query)
SELECT "movies_movie"."id", "movies_movie"."created", "movies_movie"."updated", "movies_movie"."title", "movies_movie"."slug", "movies_movie"."image", "movies_movie"."year" FROM "movies_movie"

Excluding records - the NOT operator

SQL queries have the NOT operator or one or more of the == or <> or != relational operators to use in filtering the records. Django developers widely use the filter() method to SELECT rows of the database. The WHERE clauses of SQL use the inequality operators (or NOT sometimes) to exclude records form the result. Django's querysets provide a exclude() method to exclude object from the result. For example:

In [2]: all_movies = Movie.objects.all()

In [3]: print(all_movies)
<QuerySet [<Movie: Inception (2010)>, <Movie: Interstellar (2014)>]>

The all_movies queryset has all the movies in its result set. We can use the filter() method of the queryset to include in the result only the movies that satisfy the given condition.

In [8]: latest_movies = all_movies.filter(year__gte=2012)

In [9]: print(latest_movies)
<QuerySet [<Movie: Interstellar (2014)>]>

For example, the filter() method will modify the queryset to include only those records (movies) that have year field value greater than or equal to 2012. To simplify, filter() method will specify the records to include in the result. Similarly we can use the exclude() method to specify what models to exclude from the result.

In [13]: non_thriller = all_movies.exclude(genres__in=Genre.objects.filter(name='Thriller'))

In [14]: print(non_thriller)
<QuerySet [<Movie: Interstellar (2014)>]>

We have used the exlude() method to filter out the movies that belong to the genre named 'Thriller'. As you can see the result, the movie Inception is filtered out.

Using the OR SQL operator with querysets

The filter() method of the querysets only "AND"s its keyword arguments. Django does not provide a direct way of using the OR operator in filter, but provides the Q object to build such complex queries.


Django's official documentation says:

A Q object (django.db.models.Q) is an object used to encapsulate a collection of keyword arguments. These keyword arguments are specified as in “Field lookups” above.

A Q object can be used to specify the keyword arguments in the filter method. For example,

Movie.objects.filter(Q(year__gte=2015))

is equivalent to

Movie.objects.filter(year__gte=2015)

If there are multiple Q objects passed to filter() method as positional arguments, then they will be "AND"ed by the method. For example,

Movie.objects.filter(Q(year__gte=2015), Q(genres__in=Genre.objects.filter(name='Thriller')))

is equivalent to

Movie.objects.filter(year__gte=2015, genres__in=Genre.objects.filter(name='Thriller'))

If we want to perform OR operation in the SQL query, we combine the Q objects with the | operator as follows:

Movie.objects.filter(Q(year__gte=2015) | Q(genres__in=Genre.objects.filter(name='Thriller')))

The above statement gives all the Movies that are either released in 2015 or before OR that are of genre Thriller. We can combine the Q objects with & (for ANDing) and | (for ORing) operators to form complex queries. As a side note, the above query can result in model object duplicates. To retrieve only the DISTINCT objects, use the distinct() method of querysets.

Aggregation

The list of aggregation functions supported by Django can be found here. Aggregation using the aggregate() function on a queryset returns a dictionary, not a queryset.

q = Movie.objects.aggregate(Count('title'))

q
{'title__count': 2}

While this is useful, often aggregate functions are used to append another column with the result of aggregate function, along with the returned queryset. This SQL operation can be achieved with the annotate() method as follows.

In [41]: q = Movie.objects.annotate(Count('genres'))

In [42]: q
Out[42]: <QuerySet [<Movie: Interstellar (2014)>, <Movie: Inception (2010)>]>

In [43]: q[0].genres__count
Out[43]: 2

In [44]: q[1].genres__count
Out[44]: 3

The resulting column name of the annotate() method can be specified by making the aggregate function a keyword argument with keyword being the name of the resulting column. Er... what I meant is that if in the above query we want to change the column name (of genres__count) to num_genres, we can achieve that with:

In [45]: q = Movie.objects.annotate(num_genres=Count('genres'))

In [46]: q[1].num_genres
Out[46]: 3

Because the result of annotate method is a queryset, all the fitering can be applied on the result as with a general queryset.

Handling NULL values in database

Another common issue developers face is working with NULL values in the database fields. If we want to select (filter) the records one of whose columns is NULL, our intuitive mind would suggest something like Movie.objects.filter(year=None). But that does not work in case of Django querysets. The way to do it is:

q = Movie.objects.filter(year__isnull=True)

Most often, the field values are also blank. To get objects with empty values, we can use:

q = Movie.objects.filter(year='')

To combine both do:

q = Movie.objects.filter(Q(year__isnull=True) | Q(year=''))

Comments

Recent Posts

Archive

2022
2021
2020
2019
2018
2017
2016
2015
2014

Tags

Authors

Feeds

RSS / Atom