# Python : Django : ORM : Aggregations ## Cheatsheet ```python from django.db.models import Avg, Count, Max, Q Author: name = Char( 100 ) age = Integer() Publisher: name = Char( 300 ) Book: name = Char( 300 ) pages = Integer() price = Decimal( 10.2 ) rating = Float() authors = ManyToMany( Author ) publisher = ForeignKey( Publisher ) pubdate = Date() Store: name = Char( 300 ) books = ManyToMany( Book ) Total number of books. Book.objects.count() -> 2452 Total number of books with publisher = BaloneyPress. Book.objects.filter( publisher__name = "BaloneyPress" ).count() -> 73 Average price across all books, provide default to be returned instead of None if no books exist. Book.objects.aggregate( Avg( "price", default = 0 ) ) -> { 'price__avg': 34.35 } Max price across all books, provide default to be returned instead of None if no books exist. Book.objects.aggregate( Max( "price", default = 0 ) ) -> { 'price__max': Decimal( '81.20' ) } Difference between the highest priced book and the average price of all books. Book.objects.aggregate( price_diff = Max( "price", output_field = FloatField() ) - Avg( "price" ) ) -> { 'price_diff': 46.85 } All the following queries involve traversing the Book <-> Publisher foreign key relationship backwards. Each publisher, each with a count of books as a "num_books" attribute. pubs = Publisher.objects.annotate( num_books=Count( "book" ) ) -> <QuerySet [ <Publisher: BaloneyPress>, <Publisher: SalamiPress>, ... ]> pubs[ 0 ].num_books -> 73 Each publisher, with a separate count of books with a rating above and below 5. above_5 = Count( "book", filter=Q( book__rating__gt = 5 ) ) below_5 = Count( "book", filter=Q( book__rating__lte = 5 ) ) pubs = Publisher.objects.annotate( below_5 = below_5 ).annotate( above_5 = above_5 ) pubs[ 0 ].above_5 -> 23 pubs[ 0 ].below_5 -> 12 The top 5 publishers, in order by number of books. pubs = Publisher.objects.annotate( num_books = Count( "book" ) ).order_by( "-num_books" )[ :5 ] pubs[ 0 ].num_books -> 1323 ``` ## Step-by-Step #### Generating Aggregates Over a QuerySet ```python Book..aggregate( Avg( "price" ) ) -> { "price__avg": 34.35 } Book..aggregate( foo = Avg( "price" ) ) -> { "foo": 34.35 } Book..aggregate( AGG1, ... ) # multiple aggregates ``` `aggregate()` is a terminal clause that returns a dictionary of name-value pairs. #### Generating Aggregates for Each Item in a QuerySet ```python Book..annotate( Count( "authors" ) ) -> QuerySet[ i ].authors__count Book..annotate( foo = Count( "authors" ) ) -> QuerySet[ i ].foo BAD: Book..annotate( AGG1, ... ) # multiple annotations ``` Combining multiple aggregations with annotate() will yield the wrong results because joins are used instead of subqueries. #### Joins and Aggregates ```python Store..annotate( min_price = Min( "books__price" ), max_price = Max( "books__price" ) ) Store..aggregate( min_price = Min( "books__price" ), max_price = Max( "books__price" ) ) Publisher..annotate( Count( "book" ) ) -> QuerySet[ i ].book__count ``` #### Aggregations and Other QuerySet Clauses ##### Filtering on Annotations Annotated values can also be filtered. The alias for the annotation can be used in `filter()` and `exclude()` clauses in the same way as any other model field. ```python Book..annotate( num_authors = Count( "authors" ) ).filter( num_authors__gt = 1 ) ``` If you need two annotations with two separate filters, you can use the filter argument with any aggregate. ```python highly_rated = Count( "book", filter = Q( book__rating__gte = 7 ) ) Author..annotate( num_books = Count( "book" ), highly_rated_books = highly_rated ) ``` ##### Order of annotate() and filter() Clauses If you filter on a relation before an annotation involving that relation, the filter will affect the set of related objects for the purpose of the annotation. ```python Publisher.objects.filter( book__rating__gt = 3.0 ).annotate( num_books = Count( "book" ) ) ``` The results will be wrong - it will include all publishers with at least one >= 3.0 book, and the book counts for each publisher will include only the >= 3.0 books. ##### values() When a `values()` clause is used to constrain the columns that are returned in the result set, the original results are grouped according to the unique combinations of the fields. An annotation is then provided for each unique group; the annotation is computed over all members of the group. ```python Author. .annotate( average_rating = Avg( "book__rating" ) ) vs Author..values( "name" ).annotate( average_rating = Avg( "book__rating" ) ) ``` Authors will be grouped by name, so you'll get one annotated result for each unique Author name. ##### order_by() Fields that are mentioned in the `order_by()` part of a queryset are used when selecting the output data, even if they are not otherwise specified in the `values()` call. These extra fields are used to group “like” results together and they can make otherwise identical result rows appear to be separate. ##### Aggregating Annotations ```python Book..annotate( num_authors = Count( "authors" ) ).aggregate( Avg( "num_authors" ) ) ``` #### Aggregation Functions ```python Params expressions # Strings that reference fields on the model, transforms of the field, or query expressions. output_field # An optional argument that represents the model field of the return value. filter # An optional Q object that’s used to filter the rows that are aggregated. default # Value to return when queryset or group is empty (instead of None). (Doesn't apply to Count.) **extra Avg( expression, output_field=None, distinct=False, filter=None, default=None, **extra ) Count( expression, distinct=False, filter=None, **extra ) Max( expression, output_field=None, filter=None, default=None, **extra ) Min( expression, output_field=None, filter=None, default=None, **extra ) StdDev( expression, output_field=None, sample=False, filter=None, default=None, **extra ) Sum( expression, output_field=None, distinct=False, filter=None, default=None, **extra ) Variance( expression, output_field=None, sample=False, filter=None, default=None, **extra ) ``` ### References - https://docs.djangoproject.com/en/5.0/topics/db/aggregation/ - https://docs.djangoproject.com/en/5.0/ref/models/querysets/#aggregation-functions - https://docs.djangoproject.com/en/5.0/ref/models/expressions/#django.db.models.Aggregate - https://docs.djangoproject.com/en/5.0/ref/models/conditional-expressions/#conditional-aggregation