Caching counts for Django's built-in pagination
If you've ever used the Django debug toolbar on a Django app with a large database, chances are you've seen that counts can be quite slow and are generally something to avoid. You'll probably either want to cache or guesstimate them. You may have even written some code yourself. If you've ever gotten stuck on Django's object_list still triggering .count(), then I have a solution for you. The trick is you can monkey-patch your queryset's .count() method but it will still trigger a database call. Why? object_list clones your queryset and that resets its .count() method. You'll need to monkey-patch the ._clone() method to monkey-patch .count() after it clones the queryset. Then things will magically work and you want get the FBI or CIA knocking on your door asking why you were tampering with querysets unsuccessfully.
Here's the code:
class CachedCountCloneProxy(object):
''' This allows us to monkey-patch count() on QuerySets so we can cache it and speed things up.
._clone is called so we have to monkey-patch that first...
'''
def __init__(self, queryset):
self._queryset = queryset
self._queryset._clone_original = self._queryset._clone
def __call__(self):
queryset = self._queryset._clone_original()
queryset.count = CachedCountProxy(queryset)
return queryset
class CachedCountProxy(object):
''' This allows us to monkey-patch count() on QuerySets so we can cache it and speed things up.
'''
def __init__(self, queryset):
self._queryset = queryset
self._queryset._original_count = self._queryset.count
self._sql = self._queryset.query.get_compiler(self._queryset.db).as_sql()
self._sql = self._sql[0] % self._sql[1]
def __call__(self):
''' 1. Check cache
2. Return cache if it's set
3. If it's not set, call super and get the count
4. Cache that for X minutes
'''
key = "paginator_count_%s" % hashlib.sha224(self._sql).hexdigest()
count = cache.get(key)
if count is None:
count = self._queryset._original_count()
cache.set(key, count, 300)
return count
# To use:
# queryset._clone = CachedCountCloneProxy(queryset) # monkeypatch solution to cache the count for performance
One thing to note is this is a generic solution that will cache based on the SQL that is generated. There's probably a better way but anything is faster than .count() on a large database table.
Yes, I'm aware this is a hack. The best way to fix this would probably be patching Django or something. I didn't bother, so sue me.
↓