Case-insensitive ordering with Django and PostgreSQL
When the Django Gigs site first went live we noticed the ordering of developers by name was not right. Those starting with an uppercase letter were coming before those starting with a lowercase letter.
PostgreSQL and the locale
PostgreSQL has a locale setting which is configured when the cluster is created. Among other things, this affects the ordering of results when you use the SQL order by
clause.
The local on my server was set to “C” which means it uses byte-level comparisons, rather than following more complex rules for a given culture. Although this is apparently good for performance, it means order by
will be case sensitive – e.g. “Zebra” comes before “apple”.
Depending on how your system is set up, you may have locales such as en_GB. The locale can’t easily be changed in PostgreSQL because indexes and other data depends on it. To change locale, you need to start a new cluster and move databases to it.
Django and case-sensitivity
Django provides the order_by() function on QuerySets, but does not have an option for case insensitive ordering. Instead this is left to your database configuration.
When using SQL directly, you can sort case-insensitively using the PostgreSQL lower()
function.
e.g.
select * from developer order by lower(name)
One way to do this in Django is to use extra to call the lower()
function, creating a virtual column which you can then order by.
e.g.
Developer.objects.all().extra( select={'lower_name': 'lower(name)'}).order_by('lower_name')
Using SQL functions could tie you to a particular database, though in this case the lower()
function is standard and should work with most databases. Some other databases do case-insensitive comparisons so wouldn’t need it.