Saturday, July 30, 2016

PostgreSQL + SQLAlchemy: accent and case insensitive search

Ignore accents & case

Maybe you know what it's like: you store UTF-8 encoded text/strings in your database but you want to be able to execute an ASCII based query ignoring all those pesky diacritics. It's not always straightforward.

The following tutorial is a compilation of 3 StackOverflow posts and it describes how to perform an accent and case insensitive search with PostgreSQL and SQLAlchemy.


Ingredients

We will assume that you have PostgreSQL already installed on your system. To add full-text search support you have to install a PostgreSQL extension unaccent. But, there is one precondition for this.

First you need to install the postgresql-contrib-9.x package:

[sudo] apt-get install postgresql-contrib-9.x
Adapt the version to match PostgreSQL version you have installed on you system. Run the following command if you are not sure about the version:
psql --version
Source: Error when creating unaccent extension on PostgreSQL

No accents shall pass!

Now that you have everything necessary in place you can use the unaccent extension. Run the following command:

echo 'CREATE EXTENSION unaccent' | psql <database> <username>
This should be run once per database. You can execute it also in PgAdmin, it's really up to you.

Source: Does PostgreSQL support “accent insensitive” collations?

SQLAlchemy support

So you can use the unaccent extension in psql and PgAdmin but how do you use it with SQLAlchemy?

The trick is to create a helper class named exactly unaccent

from sqlalchemy.sql.functions import ReturnTypeFromArgs

class unaccent(ReturnTypeFromArgs):
    pass
and use it as a function in your code
unaccented(mytable.mycolumn).ilike('%myvalue%'))
Note that I am using the ilike operator here instead of like as ilike ignores case.

Source: SQLALCHEMY ignore accents on query

The End

As stated in the intro, I take no credit for this tutorial. I simply put related links and content all together. Hope it helps!