Django: creating an "AutoField", but not ForeignKey

A model was giving me nighmares, in the form of "500: duplicate key" mails. A field needed to be an Integer with autoincrement, but not a PrimaryKey because the field is NULL by default. Django explicitly forbids this (an AutoField must be the primary key), so I had to adapt.

This story started slowly: the model had a true PrimaryKey (an Uuid) that works without troubles, and a second field we can call Ticket that is NULL at first, but then turns into a positive integer when the Model is saved/modified the first time. After that, the Ticket never changes again.

In hindsight

I should have splitted the model in two models: one with the PrimaryKey, and a second model with an IntegerField as its PrimaryKey linked one-to-one against the first. When the first model is saved, it gets-or-create the linked model, and the Django PrimaryKey deals with the autoincrement.

How it started

The first requirement was the model needs a TicketField which will be a random number between 1 and 999_999. It needs to be unique, but not consecutive, and more importantly: it is pre-loaded, so NULL is not a posibility. I defined the field as:

class Table(models.Model):
    pk = UUIDField(primary_key=True, default=uuid.uuid4, editable=False
    ticket = PositiveIntegerField(unique=True, editable=False)

The idea was to pre-load data from a Csv into the fields, so "ticket" can be calculated on the fly with any random generator. The meat of the model was in the other fields (not shown), this was only a human readable tag for the Model.

How it evolved

Somebody didn't feel right with the "random" part of this story, so it demanded to be sequential. No problem, we could give sequential values even easier than random values. But in the next cycle, it was decided that everything should start empty, and only after saving or updating, the field "ticket" should get a value. And not a random value, but the next integer following the higher integer in the table.

I though I could get away with this, even knowing that it contains a race condition, but I though it would almost never trigger on a low-usage app:

ticket = PositiveIntegerField(unique=True, editable=False, null=True)
...
def save(...):
    if (
        latest := Table.objects.filter(ticket__isnull=False)
         .order_by("-ticket")
         .first()
        ):
        if latest.ticket:
            self.ticket = latest.ticket + 1
     super().save(...)

Translated: give me the Table object with the highest "ticket" value, and I will use that value + 1 to create this ticket.

This worked reasonably well at first, but some days in the app load rose a bit, and some 5-10 "ticket is duplicated" errors started to pile in my inbox. Damn. In the time that goes between the "filter().order_by().first()" and the "super().save()", another user got the same "self.ticket" and triggered the race clash. The end user sees a "500 Error, please retry", and life goes on. But I decided to get it better with this:

latest = Table.objects.aggregate(Max("ticket")).get("ticket__max")
self.ticket = latest + 1

The timings it the database shown this takes 10 times less than the previous "filter().order_by().first()" (3 ms vs 25 ms), so I thought I could get away with it. But reality is harsh, usage kept growing and I was still receiving 3 to 6 mails per day with the same damn error.

How it ended

I decided to go for the sequence, a table that gives you the next integer. Just what I needed. Unfortunately, I couldn't find anyway for Django to use sequences directly, so I did it manually. First, I built the sequence in a migration:

$ python manage.py makemigrations appname --empty
Migrations for 'appname':
  appname/migrations/00XX_auto_20231030_1648.py

In the migration, create the sequence table and initializa it with the highest number for "ticket":

from django.db import connection, migrations
from django.db.models import Max


def forwards(apps, schema_editor):
    TableModel = apps.get_model("appname", "Table")
    if not (
        max_ticket := TableModel.objects.aggregate(Max("ticket"))
        .get("ticket__max")
    ):
        max_ticket = 0

    with connection.cursor() as cursor:
        cursor.execute(
            "CREATE SEQUENCE ticket_sequence "
            + f"OWNED BY appname_table.ticket START {max_ticket + 1}"
        )


def reverse(apps, schema_editor):
    with connection.cursor() as cursor:
        cursor.execute("DROP SEQUENCE IF EXISTS ticket_sequence")


class Migration(migrations.Migration):
    dependencies = [
        ("appname", "00XX_previous_migration"),
    ]

    operations = [
        migrations.RunPython(forwards, reverse),
    ]

Translated: get the maximum value for the Table model, ticket field, and create a new sequence starting with that value, with zero otherwise. If the migration is undone, drop the sequence.

Then proceed to use the recently created sequence in the model:

def save(...):
    with connection.cursor() as c:
        c.execute("SELECT nextval('ticket_sequence')")
        self.ticket = c.fetchone()[0]

    super().save()

And solved! As soon as the Model gets the "nextval", it doesn't matter if it takes forever to save() the model and meanwhile other models get saved. So far, now new mails with 500.

Takeaway

  1. Think your models as clearly as possible from the begining.
  2. Don't let anyone take decisions on database schema, unless they have more expertise than you and want to take the whole matter and problems that should arise in their hands. This effectively means never going to happen.