mediagoblin-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[GMG-Devel] Adventures in database migrations


From: ayleph
Subject: [GMG-Devel] Adventures in database migrations
Date: Wed, 2 Dec 2015 22:56:40 -0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.4.0

I recently upgraded a moderate-sized database from migration 26 to
migration 43. I encountered a few issues along the way and had to attack
this in steps. I wanted to share my experience in case we can make
things easier for others.

Migration 29 caused me a couple of issues.

Issue 29-1: I had numerous errors of the type "AttributeError:
'NoneType' object has no attribute 'id'. These were caused by entries in
my core__activities table which had a media_id column entry that no
longer existed in core__media_entries. My core__activities table had
roughly 9000 entries, and over 1300 of those referenced media entries
which no longer existed on the server.

Solution 29-1: I did this the long way. I added a line to
mediagoblin/db/migrations.py to print the id of each entry in
core__activities that returned an error. I collected all of these and
ran them through a bash script that connected to the psql database and
ran "delete from core__activities where id=$id". There's probably a
smarter/quicker way to do this in sql statements.

Issue 29-2: Migration 29 ate my hard disk. This migration iterates
through core__activities (which, as I mentioned, was about 9000 items on
this instance). As the migration chugged through this giant for loop, it
kept eating disk space until my /var partition was full. My entire
database only takes up about 28M in a psql dump, but the live database
grew to over 1GB during the migration.

Solution 29-2: I modified mediagoblin/db/migrations.py to perform
db.commit() after each entry in the gigantic for loop instead of once at
the very end. This probably created a pretty big IO hit, but I'll take
that over a full partition and a failed migration.

Migration 35 caused the same disk issue as 29-2. Again, I resolved this
by moving the db.commit() statement inside the for loop.

Migration 42 caused an issue similar to 29-1. I got errors saying "null
value in column 'obj_pk' violates not-null constraint." In this case,
the offending entries were rows inside core__reports_on_media whose
media_entry_id was null and rows inside core__reports whose object_id
was null. To get past this, I just nuked all the offending rows.

If any of these sound like bugs, I'd be happy to file reports on the
issue tracker. It's highly likely that some of my problems were caused
by failed account deletions that left things behind in the database,
such as errors deleting accounts that have created blogs [1]. I feel
like the db.commit issues need a better solution than "get a bigger hard
drive" though.

Something else I should mention is that I had problems trying to upgrade
using a series of commits instead of jumping straight to current master.
To debug the migration failures, I would checkout a specific commit and
see what happened during the migration. I found that in some cases,
choosing the wrong commit would cause issues. For example, at some point
in time the core__activities table definition changes from having a
column named "object" to having one named "object_id". If you run a
migration from an earlier commit, you end up with the "object" column
which causes problems down then road. Just something to be aware of for
anyone else debugging in steps like I tried to do.

[1] https://issues.mediagoblin.org/ticket/5308

-- 
ayleph


reply via email to

[Prev in Thread] Current Thread [Next in Thread]