Sunday, June 17, 2018

Alembic migrations in Django with SQLAlchemy Database Toolkit

In order to understand Alembic migrations in Django, we first need to understand the need of SQLAlchemy when Django already provides its own ORM (If you only want to read about migrations using Alembic then please skip this section).

Why use SQLAlchemy instead of Django's ORM?


Django ORM uses what is called Active Record implementation, what it means is that each record in the database is directly mapped to an python object in the code and vice-versa. The Django ORM allows developers to quickly deploy business-level data relationships, basic CRUD (Create, Read, Update and Delete) functions are very easy to implement using Django ORM.
SQLAlchemy on the other uses Data Mapper implementation and allows writing complex queries within the application. SQLAlchemy makes it possible to write complex join queries between multiple tables based on foreign key relationships, data filters can also be applied on the tables involved in the joins. Now, it is possible to write such complex queries in Django ORM as well, but it requires some degree coding to manipulate the python objects to get the desired result (in this case complex joins).
The project GraphSpace (under NRNB - National Resource for Network Biology) which I'm working on this summer as a GSOC '18 Student makes use of SQLAlchemy to carry out the all the CRUD operations as well as many complex joins & filtering.

What is Alembic?

Alembic is a lightweight database migration tool for usage with the SQLAlchemy Database Toolkit for Python. - Official, description
If you are familiar with Django ORM then you might have noticed the Migrations directory which maintains a list of data model changes made over the entire development period. These migration files are used internally by Django to apply data model changes in the database. Alembic is the migration tool for SQLAlchemy,  similar to Django migrations it also maintains a list of data model changes. Alembic also supports upgrading and downgrading between different states of data model using the migration files.

Creating a Alembic script

[I have assumed that the Alembic environment has been set up and that you have switched into the virtual environment you might be using for your project].
To create a new alembic script we use the alembic revision 
$ alembic revision -m "create user table"
Generating /path/to/yourproject/alembic/versions/840db85c5bce_create_user_table.py...done;

The new file 840db85c5bce_create_user_table.py 
"""create_user_table

Revision ID: 840db85c5bce
Revises: 
Create Date: 2018-06-23 02:27:29.434000

"""
from alembic import op
import sqlalchemy as sa


# revision identifiers, used by Alembic.
revision = '840db85c5bce'
down_revision = 'f8f6ba9712df'
branch_labels = None
depends_on = None


def upgrade():
    pass
 
def downgrade():
    pass

We will update the upgrade() and downgrade() functions to match the set of changes in our database.

CRUD changes in migration script


Lets say we have defined a table named app_version with the following columns :
Table app_version{
id                     TYPE Integer, Primary Key,
name                TYPE String,  Constraints[Not-Null, Unique],
version_id        TYPE Integer, Constraints[Not-Null], FOREIGN KEY references app.id,
owner_email    TYPE String,  Constraints[Not-Null], FOREIGN KEY references user.email,
json_data          TYPE String,  Constraints[Not-Null],
description        TYPE String
}
There are two foreign key references 
  • version_id references id column of the app table
  • owner_email references email  column of the user  table

The upgrade() and downgrade() function for the above table will look as follows -
upgrade():
    op.create_table(
  'app_version',
  sa.Column('id', sa.Integer, primary_key=True),
  sa.Column('name', sa.String, nullable=False, unique=True),
  sa.Column('app_id', sa.Integer, nullable=False),
  sa.Column('owner_email', sa.String, nullable=False),
  sa.Column('json_data', sa.String, nullable=False),  
  sa.Column('description', sa.String, nullable=True),
 )
op.add_column('app_version', sa.Column('created_at', sa.TIMESTAMP, server_default=sa.func.current_timestamp()))
op.add_column('app_version', sa.Column('updated_at', sa.TIMESTAMP, server_default=sa.func.current_timestamp()))

# Create New Index
op.create_index('app_version_idx_name', 'app_version', ['name'], unique=True)

# Add new foreign key reference
op.execute('ALTER TABLE app_version ADD CONSTRAINT app_version_id_fkey 
            FOREIGN KEY (app_id) REFERENCES "app" (id) MATCH SIMPLE ON UPDATE CASCADE ON DELETE CASCADE;')
op.execute('ALTER TABLE app_version ADD CONSTRAINT app_version_owner_email_fkey 
            FOREIGN KEY (owner_email) REFERENCES "user" (email) MATCH SIMPLE ON UPDATE CASCADE ON DELETE CASCADE;')

downgrade():
    op.drop_table('app_version')
Now, let us breakdown the code to understand what is actually happening

Adding the created_at and updated_at columns


op.add_column('app_version', sa.Column('created_at', sa.TIMESTAMP, server_default=sa.func.current_timestamp()))
op.add_column('app_version', sa.Column('updated_at', sa.TIMESTAMP, server_default=sa.func.current_timestamp()))
The line above creates the created_at and updated_at columns (which are created by default when using Django ORM) We have created an Index on the name column of the table, as follows :

Creating Index


op.create_index('app_version_idx_name', 'app_version', ['name'], unique=True)
If we need to create an index over multiple column then we can just append the name of the column in the 3rd parameter like so - ['name', 'id'] will create an Index over columns Name and Id.


Creating Foreign Key References and other SQL operations


For more complex operations like adding foreign keys with one or many constraints we can use the op.execute() function and pass a custom SQL statement. In our case it looks like -
op.execute('ALTER TABLE app_version ADD CONSTRAINT app_version_id_fkey 
            FOREIGN KEY (app_id) REFERENCES "app" (id) MATCH SIMPLE ON UPDATE CASCADE ON DELETE CASCADE;')
op.execute('ALTER TABLE app_version ADD CONSTRAINT app_version_owner_email_fkey 
            FOREIGN KEY (owner_email) REFERENCES "user" (email) MATCH SIMPLE ON UPDATE CASCADE ON DELETE CASCADE;')
Here, we have added foreign key references with the following constraints - ondelete = CASCADE & onupdate = CASCADE By writing custom SQL statements to the op.execute() function we can execute complex operations on our table.

No comments:

Post a Comment