In Chapter 5, we presented an introduction to Django’s database layer – how to define models and how to use the database API to create, retrieve, update and delete records. In this chapter, we’ll introduce you to some more advanced features of this part of Django.
When we introduced the syncdb command in Chapter 5, we noted that syncdb merely creates tables that don’t yet exist in your database – it does not sync changes in models or perform deletions of models. If you add or change a model’s field, or if you delete a model, you’ll need to make the change in your database manually. This section explains how to do that.
When dealing with schema changes, it’s important to keep a few things in mind about how Django’s database layer works:
Making schema changes is a matter of changing the various pieces – the Python code and the database itself – in the right order.
When adding a field to a table/model in a production setting, the trick is to take advantage of the fact that Django doesn’t care if a table contains columns that aren’t defined in the model. The strategy is to add the column in the database, and then update the Django model to include the new field.
However, there’s a bit of a chicken-and-egg problem here, because in order to know how the new database column should be expressed in SQL, you need to look at the output of Django’s manage.py sqlall command, which requires that the field exist in the model. (Note that you’re not required to create your column with exactly the same SQL that Django would, but it’s a good idea to do so, just to be sure everything’s in sync.)
The solution to the chicken-and-egg problem is to use a development environment instead of making the changes on a production server. (You are using a testing/development environment, right?) Here are the detailed steps to take.
First, take these steps in the development environment (i.e., not on the production server):
Then on the production server perform these steps:
For example, let’s walk through what we’d do if we added a num_pages field to the Book model from Chapter 5. First, we’d alter the model in our development environment to look like this:
class Book(models.Model): title = models.CharField(max_length=100) authors = models.ManyToManyField(Author) publisher = models.ForeignKey(Publisher) publication_date = models.DateField() num_pages = models.IntegerField(blank=True, null=True) def __unicode__(self): return self.title
(Note: Read the section “Making Fields Optional” in Chapter 6, plus the sidebar “Adding NOT NULL Columns” below for important details on why we included blank=True and null=True.)
Then we’d run the command manage.py sqlall books to see the CREATE TABLE statement. Depending on your database backend, it would look something like this:
CREATE TABLE "books_book" ( "id" serial NOT NULL PRIMARY KEY, "title" varchar(100) NOT NULL, "publisher_id" integer NOT NULL REFERENCES "books_publisher" ("id"), "publication_date" date NOT NULL, "num_pages" integer NULL );
The new column is represented like this:
"num_pages" integer NULL
Next, we’d start the database’s interactive shell for our development database by typing psql (for PostgreSQL), and we’d execute the following statements:
ALTER TABLE books_book ADD COLUMN num_pages integer;
Adding NOT NULL Columns
There’s a subtlety here that deserves mention. When we added the num_pages field to our model, we included the blank=True and null=True options. We did this because a database column will contain NULL values when you first create it.
However, it’s also possible to add columns that cannot contain NULL values. To do this, you have to create the column as NULL, then populate the column’s values using some default(s), and then alter the column to set the NOT NULL modifier. For example:
BEGIN; ALTER TABLE books_book ADD COLUMN num_pages integer; UPDATE books_book SET num_pages=0; ALTER TABLE books_book ALTER COLUMN num_pages SET NOT NULL; COMMIT;
If you go down this path, remember that you should leave off blank=True and null=True in your model (obviously).
After the ALTER TABLE statement, we’d verify that the change worked properly by starting the Python shell and running this code:
>>> from mysite.books.models import Book >>> Book.objects.all()[:5]
If that code didn’t cause errors, we’d switch to our production server and execute the ALTER TABLE statement on the production database. Then, we’d update the model in the production environment and restart the Web server.
Removing a field from a model is a lot easier than adding one. To remove a field, just follow these steps:
Remove the field from your model and restart the Web server.
Remove the column from your database, using a command like this:
ALTER TABLE books_book DROP COLUMN num_pages;
Make sure to do it in this order. If you remove the column from your database first, Django will immediately begin raising errors.
Because many-to-many fields are different than normal fields, the removal process is different:
Remove the ManyToManyField from your model and restart the Web server.
Remove the many-to-many table from your database, using a command like this:
DROP TABLE books_book_authors;
As in the previous section, make sure to do it in this order.
Removing a model entirely is as easy as removing a field. To remove a model, just follow these steps:
Remove the model from your models.py file and restart the Web server.
Remove the table from your database, using a command like this:
DROP TABLE books_book;
Note that you might need to remove any dependent tables from your database first – e.g., any tables that have foreign keys to books_book.
As in the previous sections, make sure to do it in this order.
In the statement Book.objects.all(), objects is a special attribute through which you query your database. In Chapter 5, we briefly identified this as the model’s manager. Now it’s time to dive a bit deeper into what managers are and how you can use them.
In short, a model’s manager is an object through which Django models perform database queries. Each Django model has at least one manager, and you can create custom managers in order to customize database access.
There are two reasons you might want to create a custom manager: to add extra manager methods, and/or to modify the initial QuerySet the manager returns.
Adding extra manager methods is the preferred way to add “table-level” functionality to your models. (For “row-level” functionality – i.e., functions that act on a single instance of a model object – use model methods, which are explained later in this chapter.)
For example, let’s give our Book model a manager method title_count() that takes a keyword and returns the number of books that have a title containing that keyword. (This example is slightly contrived, but it demonstrates how managers work.)
# models.py from django.db import models # ... Author and Publisher models here ... class BookManager(models.Manager): def title_count(self, keyword): return self.filter(title__icontains=keyword).count() class Book(models.Model): title = models.CharField(max_length=100) authors = models.ManyToManyField(Author) publisher = models.ForeignKey(Publisher) publication_date = models.DateField() num_pages = models.IntegerField(blank=True, null=True) objects = BookManager() def __unicode__(self): return self.title
With this manager in place, we can now do this:
>>> Book.objects.title_count('django') 4 >>> Book.objects.title_count('python') 18
Here are some notes about the code:
Why would we want to add a method such as title_count()? To encapsulate commonly executed queries so that we don’t have to duplicate code.
A manager’s base QuerySet returns all objects in the system. For example, Book.objects.all() returns all books in the book database.
You can override a manager’s base QuerySet by overriding the Manager.get_query_set() method. get_query_set() should return a QuerySet with the properties you require.
For example, the following model has two managers – one that returns all objects, and one that returns only the books by Roald Dahl.
from django.db import models # First, define the Manager subclass. class DahlBookManager(models.Manager): def get_query_set(self): return super(DahlBookManager, self).get_query_set().filter(author='Roald Dahl') # Then hook it into the Book model explicitly. class Book(models.Model): title = models.CharField(max_length=100) author = models.CharField(max_length=50) # ... objects = models.Manager() # The default manager. dahl_objects = DahlBookManager() # The Dahl-specific manager.
With this sample model, Book.objects.all() will return all books in the database, but Book.dahl_objects.all() will only return the ones written by Roald Dahl. Note that we explicitly set objects to a vanilla Manager instance, because if we hadn’t, the only available manager would be dahl_objects.
Of course, because get_query_set() returns a QuerySet object, you can use filter(), exclude() and all the other QuerySet methods on it. So these statements are all legal:
Book.dahl_objects.all() Book.dahl_objects.filter(title='Matilda') Book.dahl_objects.count()
This example also pointed out another interesting technique: using multiple managers on the same model. You can attach as many Manager() instances to a model as you’d like. This is an easy way to define common “filters” for your models.
class MaleManager(models.Manager): def get_query_set(self): return super(MaleManager, self).get_query_set().filter(sex='M') class FemaleManager(models.Manager): def get_query_set(self): return super(FemaleManager, self).get_query_set().filter(sex='F') class Person(models.Model): first_name = models.CharField(max_length=50) last_name = models.CharField(max_length=50) sex = models.CharField(max_length=1, choices=(('M', 'Male'), ('F', 'Female'))) people = models.Manager() men = MaleManager() women = FemaleManager()
This example allows you to request Person.men.all(), Person.women.all(), and Person.people.all(), yielding predictable results.
If you use custom Manager objects, take note that the first Manager Django encounters (in the order in which they’re defined in the model) has a special status. Django interprets this first Manager defined in a class as the “default” Manager, and several parts of Django (though not the admin application) will use that Manager exclusively for that model. As a result, it’s often a good idea to be careful in your choice of default manager, in order to avoid a situation where overriding of get_query_set() results in an inability to retrieve objects you’d like to work with.
Define custom methods on a model to add custom “row-level” functionality to your objects. Whereas managers are intended to do “table-wide” things, model methods should act on a particular model instance.
This is a valuable technique for keeping business logic in one place – the model.
An example is the easiest way to explain this. Here’s a model with a few custom methods:
from django.contrib.localflavor.us.models import USStateField from django.db import models class Person(models.Model): first_name = models.CharField(max_length=50) last_name = models.CharField(max_length=50) birth_date = models.DateField() address = models.CharField(max_length=100) city = models.CharField(max_length=50) state = USStateField() # Yes, this is U.S.-centric... def baby_boomer_status(self): "Returns the person's baby-boomer status." import datetime if datetime.date(1945, 8, 1) <= self.birth_date <= datetime.date(1964, 12, 31): return "Baby boomer" if self.birth_date < datetime.date(1945, 8, 1): return "Pre-boomer" return "Post-boomer" def is_midwestern(self): "Returns True if this person is from the Midwest." return self.state in ('IL', 'WI', 'MI', 'IN', 'OH', 'IA', 'MO') def _get_full_name(self): "Returns the person's full name." return u'%s %s' % (self.first_name, self.last_name) full_name = property(_get_full_name)
The last method in this example is a “property.” Read more about properties at http://www.python.org/download/releases/2.2/descrintro/#property
And here’s example usage:
>>> p = Person.objects.get(first_name='Barack', last_name='Obama') >>> p.birth_date datetime.date(1961, 8, 4) >>> p.baby_boomer_status() 'Baby boomer' >>> p.is_midwestern() True >>> p.full_name # Note this isn't a method -- it's treated as an attribute u'Barack Obama'
Sometimes you’ll find that the Django database API can only take you so far, and you’ll want to write custom SQL queries against your database. You can do this very easily by accessing the object django.db.connection, which represents the current database connection. To use it, call connection.cursor() to get a cursor object. Then, call cursor.execute(sql, [params]) to execute the SQL and cursor.fetchone() or cursor.fetchall() to return the resulting rows. For example:
>>> from django.db import connection >>> cursor = connection.cursor() >>> cursor.execute(""" ... SELECT DISTINCT first_name ... FROM people_person ... WHERE last_name = %s""", ['Lennon']) >>> row = cursor.fetchone() >>> print row ['John']
connection and cursor mostly implement the standard Python “DB-API,” which you can read about at http://www.python.org/peps/pep-0249.html. If you’re not familiar with the Python DB-API, note that the SQL statement in cursor.execute() uses placeholders, "%s", rather than adding parameters directly within the SQL. If you use this technique, the underlying database library will automatically add quotes and escaping to your parameter(s) as necessary.
Rather than littering your view code with these django.db.connection statements, it’s a good idea to put them in custom model methods or manager methods. For example, the above example could be integrated into a custom manager method like this:
from django.db import connection, models class PersonManager(models.Manager): def first_names(self, last_name): cursor = connection.cursor() cursor.execute(""" SELECT DISTINCT first_name FROM people_person WHERE last_name = %s""", [last_name]) return [row for row in cursor.fetchone()] class Person(models.Model): first_name = models.CharField(max_length=50) last_name = models.CharField(max_length=50) objects = PersonManager()
And sample usage:
>>> Person.objects.first_names('Lennon') ['John', 'Cynthia']
In the next chapter, we’ll show you Django’s “generic views” framework, which lets you save time in building Web sites that follow common patterns.