Feb 03

Thread-Safe Object-Oriented Views in Django

An emerging design pattern in Django is the usage of class-based views. Writing views as classes is made possible by Python features that allow classes and objects to behave as callables, just like functions. They can help organize view code and promote reusability by offering a greater level of customization. However, callable objects sometimes have thread safety issues that developers are often not aware of.

Update 2/2/11: This post is now mostly irrelevant, because Django will soon solve this problem with the newly-refactored generic view module in Django 1.3, which is due for release this month. Since I wrote this post a year ago, massive arguments over the design of class-based views erupted on the mailing list. It definitely got out of hand, but a lot was learned, and I'm happy with the solution that was committed. I still encourage people to understand how to avoid threading side-effects. I would also suggest reading the new documentation on class-based generic views.


The most common approach to class-based views is to create a callable object, a class that has been written with a __call__ method, therefore making instances of the class callable. The view is instantiated either as a module-level variable in the views file or in the urlconf. Jacob Kaplan-Moss has written a series of class-based generic views that follow this model.

The Problem

When writing persistent class-based views, you must be careful not to introduce stateful information in your object. From what I can tell, Jacob's implementation seems thread-safe, because the view's state appears to only be altered on initialization. The problem arises when you store request-specific variables on the view object. Because the object is only instantiated once per Python process, it persists while running multiple HTTP requests, for the life of the process. Stateful information can cause side effects including security problems.

Here is a simple example illustrating this effect:

class MyView(object):
    thing = 0
    def __call__(self, request):
        self.thing += 1
        return HttpResponse('%s' % (self.thing,))

my_view = MyView()

URL patterns:

urlpatterns = patterns('',
    url(r'^my_view/$', 'my_app.views.my_view', name='my_view'),
)

Every time you refresh your browser, you will see a number incrementing. Even in single-threaded (prefork) environments, this bug is present. Multi-threaded environments are succeptible to even scarier problems with object state. For example, you might be tempted to set the request object as an attribute to the view class:

class MyView(object):
    def __call__(self, request):
        self.request = request
        return self.create_response()

    def create_response(self):
        return HttpResponse('Welcome, %s' % (self.request.user,))

my_view = MyView()

If a view is called twice at roughly the same time in two threads, the request object may be incorrect in one of the threads, because the view object is shared between the threads. This bug might manifest seldomly, but it would be difficult to track down and conceivably a security problem.

Reducing shared state

If you are having thread safety issues because of class-based views, the first thing you can do is enable a prefork worker module in your web server. This causes greater memory usage, but it creates new processes instead of threads to handle concurrency. This is only a workaround, however, and there are ways to maintain thread-safe code using class-based views. A quick fix for a code base that already suffers from this bug would be to drop in __new__ method that handles the creation of new view instances per request. This should be as simple as adding the following to your view class:

def __new__(cls, *args_static, **kwargs_static):
    def view_wrapper(request, *args, **kwargs):
        view = object.__new__(cls)
        view.__init__(*args_static, **kwargs_static)
        return view(request, *args, **kwargs)
    return view_wrapper

If this is added to MyView above, it magically becomes thread-safe because the view is wrapped in such a way that every time it gets called, a new MyView instance is created for the request.

This method may be useful for existing apps, but a better designed stateful class-based view might drop the usage of __call__ altogether in favor of using a class (not an instance) as the view itself.

Michael Malone suggested in this Django ticket using __init__ as an alternative to __call__. By subclassing HttpResponse, you can treat the class as a view because calling the class creates an HttpResponse object. This creates a new view object for every request coming in. His suggestion did not appeal to me because you lose control over the resulting HttpResponse object and it's not possible to instantiate the view without creating an HttpResponse object. An alternative I've come up with is to override __new__ to return an HttpResponse.

Creating the response in __new__

Remember that any callable can be used as a view. This can mean functions, callable objects, and even classes. When you call a class, it's __new__ method is called and returns something, usually an instance of that class. However, it can be anything you want, including an HttpResponse object. A new approach to thread-safe classed-based views might be something like this:

class BaseView(object):
    def __new__(cls, request, *args, **kwargs):
        view = cls.new(request, *args, **kwargs)
        return view.create_response()

    @classmethod
    def new(cls, *args, **kwargs):
        obj = object.__new__(cls)
        obj.__init__(*args, **kwargs)
        return obj

    def __init__(self, request, *args, **kwargs):
        raise NotImplementedError()

    def create_response(self):
        raise NotImplementedError()

class MyView(BaseView):
    template = 'path/to/template.html'

    def __init__(self, request):
        self.request = request

    def create_response(self):
        return render_response(self.request, self.template, {'title': 'Home'})

This would be added to your URLPatterns directly (no need to instantiate it):

urlpatterns = patterns('',
    url(r'^my_view/$', 'my_app.views.MyView', name='my_view'),
)

A key difference is that the view is not a subclass of HttpResponse, but when you attempt to instantiate it, it will create an HttpResponse. In case you want to create a view object for testing or otherwise, the BaseView class has a factory method called "new" that makes new view instances.

The main thing to get out of all of this is that you need to be careful and aware of shared state whenever it exists. I prefer to minimize the possibility of side effects by avoiding module-level variables, singletons, and globals as they are almost always the road to hell. But if you do have a persistent view object, make sure that you don't store request-specific data on it.