Feb 06

Evented Django part one: Socket.IO and gevent

The buzz around the asynchronous, real-time web has been getting more and more attention lately, and for good reason. The old paradigm of thick servers and thin clients is getting outdated as the new web demands rich, fast, asynchronous, full-duplex messaging. The technologies that enable server-to-browser asynchronous messaging have been given the umbrella term "Comet," and the number of ways to provide Comet services is growing constantly. The transport options include XHR-multipart, WebSockets, and Adobe Flash Sockets, among others. Socket.IO was invented to provide a unified interface for server-browser messaging and let the developer not worry about the inconsistent browser support. In this post, I'm going to explain how to use Django with Socket.IO.

Socket.IO was developed with a Node.JS server implementation, but work is being done to add server implementations to a variety of languages. Two such servers exist for Python, tornadio and gevent-socketio. I'm a big fan of gevent, so I will use gevent-socketio, but tornadio looks well-written and very promising.

Why you should be thinking about gevent

Socket.IO runs great under Node.JS, but I think it's important to highlight why I think Python and gevent need more attention (feel free to skip ahead if you have already drank the gevent koolaid). Node.JS (and its underlying V8 Javascript engine) is a pinnacle achievement for the world of Javascript. It has done two especially important things: it helped show the world that evented application servers enable extremely fast high-concurrency connections, and it helped promote Javascript as a serious language, opening the doors for powerful tools such as testing frameworks, a package manager, and better community code standards. Its popularity is not surprising: it's built on top of one of the world's most well-known programming languages.

The Python community is a bit more fragmented, with several concurrent networking libraries -- among them: twisted, tornado, gevent, eventlet, and concurrence. It's certainly harder to know where to start without a "clear winner" like we see in the Javascript community. Personally, gevent has quickly become my favorite way to write asynchronous applications. I think Python with gevent wins over Node.JS in two important ways:

  1. It's Python, a sane and concise language with an awesome standard library and community.
  2. It uses greenlets instead of callbacks to provide concurrency.

Gevent, like Node.JS, is built on libevent (Update: Node actually uses libev. Thanks to Travis Cline for correcting me there), an underlying C library that provides a high-speed event loop. Node's concurrency model relies on callbacks to handle values from asynchronous I/O calls. This, combined with Javascript's highly nestable syntax, begs programmers to nest functions within other function calls, making callback passing a piece of cake, but I've seen this produce ugly, unreadable nested code, and I've seen programmers pull their hair out while trying to get things synchronized and avoid race conditions. In my experience, debugging an app with heavy use of callbacks is nearly impossible. Greenlet changes the game, because you can write simple "blocking" code that only blocks the current greenlet instead of the entire interpreter, allowing you to maintain stacks, along with beautiful Python stack traces.

Running Django on gevent-socketio

Gevent-socketio comes with one important caveat: you must use the gevent pywsgi server. This means you can't serve your WSGI app out of Apache like you might be used to doing (however, it should be possible to proxy requests from a front-end load balancer, but I haven't experimented with proxying web sockets). There's a pretty good reason for this: WSGI doesn't inherently allow web sockets. The only way this is possible is by hooking into the raw socket using the hooks provided by the pywsgi server.

Gevent-socketio works by creating a SocketIO handler and adding it to the WSGI "environ" dictionary before executing your WSGI app. When Django handles a request, it creates a WSGIRequest object and assigns it the environ dictionary created by pywsgi. So, if we are running Django under gevent-socketio, our SocketIO handler is available by accessing "request.environ['socketio']". I've demonstrated this by porting the gevent-socketio example chatroom application to Django. My ported code is available on Github.

Installation

I always choose to work in virtualenv, and I've created a pip requirements file that should cover what you need to get started. To run my example, clone the code on Github and install the requirements from pip:

git clone git://github.com/codysoyland/django-socketio-example.git
cd django-socketio-example
easy_install pip
pip install virtualenv
virtualenv .
source ./bin/activate
pip install -r pip_requirements.txt

Note the contents of pip_requirements.txt: I'm using the "tip" versions of both gevent-websocket and gevent-socketio. This is still beta-quality software, so we are using development versions. Note: Expect bugs!

A chat server request handler

The Socket.IO calls come in like normal requests and can be handled by a view, but your view code can actually contain a long-running event loop, sending and receiving messages from your web client. Here is the view that handles Socket.IO requests:

from django.http import HttpResponse

buffer = []

def socketio(request):
    socketio = request.environ['socketio']
    if socketio.on_connect():
        socketio.send({'buffer': buffer})
        socketio.broadcast({'announcement': socketio.session.session_id + ' connected'})

    while True:
        message = socketio.recv()

        if len(message) == 1:
            message = message[0]
            message = {'message': [socketio.session.session_id, message]}
            buffer.append(message)
            if len(buffer) > 15:
                del buffer[0]
            socketio.broadcast(message)
        else:
            if not socketio.connected():
                socketio.broadcast({'announcement': socketio.session.session_id + ' disconnected'})
                break

    return HttpResponse()

The view is plugged into your site like any other view:

urlpatterns += patterns('views',
    (r'^socket\.io', 'socketio'),
)

Running the example

Run the example by starting the server:

./run_example.py

Then point your browser to http://localhost:9000/.

If you run the example, you should see the same result as running the gevent-socketio example: a multi-client chatroom. The beauty of greenlet is at play in the line containing "socketio.recv()". This line blocks the greenlet and allows the server to keep processing other requests until a new Socket.IO message is ready to be processed. As soon as a new message is ready, the greenlet is re-awakened and the message is processed.

Note that we can't use our good old friend "./manage.py runserver" for this example. This is because we need to run the SocketIO server, which we import from gevent-socketio. Here is the example runner:

PORT = 9000

import os

os.environ['DJANGO_SETTINGS_MODULE'] = 'settings'

import django.core.handlers.wsgi
application = django.core.handlers.wsgi.WSGIHandler()

from socketio import SocketIOServer

if __name__ == '__main__':
    print 'Listening on port %s and on port 843 (flash policy server)' % PORT
    SocketIOServer(('', PORT), application, resource="socket.io").serve_forever()

This is all it takes to hook up gevent-socketio to the Django WSGIHandler. A monkey could easily make this into a custom management command if we desired.

Further reading

In my next post, I will explain how to scale our chatroom example to multiple web servers using ZeroMQ. Until then, I recommend checking out the following resources:

I would like to extend a special thanks to Jeffrey Gelens and other contributors for writing gevent-websocket and gevent-socketio.