Looking for a Django developer? Red Robot Studios is now accepting clients for Django development projects. Find out more...

Porting MooTools to jQuery

8 July 2010

Last night I ported some JavaScript code that used MooTools to use jQuery instead. In many ways I prefer MooTools to jQuery, but jQuery is easier to integrate in code that uses other libraries (e.g. Prototype).

Here’s a few quick hints for things you need to change if you’re doing the same.

$(‘element_id’)

You need a leading hash: $('#element_id')

No need for two dollars to use selectors: $('#element_id form input.special') works.

There’s a difference of approach to what happens when you use $(...): MooTools adds stuff to the DOM element. jQuery gives you a new object which wraps the element. This means you can’t just call DOM stuff on the object jQuery gives you. e.g.

MooTools:
$$('a.special_link').href – access the normal DOM properties

jQuery
$('a.special_link').attr('href') – ask the jQuery object to access the underlying DOM element

A jQuery object can represent a list of elements, not just a single one. You can access the underlying element using index zero: $('a.special_link')[0].href

Class

jQuery doesn’t do classes. There’s code and plugins around, or you can use a standard JavaScript approach like:

var MyClass = function(blah) {
    // this is your constructor
    this.blah = blah;
};
MyClass.prototype.doStuff = function(name) {
    // this is a member function
    return this.blah + name;
}

// instantiate and call like normal
var thing = new MyClass('hello');
thing.doStuff('monty');

bind(this)

Oh this one hurts. When you have a function, perhaps a callback, that you want to reference your class instance as “this“, in MooTools you use something like:

new Request.JSON({url: '...', onSuccess: function(data){
    this.doStuff(data.name);
}.bind(this));

The equivalent is to use JavaScripts apply method which lets you pass an object to use for “this“. You might also want “arguments” which is an array of all arguments passed to the function. e.g.

var self = this;
$.getJSON('...', function(){return function(data){
    this.doStuff(data.name);
}.apply(self, arguments)});

addEvent

This is what jQuery calls bind. e.g.

$('#element_id a').bind('click', function(e){..});

each

Instead of MooTools’ $$('a.special').each(function(elem){...})

Try $.each($('a.special'), function(index, elem){...}). Note the first param to your function is index, not the element.

Request.JSON

As above, try $.getJSON(url, func). You can also use $.post if it’s a POST request (seems to decode the JSON response automatically).

get and set

Instead of elem.get('href') try elem.attr('href'). Instead of elem.set('text', 'blah') there’s elem.text('blah').

That’s my braindump for now.

Filed under: JavaScript — Scott @ 10:49 am

Staplefish joins Red Robot Studios

18 March 2010

I’ve been working as a freelancer under the Staplefish business name for over four years now. Since mid-2008, I’ve also be working with Andrew at our Red Robot Studios business.

I’m moving all my Staplefish work under the Red Robot Studios brand. The aim is to simplify some business things (accounting, tax, invoicing, etc) and offer better service by teaming up with Andrew (e.g. one of us can cover while the other goes on holiday). It also marks our decision to focus on Red Robot Studios and offer great Django development and mobile development services.

To all Staplefish clients: Please be assured Scott is still working for you, just under a different business name and your websites will not be affected. Feel free to contact me with any questions or concerns.

Filed under: Business — Scott @ 12:58 pm

Emulating Django blocks with Smarty capture

2 January 2010

Django blocks considered addictive

I do Django development for Red Robot Studios and one of the many great things about the Django web framework is the template system.

Using blocks and inheritance, repetitive html is kept to a minimum. For example you can do:

base.html

...
<title>{% block title %}Default Title{% endblock %}</title>
...
<div id="content">
{% block content %}{% endblock %}
</div>
...

home.html

{% extends 'base.html' %}{% block title %}Home Page Title{% endblock %}
{% block content %}
Home page content here
{% endblock %}

The content of the blocks in home.html are plugged in to the block placeholders in base.html.

Smarty doesn’t have blocks like Django

Smarty is a template system for php. I was using it recently for a client and wished I could use Django-style blocks. Smarty doesn’t have blocks and inheritance, but does have capture and include. Here’s how I was able to achieve a similar result.

Using Smarty capture to emulate blocks

Using Smarty’s capture and include functions, here’s how the templates look:

header.tpl

...
<title>{if $smarty.capture.title}{$smarty.capture.title}{else}Default Title{/if}</title>
...
<div id="content">
{$smarty.capture.content}
</div>
...

home.tpl

{capture name='title'}Home Page Title{/capture}
{capture name='content'}
Home page content here
{/capture}

{include file='header.tpl'}

Simple, no? Remember to do these captures before including the file.

It’s not as powerful as Django template inheritance, but it’s a reasonable attempt to use Django-style blocks in Smarty templates.

Filed under: Web Development — Scott @ 12:34 am

Migrating Postgresql Databases the Easy Way

23 June 2009

When you upgrade Postgresl to a new major version (e.g. 8.1 to 8.3), all databases need to be dumped from the old version and loaded in to the new version. It’s not difficult, but on Debian there’s a really easy way.

Debian has pg_createcluster, pg_dropcluster and pg_upgradecluster (plus a few others). The one I’m referring to here is pg_upgradecluster.

It takes the version and cluster name of the databases you want to upgrade.

e.g. if you’ve installed Postgresql 8.3 and have databases in 8.1, just run:

# pg_upgradecluster 8.1 main

This upgrades the databases in the “main” cluster under version 8.1 and puts them under “main” in version 8.3. If you already have a “main” cluster in 8.3, you’ll need to drop it first.

This little tool not only dumps and loads the database, but it also changes the config so 8.3 runs on the standard port previously used by 8.1 (or whatever your older version). A painless way to upgrade.

Filed under: Postgresql — Scott @ 11:45 am

Gvim menu bar missing

9 March 2009

I just opened gvim on my Ubuntu (Hardy Heron) box and found there was no menu bar (File, Edit, etc).

After messing with some guioptions and getting nowhere I ran gvim as root (using sudo) and the menu bar was there. The answer came from a forum post by “Marko”:

Delete the file ~/.gnome2/Vim

It will be recreated when you run gvim again. With luck, the menu will be displayed again.

Filed under: Uncategorized — Scott @ 4:04 am

Layoff Talent – Django project just launched

11 December 2008

Andrew and I spent a few days this week putting together a new Django project.

It’s called Layoff Talent and it’s a place for people in the tech industry who have been laid off and are looking for a new job. They can add a simple profile and then hopefully be picked up by employers looking for new talent.

It’s similar in some ways to Django People or the Djangogigs developer listings, but specifically for people who have been laid off and not restricted to Django developers.

There’s nothing ground breaking from a development point of view, but it’s another example of how Django makes it easy to put out a full-featured site in a short time. Of course, we’ll be adding more features as the site gets popular.

If you know someone who has been laid off, please tell them about layofftalent.com.

Filed under: Uncategorized — Scott @ 6:05 pm

Get User from session key in Django

4 December 2008

Error emails contain session key

When you get an error email from your Django app telling you someone got a server error, it’s not always easy to tell which user had a problem. It might help your debugging to know or you might want to contact the user to tell them you have fixed the problem.

Assuming the user is logged in when they get the error, the email will contain the session key for that user’s session. The relevant part of the email looks like:

<WSGIRequest
GET:<QueryDict: {}>,
POST:<QueryDict: {}>,
COOKIES:{ 'sessionid': '8cae76c505f15432b48c8292a7dd0e54'},
...

Finding the user from the session

If the session still exists we can find it, unpickle the data it contains and get the user id. Here’s a short script to do just that:

from django.contrib.sessions.models import Session
from django.contrib.auth.models import User

session_key = '8cae76c505f15432b48c8292a7dd0e54'

session = Session.objects.get(session_key=session_key)
uid = session.get_decoded().get('_auth_user_id')
user = User.objects.get(pk=uid)

print user.username, user.get_full_name(), user.email

There it is. Pass in the session key (sessionid cookie) and get back the user’s name and email address.

Plug: Get your own job board at Fuselagejobs

Filed under: Django — Scott @ 9:43 pm

Dynamic upload paths in Django

25 August 2008

For a while I’ve been using the CustomImageField as a way to specify an upload path for images. It’s a hack that lets you use ids or slugs from your models in the upload path, e.g.:

/path/to/media/photos/1234/flowers.jpg
or
/path/to/media/photos/scotland-trip/castle.jpg

CustomImageField no more

Since the FileStorageRefactor was merged in to trunk r8244, it’s no longer necessary to use the custom field. Other recent changes to trunk mean that it doesn’t work any more in its current state, so this is a good time to retire it.

Pass a callable in upload_to

It is now possible for the upload_to parameter of the FileField or ImageField to be a callable, instead of a string. The callable is passed the current model instance and uploaded file name and must return a path. That sounds ideal.

Here’s an example:

import os
from django.db import models

def get_image_path(instance, filename):
    return os.path.join('photos', instance.id, filename)

class Photo(models.Model):
    image = models.ImageField(upload_to=get_image_path)

get_image_path is the callable (in this case a function). It simply gets the id from the instance of Photo and uses that in the upload path. Images will be uploaded to paths like:

photos/1/kitty.jpg

You can use whatever fields are in the instance (slugs, etc), or fields in related models. For example, if Photo models are associated with an Album model, the upload path for a Photo could include the Album slug.

Note that if you are using the id, you need to make sure the model instance was saved before you upload the file. Otherwise, the id hasn’t been set at that point and can’t be used.

For reference, here’s what the main part of the view might look like:

...
    if request.method == 'POST':
        form = PhotoForm(request.POST, request.FILES)
        if form.is_valid():
            photo = Photo.objects.create()
            image_file = request.FILES['image']
            photo.image.save(image_file.name, image_file)
...

This is much simpler than the hacks used in CustomImageField and provides a nice flexible way to specify file or image upload paths per-model instance.

Note: If you are using ModelForm, when you call form.save() it will save the file – no need to do it yourself as in the example above.

Filed under: Django — Scott @ 10:38 pm

Extending the Django User model with inheritance

21 August 2008

Extra fields for Users

Most of the Django projects I’ve worked on need to store information about each user in addition to the standard name and email address held by the contrib.auth.models.User model.

The old way: User Profiles

The solution in the past was to create a “user profile” model which is associated 1-to-1 with the user. Something like:

the model

class UserProfile(models.Model):
    user = models.ForeignKey(User, unique=True, related_name='profile')
    timezone = models.CharField(max_length=50, default='Europe/London')

config in settings.py

AUTH_PROFILE_MODULE = 'accounts.UserProfile'

usage

profile = request.user.get_profile()
print profile.timezone

It works ok, but it’s an extra database query for each request that uses the profile (it’s cached during the request so each call to get_profile() is not a query). Also, the information about the user is stored in two separate models, so you need to display and update fields from both the User and the UserProfile models.

The new way: Model Inheritance

As part of the great work done on the queryset-refactor by Malcolm et al, Django now has model inheritance.

If you’re using trunk as of revision 7477 (26th April 2008), your model classes can inherit from an existing model class. Additional fields are stored in a separate table which is linked to the table of the base model. When you retrieve your model, the query uses a join to get the fields from it and the base model.

Inheriting from User

Instead of creating a User Profile class, why don’t we inherit from the normal User class and add some fields?

from django.contrib.auth.models import User, UserManager

class CustomUser(User):
    """User with app settings."""
    timezone = models.CharField(max_length=50, default='Europe/London')

    # Use UserManager to get the create_user method, etc.
    objects = UserManager()

Now each instance of CustomUser will have the normal User fields and methods, as well as our additional fields and methods. Pretty handy, no?

We add UserManager as the default manager so that the standard methods are available. In particular, to create a user, we really want to say:

user = CustomUser.objects.create(...)

If we just created the user from the User class, we wouldn’t get a row in the CustomUser table. Creation needs to be done in the derived class.

You can still get and update the underlying User model, no problem, but it won’t have the additional fields and methods found in our CustomUser class.

Getting the CustomUser class by default

So far, there’s one problem. When you get request.user, it’s an instance of User, not an instance of CustomUser, so you don’t get the extra fields and methods.

What we want is for Django to retrieve the CustomUser instance transparently. It turns out to be quite easy.

Users come from authentication backends

The default authentication backend gets the User model from the database, checks the password is correct then returns the User. You can write your own authentication backend, for example to check the username and password against some other data source or to use the email address instead of username.

In our case, we can use an authentication backend to return an instance of CustomUser instead of User.

the authentication backend in auth_backends.py

from django.conf import settings
from django.contrib.auth.backends import ModelBackend
from django.core.exceptions import ImproperlyConfigured
from django.db.models import get_model

class CustomUserModelBackend(ModelBackend):
    def authenticate(self, username=None, password=None):
        try:
            user = self.user_class.objects.get(username=username)
            if user.check_password(password):
                return user
        except self.user_class.DoesNotExist:
            return None

    def get_user(self, user_id):
        try:
            return self.user_class.objects.get(pk=user_id)
        except self.user_class.DoesNotExist:
            return None

    @property
    def user_class(self):
        if not hasattr(self, '_user_class'):
            self._user_class = get_model(*settings.CUSTOM_USER_MODEL.split('.', 2))
            if not self._user_class:
                raise ImproperlyConfigured('Could not get custom user model')
        return self._user_class

config in settings.py

AUTHENTICATION_BACKENDS = (
    'myproject.auth_backends.CustomUserModelBackend',
)
...

CUSTOM_USER_MODEL = 'accounts.CustomUser'

That’s it. Now when you get request.user, it’s an instance of the CustomUser class with whatever additional fields or methods you have added.

P.S. Looking for Django hosting? I’d recommend WebFaction for shared hosting and Slicehost for a VPS.

Filed under: Django — Scott @ 8:40 pm

Django performance testing – a real world example

28 April 2008

About a week ago Andrew and I launched a new Django-powered site called Hey! Wall. It’s a social site along the lines of “the wall” on social networks and gives groups of friends a place to leave messages, share photos, videos and links.

We wanted to gauge performance and try some server config and code changes to see what steps we could take to improve it. We tested using httperf and doubled performance by making some optimisations.

Server and Client

The server is a Xen VPS from Slicehost with 256MB RAM running Debian Etch. It is located in the US Midwest.

For testing, the client is a Xen VPS from Xtraordinary Hosting, located in the UK. Our normal Internet access is via ADSL which makes it difficult to make enough requests to the server. Using a well-connected VPS as the client means we can really hammer the server.

Server spec caveats

It’s hard to say exactly what the server specs are. The VPS has 256MB RAM and is hosted with similar VPSes, probably on a quad core server with 16GB RAM. That’s a maximum of 64 VPSes on the physical server, assuming it is full of 256MB slices. If the four processors are 2.4GHz, that’s 9.6GHz total, divided by 64 gives a minimum of 150MHz of CPU.

On a Xen VPS, you get a fixed allocation of memory and CPU without contention, but usually any available CPU on the machine can be used. If other VPSes on the same box are idle, your VPS can make use of more of the CPU. This probably means more CPU was used during testing and perhaps more for some tests than for others.

Measuring performance with httperf

There are various web performance testing tools around including ab (from Apache), Flood and httperf. We went with httperf for no particular reason.

An httperf command looks something like:

httperf --hog --server=example.com --uri=/ --timeout=10 --num-conns=200 --rate=5

In this example, we’re requesting http://example.com/ 200 times, with up to 5 requests per second.

Testing Plan

Some tools support sessions and try to emulate users performing tasks on your site. We went with a simple brute-force test to get an idea of how many requests per second the site could handle.

The basic approach is to make a number of requests and see how the server responds: a status 200 is good, a status 500 is bad. Increase the rate (the number of requests made per second) and try again. When it starts returning lots of 500s, you’ve reached a limit.

Monitoring server resources

The other side is knowing what the server is doing in terms of memory and CPU use. To track this, we run top and log the output to a file for later review. The top command is something like:

top -b -d 3 -U www-data > top.txt

In this example we’re logging information on processes running as user www-data every three seconds. If you want to be more specific, instead of -U username you can use -p 1, 2, 3 where 1, 2 and 3 are pids (process ids of processes you want to watch).

The web server is Lighttpd with Python 2.5 running as FastCGI processes. We didn’t log information on the database process (PostgreSQL), though that could be useful.

Another useful tool is vmstat, particularly the swap columns which show how much memory is being swapped. Swapping means you don’t have enough memory and is a performance killer. To repeatedly run vmstat, specify the number of seconds between checks. e.g.

vmstat 2

Authenticated requests with httperf

httperf makes simple GET requests to a URL and downloads the html (but not any of the media). Requesting public/anonymous pages is easy, but what if you want a page that requires login?

httperf can pass request headers. Django authentication (from django.contrib.auth) uses sessions which rely on a session id held in a cookie on the client. The client passes the cookie in a request header. You see where this is going.

Log in to the site and check your cookies. There should be one like sessionid=97d674a05b2614e98411553b28f909de. To pass this cookie using httperf, use the --add-header option. e.g.

httperf ... --add-header='Cookie: sessionid=97d674a05b2614e98411553b28f909de\n'

Note the \n after the header. If you miss it, you will probably get timeouts for every request.

Which pages to test

With this in mind we tested two pages on the site:

  1. home: anonymous request to the home page
  2. wall: authenticated request to a “wall” which contains content retrieved from the database

Practically static versus highly dynamic

The home page is essentially static for anonymous users and just renders a template without needing any data from the database.

The wall page is very dynamic, with the main data retrieved from the database. The template is rendered specifically for the user with dates set to the user’s timezone, “remove” links on certain items, etc. The particular wall we tested has about 50 items on it and before optimisation made about 80 database queries.

For the first test we had two FastCGI backends running, able to accept requests for Django.

Home: 175 req/s (i.e. requests per second).
Wall: 8 req/s.

Compressed content

The first config optimisation was to enable gzip compression of the output using GZipMiddleware. Performance improved slightly, but not a huge difference. Worth doing for the bandwidth savings in any case.

Home: 200 req/s.
Wall: 8 req/s.

More processes, shorter queues

Next we increased the number of FastCGI backends from two to five. This was an improvement with fewer 500 responses as more of the requests could be handled by the extra backends.

Home: 200 req/s.
Wall: 11 req/s.

Mo processes, mo problems

The increase from two to five was good, so we tried increasing FastCGI backends to ten. Performance decreased significantly.

Checking with vmstat on the server, I could see it was swapping. Too many processes, each using memory for Python, had caused the VPS to run out of memory and swap memory to and from disk.

Home: 150 req/s.
Wall: 7 req/s.

At this point we set the FastCGI backends back down to five for further tests.

Profiling – where does the time go

The wall page had disappointing performance, so we started to optimise. The first thing we did was profile the code to see where time was being spent.

Using some simple profiling middleware it was clear the time was being spent in database queries. The wall page had a lot of queries and they increased linearly with the number of items on the wall. On the test wall this caused around 80 queries. No wonder its performance was poor.

Optimise this

By optimising how media attached to items is handled we were able to drop one query per item straight away. This slightly reduced how long the request took and so increased the number of queries handled per second.

Wall: 12 req/s.

Another inefficiency was the way several filters were applied to the content of each item whenever the page was requested. We changed it so the html output from the filtered content was stored in the item, saving some processing each time the page was viewed. This gave another small increase.

Wall: 13 req/s.

Back to reducing database queries, we were able to eliminate one query per item by changing how user profiles were retrieved (used to show who posted the item to the wall). Another worthwhile increase came from this change.

Wall: 15 req/s.

The final optimisation for this round of testing was to further reduce the queries needed to retrieve media attached to items. Again, we shed some queries and slightly increased performance.

Wall: 17 req/s.

Next step: caching

Having reduced queries as much as we can, the next step would be to do some caching. Retrieving cached data is usually much quicker than hitting the database, so we’d expect a good increase in performance.

Caching the output of complete pages is not useful because each page is heavily personalised to the user requesting it. It would only be a cache hit if the user requested the same page twice with nothing changing on it in the meantime.

Caching data such as lists of walls, items and users is more useful. The cached data could be used for multiple requests from a single user and shared to some degree across walls and different users. It’s not necessarily a huge win because each wall is likely to have a very small number of users, so the data would need to stay in cache long enough to be retrieved by others.

Our simplistic httperf tests would be very misleading in this case. Each request is made as the same user so cache hits would be practically 100% and performance would be great! This does not reflect real-world use of the site, so we’d need some better tests.

We haven’t made use of caching yet as the site can easily handle its current level of activity, but if Hey! Wall becomes popular, it will be our next step.

How many users is 17 req/s?

Serving 17 req/s still seems fairly low, but it would be interesting to know how this translates to actual users of the site. Obviously, this figure doesn’t include serving any media such as images, CSS and JavaScript files. Media files are relatively large but should be served fast as they are handled directly by Lighttpd (not Django) and have Expires headers to allow the client to cache them. Still, it’s some work the server would be doing in addition to what we measured with our tests.

It’s too early to tell what the common usage pattern would be, so I can only speculate. Allow me to do that!

I’ll assume the average user has access to three walls and checks each of them in turn, pausing for 10 or 20 seconds on each to read new comments and perhaps view some photos or open links. The user does this three times per day.

Looking specifically at the wall page and ignoring media, that means our user is making 9 requests per day for wall pages. Each user only makes one request at a time, so 17 users can be doing that at any second in time. Within a minute the user only makes three requests so is only counted within the 17 concurrent users for 3 seconds out of 60 (or 1 in 20).

If the distribution of user requests over time was perfectly balanced (hint: it won’t be), that means 340 users (17 * 20) could be using the site each minute. To continue with this unrealistic example, we could say there are 1440 minutes in a day and each user is on the site for three minutes per day, so the site could handle about 163,000 users. That would be very good for a $20/month VPS!

To reign in those numbers a bit, lets say we handle 200 concurrent users in a minute for 6 hours per day, 100 concurrent users for another 6 hours and 10 concurrent users for the remaining 12 hours. That’s still around 115,000 users the site could handle in a day given the maximum load of 17 requests per second.

I’m sure these numbers are somewhere between unrealistic and absurd. I’d be interested in comments on better ways to estimate or any real-world figures.

What we learned

To summarise:

  1. Testing the performance of your website may yield surprising results
  2. Having many database queries is bad for performance (duh)
  3. Caching works better for some types of site than others
  4. An inexpensive VPS may handle a lot more users than you’d think
Filed under: Django — Scott @ 2:58 pm
Next Page »