<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Flapping Head &#187; Django</title>
	<atom:link href="http://scottbarnham.com/blog/category/django/feed/" rel="self" type="application/rss+xml" />
	<link>http://scottbarnham.com/blog</link>
	<description>Code and comments on web development, Django, Python and things (un)related.</description>
	<lastBuildDate>Mon, 16 May 2011 19:22:44 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.1</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Django static media always returning 404 not found</title>
		<link>http://scottbarnham.com/blog/2010/10/06/django-static-media-always-returning-404-not-found/</link>
		<comments>http://scottbarnham.com/blog/2010/10/06/django-static-media-always-returning-404-not-found/#comments</comments>
		<pubDate>Wed, 06 Oct 2010 21:19:55 +0000</pubDate>
		<dc:creator>Scott</dc:creator>
				<category><![CDATA[Django]]></category>

		<guid isPermaLink="false">http://scottbarnham.com/blog/?p=74</guid>
		<description><![CDATA[I spent too long tonight figuring out a weird problem.
On a dev server, I was using django.views.static.serve to serve media files.  But it was returning 404 (not found) for any file.
The requests for media files weren&#8217;t even showing up in Django&#8217;s built-in server&#8217;s output.  That had me baffled until I dug deep enough [...]]]></description>
			<content:encoded><![CDATA[<p>I spent too long tonight figuring out a weird problem.</p>
<p>On a dev server, I was using <code>django.views.static.serve</code> to serve media files.  But it was returning 404 (not found) for any file.</p>
<p>The requests for media files weren&#8217;t even showing up in Django&#8217;s built-in server&#8217;s output.  That had me baffled until I dug deep enough in Django&#8217;s code to figure it out.</p>
<p>The <code>ADMIN_MEDIA_PREFIX</code> was the same as <code>MEDIA_URL</code>.  That was it.</p>
<p>Django&#8217;s built-in server doesn&#8217;t log requests for admin media, so that&#8217;s why there was no log output.</p>
<p>The built-in server also handles admin media separately, so when I tried to request a media file, it intercepted and looked for it in the admin media.</p>
<p>The solution is for the <code>ADMIN_MEDIA_PREFIX</code> to be different from <code>MEDIA_URL</code>, e.g. <code>/media/</code> and <code>/media/admin/</code>.</p>
]]></content:encoded>
			<wfw:commentRss>http://scottbarnham.com/blog/2010/10/06/django-static-media-always-returning-404-not-found/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Get User from session key in Django</title>
		<link>http://scottbarnham.com/blog/2008/12/04/get-user-from-session-key-in-django/</link>
		<comments>http://scottbarnham.com/blog/2008/12/04/get-user-from-session-key-in-django/#comments</comments>
		<pubDate>Thu, 04 Dec 2008 20:43:42 +0000</pubDate>
		<dc:creator>Scott</dc:creator>
				<category><![CDATA[Django]]></category>

		<guid isPermaLink="false">http://scottbarnham.com/blog/2008/12/04/get-user-from-session-key-in-django/</guid>
		<description><![CDATA[Error emails contain session key
When you get an error email from your Django app telling you someone got a server error, it&#8217;s not always easy to tell which user had a problem.  It might help your debugging to know or you might want to contact the user to tell them you have fixed the [...]]]></description>
			<content:encoded><![CDATA[<h2>Error emails contain session key</h2>
<p>When you get an error email from your Django app telling you someone got a server error, it&#8217;s not always easy to tell which user had a problem.  It might help your debugging to know or you might want to contact the user to tell them you have fixed the problem.</p>
<p>Assuming the user is logged in when they get the error, the email will contain the session key for that user&#8217;s session.  The relevant part of the email looks like:</p>
<pre>&lt;WSGIRequest
GET:&lt;QueryDict: {}&gt;,
POST:&lt;QueryDict: {}&gt;,
COOKIES:{ 'sessionid': '8cae76c505f15432b48c8292a7dd0e54'},
...</pre>
<h2>Finding the user from the session</h2>
<p>If the session still exists we can find it, unpickle the data it contains and get the user id.  Here&#8217;s a short script to do just that:</p>
<pre>from django.contrib.sessions.models import Session
from django.contrib.auth.models import User

session_key = '8cae76c505f15432b48c8292a7dd0e54'

session = Session.objects.get(session_key=session_key)
uid = session.get_decoded().get('_auth_user_id')
user = User.objects.get(pk=uid)

print user.username, user.get_full_name(), user.email</pre>
<p>There it is.  Pass in the session key (sessionid cookie) and get back the user&#8217;s name and email address.</p>
<p><strong>Plug:</strong> Get your own job board at <a href="http://www.fuselagejobs.com/">Fuselagejobs</a></p>
]]></content:encoded>
			<wfw:commentRss>http://scottbarnham.com/blog/2008/12/04/get-user-from-session-key-in-django/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Dynamic upload paths in Django</title>
		<link>http://scottbarnham.com/blog/2008/08/25/dynamic-upload-paths-in-django/</link>
		<comments>http://scottbarnham.com/blog/2008/08/25/dynamic-upload-paths-in-django/#comments</comments>
		<pubDate>Mon, 25 Aug 2008 21:38:27 +0000</pubDate>
		<dc:creator>Scott</dc:creator>
				<category><![CDATA[Django]]></category>

		<guid isPermaLink="false">http://scottbarnham.com/blog/2008/08/25/dynamic-upload-paths-in-django/</guid>
		<description><![CDATA[For a while I&#8217;ve been using the CustomImageField as a way to specify an upload path for images.  It&#8217;s a hack that lets you use ids or slugs from your models in the upload path, e.g.:
/path/to/media/photos/1234/flowers.jpg
or
/path/to/media/photos/scotland-trip/castle.jpg
CustomImageField no more
Since the FileStorageRefactor was merged in to trunk r8244, it&#8217;s no longer necessary to use the custom [...]]]></description>
			<content:encoded><![CDATA[<p>For a while I&#8217;ve been using the <a href="http://scottbarnham.com/blog/2007/07/31/uploading-images-to-a-dynamic-path-with-django/">CustomImageField</a> as a way to specify an upload path for images.  It&#8217;s a hack that lets you use ids or slugs from your models in the upload path, e.g.:</p>
<p><code>/path/to/media/photos/1234/flowers.jpg</code><br />
or<br />
<code>/path/to/media/photos/scotland-trip/castle.jpg</code></p>
<h2>CustomImageField no more</h2>
<p>Since the <a href="http://code.djangoproject.com/wiki/FileStorageRefactor">FileStorageRefactor</a> was merged in to trunk <a href="http://code.djangoproject.com/changeset/8244">r8244</a>, it&#8217;s no longer necessary to use the custom field.  Other recent changes to trunk mean that it doesn&#8217;t work any more in its current state, so this is a good time to retire it.</p>
<h2>Pass a callable in <code>upload_to</code></h2>
<p>It is now possible for the <code>upload_to</code> parameter of the <code><a href="http://www.djangoproject.com/documentation/model-api/#filefield">FileField</a></code> or <code>ImageField</code> to be a callable, instead of a string.  The callable is passed the current model instance and uploaded file name and must return a path.  That sounds ideal.</p>
<p>Here&#8217;s an example:</p>
<pre>import os
from django.db import models

def get_image_path(instance, filename):
    return os.path.join('photos', str(instance.id), filename)

class Photo(models.Model):
    image = models.ImageField(upload_to=get_image_path)</pre>
<p><code>get_image_path</code> is the callable (in this case a function).  It simply gets the id from the instance of <code>Photo</code> and uses that in the upload path.  Images will be uploaded to paths like:</p>
<p><code>photos/1/kitty.jpg</code></p>
<p>You can use whatever fields are in the instance (slugs, etc), or fields in related models.  For example, if <code>Photo</code> models are associated with an <code>Album</code> model, the upload path for a <code>Photo</code> could include the <code>Album</code> slug.</p>
<p>Note that if you are using the id, you need to make sure the model instance was saved before you upload the file.  Otherwise, the id hasn&#8217;t been set at that point and can&#8217;t be used.</p>
<p>For reference, here&#8217;s what the main part of the view might look like:</p>
<pre>...
    if request.method == 'POST':
        form = PhotoForm(request.POST, request.FILES)
        if form.is_valid():
            photo = Photo.objects.create()
            image_file = request.FILES['image']
            photo.image.save(image_file.name, image_file)
...</pre>
<p>This is much simpler than the hacks used in <code>CustomImageField</code> and provides a nice flexible way to specify file or image upload paths per-model instance.</p>
<p><strong>Note:</strong> If you are using ModelForm, when you call <code>form.save()</code> it will save the file &#8211; no need to do it yourself as in the example above.</p>
<p><strong>Update:</strong> Using the id of the instance doesn&#8217;t work any more because it&#8217;s probably not set when your function is called.  Try using a field such as a slug instead, or the id of a parent object (e.g. if the <code>Photo</code> is in an <code>Album</code>, use the <code>Album</code>&#8217;s id.</p>
]]></content:encoded>
			<wfw:commentRss>http://scottbarnham.com/blog/2008/08/25/dynamic-upload-paths-in-django/feed/</wfw:commentRss>
		<slash:comments>31</slash:comments>
		</item>
		<item>
		<title>Extending the Django User model with inheritance</title>
		<link>http://scottbarnham.com/blog/2008/08/21/extending-the-django-user-model-with-inheritance/</link>
		<comments>http://scottbarnham.com/blog/2008/08/21/extending-the-django-user-model-with-inheritance/#comments</comments>
		<pubDate>Thu, 21 Aug 2008 19:40:43 +0000</pubDate>
		<dc:creator>Scott</dc:creator>
				<category><![CDATA[Django]]></category>

		<guid isPermaLink="false">http://scottbarnham.com/blog/2008/08/21/extending-the-django-user-model-with-inheritance/</guid>
		<description><![CDATA[Extra fields for Users
Most of the Django projects I&#8217;ve worked on need to store information about each user in addition to the standard name and email address held by the contrib.auth.models.User model.
The old way: User Profiles
The solution in the past was to create a &#8220;user profile&#8221; model which is associated 1-to-1 with the user.  [...]]]></description>
			<content:encoded><![CDATA[<h2>Extra fields for Users</h2>
<p>Most of the Django projects I&#8217;ve worked on need to store information about each user in addition to the standard name and email address held by the <code>contrib.auth.models.User</code> model.</p>
<h2>The old way: User Profiles</h2>
<p>The solution in the past was to create a &#8220;user profile&#8221; model which is associated 1-to-1 with the user.  Something like:</p>
<h4>the model</h4>
<pre>class UserProfile(models.Model):
    user = models.ForeignKey(User, unique=True, related_name='profile')
    timezone = models.CharField(max_length=50, default='Europe/London')</pre>
<h4>config in <code>settings.py</code></h4>
<pre>AUTH_PROFILE_MODULE = 'accounts.UserProfile'</pre>
<h4>usage</h4>
<pre>profile = request.user.get_profile()
print profile.timezone</pre>
<p>It works ok, but it&#8217;s an extra database query for each request that uses the profile (it&#8217;s cached during the request so each call to <code>get_profile()</code> is not a query).  Also, the information about the user is stored in two separate models, so you need to display and update fields from both the <code>User</code> and the <code>UserProfile</code> models.</p>
<h2>The new way: Model Inheritance</h2>
<p>As part of the great work done on the <a href="http://code.djangoproject.com/wiki/QuerysetRefactorBranch">queryset-refactor</a> by <a href="http://www.pointy-stick.com/about/">Malcolm</a> et al, Django now has <a href="http://www.djangoproject.com/documentation/model-api/#model-inheritance">model inheritance</a>.</p>
<p>If you&#8217;re using trunk as of revision 7477 (26th April 2008), your model classes can inherit from an existing model class.  Additional fields are stored in a separate table which is linked to the table of the base model.  When you retrieve your model, the query uses a join to get the fields from it and the base model.</p>
<h3>Inheriting from User</h3>
<p>Instead of creating a User Profile class, why don&#8217;t we inherit from the normal <code>User</code> class and add some fields?</p>
<pre>from django.contrib.auth.models import User, UserManager

class CustomUser(User):
    """User with app settings."""
    timezone = models.CharField(max_length=50, default='Europe/London')

    # Use UserManager to get the create_user method, etc.
    objects = UserManager()</pre>
<p>Now each instance of <code>CustomUser</code> will have the normal <code>User</code> fields and methods, as well as our additional fields and methods.  Pretty handy, no?</p>
<p>We add <code>UserManager</code> as the default manager so that the standard methods are available.  In particular, to create a user, we really want to say:</p>
<pre>user = CustomUser.objects.create(...)</pre>
<p>If we just created the user from the <code>User</code> class, we wouldn&#8217;t get a row in the <code>CustomUser</code> table.  Creation needs to be done in the derived class.</p>
<p>You can still get and update the underlying <code>User</code> model, no problem, but it won&#8217;t have the additional fields and methods found in our <code>CustomUser</code> class.</p>
<h2>Getting the <code>CustomUser</code> class by default</h2>
<p>So far, there&#8217;s one problem.  When you get <code>request.user</code>, it&#8217;s an instance of <code>User</code>, not an instance of <code>CustomUser</code>, so you don&#8217;t get the extra fields and methods.</p>
<p>What we want is for Django to retrieve the <code>CustomUser</code> instance transparently.  It turns out to be quite easy.</p>
<h3>Users come from authentication backends</h3>
<p>The default authentication backend gets the <code>User</code> model from the database, checks the password is correct then returns the <code>User</code>.  You can <a href="http://www.djangoproject.com/documentation/authentication/#writing-an-authentication-backend">write your own authentication backend</a>, for example to check the username and password against some other data source or to use the email address instead of username.</p>
<p>In our case, we can use an authentication backend to return an instance of <code>CustomUser</code> instead of <code>User</code>.</p>
<h4>the authentication backend in <code>auth_backends.py</code></h4>
<pre>from django.conf import settings
from django.contrib.auth.backends import ModelBackend
from django.core.exceptions import ImproperlyConfigured
from django.db.models import get_model

class CustomUserModelBackend(ModelBackend):
    def authenticate(self, username=None, password=None):
        try:
            user = self.user_class.objects.get(username=username)
            if user.check_password(password):
                return user
        except self.user_class.DoesNotExist:
            return None

    def get_user(self, user_id):
        try:
            return self.user_class.objects.get(pk=user_id)
        except self.user_class.DoesNotExist:
            return None

    @property
    def user_class(self):
        if not hasattr(self, '_user_class'):
            self._user_class = get_model(*settings.CUSTOM_USER_MODEL.split('.', 2))
            if not self._user_class:
                raise ImproperlyConfigured('Could not get custom user model')
        return self._user_class</pre>
<h4>config in <code>settings.py</code></h4>
<pre>AUTHENTICATION_BACKENDS = (
    'myproject.auth_backends.CustomUserModelBackend',
)
...

CUSTOM_USER_MODEL = 'accounts.CustomUser'</pre>
<p>That&#8217;s it.  Now when you get <code>request.user</code>, it&#8217;s an instance of the <code>CustomUser</code> class with whatever additional fields or methods you have added.</p>
<p>P.S. Looking for Django hosting?  I&#8217;d recommend <a href="http://www.webfaction.com/shared_hosting?affiliate=sgb79">WebFaction</a> for shared hosting and <a href="https://manage.slicehost.com/customers/new?referrer=107490507">Slicehost</a> for a VPS.</p>
]]></content:encoded>
			<wfw:commentRss>http://scottbarnham.com/blog/2008/08/21/extending-the-django-user-model-with-inheritance/feed/</wfw:commentRss>
		<slash:comments>84</slash:comments>
		</item>
		<item>
		<title>Django performance testing &#8211; a real world example</title>
		<link>http://scottbarnham.com/blog/2008/04/28/django-performance-testing-a-real-world-example/</link>
		<comments>http://scottbarnham.com/blog/2008/04/28/django-performance-testing-a-real-world-example/#comments</comments>
		<pubDate>Mon, 28 Apr 2008 13:58:55 +0000</pubDate>
		<dc:creator>Scott</dc:creator>
				<category><![CDATA[Django]]></category>
		<category><![CDATA[httperf]]></category>
		<category><![CDATA[profiling]]></category>

		<guid isPermaLink="false">http://scottbarnham.com/blog/2008/04/28/django-performance-testing-a-real-world-example/</guid>
		<description><![CDATA[About a week ago Andrew and I launched a new Django-powered site called Hey! Wall.  It&#8217;s a social site along the lines of &#8220;the wall&#8221; on social networks and gives groups of friends a place to leave messages, share photos, videos and links.
We wanted to gauge performance and try some server config and code [...]]]></description>
			<content:encoded><![CDATA[<p>About a week ago <a href="http://tangerinesmash.com/">Andrew</a> and <a href="http://www.staplefish.com/">I</a> launched a new Django-powered site called <a href="http://heywall.com/">Hey! Wall</a>.  It&#8217;s a social site along the lines of &#8220;the wall&#8221; on social networks and gives groups of friends a place to leave messages, share photos, videos and links.</p>
<p>We wanted to gauge performance and try some server config and code changes to see what steps we could take to improve it.  We tested using <code>httperf</code> and doubled performance by making some optimisations.</p>
<h3>Server and Client</h3>
<p>The server is a <a href="https://manage.slicehost.com/customers/new?referrer=107490507">Xen VPS from Slicehost</a> with 256MB RAM running Debian Etch.  It is located in the US Midwest.</p>
<p>For testing, the client is a <a href="http://www.xtrahost.net/xenvps/">Xen VPS from Xtraordinary Hosting</a>, located in the UK.  Our normal Internet access is via ADSL which makes it difficult to make enough requests to the server.  Using a well-connected VPS as the client means we can really hammer the server.</p>
<h4>Server spec caveats</h4>
<p>It&#8217;s hard to say exactly what the server specs are.  The VPS has 256MB RAM and is hosted with similar VPSes, probably on a <a href="http://www.slicehost.com/questions/#users">quad core server with 16GB RAM</a>.  That&#8217;s a maximum of 64 VPSes on the physical server, assuming it is full of 256MB slices.  If the four processors are 2.4GHz, that&#8217;s 9.6GHz total, divided by 64 gives a minimum of 150MHz of CPU.</p>
<p>On a Xen VPS, you get a fixed allocation of memory and CPU without contention, but usually any <a href="http://www.slicehost.com/questions/#cpu-scheduling">available CPU on the machine can be used</a>.  If other VPSes on the same box are idle, your VPS can make use of more of the CPU.  This probably means more CPU was used during testing and perhaps more for some tests than for others.</p>
<h3>Measuring performance with httperf</h3>
<p>There are various web performance testing tools around including <a href="http://httpd.apache.org/docs/2.0/programs/ab.html">ab (from Apache)</a>, <a href="http://httpd.apache.org/test/flood/">Flood</a> and <a href="http://www.hpl.hp.com/research/linux/httperf/">httperf</a>.  We went with httperf for no particular reason.</p>
<p>An httperf command looks something like:</p>
<pre>httperf --hog --server=example.com --uri=/ --timeout=10 --num-conns=200 --rate=5</pre>
<p>In this example, we&#8217;re requesting <code>http://example.com/</code> 200 times, with up to 5 requests per second.</p>
<h3>Testing Plan</h3>
<p>Some tools support sessions and try to emulate users performing tasks on your site.  We went with a simple brute-force test to get an idea of how many requests per second the site could handle.</p>
<p>The basic approach is to make a number of requests and see how the server responds: a status 200 is good, a status 500 is bad.  Increase the rate (the number of requests made per second) and try again.  When it starts returning lots of 500s, you&#8217;ve reached a limit.</p>
<h4>Monitoring server resources</h4>
<p>The other side is knowing what the server is doing in terms of memory and CPU use.  To track this, we run <code>top</code> and log the output to a file for later review.  The top command is something like:</p>
<pre>top -b -d 3 -U www-data > top.txt</pre>
<p>In this example we&#8217;re logging information on processes running as user <code>www-data</code> every three seconds.  If you want to be more specific, instead of <code>-U username</code> you can use <code>-p 1, 2, 3</code> where 1, 2 and 3 are pids (process ids of processes you want to watch).</p>
<p>The web server is Lighttpd with Python 2.5 running as FastCGI processes.  We didn&#8217;t log information on the database process (PostgreSQL), though that could be useful.</p>
<p>Another useful tool is <code>vmstat</code>, particularly the swap columns which show how much memory is being swapped.  Swapping means you don&#8217;t have enough memory and is a performance killer.  To repeatedly run <code>vmstat</code>, specify the number of seconds between checks.  e.g.</p>
<pre>vmstat 2</pre>
<h4>Authenticated requests with httperf</h4>
<p><code>httperf</code> makes simple <code>GET</code> requests to a URL and downloads the html (but not any of the media).  Requesting public/anonymous pages is easy, but what if you want a page that requires login?</p>
<p><code>httperf</code> can pass request headers.  Django authentication (from <code>django.contrib.auth</code>) uses sessions which rely on a session id held in a cookie on the client.  The client passes the cookie in a request header.  You see where this is going.</p>
<p>Log in to the site and check your cookies.  There should be one like <code>sessionid=97d674a05b2614e98411553b28f909de</code>.  To pass this cookie using httperf, use the <code>--add-header</code> option.  e.g.</p>
<pre>httperf ... --add-header='Cookie: sessionid=97d674a05b2614e98411553b28f909de\n'</pre>
<p>Note the <code>\n</code> after the header.  If you miss it, you will probably get timeouts for every request.</p>
<h4>Which pages to test</h4>
<p>With this in mind we tested two pages on the site:</p>
<ol>
<li><strong>home</strong>: anonymous request to the home page</li>
<li><strong>wall</strong>: authenticated request to a &#8220;wall&#8221; which contains content retrieved from the database</li>
</ol>
<h3>Practically static versus highly dynamic</h3>
<p>The home page is essentially static for anonymous users and just renders a template without needing any data from the database.</p>
<p>The wall page is very dynamic, with the main data retrieved from the database.  The template is rendered specifically for the user with dates set to the user&#8217;s timezone, &#8220;remove&#8221; links on certain items, etc.  The particular wall we tested has about 50 items on it and before optimisation made about 80 database queries.</p>
<p>For the first test we had two FastCGI backends running, able to accept requests for Django.</p>
<p>Home: 175 req/s (i.e. requests per second).<br />
Wall: 8 req/s.</p>
<h3>Compressed content</h3>
<p>The first config optimisation was to enable gzip compression of the output using <code>GZipMiddleware</code>.  Performance improved slightly, but not a huge difference.  Worth doing for the bandwidth savings in any case.</p>
<p>Home: 200 req/s.<br />
Wall: 8 req/s.</p>
<h3>More processes, shorter queues</h3>
<p>Next we increased the number of FastCGI backends from two to five.  This was an improvement with fewer 500 responses as more of the requests could be handled by the extra backends.</p>
<p>Home: 200 req/s.<br />
Wall: 11 req/s.</p>
<h3>Mo processes, mo problems</h3>
<p>The increase from two to five was good, so we tried increasing FastCGI backends to ten.  Performance <em>decreased</em> significantly.</p>
<p>Checking with <code>vmstat</code> on the server, I could see it was swapping.  Too many processes, each using memory for Python, had caused the VPS to run out of memory and swap memory to and from disk.</p>
<p>Home: 150 req/s.<br />
Wall: 7 req/s.</p>
<p>At this point we set the FastCGI backends back down to five for further tests.</p>
<h3>Profiling &ndash; where does the time go</h3>
<p>The wall page had disappointing performance, so we started to optimise.  The first thing we did was profile the code to see where time was being spent.</p>
<p>Using some simple <a href="http://www.djangosnippets.org/snippets/727/">profiling middleware</a> it was clear the time was being spent in database queries.  The wall page had a lot of queries and they increased linearly with the number of items on the wall.  On the test wall this caused around 80 queries.  No wonder its performance was poor.</p>
<h3>Optimise this</h3>
<p>By optimising how media attached to items is handled we were able to drop one query per item straight away.  This slightly reduced how long the request took and so increased the number of queries handled per second.</p>
<p>Wall: 12 req/s.</p>
<p>Another inefficiency was the way several filters were applied to the content of each item whenever the page was requested.  We changed it so the html output from the filtered content was stored in the item, saving some processing each time the page was viewed.  This gave another small increase.</p>
<p>Wall: 13 req/s.</p>
<p>Back to reducing database queries, we were able to eliminate one query per item by changing how user profiles were retrieved (used to show who posted the item to the wall).  Another worthwhile increase came from this change.</p>
<p>Wall: 15 req/s.</p>
<p>The final optimisation for this round of testing was to further reduce the queries needed to retrieve media attached to items.  Again, we shed some queries and slightly increased performance.</p>
<p>Wall: 17 req/s.</p>
<h3>Next step: caching</h3>
<p>Having reduced queries as much as we can, the next step would be to do some caching.  Retrieving cached data is usually much quicker than hitting the database, so we&#8217;d expect a good increase in performance.</p>
<p>Caching the output of complete pages is not useful because each page is heavily personalised to the user requesting it.  It would only be a cache hit if the user requested the same page twice with nothing changing on it in the meantime.</p>
<p>Caching data such as lists of walls, items and users is more useful.  The cached data could be used for multiple requests from a single user and shared to some degree across walls and different users.  It&#8217;s not necessarily a huge win because each wall is likely to have a very small number of users, so the data would need to stay in cache long enough to be retrieved by others.</p>
<p>Our simplistic <code>httperf</code> tests would be very misleading in this case.  Each request is made as the same user so cache hits would be practically 100% and performance would be great!  This does not reflect real-world use of the site, so we&#8217;d need some better tests.</p>
<p>We haven&#8217;t made use of caching yet as the site can easily handle its current level of activity, but if <a href="http://heywall.com/">Hey!&nbsp;Wall</a> becomes popular, it will be our next step.</p>
<h3>How many users is 17 req/s?</h3>
<p>Serving 17 req/s still seems fairly low, but it would be interesting to know how this translates to actual users of the site.  Obviously, this figure doesn&#8217;t include serving any media such as images, CSS and JavaScript files.  Media files are relatively large but should be served fast as they are handled directly by Lighttpd (not Django) and have <code>Expires</code> headers to allow the client to cache them.  Still, it&#8217;s some work the server would be doing in addition to what we measured with our tests.</p>
<p>It&#8217;s too early to tell what the common usage pattern would be, so I can only speculate.  <em>Allow me to do that!</em></p>
<p>I&#8217;ll assume the average user has access to three walls and checks each of them in turn, pausing for 10 or 20 seconds on each to read new comments and perhaps view some photos or open links.  The user does this three times per day.</p>
<p>Looking specifically at the wall page and ignoring media, that means our user is making 9 requests per day for wall pages.  Each user only makes one request at a time, so 17 users can be doing that at any second in time.  Within a minute the user only makes three requests so is only counted within the 17 concurrent users for 3 seconds out of 60 (or 1 in 20).</p>
<p>If the distribution of user requests over time was perfectly balanced (hint: it won&#8217;t be), that means 340 users (17 * 20) could be using the site each minute.  To continue with this unrealistic example, we could say there are 1440 minutes in a day and each user is on the site for three minutes per day, so the site could handle about 163,000 users.  That would be very good for a $20/month VPS!</p>
<p>To reign in those numbers a bit, lets say we handle 200 concurrent users in a minute for 6 hours per day, 100 concurrent users for another 6 hours and 10 concurrent users for the remaining 12 hours.  That&#8217;s still around 115,000 users the site could handle in a day given the maximum load of 17 requests per second.</p>
<p>I&#8217;m sure these numbers are somewhere between unrealistic and absurd.  I&#8217;d be interested in comments on better ways to estimate or any real-world figures.</p>
<h3>What we learned</h3>
<p>To summarise:</p>
<ol>
<li>Testing the performance of your website may yield surprising results</li>
<li>Having many database queries is bad for performance (duh)</li>
<li>Caching works better for some types of site than others</li>
<li>An inexpensive VPS may handle a lot more users than you&#8217;d think</li>
</ol>
]]></content:encoded>
			<wfw:commentRss>http://scottbarnham.com/blog/2008/04/28/django-performance-testing-a-real-world-example/feed/</wfw:commentRss>
		<slash:comments>12</slash:comments>
		</item>
		<item>
		<title>ImageField and edit_inline revisited</title>
		<link>http://scottbarnham.com/blog/2008/02/24/imagefield-and-edit_inline-revisited/</link>
		<comments>http://scottbarnham.com/blog/2008/02/24/imagefield-and-edit_inline-revisited/#comments</comments>
		<pubDate>Sun, 24 Feb 2008 20:12:49 +0000</pubDate>
		<dc:creator>Scott</dc:creator>
				<category><![CDATA[Django]]></category>

		<guid isPermaLink="false">http://scottbarnham.com/blog/2008/02/24/imagefield-and-edit_inline-revisited/</guid>
		<description><![CDATA[A while back I wrote about using edit inline with image and file fields.  Specifically, I suggested adding an uneditable BooleanField as the core field of the related model.  This means you don&#8217;t have to set the ImageField or FileField to be core (which would cause confusing behaviour).
Removing the related model
The downside to [...]]]></description>
			<content:encoded><![CDATA[<p>A while back I wrote about <a href="http://scottbarnham.com/blog/2007/08/22/edit-inline-with-imagefield-or-filefield-in-django-admin/">using edit inline with image and file fields</a>.  Specifically, I suggested adding an uneditable <code>BooleanField</code> as the <code>core</code> field of the related model.  This means you don&#8217;t have to set the <code>ImageField</code> or <code>FileField</code> to be <code>core</code> (which would cause confusing behaviour).</p>
<h2>Removing the related model</h2>
<p>The downside to having an uneditable core field is that you can&#8217;t remove the related model instance using admin.  At the time, I wasn&#8217;t trouble by this so I just left it.  In a recent project I needed to associate photos with articles, use <code>edit_inline</code> for the photos and be able to remove them.  So here&#8217;s an extended workaround.</p>
<p>As well as the uneditable <code>BooleanField</code> (&#8220;keep&#8221;) which keeps the <code>ArticlePhoto</code> from being deleted, we now have a &#8220;remove&#8221; <code>BooleanField</code> which the user can tick in admin to cause the <code>ArticlePhoto</code> to be deleted.  The check for this is in the <code>save()</code> method.</p>
<pre>class ArticlePhoto(models.Model):
    article = models.ForeignKey(Article, related_name='photos', edit_inline=models.TABULAR, min_num_in_admin=5)
    keep = models.BooleanField(core=True, default=True, editable=False)
    remove = models.BooleanField(default=False)
    image = CustomImageField()

    def save(self):
        if not self.id and not self.image:
            return
        if self.remove:
            self.delete()
        else:
            super(ArticlePhoto, self).save()</pre>
<p>It&#8217;s a pretty easy way to work around the problem and gives a sensible looking &#8220;remove&#8221; checkbox in the admin interface.  The database table will have a &#8220;remove&#8221; column that never gets used, but it&#8217;s a pretty small price to pay.</p>
]]></content:encoded>
			<wfw:commentRss>http://scottbarnham.com/blog/2008/02/24/imagefield-and-edit_inline-revisited/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Case-insensitive ordering with Django and PostgreSQL</title>
		<link>http://scottbarnham.com/blog/2007/11/20/case-insensitive-ordering-with-django-and-postgresql/</link>
		<comments>http://scottbarnham.com/blog/2007/11/20/case-insensitive-ordering-with-django-and-postgresql/#comments</comments>
		<pubDate>Tue, 20 Nov 2007 19:38:15 +0000</pubDate>
		<dc:creator>Scott</dc:creator>
				<category><![CDATA[Django]]></category>

		<guid isPermaLink="false">http://scottbarnham.com/blog/2007/11/20/case-insensitive-ordering-with-django-and-postgresql/</guid>
		<description><![CDATA[When the Django Gigs site first went live we noticed the ordering of developers by name was not right.  Those starting with an uppercase letter were coming before those starting with a lowercase letter.
PostgreSQL and the locale
PostgreSQL has a locale setting which is configured when the cluster is created.  Among other things, this [...]]]></description>
			<content:encoded><![CDATA[<p>When the <a href="http://djangogigs.com/">Django Gigs</a> site first went live we noticed the ordering of developers by name was not right.  Those starting with an uppercase letter were coming before those starting with a lowercase letter.</p>
<h3>PostgreSQL and the locale</h3>
<p>PostgreSQL has a <a href="http://www.postgresql.org/docs/8.2/interactive/locale.html">locale setting</a> which is configured when the cluster is created.  Among other things, this affects the ordering of results when you use the SQL <code>order by</code> clause.</p>
<p>The local on my server was set to &#8220;C&#8221; which means it uses byte-level comparisons, rather than following more complex rules for a given culture.  Although this is apparently good for performance, it means <code>order by</code> will be case sensitive &#8211; e.g. &#8220;Zebra&#8221; comes before &#8220;apple&#8221;.</p>
<p>Depending on how your system is set up, you may have locales such as en_GB.  The locale can&#8217;t easily be changed in PostgreSQL because indexes and other data depends on it.  To change locale, you need to start a new cluster and move databases to it.</p>
<h3>Django and case-sensitivity</h3>
<p>Django provides the <a href="http://www.djangoproject.com/documentation/db-api/#order-by-fields">order_by()</a> function on QuerySets, but does not have an option for case insensitive ordering.  Instead this is left to your database configuration.</p>
<p>When using SQL directly, you can sort case-insensitively using the PostgreSQL <code>lower()</code> function.</p>
<p>e.g. </p>
<pre>select * from developer order by lower(name)</pre>
<p>One way to do this in Django is to use <a href="http://www.djangoproject.com/documentation/db-api/#extra-select-none-where-none-params-none-tables-none">extra</a> to call the <code>lower()</code> function, creating a virtual column which you can then order by.</p>
<p>e.g.</p>
<pre>Developer.objects.all().extra(
select={'lower_name': 'lower(name)'}).order_by('lower_name')</pre>
<p>Using SQL functions could tie you to a particular database, though in this case the <code>lower()</code> function is standard and should work with <a href="http://en.wikibooks.org/wiki/SQL_dialects_reference/Functions_and_expressions/String_functions">most databases</a>.  Some other databases do case-insensitive comparisons so wouldn&#8217;t need it.</p>
]]></content:encoded>
			<wfw:commentRss>http://scottbarnham.com/blog/2007/11/20/case-insensitive-ordering-with-django-and-postgresql/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>Django developers: We are the world</title>
		<link>http://scottbarnham.com/blog/2007/09/29/django-developers-we-are-the-world/</link>
		<comments>http://scottbarnham.com/blog/2007/09/29/django-developers-we-are-the-world/#comments</comments>
		<pubDate>Sat, 29 Sep 2007 12:03:19 +0000</pubDate>
		<dc:creator>Scott</dc:creator>
				<category><![CDATA[Django]]></category>

		<guid isPermaLink="false">http://scottbarnham.com/blog/2007/09/29/django-developers-we-are-the-world/</guid>
		<description><![CDATA[An informal survey of the Django community
This week, Andrew and I launched the Django Gigs website to help employers find Django developers.  Andrew wrote about it and thanks to the Django Community feed aggregator we had quite a few visitors in the first couple of days.
It&#8217;s clear that Django is catching on and growing [...]]]></description>
			<content:encoded><![CDATA[<h3>An informal survey of the Django community</h3>
<p>This week, <a href="http://tangerinesmash.com/">Andrew</a> and I launched the Django Gigs website to help employers <a href="http://djangogigs.com/">find Django developers</a>.  Andrew <a href="http://www.tangerinesmash.com/writings/2007/sep/26/djangogigscom-idea-release-6-hours/">wrote about it</a> and thanks to the <a href="http://www.djangoproject.com/community/">Django Community</a> feed aggregator we had quite a few visitors in the first couple of days.</p>
<p>It&#8217;s clear that Django is catching on and growing in popularity.  The djangoproject.com site is getting <a href="http://www.djangoproject.com/weblog/2007/sep/17/home/">close to 8 million hits each month</a>.  I thought it would be interesting to analyse my logs and see what I could tell about the Django community, or at least the section of it that read the blog and visited the <a href="http://djangogigs.com/">Django Gigs</a> website.</p>
<h3>Visitors</h3>
<p>1280 unique IP addresses</p>
<p>The number of IP addresses seems a pretty good indication of how many unique visitors we had in about two days.</p>
<h3>Platforms</h3>
<table class="simple" cellspacing="0">
<tr>
<td class="number">510</td>
<td>Windows</td>
</tr>
<tr>
<td class="number">373</td>
<td>Mac OS X (including 4 iPhones)</td>
</tr>
<tr>
<td class="number">312</td>
<td>Linux</td>
</tr>
<tr>
<td class="number">85</td>
<td>Other (mostly bots, feed aggregator sites, a handful of BSD)</td>
</tr>
</table>
<p>The platforms is a pretty even split among Windows, Mac and Linux.  Which given the dominance of Windows on the desktop suggests Django is disproportionately popular with Mac OS X and Linux users.  I suspect this is the case with Python in general, but I don&#8217;t have any stats to back that up.</p>
<h3>Browsers</h3>
<table class="simple" cellspacing="0">
<tr>
<td class="number">875</td>
<td>Firefox</td>
</tr>
<tr>
<td class="number">148</td>
<td>Safari</td>
</tr>
<tr>
<td class="number">40</td>
<td>IE</td>
</tr>
<tr>
<td class="number">36</td>
<td>Camino</td>
</tr>
<tr>
<td class="number">13</td>
<td>Konqueror</td>
</tr>
<tr>
<td class="number">168</td>
<td>Other (mostly bots or feed readers like NetNewsWire)</td>
</tr>
</table>
<p>No big surprise here: Firefox is the daddy.</p>
<p>One thing that surprised me was the number of different user agents.  There were 408 unique user agent strings!  Of course, most of them were from different versions of the same software.  IE on Windows likes to  report versions of the .NET framework and various browser extension installed on the machine.</p>
<h3>Countries</h3>
<table class="simple" cellspacing="0">
<tr>
<td class="number">423</td>
<td>United States</td>
</tr>
<tr>
<td class="number">133</td>
<td>France</td>
</tr>
<tr>
<td class="number">126</td>
<td>United Kingdom</td>
</tr>
<tr>
<td class="number">67</td>
<td>Germany</td>
</tr>
<tr>
<td class="number">61</td>
<td>Canada</td>
</tr>
<tr>
<td class="number">45</td>
<td>Russian Federation</td>
</tr>
<tr>
<td class="number">42</td>
<td>Brazil</td>
</tr>
<tr>
<td class="number">34</td>
<td>Australia</td>
</tr>
<tr>
<td class="number">33</td>
<td>Netherlands</td>
</tr>
<tr>
<td class="number">23</td>
<td>Italy</td>
</tr>
<tr>
<td class="number">16</td>
<td>Belgium</td>
</tr>
<tr>
<td class="number">16</td>
<td>China</td>
</tr>
<tr>
<td class="number">16</td>
<td>Spain</td>
</tr>
<tr>
<td class="number">15</td>
<td>Poland</td>
</tr>
<tr>
<td class="number">15</td>
<td>Sweden</td>
</tr>
<tr>
<td class="number">14</td>
<td>India</td>
</tr>
<tr>
<td class="number">13</td>
<td>Norway</td>
</tr>
<tr>
<td class="number">13</td>
<td>Singapore</td>
</tr>
<tr>
<td class="number">13</td>
<td>Switzerland</td>
</tr>
<tr>
<td class="number">13</td>
<td>Austria</td>
</tr>
<tr>
<td class="number">13</td>
<td>Japan</td>
</tr>
<tr>
<td class="number">9</td>
<td>Ireland</td>
</tr>
<tr>
<td class="number">8</td>
<td>Ukraine</td>
</tr>
<tr>
<td class="number">8</td>
<td>New Zealand</td>
</tr>
<tr>
<td class="number">7</td>
<td>Finland</td>
</tr>
<tr>
<td class="number">7</td>
<td>Portugal</td>
</tr>
<tr>
<td class="number">6</td>
<td>Czech Republic</td>
</tr>
<tr>
<td class="number">5</td>
<td>Saudi Arabia</td>
</tr>
<tr>
<td class="number">5</td>
<td>Iceland</td>
</tr>
</table>
<p>Honourable mentions (1-5 visitors): Slovenia, Denmark, Romania, Greece, Republic of Korea, Serbia and Montenegro, Indonesia, Hong Kong, Philippines, Israel, Croatia, Estonia, Colombia, Peru, Slovakia, Thailand, Turkey, Malaysia, Chile, Puerto Rico, Latvia, Hungary, Belarus, Mexico, Kenya, Kuwait, Nigeria, Lithuania, Argentina, Bolivia, Europe, Iran, Islamic Republic of, Dominican Republic, Moldova, Republic of, Bulgaria, Jamaica, Egypt, United Arab Emirates, Kazakhstan.</p>
<p>I used the free version of <a href="http://www.maxmind.com/app/geolitecountry">GeoIP from MaxMind</a> to look up countries from IP addresses.  It&#8217;s not totally accurate, but good enough.</p>
<p>It&#8217;s very easy to use from Python, assuming you have the library installed:</p>
<pre>import GeoIP
geo = GeoIP.new(GeoIP.GEOIP_MEMORY_CACHE)
print geo.country_name_by_addr('4.4.4.4')</pre>
<p>It&#8217;s not surprising that North America and Western Europe are well represented, but Russia, Brazil and Australia seem to have a good Django following also.</p>
<h3>We are the world</h3>
<p>Obviously this is just a sample of the Django community and may not be representative, but it does given an indication that Django developers are spread across the world and across the major platforms.  That can only be a good thing for the continued growth and success of the framework.</p>
]]></content:encoded>
			<wfw:commentRss>http://scottbarnham.com/blog/2007/09/29/django-developers-we-are-the-world/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Edit inline with ImageField or FileField in Django admin</title>
		<link>http://scottbarnham.com/blog/2007/08/22/edit-inline-with-imagefield-or-filefield-in-django-admin/</link>
		<comments>http://scottbarnham.com/blog/2007/08/22/edit-inline-with-imagefield-or-filefield-in-django-admin/#comments</comments>
		<pubDate>Wed, 22 Aug 2007 14:08:55 +0000</pubDate>
		<dc:creator>Scott</dc:creator>
				<category><![CDATA[Django]]></category>

		<guid isPermaLink="false">http://scottbarnham.com/blog/2007/08/22/edit-inline-with-imagefieldfilefield-in-django-admin/</guid>
		<description><![CDATA[Django admin lets you edit related model objects &#8220;inline&#8221;.  For example when editing a Recipe you can add/eding a group of Ingredient models.
Core fields for edit_inline
The related model being edited inline must specify one or more &#8220;core&#8221; fields using core=True.  If the core fields are filled in, the related model is added.  [...]]]></description>
			<content:encoded><![CDATA[<p>Django admin lets you edit related model objects &#8220;inline&#8221;.  For example when editing a <code>Recipe</code> you can add/eding a group of <code>Ingredient</code> models.</p>
<h3>Core fields for edit_inline</h3>
<p>The related model being edited inline must specify one or more &#8220;core&#8221; fields using <code>core=True</code>.  If the core fields are filled in, the related model is added.  If the core fields are empty, the related model is removed.</p>
<p>This works great for normal objects with <code>CharField</code>s, etc, but not so well if you want to have images or files uploaded using inline editing.  If the only core field is a <code>FileField</code> or <code>ImageField</code>, you&#8217;ll get strange behaviour like the file/image being removed when you edit an existing model in the admin.</p>
<h3>Using inline editing with ImageField or FileField</h3>
<p>In a recent project I wanted to have an item with title and description and zero or more photos.  The <code>Photo</code> model just has an <code>ImageField</code>.  To make it easy to edit, I wanted the photos set to <code>edit_inline</code>.</p>
<p>Here&#8217;s my first attempt:</p>
<pre>class Item(models.Model):
    title = models.CharField(max_length=100)
    description = models.TextField()

    class Admin:
        pass

class Photo(models.Model):
    item = models.ForeignKey(Item, related_name='photos', edit_inline=models.STACKED)
    image = models.ImageField(blank=False, upload_to='items', core=True)</pre>
<p>Notice that the <code>ImageField</code> in <code>Photo</code> has <code>core=True</code> to make it a core field.</p>
<p>This worked ok in the Django admin interface for adding a <code>Photo</code> to a new <code>Item</code>, but if I edited that <code>Item</code>, the <code>Photo</code> would be deleted.</p>
<p>This is a known issue (see <a href="http://code.djangoproject.com/ticket/2534">Ticket #2534</a>), but it&#8217;s marked as &#8220;pending design decision&#8221; and may be ignored for now since the Django admin is being rewritten to use <code>newforms</code>.</p>
<p>In the meantime I needed a workaround.</p>
<h3>Workaround using a different core field</h3>
<p>Instead of having the <code>ImageField</code> as a core field, we need something else.  If you&#8217;ve got some other natural data, such as a caption, that would work fine.</p>
<p>In my case, I didn&#8217;t want to add any other fields to the interface, so I went with a <code>BooleanField</code> that is not editable.</p>
<p>Here&#8217;s the revised <code>Photo</code> model:</p>
<pre>class Photo(models.Model):
    item = models.ForeignKey(Item, related_name='photos', edit_inline=models.STACKED)
    image = models.ImageField(blank=False, upload_to='items')
    keep = models.BooleanField(default=True, editable=False, core=True)

    def save(self):
        # Don't save if there is no image (since core field is always set).
        if not self.id and not self.image:
            return
        super(Photo, self).save()</pre>
<p>The keep field has been added and set to be core instead of the image field.  Since there is a core field and it&#8217;s not empty, the main <code>Item</code> model can be edited without the <code>Photo</code> models being deleted.</p>
<h3>Don&#8217;t save the empties</h3>
<p>The core field always has a value which means the <code>Photo</code> model is told to save even when its <code>ImageField</code> is empty.  To prevent creating these empty objects, the <code>Photo</code> model overrides <code>save()</code> and checks if an image was uploaded to the <code>ImageField</code>.  If not, it returns without saving.</p>
<p>The remaining issue is that you can&#8217;t delete a <code>Photo</code> using the Django admin interface.  You can replace the image, but will need some other method for deleting.  For me this isn&#8217;t a big problem, so the workaround solved the problem for now.</p>
<p>Note that this is just a workaround to the problem which hopefully will be fixed in Django at some point in the future.  Ideally, the admin interface would properly handle having an <code>ImageField</code> or <code>FileField</code> as the only core field of the related model and optionally put a &#8220;remove&#8221; checkbox in the UI to allow removing the image/file.</p>
]]></content:encoded>
			<wfw:commentRss>http://scottbarnham.com/blog/2007/08/22/edit-inline-with-imagefield-or-filefield-in-django-admin/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Uploading images to a dynamic path with Django</title>
		<link>http://scottbarnham.com/blog/2007/07/31/uploading-images-to-a-dynamic-path-with-django/</link>
		<comments>http://scottbarnham.com/blog/2007/07/31/uploading-images-to-a-dynamic-path-with-django/#comments</comments>
		<pubDate>Tue, 31 Jul 2007 20:25:46 +0000</pubDate>
		<dc:creator>Scott</dc:creator>
				<category><![CDATA[Django]]></category>

		<guid isPermaLink="false">http://scottbarnham.com/blog/2007/07/31/uploading-images-to-a-dynamic-path-with-django/</guid>
		<description><![CDATA[
Update: There&#8217;s a new method you should try first.  See: Dynamic upload paths in Django

Django makes it easy to upload images by adding an ImageField to your model.  The images are uploaded to your media path in a subdirectory specified with the upload_to parameter which can contain a date/time pattern like %Y/%m/%d.
class Photo(models.Model):
 [...]]]></description>
			<content:encoded><![CDATA[<div class="alert">
<strong>Update:</strong> There&#8217;s a new method you should try first.  See: <a href="http://scottbarnham.com/blog/2008/08/25/dynamic-upload-paths-in-django/">Dynamic upload paths in Django</a>
</div>
<p>Django makes it easy to upload images by adding an <code>ImageField</code> to your model.  The images are uploaded to your media path in a subdirectory specified with the <code>upload_to</code> parameter which can contain a date/time pattern like <code>%Y/%m/%d</code>.</p>
<pre>class Photo(models.Model):
    caption = models.CharField(blank=True, maxlength=100)
    image = models.ImageField(upload_to='photos/%Y/%m/%d')</pre>
<p>In this example, images will be uploaded to a path like:</p>
<p><code>/path/to/media/photos/2007/07/31/flowers.jpg</code>.</p>
<p>Sometimes you want to keep related images together, rather than spreading them over multiple date directories.  But if you have a lot of images, you won&#8217;t want them all stored in a single directory.</p>
<p>It would be nice if there was a way to upload to a directory specific to the model, perhaps a path incorporating the model object&#8217;s id or a unique slug.  Something like: </p>
<p><code>/path/to/media/photos/1234/flowers.jpg</code><br />
or<br />
<code>/path/to/media/photos/scotland-trip/castle.jpg</code><br />
<code>/path/to/media/photos/scotland-trip/bonnie-purple-heather.jpg</code></p>
<p>Django doesn&#8217;t have a standard way to do this at the moment (it&#8217;s pending a design decision according to <a href="http://code.djangoproject.net/ticket/4113">ticket #4113</a>).</p>
<p>I needed to do this in a project recently and tried various different approaches.  Here&#8217;s what I tried  &#8211; <a href="#working_solution">skip to the working solution</a> if you&#8217;re not interested in the failed attempts.</p>
<p><strong>Attempt 1</strong>: Specify <code>upload_to</code> dynamically</p>
<p>Why not make <code>upload_to</code> include the id of the model?</p>
<pre>class Photo(models.Model):
    caption = models.CharField(blank=True, maxlength=100)
    image = models.ImageField(upload_to='photos/%d' % self.id)</pre>
<p>Because there is no <code>self</code>; that&#8217;s why.  Django builds the model with the fields we specify, but at that point, there is no instance of the model, so <code>self</code> is meaningless.  Whatever we put in <code>upload_to</code> here will apply to all instances of the model.</p>
<p><strong>Attempt 2</strong>: Set <code>upload_to</code> on save</p>
<p>How about overriding the <code>save</code> method of the model and setting <code>upload_to</code> on the image field at that point.  Something like:</p>
<pre>class Photo(models.Model):
    caption = models.CharField(blank=True, maxlength=100)
    image = models.ImageField(upload_to='photos')

    def save(self):
        for field in self._meta.fields:
            if field.name == 'image':
                field.upload_to = 'photos/%d' % self.id
        super(Photo, self).save()</pre>
<p>It&#8217;s a bit icky having to iterate through <code>self._meta.fields</code> to find the right one.  But a bigger problem is the image file may well be written to the path before <code>save</code> is called.</p>
<p>Usually you would call <code>photo.save_image_file(filename, content)</code> to save the file content then <code>photo.save()</code> to save the model&#8217;s fields, including the path in the image field.  By the time we set <code>upload_to</code>, it&#8217;s too late.</p>
<p><strong>Attempt 3</strong>: Override <code>ImageField</code> and pass a callable for <code>upload_to</code></p>
<p>Taking a different approach, how about making a new class that derives from ImageField and takes either a new parameter or a callable for the <code>upload_to</code> parameter.  Something like:</p>
<pre>class SpecialImageField(ImageField):

    def get_directory_name(self):
        if callable(self.upload_to):
            return self.upload_to()
        else:
            return super(SpecialImageField, self).get_directory_name()</pre>
<p>We override the <code>get_directory_name</code> method which is actually defined in <code>FileField</code> (from which <code>ImageField</code> inherits).</p>
<p>The problem is, we&#8217;re still having to pass something (a callable) for the <code>upload_to</code> parameter.  Again, at the time the <code>ImageField</code> is created, we are not in an instance of the model, so we can&#8217;t pass a it a model method.  We could pass a module-level function, but that&#8217;s not enough information; we want to set the path using something in the model instance.</p>
<p><strong id="working_solution">Attempt 4 &#8211; the one that worked</strong>: Override <code>ImageField</code> get model instance and ask it</p>
<p>After going round in circles and learning a few things on the way, I came across <a href="http://code.djangoproject.com/wiki/CustomUploadAndFilters">this page in the Django wiki</a> (mental note: check wiki first in future).</p>
<p>It shows how a custom field can get hold of the model instance using dispatcher.  I wrote <code>CustomImageField</code> to get the model instance when the model instance is initialised and ask it to supply a new <code>upload_to</code> path.</p>
<p>The field looks like this:</p>
<pre>from django.db.models import ImageField, signals
from django.dispatch import dispatcher

class CustomImageField(ImageField):
    """Allows model instance to specify upload_to dynamically.

    Model class should have a method like:

        def get_upload_to(self, attname):
            return 'path/to/%d' % self.id

    Based on: http://code.djangoproject.com/wiki/CustomUploadAndFilters
    """
    def contribute_to_class(self, cls, name):
        """Hook up events so we can access the instance."""
        super(CustomImageField, self).contribute_to_class(cls, name)
        dispatcher.connect(self._post_init, signals.post_init, sender=cls)

    def _post_init(self, instance=None):
        """Get dynamic upload_to value from the model instance."""
        if hasattr(instance, 'get_upload_to'):
            self.upload_to = instance.get_upload_to(self.attname)

    def db_type(self):
        """Required by Django for ORM."""
        return 'varchar(100)'</pre>
<p>Note the <code>db_type</code> method which replaces the <code>get_internal_type</code> method in Django trunk.  It is used when you run <code>manage.py syncdb</code> to know what field type to create in the database.</p>
<p>The field is used in a model like this:</p>
<pre>class Photo(models.Model):
    caption = models.CharField(blank=True, maxlength=100)
    image = CustomImageField(upload_to='photos')

    def get_upload_to(self, field_attname):
        """Get upload_to path specific to this photo."""
        return 'photos/%d' % self.id</pre>
<p><code>get_upload_to</code> is passed the <code>attname</code> of the field in that model (in this case &#8220;image&#8221;).  This is so the model can distinguish between multiple custom image fields.</p>
<p>Ok, so the bit I&#8217;ve glossed over is that the model may not have an id at the time <code>get_upload_to</code> is called.  If the model is new and hasn&#8217;t been saved you&#8217;ll need to save it or work something else out before you can return the dynamic <code>upload_to</code> path.  But that was always the case, so I&#8217;m not taking the blame.</p>
<p>In my case, the <code>Photo</code> model was related to some other model (e.g. <code>Room</code>) which I called to get the path.  It didn&#8217;t matter that <code>Photo</code> didn&#8217;t have an id because it was related to something that did.</p>
<p>So now I get to save images to paths like:</p>
<p><code>/path/to/media/photos/12345/front.jpg</code><br />
<code>/path/to/media/photos/12345/rooms/kitchen.jpg</code><br />
etc</p>
<p><strong>Update &#8211; 20 May 2008:</strong></p>
<p>Here&#8217;s a small update to the <code>CustomImageField</code> class.  The version above listens for the <code>post_init</code> signal and use it to get the dynamic upload path.  This works fine when you use it like:</p>
<pre>photo = Photo.objects.create(...)</pre>
<p>Calling <code>create</code> saves the object and loads it so that <code>post_init</code> gets called.  However, if you create the model object and upload a file before saving, it will not know about the dynamic upload path.</p>
<p>The version below listens for the <code>pre_save</code> signal instead and gets the dynamic upload path at that point.  You can use it like:</p>
<pre>photo = Photo()
photo.save_image_file(filename, content)</pre>
<p>Note that you may still need to save the model before uploading if your dynamic path includes the model id (which is not set until it is saved).</p>
<p>Here is the new version of the field:</p>
<pre>class CustomImageField(ImageField):
    """Allows model instance to specify upload_to dynamically.

    Model class should have a method like:

        def get_upload_to(self, attname):
            return 'path/to/%d' % self.id

    Based on: http://code.djangoproject.com/wiki/CustomUploadAndFilters
    """
    def __init__(self, *args, **kwargs):
        if not 'upload_to' in kwargs:
            kwargs['upload_to'] = 'dummy'
        self.prime_upload = kwargs.get('prime_upload', False)
        if 'prime_upload' in kwargs:
            del(kwargs['prime_upload'])
        super(CustomImageField, self).__init__(*args, **kwargs)

    def contribute_to_class(self, cls, name):
        """Hook up events so we can access the instance."""
        super(CustomImageField, self).contribute_to_class(cls, name)
        if self.prime_upload:
            dispatcher.connect(self._get_upload_to, signals.post_init, sender=cls)
        dispatcher.connect(self._get_upload_to, signals.pre_save, sender=cls)

    def _get_upload_to(self, instance=None):
        """Get dynamic upload_to value from the model instance."""
        if hasattr(instance, 'get_upload_to'):
            self.upload_to = instance.get_upload_to(self.attname)

    def db_type(self):
        """Required by Django for ORM."""
        return 'varchar(100)'</pre>
<p>In some cases you will still want the path to be specified when the model is initialised rather than saved.  If you are editing a model and want to be able to save a new image without saving the model first, it needs to get the dynamic upload path when the <code>post_init</code> signal is raised.</p>
<p>This new <code>CustomImageField</code> takes an optional <code>prime_upload</code> argument.  If true, it will also listen for the <code>post_init</code> event and get the dynamic upload path.  You can use it like:</p>
<pre>class Photo(models.Model):
    image = CustomImageField(prime_upload=True)</pre>
<pre>photo = Photo.objects.get(pk=photo_id)
photo.save_image_file(filename, content)</pre>
<p>This is all a bit fiddly still, but it does the job until Django has a native way to specify the upload path per-instance.</p>
]]></content:encoded>
			<wfw:commentRss>http://scottbarnham.com/blog/2007/07/31/uploading-images-to-a-dynamic-path-with-django/feed/</wfw:commentRss>
		<slash:comments>21</slash:comments>
		</item>
	</channel>
</rss>

<!-- Dynamic Page Served (once) in 0.413 seconds -->

