Serving websites from svn checkout considered harmful

22 April 2008

Serving from a working copy

A simple way to update sites is to serve them from Subversion working copies. Checkout the code on the server, develop and commit changes, then svn update the server when you’re ready to release.

Security concerns

There’s a potential security problem with this. Subversion keeps track of meta-data and original versions of files by storing them in .svn directories in the working copy. If your web server allows requests that include these .svn directories, anything within them could be served to whoever requests it.

Requests would look like:

http://example.com/stuff/.svn/entries
http://example.com/stuff/.svn/text-base/page.php.svn-base
http://example.com/stuff/.svn/text-base/settings.py.svn-base

The first one would reveal some meta-data about your project, such as file paths, repository urls and usernames.

The second one may be interpreted as a PHP script, in which case there’s little risk. Or it may return the PHP source file, which is a much bigger risk.

The third one (assumed to be a Django project) should never happen. The request can only be for files within the web server’s document root. Code itself doesn’t need to be there, only media files do.

Alternatives

Instead of serving sites from a working copy, you can use svn export to get a “clean” copy of the site which does not include .svn directories. If you svn export from the repository, you must export the complete site, rather than just update the changed files, which could be a lot more data.

However, you can svn export from a working copy on the server. It’s still a complete export, but you don’t have to trouble the repository, so it’s typically much quicker.

An alternative is to update a working copy which is stored on the server, but not in the web document root, then use rsync or some file copying to update the “clean” copy in the web document root. In this case, only changed files are affected.

Protection through web server config

If you do serve from working copies, you should configure the web server to block all requests which include .svn in the url. Here’s how to do it for some popular web servers:

Apache

<LocationMatch ".*\.svn.*">
    Order allow,deny
    Deny from all
</LocationMatch>

Lighttpd

$HTTP["url"] =~ ".*\.svn.*" {
  url.access-deny = ("")
}

Nginx

Using the location directive which must appear in the context of server.

server {
location ~ \.svn { deny all; }
...
}
Filed under: Hosting,Security — Scott @ 9:48 pm

ImageField and edit_inline revisited

24 February 2008

A while back I wrote about using edit inline with image and file fields. Specifically, I suggested adding an uneditable BooleanField as the core field of the related model. This means you don’t have to set the ImageField or FileField to be core (which would cause confusing behaviour).

Removing the related model

The downside to having an uneditable core field is that you can’t remove the related model instance using admin. At the time, I wasn’t trouble by this so I just left it. In a recent project I needed to associate photos with articles, use edit_inline for the photos and be able to remove them. So here’s an extended workaround.

As well as the uneditable BooleanField (“keep”) which keeps the ArticlePhoto from being deleted, we now have a “remove” BooleanField which the user can tick in admin to cause the ArticlePhoto to be deleted. The check for this is in the save() method.

class ArticlePhoto(models.Model):
    article = models.ForeignKey(Article, related_name='photos', edit_inline=models.TABULAR, min_num_in_admin=5)
    keep = models.BooleanField(core=True, default=True, editable=False)
    remove = models.BooleanField(default=False)
    image = CustomImageField()

    def save(self):
        if not self.id and not self.image:
            return
        if self.remove:
            self.delete()
        else:
            super(ArticlePhoto, self).save()

It’s a pretty easy way to work around the problem and gives a sensible looking “remove” checkbox in the admin interface. The database table will have a “remove” column that never gets used, but it’s a pretty small price to pay.

Filed under: Django — Scott @ 9:12 pm

getSelection() returns empty in Google Mail

21 February 2008

Getting selected text in a Firefox extension

I’m developing a Firefox extension for a client which does something with the currently selected text in the browser window.

The standard way to get the selection is with window.content.getSelection().

Selection is empty in Gmail

Some users reported that selected text in Gmail messages wasn’t being found by the extension. I suspect the issue is with content added using JavaScript, but I haven’t investigated.

An alternative way to getSelection

The standard Search Google for “whatever” contextual menu item does work in Gmail, so obviously it gets the selection another way.

I found a function getBrowserSelection() in the browser.js file in Firefox’s chrome. It is used by Firefox for the contextual menu search.

This is how it gets the selection:

var focusedWindow = document.commandDispatcher.focusedWindow;
var selection = focusedWindow.getSelection();

I don’t know what the difference is, but I am now using this code in my Firefox extension and it is working well.

Filed under: Uncategorized — Scott @ 12:23 pm

Which web hosting company is best

30 January 2008

Choosing the best web host

Sometimes friends and clients ask me to recommend a web hosting company. For the past couple of years I’ve done my own hosting on a VPS, so I don’t spend much time with shared web hosting accounts. But there are a few I’ve used or heard good things about, so here’s what I normally recommend.

Big web hosting companies

Some of the big boys are:

Hosting Facebook apps

I recently had a bad experience with DreamHost. I developed a Facebook app for a client and suggested DreamHost because I’d heard they were pretty good. The server was slow in responding at times which caused Facebook to show an error.

Facebook gives app servers about 10 seconds to respond and if they don’t, it tells the user there’s a problem. That seems fair enough; I like my websites to respond in about 1 second. But whereas a determined user can wait for a slow website to respond, they don’t get the option of waiting for a slow Facebook app. For Facebook apps, responsiveness counts.

The Facebook app I developed is now on a VPS and is much more responsive.

Overselling

Most web hosting companies oversell resources. This means they give customers lots of disk space and bandwidth on the assumption most won’t use anywhere near the amount. If everyone actually used that amount, there’s no way the host could deliver.

The poster child for overselling is probably Dreamhost. Currently offering 500 GB of disk space and 5 TB of monthly bandwidth for a fistful of dollars.

Overselling is part of the business and not much can be inferred from the numbers. Dreamhost is probably no better or worse than another host that offers more or less disk space and bandwidth. There’s just no guarantee of what you are getting. Your account is lumped in with hundreds of others and the performance you get depends what these neighbours are doing.

Smaller webhosts try harder

I would consider a smaller hosting company like:

I’ve heard good things about A Small Orange.

WebFaction has support for Rails and Django apps and generally seems a bit more savvy and flexible that the big boys.

VPS hosting for speed and flexibility

An alternative is to have your own VPS (Virtual Private Server). You have full control and usually very good performance, but need more geek skills.

Having a VPS is just like having your own dedicated server, but instead of your own machine, there are several virtual machines running on one physical server. Split the resources and split the cost.

Some VPS packages come with a control panel such as Webmin or CPanel. So if you know what you’re doing, but are not geek enough to do everything on the command line, a VPS may still be an option for you.

Dedicated resources with Xen

There are different virtualisation packages that allow hosting companies to split a physical server in to multiple virtual machines. Two of the big ones are Xen and Virtuozzo.

I recommend going for a Xen VPS. With Xen, a fixed amount of memory is assigned to each virtual machine. It’s not possible for the hosting company to oversell resources. This effectively limits how many VPSes can be run one one physical server which gives you a much better idea of the resources dedicated to you.

Here are some good VPS hosts:

I’ve had a UK based VPS from Xtraordinary Hosting for about 18 months and I’ve been delighted with it. Rock solid servers, very good performance and responsive and helpful technical staff. I highly recommend them if you need a VPS in the UK.

RimuHosting offers Xen VPSes mainly hosted in the US, but with an option to host in the UK or Australia. When Dreamhost wasn’t delivering the goods for a Facebook app, I moved it to a VPS on Rimuhosting. It too has been fast and reliable.

Slicehost has great prices and has been generating a lot of positive buzz. The servers are hosted in the US, so if you’re looking for a US based Xen VPS, consider Slicehost.

When good hosts go bad

Sometimes good web hosting companies start to suck. If that happens to your web host, you might need to jump ship. The gold rule is to always register your domain names yourself using a domain registrar and not get them as part of your web hosting package. That way, moving to another host is just a matter of re-pointing your domains.

Read the Reviews

It’s worth reading some web hosting reviews to see what other customers say about a hosting company. But choose your reviews site carefully as some are full of shill reviews or are even operated by the hosting companies themselves!

Filed under: Hosting — Scott @ 11:30 pm

SysLogHandler not writing to syslog with Python logging

1 January 2008

Logging to syslog in Python

I was trying to use the standard Python logging module to write messages to syslog. The logging module has a SysLogHandler class which can log to a local or remote syslog daemon.

With no host specified, SysLogHandler uses localhost which is what I wanted. I tried to use SysLogHandler, but it just wouldn’t work. There was no error when I called the logging methods, but my messages didn’t show up in /var/log/syslog.

syslog module works

Python also has a standard syslog module. I tried it and it worked fine; my messages were written to the syslog file.

For example:

import syslog
syslog.syslog('test')

syslogd isn’t listening

After running Wireshark I found the SysLogHandler was correctly sending a UDP packet to localhost on port 514. I could also see there was an ICMP response indicating the UDP packet was not received on that port. syslog wasn’t listening!

Use /dev/log

Instead of sending to localhost, I wanted SysLogHandler to pass the message to syslog on the local machine in the same way the syslog Python module was doing.

The solution is to pass /dev/log as the address parameter to SysLogHandler. It’s not well documented, but it works.

For example:

import logging
from logging.handlers import SysLogHandler

logger = logging.getLogger()
logger.setLevel(logging.INFO)
syslog = SysLogHandler(address='/dev/log')
formatter = logging.Formatter('%(name)s: %(levelname)s %(message)s')
syslog.setFormatter(formatter)
logger.addHandler(syslog)

Easy when you know how.

Filed under: Python — Scott @ 11:21 pm

Case-insensitive ordering with Django and PostgreSQL

20 November 2007

When the Django Gigs site first went live we noticed the ordering of developers by name was not right. Those starting with an uppercase letter were coming before those starting with a lowercase letter.

PostgreSQL and the locale

PostgreSQL has a locale setting which is configured when the cluster is created. Among other things, this affects the ordering of results when you use the SQL order by clause.

The local on my server was set to “C” which means it uses byte-level comparisons, rather than following more complex rules for a given culture. Although this is apparently good for performance, it means order by will be case sensitive – e.g. “Zebra” comes before “apple”.

Depending on how your system is set up, you may have locales such as en_GB. The locale can’t easily be changed in PostgreSQL because indexes and other data depends on it. To change locale, you need to start a new cluster and move databases to it.

Django and case-sensitivity

Django provides the order_by() function on QuerySets, but does not have an option for case insensitive ordering. Instead this is left to your database configuration.

When using SQL directly, you can sort case-insensitively using the PostgreSQL lower() function.

e.g.

select * from developer order by lower(name)

One way to do this in Django is to use extra to call the lower() function, creating a virtual column which you can then order by.

e.g.

Developer.objects.all().extra(
select={'lower_name': 'lower(name)'}).order_by('lower_name')

Using SQL functions could tie you to a particular database, though in this case the lower() function is standard and should work with most databases. Some other databases do case-insensitive comparisons so wouldn’t need it.

Filed under: Django — Scott @ 8:38 pm

Django developers: We are the world

29 September 2007

An informal survey of the Django community

This week, Andrew and I launched the Django Gigs website to help employers find Django developers. Andrew wrote about it and thanks to the Django Community feed aggregator we had quite a few visitors in the first couple of days.

It’s clear that Django is catching on and growing in popularity. The djangoproject.com site is getting close to 8 million hits each month. I thought it would be interesting to analyse my logs and see what I could tell about the Django community, or at least the section of it that read the blog and visited the Django Gigs website.

Visitors

1280 unique IP addresses

The number of IP addresses seems a pretty good indication of how many unique visitors we had in about two days.

Platforms

510 Windows
373 Mac OS X (including 4 iPhones)
312 Linux
85 Other (mostly bots, feed aggregator sites, a handful of BSD)

The platforms is a pretty even split among Windows, Mac and Linux. Which given the dominance of Windows on the desktop suggests Django is disproportionately popular with Mac OS X and Linux users. I suspect this is the case with Python in general, but I don’t have any stats to back that up.

Browsers

875 Firefox
148 Safari
40 IE
36 Camino
13 Konqueror
168 Other (mostly bots or feed readers like NetNewsWire)

No big surprise here: Firefox is the daddy.

One thing that surprised me was the number of different user agents. There were 408 unique user agent strings! Of course, most of them were from different versions of the same software. IE on Windows likes to report versions of the .NET framework and various browser extension installed on the machine.

Countries

423 United States
133 France
126 United Kingdom
67 Germany
61 Canada
45 Russian Federation
42 Brazil
34 Australia
33 Netherlands
23 Italy
16 Belgium
16 China
16 Spain
15 Poland
15 Sweden
14 India
13 Norway
13 Singapore
13 Switzerland
13 Austria
13 Japan
9 Ireland
8 Ukraine
8 New Zealand
7 Finland
7 Portugal
6 Czech Republic
5 Saudi Arabia
5 Iceland

Honourable mentions (1-5 visitors): Slovenia, Denmark, Romania, Greece, Republic of Korea, Serbia and Montenegro, Indonesia, Hong Kong, Philippines, Israel, Croatia, Estonia, Colombia, Peru, Slovakia, Thailand, Turkey, Malaysia, Chile, Puerto Rico, Latvia, Hungary, Belarus, Mexico, Kenya, Kuwait, Nigeria, Lithuania, Argentina, Bolivia, Europe, Iran, Islamic Republic of, Dominican Republic, Moldova, Republic of, Bulgaria, Jamaica, Egypt, United Arab Emirates, Kazakhstan.

I used the free version of GeoIP from MaxMind to look up countries from IP addresses. It’s not totally accurate, but good enough.

It’s very easy to use from Python, assuming you have the library installed:

import GeoIP
geo = GeoIP.new(GeoIP.GEOIP_MEMORY_CACHE)
print geo.country_name_by_addr('4.4.4.4')

It’s not surprising that North America and Western Europe are well represented, but Russia, Brazil and Australia seem to have a good Django following also.

We are the world

Obviously this is just a sample of the Django community and may not be representative, but it does given an indication that Django developers are spread across the world and across the major platforms. That can only be a good thing for the continued growth and success of the framework.

Filed under: Django — Scott @ 1:03 pm

VMWare on Ubuntu Linux with bridged network to XP

23 August 2007

XP on Linux

I run Ubuntu on my main machine, but needed Windows for a project for one of my clients. I installed the free VMWare Server from the Ubuntu commercial repository and installed Windows XP Pro on a virtual machine.

VMWare networking modes

There are three different networking modes in VMWare to give the virtual machine network access:

Host-only
A private network between the host and VM. The VM can’t be accessed by other machines on the network.
NAT
The VM shares the IP address of the host.
Bridged
The VM has its own IP address and can be accessed by the host and other machines on the network as if it was a separate box.

I wanted to keep my virtual Windows box as isolated as possible for security reasons – Windows boxes get compromised so easily. I used bridged networking to give the virtual machine its own IP address and blocked outgoing Internet access for that machine on my router firewall.

Networking problems with Samba/SMB

I wanted to share files between the Linux host and the Windows virtual machine. I used Samba on Linux to share some directories then tried to connect to them from the Windows VM. It couldn’t connect and just timed out without a helpful error message.

After messing with Samba for a while and reading the VMWare Samba docs I was no further forward. I tried using IE on Windows to connect to the web server on my Linux box. No dice. It timed out as well.

It’s the network card settings

I read some discussion in the VMWare forums about similar problems using bridged networking, but working fine with NAT.

This led to the answer on Launchpad – the problem was the network card. Apparently some network cards optimise by discarding packets they have already seen. Because the networking is effectively between two machines on the same network card, some of the data was getting lost.

The solution is to disable these settings on the Ethernet card using:

ethtool -K eth0 sg off rx off tx off

or

ethtool -K eth0 sg off rx off tx off tso off

depending on the settings supported by your network card.

I ran this command and it worked immediately. Note that when you reboot you will need to issue this command again. You could add it to /etc/rc.local or similar to have it issued automatically.

Filed under: Linux,VMWare — Scott @ 1:18 pm

Edit inline with ImageField or FileField in Django admin

22 August 2007

Django admin lets you edit related model objects “inline”. For example when editing a Recipe you can add/eding a group of Ingredient models.

Core fields for edit_inline

The related model being edited inline must specify one or more “core” fields using core=True. If the core fields are filled in, the related model is added. If the core fields are empty, the related model is removed.

This works great for normal objects with CharFields, etc, but not so well if you want to have images or files uploaded using inline editing. If the only core field is a FileField or ImageField, you’ll get strange behaviour like the file/image being removed when you edit an existing model in the admin.

Using inline editing with ImageField or FileField

In a recent project I wanted to have an item with title and description and zero or more photos. The Photo model just has an ImageField. To make it easy to edit, I wanted the photos set to edit_inline.

Here’s my first attempt:

class Item(models.Model):
    title = models.CharField(max_length=100)
    description = models.TextField()

    class Admin:
        pass

class Photo(models.Model):
    item = models.ForeignKey(Item, related_name='photos', edit_inline=models.STACKED)
    image = models.ImageField(blank=False, upload_to='items', core=True)

Notice that the ImageField in Photo has core=True to make it a core field.

This worked ok in the Django admin interface for adding a Photo to a new Item, but if I edited that Item, the Photo would be deleted.

This is a known issue (see Ticket #2534), but it’s marked as “pending design decision” and may be ignored for now since the Django admin is being rewritten to use newforms.

In the meantime I needed a workaround.

Workaround using a different core field

Instead of having the ImageField as a core field, we need something else. If you’ve got some other natural data, such as a caption, that would work fine.

In my case, I didn’t want to add any other fields to the interface, so I went with a BooleanField that is not editable.

Here’s the revised Photo model:

class Photo(models.Model):
    item = models.ForeignKey(Item, related_name='photos', edit_inline=models.STACKED)
    image = models.ImageField(blank=False, upload_to='items')
    keep = models.BooleanField(default=True, editable=False, core=True)

    def save(self):
        # Don't save if there is no image (since core field is always set).
        if not self.id and not self.image:
            return
        super(Photo, self).save()

The keep field has been added and set to be core instead of the image field. Since there is a core field and it’s not empty, the main Item model can be edited without the Photo models being deleted.

Don’t save the empties

The core field always has a value which means the Photo model is told to save even when its ImageField is empty. To prevent creating these empty objects, the Photo model overrides save() and checks if an image was uploaded to the ImageField. If not, it returns without saving.

The remaining issue is that you can’t delete a Photo using the Django admin interface. You can replace the image, but will need some other method for deleting. For me this isn’t a big problem, so the workaround solved the problem for now.

Note that this is just a workaround to the problem which hopefully will be fixed in Django at some point in the future. Ideally, the admin interface would properly handle having an ImageField or FileField as the only core field of the related model and optionally put a “remove” checkbox in the UI to allow removing the image/file.

Filed under: Django — Scott @ 3:08 pm

Replacing door seal on Hoover WDM-130 Washing Machine

19 August 2007

I recently replaced the rubber door gasket on a Hoover washer dryer. It’s the first time I’ve done any repairs on a washing machine and I didn’t find much information online. I haven’t read the Haynes Washing Machine Manual but it might have more information.

I worked on a Hoover WDM-130, but the procedure will likely be the same for other Hoover and Candy washing machines and possibly other makes.

Mouldy door seal and brown marks on clothes

The door seal, also know as door gasket, is the rubber that goes between the drum and the door and allows the drum to bounce around without any water escaping. It has lots of folds and these can get mouldy, start to smell and possibly leave stains on clothes.

There’s some good advice from UK Whitegoods on how to prevent it happening, such as leaving the door open to allow the seal to dry and doing a monthly “service wash” (run it through empty at a high temperature).

Once the door seal has mould, it’s difficult to get rid of. My washing machine was leaving brown marks on clothes after they had been through a wash, so I had to do something about it.

Replacing the door seal is not super difficult, but it’s quite awkward to get to. If you’re fairly handy and have suitable tools, you should be ok.

Replacement rubber gasket

I bought a replacement door seal from UK Whitegoods (part number 91620118 – Candy Hoover Door Seal). It was a good price and delivered quickly. The part looks green in their picture, but it is grey as you’d expect!

Tools you will need

  • 7 mm socket
  • 7 mm spanner
  • 10 mm socket
  • 13 mm socket (maybe)
  • long T-bar handle for 7 mm socket (about 30 cm long) – or other way of turning it (I used a screwdriver)

If you don’t already have the tools, you might consider the Draper socket set and Draper spanner set.

Getting inside the Hoover washing machine

Remember to disconnect the electricity from the washer dryer before opening it up.

I expected the whole front panel to come off, but the front is welded on. The sides and back are one piece, so the only ways in are from the top and through the door hole. For other jobs there is a removable hatch at the back and you might get to some things through the bottom, but most access will be through the top.

Remove the three screws along the top at the back of the machine to remove the plastic trim. The top (wood/melamine) slides out the back.

Inside top view

You’ll see a metal duct that goes in to a sleeve at the top of the rubber door seal. I found it easiest to remove this to get better access to the drum for removing the old and fitting the new door seal. It is held on by four screws (10 mm socket). You might need to remove the block of concrete (top right in picture) to be able to lift up the duct (need 10 mm and 13mm socket).

Removing the old door seal

The door seal is held to the front of the machine by a white plastic ring. Remove it by gently prising it away from the rubber and pushing the rubber in through the hole. Be careful not to break it as you will need it to fit the new seal.

White plastic ring

With the outside removed, you can get access to where the rubber attaches to the drum. The rubber seal is held to the drum by a wire ring that is tightened with a nut and bolt.

Wire ring

Tightening bolt for wire ring

You need to loosen the wire ring by undoing the bolt. There’s not much room around it, so it’s quite difficult to get it turning.

Accessing the bolt

I pulled back the rubber and put the 7 mm socket on the bolt. I then held the nut in place with a 7 mm spanner. There wasn’t enough room to use the socket driver, so I needed to turn the socket by reaching through the open top of the machine. A long T-shaped socket driver would be useful, but I used a long screwdriver to turn the socket.

Bolt adjustment top view

Once you’ve loosened the bolt enough, the wire ring will come over the rubber and you will be able to remove the rubber door seal from the drum.

This is the old seal I removed, with lots of lovely mould stains:

Old seal with mould

Fitting the new door seal to the drum

The new door seal just needs to go on in place of the old one. The tricky bit is getting all the folds of rubber in to the right places. I’ll attempt to illustrate with some photos.

The bigger side of the seal goes around the drum. The seal folds under the edge of the drum and has a groove where the wire ring fits to hold it in place. This is it fitting around the outside edge of the drum:

Seal around edge of drum

This is the other side, where it meets the basket (the rotating metal basket where the clothes go):

Seal meets metal drum

It took me a few goes to get it right. I found it easiest to use one hand inside the seal pushing it in place, and the other hand on the outside of the seal pulling the folds open to get it to wrap around the edge of the drum. You’ll want to get this right or the machine might leak water out the bottom.

Make sure the rubber sleeve is in the right place at the top so that the metal duct can plug in to it. you can see it on the left of this photo. The duct is sitting on its side where I disconnected it to make room.

Sleeve at top of seal

Put the wire ring in to the groove so that it will hold the rubber seal securely to the drum. Tighten the bolt the same way you got it off (e.g. with spanner and socket).

Fitting the duct in to the top of the door seal

The new door seal comes with an inner sleeve that attaches to the duct and then goes in to the sleeve at the top of the door seal. Both are held on to the duct with a springy strap that fits in to a groove on the door seal sleeve.

Inner sleeve that fits on duct

Inner sleeve fitted to duct

With the inner sleeve on the duct, push it in to the sleeve part at the top of the door seal. Make sure it’s fitted in well and put the springy strap around the outer sleeve. You can fit the duct back in place and screw it down.

Fitting the door seal to the front of the washing machine

The other side of the seal needs to be fitted to the door hole in the front of the machine. I found this a bit easier, but there are still various folds that fit either side.

Seal around edge of door hole at front of machine

Once it’s on, there should be a groove around the outside in to which the white plastic ring can be fitted. The ring is joined by a toothed connector that lets you tighten it in place.

White plastic ring tightener

You’ve done it

With the white plastic ring holding the door seal firmly in place, you can put the top back on the washing machine and do a little victory dance.

I hope this is helpful to anyone embarking on the same job. If nothing else, at least you know what you are getting in to and whether to tackle it yourself or pay a washing machine engineer.

Filed under: Washing Machine — Scott @ 6:31 pm
« Previous PageNext Page »