For location-based stuff, it’s useful to take a UK postcode and get the lat/long to plot it on a map (such as Google Maps).
The Google Maps API only has postcodes down to sector-level (e.g. NW1 4), so its results are approximate.
Ordnance Survey released the Code Point dataset free as part of OpenData. Among other things, it includes full postcodes with grid references.
The grid references are OSGB36, rather than lat/long. Converting them is more difficult than you’d think. Here’s the solution I used, based on what I cobbled together from forum posts and the like.
Get the data
Download the Code Point data. It’s a series of CSV files, one for each postcode area.
You might want to concatenate them in to one file. Or if you only want London, postcodes, try the
e, ec, n, nw, se, sw, w, wc files.
Convert OSGB36 to WGS84 Lat/Lng
The grid references need to be transformed to latitude/longitude on the WGS84 system. There are a few ways to do this, but I used PROJ4‘s cs2cs program.
The command is:
cs2cs -f '%.7f' +proj=tmerc +lat_0=49 +lon_0=-2 +k=0.9996012717 +x_0=400000 +y_0=-100000 +ellps=airy +towgs84=446.448,-125.157,542.060,0.1502,0.2470,0.8421,-20.4894 +units=m +no_defs +to +proj=latlong +ellps=WGS84 +towgs84=0,0,0 +nodefs
There is a PROJ wrapper for Python (pyproj), but I wasn’t smart enough to figure out how to do the options, so instead I spawned the cs2cs program from a Python script.
Here’s the script. It’s not great, but it does the job.
if argv is None:
argv = sys.argv
if len(argv) != 3:
print """\nUsage: %s input_file.csv output_file.csv\n""" % argv
input = open(argv, 'r')
reader = csv.reader(input)
output = open(argv, 'w')
input_fields = 
for index, row in enumerate(reader):
postcode, easting, northing = row, row, row
input_fields.append((postcode, easting, northing))
if index % 1000 == 0:
input_fields = 
print 'processed', index
def process(input_fields, output):
args = ['cs2cs', '-f', '%.7f', '+proj=tmerc', '+lat_0=49', '+lon_0=-2', '+k=0.9996012717', '+x_0=400000', '+y_0=-100000', '+ellps=airy', '+towgs84=446.448,-125.157,542.060,0.1502,0.2470,0.8421,-20.4894', '+units=m', '+no_defs', '+to', '+proj=latlong', '+ellps=WGS84', '+towgs84=0,0,0', '+nodefs']
cs2cs = subprocess.Popen(args, stdin=subprocess.PIPE, stdout=subprocess.PIPE)
data = cs2cs.communicate('\n'.join(['%s %s' % (input, input) for input in input_fields]))
for index, line in enumerate(data.split('\n')):
postcode, easting, northing = input_fields[index]
data_parts = re.split('\s+', line)
output.write('%s,%s,%s,%s,%s,%s\n' % (postcode.replace(' ', ''), format_postcode(postcode), easting, northing, data_parts, data_parts))
postcode = postcode.replace(' ', '')
return '%s %s' % (postcode[0:-3], postcode[-3:])
if __name__ == '__main__':
Run it with your Code Point csv file as input. The output contains the postcode, OS grid refs, latitude and longitude.
Bonus: if you’re using Postgresql, here’s how to get the csv data in to your database.
Create a table matching the csv file layout:
create temporary table postcode (
Load it in:
copy postcode from '/path/to/your/output.csv' with delimiter ',' csv header;
Then copy the fields you want to your real table.