March 9, 2010
The OpenStreetMap experience
What do people find difficult about cycling to work? Why don’t they do it?
We could ask them. Actually, because we take an interest in these things – because we already go out and talk to people – we largely know. The roads are perceived as dangerous. Where there are safe routes, people don’t know about them. Helmet hair. Nowhere to park the bike. It rains.
What should we be doing?

Look! I fixed it for you! Everyone will start cycling to work now!
What?
March 8, 2010
Cropping Illustrator CS artwork when saved as a PDF
Dear old Illustrator does have its quirks, and one of them is that when you save a PDF, the bounding box of the PDF is the bounding box of your artwork – not anything sensible like the artboard or anything like that. I won’t bore you with the many things that should work but don’t, but suffice it to say they involve crop marks, page tiling, maximum paper sizes and heartache.
So here, after much searching and head-scratching, is something that worksforme(tm). This post is here more as an aide-memoire than anything else, but someone else may find it useful.
Disclaimer: I’m still on Illustrator CS; all of this will no doubt be different in CS27 or whatever other overpriced piece of crap Adobe have come up with this week.
- Download this PPD file and save it somewhere memorable. (I chose the Illustrator app folder.)
- Go to File ≫ Print. Curse Adobe for not using the proper OS X print dialogue like everyone else does.
- Change Printer to ‘Adobe PostScript® File’.
- Change PPD to ‘Other…’, then select the PPD file from above.
- Select your page size from the Media popup, e.g. A1. I guess if you were producing a funny-sized leaflet you could manually edit the PPD file to have your size in it – it’s just plain text.
- Go to Setup and check that the Origin X and Origin Y are both 0mm.
- Click ‘Save’ and call the file artwork.ps or something like that.
- Open the resulting .ps file in Preview (context-click ≫ Open With ≫ Preview).
- Breath a sigh of relief that you are now in a properly designed Apple app and don’t have to undergo the indignities of Adobe again.
- Use File ≫ Save As… to save it as a PDF.
January 13, 2010
Ordnance Survey consultation
I’ve finally finished my response to the consultation on Ordnance Survey data. It’s here (five-page PDF, 64k).
The tl;dr* version:
Generally good. Don’t release 1:25k and 1:50k rasters, that’s just gratuitous and we don’t need them. Provide an aerial photography API for rights-free tracing. Use PD or a database rights-aware licence, not CC-BY. kthxbye
* link almost certainly Not Suitable For Work, or indeed anywhere else
January 12, 2010
Latitude scale. Or, paleos may know some stuff after all
Every month I spend beyond ridiculous amounts of time drawing maps for the WW cruising guides. Along the neatlines (sides) of each map are black and white bars. Each bar represents one mile. It looks like this.

I do it like that because Old Charts do it like that too; because there’s a deliberately slightly retro feel to the WW maps (they’re essentially strip maps, like cruising guides used to have); and because I think it’s a great signifier of “we care about cartography” rather than “we have just thrown an OS map at our designer and told him to trace it, though not too obviously”.
Old Charts do it for a proper reason.
Roughly every two months someone complains that either a) the scale on OpenStreetMap is wrong or b) that OpenStreetMap doesn’t have a scale. (To be precise, a) happens, then the scale gets removed to stop the whingeing, then two months later b) happens.) You can read typical threads here and here but, in brief, the reason is because the scale isn’t consistent across latitudes, and that’s quite significant when you’re zoomed out a lot.
A bunch of us were talking about that this weekend, which is what just made me remember this.
On Old Charts, one bar represents one minute of latitude (or longitude). So the length of the latitude bars actually differs across the map.

The above is from a 3Mb university course presentation which blethers on about it for a while.
It’s an excellent solution. I wouldn’t imagine there’s any likelihood of OpenLayers et al adding it any time soon. But doesn’t it look great?
The best tablet yet
The usual sources are going haywire about an impending Apple Tablet.
I used to have a tablet computer. It was a beauty. It was A4 sized and had a full, moving-parts keyboard. The word-processor was speedy and yet powerful; but it was a proper computer, too. You could even program it. And it cost about £150.
Ah, Amstrad NC100 Notepad, how I miss thee.
It was such a great little machine. I’d tuck it under my arm then head off to the University Library in Cambridge to research some article or other I was writing for Keyboard Review. Then, at home, I’d connect up the serial-to-AppleTalk lead, run ZTerm, and fire the documents across. Bar a bit of search-and-replace to get rid of extraneous control characters, that was it. Done.
(The word-processing software was actually Protext, with which I was of course very familiar. Protext was the WP of choice on the Amstrad CPC, but also a ridiculously fast raw text editor. It came on a 16k sideways ROM; on another ROM I had Maxam 1.5, Arnor’s assembler. You’d write your source code in Protext then type ‘asm’, and Maxam would assemble it. To this day, the reason I indent code with hard tabs rather than spaces is Protext’s doing; a tab took up one byte; a space took up six. When you only had 38k for text, that was a big difference.
At one point I wrote 90% of the code, but only 30% of the UI, for a wizard hack that brought WYSIWYG editing and embedded graphics to Protext (CPC). It was going to be called Fidelity, after the utterly superb Durutti Column album. I never finished it but you should still buy the album.)
The NC100 used four real AA batteries, which is almost always better than any proprietary solution. (Maybe excepting Sony’s digital camera batteries.)
It had two failings. One is that Amstrad had always, from the CPC on, skimped on the keyboard decoding circuitry. If you pressed three keys together, a fourth character would also result. For the fast typist this is a real problem. On the NC100 there was one very common three-key combination (might have been I-O-N, I can’t remember) where the ghostly fourth key was cursor-up. So you’d be touch-typing at eight billion words a minute, and would briefly look down at the screen, only to see that three sentences ago you’d unwittingly ‘pressed’ cursor-up and all your subsequent text had been inserted into the previous line. This happened to me so many times.
The other failing is that Amstrad was enormously value-conscious. Which is generally a good thing, but I’d have preferred a £180 armour-plated NC100 to a £150 where I broke the power socket twice (which Rob Scott could fix) and the screen once (which he couldn’t).
If Apple were to build one of these, maybe with Mobile Safari and 3G networking, would I buy one? Hell yes.
November 18, 2009
Ordnance Survey goes free – some initial thoughts
How about that, then? Or as the Map Room succinctly put it, “Holy shit.”
Good news for:
- Google, Yahoo, Microsoft. Free maps, and unlike the US, good-quality free maps which they can start using right out the box.
- Ordnance Survey. I wrote here previously that OS’s best chance of surviving was to open up street name/geometries, boundaries, postcodes, peaks, rivers and PROWs, and to keep charging for the large-scale stuff. This seems to be pretty much what’s promised. I still believe that it’s absolutely the right decision for them. (Also, I am rather smug.)
- The Guardian. Launching a campaign is a risky business for any publication, especially a fairly obscure and, at times, seemingly fruitless campaign like ‘Free Our Data’. It has paid off – and of all the organisations campaigning for this, the Guardian is the only one that anyone has ever heard of.
- Apple et al. Insofar as Apple ever gives a shit about anything that happens outside the US, they no longer have to depend on anyone for UK iPhone maps. Not Google, not Tele Atlas. No-one. (Incidentally, if UK mobile carriers had any brains, they would now write their own mapping app and bundle it with their iPhone contracts. Fortunately they don’t.)
- Cartographers. Maps will now compete on cartography, not on data. This is an absolute shot in the arm for skilled cartographers and could go a long way to reviving the craft in the UK. With my Waterways World hat on, I’m delighted: our cruising guide maps can get better than they are now, yet anyone wanting to compete still has to learn how to produce lovely maps.
- Developers. Same applies. I am really looking forward to what people come up with. If I were an iPhone dev I would start writing that killer app now, ready to release when the data arrives.
- Wider Government. Full release instantly becomes the standard for public data. There is now absolutely no excuse for, say, the Environment Agency to withhold its fisheries data. That means more third-party sites that do funky things with public data. I suspect that will help in breaking the stranglehold of evil big outsourcers on Government IT projects.
- This blog because I can stop writing about boring map copyright law and start writing about fun things, like canals, organs and the new William Orbit album.
Possibly good news for:
- OpenStreetMap. I don’t think it’s a stretch to say this wouldn’t have happened without OSM. The inevitability that OSM would, in time, catch up with OS small-scale mapping absolutely vindicates the project. And, hey, complete data for the whole UK – what could be cooler?
But on the other hand, everyone else has it, too. How do ongoing changes get integrated into the OSM database? Will the UK community survive a sudden change in tack from surveying the basemap to becoming a provider of ‘added value’? Will smaller public domain mapping projects create an informal, developer-led community without OSM’s harsh share-alike restriction? Will UK OSM developers (who lead the project) get bored of it now there’s not such a unique need? How many questions can I get in one paragraph?
Oh, and there’s the licence. I dread to think what would happen if the chosen licence wasn’t compatible with OSM.
Bad news for:
- Tele Atlas and Navteq. See G-Y-M above. On the up side, their parent companies no longer have to bother collecting UK data for their satnavs/mobile phones. But that’s like saying Tesco giving free food away is good news for Sainsbury’s, which can now take it and resell it for 1p.
November 4, 2009
The mysterious data mines of Argleton-on-Google
There’s been a bunch of online chatter today about Argleton, the mystery town on Google Maps that has never really existed.

“Maybe it’s a trap street,” people have speculated. Google itself appears to be pinning the blame on Tele Atlas, telling the Telegraph: “People can report an issue to the data provider directly and this will be updated at a later date.”
The Telegraph goes on to say: “The data for the programme was provided by Dutch company Tele Atlas. A spokesman said it would now wipe the non-existent town from the map.”
Update: Originally I suggested here that, by reference to extra map data showing up elsewhere in Britain, this looked like something that had been ‘mined’ by Google from web sources. From a couple of comments below based on other Tele Atlas mapping, it does actually appear that this is a superfluous Tele Atlas town, not an invention of Google’s data mining. Nonetheless the data mining story is interesting in itself, so…
The canary starts to wobble
We know that, even before their recent go-it-alone expedition in the States, Google was mining the web and integrating the results into its map data. Wikipedia is the best-known example; Wikipedia articles with co-ordinates have long appeared as ‘active POIs’ on Google Maps. But as time goes on, Google has mined more and more directories, and other web content, to make the maps richer than the raw Tele Atlas data can offer.
It’s a really clever idea.
But sometimes, the parsing fails. Google Maps FAIL has a good example. Google has found a source of addresses somewhere on the web, and pulled out various data from it. But either the source data is dodgy, or more likely, it’s not formatted quite as consistently as Google’s algorithms would like.
So in Google Maps FAIL’s example, the sizeable town of Cirencester has moved to a little village halfway towards Northleach “inhabited by two sheep and a squirrel”, and the historic city of Gloucester has navigated upriver 20 miles and is sitting in a watermeadow outside Tewkesbury.
This [edit] was my original guess as to what’s happened at Argleton: dodgy data mining. My guess was that the mined data was in fact a badly OCRed address, meant to be “Aughton” but transcribed as “Argleton”. We already know that Google is OCRing PDFs as it crawls them; or maybe it was OCRed before being uploaded to the web. No matter.

If we need any more proof that they’re mining some fairly imperfect sources, then three miles to the west we find “Downhollnad”. A couple of months ago I was drawing a map of the Leeds & Liverpool Canal there, and I’m pretty sure that it’s called Downholland. It’s spelled correctly on other Tele Atlas-derived mapping, too, such as Multimap’s.

The canary falls over
How endemic is this faulty mining?
My home-town of Charlbury is well-known as the world centre of innovation in collaborative mapping, especially as performed by ninjas. I was just coming back from church the other day and I met that Artem ‘Mapnik’ Pavlenko walking down the street. So let’s have a look at the data Google has mined for Charlbury.

This is a good start. St Mary’s, where I play the organ (badly), is labelled as ‘Charlbury RC Church’. St Mary’s is not Roman Catholic. It’s Church of England. People have been firebombed in Ulster for less. Charlbury’s Catholic church is, as the full address suggests, a few streets away on Fisher’s Lane. (Incidentally, thank you to my Twitter followers for suggesting that maybe RC meant Radio-Controlled. It could make baptisms a whole lot more fun.)
You can also see that the Bell is in the right place, but the Bull, which should be at the corner, is closer to where the Three Horseshoes actually is.

(Incidentally, there’s a little sponsored link beside the wee Bull for Millie Benjamin Bridal Wear. Curiously, when I looked at this earlier, this in turn triggered a foot-of-page ad for ‘Milly Dress at Shopbop’. So buying one sponsored ad alerts Google to place potential competitors’ ads at the same place? That’s an… interesting loyalty tactic.)

The ‘Cotswold View’ campsite has been placed on a little unpaved street called Cotswold View. As the full address again makes clear, it’s not there. It’s actually on the road to Enstone. Whether it’s actually on ‘Enstone Rd’ is debatable – I’d have said Banbury Hill, and so does Tele Atlas.
Note the non-standard space in the middle of the phone number. A Google search for “Cotswold View” “Enstone Road” “810 314” only returns a few results, two of which are at 192.com (once described as Britain’s most invasive website in a shock-horror exposé, and no strangers to data mining themselves). I’m guessing that Google is either mining 192.com or has licensed the same data.
This is also interesting in that Google clearly aren’t doing a postcode lookup, which would be easy technically but horrible legally. A postcode lookup would put the icon in the right place.

The Fiveways Takeaway appears on the wrong side of the road. Well, big deal. But again, the only result for Fiveways “Sturt Road” “811 555″ is 192.com.
(Curious decision on Google’s part not to show ‘Takeaway’ as part of the name, but yet also not to use a custom icon. Fiveways is originally the name of the junction you see just to the south-west. “Turn left at Fiveways” is a common direction in Charlbury. If you took that literally while looking at this map, you’d drive up Sturt Close.)

This one just made me giggle. The problem with having good satellite imagery, as again Google Maps FAIL points out, is that it shows up the inadequacies of the rest of your data. There is clearly a bowls club in this picture but it ain’t where the icon is.
This is a dead canary
So. A small Oxfordshire town, only a handful of mined icons, and around half of them are faulty in some way. Data is being conflated which shouldn’t be (’Cotswold View’ caravan site on ‘Cotswold View’ street, ‘Charlbury RC Church’ located at a church in Charlbury). Positional accuracy is iffy, at best. How endemic is this faulty mining? It’s pretty endemic.
Even getting to this stage is, of course, a display of awesome technical ability. And there is no doubt that the logic will iterate like every other Google product, becoming more accurate each time.
But it does also point out the limitations of applying search-engine technologies to mapping. If you search Google for something non-trivial, you don’t expect the top result to be the one that answers your question. You hope you’ll find it in the top 10, and if not, you’ll turn the page until you get the answer. It’s fuzzy like that and people accept this.
Map data isn’t fuzzy. You have to get it right, first time. Charlbury Bowls Club’s location is approximate, but nonetheless, wrong. St Mary’s is a church in Charlbury but it’s not the Charlbury RC Church.
Data mining gets you worldwide coverage fast, but takes a long time to get to 95% accuracy: you could argue it never will. Crowdsourcing, OpenStreetMap-style, gets you to 95% accuracy fast, but takes a long time to approach worldwide coverage. Professional surveying a la Tele Atlas gets you both, at a huge cost.
All of this is especially interesting in the light of the superb Mike Dobson interview at SearchEngineLand. If you only read one article about webmapping this year, make it that one. He’s the only commentator I’ve seen who appreciates how much data mining Google is doing:
“It is clear to me that conflation and data mining across redundant sources are major components of [Google’s] update process.”
He then suggests that the strategy is to start with data mining, then refine it via crowdsourcing.
“One of the tenets of crowd sourcing is that the frequency of errors decreases with increased inspection. So, Google might make a wrong change from time to time, but the odds are that someone will correct it.” [See also his later comment on Tele Atlas and GDT.]
In other words, Google’s strategy is to get worldwide coverage via mining, then refine it until it’s accurate by crowdsourcing. That makes a lot of sense. But it remains to be seen whether their reputation can withstand the Telegraph story that will inevitably accompany each excursion into the mines.
Drums. Drums in the deep. They Are Coming.
October 19, 2009
Dear Royal Mail

I do understand your present predicament and sympathise so, so feeply with you. (That was meant to be “deeply”, but I liked the typo.)
It must be very hard for you to even consider opening up a teensy little bit of the postcode data. You run a very tight ship where every penny counts. After all, every year, you make £1.5m in profit from the Postcode Address File. This income can then be used to offset approximately one thousandth of the total loss to the British economy incurred by your Genghis Khan-like approach to industrial relations.
Indeed, I feel a particular kinship with the founder of the Post Office, Mr P. Pat (pictured). Our new house looks a little like some of those on Mr Pat’s round, and it is only two doors away from the Post Office. Also, since we moved in earlier this year, I suspect our cat has been beating up Mr P. Pat’s black and white cat regularly.
However, may I kindly request that, bearing in mind your careful, some may even say jealous stewardship, of the Postcode Address File, that you start to actually use the fucking thing and deliver the fuck some mail to us that is meant for 11 Market Street, Charlbury OX7 3PH, rather than 11 Market Street Chipping fucking Norton which has an entirely different fucking postcode.
Thank you.
September 29, 2009
The neogeographers are coming for your children
Paleo or neo – which are you?
I have a degree in Anglo-Saxon Norse & Celtic, I play the church organ, I work for a printed magazine about an 18th century technology, and I have several shelves of Ordnance Survey maps (excluding the unfolded ones, which are elsewhere):

Which I guess makes me a paleotard. Glad we’ve cleared that one up.
This was originally going to be a nice, friendly, fuzzy post about how ‘paleotard’, ‘neogeographer’, ‘freetard’ and the like don’t actually mean anything. How it’s quite possible to love the Ordnance Survey’s beautiful maps yet still think that maybe, maybe, their licensing isn’t 100% perfect. How it’s possible to be an OpenStreetMap activist yet still happily use Landrangers and Explorers, the AA Close-Up Atlas, Multimap, etc. etc. etc.
Then I saw Gary Gale’s excellent rant from the AGI Geocommunity event (where paleo vs neo appears to have been the main topic of conversation). He makes the same point. We are, to misquote Moby, all made of tards.

Also, I figured out something more interesting to say.
Let’s talk about scale for a moment. That’s a lovely old-fashioned concept, isn’t it? When was the last time you heard a freetard talk about a 1:50,000 map, or better still, one inch to the mile? I must be a paleo after all.
But scale is really, really interesting.
Ordnance Survey makes most of its money from very large scale (i.e. very detailed) mapping. That’s what government, utilities, property developers and insurance companies pay for. OS’s accounts are frustratingly vague by comparison to, say, British Waterways’, and when you’re holding up BW as an example of How To Do Transparency Right then you know something’s amiss. But a simple read-through of the space allocated to small-scale and large-scale products in the 08-09 Annual Report demonstrates that large-scale is where the money is.
Yet the majority of ‘neo’ stuff only needs small-scale data, and that’s especially true of public interest projects from CycleStreets to MySociety – the sort of thing Governments of any stripe should be encouraging. Even OpenStreetMap, which has an ever-advancing army of buildingtards doing bonkers micro-mapping stuff (hey, London guys, you should sort out your road numbers first. They suck), acknowledges this by only mapping road centrelines. OS MasterMap it ain’t.
Ordnance Survey clearly knows this but won’t admit it. It does claim to be adopting a “hybrid” model, but at heart, there’s no movement from: our data, our terms, you use it only as we see fit, we own everything.
So the notion that the data should be let out without these terms keeps gaining ground, from the Guardian’s Free Our Data campaign to the very learned Power of Information Review report.
Interestingly, though, OpenStreetMap contributors aren’t pushing for it (which rather kiboshes an otherwise entertaining rant from the AGI). That, I think, is because they’re interested in the small-scale data, too. OSM will have a complete small-scale map of the UK in a few years’ time, and as the backlash against data imports shows, they’re having fun making it. There are now hundreds of UK towns and villages where OSM is more complete and up-to-date than OS Landranger and Streetview. Even the most trivial analysis of OSM’s growth rate shows a quick trajectory towards national coverage.
So if the large-scale stuff doesn’t matter, and OSM is advancing towards small-scale completion, why am I bothering to write any of this?
Because, I guess, I really am a paleotard – well, a little. Like lots of people, I love Ordnance Survey maps: that’s why I collected a whole edition of them. I don’t want to see the Landranger relegated to being an expensive niche product used only by those who don’t have satnavs. I don’t want to see a generation of map users believing that there’s nothing better than rounded-end Internet cartography.
But that’s exactly what’s happening. I lost count of the number of head-in-hands moments during the Ordnance Survey presentations at the Society of Cartographers’ Summer School. “Extremely popular sites nick all the tiles” was the most quotable. Geovation was the most tragic – a whole initiative devoted to encouraging ideas, when there’s no shortage of ideas or even of coders, only of data. The bar chat about OpenSpace usage limits, and Landrangers-for-iPhone that are more expensive than the paper maps – really – demolished any pleas of “listen to us, we’re trying to work with you” from the presenters. It was one long exposition of the fact that the OS still doesn’t get it.

I was reminded of a dyed-in-the-wool BW section engineer telling his boss he couldn’t cope with the new (1990s) way of doing things. “I’ve spent all my life trying to keep people away from the canals. Now you’re telling me you want to get as many people onto them as possible.” Ordnance Survey is still at the “trying to keep people away from the data” stage.
So a whole generation of people is growing up without the Ordnance Survey. I would love to see statistics on the average number of OS maps bought by 25-year olds today, compared to 10 and 20 years ago. The (yes) epic fail that is OpenSpace is the most obvious example: no-one can ever name more than one site made with it, and even that site (Where’s The Path) is essentially a comparison tool for Google aerial imagery.
This is why OS must open up its small-scale data. Not because of any doctrine that facts should be free. Nor because we need the data – we do, but OSM is filling that niche. But because today’s maps and tools are being built without OS data. Today we’ve had the announcement that Flickr photos can now be linked directly to OSM data. If OS had opened its data three years ago, maybe we’d have seen Flickr linkage to OS IDs instead.
The Ordnance Survey, simply put, is heading into irrelevance.
All it can do is release small-scale data, unencumbered by anything but attribution, and hope it’s not too late. Right now it probably isn’t. In two years’ time it will be. Off the top of my head:
- Street names and geometries
- Public rights of way
- Postcode->co-ordinate lookup
- Administrative boundary data, including electoral divisions, Access Land etc.
- Major natural feature names and geometries (peaks, main rivers/water areas)
Yes, it would result in some initial loss of income for OS. I’ve already explained the long-term benefits for OS in ensuring its continued relevance. Many will also make the case that the benefits to the wider economy will more than make up for it.
There are two more political considerations.
One is obvious. Releasing small-scale data would remove the need to stop wasting money on nonsense such as OpenSpace, Geovation, lawyers’ letters to Google (and others) about derived data, and today’s five-minute wonder, OS VectorMap Local which has thus far failed to set the world on fire.
The second is less obvious. Ordnance Survey is regularly talked about as a candidate for privatisation, helping to fill that modest hole in the public finances. (It’s usually somewhere in the list between BW and the Royal Mint.) Let’s say the small-scale business dies and OS is no longer a household name, though still earning much the same from its large-scale data sales to utilities and local authorities. All of a sudden, the ‘national interest’ argument for keeping it in the public sector is much smaller – and the political fallout from selling it much less. And I don’t get the impression that OS wants to be privatised.
But hey, I guess the Stasi is pretty much immune from privatisation.
September 11, 2009
Vote for the Data Liberation Front to tackle aerial imagery
Google has a really enlightened guy called the Data Liberation Front. His role is to make it easy for people to get their data out of Google – rather than it being locked in.
Usually, people are locked in by the lack of an export feature, or an obscure file format. In mapping, people are locked in by licences.
In Google Maps’ case, you can create your own work by tracing over aerial imagery. But you can’t use this work elsewhere, because of the licences and terms of use. (The phrase “derived work” usually crops up around now.)
Google could fix this by saying that tracing from their imagery is ok - just like Yahoo have done. Several posts ago, I looked into the legalities of this and concluded there’s nothing in law stopping them from doing so. It’s entirely their decision.
So – please vote for the Data Liberation Front to fix this! Click here, sign in with a Google Account, and tick the box. And tell your friends.
Older Posts »
|
|