11 March 2006
ShareAlike considered harmful for geodata
Why the 'derivative work' provision of GPL/ShareAlike licences makes them unsuitable for geodata and mapping works.
Free, or open, geodata is deservedly receiving a lot of attention recently. More and more people are calling for state agencies, such as the Ordnance Survey, to make their existing data freely distributable: the cause is sufficiently well-known that the Guardian has begun to champion it. Open mapping projects, the most notable of which is Openstreetmap, are using cheap GPS receivers to create 'free' datasets which can be used without payment of any sort. And an increasingly computer literate society is making ever more inventive uses of existing free geodata, notably that published by the US Government.
This explosion in interest has not been matched by a suitable legal or licensing framework. Instead, because most of the proponents come from a free software background, they are using the licences with which they are familiar - generally, the Creative Commons ShareAlike licence (CC-SA), a creative-arts equivalent of the ubiquitous GNU General Public Licence.
This licence is not designed for geodata. Its widespread adoption threatens to hinder the takeup of free geodata; to lead to the development of multiple free geodata projects with incompatible licences; and to strengthen the ability for state agencies to keep charging for public data.
The derivative work problem
This is a result of the 'ShareAlike' clause in the CC-SA licence. It is summarised thus:
Share Alike. If you alter, transform, or build upon this work, you may distribute the resulting work only under a license identical to this one.
This has two results:
Applying this to free geodata, we find:
Because this is a discussion of principles rather than the policies of existing mapping sites, these three examples refer to a fictitious ShareAlike-licensed project called Freemappingworld.
I run the community website for the town of Charlbury, www.charlbury.info. The website is entirely non-profit-making and performs a valuable social service.
I would like to use Freemappingworld data to draw a town plan for the website. Not all of Charlbury's streets have yet been surveyed, so I cycle round with my GPS, map them, and contribute the result back to Freemappingworld.
In practice, though, I cannot use Freemappingworld data to draw my map.
This is because, on the Charlbury website, we post PDFs of the Charlbury Chronicle - a quarterly town newsletter issued free to all households in Charlbury. None of its content has been expressly licensed as public domain or ShareAlike: in fact, if you asked any of the authors what licence they had chosen for their content, you'd just get blank stares. The editor is a very sweet Catholic lady and town councillor, who would not have the first idea of the difference between free as in beer (or as in Communion wine) and free as in speech.
(I, however, can imagine circumstances in which the Chronicle might contain decidedly non-free content. For example, a local poet might have published their first collection. One of their poems is reproduced in the Chronicle as a taster - but it's not public domain.)
By including Freemappingworld data, the Charlbury website becomes a Derivative Work under the provisions of the ShareAlike licence. It is "building upon" the Freemappingworld data. This means it, too, has to be available as ShareAlike.
But since the Charlbury Chronicle contains non-PD/ShareAlike content, it can't be distributed under these terms.
This prevents this non-profit, community-focused website from using Freemappingworld data. Though this is unlikely to be the intention of the Freemappingworld creators, it's the effect of the licence they've chosen.
Fred runs a free site enabling people to plan their cruises around Britain's canal network.
Fred has spent a lot of time on the software, all of which is released under the GPL. But the site also benefits from the exceptionally high quality of the data it uses, covering the 4,000 miles of navigable waterways in Britain. Much of this has been contributed by individuals, but some is licenced to him by forward-thinking navigation authorities on a one-off basis.
Fred doesn't make any money from his site, and a lot of people benefit from it. Like so many enthusiast developers, he wants to keep making it better, and he decides one way to do this is to include some better maps. So he looks to a ShareAlike geodata project for the source material.
But once again, because not all the data on his site is expressly licenced as ShareAlike or public domain, the Derivative Work provision prevents him from using this geodata in any way, shape or form.
For my work as editor of Waterways World, I'd like to draw a map of the waterways of Britain.
This is unashamedly a commercial project, make no bones about it. Waterways World Ltd is a company like any other, whose prime aim is to sell magazines, not to run a charity for the benefit of content creators everywhere.
This map would include a coastline, hill shading, built-up areas, waterways (of course), and motorways... all put together in a beautifully designed whole.
The first three are all available from public domain (US Government) sources. I already have the waterway lines, either from GPS tracks or traced from out-of-copyright maps. I don't, however, have tracks for every motorway in Britain - probably only 80% (no M62 east of Manchester, for example).
To fill the gap, I would like to use the motorway data from Freemappingworld. In return, Freemappingworld would get all the waterway lines to use in its own dataset as it wishes - 4,000 miles of high-resolution data. It would also get (under the 'Attribution' clause of the licence) a hulking great big credit on the final, printed map, telling thousands of people about the good work it does and directing them to www.freemappingworld.org to find out more.
Unfortunately, the ShareAlike clause is not content with this, and requires that the entire map design - not just the content - is licenced under the ShareAlike clause. Any other magazine, or publisher, or boat-hire company is free to copy the map without any further reference. That's unacceptable both to me and, principally, to the company I work for.
(To take it further still, if I get British Waterways' endorsement for this fabulous map, I can't feature their bridge-and-bulrushes logo on the map - because the logo is copyrighted art.)
What will they do?
1. For the Charlbury Website, I will cycle a little further round Charlbury, and gather all the GPS tracks myself. I can then create the map I want without any restriction on what else is included on the website. Do I contribute the resultant tracks to Freemappingworld? Well, maybe. But when the project has proved unsuitable for my needs, there's absolutely no reason or moral imperative for me to do so.
2. Fred will probably use the Google Maps API. That way, he gets lovely maps, completely free of charge, and can use them without having to relicense his whole site. Score: proprietary geodata 1, free geodata 0.
Where this gets really messy is when Fred introduces a feature that lets you label the map by clicking on it - so you can add pubs to the database, for example. As is already well-established, co-ordinates obtained by clicking on Google Maps are copyright of Google's map providers. But if they had been obtained by clicking on a free map, there would be no such restriction.
The result? Because Google provides attractive cost-free maps which are usable in circumstances that the ShareAlike maps aren't, the sum total of proprietary geodata in the world actually increases.
3. Very, very simple. I'll buy the motorways from the Ordnance Survey instead; 10% of the motorway network at low resolution won't cost much at all. The ShareAlike project loses out on both the publicity and the data I was willing to contribute.
The ghetto effect
To summarise, the ShareAlike clause "ghettoises" free geodata. It is only usable where the entire project - geodata, map design, and all surrounding information - is licenced on the same terms.
For most end-users, this makes ShareAlike geodata less attractive than the free-beer alternative offered by the Google API. For most commercial users, it reinforces the viability of the Ordnance Survey alternative - where you can do what the hell you like, as long as you pay for it.
ShareAlike geodata may gain traction among the existing free software community - it could be distributed with open-source GIS packages and operating systems, for example - but it is unlikely to reach beyond this relatively small market.
UK geodata is currently only accessible to those with money; ShareAlike projects will make it also accessible to those in the free software community; but the rest of us are still shut out. Many, many existing data sources - such as the Hansard excerpts behind www.theyworkforyou.com, licensed under Parliamentary Copyright - do not have licenses compatible with ShareAlike provisions.
As a result, ShareAlike geodata will have virtually no effect on commercial suppliers such as Ordnance Survey. Ordnance Survey CTO Ed Parsons has publically stated several times that he believes projects like Openstreetmap can and should coexist with proprietary OS geodata, and I'd guess this is why: because existing and most prospective OS customers simply won't be able to use the ShareAlike data.
The lesson of the LGPL
Curiously, the free software community learned this lesson a long time ago, and provided the "Lesser General Public License". The LGPL effectively recognises that, for everyone except programmers, library code is a component of a work rather than a work in itself. Therefore the objectives of the free software movement may sometimes be best achieved by permitting derivative works. To quote:
Substitute 'geodata' for 'library' and the arguments hold.
Rather than using the blunderbuss term of "the Derivative Work", the LGPL draws a useful distinction between a "work based on the library" and "a work that uses the library". The former applies to modified versions of the library code, which are still subject to ShareAlike provisions. The latter applies to programs that call on the library, which aren't.
We need to assert this difference for free geodata.
A "work based on the geodata" means an expanded set of geodata - taking the free data and adding extra lines to it. But "a work that uses the geodata" means a map, a website, a book - anything where the creative input isn't just geodata. Creating a geodata LGPL would solve the problems in each of the three examples above.
The Collective Work clause - a solution?
The CC-SA licence already has the beginnings of such a solution. It defines a "Collective Work", from which the ShareAlike provisions are exempt, as follows:
"Collective Work" means a work, such as a periodical issue, anthology or encyclopedia, in which the Work in its entirety in unmodified form, along with a number of other contributions, constituting separate and independent works in themselves, are assembled into a collective whole. A work that constitutes a Collective Work will not be considered a Derivative Work (as defined below) for the purposes of this License.
This is fairly unambiguous when the Work is, say, an encyclopaedia article. Wikipedia uses the similar GFDL (GNU Free Documentation Licence), and it is accepted that you can include a Wikipedia article verbatim on your website, given certain provisos, without affecting the copyright of the rest of the site.
However, the clause has no easy relevance to geodata.
The majority view in the Openstreetmap community (discussion) is that 'derivative work' should be taken in its widest sense, forbidding any non-ShareAlike use of the geodata in any way, shape or form; and that the Collective Work provision doesn't affect that.
How to avoid the problem
The simplest way to avoid all these problems is to use a BSD-style licence, or an unambiguous declaration of 'public domain'. These do not restrict the works that can use the data.
This may have two practical disadvantages. One is that some contributors are implacably opposed to BSD-style licences; another is that it will lead to the development of two distinct free geodata streams, one ShareAlike and one PD/BSD.
The long-term solution, then, must to be to create a dedicated LGPL-like geodata licence that addresses the problems above, setting out what you can and can't do in terms that are unambiguously relevant to geodata and mapping.
How to mitigate the problem
If you absolutely, positively have to use an existing ShareAlike licence, here are two steps you can take to mitigate the adverse effects.
1. Assign all copyright to a 'trust', rather than to individual contributors. That way, the trustees can choose to act collectively and grant permission outside the terms of the licence, if they deem it in the best interests of the project or society at large.
2. Expand the licence by adding a definition of 'Collective Work' which specifically explains how you intend it to apply to your geodata.
Richard, you make excellent and illuminating points, and I'm in general agreement over the similarities behind the need for LGPL and the need for a similar geodata license. We're in the process of producing an alternative data set, not a brand new thing that doesn't exist elsewhere. Good point.
But... the last two bullet points on "The derivative work problem" are wrong, I think.
This is revisiting some of our mailing list discussion from last month, but hopefully in a way that doesn't come across as badly as I did last time :)
Example 1 is a total nonsense. Charlbury.info wouldn't be required to distribute ALL its content under a ShareAlike license, just any maps that were derived from Freemappingworld. ShareAlike does propogate, but not that wildly. The PDFs of the Charlbury chronicle would be safe. (Compare, for example, if I license one photo on Flickr as ShareAlike. It doesn't mean that all the other photos on my account - or even the whole of Flickr - are now part of a derived work and are forced to be distributed the same way!)
Example 2 is unfortunate but true - it doesn't look like you could mix proprietary licensed waterway data with Freemappingworld data that is under a ShareAlike license. But that's to treat waterways as fundamentally different from streets, isn't it? Freemappingworld contributors wouldn't think a ShareAlike license unworkable just because we couldn't mix existing proprietary street data with new ShareAlike data, they'd just plan to map the whole lot eventually.
Example 3 is pretty much true, except where you say "in return, Freemappingworld would get...". Freemappingworld specifically wouldn't *directly* get anything in return, the only thing that happens is that its contributors would have a right to use the map you distributed under a ShareAlike license if they wanted to. If the map were only distributed in print, Freemappingworld's contributors would be left with the task of projecting it back into their system and tracing it, much as they do for GPS and not really any simpler. They might be able to quickly copy map names, but that's all. I don't think you'd be obliged to distribute the raw geodata of the map (source code?) so it's hard to see how you'd actually be disadvantaged. Although this distinction does start to support your argument that ShareAlike isn't much use, albeit from the point of view that it isn't much use to the Free/Open mapping project rather than that it restricts the uses you suggest.
The logo is a very interesting point - I wonder if CC ShareAlike licensed works with a logo on are already in the wild, and if that puts the logo at risk. Perhaps the logo would/should be protected under trademark legislation as well as copyright?
Your idea about assigning copyright to a trust is a good one, though I believe our German friends have legal difficulties in signing away rights?
Posted by Tom Carden on 11.3.06 13:44
I agree that an LGPL-like geodata licence would be great but, like Tom, I think you overestimate the infectiousness of the CC-SA and underestimate the protection offered by the 'collective work' provision.
My reading is that if you are building on a FreeMappingWorld map (original work) to produce a Charlbury map (derived work) and including that in the Charlbury website (collective work) then the Charlbury map would indeed have to be CC-SA licensed but that the Charlbury website would be a collective work which includes "the Work in its entirety in unmodified form" but where 'the Work' here refers to the derived map that is the Charlbury map rather than the 'original' work that the Charlbury map was based upon.
Posted by Peter Ferne on 11.3.06 16:05
Really interesting posts, both (and Jo at mappinghacks.com, too) - thanks.
I'd agree with you for Flickr, Tom, but there are two differences with the geodata situation. One is technical: by uploading a pic to Flickr (or Wikipedia, or whatever), you can't virally infect their site (work) with ShareAlike. They can only do that by downloading your pic, thereby consenting to your licensing conditions, and incorporating that in their own work. That's closer to the example.
But the main difference is that Flickr and Wikipedia are in known territory for "collective works" and geodata isn't. On Wikipedia, for example, it's accepted that you can use an article anywhere you like as long as it's presented distinctly: the ShareAlike infection doesn't seep beyond the div which the text is in (it's not quite that specific about HTML markup, but you know what I mean!). Similarly, on Flickr, the work is the JPEG (/PNG/whatever).
It's not quite that simple with geodata, which doesn't have a physical appearance in itself. If I have a map of Charlbury in a JPEG, made from Freemapping world data, then I think we're all in agreement that it's a derived work. If I then post a single transparent GIF next to it, of the same proportions but 100 pixels away, that might be a separate/collective work. But if I change the CSS so the same GIF is on top of the same JPEG, to all intents and purposes looking like one single map, is that derived ("building upon")? Or still separate? My brain hurts.
It sounds contrived, but it's not; geowiki v1 works by superimposing images onto a base map tile, and I'd love to use OSM data for geowiki v2 if the licensing permitted it. (Another real-world example: on my cartography portfolio site, I have little decorative faded-out strips of map image at the top of each page. If these were from ShareAlike data, is the GIF the only bit that's infected - or as an integral part of a wider design, does the whole page come with it? I really don't know the answer on that one.)
I suspect I've not been clear enough on example 2, where "waterway data" refers more to point data ("there's a lock here") and metadata ("this waterway takes boats up to 7ft wide") than the actual line data. So it's not so much whether waterways and streets are different - and I agree, they're not - but, again, about the scope of the ShareAlike clause.
And for the "in return" part of example 3, sure, there wouldn't be any legal requirement for me to contribute the waterway line "source" back to Freemappingworld - the letter of the licence, as you note, just says they're left to trace it if they want. But I'd do it anyway. I'd spend a day or two extracting the lines from Illustrator, degeneralising them and turning them into lovely GPXs or shapefiles to upload. If FMW has been kind to me, I'm kind back to it.
(The discussion could very easily get GPL/BSD from there, so I'll stop.)
Away from the specifics, Jo's posting at mappinghacks is really useful partly because it expresses and rephrases everything in very clear English, but also because it introduces the question of data vs. metadata. This could be a beautiful starting point for a serious attempt to tackle the problem.
I'll return to this at some point, probably at a less silly time in the morning. Perhaps, though, the best way to test this would be if I were to get all the geowiki data together in a LGPL-like dataset and see how it flies...
Posted by Richard on 14.3.06 00:41
Dear Richard and others,
I'm following your discussions for a few weeks now and find the issues raised quite interesting since they're close to our concerns (more on this at some later time may be).
Now a question: you mention that "As is already well-established, co-ordinates obtained by clicking on Google Maps are copyright of Google's map providers". Where does this stem from? I couldn't figure out from GMap API terms.
Posted by Manolis Koutlis on 17.3.06 08:13
Under UK copyright law, at least, anything that you create by direct reference to these maps (e.g. by clicking on them) is a derived work. Therefore partial copyright rests with TeleAtlas, Navteq etc. If this wasn't the case, then you'd be legally entitled to trace along all of the roads depicted on the maps and make a copy... so you can see the reasoning behind it.
Posted by Richard on 17.3.06 11:47
One afternoon, I was in the backyard hanging the laundry when an old, tired-looking dog wandered into the yard. I could tell from his collar and well-fed belly that he had a home. But when I walked into the house, he followed me, sauntered down the hall and fell asleep in a corner. An hour later, he went to the door, and I let him out. The next day he was back. He resumed his position in the hallway and slept for an hour.
I cried from laughter
Posted by Melissik on 2.5.08 12:31
a href= /a
Posted by balabo_lp on 11.5.08 12:14
Posted by shipp Handwriting analysis on 20.5.08 21:44
Posted by Handwriting illinois analysis on 20.5.08 23:13
Posted by freexxxpron on 6.6.08 02:14
Posted by online vicodin on 21.6.08 13:54
Posted by party motorcycle hardcore sex on 9.7.08 14:52
a href= /a
Posted by balabo1_mi on 2.8.08 07:30