About Data Incubator

When it comes to demonstrating the power of open, linked data a "show don't tell" policy is often the best approach. But until data is available in open, easily re-usable formats, its not feasible to either show that power. And without access to data, its not possible to show that there is a community eager, willing, and able to do interesting things with that data. Without those demonstrations it can be hard for the owner or curator of a dataset to justify investment in opening up that data. A Catch-22 situation.

In many cases this problem has been addressed by interested individuals or group by scraping data from websites and re-publishing the raw data. There have been many examples of successes, but these efforts generally aim to solve a specific problem or need rather than opening up data for wider reuse. They also don't necessarily provide the original data owners with the means to build on this work, e.g. by supplying code that could be audited or reused, letting the data become "officially" sanctioned by the original owners.

Attribution and recognition of the original sources of data will be an important community norm amongst the DataIncubator.org developers. All data exposed through the website will link back to and reference its original sources.

The ultimate end goal for all datasets hosted on DataIncubator.org is that they be (re-)adopted by their original owners. In some cases the demonstration of the community interest may be all the original owner needs. In others the owner may want or need to draw on the modelling work done by the community. In others, even the conversion code and portions of the website infrastructure may be adopted by the owners, allowing fast tracking of the production of official datasets. To support this, all code and documentation, including RDF vocabularies, published through DataIncubator.org will be published as open source under liberal reuse licenses.

When data is re-adopted by its owner, redirects will be put in place to ensure that re-users of the data are directed to the authoritative copy wherever it may end up.

The goal is not to steal data, but to show that there is a better way.