Research Data: A Quickly Vanishing Resource
By placing data in a well-preserved public space, researchers could in time eradicate the problem of data accessibility.
An article in Current Biology ("The Availability of Research Data Declines Rapidly with Article Age" [subscription required]) uncovered an alarming issue with scientific publishing: data underlying published manuscripts disappear rather quickly. And even worse, it is sometimes impossible to contact any authors of older papers to request raw data in the first place.
In some cases, published conclusions and figures are sufficient to drive further research. Still, in this era of increased focus on reproducibility and the ability to make informed comments about published papers, it is imperative to have access to data and not just graphs. But just how accessible is this data?
In the Current Biology study, Vines et al. sampled 516 articles published between 2 and 22 years ago. In each case, the authors attempted to contact authors of the study by e-mail to inquire about the status of the data that formed the basis of the results. Based on their linear regression analysis, the odds of finding an extant data set (whether or not the authors were willing to share it) fell by 17% for each year since publication. This drop was not related to the response rate of authors that were contacted or the ability to find working e-mail addresses for authors, which each remained steady across the entire range of article ages. In some cases, the original authors reported that the data were lost or stored on media that they could no longer use (e.g., floppy disks).
One good solution is to place data in a public archive upon publication, as the authors highlight and put into practice. Several journals (such as The American Naturalist and BMJ) having already put such policies in place. However, even if the journal will not host raw data, there are a number of third-party services to use. Both figshare and Dryad will host any data set for free, even providing DOIs that make the information citable and uniquely identifiable. By placing data in a well-preserved public space, authors no longer have to worry about meeting requests for the data.
The next time you write a research article, consider placing raw data or additional results that will not be published elsewhere in a repository like Dryad or figshare. Make sure that future studies can build on your results; don't let them become lost with time.