When people talk about data management, they are often referring to the shiny, exciting new data that is promised with a newly funded grant project. Sometimes they mean a nearly completed research project where they are interested in data stewardship; less often they are talking about preserving the data on some form of nearly outdated media requested by the researcher. Rarely, however, are people talking about data sets captured in print as part of the corpus of reports and grey literature on our library shelves.
Advances in technology and in the mass digitization of sources of grey literature, such as government documents that contain data sets, mean that researchers can more easily discover, acquire, extract, reformat and re-analyze data from past experiments. The concept of replication — sharing an experiment’s data and the description of its exact processes so that others can replicate and test the discoveries — has always been at the heart of the scientific method. With the advent of mass digitization efforts, this newly available “hidden” data may be a gold mine for researchers, especially in areas of research where data cannot be easily reproduced or where researchers may want to validate past findings using new techniques and analyses.
The Technical Report Archive & Image Library, also known as TRAIL, contains digitized versions of scientific and technical reports of research performed by and for the US federal government. Many of the more than 42,000 reports available in TRAIL contain data sets that may be useful to researchers today. As researchers discover, extract and reformat this data, one would hope they will be willing to redeposit their new data files for others to use.
In a competitive research market, “data sharing” may sound threatening. While more policies and mandates are enacted to aim to prevent data withholding, maybe a demonstration of the value of data sharing using a body of research for which there is less of a sense of ownership can help to demonstrate the greater scientific value of the whole and help data sharing become more accepted. In other words, as data management as a whole matures, and researchers become more open and willing to share data, these revitalized data will serve as a great test bed for demonstrating how data sharing, data archiving and data stewardship are beneficial to the research process.