The Center for Informatics Research in Science and Scholarship in the UI's Graduate School of Library and Information Science will receive about $2.9 million dollars as a partner in the Data Conservancy project, a $20 million initiative led by the Johns Hopkins University Sheridan Libraries. The five-year award, one of the first two in the National Science Foundation's DataNet program, will build infrastructure for the management of the ever-increasing amounts of digital-research data.
The principal investigator is Sayeed Choudhury, Hodson Director of the Digital Research and Curation Center, and associate dean of university libraries, at Johns Hopkins. The sub-award to the UI is led by co-principal investigator, Carole L. Palmer, director of CIRSS and a professor of information science. Other CIRSS researchers include GSLIS faculty members Melissa Cragin, Allen Renear, John MacMullen and David Dubin, and Michael Welge from the National Center for Supercomputing Applications.
The project will begin with data from astronomy, the life sciences, earth sciences and social sciences, developing a framework to more fully understand data practices in use, and arrive at a model for curation that allows ease of access within and across disciplines.
The Illinois team will contribute to multiple aspects of the project, conducting studies of scientists' data practices and needs, and analyzing how best to represent complex units of data in the repository. "We will be conducting a systematic analysis of the data curation requirements across the disciplines served by the Data Conservancy," Palmer said. "Our primary interest is in the 'long tail of small science,' and how to support collecting and sharing of the highly variable types of data produced by individual scientists and small research groups. Our results will determine data-curation and preservation requirements but also policies to guide how the Data Conservancy and other large, cross-disciplinary data repositories are developed and used."
The research led by Renear will develop formal terminology and identity conditions for fundamental data concepts. "Many of the key cross-cutting concepts of scientific data organization remain poorly defined," Renear said. "Our work will provide the foundation for standardizing how Data Conservancy datasets are identified, described, related and organized."
The CIRSS research activities and other Data Conservancy efforts will feed directly into two professional training programs at GSLIS, the Data Curation specialization in the master's of library and information science, and the Biological Information Specialists master's in the campuswide bioinformatics program. The award also will support professional development in data-curation principles, processes and technologies.