West-Life has delivered an "Assessment of the life cycle of structural data and comparison with_other scientific data".
Research data is acquired, interpreted, published, reused, and sometimes eventually discarded. Understanding this life cycle better will help the development of appropriate infrastructural services, ones which make it easier for researchers to preserve, share, and find data.
Structural biology has a strong tradition of data sharing, expressed by the founding of the Protein Data Bank (PDB) in 1971 (PDB, 1971). The culture of structural biology is therefore already in line with perspective of the European Commission that data from publicly funded research projects are public data (COM(2011) 882 final).
This report is based on the data life cycle as defined by the UK Data Archive. This identifies six stages: Creating data, processing data, analysing data, preserving data, giving access to data, re-using data. Each will be discussed below. However, ʻpreserving dataʼ and ʻgiving access to dataʼ are discussed together. It also adds a final stage to the life cycle, ʻdiscarding dataʼ.
Changes in research goals and methods have led to some changes in the requirements for IT infrastructure. A common data infrastructure is required, giving a simple user interface and simple programmatic access to scattered data. Progress on these tasks will support the development of workflows that facilitate the use of datasets from different facilities and techniques. The automatic acquisition of metadata can help. Large experimental centres already provide a highly professional data infrastructure. For smaller centres this is onerous - it is desirable that a standard package is provided enabling them to use the European e-infrastructure resources, in a way that integrates with other structural biology resources. The West-Life collaboration is addressing these challenges.