Guide to preservation in DTU Data

DTU Data aims to be a trusted repository, where preservation practices are well defined and understood by users.    

By publishing your data in a trusted repository such as DTU Data, you initiate the preservation process and take the first step toward ensuring that your research outputs remain accessible and reusable.  
DTU Library manages DTU Data and is responsible for maintaining all documentation and user support on DTU Data. DTU Library aims to make stored metadata and data findable, accessible and understandable in the long term.

DTU Library collaborates with the IT department, which is responsible for the technical infrastructure, hardware migration and IT security measures.
Data may be created with one specific purpose in mind. However, in the future, they may potentially only have value if saved in open formats. Further, they may have value for reuse in other contexts.

Data preservation only creates value, if supported by documentation of data provenance, and of software and hardware requirements for reusing the data. DTU Data staff will review the metadata of your submitted item to increase findability, interoperability and reuse for data users.

This is done in collaboration with the depositor. As depositor, you must also evaluate the item in terms of scientific quality and context. These are important aspects, when appraising data for preservation actions.

Metadata for DOI

Digital Object Identifiers (DOIs) in DTU Data are registered with DataCite according to the DataCite Metadata Schema, hence DTU Library is obliged to secure the persistent access to metadata either through DTU Data or another openly accessible repository.

Preservation of item files

DTU Data applies two levels of file support:

  1. Bit-level preservation: Access to the file in its submission format is provided
  2. Full preservation: Usability of files will require actions such as migration, normalization and conversion. With bit-level preservation, DTU Data guarantees access to files for a minimum of 10 years (see Curation level 2 below). Therefore, extended access to item files may depend on the files being uploaded in preferred formats. See our list of preferred formats. For files stored outside DTU Data see Curation level 3 below. 

Curation levels describe the different degrees of care and preservation actions applied to research data. They help clarify what level of accessibility, usability, and long-term security can be expected, depending on how the data are stored and maintained.

 

Level 1                                                                                                                

 
Availability for more than 10 years must be determined at research group or department level according to local guidelines. Actions include curation of variables and files for long-term preservation. Data is converted into long-term formats. Documentation of variables and used metadata standards are preserved with other documentation. Long-term accessibility is also ensured by hardware migration. Data may be fit for reporting to the National Archives. This will be determined with the research group/department if not already required according to Danish legislation. Examples are time-series data, cohort studies, databases. Research data of high value for the technical sciences or nationally.
 

Level 2                                                                                     

Data are available in the same format as deposited. Data are available for minimum 10 years, but only backed-up at bit-level. Deposition in preferred file formats are necessary for usability beyond 10 years. Secondary use of data is possible for example for validation of published results. Examples are data and code supporting publications, models, videos and images. Data with insufficient documentation.
 

Level 3

No file preservation actions are done with files stored outside DTU Data. For example if data are stored elsewhere and linked with a metadata record published in DTU Data. This can include data archived in department archives, such as confidential data (including personal data) or data stored on other web-services. Some data that cannot be shared openly may still need long-term preservation according to Danish legislation  

Some research data and outputs may be assessed of having so high value that long-term preservation, i.e. for more than 10 years, is wanted (see Curation level 1 above).

Some research data are mandatory to report to the National Archives. After reporting, the National Archives will evaluate which data should be preserved for more than 100 years. Cost for preservation actions shall be borne by the research project or department. If your research data are excepted for reporting, you still have the possibility to get an evaluation by the National Archives if the research data can be transferred for long-term preservation.

It is possible to use DTU Data for transferring data to the National Archives.

It is possible to use DTU Data for long-term preservation of highly valuable data that are not preserved by the National Archives (see Curation level 1). Such full preservation actions requires dedicated resource for file format migration and for hardware migration. Also, data and preservation actions need regular reappraisal. Therefore, an individual agreement on actions and expenses is needed with DTU Library and the IT department.  

Contact datamanagement@dtu.dk before you prepare your research data for long term preservation.

Danish law requires that certain data are reported and delivered for long-term preservation at the National Archives. Long-term preservation means archiving longer than the required 5 years after publication and data have to be thoroughly documented and in a future-proofed open format.

However, data published in DTU Data is exempted from reporting as it is already harvested for archiving at Netarkivet.dk 

Some research data are assessed to be of such high value that they must be preserved for more than 100 years. In Denmark, the National Archive is responsible for deciding which digital research data fall into this category.