Preferred formats in DTU Data

DTU Data accepts data in all file formats. However, to ensure long-term accessibility and readability, you should choose open file formats for deposited files.

Open file formats are recognised by the source code being open and therefore, often in non-proprietary file formats. If you save data in open file formats this will

  1. make it more easy for you and others to use the files in the future, if there is no longer access to the proprietary software with which to process the files and
  2. make the file content more accessible to others that don’t have access to proprietary software.

If you aim for long-term preservation, data must be uploaded in an open or preferred file format. All files must have a valid file extension, e.g. .txt, .pdf. If your data cannot be stored in a preferred format, they can still be published in their original format, but in that case, DTU Data only commits to preserve the data at bit-level (i.e. access to the file in its submission format is provided) in the long term. If appropriate, the file may also be archived in their original file format in addition to preferred format(s).

Read more about preservation in DTU Data here, or contact datamanagement@dtu.dk for more information.

File formats suitable for full preservation (examples)  
 Containers: TAR, GZIP, ZIP 
 Databases: XML, CSV
 Geospatial: GeoTIFF, NetCDF
 Sounds: WAVE, AIFF, MP3
 Statistics:  ASCII, DTA, POR, SAS, SAV
 Video: MPEG-4
 Images: TIFF, JPEG, PDF/A, PNG
 Tabular data:  CSV, tab-delimited values
 Text (slides, illustrations): PDF/A (and original file)
 Text: Ren tekst, XML, PDF/A
 Array data:  NetCDF

 Full preservation: Usability of files will require actions such as migration, normalization and conversion. Preferred formats is a prerequisite for such actions.

The table gives examples of preferred formats for long-term preservation in DTU Data. It is not exhaustive and exclusive. Contact datamanagement@dtu.dk before you prepare your research data for long-term preservation.

Proprietary formats are file formats owned and controlled by a company or organization. They often require specific (and sometimes paid) software to open and edit.

When saving and publishing data in proprietary formats, consider including an explanatory readme.txt file with the data. Include information about name, version and original use of software used to generate the files. This may be necessary information to handle files in the future

Examples of proprietary formats:

  • .docx (from Microsoft Word)

  • .xlsx (from Excel)

  • .psd (from Adobe Photoshop)

Alternative open formats (e.g., .txt, .csv, .pdf) can usually be used across programs and without a license. However, we understand that there may be context and features in proprietary formats that are essential within specific research disciplines. Contact us at datamanagement@dtu.dk if you have questions regarding your file formats.