skip to content

The Research Data Management Facility is running a pilot project with some departments to include data management plans in the assessments PhD students go through towards or at the end of their first year. We are asking students to complete a brief data management plan (DMP) and for supervisors and assessors to ensure that the student has thought about all the issues and their responses are reasonable. The Research Data Management Facility will provide support for any students, supervisors or assessors that are in need.

This page lists some sources of information that will help students to complete their data management plans.

Data Formats

In your DMP you should specify the formats that your data will be collected in or made available in. It is best to share your data using open source formats as this increases the reuseability of the data. For example, if you save your spreadsheet in a Microsoft Excel format this can create problems if the person wishing to look at your data doesn't have access to Excel. It would be much better to save it as a .csv file in this example. You might also want to think about the long term preservation potential of different formats. For example, if you have an image .tif is considered to be a better format than .jpg when it comes to preservation. For lists of recommended file formats see:

UK Data Service Recommended File Formats

Library of Congress Recommended Formats Statement

Short-Term Storage

Good research data management practices include backing up your data in at least 2 locations. You might use the University's network for this, an external hard drive or a cloud solution. It can be problematic using private cloud solutions due to their terms and conditions and you should never put personal or sensitive data on a private cloud. However, the University offers several cloud solutions that students can use with their @cam email address, which have different terms and conditions. The University Information Service also offers paid-for storage options for researchers working with large volumes of data (>2Tb). For information about University provided cloud solutions see:

UIS Data Storage for Individuals

For information about the paid-for large capacity storage options please see:

UIS Research Data Storage Services

Data Management

When considering your data management think about all the day to day things you do with your data and digital files. How will you organise your folders and files on your computer? Will you use a file naming convention (e.g. like this file naming convention) to make it easy to locate files later on and keep track of the lastest version? If you are collecting data in the field, will you need to digitise your notes? If so, when will you do this and how will you back it up if you are away from Cambridge? If you are doing lab work, you might want to consider using an electronic laboratory notebook as this means all your notes and experimental data are digital from the start.

Ethics and Intellectual Property

Hopefully if there are any ethical considerations to your PhD you will have picked them up before the end of your first year. If you still need advice on ethics you should visit the Research Ethics Website. Many departments have their own ethics committees and advice so it is worth checking if your department does. If you think their might be Intellectual Property issues with your research please visit the Cambridge Enterprise pages for more information. If there are ethical or IP considerations for your data you should say what you have done or are planning to do to address these.

Data Sharing and Reuse

When thinking about data sharing you should first consider what your funder requires you to do. Provided you can share your data (there are exceptions for personal/sensitive data) you will need to think about:

a) what data will you share. You should share any data that is needed to support the arguments you make in your publications. This might be the raw data or processed data - you are expected to make this judgement in line with what is most useful to others working in your discipline. You should also share any documentation that goes with the data, e.g. protocols or information about collection. Code should also be shared if this was used in the creation or manipulation of data. If you are going to share anonymised data, do you know how to carry out the anonymisation?

b) where to share your data. A subject specific repository is best. If you don't know what that is for your repository you can look it up on the re3data registry. If a discipline specific repository doesn't exist you can use a general repository, such as Apollo, the University's repository.

c) when you will share your data. Data should be shared at the same time that any publication it supports is published. Some funders will allow you to delay the sharing of the data if another paper is imminent but you shouldn't delay sharing it indefinitely.

d) is there any cost to sharing your data? The University repository allows you to deposit datasets up to 20Gb for free but anything large is charged at £4/Gb (one-off charge).

If you cannot share your data you should state the reasons for this in your plan.

Thesis Sharing

It is now a requirement of graduation that a digital copy of your thesis is given to the University repository. This doesn't mean that your thesis has to be openly available although it is encouraged. Think about any issues their might be with making your thesis available in the repository. The Office of Scholarly Communication provides detailed advice about the new requirement to deposit an electronic copy of your thesis.