skip to content
 
This page provides guidance for dataset depositers on:
 
 
Visit this page to deposit your dataset via Symplectic Elements.
 

Who can submit data to the Repository?

Anyone with a valid Cambridge CRSid can simply upload the data by logging into Symplectic Elements. Guidance for first time users is available in several forms. Anyone who has a legitimate reason to submit their data to the University of Cambridge data repository, but who does not have a valid CRSid, should e-mail us to request external user access.

 

What can be submitted?

Only research data connected to the University of Cambridge can be deposited in the repository. This includes datasets created by current or former University of Cambridge researchers, research students or staff members; datasets resulting from research conducted at the University of Cambridge; datasets that appear in a journal published, or a conference hosted, by the University; or datasets resulting from research undertaken using University facilities.

 

Information required in the data submission form

Information about the nature of the data

When submitting your data we will ask you if your data contain any personal/sensitive, commercially sensitive or other forms of confidential/restricted information, and whether you have the rights to share these data via the repository.

Information about personal/sensitive information 

For information about personal/sensitive information have a look here. If you have any doubts about this question, please consult the University of Cambridge Ethics pages or e-mail the University’s Research Governance and Integrity Officer.

Information about other forms of confidential/restricted information 

Examples of other forms of confidential/restricted information might include cases where there are confidentiality/publication restrictions in sponsorship or collaborations agreements, or, for example, where research data are subject to UK export control law.

Rights to share the data

Depositors must confirm that they have the authority or permission to deposit data. This includes obtaining permission to share the data from any third parties who might hold rights over this dataset.

If you have any doubts about your rights to deposit and share your data, please consult (in the given order):

  • the Principal Investigator responsible for this study (if you are not the Principal Investigator)
  • your Departmental Administrator (or equivalent)
  • the appropriate Contracts Manager at the Research Operations Office.

Compulsory information about your dataset

When you submit your data to us we will ask for:

  • Title of the dataset
  • The authors of the dataset
  • Information about the publication (or thesis) associated with your dataset (if applicable):
    • If your data supports a publication, we will also ask you for details of the publication (title of the publication, DOI of the publication)
  • Embargo options:
    • If you wish, you will be able to select the option to embargo your dataset. Please note that we will only embargo datasets until the associated publication has been published. Data files will not be publicly available while an embargo is in place, but the metadata will be publicly visible.
  • Description of the data
    • This is important contextual information about the dataset and documenting your data comprehensively is an essential part of sharing your data well. Where are this data from? How were the data generated? Give details about your data that would help someone else understand your data. You may wish to embed discipline-specific metadata in your dataset documentation – the Digitial Curation Centre provides examples of disciplinary metadata. We recommend that you deposit a Readme file alongside your dataset, or multiple Readme files if your dataset consists of multiple files and folders. 
  • Keywords – to make your data discoverable via search engines
  • Sponsorship information and grant IDs
  • Information on file formats:
    • We will ask you to list all file formats in your dataset.
    • You are strongly encouraged to submit your dataset in open formats, to facilitate long-term preservation and accessibility of your data. However, we recognise that it is not always possible to export all data files into open formats. Therefore, research data in proprietary file formats will be also accepted by the repository, but we will ask you to provide information about software needed to read and process your files.
    • Guidance on choosing file formats is available here.
  • Software:
    • Information about the software needed to read your files or any other information that someone might find useful when trying to open/process your data files.
  • Licence for your data:
    • You will be required to indicate what type of licence you would like to be applied to your research data. You can read information about available licences here and a nice graphic explaining licence types can be viewed here.

Optional information

  • Related resources
    • If you would like us to link your dataset with other relevant existing resources (for example, your other existing publications, other datasets, external reports, webpages, news articles etc.), please provide URLs here.
  • ORCID ID
    • Open Researcher and Contributor ID (ORCID) provides each academic with a unique identifier, and is increasingly required by publishers and by data repositories at the stage of research output submission. The use of ORCID ensures that each academic’s research activities are distinguished from those of others with similar names.
  • Name of the Principal Investigator – if you are not the Principal Investigator, you will be asked to indicate who was the Principal Investigator
  • Additional information:
    • You can provide any additional information about your data here, for example, if you need the link to your data urgently (we normally process data submissions and provide you with a link within three working days), if this is a placeholder record, if you wish this dataset to be visible only to the reviewers etc.

Compulsory administrative information

We will also ask for some administrative information about you in order to process and preserve your research data:

  • Your name and surname
  • Your e-mail address
  • Your department/institute

Data files

Finally, you will be asked to upload your files. Note that the maximum file size for individual files is 2GB. If the total size of your dataset exceeds 20GB, there will be an additional charge of £4 per exceeding GB. Please contact us if you would like to either submit individual files that exceed 2GB or your submission is larger than 20GB in total.

Data submitters are also responsible for consulting the guidance on file formats before submitting their data to the data repository.

 

What happens after we receive your data submission?

The following image summarises the data submission process. Please note that Step 3 is only required when submitting a placeholder (i.e. draft) dataset; otherwise, you can submit the final version of your dataset if it is ready to be archived and shared publicly in the repository.

Data submission process

We will respond to you within three working days following your dataset submission; however, if you have submitted your dataset as a placeholder record, you will automatically receive the DOI of your dataset to cite in your publication. If you have specified in the placeholder record that your dataset contains sensitive information, or if we suspect sensitive content based on the information provided, we shall contact you for more information. Otherwise, we will wait for you to finalise your dataset before going any further. We expect datasets to be finalised prior to or closely aligned with the release of the associated manuscript by the publisher.  

If you have submitted your dataset as the final version (or after you have finalised your placeholder dataset), we then review your data submission before uploading it to the repository. Please note that our DOI policy states that datasets cannot be changed (e.g. files added, removed or amended) after they have been approved into the repository. Only finalised datasets will be processed into the repository. If you wish to submit a dataset in draft form to be finalised later (e.g. after peer review) then choose to submit your dataset as a placeholder record. 

When we review a finalised dataset, we check the following before approving the dataset into the repository: 

  • Is this dataset submitted by (or on behalf of) a current/former University of Cambridge researcher, research student, or staff member? 
  • Does the submitter have the rights to share the data via the Repository? See ‘Responsibilities of Depositors’ in the Repository Terms of Use
  • Does the dataset contain any confidential/restricted information?  
  • Do the files open without errors? 
  • Is the submission accompanied by appropriate metadata description? This includes keywords, a detailed dataset description, software instructions and, if applicable, readme file(s), variable definitions, provision of units, a codebook.
  • Are the file formats suitable for long-term preservation of data files? If not, could the files be exported to a different file format, more suitable for preservation? 
  • If applicable, has the title of the publication associated with the dataset been provided? 
  • Has an external email address (i.e. not a cam.ac.uk address) been provided? 

If your dataset contains information pertaining to human participants (including de-identified data) we will contact you to ask if you have the correct consent to allow your data to be made publicly available in the Repository. We will also ask you to send us a copy of the consent forms and/or participant information sheets for our review. All data deposited in the Repository are made publicly available. 

We expect datasets to have appropriate metadata supplied so that the contents of the dataset can be understood and reused by others. If we think that any information is missing, we will get in touch with you to request the missing information.  

If all the information is provided, we will upload your data into the repository and send you the DOI of your dataset. Your dataset will be linked to the DOI of the associated publication, either at the point of deposit in the repository or at a later date if unavailable beforehand. We will also link your dataset to the corresponding manuscript in the repository (e.g. the accepted manuscript submitted to the Open Access team) and vice versa. These steps help to increase the findability of your dataset, enhancing opportunities for dataset citation. The recommended citation for your dataset (author(s), publication date, title and DOI) is provided on the page for your dataset in the repository.  

If you have selected to embargo your dataset, access to the data files can be granted only by the dataset author via our request a copy service while the files are under embargo. Although the data files are not publicly available while datasets are embargoed, please note that metadata for your dataset are publicly visible and findable via search engines. Embargoes on datasets associated with articles are removed as soon as we are aware that the article has been published.

 

Questions

If you have any questions about data submissions or data sharing, please review the frequently asked questions. If you cannot find the answer to your question, please contact us.

Open Research Newsletter sign-up

Please complete this form if you would like to receive our monthly Open Research e-newsletter.

The Office of Scholarly Communication sends this Newsletter to its subscribers in order to disseminate information relevant to open access, research data management, scholarly communication and open research topics. For details on how the personal information you enter here is used, please see our privacy policy. Please note that MailChimp uses click-tracking technology and may use your personal information according to its Terms and Conditions. Find out more at the MailChimp Privacy Policy.