Leeds University Library

Data Management Planning (DMP)


What is a Research Data Management Plan? Why would I develop one for my research?

A Research data management plan (DMP) is evidence of your commitment to value the research data generated by you/your project; and ensure that you, as researcher and your institution are better positioned to meet the requirements of our research funders.  See Bidding.

Why develop a research data management plan (DMP)?

  • You can find and understand your data when you need to use it
  • There is continuity if project staff leave or new researchers join
  • You can avoid unnecessary duplication e.g. re-collecting or re-working data
  • The data underlying publications are maintained, allowing for validation of results
  • Data sharing leads to more collaboration and advances research
  • Your research is more visible and has greater impact
  • Publishers and research funders may require that you share your data so it is worth investing time to plan for effective data management. Several funders ask for data plans as part of grant proposals.
  • Other researchers can cite your data so you gain credit

Did you know that the University of Leeds Research Data Management Policy states....?

  • A data management plan that explicitly addresses the capture, management, integrity, confidentiality, preservation, sharing and publication of research data must be created for each proposed research project or funding application.
  • Sufficient metadata shall also be created and stored to aid discovery and re-use.
  • Data management plans should take account of and ensure compliance with relevant legislative frameworks which may limit public access to the data (for example, in the areas of data protection, intellectual property and human rights).

Please see the FAQs supporting the University of Leeds Research Data Management Policy.  This resource has been developed to help understand the policy requirements whilst also being a standalone resource for information and guidance on research data management.


How would I start to structure my research data management plan (DMP)?

A typical Research data management plan (DMP) provides information on:

  1. The research project and research context including what data will be created by the research project and how? 
  2. Data Types, Formats, Standards and Capture Methods
  3. Ethics and intellectual property - including what actions are appropriate given the nature of the data and any restrictions that may need to be applied.
  4. Access, Data Sharing and Re-use including an outline plan for the sharing and preservation of the research project outputs/data
  5. Short-term storage and data management
  6. Deposit and long-term storage
  7. Resourcing
  8. Training and Support                                                         

This DCC structure gives you the generic headings/sections that inform a data management plan.  The information in the extended tabs below will help you consider what content to include as part of these headings/sections:


DMP - Creating and Organising my data

1. DMP - The Research Project & Research context. Consider the following:

  1. Details on the Research project funded including research aims, methodology, contact details etc
  2. Research funder grant requirements eg How long you required to preserve your research data for beyond the life of the research?

For certain areas of research it may be difficult due to time and cost constraints to undertake primary research or analysis. Being able to obtain information from secondary sources may support your finding of data; Data that has already been published provides a useful source of secondary data ie from books, journals and periodicals - see the list below. 

      • Government records 
      • Census Data/population statistics
      • Health records
      • e-journals
      • General websites 
      • Weblogs
      • Diaries
      • Letters

Remember your research funder may ask you to outline the steps you have taken to satisfy yourself that the research data to be generated has not in fact been funded and published on previously.    


& a peer review/funder perspective?

  • What secondary sources of data have been considered and evaluated? 
  • Is the project creating new data? and why existing  data resources could not be re-used? 
  • If existing data are used, have issues such as copyright or IPR of such data been considered and possible copyright clearance obtained to be able to share data or data derived thereof?

2. DMP - Data Types, Formats, Standards and Capture Methods. Consider the following:

  1. The approximate size of your research data / number of data files that represent your research
  2. How your data is organised, described and annotated?
  3. See What is research data? and File formats

For example:

To help you keep track of your data files, it is important to adopt a file naming and versioning convention.


& a peer review/funder perspective?
  • In line with the research and methodology proposed in the application is the information on the data to be produced adequate and realistic? 
  • Is there evidence that the plan covers all data that is planned to be generated from the research?
  • Is sufficient information given on how data will be collected and in which formats (eg Open Document Format, tab-delimited, Excel etc)?
  • How will data be documented, analysed and stored?
  • Have any data collection or quality assurance procedures been stated? This could include methods for data validation or standards applied during data collection and data entry, codes of research practice adhered to, transcription templates used, etc.
  • Is there a clear need that from the research being proposed quality assurance procedures should be outlined?

DMP - Accessing and Looking after my data

Ethics in research relates to how issues such as:

  • consent
  • confidentiality
  • privacy
  • security
  • disclosure
  • protection
  • safeguarding
  • risk
  • human rights

These are issues that are applied to all areas of your research.  Hence the principles of good research practice encourages those involved in research to consider the wider consequences of their research and engage with practical, ethical and intellectual challenges inherent in high quality research. 

The university provides information and guidance on good practice in ethics and ethical review.

Data Protection

The eight principles of the Data Protection Act states that personal information should be:

  1. Fairly and lawfully processed
  2. Processed for limited purposes
  3. Adequate, relevant and not excessive
  4. Accurate and up to date
  5. Not kept for longer than is necessary
  6. Processed in line with your rights
  7. Secure
  8. Not transferred to other countries without adequate protection

For more detailed information, guidance, advice and answering any queries you may have on Data Protection please go to: Data Protection.

Copyright and IPR issues?

Intellectual property rights, very broadly, are rights granted to creators and owners of works that are the result of human intellectual creativity. The main intellectual property rights are: copyright, patents, trade marks, design rights, protection from passing off, and the protection of confidential information (JISC Legal).

It is important to clarify intellectual property rights at the beginning of the research process, otherwise there may be unexpected restrictions on how data may be used and reused by yourself and others. Research funders may expect IPR ownership to be addressed in your data management plan.

Other resources:

What is Freedom of Information?

    • Freedom of Information gives everyone the right to request information held by public sector organisations under the Freedom of Information Act. Directgov.

JISC, the UK expert on information and digital technologies for education and research advise the following:

    • Research data can be the subject of Freedom of Information requests, as recent high profile cases have shown. As a researcher, how should you respond if faced with such a request? This document sets out to answer that question and some others you may have. Details of particular circumstances can make a major difference, so conclusions reached in an individual case may well differ from those suggested here.

Information, guidance and support from the University of Leeds is available at:

The Information Commissioners Office (ICO) is the UK independent authority set up to uphold information rights in the public interest, promoting openness by public bodies and data privacy for individuals.

FOI and Data Examples


& a peer review/funder perspective?
  • Is copyright of research data (both existing sources of data used or created) agreed or clarified? What about collaborative research or where various sources of data are combined?
  • Are plans in place for copyright clearance for data sharing (if possible)?

4. DMP - Access, Data Sharing and Re-use. Consider the following:

Many funders of research require the studies they support to be made freely available - this is referred to as open access.  Where possible, the cost of publishing under open access arrangements should be included in project costs. The University of Leeds Researcher@Librarywebsite contains useful information on open access publishing including links to information on the open access policies of many funders and publishers.

The Digital Curation Centre (DCC) provides the following guidance regarding research data sharing, access and re-use.

  • Anticipate and plan for data reuse: It can help to envisage which users your data would be of value to, and address their needs when deciding how to make the data available. Data centres may also ask you to meet minimum quality standards to make sure your data can be understood and reused by other researchers.
  • Provide specific details on access: Reassure funders by being very clear about where, when and how your data will be made available. The DCC offers guidance on how to licence your data to make clear who can use it and for what purpose.  Funders often state expected  timeframes for release, such as making data available on publication. If you can't meet these expectations or need to impose any restrictions, try to demonstrate that you have considered various means of overcoming these challenges.
  • Use existing infrastructure: Where possible select an appropriate disciplinary database, data centre or institutional repository. If you are unsure which services are available to you, check the repository list collated by DataCite, BioMed Central and the DCC.  If access to your data needs to be restricted, look for secure data services or data enclaves.

What do I need to take into account when planning to share research files and data with collaborators?

  1. If newly generated data cannot be shared, adequate justification should be given.
  2. It may be a case that parts of the data that are sensitive cannot be shared. In these cases the plan should provide evidence that the data has been assessed from all angles and that in any format whatosever, whether the data has been re-formatted, anonymised or otherwised, that there is no element of the data that can be shared.
  3. Non-sharing of data is normally considered as an exception, and often funders requiring data to be deposited will reserve the right to refuse waivers on sharing where there is insufficient evidence that the applicant has fully explored all strategies to enable data sharing and archiving.

Further advice and guidance is available from:

University of Leeds Research Support:

& a peer review/funder perspective?
  • What data will be shared and with whom? How will other researchers be able to access the data?
  • What are the further intended and/or foreseeable research uses for the completed dataset(s)?
  • How you will make the resource accessible to the potential audience(s) identified?
  • Where will you make the data available?
  • Will a data sharing agreement be required?
  • What is the timescale for public release of the data?
  • State any expected difficulties in data sharing, along with causes and possible measures to overcome these difficulties.
  • How will data sharing provide opportunities for coordination or collaboration?
  • Whether any enhanced security controls need to exist.
  • Have all obstacles to sharing data been considered? Have strategies been considered for dealing with these issues? For example by discussing data sharing and re-use with interviewees and gaining specific consent from participantsto share research data and
    o anonymising data to remove personal and disclosive information
    o regulating access to data

5. DMP - Short term storage and data management. Consider the following:

  1. Identification of the person responsible for the immediate day-to-day management, storage and backup of your research data arising from your research
  2. Who will be responsible for your research once your research has concluded?

If you have any questions about research data backup and storage you should contact your Faculty IT Manager.

The Research Support service web pages have advice on the University approach to storage and backup of research data.

The IT Help desk web page holds information on file storage.

Always store your crucial data in more than one place.

Useful resources


& a peer review/funder perspective?
  • Is the data back-up procedure described fit for purpose? eg considering back-up  procedures for all institutions.
  • Are methods of version control described? (ie making sure that if the information in one file is altered, the related information in other files is also adopted, as well as keeping a track on a number of versions and their locations)
  • Is the data back-up procedure described fit for purpose? eg considering back-up procedures for all institutions/collaborations

6. DMP - Deposit, long-term storage/preservation, access and citation. Consider the following:

  1. What happens to your data longer term?
  2. Do you deposit to a data centre or similar?
  3. For how long will you embargo your research data before it is published for others to see and use?

Why is preservation important for my research?

Regular backups of data are not enough to ensure long term accessibility and reusability. For example, file formats may become obsolete, storage media may become damaged or your data may not be understandable if it lacks appropriate description and documentation to enable reuse.

Long term preservation guidance from the University of Edinburgh.  If you deposit your data with a data centre or repository, check their policies on data retention and preservation.  Your research funder may have a requirement that your data is retained for a specific period of time.

All research data should be offered and assessed for deposit in an appropriate University, national or international data service or domain repository, unless specified otherwise in the data management plan.  See the University of Leeds Research Data Management Policy FAQs for further information.

If there is a recognised data centre or 'digital repository' for your subject discipline, this should be your first port of call for long term data deposit. Your funder may stipulate which repository should be used to house your data. Be aware of any collaborative agreements you have signed with research partners and whether these contain requirements for data deposit.

To assess whether the repository is a suitable home for your data you could consider:
  • will others be able to find your data - how?
  • are data sets in the service easy to cite? Do they have persistent identifiers e.g. a handle or DOI (digital object identifier) See below for further details.
  • under what licence terms are data sets made available for reuse?
  • can you apply an access embargo period if you need to?
  • are others depositing data with the service? Is there a community to support the repository?
  • what is the growth rate for data deposit?
  • what file formats does the repository support?
  • what metadata requirements are there? It can be helpful to know this early on, before you generate your data.
  • for how long can/will your data be housed in the repository?
  • how will the data be curated? Will any action be taken to facilitate long term access and reuse?
  • is the repository seen as a best practice exemplar? Have you seen favourable references to it?
  • does the repository meet the terms of the University's Research Data Management policy?

(criteria partly taken from the Draft Framework for Assessing the Dryad Data Repository)

Is there a Leeds data repository?

The University of Leeds is scoping a research data management repository service as part of the RoaDMaP project (Leeds Research Data Management Pilot). The project will make recommendations for a service in 2013.

In the meantime, the holding advice is to apply good research data management principles to your data and ensure it is securely stored ready for eventual deposit in the local data repository.
 
If you are unsure about where to deposit your data, email roadmap@leeds.ac.uk  for advice.

What is data citation and why is data citation relevant for my research data?

"Data citation refers to the practice of providing a reference to data in the same way as researchers routinely provide a bibliographic reference to printed resources.  The need to cite data is starting to be recognised as one of the key practices underpinning the recognition of data as a primary research output rather than as a by-product of research.  While data has often been shared in the past, it is rarely, if ever, cited in the same way as a journal article or other publication might be.  If datasets were cited, they would achieve a validity and significance within the cycle of activities associated with scholarly communications and recognition of scholarly effort."
Australian National Data Service

The exact format data citation should take is currently under debate though common practice is emerging as more datasets are made available online. If you use data from an online data repository, there may be a recommended citation format; any journal you publish with may have a dataset citation house style. A Digital Curation Centre guide collates current practice on citing datasets.

There is some evidence publications with associated datasets are more highly cited.

What is a Unique Identifier?

For robust citation, datasets should have a unique identifier which persists over time. Several identification systems are in use. A well known example is the digital object identifier (DOI) which is commonly used to identify journal articles but which can also be used to identify datasets. More information on persistent identifiers from DataCite and the Australian National Data Service.

Further information on citing data and getting cited is available from: 


& a peer review/funders perspective?

  • Are the plans for preparing and documenting data for sharing and archiving with the selected repository appropriate? 
  • Is there evidence that data will be well documented during research to provide high-quality contextual information and/or structured metadata for secondary users? eg documenting the method of data collection, origin, circumstances, processing and analysis of data.

7. DMP - Resourcing and responsibilities. Consider the following:

  1. Have all relevant of data management requirement costs been built into your research bids?
  2. Who will be responsible for your data, once you have left your present research group?

Further support in the area of costing your data management plan is available from:

1. The UK Data Archive

2. UK Data Service

3. JISC Keeping Research Data Safe



& a peer review/funder perspective?

  • Have data management responsibilities been allocated to named individuals? 
  • Is there evidence that data management will be followed throughout the course of the  project? 
  • Has consideration been given to the variety of data management tasks that may be required for the research? 
  • For collaborative research, are data management responsibilities allocated at each partner organisation (if needed for the research) or has the coordination of data management responsibilities across partners been considered?

8. DMP - Where can I get Data Management training and support?

Internally

SDDU offers a range of courses relevant to research data management, including External training providers

Online

For further information, advice and guidance regarding research data management plan (DMP) for your research project... please contact :

  1. Research data management enquiries roadmap@leeds.ac.uk
  2. Research support: researchsupport@leeds.ac.uk
  3. Library

Also see:

  1. Wellcome Trust
  2. MRC
  3. BBSRC