Skip to content

1.2 Research data management

Unit overview

Unit study time

  • 20 minutes

Intended Learning Outcome

By the end of the unit, you will be able to ...

  • Explain what research data management (RDM) is
  • Recognise the issues caused by poor RDM
  • Be able to explain the reasons and benefits of good research data management
  • Recognise the role of data management plans in RDM and what should be included in one

Using research data

To use data as evidence in research, we need to collect and analyse it. This may be primary data (data we collect ourselves) or secondary data (data collected by others). However, if we don't organise and manage our data, we may hinder our ability to do this.

If we have disorganised and unmanaged data, we may ...

  • Not understand our data, as datasets are not labelled and abbreviations are not explained
  • Lose data, as it's not securely saved
  • Produce 'dirty' data that contains human errors and inconsistencies which invalidate our findings

If our data is unorganised or 'dirty', we limit our ability to produce effective findings.

Poorly managed research data doesn't just impact the person who originally created the data, it also affects other researchers after the project has finished.

In this clip, one researcher would like to find out more about another researcher's project. However, it is not as straightforward as it seems.1

What are some examples of poor research data management explored in this video?

When you watch the video, can you spot what problems arise?

Snafu in Three Short Acts

Examples of poor research data management
  • Data is not properly cited or linked in publications
  • Links to objects/resources are broken e.g. when you follow a link in a citation, the page no longer exists or cannot be reached
  • Limited context to understand the data
  • Limited information about the project, such as when it was conducted, where it was conducted, who collected the data, how it was collected and why
  • Limited information about what the dataset contains, such as what type of data was collected, what abbreviations mean
  • No contact details of a person / organisation who collected the data to ask further questions about the data
  • Data is unusable or only available in proprietary software
  • No information about the data sharing policy
  • Unclear data access process that doesn't specify how long it takes or the cost

Poor research data management

What is the impact of poor research data management?

For published research data, poor data management impacts the discovery, understandability and (re)use of that data.

Reflecting on the video and your own experience, what are the consequences if you had poor research data management?

Expand the boxes below to explore some of the consequences of poor research data management.

The reach and use of research data is limited

Other people are less likely to trust your data as it's harder to understand and may contain errors and inconsistencies. This means they're less likely to cite your work or reuse your data in secondary research or cross-study comparisons. This limits its impact and, in the long run, it could affect your chances of getting future funding or support.

Research data is less secure

If there is no clear data usage or sharing policies, your data could be misused. If data is not stored in a secure location, it could be lost or accessed by people without permission. This is especially risky if your data is sensitive and needs to be kept secure.

Time is wasted trying to understand data retrospectively

You will have to spend more time trying to find and organise poorly managed data into a meaningful structure so it can be understood and used. If that's not possible, you may need to repeat a study to correct errors caused by poor data management, doubling the time and resources needed to obtain the data. If your study is not documented and made discoverable, people might not know it exists and it could be unnecessarily duplicated by others.

Access to future research funding could be impacted

If data is poorly managed and/or inaccessible, institutions are less likely to fund future research.

Poor data management causes inefficiency and devalues the quality and usability of your research. Poorly managed data affects both the person who has created the data and others who are interested in using the data.

Research data management (RDM)

In order to avoid the consequences caused by poor data management highlighted in the previous section, research data needs to be organised and managed. This concept is called research data management (RDM).

"CODATA defines research data management as...

Storage, access and preservation of data created or collected in the course of research. Research data management practices cover the entire lifecycle of the data, from planning the investigation to conducting it, and from backing up data as it is created and used to long term preservation of data deliverables after the research investigation has concluded.[2]

Who's responsible for implementing good research data management?

Everyone involved in a research project is responsible for implementing good RDM. Larger projects may assign different roles and share responsibilities. People outside of the research project are also responsible for good RDM when they cite or re-use data.

Implementing good RDM

Select the tabs below to explore how we can implement good research data management in our own research.

Data Management Plans (DMP) are a document that outline how a project will manage their data. This includes assigning roles in RDM, and specifying where and how data will be stored and documented. DMPs should be created at the beginning of the project and it is recommended that all projects have a DMP. [3]

What should a DMP include?

  • DMPs should address the whole research lifecycle, covering how data will be collected, processed, analysed, described, preserved, and shared during the course of a research project and beyond (we will explore the research lifecycle in unit 2.2.
  • DMPs should be a living document that is updated as processes are refined and created when the need arises.

To help guide research data management across all disciplines, the FAIR data principles were established as a hallmark of best practice in research. The principles help researchers avoid common problems that come from poor research data management and encourage data to be Findable, Accessible, Interoperable and Re-usable (FAIR). We'll unpack these principles in the next unit 1.3.

Creating structured documentation that describes your data, and your data collection processes can help you organise your research data effectively. For example, recording what different datasets contain and the security levels of the data. You can also share this documentation with other researchers so they can understand your data without having to access the data files (note, this documentation does not need to contain any sensitive information). Metadata is a type of structured documentation that allows you to do this. We'll cover what metadata is and how it is used in research in unit 2.1.

RDM case studies

To explore real-life examples of data management, the University of Birmingham has put together four case studies from different departments (Birmingham University Imaging Centre (BUIC), College for Engineering and Physical Sciences, Department of English Literature, School of Sport, Exercise and Rehabilitation Sciences) exploring how they approached research data management in their project. You can explore the case studies here. [5]

Benefits of RDM

What do you think are the benefits of good research data management?

How do you think good research management can help the research process and data quality?

Expand the box below to explore the benefits of good research data management.

Benefits of good research data management

Efficiency

  • Saves time and resources by establishing robust documentation, storage and preservation processes
  • Reduces the risk of losing data and ensures long-term preservation of data
  • Reduces the potential for research duplication
  • Improves internal communication and workflows within a project

Quality and trustworthiness of your data

  • Enables other researchers to use your research and encourages cross-study research
  • Proof of transparent and valid research methods and data
  • Improves the accuracy and security of your data

Encourages re-use and makes your data go further

  • Facilitates the re-use of data by yourself and others
  • It is required to obtain funding and/or meet funding bodies' requirements

[6]

Assess your RDM approach

How strong is your research data management?

Checklists can help you make sure you've considered all areas of research data management in your research. These are available online, for example ...

  • DCC provide a pdf, 'Research data management plan checklist', which has a list of questions to help you assess your RDM approach.
  • Openaire gives a checklist for different stakeholders in the research process: the research group, institution, repository, research infrastructure, funder and national. On each page, it outlines what each group should consider to implement good RDM.
  • Stanford Medicine provide a online data management checklist that allows you to tick different RDM activities once you've completed them.

If you're currently working on a project, try using one of the lists above to assess your research data management approach.


Test your knowledge

1. When in the research process should you create a data management plan?
2. What stage of the research lifecycle should a data management plan address?
3. Who should create a data management plan?
4. Who is responsible for carrying out a data management plan?

Answers
  1. At the start of the project in the research design stage.

  2. A data management plan (DMP) should address all stages of the research lifecycle, including after the research has finished.

  3. The lead researcher or research team, often with input from data managers or institutional support. If it is a large research team, you may delegate different areas of the DMP to be completed.

  4. Everyone in the research team is responsible for carrying out the DMP. You may also assign specific tasks to relevant members.


Further RDM training and resources

If you want to explore research data management further, you can examine some of the resources below.

Research data management online training
Resources for developing data management plans

If you want to find out more about data management plans (DMPs), explore the resources below.

  • UGent Open Science have created a short video explaining the importance of DMPs and how to go about developing one. Watch it here
  • Research Data Oxford provide an overview of DMPs such as what they should include and how to develop them. Explore the guidance here.
  • Digital Curation Centre's (DCC) guide 'How to Develop a Data Management and Sharing Plan' and DMP checklist.
  • UK Data Service provides a Data Management Costing Tool, a table listing data management activities and a column to estimate the cost of each activity. You can include this spending forecast in your DMP so you can assess whether you are on track as you go along. You can see the costing tool here.
Explore data management plans templates and examples

If you want to create a DMP, you can view other projects DMPs to help inform how you make your own. Explore the sites below which store data management plans and offer templates and guidance.

  • On the ARGOS platform, you can view previous DMPs. It also provides guidance and tools to create your own plan.
  • Developed by the Digital Curation Centre (DCC), DMPonline facilitates the creation, review, and sharing of data management plans. The online, open resource contains DMP templates and public DMPs for reference.
  • DMP Tool contains public DMPs for reference as well as a list of requirements from different funders to ensure DMPs meet the criteria of the funding body.
  • Data Stewardship Wizard is an open-source platform for collaborative and living data management plans trusted worldwide. You can access the tool here.

References