Unit 2.7 Summary
Unit overview
Unit study time
- 10 minutes
Outline
- Identify the key takeaways from the Introduction Course
- Explain the main benefits of metadata for data owners, data users and research as a whole
- Identify tools that help metadata creation and management
Defining metadata
CODATA definition of metadata
Data about data. Defines and describes the characteristics of other data. It is used to improve the understanding and use of the data.
In contrast to other forms of data documentation, metadata is machine readable and machine actionable.
How do we use metadata in research
Metadata makes research data easier to find by providing searchable descriptions, and helps others understand it through clear context and documentation. It enables researchers to assess the quality and relevance of data, and ultimately to act on it by reāusing, sharing, or building upon it with confidence.
How does metadata help us find data?
Metadata powers data catalogues which are a key discovery tool in research. On these sites, a large number of studies are listed so people can find past studies for their own work, whether to reference or to use as secondary data. Without metadata, we wouldn't be able to search or filter these sites, instead we would have to scroll through hundreds of pages and click into every individual record to find studies that relevant for our work. As metadata is machine readable and machine actionable, it helps organise all the studies on a data catalogue and provides us with the data to filter and search.
How does metadata help us understand data?
As a structured form of data documentation, metadata gives the context of a study. It explains the who, what, where, why and how of the data. Without this, we wouldn't know what data a study has collected or, if we have the study's dataset, we would struggle to interpet the data as it would just appear to be numbers or text with no meaning. By providing extra information that wouldn't otherwise be captured in the dataset, metadata helps researchers understand how the data was produced and how to interpet the data once they have the dataset.
How does metadata help us reuse data?
By helping people discover and understand existing research, metadata helps people reuse existing data. This is particularly important when finding secondary data or conducting research to see what studies have already been conducted in our area. Without metadata, we might not know a study in our area of interest has been conducted and we run the risk of duplicating research. By making these studies findable and understandable, metadata to helps us reuse data, saving us time, money and resources.
A dataset created for one research question might also contain data valuable for a completely different enquiry. This may not be obvious when reading about an overall study however metadata, such as variable descriptions, makes these connections visible. People can then identify these opportunities and repurpose existing data, meaning we maximise the potential of data. By documenting the provenance of data, metadata helps maintain the credibility and trustworthiness of data so it can be used reliably to support other research enquiries.
Why create metadata?
Creating metadata is essential for good research data management and implementing the FAIR data principles in research.
FAIR and sustainable research
Metadata is not a nice-to-have, it is an asset that can be used to your advantage, making your data go further and maximising the potential of research. Metadata is a key part of making data FAIR (Findable, Accessible, Interoperable, and Reusable). When we talk about FAIR data, we usually think about only about the data itself, but the metadata also needs to be FAIR. If your metadata is well-structured and follows FAIR principles, it makes your data easier to discover, understand, and reuse, giving you and others the full benefit.
Metadata helps us manage our data during the research process. By capturing information around file formats, versioning and provenance of the data as we go along, metadata documents the research process clearly. It enables us to organise our data and make sure it's secure, reducing the risk of losing data or accidentally allowing restricted data to be accessible. As such, metadata improves our research data management, making our data more trustworthy and helping us preserve it for future use. It may also be needed as part of a funding or journal requirement as evidence of data sharing and well managed data.
Research data management
Metadata helps you manage your data in a consistent and structured way. This is useful during the research process to make sure your keep your data secure and correctly organised. It can also help you in the future when you return to a previous study after many years as metadata can give you the context and information to understand your data and how it was collected so you don't have to rely on memory.
Make your data discoverable and re-usable
If you're sharing your data, metadata makes your data easier to find, understand, and reuse by other people. For example, by having metadata, your data can become findable on platforms such as data catalogues and repositories. Strong provenance metadata can also help people trust your data, as your research processes are clear and transparent. This means your data is more likely to be cited and re-used, increasing your visibility of as a researcher and the impact of your research.
Good quality metadata
What are the features of good quality metadata?
- Accurate descriptions that provide the appropriate level of information necessary to understand a project
- Being consistent with the use of terms within a project's metadata to avoid confusion or misinformation
- Use standardised terms that are commonly used within the relevant discipline and are unlikely to become redundant
- Interoperability of metadata with other studies' so it can be compared across resources and deposited on centralised platforms such as data catalogues where it can be searched and filtered
What tools can help us create good quality metadata?
- Controlled vocabularies
- Metadata standards and schemas