Unit 4.1 Considerations for what metadata to create
Overview
Unit study time
-
20 minutes
Intended Learning Outcome
By the end of the unit, you will ...
- Understand how metadata creation varies across projects depending on scope, size and aims of the research
- Know how to decide what metadata to create for a project
- Know what software you can use to create and store metadata
- Identify tools you can use to support metadata creation and implement best practice
There is no singular, universally-agreed set of minimum metadata requirements. When looking for available guidance, there are a variety of suggestions which are often discipline or resource specific. With extensive metadata, your data is more discoverable, understandable and (re)usable. However, the amount of metadata you create can be a trade-off between time, resource and value.
Common metadata elements that are frequently used across disciplines and different levels of resources describe the who, what, where, when and how of the research at a project or study level. Beyond this, what metadata you create depends on the scope and aims of your research project.
Considerations
Data sharing
- Do you plan to share your (meta)data, or is it for personal reference only?
- If you're planning to share your (meta)data, how widely will you share it?
- Will the people who interact with your (meta)data have direct contact with you and/or your research team?
(Note: sharing metadata does not mean the data has to be open.)
If you're not planning to share your data, you will be the main user of your metadata. You can choose metadata elements that help you manage and preserve your data. The aim of this metadata is to support efficient research practice and help your future self understand the data you collected.
If you're sharing (meta)data with external users who do not have direct contact with you, your metadata needs to act as a standalone guide to your research and data. In this scenario, metadata also helps make your project discoverable and understandable to others.
Because users may not be able to contact you easily to clarify missing or confusing information, it is important that your metadata is clear, standardised, and comprehensive from the outset. This usually means creating more metadata than for a project that does not share its (meta)data. Tools such as metadata standards and controlled vocabularies are also fundamental in ensuring your metadata is interoperable and reusable by both you and others.
Project size
- What is the scale of your research?
- How much data will you collect? How many different data collection methods will you use?
- How many researchers are working on the project?
- How many people will be creating the metadata?
- What time and resources do you have available?
- What software do you have available to use?
- Do you have a data management budget?
If you're working on a large research project that handles a vast amount of data and data collection methods, you will need to create more metadata in order to effectively describe the different types of data and the processes undertaken to collect it. If it's a small research project, you will not need to create as much metadata.
If you're working with a large research team, it is important to establish a clear metadata schema so metadata remains consistent and standardised. If there are multiple people creating, managing and using the metadata, following a schema or standard will ensure the metadata remains consistent and easy to understand. If you are doing a solo research project, you may want to create your own metadata template. However, it can still be useful to follow a metadata schema or standard and use controlled vocabularies in order to make sure your metadata is high quality and interoperable.
If you're working on a large research project, you may have more time and resources to create metadata. Larger projects are more likely to have dedicated budgets for data management, team members with experience in metadata, and requirements to submit formal data management plans. You could use specific metadata tools to help create and manage your metadata. If you are doing a small, solo project, you may not have the capacity to create extensive metadata or the access to metadata tools. In this scenario, it's important to prioritise the high value metadata for your research that support the aims of your project.
Research value
- Does your data have long‑term importance?
- Does your data contribute to a wider body of research?
- How long should the data be preserved for?
- Will others want to reuse or replicate your research?
The amount of metadata you create should also reflect the value of your data, both to you and to potential future users. If your data has long‑term importance, contributes to a wider body of research, or is likely to be reused by others, it is worth investing in richer, more detailed metadata. High‑value data often benefits from metadata that captures context, methods, provenance, and structure in depth so that it remains understandable and reusable over time. If your data has a more limited scope, is exploratory, or is unlikely to be reused outside your own project, extensive metadata may not be needed. In this case, focus on documenting the essential information needed for you to understand and manage the data may be more of a priority.
One-off or repeated study
- Is it a one-off study or a repeated study?
- If it is a repeated study, have there already been completed studies? What will the upcoming studies cover?
If you're creating metadata for a repeated study, it is important to consider what metadata was captured for previous studies, including what schemas or standards were used. You should incorporate previous metadata structures into the metadata creation for any new research in the series in order to make sure your data is interoperable. If it's the first study of a series, it's important to create a robust metadata schema that can be reused for future studies.
For a one-off study, you have more flexibility about what metadata you create and/or what schema you use as it only needs to serve the aims of that individual project. However, it is still important to consider how you can make your (meta)data interoperable with other research projects in your field.
Metadata standards and schemas
- What common metadata standards are used in your discipline?
- What metadata schemas are widely available?
- Are you planning to deposit your (meta)data in a data repository and/or data catalogue? Do they specify or provide a schema?
Data repositories and catalogue schemas
If you're depositing your (meta)data in data catalogues or repositories, they may specify which metadata standards and controlled vocabularies to use. They may also provide guidance or a metadata template or model, for example:
- CESSDA Data Model maps out the required metadata elements needed to deposit a dataset in their data catalogue
- UK Data Service provides guidance on how to prepare metadata and documentation
Even if you're not planning to deposit your data in a repository or catalogue, these resources can provide accessible guidance on what metadata elements to capture and a ready‑made structure. You can use their templates or models as a basis for your metadata creation.
Metadata standards and schemas
It is good practice to identify common guidelines for metadata creation in your discipline or for a specific resource. This includes identifying widely used metadata standards and/or schemas. These can either provide the structure for your metadata creation or inform the approach you take.
Using a metadata schema and standard is essential when planning to share your (meta)data or deposit it in a repository and/or catalogue. Schemas and standards ensure metadata is interoperable and can be integrated into centralised systems (such as repositories and catalogues). They also help standardise terminology across research teams, reducing confusion and miscommunication.
Even if you are not planning to share your (meta)data, metadata standards and schemas can save time by guiding what metadata to capture and how to structure it. You do not need to follow a standard in its entirety—you can use it as a starting point. Standards also provide definitions, valid inputs, and controlled vocabularies, helping ensure metadata is clear and consistent.
While some cross-disciplinary standards exist, many are discipline-specific. It is helpful to identify a suitable standard before you begin creating metadata. You can refer to a list of common standards for different disciplines provided by the [University of Texas/rdamsc.bath.ac.uk/search useful for discovering available standards.
For more information, see Unit 2.5 of the Introduction to Metadata training course.