Skip to content

Getting started

Welcome to the Foundation Metadata training course.

What will I learn from this course?

This Foundation course builds on the core principles introduced in the Introduction to Metadata training which covered what metadata is, how it is used, and the benefits it provides to you, your collaborators, and the wider research community. In this course, we introduce additional foundational metadata principles that are key for managing both metadata and data effectively. You will then learn how to make informed decisions about what metadata to create and how to create it, before moving on to practical guidance on producing metadata for your own research.

[NOTE] HM - Add LO once finalised.

Who's this course for?

This course is for anyone who will be engaging with research data as part of their work, for example using, analysing or collecting research data.

This course may be particularly relevant for ...

  • Masters and PhD students
  • Researchers
  • Data stewards and data managers
  • People working for funding bodies of academic research
  • People working in policy areas that use data and academic research in their work

No advanced technical experience is required.

What level of information do I need to know before starting the course?

You do not need specialist knowledge to begin this course. However, you should understand the basics of metadata, its role, and the essentials of research data and FAIR principles which are all covered in the Introduction to Metadata training. The course then builds from those concepts toward more practical skills in deciding what metadata to create and how to create it effectively.

The following topics were covered in the 'Introduction to Metadata' training course. If you are unfamiliar with any of the topic areas or would like a refresher then please go to the relevant units as linked below.

Research data

  • Understanding the characteristics and purpose of research data unit 1.1
  • Conceiving the research lifecycle as concept --> measure --> representation unit 1.1
  • Research data management unit 1.2
  • FAIR data principles unit 1.3

Metadata

  • Definition of metadata and the role of metadata in research unit 2.1
  • How metadata relates to data management and FAIR principles unit 2.1
  • Metadata in the data lifecycle unit 2.2
  • The role of metadata in data repositories and catalogues unit 2.3

Metadata components

Using metadata activity

  • Evaluating a dataset using metadata unit 3.0 (add link)

How do I take the course?

The Introduction course is split into 10 units. These are...

  • 1 Introduction
  • 1.1 Who should create metadata?
  • 1.2 Why create metadata?
  • 2 Core metadata elements
  • 2.1 Unit type, Population and Universe
  • 2.2 Question metadata
    • 2.21 Codes and Categories
  • 2.3 Variable metadata
  • 2.4 Concept metadata
  • 3 Metadata relationships
  • 4 What metadata should you create?
  • 4.1 What study metadata
  • 4.2 What dataset metadata
  • 4.3 What variable metadata
  • 5 Creating your own metadata activity

[!NOTE] BO: Will need to check and update the unit names once finalised (e.g. 1.2).

Creating metadata in the Foundation course

In the Foundation course, we will explore creating metadata for a small research project that has produced tabular data (data found in tables). We will use an excel template to practice creating metadata.

We will look at creating metadata predominantly for quantitative, tabular data, and use examples focused on metadata for data collection instruments such as questionnaires and surveys.

While metadata can be created for any research project and data type, we will be focusing on metadata for tabular data. If you have tabular dataset you would like to create metadata for, have it open while completing this course. If you do not regularly work with tabular data, the metadata elements and concepts this course covers are still relevant across disciplines and data types.

In the Foundation course, we will mainly focus on creating metadata for small scale research projects, with the purpose of personal use or sharing with project collaborators.

The metadata elements we explore will also be relevant to those working on large scale projects whose (meta)data will be widely shared, however, for these projects, you will have to take into account further contextual considerations such as discipline specific metadata standards, controlled vocabularies and data repository/catalogue requirements.

To practice metadata creation in the Foundation unit, we will use an excel metadata template. However, depending on your research project and the tools you chose to use, you may create and manage metadata using different software.

While Excel is not an machine readable format itself, it can be exported or transformed into machine readable formats, if the file follows a clear strucutre, data is clean and it doesn't rely on extra formatting such as colour, merged cells or notes.

You can download the metadata template for the Foundation course here (add link)

How long will the course take?

The training course is estimated to take about 6 hours 30 minutes to complete. This is estimated based on:

  • An average reading speed of 300 words per minute
  • Additional learning time for comprehension of technical concepts (between 20-70% added time)
  • Time for exercises, examples, and quizzes

These are estimated times. Your own time may vary depending on experience, background knowledge, and whether you complete all activities.

Terminology

As the way we understand and use metadata is evolving, it's important to note that some terms may be used differently by people or disciplines. The way some concepts are defined in this course may be different to how you currently understand them. So you have a clear understanding of the course content, we'll define key terms and concepts as we go along.

To maintain a level of consistency in our definitions, where possible we will refer to the Research Data Management (RDM) Terminology Bank developed by CODATA[1]. We may also provide further definitions from other sources to give a more rounded description of a concept.

The RDM Terminology Bank is a community reviewed, cross-discipline vocabulary that defines key concepts in RDM. As it is regularly updated, it reflects the latest understanding of RDM and aims to offer a single source of truth for terms that can be used in different ways. You can access the RDM Terminology Bank here and we recommend having this open while you go through the course so you can look up any terms that may be new to you, or you want to clarify[2].

Starting the course

If you're starting the course from the beginning, head to unit 1.1 where we'll recap what metadata is and why different types of research projects require different approaches. You may want to download all the files used in the course before you get started (add link). If you need more of A refersher, head to the Introduction to Metadata training course first.