1.3 FAIR Data principles
Unit overview
Unit study time
- 20 minutes
Intended Learning Outcomes
By the end of the unit, you will be able to ...
- Define the four FAIR principles (Findable, Accessible, Interoperable, Reusable)
- Recognise the role of FAIR in research and its benefits
- Identify research activities and research tools that support the implementation of the FAIR principles in research activities
FAIR data principles
In 2016, Mark D. Wilkinson, Michel Dumontier, Ijsbrand Jan Aalbersberg and a host of other academic and professional contributors from industry, funding agencies and scholarly publishers, published an article called 'FAIR Guiding Principles for scientific data management and stewardship' for Scientific Data.[1]
FAIR was presented as a guiding set of principles to support the sharing and reuse of data, tackling some of the research data management (RDM) challenges we encountered in unit 1.2. It highlights the importance of technology in supporting the RDM process, giving advice about how we can document our data in a way that can be processed by computers and software. There is also growing advice around how we can make our metadata ready for machine learning and AI processes. [2]
The FAIR principles outline the features that make data good quality data.
FAIR proposes data should be...
- Findable: other people can discover your data and data documentation.
- Accessible: your data and data documentation is accessible by other people (as per data permissions).
- Interoperable: your data and data documentation should be easily integrated and combined with other studies.
- Re-usable: your data can be re-used by other people, serving multiple research purposes beyond the original project.
Note, FAIR data does not mean open data. FAIR suggests data should be 'as open as possible, as closed as necessary'.
Since being created, the FAIR data principles have become a core component in research data management and should be considered in any research project.
You can watch a short video to gain an overview of FAIR in research[3].
Implementing the FAIR principles
Who implements FAIR?
Everyone can contribute to the implementation of the FAIR principles in research activities.
Research infrastructures play a key role in supporting the implementation of FAIR principles. This includes data repositories, data catalogues, standards, guidelines, and software designed to manage and store data. We will examine some of these tools in the upcoming units of the Introduction course.
As a researcher or data steward, you can also apply FAIR principles directly within your own work.
Select the tabs to explore suggestions on what you, as a researcher, can do to implement each principle in your research.
Other people can easily find your data and the documentation around your data and research project.
What you could do ...
- Deposit your data in a trusted repository or archive so people can find your data when searching these sites. The repository will create a Digital Object Identifier (DOI) for your data, which is a persistent and unique link to your work. The DOI can be used to cite your data, making it easy to find. The repository will also use metadata to organise different datasets so they are discoverable (we will look at data repositories in more detail in unit 2.3.)[6]
- Some repositories may create a Persistent Identifier (PID) for individual people, however you can also create and manage a PID for yourself using ORCID iD. This can help ensure you are correctly attributed in your research by other researchers. You should also reference your ORCID iD when documenting your own work.
- Use a DOI to cite other publications and datasets referenced in your research. [7]
- Create metadata that describes your project and deposit it in data repositories and/or catalogues so that it can be readily accessed alongside your data.
Resources to help you implement the 'Findable' principle in your work...
- Explore guidance on deciding where to deposit your data on DCC here.
- Find a trusted repository to deposit your data by using this guide from OpenAIRE.
- Follow DCC's guidance on 'How to Cite Datasets and Link to Publications' here.
- Create a Persistent Identifier for yourself with ORCID iD.
Your data and data documentation are accessible by other people. Note, this does not mean data has to be 'open'. Even when data is restricted, documentation that outlines what your data is describing and what the data sharing policies are should be available. FAIR states data should be 'as open as possible, as closed as necessary'.
What you could do...
- Use technology and software that is readily available and widely used in your discipline so people can access your data easily, if required. Where possible, use open software rather than proprietary software that some people may not have the resources to use. Alternatively, you can use proprietary software in your research processes and then save and share your data in an open format such as .csv.
- Use the latest version of a software to store your data to reduce the risk of your data being stuck in an outdated format (the best software for your data will depend on the data you have collected).
- Provide your data to repositories in their preferred format so your data can be found on their webpages and catalogues.
- Create metadata that outlines how other people can access your data.
Resources to help you implement the 'Accessible' principle in your work...
- Determine the best file format for your data by using this guide from UK Data Service.
Data and data documentation should be easily integrated and combined with other datasets and documentation.
What you could do...
- Ensure your data and data documentation is clean and well structured, identifying and correcting any input or format errors. This will make sure your data is ready to interoperate with applications or workflows for analysis, storage, and processing. It can then also be integrated with other data should data comparison and re-use take place.
- Use tools to standardise your data and data documentation such as controlled vocabularies, classifications and ontologies (we will explore these tools in unit 2.4.
- Create metadata that complies with relevant metadata standards and schemas (we will cover metadata standards and schemas in more depth in unit 2.5. Metadata that uses the same standard are more interoperable and can easily be compared and integrated in centralised systems (such as data and metadata catalogues).
Resources to help you implement the 'Interoperable' principle in your work...
- Clean your dataset with open source tools such as Open Data Editor or OpenRefine, correcting any errors found in the dataset.
- We will cover controlled vocabularies and metadata standards later in the course. Head there for further guidance on how to use them.
Data should be available for future research, study and processing, for example in secondary research and cross-study comparisons.
What you could do...
- Select a clear data usage and sharing license.
- Use terms that are used widely in your discipline and/or standardised by an academic community in a formal vocabulary (more information about standardisation tools such as controlled vocabularies and metadata standards in unit 2.3 and unit 2.4).
- Create metadata to describe the context and contents of your study and its data so other researchers can assess whether they can re-use data for their study.
Resources to help you implement the 'Re-usable' principle in your work...
- Find a data sharing license best suited for your research by using Creative Commons' guidance. and their license chooser tool
FAIR and Metadata
What was a recurring feature across all four principles?
Metadata
Documenting your data with metadata is one of the most effective ways of implementing the FAIR principles in your research. In unit 2.1 and 2.2, we'll explore what metadata is and how it's used in research.
Benefits of FAIR
Expand the tabs to explore some of the benefits of FAIR data below. These benefits are some of the many reasons FAIR is increasingly required by funders and scientific journals.
Maximises the value and use of data
Adopting FAIR practices ensures the long-term preservation and use of data. By using widely available software and depositing data in trusted repositories, data is discoverable and accessible for future use. This means data can outlive an individual project and be used as secondary data or in cross-study comparisons. Having strong documentation that uses standardised terms helps others to find, understand and use data, even if they don't have any contact with the original research team. For researchers who created the data, their data reaches more people and can be cited more widely, increasing their visibility without adding any further work.
Saves money, time and resources
Enabling data to be re-used in secondary research, cross-research comparison and in other outputs such as policy briefings, FAIR supports sustainable research practices. It reduces the risk of duplicating research and saves time, money and resources in conducting separate research projects collecting data on the same subject. It also ensures that data doesn’t just get produced to exist independently, but it is used across the research community.
Improves the quality, accuracy and security of data
Creating strong documentation around your data and datasets as you go along reduces risk of losing or duplicating data. If you need to produce reports about your research process, all the information is ready to go and you don't need to hunt down missing datasets. This saves you the challenge of having to document data retrospectively at the end of a project.
Makes research processes transparent and trustworthy
Documenting your dataset and research methods, in line with FAIR principles provides proof of transparent and valid conduct, providing people with confidence in data and research. Having clear and robust documentation builds trust, fosters collaboration, and strengthens the credibility of research findings.
If you want to explore some real-life examples, Sheffield University has created case studies of FAIR data in different disciplines.[8].
Test your knowledge
True or false...
- FAIR data must always be open and free to access.
- Search engines and registries play a key role in making data findable.
- I can make my data FAIR without capturing metadata for my study.
- Data does not need a unique identifier to be considered findable.
- FAIR principles encourage the use of community standards for metadata.
- Interoperability means data can be used only within the original software environment.
- Even if access to data is restricted, metadata should remain openly accessible.
- Reusability means data can be reused without any conditions or context.
Answers
- FAIR data must always be open and free to access. FALSE
- Search engines and registries play a key role in making data findable. TRUE
- I can make my data FAIR without capturing metadata for my study. FALSE
- Data does not need a unique identifier to be considered findable. FALSE
- FAIR principles encourage the use of community standards for metadata. TRUE
- Interoperability means data can be used only within the original software environment. FALSE
- Even if access to data is restricted, metadata should remain openly accessible. TRUE
- Reusability means data can be reused without any conditions or context. FALSE
Further resources
FAIR resources and training
If you want to learn more about the FAIR data principles, explore the resources and training material available below.
- GO FAIR is a stakeholder-driven initiative that aims to implement the FAIR data principles. They provide information on each principles as well as suggestions on how to go FAIR. Check out their guidance on the FAIR Principles here
- 'How to Fair' takes a deep dive into what FAIR is and how you can implement it. You can explore the website here.
- Research Data Center of the University of Mannheim's Research Data Centre has produced pdf guides for each of the FAIR principles, giving guidance on how to implement them.
- The Australian Research Data Commons offers a self-guided FAIR Data 101 training with practical activities. You can explore the training here.
- MANTRA offers an interactive training module on FAIR sharing and access here.
- OpenAire provides online guidance on how to make your data FAIR. You can explore the guidance here.
- University of Oxford explains FAIR in 60 seconds. You can watch the video here.
- Parthenos offers an online training course on FAIR, research data management and sharing your data. You can view the online course here.
Different organisations have created checklists to help researchers assess how FAIR their data is. Check a few out here ...
- ARDC offers an online FAIR data assessment tool
- UK Data Service FAIR pdf checklist here to review your data.
- OpenAire's pdf checklist
References
- [1] Wilkinson, M., Dumontier, M., Aalbersberg, I. et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data 3, 160018 (2016)
- [2] Go FAIR (2025) FAIR Principles
- [3] Martínez-Lavanchy, P.M., Hüser, F.J., Buss, M.C.H., Andersen, J.J., Begtrup, J.W. (2019). ‘FAIR Principles’. In: Holmstrand, K.F., den Boer, S.P.A., Vlachos, E., Martínez-Lavanchy, P.M., Hansen, K.K. (Eds.), Research Data Management (eLearning course) Online video
- [4] Open University (2025) FAIR Principles
- [5] OpenAire (2025) How to make your data FAIR
- [6] Whyte, A. (2015). ‘Where to keep research data: DCC checklist for evaluating data repositories’ v.1.1 Edinburgh: Digital Curation Centre. Available online: www.dcc.ac.uk/resources/how-guides
- [7] DCC (2015) 'How to Cite Datasets and Link to Publications DCC Guide' Available online: https://www.dcc.ac.uk/guidance/how-guides/cite-datasets
- [8] University of Sheffield (2025) FAIR case studies: good practice on FAIR data and software www.sheffield.ac.uk/openresearch/faircasestudies