Find, reuse and cite data
Data sources
There are thousands of data repositories worldwide, housing datasets from a wide range of research areas. Some repositories hold subject specific data, while others contain data from several different disciplines. Here are some suggested starting points for finding data; it is by no means an exhaustive list of data sources.
Find repositories in your research area
Search the re3data (Registry of Research Data Repositories) to find relevant repositories.
re3data is a Datacite service. It provides information on over 2,000 research data repositories from every domain and in every country. You can browse by subject, country, or content type, and search by any combination of 41 different attributes.
Datacite search
Datacite is a global non-profit organisation providing persistent identifiers (DOIs) for research data. Data DOIs help researchers locate, identify, and cite research data with confidence.
Use DataCite Commons to find datasets in your topic area.
re3data and Datacite Search are complementary services:
- re3data will help you find discipline specific repositories
- DataCite will search across multiple repositories to find datasets related to your subject.
Search multi-disciplinary repositories
Searching multi-disciplinary repositories can help to find more datasets in your area due to their extensive content and coverage.
DRYAD is a repository governed by a nonprofit membership organization. Several publishers partner with Dryad to coordinate the submission of manuscripts with submission of data to Dryad.
UK Data Archive is the UK's largest collection of digital research data in the social sciences and humanities. It is funded by the Economic and Social Research Council (ESRC), Jisc and the European Union (EU).
UK Data Service is part of the UK Data Archive. It is the UK’s largest collection of social, economic and population data resources.
FigShare is a third party repository owned by Digital Science. It is free to use for researchers.
Zenodo is a data repository for EU funded research, developed in partnership with CERN. It is available to anyone to use free of charge irrespective of funder. It is also used for software and code via an integration with GitHub
Google Dataset Search is a search engine for data sets. Using a simple keyword search, users can discover data sets hosted in thousands of repositories across the web.
Wellcome Trust approved data repositories
Wellcome Trust approves of a number of data repositories which you may find useful to search, depending on your research area.
Local and Regional Data Sources
Data Mill North: open datasets from multiple sectors across the City of Leeds and the North of England.
OpenInnovations: an initiative that uses open data to drive social, economic, and environmental change.
How to search data repositories
Each data repository works in a different way, so to get the best results you should adapt the way you search for each one. For example, some repositories have a better “browsing by topic” function. Others work better with keywords. Where possible, consult the repository’s user guide to find the best way to find datasets.
When searching for data, you should assess the quality of the data source as you would any other scholarly resource.
Find out how to check the quality of external repositories
Data DOI links in articles
It is becoming more common to see datasets links in full text articles. You may find some datasets when reviewing your full text papers in the final stages of your literature search.
For example, in the life science repository Europe PMC, you can use advanced search to find articles that cite data DOIs.