Open data

Everything you need to about sharing your research data openly

Why choose open data?

Open data plays a vital role in the research landscape by accelerating the pace of discovery, promoting data reuse, and enabling the testing and validation of research findings. Over the past decade, data has become a priority for academic stakeholders, including governments, funders, institutions, and publishers worldwide.

Today nearly all major scientific journals have an open data policy. However, according to the 2022 State of Open Data report, many academics find sharing data difficult.

On this page, we’ll provide information and guidance to help you get to grips with open data. We’ll also answer your open data questions, including:

What is open data?

What are the different types of research data?

What are the benefits of sharing my research data openly?

How do I share my research data?

Why is it important to make my data FAIR?

DOWNLOAD OUR FREE eBook

Open data demystified: the essential toolkit for researchers

Fill in the form below for expert guidance on how to collect, store, format, share and publish your data.

What is open data?

Open data is data that is available for everyone to access, use, and share. For researchers, this refers to any datasets collected or created as part of your research project.

In some cases, data sharing is not appropriate for legal, ethical, data protection, or confidentiality reasons. F1000 recommends researchers strive to make their data as open as possible and as closed as necessary. This means you should only restrict access to data where essential, for example, for security or confidentiality reasons.

There are many types of research data, both quantitative and qualitative. This includes:

Survey results
Software
Models and algorithms
Interviews and transcripts
Images, videos, and audio files

Genome sequences

Benefits of open data

Sharing your data benefits your career, other researchers, and society. In recent years, open data has become a priority for academic stakeholders globally. So how can you benefit from open data?

Open data

Benefits for your career

Increase the discoverability of your research: linking your open data and published research outputs can increase the readership of your research.
Increase citations: research shows that articles with links to datasets shared in repositories generated up to 25% more citations than articles that did not share data in repositories.
Enhance the credibility of your work: when the data supporting your findings is openly available, others can replicate your work to validate your results and conclusions.
Establish ownership and get credit for your data: uploading it to a repository allows you to establish ownership through a persistent identifier so other researchers can cite it.
Facilitate collaboration and new partnerships: researchers in your field and beyond can access and use your data, leading to greater collaboration and new research projects.

Open data

Benefits for the community

Supports reproducibility: open data enhances research rigor by making it easier for others to validate, replicate, and reproduce your findings.
Reduces research waste: when data is openly available, research becomes more efficient by removing duplication of efforts from other researchers.
Enables others to reuse your data: sharing data can lead to reuse by providing a foundation for others to build on.
Preserves data more securely over time: data hosted on a repository is more secure than data hosted on a website or personal files.

Open data

Benefits for society

Gives greater visibility over results of publicly funded research: open data offers a chance to make research results openly available as a public good, as research is often publicly funded.
Can lead to real-world impact: when data is open, we can accelerate the pace of research discovery to solve societal challenges in real-time.
Fosters trust in research: transparency and accountability help to foster public trust in the research process and results.

How to share your research data

Write a data management plan before your project begins

Planning for managing and sharing your data can go a long way in making it easy to open your data at the end of your project. Before research begins, create a detailed Data Management Plan (DMP). A DMP is a living document that describes how your research outputs will be generated, stored, used, and shared. The document can change and evolve throughout your research project. While most funders and publishers don’t require researchers to create a DMP, it can help to ensure efficient data management and makes it easier to make your data FAIR.

Prepare the data for sharing

You’ve collected your data; now it’s time to prepare it for sharing. While some restrictions may make it impossible to share your dataset, in other cases, you can share sensitive data provided you take the necessary precautions to protect the confidentiality of research participants. Once you’ve determined the extent to which you can share your data, you’ll need to format your data, label your files for sharing, and prepare any additional materials needed to understand and use the data. For example, you may include a data dictionary and details of any software needed to process the data. Different disciplines and data repositories may have different standards around formatting data, so research this before you get started.

Deposit your data in a repository

A repository is an online storage infrastructure for researchers to store data, code, and other research outputs. Depositing your data in a publicly accessible, recognized repository ensures that your dataset continues to be available to both humans and machines in a usable form. Uploading data to a repository helps preserve it more securely over time than hosting it on a website. Plus, you’ll receive a persistent identifier (PID) to establish ownership and enable others to cite the data. Your institutional librarian, funder, and colleagues can likely guide you in choosing a repository relevant to your discipline.

Apply an open license to the data

Apply an open license to your data to permit others to reuse it with minimal restrictions. Permitting reuse supports reproducibility and transparency in research and allows others to build on your findings. The Creative Commons Public Domain Dedication (CC0) and the Creative Commons Attribution Only (CC-BY) licenses are popular examples of open licenses. Both licenses allow reusers to distribute, remix, adapt and build upon the materials in any medium or format. The critical difference is that the CC0 license has no requirement for attribution, while the CC-BY license requires reusers to credit the original creator.

Make your data easy to find

Always cite your dataset in your published article and include a data availability statement. A data availability statement is a short section of text which tells the reader how, where, and under what conditions the data associated with your research can be accessed and reused. Once your research is published, some repositories allow you to add the article’s Digital Object Identifier (DOI) to the metadata of your dataset to establish a permanent link between these two outputs of your research. You can also choose to publish a Data Note to maximize the potential of your research data. Data Notes are a peer reviewed article type that indicates why and how your data was collected, analyzed, and validated.

What is FAIR data?

The FAIR guiding principles for scientific data management and stewardship were developed in 2016 to ensure research data is:

Findable

Data should be deposited in a repository, giving you a digital object identifier (DOI) or persistent identifier (PID). Use metadata to give a detailed description of your data.

Accessible

The repository must use a standard protocol like http://. The repository must continue to provide a landing page and the metadata even if the dataset were removed.

Interoperable

The metadata used to describe the data are based on the standard subject vocabularies and should be machine-readable. You can find the subject standards at FAIRsharing.org.

Reusable

The metadata which describes the data is accurate and relevant. An explicit data license has been applied to the data, explaining what other users can and cannot do.

FREE EBOOK

Open data demystified: the essential toolkit for researchers

DOWNLOAD HERE