Skip to Main Content

Researcher's guide to responsible and open science

Opening your data

When you archive your data to a data repository, it will be stored there until further notice. Thus the archiving period is not defined in years. If the data is to be stored only for a limited time after the project for the purpose of verifying the results, and it is not possible to share the data for example for data protection reasons, your own storage space for work may be the most suitable solution.

The openness of research data is a key part of responsible and ethical research activity. The degree of openness of research data may vary from fully open, partially open, or available only with permission. Assess which part of the data should or can be opened, and which part of the data should be permanently destroyed. When sharing and opening data, follow the principle "as open as possible, as closed as necessary". If you cannot open the actual data, at least publish its metadata, i.e. descriptive information.

The timing of making your data open access depends a lot on you. At what point do you want to and can open your data for others to use? It is your choice whether you open your data during or after the research project. For example, the Research Council of Finland states: "Research data must be made freely available as soon as possible after the research results have been published."

Making research data open access benefits you, the scientific community, and society at large. With the opening of research data, the data will be available to you and others even after decades, your research will gain more visibility, and your publication will receive more citations. Add information about the open data to your CV, as open data is a merit for a researcher. Opening the data also increases the transparency and reproducibility of research.

Many parties encourage making research data open access. As a rule, the research data, scientific publications, and theses produced at Tampere University and Tampere University of Applied Sciences, as well as the research methods used, are shared and open.

Opening data in a data repository

Research data should be made open access primarily in a national or international data repository or storage service that is specific for your own field of science. Finnish Social Science Data Archive (FSD) and the Language Bank of Finland are Finnish data repositories with a Core Trust Seal certification. The certificate is an indication that the organisation stores electronic data reliably, and efficiently enables the reuse of data. International multidisciplinary data repositories include Zenodo and Figshare, among others. You can search for repositories in the re3data register.

When choosing a data repository, pay attention to that:

  • the repository assigns a persistent identifier to the data, such as DOI or URN.
  • the repository publishes machine-readable metadata and uses a well-known metadata standard.
  • the repository has clear policies for data access and use.
  • the repository allows you to choose the terms of use (e.g. an embargo, that can be used to technically limit when the data is available) and licenses (e.g. CC-BY licences) for reuse of your research data.
  • the repository has a certificate of operational reliability (e.g. Core Trust Seal and ISO 16363 standard).
  • the repository may have specific requirements for the archived data, so you should contact a suitable repository for your research data already at the beginning of your research project.

Repositories offer different options for how a limited or large group of people can use your data. For example, the Language Bank of Finland has three access categories: the PUB category for fully public distribution, the ACA category for research and teaching use, the RES category requires permission from the rights holder.

FSD's access rights categories are very similar: A openly available data, B data available for research, teaching, and study, C data available for research and theses only (including Master's, doctoral and Polytechnic/University of Applied Sciences Master's theses), and D data available only by permission from the data depositor/creator.

The metadata of your data is publicly visible in a data repository. That, on the other hand, is only a good thing, because the metadata alone makes your research more visible and promotes your work.

Making data open access as part of a research article

Many scientific publications require research data to be opened to verify research results and to increase transparency. Including research data as supplementary material to a journal article does not necessarily make the data open if access to it requires e.g. a paid subscription to the journal or the purchase of the article. However, these days many publishers recommend using data repositories.

Opening/describing data in a data article

A data article describes the open research dataset(s) thoroughly, i.e. describes the content of the dataset, its collection, processing, and the tools used. Data articles do not usually contain research results or discussions (an exception may be discussions regarding different data collection methods). Data articles can be published in data journals or in a separate "Data Papers" section of a scientific journal. Data-focused journals include Scientific Data (Springer Nature), Data in Brief (Elsevier), and Research Data Journal for the Humanities and Social Sciences (Brill).

Making metadata open access

During archiving, the research data is also described, and this metadata becomes public. Data repositories often use metadata standards, which means that the metadata is structured and thus easily machine-readable. It is advisable to make a public metadata record of the research data even when the data cannot be made open access. The metadata describes the content and access rights of the data, and it also shows relevant identifiers.

Qvain is a suitable tool for describing metadata during publication of research data, it is produced and maintained by CSC. Qvain supports the use of controlled glossaries, provides a comprehensive list of licenses, and creates a permanent identifier for your metadata. On Qvain's website, you will find field-specific description instructions. In addition, Qvain allows you to publish your metadata directly in the national Etsin search service for research datasets.

Checklist for anyone planning to make data open access:

  • Plan the opening of the data in advance! A data management plan will be helpful.
  • Remember to inform the research participants if the data will be made open access and ask for their consent. FSD has excellent instructions for informing participants about archiving.
  • Describe your research data carefully so that others can also understand what your data is about.
  • Check that ethical or legal issues do not prevent the publication and opening of the research data.
  • Prefer a data repository that provides a permanent identifier for the data (e.g. DOI, URN) and thus allows referencing of the data.
  • Define clear terms of use for the research data, for example with licences or the repository’s own terms of use.

Funders' requirements for making data open access

These days many research funders require that research data is made open access. When applying for funding it is a good idea to check the funder’s up-to-date requirements. Here are examples from a few funders:

  • Bill & Melinda Gates:
    • the research data underlying the published research results must be open access immediately.
    • ethical and legal reasons can prevent the data from being made open access.
    • it is recommended that the data is stored primarily in a discipline-specific archive of one's own field of science.
  • Business Finland:
    • recommends making research data open access where applicable.
  • Horizon Europe:
    • the data must be made open access as soon as possible, and the schedule specified in the data management plan must be followed.
    • a reliable data repository must be used when making the data open access (check if there are any special requirements for the repository in the funding programme).
    • the terms of use of the data must be defined with a Creative Common license or equivalent. Recommended Creative Commons licenses are CC BY and CC0. Usage rights to the metadata for the opened research data must be defined with a CC0 licence.
    • for justified reasons, the data can be left closed. Legitimate reasons include IP rights, for example.
  • Kone Foundation:
    • recommends that data collected in connection with research is archived for reuse.
  • NordForsk:
    • the data must be made open access at the end of the project.
    • the data must be made open access in a certified data repository.
    • for justified reasons, the data can be left closed. Justified reasons include, for example, restrictions imposed by legislation and agreements.
  • Research Council of Finland:
    • the data must be made open access as soon as possible after the publication of the research results.
    • the data must be made open access in a national or international data repository important to the research organisation or field of science.
    • for justified reasons, the data may have different levels of openness.
    • the party making the data open access must ensure that the publication does not violate the Act on the Openness of Government Activities, the Data Protection Act, or the Copyright Act.
    • the licensing of the data must be ensured.
  • Sherpa/Juliet Service is a search service that provides the latest information on funders' policies and requirements related to open science, publishing, and archiving.

Making data open access for peer review

Article manuscript reviewers may request access to research data. Through Zenodo, you can share a link to your research data without compromising the double-blind review process. Note that Zenodo's link to the data is temporary, and valid only for one month after the data is added to the repository.

Adding data to Zenodo for reviewers (pdf)

Data availability statement (DAS)

It is a good idea to include a Data Availability Statement (DAS) as part of the research publication. You can use a DAS to explain the data behind your publication and its availability. DAS promotes transparency in the research process and brings visibility and discoverability to your research data. DAS also supports the implementation of the FAIR principles in your research. It is also important to inform about the availability of your code. You can do this in connection with the DAS or with a separate Code Availability Statement.

How to write a Data Availability Statement

  • Read the publisher’s and funder’s instructions.
  • Write a DAS even if the publisher or funder doesn't require it.
  • If you use template phrases provided by the publisher, ensure that they reflect your data. Edit them if necessary. Data Availability Statement templates can be found for example here: Springer Nature, Taylor & Francis. PLOS provides Code Availability Statement templates.
  • Explain availability on a data-by-data basis.
    • Explain where the data is available and if there are any conditions related to its availability.
    • Declare the name of the repository and the persistent identifier link. Declare the potential license.
    • If the data is available from the researcher, explain how and where the data can be accessed. Consider how long you as a researcher can store the data.
    • If there are restrictions on the use of the data, explain the reasons for the restrictions and the conditions under which the data could be used.
    • If the data is not available, explain why.
  • If no data has been used in the publication, indicate this in the DAS.
  • If you used data provided by someone else, cite it appropriately.

Citing research data

By citing research data, the role of the original researcher as the creator or collector of the research data used is taken into account. Citation practices are guided by the Copyright Act, the guidelines for the data repository being used, and the norms of the research community.

Data storage services have their own more general guidelines, and the metadata of individual datasets may also contain citation instructions. If specific citation instructions are not available, the data should be cited in the same way as any other publication.

The policies for responsible evaluation of research emphasise with increasing intensity that referring to research data should be regarded as a part of considering the diversity of research outputs. The responsibility for establishing data citations lies with research organisations, science policy bodies, funders, publishers, as well as individual researchers.

CrossCite is a tool that helps you cite data.

Read more:

Long-term preservation of data (PAS)

Long-term preservation (PAS, from the Finnish pitkäaikaissäilytys) ensures that digital research data remains understandable and usable for several decades or even centuries. Tampere University has its own PAS process, in which significant research data is first identified and then transferred to the PAS service. Contact Research Data Services (researchdata@tuni.fi) if you think your data is suitable for long-term preservation.

Research participants must be informed already at the data collection stage how long within the framework of the research the data related to them will be stored. In any case, it is a good idea to plan the storage of the data as early as possible at the start of the research. If, for example, there was a need for long-term preservation of the data, proactive storage planning (e.g. storage locations, file formats optimised for long-term preservation, etc.) will facilitate the transfer of data to storage after the research has concluded.

Making source code open access and licenses

Copyleft and permissive licenses

Open licenses can be roughly divided into two categories: copyleft and permissive licenses. In both cases, anyone is free to use or copy the original software. The difference in licenses is reflected in the licensing of derivative works. 

  • A copyleft license means that changes to the software must be licensed under the same conditions in the future as the original software. If program A is licenced under GPL (so that it can be freely copied), its derivatives A¹ and A² must also be licenced under GPL (i.e. they can also be freely copied).
  • A permissive license allows you to modify and distribute the software also for commercial purposes. For example, if program B is licensed under BSD, commercial versions B¹, B², etc. can be made. Derivative software could also be licensed under GPL, in which case derivative software would have to be licensed under GPL.

Permissive licenses may be of more interest to commercial operators, while copyleft licenses may be of interest to independent software developers. Neither license type imposes direct restrictions on commercial use of the software. Free software can be used in private business, and for example software support services can be sold. A company can pay the software developer to implement additional features.

Choosing a license for source code

Source code licensing must be taken care of appropriately. Independent commercialisation of the software can be a reason for choosing the license used. When choosing, you must also take into account case-by-case requirements for future licensing needs. The choice of licence should also be discussed with your supervisor, laboratory leader, or dean. If there are commercial interests involved with the software, please contact the University's Innovation Services at inventions@tuni.fi.

If you continue developing existing source code, use the same license as in the original program. Using the same license may be mandatory, and in any case, it is almost always sensible. If you're starting from scratch, choose a well-known license such as MIT, GNU GPL, or BSD. Do not limit the use of the program to a specific purpose. Limitations often prevent the code from being integrated into a larger program.

Creative Commons licences

With licenses, you define terms of use for the data you have made open access. Creative Commons licenses are suitable for research data and metadata. CC licenses require people using your work to credit you as the original author in a manner you want, but your work can be shared and edited according to the conditions you set. Creative Commons licences recommended for research data are CC BY and CC0 licences. A CC0 license is recommended for metadata.

For more information on the different licensing options, see the Copyright section of this guide.

Logo

Email: library@tuni.fi
P. 0294 520 900

Kirjaston kotisivut | Library homepage
Andor

Palaute | Feedback