When you archive your data to a data repository, it will be stored there until further notice. Thus the archiving period is not defined in years. If the data is to be stored only for a limited time after the project for the purpose of verifying the results, and it is not possible to share the data for example for data protection reasons, your own storage space for work may be the most suitable solution.
The openness of research data is a key part of responsible and ethical research activity. The degree of openness of research data may vary from fully open, partially open, or available only with permission. Assess which part of the data should or can be opened, and which part of the data should be permanently destroyed. When sharing and opening data, follow the principle "as open as possible, as closed as necessary". If you cannot open the actual data, at least publish its metadata, i.e. descriptive information.
The timing of making your data open access depends a lot on you. At what point do you want to and can open your data for others to use? It is your choice whether you open your data during or after the research project. For example, the Research Council of Finland states: "Research data must be made freely available as soon as possible after the research results have been published."
Making research data open access benefits you, the scientific community, and society at large. With the opening of research data, the data will be available to you and others even after decades, your research will gain more visibility, and your publication will receive more citations. Add information about the open data to your CV, as open data is a merit for a researcher. Opening the data also increases the transparency and reproducibility of research.
Many parties encourage making research data open access. As a rule, the research data, scientific publications, and theses produced at Tampere University and Tampere University of Applied Sciences, as well as the research methods used, are shared and open.
Opening data in a data repository
Research data should be made open access primarily in a national or international data repository or storage service that is specific for your own field of science. Finnish Social Science Data Archive (FSD) and the Language Bank of Finland are Finnish data repositories with a Core Trust Seal certification. The certificate is an indication that the organisation stores electronic data reliably, and efficiently enables the reuse of data. International multidisciplinary data repositories include Zenodo and Figshare, among others. You can search for repositories in the re3data register.
When choosing a data repository, pay attention to that:
Repositories offer different options for how a limited or large group of people can use your data. For example, the Language Bank of Finland has three access categories: the PUB category for fully public distribution, the ACA category for research and teaching use, the RES category requires permission from the rights holder.
FSD's access rights categories are very similar: A openly available data, B data available for research, teaching, and study, C data available for research and theses only (including Master's, doctoral and Polytechnic/University of Applied Sciences Master's theses), and D data available only by permission from the data depositor/creator.
The metadata of your data is publicly visible in a data repository. That, on the other hand, is only a good thing, because the metadata alone makes your research more visible and promotes your work.
Making data open access as part of a research article
Many scientific publications require research data to be opened to verify research results and to increase transparency. Including research data as supplementary material to a journal article does not necessarily make the data open if access to it requires e.g. a paid subscription to the journal or the purchase of the article. However, these days many publishers recommend using data repositories.
Opening/describing data in a data article
A data article describes the open research dataset(s) thoroughly, i.e. describes the content of the dataset, its collection, processing, and the tools used. Data articles do not usually contain research results or discussions (an exception may be discussions regarding different data collection methods). Data articles can be published in data journals or in a separate "Data Papers" section of a scientific journal. Data-focused journals include Scientific Data (Springer Nature), Data in Brief (Elsevier), and Research Data Journal for the Humanities and Social Sciences (Brill).
Making metadata open access
During archiving, the research data is also described, and this metadata becomes public. Data repositories often use metadata standards, which means that the metadata is structured and thus easily machine-readable. It is advisable to make a public metadata record of the research data even when the data cannot be made open access. The metadata describes the content and access rights of the data, and it also shows relevant identifiers.
Qvain is a suitable tool for describing metadata during publication of research data, it is produced and maintained by CSC. Qvain supports the use of controlled glossaries, provides a comprehensive list of licenses, and creates a permanent identifier for your metadata. On Qvain's website, you will find field-specific description instructions. In addition, Qvain allows you to publish your metadata directly in the national Etsin search service for research datasets.
Checklist for anyone planning to make data open access:
These days many research funders require that research data is made open access. When applying for funding it is a good idea to check the funder’s up-to-date requirements. Here are examples from a few funders:
Article manuscript reviewers may request access to research data. Through Zenodo, you can share a link to your research data without compromising the double-blind review process. Note that Zenodo's link to the data is temporary, and valid only for one month after the data is added to the repository.
It is a good idea to include a Data Availability Statement (DAS) as part of the research publication. You can use a DAS to explain the data behind your publication and its availability. DAS promotes transparency in the research process and brings visibility and discoverability to your research data. DAS also supports the implementation of the FAIR principles in your research. It is also important to inform about the availability of your code. You can do this in connection with the DAS or with a separate Code Availability Statement.
How to write a Data Availability Statement
By citing research data, the role of the original researcher as the creator or collector of the research data used is taken into account. Citation practices are guided by the Copyright Act, the guidelines for the data repository being used, and the norms of the research community.
Data storage services have their own more general guidelines, and the metadata of individual datasets may also contain citation instructions. If specific citation instructions are not available, the data should be cited in the same way as any other publication.
The policies for responsible evaluation of research emphasise with increasing intensity that referring to research data should be regarded as a part of considering the diversity of research outputs. The responsibility for establishing data citations lies with research organisations, science policy bodies, funders, publishers, as well as individual researchers.
CrossCite is a tool that helps you cite data.
Read more:
Long-term preservation (PAS, from the Finnish pitkäaikaissäilytys) ensures that digital research data remains understandable and usable for several decades or even centuries. Tampere University has its own PAS process, in which significant research data is first identified and then transferred to the PAS service. Contact Research Data Services (researchdata@tuni.fi) if you think your data is suitable for long-term preservation.
Research participants must be informed already at the data collection stage how long within the framework of the research the data related to them will be stored. In any case, it is a good idea to plan the storage of the data as early as possible at the start of the research. If, for example, there was a need for long-term preservation of the data, proactive storage planning (e.g. storage locations, file formats optimised for long-term preservation, etc.) will facilitate the transfer of data to storage after the research has concluded.
Copyleft and permissive licenses
Open licenses can be roughly divided into two categories: copyleft and permissive licenses. In both cases, anyone is free to use or copy the original software. The difference in licenses is reflected in the licensing of derivative works.
Permissive licenses may be of more interest to commercial operators, while copyleft licenses may be of interest to independent software developers. Neither license type imposes direct restrictions on commercial use of the software. Free software can be used in private business, and for example software support services can be sold. A company can pay the software developer to implement additional features.
Choosing a license for source code
Source code licensing must be taken care of appropriately. Independent commercialisation of the software can be a reason for choosing the license used. When choosing, you must also take into account case-by-case requirements for future licensing needs. The choice of licence should also be discussed with your supervisor, laboratory leader, or dean. If there are commercial interests involved with the software, please contact the University's Innovation Services at inventions@tuni.fi.
If you continue developing existing source code, use the same license as in the original program. Using the same license may be mandatory, and in any case, it is almost always sensible. If you're starting from scratch, choose a well-known license such as MIT, GNU GPL, or BSD. Do not limit the use of the program to a specific purpose. Limitations often prevent the code from being integrated into a larger program.
With licenses, you define terms of use for the data you have made open access. Creative Commons licenses are suitable for research data and metadata. CC licenses require people using your work to credit you as the original author in a manner you want, but your work can be shared and edited according to the conditions you set. Creative Commons licences recommended for research data are CC BY and CC0 licences. A CC0 license is recommended for metadata.
For more information on the different licensing options, see the Copyright section of this guide.