Skip to Main Content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.
Open your data
As said in Tampere higher education community's Open Science and Research policy (pdf), research data related to research results is by default open and meant for cooperative use. However, confidentiality should not be compromised, and hence, sharing and opening your data should follow the principle: as open as possible, as closed as necessary. When closing a project, evaluate which materials should be preserved and for how long and which materials should be disposed of permanently. There are several benefits to open your data.
Go through all your data types and answer the following questions:
- What part of the data will be opened or published?
- Where will the data be opened? Name the repositories, if known
- When will the data be available?
- Explain if your data or part of it cannot be opened and give reasons for that. Tell where the project metadata will be opened.
How to open your data?
There are different ways to open your data. Your preference may depend on the customs in your discipline or on the expectations of your funder. Some publishers also has requirements for the length of time for preservation regarding data related to a publication.
- Data repository or an archive
- Data journals publish detailed descriptions of research datasets, including the methods used to collect the data and technical analysis supporting the quality of the measurements. Some data journals publish only data papers whereas others publish different types of articles including data papers. Examples of data journals:
- Peer-reviewed scientific article. In order to verify your research findings, many journals encourage or require publishing your data set related to your article as supplementary material. Please remember that such data is not necessary considered as open data, if the data can only be accessed through subscription based journal's website. Though, many publishers also accept depositing your data to general type repositories. See examples of publishers requirements:
It's also recommended that source code be shared and distributed where it is most appropriate, as determined by funding and other factors. Open source software and code tab in this guide offer more information about the topic.
When opening your data:
- Plan beforehand! Having a data management plan helps you.
- Document your data well and create adequate metadata. Remember to publish the metadata as well!
- Check that there are no ethical or legal issues that prevent you from publishing your data
- Check that the terms of reuse of your data are clear (licenses)
- CC0 waiver is the most efficient way of facilitating reuse of your data, but Creative Commons 4.0 can also be recommended for open data. See the guide on How to select a Creative Commons License
Does your research include processing sensitive or confidential data?
Data with personal information can only be published anonymised, when it is no longer subject to data protection legislation. Pseudonymised data is still personal data and cannot be opened without explicit consent for that purpose.
In some cases, personal information can be shared, if the original processing purpose allows it. However, if the original consent form does not refer to the further use of the data, opening the dataset may require requesting new consent from the data subjects. If you plan to share data which includes personal information, please contact firstname.lastname@example.org.
Please remember that you should still be able to open the metadata of the data holding personal information, although the actual data cannot be.
Archiving and long term preservation
The aim of long-term preservation is to keep data usable and comprehensible for tens or even hundreds of years. If your data has long-term value, answer the following questions:
Briefly describe what part of your data you will preserve and for how long. Categorise your data sets according to the anticipated preservation period:
A) Data to be destroyed upon the ending of the project
B) Data to be archived in non-curated archive (e.g. Zenodo) for 20 years
C) Data to be archived by a curated facility for the future generations for tens or hundreds of years.
- Describe the access policy to the archived data.
- Are there some costs related to archiving? Who takes care of them?
Tampere University has determined the process for identifying research data that will retain its value for a longer period and transferring it to Digital Preservation Service for Research Data, Fairdata-PAS. If you think your data will be suitable for the service, please contact email@example.com.
Tips for best practices
- When you start your project and begin to collect or produce your data, consider also how long each data set should be preserved.
- Submit your data selected for long-term preservation to a certified data repository or data archive, such as Finnish Social Science Data Archive, Language Bank of Finland or Mendeley Data provided by Elsevier.
- Remember to check publisher, funder, disciplinary or national recommendations for data repositories, data archives or data banks, and their preservation time requirements.
Does your research include processing sensitive or confidential data?
Traditionally, it has been recommended to destroy all sensitive data after the research project has ended, as storing it is risky and requires special arrangements. However, depending on research permits, datasets containing sensitive personal data may also be stored in the Fairdata-PAS service. What is important, is that research participants must be informed about preservation of data and the basis of the duration of preservation. The data must be minimised before storage. The further processing of such data requires a research permit.
- Remember also to plan the safe disposal of the data.
- Please remember that the anonymisation and disposal or archiving of data must be carried out by the expiry of the relevant research permit.
- Genuine anonymisation requires that both direct and indirect identification are made impossible, in addition to which the identification key must be destroyed.
- Check the five steps in deciding what data to keep (DCC, UK)
- Use firstname.lastname@example.org to contact the research data specialists at Tampere University.
Data citation gives credits to a data creator and facilitates tracking the usage and impact of the data. The researcher's position as the creator or collector of research data can be acknowledged by accordingly citing research data. Data citation practices are guided by copyright laws, data archive guidelines and the general rules of the scientific community.
Data storage services have their own general guidelines on how to cite data. Additionally, individual datasets may have citation guides. If there are no specific citation guidelines, data should be cited just like any other publication. Crosscite is a tool that helps you format your data citation.
Research Data Services assist the staff and students of the Tampere higher education community in matters related to research data management. What we do:
- We organise research data management and data protection trainings covering topics such as describing your data, data protection, data storage services and sharing your data. Content of trainings and workshops can be tailored to meet your needs. More detailed information on trainings will be updated to our website. Don't hesitate to contact us!
- We provide you with this Data management guide and other instructions and resources for the planning, organising, storing, sharing and sharing of research data.
- We comment on data management plans
Plase email email@example.com and let’s solve your problem together!
When choosing a repository
- Check the recommendations of the publishers, learned societies, and funders of your own field of science. Where have you or your colleagues in the same field published data?
- The repository publishes machine-readable metadata and uses a known metadata standard. This helps search engines and other databases find the data.
- Trustworthy repositories are assessed with a certification such as Core Trust Seal and ISO 16363 standard, which indicate that they have transparent and properly documented policies and procedures.
- Some repositories allows you to set restrictions (such as embargos, technological access restrictions, data use agreements and licenses) on how your data can be reused. However, these may have unwanted consequences and eventually, they may prevent others using your data.
- Also, some repositories may have specific requirements concerning deposited data. To make sure your data meet all requirements, please contact repositories already in the early stage of your research project.
- The repository assigns persistent identifiers (PID), such as DOI or URN, to your data. A persistent identifier is a long-lasting reference to a digital resource and makes your data easier to cite. Check the recommendation for research data sets.
List of data repositories
The list consists of well known data repositories. Many of the services accept any kind of data types and files, but some of them are specialised in specific data. If you don't have any established repository in your field, use of these services is highly recommended.
- Aila Data Service – Research data deposited in the Finnish Social Science Data Archive (FSD), with extensive metadata both in Finnish and in English
- Zenodo – easy to use and suitable for any kind of data files smaller than 50gb. Provided by CERN and OpenAIRE.
- IDA – Offers a free quota for Finnish Universities for storing data during the active phase of the research in an immutable state, sharing data within the project group, and publishing the data as a dataset. Part of the Finnish Fairdata Services.
- EUDAT – EU Horizon 2020 funded service for storing, opening, sharing and browsing data.
- Figshare – Multidisciplinary data archive.
- The Language Bank of Finland Fin-Clarin – A comprehensive text and speech corpora. Basic use is free for academic researchers and students.
- Array Express EMBL-EBL – Functional genomics data from microarray and sequencing platforms.
- Dryad – Data archive focused on natural sciences and medicine. Contains research data linked to scientific publications.
- GitHub – Service for opening and sharing code.
- Worldwide Protein Data Bank PDB - 3D structure data and metadata of proteins, nucleid acids, and complex assemblies.
Other data services
- Re3data.org – Registry of Research Data Repositories gathers information on different data archives from different subjects that offer long-term data storage.
- Etsin – Research data finder, which contains descriptive information (metadata). Part of the Fairdata Services.
- Qvain – Research Dataset Description Tool. Part of the Fairdata Services.
- Paituli spatial data service – Includes datasets from following data providers: Finnish Meteorological Institute, National Land Survey, Traffic Agency, Agency for rural affairs, Statistics Finland and Finnish Environment Institute (SYKE). Part of Avaa research data portal.
- Scientific Data – Research data journal's list of recommended data repositories.
- See also Finnish Biobanks.
Photo by Pixabay
Deleting and emptying the recycle bin containing the deleted files is not an irreversible way to destroy unnecessary data. Deleted data can be recovered even after reformatting the hard disk. Use special file deletion software in order to overwrite the data or demagnetise the hard disk. Storage devices can also be mechanically crushed into an unreadable state.
Photo by rawpixel.com
Funders' requirements about Open Data
Photo by pxhere. CC0 Public domain.