Skip to Main Content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.

Research Data Management: Data management guidelines for students

About data management

Responsible data management is an essential part of both thesis and working life skills. In this guideline we summarise what students should take into account when managing their research data.

Data management refers to the collection, processing, description and storage of data used in the thesis. In the data management plan (DMP), these processes have been described and written down before the data collection. The DMP will also be updated during the project. The purpose of the DMP is to anticipate potential data protection risks and to ensure that the data remains usable in safe hands throughout the project. The careful data management aims to improve the reliability of research and the repeatability of the results, and to promote the further use of the research data.

Use the template to draft your DMP

What is research data?

Research data refers to the data you use in your thesis and which your thesis is based on. The data can be collected by yourself and it includes, for example, surveys, interviews, measurement results, field notes... In addition to these, "already existing data" can be used as a research data. This can include, for example, archive sources, audio recordings, YouTube videos, photos, literature as a research subject, movies, websites, forum threads, medical imaging, simulations, etc. You may also use research data obtained from your supervisor or data archive. You can search for survey and interview data, which you can use in your own thesis for example in the Finnish Social Science Data Archive (FSD).

In addition, your data may also include classifications, categorisations, tables, databases, visualisations, notes, etc. based on the data listed above. In this context, research data do not mean source literature or research publications.

Describe your data

  • Briefly describe, what kind of data you will collect or produce, and what kind of existing data you will use.
  • How much storage space do you need for your data?
  • Make sure that the data you collect is of sufficient quality for your research. Collect, for example, a small test data before the actual data collection: conduct a test interview, test the survey questionnaire...
  • If you use already existing data, please familiarise yourself with it carefully and assess whether it will enable you to answer your research questions.

Rights related to data

Maria Rehbinder from Aalto University describes intellectual property rights and data protection issues related to research data in the YouTube video. In particular, consider any copyrights or terms of use related to the data you have received elsewhere. Remember also that not all material available online may be used as such in your own work. Read the terms and conditions of the service provider to determine whether research use of the material is permitted.

If your data contains images, texts or other works, they are likely to also involve copyrights. Especially, if you upload images or texts to your computer, you should make sure that you are entitled to do so. Sometimes the terms and conditions for images and texts are defined in licenses. If you use social media messages as your data, make sure that you are allowed to copy messages to your computer if necessary.

If you receive the data from your supervisor, it is advisable to agree on who is allowed to use the data and what to do with the data after the thesis has been completed in order to avoid any ambiguity. What if you cannot finish the thesis? Similarly, if you are doing your thesis in cooperation with a company. Will the data remain at the company's disposal, will you be allowed to use the material later, or will the material have to be disposed of?

Documentation and metadata of the data

The data, as well as its collection and processing, should be described with the accuracy that someone other than you understand what the data is about. It is advisable to describe the data so accurately that if for any reason you had to stop doing your thesis for a year or two, you could still reasonably safely continue your thesis on where you had left off.

Describe the collection and processing of data

  • What do the folders and data files and contain?
  • When and where was the data collected?
  • Why was the data collected?
  • How has the data been processed or modified?
  • In tabular data, describe the column and row labels, data encodings (including data from missing data), units of measure, etc.
  • Describe the abbreviations you are using.

File naming and folder structure

  • Pay attention to the names and order of your files and folders.
  • Smart file names are short, informative, and consistent. For example, mark dates as standard YYYYMMDD.
  • Use version numbering in file names and keep the original file separate from editing files.
  • Already one interview easily generates three files: the original audio file, the text transcription file, and the interview notes.
  • Do not use personal identifiers in the file names.

Data storage and data security

What do I do with my data after the thesis is completed?

Data containing personal information is usually destroyed after the thesis has been completed and approved. Inform the research participants if you intend to use the data after your thesis, for example in a postgraduate degree, or if you write an article based on your thesis. Sometimes it may be necessary to keep the data for the purpose of verifying the results. If you work in a company or other organisation, consider the potential interests of the client of your thesis in the data.

In order to share the data for further use, you need a permission from your research participants. The sharing of the data is done in archives, such as the Finnish Social Science Data Archive (FSD) or Zenodo. In order to archive your data in FSD, the data should be sufficiently extensive for the needs of comparative research, methodological education or new research. Typically, data collected for master thesis do not meet these criteria. Read more  about the materials received by the Data Archive

The EU-funded Zenodo is suitable for storing many types of data. If necessary, the visibility of the data can be set so that only you can access it, but it is also possible to share your data through Zenodo.

Anticipate the further use or use of data even before the data collection. For example, contact the Data Archive and ask about the possibilities of sharing. It is essential to inform research participants about the further use of the data. Access to the university's storage solutions ends after you have graduated, so take care of the data if you do not destroy it.

Data protection

Data protection is the process of protecting personal data. Data protection is a fundamental right and safeguards the rights and freedoms of data subjects when their personal data is processed. Data processing laws set out the principles for the lawful processing of personal data. The processing of personal data must always be based on the law.

What is personal data?

Personal data is any information that makes a person identifiable, either directly or indirectly.

For example, when conducting an interview, you will almost certainly process personal data, as the voice will be considered as identifiable information. In addition, the participation consents collected from the research subjects are personal data. So is the contact information of the research participants.

Answers to the survey questionnaire may also be identifiable. Background information, such as occupation, city of residence, age and gender can make a person recognisable when these information are combined. Note that there can be identifiable information in file or folder names also.

Use quotes or citations carefully, especially if you use social media data or data collected from forums. Direct quotations are likely to constitute identifiable information. It may also come as a surprise to the user of the service that quotes are extracted from their messages and presented in a research publication or thesis, taken out of the original context. Marko Ahteensuu discusses the ethical problems of social media data in his article.

Read more about the definition of personal data in FSD’s Data Management Guidelines

Risk assessment

A basic risk assessment must always be carried out before the processing of personal data. Risks are assessed from the perspective of the research participant. The risk assessment shall be documented.

A data protection impact assessment (DPIA) must be carried out if the risk assessment indicates that the processing of personal data poses a high risk to the participants. See model template and guidelines for risk assessment and impact assessment.

Informing research participants and research permits

Informing the research participants about your research and requesting ethical consent (informed consent) is part of good scientific practices. Also, participants should be informed about the processing of their personal information as required by the Data Protection Act (1050/2018).

Privacy Notice

The purpose of the Privacy Notice is to inform the research participants about the processing of their personal data.

Information sheet

Information sheet will provide the research participants with basic information about the project. It is essential that it clearly indicates what the participant will agree to.

Informed consent

Prior to participation in the study, the research participants should give you an ethical consent. In human sciences, this can be a signed consent form, an email confirmation or, oral consent given at the beginning of an interview message, for example. If the investigation interferes with physical integrity or processes sensitive personal data, we recommend a signed consent form. Please note that ethical consent is not the same as consent as a basis for processing personal data.

Further use of data

In general, the research participants are asked to participate only for a limited research project. The use of the data is only allowed for research covered by the demarcation. If the data is to be archived and used for other research purposes, the participants must be informed of this

Research permits

If your thesis deals with employees or students in an organization, you probably need a research permit from that organization. For example, cities and municipalities typically have their own practices for research permits. Learn more about research permits from the University's intra

How do I do this in practice?

Before collecting the data, plan how to implement the informing of the research participants (information sheet and privacy notice) and how to ask a consent for a participation. For example, you can attach your information sheet to an e-mail message, as part of a questionnaire, or submit it as a paper form. The style is free as long as the research participants receive the relevant information about your project.

You can also request a consent in different ways. For example, in interviews, consent can be requested orally at the beginning of the interview. In the survey, the user can be asked to tick the check box that they have read the information sheet.

Avoid processing direct identifiers, such as names, for example by pseudonymizing the data. Remove direct identifiers or replace names with codes.

Do not use identifying information in file names either.


P. 0294 520 900

Kirjaston kotisivut | Library homepage

Palaute | Feedback