Skip to Main Content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.

Research Data Management: Describe your data

Categorize your data

Start writing your data managemenn plan by briefly describing:

  • What data will you use or produce in the project?
  • In which file formats will the data be in?
  • How much data will you have (approximately)
  • If you will you use or develop special software or code to analyse your data.
  • Do you have personal or confidential information?

Your answer to following question forms a general structure for your data management plan. Categorise your data in such a way that you can refer to it later in the plan. For example,

  1. Data collected or produced by you or your research group
  2. Data collected by other researchers
  3. Data from other sources such as registers, statistics, measuring stations etc.
  4. Other materials needed to use and understand the data, such as codes, softwares, lab notebooks etc.

The categorisation follows the license policy of your data sets. For example, briefly describe according to which license you are entitled to (re)use the data.

In your DMP, describe the required disk space - not how many informants were participating the project. A rough estimation of the size of the data is sufficient - e.g. less than 100 Gb, approx. 1 Tb, or several petabytes.

Tips for best practices 
  • When categorizing your data, use bullet points for a concise way of presenting 
    • data types
    • The file formats (for example, .csv, .txt, .docx, .xslx, .tiff) used during the research project may differ from those used in archiving the data. List both. The file format is a primary factor in the accessibility and reusability of your data in the future.
    • Favour software and formats based on open standards to enable data reuse, interoperability and sharing.
    • the size of the data sets
    • the software used (especially if the software is coded in your project)
    • other relevant information related to your data sets.
  • AVOID OVERLAPPING WITH THE RESEARCH PLAN! Data analysis and methodological issues related to data and materials should be described in your research plan.

Non-digital research data

Some research data are still gathered and handled non-digitally. Examples of such data might include e.g. paper-based data, biospecimen, fossil specimen, art samples, artefacts or other concrete objects that either cannot be converted to digital form at all or such processing would require too much labor or other resources to be feasible.

Regardless of whether research data are digital or non-digital, proper data management is always crucial. Non-digital data require different approaches, methods and tools for preservation and management than digital data. Non-digital research data might require e.g. filing cabinets, archive-friendly filing systems, physical storage solutions, specific climate conditions and other special tools and instructions for preservation.

Metadata production is a key element in non-digital research data management. Different types of metadata (e.g. descriptive, structural, process-related, administrative etc.) are required to ensure the proper care and preservation of non-digital research data. The principles of data documentation will be explained in Metadata & Documentation -section. In addition to this, Tampere University Archives also give further instructions on metadata guidelines regarding non-digitally preserved research data.

For more information and advice, please contact:

  • Tampere University Laboratory Services provide health-related research projects with services regarding e.g. laboratory
    equipment and sample storage
  • Tampere University Archives: Records management team helps researchers in questions related to storing and preserving research materials. Tampere University’s Information Management Plan (Tiedonohjaussuunnitelma TOS) provides guidelines for handling research related and other administrative records by specifying e.g. records publicity and records preservation schedules. Contact tau@tuni.fi

Contact us

Is there something you did not found in this guide? Or is some important information missing? You can always contact us for further information, and we will help you with the research data management.

File formats

Choose file formats according to long-term access if possible and use formats which are in common use by the research community. Favor following properties:

  • Interoperability among diverse platforms and applications
  • Availability without fees or restrictions
  • Implementable by multiple software providers without  any intellectual property restrictions

You may have to choose certain formats during data collection and analyses, and others for long term preservation. The formats you choose can depend on how you plan to analyse your data or software compatibility. You may need to convert your data files to a preservation file format at some point of your research.

Some preferred file formats for long term preservation:

  • Containers: TAR, GZIP, ZIP
  • Databases: XML, CSV
  • Geospatial: SHP, DBF, GeoTIFF, NetCDF
  • Moving images: MOV, MPEG (MPEG-1/2, MPEG-4), AVI, MXF
  • Sounds: WAV, AIFF, MP3, MXF
  • Statistics: ASCII, DTA, POR, SAS, SAV
  • Still images: TIFF, JPEG 2000, PDF/A, PNG, GIF, BMP
  • Tabular data: CSV
  • Text: XML (ODT, DOCX), PDF/A, HTML, ASCII (RTF, TXT)

More information about the file formats in Data Management Guidelines by the Finnish Social Science Data Archive (FSD) or in Recommended formats by UK Data Service.

CC License