91ÇàÇà²Ý

Services

Managing your data in doctoral research

Doing research involves making decisions about research data: where it is stored, who it is shared with and how it is documented. Planning these at the start of a project can save time and avoid hassle further down the line. If you are a part of a research group, there might already be a data management plan in place which will provide you with answers on how data is handled in your team.

What is your data?

Research data is any material that informs your research and validates its findings. This might include results of experiments, measurements, observations from fieldwork, survey results, interview recordings, images or code. This data might be collected as a part of the research (e.g. measurements), produced within it (e.g. code) or it might be existing data is reused for the purposes of your research project (e.g. census data). To understand how to best manage your data, it is useful to take a moment to make a list of what exactly you are working with. Make a list for yourself of:

  1. ​​​All key data types
  2. ​​Origin of each data type
    If data collected in the project, what instrument is used for the collection? ​
    If reused data, what is the source of the data and the conditions of access?
  3. ​File formats
  4. ​​​Rough estimate of size (main point here is to consider whether you might need extra storage or computational capacity) 

Gathering this information into a table gives a good overview of your data:

Data type Origin of data File format File size
Interviews Collected in project, audio recordings mp3 for audio recordings
txt after transcribing
2GB raw audio
150 text
Census data Existing data reused, Statistics Finland, open under CC-BY license csv 500KB
Measurement spectra Collected in project, XPS vms 3-5MB

Where is your data stored?

Most often research data will be in digital form and stored on servers. The storage should be big enough fit your material and it should create automatic backups of your work to prevent data loss. The storage should also be secure and protected against unauthorised access to your work. Exactly how secure depends on what type of data you are working with. 

Data can be divided into four groups based on who is allowed to have access to it: public, internal, confidential and secret. Health data for example is secret and will require strict safety measures and access control. In addition to choosing a storage option that is safe enough, consider also where your data is stored during the gathering and analysis stages. Gathering or analysing data might require uploading it onto a software e.g. collecting interviews on Zoom and uploading them into Atlas.ti for decoding. The software solutions used should be approved by Aalto IT for your data type. Details of what the four grops of data classification mean in practice and what software is safe for which data type

It is also worth considering whether you might want a storage option that allows secure sharing of your data with your supervisor or collaborators. Aalto University's IT services offers many storage options to choose from:

How do you keep track of your data?

When you are gathering your data it is useful to keep notes of how exactly the data was gathered. This might be noting down the calibration of equipment used for the measurements, so that you can replicate the experiment later. If you are working in a group, other team members need to understand how the data was created to interpret it correctly. Consider what information is integral to understanding your data and consider where this information might be stored. 

Tips for keeping track of your data:

  • Have a clear folder structure with separate folders for raw data and processed data to avoid confusion and overwriting files.
  • Name your files so that you know their their content without having to open them, e.g. 20230228_XRD_Sample01 is an X-Ray Diffraction measurement on sample 01 taken on 28th Feb 2023.
  • ​​Include README files in your folders. These are a useful place to document things needed to understand and reuse your data (the Who, What, When, Where, Why and How).
  • For some research fields, electronic lab notebooks are very useful for keeping track of processes: Aalto Notebook or .
  • Where possible, store the data in file formats that can be opened without special software that might require payment.
  • Find out if your field uses any standard vocabularies or metadata formats for research data, and make use of these to ensure your notes are in a format that others in your field understand.
  • Consider if there is anything else needed to understand, use or reproduce your results. These might be software, algorithms, models. If possible, include these or links to them along with your data.
  • Be vigilant about version control. For software, is a great help in keeping track of changes to code as version control is built in.

What rules apply to its use?

There are rules that define how certain data types can be used and handled, and it is integral that you know if these apply to your data. These rules might be set in agreements with corporate partners or in legislation for example regarding the handling of personal data. also set some rules regarding what kinds of research setups with human subjects require ethical pre-review. Similarly there are ethics pre-reviews required for medical research and research with animals. Find out whether your research will need an ethical pre-review. 

In general, research at Aalto University follows the . These outline good research practices like agreeing on authorship early on and reporting research findings in an honest and transparent way. In addition to the national guidelines, EU projects follow the which are closely aligned. 

Have a look through the examples below and consider if you are working with this type of data. If so, set aside some time to familiarise yourself with the rules and what exactly they mean for your work, e.g. preparing a privacy notice or checking that your storage solutions are secure enough. These may be a bit daunting, but Aalto has an entire team of experts on hand to help you meet the legal and ethical demands of your data. You can reach them at researchdata@aalto.fi.

What happens to the data after the project?

Whilst the end of your project might feel far away, it is worth considering what will happen to your data early on as it affects e.g. the information in your privacy notice. Looking at your list of data, consider where your various data types fall in the categories below:

  1. To be deleted after the project: unnecessary personal data, personal data after the retention period set in the privacy notice, data with no reuse potential, internal notes, …
  2. ​​To be kept for verification period: e.g., for 5 years; may be stored internally or in repositories.
  3. To be archived for reuse: e.g., for 15-20 years; trustworthy data repositories such as .
  4. To be submitted for digital preservation: especially significant data, will be kept usable for decades or even longer; stored for example in the Aalto Repository.

Note that once your time at Aalto ends, so does access to cloud services and other storage solutions.

Data Management Plan (DMP)

Once these have been considered, you have in essence created a data management plan or in short DMP. A DMP is a document that speficies how data will be managed in a research project, often with a focus on publishing datasets. Funders are increasingly requiring researchers to prepare a DMP as assurance of good data management practices and to encourage opening of datasets with an aim to increase the openness of science and benefit the wider research community. 

Funders have their own DMP templates and requirements. More on information on funder requirements. Funder requirement often mention FAIR data, which is an acronym for data that is findable, accessible, interoperable and reusable. The FAIR principles provide good guidance for data management in general, but are most useful when creating datasets that will be shared with others.

Aalto University has a that includes questions on data management as well as guidance for answering these.

Further information

People talking with each other

Research Data Management (RDM) and Open Science

Aalto University offers comprehensive services, guidance, and support to help you manage your data efficiently. Explore our collection of resources and external links to boost your research.

Services
The image is from Aalto University material bank.

Data Management Plan (DMP)

Create a Data Management Plan (DMP) to ensure your research data is high-quality and FAIR: findable, accessible, interoperable, and reusable.

Services
Students sitting around a table with laptops and discussing.

Data Agents

Meet your Data Agents — researchers offering hands-on support on data management.

Services
This service is provided by:

Research and Innovation Services

Did you find what you were looking for? If not, please contact us.
  • Updated:
  • Published:
Share
URL copied!