2. Project data management documentation
Warning
This document is currently work in progress and its contents was created for demonstrating the usage of readthedocs for structuring the results of the plant data management workgroup.
2.1. Data Management Plan
- Project data management can be guided via the design of a Data Management Plan. Consult the guidelines for designing a data management plan on Intranet . There are several templates available on DMPOnline. The ILVO template should be used for any ILVO project and is also the template for project proposals for the Flemish government (FWO, VLAIO,…). To access the template(s):
navigate your browser to DMPOnline
select ILVO as your host institution on the home page
log in using your ILVO account
click the Reference tab at the top of the page
click DMP Templates
During the project kick-off meeting with the scientific directors, the data management plan should be reviewed.
2.2. Working environment (Temporary storage)
It is very important to thoroughly think where you will be storing the data during the project. At ILVO, there are multiple options for personal and collaborative workspaces. The ‘My Documents’ folder on your PC will be synchronised to the ILVO server via folder redirection. This provides a continuous backup (if you are connected to the ILVO network or using the ILVO VPN), and allows you to work on any computer connected to the ILVO network. To collaborate with other ILVO colleagues on the same project, the ILVO IT department can set up a dedicated project network drive. These drives are, however, not accessible for external project partners.
- Alternatively, the Microsoft 365 software provides more options like OneDrive, Sharepoint and Teams. These different possibilities may seem a bit overwhelming, but, in short:
Sharepoint is the backbone of the system on which all data are stored. You can make shared libraries for specific projects, where all project partners can get access to.
Teams is an interface to the shared libraries that are on the Sharepoint drive. It allows you to make shared libraries (or ‘Teams’), invite project collaborators and communicate with them.
OneDrive is your personal space (1 TB) on the ILVO Sharepoint drive to use as your personal working directory and Teams is an interface to access collaborative projects
- Interesting tutorials on deciding which working environment to use (local PC, or using Onedrive or Sharepoint) are available:
- Hands-on guidelines to properly install and use are available for:
2.3. Folder structure
You have to create a folder for each running project you are involved in. Ideally, you describe the contents of this folder in your data management plan. Always add a number in front of the folder name, so folders are always sorted in the same way. The subfolders that are advised will depend on the project, but here, we provide some inspiration:
- Subfolder per work package, with next level subfolders per experiment (or group of experiments):
Keep an overview file (metadata file or readme file) for each experiment. This file should describe the materials and methods in detail, list contact details from partner institutes or from persons that you have obtained material (plant material, fungal isolates, …) from, and list main conclusions. When your experiments include pictures, DNA gels, sequencing data, tables or other relevant files, list where these files are stored so that they can be retrieved later. If applicable, make a reference to the notebooks (OneNote) of the laboratory staff that assisted you.
For small experiments that include e.g. only 1 excel file, the overview can be given in the first tab sheet of the excel file.
For data processing, use the tab sheets in Excel (as many as needed) and give them a clear name.
Do not only keep data from trials that went well, also keep data from failed trials or exploratory trials. Always add a conclusion to these files to describe the lessons learned. These trials may be of interest for researchers working on the same topic after you.
Subfolder for project administration.
Subfolder for project meetings (powerpoint presentations, meeting reports, …).
Subfolder for project output. This will be a working folder, finalised papers, presentations for conferences,… can be moved to the appropriate general folder.
Subfolder Data Management Plan
2.4. Designing a Tidy data set
Very often, a lot of effort is required to get data into a good shape ready for analysis. After Wickham (2014), we call well-structured data tidy data. Tidy datasets are easy to manipulate, model and visualise. So there is a high return on investment.
- The tidy data framework offers a coherent rationale to organise the data. It is based on three simple principles only:
each variable is a column,
each observation is a row, and
each type of observational unit is a table.
The framework makes it easy to tidy messy datasets. Yet, some exercise is necessary to master the principles. Inspired on Wickham (2014), the presentation illustrates the main ideas. What can go wrong and how to solve it. You can find examples of tidy data sets here.
Wickham, H. 2014 Tidy data. Journal of Statistical Software, 59 (10), 23.
2.5. Archiving projects (Permanent storage)
Data from finished projects (link to Personal data) should be archived on a permanent drive at ILVO, and whenever possible it is encouraged to archive the data on a publicly accessible generic (e.g., Zenodo, figshare, …) or a field-specific repository (check out this overview).