Ready, Set, Go: We would Like to Submit a New Project
Projects and project funding are aspects that move science and research forward. The role of the data steward is to find the overlap between the needs of the team and the requirements of the projects and, through this, to support its successful acquisition.
As part of research data management, you may encounter various requirements from funding agencies, which often include Data Management Plans (DMPs). To make it easier to navigate their expectations, you can refer to the table that summarizes DMP requirements by individual projects and funding providers.
We Got a Project: the Challenges Ahead
One of them is the DMP which can help you avoid a bunch of potential problems later in the project. Luckily, several tools can guide you through the creation of a DMP. A widely used tool is Data Stewardship Wizard, or you can try DMPOnline or any generic DMP template.
When creating a DMP, you will also be addressing the issue of long-term storage of the research data generated in the project. The re3data repository database can help you make the right choice. It is best to store your data in disciplinary repositories that should be tailored to your specific data type. In case you don't choose a suitable subject repository, see if you can use, for example, an institutional repository, and as a last resort, we have the Czech Data Repo (launched in a pilot version) and the internationally used Zenodo.
You'll run into metadata and other standards when choosing a repository. FAIRsharing will help you with the industry ones. With rich metadata, standardized nomenclature and clear data structure, you can ensure that your data is understood by colleagues around the world.
After the project is launched, the first data will start to be reported. We need to store and process them somewhere. The services of e-INFRA CZ (CESNET, IT4Innovation and CERIT-SC) will help you with this. There is also a Sensitive Cloud solution specifically for sensitive data. If you are planning to process big data, you need to think in advance how much computing resources you will need. The scale of e-INFRA services is wide and some services are available free of charge to researchers, students and academics. Some (e.g. IT4I large-scale computing) require a project application and this approval process can take time. Beware, some services require a longer process to get started, so don't leave their selection to the last minute.
Are Industry Specifications Coming into the Spotlight?
Modify your toolkit according to your industry needs. Depending on your industry specification, different types of tools can help you. For data processing and cleaning, look for tools that make your job easier in programming languages like R, and Python, for example. Or, if it suits you, you can use the open-source tool OpenRefine, for example. You may also find useful for example:
- Laboratory Notebooks (ELN)
An ELN is a digital registry for systematically recording, organizing, and sharing research notes, experiments, and data in the context of scientific work. - Humanities Toolkit
If you work in the fields of study in the humanities, here is a list of tools that you may find useful in your work. - Working with microscope image data
The microscopy, microscope data, and scientific image data processing community has developed a set of tools and guidelines that are presented, for example, in articles published in Nature Methods (2021) and Nature Methods (2023). In addition to tools for setting up metadata, checklists for publishing image data, images/visualizations, and software/scripts for data processing are included. Some of this is summarized in this Jupyter Book. - More open-source tools and tips
Further tips and guidelines in the open-source environment can be found, for example, on the Galaxy Community Hub or WorkflowHub.

Getting Closer to the Goal: We have Data to Publish
The project has been successfully completed, the research outputs published, and now it's time to package the resulting data, provide it with rich metadata, and store it for long-term use and reference. For this reason, it is necessary to return to your previously selected repository to use for uploading and storing your data. Thanks to the option to choose licenses, we can also set who can see our data and under what conditions they can use it. Now your data is safe and should comply with FAIR principles. You can do a final check here.
Three final tips:
- If you're looking for another open science platform, try osf.io.
- For a quick overview of useful tools, bookmark the services page on our website eosc.cz.
- If you or someone on your team is dealing with questions related to legal aspects or publishing, don’t hesitate to reach out to your open science methodologists. They can help guide you through these topics. You can find the key contacts in the second part of our series.
Build your Toolkit Based on the 7 Phases of the Data Lifecycle
You can try a different approach and build your toolkit based on the research data lifecycle. This method allows you to focus on the specific needs of your data at each stage – from planning to reuse.
- Plan
Do we know how we are going to work with our data during and after the project? How do we guarantee that the data will be reproducible? In this phase, we answer these questions and more. - Collect
For data collection, we look at the data quality control, and legal and ethical requirements that need to be followed. - Process
When processing the data, you need to take care to keep it secure. When processing, you need to provide proper documentation for the data, organize the data appropriately, assign a structure to the data, and generally take care of the data. - Analyse
Monitoring and control plays a key role here. - Preserve
Deciding where and how to store data is the most important issue at this stage of the data lifecycle. When choosing a repository, it is a good idea to focus on whether the repository/repository meets the FAIR principles. - Share
For data sharing, it is necessary to set up the correct access to the data. When publishing, be sure to check that the data meets the FAIR principles. PID and license assignment are then done. - Reuse
Reuse can be ensured through proper data accessibility, metadata descriptions, and other activities mentioned in the data lifecycle. This allows science to move forward.

To Be Continued...
Don’t miss the next episode of our series, Guardians of Data: Mission FAIR – Follow us on social media and be among the first to know when the next episode is released!