Guidance Data Management Plan

Guidance Data Management Plan

Table of Contents

1. General Information

1.3 Name and contact details of the lead researcher including their ORCID

The lead researcher can be the principal investigator or supervisor of the project. Principal Investigator (PI) refers here to the holder of an independent grant and is usually the lead researcher for the grant project.

ORCID

ORCID provides a persistent digital identifier (an ORCID iD) that you own and control, and that distinguishes you from every other researcher. You can connect your iD with your professional information — affiliations, grants, publications, peer review, and more. You can use your iD to share your information with other systems, ensuring you get recognition for all your contributions, saving you time and hassle, and reducing the risk of errors.

1.4 Name and Contact Details of the Primary Point of Contact / Executive Researcher / Coordinating Investigator, including ORCID

Contact Details

This may be a single individual (e.g. the researcher collecting the data or the coordinating investigator), but multiple names can also be listed to enhance the findability of the research output (such as data, research protocols, publications, etc.) during and after the project.

ORCID

ORCID provides a persistent digital identifier (an ORCID iD) that you own and control, and that distinguishes you from every other researcher. You can connect your iD with your professional information — affiliations, grants, publications, peer review, and more. You can use your iD to share your information with other systems, ensuring you get recognition for all your contributions, saving you time and hassle, and reducing the risk of errors.

2. Legislation

2.1b I confirm that I am aware of and compliant with laws and regulations concerning privacy sensitive data

General Data Protection Regulation (GDPR):

Each research that processes personal data must be registered initially in the so called GDPR Processing Activities Register. This registration is for review and assessment of the principles of processing data, (remember to obtain informed consent of data subjects) with regard to the technical and organizational measures for protecting personal data. In case third parties are involved in processing personal data, a data processing agreement is mandatory.

You can find more information on the privacy page of Maastricht University.

For any privacy related concerns, you can contact one of the Data Protection Officers:

2.1c My research project is registered at my institution by the Local Information and Security Officer (LISO).

Registration Requirement

If you are processing personal data, you are required to register your project.

Local Information and Security Officer

Registration within the UM is handled by the Local Information and Security Officer, in most cases this is the Information Manager of your faculty.

To register research projects within the Faculty of Health and Life Sciences for GDPR compliance, please use the following form: General Data Protection Regulation - Jira Service Management.

Alternatively, you can initiate this registration by clicking the option provided at the end of this question. If you use this method, the form will become accessible via the Data Portal. You can then finalize it through the Data Portal by clicking on your Profile and then on Requests.

Within the azM the registration is provided by CTCM. For studies at azM the registration is done via the Panama System (https://qsmumc.ctcm.nl/Home/algemeen).

2.1d My study research file (e.g. protocol) has been or will be submitted to the relevant Ethics Committee.

The Medical Ethics Committee (METC)

Ethical review is obligatory by law for all research that is subject to the WMO. The METC acts as an accredited independent Ethics Committee for review and approval of all scientific research with human participants subject to the WMO. Prior to the start of each WMO complicit research project performed at Maastricht UMC+, the Executive Board of Maastricht UMC+ requires the METC to review and approve the project.

For non-WMO research with patients from the academic hospital, it is obliged by the Executive Board of Maastricht UMC+ to submit your research plan to the METC for judgement.

Ethics Review Committee Health, Medicine and Life Sciences (FHML-REC)

Researchers undertaking work with human participants that falls outside the scope of the WMO and does not include patients from the academic hospital, are able to submit their research proposal to the ‘FHML-REC’ - the FHML Research Ethics Committee for ethics review.

Ethics Review Committee Inner City faculties (ERCIC)

ERCIC encourages researchers to submit their study protocols involving human participants or personally identifiable data for ethical review before the start of research activities. Review by ERCIC at the moment takes place on a voluntary basis.

Ethics Review Committee Psychology and Neuroscience (ERCPN)

Ethical review of scientific research involving human participants or personally identifiable data is carried out by the Ethics Review Committee Psychology and Neuroscience (ERCPN). If studies fall under the Medical Research Involving Human Subjects Act (WMO), review by the accredited review committee METC is mandatory.

Animal Ethics Committee (DEC)

As from 18-12-2014 the Wet op de Dierproeven (WOD) has changed. This has changed the role of the Animal Ethics Committees (DECs) as well. The DEC-UM now reviews not only ethically, but also scientifically and offers its opinion to the Central Commission Animal testing (CCD). For information about the application of a project authorisation you can visit the website of the CCD. The Central Commission Animal testing (CCD) is the only authority that can act throughout the Netherlands to grant licenses for animal testing.

2.1f The 'Act medical scientific research with human beings applies to my project and I will comply with the 'Quality Assurance for Research Involving Human Subjects'

The 'Act medical scientific research with human beings (Dutch: 'Wet medisch-wetenschappelijk onderzoek met mensen, WMO) applies to my project and I will comply with the 'Quality Assurance for Research Involving Human Subjects' (Dutch: Kwaliteitsborging Mensgebonden Onderzoek).

https://metc.mumc.nl/beslisboom-wmo-plichtig-niet-wmo-plichtig-onderzoek

If you intend to submit a new research proposal to the METC, it is first important to determine whether your research falls within the scope of the Act Medical Scientific Research with Human Beings (WMO).

2.1g I will be doing research involving human subjects and I am aware that informed consent is required from the participants for collecting or reusing their data.

Informed Consent

Consent must be freely given, informed, specific, and unambiguous. Consent must be a statement or clear affirmative action signifying agreement to the processing. When conducting research that involves personal or sensitive data, obtaining informed consent is generally essential.

Within all types of (medical) scientific (clinical) research informed consent should be the basis for the use of personal data. However, based on article 24 of the Dutch implementation law of the GDPR, there are some (cumulative) exceptions. The Ethics Committee (EC) needs to approve this deviation, based on solid argumentation in the protocol. The following criteria need to be taken into account:

  • the processing is absolutely necessary for scientific or historical research or statistical purposes

  • the research mentioned above serves a common interest

  • requesting explicit permission proves impossible or takes a disproportionate effort

  • the execution of the research has been provided with safeguards such that the privacy of the subject is not disproportionately harmed

Exceptions to Informed Consent

Nevertheless, there are circumstances where exceptions to informed consent may be permissible, determined either by ethical review committees or through legal provisions like those found in the GDPR. Ethical grounds for waiving consent can include situations where seeking it might cause participant distress or confusion, or in certain observational studies where the consent process itself could interfere with the data or outcome. Under the GDPR, processing personal data is lawful only if a valid legal basis is identified; consent is one such 6 basis, but not the only one. Other lawful bases include, for example, the necessity of processing for vital interests. Specific to the Netherlands, the UAVG introduces particular rules and potential derogations for processing data for purposes like research. This often relies on lawful bases such as public interest or dedicated research conditions under GDPR Article 9, contingent upon the implementation of appropriate safeguards, and is separate from the 'vital interests' basis which applies in emergency scenarios.

Informed consent WMO

In case of WMO research the information letter and informed consent form should be written according to a specific format. The format of the MUMC+ ethical review committee is the same as the National CCMO format and can be found here: https://www.ccmo.nl/onderzoekers/standaardonderzoeksdossier/e-informatie-onderzoeksdeelnemers.

Informed consent non-WMO

There is no specific format for participant information and informed consent forms in case the research does not fall under the WMO. The researcher should adhere as much as possible to the formats of participant information and informed consent mentioned in the section above (section Informed consent WMO). We suggest starting with the complete format, to consider all elements mentioned, and to keep all elements that are relevant to the study.

2.1k I will be doing research involving human subjects and I have taken privacy protection measurements.

Anonymisation

Under the GDPR, anonymization refers to processing personal data in such a way that the data subject is not or no longer identifiable (as indicated in Recital 26, which states that anonymous information is outside the scope of the Regulation). This is an irreversible process; individuals should not be re-identifiable by using all means reasonably likely to be used. When data is truly anonymized, it is no longer considered personal data and is therefore not subject to the GDPR.

Pseudonymisation

EU law defines pseudonymisation in Article 4 of the GDPR as “the processing of personal data in such a manner that the personal data can no longer be attributed to a specific data subject without the use of additional information, provided that such additional information is kept separately and is subject to technical and organisational measures to ensure that the personal data are not attributed to an identified or identifiable natural person”. Unlike anonymization, pseudonymization is reversible because re-identification is still possible by linking the pseudonymized data back to the original identifiers using that separate additional information. The GDPR requires that this additional information be kept separately and subject to technical and organisational measures to ensure non-attribution. For research data, managing pseudonymization often involves keeping the original, identifiable data (e.g., in a secure or encrypted key folder) separate from the pseudonymized dataset. Pseudonymisation is mandatory when conducting scientific medical research with human subjects.

2.2a In collecting new data, I will be collaborating with other parties.

The following options could apply:

  • Yes, the new data will be (partly) provided by a project partner or supplier = Other parties, such as participating centers, will provide data

  • Yes, I will collect the new data in conjunction with other researchers or research groups = Multiple partners or research institutions work closely together in designing the study and collection of the data, e.g. within a consortium . Data will be shared among these partners

  • Yes, we have reached agreements on the user rights of the data used in the project = Answer is almost always applicable when data is transferred or shared and it NOT anonymous but pseudonimised.

2.2c I am a member of a consortium of 2 or more partners

Consortium Agreement

A consortium agreement is an agreement in which more than two parties are involved to collaborate for research, development or other purposes. This agreement includes provisions necessary for the execution process, the grants, confidentiality, intellectual property(IP), co-authorship, conditions for reusing data and publications. In contrast to a cooperation agreement, a consortium agreement is more comprehensive and it describes among other subjects, the structure of the management.

If a project is being executed in collaboration with other academic partners (for instance researchers from another university) and/or peripheral hospitals, this collaboration is called a consortium cooperation. In other projects these parties might also involved external parties, in which case a consortium agreement is also advised. In all of the latter examples it is very important to confirm agreements in the form of a consortium agreement.

When the project is carried out by different departments within the same institute, determining consortium agreements about internal management, about the consortium and/or processor agreements, are not mandatory. However, it is advisable to construct mutual agreements and to record them in writing. You should always contact the relevant legal department when constructing a consortium agreement:

2.2d Agreements have been made regarding research data management and intellectual property recorded in a collaboration or consortium agreement.

Consortium Cooperation and Personal Data

If there is a consortium agreement, and during the research, personal data will be processed, it is important to think carefully about the relationships / roles of the consortium partners with regard to these personal data. In other words: Who is responsible for the data management? Multiple persons responsible or only one?

Consortium Agreements and the DMP

Legal agreements such as a consortium agreement, should not be included in a DMP. DMP is a living document which serves another purpose namely describing the many aspects of data management. It is a document that can be altered and therefore it is not suitable for irrevocable, binding legal agreement. In case of a consortium, reference to the consortium agreement document should be logged in the DMP.

Co-authorship

In accordance with the ethical guidelines, (rules of the Vancouver convention: Davidoff F et al., Sponsorship, authorship and accountability, NEJM 345:825-826, 2001), agreements regarding co-authorship can be made.

If additional information regarding intellectual property law is needed expert Eliza Malathouni from the University Library might contacted at:

3. Data Preparation and Collection

3.1a In collecting data for my project, I will be reusing or combining existing data

Reuse of Data and Informed Consent (IC)

Data reuse means using data for other purposes than it was originally collected for. Reuse of data is particularly important in science, as it allows different researchers to analyse and publish findings based on the same data independently of one another. Reusability is one key component of the FAIR principles. However, the principles of anonymization and pseudonymization according to GDPR should kept in mind when dealing with the re-use of research data.

For an explanation regarding anonymous data versus pseudonymous data see section 2.1k.

If you opt to pseudonymise data, you will often need the permission of the subjects to reuse the data. The data may be linked only once subjects have given their explicit consent. If the data are used for another purpose in the future (further use, or ‘secondary processing’), the researcher must again seek the permission of the subjects before linking the data.

3.1b I have the owner's permission for reusing the data

Reusing Data
If data is used from former projects/databases, it should be clear the subjects already consented to the re-use during the original collection of their data. When using data from other authors or sources, take into account that you have to acquire permission or deal with user agreements and licenses first.

Here is it is important to take into account the definition of consent. The article 7 in the The General Data Protection Regulation (GDPR) mandates that organizations must obtain explicit consent from individuals before processing their data, ensuring transparency and accountability in data handling practices. The controller (person who processes the data) shall be able to track and demonstrate he/she obtained consent from its data subjects. The consent should be a written declaration, but there are exceptions to this. The consent should be written in clear and plain language that clearly addresses the specific audience intended for. The data subject shall have the right to withdraw from the research at any time, and prior to giving consent the data subject shall be informed.

Even when reusing data, you have to take into account different retention period for your particular type of research. You are sometimes required to delete the data you are reusing immediately after your research.

There are some legal bases on which no permission is required, for instance when the information is public domain. Please check the legal bases on which the data was collected.

3.3a The following type of data will be used/collected

Standard Personal Data

Any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person. NB: National identification numbers cannot be used without a legal obligation!

Special Categories of Personal Data (Sensitive Data)

Data revealing racial or ethnic origin, political opinions, religious or philosophical beliefs, or trade union membership, and the processing of genetic data, biometric data for the purpose of uniquely identifying a natural person, data concerning health or data concerning a natural person’s sex life or sexual orientation;
Clarification: data collected within clinical scientific research qualifies as a special categories of personal data, also when it is collected in a pseudonomized form!

GDPR art. 5 Principles relating to processing of personal data
GDPR art. 9 Processing of special categories of personal data

Anonymous Data

This is only possible when the data a) was already anonymous when extracted from an existing database or b) will be irreversibly anonymised before providing the data to the researcher. Consent is not needed.

If you are uncertain to which category the data belongs, you should get in touch with your faculty contact person (Local Information and Security Officer or Information Manager). Or you could sent an email to privacy@maastrichtuniversity.nl for UM or secretariaat.kwaliteitenveiligheid@mumc.nl for MUMC+ research.

3.5 Please select the tools, instruments or other means you intend to use for collecting, processing or storing data

Tools for Data Collection and Processing

Collecting:

The following tools might be used to collect Interview or other Qualitative Inputs Online:

Processing:

The following tools might also be used to process Qualitative Data:

  • Nvivo

  • AtlasTI

  • Dragoncitation

  • MSTeams

  • electronic lab journals (FHML)

  • one note

Tools for Data Storage

Effective data storage is crucial throughout the research lifecycle. This section outlines recommended tools for both the dynamic (active) and static (archival) phases of your research. Look into 4.1 for more specifics on each tool:

Dynamic or Active Phase:

Static Phase of Research (Archival Data)

Once the active research phase concludes, it's vital to preserve and archive your data for long-term access, reusability, and compliance.

  • MDR (Maastricht Data Repository): Used for internal data storage in compliance with the FAIR Principles. Assistance with setup and usage is available via the Data Stewards.

  • DataverseNL: A national repository suitable for data archival after research completion. Note that the archival of non-anonymized data is not permitted; however, pseudonymized data may be archived here. You can find more information and access the repository via the Data Stewards.

3.6a I will select an ontology/terminology for recording my data that allows my dataset to be linked or integrated with other datasets

FAIR Principle I2

Proper use of terminology involves describing individual variables and their values using standardized ontologies. Variable names should be linked to formal ontological concepts that define their meaning unambiguously. For example, a variable labeled systolic_bp could be mapped to a concept from the SNOMED CT or LOINC ontology representing "systolic blood pressure." Similarly, value labels—such as the categories "male" and "female"—should reference controlled terms from established vocabularies like SnoMED, Loinc and MedDRA (for medical conditions). This semantic annotation allows automated tools and researchers to interpret, integrate, and analyze data across studies, regardless of linguistic or structural differences in the original data sources. Ontology-based metadata enhances the FAIRness of data (Findable, Accessible, Interoperable, and Reusable) and is particularly valuable in clinical trials, observational research, and public health monitoring.

It is recommended to make use of metadata standards at an early stage in the research. Within the MUMC+ the next metadata standards are advised:

For pharmaceutical clinical trials CDISC is often used.

Ontology Lookup Service:https://www.ebi.ac.uk/ols4/

COVID CRF: https://www.who.int/publications/i/item/global-covid-19-clinical-platform-case-report-form-(crf)-for-post-covid-conditions-(post-covid-19-crf-)

ontology-20250619-133449.png

3.7 Give an estimation of the size of the data collection

Estimating Size

This can be rough estimate before the start of your project. Please take a look at the following reference table to make an estimation of the size.

Type of data

Content

Format

Volume

Type of data

Content

Format

Volume

Text

10 pages

PDF

<1MB

Audio

10 minutes

MP3

~10MB

Audio

10 minutes

MKA

<100MB

Video (1080p)

10 minutes

MKV

~850MB

Image (50 MP)

10 photos

JPG

~120MB

Image (MRI, EEG, Xray, PET, Ultrasound)

1 image

DICOM

1-30MB

Statistical data
Spreadsheets
Databases

100 records

CSV

<10MB

All the material (i.e. data, code, software) needed for the reproduction of the research must be saved.

4. Data Processing and Analysis

4.1a During the project, I will have access to sufficient storage capacity/sites and a backup of my data will be available

There are several options for storing data during the research project. It is highly recommend that you use UM/MUMC+ infrastructures whenever possible. Check with your faculty whether agreements already have been made with regards to storage and whether any procedures are in place.

Effective data storage is crucial throughout the research lifecycle. This section outlines recommended tools for the dynamic (active) phase of your research.

Dynamic Phase of Research (Active Data)

During the active phase of your research, reliable and accessible storage solutions are essential for ongoing data collection, processing, and analysis.

  • Internal Drive P-Drive: Primarily for internal storage of FHML research data. Access management is handled by the FHML ICT Department fhml-ict-support@maastrichtuniversity.nl

  • Internal Drive L-Drive: For internal storage of MUMC+ research data. Access management is handled by MIT azM klantenservice.mit@mumc.nl.

  • Surf-Drive: A cloud storage option managed by the ICT department at FHML: fhml-ict-support@maastrichtuniversity.nl

  • DSRI (Data Science Research Infrastructure): A robust cluster of servers designed to deploy workspaces and applications for data science. It operates by launching workspaces and applications within Docker containers, which are automatically deployed to powerful servers on the cluster using Kubernetes, a container orchestration system.

  • Andrea: A cloud base workspace to share and actively work with data. For access to it, contact info-memic@maastrichtuniversity.nl

  • SURF Research Drive: A secure, cloud-based storage service specifically designed for researchers, students, and information professionals. It facilitates the storage, sharing, and collaborative work on research data. Key features include:

    • Scalable Storage: Ideal for managing large datasets common in research.

    • Advanced Security: Ensures sensitive and valuable research data is safeguarded through robust protocols.

    • Seamless Collaboration: Enables efficient data sharing across institutions while maintaining control over data access.

    • Integration: Integrates effectively with other research tools such as SURF Research Cloud and SURF Sharekit Link, enhancing workflow efficiency and supporting compliance with data management regulations.

4.2a I will ensure that the data and their documentation will be of sufficient quality to allow other researchers to interpret and reuse them (in a replication package).

FAIR Principle R1

Meta(data) are richly described with a plurality of accurate and relevant attributes.

Documentation of the Research Process

Study protocols, CV’s, orientations schedules and certificates of persons working within the research, standard operating procedures (SOPs), old and new versions of information on test subjects, methods of blinding etc. Digital information stored on a secured server and all paper information in a safely locked cabinet whilst specifying its location.

 

Quality Control during Data Collection

Incorporate, as much as possible, restrictions and validation rules when using online data collecting tools like Castor EDC, apps or tailor-made software. This in order to prevent data entry mistakes and to increase the quality of the data.

Examples of restrictions and validation rules are:

  • automated routing: only relevant questions will be available

  • indicate ranges

  • use mainly fixed response questions and try to avoid open text questions

  • mandatory questions

All data collection tools must have an audit layer, all data changes will be logged (by whom, what and when).

Quality control after data collection

For additional quality control, a data-control and data-cleaning plan can be developed.

In combination with the code-book you can determine which controls should be carried out.

The controls described will be translated to a script (e.g. SPSS-syntax). With this script an error summary can be generated. Based on the summary corrections can be made.

These corrections will be included in a script and will be documented in the data cleaning plan. This method will ensure that:

  • controls and corrections are made in a uniform manner

  • all steps will be documented

  • In this way reproducability of the data will be guaranteed.

4.3a All data processing and analyses will be programmed in syntax or script files

Effective version control in statistical software syntax—such as R, SAS, Stata, or SPSS—is crucial for ensuring reproducibility, clarity, and collaboration. A best practice is to maintain a clear and consistent file-naming convention that includes version numbers or dates (e.g., analysis_v1.R, model_2025-07-07.do). Scripts should include detailed comments explaining the purpose and logic of each part of the code. Keeping a changelog or version history as a comment in the script file can help track modifications over time. It's also important to avoid overwriting original data or scripts; instead, create backups or use copies for experimentation. Regularly saving and organizing files in structured directories (e.g., data/, scripts/, output/) further enhances project manageability and transparency.

4.7a Data wil be shared and transferred in a secure way

SURFfilesender

Within the UM/MUMC+ the use of SURFfilesender is recommended. This allows you to send files up to 500GB for free. With SURFfilesender, your files are sent securely. The uploaded files are stored in the Netherlands for no more than 21 days. Although SURFfilesender is already secure as it is, you can also opt for additional security in the form of encryption. Files up to 2GB can be send using encryption. All you have to do is send the recipient a ‘key’ via a second channel: telephone or SMS, for example. The recipient then enters this key, which allows them to download the file. This way, you can determine who is allowed access to your valuable research data or confidential files.

SURFdrive
Store, synchronise and share your documents easily with SURFdrive. SURFdrive is a personal cloudservice for the Dutch education and research. Your documents are kept safe and sound in our communitycloud.

Sharepoint

Sharepoint facilitates the production of joint research papers. Multiple collaborators from different institutions and locations can edit shared documents within Sharepoint.

A Virtual Research Environment has a secured place on a UM server, so collaboration takes place in a secure environment, accessible anytime and anyplace and is fully integrated with Microsoft. VREs facilitate online communication and sharing sources of information with integrated tools like wikis, blogs, shared calendars and discussion forums.

Andrea

A workspace to share and actively work with data. For access to it, contact info-memic@maastrichtuniversity.nl

Tools for Encryption

7-Zip and WinZip are tools that support several different data compression, encryption and pre-processing algorithms. The contents of the files that you want to protect are encrypted based on a password that you specify. In order to later extract the original contents of the encrypted files, the correct password must again be supplied.

5. Data Archiving and Open Access

5.1a I will select a data format, which will allow other researchers and their computers (machine actionable) to read my data collection

FAIR Principle I1

(Meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation.

To ensure long-term usability, accessibility and preservation of data, it is recommended that you use a ‘preferred’ file format. DANS has made an extensive list of preferred formats.

DANS is confident that preferred formats will offer the best long-term guarantees in terms of usability, accessibility and sustainability.

5.2a I will use a metadata scheme for the description of my data collection (for describing the dataset as a whole)

Metadata: The Key to Understandable and Reusable Data

Metadata, literally "data about data," provides essential descriptive information that defines and contextualizes your research data. It typically describes the content, context, and provenance of a dataset in a standardized and highly structured way. Think of metadata as the "footprint" you leave for the future; it allows others (and your future self!) to understand what your data is about, enabling them to locate, access, and potentially reuse it effectively.

Why is metadata important?

High-quality metadata is fundamental to good data management and is crucial for adhering to the FAIR principles (Findable, Accessible, Interoperable, Reusable). Specifically, metadata:

  • Enhances Discoverability: Ensures your data can be easily found by relevant stakeholders, preventing duplication of effort and promoting collaboration.

  • Facilitates Understanding: Provides the necessary context for users to correctly interpret your data, including methodologies, definitions, and collection circumstances.

  • Promotes Reusability: Equips future researchers with the information needed to reliably re-analyze, validate, or build upon your work, extending the impact of your research.

  • Supports Long-Term Preservation: Guarantees that data remains interpretable and usable even as technologies and personnel change over time.

Types of Metadata

Metadata can be categorized based on the type of information it conveys:

  • Descriptive Metadata: Information primarily for discovery and identification.

    • Examples: Title, author, abstract, keywords, publication date, data type, project title. This is like the meta-data used to describe a dataset in a repository like DataverseNL.

  • Structural Metadata: Describes how a dataset's internal components are organized and relate to each other.

    • Examples: Relationships between files within a larger dataset, dataset table of contents, folder structures.

  • Administrative Metadata: Information required to manage and preserve the data effectively.

    • Examples: Creation date, file format, access rights, intellectual property rights, licenses, dataset version, contact information for the data owner.

  • Technical Metadata: Details about the data's technical characteristics, often useful for processing and software compatibility.

    • Examples: Encoding (e.g., UTF-8), file size, software used for data creation or analysis, sensor settings.

  • Preservation Metadata: A specific subset of administrative metadata focused on actions taken to ensure the long-term integrity and accessibility of the data.

    • Examples: Checksums, migration history, software dependencies for opening the data.

Registering Meta-Data

FAIR Principle F2

Data are described with rich metadata.

To make your dataset (re)usable, you should use a metadata scheme to describe your data. It is advised that you start with this fairly early in your project. After the project it is generally a lot harder and more work to do. There are already multiple standards available for different fields of research.

Metadata Standards and Schemas

A key component of metadata is the schema. Metadata schemes provide the overall structure for the metadata. It describes how the metadata is set up, and usually addresses standards for common components of metadata like dates, names, and places.

One of the most generic and commonly employed metadata schemes is the Dublin Core and can always be applied in case you are unsure about which schema to select.

List of Metadata Standards by the Digital Curation Centre
The Digital Curation Centre has provided a list with links to information about discipline specific metadata standards, including profiles, tools to implement the standards, and use cases of data repositories currently implementing them.

FAIRsharing

FAIRsharing.org provides a more extensive catalogue of standards used in biomedical research.

When you register your dataset in an online catalogue you are in fact already asked to provide a set of metadata. The metadata asked for an online catalogue are mainly focused on improving the find-ability and accessibility of your data.

However, the metadata in schemes like those listed on FAIRsharing.org are far more detailed and domain specific. These specific metadata schemes are aimed at improving the interoperability and reusability of the data.

5.3a I will make the following end products available for further research and verification

NWO Requirements

NWO expects you to preserve the data resulting from your project for at least ten years, unless legal provisions or discipline-specific guidelines dictate otherwise. As much as possible, research data should be made publicly available for re-use. As a minimum, NWO requires that the data underpinning research papers should be made available to other researchers at the time of the article’s publication, unless there are valid reasons not to do so. The guiding principle here is 'as open as possible, as closed as necessary.' Due consideration is given to aspects such as privacy, public security, ethical limitations, property rights and commercial interests. In relation to research data, NWO recognizes that software (algorithms, scripts and code developed by researchers in the course of their work) may be necessary to access and interpret data. In such cases, the data management plan will be expected to address how information about such items will be made available.

Availability

Conditions for the availability of the data, can be determined in terms of use. Terms of use can be included in the cooperation agreement or consortium agreement.

Verification

For verification, data will be made available for people who have the necessary permissions/rights. These permissions/rights can be captured in terms of use.

End Products of the Project

Give a brief description of the end products available for follow up research and verification:

  • raw data

  • processed data, e.g. SPSS files

  • data documentation, for instance codebooks with metadata

  • documentation about the data control and data cleaning process: description of the findings in a Word-document

    • TMF (Trial Master File) and separate ISF's (Investigator Site File): documentation files with documentation about the research process (including digital documentation about the data and syntaxes) and signed IC's. These will be kept safely in the appropriate center in a locked cabinet for 15 years.

    • study flow and logistics

    • study protocol

  • scripts/syntaxes for data review and data cleaning, analyses

  • questionnaires and eCRF’s

  • other products like text documents, spreadsheets, (lab) logbooks, models and algorithms, transcripts, codebooks, samples, artifacts, models, scripts and other data files like literature review files, email archives, etc.

5.5a The data collection of my project will be findable for subsequent research

FAIR principle F4

(Meta)data are registered or indexed in a searchable resource.

In addition to the persistent identifier, the find-ability of your dataset will be enhanced by registering the dataset in an online catalogue or web portal with a search engine. In this way, the dataset is find-able for other potential users. Research Data Alliance has formulated a clear definition.

You can register the dataset on such a catalogue. This does not mean however that the data themselves are stored there. Rather, you register information about your dataset (metadata), and provide a reference through a persistent identifier.

The information about the dataset may include title, description, research goal and contact information and conditions for getting access to the data, etc. It may be generic information, or specific information aimed at a research community.

DataVerseNL

DataverseNL is an online research data repository to register, store and share research data in accordance with the FAIR principles. Every researcher at UM can make use of DataVerseNL during the research period and up to the prescribed term of ten years after the last publication based on the data. For questions and support, contact your faculty data steward or the University Library Research Data Management specialists.

Maastricht Data Repository

The Maastricht Data Repository, offered by DataHub, is available within the MUMC+ for storage and archiving of metadata as well as research data, holding into account the drafted directives. The research data will be stored according to the FAIR principles. DataHub meets the requirements of current and future legislation.

If a dataset is stored in the Maastricht Data Repository, a Unique Persistent Identifier (PID) is automatically generated for this dataset. This unique PID refers to the corresponding data set stored in the Maastricht Data Repository. The dataset receives a Handle PID.

DataHub also provides the ability to transfer your metadata or even your complete dataset onto DataVerseNL. For more information please contact DataHub:

5.6a I will be using a persistent identifier as a permanent link to my data collection

FAIR Principle F1

(Meta)data are assigned a globally unique and persistent identifier.

A Persistent Identifier (PID) is an online permanent referral to a digital object that is independent of its storage location. The digital object in this case is the dataset itself, or metadata that describe what the dataset is about (see key item 7). This PID is a unique ‘label’ (usually in the form of a code) and is created by a (certified) data archive or repository. With a PID the dataset, or a description of it, can always be found on the internet, even when the name or location of it is changed since its creation. They are essential for ensuring Findability and sustainable archiving of your dataset. In addition, a PID enables you to cite your data in publications. Examples of PID’s are DOI, Handle, URN of ARK.

For more information on PIDs, see the web pages of the International DOI Foundation (IDF) and Datacite (TUDelft).

The course of RDNL is also a good option to learn more about the storing, managing, archiving and sharing of data.

DataHub/DataVerseNL

If a dataset is stored in the Maastricht Data Repository or in DataverseNL, a Unique Persistent Identifier (PID) is automatically generated for this dataset. This unique PID refers to the corresponding data set stored in the Maastricht Data Repository or in DataversNL. The dataset receives a Handle PID.    

Resulting publication(s) related to the dataset, are identified with a Digital Object Identifier (DOI) submitted by the publisher. In the publication(s) it will probably be required to include a reference to the corresponding dataset's Handle PID.

DOI

The Digital Object Identifier is a unique and stable identifier that ensures that a digital object can be permanently found on the World Wide Web, regardless of changes in the URL where the object is found. A central registry ensures that the user of a DOI will be referred to its current location.

5.7a Once the associated article is published and/or the project has ended, (part of) my data will be accessible for further research and verification

FAIR principle A1

(Meta)data are retrievable by their identifier using a standardised communications protocol.

Restricted Access – Embargo Period

By determining an embargo period, the accessibility of data can be delayed. One of the reasons for delayed access to data can be the marketing of the acquired knowledge.

A distinction in accessibility can be made between the access of raw data and aggregated research data (for scientific publication). Agreements on embargo period and restricted access should be captured in Terms of use.

5.7c Once the project has ended, my data collection will be publicly accessible

FAIR principle A1

(Meta)data are retrievable by their identifier using a standardised communications protocol.

Accessibility to data is a key item for ZonMW. Requirements for the accessibility of data can be laid down in guidelines for terms of use. Open access is not needed. The data must be findable via an online catalogue or metadata catalogue, foreseen in the terms of use.

5.7e I have a set of terms of use available to me, which I will use to define the requirements of access to my data collection once the project has ended

FAIR Principle R1.1

(Meta)data are released with a clear and accessible data usage license.

UM/MUMC+ promotes researchers to create FAIR data. FAIR data, however, are not necessarily OPEN for anybody. With the ‘A’ of accessible in FAIR, you as a researcher can state the conditions by which the data will be shared.

If the reuse of your dataset is bound to specific conditions, or in other words there is restricted access to your dataset, other researchers must be able to view the terms of use. These must be findable online, e.g. through the website of your institute, or the catalogue or repository.

The terms of use have to be made available by your institute or research group and should not be personal. Other researchers should be able to find out who they need to contact if they want to make use of the data.

The legal status of the licenses and conditions for reusing the data have to be clear. You can use international standards for the terms of use, or you can formulate them yourself together with a legal advisor.

Terms of Use - UM/MUMC+

A Terms of Use Agreement is used to set the rules to which your users must agree to in order to use your data collection. One useful disclosure of a terms of use agreement, can be a clause covering Intellectual Property.

At the moment no sample Terms of Use template is available within UM and MUMC +. Terms of use should therefore be drawn up by a legal advisor. Within the UM you can contact:

Creative Commons License

Using a Creative Commons license the researcher can easily capture a basic Terms of Use for their data. The licences come in 6 different flavors. By using the online tool it is easy to select the licence that most fits your project.

5.7f In the terms of use restricting access to my data, I have included at least the following:

FAIR Principle R1.1

(Meta)data are released with a clear and accessible data usage license.

Examples of Terms of Use for Reusing Data

  • An appointed committee will decide on the approval of future data requests and will consist of the following persons, being XXXXXX

  • The purpose of the data access is scientific and not in any way for commercial purposes;

  • The purpose of the data request needs to be of social interest. Research questions of the applicant for accessing data need to be related to the research topic of the original research;

  • The dataset may not be linked to an external dataset (due to privacy);

  • Use of the data set in not infinite, but only for a predetermined period, a time interval required for analyses of data up to a maximum of 3 months (and one-time only for this specific question by this applicant).

  • If the purpose of data-access is for scientific publication, prior agreements must be made regarding to co-authorship, access to final manuscript and possibility to stop publication of data up to a period of 3 months if no agreement can be reached between researchers and the applicant.

5.8a I will select an archive or repository for (certified) long-term archivig of my data collection once hte project has ended

FAIR Principle A1

(Meta)data are retrievable by their identifier using a standardised communications protocol.

At the end of your project, you are required to deposit your data sustainably in a data repository (or a data archive). Preferably, this should be done in a certified repository (core trust seal). A certified repository ensures that data can be shared in the long run.

It is recommended that you use a repository provided by UM/MUMC+. If you are required by co-funder or a scientific journal to use an international repository, you are free to do so. However, it is preferred that you register your dataset at least in one of the repositories provided by your institution. Contact the data steward of your faculty or department for more information.

DataVerseNL

DataverseNL is an online research data repository to register, store and share research data. Every researcher at UM can make use of DataVerseNL during the research period and up to the prescribed term of ten years after the last publication based on the data. For questions and support, contact the University Library Research Data Management specialists.

Maastricht Data Repository

Datasets from your project can be deposited in the Maastricht Data Repository, offered by DataHub Maastricht. The Maastricht Data Repository provides services on sustainability and access during and after the research project. For long-term storage an agreement will be made on storing at DataHub Maastricht, possibly in combination with an external domain specific repository. Data in the Maastricht Data Repository will be stored in accordance with funder and university data policies.        

The Maastricht Data Repository has an access management layer available within the infrastructure. Only people with the correct permissions will be able to upload new data and access existing data. After an agreed period, the data will be made available to the DataHub Maastricht community or beyond. Personal sensitive data on human subject will not be made available for ethical/privacy reasons. For more information please contact DataHub:

5.9a Once the project has ended, I will ensure that all data (digital and paper), software codes and research materials, published or unpublished, are managed and securely stored.

UM Code of Conduct
Maastricht University has established an Integrity Code of Conduct based on the codes of conduct applied by the Association of Universities in the Netherlands (VSNU) in the fields of education, research and management.

To the extent that a long term is not required by a law, rule, contract, subsidy or faculty guideline, all research results must be stored for a period of at least ten years after the final publication of the relevant data.

MUMC+ Research Code

If your research falls outside the scope of medical scientific research with human beings (WMO), it is recommend to store your data for 10 years. If your research meets the WMO requirement, you have to keep the data for 15 years after the last publication on that data.

Informed Consent
You are required to keep the informed consents at least 15 years after the inclusion of the final participant or as long as your dataset is stored.

5.10a Once the project has ended and the data have been selected, I can make an estimate of the size of the data collection (in GB/TB) to be preserved for storage or archival

The volume of data is determined by the type of data, for example:

  • (alpha)numeric data

  • image footage, images

  • audio files

The number of participants is also of great importance. Besides, you have to take into account the reproducibility of  the research and data. All the material needed for the reproduction of the research (data, codes and software) must be saved. An estimate is sufficient, for instance <1 GB.

5.12a Once the project has ended and the data have been selected, I can make an estimation of the costs involved for storage

Answer when using the Maastricht Data Repository

The cost for data stored in the Maastricht Data Repository is calculated on a price per GB per year.  Another factor that influences the cost is the number of replica’s that you require for your data.

We currently offer data replication to two geographically separated storage backends. The latest information with regards to costs can be found on the Maastricht Data Repository website (scroll down almost to the bottom of the page).