Appendix 1: Criteria for data storage, archiving and registration and whether these criteria were considered as Must-have, Should-have, Could-have, or Won’t-have (MOSCOW).
Criteria | MOSCOW |
Data storage | |
No unauthorised access is possible. | M |
A string password policy is in place. | M |
The storage solution allows the encryption of sensitive data files. | M |
The storage solution has an access rights system, with user accounts, roles, and authentication. | M |
The stored data is protected against physical access and disasters (e.g. fire), and has M an emergency power system. | M |
Data can be found back for one year if it is removed by accident (by a file bin, previous versioning, daily back-up that is kept for a longer period, etc.). | M |
Data can be recovered in case of a disaster (fire, flood, hardware crash, etc.). | M |
The storage solution allows virus scanning. | S |
The storage solution offers an availability guarantee of >99.5%. | M |
The storage solution can safely keep data of all four information confidentiality levels: open, internal, confidential, and secret. | M |
The storage solution allows the storing of file sizes up to TBs. | M |
The system can group data files visibly into collections (datasets, collections of datasets, etc.), and it must be possible to store a file in more than one collection. | C |
The storage solution enables the sharing of data files. | M |
It is possible for an administrator to give other individuals certain roles with rights. | M |
Admins can link files to other users. | M |
The storage solution provides a version control system. | S |
The data are accessible from any operating system (Windows, Linux, Mac OS). | M |
The data are accessible from any smart device (Smartphone, Tablet). | C |
The data can be accessed at any time. | M |
The data can be accessed from anywhere. | M |
It is possible to do a full-text search of the content. | C |
It is possible to work together on data files simultaneously. | W |
Data archiving | |
Preferred formats are: .pdf .txt .sgm(l) .xml .jpg .tif .wav .shp (and otehr ESRI) .csx .tab .nc (and .cdf). | M |
File formats are migrated in case of obsoleteness. | S |
Linked Data format can be used. | C |
Md5 type checksums are carried out on datafile and on metadatafile level, and replacements are made in case of degradation. | M |
The archive has a back-up and recovery system in place. | M |
The archive provides datasets with persistent identifiers. | M |
It is possible to establish links between datasets and publications. | M |
Preservation time is at least 10 years. | M |
The archive provides metadata fields so that datasets can be described and found. | M |
Data documentation can be added to the dataset (a description of methods and techniques used to collect and analyse the data). | M |
Published data are indexed in Google (and in other search engines). | S |
The archive provides metadata fields so that datasets can be described and found. | M |
Restricted access to datasets with end user licence is possible. | M |
The archive offers an access rights system, with user accounts, roles, and authentication. | M |
OAI harvesting of metadata is possible. | S |
The archive is accurate, complete, authentic and reliable. | M |
The archive provides clear guidelines for data citation. | S |
The archive shows the number of downloads of datasets. | C |
The archive shows who downloaded which datasets. | C |
Quality of data entry, data storage, and data processing is controled for. | M |
Data registration | |
The registration system allows the establishing of links between articles and datasets. | M |
It is possible to set up links between the datasets and other organizational records. | M |
It is possible to provide URLs to the location of the datasets. | M |
Bibliographical metadata can be added to datasets. | M |
Appendix 2: The interview questions used .
Category | Question |
General questions | What kind of research do you do? |
What kind of data do you work with (raw and analysed)? | |
What file formats do you use for these data? | |
Do you work with sensitive data? If so, how do you deal with this (e.g. anonymisation)? | |
Do you use data of external sources? If so, have you made certain agreements as to how you can use this data? | |
What is the confidentiality level of your data (open, internal, confidential, or secret)? | |
Storage of data | Where do you store your data? Is there a difference in the storage solution used for raw data, analysed data, and data of external sources? Do you store all the data that you generate/use? |
How much capacity do you need for storing your data, both during and after the research? | |
How often do you make back-ups during the research? How and where do you do this? | |
How long to you store data during the research (‘active data’)? | |
How long do you keep the data after the research has finished? | |
Do you ever destroy data after the research has finished? If so, how? | |
Do you also take the value of data into account when you decide on data storage, e.g. more back-ups or safer storage solutions for data that are more difficult to reproduce? If so, how? | |
Is it possible for other people to access and change the data after the research has finished? If so, how? | |
What kind of documentation, if any, do you add to your data? | |
Accessibility of data | How many people work on one dataset during the research? Do you also share data externally, or only internally? |
How does this sharing work (e.g. cloud storage, e-mail)? | |
Do you use metadata to describe datasets? If so, what kind (discipline-specific or general standards)? | |
Once the research has finished, do you save the data or metadata somewhere where other people can find them? If so, where? And are the data and/or metadata searchable there? | |
Do you want your data and/or metadata to be findable for everybody, or only researchers in your own discipline? | |
Are you ever asked to share your data with external parties? If so, how do you do this? | |
Do you use licences or other agreements when you share your data with others? | |
Who has ownership rights over the data? If you collaborate with other (commercial) organisations and/or countries, how does this influence ownership of the data? | |
General RDM support questions | Are there any data management practices that are going well, or not so well? |
Are you aware of the available data support provided by Wageningen University & Research, i.e. in terms of data storage, archiving and registration? | |
If so, how (if at all) do you use these services? | |