Appendices

Appendix 1: Criteria for data storage, archiving and registration and whether these criteria were considered as Must-have, Should-have, Could-have, or Won’t-have (MOSCOW).


CriteriaMOSCOW

Data storage
No unauthorised access is possible. M
A string password policy is in place.M
The storage solution allows the encryption of sensitive data files. M
The storage solution has an access rights system, with user accounts, roles, and authentication. M
The stored data is protected against physical access and disasters (e.g. fire), and has M an emergency power system.M
Data can be found back for one year if it is removed by accident (by a file bin, previous versioning, daily back-up that is kept for a longer period, etc.). M
Data can be recovered in case of a disaster (fire, flood, hardware crash, etc.). M
The storage solution allows virus scanning.S
The storage solution offers an availability guarantee of >99.5%. M
The storage solution can safely keep data of all four information confidentiality levels: open, internal, confidential, and secret. M
The storage solution allows the storing of file sizes up to TBs.M
The system can group data files visibly into collections (datasets, collections of datasets, etc.), and it must be possible to store a file in more than one collection.C
The storage solution enables the sharing of data files. M
It is possible for an administrator to give other individuals certain roles with rights.M
Admins can link files to other users.M
The storage solution provides a version control system.S
The data are accessible from any operating system (Windows, Linux, Mac OS).M
The data are accessible from any smart device (Smartphone, Tablet). C
The data can be accessed at any time.M
The data can be accessed from anywhere.M
It is possible to do a full-text search of the content.C
It is possible to work together on data files simultaneously. W
Data archiving
Preferred formats are: .pdf .txt .sgm(l) .xml .jpg .tif .wav .shp (and otehr ESRI) .csx .tab .nc (and .cdf).M
File formats are migrated in case of obsoleteness.S
Linked Data format can be used.C
Md5 type checksums are carried out on datafile and on metadatafile level, and replacements are made in case of degradation. M
The archive has a back-up and recovery system in place. M
The archive provides datasets with persistent identifiers. M
It is possible to establish links between datasets and publications.M
Preservation time is at least 10 years.M
The archive provides metadata fields so that datasets can be described and found.M
Data documentation can be added to the dataset (a description of methods and techniques used to collect and analyse the data). M
Published data are indexed in Google (and in other search engines).S
The archive provides metadata fields so that datasets can be described and found.M
Restricted access to datasets with end user licence is possible.M
The archive offers an access rights system, with user accounts, roles, and authentication. M
OAI harvesting of metadata is possible.S
The archive is accurate, complete, authentic and reliable.M
The archive provides clear guidelines for data citation.S
The archive shows the number of downloads of datasets.C
The archive shows who downloaded which datasets. C
Quality of data entry, data storage, and data processing is controled for. M
Data registration
The registration system allows the establishing of links between articles and datasets.M
It is possible to set up links between the datasets and other organizational records.M
It is possible to provide URLs to the location of the datasets.M
Bibliographical metadata can be added to datasets.M




Appendix 2: The interview questions used .


CategoryQuestion

General questionsWhat kind of research do you do?
What kind of data do you work with (raw and analysed)?
What file formats do you use for these data?
Do you work with sensitive data? If so, how do you deal with this (e.g. anonymisation)?
Do you use data of external sources? If so, have you made certain agreements as to how you can use this data?
What is the confidentiality level of your data (open, internal, confidential, or secret)?
Storage of dataWhere do you store your data? Is there a difference in the storage solution used for raw data, analysed data, and data of external sources? Do you store all the data that you generate/use?
How much capacity do you need for storing your data, both during and after the research?
How often do you make back-ups during the research? How and where do you do this?
How long to you store data during the research (‘active data’)?
How long do you keep the data after the research has finished?
Do you ever destroy data after the research has finished? If so, how?
Do you also take the value of data into account when you decide on data storage, e.g. more back-ups or safer storage solutions for data that are more difficult to reproduce? If so, how?
Is it possible for other people to access and change the data after the research has finished? If so, how?
What kind of documentation, if any, do you add to your data?
Accessibility of dataHow many people work on one dataset during the research? Do you also share data externally, or only internally?
How does this sharing work (e.g. cloud storage, e-mail)?
Do you use metadata to describe datasets? If so, what kind (discipline-specific or general standards)?
Once the research has finished, do you save the data or metadata somewhere where other people can find them? If so, where? And are the data and/or metadata searchable there?
Do you want your data and/or metadata to be findable for everybody, or only researchers in your own discipline?
Are you ever asked to share your data with external parties? If so, how do you do this?
Do you use licences or other agreements when you share your data with others?
Who has ownership rights over the data? If you collaborate with other (commercial) organisations and/or countries, how does this influence ownership of the data?
General RDM support questionsAre there any data management practices that are going well, or not so well?
Are you aware of the available data support provided by Wageningen University & Research, i.e. in terms of data storage, archiving and registration?
If so, how (if at all) do you use these services?