Appendix 1: Criteria for data storage, archiving and registration and whether these criteria were considered as Must-have, Should-have, Could-have, or Won’t-have (MOSCOW).
| Criteria | MOSCOW |
| Data storage | |
| No unauthorised access is possible. | M |
| A string password policy is in place. | M |
| The storage solution allows the encryption of sensitive data files. | M |
| The storage solution has an access rights system, with user accounts, roles, and authentication. | M |
| The stored data is protected against physical access and disasters (e.g. fire), and has M an emergency power system. | M |
| Data can be found back for one year if it is removed by accident (by a file bin, previous versioning, daily back-up that is kept for a longer period, etc.). | M |
| Data can be recovered in case of a disaster (fire, flood, hardware crash, etc.). | M |
| The storage solution allows virus scanning. | S |
| The storage solution offers an availability guarantee of >99.5%. | M |
| The storage solution can safely keep data of all four information confidentiality levels: open, internal, confidential, and secret. | M |
| The storage solution allows the storing of file sizes up to TBs. | M |
| The system can group data files visibly into collections (datasets, collections of datasets, etc.), and it must be possible to store a file in more than one collection. | C |
| The storage solution enables the sharing of data files. | M |
| It is possible for an administrator to give other individuals certain roles with rights. | M |
| Admins can link files to other users. | M |
| The storage solution provides a version control system. | S |
| The data are accessible from any operating system (Windows, Linux, Mac OS). | M |
| The data are accessible from any smart device (Smartphone, Tablet). | C |
| The data can be accessed at any time. | M |
| The data can be accessed from anywhere. | M |
| It is possible to do a full-text search of the content. | C |
| It is possible to work together on data files simultaneously. | W |
| Data archiving | |
| Preferred formats are: .pdf .txt .sgm(l) .xml .jpg .tif .wav .shp (and otehr ESRI) .csx .tab .nc (and .cdf). | M |
| File formats are migrated in case of obsoleteness. | S |
| Linked Data format can be used. | C |
| Md5 type checksums are carried out on datafile and on metadatafile level, and replacements are made in case of degradation. | M |
| The archive has a back-up and recovery system in place. | M |
| The archive provides datasets with persistent identifiers. | M |
| It is possible to establish links between datasets and publications. | M |
| Preservation time is at least 10 years. | M |
| The archive provides metadata fields so that datasets can be described and found. | M |
| Data documentation can be added to the dataset (a description of methods and techniques used to collect and analyse the data). | M |
| Published data are indexed in Google (and in other search engines). | S |
| The archive provides metadata fields so that datasets can be described and found. | M |
| Restricted access to datasets with end user licence is possible. | M |
| The archive offers an access rights system, with user accounts, roles, and authentication. | M |
| OAI harvesting of metadata is possible. | S |
| The archive is accurate, complete, authentic and reliable. | M |
| The archive provides clear guidelines for data citation. | S |
| The archive shows the number of downloads of datasets. | C |
| The archive shows who downloaded which datasets. | C |
| Quality of data entry, data storage, and data processing is controled for. | M |
| Data registration | |
| The registration system allows the establishing of links between articles and datasets. | M |
| It is possible to set up links between the datasets and other organizational records. | M |
| It is possible to provide URLs to the location of the datasets. | M |
| Bibliographical metadata can be added to datasets. | M |
Appendix 2: The interview questions used .
| Category | Question |
| General questions | What kind of research do you do? |
| What kind of data do you work with (raw and analysed)? | |
| What file formats do you use for these data? | |
| Do you work with sensitive data? If so, how do you deal with this (e.g. anonymisation)? | |
| Do you use data of external sources? If so, have you made certain agreements as to how you can use this data? | |
| What is the confidentiality level of your data (open, internal, confidential, or secret)? | |
| Storage of data | Where do you store your data? Is there a difference in the storage solution used for raw data, analysed data, and data of external sources? Do you store all the data that you generate/use? |
| How much capacity do you need for storing your data, both during and after the research? | |
| How often do you make back-ups during the research? How and where do you do this? | |
| How long to you store data during the research (‘active data’)? | |
| How long do you keep the data after the research has finished? | |
| Do you ever destroy data after the research has finished? If so, how? | |
| Do you also take the value of data into account when you decide on data storage, e.g. more back-ups or safer storage solutions for data that are more difficult to reproduce? If so, how? | |
| Is it possible for other people to access and change the data after the research has finished? If so, how? | |
| What kind of documentation, if any, do you add to your data? | |
| Accessibility of data | How many people work on one dataset during the research? Do you also share data externally, or only internally? |
| How does this sharing work (e.g. cloud storage, e-mail)? | |
| Do you use metadata to describe datasets? If so, what kind (discipline-specific or general standards)? | |
| Once the research has finished, do you save the data or metadata somewhere where other people can find them? If so, where? And are the data and/or metadata searchable there? | |
| Do you want your data and/or metadata to be findable for everybody, or only researchers in your own discipline? | |
| Are you ever asked to share your data with external parties? If so, how do you do this? | |
| Do you use licences or other agreements when you share your data with others? | |
| Who has ownership rights over the data? If you collaborate with other (commercial) organisations and/or countries, how does this influence ownership of the data? | |
| General RDM support questions | Are there any data management practices that are going well, or not so well? |
| Are you aware of the available data support provided by Wageningen University & Research, i.e. in terms of data storage, archiving and registration? | |
| If so, how (if at all) do you use these services? | |