Data Production |
- What type(s) of data and datasets will be produced? Will it be video data, traditional numerical data, electronic lab notebooks, software, other kinds of datasets?
- Will the data include human data? Will deidentification and/or anonymization be required?
- What file format(s) will the data be saved as? Are those file formats proprietary? Will they degrade?
- Will the data be reproducible?
- Do you need tools or software to create/process/visualize the data?
|
Data Size |
- How much data will be gathered, and at what growth rate?
- How often will the data change?
|
Data Transfer |
- How will the datasets be moved from local storage to long-term storage or from lab servers to other types of storage?
|
Data Usage |
- Who will potentially be using your data, both now and later?
|
Data Retention |
- How long should it be retained? (e.g., 3-5 years, 10-20 years, permanently).
- Does your institution have a data retention policy?
- What is your long-term plan for your data, especially once the research is concluded?
|
Privacy and Security |
- Does you data have any special privacy or security requirements? (e.g., human data, personal data, high-security data are all restricted types of data).
|
Data Sharing |
- Any sharing requirements? (e.g., funder data sharing policy, federal requirements such as the NIH guidelines).
- Have you chosen a repository in which to archive your data?
- If your data is sensitive (e.g. human data, personal data), can the repository properly handle that data?
|
Data Management Plan |
- Does your funding agency require a data management plan in the grant proposal?
|
Costs |
- Does you need to include the following costs in your Data Mangement Plan:
- Library Data Mangement assistance, up to including an embedded Data Mangement Librarian in your research team
- Repository fees (e.g. uploading your data, long term curation)
- Anonymization and deidentification fees
|
Data Documentation |
- How will you be documenting your data and project?
- What directory and file naming convention will be used?
- What project and data identifiers will be assigned?
- Is there a schema, ontological, or other metadata standard in your field for sharing data with others?
- Do you have a proper README file to explain all of your datasets, codes, codebooks, and other files?
- Are all abbreviations, terms and labels defined so that future researchers can identify all parts of your data?
- Do you have a file that documents all of the repositories and other places where your datasets and associated files are stored, including any needed software to access the datasets and files?
- Is everything documented clearly enough that a future researcher, with no knowledge of your work, would be able to duplicate your work, with all of the same processes, variables, and constraints, and get the same results (within an acceptable margin of error) - AND be able to easily explain any differences in results?
|
Storage and Backup |
- What are the strategies for storage and backup of the data?
- Are you aware of support backups?
- Which repositories will you use for your data? Can they handle the type of datasets that you need stored?
- Are you using one repository or several (e.g., Dryad, Github, Vivli, etc.)
|
Training |
- Will the team need training in data management best practices, working with metadata, making the datasets sharable and reproducible, or other data management topics?
|
Publication |
- When and where will the work be published?
|
Responsibility |
- Who in the research group will be responsible for data management?
- Who controls the data (PI, student, lab, institution, funder)?
|