Skip to Main Content
We are working to upgrade the research experience by making ongoing improvements to our Research Guides.
You may encounter changes in the look and feel of the Research Guides website along with structural changes to our existing guides. If you have any questions or concerns about this process please let us know.

Data Management

NYU Health Sciences Library (2012).  Data sharing and management snafu in 3 short acts [Video].  https://youtu.be/66oNv_DJuPc?si=GTMhs1TKJCQCHUm0

As the video shows, good Data Management practices and a solid Data Managment Plan (DMP) are vital to reproducible research and to supporting new research. A researcher undertakes data management planning before they begin their project by creating a workflow of how data will be gathered, stored, archived, and eventually weeded.  The DMP lays out the plan clearly and describes the data that will be collected, the software or technology used for its analysis, how it will be gathered, and who has responsibility for its stewardship. 

Templates and Examples of DMPs

Templates


Examples of Data Management Plans

Data Management Checklist

The following questions can help you to begin to think about how you will manage your data and the answers will be useful for developing the content of a qualtiy data management plan.

Data Production
  • What type(s) of data and datasets will be produced? Will it be video data, traditional numerical data, electronic lab notebooks, software, other kinds of datasets?
  • Will the data include human data? Will deidentification and/or anonymization be required?
  • What file format(s) will the data be saved as? Are those file formats proprietary? Will they degrade?
  • Will the data be reproducible?
  • Do you need tools or software to create/process/visualize the data?
Data Size
  • How much data will be gathered, and at what growth rate?
  • How often will the data change?
Data Transfer
  • How will the datasets be moved from local storage to long-term storage or from lab servers to other types of storage?
Data Usage
  • Who will potentially be using your data: both now and later?
Data Retention
  • How long should it be retained? (e.g., 3-5 years, 10-20 years, permanently).
  • Does your institution have a data retention policy?
  • What is your long-term plan for your data, especially once the research is concluded?
Privacy and Security
  • Does you data have any special privacy or security requirements?  (e.g., human data, personal data, high-security data are all restricted types of data).
Data Sharing
  • Any sharing requirements? (e.g., funder data sharing policy, federal requirements such as the NIH guidelines).
  • Have you chosen a repository in which to archive your data?
  • If your data is sensitive (e.g. human data, personal data), can the repository properly handle that data?
Data Management Plan
  • Does your funding agency require a data management plan in the grant proposal?
Costs
  • Does you need to include the following costs in your Data Mangement Plan:
    • Library Data Mangement assistance, up to including an embedded Data Mangement Librarian in your research team
    • Repository fees (e.g. uploading your data, long term curation)
    • Anonymization and deidentification fees
Data Documentation
  • How will you be documenting your data and project?
  • What directory and file naming convention will be used?
  • What project and data identifiers will be assigned?
  • Is there a schema, ontological, or other metadata standard in your field for sharing data with others?
  • Do you have a proper README file to explain all of your datasets, codes, codebooks, and other files?
  • Do you have a file that documents all of the repositories and other places where your datasets and associated files are stored, including any needed software to access the datasets and files?
Storage and Backup
  • What are the strategies for storage and backup of the data?
  • Are you aware of support backups?
  • Which repositories will you use for your data? Can they handle the type of datasets that you need stored?
  • Are you using one repository or several (e.g., Dryad, Github, Vivli, etc.)
Training
  • Will the team need training in data management best practices, working with metadata, making the datasets sharable and reproducible, or other data management topics?
Publication
  • When and where will the work be published?
Responsibility
  • Who in the research group will be responsible for data management?
  • Who controls the data (PI, student, lab, institution, funder)?

 

Source:  Florida Institute of Technology, Evans Library; Used with permission.

Last updated on Dec 2, 2024 9:53 AM