A Data Management Plan (DMP) serves two main purposes:
These are the general components of a data management plan:
Best practices for dealing with data on a daily basis:
Most public funding agencies require a DMP of some sort in the grant application process.
NIH and NSF policies
Supplemental information for new NIH Data Sharing Policy
What kind of data is going to be recorded?
There are lots of possible kinds of data: recorded observations, text files from a corpus of texts, image files, geospatial data, audio and video files, even physical specimens. Give some context for where the data is coming from and how you are recording it.
What format is the data in?
There are also lots of possible formats your data might be in, and we encourage you to use open formats that can be opened by lots of programs not just proprietary software (more info in the Standards section). Note what formats you are using.
A DMP is both to show that you are going to follow good procedures to your funders, but also a chance to think about those procedures before you start, and hopefully avoid mistakes!
What format(s) will your data be collected and stored in?
Explain what the data will be, when it is going to be collected, and why you are choosing these formats. Open formats are better than propietary ones, because more programs can open adn edit open formats. For instance, for storing tabular or spreadsheet data, .csv files are preferred over .xslx files (from Excel), because .csv files can be opened by text editors and spreadsheet programs.
What volume of data (MB/GB/TB) do you expect to collect?
Include estimates of raw data, processed data and other outputs.
What metadata will you provide to augment and inform your data?
Plan to include documentation on your methodology, analysis, a data dictionary, and hardware and software used.
The Research Data Alliance has a directory of metadata standards you can use as a template.
Explain how the data will be stored during the research project.
If there are different policies at different sites or stages of the project, explain them each.
If you are collecting Personal Health Information (PHI), how are you ensuring that it is stored securely and participants privacy is assured?
Note how data will be backed up and who will be responsible for that.
Explain how the data will be shared after publication, for instance in a subject-specific or institutional repository. If PHI, or other ethical issues, are involved, how will that inform your data sharing plan?
Make clear who owns the copyright and intellectual property rights to the data.
Similar to storage and security, how will the data be stored and shared securely and safely?
Who can reuse it and how?
How will the data be formatted and shared to make it useful for re-use.?
Are you using non-proprietary data formats (.csv, for example)? Are you including instructions on how to interpret the data?
Explain how the data will be preserved for future use.
Are you depositing the data in a repository that will provide preservation services?
How are you preserving physical data specimens and records?
Are you requesting money in your grant to cover preservation costs?
It's better to ask for money up front, then have to scramble for access to good preservation (required by many funders) of your data at the end of the project.