4 Data Management Plan

Data management plan in the research project life cycle

Figure 3.1: Data management plan in the research project life cycle

4.1 History and purpose

Since 2013, even earlier for the National Science Foundation, most federal agencies that education researchers work with have required a data management plan (DMP) as part of their funding application. While the focus of these plans is mostly on the future outcome of data sharing, the data management plan is a means of ensuring that researchers will thoughtfully plan for a research study that will result in data that can be shared with confidence, and free from errors, uncertainty, or violations of confidentiality. President Obama’s May 2013 Executive Order declared that “the default state of new and modernized government information resources shall be open and machine readable.”48 In August of 2022, the Office of Science and Technology Policy (OSTP) doubled down on their data sharing policy and issued a memorandum stating that all federal agencies must update their public access policies no later than December 31, 2025, to make federally funded publications and their supporting data accessible to the public with no embargo on their release.49 Even sooner than this, organizations like the National Institutes of Health have mandated that grant applicants, beginning January 2023, must submit a plan for both managing and sharing project data50.

4.1.1 Why are DMPs important?

Funding agencies see DMPs as important in maximizing scientific outputs from investments and increasing transparency. Mandating data sharing for federally funded projects leads to many benefits including accelerating discovery, greater collaboration, and building trust among data creators and users. In addition to the benefits viewed by funders, there are intrinsic benefits that come from having to write a data management plan. Having to thoughtfully plan and having transparency in that plan leads to better data management. Knowing that you will eventually be sharing your data and documentation with others outside of your team can motivate researchers to think hard about how to organize their data management practices in a way that will produce data that they trust to share with the outside world51.

4.2 What is it?

Generally, a data management plan is a supplemental 2-5 page document, submitted with your grant application, that contains details about how you plan to store, manage, and share your research data products. For most funders these DMPs are not part of the scoring process, but they are reviewed by a panel or program officer. Some funders may provide feedback or ask for revisions if they believe your plan and/or your budget and associated costs are not adequate.

4.2.1 What to include?

What to include in a DMP varies some across funding agencies. While you should check each funding agency’s site for their specific DMP requirements, there are typically 10 common categories covered in a data management plan52 . Those categories are:

  1. Roles and responsibilities
    • What are the staff roles in management and long-term preservation of data?
    • Who ensures accessibility, reliability, and quality of data?
    • Is there a plan if a core team member leaves the project or institution?
  2. Types of data
    • How is data captured? (Ex: surveys, assessments, observations)
    • Will data be item-level and summary scores?
    • Will you share raw data and clean data?
    • What are the expected number of files? Expected number of rows in each file?
  3. Format of data
    • Will data be in an electronic format?
    • Will it be provided in a non-proprietary format? (Ex: csv)
    • Will more than one format be provided? (Ex: sav and csv)
    • Are there any tools needed to manipulate shared data?
  4. Documentation
    • What metadata will you create? (Consider project level, dataset level and variable level metadata)
    • What format will your documentation be in? (Ex: xml, csv, pdf)
    • What other documentation do you plan to include when sharing data? (Ex: code, data collection instruments, protocols)
  5. Standards
    • Are there any data or documentation standards being used? (Ex: DDI)
  6. Method of data sharing
    • How will you share your data? (Ex: Institutional archive, data repository, PI website)
    • Will data be restricted and is a data enclave required?
    • Is a data use agreement required?
    • How will you license your data?
    • Will your data have persistent unique identifiers?
  7. Circumstances preventing data sharing
    • Do you have any data covered by FERPA/HIPAA that doesn’t allow data sharing?
    • Do you work with any partners that do not allow you to share data? (Ex: School districts, tribal regulations)
    • Are you working with proprietary data?
  8. Privacy and rights of participants
    • How will you prevent disclosure of personally identifiable information when you share data? How will you anonymize data (if applicable)?
    • Do participants sign informed consent agreements? Does the consent communicate how participant data are expected to be used and shared?
  9. Data security
    • How will you maintain participant privacy and confidentiality during your project?
    • How will you prevent unauthorized access of data?
    • Consider IRB requirements here.
  10. Schedule for data sharing
    • When will you share your study data and for how long?
  11. Pre-registration (less commonly required)
    • Where and when will you pre-register your study?

Again, the specifics of what should be included in each category will vary by funder. Here are sites to visit to learn more about the four most common federal education research funder DMP requirements.

  • Institute of Education Sciences53
  • National Institutes of Health54
  • National Institute of Justice55
  • National Science Foundation56

4.3 Getting help

When constructing your DMP it may be important to enlist help. If you have a data manager or data team, you will most certainly want to consult with them when writing your plan. If you work for a university system, your research data librarians are also excellent resources with a wealth of knowledge about writing comprehensive data management plans. And last, if you plan to share your final data with a repository or institutional archive you will want to contact your repository when writing your plan as well. The repository may have its own requirements for how and when data must be shared and it is helpful to outline those guidelines in your data management plan at the time of submission. You can also specifically write the name of your repository into your data management plan as well. Last, you may want to obtain the help of your colleagues. Your colleagues have likely written DMPs before and many people are willing to share their plans as a way to help others better understand what to include.

Your DMP is a living document and you can always update your plan during or after your project completion. It may be helpful to keep in contact with your program officer regarding any potential changes throughout your project.

If you are looking for guidance in writing a DMP, a variety of generic DMP templates for different federal agencies are available from the University of Virginia Library57. There is also a well-known free online application called the DMPTool58 that guides you in constructing a data management plan for many of the large funding agencies you might work with. Their site also has many searchable public DMPs that you can review for inspiration.

4.4 Budgeting

As briefly mention above, funding agencies acknowledge that there are costs associated with implementing your data management plan and allow you to explain these costs in your budget narrative. Costs associated with the entire data life cycle should be considered and may include data management personnel costs, fees, infrastructure, or tools needed to organize, document, store, and share study data.59 Make sure to review your funder’s documentation for information about allowable costs. Examples of potential allowable costs include:60

  • Costs associated with curating and de-identifying data
  • Costs associated with developing data documentation
  • Fees associated with depositing data for long-term sharing in a repository

It can be difficult to estimate the costs of everything that is associated with the vast landscape of managing data. Luckily a few organizations have developed resources to aid in estimating those costs. The UK Data Service61, the University of Twente62, Utrecht University63, and DataOne64 have put together checklists to help you think through your various potential data management costs.