10 Data Collection

Data collection in the research project life cycle

Figure 10.1: Data collection in the research project life cycle

When collecting original data as part of your study (i.e., you are administering your own survey, assessment, or observation), data management best practices should be interwoven throughout your data collection process. The number one way to ensure the integrity of your data is to spend time planning your data collection efforts (Northern Illinois University n.d.). Not only does planning minimize errors, it also keeps your data secure, valid, and relieves future data cleaning headaches.

If you have ever created a data collection instrument and expected it to export data that looks like the image on the left (Figure 10.2), but instead you export data that looks like the image on the right, then you know what I mean. Collecting quality data doesn’t just happen because you create an instrument, it takes careful consideration, structure, and planning on the part of the entire team.

A comparison of data collected without planning and data collected with planning

Figure 10.2: A comparison of data collected without planning and data collected with planning

10.1 Planning

Planning for data collection begins in the Data Management Plan phase when you first choose measures you plan to collect as well as the instruments you plan to use to collect those measures. Then, during the planning and documentation phase, you can begin to really flesh out the exact items you plan to collect, how those items will be recorded (e.g., type, name, values), and how they will be collected (e.g., online survey, paper form). And last in the create instruments phase, you start to choose tools to build your instruments in and begin constructing those instruments. As shown in Figure 10.1 above, you’ll see that creating data collection instruments is typically a collaborative effort between the project management and data management team members.

However, planning includes not only carefully designing your data collection instruments, but it also includes considering the data collection and handling process itself. While the project management team is typically in charge of overseeing data collection, there are still best practices that can be implemented during this time to collect better data. In this chapter we will not only review best practices for building your data collection instruments, we will also discuss our two other lines of defense which include data management in the field and implementing ongoing data checks to catch errors early on (DIME Analytics n.d.).

Before we dive into planning for building data collection instruments, it’s important to first review the ethical and legal considerations of your data collection effort. When working with human subjects it is likely that the Institutional Review Board (IRB) will need to review and approve all of your data collection instruments as well as any agreement forms that will be collected as part of your study. Our next section will provide an overview of the IRB and its requirements as well as best practices for creating agreement forms for participants and partners.

10.2 Institutional Review Board

The IRB is a formal organization designated to review and monitor human participant research and ensure that the welfare, rights, and privacy of research participants are maintained throughout the project (Oregon State University 2012). More likely than not, if you are conducting education research with human participants you will have some interaction and oversight with the IRB. Before reviewing potential requirements, lets review the history of this administrative body.

10.2.1 Background

In 1974 the IRB was established as part of the National Research Act in response to a long history of unethical research that had been conducted with human participants (Qiao 2018). In 1979, the Belmont Report established a set of ethical principles for doing research with human participants. Those ethical principles included the following (Duru and Sautmann n.d.; Huisman n.d.; Office for Human Research Office for Human Research Protections 2018):

  1. Respect for persons
    • This included both protecting autonomy of participants by acquiring consent as well as providing a plan to protect participant privacy
      • In practice this meant acquiring consent in a way that ensures participants can comprehend what is being asked of them, understand that their participation is voluntary, and understand the plan to protect their privacy
  2. Beneficence
    • This involved maximizing good and minimizing harm in the study, for both participants and society at large
      • In practice this meant taking time to assess risk and benefits of your study for both the intervention itself as well as the data collection efforts (e.g., how burdensome is the survey)
  3. Justice
    • This included providing additional care and consideration when working with vulnerable populations (e.g., children, prisoners), making sure your practices are non-exploitative and there is fair distribution of costs and benefits across all participants
      • In practice this involved fairness in the selection of participants

Heavily influenced by the Belmont Report, in 1991 the Federal Policy for the Protection of Human Subjects was published, establishing core procedures for human subject protections. The policy, 45 CFR part 46 (Office for Human Research Protections 2016), included four subparts. Subpart A, known as the “Common Rule” for the 15 federal departments and agencies which codified the policy in separate regulations, provided a set of protections for human subjects research including informed consent, review by an IRB, and compliance monitoring (National Institute of Justice 2007; Office for Human Research 2009).

In 2018 the Common Rule was revised in order to better protect research participants and to reduce administrative burden (Office for Human Research Office for Human Research 2018; U.S. Department of Health and Human Services n.d.). While many revisions were made, some changes that are applicable to education researchers include the following (Fordham University n.d.):

  • Revisions and additions to exempt categories, many of which are applicable to research conducted in educational settings
  • Reduced burden of continuing review, particularly for exempt and expedited studies
  • Clarifications on how informed consent should be organized, written, and provided

10.2.2 Requirements

While each institution’s IRB submission process is different, typically if your study involves working with human subjects you are required to submit an application to the IRB. As part of your application you will be asked to state what review category your study falls under (Lafayette College n.d.; Northwestern University n.d.; University of California Berkeley 2022).

  1. Exempt
    • These studies usually involve minimal risk and fit within categories predefined by your IRB (e.g., Evaluating the use of accepted or revised standardized tests). These studies typically involve a shorter review process and a quicker review than non-exempt studies.
  2. Expedited
    • These studies also involve minimal risk but do not meet criteria for exempt status (e.g., collection of voice, video, or image data from non-vulnerable populations).
  3. Full Review
    • If a study does not fall into one of the two categories above (e.g., collection of information about illegal behavior), it requires full review, discussed by the full board at a convened meeting.

As part of your application, common documents you may be required to submit include the following (Cabrini University n.d.; Duru and Sautmann n.d.).

  1. Certificates from human subjects training (e.g., CITI training)
  2. Research protocol (see Chapter 7)
    • When writing your protocol, make sure to review your IRB’s rules around data handling and include this information in your plan. IRBs typically have specific rules for things such as how paper and electronic data must be stored and backed up, how long data should be retained, how data can be transferred and shared, and how data should be anonymized (Filip, n.d.).
  3. Study materials (e.g., recruitment materials)
  4. Copies of your instruments (e.g., surveys, interview guides)
    • Note that these will need to be created before you can submit to your IRB so make sure to consider timing and start building your instruments early enough to give you time to submit to your IRB before data collection
  5. Copy of informed consent/assent forms
    • Same as above, give yourself plenty of time to submit before you start participant recruitment
  6. If collecting data from sites (such as school districts) or sharing data between sites, supporting documentation from those partners may be required (MOUs, data use/sharing agreements, letters of support, confidentiality agreements)
  7. If partnering with other institutions, IRB approval letters from partner institutions may also be required

The review process can take several weeks and it is common for the IRB to request revisions to materials. Make sure to review your timeline and give yourself plenty of time to work through this process before you need to begin recruitment and data collection.

10.3 Agreements

There are several types of agreements that may be required for your research study for both ethical and legal reasons. Here we will discuss the most common type of agreements, informed consent and assent, as well as other agreements used when working with external partners including data sharing agreements, memorandum of understanding documents, and confidentiality agreements.

10.3.2 Other agreements

As we discussed in Chapter 7, a data use agreement (DUA) is a contractual document that lays out expectations for how data will be shared between two or more parties. While the terms data use agreement and data sharing agreement (DSA) are often used interchangeably, I want to differentiate between the two documents. Data use agreements are typically legally binding agreements that provide terms and conditions for working with restricted use data. DUAs are commonly written for data sharing with school districts. In this case, a DUA may include the terms for sharing, working with, and storing identifiable district level data.

When working with de-identified non-sensitive data, a data sharing agreement is a good option. A DSA is a less formal agreement but is still beneficial if you want to provide terms for how data is used, such as limiting the types of projects that use the data (LDbase n.d.a). We will talk more about these types of agreements in the Chapter 14.

Another type of agreement, commonly signed when working with partners such a school districts, is a memorandum of understandings (MOU), which establishes the framework for collaboration (National Center for Education Statistics n.d.; REL West, n.d.). This document is typically not legally binding, but establishes agreements around things such as responsibilities, communication, and expectations (Duru and Kopper n.d.b). An MOU can be a standalone document or can include a DSA or DUA as part of the document.

Last, confidentiality agreements and non-disclosure agreements (NDAs) are other types of agreement that may be needed. These documents restrict the use of proprietary or confidential information (University of Washington n.d.) and are legally enforceable agreements.

Templates and Resources

Source Resource
Amy O’Hara Sample text for data sharing agreements 50
Florida State University Example data use agreement 51
REL West Data use agreement checklist 52
University of North Carolina Data use agreement decision making flow chart 53
Wilhelmina van Dijk, Sara Hart Example data sharing agreement 54

10.4 General instrument creation considerations

With a solid understanding of the ethical and legal requirements needed to plan data collection, we are now ready to discuss best practices for creating data collection instruments.

10.5 Electronic data collection instruments

10.6 Paper data collection instruments

10.7 Interviews/focus groups