Research and evaluation projects succeed or fail long before data collection begins. The decisions made during the design phase about what questions to ask, how to ask them, and how responses will be analysed, determine whether a project will deliver meaningful insights or simply generate noise. Yet too often, researchers and evaluators approach data collection in an ad hoc manner, adding questions without a clear line of sight to their overarching research objectives or analytical approach.
That's why we've developed a practical tool to address this challenge: the Data Collection Matrix. This spreadsheet-based resource supports researchers and evaluators in designing data collection instruments that are intentional, ethical, and analytically sound from the outset.
Introduction
In our work reviewing research applications and supporting evaluators, we've observed recurring challenges in how data collection is planned and executed:
Planning for analysis is an afterthought. Researchers and evaluators are at risk of designing surveys or interview guides without considering how they'll analyse the data. This leads to questions that are interesting but unanalysable, or data that doesn't align with the statistical approaches needed to answer research questions.
The link between questions and objectives is unclear. Ethics committees and peer reviewers struggle to understand how specific questions contribute to project goals. This isn't just a documentation problem, it risks being a deeper disconnect in project design that can undermine the entire set of activities.
Multi-modal and multi-stakeholder research becomes unwieldy. When a project involves surveys to different stakeholder groups, interviews, focus groups, and document analysis, keeping track of which questions are asked to whom, in what format, and how they connect becomes exponentially complex.
These challenges have ethical implications. Collecting data without a clear purpose violates the principle of respect in the National Statement. Participants should only be asked questions that serve a legitimate research need. Unfocused data collection increases the burden on participants and can lead to scope creep that undermines consent processes.
A tool for intentional design
The Data Collection Matrix provides a structured framework for planning data collection at the question level. By working through the matrix during the design phase, researchers and evaluators create a single source of truth that links every question to its purpose, presentation, and analytical approach.
How it works
The matrix is organised around individual questions, each assigned a unique question identifier (QID). For each question, researchers document:
Collection context: Which data collection methods will include this question (e.g., stakeholder survey, staff interviews), and what question number it corresponds to in each instrument. For multi-modal research, this allows you to see at a glance that the same question appears as Q4 in the survey but Q6 in the interview guide. For multi-stakeholder research, you can specify that a question appears in the community survey but not the provider survey.
Question structure: The full question text, question type (single choice, multiple choice, Likert scale, verbatim, matrix etc.), and all response options. For matrix questions, you can specify both row and column elements. If you're using questions from validated instruments, you can note the source instrument and question number to maintain methodological integrity.
Logic and presentation: Any branching or piping logic that determines whether participants see the question or what options are presented. For example, a question about preferences for public transport might only be shown to participants who indicated they use public transport in an earlier question. You can also document whether response options are presented in a fixed order, randomised, or filtered based on previous responses.
Research alignment: For each question, you articulate how it relates to your key research or evaluation questions. This forces clarity about purpose and helps identify questions that don't serve your objectives.
Analytical approach: How you plan to analyse responses, from simple summary statistics to complex statistical modelling. You can also specify planned crosstabulations (or "crosstabs" for short). For instance, if you intend to examine responses by age cohort or geographic region, you can specify this here and note which question provides the demographic variable for these analyses.
Applications across different methods of research and evaluation
While the matrix is particularly powerful for quantitative research with its emphasis on question types, response options, and statistical analysis, it's equally valuable for qualitative work. Interview guides and focus group protocols benefit from the same intentionality: documenting which topics will be explored with which participant groups, how questions connect to research objectives, and what analytical approach (thematic analysis, framework method, grounded theory) will be applied.
For mixed-methods research, the matrix becomes indispensable. You can use it to ensure that survey questions and interview topics address the same research questions from different angles, identify where quantitative and qualitative data will be integrated in analysis, and maintain consistency in how concepts are operationalised across methods.
Ethical and practical benefits
Using the Data Collection Matrix delivers benefits throughout the research lifecycle:
Preventing scope creep: By documenting the purpose and analytical approach for every question at the design stage, you create accountability. It becomes much harder to add "nice to know" questions that don't serve research objectives and simply increase burden on participants.
Supporting ethical review: Ethics committees can quickly assess whether data collection is proportionate and aligned with research aims. The matrix provides transparency about what will be asked, to whom, and why, making it easier to evaluate whether the benefits justify the risks and burdens.
Enabling efficient data collection: For survey research, the matrix serves as a complete specification for survey programming. It documents all branching logic, randomisation requirements, and question dependencies in one place. This reduces errors in survey setup and makes quality assurance more straightforward, especially if you are engaging market research panel providers.
Facilitating analysis planning: By thinking through analytical approaches during design rather than after data collection, you can identify whether your questions will actually support the analyses you need. You might realise that you need to adjust response options, add screening questions, or modify question wording to enable planned statistical tests or qualitative coding frameworks.
Improving research transparency: The completed matrix serves as documentation that can be included in research protocols and evaluation frameworks, shared with ethics committees, and archived alongside datasets. It creates an audit trail showing the intentionality behind every element of data collection.
Getting started
The Data Collection Matrix is available for download here.
(Or click on this icon to download it: )
We've designed it to be flexible enough for small pilot studies and complex multi-site evaluations alike. The comments for each field provide more information on the purpose of each field and how they can be filled out.
As you work with the matrix, we encourage you to adapt it to your needs. You may find that certain columns aren't relevant to your work, or might want to add fields for specific documentation requirements. The tool is a starting point, not a constraint (or for that matter a requirement of the application process).
We also welcome feedback. If you discover ways to improve the matrix or have suggestions for additional functionality, please contact us. Like research design itself, this tool benefits from iterative refinement based on real-world use.
Summary
Good research design requires intention at every level, from the overarching research questions down to the wording of individual survey items. The Data Collection Matrix supports this intentionality by providing a structured way to document the what, why, and how of every question you ask.
By linking data collection to research and evaluation objectives and analytical approaches from the start, you create research that is more ethical, more efficient, and more likely to deliver meaningful insights. And in an era where participants are rightly demanding that their time and data be used responsibly, that intentionality is not just good practice, it's an ethical imperative.
AI Disclosure: An initial draft of this article was developed with the assistance of a Large Language Model. Final editing and approval, as well as the design and development of the tool itself, was delivered by a human, consistent with the Iris Ethics AI Policy.