Development Partner Resource Guide

5. Monitoring, Evaluation, Research, And Learning

Learning outcomes are affected by a broad range of factors, which may differ to a significant degree between contexts. Despite the inherent challenges, there are several approaches to study the critical components of reforms and interventions. There is also evidence regarding the state-of-the-art understanding of the gains that can be achieved under suitable conditions (including political, technical, material, and individual).

There are numerous tools for foundational learning assessment, especially in the areas of literacy and numeracy. Many countries are in the process of developing or improving national systems of assessment that occur periodically. Following best practices, schools in some low- and middle-income countries (LMICs) are now also using tools to track learner progress and the need for remediation. Furthermore, teachers use assessments to adjust their instruction to learner needs.

Fidelity of implementation assessments and research are a means to determine whether interventions are being carried out as planned, how contextual factors affect implementation, and whether strong implementation leads to better outcomes (learning outcomes or intermediate outcomes — for example, if training leads to better instruction). A range of tools has also been developed to measure instructional quality. It may be challenging to obtain objective responses if observers do not have sufficient technical background in the subject-specific instructional methods appropriate to the grade and aligned with the methods used in the particular program. This has led to the development of tools, for example, highly technical tools used by experts with a smaller school sample, or simpler tools for a larger number of observers, featuring multiple-choice responses and general pedagogy rather than subject-specific practices. Progress toward sustainability can also be measured.

The resources in this section include some of the more common and validated tools and practices that can provide information and evidence throughout the life cycle of a reform or program.


Setting Outcomes and Indicators and Using Log Frames

Metrics and tools used to monitor foundational learning performance, measure learning outcomes, provide information for adaptive management and ensure accountability.

Questions
  • What have other foundational learning projects in similar contexts and at a similar scale achieved in terms of learning outcomes, and with what resources?
  • What indicators are most directly connected to the theory of change?
  • What are the previous trends in outcomes over time, and which factors appear to have the greatest impact, especially those that may have shifted since the latest outcome data were available?
  • Is the theory of change updated as part of the learning and adaptation process? Do some indicators and expected outcomes need to shift due to changes in the theory of change?
  • Does the government have a system or external support for tracking results at the end of both lower and upper primary education levels?
  • Is the government interested and prepared to be included in Sustainable Development Goal (SDG) 4.1.1 reporting on foundational literacy and numeracy outcomes? If not, are other global assessments being considered?

Resources

Is it Possible to Improve Learning at Scale? Reflections on the Process of Identifying Large-Scale Successful Education Interventions +

Overview: A blog on insights from a global meta-analysis that examined rigorous analyses of learning outcomes.

Host: The Center for Global Development (CGD)

Description: This blog reports and reflects on the availability (and gaps in availability) of evidence with regard to rigorously evaluated learning outcomes in foundational learning programs worldwide.

Read More
Foundational Literacy and Numeracy Academy - Module 6 - Implementation (ENGLISH) (5 August 2021) +

Overview: A two-hour webinar covering implementation in a variety of programmatic contexts, with activities and indicators.

Host: FLN Hub

Description: This webinar covers the importance of implementation to programmatic success, how activities can be prioritized and selected, and the setting and use of different types of indicators. It includes cases from multiple contexts and insights from expert presenters.

Read More
Learning at Scale: Final Report +

Overview: A report on a meta-analysis of robustly evaluated large-scale foundational literacy programs and findings from top performers.

Hosts: Learning at Scale and the Bill & Melinda Gates Foundation

Description: An October 2023 report on the top-performing foundational literacy programs globally. Table 4 provides the specific measured learning gains in these top-performing programs.

Read More
Numeracy at Scale: Final Report +

Overview: A report on a meta-analysis of robustly evaluated large-scale foundational numeracy programs and findings from top performers.

Hosts: Learning at Scale and the Bill & Melinda Gates Foundation

Description: An October 2023 report on the top-performing foundational numeracy programs globally. Impact information is contained in the program descriptions on pages 33 to 53.

Read More
The Global Alliance to Monitor Learning (GAML) +

Overview: A website with links to learning measurement resources.

Host: UNESCO

Description: The GAML is an institutional platform that oversees efforts to measure learning and harmonize measurement standards. It provides resources developed from 2017 onward, including concept papers, theories of change, result frameworks, and governance models related to SDG 4.6.

Read More
Reporting Learning Outcomes in Basic Education: Country’s Options for Indicator 4.1.1 +

Overview: A document describing options for countries to take part in global reporting on SDG 4.1.1.

Host: UNESCO

Description: This document provides options for countries to assess and report progress toward SDG 4.1.1, which covers lower primary, upper primary, and lower secondary education levels, focusing on students’ ability to meet globally identified minimum proficiency standards in reading and mathematics.

Read More
Mapping of SDG Indicators in Learning Assessments +

Overview: A webpage with a repository of the most widespread learning assessment indicators.

Host: UNESCO Global Alliance to Monitor Learning

Description: This webpage contains a repository of the most widespread learning assessment programs, focusing on their potential to monitor progress toward SDG 4. The information maps cognitive and non-cognitive questionnaires against SDG 4 indicators.

Read More

Student Performance Assessment

Includes guidance related to formative and summative foundational learning assessment, as well as system-level assessments.

Questions
  • What is the purpose of the assessment? Different purposes require different designs. For example, formative assessments may change according to the curriculum, whereas an improvement in learners’ key skills over time would imply stability and consistent difficulty levels. However, determining the effects of a new program may require an experimental approach, and school support based on outcomes may require a census-based approach.
  • Research or focused studies may provide a deeper understanding of the different dimensions of a problem compared with tools used to track broader program implementation or monitor school inspections. Are there particular factors likely to affect outcomes? Can research studies be carried out with a focus on those factors?
  • Complex, large scale studies can be expensive. Has the team considered the importance of the investment in research as compared to program inputs?
  • What capacities do organizations and individuals need to effectively carry out the desired research or study?

Resources

Key Early Grade Reading Skills and Strategies for Effective Instruction and Assessment +

Overview: The third part of a five-part online professional development series on early grade reading (EGR) program design and implementation

Host: Science of Teaching

Description: This two-hour “short course” focuses on the EGR and writing skills children need to learn and the strategies and activities for effectively teaching them, as well as classroom-based assessments to inform instruction. It includes considerations for planning, monitoring, and evaluating instruction. A suite of relevant resources is also provided.

Read More
Early Grade Mathematics Assessment (EGMA) Toolkit +

Overview: A toolkit with detailed information about the EGMA

Host: RTI International

Description: The EGMA Toolkit provides detailed information about the EGMA. The first chapter provides an introduction to the instrument and summarizes the purposes of the assessment. Chapter 2 discusses the development of the EGMA, including the theoretical foundations of the instrument. Chapter 3 details the technical adequacy of the EGMA. Chapter 4 provides information on adaptation and training.

Read More
Early Grade Reading Assessment (EGRA) Toolkit: Second Edition +

Overview: A toolkit with detailed information about the EGRA

Host: RTI International

Description: The EGRA Toolkit, or user manual, serves as a guide for countries beginning to work with EGRA in areas such as the local adaptation of the instrument, fieldwork, and the analysis of results. This toolkit is intended for use by Ministry or Department of Education staff, donor staff, practitioners, and professionals in the field of education development.

Read More
FLN Hub Formative Assessments +

Overview: A series of training modules and case studies explaining how to administer foundational literacy and numeracy (FLN) assessments

Host: FLN Hub

Description: This landing page contains modules on formative assessments for literacy and numeracy. The page includes videos and reading materials. These assessments can be administered to all children in the 5–16 age group and can be conducted both in and out of school.

Read More
Annual Status of Education Report (ASER) +

Overview: A simple but rigorous assessment of literacy and numeracy that began as a citizen survey in India but has since been adapted and used for various purposes, including grouping learners by ability level for remediation

Host: ASER

Description: This site contains an ASER overview, process documents, tools, and do-it-yourself instructions. Other related topics, such as research on the tool, publications resulting from using the tool, and current partnerships, are also provided.

Read More
Systems Implications for Core Instructional Support Lessons from Sobral (Brazil), Puebla (Mexico), and Kenya +

Overview: A document highlighting five tips produced from a review of four case studies

Host: Research on Improving Systems of Education (RISE) Programme

Description: This document walks readers through five tips for building coherence across their education sub-systems (curriculum design, textbook design, assessment tools, and teacher training). These tips were derived from four case studies.

Read More
Assessment of Literacy and Foundational Learning in Developing Countries +

Overview: A report on the range and quality of tools that are used to measure literacy and foundational learning in developing countries.

Host: Foreign, Commonwealth and Development Office (FCDO)

Description: The Health & Education Advice & Resource Team (HEART) reviewed the range and quality of tools used to measure literacy and foundational learning in developing countries. The team analyzed tools that assess language and literacy skills in children from age 3 to 14 (preschool to Grade 8). The analysis included assessment tools from studies published between 1990 and 2014, rated as “Moderate” or “High” in terms of methodological quality, such as the EGRA and the Peabody Vocabulary Test (PPVT). This systematic overview explains best practices for different measures of emergent literacy and language. It covers a wide range of cultural/linguistic adaptations for common assessments.

Read More
Global Toolkits for FLN Assessments +

Overview: A report providing a framework for different foundational literacy and numeracy (FLN) assessments.

Host: Sattva Knowledge Institute

Description: This report provides a detailed framework of the major assessment toolkits used within the FLN space. It describes 15 significant toolkits that fall within the following categories: (i) Early Years Assessments (0–6 years), (ii) Formal Schooling Years Assessments (7–14 years), (iii) Assessments for Differently-Abled Individuals, and (iv) Assessments for Adolescents and Adults (15–80 years). The final section of the document provides an overview of gaps through the lens of an Indian nongovernmental organization (NGO), namely, a lack of standardized nationwide assessments, limited inclusivity of tools, and challenges with supporting teachers to play a greater role in the administration and interpretation of these tools.

Read More
Mapping of SDG Indicators in Learning Assessments +

Overview: A webpage with a repository of the most widespread learning assessment indicators.

Host: The Global Alliance to Monitor Learning (UNESCO)

Description: This webpage contains a repository of the most widespread learning assessment programs, with a special focus on their potential ability to monitor progress toward SDG 4. The information is based on a mapping of cognitive and non-cognitive questionnaires against SDG 4 indicators, and it is organized by listing all assessments with questions that could report against each specific indicator.

Read More

Measuring Instructional Quality

Resources to evaluate the quality and equity of foundational literacy, numeracy and social and emotional learning instruction, as well as time on task and the fidelity of a particular instructional program.

Questions
  • What aspects of instruction are being considered (subject matter instruction or general pedagogy, which is also required for high-quality teaching and learning)?
  • Are the available tools research based on research, and of high quality? Are they likely to generate data that correspond to key questions, education system needs, or school instructional leadership requirements?
  • Are measurements primarily based on adherence to a standardized program or instructional best practices?
  • Has the capacity of data collectors to recognize research-based, subject-specific instructional practices and pedagogical approaches been assessed? Are their judgements reliable and consistent when compared to one another?
  • Are factors such as the time available for instruction and time use considered? Factors that affect time could include the school calendar, tardiness and absenteeism, effects of service provision such as school feeding programs, teacher time on task while in the classroom, and the level of student engagement.

Resources

World Bank Coach Tools and Resources Map +

Overview: A compendium of tools and a resources map for teacher professional development (TPD).

Host: World Bank Group

Description: This compendium contains resource guides, repositories with links to existing resources, sample TPD materials, guides on design choice, implementation approaches, TPD monitoring and evaluation approaches, slides, and video guides.

Read More
Continuous Professional Development in Early Grade Reading Programs +

Overview: The fifth part of a five-part online professional development series called Early Grade Reading Program Design and Implementation.

Host: Science of Teaching

Description: This two-hour “short course” focuses on the types, characteristics, and content of effective continuous professional development (CPD) for teachers, educators, and administrators responsible for EGR improvement. The webinar includes an overview of the knowledge and skills EGR actors need to have, as well as considerations when planning, implementing, monitoring, and evaluating teacher and educator CPD. A suite of relevant resources is also provided.

Read More
International Common Assessment of Numeracy (ICAN) +

Overview: A webpage for the ICAN tool and relevant reports and documents.

Host: People’s Action for Learning Network (PAL)

Description: This webpage features the ICAN tool, as well as supporting documents and reports. ICAN is an open-source, robust, and easy-to-use assessment tool, available in 11 languages, that offers international comparability of results aligned to SDG 4.1.1 (a).

Read More
Teach Primary: Helping Countries to Measure Effective Teaching Practices +

Overview: A classroom observation tool.

Host: World Bank Group

Description: The TEACH tools are a seminal resource for the sector, providing tools for classroom observation resources for early childhood education (ECE) and primary and secondary level education. The tools provide a holistic view of classroom practices and support the use of inclusive teaching strategies. They are open source and are accompanied by comprehensive resources to aid implementation; these resources are categorized for ease of access. The TEACH primary tools examine time on task, teaching practices that respond to the cognitive and social-emotional needs of learners, accessibility of the learning environment, and inclusive practices.

Read More
The PSS-SEL Toolbox +

Overview: A set of resources to inform locally led social and emotional learning (SEL) programming.

Host: Inter-agency Network for Education in Emergencies (INEE) and Harvard EASEL Lab

Description: A set of online and downloadable resources to support stakeholders working on psychosocial support and SEL in global settings. The resources include a guided process for developing locally led, participatory, and inclusive SEL frameworks, policies, and curricula. The toolbox also has interactive online tools to explore PSS-SEL approaches, as well as a set of additional resources, including assessments.

Read More

Fidelity of Implementation

Fidelity of implementation can be defined in as many ways as education programs can be carried out. Fidelity in the core areas of foundational learning closest to the classroom include adherence to a book distribution plan, whether teachers follow lesson plans designed with program staff support, and whether teachers utilize learning materials appropriately in the class. At a management level, fidelity of implementation can also be related to other factors, such as whether head teachers ensure sufficient learning time per national policy, whether sector education staff make regular school support visits, and the degree of fidelity from national to local levels in administering national assessments.

Questions
  • What are the most important components or milestones in implementation?
  • What are the intermediate and final outcomes that will help in understanding how certain degrees of fidelity may or may not affect the change pathway? For example, if a behavior change method intended to support frequent teacher community of practice meetings is implemented and later removed from the program, but teachers continue to meet, the intervention may no longer be needed. Alternatively, if teachers stop meeting, this could indicate that the behavior change intervention was an important part of the change pathway.
  • What factors are most likely to influence the fidelity of implementation, and are those factors considered in data collection and analysis?
  • If the program will be adaptively managed and the implementation model changes, does the fidelity of implementation measurement also change? What needs to remain the same to learn about change over time, regardless of the implementation model?

Resources

Systems Implications for Core Instructional Support +

Overview: A document highlighting five tips produced from a review of four case studies.

Host: RISE Programme

Description: This document walks readers through five tips for building coherence across their education sub-systems (curriculum design, textbook design, assessment tools, and teacher training). These tips were derived from four case studies.

Read More
Guidance Note on Using Implementation Research in Education +

Overview: A document to provide guidance and support on understanding how and in what context an intervention or reform works.

Host: UNICEF

Description: This document provides information on how iterative research can be used to better understand what does and does not work in an intervention or reform, whom the intervention or reform does and does not benefit, and under what circumstances an intervention or reform is effective. This information can be used to understand contextual factors that affect implementation and effectiveness in a context or system.

Read More
Teach Primary: Helping Countries to Measure Effective Teaching Practices +

Overview: A classroom observation tool.

Host: World Bank Group

Description: The TEACH tools are a seminal resource for the sector, providing tools for classroom observation resources for ECE and primary and secondary level education. The tools provide a holistic view of classroom practices and support the use of inclusive teaching strategies. They are open source and are accompanied by comprehensive resources to aid implementation; these resources are categorized for ease of access. The TEACH primary tools examine time on task, teaching practices that respond to the cognitive and social-emotional needs of learners, accessibility of the learning environment, and inclusive practices.

Read More

Targeted Research to Improve Program Effectiveness

Targeted research includes implementation research, which is intended for rapid learning and program improvement, as well as research focused on a very specific component or set of components to understand their added value in a reform or program.

Questions
  • How broad and complex, or specific and technical, is the question to be researched?
  • Does the research focus on people and behaviors, education systems, technical inputs, or other aspects of programming?
  • Who is best positioned to understand and interpret the results? Are those individuals involved in helping to define or enrich the research question and the necessary approaches?
  • What can be learned from similar research studies that may have been carried out in the same or a different context? What reflections might the research designers have on learning from their own efforts, and how could they best undertake a similar study?

Resources

Guidance Note on Using Implementation Research in Education +

Overview: A document to provide guidance and support on understanding how and in what context an intervention or reform works.

Host: UNICEF

Description: This document provides information on how iterative research can be used to better understand what does and does not work in an intervention or reform, whom the intervention or reform does and does not benefit, and under what circumstances an intervention or reform is effective. This information can be used to understand contextual factors that affect implementation and effectiveness in a context or system.

Read More
The What Works Hub for Global Education +

Overview: A website with working papers and blogs that focus on policy, implementation, and evidence.

Host: The What Works Hub for Global Education

Description: This website focuses on synthesizing, curating, and translating evidence on improving literacy, numeracy, and other key skills for children. The site offers working papers and blogs that focus on sharing evidence-based ideas, government policy, and the large-scale and day-to-day implementation of reforms.

Read More
Assessment of Literacy and Foundational Learning in Developing Countries +

Overview: A report on the range and quality of tools that are used to measure literacy and foundational learning in developing countries.

Host: FCDO

Description: HEART reviewed the range and quality of tools used to measure literacy and foundational learning in developing countries. The team analyzed tools that assess language and literacy skills in children from age 3 to 14 (preschool to Grade 8). The analysis included assessment tools from studies published between 1990 and 2014, rated as “Moderate” or “High” in terms of methodological quality, such as the EGRA and the Peabody Vocabulary Test (PPVT). This systematic overview explains the best practices for different measures of emergent literacy and language. It considers a wide range of cultural/linguistic adaptations to common assessments.

Read More
Global Toolkits for FLN Assessments +

Overview: A paper providing a framework for different foundational literacy and numeracy (FLN) assessments.

Host: Sattva Knowledge Institute

Description: This report provides a detailed framework of the major assessment toolkits used within the FLN space. It describes 15 significant toolkits that fall within the following categories: (i) Early Years Assessments (0–6 years), (ii) Formal Schooling Years Assessments (7–14 years), (iii) Assessments for Differently-Abled Individuals, and (iv) Assessments for Adolescents and Adults (15–80 years). The final section of the document provides an overview of gaps through the lens of an Indian NGO, namely, a lack of standardized nationwide assessments, limited inclusivity of tools, and challenges with supporting teachers to play a greater role in the administration and interpretation of these tools.

Read More

Setting and Evaluating Progress/Milestones Toward Sustainability Goals

Sustainability goals are distinct from programmatic goals in that they focus on the environmental conditions that enable a reform or program to continue. They may include, for example, documenting a replicable technical program, ensuring a working model does not exceed a particular budget to remain cost effective within the current government funding capacity, establishing strong systems for procurement or management, or implementing actions that demonstrate political commitment.

In addition to tools mentioned in other sections, including instructional quality and implementation fidelity, interim goals and milestones can include achievement measures as represented in these resources.

Questions
  • Is the outcome — specifically, what stakeholders hope to sustain — explicitly and specifically defined? Has the outcome been defined by national leaders, or jointly defined with development partners, to ensure a commitment to sustainability?
  • Have people impacted by the reform (including officials, implementing staff and partners, and beneficiaries) had an opportunity to comment on their role in the reform or its potential outcomes and impacts?
  • Has a plan of action been jointly defined, with roles and responsibilities, a timeline, and milestones?
  • Are there check-in points for interim achievements (milestones) toward sustainability?
  • Are there incentives to take action toward milestones?

Resources

Global Alliance to Monitor Learning +

Overview: A website with linked resources (conceptual framework, tools, toolkits, and metadata).

Host: UNESCO

Description: The GAML is an institutional platform to oversee the coordination of efforts to measure learning and the harmonization of standards for measuring learning. This platform provides a range of resources developed from 2017 onward that outline how to report against the SDG indicators to produce reliable and internationally comparable learning outcomes data. The landing page outlines the rationale for the resources, providing a concept paper, theory of change, result framework, and governance of the GAML. SDG 4.6 provides definitions and measurement options, a global framework for reporting on literacy and numeracy, and suggested indicators.

Read More
Mapping of SDG Indicators in Learning Assessments +

Overview: A webpage with a repository of the most widespread learning assessment indicators.

Host: UNESCO Global Alliance to Monitor Learning

Description: This webpage contains a repository of the most widespread learning assessment programs, with a special focus on their potential ability to monitor progress toward SDG 4. The information is based on a mapping of cognitive and non-cognitive questionnaires against SDG 4 indicators, and it is organized by listing all assessments with questions that could report against each specific indicator.

Read More

TIPS FOR DEVELOPMENT PARTNERS

✅ The value of building onto existing system tools

When possible, FLN program tools should be adapted from or carefully aligned with those already existing in the education system (if not working directly through the system itself). Program implementers frequently develop their own tools to track progress on specific indicators, which helps them report on their results but does little to support or integrate with the Ministry of Education’s data systems. This approach not only leads to duplication but also contributes to fragmentation and a lost opportunity for knowledge generation, as datasets across programs are rarely consolidated. Many governments already face challenges with data centralization, and this lack of alignment further complicates efforts to create a unified, accessible data repository. By aligning program implementer tools with those of the ministry, we can improve data consistency, support centralized data management, and facilitate better long-term planning and decision-making.


✅ Dashboards are valuable tools, but they do not automatically lead to data-informed decision-making

While dashboards have become popular over the years as platforms for programs and system actors to make more informed choices around resource allocation, support needs, and intervention priorities, they are often not used as intended. For example, in the context of FLN programs, dashboards designed to help district coaches review student and teacher observation data theoretically enable targeted support, such as coaching visits or other interventions. However, in practice, dashboards are rarely consulted effectively. This is often due to ingrained habits — many professionals are not accustomed to using dashboards as part of their workflow, requiring a significant shift in how they approach their work — and challenges with data interpretation, as even visualization techniques that are relatively commonly used can be difficult for many to interpret accurately. This is not to say that dashboards should be discarded; instead, it emphasizes the need for careful design and implementation to make them genuinely useful.