By Luis Crouch, Chair of the UNESCO Institute for Statistics (UIS) Governing Board, and Silvia Montoya, Director of the UIS
By late 2023, few countries were measuring and reporting on SDG indicator 4.1.1a, the proportion of children in grades 2/3 achieving at least a minimum proficiency level in reading and mathematics. Last October, the Inter-Agency and Expert Group on SDG Indicators (IAEG-SDGs) therefore ‘demoted’ the indicator from Tier I to Tier II, risking its status at the 2025 review of the SDG monitoring framework. This blog describes actions taken by the UIS, working with IAEG-SDGs, to restore the indicator’s status. The road ahead, though, demands collective action: important technical and institutional issues remain.
How did the UIS respond to protect the indicator?
Since 2016, the UIS has been putting together building blocks for SDG indicator 4.1.1a: the minimum proficiency level (MPL) definition in 2018, later related to the Global Proficiency Framework (GPF). Both are guidance for learning progression over grades, for reading and mathematics.
UIS contributions to indicator 4.1.1 development, 2016–2024
To address the ‘demotion,’ a Global Alliance to Monitor Learning experts and stakeholders meeting, coordinated by the UIS, took place in Paris on 6–7 December. It was noted that well-known but newer efforts to measure learning (EGRA, FLM, and the PAL Network tools) are generating data, but were designed primarily for advocacy and programme design, monitoring and evaluation, not for global reporting and comparison. Importantly, they were not explicitly aligned to the MPL/GPF. Their properties were not well documented (see here and here). Thus, data generated by these assessments were not being used by the UIS for global reporting. The meeting requested the UIS to prepare eligibility criteria (psychometric and procedural), which could allow these assessments to be used for reporting, if criteria are met. UIS shared the document in February. Hundreds of stakeholder comments were received to which the UIS responded in writing.
On 4–6 March, the UIS convened a Technical Advisory Group (TAG) meeting to analyse the feedback, further refine eligibility criteria, and define steps forward. On 25 March, the UIS shared the revised version of the eligibility document along with the TAG recommendations, which included further data analysis to define relevant skills and benchmarks. A call was also issued to share databases for the analysis.
Based on the foregoing, the UIS proposed to the IAEG-SDGs to unpack the reporting of SDG 4.1.1a to deal with two important measurement issues particularly relevant to low- and lower-middle-income countries:
- Language matters more in early grades, as home languages are often used for instruction there. (Most school systems no longer teach in the home language by the end of primary, where SDG 4.1.1b is relevant). This poses measurement and benchmarking challenges. For instance, progress in a phonetically written language differs from progress in languages where the correspondence of sound to print is complex or the script is complex.
- Many children have not mastered ‘reading to learn’ by grades 2 or 3; they are only starting to master the most basic or ‘precursor’ elements of reading. This makes it harder to assess in traditional ways.
Accordingly, the UIS proposed benchmarks for reading precursor skills (see figure and explanation here). Specifically, the UIS would encourage countries where children are not mastering reading comprehension to also measure and report on children meeting benchmarks for precursor skills. This would help countries assess progress towards the minimum proficiency level, defined in terms of comprehension. This would require setting numerical benchmarks for those skills—an innovation requiring analysis and agreement.
Fortunately, many of the assessments under discussion address precursor skills. Their data were used for benchmark analysis, carried out in April. On 14–16 May, UIS convened the second TAG meeting to discuss the results of the analysis. The following summary conclusions were shared with stakeholders:
- If an assessment has high quality data and large-enough samples, it is possible to establish benchmarks by language or language group. Experts from South Africa and Kenya also informally shared national benchmark-setting methods.
- But more steps are needed before benchmarks can be set:
- The TAG called for more analysis on the reliability of proposed methods.
- For some languages, the number of countries was too low. More data will be called for. It may be that benchmarks can only be established for some languages as data come in.
- There has not been a similar analysis for numeracy and mathematics. Data for mathematics data will be called for.
- Clearer definition of the levels of difficulty of the assessments should be provided as guidance.
- It would be useful to have more national experiences of country-set benchmarks, as was the case of South Africa and Kenya.
Mathematics deserves special mention, since much of the world’s attention has focused on foundational reading. Expert guidance divides skills into two broad categories: numbers/operations (e.g., counting, addition) and mathematics (e.g., shapes, early algebraic skills). But reading has a single goal: reading understanding. In mathematics, both numbers and operations and other skills such as dealing with shapes are equal goals. For purely practical reasons, the TAG called for an initial reporting focus on numbers and operations, while research on benchmarks for mathematics components is conducted.
Current status and next steps
The TAG meetings in March and May reaffirmed that all assessments meeting the eligibility criteria can be used for reporting, including more traditional assessments, newer ones that also measure precursor skills, and national assessments. The TAG recommended that, in reading, the percentage of children answering enough comprehension questions correctly would be the metric for reporting, and in mathematics, for now, the percentage of children answering correctly on numbers and operations. Thus, assessments of any kind must have at least a minimum number of items in comprehension and numeracy. Countries, especially those with low numbers of children reading with understanding, could report on precursor skills in reading, and broader skills beyond operations in mathematics. These can be compared to the benchmarks to assess progress even in countries where most children are not yet mastering reading for meaning or broader mathematical concepts.
But it is important to note that there is much more work to be done, which the UIS will lead, and which will require tight collaboration between key stakeholders.
Apart from those already mentioned above, another technical task to tackle is the provision of guidance on language groupings. These technical tasks have institutional implications. Agencies and countries will need to respond quickly and effectively to calls for more data to collaboratively develop the benchmarks.
Institutional innovations are needed to make the learning assessment ecosystem more sustainable and useful (see here and here), more efficient, more fair, and less of a transactions-heavy burden on both international agencies and countries. For instance, it has been common practice for agencies to fund particular assessments, which they favour, in particular countries. The UIS advocates moving to a system where funding is untied from particular assessments, as long as assessments meet quality criteria.
In brief, the UIS sees these priorities:
- In line with the decision of the SDG 4 High-level Steering Committee at its June meeting, a virtual fund or coordinating mechanism needs to be established. Under this mechanism, donors will make funding available to countries that choose to conduct an assessment that meets criteria, without having specific donors fund only specific assessments in specific countries. A similar approach partially accounts or the success of initiatives such as the Vaccine Alliance (GAVI). Details of how such a fund would work must be worked out.
- To assist with implementation of this decision, the UIS will prepare a concept note and will commission a ‘buyer’s guide’ that will consider how an assessment deals with issues of purpose, local capacity, sustainability, and cost. Countries could use the ‘buyer’s guide’ to decide which assessment suits their needs best.
- A small team of independent experts must put in place to evaluate particular assessment applications from particular countries against the eligibility criteria for reporting and recommend their use to UIS. This team will be coordinated by UIS. The team would be overseen by an independent steering committee to vouch for the impartiality and quality of the work.
The UIS welcomes involvement of partners to help create the virtual fund and the vetting mechanism for certifying assessment applications by countries to report data on indicator 4.1.1a. The UIS will continue to coordinate and support plans to increase the number of countries reporting and will communicate these to the IAEG-SDGs and the UN Statistical Commission – and will keep all parties informed.