Back to cover

The World Bank Group Outcome Orientation at the Country Level

Chapter 5 | Toward Stronger Outcome Orientation

Highlights

The current country-level results system relies on specific principles of accountability that do not fit well with the way the World Bank Group pursues outcomes at the country level.

A renewed country-level results system could conceive accountability differently, based on evidence of achievement and failures and descriptions of learning and adaptation, rather than reaching predefined indicators.

The current country-level results system is too reliant on results frameworks and could benefit from introducing new approaches such as a monitoring, evaluation, and learning plans that are better suited to country programs. Under these plans, approaches would change in the following ways:

  • Monitoring approaches would focus on selectively tracking critical country outcomes to assess progress on objectives and to support adaptive management.
  • Evaluation approaches would focus on the World Bank Group’s contribution to country outcomes, scale back ratings, and emphasize selectivity over coverage.
  • Learning approaches would focus on creating safe spaces for teams to engage and discuss evaluative evidence.

Changes to the country-level results system would need to be supplemented by changes in signals and incentives to center on results achievement and management rather than overemphasizing approvals, commitments, and output delivery. Incentives could better match staff’s intrinsic motivations.

This chapter rethinks the country-level results system’s accountability principles, tools, and incentives. To improve the accuracy, utility, and outcome orientation of the country-level results system, tools, principles, and incentives should better capture the Bank Group’s contribution to outcomes. The evaluation lays out the contours of a renewed country-level results system based on the findings and good practices presented above; a detailed review of literature on good practices in other development and government agencies; and an exercise to brainstorm solutions with a cross-section of Bank Group staff.

The challenges presented in the report are well known, but past corrections have ossified rather than improved these shortcomings, so a new vision is needed. Over time, the Bank Group, including IEG, has correctly identified some of the shortcomings of the current results system, including the overemphasis on lending, short-termism, and output-focus, and the mismatch between objectives and indicators in the CPF results frameworks (World Bank 2016f, 2017a, 2017d). However, past recommendations and corrections to the system have been to double down on attribution and add more performance measures, creating a cascade of indicators that have become ever less useful without solving the underlying problems of the results system. This evaluation proposes to profoundly rethink the notion of accountability that underlies the current system; the tools for monitoring, evaluating, and learning from country engagements; and the incentives for staff to learn from experience and to prioritize development results. In changing the system there are inevitably trade-offs to consider. Some of the main trade-offs are laid out in box 5.1 at the end of the chapter.

Rethinking Accountability

The current country-level results system relies on specific principles of accountability that do not fit well the way the Bank Group pursues outcomes at the country level. The current country-level results system is premised on a narrow view of accountability that equates it with counting results that can be translated into metrics and ratings. The accountability system is hierarchical, based on a notion that accountability comes from senior management and the Board observing whether targets have been met and on exerting positive or negative pressure on subordinates based on results. Yet this model is not functioning effectively at the country level, because staff do not find that the country-level results system substantially drives their incentives. The country-level results system requires rating effectiveness against forecasted indicators. It measures results that can be quantified and delivered over a short time frame and that the Bank Group has direct control over achieving. These features are not suitable at the country level. At the country level, the Bank Group pursues results through indirect pathways (such as institution building, demonstration effects, market creation, and others) and multiple instruments (such partner convening, capacity building, policy dialogue, and analytical work) where its influence could be evaluated but rarely quantified or translated into indicators with targets and baselines. Moreover, these indirect pathways to outcomes are not fully within the Bank Group’s control and so are not captured by the results system, which only considers results that are directly attributable to Bank Group interventions. Consequently, the results system does not answer key questions of how well the Bank Group’s interventions, including ASA and IFC upstream efforts, build institutions, create markets, or shape policies and under what conditions these institutions contribute to sustained positive changes to societies’ well-being. This evaluation has shown that this stringent notion of accountability constrains country teams’ outcome measurement and management practices, thereby limiting the Bank Group’s knowledge of its influence through indirect pathways over country outcomes.

A renewed country-level results system could conceive accountability differently. To enable a more strategic and collective focus on outcomes at the country level, a different model of accountability could be envisioned—one that is based on the idea of mutual accountability, collective learning, informed risk taking, and maintaining trust through both rewards and effective challenge mechanisms. This would start by (i) acknowledging that the Bank Group can influence country outcomes but is not in total control over them; (ii) recognizing that country teams can make informed decisions to adapt country engagements to evolving risks and to complex environments during implementation rather than deciding all targets and objectives during the design; and (iii) realizing that capturing contributions to country outcomes and assessing the cumulative effects from multiple interventions requires dedicated evaluation inquiries, not just measurement of indicators. As such, country teams should be held accountable for providing well-evidenced descriptions of achievements and failures and for learning from them and adapting accordingly, not for reaching predefined indicators. Senior management or Board oversight (including independent evaluation) could make their own informed judgments on whether contributions have been substantive, given the context and available quantitative and qualitative evidence, and they could ensure that these judgments meaningfully incentivize behavior. The country-level results system should thus relax its current focus on metrics, attribution, and time-boundedness and adopt a more differentiated approach to capturing results—based on plausible contributions, an adequate evidence base, time-appropriateness, selectivity, and contestability mechanisms.

Rethinking the Tool Kit

The current country-level results system is too reliant on CPF results frameworks. The Bank Group uses results frameworks to monitor country programs in much the same way as it monitors individual projects, without much adaptation. However, results frameworks are best suited for measuring outcome areas where indicators are a reasonable proxy for success and collected routinely as part of sector work, where theories of change are well established and change is rather linear, and where there is relative certainty about interventions’ efficacy within a specified context. Conversely, results frameworks are less effective when the measurability of outcomes is contested; there is little data; or there are multiple and diverse interventions contributing to a broader range of objectives, which is typical of country programs (Andrews 2013; Brinkerhoff and Brinkerhoff 2015; Mansbridge 2014; Muller 2018; Radin 2006). Also, country teams make little use of the results framework to engage clients or guide their adaptive management decisions.

A renewed country-level results system could introduce new monitoring, evaluation, and learning (MEL) approaches. Instead of focusing on results frameworks and their rating tools, the country-level results system could develop MEL plans that would enable country teams to use a wider set of distinct MEL approaches that best fit the type of outcome pursued and the type of results information needed to support learning and decision-making. Several other development partners, including the United States Agency for International Development, have adopted the practice of MEL plans at the country level. This could also improve the evidence on the Bank Group’s contribution to country outcomes and encourage evidence-informed adaptive management practices. Distinguishing between MEL approaches is necessary because one tool cannot meet all purposes. Table 5.1 shows how a MEL plan could address some of the limitations of the current country-level results systems. A more elaborate table is presented in appendix D, which provides more details on what a prototype MEL plan could entail.

Monitoring approaches would focus on selectively tracking critical country outcomes to assess progress on CPF objectives and to support adaptive management. A monitoring system could track key country outcomes to inform strategic perspectives on whether progress on CPF objectives is moving in the right direction but without expectations that these outcomes could be attributed to the Bank Group or that specific targets would be achieved. It would provide information that could be tracked and updated frequently and visualized and analyzed easily to provide useful and timely feedback for adaptation. The adaptive management literature highlights the importance of management information systems that work for the users. Such systems are simple to use, access a variety of relevant information sources, visualize progress toward objectives, and can be operated within existing project data platforms. During brainstorming sessions, many country teams highlighted two unmet needs that could inform their decision-making: (i) a portal that links project-level information to the CPF results framework for easier, more dynamic, and more accurate updates; and (ii) long-term tracking of key country outcomes that the Bank Group hopes to influence.

Evaluation approaches would focus on the Bank Group’s contribution to country outcomes, scale back ratings, and emphasize selectivity over coverage. Each of these revisions to the current evaluation approach is described in table 5.1.

Table 5.1. How a Monitoring, Evaluation, and Learning Plan Could Address Current Limitations

Intended Use

Current System

MEL Plan

Capturing contribution to country outcomes

Results frameworks’ indicators do not capture country outcomes well because of expectations of quantification and attribution

The MEL plan and its terminal evaluation would report on the contribution of the World Bank Group to key CPF objectives and their influence on country outcomes

Capturing the contribution of multiple instruments and indirect development pathways

Results frameworks do not properly capture the contribution of multiple instruments, IFC and Multilateral Investment Guarantee Agency, complementarities, or indirect pathways

A MEL plan that embraced selectivity could allow deeper inquiry into critical areas of interest including these issues

Capturing contributions over time

Results frameworks primarily capture past operations, and do not capture medium- to long-term effects

Monitoring would take place across Country Partnership Framework cycles, and evaluation exercises would be timed to be able to capture long-term effects

Reporting on corporate priorities

Reporting on corporate priorities at the country level is carried out separately from the country results system.

The monitoring plan report on corporate priorities and International Development Association results frameworks

Adaptive management

The country-level results system does not effectively support adaptive management

Country teams would shape the MEL plan in part so that they fit their own adaptive management and learning needs

Knowledge of World Bank Group contribution to high-level outcomes

The country-level results system prioritizes accountability rather than knowledge generation

Increased knowledge would come from the combination of monitoring outcomes over the long term and from specific evaluations on critical areas

Organizational learning

Accountability and reporting requirements crowd out learning and reflection

Relaxing reporting requirements would reduce compliance burden, and MEL plans would place greater emphasis on evaluation that meets the needs of country teams and supports their learning

Incentives

Teams engage with the system out of compliance, not from a sense of utility or interest

Decentralization and emphasis on learning would improve ownership of and interest in the system by country teams

Source: Independent Evaluation Group.

Note: CPF = Country Partnership Framework; IFC = International Finance Corporation; MEL = monitoring, evaluation, and learning.

At the country level, the evaluation approach could focus on understanding the Bank Group’s contribution to country outcomes. There is an inherent conflict between expecting results to be attributable to the Bank Group and to achieving high-level outcomes that are not completely within the Bank Group’s control. It is, however, possible to evaluate the Bank Group’s contributions to country outcomes, including the magnitude, uniqueness, and catalytic nature of the Bank Group’s influence. Figure 5.1 shows that assessing the Bank Group’s contribution to country outcomes brings in right-to-left causal reasoning, rather than just the traditional left-to-right. This requires understanding all the factors that contribute to changes to country outcomes. Much of the Bank Group’s research and analytics already study changes in sectors and explain causes, though this available information is rarely used to discern the Bank Group’s contributions to these changes. Also, there are other evaluation methods and techniques that can assess the Bank Group’s contribution to outcomes, including techniques to measure objectives that are hard to quantify (Vaessen, Lemire, and Befani 2020).

Figure 5.1. Alternative Causal Reasoning at the Country Level

Image

Source: Independent Evaluation Group.

Note: “Left-to-right causal reasoning” asks the following question: “Given what is known about the World Bank Group’s portfolio of intervention, how likely is it that the World Bank Group contributed to changes in country outcomes?” Answering this question requires understanding and verifying theories of change, evidence of what works—for whom and under what circumstances—and information on what the Bank Group actually delivered. Conversely, “right-to-left” causal reasoning asks, “Given what is known about how country outcomes have changed over time and the factors that contributed to these changes, how likely is it that the Bank Group contributed to these changes?” Answering this question requires data on changes in outcome and research or analytics that explain what effected these changes.

Figure 5.1. Alternative Causal Reasoning at the Country Level

Source: Independent Evaluation Group.

Note: “Left-to-right causal reasoning” asks the following question: “Given what is known about the World Bank Group’s portfolio of intervention, how likely is it that the World Bank Group contributed to changes in country outcomes?” Answering this question requires understanding and verifying theories of change, evidence of what works—for whom and under what circumstances—and information on what the Bank Group actually delivered. Conversely, “right-to-left” causal reasoning asks, “Given what is known about how country outcomes have changed over time and the factors that contributed to these changes, how likely is it that the Bank Group contributed to these changes?” Answering this question requires data on changes in outcome and research or analytics that explain what effected these changes.

The evaluation approach could scale back its use of ratings. Currently, the CLR and CLRR focus on arriving at and justifying a single development outcome rating for the entire country engagement that is based on deviations from the anticipated targets and indicators. There are arguments in favor of ratings: they provide discipline to the assessment, are simple to communicate, are attractive to shareholders, allow aggregation and a certain level of comparability, and provide a forum for contestability. However, there are also clear shortcomings to the current ratings system. The current ratings approach has methodological limitations because it combines several performance metrics that are not all logically linked, it faces uneven data quality and availability, and it requires aggregating into a single rating a very diverse set of objectives and achievements. Country teams are largely rated on whether they achieved indicators or not, rather than on whether they substantially influenced country outcomes. Ratings can divert attention from the substance, create “gaming” behaviors, and detract from learning (World Bank 2016f).

The evaluation approach could be more selective to achieve greater depth and to match specific learning and accountability needs. A country-level evaluation approach should provide sufficient evidence for teams to strategically reflect on how the Bank Group works with governments and development partners to contribute to medium- and long-term changes in client countries. A more selective approach would yield more valid and useful evidence of the Bank Group’s outcome contributions and help the organization bridge key knowledge gaps. As part of this, country teams could propose the evaluations that should be carried out during or after the engagement cycle, depending on the team’s learning and accountability needs. For example, evaluations might be warranted to inform key decision-making moments (like whether to continue or adjust an approach in a key sector or modality), or to assess the cumulative effectiveness of a substantial long-term engagement. Evaluations could be programmed to assess the effectiveness of critical indirect development pathways, which are difficult to capture with simple quantified indicators. Review and approval of the evaluation plan and selection of evaluations by Regions or the Board could ensure that strategically important interventions are covered by evaluation.

Learning approaches would focus on creating safe spaces for teams to engage and discuss the evaluative evidence. The PLR, CLR, and CLRR all include “learning” as a core objective. For the CLR, teams are asked to draw out lessons from the previous CPF’s implementation, which IEG also does when validating CLRs. Lessons captured in CLRs and CLRRs span a wide range of topics, but country teams often find them too generic and not substantiated enough to be useful. More important, teams question the effectiveness of using reports to share lessons when these reports are primarily geared toward external reporting to IEG and the Board and are rarely used by country teams themselves. Instead, country teams find that what facilitates learning is the process of coming together as country teams, sometimes with clients, to discuss the evaluative evidence and to reflect on the evidence’s implications for current and future engagements. The external literature on adaptive management shows that routine feedback meetings, used as safe spaces for teams to genuinely engage results data and evidence, are a cornerstone of organizational learning. They are also an effective mechanism to collectively reflect on the risks and rewards of alternative courses of action in changing environments.

Rethinking Incentives

Changing the tool kit will not lead to changes in practice and behavior if the underlying incentives are not aligned with the pursuit of outcomes. Country teams’ practices and behaviors are shaped by their intrinsic motivation to achieve impact in client countries and by incentives that can either support or hinder this intrinsic motivation. Incentives emerge from what is measured, tracked, and rated in managerial dashboards, regional vice presidents’ and country directors’ performance agreements, and Corporate Scorecards. Incentives also come from informal signals that staff receive from management, the Board, and other members of the authorizing environment, including IEG. Signals are less tangible than incentives and include what managers informally value or praise, what managers spend their time on, and what leadership prioritizes in meetings—including ROCs, Board discussions, and unit meetings. These signals and incentives are not always aligned with outcome orientation. IEG’s evaluations have shown that results systems will not uphold accountability if they are not embedded in an organizational setting where teams are encouraged, through incentives and signals, to use the system to learn, course correct, and achieve results. A results system without such incentives and signals is performative and fuels a façade of accountability (World Bank 2013b, 2015h, 2016g, 2017a).

Current signals and incentives center on approvals, commitment, and output delivery, not results achievement and management. Through interviews and country cases, IEG gathered evidence on the types of incentives and signals that drive staff behavior. Interviewees felt that incentives of various types—carrots, sticks, and sermons—largely centered on the importance of receiving approvals and increasing the volume, disbursements, output delivery, and financial performance in the case of IFC. As one interviewee put it, “What really matters is getting new deals approved by the Board and making sure that those disbursement ratios in CD [country director] and Vice President’s dashboards don’t go into the red.” Regional agreements contain indicators on the number of CPFs submitted to the Board but nothing on the quality of PLRs or CLRs. The Board reviews new projects and CPFs at approval but pays less attention at closure. Staff recognition and promotion come from getting a project approved by the Board, not from the results of that project. IFC’s incentives are for the most part based on deal volumes that drive individual and sectoral behavior.

Although approvals, commitment, and output delivery are important to the Bank Group’s success, an overemphasis on these aspects can overshadow the drive for results. Staff interviews show concerns that pursuing these incentives can divert attention from quality and results. For example, for certain objectives, output delivery—such as operating newly built hydropower plants or delivering vaccines—is a good proxy for results. However, in other projects that involve analytical work or capacity or system building, output delivery is neither necessary nor sufficient to achieve results. In one theoretical example, a team that seeks to change a public agency’s management practices could lead a very successful reform process without ever delivering a specified output—or could deliver that output but make little traction on reform. Teams’ performance is also tracked by loan disbursements. Most times, this is a good proxy for a successful implementation and encourages action to unlock operational bottlenecks, but not always. For example, in policy reform loans, coordinating and building consensus among divided stakeholder groups or administrative layers can indicate progress toward objectives, even if the lending disbursement is slow. Disbursement pressures encourage teams to unlock delivery bottlenecks but may discourage teams from applying strategic changes or drastically rethinking approaches to achieve greater impact.

A renewed country-level results system could experiment with new incentive mechanisms for better outcome management. Although reshaping incentives in the Bank Group goes well beyond what can be achieved through a renewed results system, some incentives could be corrected through changes in practice. For example, Regions could experiment with changing country leadership’s incentives, including adjusting agreements between the managing directors and regional vice presidents and between regional vice presidents and their country directors as well as other formal incentive mechanisms to de-emphasize meeting disbursement targets while emphasizing outcome management. The right incentive need not take the form of a metric indicator or ratings but could stem from an assessment from senior management or the Board on whether sufficient progress was achieved on priority outcomes, based on compelling evaluative evidence.

Incentives could better match staff’s intrinsic motivations to achieve outcomes, their learning methods, and the signals that matter most to them. The current results system relies on ratings and their independent validation as a “stick” and “reward” instrument but also as the primary “challenge mechanism” to ensure the contestability and credibility of the reporting. However, this report has shed light on a range of problems with the rating practices. A renewed system could envision alternative “challenge mechanisms.” For example, the practice of the ROC for the CPF or IFC high-profile meetings with the chief operations officer and chief executive officer for country strategies could be extended to the evaluation stage. The Bank Group could experiment with dedicated high-profile meetings to discuss evaluative evidence on results achieved at multiple levels of the organization, including the regional vice presidential unit and the Board. The Regional Updates to the Board could more prominently feature evidence of results and their discussion. A challenge mechanism could be built into these meetings to ensure contestability of the evidence and decision-making. This might require the presence of designated “provocateurs” and external voices with independent judgment who are assigned to ask questions that challenge tenuous assumptions and evidence.

Mechanisms to mitigate the short-termism created by rapid staff turnover and short country cycles could be envisioned. Country teams could improve staff succession plans and handover strategies by ensuring the continuity of outcome management. For example, exiting country leaders and teams could prepare knowledge packages that contain qualitative information on drivers of success and failure in the specific country political economy, to be passed on to the incoming country teams. Ensuring that results systems serve as a platform to capture evaluative evidence, including qualitative information, detailed lessons, and feedback on what works, what does not, and why in this context could mitigate the knowledge losses caused by staff turnover. Creating “results teams” related to key country outcomes—which exist in some country offices—gives GP staff a leadership role on key country outcomes. These teams could also better recognize and leverage the unique role played by local staff, who provide critical continuity to the Bank Group’s engagement across CPF cycles and have an acute sense of the unique development challenges, institutional capacities, and political constraints of the clients and support them for years on end to achieve outcomes.

Box 5.1. Trade-Offs in Adopting a New Country-Level Results System

The evaluation recognizes that there are trade-offs in changing some of the features of the current country-level results system. Some of the most important trade-offs include the following:

  • External reporting versus internal use. The current results system privileges information that is geared toward external reporting to the authorizing environment, with ratings that can be aggregated, are relatively easy to communicate and digest to shareholders, and contain reporting mechanisms that favor transparency. There is, however, little evidence of actual use by the authorizing environment of the information generated by the system. A renewed system would rebalance toward internal use of results information, privileging internal accountability and learning without losing the capacity to provide critical results information to the Board, clients, and the general public.
  • Standardization versus customization. The current results system uses a common methodology and approach for all country engagements, independent of the type of objectives pursued and nature of the portfolio and interventions, leading to some critical mismatches between what outcomes are pursued and what results are captured, especially through indirect development pathways. A renewed system would offer more flexibility to country teams to select the monitoring, evaluation, and learning approaches as well as types of data that are most suitable to the country engagements and objectives pursued.
  • Centralized versus decentralized. The current results system is centrally designed and managed, and the ownership of the system primarily lies with Operations Policy and Country Services and the Independent Evaluation Group. A renewed system would empower Regions and Country Management Units to play a more active role in designing, governing, and using the system. This is in line with the Bank Group’s ongoing reform agenda to realign the matrix organization around Regional vice presidency units, enhance the global footprint, and move decision-making closer to clients.
  • Comprehensive versus selective. The current results system seeks to cover all Country Partnership Framework objectives and provide a broad picture of the overall portfolio of activities; at the same time, it inevitably provides a superficial picture of outcomes and does not adequately cover nonlending and indirect development pathways. A renewed system would have a more selective approach that would privilege outcome focus, utility for decision-making, and adequacy of the evidence base. Governance mechanisms can be put in place to avoid the risk of cherry picking.

Source: Independent Evaluation Group.

Recommendation

Based on the vision described above, IEG makes the following concrete recommendation:

Reform the country-level results system to ensure that it accurately captures the Bank Group’s contribution to country outcomes and usefully informs decision-making on country engagements. The Bank Group should keep a country engagement model that articulates clear outcome-level objectives and lays out the pathways that will be pursued to achieve them, conducts periodic reviews to take stock of progress, and includes an end-of-cycle review of evidence and learning. These reviews need not be a Board deliverable but should be carried out in time to meaningfully inform course correction and the next strategy. It should discontinue the reliance on results frameworks in its country strategies and midterm and terminal reviews. And it should adopt MEL plans for its country engagements (box 5.2 and appendix D).

Box 5.2. How Could Monitoring, Evaluation, and Learning Plans Work?

Appendix D presents additional ideas on what a prototype monitoring, evaluation, and learning (MEL) plan could entail.

  • A monitoring plan would describe how the country team will monitor the country engagement’s performance, changes to the country outcomes that the World Bank Group seeks to influence, and contextual risks that are relevant to specific Country Partnership Framework (CPF) objectives.
  • An evaluation plan would define the team’s criteria for selecting evaluation activities, the evaluation’s purpose and expected use, possible evaluation questions, and the evaluation modality (internal or commissioned out). The evaluation plan would also propose a terminal evaluation that would synthesize available information on the Bank Group’s performance, including the results from evaluative activities under the MEL plan, portfolio information from project results systems, and stakeholder feedback surveys or narrative accounts.
  • A learning plan would identify knowledge gaps that the country team intends to fill during the CPF period. It would also plan activities for country teams to periodically reflect on certain activities, processes, and findings related to evaluations, information monitoring, and stakeholder engagements. The MEL plan should include one or more midterm reviews, where the country team engages in an exercise to collectively consider whether progress toward objectives is on the right track and to identify lessons or opportunities for changing direction.
  • Country teams would propose a MEL plan at the CPF design stage and allocate resources for its implementation. The plan could be proposed by the Country Director and approved by the Region’s vice president. However, the plan should not be comprehensive or fixed on approval, and country teams should be expected to update the plan over the life of the country engagement as part of the portfolio review process.

Source: Independent Evaluation Group.