Evaluation of International Development Interventions

Appendix A. Glossary of Key Terms

The purpose of the glossary is to define key terms used in this guide. To the extent possible, definitions from the glossary of key evaluation terms of the Development Assistance Committee of the Organisation for Economic Co-operation and Development (OECD-DAC; or from Independent Evaluation Group publications are provided.

Activity. Actions taken or work performed through which inputs, such as funds, technical assistance, and other types of resources are mobilized to produce specific outputs (OECD-DAC).

Alternative explanation. A plausible or reasonable explanation for changes in an outcome variable caused by factors other than the program under evaluation.

Analytical generalization. A nonstatistical approach for generalization of findings based on a theoretical analysis of the program and contextual factors producing outcomes.

Analytical technique. An approach used to process and interpret information as part of an evaluation (OECD-DAC).

Assumption. A hypothesis about factors or risks that could affect the progress or success of a development intervention (OECD-DAC).

Attribution. The ascription of a causal link between changes observed or expected to be observed and a specific intervention (OECD-DAC).

Baseline. A measure describing the situation before a development intervention, against which progress can be assessed or comparisons made (OECD-DAC).

Bayesian updating. A technique for refining the probability that a hypothesis or theory is true (or false) as more information becomes available.

Big data. Data characterized by high volume, velocity (real time), and variety (wide range of information).

Causal description. Determining the outcome(s) attributable to a program.

Causal explanation. Clarifying the mechanisms through which a program generates the outcome(s).

Comparison/control group. The group of individuals in an experiment (control) or quasi-experiment (comparison) who do not receive the treatment program.

Contribution. A program effect that is difficult to isolate from other co-occurring causal factors.

Counterfactual. The situation or condition that may have hypothetically materialized for individuals, organizations, or groups if no development intervention had been implemented.

Data analytics. An umbrella term for analytical techniques and processes used to extract information from data, including data collected with emerging technologies and big data (see definition above).

Data collection method. An approach used to identify information sources and collect information during an evaluation (OECD-DAC).

Discount rate. The interest rate used in cost-benefit analysis to adjust the value of past or future cash flows to present net value.

Doubly decisive test. A type of test (made famous by process tracing literature) that is both strong and symmetrical; that is, it can either strengthen or weaken the hypothesis considerably, depending on whether the test is positive or negative, respectively.

Effect. An intended or unintended change attributable directly or indirectly to an intervention (OECD-DAC).

Effect size. A quantitative measure of the outcome difference between a treatment group and a comparison/control group.

Effectiveness. The extent to which the development intervention’s objectives were or are expected to be achieved, taking into account their relative importance (OECD-DAC).

Efficiency. A measure of how economically resources/inputs (funds, expertise, time, and so on) are converted to results (OECD-DAC).

Ex ante evaluation (also known as prospective evaluation). An evaluation that is performed before implementation of a development intervention (OECD-DAC).

Ex post evaluation (also known as retrospective evaluation). Evaluation of a development intervention after it has been completed (OECD-DAC).

External validity. The extent to which findings from an evaluation can be generalized to other, perhaps broader, settings and groups.

Evaluation theory. Approaches to evaluation that prescribe a specific role and purpose for the evaluation.

Hoop test. A type of test (made famous by process tracing literature) that is strong but not symmetrical: it can substantially weaken the theory or hypothesis if negative but cannot strengthen it if positive.

Impact. Positive and negative, primary and secondary long-term effects produced by a development intervention, directly or indirectly, intended or unintended (OECD-DAC).

Independent evaluation. An evaluation carried out by entities and persons free of the control of those responsible for the design and implementation of the development intervention (OECD-DAC).

Influencing factor. An aspect of the program implementation context that affects the program outcomes, either qualitatively or quantitatively.

Input. The financial, human, and material resources used for the development intervention (OECD-DAC).

Internal validity. The credibility or “truth value” of a causal connection between a program and specific outcome.

Judgmental matching. The use of nonstatistical techniques to establish comparable treatment and comparison groups for the purposes of net effect estimation.

Logic model. A depiction (often in tabular form) of the activities, outputs, and outcomes of a program. The term is often used interchangeably with program theory. Many logic models differ from program theories in that they merely list program activities, outputs, and outcomes instead of explaining how they are connected.

Logical framework (logframe). A management tool used to improve the design of interventions, most often at the project level. It involves identifying strategic elements (inputs, outputs, outcomes, impact) and their causal relationships, indicators, and the assumptions or risks that may influence success and failure. It thus facilitates planning, execution, and evaluation of a development intervention (OECD-DAC).

Mechanism. The underlying processes generating an outcome.

Outcome. The likely or achieved short-term and medium-term effects of an intervention’s outputs (OECD-DAC). Outcomes are usually in the form of behavioral or organizational changes.

Output. The products, capital goods, and services that result from a development intervention; it may also include changes resulting from the intervention that are relevant to the achievement of outcomes (OECD-DAC).

Pipeline approach. A technique for comparison group selection, where the comparison group is composed of individuals who have been selected (eligible) to participate but have not (yet) been involved or benefited from intervention activities.

Program. A set of activities and outputs intended to advance positive outcomes for a specific group of people, here used as generic term for a policy intervention.

Program theory. A visual and narrative description of how the activities and outputs of a program are expected to generate one or more outcomes.

Propensity score matching. The use of statistical (regression-based) techniques to establish a comparison group that is equivalent to the treatment group for purposes of net effect estimation.

Power (statistical). The probability that a statistical test (based on a sample) will detect differences (when they truly exist in the population).

Purposive sampling. A nonrandom sampling procedure.

Random allocation. The random selection of participants to the treatment group and the comparison group, whereby selection bias from observed and unobserved characteristics is eliminated.

Random sample. A sample drawn from a population where each unit has an equal probability of being selected.

Regression. A statistical procedure for predicting the values of a dependent variable based on the values of one or more independent variables.

Reliability. Consistency or dependability of data and evaluation judgments, with reference to the quality of the instruments, procedures, and analyses used to collect and interpret evaluation data (OECD-DAC).

Sample. A subset of units (for example, individuals or households) drawn from a larger population of interest.

Sampling. A process by which units (for example, individuals or households) are drawn from a larger population of interest. See also random and purposive sampling procedures.

Selection bias. Bias introduced when specific individuals or groups tend to take part in the program (treatment group) more than other groups, resulting in a treatment group that is systematically different from the control group.

Sensitivity analysis. Determines how sensitive the findings are to changes in the data sources or the data collection and analysis procedures.

Smoking gun test. A type of test (made famous by process tracing literature) that is strong but not symmetrical; it can considerably strengthen the theory or hypothesis if positive but cannot weaken it if negative.

Stock and flow diagram. A visual depiction of causal relationships in a system modeled on the basis of one or more stocks (for example, the total number of rural farmers under the poverty line) and the flows between them that change the stock values (for example, job growth, currency inflation).

Straw-in-the-wind test. A type of test (made famous by process tracing literature) that is both weak and symmetrical; it can never substantially strengthen nor weaken the theory or hypothesis.