Image

POVERTY AND EQUITY:
MEASUREMENT, POLICY AND
ESTIMATION WITH DAD

ECONOMIC STUDIES IN INEQUALITY, SOCIAL EXCLUSION AND WELL-BEING

Editor:
Jacques Silber, Bar Ilan University

Editorial Advisory Board:
John Bishop, East Carolina University, USA.
Satya Chakravarty, Indian Statistical Institute, India.
Conchita D' Ambrosio, University of Milano-Bicocca, Italy.
David Gordon, University of Bristol, The United Kingdom.
Jaya Krishnakumar, University of Geneva, Switzerland.

This series will publish volumes that go beyond the traditional concepts of consumption, income or wealth and will offer a broad, inclusive view of inequality and well-being. Specific areas of interest will include Capabilities and Inequalities, Discrimination and Segregation in the Labor Market, Equality of Opportunities, Globalization and Inequality, Human Development and the Quality of Life, Income and Social Mobility, Inequality and Development, Inequality and Happiness, Inequality and Malnutrition, Income and Social Mobility, Inequality in Consumption and Time Use, Inequalities in Health and Education, Multidimensional Inequality and Poverty Measurement, Polarization among Children and Elderly People, Social Policy and the Welfare State, and Wealth Distribution.

Volume 1
de Janvry, Alain and Kanbur, Ravi
Poverty, Inequality and Development: Essays in Honor of Erik Thorbecke

Volume 2
Duclos, Jean-Yves and Araar, Abdelkrim
Poverty and Equity: Measurement, Policy and Estimation with DAD

POVERTY AND EQUITY:
MEASUREMENT, POLICY AND
ESTIMATION WITH DAD

By

JEAN-YVES DUCLOS
Interuniversity Centre on Risk, Economic Policies and Employment (CIRPEE)
Universite Laval, Québec, Canada

ABDELKRIM ARAAR
CIRPEE and Poverty and Economic Policy (PEP) network,
Universite Laval, Québec, Canada

Image

Jointly published by
Springer
233 Spring Street
New York, NY 10013

and the
International Development Research Centre
PO Box 8500, Ottawa, ON, Canada K1G 3H9
info@idrc.ca / www.idrc.ca

Library of Congress Control Number: 2006923024

ISBN:10: 0-387-25893-0 (HB)
ISBN-13: 978-0387-25893-5 (HB)

e-ISBN-10: 0-387-33318-5
e-ISBN-13: 978-0387-33318-2

© 2006 Springer Science+Business Media, LLC
All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden.

The use in this publication of trade names, trademarks, service marks and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights.

Printed in the United States of America.

9 8 7 6 5 4 3 2 1

springer.com

To Marie-Chantal,
Étienne, Clémence and
Antoine. Without their
love, I could not be
such a happy Dad.

Yves

To Syham, Abdou and
Aymen.
To my mother and my
father


Araar Abdelkrim

This page intentionally left blank.

Contents

Dedication

v

List of Figures

xv

List of Tables

xviii

Preface

xix

Part I Conceptual and methodological issues

 

1 WELL-BEING AND POVERTY

3

1.1 The welfarist approach

3

1.2 Non-welfarist approaches

5

1.2.1 Basic needs and functionings

5

1.2.2 Capabilities

7

1.3 A graphical illustration

8

1.3.1 Exercises

11

1.4 Practical measurement difficulties for the non-welfarist approaches

12

1.5 Poverty measurement and public policy

15

1.5.1 Poverty measurement matters

15

1.5.2 Welfarist and non-welfarist policy implications

16

2 THE EMPIRICAL MEASUREMENT OF WELL-BEING

19

2.1 Survey issues

19

2.2 Income versus consumption

21

2.3 Price variability

22

2.3.1 Exercises

26

2.4 Household heterogeneity

27

2.4.1 Estimating equivalence scales

28

2.4.2 Sensitivity analysis

29

2.4.3 Household decision-making and within-household inequality

32

2.4.4 Counting units

33

2.5 References for Chapters 1 and 2

33

Part II Measuring poverty and equity

 

3 INTRODUCTION AND NOTATION

39

3.1 Continuous distributions

40

3.2 Discrete distributions

41

3.3 Poverty gaps

42

3.4 Cardinal versus ordinal comparisons

43

4 MEASURING INEQUALITY AND SOCIAL WELFARE

49

4.1 Lorenz curves

49

4.2 Gini indices

53

4.2.1 Linear inequality indices and S-Gini indices

53

4.2.2 Interpreting Gini indices

57

4.2.3 Gini indices and relative deprivation

59

4.3 Social welfare and inequality

60

4.4 Social welfare

63

4.4.1 Atkinson indices

63

4.4.2 S-Gini social welfare indices

65

4.4.3 Generalized Lorenz curves

65

4.5 Statistical and descriptive indices of inequality

66

4.6 Decomposing inequality by population subgroups

67

4.6.1 Generalized entropy indices of inequality

67

4.6.2 A subgroup Shapley decomposition of inequality indices

69

4.7 Appendix: the Shapley value

71

4.8 References

72

5 MEASURING POVERTY

81

5.1 Poverty indices

81

5.1.1 The EDE approach

81

5.1.2 The poverty gap approach

82

5.1.3 Interpreting FGT indices

83

5.1.4 Relative contributions to FGT indices

84

5.1.5 EDE poverty gaps for FGT indices

86

5.2 Group-decomposable poverty indices

87

5.3 Poverty and inequality

88

5.4 Poverty curves

89

5.5 S-Gini poverty indices

90

5.6 Normalizing poverty indices

91

5.7 Decomposing poverty

92

5.7.1 Growth-redistribution decompositions

92

5.7.2 Demographic and sectoral decomposition of differences in FGT indices

95

5.7.3 The impact of demographic changes

96

5.7.4 Decomposing poverty by income components

97

5.8 References

98

6 ESTIMATING POVERTY LINES

103

6.1 Absolute and relative poverty lines

103

6.2 Social exclusion and relative deprivation

105

6.3 Estimating absolute poverty lines

106

6.3.1 Cost of basic needs

106

6.3.2 Cost of food needs

106

6.3.3 Non-food poverty lines

110

6.3.4 Food energy intake

113

6.3.5 Illustration for Cameroon

114

6.4 Estimating relative and subjective poverty lines

116

6.4.1 Relative poverty lines

116

6.4.2 Subjective poverty lines

119

6.4.3 Subjective poverty lines with discrete information

120

6.5 References

121

7 MEASURING PROGRESSIVITY AND VERTICAL EQUITY

127

7.1 Taxes and transfers

127

7.2 Concentration curves

128

7.3 Concentration indices

130

7.4 Decomposition of inequality into income components

130

7.4.1 Using concentration curves and indices

130

7.4.2 Using the Shapley value

132

7.5 Progressivity comparisons

132

7.5.1 Deterministic tax and benefit systems

132

7.5.2 General tax and benefit systems

135

7.6 Tax and income redistribution

135

7.7 References

137

8 HORIZONTAL EQUITY, RERANKING AND REDISTRIBUTION

141

8.1 Ethical and other foundations

141

8.2 Measuring reranking and redistribution

143

8.2.1 Reranking

144

8.2.2 S-Gini indices of equity and redistribution

144

8.2.3 Redistribution and vertical and horizontal equity

145

8.3 Measuring classical horizontal inequity and redistribution

147

8.3.1 Horizontally-equitable net incomes

147

8.3.2 Change-in-inequality approach

149

8.3.3 Cost-of-inequality approach

149

8.3.4 Decomposition of classical horizontal inequity

151

8.4 References

151

Part III Ordinal comparisons

 

9 DISTRIBUTIVE DOMINANCE

155

9.1 Ordering distributions

155

9.2 Sensitivity of poverty comparisons

155

9.3 Ordinal comparisons

156

9.4 Ethical judgements

158

9.4.1 Dominance tests

158

9.4.2 Paretian judgments

158

9.4.3 First-order judgments

159

9.4.4 Higher-order judgments

160

9.5 References

162

10. POVERTY DOMINANCE

165

10.1 Primal approach

167

10.1.1 Dominance tests

167

10.1.2 Nesting of dominance tests

171

10.2 Dual approach

173

10.2.1 First-order poverty dominance

173

10.2.2 Second-order poverty dominance

174

10.2.3 Higher-order poverty dominance

175

10.3 Assessing the limits to dominance

176

10.4 References

177

11 WELFARE AND INEQUALITY DOMINANCE

181

11.1 Ethical welfare judgments

181

11.2 Tests of welfare dominance

182

11.3 Inequality judgments

184

11.4 Tests of inequality dominance

186

11.5 Inequality and progressivity

187

11.6 Social welfare and Lorenz curves

188

11.7 The distributive impact of benefits

189

11.8 Pro-poor growth

190

11.8.1 First-order pro-poor judgements

190

11.8.2 Second-order pro-poor judgements

192

11.9 References

194

Part IV Policy and growth

 

12 POVERTY ALLEVIATION: POLICY AND GROWTH

199

12.1 The impact of targeting

199

12.1.1 Group-targeting a constant amount

200

12.1.2 Inequality-neutral targeting

201

12.2 The impact of changes in the poverty line

204

12.3 Price changes

207

12.4 Tax and subsidy reforms

209

12.5 Income-component and sectoral growth

211

12.5.1 Absolute poverty impact

212

12.5.2 Poverty elasticity

213

12.6 Overall growth elasticity of poverty

213

12.7 The Gini elasticity of poverty

216

12.7.1 Inequality and poverty

216

12.7.2 Increasing bi-polarization and poverty

217

12.8 The impact of policy and growth on inequality

218

12.8.1 Growth, fiscal policy, and price shocks

218

12.8.2 Tax and subsidy reforms

220

12.9 References

222

13 TARGETING IN THE PRESENCE OF REDISTRIBUTIVE COSTS

225

13.1 Poverty alleviation, redistributive costs and targeting

226

13.2 Costly targeting

228

13.2.1 Minimizing the headcount

228

13.2.2 Minimizing the average poverty gap

228

13.2.3 Minimizing a distribution-sensitive poverty index

231

13.2.4 Optimal redistribution

233

13.3 References

234

Part V Estimation and inference

 

14 AN INTRODUCTION TO DAD: A SOFTWARE FOR DISTRIBUTIVE ANALYSIS

241

14.1 Introduction

241

14.2 Loading, editing and saving databases in DAD

243

14.3 Inputting the sampling design information

245

14.4 Applications in DAD: basic procedures

246

14.5 Curves

248

14.6 Graphs

251

14.7 Statistical inference: sampling distributions, confidence intervals and hypothesis testing

252

14.8 References

253

15 NON-PARAMETRIC ESTIMATION IN DAD

259

15.1 Density estimation

259

15.1.1 Univariate density estimation

259

15.1.2 Statistical properties of kernel density estimation

262

15.1.3 Choosing a window width

263

15.1.4 Multivariate density estimation

265

15.1.5 Simulating from a density estimate

265

15.2 Non-parametric regressions

267

15.3 References

269

16 ESTIMATION AND STATISTICAL INFERENCE

271

16.1 Sampling design

271

16.2 Sampling weights

273

16.3 Stratification

274

16.4 Multi-stage sampling

275

16.5 Impact of sampling design on sampling variability

278

16.5.1 Stratification

278

16.5.2 Clustering

280

16.5.3 Finite population corrections

281

16.5.4 Weighting

283

16.5.5 Summary

284

16.6 Estimating a sampling distribution with complex sample designs

284

16.7 References

287

17 STATISTICAL INFERENCE IN PRACTICE

289

17.1 Asymptotic distributions

289

17.2 Hypothesis testing

291

17.3 p-values and confidence intervals

293

17.4 Statistical inference using a non-pivotal bootstrap

294

17.5 Hypothesis testing and confidence intervals using pivotal bootstrap statistics

295

17.6 References

297

Part VI Exercises

 

18 EXERCISES

303

18.1 Household size and living standards

303

18.2 Aggregative weights and poverty analysis

303

18.3 Absolute and relative poverty

304

18.4 Estimating poverty lines

304

18.5 Descriptive data analysis

306

18.6 Decomposing poverty

306

18.7 Poverty dominance

306

18.8 Fiscal incidence, growth, equity and poverty

307

18.9 Sampling designs and sampling distributions

316

18.10 Equivalence scales and statistical units

317

18.11 Description of illustrative data sets

319

References

329

Symbols

376

Authors

382

Subjects

390

List of Figures

1.1

Capabilities, achievements and consumption

10

1.2

Capabilities and achievements under varying preferences

10

1.3

Capability sets and achievement failures

14

1.4

Minimum consumption needed to escape capability poverty

14

2.1

Price adjustments and well-being with two commodities

25

2.2

Equivalence scales and reference well-being

30

3.1

Quantile curve for a continuous distribution

45

3.2

Quantile curve for a discrete distribution

46

3.3

Incomes and poverty at different percentiles

47

4.1

Lorenz curve

51

4.2

The weighting function k(p;ρ)

54

4.3

The weighting function ω(p;ρ)

54

4.4

Mean income and inequality for constant social welfare

74

4.5

Homothetic social evaluation functions

75

4.6

Social utility and incomes

76

4.7

Marginal social utility and incomes

77

4.8

Atkinson social evaluation functions and the cost of inequality

78

4.9

Inequality aversion and the cost of inequality

79

4.10

Generalized Lorenz curve

80

5.1

Contribution of poverty gaps to FGT indices

85

5.2

The relative contribution of the poor to FGT indices

100

5.3

Socially-representative poverty gaps for the FGT indices

101

5.4

The cumulative poverty gap curve

102

6.1

Engel curves and cost-of-basic-needs baskets

108

6.2

Food preferences and the cost of a minimum calorie intake

111

6.3

Food, non-food and total poverty lines

122

6.4

Expenditure and calorie intake

123

6.5

Subjective poverty lines

124

6.6

Estimating a subjective poverty line with discrete subjective information

125

10.1

Primal stochastic dominance curves

168

10.2

s-order poverty dominance

172

10.3

Poverty indices and ethical judgements

178

10.4

Poverty dominance and income distributions

179

10.5

Classes of poverty indices and upper bounds for poverty lines

180

11.1

Inequality and social welfare dominance

196

12.1

Growth elasticity of the poverty headcount

215

13.1

Targeting and redistributive costs

229

13.2

Optimal set of benefit recipients and levels of state expenditure with α = 0

236

13.3

Optimal set of benefit recipients and levels of state expenditure with α = 1

236

13.4

Optimal set of benefit recipients and levels of state expenditure with α = 2

237

14.1

The spreadsheet for handling and visualizing data in DAD

244

14.2

The Set Sample Design window in DAD

246

14.3

Application window for estimating the FGT poverty index - one distribution.

254

14.4

Choosing between configurations of one or two distributions

255

14.5

Lorenz curves for two distributions

255

14.6

Differences in Lorenz curves drawn by DAD

256

14.7

The dialogue box for graphical options

256

14.8

The STD option

257

15.1

Histograms and density functions

261

List of Tables

2.1

Equivalent income and price changes

27

2.2

Equivalent income and price changes

27

3.1

Incomes and quantiles in a discrete distribution

42

6.1

Estimated poverty lines in Cameroon according to different methods

117

6.2

Headcount index headcount according to alternative measurement methods and for different regions in Cameroon

117

6.3

Distribution of the poor according to calorie, food and total expenditures poverty

118

9.1

Sensitivity of poverty comparisons to choice of poverty indices and poverty lines

156

9.2

Sensitivity of differences in poverty to choice of indices

157

12.1

The impact on poverty of targeting a constant marginal amount to everyone in the population or in a group k

202

12.2

The impact on poverty of inequality-neutral targeting within the entire population or within a group k

205

14.1

DAD's applications

249

14.2

DAD's applications (continued)

250

14.3

Available formats to save DAD's graphs

252

14.4

Curves and stochastic dominance

252

17.1

Confidence intervals and pvalues associated to the usual hypothesis tests

294

17.2

Confidence intervals and p values for the usual hypothesis tests, using non-pivotal bootstrap statistics

296

17.3

Confidence intervals and p values for the usual hypothesis tests, using pivotal bootstrap statistics

296

18.1

The distribution of households in ECAM (1996)

319

18.2

The distribution of households in ESAM (1995)

324

Preface

The publication of this book is in large part due to the role of Canada's International Development Research Center (IDRC) in encouraging policy-relevant research in the fields of poverty and equity. The book has in particular benefitted much from IDRC's support of two significant ventures, the Micro Impacts of Macro Economic and Adjustment Programs (MIMAP) and the Poverty and Economic Policy (PEP) international research network. We are most grateful to the IDRC for their continued and inspiring dedication, professionalism and vision in the field of development research. The Secrétariat d'appui institutionnel à la recherche économique en Afrique (SISERA), the World Bank Institute and the African Economic Research Consortium (AERC) have also partially supported to the production of this book. The fundamental research was further financed by grants from the Social Sciences and Humanities Research Council of Canada (SSHRC) and from the Fonds Québécois de Recherche sur la Société et la Culture (FQRSC) of the Province of Québec.

This book is mostly targeted to senior undergraduate and graduate students in economics as well as to researchers and analytical policy makers. More generally, it is intended for social scientists and statisticians. Some of its content can also be instructive to less specialized readers, such as those in the general public wishing to introduce themselves to the challenges posed and the insights generated by distributive analysis.

The book covers a relatively wide range of material. Part I deals with some of the conceptual, methodological and empirical issues and difficulties that arise in the assessment of well-being and poverty. Part II presents a number of measures on poverty, inequality, social welfare, and vertical and horizontal equity. Part III considers some of the methods that can establish whether a distribution of well-being or a policy "dominates" another in terms of some generally-defined ethical criteria. Part IV develops tools that can be used to understand and predict how targeting, price changes, growth and fiscal policy can affect poverty and equity. Part V introduces some of the statistical tech­niques that can help depict the distribution of living standards and help protect against the presence of sampling errors in making poverty and equity comparisons. Part V also introduces DAD and shows how that software can be used to apply the book's measurement and statistical techniques to micro data. Part VI contains a number of exercises (with illustrative datasets) that can be used to learn to implement some of the measurement and statistical techniques described in the book.

We certainly cannot pretend the book to be a comprehensive survey of the methods used to analyze poverty and equity. There is an obvious tendency for an author's exposition of a subject to be biased in favor of the work he knows best—and thus in favor of the work most closely related to his own work. This book is a clear example of this bias. One advantage of such a bias, however, is that it tends to unify the exposition. Such a unification, we have tried to enforce as much as we could throughout the various parts of the book. This helped present in a single text a unified treatment of distributive analysis from a conceptual, methodological, policy, statistical and practical point of view.

Most of the book's footnotes refer to applications programmed in DAD. These footnotes can thus guide the reader to where to go in DAD to test and implement many of the measurement and statistical tools exposed in the book. In the margins appear the exercise numbers which can be used to learn more about the book's tools. Most of these exercises involve the use of DAD. The solutions to the exercises can be found on DAD's official web page,

www.mimap.ecn.ulaval.ca.

The illustrative datasets used are briefly described at the end of the exercises. An index of the symbols used can be found starting on page 376. An author and a subject index are also provided at the end of the book.

To ease exposition within the main text, we endeavored to limit as much possible references to the literature, except when such references were clearly improving readability. Instead, each chapter is followed by a reference section in which the chapter's appropriate bibliographic references are mentioned and linked to each other.

This book and the accompanying software are certainly perfectible. I suppose it is the plight of all book writers to feel that their product is never satisfactorily finished. We hope to correct some of this version's shortcomings in future editions. For this, any comments on this first edition will be gratefully received.

I wish to thank my co-authors and former students, Sami Bibi, Philippe Grégoire, Vincent Jalbert, Paul Makdissi and Martin Tabi for their insights and dedication. I am also very grateful to my co-authors on distributive analysis papers — Russell Davidson, Damien Échevin, Carl Fortin, Peter Lambert, Magda Mercader, David Sahn, Steve Younger and Quentin Wodon — for their friendship and fruitful collaboration. Work at Université Laval was made productive and particularly enjoyable by the encouragement of my colleagues — among whom Bernard Decaluwé and Bernard Fortin feature prominently as former heads of CRÉFA — and more generally by the support of the Department of Economics and CIRPÉE (formerly CRÉFA). My thanks also extend to MIMAP and PEP co-workers, inter alia Touhami Abdelkhalek, Louis-Marie Asselin, Dorothée Boccanfuso, John Cockburn, Anyck Dauphin, Yazid Dissou, Samuel Kaboré, Jean Bosco Ki, Marie-Claude Martin, Damien Mededji, Abena Oduro, Luc Savard, Randy Spence, and to the teams of IDRC and AERC administrators and researchers with whom we have had the pleasure and privilege to work in the last decade. They provided much of the motivation and inspiration for writing this book. I am also grateful to my co-author, Abdelkrim Araar, for the trust and dedication he put into building DAD and this book's material over the last years, despite the uncertainty that initially clouded the project. I finally wish to thank Bill Carman of IDRC and Marilea Polk Fried of Springer-Kluwer for their efforts in bringing the publication of this book to full completion.

JEAN-YVES DUCLOS

Developing the DAD software, conducting fundamental research in distributive analysis and assisting researchers in developed countries have been my main activities for the last several years. The expertise that I have acquired in distributive analysis is the result of the continued support that I have received from Jean-Yves Duclos, who was also the director of my Ph.D. thesis. I am also grateful to all the researchers with whom I have worked for their collaboration and assistance.

ARAAR ABDELKRIM

This page intentionally left blank.

PART I
CONCEPTUAL AND METHODOLOGICAL ISSUES

This page intentionally left blank.

Chapter 1
WELL-BEING AND POVERTY

The assessment of well-being for poverty analysis is traditionally characterized according to two main approaches, which, following Ravallion (1994), we will term the welfarist and the non-welfarist approaches. The first approach tends to concentrate in practice mainly on comparisons of "economic well-being", which we will also call "standard of living" or "income" (for short). As we will see, this approach has strong links with traditional economic theory, and it is also widely used by economists in the operations and research work of organizations such as the World Bank, the International Monetary Fund, and Ministries of Finance and Planning of both developed and developing countries. The second approach has historically been advocated mainly by social scientists other than economists and partly in reaction to the first approach. This second approach has nevertheless also been recently and increasingly advocated by economists and non-economists alike as a multidimensional complement to the unidimensional standard of living approach.

1.1 The welfarist approach

The welfarist approach is strongly anchored in classical micro-economics, where, in the language of economics, "welfare" or "utility" are generally key in accounting for the behavior and the well-being of individuals. Classical microeconomics usually postulates that individuals are rational and that they can be presumed to be the best judges of the sort of life and activities which maximize their utility and happiness. Given their initial endowments (including time, land, and physical, financial and human capital), individuals make production and consumption choices using their set of preferences over bundles of consumption and production activities, and taking into account the available production technology and the consumer and producer prices that prevail in the economy.

Under these assumptions and constraints, a process of individual and rational free choice will maximize the individuals' utility; under additional assumptions (including that markets are competitive, that agents have perfect information, and that there are no externalities — assumptions that are thus restrictive), a society of individuals all acting independently under this freedom of choice process will also lead to an outcome known as Pareto -efficient, in that no one's utility could be further improved by government intervention without decreasing someone else's utility.

Underlying the welfarist approach to poverty, there is a premise that good note should be taken of the information revealed by individual behavior when it comes to assessing poverty. More precisely, the assessment of someone's well-being should be consistent with the ordering of preferences revealed by that person's free choices. For instance, a person could be observed to be poor by the total consumption or income standard of a poverty analyst. That same person could nevertheless be able (i.e., have the working capacity) to be non-poor. This could be revealed by the observation of a deliberate and free choice on the part of the individual to work and consume little, when the capability to work and consume more nevertheless exists. By choosing to spend little (possibly for the benefit of greater leisure), the person reveals that he is happier than if he worked and spent more. Although he could be considered poor by the standard of a (non-welfarist) poverty analyst, a comprehensive utility judgement would conclude that this person is not poor. As we will discuss later, this can have important implications for the design and the assessment of public policy.

A pure welfarist approach faces important practical problems. To be operational, pure welfarism requires the observation of sufficiently informative revealed preferences. For instance, for someone to be declared poor or not poor, it is not enough to know that person's current characteristics and income status: it must also be inferred from that person's actions whether he judges his utility status to be above a certain poverty utility level.

A related problem with the pure welfarist approach is the need to assess levels of utility or "psychic happiness". How are we to measure the actual pleasure derived from experiencing economic well-being? Moreover, it is highly problematic to attempt to compare that level of utility across individuals — it is well known that such a procedure poses serious ethical difficulties, preferences are heterogeneous, personal characteristics, needs and enjoyment abilities are diverse, households differ in size and composition, and prices vary across time and space. More generally, because economic well-being (in particular, utility) is typically seen as a subjective concept, most economists believe that interpersonal comparisons of economic well-being do not make much sense.

Supposing that these criticisms are resolved, the welfarist approach would classify as poor an individual who is materially well-off but not content, and as not poor an individual materially deprived but nevertheless content. It is not clear that we should accept as ethically significant such individual feelings of utilities. Said differently, why should a difficult-to-satisfy rich person be judged less well-off than an easily-contented poor person? Or, in the words of Sen (1983), p. 160, why should a "grumbling rich" be judged "poorer" than a "contented peasant"?

Hence, welfarist comparisons of poverty almost invariably use imperfect but objectively observable proxies for utilities, such as income or consumption. The "working" definition of poverty for the welfarist approach is therefore a lack of command over commodities, measured by low income or consumption. These money-metric indicators are often adjusted for differences in needs, prices, and household sizes and compositions, but they clearly represent far-from-perfect indicators of utility and well-being. Indeed, economic theory tells us little about how to use consumption or income to make consistent interpersonal comparisons of well-being. Besides, the consumption and income proxies are rarely able to take full account of the role for well-being of public goods and non-market commodities, such as safety, liberty, peace, health. In principle, such commodities can be valued using reference or "shadow" prices. In practice, this is difficult to do accurately and consistently.

1.2 Non-welfarist approaches

1.2.1 Basic needs and functionings

There are two major non-welfarist approaches, the basic-needs approach and the capability approach. The first focuses on the need to attain some basic multidimensional outcomes that can be observed and monitored relatively easily. These outcomes are usually (explicitly or implicitly) linked with the concept of functionings, a concept largely developed in Amartya Sen's influential work:

Living may be seen as consisting of a set of interrelated 'functionings', consisting of beings and doings. A person's achievement in this respect can be seen as the vector of his or her functionings. The relevant functionings can vary from such elementary things as being adequately nourished, being in good health, avoiding escapable morbidity and premature mortality, etc., to more complex achievements such as being happy, having self-respect, taking part in the life of the community, and so on (Sen 1992, p.39).

In this view, functionings can be understood to be constitutive elements of well-being. One lives well if he enjoys a sufficiently large level of functionings. The functioning approach would generally not attempt to compress these multidimensional elements into a single dimension such as utility or happiness. Utility or happiness is viewed as a reductive aggregate of functionings, which are multidimensional in nature. The functioning approach usually focuses instead on the attainment of multiple specific and separate outcomes, such as the enjoyment of a particular type of commodity consumption, being healthy, literate, well-clothed, well-housed, socially empowered, etc..

The functioning approach is closely linked to the well-known basic needs approach, and the two are often difficult to distinguish in practice. Functionings, however, are not synonymous with basic needs. Basic needs can be understood as the physical inputs that are usually required for individuals to achieve functionings. Hence, basic needs are usually defined in terms of means rather than outcomes, for instance, as living in the proximity of providers of health care services (but not necessarily being in good health), as the number of years of achieved schooling (but not necessarily being literate), as living in a democracy (but not necessarily participating in the life of the community), and so on. In other words,

Basic needs may be interpreted in terms of minimum specified quantities of such things as food, shelter, water and sanitation that are necessary to prevent ill health, undernourishment and the like (Streeten, Burki, Ul Haq, Hicks, and Stewart 1981).

Unlike functionings, which can be commonly defined for all individuals, the specification of basic needs depends on the characteristics of individuals and of the societies in which they live. For instance, the basic commodities required for someone to be in good health and not to be undernourished will depend on the climate and on the physiological characteristics of individuals. Similarly, the clothes necessary for one not to feel ashamed will depend on the norms of the society in which he lives, and the means necessary to travel, on whether he is handicapped or not. Hence, although the fulfillment of basic needs is an important element in assessing whether someone has achieved some functionings, this assessment must also use information on one's characteristics and socio-economic environment. Human diversity is such that equality in the space of basic needs generally translates into inequality in the space in functionings.

Whether unidimensional or multidimensional in nature, most applications of both the welfarist and the non-welfarist approaches to poverty measurement do recognize the role of heterogeneity in characteristics and in socio-economic environments in achieving well-being. Streeten, Burki, UI Haq, Hicks, and Stewart (1981) and others have nevertheless argued that the basic needs approach is less abstract than the welfarist approach in recognizing that role. Indeed, as mentioned above, assessing the fulfillment of basic needs can be seen as a useful practical and operational step towards appraising the achievement of the more abstract "functionings".

Clearly, however, there are important degrees in the multidimensional achievements of basic needs and functionings. For instance, what does it mean precisely to be "adequately nourished"? Which degree of nutritional adequacy is relevant for poverty assessment? Should the means needed for adequate nutritional functioning only allow for the simplest possible diet and for highest nutritional efficiency? These problems also crop up in the estima­tion of poverty lines in the welfarist approach. A multidimensional approach extends them to several dimensions.

In addition, how ought we to understand such functionings as the functioning of self-respect? The appropriate width and depth of the concept of basic needs and functionings is admittedly ambiguous, as there are degrees of functionings which make life enjoyable in addition to making it purely sustainable or satisfactory. Furthermore, could some of the dimensions be substitutes in the attainment of a given degree of well-being? That is, could it be that one could do with lower needs and functionings in some dimensions if he has high achievements in the other dimensions? Such possibilities of substitutability are generally ignored (and are indeed hard to specify precisely) in the multidimensional non-welfarist approaches.

1.2.2 Capabilities

A second alternative to the welfarist approach is called the capability approach, also pioneered and advocated in the last three decades by the work of Sen. The capability approach is defined by the capacity to achieve functionings, as defined above. In Sen (1992)'s words,

the capability to function represents the various combinations of functionings (beings and doings) that the person can achieve. Capability is, thus, a set of vectors of functionings, reflecting the person's freedom to lead one type of life or another, (p.40)

What matters for the capability approach is the ability of an individual to function well in society; it is not the functionings actually achieved by the person per se. Having the capability to achieve "basic" functionings is the source of freedom to live well, and is thereby sufficient in the capability approach for one not to be poor or deprived.

The capability approach thus distances itself from achievements of specific outcomes or functionings. In this, it imparts considerable value to freedom of choice: a person will not be judged poor even if he chooses not to achieve some functionings, so long as he would be able to achieve them if he so chose. This distinction between outcomes and the capability to achieve these outcomes also recognizes the importance of preference diversity and individuality in determining functioning choices. It is, for instance, not everyone's wish to be well-clothed or to participate in society, even if the capability is present.

An interesting example of the distinction between fulfilment of basic needs, functioning achievement and capability is given in Townsend (1979)'s (Table 6.3) deprivation index. This deprivation index is built from answers to questions such as whether someone "has not had an afternoon or evening out for entertainment in the last two weeks", or "has not had a cooked breakfast most days of the week". It may be, however, that one chooses deliberately not to have time out for entertainment (he prefers to watch television), or that he chooses not to have a cooked breakfast (he does not want to spend the time to prepare it), although he does have the capacity to have both. That person therefore achieves the functioning of being entertained without meeting the basic need of going out once a fortnight, and he does have the capacity to achieve the functioning of having a cooked breakfast, although he chooses not to have one.

The difference between the capability and the functioning/basic needs approaches is in fact somewhat analogous to the difference between the use of income and consumption as indicators of living standards. Income shows the capability to consume, and "consumption functioning" can be understood as the outcome of the exercise of that capability. There is consumption only if a person chooses to enact his capacity to consume a given income. In the basic needs and functioning approach, deprivation comes from a lack of direct consumption or functioning experience; in the capability approach, poverty arises from the lack of incomes and capabilities, which are imperfectly related to the functionings actually achieved.

Although the capability set is multidimensional, it thus exhibits a parallel with the unidimensional income indicator, whose size determines the size of the "budget set":

Just as the so-called 'budget set' in the commodity space represents a person's freedom to buy commodity bundles, the 'capability set' in the functioning space reflects the person's freedom to choose from possible livings (Sen 1992, p. 40).

This shows further the fundamental distinction between the extents of freedoms and capabilities, the space of achievements, and the resources required to generate these freedoms and to attain these achievements.

1.3 A graphical illustration

To illustrate the relationships between the main approaches to assessing poverty, consider Figure 1.1. Figure 1.1. shows in four quadrants the links between income, consumption of two goods — transportation T and clothing C goods — and the functionings associated to the consumption of each of these two goods. The northeast quadrant shows a typical budget set for the two goods and for a budget constraint Y1. The curve U1 shows the utility indifference curve along which the consumer chooses his preferred commodity bundle, which is here located at point A.

The northwestern and the southeastern quadrants then transform the consumption of goods T and C into associated functionings FT and FC. This is done through the functioning Transformation Curves TCT and TCC, for transformation of consumption of T and C into transportation and clothing functionings, respectively. The curves TCT and TCC appear respectively in the northwest and the southeast quadrants respectively. These curves thus bring us from the northeastern space of commodities, {C, T}, into the southwestern space of functionings, {FC, FT}. Using these transformation functions, we can draw a budget constraint S1 in the space of functionings using the traditional commodity budget constraint, Y1. Since the consumer chooses point A in the space of commodities, he enjoys B's combination of functionings. But all of the functionings within the constraint S1 can also be attained by the consumer. The triangular area between the origin and the line S1 thus represents the individual's capability set. It is the set of functionings which he is able to achieve.

Now assume that functioning thresholds of zC and zT must be exceeded (or must be potentially exceeded) for one not to be considered poor by non- wel-farist analysts. Given the transformation functions TCT and TCF, a budget constraint Y1 makes the individual capable of not being poor in the functioning space. But this does not guarantee that the individual will choose a combination of functionings that will exceed zC and zT: this also depends on the individual's preferences. At point A, the functionings achieved are above the minimum functioning thresholds fixed in each dimension. Other points within the capability set would also surpass the functioning thresholds: these points are shown in the shaded triangle to the northeast of point B. Since part of the capability set allows the individual to be non-poor in the space of functionings, the capability approach would also declare the individual not to be poor.

So would conclude, too, the functioning approach since the individual chooses functionings above zC and zT. Such a concordance between the two approaches does not always prevail, however. To see this, consider Figure 1.2. The commodity budget set and the functioning Transformation Curves have not changed, so that the capability set has not changed either. But there has a been a shift of preferences from U1 to U2, so that the individual now prefers point D to point A, and also prefers to consume less clothing than before. This makes his preferences for functionings to be located at point E, thus failing to exceed the minimum clothing functioning zC required. Hence, the person would be considered non-poor by the capability approach, but poor by the functioning approach. Whether an individual with preferences U2 is really poorer than one with preferences U1 is debatable, of course, since the two have exactly the same "opportunity sets", that is, have access to exactly the same commodity and capability sets.

An important allowance in the capability approach is that two persons with the same commodity budget set can face different capability sets. This is illustrated in Figure 1.3, where the functioning Transformation Curve for transportation has shifted from TCT to Image. This may due to the presence of a handicap, which makes it more costly in transportation expenses to generate a given level of transportation functioning (disabled persons would need to expend more to go from one place to another). This shift of the TCT curve moves the capability constraint to S1' and thus contracts the capability set.

Figure 1.1.: Capabilities, achievements and consumption

Image

Figure 1.2.: Capabilities and achievements under varying preferences

Image

With the handicap, there is no point within the new capability set that would surpass both functioning thresholds zC and zT. Hence, the person is deemed poor by the capability approach and (necessarily so) by the functioning approach. Whether the welfarist approach would also declare the person to be poor would depend on whether it takes into account the differences in needs implied by the difference between the TCT and the Image curves.

For the welfarist approach to be reasonably consistent with the functioning and capability approaches, it is thus essential to consider the role of transformation functions such as the TC curves. If this is done, we may (in our simple illustration at least) assess a person's capability status either in the commodity or in the functioning space.

To see this, consider Figure 1.4. Figure 1.4 is the same as Figure 1.1 except for the addition of the commodity budget constraint Y2 which shows the minimum consumption level needed for one not to be poor according to the capability approach. According to the capability approach, the capability set must contain at least one combination of functionings above zC and zT, and this condition is just met by the capability constraint S2 that is associated with the commodity budget Y2. Hence, to know whether someone is poor according to the capability approach, we may simply check whether his commodity budget constraint lies below Y2.

Even if the actual commodity budget constraint lies above Y2, the individual may well choose a point outside the non-poor functioning set, as we discussed above in the context of Figure 1.2. Clearly then, the minimum total consumption needed for one to be non-poor according to the functioning or basic needs approach generally exceeds the minimum total consumption needed for one to be non poor according to the capability approach. More problematically, this minimum total consumption depends in principle on the preferences of the individuals. On Figure 1.2, for instance, we saw that the individual with preference U2 was considered poor by the functioning approach, although another individual with the same budget and capability sets but with preferences U1 was considered non-poor by the same approach.

1.3.1 Exercises

1 Show on a figure such as Figure 1.1 the impact of an increase in the price of the transportation commodity on the commodity budget constraint and on the capability constraint.

2 On a figure such as Figure 1.4, show the minimal commodity budget set that ensures that the person

(a) is just able to attain one of the two minimum levels of functionings zC or zT;

(b) chooses a combination of functionings such that one of them exceeds the corresponding minimum level of functionings zC or zT;

(c) is just able to attain both minimum levels of functionings zC and zT;

(d) chooses a combination of functionings such that both exceed the corresponding minimum level of functionings zC and zT.

(e) How do these four minimal commodity budget constraints compare to each other? Which one corresponds to the different approaches to assessing poverty seen above?

1.4 Practical measurement difficulties for the non-welfarist approaches

The measurement of capabilities raises various problems. Unless a person chooses to enact them in the form of functioning achievements, capabilities are not easily inferred. Achievement of all basic functionings implies non-deprivation in the space of all capabilities; but a failure to achieve all basic functionings does not imply capability deprivation. This makes the monitoring of functioning and basic needs an imperfect tool for the assessment of capability deprivation.

Besides, and as for basic needs, there are clearly degrees of capabilities, some basic and some deeper. It would seem improbable that true well-being be a discontinuous function of achievements and capabilities. For most of the functionings assessed empirically, there are indeed degrees of achievements, such as for being healthy, literate, living without shame, etc... It would seem important to think of varying degrees of well-being in assessing and comparing achievements and capabilities, and not only to record dichotomic 0/1 answers to multidimensional qualitative criteria.

The multidimensional nature of the non-welfarist approaches also raises problems of comparability across dimensions. How should we assess adequately the well-being of someone who has the capability to achieve two functionings out of three, but not the third? Is that person necessarily "better off" than someone who can achieve only one, or even none of them? Are all capabilities of equal importance when we assess well-being?

The multidimensionality of the non-welfarist criteria also translates into greater implementation difficulties than for the usual proxy indicators of the welfarist approach. In the welfarist approach, the size of the multidimensional budget set is ordinarily summarized by income or total consumption, which can be thought of as a unidimensional indicator of freedom. Although there are many different combinations of consumption and functionings that are compatible with a unidimensional money-metric poverty threshold, the welfarist approach will generally not impose multidimensional thresholds. For instance, the welfarist approach will usually not require for one not to be poor that both food and non-food expenditures be larger than their respective food and non-food poverty lines. A similar transformation into a unidimensional indicator is more difficult with the capability and basic needs approaches.

One possible solution to this comparability problem is to use "efficiency-income units reflecting command over capabilities rather than command over goods and services" (Sen 1985, p.343), as we illustrated above when discussing Figure 1.4. This, however, is practically difficult to do, since command over many capabilities is hard to translate in terms of a single indicator, and since the "budget units" are hardly comparable across functionings such as well-nourishment, literacy, feeling self-respect, and taking part in the life of the community. On Figure 1.4, anyone with an income below Y2 would be judged capability-poor. But by how much does poverty vary among these capability-poor? A natural measure would be a function of the budget constraint. It is more difficult to make such measurements and comparisons within the non-money-metric capability set.

Figure 1.3.: Capability sets and achievement failures

Image

Figure 1.4.: Minimum consumption needed to escape capability poverty

Image

1.5 Poverty measurement and public policy

1.5.1 Poverty measurement matters

The measurement of well-being and poverty plays a central role in the discussion of public policy. It is used, among other things, to identify the poor and the non-poor, to design optimal poverty targeting schemes, to estimate the errors of exclusion and inclusion in the targeting set (also known as Type I and Type II errors), and to assess the equity of poverty alleviation policies. Is growth "pro-poor"? How do indirect taxes and relative price changes affect the poor? What should the target groups be for socially-improving government interventions? What impact do transfers have on poverty? Is it the poorest of the poor who benefit most from public policy?

An important example of the central role of poverty measurement in the setting of public policy is the optimal selection of safety net targeting indicators. The theory of optimal targeting suggests that it will commonly be best to target individuals on the basis of indicators that are as easily observable and as exogenous as possible, while being as correlated as possible with the true poverty status of the individuals. Indicators that are not readily observable by program administrators are of little practical value. Indicators that can be changed effortlessly by individuals will be distorted by the presence of the program, and will lose their poverty-informative value. Whether easily observable and sufficiently exogenous indicators are sufficiently correlated with the deprivation of individuals in a population is given by a poverty profile. The value of this profile will naturally be highly dependent on the approach used and the particular assumptions made to measure well-being and poverty.

Estimation of inclusion and exclusion errors is also a product of poverty profiling and measurement. These errors are central in the trade-off involved in choosing between a wide coverage of the population — at relatively low administrative and efficiency costs — and a narrower coverage — with more generous support for the fewer beneficiaries. Indeed, as van de Walle (1998a) puts it, a narrower coverage of the population, with presumably smaller errors of inclusion of the non-poor, does not inevitably lead to a more equitable treatment of the poor:

Concentrating solely on errors of leakage to the non-poor can lead to policies which have weak coverage of the poor (p.366).

The terms of this trade-off are again given by a poverty assessment exercise.

Another lesson of optimal redistribution theory is that it is usually better to transfer resources from groups with a high level of average well-being to those with a lower one. What matters more, however, is the distribution of well-being within each of the groups. For instance, equalizing mean well-being across groups does not ordinarily eliminate poverty since there generally exist within-group inequalities. Even within the richer group, for instance, there normally will be found some deprived individuals, whom a rich-to-poor cross-group redistributive process would clearly not take out of poverty. The within- and between-group distribution of well-being that is required for devising an optimal redistributive scheme can again be revealed by a comprehensive poverty profile.

1.5.2 Welfarist and non-welfarist policy implications

The distinction between the welfarist and non-welfarist approaches to poverty measurement often matters (implicitly or explicitly) for the assessment and the design of public policy. As described above, a welfarist approach holds that individuals are the best judges of their own well-being. It would thus in principle avoid making appraisals of well-being that conflict with the poor's views of their own situation. A typical example of a welfarist public policy would be the provision of adequate income-generating opportunities, letting individuals decide and reveal whether these opportunities are utility maximizing, keeping in mind the other non-income-generating opportunities that are available to them.

A non-welfarist policy analyst would argue, however, that providing income opportunities is not necessarily the best policy option. This is partly because individuals are not necessarily best left to their own resolutions, at least in an intertemporal setting, regarding educational and environmental choices for example. The poor's short-run preoccupations may, for instance, harm their long-term self-interest. Individuals may choose not to attend skill-enhancing programs because they deceivingly appear overly time costly in the short-run, and because they are not sufficiently aware or convinced of their long-term benefits. Besides, if left to themselves, the poor will not necessarily spend their income increase on functionings that basic-needs analysts would normally consider a priority, such as good nutrition and health.

Thus, fulfilling "basic needs" cannot be satisfied only by the generation of private income, but may require significant amounts of targeted and in-kind public expenditures on areas such as education, public health and the environment. This would be so even (and especially) if the poor did not presently believe that these areas were deserving of public expenditures. Furthermore, social cohesion concerns are arguably not well addressed by the maximization of private utility, and raising income opportunities will not fundamentally solve problems caused by adverse intra-household distributions of well-being, for instance.

An objection to the basic needs approach is that it is clearly paternalistic since it supposes that it must be in the absolute interests of all to meet a set of often arbitrarily specified needs. Indeed, as emphasized above, non-welfarist approaches generally use criteria for identifying and helping the poor that may conflict both with the poor's preferences and with their utility maximizing choices. The welfarist school conversely emphasizes that individuals are generally better placed to judge what is good for them. For instance,

To conclude that a person was not capable of living a long life we must know more than just how long she lived: perhaps she preferred a short but merry life. (Lipton and Ravallion 1995)

To force that person to live a long but boring life might thus go against her preferences.

For poverty alleviation purposes, the prescriptions of non-welfarist approaches could in principle go as far as, for instance, enforced enrolment in community development programs, forced migration, or forced family planning. This may not only conflict with the preferences of the poor, but would also clearly undermine their freedom to choose. Freedom to choose is, however, arguably one of the most important basic capabilities that contribute fundamentally to well-being.

A further example of the possible tension between the welfarist and non-welfarist influences on public policy comes from optimal taxation theory, which is linked to the theory of optimal poverty alleviation. In the tradition of classical microeconomics, which values leisure in the production and labor market decisions of individuals, pure welfarists would incorporate the utility of leisure in the overall utility function of workers, poor and non-poor alike. In its support to the poor, the government would then take care of minimizing the distortion of their labor/leisure choices so as not to create overly high "deadweight losses". Classical optimal taxation theory then shows that being concerned with such things as labor/leisure distortions implies a generally lower benefit reduction rates on the income of the poor than otherwise. Taking into account such abstract things as "deadweight losses" is, however, less typical of the basic needs and functioning approaches. Such approaches would, therefore, usually target program benefits more sharply on the poor, and would exact steeper benefit reduction rates as income or well-being increases.

Relative to the pure welfarist approach, non-welfarist approaches are also typically less reluctant to impose utility-decreasing (or "workfare") costs as side effects of participation in poverty alleviation schemes. These side effects are in fact often observed in practice. For instance, it is well-known that income support programs frequently impose participation costs on benefit claimants. These are typically non-monetary costs. Such costs can be both physical and psychological: providing manual labor, spending time away from home, sacrificing leisure and home production, finding information about application and eligibility conditions, corresponding and dealing with the benefit agency, queuing, keeping appointments, complying with application conditions, revealing personal information, feeling "stigma" or a sense of guilt, etc...

Although non-monetary, these costs impact on participants' net utility from participating in the programs. When they are negatively correlated with unobserved (or difficult to observe) entitlement indicators, they can provide self-selection mechanisms that enhance the efficiency of poverty alleviation programs, for welfarists and non-welfarists alike. One unfortunate effect of these costs is, however, that many truly-entitled and truly deserving individuals may shy away from the programs because of the costs they impose. Although program participation could raise their income and consumption above a money-metric poverty line, some individuals will prefer not to participate, revealing that they find apparent poverty utility greater than that of program participation. Welfarists would in principle take these costs into account when assessing the merits of the programs. Non-welfarists would usually not do so, and would therefore judge such programs more favorably.

Finally, the width of the definition of functionings is also important for the design and the assessment of public policy. For instance, public spending on education is often promoted on the basis of its impact on future productivity and growth. But education can also be seen as a means to attain the functioning of literacy and participation in the community. This then provides an additional support for public expenditures on education. Analogous arguments also apply, for instance, to public expenditures on health, transportation, and the environment.

Chapter 2
THE EMPIRICAL MEASUREMENT OF WELL-BEING

The empirical assessment of poverty and equity is customarily carried out using data on households and individuals. These data can be administrative (i.e., stored in government files and records), they can come from censuses of the entire population, or (most commonly) they can be generated by probabilistic surveys on the socio-demographic characteristics and living conditions of a population of households or individuals. We focus on this latter case in this chapter.

2.1 Survey issues

There are several aspects of the surveying process that are important for assessment of poverty and equity. First, there is the coverage of the survey: does it contain representative information on the entire population of interest, or just on some socio-economic subgroups? Whether the representativeness of the data is appropriate depends on the focus of the assessment. A survey containing observations drawn exclusively from the cities of a particular country may be perfectly fine if the aim is to design poverty alleviation schemes within these cities; its representativeness will, however, be clearly insufficient if the objective is to investigate the optimal allocation of resources between the country's urban and rural areas.

Then, there is the sample frame of the survey. Surveys are usually stratified and multi-staged, and are therefore made of stratified and clustered observations. Stratification ensures that a certain minimum amount of information is obtained from each of a given number of "areas" within a population of interest. Population strata are often geographically defined and typically represent different regions or provinces of a country. Clustering facilitates the interviewing process by concentrating sample observations within particular population subgroups or geographic locations. They thus make it more cost effective to collect more observations. Strata are thus often divided into a number of different levels of clusters, representing, say, cities, districts, neighborhoods, and households. A complete listing of first-level clusters in each stratum is used to select randomly within each stratum a given number of clusters. The initial clusters can then be subjected to further stratification or clustering, and the process continues until the last sampling units (usually households or individuals) have been selected and interviewed. This therefore leads to both stratification and multi-stage sampling.

Fundamental in the use of survey data is the role of the randomness of the information that is generated by the sampling process. Because households and individuals are not all systematically interviewed (unlike in the case of censuses), the information generated from survey data will depend on the particular selection of households and individuals that is made from a population. In other words, a poverty/equity assessment of a population will vary according to the sample drawn from that population. For that reason, distributive assessments carried out using survey data will be subject to so-called "sampling errors", that is, to sampling variability. When carrying out distributive using sample data, it is therefore important to recognise and assess the importance of sampling variability.

By ensuring that a minimum amount of information (typically, a minimum number of observations) is obtained from each of a number of strata, stratification decreases the extent of sampling errors. A similar effect is obtained by increasing the total size of the sample: the greater the number of households surveyed, the greater on average is the sampling precision of the estimates obtained. Conversely, by bundling observations around common geographic or socio-economic indicators, clustering tends to reduce the informative content of the observations drawn and thus also tends to increase the size of the sampling errors (for a given number of observations). The sampling structure of a survey also impacts on its ability to provide accurate information on certain population subgroups. For instance, if the clusters within a stratum represent geographical districts, and between-district variability is large, it would be unwise to use the information generated by the selected regions to depict poverty in the other, non-selected, regions.

Survey data are also fraught with measurement and other "non-sampling" errors. For instance, even though they may have been selected to belong to a sample, some households may end up not being interviewed, either because they cannot be reached or because they refuse to be interviewed. Such "non-response" will raise difficulties for distributive assessments if it is correlated with observable and non-observable household characteristics. Even if interviewed, households will sometimes misreport their characteristics and living conditions, either because of ignorance, misunderstanding or mischief. This tends to make distributive assessments built from survey data diverge systematically from the true (and unobserved) population distributive assessment that would be carried out if there were no non-sampling errors. Clearly, such a shortcoming can bias the understanding of poverty and equity and the subsequent design of public policy.

The empirical analysis of vulnerability and poverty dynamics is particularly "data demanding". In general, it requires longitudinal (or panel) surveys, surveys that follow each other in time and that interview the same final observational units. Because they link the same units across time, longitudinal data contain more information than transversal (or cross-sectional) surveys, and they are particularly useful for measuring vulnerability and for understanding poverty dynamics — in addition to facilitating the assessment of the temporal effects of public policy on well-being. Note, however, that measurement errors are particularly problematic for the analysis of vulnerability and mobility.

2.2 Income versus consumption

It is frequently argued that consumption is better suited than income as an indicator of living standards, at least in many developing countries. One reason is that consumption is believed to vary more smoothly than income, both within a given year and across the life cycle. Income is notoriously subject to seasonal variability, particularly in developing countries, whereas consumption tends to be less variable. Life-cycle theories also predict that individuals will try to smooth their consumption across their low- and high-income years (in order to equalize their "marginal utility of consumption" across time), through appropriate borrowing and saving behavior. In practice, however, consumption smoothing is far from perfect, in part due to imperfect access to commodity and credit markets and to difficulties in estimating precisely one's "permanent" or life-cycle income. Using short-term vs longer-term consumption or income indicators can therefore change the assessment of well-being.

For the non-welfarist interested in outcomes and functionings, consumption is also preferred over income because it is deemed to be a more "direct" indicator of achievements and fulfilment of basic needs. A caveat is, however, that consumption is an outcome of individual free choice, an outcome which may differ across individuals of the same income and ability to consume, just like actual functionings vary across people of the same capability sets. At a given capability to spend, some individuals may choose to consume less (or little), preferring instead to give to charity, to vow poverty, or to save in order to leave important bequests to their children.

Consumption is also held to be more readily observed, recalled and measured than income (at least in developing countries, although even then this is not always the case), to suffer less from underreporting problems, This is not to say that consumption is easy to measure accurately. Sources of income are typically far more limited than types of expenditures, which can make it easier to collect income information. The periodicity of expenses on various goods varies, and different recall periods are therefore needed to ensure adequate expenditure coverage.

Moreover, consumption does not equal expenditures. The value of consumption equals the sum of the expenditures on the goods and services purchased and consumed in a given period, plus the value of goods and services consumed but not purchased (such as those received as gifts and produced by the household itself), plus the consumption or service value of assets and durable goods owned. Unlike expenditures, therefore, consumption includes the value of own-produced goods. The value of these goods is not easily assessed, since it has not been transacted in a market. Distinguishing consumption from investment is also very difficult, but failure to do so properly can lead to double-counting in the consumption measure. For instance, a $1 expenditure on education or machinery should not be counted as current consumption if the returns and the utility of such expenditure will only accrue later in the form of higher future utility and earnings.

Similarly, and as just mentioned, the value of the services provided by those durable goods owned by individuals ought also to enter into a complete consumption indicator, but the cost of these durable goods should feature in the consumption aggregate of the time at which the good was purchased. An important example of this is owner-occupied housing. Further measurement difficulties arise in the assessment of the value of various non-market goods and services – such as those provided freely by the government – and the value of intangible benefits such as the quality of the environment, the benefit of peace and security, and so on.

2.3 Price variability

Whether it is income or consumption expenditures that are measured and compared, an important issue is how to account for the variability of prices across space and time. Conceptually, this also encompasses variability in quality and in quantity constraints. Failure to account for such variability can distort comparisons of well-being across time and space. In Ecuador, for instance (Hentschel and Lanjouw 1996), and in many other countries, some households have free access to water, and tend to consume relatively large quantities of it with zero water expenditure. Others (often peri-urban dwellers) need to purchase water from private vendors and consequently consume a lower quantity of it at necessarily higher total expenditures. Ranking households according to water expenditures could wrongly suggest that those who need to buy water are better off and derive greater utility from water consumption (since they spend more on it).

Microeconomic theory suggests that we may wish to account for price variability by comparing real as opposed to nominal consumption expenditures (or income). Several procedures can be followed to enable such comparisons. A first procedure estimates the parameters of consumers' indirect utility functions. Let these parameters be denoted by ν and the indirect utility function be defined by V(y, q, ν), where q is the price vector and y is total nominal expenditure (we abstract from savings). Suppose that reference prices are given by qR. Equivalent consumption expenditure is then given implicitly by yR:

Image

Inversion of the indirect utility function yields an equivalent expenditure function e, which indicates how much expenditure at reference prices is needed to be equivalent to (or to generate the same utility as) the expenditure observed at current prices q:

Image

Distributive analysis would then proceed by comparing the real incomes defined in terms of the reference prices qR.

An alternative procedure deflates by a cost-of-living index the level of total nominal consumption expenditures. One way of defining such a cost-of-living index is to ask what expenditure is needed just to reach a poverty level of utility vz at prices q. This is given by e (q,ν,υz). A similar computation is carried out for the expenditure needed to attain υz at prices qR: this is e (qR,ν,υz). The ratio

Image

is then a cost-of-living index. Dividing y by (2.3) yields real consumption expenditure.

In practice, cost-of-living indices are often taken to be those aggregate consumer price indices routinely computed by national statistical agencies. These consumer price indices usually vary across regions and time, but not across levels of income (e.g., across the poor and the non-poor). In some circumstances (i.e., for homothetic utility functions and when consumer preferences are identical), all of the above procedures are equivalent. In general, however, they are not the same.

The fact that utility functions are not generally homothetic, and that preferences are highly heterogeneous, has important implications for distributive analysis and public policy. First, the true cost-of-living index would normally be different across the poor and the rich. Using the same price index for the two groups may distort comparisons of well-being. An example is the effect of an increase in the price of food on economic welfare. Since the share of food in total consumption is usually higher for the poor than for the rich, this increase should hurt disproportionately more the poor. Deflating nominal consumption by the same index for the entire population will, however, suggest that the burden of the food price increase is shared proportionately by all.

Spatial disaggregation is also important if consumption preferences and price changes vary systematically across regions. In few developing countries, however, are consumer price indices available or sufficiently disaggregated spatially. The alternative for the analyst is then to produce different poverty lines for different regions (based on the same or different consumption baskets, but using different prices) or construct separate price indices. In both cases, the analyst would usually be using regional price information derived from consumption survey data. The resulting indices would then be interpreted as cost-of-living indices, and could help correct for spatial price variation and regional heterogeneity in preferences.

To see why these adjustments are necessarily in part arbitrary, and to see why they can matter in practice, consider the case of Figure 2.1. It shows 3 indifference curves U1, U2 and U3, for three consumers, 1, 2 and 3. Two of these consumers have relatively strong preferences for meat as opposed to fish, and the third (represented by U3) has strong preferences for fish. Also shown are two budget constraints, one using relative prices qc (c for coastal area), where the price of fish is relatively low, and the other with qm (m for mountainous area), where the price of fish is high compared to the price of meat.

How is the standard of living for individuals 1, 2 or 3 to be compared? One way to answer this question is to "cost" the consumption of the three individuals. For this, we may use either qc or qm. If we use the mountains' relative price, then the consumption bundles chosen by individuals 1 and 3 are equivalent in terms of value: they lie on the same budget constraint of value B in terms of meat (the numéraire). Individual 2 is clearly then the worst off of all three. If instead we use the coastal area's relative price, then the consumption bundles chosen by individuals 2 and 3 are equivalent, with a common value of A in terms of meat – and individual 1 is the best off.

Hence, choosing reference prices to assess and compare living standards can matter significantly. If we knew a priori that individuals 1 and 3 had equivalent living standards, then reference prices qm would be the right ones (conversely: qc would be the correct reference prices if 2 and 3 could be assumed to be equally well off). But such information is generally not available. In some circumstances, such as in comparing 1 and 2, we can be fairly certain that one individual is better off than another, whatever the choice of reference prices, but even then, the extent of the quantitative difference in well-being can can vary to a large extent with the choice of reference prices.

The choice of reference prices and reference preferences will also matter for estimating the impact of price changes on well-being and equity. Consider

Figure 2.1: Price adjustments and well-being with two commodities

Image

again Figure 2.1.. Suppose that we wish to measure the impact on consumers' well-being of an increase in the price of fish. Assume for simplicity that this change in relative prices is captured by a move from qm to qc. If we were to choose as a reference bundle the bundle of meat and fish chosen by individuals 1 and 2 to capture the impact of this change, then the price impact would be estimated to be fairly low. The reason is that both individuals consume little of fish. For instance, take meat as the numéraire and assess the real income value of being at U1. Under qm, this is given by B and under qc by D. Using instead the preferences of individual 3 as reference tastes (and thus U3 as reference well-being), real consumption would move from A to B, a much greater change.

Furthermore, even if 3 were deemed better off than 1 before the increase in the price of fish, it could well be that 3's strong preferences for fish would make him less well off than 1 after the price change. Hence, when consumer preferences are heterogeneous, price changes can reverse rankings of well-being. Indeed, in Figure 2.1., the increase in the price of fish is visibly much more costly for fish eaters than for meat ones. This warns again against the use of a common price index across all regions as well as across all socio-economic groups – rich and poor.

2.3.1 Exercises

Suppose the following direct utility function over the two goods x1 and x2,

Image

with υ= 1/3, and let prices q1and q2be set to 1.

1 What is the expenditure needed to attain a poverty level of utility of 158.74 at the reference prices Image=1 and Image=1? (Call this zR.)

answer: see table 2.1, zR = 300 for U = 159.78$

2 What are the quantities of goods 1 and 2 that are consumed at the poverty level of utility?

answer: see table 2.1, x1 = 100 and x2 = 200 for U = 159.78$

3 Suppose that the price of good 2 is increased from 1 to 3. What is the new cost of the level of poverty utility? (Call this z.)

answer: see table 2.2, z = 624 for U = 159.78$

4 Using definitions (2.1) and (2.2), prove the following:

yR/zR = y/z.

What does it imply?

answer: When preferences are homothetic, poverty measures are the same for two following methods that one can use to adjust the nominal income:

Image Equivalent expenditure method: y* = e(qR, xR(q,x,y))

Image Welfare ratio method: y* = y/z

5 Suppose now that a poverty analyst does not believe that consumption of goods 1 and 2 will adjust following good 2's price increase. What is the poverty line z that he would then obtain? (Hint: compute the cost of the initial commodity basket using the new prices.)

answer: z = 700$

6 Using indifference curves and budget constraints, show the difference that taking account of changes in behavior can make for the computation of price indices and the assessment of poverty.

Table 2.1: Equivalent income and price changes

i

y

x1

x2

U

y/z

1

150.00

50.00

100.00

79.37

0.50

2

210.00

70.00

140.00

111.12

0.70

3

300.00

100.00

200.00

158.74

1.00

4

380.00

126.67

253.33

201.07

1.27

5

500.00

166.67

333.33

264.57

1.67

6

510.00

170.00

340.00

269.86

1.70

7

550.00

183.33

366.67

291.02

1.83

8

600.00

200.00

400.00

317.48

2.00

9

800.00

266.67

533.33

423.31

2.67

10

1000.00

333.33

666.67

529.13

3.33

 

Table 2.2: Equivalent income and price changes

i

y

x1

x2

U

yR

y/z

yR/zR

y/700

1

160.00

53.33

35.56

40.70

76.92

0.26

0.26

0.23

2

200.00

66.67

44.44

50.88

96.15

0.32

0.32

0.29

3

500.00

166.67

111.11

127.19

240.37

0.80

0.80

0.71

4

624.00

208.00

138.67

158.74

300.00

1.00

1.00

0.89

5

1100.00

366.67

244.44

279.82

528.82

1.76

1.76

1.57

6

1240.00

413.33

275.56

315.43

596.13

1.99

1.99

1.77

7

1300.00

433.33

288.89

330.70

624.97

2.08

2.08

1.86

8

1500.00

500.00

333.33

381.57

721.12

2.40

2.40

2.14

9

1600.00

533.33

355.56

407.01

769.20

2.56

2.56

2.29

10

2770.00

923.33

615.56

704.64

1331.68

4.44

4.44

3.96

2.4 Household heterogeneity

A fundamental problem arises when comparing the well-being of individuals who live in households of differing sizes and composition. Differences in household size and composition can indeed be expected to create differences in household "needs". It is essential to take these differences in needs into account when comparing the well-being of individuals living in differing households. This is usually done using equivalence scales. With these scales, the needs of a household of a particular size and composition are compared to those of a reference household, usually one made of one reference adult.

2.4.1 Estimating equivalence scales

Strategies for the estimation of equivalence scales are all contingent on the choice of comparable indicators of well-being. The choice of any such indicator is, however, intrinsically arbitrary. A popular example is food share in total consumption: at equal household food shares, individuals of various household types are assumed to be equally well-off. But, at equal well-being, one household type could certainly choose a food share that differs from that of other household types. This would be the case, for instance, for households of smaller sizes for which it could make perfect sense to spend more on food than on those goods for which economies of scale are arguably larger, such as housing. Failing to take this differential price effect into account would lead to an overestimation of the needs of small households.

Another difficulty arises when household size and composition are the result of a deliberate free choice. It may be argued, for instance, that a couple which elects freely to have a child cannot perceive this increase in household size to be utility decreasing. This would be so even if the household's total consumption remained unchanged after the birth of the child (or even if it fell), despite the fact that most poverty analysts would judge this birth to increase household "needs". Another difficulty lies in the fact that the intra-household decision-making process can distort the allocation of resources across household members, and thereby lead to wrong inferences of comparative needs. This is the case, for instance, when more is spent on boys than on girls, not because of greater boy needs, but because of differential gender preferences on the part of the household decision-maker. Such observations can lead analysts to overestimate the real needs of boys relative to those of girls. In turn, this would underestimate on average the level of deprivation experienced by girls and their households, since it would be wrongly assumed that girls are less "needy". An analogous analytical difficulty arises when the household decision-maker is a man, and the consumption of his spouse is observed to be smaller than his own. Is this due to gender-biased household decision-making, or to gender-differentiated needs?

To illustrate these issues, consider Figure 2.2, which graphs consumption of a reference good xr(y,q) against household income y. The predicted consumption of the reference good is plotted for two households, the first composed of only one man, and the other made of a couple (i.e., a man and a woman). A common procedure in the equivalence scales literature is the estimation of the total household income at which a reference consumption of a reference good is equal for all household types. The basic argument is that when the consumption of that reference good is the same across households, the well-being of household members should also be the same across households. Reference goods are often goods consumed exclusively by some members of the household, such as adult clothing.

For Figure 2.2., take for instance the case of men's clothing for xr(y,q). Suppose that the reference level of that good is given by x0. Leaving aside issues of consumption heterogeneity within households of the same type at a given income level, one would estimate that the one-member household would need an income yc in order to consume x0 (at point c), and that the two-member household would require total household income yd to reach that same reference consumption level. Hence, following this line of argument, the second household would need yd/yc as much income as the first one to be "as well off in terms of consumption of men's clothing. Said differently, the second household's needs would be yd/yc that of the single man household. The number of "equivalent adults" in the second household would then be said to be yd/yc. When applied to different household types, this procedure provides a full equivalence scale, expressing the needs of various household types as a function of those of a reference household.

Such a procedure faces many problems, however, most of which are very difficult to resolve. First, there is the choice of the reference level of xr(y,q). If a reference level of x1 instead of x0 were chosen in Figure 2.2., the number of adult equivalents in the second household would fall from yd/yc to yf/ye. There is little that can be done in general to determine which of these two scales is the right one. In such cases, one cannot use a welfare-independent equivalence scale – the equivalence scale ratios must depend on the levels of the households' reference well-being.

Equivalence scale estimates also generally depend on the choice of the reference good. For instance, the choice of adult clothing versus that of tobacco, alcohol or other adult commodities will generally matter in trying to compare the needs, say, of households with and without children. This is in part because preferences for these goods are not independent of – and do not depend in the same manner on – household composition. One additional problem is the issue of the price dependence of equivalence scale estimates. Choosing a different q in Figure 2.2., for instance, would usually lead to the estimation of different equivalence scale ratios.

2.4.2 Sensitivity analysis

In view of these difficulties, the literature has often emphasized that the choice of a particular scale inevitably introduces value judgements and some

Figure 2.2: Equivalence scales and reference well-being

Image

arbitrariness. It would therefore seem important to recognize explicitly such difficulties when measuring and comparing poverty and inequality levels.

Allowing the assessment of needs to vary turns out to be especially relevant in cross-country comparative analyses, particularly when those countries compared differ significantly in their socio-economic composition. There is in this case the added issue that not only can the appropriate scale rates be uncertain in a given country, but they may also be different between countries. Testing the sensitivity of inequality and poverty results to changes in the incorporation of needs would seem particularly important for those comparisons whose results can influence redistributive policies, e.g., through the transfer of resources from some regions or household types to others.

To see how to carry out such sensitivity analysis, define an equivalence scale E as a function of household needs. This function will typically depend on the characteristics of the M different household members, such as their sex and age, and on household characteristics, such as location and size. Because E is normalized by the needs of a single adult, it can be interpreted as a number of "equivalent adults", viz, household needs as a proportion of the needs of a single adult. A "parametric" class of equivalence scales is often defined as a function of one or of a few relevant household characteristics, with parameters indicating how needs are modified as these characteristics change.

A survey of Buhmann, Rainwater, Schmaus, and Smeeding (1988) reported 34 different scales from 10 countries, which they summarized as

Image

with s being a single parameter summarizing the sensitivity of E to household size M. This needs elasticity, s, can be expected to vary between 0 and 1. For s = 0, no account is taken of household size. For s = 1, adult-equivalent income is equal to per capita household income. The larger the value of s, the smaller are the economies of scale in the production of well-being that are implicitly assumed by the equivalence scale, and the greater is the impact of household size upon household needs.

An obvious limitation of a simple function such as (2.4) is its dependence solely on household size and not on household composition or other relevant socio-demographic characteristics. Most equivalence scales do indeed distinguish strongly between the presence of adults and that of children, and some like that of McClements (1977) even discriminate finely between children of different ages. An example of a class of equivalence scales that is more flexible than the above was suggested by Cutler and Katz (1992) – this class takes separately into account the importance of the MA adults and the M - MAchildren:

Image

where c is a constant reflecting the resource cost of a child relative to that of an adult, and s is now an indicator of the degree of overall economies of scale within the household. When c = 1, children count as adults (which is the assumption made in (2.4)); otherwise, adults and children are assumed to have different needs.

2.4.3 Household decision-making and within-household inequality

Finally, and as elsewhere in distributive analysis, there is the practically insoluble difficulty of having to make interpersonal comparisons of well-being across individuals – compounded by the fact that individuals here are heterogeneous in their household composition. On the basis of which observable variable can we really make interpersonal comparisons of well-being? Again, note that the assumption that well-being for the man is the same as well-being for the couple when xr(y,q) is equalized in Figure 2.2. is a very strong one. Furthermore, apart from influencing preferences and commodity consumption, household formation is as indicated above itself a matter of choice and is presumably the source of utility in its own right. Preferences for household composition are themselves heterogeneous, and so is the utility derived from a certain household status. All of this makes comparisons of well-being across heterogeneous individuals and the use of equivalence scales the source of arbitrariness and significant measurement errors.

An additional problem in measuring individual living standards using survey data comes from the presence of intrahousehold inequality. The final unit of observation in surveys is customarily the household. Little information is typically generated on the intrahousehold allocation of well-being (e.g., on the individual benefits stemming from total household consumption). Because of this, the usual procedure is to assume that adult-equivalent consumption (once computed) is enjoyed identically by all household members.

This, however, is at best an approximation of the true distribution of economic well-being in a household. If the nature of intrahousehold decision-making leads to important disparities in well-being across individuals, assuming equal sharing will underestimate inequality and aggregate poverty. Not being able to account for intra-household inequities will also have important implications for profiling the poor, and also for the design of public policy. For instance, a poverty assessment that correctly showed the deprivation effects of unequal sharing within households could indicate that it would be relatively inefficient to target support at the level of the entire household – without taking into account how the targeted resources would subsequently be allocated within the household. Instead, it might be better to design public policy such as to self-select the least privileged individuals within the households, in the form of specific in-kind transfers or specially designed incentive schemes.

2.4.4 Counting units

A final and related difficulty concerns who we are counting in aggregating poverty: is it individuals or households? Although this distinction is fundamental, it is often surprisingly hidden in applied poverty profiles and poverty measurement papers. The distinction matters since there is usually a strong positive correlation between household size and a household's poverty status. Said differently, poverty is usually found disproportionately among the larger households. Because of this, counting households instead of individuals will typically underestimate the true proportion of individuals in poverty.

2.5 References for Chapters 1 and 2

The literature investigating the foundations and the impact of alternative approaches to measuring well-being is large and (yet) rapidly increasing.

Influential discussions of the conceptual foundations can be found in Das-gupta (1993), Sen (1981), Sen (1983), Sen (1985), Streeten, Burki, Ul Haq, Hicks, and Stewart (1981) and Townsend (1979).

Papers considering the impact of the accounting period (e.g., short-term vs long-term incomes) on the distribution of well-being include Aaberge, Bjorklund, Jantti, Palme, Pedersen, Smith, and Wennemo (2002), Arkes (1998), Bjorklund (1993), Burkhauser, Frick, and Schwarze (1997), Burkhauser and Poupore (1997), Coronado, Fullerton, and Glass (2000), Creedy (1997), Creedy (1999a), Gibson, Huang, and Rozelle (2001), Harding (1993) and Parker and Siddiq (1997).

The comprehensiveness of income concepts can also make a difference. A good introduction to the general methodological issues is Hentschel and Lanjouw (1996). The impact of the difference between cash and more comprehensive measures of income is studied inter alia in Formby, Kim, and Zheng (2001), Gustafsson and Makonnen (1993), Gustafsson and Shi (1997), Harding (1995), Jenkins and O'Leary (1996), Smeeding, Saunders, Coder, Jenkins, Fritzell, Hagenaars, Hauser, and Wolfson (1993), Smeeding and Weinberg (2001), Van den Bosch (1998), and Yates (1994). The role of public services is also discussed in Anand and Ravallion (1993); see also Propper (1990) and ?) for adjusting the value of public services for the costs of accessing them.

The sensitivity of the measurement of well-being to the choice between consumption and income measures is analyzed in Barrett, Crossley, and Worswick (2000b), Barrett, Crossley, and Worswick (2000a), Blacklow and Ray (2000), Blundell and Preston (1998), Cutler and Katz (1992), Jorgenson (1998) Mitrakos and Tsakloglou (1998), O'Neill and Sweetman (2001), Slesnick (1993) and Zaidi and de Vos (2001).

Choosing the units of analysis, be they individuals, households or equivalent adults, also influences distributive analysis, as studied by Bhorat (1999), Carlson and Danziger (1999), Ebert (1999), and Sutherland (1996). This is closely related to the growing concerns expressed about the role of income pooling/sharing within families and households; see for instance Cantillon and Nolan (2001), Haddad and Kanbur (1990), Jenkins (1991), Kanbur and Haddad (1994), Lazear and Michael (1988), Lundberg, Pollak, and Wales (1997), Phipps and Burton (1995), Quisumbing, Haddad, and Pena (2001), and Woolley and Marshall (1994). Ebert and Moyes (2003) explore the normative implications of a concern for equality in living standards for the use of equivalence scales in applied studies.

Price adjustments can also be important to making consistent comparisons of well-being across time, space and socio-economic groups, and for measuring equity and poverty properly. A good introduction to the methodological literature is given by Donaldson (1992). Empirical evidence can be found in Araar (2002), Bodier and Cogneau (1998), Deaton (1988), Erbas and Sayers (1998), Finke, Chern, and Fox (1997), Idson and Miller (1999), Muller (2002), Pendakur (2002), Rao (2000), Ruiz Castillo (1998) and Slesnick (2002).

Justification and examples of the use by economists of non-money-metric measures of well-being can be found inter alia in De Gregorio and Lee (2002) (for a link between education and income inequality), Haveman and Bershadker (2001) (self-reliance), Jensen and Richter (2001) (children's health), Klasen (2000) and Layte, Maitre, Nolan, and Whelan (2001) (deprivation), Sahn and Stifel (2000) (a composite welfare index), Sefton (2002) (fuel poverty) and Skoufias (2001) (calorie intake). The wealth distribution is also often of interest: see for instance Wolff (1998) for a review of the American evidence. Alternative measures of well-being are also explored in Davies, Joshi, and Clarke (1997) (for a construction of a deprivation index), Desai and Shah (1988) (for estimates of relative deprivation), Hagenaars (1986) (for perceptions of poverty), and Narayan and Walton (2000) (for participatory evidence on the living conditions and views of more than 20,000 poor people).

Survey measurement problems are numerous. See for instance Fields (1994) for a general discussion, Juster and Kuester (1991) for wealth measurement, and Lanjouw and Lanjouw (2001) for the estimation of food and non-food consumption expenditures.

The sensitivity of distributive analysis to the "equivalization" of incomes has been the focus of much work in the last 15 years. This includes Banks and Johnson (1994), Bradbury (1997), Buhmann, Rainwater, Schmaus, and Smeeding (1988), Burkhauser, Smeeding, and Merz (1996), Coulter, Cowell, and Jenkins (1992b), Coulter, Cowell, and Jenkins (1992a), de Vos and Zaidi (1997), Duclos and Mercader Prats (1999), Jenkins and Cowell (1994), Lancaster and Ray (2002), Lanjouw and Ravallion (1995), Lyssiotou (1997), Meenakshi and Ray (2002), Phipps (1993), and Ruiz Castillo (1998).

The econometric and theoretical difficulties involved in the estimation of equivalence scales are formidable, and these are discussed inter alia in Blundell and Lewbel (1991), Blundell (1998), and Pollak (1991). Estimation of equivalence scales is performed in Bosch (1991), Nicol (1994), Pendakur (1999) (where they are found to be "base-independent"), Pendakur (2002) (where they are found to be price-dependent), Phipps and Garner (1994) (where they are found to be different across Canada and the United States), Phipps (1998), and Radner (1997) (where they depend on the types of income considered).

An attempt to identify and estimate unconditional preferences for goods and demographic characteristics is Ferreira, Buse, and Chavas (1998). Whether equivalence scales should be income-dependent, and what happens if they are, is studied among others in Aaberge and Melby (1998), Blackorby and Donaldson (1993), and Conniffe (1992).

The normative issues raised by the presence of heterogeneity in the population – heterogeneity other than in the dimension of income – are numerous, and some of them are examined in Ebert and Moyes (2003), Fleurbaey, Hagneré, and Trannoy (2003), Glewwe (1991), and Lewbel (1989).

This page intentionally left blank.

PART II
MEASURING POVERTY AND EQUITY

This page intentionally left blank.

Chapter 3
INTRODUCTION AND NOTATION

In what follows in this book, we will denote living standards by the variable y. The indices we will use will sometimes require these living standards to be strictly positive, and, for expositional simplicity, we may assume that this is always the case. Strictly positive values of y are required, for instance, for the Watts poverty index and for many of the decomposable inequality indices. It is of course reasonable to expect indicators of living standards such as consumption or expenditures to be strictly positive. This assumption is less natural for other indicators, such as income, for which capital losses or retrospective tax payments can generate negative values. Also recall that, for expositional simplicity, we will also usually refer to living standards as incomes.

Let p = F(y) be the proportion of individuals in the population who enjoy a level of income that is less than or equal to y. F(y) is called the cumulative distribution function (cdf) of the distribution of income; it is non-decreasing in y, and varies between 0 and 1, with F(0) = 0 and F(∞)=11. For expositional simplicity, we will sometimes implicitly assume that F(y) is continuously differentiable and strictly increasing in y. These are reasonable approximations for large-population distributions of income. They are also reasonable assumptions from the point of view of describing the data generating processes that generate the distributions of income observed in practice. The density function, which is the first-order derivative of the cdf, is denoted as f(y) = F'(y) and is strictly positive when F(y) is assumed to be strictly increasing in y.

 

1DAD: Distribution|Distribution Function.

3.1 Continuous distributions

A useful tool throughout the book will be "quantile functions". The use of quantiles will help simplify greatly the exposition and the computation of several distributive measures. Quantiles will also sometimes serve as direct tools to analyze and compare distributions of living standards (to check first-order dominance for instance). The quantile function Q(p) is defined implicitly as F(Q(p)) = p, or using the inverse distribution function, as Q(p)= F(−1(p)2. Q(p) is thus the living standard level below which we find a proportion p of the population. Alternatively, it is the income of that individual whose rank — or percentile — in the distribution is p. A proportion p of the population is poorer than he is; a proportion 1 – p is richer than him.

These tools are illustrated in Figure 3.1. The horizontal axis shows percentiles p of the population. The quantiles Q(p) that correspond to different p values are shown on the vertical axis. The larger the rank p, the higher the corresponding income Q(p). Alternatively, incomes y appear on the vertical axis of Figure 3.1, and the proportions of individuals whose income is below or equal to those y are shown on the horizontal axis. At the maximum income level, ymax, that proportion F(ymax) equals 1. The median is given by Q(0.5), which is the income value which splits the distribution exactly in two halves.

Note that an important expositional advantage of working with quantiles is to normalize the population size to 1. This also means that everyone's income and contribution to this book's poverty and equity analysis can then appear on an interval of percentiles ranging from 0 to 1. In a sense, the population size is thus scaled to that of a socially representative individual. Normalizing all population sizes to 1 also makes comparisons of poverty and equity accord with the population invariance principle. This principle says that adding an exact replicate of a population to that same population should not change the value of its distributive indices. Putting everyone's income within a common total population scale of 1 is a handy descriptive way of comparing populations of different sizes. It also ensures that adding exact replications to these populations will not change the distributive picture.

We will define most of the distributive measures (indices and curves) in terms of integrals over a range of percentiles. This is a familiar procedure in the context of continuous distributions. We will see below why this procedure is also generally valid in the context of discrete distributions, even though the use of summation signs is often more familiar in that context. Using integrals will make the definitions and the exposition simpler, and will help focus on what matters more, namely, the interpretation and the use of the various measures.

 

2DAD: Curves|Quantile.

The most common summary index of a distribution is its mean. Using integrals and quantiles, it is defined simply as:

Image

μ is therefore the area underneath the quantile curve. This corresponds to the grey area shown on Figure 3.1. Since the horizontal axis varies uniformly from 0 to 1, μ is also the average height of the quantile curve Q(p), and this is given by μ on the vertical axis, μ is thus the income of the population's "average individual".

The computation of the average income μ gives equal weight to all incomes in the population. We will see later in the book alternative weighting schemes for computing socially representative incomes. As for most distributions of income, the one shown on Figure 3.1 is skewed to the right, which gives rise to a mean μ that exceeds the median Q(p). Said differently, the proportion of individuals whose income falls underneath the mean, F(μ), exceeds one half.

3.2 Discrete distributions

To see how to rewrite the above definitions using familiar summation signs for discrete distributions, we need a little more notation. Say that we are interested in a distribution of n incomes. We first order the n observations of yi in increasing values of y, such that y1≤ y2≤ y3≤ … ≤ yn−1≤ yn. We then associate to these n discrete quantiles over the interval of p between 0 and 1. For p such that (i − 1)/n < pi/n, we then have Q(p) = yi. Technically, this is equivalent to defining quantiles as Q(p) = min{y|F(y)≥ p}. This is illustrated in Table 3.1 for n = 3 and where the three income values are 10, 20 and 30. Figure 3.2 graphs those quantiles as a function of p. p values between 0 and 1/3 give a quantile of 10, the second income, 20, covers percentile 1/3 to 2/3, and the highest incomes, 30, covers percentile 2/3 to 1.

The formulae for discrete distributions are then computed in practice by replacing the integral sign in the continuous case by a summation sign, by summing across all quantiles, and by dividing that sum by the number of observations n. Thus, the mean μ of a discrete distribution can be expressed as:

Image

Thus, whenever an expression like (3.1) arises, we can think of the integral sign as standing for a summation sign and of dp as standing for 1/n.

Using (3.2), the mean of the discrete distribution of Table 3.1, which is 20, is then simply the integral of the quantile curve shown on Figure 3.2. In other words, it is the sum of the area of the three boxes each of length 1/3 that can be found underneath the filled curve. For completeness, we will mention from time to time how indices and curves can be estimated using the more familiar summation signs. For more information, the reader can also consult DAD's User Guide, where the estimation formulae shown use summation signs and thus apply to discrete distributions.

Table 3.1: Incomes and quantiles in a discrete distribution

I

i/n

Q(i/n) = yi

1

0.33

10

2

0.66

20

3

1

30

3.3 Poverty gaps

For poverty comparisons, we will also need the concept of quantiles censored at a poverty line z. These are denoted by Q*(p; z) and defined as:

Image

Censored quantiles are therefore just the incomes Q(p) for those in poverty (below z) and z for those whose income exceeds the poverty line. This is illustrated on Figure 3.3, which is similar to Figure 3.1. Quantiles Q(p) and censored quantiles Q*(p; z) are identical up to p = F(z), or up to Q(p) = z. After this point, censored quantiles equal a constant z and therefore diverge from the quantiles Q(p).

The mean of the censored quantiles is denoted as μ* (z):

Image

This is the area underneath the curve of censored incomes Q* (p; z). Censoring income at z helps focus attention on poverty, since the precise value of those living standards that exceed z is irrelevant for poverty analysis and poverty comparisons (at least so long as we consider absolute poverty).

The poverty gap at percentile p, g(p; z), is the difference between the poverty line and the censored quantile at p, or, equivalently, the shortfall (when applicable) of living standard Q(p) from the poverty line. Let f+ = max (f, 0).

E:18.7.6

Poverty gaps can then be defined as3:

Image

When income at p exceeds the poverty line, the poverty gap equals zero. A shortfall g(q; z) at rank q is shown on Figure 3.3 by the distance between z and Q(q). The larger one's rank p in the distribution — the higher up in the distribution of income — the lower the poverty gap g(p; z). The proportion of individuals with a positive poverty gap is given by F(z). The average poverty gap then equals μg(z):

Image

μg(z) is then the size of the area in grey shown on Figure 3.3.

3.4 Cardinal versus ordinal comparisons

There are two types of poverty and equity comparisons: cardinal and ordinal ones. Cardinal comparisons involve comparing numerical estimates of poverty and equity indices. Ordinal comparisons rank broadly poverty and equity across distributions, without attempting to quantify the precise differences in poverty and equity that exist between these distributions. They can often say where poverty and equity is larger or smaller, but not by how much.

Consider for instance the case of cardinal poverty comparisons. Numerical poverty estimates attach a single number to the extent of aggregate poverty in a population, e.g., 40% or $200 per capita. But calculating cardinal poverty estimates requires making a number of very specific assumptions. These include, inter alia, assumptions on the form of the poverty index, the definition of the indicator of well-being, the choice of equivalence scales, the value of the poverty line, and how that poverty line varies precisely across space and time.

Once these assumptions are made, cardinal poverty estimates can tell, for instance, that the consumption expenditures of 30% of the individuals in a population lie underneath a poverty line, but that a proposed government program could decrease that proportion to 25%. Cardinal poverty estimates can also be used to carry out a money-metric cost-benefit analysis of the effects of social programs. Thus, if the above government program involved yearly expenditures of $500 million, then we would know immediately that a 1% fall in the proportion of the poor would cost on average $100 million to the government. That amount could then be compared to the poverty alleviation cost of other forms of government policy.

The main advantage of cardinal estimates of poverty and equity is their ease of communication, their ease of manipulation, and their (apparent) lack of

 

3DAD: Curves|Poverty gap.

ambiguity. Government officials and the media often want the results of distributive comparisons to be produced in straightforward and seemingly precise terms, and will often feel annoyed when this is not possible. As hinted above, cardinal estimates of poverty and equity are, however, necessarily (and often highly) sensitive to the choice of a number of arbitrary measurement assumptions.

It is clear, for example, that choosing a different poverty line will almost always change the estimated numerical value of any index of poverty. The elasticity of the poverty headcount index with respect to the poverty line is, for example, often significantly larger than 1 (see Section 12.2). This implies that a variation of 10% in the poverty line will then change by more than 10% the estimated proportion of the poor in the population; this sensitivity is substantial, especially since poverty lines are rarely convincingly bounded within a narrow interval.

Another source of cardinal variability comes from the choice of the form of a distributive index. Many procedures have been proposed for instance to aggregate individual poverty. Depending on the chosen procedure, numerical estimates of aggregate poverty will end up larger or lower. As we will see later, for instance, identifying a "socially representative poverty gap" will hinge particularly on the relative weight given to the more deprived among the poor. There is little objective guidance in choosing that weight; the greater its value, however, the greater the socially representative poverty gap, and the greater the estimate of aggregate poverty.

Ordinal comparisons, on the other hand, do not attach a precise numerical value to the extent of poverty or equity, but only try to rank poverty and equity across all indices that obey some generally-defined normative (or ethical) principles. This can be useful when it suffices to know which of two policies will better alleviate poverty, or which of two distributions has more inequality, but not precisely by how much. Because of this lower information requirement, ordinal rankings can prove robust to the choice of a number of measurement assumptions. For instance, ordinal poverty orderings can often rank poverty over general classes of possible poverty indices and wide ranges of possible poverty lines.

It is thus useful to consider in turn cardinal and ordinal comparisons of poverty and equity. We first see how to construct aggregate cardinal distributive indices. Ordinal comparisons are considered in Part III.

Figure 3.1: Quantile curve for a continuous distribution

Image

Figure 3.2: Quantile curve for a discrete distribution

Image

Figure 3.3: Incomes and poverty at different percentiles

Image

This page intentionally left blank.

Chapter 4
MEASURING INEQUALITY AND SOCIAL WELFARE

4.1 Lorenz curves

The Lorenz curve has been for several decades the most popular graphical tool for visualizing and comparing income inequality. As we will see, it provides complete information on the whole distribution of incomes relative to the mean. It therefore gives a more comprehensive description of relative incomes than any one of the traditional summary statistics of dispersion can give, and it is also a better starting point when looking at income inequality than the computation of the many inequality indices that have been proposed. As we will see, its popularity also comes from its usefulness in establishing orderings of distributions in terms of inequality, orderings that can then be said to be "ethically robust".

The Lorenz curve is defined as follows1:

Image

The numerator Image sums the incomes of the bottom p proportion (the poorest 100p%) of the population. The denominator Image sums the incomes of all. L(p) thus indicates the cumulative percentage of total income held by a cumulative proportion p of the population, when individuals are ordered in increasing income values. For instance, if L(0.5) = 0.3, then we know that the 50% poorest individuals hold 30% of the total income in the population.

 

1DAD: Curves|Lorenz.

A discrete formulation of the Lorenz curve is easily provided. Recall that the discrete income values yi are ordered such that y1 ≤ y2 ≤... ≤ yn, with percentiles pi = i/n such that Q(pi) = yi. For i = 1,...n, the discrete Lorenz curve is then defined as:

Image

If needed, other values of L(p) in (4.2) can be obtained by interpolation.

The Lorenz curve has several interesting properties. As shown in Figure 4.1, it ranges from L(0) = 0 to L(1) = 1, since a proportion p = 0 of the population necessarily holds a proportion of 0% of total income, and since a proportion p = 1 of the population must hold 100% of aggregate income. L(p) is increasing as p increases, since more and more incomes are then added up. This is also seen by the fact that the derivative of L(p) equals Q(p)/μ:

Image

This is positive if incomes are positive, as we are assuming throughout. Hence by observing the slope of the Lorenz curve at a particular value of p, we also know the p-quantile relative to the mean, or, in other words, the income of an individual at rank pas a proportion of mean income. An example of this can be seen on Figure 4.1 for p = 0.5. The slope of L(p) at that point is Q(0.5)/μ, the ratio of the median to the mean. The slope of L(p) thus portrays the whole distribution of mean-normalized incomes.

The Lorenz curve is also convex in p, since as p increases, the new incomes that are being added up are greater than those that have already been counted. This is clear from equation (4.3) since Q(p) is increasing in p. Mathematically, a curve is convex when its second derivative is positive, and the more positive that second derivative, the more convex is the curve. Formally, the second-order derivative of the Lorenz curve equals

Image

Note that by definition that pF(Q(p)). Differentiating this identity with respect to p, we have that 1 ≡ f(Q(p)) d(Q(p))/dp. Thus,

Image

and we therefore have that

Image

Figure 4.1: Lorenz curve

Image

The larger the density of income f(Q(p)) at a quantile Q(p), the less convex the Lorenz curve at L(p). The convexity of the Lorenz curve is thus revealing of the density of incomes at various percentiles. On Figure 4.1, this density is thus visibly larger for lower values of p since this is where the slope of the L(p) changes less rapidly as p increases.

Some measures of central tendency can also be identified by a look at the Lorenz curve. In particular, the median (as a proportion of the mean) is given by Q(0.5)/μ, and thus, as mentioned above, by the slope of the Lorenz curve at p = 0.5. Since many distributions of incomes are skewed to the right, the mean often exceeds the median and Q(p = 0.5)/μ will typically be less than one. The mean income in the population is found at that percentile at which the slope of L(p) equals 1, that is, where Q(p) = μ and thus at percentile F(μ) (as shown on Figure 4.1). Again, this percentile will often be larger than 0.5, the median income's percentile. The percentile of the mode (or modes) is where L(p) is least convex, since by equation (4.4) this is where the density f(Q(p)) is highest.

Simple summary measures of inequality can readily be obtained from the graph of a Lorenz curve. The share in total income of the bottom p proportion of the population is given by L(p); the greater that share, the more equal is the distribution of income. Analogously, the share in total income of the richest p proportion of the population is given by 1 – L(p); the greater that share, the more unequal is the distribution of income. These two simple indices of inequality are often used in the literature.

An interesting but less well-known index of inequality is given by the proportion of total income that would need to be reallocated across the population to achieve perfect equality in income. This proportion is given by the maximum value of p – L(p), which is attained where the slope of L(p) is 1 (i.e., at L(p = F(μ))). It is therefore equal to F(μ) – L(F(μ)). This index is usually called the Schutz coefficient.

Mean-preserving equalizing transfers of income are often called Pigou-Dalton transfers. In money-metric terms, they involve a marginal transfer of $1, say, from a richer person (of percentile r, say) to a poorer person (of percentile q < r)that keeps total income constant. All indices of inequality which do not increase (and sometimes fall) following any such equalizing transfers are said to obey the Pigou-Dalton principle of transfers. These equalizing transfers also have the consequence of moving the Lorenz curve unambiguously closer to the line of perfect equality. This is because such transfers do not affect the value of L(p) for all p up to q and for all p greater than r, but they increase L(p) for all p between q and r.

Hence, let the Lorenz curve LB(P) of a distribution B be everywhere above the Lorenz curve LA(P) of a distribution A. We can think of B as having been obtained A through a series of equalizing Pigou-Dalton transfers applied to an initial distribution A. Hence, inequality indices which obey the principle of transfers will unambiguously indicate more inequality in A than in B. We will come back to this important result in Chapter 11 when we discuss how to make ethically robust comparisons of inequality.

4.2 Gini indices

If all had the same income, the cumulative percentage of total income held by any bottom proportion p of the population would also be p. The Lorenz curve would then be L(p) = p: population shares and shares of total income would be identical. A useful informational content of a Lorenz curve is thus its distance, p – L(p), from the line of perfect equality in income. Compared to perfect equality, inequality removes a proportion p – L(p) of total income from the bottom 100 .p % of the population. The larger that "deficit", the larger the inequality of income.

If we were then to aggregate that deficit between population shares and income shares in income across all values of p between 0 and 1, we would get half the well-known Gini index2:

Image

The Gini index implicitly assumes that all "share deficits" across p are equally important. It thus computes the average distance between cumulated population shares and cumulated income shares.

4.2.1 Linear inequality indices and S-Gini indices

One can, however, also think of other weights to aggregate the distance p–L(p). The class of linear inequality indices is given by applying percentile-dependent weights to those distances. Let those weights be defined by κ(P). A popular one-parameter functional specification for such weights is given by

Image

and depends on the value of a single "ethical" parameter ρ That parameter must be greater than 1 for the weights κ(P; p) to be positive everywhere. The shape of κ(p;ρ) is shown on Figure 4.2 for values of ρ equal to 1.5, 2 and 3. The larger the value of ρ the larger the value of κ(P;ρ) for small p.

 

2 DAD: Inequality | Gini/S-Gini Index.

Figure 4.2: The weighting function κ(p;ρ)

Image

Figure 4.3: The weighting function w(p;ρ)

Image

Using (4.8) then gives what is called the class of S-Gini (or "Single-Parameter" Gini) inequality indices, I(ρ)3:

Image

E:18.8.2

Note4 that I(2) is the standard Gini index. This is because κ(p;ρ = 2) ≡ 2, which then gives equal weight to all distances p – L(p). When 1 < ρ < 2, relatively more weight is given to the distances occurring at larger values of p, as shown by Figure 4.2. Conversely, when ρ > 2, relatively more weight is given to the distances found at lower values of p. Changing ρ thus changes the "ethical" concern which is felt for the "share deficits" at various cumulative proportions of the population.

Let ω(p;ρ) be defined as

Image

The shape of ω(p; ρ) is shown on Figure 4.3 for ρ equal to 1.5, 2 and 3. Note that ω(p; ρ) > 0 and that dω(p;ρ)/dp < 0 when ρ > 1. Since Image for any value of ρ the area under each of the three curves on Figure 4.3 equals 1 too. Using (4.10) and integrating by parts equation (4.9), we can then show that5:

E:18.8.31

Image

This says that I(p) weights deviations of incomes from the mean by weights which fall with the ranks of individuals in the population. Since, in equation (4.11), I(p) is a (piece-wise) linear function of the incomes Q(p), it is a member of the class of linear inequality measures, a feature which will prove useful later in measuring progressivity and vertical equity. The usual Gini index is then given simply by:

Image

Yaari (1988) defines "an indicator for the policy maker's degree of equality mindedness at p" as –ω(1)(p;ρ)/ω(p;ρ), where ω(1)(p;ρ) is the first-order

 

3DAD: Inequality|Gini/S-Gini Index.

4DAD: Curves|Lorenz.

5DAD: Inequality|Gini/S-Gini Index.

derivative of ω(p; ρ) with respect to p. This indicator thus captures the speed at which the weights ω(p;ρ) decrease with the ranks p. It gives:

Image

Thus, the local degree of "equality mindedness" for ω(p;ρ) is a proportional function of the single parameter ρ. As definition (4.13) makes clear, this degree of inequality aversion is defined at a particular rank p in the distribution of income, independently of the precise value that income takes at that rank. The larger the value of ρ, the larger the local degree of equality mindedness, and the faster the fall of the weights ω(p;ρ) with an increase in the rank p. Therefore, the greater the value of ρ, the more sensitive is the social decision-maker to differences in ranks when it comes to granting ethical weights to individuals.

The functions κ(p;ρ) and ω(p;ρ) can also be given an interpretation in terms of densities of the poor. Assume that r individuals are randomly selected from the population. The probability that the income of all of these r individuals will exceed Q(p) is given by [1 - F(Q(p))] r. The probability of finding an income below Q(p) in such samples is then 1 - [1 - F(Q(p))] r = 1 - [1 - p] r. 1 - [1 - p] r is thus the distribution function of the lowest income in samples of r individuals. The density of the lowest income rank in a sample of r randomly selected incomes is the derivative of that distribution with respect to p, which is

Image

This helps interpret the weights κ(p;ρ) and ω(p;ρ). By equation (4.8), κ(P; ρ) is ρ times the density of the lowest income in a sample of ρ- 1 randomly selected individuals; analogously, by equation (4.10), ω(p;ρ) is the density of the lowest income in a sample of ρ randomly selected individuals.

We might be interested in determining the impact of some inequality-changing process on the inequality indices of type (4.11). One such process that can be handled nicely spreads income away from the mean by a proportional factor λ, and thus corresponds to some form of bi-polarization of incomes away from the mean (loosely speaking). This bi-polarization process is equivalent to adding (λ - 1)(Q(p) - μ)to Q(p), since

Image

does indeed spread income away from the mean by a proportional factor λ. As can be checked from equation (4.11), this changes I(p) proportionally by λ:

Image

Equation (4.16) also says that the elasticity of I(p) with respect to λ, when λ equals 1 initially, is equal to 1 whatever the value of the parameter Image

Such bi-polarization away from the mean is also equivalent to a process that increases the distance p - L(p) by a factor λ. That this gives the same change in I(ρ) can be checked from equation (4.9). This bi-polarization process thus increases the deficit p - L(p) between population shares p and income shares L (p) by a constant factor λ across all p. We will see later in Chapter 12 how this distance-increasing process leads to a nice illustration of the possible impact of changes in inequality on poverty.

As shown on Figure 4.3 and in equation 4.11, the larger the value of ρ, the greater the weight given to the deviation of low incomes from the mean. When ρ becomes very large, the index I(ρ) equals the proportional deviation from the mean of the lowest income. When ρ = 1, the same weight ω(p; ρ = 1) ≡ 1 is given to all deviations from the mean, which then makes the inequality index I(ρ = 1) always equal to 0, regardless of the income distribution under consideration. Thus, S-Gini indices range between 0 (when all incomes are equal to the mean or when the ethical parameter ρ is set to 1) and 1 (when total income is concentrated in the hands of only one individual, or when ρ is large and the lowest income is close to 0). Since the Lorenz curve moves towards p when a Pigou-Dalton equalizing transfer is implemented, the value of the S-Gini indices also naturally decreases with such transfers.

Hence, ρ is a parameter of "inequality aversion" that captures our concern for the deviation of quantiles from the mean at various ranks in the population. In this sense, it is analogous to the parameter ε of relative inequality aversion which we will discuss below in the context of the Atkinson indices. For the standard Gini index of inequality, we have that ρ = 2 and thus that ω(p;ρ = 2) = 2(1-p); hence in assessing the standard Gini, the weight on the deviation of one's income from the mean decreases linearly with one's rank in the distribution of income. In a discrete formulation, the weights ω(p; ρ) take the form of:

Image

4.2.2 Interpreting Gini indices

The S-Gini indices can also be shown to be equal to the covariance formula

Image

a formula which can simplify their computation with common spreadsheet or statistical softwares. The traditional Gini is then simply:

Image

and is just a proportion of the covariance between incomes and their ranks. Note here the interesting analogy of (4.19) with the variance, given by

Image

A further useful interpretive property of the standard Gini index is that it equals half the mean-normalized average distance between all incomes:

Image

Thus, if we find that the Gini index of an income distribution equals 0.4, then we know that the average distance between the incomes of that distribution is of the order of 80% of the mean. Again, note the interesting link of (4.21) with another definition of the variance, which is var Image

The Gini index can also be computed as the integral of a simple transformation of the familiar cumulative distribution function. Recall that F(y) and 1 - F(y) are simply the proportions of individuals with incomes below and above y. If we integrate the product of these proportions across all possible values of y, we again obtain the Gini coefficient:

Image

Note also that F(y)(1 – F(y)) is largest at F(y) = 0.5, which also explains why the Gini index is often said to be most sensitive to changes in incomes occurring around the median income.

Now suppose that society can be split into two classes, and that income is equally distributed within each class.

1 Assume that those in the first class hold no income. The Gini index of the total population is then given by the population share of that zero-income class.

2 Assume that the population share of each group is 0.5. The Gini index of the total population is then given by 0.5 – L(0.5). In other words, the income share of the bottom class is 0.5 minus the Gini coefficient.

3 Assume that the population share of each group is again 0.5. Denote the incomes of those in the richer class by yR and of those in the poorer class by yp We then have:

Image

or alternatively

Image

which gives a simple relationship between incomes and the Gini coefficient. For instance, if yR = λyP, then the Gini index is simply (λ – 1)/(2λ + 2); for λ = 2, we thus have I(ρ = 2) = 1/6.

4.2.3 Gini indices and relative deprivation

A final interesting interpretation of the Gini index is in terms of relative deprivation, which has been linked in the sociological and psychological literature to subjective well-being, social protest and political unrest. Runciman (1966) defines it as follows:

The magnitude of a relative deprivation is the extent of the difference between the desired situation and that of the person desiring it (as he sees it), (p.10)

Sen (1973), Yitzhaki (1979) and Hey and Lambert (1980) follow Runciman's lead to propose for each individual an indicator of relative deprivation that measures the distance between his income and the income of all those relative to whom he feels deprived. For instance, let the relative deprivation of an individual with income Q(p), when comparing himself to another individual with income Q(q), be given by:

Image

The expected relative deprivation of an individual at rank p is then Image6:

Image

As we did for the "shares deficits" above, we can aggregate the relative deprivation at every percentile p by applying the weights κ(p;ρ). We can show that this gives the S-Gini indices of inequality:

 

6DAD: Curves|Relative Deprivation.

Image

Hence, the S-Gini indices are also a weighted average of the average relative deprivation felt in a population. By equations (4.8), (4.14) and (4.27), they equal the expected relative deprivation of the poorest individual in a sample of ρ – 1 randomly selected individuals. The greater the value of ρ the more important is the relative deprivation of the poorer in computing I(ρ).

4.3 Social welfare and inequality

We now introduce the concept of a social welfare function. Unlike relative inequality, which considers incomes relative to the mean, social welfare aggregates absolute incomes. We will see that under some popular conditions on the shape of social welfare functions, the measurement of inequality and social welfare can often be nicely linked and integrated, and that the tools used for the two concepts are then similar. This will explain why some inequality indices are sometimes called "normative".

The social welfare functions we consider take the form of:

Image

where for expositional simplicity we restrict ω(p) to be of the special form ω(p;ρ) defined by equation (4.10). U(Q(p)) is a "utility function" of income Q(p). Social welfare is then the expected utility of the poorest individual in a sample of (ρ - 1) individuals.

Another requirement that we wish to impose on the form of W is that it be homothetic. Homotheticity of W is analogous to the requirement for consumer utility functions that the expenditure shares of the different consumption goods be constant as income increases, or the requirement for production functions that the ratios of the marginal products of inputs stay constant as output is increased. For social welfare measurement, homotheticity implies that the ratio of the marginal social utilities (the marginal utility being given by U'(Q(p)) ω(p)) of any two individuals in a population stays the same when all incomes are changed by the same proportion. For (4.28) to be homothetic, we need U(Q(p)) to take the popular form of U(Q(p); ε), which is defined as

Image

Hence, W in equation (4.28) will depend on the parameters ρ and on ε and we will denote this as W(ρ,ε)7:

Image

Homotheticity of a social welfare function has an important advantage: the social welfare function can then easily be used to measure relative inequality. To see how this can be done, define ξ(ρ, ε) as the equally distributed income that is equivalent, in terms of social welfare, to the actual distribution of income. We will refer to ξ as the EDE income, the equally distributed equivalent income. ξ(ρ,ε) is implicitly defined as:

Image

Since Image is also such that

Image

or, alternatively,

Image

where Image is the inverse utility function:

Image

The index of inequality I corresponding to the social welfare function W is then defined as the distance between the EDE and the mean incomes, expressed as a proportion of mean income:

Image

Using ξ(ρ,ε) in (4.35) gives I(ρ, ε): I(ρ, ε) = 1 - ξ(ρ,ε)/μ 8.

Clearly, then, the EDE income is a simple function of average income and inequality, with

Image

 

7DAD: Welfare|S-Gini Index.

8DAD: Inequality|Atkinson-Gini Index.

Compared to W, ξ has the advantage of being money metric and thus of being easily interpreted. It can, for instance, be compared to other economic indicators that are also expressed in money-metric terms.

To increase social welfare, we can either increase μ or we can increase equality of income 1 - I by decreasing inequality I. Two distributions of income can display the same social welfare even with different average incomes if these differences are offset by differences in inequality. This is shown in Figure 4.4, starting initially with two different levels of mean income μ0 and μ1 and common zero inequality. We then have that ξ = μ0 and ξ = μ1 To preserve the same level of social welfare in the presence of inequality, mean income must be higher: this is shown by the positive slope of the constant ξ functions. Furthermore, as inequality becomes larger, further increases in I must be matched by higher and higher increases in mean income for social welfare not to fall.

Defined as in (4.35), inequality has an interesting interpretation: it measures the difference between

Image the mean level of actual income

Image and the (lower) level that would instead be needed to achieve the same level of social welfare were income distributed equally across the population.

This difference being expressed as a proportion of mean income, I thus shows the per capita proportion of income that is "wasted" in social welfare terms because of its unequal distribution. Society as a whole would be just as well-off with an equal distribution of a proportion of just 1 - I of total actual income. I can thus be interpreted as a unit-free indicator of the social cost of inequality.

Let a distribution B of income be a proportional re-scaling of a distribution A. In other words, for a constant λ > 0, let QB(P) = λQA(P) for all p. If the social welfare function used for the computation of I is homothetic, it must be that IA = IB This is illustrated in Figure 4.5 for the case of two incomes Image and Image for an initial distribution A, and two incomes Image and Image for a "scaled-up" distribution B (since λ > 1). Social welfare in A is given by WA. The social indifference curve WA shown in Figure 4.5 also depicts the many other combinations of incomes that would yield the same level of social welfare. The combinations at point F correspond to a situation of equality of income where both individuals enjoy ξAA is therefore the equally distributed income that is socially equivalent to the distribution (Image, Image).

The average income in A is given by μA, which leads to point G = (μA,μA)in Figure 4.5. Hence two distributions of income, one made of the vector (Image, Image) and the other of the vector (ξA, ξA), generate the same level WA of social welfare, the first with an unequally distributed average income μA and the other with an equally distributed average income ξA. Hence, the vertical (or horizontal) distance between point F and point G in Figure 4.5 can be understood as the "cost of inequality" in A's distribution of income. Taking that distance as a proportion of μA (see equation (4.35)) gives the index of inequality IA

That Image for the same λ can be seen from the fact that the two vectors of income lie along the same ray from the origin. If the function W is homothetic, then inequality in A must be the same as inequality in B. In other words, the distance between points D and E as a proportion of the distance OE must be the same as the distance between points F and G as a proportion of the distance OG.

4.4 Social welfare

4.4.1 Atkinson indices

Two special cases of W(ρ,ε) are of particular interest in assessing social welfare and relative inequality. The first is when income ranks are not important per se in computing social welfare: this is obtained with ρ = 1, and it yields the well-known Atkinson additive social welfare function, W(ε)9

Image

This Atkinson social welfareatk function has had two major interpretations: 1) first, as a utilitarian social welfare function, where U(Q(p);ε) is an individual utility function displaying decreasing marginal utilities of income, and 2) second, as a concave social evaluation of a concave individual utility of income.

It can be argued, however, that "it is fairly restrictive to think of social welfare as a sum of individual welfare components", and that one might feel that "the social value of the welfare of individuals should depend crucially on the levels of welfare (or incomes) of others" (Sen 1973, pp.30 and 41). The unrestricted form W(ρ,ε) allows for such interdependence and may therefore be thought more flexible than the Atkinson additive formulation. In the light of the above, we can indeed interpret W(ρ,ε) as the expected utility of the poorest individual in a group of ρ randomly selected individuals, or the expected social valuation of the utility of such individuals. This interpretation of the social evaluation function W(ρ,ε) confirms why it is not additive or separable in individual welfare: the social welfare weight on U(Q(p);ε) depends on the rank p of the individual in the whole distribution of income. It is only when ε = 1 that W(ρ,ε) gives the average utility U(Q(p);ε) weighted by a function of ranks.

 

9DAD: Welfare|Atkinson Index.

Figure 4.6 shows the shape of the utility functions U(y,ε) for different values of ε10 Incomes are shown on the horizontal axis as a proportion of their mean, and utility U(y;ε) can be read on the vertical axis. The normalization U(μ;ε) = 1 has been applied for graphical convenience. Although for all values of ε the slope of U(y;ε) is positive, that slope is not constant. This is made more explicit on Figure 4.7 which shows the marginal social utility of income U(1)(y;ε) for different values of ε. Again, a normalization of U(1)(μ) = 1 is applied. For ε = 0, the marginal social utility is constant: increasing by a given amount a poor person's income has the same social welfare impact as increasing by the same amount a richer person's income. For ε > 0, however, increasing the poor's income is socially more desirable than increasing the rich's. The larger the value of ε, the faster marginal social utility falls with y.

By (4.33) and (4.35), the Atkinson inequality index is then given by11:

Image

The Atkinson indices are said to exhibit constant relative inequality aversion since the elasticity of U(l)(Q(p);ε) with respect to Q(p) is constant and equal to ε:

Image

The parameter ε is thus usually called the Atkinson parameter of relative inequality aversion.

Figure 4.8 illustrates graphically the link between the Atkinson social evaluation functions W(ε) and their associated inequality indices. For this, suppose a population of only two individuals, with incomes y1 and y2 as shown on the horizontal axis. Mean income is given by μ = (y1 + y2)/2 (the middle point between y1 and y2). The utility function U(y;ε) has a positive but decreasing slope. W(ε) is then given by (U(y1) + U(y2))/2, the average height of U(y1) and U(y2).

If equally distributed, a mean income of ξ would be sufficient to generate that same level of social welfare, since on Figure 4.8 we have that W(ε) = U(ξ;ε). The cost of inequality is thus given by the distance between μ and ξ, shown as C on Figure 4.8. Inequality is the ratio C/μ.

Graphically, the more "concave" the function U(y;ε), the greater the cost of inequality and the greater the inequality indices I(ε). This can be seen on

 

10This paragraph draws from Cowell (1995), pp.40-41.

11DAD: Inequality|Atkinson Index.

Figure 4.9 where two functions U(y;ε) have been drawn, with different relative inequality aversion parameters ε0 < ε1. We have that W(ξ) = U (ξ0; ε0) and W(ξ) = U(ξ11). The difference in relative inequality aversion parameters nevertheless leads to ξ0 > ξ1, and therefore to I(ε0) < I(ε1). A specification with greater inequality aversion leads to a greater inequality index, and to the judgement that inequality costs socially a greater proportion of average income.

4.4.2 S-Gini social welfare indices

The second special case of W(ρ,ε) is obtained when the utility functions U(Q (p);ε) are linear in the levels of income, and thus when ε = 0. This yields the class of S-Gini social welfaresgini functions, W(ρ)12:

Image

Social welfare is then the expected income of the poorest individual in a group of ρ randomly selected individuals. By (4.33), this is also the EDE income. Hence, the associated inequality indices are given by:

Image

Image

which is seen by (4.11) to be the same as the S-Gini inequality indices I(ρ). Hence, social welfare and the EDE income equal per capita income corrected by the extent of relative deprivation in those incomes:

Image

4.4.3 Generalized Lorenz curves

A useful curve for the analysis of the distribution of absolute incomes is the Generalized Lorenz curve. It is defined as GL(p)13:

Image

and is illustrated on Figure 4.10. The Generalized Lorenz curve has all of the attributes of the Lorenz curve, except for the fact that it does not normalize

 

12DAD: Welfare|S-Gini Index.

13DAD: Curves|Generalized Lorenz.

incomes by their mean. GL(p) gives the absolute contribution to per capita income of the bottom p proportion (the 100p% poorest) of the population. GL(p) is thus also the per capita income that would be available if society could rely only on the income of the bottom p proportion of the population. Assume for instance that μ = $20000 and that GL(0.5) = $5000. Then, per capita income would be only $5000 if we assumed that the richest 50% of the population were suddenly to retire and earn no income... Note also that GL(p)/p gives the average income of the bottom p proportion of the population. In the example just provided, the average income of the 50% poorest would be $10,000, half the level of overall average income.

Combining (4.9), (4.35) and (4.40) further shows that the Generalized Lorenz curve has a nice graphical link to the S-Gini indices of social welfare:

Image

4.5 Statistical and descriptive indices of inequality

A popular descriptive index of inequality is the quantile ratio. This is simply the ratio of two quantiles, Q(p2)/Q(p1) using percentiles p1 and p214. Popular values of p1 and p2 include p1 = 0.25 and p2 = 0.75 (the quartile ratio), as well as p1 = 0.10 and p2 = 0.90 (the decile ratio). Note that these values of p1 and p2 are often reversed. Median income is also a popular choice for Q(p1). Observe also that these ratios are by definition insensitive to changes that affect quantiles other than Q(p1) and Q(p2). Moreover, none of them is consistent with Lorenz inequality orderings: it can be that the Lorenz curve for a distribution A is always above that of distribution B, but that quantile ratios suggest that B has less inequality than A. For inequality analysis, an arguably better choice for normalizing Q(p2) is mean income — an index such as Q(p2)/μ can indeed be shown to be consistent with first-order (restricted) inequality dominance (we discuss this in Chapter 11).

The coefficient of variation is the ratio of the standard deviation to the mean of income. It is given by15:

Image

and is therefore a function of the squared distance between incomes and the mean.

 

14DAD: Inequality|Quantiles Ratio.

15DAD: Inequality|Coefficient of Variation.

Two other popular measures of inequality use distances in logarithms of income. The first one, which we can call the logarithmic variance, is defined as16

Image

and the second, the variance of logarithms, as17

Image

These two last measures do not, however, always obey the Pigou-Dalton principle of transfers — that is, they will sometimes increase following a spread-reducing transfer of income between two individuals.

Finally, the relative mean deviation is the average absolute deviation from mean income, normalized by mean income18:

Image

Note that this measure is insensitive to transfers made between individuals whose income lies on the same side of the mean.

4.6 Decomposing inequality by population subgroups

A frequent goal is to explain the total amount of inequality in a distribution by the extent of inequality found among socio-economic groups ("intra" or "within" group inequality) and across them ("inter" or "between" group inequality). There are several ways to do this. One method uses the class of inequality indices that are exactly decomposable into terms that account for within- and between-groups inequality. Although that class can be given a justification in terms of social welfare functions, this exercise is less transparent and intuitive than for the classes of relative inequality indices considered hitherto. Another method applies the Shapley decomposition to any type of inequality indices. We discuss these two methods in turn.

4.6.1 Generalized entropy indices of inequality

For most practical purposes, we can express these decomposable inequality indices as Generalized entropy indices. We denote them as I(θ)19:

 

16DAD: Inequality|Logarithmic Variance.

17DAD: Inequality|Variance of Logarithms.

18DAD: Inequality|Relative Mean Deviation.

19DAD: Inequality|Entropy Index.

Image

Some special cases of (4.50) are worth noting. First, if we constrain θ to be no greater than 1 and let θ = 1 - ε, I(θ) becomes ordinally equivalent to the family of Atkinson indices. This simply means that if an Atkinson index I(ε) indicates that there is more inequality in a distribution A than in a distribution B, then the index I(θ) with θ = 1 - ε will also necessarily indicate more inequality in A than in B. Second, the special case I(θ = 0) gives the Mean Logarithmic Deviation, since I(θ = 0) can also be expressed as

Image

that is, as the average deviation between the logarithm of the mean and the logarithms of incomes. I(θ = 1) gives the well-known Theil index of inequality. I(θ = 2) is half the square of the coefficient of variation (see (4.46)) since I(θ = 2) can be rewritten as

Image

Now assume that we can split the population into K mutually exclusive population subgroups, k = 1,...,K. The indices in (4.50) can then be decomposed as follows20:

Image

where φ(k) is the proportion of the total population that belongs to subgroup k and μ(k) is the mean income of subgroup k.

Image I(k;θ) is inequality within subgroup k, defined in exactly the same way as in (4.50) for the total population. The first term in (4.53) can thus be interpreted as a weighted sum of the within-group inequalities in the distribution of income.

 

20DAD: Decomposition|Entropy: Decomposition by Groups.

Image Image is total population inequality when each individual in subgroup k is given the mean income μ,(k) of his subgroup (namely, when within subgroup inequality has been eliminated). I(θ) can thus be interpreted as the contribution of between-group inequality to total inequality.

Note, however, that only when θ = 0 is it the case that the within-group inequality contributions do not depend on mean income in the groups; the terms I(k;θ = 0) are then strictly population-weighted. Otherwise, the within-group inequalities are weighted by weights which depend on the mean income in the subgroups k. Depending on the context, this can make I(θ = 0) a more attractive decomposable index than for other values of θ.

4.6.2 A subgroup Shapley decomposition of inequality indices

This decomposition involves two steps. The first one is to decompose total inequality into global between-group and within-group contributions. The second step is to the express global within-group contribution as a sum of the within-group contribution of each of the groups.

For each of these two steps, we want to assess by how much inequality would be reduced if we removed one of the "factors" that contribute to inequality. Take for instance the first step. It has two factors, within-group and between-group inequality. By how much would inequality fall if between group inequality were eliminated? One estimate would be given by the difference between initial inequality and inequality after the mean income of all groups has been equalized. Another estimate would be given by the inequality that remains once within-group inequality is removed and all that there is left is between-group inequality. These two estimates, however, will generally differ. Which one is better? Since there is no right answer to this question, an alternative is to use the average of the two estimates. Note that the first estimate gives the effect of the first factor when the second factor has not been removed, while the second estimate gives the effect of the first factor after the second factor has been eliminated.

Using the average marginal effect of removing a factor across all factor elimination sequences is what is implied by the choice of the Shapley value as a decomposition procedure. The procedure is detailed in an appendix found below in Section 4.7.

As mentioned above, applying the Shapley decomposition procedure to our sub-group inequality decomposition problem involves two steps. In the first step, we suppose that the two Shapley factors are between-group and within-group inequality. The basic rules followed to compute the marginal contribution of each of these factors are:

1 first, to eliminate within-group inequality and to calculate between-group inequality, we use a vector of incomes in which each observation is assigned the average income μ(k) of the observation's group k;

2 to eliminate between-group inequality and to calculate within-group inequality, we use a vector of incomes where each observation has its income multiplied by the ratio μ(k)/μ of its group k.

To be more precise, let an inequality index I depend on the incomes of individuals in k = 1,..., K groups, each group with n(k) individuals. Let y(k) be the n(k)-vector of incomes of group k. We want to express total inequality I as a sum of between- and within- group inequality21:

Image

To compute the contribution of between-group inequality, we compute the fall of inequality observed when the mean incomes of the groups are equalized. This can be done either before or after within-group inequality has been removed. Hence, the Shapley contribution of between-group inequality is given by:

Image

where l(k) is a unit vector of size nk. The within-group contribution is then given as

Image

The second step consists in decomposing total within-group inequality as a sum of within-group inequality across groups. To do this, we proceed by replacing the incomes of those in a group k by μ(k) in order to eliminate group k's contribution to total within-group inequality. The fall in inequality induced by this equalization of incomes is the contribution of group k to total within-group inequality. We compute this for each group. Given that this computation depends on the sequence ordering of the groups, we compute the average contribution of a group k over all possible orderings of groups. This gives the Shapley value of group k's contribution to total within-group inequality.

 

21DAD: Decomposition|S-Gini: Decomposition by Groups.

To formalize this, suppose that there are only two groups, k = 1, 2. The first group's contribution to total within-group inequality is given as

Image

and symmetrically for the second group.

4.7 Appendix: the Shapley value

The Shapley value is a solution concept often employed in the theory of cooperative games. Consider a set S of s players who must divide some surplus among themselves. The question to resolve is: how can we divide the surplus between the s players? To see how, suppose that the s players can form coalitions Image (these coalitions are subsets of S) to extract a part of the surplus and redistribute it between their σ members. Suppose that the function V determines the extracting force of the coalition, viz, that amount of the surplus that it can extract without resorting to an agreement with those players that are outside of the coalition. The value of an additional player I in a coalition Image is given by

Image

The term MV(Image,i) equals the marginal value added by player i after his adhesion to the coalition Image What will then be the expected marginal contribution of player i over the different possible coalitions that can be formed and which he can join? Note that the number of possible permutations of the s players equals s!. Note also that the size of coalitions Image is limited to σ ε {0, 1,...s - 1}. Out of s! possible permutations of players, the number of times that the same first σ players are located in a same coalition Image is given by the number of possible permutations of the σ players in coalition Image that is, by σ!. For every permutation in the coalition Image we find (s – σ –1)! permutations for the players that complement the coalition Image (excluding player i). The Shapley value gives the expected marginal value that player i generates after his adhesion to a coalition Image of any possible size σ. It is thus given by:

Image

This decomposition procedure has two useful properties. The first is symmetry, ensuring that the contribution of each factor is independent of the order in which it appears in the initial list or sequence of factors. The second property is exactness and additivity, from which the total surplus is given by Image

For decompositions of inequality or poverty indices, say, applying a Shapley procedure consists in computing the marginal effect on such indices of removing each contributing factor (between or within group inequality, inequality in income component, differences in mean income, etc.) in a given sequence of elimination. Repeating the computation for all possible elimination sequences, we estimate the mean of the marginal effects for each factor. This mean provides the contribution of each such factor. The contribution of all factors yield an exact, additive decomposition of distributive indices and variations in them into s contributions.

4.8 References

The literature on the measurement of inequality and social welfare is very large. General references include Atkinson (1983), Atkinson and Bourguignon (2000), Atkinson and Micklewright (1992), Bishop, Formby, and Smith (1993), Chakravarty (1990), Champernowne and Cowell (1998), Cowell (1995), Cowell (2000), Essama Nssah (2000), Foster and Sen (1997), Johnson and Shipp (1997), Lambert (2001), Sen (1973), Sen (1992), Sen (1992), and Saunders (1994).

Applications to real data are very numerous too — among the most influential recent ones feature Bourguignon and Morrisson (2002), Danziger and Gottschalk (1995), Gottschalk and Smeeding (1997), Gottschalk and Smeeding (2000), Jantti (1997) and Milanovic (2002).

Seminal work on inequality measurement and Lorenz curves include Atkinson (1970), Blackorby and Donaldson (1978), Dalton (1920), Dasgupta, Sen, and Starret (1973), Gini (1914) (see Gini 2005 for a recent English translation), Hainsworth (1964), Kakwani (1977a), Kolm (1969), Lorenz (1905) and Rothschild and Stiglitz (1973). Aaberge (2000) rationalizes the use of "moments of Lorenz curves" as measures of inequality, and Aaberge (2001a) presents axiomatic bases for the use of Lorenz curve orderings. Foster and Ok (1999) analyze the concordance of the variance of logarithms with Lorenz dominance.

Discussion and interpretation of linear (or rank-dependent) indices of inequality can be found in Aaberge (1997), Aaberge (2000), Anand (1983), Barrett and Salles (1995), Ben Porath and Gilboa (1994), Blackburn (1989), Blackorby, Bossert, and Donaldson (1994), Bossert (1990), Chakravarty (1988), Chew and Epstein (1989), Donaldson and Weymark (1980) and Donaldson and Weymark (1983) (for S-Ginis), Duclos (1997a), Weymark (1981), Yaari (1988), Yitzhaki (1983) (for extended Ginis, equivalent to S-Ginis — see also Kakwani (1980)), and Wang and Tsui (2000). The most popular member of the class of linear inequality indices is the Gini index: it is discussed in detail in Deutsch and Silber (1997), Milanovic (1994b), Milanovic (1997), Subramanian (2002) and Yitzhaki (1998).

The theory and the economic measurement of relative deprivation is explored inter alia in Berrebi and Silber (1985), Chakravarty and Chakraborty (1984), Clark and Oswald (1996), Davis (1959), Duclos (2000), Ebert and Moyes (2000), Festinger (1954), Hey and Lambert (1980), Merton and Rossi (1957), Paul (1991), Podder (1996), Runciman (1966), Silver (1994), Wang and Tsui (2000), Yitzhaki (1979), Yitzhaki (1982a) and Nolan and Whelan (1996).

Discussion and use of the Theil index appears inter alia in Beblo and Knaus (2001), Duro and Esteban (1998) and Goerlich Gisbert (2001).

Other inequality indices are discussed in Araar and Duclos (2003) and Berrebi and Silber (1981) (a combination of Atkinson and Gini inequality indices), Chakravarty (2001) (a defense of the use of the variance), del Rio and Ruiz Castillo (2001) (for "intermediate inequality measures"), and Foster and Shneyerov (2000) (for "path-independent decomposable measures").

Decomposition of inequality across population subgroups has also been the focus of a large literature. This has mostly involved using additive and Generalized entropy indices — see, for instance, Bourguignon (1979), Cowell (1980), Foster and Shneyerov (1999), Mookherjee and Shorrocks (1982), Shorrocks (1980), Shorrocks (1984), Schwarze (1996) and Zandvakili (1999). Decompositions of the Gini and rank-dependent inequality indices are investigated in Dagum (1997), Deutsch and Silber (1999a), Deutsch and Silber (1999b), Milanovic and Yitzhaki (2002), Sastry and Kelkar (1994), Tsui (1998) and Yitzhaki and Lerman (1991). A money-metric cost-of-inequality approach to decomposing inequality across subpopulations is derived in Blackorby, Donaldson, and Auersperg (1981), Duclos and Lambert (2000) and Ebert (1999). Alternative decomposition approaches are also explored in Cowell and Jenkins (1995), Fields and Yoo (2000), Fournier (2001), Hyslop (2001), Jenkins (1995), Parker (1999), and Schultz (1998).

The Shapley value was introduced by Shapley (1953). See also Owen (1977) for how a two-stage decomposition procedure can be applied to the Shapley value, as well as Shorrocks (1999) and Chantreuil and Trannoy (1999) for its use in distributive analysis.

Figure 4.4: Mean income and inequality for constant social welfare ξ

Image

Figure 4.5: Homothetic social evaluation functions

Image

Figure 4.6: Social utility and incomes

Image

Figure 4.7: Marginal social utility and incomes

Image

Figure 4.8: Atkinson social evaluation functions and the cost of inequality

Image

Figure 4.9: Inequality aversion and the cost of inequality

Image

Figure 4.10: Generalized Lorenz curve

Image

Chapter 5
MEASURING POVERTY

5.1 Poverty indices

Two approaches have been used to devise cardinal indices of poverty. The first uses the concept of equally distributed equivalent (EDE) incomes, and applies it to distributions whose incomes have been censored at the poverty line. It then compares those EDE incomes to the poverty line. The second approach transforms incomes and the poverty line into poverty gaps, and aggregates these gaps using social-welfare like functions. We look at these two approaches in turn.

5.1.1 The EDE approach

For the EDE approach to building poverty indices, we start with the distribution of income Q(p). Since, for poverty comparisons, we want to focus on those incomes that fall below the poverty line (the "focus axiom"), the incomes Q(p) are censored at the poverty line z to give Q*(p; z). The censored incomes are then aggregated using one of the many social welfare functions that have been proposed in the literature, such as the Atkinson or S-Gini ones. A poverty index is obtained by taking the difference between the poverty line and the EDE income. For instance, for the social welfare functions proposed in section 4.3, this procedure leads to the following class of poverty indices:

Image

where ξ*(z; p, ε) is the EDE income of the distribution of censored income Q*(p; z) and where we need ρ ≥ 1 and ε ≥ 0 for the Pigou-Dalton transfer principle not to be violated. P(z; ρ, ε) can then be interpreted as the "socially representative" or EDE poverty gap.

Examples of such poverty indices include a transformation of the Clark, Hemming and Ulph's (CHU) second class of poverty indices, given by P(z; ε) = P(z; ρ = 1, ε)1:

Image

The CHU indices are then obviously closely related to the Atkinson social welfareatk functions and inequality indices. When ε = 1, the CHU poverty index is also the EDE poverty gap corresponding to the Watts poverty index, an index which is defined as2:

Image

For 0 ≤ ε < 1, the CHU indices also correspond to the EDE poverty gap of the class of poverty indices proposed by Chakravarty, PC(z; ε):

Image

Moreover, if we choose ε = 0 for the class of indices defined in (5.1), we obtain the class of S-Gini indices of poverty, P(z; ρ)3:

Image

P(z; ρ = 2) is then a "Gini-like" index of poverty.

5.1.2 The poverty gap approach

The second approach to constructing poverty indices uses the distribution of poverty gaps, g(p; z) = z - Q*(p; z). Once this distribution is known, no other use of the poverty line is needed for the aggregation of poverty. Because of this, the poverty gap approach to constructing poverty indices is slightly more restrictive and also puts more structure on the shape of the allowable poverty indices than the previous EDE approach. After the distribution of poverty gaps has been computed, we may use aggregating functions analogous to those used in Section 4.3 for the analysis of social welfare. Like social

 

1DAD: Poverty|CHU Index.

2DAD: Poverty|Watts Index.

3DAD: Poverty|S-Gini Index.

welfare functions, where we normally want an increase in someone's income to increase social welfare, we would normally wish the poverty indices to be increasing in poverty gaps. Unlike social welfare functions, however, where an equalizing Pigou-Dalton transfer would often increase the value of a social welfare function, we would typically wish a poverty index to decrease when such an equalizing transfer of income takes place.

A popular class of poverty gap indices that can obey these axioms is known as the Foster-Greer-Thorbecke (FGT) class. It differentiates its members using an ethical parameter α ≥ 0 and is generally defined as4

E: 18.7.4

Image

for the normalized FGT poverty indices and as

Image

for the un-normalized version (which can sometimes be more useful than the more usual normalized form). Note that poverty gap indices other than the FGT ones can also be easily proposed, simply by using other aggregating functions of poverty gaps that obey some of the desirable axioms (such as that of being increasing and convex in poverty gaps) discussed in the literature.

5.1.3 Interpreting FGT indices

When α = 0, the FGT index gives the simplest and most commonly used poverty index. It is called the poverty headcount ratio, and is simply the proportion of a population that is in poverty (those with a positive poverty gap), F(z) 5. The shorter expression "poverty headcount" is sometimes meant to indicate the absolute (as opposed to the relative) number of the poor in the population. Since our population size is normalized ton 1 in the this book, we will use the two expressions "headcount" and "headcount ratio" interchangeably.

E:18.1.1

The next simplest and most commonly used index, μg(z), is given by the average poverty gap, P (z; α = 1), and is the average shortfall of income from the poverty line:

Image

To see how to interpret the form of the FGT indices for general values of α consider Figure 5.1. It shows the (absolute) contributions to total poverty

 

4DAD: Poverty|FGT Index.

5DAD: Poverty|FGT Index.

Image(z; α) of individuals at different ranks p. These contributions are given by (g(p; z)/z)α. For α = 0, the contribution is a constant 1 for the poor and 0 for the rich (those whose rank exceeds F(z) on the Figure, or equivalently those whose income Q(p) exceeds z). The headcount is then the area covered by the dotted rectangle on Figure 5.1. For α = 1, the contribution of someone at p equals his normalized poverty gap, g(p; z)/z. Poverty is then the area underneath the g(p; z)/z curve drawn on Figure 5.1. The same reasoning is valid for higher values of α For instance, the absolute contribution to Image(z; α = 3) of individuals at rank p is given by (g(p; z)/z)3 on Figure 5.1, and Image(z; α = 3) equals the area underneath the (g(p; z)/z)3 curve.

Notwithstanding the above, interpreting the numerical value of FGT indices for α different from 0 and 1 can be problematic. We can easily understand what is meant by a proportion of the population in poverty or by an average poverty gap, but what, for instance, can a squared-poverty-gap index actually signify? And how to explain it to a government Minister?... A further difficulty with such indices emerges from a closer look at Figure 5.1, which indicates that the absolute contribution of poverty gaps to poverty decreases with α — the contribution curves (g(p)/z)α move down as α rises. This also implies that the normalized FGT indices necessarily fall as α increases. This is paradoxical since it is usually argued that the higher the value of α, the greater the focus on those who suffer most "severely" from poverty. It would thus be more natural if an increase in α also increased Image(z; a).

5.1.4 Relative contributions to FGT indices

One partial solution to these interpretive problems is to switch one's focus from the absolute to the relative contribution to an FGT index of individuals with different poverty gaps. Such a relative contribution is depicted on Figure 5.2 for α = 0, 1 and 2. It shows the ratio of the absolute contributions g(p)α to total poverty P(z; α) — these ratios are the same for normalized and unnormalized FGT indices. Since this graph shows relative contributions to total poverty, the area underneath each of the three curves must in all cases equal 1.

For α = 0, each poor contributes relatively the same constant 1/F(z) to the poverty headcount. The poor's relative contribution to the average poverty gap increases with their own poverty gap, as shown by the curve g(p)/P(z; α = 1). That relative contribution equals 1 for those individuals whose own poverty gap is precisely equal to the average poverty gap. The rank of such individuals is given by F(μg(z)), as is also shown on Figure 5.2. Thus, those located at p = F(μg(z)) have a poverty gap that is representative of the average poverty gap in the population. Increasing α from 1 to 2 decreases the relative contribution of the not-so-poor, but inversely increases the contribution of those with the highest poverty gaps as shown by the curve g(p; z)/P(z; α = 2). This then becomes consistent with the general view that, in the aggregation of

Image

Figure 5.1: Contribution of poverty gaps to FGT indices

individual poverty, higher values of α put more emphasis on those who suffer most severely from poverty — those with lower values of p and higher values of g(p;z).

5.1.5 EDE poverty gaps for FGT indices

Figure 5.2 does not, however, solve the main interpretation problems associated with the FGT indices. As mentioned above, explaining to non-technicians or policymakers the practical meaning of FGT indices for general values of α is difficult since these indices are averages of powers of poverty gaps. They are also neither unit-free nor money-metric (except for α = 0 and 1). An another already-mentioned difficulty is that the usual FGT indices will generally fall with an increase in the value of their poverty aversion parameter, α.

A simple solution to these two problems is to transform the FGT indices into EDE poverty gaps. An EDE poverty gap is that poverty gap which — if it were assigned equally to all individuals — would yield the same aggregate poverty index as that which is currently observed. An EDE poverty gap can then usefully be interpreted as a socially-representative poverty gap. This transformation provides a money-metric measure of poverty which can be usefully compared across different poverty indices and/or across different values of α. As we will see later, it also allows the analyst to determine the impact of poverty-gap inequality upon the level of poverty. For the un-normalized FGT indices, the EDE poverty gap is given simply by (for α > 0)6

Image

For the normalized FGT indices, it is just ξ-9 (z; a) = ξg (z; a)/z. An EDE poverty gap cannot be defined for α = 0.

Figure 5.3 shows such socially-representative poverty gaps ξg (z; α) for different values of α. In each case, we obtain a socially-weighted money-metric indicator of the distribution of deprivation in the population. This summary aggregate indicator can also be compared to the individual distribution of poverty, given by the g(p; z) curve. Those whose g(p; z) exceeds ξg (z; α) experience more poverty than the socially representative average. Those exactly at ξg (z; α) are located exactly at the socially representative poverty gap. Those representative individuals are thus found at the ranks given by Fg (z; α)), which are also shown on Figure 5.3 for different values of α.

An important point to note is that an increase in α moves the socially-representative poverty gap closer to that experienced by the poorest individuals. This is since ξg(z; α + 1) ≥ ξg(z; α) for any α > 0. (This is unlike the usual definition of the FGT indices, for which we have P (z; α + 1) ≤ P(z;α)

 

6DAD: Poverty|FGT Index.

for any α > 0.) Hence, we can readily interpret increases in α as leading to increases in the socially-representative poverty gap, and thus in the relative weight given to the poorer of the poor. The larger the value of α, the more important are the most severe cases of deprivation in computing a socially-representative aggregate level of poverty.

Note finally that, besides being already in an EDE poverty gap form, the S-Gini index of poverty also has the property of being a poverty gap index. Indeed, by (5.5), we have that

Image

5.2 Group-decomposable poverty indices

Much of the early literature on the construction of poverty indices focussed on whether indices were decomposable across population subgroups. This has led to the identification of a subgroup of poverty indices known as the "class of decomposable poverty indices". These indices have the property of being expressible as a weighted sum (more generally, as a separable function) of the same poverty indices assessed across population subgroups. They most commonly include the FGT and the Chakravarty classes of indices as well as the Watts index.

Let the population be divided into K mutually exclusive population subgroups, where φ(k) is the share of the population found in subgroup k. For the FGT indices, we then have that:

Image

where P(k; z; α) is the FGT poverty index of subgroup k7. The Watts and Chakravarty indices are expressible as a sum of the poverty indices of each subgroup in exactly the same way as for the FGT indices in (5.11).

E: 18.6

To illustrate the practical implications of the group-decomposition property, consider the following two-group (K = 2) example. Let the first group contain 40% of the total population, and let poverty in group 1 be 0.8 and that of group 2 be 0.4. Poverty in the total population is then a simple weighted mean of group poverty, and is immediately computable as 0.4 · 0.8 + 0.6 · 0.4 = 0.56. Estimates of total poverty in a population can then be constructed in a decentralized manner, first by estimating poverty within communities or regions, and then by averaging over these decentralized estimates, without there being a need for all of the micro data to be regrouped in one single register.

 

7DAD: Decomposition|FGT: Decomposition by Groups.

Subgroup decomposability also implies that an income improvement in one of the subgroups will necessarily improve aggregate poverty if the incomes in the other groups have not changed. It will also mean that the optimal design of social safety nets and benefit targeting within any given group can be assessed independently of the income distribution in the other groups: only the distributive characteristics of the relevant group matter for the exercise. If targeting succeeds in decreasing poverty at a local level, then it must also succeed at the aggregate level.

Subgroup decomposability is therefore useful, although it is certainly not imperative for poverty analysis. In particular, it is not because an index facilitates poverty profiling and targeting analysis that this index is necessarily ethically fine. Ease of computation and ethical soundness are also two different an potentially conflicting criteria. Among other things, imposing the decomposability and additivity property can mean sacrificing some important ethical features in the aggregation of poverty. In that context, Ravallion (1994) notes that when measuring poverty "one possible objection to additivity is that it attaches no weight to one aspect of a poverty profile: the inequality between subgroups in the extent of poverty". This can be an important flaw if for instance between-group relative deprivation is considered ethically significant.

5.3 Poverty and inequality

Expressing poverty indices in the form of EDE poverty gaps enables the decomposition of poverty as a sum of average poverty and inequality in poverty. Let ξg (z) be the EDE poverty gap and Imageg(z) be the cost of inequality in poverty gaps. We then have:

Image

or, alternatively,

Image

For instance, for the popular FGT indices, we have that the cost of inequality in poverty gaps is given by:

Image

When α = 1, we have that the socially representative poverty gap ξg(z) is just the average poverty gap μg(z); inequality in poverty gaps is thus not taken into account in assessing poverty. The poverty cost of inequality is then nil. Since μg(z) is insensitive to α, and since ξg(z; α) is increasing in α, it follows that Imageg(z; α) is also increasing in α; the larger the value of α, the larger the impact of inequality on the level of aggregate poverty. This can be checked on Figure 5.3. We can thus interpret α as a parameter of inequality aversion in measuring poverty. For 0 < α < 1, we have that ξg(z; α) < μg(z), and inequality in poverty is then deemed to reduce poverty: Imageg(z, α) < 0. Ceteris paribus, we then have that the greater the level of inequality, the lower the socially representative level of poverty. For α > 1, we have that Imageg(z; α) > 0 and inequality has therefore a positive poverty cost.

A similar decomposition can be done using (5.1) and the EDE level of censored income. The EDE poverty gap corresponding to that approach is defined as

Image

where Image*(z; ρ, ε) = μ*(z) · I*(z; ρ, ε) is the cost of inequality in censored income and where I*(z; ρ.ε) is the index of inequality in censored income.

5.4 Poverty curves

It is often informative to portray the whole distribution of poverty gaps on a simple graph, in a way which shows both the incidence and the inequality of income deprivation. Particularly useful is the poverty gap curve, which plots g(p; z) as a function of p — see again Figure 5.3. The curve naturally decreases with the rank p in the population, and reaches zero at the value of p equal to the headcount. The integral under the curve gives the average poverty gap, and its steepness indicates the degree of inequality in the distribution of poverty gaps.

Another percentile-based curve that is graphically informative and that is useful for the measurement and comparison of poverty is called the Cumulative Poverty Gap (CPG) curve (also sometimes referred to as the inverse Generalized Lorenz curve, the "TIP" curve, or the poverty profile curve). The CPG curve cumulates the poverty gaps of the bottom p proportion of the population. It is defined as:8

E:18.7.8

Image

A CPG curve is drawn on Figure 5.4. The slope of G(p; z) at a given value of p shows the poverty gap g(p; z). Since g(p; z) is non-negative, G(p; z) is non-decreasing. G(p = 1; z) equals the average poverty gap μg(z). The percentile at which G(p; z) becomes horizontal (where g(p; z) becomes zero) yields the poverty headcount. Furthermore, since the higher his rank p in the population, the richer is an individual, and therefore the lower is his poverty gap, G(p; z) is therefore concave in p. Because of this, the CPG curve exhibits

 

8DAD: Curves|CPG.

for poverty analysis the same descriptive interest as the Lorenz and Generalized Lorenz curves for the analysis of inequality and social welfare. The distance of G(p; z) from the line of perfect equality of poverty gaps (namely, the line 0B in Figure 5.4) shows the inequality of poverty gaps among the total population. The distance of G(p; z) from the line of perfect equality of poverty gaps among the poor (namely, the line 0A in Figure 5.4) displays the inequality of poverty gaps among the poor. Finally, the concavity of G(p; z) is inversely related to the density of poverty gaps at p.

5.5 S-Gini poverty indices

When weighted by K(p; ρ), the area underneath the CPG curve generates the class of S-Gini poverty indices9:

Image

Recall that K(p; ρ) = ρ(ρ–1) (1 – p)ρ–2 · P(z; ρ = 1) thus equals the average poverty gap, μg(z), P(z; ρ = 2) is the poverty index that is analogous to the standard Gini index of inequality, and the well-known Sen index of poverty is given by:

Image

An interesting feature of the P(z; ρ) indices is their link with absolute and relative deprivation. Let absolute deprivation, AD(z), be given by the average shortfall from the poverty line, that is, by μg(z). Recalling (4.25) and (4.26), we can define relative deprivation in censored income at percentile p as:

Image

Average relative deprivation across the whole population is then:

Image

It is then possible to show that:

Image

The larger the value of ρ, the larger is relative deprivation, RD(z; ρ), and the larger are P(z;ρ) and the contribution of relative deprivation and inequality to poverty. This provides an alternative link between inequality and poverty.

 

9DAD: Poverty|S-Gini Index.

5.6 Normalizing poverty indices

Most of the poverty indices discussed above have initially been introduced in the literature in a normalized form, that is, by dividing censored income and poverty gaps by the poverty line. The FGT indices, for instance, are generally expressed as10:

Image

(see (5.6)). Normalizing poverty indices will make no substantial difference and little expositional difference for poverty analysis when the distributions of income being compared have identical poverty lines. This will typically be the case, for instance, when incomes are expressed in real (or constant) values, and when the focus is on absolute poverty with constant real poverty lines. Normalizing poverty indices by the poverty line will

Image make the EDE poverty gap lie between 0 and 1,

Image make poverty indices insensitive to and independent of the monetary units (e.g., dollars or cents) used in assessing income, and

Image make the indices invariant to an equi-proportionate change in all incomes and in the poverty line.

Normalizing poverty indices is particularly useful if the poverty lines serve as price indices, and thus used to enable comparisons of nominal income across time and space (recall that price indices are used to convert nominal incomes into base-year real incomes).

Normalized poverty indices are usually referred to as "relative poverty indices"; changing all incomes and the poverty line by the same proportion will not affect the value of relative poverty indices. FGT and other poverty gap indices that are not normalized are often called "absolute" poverty indices; it can be checked that equal absolute additions to all incomes and to the poverty line will not affect their value. Increasing all incomes and the poverty line by the same proportion will, however, increase the value of such absolute poverty indices.

When poverty lines are different across distributions, and when their ratio across time or space cannot be interpreted simply as a ratio of price indices, the normalization of poverty indices by these poverty lines can, however, be problematic, and is surely open to debate. This is the case, for instance, when we are interested in comparing the absolute shortfalls of "real" income from a "real" poverty line, when these real poverty lines vary across populations

 

10DAD: Poverty|FGT Index.

or population subgroups. Examples can arise, inter alia, in comparing the poverty of families of different sizes and composition, or in comparing poverty across distributions with different social or cultural bases for the definition of a poverty line.

To see this more clearly, consider the following example in which all incomes and poverty lines are expressed in real terms (namely, they have been adjusted for differences in the cost of living, and they are therefore comparable). In country A, the poverty line is $1,000, and a poor person i has an income of $500. Because, say, of cultural and/or sociological differences (these differences may exist across time or space), the poverty line in country B is larger and is equal to $2,000, and a poor person j in it has an income equal to $1,100. Who of i and j is poorer? If we adopt the relative view to building poverty indices, i will be considered the poorer since as a proportion of the respective poverty lines he is farther away from it than j. If, instead, absolute poverty indices are used, j will be deemed the poorer since his absolute poverty gap ($900) is by far larger than that of i ($500). Which of these two views should prevail is then open to debate.

5.7 Decomposing poverty

5.7.1 Growth-redistribution decompositions

It is often useful to determine whether it is mean-income growth or changes in the relative income shares accruing to different parts of the population that are responsible for the evolution of poverty across time. Investigating this can also help assess whether these two factors, mean-income changes and inequality changes, work in the same or in opposite directions when it comes to the behavior of aggregate poverty. Similarly, we may wish to assess whether differences in poverty across countries or regions are due to differences in inequality or to differences in mean levels of income.

There are several ways to do this. To illustrate them, assume that we wish to compare distributions A and B to determine if it is the difference in their mean income (" growth") or the difference in their income inequality (" redistribution") that accounts for their difference in poverty. The common feature of all existing growth- redistribution decomposition procedures is

1 first, to scale the two distributions A and B such that they have the same mean, and interpret the difference in poverty across these two scaled distributions as the impact on poverty of their difference in inequality;

2 and second, to interpret the difference in poverty between one of the distributions (say, A) and that same distribution scaled to the mean income of the other distribution (B) as the impact on poverty of their difference in mean income.

Starting from this, the precise growth- redistribution decomposition procedures that are chosen differ by the solution they apply to a basic problem known generally in the national-accounts literature as the "index problem". Specifically here, should we scale A to the mean of B, or B to the mean of A, to assess the impact of differences in inequality? And, in estimating the impact of differences in mean incomes, should we compare A with A-scaled-to-the-mean-of-B, or B with B-scaled-to-the-mean-of-A?

The first paper that implemented a growth- redistribution decomposition of poverty differences (Datt and Ravallion 1992) used the initial distribution as a reference "anchor point". To see how, it is easiest to use the normalized FGT indices Image(z; a) defined in (5.6), although the growth- redistribution decomposition methodologies can be used with any relative poverty indices, additive or not. The change in poverty between A and B is expressed as a sum of a "growth" (difference in mean income) effect and of a "redistributive" (difference in relative income shares) effect, plus an error term that originates from the above-mentioned index problem. This gives11:

Image

The first expression in the first term on the left of (5.23), ImageAImage, is poverty in A after A's incomes have been scaled by μB/μA to yield a distribution with mean μB and inequality unchanged. Image is thus the difference between two distributions with the same relative income shares but with (possibly) different mean incomes. When μB > μA, this growth term is negative — this simply says that growth reduces poverty. The first expression in the second term, ImageB Image, is poverty in B after B's incomes have been scaled by μA/μB to yield a distribution with mean μA. Image is thus the difference between two distributions with identical mean incomes but with (possibly) different inequality. When the Lorenz curve for B is everywhere above the Lorenz curve for A, this redistribution term is necessarily negative when α ≥ 1, but it can also be positive when α < 1.

The error term in (5.23) can be expressed as:

 

11DAD: Decomposition|FGT: Growth & Redistribution.

Image

This error term can be shown to be either the difference between the growth effect measured using B as a reference distribution and that using A as the reference distribution,

Image

or the difference between the redistribution effect measured using B as the reference distribution and the redistribution effect using A as the reference distribution,

Image

An alternative decomposition uses the posterior distribution B as the reference distribution for assessing the growth and redistribution effects. This yields:

Image

Clearly, a middle way between these two alternative decomposition procedures is to measure the growth effect as the average of the two growth effects, in (5.23) and (5.27), and likewise to measure the redistribution effect as the average of the two redistribution effects. Proceeding this way has the advantage of eliminating the error term in the poverty decomposition, since the error terms of each of the two alternative decompositions sum to zero. This middle way is in fact what would be given by the use of the Shapley value to perform a growth- redistribution decomposition — see the Appendix 4.7 for more details on the Shapley value. This leads to the following growth- redistribution decomposition12:

Image

5.7.2 Demographic and sectoral decomposition of differences in FGT indices

Equation (5.11) shows how poverty can be expressed as a sum of the poverty contributions of the various subgroups that make a population. Each subgroup contributes in proportion to its share in the population and to the level of poverty found in that subgroup. Hence, we may wish to express changes in poverty across time or space as a function of differences in these factors. More precisely, we want to see whether differences in poverty across distributions can be attributed to differences in demographic or sectoral composition across these distributions, or to differences in poverty across these demographic or sectoral groups. We may express this as follows13:

Image

Note that the decomposition in (5.29) suffers from the same index number problem as the earlier one in (5.23). For example, one could prefer to use φB(k) instead of φA(k) to compute the within-group poverty effects. It may

 

12DAD: Decomposition|FGT: Growth & Redistribution.

13DAD: Decomposition|FGT: Sectoral.

also seem more convenient to weight the within-group poverty effects by the average population shares, and to weight the demographic and sectoral effects by the average poverty index. This yields 14:

Image

where Image(k)= 0.5 (φA (k) + φB (k)) and Image(k; z; α) = 0.5 (ImageA(k;z;α) +ImageB(k;z;α)). Note from (5.30) that this decomposition procedure removes the error term. Depending on the context, the decomposition in (5.30) could serve to show, for instance, how variations in the size and in the poverty of various sectors of the economy account for variations of total poverty across economies, how differences in the size and in the poverty of various demographic groups explain differences in total poverty across societies, how migration and differential poverty across regions account for changes in poverty across time, etc..

5.7.3 The impact of demographic changes

An alternative use of the decomposition in (5.11) computes the impact of a change in the proportion of the population that is found in a group k, this change being accompanied by an exactly offsetting change in the proportion of the other groups. This may be useful, for instance, if one wishes to predict the impact of migration or demographic changes on national poverty, keeping out within-group poverty. Let the population share of a group t, φ(t), increase by a proportion λ to φ(t)(1 + λ), with a proportional fall in the other groups' population share from φ(k) to φ(k) (1 – φ(t)λ/(1 – φ(t))). Note that the new population shares will add up to 1 since

Image

The net impact of this on poverty is then15

Image

 

14DAD: Decomposition|FGT: Sectoral.

15DAD: Poverty|Impact of Demographic Change.

We may instead wish to predict the impact of an absolute increase in the population share of a group t. Let this change be from φ(t) to φ(t) + λ, with a corresponding fall in the other groups' population share that is proportional to their initial share (a fall from φ(k) to φ(k) (1 – λ/(1 – φ(t)))). The resulting change in poverty is analogously given as

Image

Note that the only difference between (5.31) and (5.32) comes from the size in the increase in φ(t), which is φ(t)λ in (5.31) and λ in (5.32).

5.7.4 Decomposing poverty by income components

Let C income components add up to total income X(p), with X(p) = Image and Image being the expected value of income component c at rank p in the distribution of total income. Image can be, for instance, agricultural or capital income, or the income of those living in some geographic area, or some type of expenditure that enters total expenditure X 16.

We may wish to know by what amount total poverty is reduced by the presence of an income component. Clearly, we would expect those components with a large mean μX(c) to be more effective in helping to alleviate total poverty. But we must also take into account the distribution of Image. Suppose for instance that urban capital income is larger than rural capital income, but that poverty is low in urban areas because urban labor income is large there. Then, it is unclear whether relatively high capital income in urban areas is more effective at alleviating poverty than the relatively low capital income in rural areas, where poverty is more concentrated.

The contribution of an income component c to poverty alleviation can be given by the fall in poverty after Image is added to initial income. But this fall depends on what this initial income is. Does it include some of the other income components? This path-dependency difficulty can again be circumvented by the use of the Shapley value. We start by assuming maximum poverty, that is, poverty when total income is nil for everyone. We then estimate the contribution of component c to poverty alleviation as the expected value of its marginal contribution when it is added to anyone of the various subsets of income components that one can choose from the set of all the components.

 

16DAD: Decomposition|FGT: Decomposition by Sources.

When a component is missing from that set for an individual, we assume that its value is 0.

5.8 References

Rowntree (1901) predated by far the modern quantitative approach to poverty measurement. General and recent references include Chen and Ravallion (2001) (for wide empirical evidence on poverty), Constance and Michael (1995) (for the US debate on poverty measurement), Deaton (2001) (for the empirical difficulties associated with "counting the poor"), Glewwe (2001) (for a very extensive coverage of the nature, evolution, and causes of poverty), Jantti and Danzinger (2000) (for poverty in more advanced countries), Lipton and Ravallion (1995) (for poverty and policy), Ravallion (1994) and Ravallion (1996) (for a non-technical overview and discussion of poverty measurement issues), Smeeding, Rainwater, and O'Higgins (1990) (for early results using Luxembourg Income Study data) and Zheng (1997) (for a review of poverty indices).

The papers by Watts (1968), Sen (1976) and Foster, Greer, and Thorbecke (1984) influenced greatly much of the subsequently large literature on poverty indices. Relatively early contributions on poverty measurement are found in Anand (1977), Blackorby and Donaldson (1980), Chakravarty (1983a), Chakravarty (1983b), Clark, Hamming, and Ulph (1981), Donaldson and Weymark (1986), Foster (1984), Hagenaars (1987), Kakwani (1980), Kundu and Smith (1983), Takayama (1979), and Thon (1979). More recent works include Chakravarty (1997), Myles and Picot (2000), Osberg and Xu (2000) and Shorrocks (1995) on a revisited and improved form of the Sen (1976) poverty index; Duclos and Gregoire (2002) on the link between linear poverty indices and relative deprivation; Morduch (1998) and Zheng (1993) on the Watts index; Pattanaik and Sengupta (1995) on the original Sen index; and Shorrocks (1998) on "deprivation profiles".

Applied poverty studies using these developments have been almost innumerable. A small subset of the studies that have been published includes Coulombe and McKay (1998) (Mauritania), Coulombe and McKay (1998) (Ghana), Davidson and Duclos (2000) (using LIS data), Gustafsson and Nivorozhkina (1996) (Northern countries), Grootart and Kanbur (1995) (Côte d'Ivoire), Gustafsson and Shi (2002) (China), Hagenaars and De Vos (1988) (the Netherlands), Hill and Michael (2001) (US), Iceland, Short, Garner, and Johnson (2001) (US), Milanovic (1992) (Poland), Osberg and Xu (1999) (Canada), Osberg (2000) (Canada and the US), Pendakur (2001) (Canada), Rady (2000) (Egypt), Ravallion and Bidani (1994) (Indonesia), Ravallion and Chen (1997) (67 less developed countries), Rodgers and Rodgers (2000) (Australia), and Szulc (1995) (Poland).

The empirical links between growth, poverty and inequality have also often been analyzed in recent years. Studies on whether growth is beneficial to the poor, both absolutely and relatively speaking, include Bigsten and Shimeles (2003) (for Ethiopian evidence), Datt and Ravallion (2002) (for a survey of the Indian evidence), Dollar and Kraay (2002) (for an influential study of the experience of 42 countries over 4 decades), Essama Nssah (1997) (for Madagascar evidence), Ravallion and Chen (1997) (where growth is found to decrease inequality as often as it increases it), Ravallion (2001) (where a warning against the use of cross-country regressions is made), and Ravallion and Datt (2002) (for differential evidence across Indian states). De Janvry and Sadoulet (2000), Deininger and Squire (1998) and Ravallion (1998a) also apply causal tests to determine whether inequality favors or impedes growth. See also Ravallion and Chen (2003) and Tsui (1996) for the use of the average poverty gap and the Watts index as indices of whether growth is beneficial to the poor.

Image

Figure 5.2: The relative contribution of the poor to FGT indices

Image

Figure 5.3: Socially-representative poverty gaps for the FGT indices

Image

Figure 5.4: The cumulative poverty gap (CPG) curve

Chapter 6
ESTIMATING POVERTY LINES

Three major issues arise in the estimation and in the use of poverty lines. First, we must define the space in which well-being is to be measured. As discussed in Chapter 1, this can be the space of utility, incomes, "basic needs", functionings, or capabilities. Second, we must determine whether we are interested in an absolute or in a relative poverty line in the space considered. Third, we must choose whether it is by someone's "capacity to function" or by someone's "actual functioning" that we will judge if that person is poor. We consider first the issue of the choice between an absolute and a relative poverty line.

6.1 Absolute and relative poverty lines

An absolute poverty line can be interpreted as fixed in any one of the spaces in which we wish to assess well-being. Conversely, a relative poverty line would depend on the distribution of well-being (including the utilities, living standards, functionings or capabilities) found in a society and would therefore vary across societies. Considerable controversy exists on whether absoluteness or relativity is a better property for a poverty threshold. Most analysts would probably agree that a poverty threshold defined in the space of functionings and capabilities should be absolute (but even on this there is no unanimity). An absolute threshold in these spaces would, however, generally imply relativity of the corresponding thresholds in the space of the commodities and in the level of basic needs required to achieve these functionings.

There are two main reasons for this. First, the relative prices and the availability of commodities depend on the distribution of incomes. For instance, as a society initially develops, rising numbers of people need to travel to work and to trade, without first being able to afford the costs of private transportation. Because of increasing returns to scale in the provision of public transportation, the affordability and accessibility of public transportation usually also first increases during that development stage. As societies become richer on average, however, their citizens start making increasing use of private forms of transportation, a phenomenon which causes a fall in the supply and availability of public transportation, leading to an increase in its relative price. This makes the capacity to travel (arguably an important capacity) more or less costly, depending on the state of economic development.

Second, not to be deprived of some capability may require the absence of relative deprivation in the space of some commodities. In support of this, there is Adam Smith's famous statement that the commodities needed to go without shame (an oft-mentioned basic functioning) can be to some extent relative to the distribution of such commodities in a society:

By necessaries I understand not only the commodities which are indispensably necessary for the support of life, but whatever the custom of the country renders it indecent for creditable people, even of the lowest order, to be without. A linen shirt, for example, is, strictly speaking, not a necessary of life. The Greeks and Romans lived, I suppose, very comfortably though they had no linen. But in the present times, through the greater part of Europe, a creditable day-laborer would be ashamed to appear in public without a linen shirt, the want of which would be supposed to denote that disgraceful degree of poverty which, it is presumed, nobody can well fall into without extreme bad conduct. Custom, in the same manner, has rendered leather shoes a necessary of life in England. The poorest creditable person of either sex would be ashamed to appear in public without them. In Scotland, custom has rendered them a necessary of life to the lowest order of men; but not to the same order of women, who may, without any discredit, walk about barefooted. In France they are necessaries neither to men nor to women, the lowest rank of both sexes appearing there publicly, without any discredit, sometimes in wooden shoes, and sometimes barefooted. Under necessaries, therefore, I comprehend not only those things which nature, but those things which the established rules of decency have rendered necessary to the lowest rank of people. (Smith 1776, Book 5, Chapter 2)

Sen (1985), reinforces this by distinguishing clearly the two dimensions of capabilities and commodities:

I would like to say that poverty is an absolute notion in the space of capabilities but very often it will take a relative form in the space of commodities and characteristics (Sen 1985, p.335).

This view is in fact also consistent with the World Bank's influential definition of poverty, which says that poverty is the inability to attain a minimal standard of living (World Bank 1990). This minimal standard consists of

of nutrition and other basic necessities and a further amount that varies from country to country, reflecting the cost of participating in the everyday life of society. (World Bank 1990, p. 26)

This has led some writers (particularly in developed countries) to conclude that attempts to preserve some degree of absoluteness in the space of commodities are untenable:

In summary, it does not seem possible to develop an approach to poverty measurement which is linked to absolute standards. While some analysts are uneasy with relativist concepts of poverty on the grounds that they are difficult to comprehend and can be seen as somewhat arbitrary and open to manipulation, no real practical alternative to relativist concepts exists. (Saunders 1994, p. 227)

6.2 Social exclusion and relative deprivation

Complete relativity of the poverty line in the space of commodities would nevertheless draw poverty analysis very close to the analysis of social exclusion (as exemplified by Rodgers, Gore, and Figueiredo 1995 at the International Labor Organization) and relative deprivation (as propounded for instance by Townsend 1979). Social exclusion entails "the drawing of inappropriate group distinctions between free and equal individuals which deny access to or participation in exchange or interaction" (Silver 1994, p.557). This includes participation in property, earnings, public goods, and in the prevailing consumption level (Silver 1994, p.541). Relative deprivation focuses on the inability to enjoy living standards and activities that are ordinarily observed in a society. Townsend (1979) defines it as a situation in which

Individuals, families and groups in the population (...) lack the resources to obtain the types of diet, participate in the activities and have the living conditions and amenities which are customary or at least widely encouraged or approved, in the society to which they belong, (p.30)

Equating absolute deprivation in the space of capabilities with relative deprivation in the space of commodities can, however, be a source of confusion in poverty comparisons. First, it tends to blur the operational and conceptual distinction between poverty and inequality. Second, it can hinder the identification of "core" or absolute poverty in any of the spaces. The identification of core poverty is, indeed, probably the most important input into the design of public policy in developing countries. Third, although the ethical appeal of Sen's capability approach has variously been invoked to justify the use of an entirely relative poverty line in the space of commodities, Sen himself does not accept this:

Indeed, there is an irreducible core of absolute deprivation in our idea of poverty, which translates reports of starvation, malnutrition and visible hardship into a diagnosis of poverty without having to ascertain first the relative picture. Thus the approach of relative deprivation supplements rather than supplants the analysis of poverty in terms of absolute dispossession (Sen 1981, p. 17)).

Furthermore,

(...) considerations of relative deprivation are relevant in specifying the 'basic' needs, but attempts to make relative deprivation the sole basis of such specification is doomed to failure since there is an irreducible core of absolute deprivation in the concept of poverty (Sen 1981, p.17).

Given the measurement difficulties involved in estimating relative poverty lines that correspond to absolute poverty lines in the space of functionings and capabilities, analysts often find most transparent to use the space of living standards as the space in which to define an absolute threshold. If this is done, however, it must subsequently be admitted that the procedure will imply a set of thresholds in the space of functionings and capabilities that depend at least partly on the conditions of the society in which an individual lives. Indeed, for a given absolute level of living standard in the space of commodities, an individual's capabilities are generally relative, that is, they depend on his social and economic environment, at least for functionings such as shamelessness and participation in the life of the community.

6.3 Estimating absolute poverty lines

Methodologies for the estimation of poverty lines have been most developed in the context of the fulfillment of basic physiological needs. Although such methodologies have often been set in a welfarist framework, they also matter for the basic needs, functioning or capability approaches since these approaches are also concerned with basic physiological achievements. These methodologies have recently been most often applied to developing country contexts.

6.3.1 Cost of basic needs

The estimation of the "cost of basic needs" (CBN) usually involves two steps. First, an estimation is made of the minimal food expenditures that are necessary for living in good health; we will denote this by zF. Second, an analogous estimate of the required non-food expenditures, ZNF, is computed and added to zF to yield a total poverty line, ZT We consider now in some detail each of these two steps.

6.3.2 Cost of food needs

The first step in the computation of a global poverty line is usually to estimate a food poverty line. The determination of a food poverty line generally proceeds by asking what amount of food expenditures is required to achieve some minimal required level of food-energy intake (or nutrient intake, such as proteins, vitamins, fat, or minerals. Early examples of the application of this approach include Rowntree (1901) and Orshansky (1965). A basket of food commodities is designed or estimated by "food specialists" such as to provide those minimally required levels of food-energy intake. The cost of that basket yields the food poverty line zF.

To illustrate how this exercise can be carried out in practice, consider Figure 6.1, which plots consumption x1(p) and x2(p) of two goods, goods 1 and 2, over a range of percentiles p. For simplicity, Figure 6.1 supposes that good 1 is "income-inelastic" (x1(p) is constant) but that the consumption of good 2 increases with the rank in the distribution of income (it is income elastic). The idea then is to select a combination of x1(p) and x2(p) that provides a given level of minimal calorie intake. For the purposes of this illustration, assume that this minimum energy intake is 3000 calories per day, and that 1 unit of good 1 and 2 provides 2000 and 1000 calories each respectively. Also assume that each unit of good 1 and 2 costs q$.

The cheapest way to achieve the minimum calorie intake would be to consume only of good 1, since good 1 is the most calorie-efficient (we can think of good 1 as "cereals" and good 2 as "meat"). Indeed, each calorie provided by the consumption of good 1 costs q$/2000, whereas each calorie provided by the consumption of good 2 costs twice as much, that is, q$/1000. 1.5 units of good 1 (1.5 units *2000 calories/unit =3000 calories) would then be required for the minimal energy intake to be met, and zF would then equal 1.5q$.

This, however, would suppose a food commodity basket that no individual in Figure 6.1 would be observed to consume. Even at the very bottom of the distribution of income, individuals consume indeed at least some of good 2 at the expense of a diminished consumption of the more calorie-efficient good 1. We should presumably take this information into account if we wished to respect at least to some extent the cultural and culinary preferences of those whose well-being we aim to evaluate. This raises the obvious question of which preferences we should consider. Note that the preferred ratio of good 2 over good 1 increases continuously with p in Figure 6.1. For convenience, denote that ratio by ρ(p) = x2(p)/x1(p). Simple algebra then shows that the cost of attaining the minimum calorie intake is given by zF(p) = 3q$(1 + ρ(p))/(2 + ρ(p)), where zF(p) indicates that zF depends on the rank p of those whose preferences we use to build the commodity basket and to compute the food poverty line.

Figure 6.1 plots zF(p) and shows that it is not neutral to the choice of p. Using the preferences of the poorest, we obtain zF(p = 0) = 1.8q$, but if we use the preferences of the median population, we get zF(p = 0.5) = 2.1q$. This is in fact just one example of a more general standard observation in the literature on poverty lines that the choice of reference parameters matters for the estimation of poverty lines. In Figure 6.1, the farther are the preferences ρ(p) from the most calorie-efficient choice, the more costly is the estimated food poverty line zF(p). Arguably, the preferences ρ(p) should be those of the individuals that are close to the total poverty line, but this is a (partly) circular argument since ρ(p) is itself a determinant of that total poverty line. In practice, an arbitrary value of p is often chosen, reflecting some a priori belief on the position of those at the edge of the total poverty line. A more common (though arguably less commendable) procedure is to compute and

Image

Figure 6.1: Engel curves and cost-of-basic-needs baskets

use an average value of x2(p)/x1(p) over a range of p, such as the bottom 25% or 50% individuals of a population.

Even if we were to agree on the position p at which we wish to observe preferences such as ρ(p), there still remains the awkward fact that preferences will often vary significantly even at this given value of p. Said differently, there are in practice many different actual consumption patterns for a group of "typical poor". One solution is simply to ignore these differences and estimate the typical poor's average consumption patterns. Following this line of argument, consumption expenditures on various food items are regressed against income and the estimated parameters of these regressions are then used to predict the consumption patterns of the "typical poor". These regressions have often been parametric — assuming for instance that expenditures on cereals and meat are globally quadratic or log-linear in total expenditures. It is unlikely, however, that such parametric forms fit appropriately at all income levels, low and high alike. A better statistical procedure would probably be to regress consumption expenditures non parametrically on total expenditures, which would allow for a better fit of the preferences of those around the "typical poor".

An additionally important issue then is whether variations in culinary tastes and food habits across socio-economic characteristics should be taken into account. If no account of such variations are taken, then we can choose as a reference group that group whose diet minimizes food cost while providing the minimum required level of food-energy intake. This would typically generate an unreasonably low level of expenditures for many other groups, with an implied dietary basket of food commodities that could again be very different from those they typically consume.

If, however, full account of diversity in culinary tastes were to be taken, a serious risk would exist of overestimating the poverty lines of those individuals and groups of individuals with a greater taste for expensive foods (e.g., of higher quality or better taste). This is commonly the case, for instance, for urban households, who customarily have more sophisticated culinary tastes than rural dwellers (for the same overall living standards), and have also greater access to a larger variety of imported and expensive foods. This procedure would then assign greater poverty lines to urban versus rural individuals. It would also mean that the utility equivalents of individual food poverty lines would depend on the peculiarities of the individuals' food preferences. This would generally lead to inconsistent comparisons of well-being across urban and rural inhabitants, and would exaggerate the degree of poverty in the urban as compared to the rural areas.

We can illustrate this using Figure 6.2. Figure 6.2 shows baskets of two food commodities, x1 and x2, with three food budget constraints of total food consumption equal to Y0, Yl, and Y2 (these total budgets are expressed in units of x1). Figure 6.2 also shows a "minimum calorie constraint", along which the total calories provided by the consumption of x1 and x2 equal the required minimum level of calorie intake. If no account whatsoever were taken of preferences, Y0 would yield the food poverty line. But along the food budget constraint Y0, there is only one point which meets the minimum calorie constraint (the point at which x1 = Y0 and x2 = 0, and it is of course unlikely that individuals will choose a food basket to be precisely at that corner. An individual with preferences U0 and budget Y0, for instance, would not locate himself on the minimum calorie constraint. It is only with the more generous budget constraint Y1 that this individual will consume the minimally required level of calorie intake, as shown on Figure 6.2.

But not all individuals will necessarily choose to be "calorie-sufficient" even with a total food budget of Y1. Individuals with greater preferences — as in the case of U2 — for the less-calorie efficient good x2 will not choose a food basket on or above the minimum calorie constraint. Individual with preferences U2 will instead need Y2 to be calorie-sufficient. Yet, whether individuals with preferences U1 and budget Y1 are just as well off as individuals with preferences U2 and budget Y2 is debatable. Such would be the assumption, however, if we used two distinct poverty lines Y1 and Y2 for the two different tastes.

As mentioned above, such comparability assumptions are often implicitly made in practice when individuals living in different regions, rural or urban for instance, are assigned different poverty lines for reasons independent of differences in needs or prices. As illustrated in Figure 6.2, this supposes that an individual with "sophisticated" preferences (an urban dweller who has been accustomed to food variety) needs a higher budget to be as "well off" as an individual with less expensive preferences (a rural dweller who is content with eating basic food types). Probably more convincing, however, would be the view that U2 with Y2 in Figure 6.2 provides greater utility and well-being than U1 with Y1. Assigning different poverty lines Y1 and Y2 would then lead to inconsistent and biased poverty estimates.

Minimally required food expenditures can also be (and are often) adjusted for differences in climate, sex, or age, when such differences impact on needs rather than on tastes (as we discussed above). These expenditures can also be adjusted for variations in activity levels, although activity levels depend on the level of one's well-being, and thus on one's poverty status. Activity-level adjustments would thus generate a poverty line that evolves endogenously with the standard of living of individuals, a slightly awkward feature for comparing poverty.

6.3.3 Non-food poverty lines

The subsequent step is usually to estimate the non-food component of the total poverty line. The most popular method for doing this is simply to go

Image

Figure 6.2: Food preferences and the cost of a minimum calorie intake

straight to an estimate of the total poverty line by dividing the food poverty line by the share of food in total expenditures. The intuition behind this is as follows. The larger the food share in total expenditures, the closer the food poverty line should be to the total poverty line. Therefore, the smaller should be the necessary adjustment to the food poverty line (the closer to 1 should be the denominator that divides the food poverty line). Indeed, dividing ZF by ZF/ZT (the food share) gives ZT. The problem of which food share to use is of course an important issue. It is a problem analogous to the one discussed above on what the food basket should be for computing a food poverty line. Popular practices vary, but often make use of:

A- the average food share of those whose total expenditures equal the food poverty line;

E: 18.4.5

B- the average food share of those whose food expenditures equal the food poverty line;

E: 18.4.3

C- the average food share of a bottom proportion of the population (e.g., the 25% or 50% poorest).

In addition to this, another popular method

D- adds to zF the non-food expenditures of those whose total expenditures equal ZF

E: 18.4.7

To see how methods A, B and D work and differ from each other, consider Figure 6.3. Figure 6.3 shows (predicted) total expenditures against various levels of food expenditures. The regression can be done parametrically, but a generally better approach would be to predict total expenditures using a non-parametric regression on food expenditures.1 On each of the two axes is shown the level of the (previously estimated) food poverty line zF. These two levels meet at the 45 degree line.

As indicated above, method A makes use of the average food share of those whose total expenditures equal the food poverty line. Total expenditures equal the food poverty line, zF, at point E on Figure 6.3. The food share at point E is given by the inverse of the slope of the line OE that goes from the origin to point E. The total poverty line according to method A is therefore given by the height of a line OE that extends to just above a level of food expenditures zF. This gives the vertical height of point A as the total poverty line according to method A.

Method B makes use of the average food share of those whose food expenditures equal the food poverty line. Those who consume zF in food are located

 

1 DAD: Distribution|Non-Parametric Regression.

at point B on Figure 6.3. Their food share is given by the inverse of the slope of the straight line that would extend from point O to point B. Hence, dividing zF by that food share brings us back to point B, which is therefore the total poverty line according to method B.

The total poverty line according to method B is more generous than that according to method A since the food share used for B is lower than that used for A. Indeed, method A focusses on the food share of a rather deprived population: those who, in total, only spend the food poverty line. Method B focusses on the food share of a less deprived population: those who, on food only, spend the food poverty line. Since food shares tend to decline with standards of living, method B's food share is usually lower than method A's.

Finally, method D considers the non-food expenditures of those whose total expenditures equal zF. As for method A, these individuals are found at point E on Figure 6.3. Their non-food expenditures are given by the length of line EG on the Figure. Adding these non-food expenditures to zF yields a total poverty line given by the height of point D.

The choice of methods and food shares and the estimation of the non-food poverty lines is rather arbitrary, and the resulting estimate of the total poverty line will also be somewhat arbitrary. Moreover, and perhaps more worryingly, some of the estimates will also vary with the distribution of living standards, as in the case of method C where the food share is an average over a range of individuals. To avoid inconsistencies in poverty comparisons, it would therefore seem preferable to use the same food share across the distributions being compared, and to use methods that do not make estimates overly dependent on a particular distribution of living standards.

6.3.4 Food energy intake

A slightly different method for estimating poverty lines that is popular in the literature is the so-called Food-Energy-Intake (FEI) method. Estimates of observed calorie intakes are first computed and then graphed against observed (total or food) expenditures. The analyst then estimates the expenditures of those whose calorie intake is just at the minimum required for healthy subsistence. When these expenditures are on food, this provides a food poverty line, which can then be used as described above in Section 6.3.3 to provide an estimate of a global poverty line. When the expenditures are total expenditures, the FEI method provides a direct link between a minimum calorie intake and a total poverty line2.

E:18.4.1

Figure 6.4 illustrates how this method works. The curve shows the level of expenditure (measured on the vertical axis) that is observed (on average) at

 

2DAD: Distribution|Non-Parametric Regression.

a given level of calorie intake (shown on the horizontal axis). The curve is increasing and convex, since calorie intake is usually expected to increase at a diminishing rate with food or total expenditures. Above zk, the minimum calorie intake recommended for a healthy life, we read z, the food or total poverty line according to the FEI method.

As just exposed, the FBI method may appear straightforward and simple to implement. A number of conceptual and measurement problems are hidden, however, behind this apparent simplicity. Note for instance that the line traced on Figure 6.4 is the expected link between expenditure and calorie intake; there is in real life a significant amount of variability around this line. How are we to interpret this variability? If it is due to measurement errors, then we may perhaps ignore it. If it is due to variability in preferences, then we may wish to model the calorie-intake-expenditure relationship separately for different groups of the population, as is often done in practice, for urban and rural areas for instance. As in the cost-of-basic-needs method, however, we then run the risk of estimating higher poverty lines for those groups that have more expensive or more sophisticated tastes for food. This would lead to inconsistent comparisons of well-being and poverty, as discussed in Section 6.3.2.

To compute expected expenditure (given the variability of actual observed spending) at a given calorie intake, we can estimate the parameters of a parametric regression linking expenditures to calorie intake. Again, the regression is often postulated to be log-linear or quadratic. This parametric specification supposes, however, that the functional relationship between expenditures and calorie intake is known by the analyst, up to some unknown parameter values. This is unlikely to be true everywhere, especially for those far from the level of calorie intake of interest (e.g., those at the lower and upper tails of the distribution of spending and calorie intake). In such cases, the parametric procedure will make the estimated expenditure poverty line affected by the presence of "outliers" that are relatively far from the minimum level of calorie intake. This procedure will then generate a biased estimator of the "true" poverty line. A more flexible and arguably better approach would be to estimate the link between expenditures and calorie intake non parametrically.

6.3.5 Illustration for Cameroon

To see whether differences in some of the methodologies described above can matter, consider the case of 1996 Cameroon. Table 6.1 shows the result of estimating food, non-food and total poverty lines for the whole of Cameroon and for each of its 6 regions separately. Note that the figures are in Francs CFA adjusted for price differences, with Yaoundé being the reference region. The food poverty line was estimated using the FEI method at 2400 calories per day per adult equivalent. A non-parametric regression using DAD was performed for the whole of Cameroon and separately for each of the 6 regions. The lower non-food poverty line was obtained (non parametrically) using method D in section 6.3.3, and the upper non-food poverty line using method B. Again, the relevant regressions were carried out for the whole of Cameroon and separately for each of its 6 regions.

As can be seen, the link between calorie intake and food expenditures varies systematically across regions. Expected food expenditure at 2400 calories per day is significantly higher in urban areas (Yaoundé, Douala and Other cities) than in the rural ones. In Douala, for instance, a household would need 408 Francs CFA per day per adult equivalent to reach an intake of 2400 calories per day. In the Highlands, no more than 170 Francs CFA would on average be needed. The link between food and total expenditures also varies across Cameroon's regions. Combined with the different estimates for the food poverty lines, this leads to very significant variations across regions in the total poverty lines. Using method D, a lower total poverty line of 589 Francs CFA is obtained for Douala, but that same poverty line is only 235 Francs CFA for the Highlands. Note also that the choice of method B vs method D has a very significant impact on the estimate of the total poverty line. For the whole of Cameroon, the lower and the upper total poverty lines are respectively 373 and 534 Francs CFA, a difference of 43%.

Unsurprisingly, these large differences across regions and across methods have a large impact on national poverty estimates and on regional poverty comparisons. This is illustrated in Table 6.2, which shows the proportion of individuals underneath various poverty lines for various indicators of well-being. "Calorie poverty" (first line) is relatively constant across Cameroon. In the whole of Cameroon, 68.1% of the population was observed to consume less than 2400 calories per day per adult equivalent. This proportion varies between 59.9% (for Other cities) and 86.5% (for Forests) across regions. Roughly the same limited variability and the same poverty rankings appear when food poverty is estimated using for each region its own food poverty line (third line). However, when a common food poverty line is used to assess food poverty in each region (second line), national poverty stays roughly unchanged at around 69% but urban regions now appear significantly less poor than the rural ones. For instance, the poverty headcount in Douala (42.0%) is now only half that of the Highlands (82.5%).

The rest of Table 6.2 confirms these lessons. When a common poverty line is used to compare the regions, rural areas are significantly poorer than urban ones. When region-specific poverty lines are used, these differences are much reduced, and the regional rankings are often even reversed. For example, using a common lower total poverty line (fourth line), the Highlands have a head-count ratio more than three times that of the urban regions. When regional lower total poverty lines are used instead, the Highlands become prominently the least poor of all regions. Setting common as opposed to regional poverty lines can thus have a crucial impact on poverty rankings and the setting of subsequent poverty alleviation policies. The choice of a lower as against an upper total poverty line also makes a difference. For the whole of Cameroon, the proportion of the Cameroonian population in poverty increases from 43.9% to 68.0% when we move from a common lower total poverty line (fourth line) to a common upper total poverty line (sixth line). Clearly, this changes significantly one's understanding of the incidence of poverty in Cameroon.

These results also implicitly warn that the choice of well-being indicators is not neutral to the identification of the poor. In our context, this is because the correlation between calorie intake, food expenditure and total expenditure is imperfect. Table 6.3 indicates, for example, that in bidimensional poverty analyses using any two of these three indicators of well-being, around 20% to 25% of the population is characterized as poor in one dimension but non poor in the other. In the first part of 6.3, we note for instance that 11.2% of the population would be judged poor in terms of calorie intake but not poor in terms of food expenditure. Conversely, 9.6% of the population would be deemed non poor in terms of calorie intake but poor in terms of food expenditure. These proportions are slightly higher for the other bidimensional poverty analyses, which compare food with total expenditure poverty, and calorie with total expenditure poverty, respectively.

6.4 Estimating relative and subjective poverty lines

6.4.1 Relative poverty lines

There are two other popular methodologies for the estimation of poverty lines. The first deals with purely relative poverty lines, which, as we saw above, can be useful to determine the commodities needed for "living without shame" and for participating in the "prevailing consumption level". A relative poverty line is typically set as a somewhat arbitrary proportion of the mean or of some income quantile (often the median). Clearly, such a poverty line will vary with the central tendency of the income distribution, and will not be the same in constant terms across space and time. One possibly awkward feature of the use of a relative poverty line approach is that a policy which raises the income of all, but proportionately more those of the rich, will increase poverty, although the absolute incomes of the poor have risen. Conversely, a natural catastrophe which hurts absolutely everyone will decrease poverty if the rich are proportionately the most hurt3.

E:18.3

Another possibly awkward feature of the use of relative poverty lines is that an improvement in the absolute incomes of some of the poor, with no change

 

3DAD: Poverty|FGT Index.

Table 6.1: Estimated poverty lines in Cameroon according to different methods (Francs CFA/day/adult equivalent), for the whole of Cameroon and separately for its 6 regions

 

FEI food poverty line

Lower non-food poverty line

Lower total CBN poverty line

Upper non-food poverty line

Upper total CBN poverty line

Cameroon

256

117

373

278

534

Yaoundé

337

143

480

412

749

Douala

408

181

589

588

995

Other cities

347

152

499

385

732

Forests

259

134

393

214

473

Highlands

170

65

235

186

357

Savana

204

78

282

190

394

 

Table 6.2: Headcount according to alternative measurement methods and for different regions in Cameroon (% of the population)

 

Yaoundé

Douala

Other cities

Forests

Highlands

Savana

Cameroon

Calorie poverty using common calorie poverty line

73.4

67.3

59.9

86.5

64.6

61.1

68.1

Food poverty using common food poverty line

53.1

42

44.5

82.5

82.5

74

69.5

Food poverty using regional food poverty lines

67.9

67.5

63.2

82.5

61.1

61.2

66.4

Total expenditure poverty using common lower CBN poverty line

19.2

16.5

16

57.7

58.7

49

43.9

Total expenditure poverty using regional lower CBN poverty line

34.7

38.1

31.8

62.6

19

29.7

33.9

Total expenditure poverty using common upper CBN poverty line

41.6

33.4

36.5

83.8

81.1

78.7

68

Total expenditure poverty using regional upper CBN poverty line

59.6

59

58.8

78.1

53.1

55.8

60.1

Proportion of region in total population

7.1

9.6

12.7

18.5

27.8

24.2

100

 

Table 6.3: Distribution of the poor according to calorie, food and total expenditures poverty (% of the population)

 

Calorie poor

Calorie non-poor

Poor in food expenditure

58.5 %

9.6 %

Non poor in food expenditure

11.2 %

20.7 %

 

Poor in total expenditure

Non poor in total expenditure

Poor in food expenditure

56.6 %

9.8 %

Non poor in food expenditure

11.3%

22.2%

 

Poor in total expenditure

Non poor in total expenditure

Calorie poor

55.8 %

12.3 %

Calorie non poor

12.2 %

19.7 %

in the incomes of the others, may in fact increase poverty. To see why, let η and ς be small positive values and let an income distribution be defined as Q(p) + η(p), with

Image

and with η set initially to 0. Choose z = λμ. The un-normalized FGT index is then given by

Image

Note that Image, which also says that the relative poverty line λμ increases with an increase in ς. We may then check how increases in η affect overall poverty, for a small ς. For the headcount index, we find

Image

which says that the headcount necessarily increases whenever someone's income increases, regardless of whether that person is poor or rich. When α > 0,

Image

The term A on the right-hand side of (6.4) is positive: an increase in incomes increases the relative poverty line and thus tends to increase poverty. When p0 > - Fμ), the increase in income is beneficial to the rich: the term B is then nil, and poverty then necessarily increases with η. When p0 < Fμ), the increase in income benefits some of those below the poverty line, and this increase in their absolute living standards explains why the term B is then negative. Whether it is sufficiently negative to offset the positive term A depends 1) on how far below the poverty line these poor are, and 2) on the value of the ethical parameter α. Hence, even with α > 0, relative poverty may increase when growth is beneficial to the poor4.

E:18.3

6.4.2 Subjective poverty lines

An alternative poverty line methodology relies uses subjective information on the link between living standards and well-being. One source of information comes from interviews on what is perceived to be a sound poverty line, using a question found for instance in Goedhart, Halberstadt, Kapteyn, and Van Praag (1977):

We would like to know which net family income would, in your circumstances, be the absolute minimum for you. That is to say, that you would not be able to make both ends meet if you earned less, (p.510)

The answers are subsequently regressed on the incomes of the respondents. The subjective poverty line is given by the point at which the predicted answer to the minimum income question equals the income of the respondents. The basic intuition for this is that unless someone earns that poverty line, he will not truly know that it is indeed the appropriate minimum income needed to "make both ends meet".

This method is illustrated in some detail on Figure 6.5. Each point represents a separate answer to the above query, namely, the minimum income judged to be needed to make both ends meet as a function of the actual income of the respondents. The filled line shows the predicted response of individuals at a given level of income. For low income levels, this predicted minimum subjective income is well above the respondents' income. The predicted minimum subjective income increases with actual income, but not as fast as income itself. Those with below z* answer that they need more than their own income. Those with income above z* answer that they need on average less than their own income. At z*, which is also where the 45-degree line crosses the line of predicted minimum subjective income, that predicted minimum subjective income equals actual income. The subjective poverty line would therefore be estimated here as z*.

One difficulty with the subjective approach is the sensitivity of poverty line estimates to the formulation of the interview questions. Another problem

 

4DAD: Poverty|FGT Index.

comes from the considerable variability in the answers provided, even within groups of relatively socio-economically homogeneous respondents. The presence of this variability is apparent on Figure 6.5 with points sometimes quite far away from the predicted response line. This variability has some awkward consequences. On Figure 6.5, for instance, an individual at point a is someone who would be judged poor according to the subjective income method since his income falls below z*. An individual at a feels, however, that his income exceeds the minimum income he feels to be needed (point a is to the right of the 45-degree line). He would therefore feel that he is not poor. Conversely, someone at point b feels that he is poor, since his reported minimum income exceeds his actual income, but he would be judged not to be poor by the subjective poverty line method.

How, therefore, ought we to interpret this variability? Is it due to measurement errors? If so, then we may probably best ignore it. Is it rather that the link between living standards and true well-being varies systematically even within homogeneous groups of people? If so, then we might not want to use incomes or other direct or indirect indicators of well-being to classify the poor and the non poor. Instead, we should take individuals at their word on whether they declare themselves to be poor or not. But then, this would clearly raise important practical and incentive problems for the design and the implementation of public policy.

6.4.3 Subjective poverty lines with discrete information

An alternative approach to estimating subjective poverty lines is to ask respondents whether they feel that their income is below the poverty line, without directly asking what the value of that poverty line should be. Answers are coded 0 or 1 — according to whether respondents feel that they are poor or not — alongside the respondents' incomes. The estimate of the poverty line is that which best reconciles the distribution of those answers with that of the respondents' incomes.

This is illustrated in Figure 6.6. Each "dot" is an observation of whether a respondent of a certain income level felt poor (1) or not (0). The working assumption is that respondents compare their income to a common subjective poverty line z*. z* is unobserved and must be estimated. One estimation procedure for z* would be to maximize the likelihood that the respondents' declarations of poverty status correspond to that which would be inferred by comparing z* to their incomes. Said differently, the estimator of z* would minimize the likelihood of observing observations within the ellipses of Figure 6.6. Not everyone with an income below z* says that he is poor; conversely, not everyone above z* says that he is not poor. These "classification errors" would be explained by measurement and/or misreporting errors. Hence, on Figure 6.6, there are "false poor" and "false rich", as shown within the ellipses at the bottom left and at the top right of the Figure. Again, this would run into difficulties if individual preference or need heterogeneity were the true explanation for the "classification errors".

6.5 References

The literature on the estimation of poverty lines is both significant and varied. Note that there is often a sharp distinction in tone and in content between those works which focus on poverty in less developed countries and those which address poverty in more developed economies.

Early reviews of the literature include Goedhart, Halberstadt, Kapteyn, and Van Praag (1977) and Hagenaars and Van Praag (1985). An excellent and comprehensive recent review can be found in Ravallion (1998b) — this chapter has been much influenced by it. Greer and Thorbecke (1986) has been influential in establishing the FEI method of estimating a poverty line. A method based on "basic needs budget" is described in Renwick and Bergmann (1993). The differential effects for poverty measurement of choosing FEI vs CBN methods for estimating poverty lines can be found inter alia in Ravallion and Bidani (1994) and in Wodon (1997a).

Barrington (1997), Fisher (1992), Glennerster (2000) and Orshansky (1988) provide critical reviews of the literature on the setting of the official poverty line in the United States.

The consequences and the issues that surround the choice between absolute and relative poverty lines are discussed in Blackburn (1998) (on the empirical sensitivity of poverty comparisons to that choice), de Vos and Zaidi (1998) (on whether poverty lines should be country specific), Foster (1998) and Zheng (1994) (on the consequences for the choice of poverty indices), and Fisher (1995) and Madden (2000) (on the empirical income elasticity of poverty lines).

Subjective methods for setting poverty lines are discussed and explored in de Vos and Garner (1991) (for comparisons of results between the US and the Netherlands), Pradhan and Ravallion (2000) (on perceived consumption adequacy), Stanovnik (1992) (for an application to Slovenia), Van den Bosch, Callan, Estivill, Hausman, Jeandidier, Muffels, and Yfantopoulos (1993) (for a comparison across 7 European countries), Blanchflower and Oswald (2000) (for reported levels of happiness in Great Britain and in the US), and Ravallion and Lokshin (2002) (for perceptions of well-being in Russia).

Figure 6.3: Food, non-food and total poverty lines

Image

Figure 6.4: Expenditure and calorie intake

Image

Figure 6.5: Subjective poverty lines

Image

Figure 6.6: Estimating a subjective poverty line with discrete subjective information

Image

This page intentionally left blank.

Chapter 7
MEASURING PROGRESSIVITY AND VERTICAL EQUITY

As is well-known, the assessment of tax and transfer systems draws mainly on two fundamental principles: efficiency and equity. The former relates to the presence of distortions in the economic behavior of agents, while the latter focuses on distributive justice. Vertical equity as a principle of distributive justice is rarely questioned as such, although the extent to which it must be precisely weighted against efficiency is a matter of intense disagreement among policy analysts. A principle of redistributive justice which gathers even greater support is that of horizontal equity, the equal treatment of equals. The HE principle is often seen as a consequence of the fundamental moral principle of the equal worth of human beings, and as a corollary of the equal sacrifice theories of taxation. This chapter and the next cover in turn the measurement of each of these principles.

7.1 Taxes and transfers

Let X and N represent respectively gross and net incomes, and let T be taxes net of transfers — the net tax for short. Gross income is pre-tax and/or pre-transfer income, and net income is post-tax and/or post-transfer income, that is, N = X - T. For expositional simplicity, we assume in this chapter that gross incomes are exogenous. This is a common assumption in the literature on the measurement of the impact of taxes and transfers, although it can fail to capture the true impact of tax and transfer policies on well-being when these taxes and transfers are non-marginal.

We can expect a part of the net tax to be a function of the value of gross income X. Otherwise, taxes would be lump sum and orthogonal to gross income. We denote this deterministic part by T(X). For several reasons, we also expect T to be stochastically linked to X. In real life, taxes and transfers depend on a number of variables other than gross incomes, such as family

size and composition, age, sex, area of residence, sources of income, type of consumption and savings behavior, and the ability to avoid taxes or claim transfers. Thus, we can think of T as being a stochastic function of X, with

Image

where v is a stochastic tax determinant.

We denote by FX,N(.,.) the joint cumulative distribution function (cdf) of gross and net incomes. Let Qx(p), QN(p) and QT(p) be the p-quantile functions for gross incomes, net incomes and net taxes, respectively. Let FN \ x(.) be the cdf of N conditional on gross income being equal to x. The q-quantile function for net incomes conditional on a p-quantile value for gross incomes is then technically defined as QN(q\p) = inf{s ≥ o|FN|Qx(p)(s) ≥ q} for q ∈ [0, 1], assuming that net incomes are non-negative. QN(q|p) thus gives the net income of the individual whose net income rank is q among all those with gross income equal to Qx(p).

The expected net income of those with Qx(p) is then given by1

Image

and the expected net tax of those with Qx(p) is obtained as

Image

7.2 Concentration curves

An important descriptive and normative tool for capturing the impact of tax and transfer policies is the concentration curve. As we will see, concentration curves can help capture the horizontal and vertical equity of existing tax and transfer systems. They can also serve to predict the impact of reforms to these systems.

E:18.8.11

The concentration curve for T is2:

Image

where Image is average taxes across the population. C T (P) shows the proportion of total taxes paid by the p bottom proportion of the population.

In practice, concentration curves are usually estimated by ordering a finite number n of sample observations (X1, N1),..., (Xn, Nn) in increasing values

 

1DAD: Distribution|Non-Parametric Regression.

2DAD: Curves|Concentration.

of gross incomes, such that X1X2 ≤.... ≤ Xn, with percentiles pi = i/n, i = 1,....., n and with Ti = Xi- Ni. For i = 1,...n, the sample (or "empirical") concentration curve for taxes (Ti = Xi - Ni)is then defined as

Image

As for the empirical Lorenz curves, other values of CT(P) can be estimated by interpolation.

The concentration curve CN(p) for net incomes is analogously defined as

Image

and typically estimated as

Image

where the Nj have been ordered in increasing values of the associated gross incomes Xj. Note that CN(P) is different from the Lorenz curve of net incomes, LN(p), which is defined as:

Image

Empirically, the Lorenz curve for net income is typically estimated as

Image

but where the observations have been re-ordered in increasing values of net incomes, with N1 ≤ N2 ≤...≤ Nn. Thus, CN(p) sums up the expected value of net incomes up to gross income percentile p. LN(p) however, sums up net incomes up to a net income percentile p.

Denote as t the average tax as a proportion of average gross income, with t = μT/μX. When ≠0,we can show that

Image

For a positive t, this indicates that the more concentrated are the taxes among the poor (the smaller the difference Lx(p) - CT(P))the less concentrated among the poor will net incomes be. The reverse is true for transfers (negative t): the more concentrated they are among the poor, the more concentrated net income is among the poor. This link will prove useful later in defining indices of tax progressivity.

7.3 Concentration indices

As for the Lorenz curves and the S-Gini indices of inequality introduced earlier, we can aggregate the distance between p and the concentration curves C(p) to obtain summary indices of concentration. These indices of concentration are useful to compute aggregate indices of progressivity and vertical equity. More generally, they can also serve to decompose the inequality in total income or total consumption into a sum of the concentration of the components of that total income or consumption, such as different sources of income (different types of earnings, interests, dividends, capital gains, taxes, transfers, etc.) or different types of consumption (of food, clothing, housing, etc.).

To define indices of concentration, we can simply weight the distance p - C(p) by an ethical weight κ(P), of which a popular form is again given by κ(p; p) in equation (4.8). This gives the following class of S-Gini indices of concentration, IC(p) 3:

Image

7.4 Decomposition of inequality into income components

7.4.1 Using concentration curves and indices

An S-Gini inequality index for a variable can easily be decomposed as a sum of the concentration indices of the component variables that add up to that variable. This can be useful, for instance, for decomposing total income inequality as a sum of concentration indices for the different sources of income (employment, capital, transfers, etc.), or total expenditure inequality as a sum of concentration indices for food and non-food expenditures, say. For example, let X(1) and X(2) be two types of expenditures, and let X = X(1) + X(2) be total consumption. Let Cx(1) (p) and Cx(2) (p) be the concentration curves of each of the two types of consumption (using X as the ordering variable). The concentration indices for X(c), ICx(c) (p)), c = 1, 2, are as follows:

Image

 

3DAD: Redistribution|Coefficient of Concentration.

Inequality in X can then be decomposed as a sum of the inequality in X(1) and in X(2). The Lorenz curve for total consumption is given by:

Image

which is a simple weighted sum of the concentration curves for each of the two types of consumption. The index of inequality in total consumption is similarly a simple weighted sum of the concentration indices of each of the two types of consumption4:

Image

For given μX(1) and μX(2), the higher the concentration indices ICX(1)(P) and ICX(2)(ρ), the larger the S-Gini index of inequality in total consumption. Moreover, the higher the share μX(C)/μX of the more highly concentrated expenditure, the higher the inequality in total expenditures5.

E:18.8.32

One possible difficulty with the above is that a component which has the same value for all will be judged by the decompositions in (7.13) and (7.14) to have a zero contribution to total inequality. This is because CX(c) (p) = p for all p and ICX(c) = 0 if component c is equally distributed across all individuals. It may be argued, however, that in such a case contribution c should be seen as contributing negatively to total inequality. Being the same for all, component c indeed decreases the inequality introduced by other components. One way to capture this is to rewrite the decompositions (7.13) and (7.14) in reference to LX(p) and IX(ρ).This gives:

Image

and

Image

The two terms on the left of each of these last two expressions give respectively the contributions of components 1 and 2 to the Lorenz curve and the inequality index of total expenditure X. Those conditions must sum to zero.

 

4DAD: Decomposition|S-Gini: Decomposition by Sources.

5DAD: Decomposition|S-Gini: Decomposition by Sources.

7.4.2 Using the Shapley value

An alternative approach uses the Shapley value to express inequality in total income as a sum of the contributions of inequality in individual income components. For expositional simplicity, assume again that there are only two income components, X(1) and X(2). Total inequality is then given by I (X(1), X(2)). Suppose that we replace the two income components X(1) and X(2) by their mean value μx1 and μx(2), to yield IX(1), μX(2)). Clearly, inequality would be zero after such a substitution. Total inequality can then be expressed as:

Image

An estimate of the contribution of component 1 to total inequality would be given by the second line, and the third line would indicate the contribution of component 2. These estimated contributions are in general dependent upon the order in which the components are replaced by their mean value. The contribution of component 1 could for instance be estimated alternatively as I (X(1),μx2). To solve this order dependency problem, we can use the Shapley value to define the contribution of a component c to total inequality as its expected contribution to inequality reduction when it is added randomly to anyone of the various subsets of components that one can choose from the set of all components. With two components, this gives 6:

Image

7.5 Progressivity comparisons

7.5.1 Deterministic tax and benefit systems

Let us for a moment assume that the tax system is non-stochastic (or deterministic), namely, that v equals a constant zero. Suppose also for now that this deterministic tax system does not rerank individuals, or equivalently that

 

6DAD: Decomposition|S-Gini: Decomposition by Sources.

T(1)(X) ≤ 1. Furthermore, denote the average rate of taxation at gross income X by t(X) with t(X) = T(X)/X7. Assuming no reranking, a net tax

E:18.8.9

(possibly including a transfer or subsidy) T(X) is said to be

Image locally progressive at X = x if the average rate of taxation increases with X, that is, if t(1)(x) >0;

Image locally proportional at X = x if the average rate of taxation stays constant with X, that is, if t(1)(x)=0;

Image and locally regressive at X = x if the average rate of taxation decreases with X, that is, if t(1)(x) < 0.

E: 18.8.10

There are two popular "local" measures to capture the change in taxes and net income as gross income increases. One is the elasticity of taxes with respect to X, also called Liability Progression, LP(X):

Image

LP(X) is simply the ratio of the marginal tax rate over the average tax rate at X. It is possible to show that a tax system is everywhere progressive (namely, t(1)(X) >0 everywhere) if LP(X) > 1 everywhere. The larger this measure at every X, the more concentrated among the richer are the taxes.

One problem with LP(X) is that it is not defined when T(X) = 0, and that it is awkward to interpret when a net tax is sometimes negative and sometimes positive across gross income. Another problem is that it is linked to the relative distribution of taxes, not with the relative distribution of the associated net incomes.

These problems are avoided by the use of a second local measure of progression, called Residual Progression (RP(X)), which is the elasticity of net income with respect to gross income:

Image

Unlike LP(X), RP(X) is well defined and easily interpretable even when taxes are sometimes negative, positive or zero, so long as gross and net incomes are strictly positive. It is then possible to show that a tax system is everywhere progressive (again, this means that t(1)(X)> 0 everywhere) if RP(X) <1 everywhere.

There is a nice link between these measures of progressivity and the redistributive impact of taxes.

 

7DAD: Distribution|Non-Parametric Regression.

Progressivity and inequality reduction

Assuming no reranking, the following conditions are equivalent:

1 t(1)(x)> 0 for all X;

2 LP(X) > 1 for all X (assuming T(X) > 0);

3 RP(X)< 1 for all x;

4 Lx(p) > CT(P) for all p and for any distribution FX of gross income (assuming μT > 0);

5 LN(p) > LX(P) for all p and for any distribution FX of gross income.

Progressive taxation will thus make the distribution of net incomes unambiguously more equal than the distribution of gross incomes, regardless of that actual distribution of gross incomes. Moreover, if the residual progression for a tax system A is always lower than that of a tax system B, whatever the value of X, then the tax system A is said to be everywhere more residual -progressive than the tax system B, and the distribution of net incomes will always be more equal under A than under B, again regardless of the distribution of gross incomes.

Hence, an important distributive consequence of progressive taxation is to make the inequality of net incomes lower than that of gross income. Analogously, proportional taxation will not change inequality, and regressive taxation will increase inequality. The more progressive the tax system, the more inequality-reducing it is. To check whether a deterministic tax system is progressive, proportional or regressive, we may thus simply plot the average tax rate as a function of X and observe its slope. Alternatively, we may estimate and graph its Liability progression or its residual progression at various values of X. To check whether a tax system is more residual -progressive (and thus more redistributive) than another one, we simply plot and compare the elasticity of net incomes with respect to gross incomes. All of this can be done using non-parametric regressions of T(X) and N against X.8

Another informative descriptive approach is to compare the share in taxes and benefits to the share in the population of individuals at various ranks in the distribution of gross income. This is most easily done by plotting on a graph the ratios T(X)/μT orImage for various values of X or p. If these ratios exceed 1, then those individuals with those incomes or ranks pay a greater share of total taxes than their population share. A similar intuition applies when T(·) is a benefit: a ratio T(X)/nr or Image that exceeds 1 indicates that the benefit share exceeds the population share. If T(X) or Image increases

 

8DAD: Distribution|Non-Parametric Regression.

proportionately faster than X or Qx(p), then the tax system is everywhere locally progressive.

A competing descriptive tool is to plot the ratio of taxes over gross income, that is, T(X)/X, perhaps assessed at some rank p to give Image Such a graph shows how the average tax rate evolves with gross income or ranks. When these ratios increase everywhere with X, the tax is everywhere locally progressive.

7.5.2 General tax and benefit systems

Although graphically informative, the above simple descriptive approaches present three main problems. First, if T(1) (X) > 1, the tax system will induce reranking, even if it is a deterministic function of X. As we will see below, reranking (and, more generally, horizontal inequity) decreases the redistributive effect of taxation, besides being of significant ethical concern in its own right.

Second, and more importantly in empirical applications, taxes are typically not a deterministic function of gross income, and randomness in taxes will introduce greater variability and inequality in net incomes than the above deterministic approach would predict. X - T(X) may then be an unreliable guide to the distribution of net incomes, and the above theorems relating local progression measures to global redistributive impact lose a great part of their practical usefulness. Randomness in taxes will also introduce further reranking. These features will reduce the redistributive effect of the tax, and may even in the most extreme cases increase inequality even when the "deterministic trend" of the tax is progressive - even when t(1) (X) > 0.

Third, the actual redistribution effected by taxes depends on the distribution of gross incomes, and not only on the shape of the tax function T. Said differently, the actual redistributive effect of Liability or residual progression will depend on the actual distribution of gross incomes. Arguably, the actual redistribution operated by a tax system is probably of greater interest than its potential impact. A tax may be very locally progressive over some ranges of gross income, but the actual redistributive impact will depend on the interaction of this local progression with the distribution of gross incomes.

7.6 Tax and income redistribution

To deal with these difficulties, we can use the actual distribution of taxes T and net incomes N (instead of their predicted values T(X) and X - T(X)) to determine whether the actual tax system is really progressive and inequality-reducing. This amounts to combining the local measures of progressivity with the distribution of gross incomes to generate global measures of progressivity.

There are two leading approaches for this exercise. The first is the Tax-redistribution (TR) approach, and the second is the Income- redistribution (IR) approach. The global definitions of tax progressivity associated to each of these approaches are as follows.

E: 18.8.2

1 For TR progressivity:

(a) A tax T is TR-progressive if9

Image

(b) A benefit B is TR-progressive if 10

Image

(c) A tax T(1) is more TR-progressive than a tax T(2) if 11

Image

(d) A benefit B(1)is more TR-progressive than a benefit B(2) if 12

Image

(e) A tax T is more TR -progressive than a benefit B if 13

Image

E:18.8.3

2 For IR progressivity:

(a) A net tax T is IR-progressive if 14

Image

(b) A net tax T(1) is more IR-progressive than a tax (and/or a transfer) T(2) if 15

Image

These two TR and IR approaches are consistent with the use above of Liability and residual progression in a deterministic tax system. If v = 0 in

 

9DAD: Redistribution|Tax or Transfer.

10DAD: Redistribution|Tax or Transfer.

11DAD: Redistribution Tax/Transfer vs Tax/Transfer.

12DAD: Redistribution Tax/Transfer vs Tax/Transfer.

13DAD: Redistribution Transfer vs Tax.

14DAD: Redistribution|Tax or Transfer.

15DAD: Redistribution|Tax/Transfer vs Tax/Transfer.

(7.1), and if t(1) (X) > 0 and T(1) (X) ≤ 1 (namely, no reranking), then, whatever the actual distribution of gross incomes, T(X) is both TR- and IR-progressive. Furthermore, if LP(1)(X) > LP(2)(X) at all values of X, then the tax system 1 is necessarily more TR progressive than the tax system 2. And if RP(1)(X) < RP(2)(X) at all values of X, then the tax system 1 is necessarily more IR progressive than the tax system 2.

Note that these progressivity comparisons have as a reference point the initial Lorenz curve. In other words, a tax is progressive if the poorest individuals bear a share of the total tax burden that is less than their share in total gross income. As mentioned above, an alternative reference point would be the cumulative shares in the population. This is often argued in the context of state support — the reference point to assess the equity of public expenditures is population share. The analytical framework above can easily allow for this alternative view — for instance, simply by replacing LX (p) by p in the above definitions of TR progressivity. This will make more stringent the conditions to declare a benefit to be progressive, but it will also make it easier for a tax to be declared progressive — to see this, compare (7.21) and (7.22).

7.7 References

Many of the classical texts on the concept, the role and the measurement of tax progressivity date from the 1950's but they are still very relevant today — they include Blum and Kahen Jr. (1963), Musgrave and Thin (1948), Slitor (1948) and Vickrey (1972). See also Okun (1975) for an influential discussion of the interaction between efficiency and equity issues, as well as Pechman (1985) on incidence analysis.

The measurement of progressivity and vertical equity moved forward significantly in the middle of the 1970's following the slightly earlier advances on the measurement of inequality — see, for instance, Fellman (1976), Jakobsson (1976) and Kakwani (1977a) for the link between progressivity and inequality reduction, and Kakwani (1977b), Suits (1977) and Reynolds and Smolensky (1977) for influential indices of tax progressivity and vertical equity. Reviews of the literature can be found in Lambert (1993) and Lambert (2001).

For papers that address general links between progressivity and inequality, see Davies and Hoy (2002) (for the inequality-reducing properties of "flat taxes"), Latham (1993) (for how to assess whether one tax is more progressive than another), Liu (1985) (for tax progressivity and Lorenz dominance), Moyes and Shorrocks (1998) (on the difficulties that arise for the measurement of progressivity when households differ in needs), and Thistle (1988) (for residual progression and progressivity).

Numerous versions of other specific tax progressivity indices have been discussed and presented over the years. These include, for example, Baum (1987), for "relative share adjustment" indices; Blackorby and Donaldson (1984) and

Kiefer (1984), for normatively-based indices of progressivity; Duclos (1995a), Duclos and Tabi (1996) and Duclos (1997b), for indices of the "social performance" of tax progressivity; Duclos (1998), for normative foundations for the Suits progressivity index; Hayes, Slottje, and Lambert (1992), for effective tax progression across percentiles; and Zandvakili (1994), Zandvakili (1995) and Zandvakili and Mills (2001) for the use of progressivity indices derived from Generalized entropy and Atkinson inequality indices.

Linear indices of progressivity derive from the class of linear inequality indices introduced in Mehran (1976). They are discussed inter alia Duclos (2000), Kakwani (1987), Pfahler (1983) and Pfahler (1987).

Some of the literature has also tended to focus on the tension and on the links between local and global progressivity. See, for instance, Baum (1998), Cassady, Ruggeri, and Van Wart (1996), Formby, Seaks, and Smith (1984), Formby, Smith, and Thistle (1987), Formby, Smith, and Thistle (1990) and Formby, Smith, and Sykes (1986). See also Duclos (1995a) for a method for estimating the average residual progression of unevenly progressive tax and benefit systems, and Keen, Papapanagos, and Shorrocks (2000) and Le Breton, Moyes, and Trannoy (1996) for the impact of changes in tax components (such as sizes of allowances) on the progressivity of the overall tax system.

The influence of the "initial distribution" (that of gross incomes) on progressivity measurement is studied in Dardanoni and Lambert (2002) (for a "transplant-and-compare" procedure) and in Lambert and Pfahler (1992) - see also the comment by Milanovic (1994a). Yardsticks for assessing the effectiveness of tax and benefit policies in reducing initial inequality are proposed in Fellman, Jäntti, and Lambert (1999) and Fellman (2001).

How income is measured is also of importance for the measurement of progressivity and redistribution. See in particular Altshuler and Schwartz (1996) (for the annual vs a "time-exposure" incidence of the US child care tax credit), Caspersen and Metcalf (1994) (for the annual vs lifetime incidence of value-added taxes), Creedy and van de Ven (2001) (for the annual vs lifetime incidence of the Australian tax and benefit system), Lyon and Schwab (1995) (for the annual vs lifetime incidence of taxes on cigarettes and alcohol), Metcalf (1994) (for the lifetime incidence of US state and local taxes), Nelissen (1998) (for the lifetime incidence of Dutch social security),

Empirical studies of progressivity and redistribution have been very numerous over the last three decades. They include Bishop, Chow, and Formby (1995a) (redistribution in six LIS countries), Borg, Mason, and Shapiro (1991) (regressivity of taxes on casino gambling), Davidson and Duclos (1997) (progressivity in Canada), Decoster and Van Camp (2001) (the redistributive effect of a shift from direct to indirect taxation in Belgium), Dilnot, Kay, and Norris (1984) (progressivity in the UK between 1948 and 1982), Duclos and Tabi (1999) (redistribution in Canada), Giles and Johnson (1994) (redistri­bution in the UK), Gravelle (1992) (the redistributive effect of the 1986 US tax reform), Hanratty and Blank (1992) (the comparative poverty effect of redistributive policies in the US and in Canada), Heady, Mitrakos, and Tsak-loglou (2001) (the redistributive effect of social transfers in the European Union), Hills (1991) (the redistributive effect of British housing subsidies), Howard, Ruggeri, and Van Wart (1994) (the redistributive effect of taxes in Canada), Khetan and Poddar (1976) (redistribution in Canada), Loomis and Revier (1988) (the redistributive effect of excise taxes), Mercader Prats (1997) (redistribution in Spain, 1980-1994), Milanovic (1995) (the redistributive effect of transfers in Eastern Europe and in Russia), Morris and Preston (1986) (redistribution in the UK), Norregaard (1990) (tax progressivity in the OECD countries), O'higgins and Ruggles (1981) (redistribution in the UK), O'higgins, Schmaus, and Stephenson (1989) (comparative redistribution of taxes and transfers in seven countries), Persson and Wissen (1984) (the impact of tax evasion on redistribution), Price and Novak (1999) (the regressivity of implicit taxes on lottery games), Ruggeri, Van Wart, and Howard (1994) (the redistributive impact of government spending in Canada), Ruggles and O'higgins (1981) (the redistributive impact of government spending in the US), Schwarz and Gustafsson (1991) (redistribution in Sweden), Smeeding and Coder (1995) (redistribution in 6 LIS countries), van Doorslaer, Wagstaff, van der Burg, Christiansen, Citoni, Di Biase, Gerdtham, Gerfin, Gross, and Hakinnen (1999) (the redistributive impact of health care financing in 12 OECD countries), Vermaeten, Gillespie, and Vermaeten (1995) (the redistributive impact of taxes in Canada, 1951-1988), Wagstaff and van Doorslaer (1997) (the redistributive impact of health care financing in the Netherlands), Wagstaff, van Doorslaer, Hattem, Calonge, Christiansen, Citoni, Gerdtham, Gerfin, Gross, and Hakinnen (1999) (the redistributive impact of personal income taxation in 12 OECD countries), Younger, Sahn, Haggblade, and Dorosh (1999) (tax incidence in Madagascar).

Benefit incidence analysis is also regularly carried out in less developed economies - see, for instance, Lanjouw and Ravallion (1999) for the role of differentiated "program capture" in explaining the evolution of the incidence of benefits, Sahn, Younger, and Simler (2000) for a dominance analysis of benefit incidence in Romania, van de Walle (1998a) for a discussion of general issues, and Wodon and Yitzhaki (2002) for the role of program allocation rules in the study of benefit incidence.

There have been numerous papers decomposing the Gini indices into sums of contributions of income sources. These include Aaberge, Bjorklund, Jantti, Pedersen, Smith, and Wennemo (2000), Achdut (1996), Cancian and Reed (1998), Gustafsson and Shi (2001), Keeney (2000), Leibbrandt, Woolard, and Woolard (2000), Lerman (1999), Lerman and Yitzhaki (1985), Morduch and Sicular (2002), Podder (1993), Podder and Mukhopadhaya (2001), Podder and Chatterjee (2002), Reed and Cancian (2001), Shorrocks (1982), Silber (1989), Silber (1993), Silber (1989), Sotomayor (1996), Wodon (1999), and Yao (1997).

Chapter 8
HORIZONTAL EQUITY, RERANKING AND REDISTRIBUTION

In this chapter, we examine in more detail a more neglected aspect of the notion of redistributive justice: horizontal equity (HE) in taxation (including negative taxation).1 Two main approaches to the measurement of HE are found in the literature, which has evolved substantially in the last thirty years. The classical formulation of the HE principle prescribes the equal treatment of individuals who share the same level of welfare before government intervention. HE may also be viewed as implying the absence of reranking: for a tax to be horizontally equitable, the ranking of individuals on the basis of pre-tax welfare should not be altered by a fiscal system. Most of the analysis below will involve ethical indices. We will see that, depending on the choice of the underlying social welfare function or inequality index, horizontal inequity will be captured either by a "classical" horizontal inequity index or by a "reranking" one.

8.1 Ethical and other foundations

Why should concerns for horizontal equity influence the design of an optimal tax and transfer system? Several answers have been provided, using either of two approaches. The traditional or "classical" approach defines HE as the equal treatment of equals (see Musgrave (1959)). While this principle is generally well accepted, different rationales are advanced to support it. First, a tax which discriminates between comparable individuals is liable to create resentment and a sense of insecurity, possibly also leading to social unrest.

Second, the principles of progressivity and income redistribution, which are key elements of most tax and transfer systems, are generally undermined

 

1This chapter draws extensively from Duclos, Jalbert, and Araar (2003), where more details can be found.

by horizontal inequity (HI) - as we shall see in our own treatment below. This has indeed been one of the main themes in the development of the reranking approach in the last decades. Hence, a desire for HE may simply derive from a general aversion to inequality, without any further appeal to other normative criteria. HI may moreover suggest the presence of imperfections in the operation of the tax and transfer system, such as an imperfect delivery of social welfare benefits, attributable to poor targeting or to incomplete take-up. It can also signal tax evasion, which can inter alia cost the government significant losses of tax revenue.

Third, HE can be argued to be an ethically more robust principle than VE. VE asks for the reduction of welfare gaps between unequal individuals. Depending on the retained specification of distributive fairness, the strength of the requirements of vertical justice can vary considerably, while the integrity of the principle of horizontal equity remains essentially invariant. This has led several authors to advocate that HE be treated as a separate principle from VE, and thus that HE be one of the objectives over which optimal trade-offs are assessed for the setting of tax policy.

The theory of relative deprivation also suggests that people often specifically compare their relative individual fortune with that of others in similar or close circumstances. The first to formalize the theory of relative deprivation, Davis (1959), expressly allowed for this by suggesting how comparisons with similar vs dissimilar others lead to different kinds of emotional reactions; he used the expression "relative deprivation" for "in-group" comparisons (i.e., for HI), and "relative subordination" for "out-group" comparisons (i.e., for VE) (Davis 1959, p.283). Moreover, in the words of Runciman (1966), another important contributor to that theory, "people often choose reference groups closer to their actual circumstances than those which might be forced on them if their opportunities were better than they are" (p.29).

In a discussion of the post-war British welfare state, Runciman also notes that "the reference groups of the recipients of welfare were virtually bound to remain within the broadly delimited area of potential fellow-beneficiaries. It was anomalies within this area which were the focus of successive grievances, not the relative prosperity of people not obviously comparable" (p.71). Finally, in his theory of social comparison processes, Festinger (1954) also argues that "given a range of possible persons for comparison, someone close to one's own ability or opinion will be chosen for comparison" (p.121). In an income redistribution context, it is thus plausible to assume that comparative reference groups are established on the basis of similar gross incomes and proximate pre-tax ranks, and that individuals subsequently make comparisons of post-tax outcomes across these groups. Individuals would then assess their relative redistributive ill-fortune in reference groups of comparables by monitoring inter alia how they fare compared to similar others, and by assessing whether they

are overtaken by or overtake these comparables in income status, thus providing a plausible "micro-foundation" for the use of HE as a normative criterion.

This suggests that comparisons with close individuals (but not necessarily exact equals) would be at least as important in terms of social and psychological reactions as comparisons with dissimilar individuals, and thus that analysis of HI and reranking in that context should be at least as important as considerations of VE. It also says that, although classical HI and reranking are both necessary and sufficient signs of HI, they are (and will be perceived as) different manifestations of violations of the HE principle.

The value of studying classical HI has nonetheless been questioned by a few authors, who reject the premise that the initial distribution is necessarily just, or who point out that utilitarianism and the Pareto principle may justify the unequal treatment of equals (as discussed above). A number of authors have also expressed dissatisfaction with the classical approach to HE because of the implementation difficulties it was seen to present. Indeed, since no two individuals are ever exactly alike in a finite sample, it was argued that analysis of equals had to proceed on the basis of groupings of unequals which were ultimately arbitrary. The proposed alternative was then to link HI and reranking and to note that the absence of reranking implies the classical requirement of HE. For instance, Feldstein (1976), p.94, argues that

the tax system should preserve the utility order, implying that if two individuals would have the same utility level in the absence of taxation, they should also have the same utility level if there is a tax.

Various other ethical justifications have also been suggested for the requirement of no-reranking. For instance, King (1983) argues in favor of adding (for normative consistency) the qualification "and treating unequals accordingly" to the classical definition of HE. It then becomes clear that classical HE also implies the absence of reranking. Indeed, if two unequals are reranked by some redistribution, then it could be argued at a conceptual level that at a particular point in that process of redistribution, these two unequals became equals and were then made unequal (and reranked), thus violating classical HE. Hence, from the above, it would seem that (quoting again from King 1983, p. 102) "a necessary and sufficient condition for the existence of horizontal inequity is a change in ranking between the ex ante and the ex post distributions". We thus follow each of the approaches in turn, starting with reranking.

8.2 Measuring reranking and redistribution

We first show how to decompose the net redistributive effect of taxes and transfers into vertical equity (VE) and reranking (RR) components. The VE effect measures the tendency of a tax system to "compress" the distribution of net incomes, which is linked to the progressivity of the tax system. The RR term contributes negatively to the net redistributive effect of the tax system.

The use of Lorenz and concentration curves and of the associated S-Gini indices of inequality and redistribution will enable this integration of reranking and horizontal inequity.

8.2.1 Reranking

Recall first the definition of a concentration curve for net income in (7.6). We can show that CN(p) will never be lower than the Lorenz curve LN(P), and will be strictly greater than LN(p) for at least one value of p if there is "reranking" in the redistribution of incomes. (In a continuous distribution, a sufficient condition for reranking is that v in (7.1) is not degenerate, namely, that it is not a constant.) Intuitively, CN(p) cumulates some net incomes whose percentiles in the net income distribution exceed p. These are net incomes that exceed Image and QN(P) Such high incomes are nevertheless possible, however, due to the stochastic term v in (7.1). LN(p) only cumulates the net incomes which equal QN(P) or less. Hence, CN(P) ≥ LN(p). This can also be seen by comparing the estimators in equations (7.7) and (7.9). In (7.9), the observations of Nj are cumulated in increasing values of Nj, but in (7.7), the observations of Nj ate cumulated in increasing values of Xj, which means that some higher values of Nj may be cumulated before some lower ones.

It is therefore straightforward to conclude that a net tax T will cause reranking (and hence horizontal inequity) if and only if CN(p) > LN(P) for at least one value of p ε]0, 1[. The distance CN(p) - LN(P) can therefore be used

E: 18.8.5

as an indicator of reranking2. A natural S-Gini index of rerankingind is then obtained as a weighted distance between the two curves:

Image

Denoting ICN(p) as the index of concentration of net incomes (recall (7.11)), this index of reranking can also be obtained as

Image

8.2.2 S-Gini indices of equity and redistribution

As for comparisons of inequality and concentration, it is often useful to summarize the progressivity, vertical equity, horizontal inequity as well as the redistributive effect of taxes and transfers into summary indices. We can do this by weighting the differences expressed above by the weights k(p; p) of the S-Gini indices to obtain S-Gini indices of TR- progressivity (IT(p)), IR- progressivity and vertical equity (IV(p)), reranking (RR(p)), and redistribution (IR(p)):

 

2DAD: Curves|Lorenz and DAD: Curves|Concentration.

Image

Image

Image

Image

These indices can also be computed as differences between S-Gini indices of inequality and concentration:

Image

Image

Image

Image

Many of these indices have first been proposed with p = 2, which corresponds to the case of the standard Gini index. IT(p = 2) is known as the Kakwani index of TR progressivity3, IV(p = 2) is known as the Reynolds-

E:18.8.4

Smolensky index of IR progressivity and vertical equity, and RR(p = 2) is known as the Atkinson-Plotnick index of reranking.

8.2.3 Redistribution and vertical and horizontal equity

The difference between the Lorenz curve of net and gross incomes is given by:

Image

The larger this difference, the more redistributive is the tax and benefit system. Alternatively, the net redistribution can be expressed in terms of S-indices4:

E:18.8.6

Image

 

3DAD: Inequality|Gini/S-Gini Index and DAD: Redistribution|Coefficient of Concentration.

4DAD: Inequality|Gini/S-Gini Index and DAD: Redistribution|Goefficient of Concentration.

The first term VE in each of the above two expressions is clearly linked to the definition of IR- progressivity in equation (7.26). As shown in equation (7.10), it can also be expressed in terms of TR- progressivity when t ≠ 0:

Image

and, using S-indices,

Image

Furthermore, if there is more than one tax and/or benefit that make up T, we can decompose total VE as a sum of the IR and TR progressivity of each tax and transfer. Say that there are J such taxes or benefits. Let t(j) be the (overall) average tax rate of the tax T(j) with j = 1,..., J, such that Image and let CT(j) (p) and CN(j) (p) be the concentration curves of net income and taxes corresponding to tax T(j), with N(j) = XT(j). Then, we have

Image

and

Image

CN(j) (p)— LX(P) and IX(P)—ICN(p) capture the vertical equity of tax or transfer j at percentile p, and again can be easily seen to be an element of the definition of IR- progressivity. Each of these VE contributions can also be expressed as a function of TR progressivity at p (when t(j) ≠ 0):

Image

or, using S-Gini indices of IR progressivity, as a function of S-Gini indices of TR progressivity:

Image

The second term on the right-hand side of (8.11) and (8.12) is the redistribution-reducing reranking effect. As is well known from the literature on reranking (see Atkinson, 1979, and Plotnick, 1981, for instance), taking into account reranking when using rank-dependent inequality indices increases measured inequality and decreases the redistributive effect of taxation,

and this explains why IN generally exceeds ICN,and also why the difference can be interpreted as the impact of reranking on the net redistributive effect of taxation.

To interpret that second term, we may also think of individuals resenting being outranked by others, but enjoying outranking others, and then assess their net feeling of resentment by the amount by which the net income of the richer (than themselves) actually exceeds what the net income of the richer class would have been had no "new rich" displaced "old rich" in the distribution of net incomes. We can then show that μN(IN(P) — ICN(p)) is the expected net income resentment of the poorest person in samples of p — 1 randomly selected individuals, and thus that RR(p) is an ethically-weighted indicator of such net resentment in the population.

8.3 Measuring classical horizontal inequity and redistribution

We now turn to the measurement of classical horizontal equity, defined again as "the equal treatment of equals".

8.3.1 Horizontally-equitable net incomes

One natural avenue for measuring whether equals are treated equally is to estimate the variability of taxes and net incomes conditional on some initial value of gross income. We may, for instance, wish to estimate the conditional variability of T at some value of X. Alternatively, and perhaps better for expositional purposes, we may want to show that conditional variability over a range of percentiles p of gross income X, and we may thus want to estimate for example the conditional variance of T at gross incomeImage:5

E:18.8.7

Image

Recent work has, however, attempted to make the measurement of classical HI flow from ethical (as opposed to descriptive or statistical) foundations. We show how this can be done using the popular Atkinson social welfareatk function W(t) introduced in (4.37). For the distribution of net incomes, this social welfare function equals:

Image

Recall that the expected net income of those at rank p in the distribution of gross income is given by Image Hence, if the tax system were horizontally

 

5DAD: Distribution|Conditional Standard Deviation.

equitable and if all individuals at rank p in the distribution of gross income were granted Image in net incomes, the local level of utility would be U (N(p); e) and net-income social welfare would equal

Image

The expected net income utility of those at rank p in the distribution of gross income is, however, equal to

Image

If, instead of U(N(p); e), we assigned individuals at rank p their expected net income utility U(p; e), social welfare would equal

Image

Image is social welfare using ex ante expected net income;Image is social welfare using ex ante expected net income utility. By the concavity of the utility function, we have that Image and this difference captures the local utility cost of net income uncertainty at p. Hence, we also have that Image a feature which we can use to capture the global social welfare cost of HI and its impact on redistribution.

To show the social welfare cost of HI and its impact on redistribution, we can follow either of two approaches. Recall that we have just provided two locally horizontally-equitable tax systems:

Image one in which each individual at rank p in the distribution of gross incomes receives JV(p) and utility U(N(p); e),

Image and one in which each of these individuals receives U(p; e).

In the first case, Image but mean income is the same under the two distributions N(p) and Image since Image Hence, a consequence of HI is to increase inequality and to decrease the redistributive fall in inequality brought about by tax and benefit systems. This is further developed in Section 8.3.2.

The second case imposes a horizontally-equitable local distribution of utility U(p) that equals the ex ante expected local utility. Compared to the actual distribution of net incomes, this reduces inequality but maintain the overall level of social welfare. Hence, it must be that average income under U(p) is lower than under N(p). It also implies that the cost of inequality is lower with U(p). This is further developed in Section 8.3.3.

8.3.2 Change-in-inequality approach

Let the equally distributed equivalent (EDE) incomes for WN(∈), Image and Image be Image, Image respectively. As before, inequality can be measured by the differences between those ξ and the corresponding μ, as a proportion of μ Now observe that

Image

sinceImage and Image. Hence, .HI increases inequality. The overall redistributive change in inequality that results from the effect of taxes and transfers can then be expressed as

Image

Note also that, by (4.35), (8.25) is equivalent to Image when the means of X and N are the same.

Hence, using (8.25) we obtain the following decomposition of the net redistributive change iri inequality6:

Image

VE represents the decrease in inequality yielded by a tax which treats equals equally. Thus, VE can be interpreted as a measure of the underlying vertical equity of horizontally-equitable net taxes Image measures the fall in redistribution attributable to the unequal post-tax treatment of pre-tax equals. The excess of Image over Image is due to the appearance of post-tax income inequality within groups of pre-tax equals.

8.3.3 Cost-of-inequality approach

In the above change-in-inequality approach, average income is kept the same while comparing distributions of actual and horizontally equitable net incomes. Social welfare and inequality do, however, vary across the distributions of N(p) and Image In the second approach, the cost-of-inequality approach, social welfare is kept the same across the distributions being compared but the mean income required to attain this level of welfare varies. Each element of the decomposition in this section thus corresponds to a difference in means at equal social welfare Image

 

6DAD: Redistribution|Duclos & Lambert (1999) and DAD: Redistribution|Duclos, Jalbert & Araar (2003).

The cost of inequality in the distribution of net income can be expressed as:

Image

Recall that Image represents the level of per capita net income that society could use for the elimination of inequality with no loss of social welfare.

Let Image represent the cost of inequality subsequent to a flat (or proportional, and thus inequality neutral) tax on gross incomes that generates the same level of social welfare as the distribution of net incomes. Denote the average income under this welfare-neutral flat tax by μF The net effect of redistribution on the cost of inequality then becomes:

Image

Since Image and since Image we also have

Image

which is positive if Image The more progressive the net tax system, the greater the value of Image. If the net tax system is progressive, the greater the value of e, the greater the redistributive fall in the cost of inequality.

We then write the decomposition of the total variation in the cost of inequality as 7:

Image

The redistributive fall in the cost of inequality then decomposes into two effects.

First, Image is the cost of inequality under a (horizontally-equitable) certainty-equivalent level of net income at all ranks p. This certainty-equivalent net income is given by Image at rank p. Hence, for constant social welfare, an horizontally-equitable tax system corresponds to a distribution of Image to each individual at pre-tax percentile p.

Second, Image in (8.30) measures the difference in the cost of inequality of two horizontally equitable tax systems, the first being a flat tax system, and the second granting everyone his certainty equivalent level of net income, with both systems yielding the same level of social welfare WN. Image is positive if the tax system is progressive in an ex ante,

 

7 DAD: Redistribution|Duclos & Lambert (1999) and DAD: Redistribution|Duclos, Jalbert & Araar (2003).

certainty-equivalent, sense. In such a case, the distribution across percentiles of the certainty-equivalent net incomes is less inequality costly than the distribution of gross incomes.

8.3.4 Decomposition of classical horizontal inequity

We may also wish to know at which percentile or for which population group HI is more pronounced, and by how much it contributes to total classical HI. For this, define the local cost of classical violations of HE at p as:

Image

This is the "risk-premium" of net income uncertainty at percentile p, and it is thus a money-metric cost of local classical HI at p. It is then possible to show that aggregating (8.31) using population weights yields the global index of total classical HI in (8.30):

Image

8.4 References

The literature on horizontal inequity has evolved very significantly over the last 25 years. Recent literature surveys can be found in Jenkins and Lambert (1999), Lambert and Ramos (1997a) and Lambert (2001) (see also the comment by Plotnick (1999) and the earlier reviews of Musgrave (1990) and Plotnick (1985)). See also Balcer and Sadka (1986), Feldstein (1976), Hettich (1983), Lambert and Yitzhaki (1995) and Stiglitz (1982) for a treatment of horizontal equity as a separate principle from vertical equity, and Kaplow (1989), Kaplow (1995) and Kaplow (2000) for a critique of the principle of horizontal inequity.

The early reranking approach was much influenced by Atkinson (1979), Plotnick (1981) and Plotnick (1982) (for the RR(2) index), and King (1983) (for a normative link between inequality, mobility and reranking). See also Chakravarty (1985) for normative links between inequality and reranking, Dardanoni and Lambert (2001) for a statistically-based look at the association between gross and net incomes, Duclos (1993) for the general form of the IR(p) indices, Jenkins (1988a) for a "within-group" horizontal equity focus, Kakwani and Lambert (1999) for a Hi-related analysis of tax discrimination, Kakwani and Lambert (1998) for an axiomatic construction of equity measures, Rosen (1978) for a (rare) utility-based evaluation of horizontal inequity, and Lerman and Yitzhaki (1995) for reasons for which reranking may decrease inequality.

Classical horizontal equity has seen extensive developments particularly in the last 10 years: see, for instance, Aronson, Johnson, and Lambert (1994), Aronson and Lambert (1994), Aronson, Lambert, and Trippeer (1999) and van de Ven, Creedy, and Lambert (2001), for the use of the Gini for calculating both reranking and classical horizontal inequity; Duclos and Lambert (2000), for a cost-of-inequality approach; and Auerbach and Hassett (2002) and Lambert and Ramos (1997b), for a change-in-inequality approach.

Empirical enquiries into the extent of horizontal inequity have also been relatively numerous. They include inter alia Ankrom (1993) for comparative Swedish, British and American evidence, Berliant and Strauss (1985) for the US federal income tax system, Bishop, Formby, and Lambert (2000) for the effects of noncompliance and tax evasion, Creedy (2001) and Creedy (2002) for the impact of non-uniform indirect taxes on horizontal inequity in Australia, Creedy and van de Ven (2001) for the impact on measured horizontal inequity of using different equivalence scales and of using annual vs lifetime income, Decoster, Schokkaert, and Van Camp (1997) for indirect taxation and horizontal inequity in Belgium, Duclos (1995b) for the role of imperfections in poverty alleviation programs, Jenkins (1988b) and Nolan (1987) for the extent of reranking in the UK, Sa Aadu, Shilling, and Sirmans (1991) for whether the treatment of capital gains on owner-occupied housing matters for horizontal inequity, and Stranahan and Borg (1998) for whether an implicit "lottery tax" is a source of horizontal inequity.

The advances in the measurement of horizontal inequity have also led to a desire to decompose the overall measurement of redistribution as a function of progressivity, vertical equity, reranking and classical horizontal inequity. This is done inter alia in Duclos (1993) (with the S-Gini), Duclos (1995b) (with redistributive imperfections), Kakwani (1984) and Kakwani (1986) by using the Gini index but not attempting to measure classical horizontal inequity; and in Aronson, Johnson, and Lambert (1994), Aronson and Lambert (1994), van Doorslaer, Wagstaff, van der Burg, Christiansen, Citoni, Di Biase, Gerdtham, Gerfin, Gross, and Hakinnen (1999) (for health financing in 12 OECD countries), Wagstaff and van Doorslaer (1997) (for health financing in the Netherlands), Wagstaff, van Doorslaer, Hattem, Calonge, Christiansen, Citoni, Gerdtham, Gerfin, Gross, and Hakinnen (1999) (for personal income taxes in 12 OECD countries), all using the Gini index and incorporating both reranking and classical horizontal inequity. See also Wagstaff and van Doorslaer (2001) for a decomposition of total tax progressivity in components such as the progressivity of tax credits, marginal tax rates, allowances and deductions.

PART III
ORDINAL COMPARISONS OF POVERTY AND EQUITY

This page intentionally left blank.

Chapter 9
DISTRIBUTIVE DOMINANCE

9.1 Ordering distributions

We have, up to now, focussed mostly on measuring and comparing cardinal indices of poverty and equity. As discussed in Chapter 4, this has several expositional advantages. The greatest of these advantages is probably that of focussing on only one (or a few) numerical assessments of poverty and equity. It is then relatively straightforward to compare poverty and equity across distributions just by comparing the values of these cardinal indices. The conclusions are then (seemingly) "clear-cut".

There are, however, important reasons to consider instead ordinal comparisons of poverty and equity. The most important one is that comparisons of cardinal poverty and equity indices (comparisons across time, regions, sociodemographic groups, or comparisons of policy regimes, for instance) may be disturbingly sensitive to the choice of indices and poverty lines. For instance, we might find for some poverty lines and indices that poverty is greater in a region A than in a region B, but we then find the opposite for other lines and indices. We could support the introduction of a particular fiscal policy or macroeconomic adjustment program for some social welfare indices, but could be in doubt as to whether the same support would be warranted with other indices. Since there is rarely unanimity as to the right choice of poverty lines and distributive indices, it is clear that such sensitivity can seriously undermine one's confidence in comparing distributions or in making policy recommendations.

9.2 Sensitivity of poverty comparisons

To see this better in the context of poverty comparisons, consider the hypothetical example of Table 9.1. The second, third and fourth lines in the table show the incomes of three individuals in two hypothetical distributions, A and B. Thus, distribution A contains three incomes of 4, 11 and 20 respectively. The bottom 3 lines of the table show the value of the two most popular indices of poverty, the headcount F(z) and the average poverty μ,g(z) indices, at two alternative poverty lines, z = 5 and z = 10. Recall from Section 5.1.2 that the poverty headcount gives the proportion of individuals in a population whose income falls underneath a poverty line. At a poverty line of 5, there is only one such person in poverty in distribution A, and the headcount is thus equal to 0.33. The average poverty gap index is the sum of the distances of the poor's incomes from the poverty line, divided by the total number of people in the population. For instance, at a poverty line of 10, there are 2 people in poverty in B, and the sum of their distances from the poverty line is (10-6)+(10-9)=5. Divided by 3, this gives 1.66 as the average poverty gap in B for a poverty line of 10.

At a poverty line of 5, the headcount in A is clearly greater than in B, but this ranking is spectacularly reversed if we consider instead the same headcount index but at a poverty line of 10. The ranking changes again if we use the same poverty line of 10 but now focus on the average poverty gap μg(z):Image Clearly, the poverty ranking A and B can be quite sensitive to the precise choice of measurement assumptions.

Table 9.1: Sensitivity of poverty comparisons to choice of poverty indices and poverty lines

 

Distribution A

Distribution B

First individual's income

4

6

Second individual's income

11

9

Third individual's income

20

20

F(5)

0.33

0

F(10)

0.33

0.66

μy(1O)

2

1.66

9.3 Ordinal comparisons

The alternative to comparing the value of one or a few cardinal indices is to check whether rankings of poverty and equity are valid for a class of ethical judgments. These classes are defined over classes of indices as well as over ranges of poverty lines (for poverty comparisons). In other words, we do not wish to quantify poverty or equity. We only want to determine whether poverty and equity is higher or lower in one distribution than in another, for a class of ethical judgments. When inferred, an ordinal ranking of poverty and equity across distributions or policies establishes the sign of the differences across these distributions or policies of everyone of the cardinal poverty and equity indices of that class. Note that it can say only whether poverty and equity is higher in one distribution or for one policy than for another, but not by how much. In the article in which he introduces his famous inequality index (or "concentration ratio"), Gini (1914) criticizes the curve introduced earlier by Lorenz (1905) exactly along those lines:

This graphical approach presented two drawbacks (...):

a) it does not provide a precise measurement of concentration

b) it does not allow to assess, not even in some circumstances, when or where concentration is stronger. In fact, if two curves cross each other (...), it is not always possible to say if one denotes a stronger concentration than the other, (translated in Gini 2005, p. 24.)

Ordinal comparisons of poverty do not, therefore, provide precise numerical values to compare with numerical indicators of other aspects or effects of government policy, such as the policy's administrative or efficiency cost. This is seemingly their main defect. It is arguably also their greatest advantage. As seen above in the context of Table 9.1, differences in simple poverty indices can be deceptive when it comes to ranking distributions. They can also quantify deceptively differences across distributions. To illustrate this, consider Table 9.2 with distributions A and B and a poverty line z = 1. The three FGT poverty indices Image agree that poverty has not increased in moving from A to B. But the quantitative change in poverty varies significantly with the value of α. With the poverty headcount, poverty remains the same, but the average poverty gap falls by 33% and the Image index falls by 56%.

Table 9.2: Sensitivity of differences in poverty to choice of indices

Distributions

Firsta

Secondb

Image

Image

Image

A

0.25

2

0.5

0.375

0.28125

B

0.5

2

0.5

0.25

0.125

Differencec

 

 

no change

fall of 33%

fall of 56%

aFirst individual's income.
bSecond individual's income.
cChanges in poverty from A to B.

A focus on ordinal comparisons can save most of the considerable energy and time often spent on selecting poverty lines and poverty indices. It can avoid inter alia the difficult debate on the choice of appropriate theoretical and econometric models for estimating poverty lines. It can also escape arguments on the relative merits and properties of the many distributive indices that have been proposed in the social welfare literature, and of which the previous chapters introduced only a few. Again, this is because of ordinal distributive comparisons simply order distributions, and for this, differences in numerical indices do not need to be estimated. For instance, we will see later in Section 10.1 that we can order robustly distributions A and B in Table 9.1 for all "distribution-sensitive" poverty indices and for any choice of poverty line. If such an ordering is considered sufficiently strong and informative, then, in comparing A and B, we can effectively stop quibbling on whether we should use the Watts index or the average poverty gap as a poverty index, and on whether the poverty line should be 5 or 10.

In short, ordinal poverty comparisons can sometimes be robust to the choice of measurement assumptions, since they will sometimes be valid for wide classes and ranges of such assumptions. When the problem is simply of resolving which of two policies will better alleviate poverty, or determining which of two distributions displays the greatest level of social welfare, or assessing which of two distributions is the most equal, ordinal comparisons can sometimes be sufficiently informative, and cardinal estimates will then not be needed.

9.4 Ethical judgements

9.4.1 Dominance tests

As we will see in detail below, ordinal comparisons of poverty and equity involve using classes of distributive indices. It is useful to define these classes by referring to "orders of normative (or ethical) judgements", an order being denoted as s = 0, 1, 2,.... An ethical judgement of order s thus serves to define a class of indices also of order s. Whether an ordering of poverty and equity is valid for all of the indices that are members of a class of order s is empirically tested through dominance tests, which happen to be convenient variants of well-known stochastic dominance tests also of order s. When two dominance curves of a given order do not intersect, all indices that obey the ethical principles associated to this order of dominance then rank identically the two distributions. Hence, a dominance test of order s serves to test whether some distributive ranking is valid for all of the indices of a class of order s, and that class of order s can be interpreted through the use of ethical judgements of the same order s.

9.4.2 Paretian judgments

A first natural property of normative judgements is that a society should be judged improved whenever the income of one of its members increases and no one else's income decreases. For poverty, this would mean that indices of poverty should (weakly) fall whenever someone's income increases, everything else being the same. ("Weakly fall" means that the index should at the very least not increase following the change, and conversely for "weakly increase". This caveat applies to all of the ethical statements considered in this book.) For social welfare comparisons, this would imply that social welfare indices should increase following this improvement in someone's income. Such indices thus obey the Pareto principle: they must respond favorably to Pareto-improving changes in the distribution of income.

To see this formally, consider the case of a social welfare function, W(y), that depends on a vector y = (y1,..., yn) of n income levels.

Pareto principle

Let y = (y1, ..., yn),η > 0 be any positive constant, and Image = (y1,..., yj + η,..yn). Then the social welfare function W obeys the Pareto principle if and only if W (y) ≤ W (Image) for all possible pairings of y and Image.

Because the ethical condition imposed by the Pareto principle is very weak, we can consider all of the indices that obey that principle to be members of a class of ethical order 0. The poverty indices belonging to a class of order 0 would for instance all fall whenever someone's income increases, everything else being the same. Note that the case of relative poverty might seem to provide an exception to this principle, since an increase in someone's income could increase the relative poverty line and possibly also increase the poverty index. To deal with this possible exception, it is best to think of the poverty line as constant in the current discussion of ethical principles.

All of the indices which obey the Pareto ethical condition then belong to (poverty or social welfare) classes of order s = 0. It has, however, long been recognized that searches for strict Pareto improvements in distributions of incomes are generally doomed to failure, because of fundamental randomness in economic status and because of strong heterogeneity in preferences, endowments and markets. For a distributive change to be strictly Pareto improving, it must indeed not decrease anyone's income, whatever one's peculiar circumstances. This is unlikely ever to be empirically observable, even if we were to focus only on those with incomes below some poverty line. Besides, checking for Pareto -improving temporal changes would require the use of panel data in order to observe individual-specific changes in incomes. Such panel data are rare, and even if we had access to them, they would still not enable us to infer Pareto -improvements over an entire population (as opposed to only over an available sample). To be valid, searches for strict Pareto improvements also plausibly require no change in population size and composition, a difficulty with which we deal below through the use of the anonymity and population invariance principles.

9.4.3 First-order judgments

It is thus natural and logical to consider ethical principles of order higher than that of the Pareto principle. In the light of the above, a plausible higher-order ethical judgement would require that the distributive indices be anony­mous in the incomes of the individuals. That is, ceteris paribus, whether it is an individual named α rather than b that enjoys some level of income should not affect the value of a distributive index. It also follows from this property that interchanging two income levels should not affect distributive indices: these indices thus obey the symmetry or anonymity principle. Formally, we have (for a social welfare function W):

Anonymity principle

Let M be an n × n permutation matrix (a permutation matrix is composed of 0's and 1's, with each row and each column summing to 1) and let Image = My'. Then the social welfare function W obeys the anonymity principle if and only if W (y) = W (Image) for all possible pairings of y and Image

Clearly, this principle would not be acceptable for an index of horizontal equity, but it would seem relatively uncontroversial for comparing inequality, social welfare or poverty across anonymous distributions.

There is another principle that we have implicitly imposed since the beginning of this book and that also goes beyond the Pareto principle. It is usually called the population invariance principle, and it simply states that adding an exact replicate of a population to that same population should not affect distributive comparisons. For a social welfare function W, we thus have:

Population invariance principle

Let Image be a vector of size 2n, with Image and with yj = yj, j = 1,..., n. Then the social welfare function W obeys the population invariance principle if and only if W (y) = W (Image) for all possible pairings of y and Image.

As indicated on page 40, imposing this principle simplifies exposition significantly by enabling the use of quantiles and the normalization of population size to 1. The population invariance principle is thus implicitly imposed everywhere throughout the book.

First-order classes of distributive indices then regroup all of the indices that show a social improvement when the income at some percentile in the population increases and when no other income changes. These indices have properties that are analogous to those of Paretian indices: ceteris paribus, the larger the individual incomes, the better off is society. They are in addition symmetric in income since they obey the anonymity principle.

9.4.4 Higher-order judgments

Even with the above anonymity constraint, it is likely that some of the first-order distributive indices will clash in their distributive ranking. Some of the first-order poverty indices could declare a policy reform to worsen poverty, while others might indicate that the reform improves poverty. To resolve this ambiguity, we may move to a second-order class of distributive indices. As above, this is done by constraining distributive indices to obey additional ethical principles.

To do this, assume that distributive indices must show a social improvement whenever a mean-preserving redistributive transfer from a richer to a poorer individual occurs. This corresponds to imposing the well-known Pigou-Dalton principle on social judgements. To see this formally, consider again the case of a social welfare function W(y).

Pigou-Dalton principle

Let η > 0 be any positive constant, and let Image = (y1,... , yj + η,..., yk – η,..., yn), with yj + η ≤ yk – η. Then the social welfare function W obeys the Pigou-Dalton principle if and only if W (y) ≤ W (Image) for all possible pairings of y and Image.

The second-order classes of distributive indices thus contain those indices that have a greater ethical preference for the poorer than for the richer. They display a preference for equality of income and are therefore said to be distribution-sensitive. For instance, all other things being the same, the more equal the distribution of income among the poor, the lower the level of poverty. Ceteris paribus, if a transfer from a richer to a poorer person takes place, all second-order social welfare indices will increase and all second-order inequality and poverty indices will fall. Note again that all indices that belong to a second-order class of poverty and welfare indices also belong to the first-order class of relevant indices.

There are often sound ethical reasons to be socially more sensitive to what happens towards the bottom of the distribution of income than higher up in it. We may thus be less concerned about a "bad" disequalizing transfer higher up in the distribution of income than lower down. To make this more precise, imagine four levels of income, for individuals 1, 2, 3, and 4, such that y2y1 = y4 - y3 > 0 and y1 < y3. Let a marginal transfer of $1 of income be made from individual 2 to individual 1 (an equalizing transfer) at the same time as an identical marginal $1 is transferred from individual 3 to individual 4 (a disequalizing transfer). This is called in the literature a "favorable composite transfer".

Note that the equalizing transfer is made lower down in the distribution of income than the disequalizing transfer. This can be seen by the fact the recipient of the first transfer, individual 1, has a lower income than the donor of the second transfer, individual 3, since y3 > y1. For a given distance between recipients and donors, the social improvement effect of equalizing transfers is decreasing in the income of the recipient. Said differently, Pigou-Dalton transfers lose their social improvement effects when recipients are more affluent.

Second-order indices which respond favorably to such a "favorable composite transfer" obey the transfer-sensitivity principle and therefore belong to the third-order class of indices. Again, such a favorable composite transfer is made of a beneficial Pigou-Dalton transfer within a lower part of the distribution, coupled with a reverse Pigou-Dalton transfer within an upper part of the distribution. Third-order welfare indices will increase following this change, and third-order poverty and inequality indices will fall. Formally, we have (for a social welfare function W):

Transfer-sensitivity principle

Let η > 0 and yj - yi = yi - yk > 2η with yi < yk.

Also let Image = (y1,..., yi + η,..., yj - η,..., yk - η,..., yi + η,..., yn). Then the social welfare function W obeys the transfer-sensitivity principle if and only if W (y) ≤ W (Image) for all possible pairings of y and Image.

Note that the favorable composite transfer considered above involves no change in the variance of the distribution since yj – yi = yl – yk

We can, if we wish, define subsequent classes of indices in an analogous manner. To define fourth-order indices, for instance, we consider a combination of two exactly opposite and symmetric composite transfers, the first one being favorable and occurring within a lower part of the distribution, and the second one being unfavorable and occurring within a higher part of the distribution. The indices that respond favorably to this combination of composite transfers can then belong to the class of fourth-order indices.

As can be seen, higher-order transfer principles essentially postulate that, as the order increases, the relative ethical weight assigned to the effect of income changes occurring at the bottom of the distribution also increases. Thus, as the order s of the class of distributive indices increases, the indices become more and more sensitive to the distribution of income among the poorest. At the limit, as s becomes very large, only the income of the poorest individual matters in comparing poverty and social welfare across two distributions. In that sense, the poverty and social welfare indices become more and more Rawlsian as s increases.

9.5 References

Much of normative welfare economics has been influenced by the philosophical work of Nozick (1974), Rawls (1971) (see Rawls 1974 for a very short synthesis addressed to economists) and Sen (1982). The combined work of Kolm (1976a) and Kolm (1976b) was the first to introduce the transfer-sensitivity condition into the inequality literature, and Kakwani (1980) subsequently adapted it to poverty measurement. See also Davies and Hoy (1994) (who describe that condition as a Rawlsian extension of the Lorenz criterion), Shorrocks (1987) for a complete characterization of the transfer-sensitivity principle, and Zheng (1997) for an informative discussion of it. Higher-order principles can be interpreted using the generalized transfer principles of Fishburn and Willig (1984) — see also Blackorby and Donaldson (1978) for a description of these principles as becoming "more Rawlsian". surveys of the normative and axiomatic foundations of modern inequality measurement can be found in Blackorby, Bossert, and Donaldson (1999) and Chakravarty (1999).

Other papers which explore variations to the normative principles typically used in distributive analysis are Mosler and Muliere (1996) (for an alternative principle of transfers), Ok (1995) (for a "fuzzy" measurement of inequality), Ok (1997) (for ranking over opportunity sets), Salas (1998) (for marginal population invariance), Zoli (1999) (for a positional transfer principle when Lorenz curves intersect), and Tam and Zhang (1996) (for an alternative Pareto principle defined in terms of growth over the poor).

Experimental evidence on the normative attitudes of individuals and societies towards the measurement of poverty and equity has also grown fast in the last decades. Methods and results can be found in Amiel and Cowell (1992) (on attitudes to inequality — which question the acceptability of transfer and decomposability principles), Amiel and Cowell (1999) (on attitudes to poverty, social welfare and inequality), Amiel and Cowell (1997) (on attitudes towards poverty measurement), and in Amiel, Creedy, and Hurn (1999) (on quantifying inequality aversion using Okun (1975)'s "leaky bucket experiment"). A survey of such attitudes can be found in Corneo and Gruner (2002).

Fong (2001) tests whether normative attitudes can be explained by self-interest or by values about distributive justice. Dolan and Robinson (2001) further explore whether there is a "reference point" problem in such studies, and Ravallion and Lokshin (2002) reports that expectations about future levels of well-being can influence individuals' desire for redistributive policies.

See also Stodder (1991) for empirical evidence as to why inequality aversion can matter for ranking distributions, and Christiansen and Jansen (1978) for an example of the estimation of social preferences using the revealed structure of an existing tax system (the Norwegian one).

A number of studies have recently also attempted to distinguish between attitudes towards inequality and towards risk aversion: see inter alia Amiel, Cowell, and Polovin (2001), Beck (1994), Cowell and Schokkaert (2001), and Kroll and Davidovitz (2003).

This page intentionally left blank.

Chapter 10
POVERTY DOMINANCE

To see how the material of Chapter 9 can be used practically to test for the robustness of poverty comparisons, we focus for simplicity on classes of additive poverty indices denoted as IIs (z+), where s stands again for the "ethical order" of the class and where z+ will stand for the upper bound of the range of all of the poverty lines that can reasonably be envisaged. The additive poverty indices P(z) that are members of that class can be expressed as

Image

where z is a poverty line and π(Q(p);z) is an indicator of the poverty status of someone with income Q(p).

We can also think of the function π(Q(p); z) as the contribution of an individual with income Q(p) to overall poverty P(z). Hence, we can also assume that π(Q(p); z) = 0 if Q(p) > z. This ensures that the poverty indices fulfill the well-known poverty focus principle, which simply states that changes in the incomes of the rich should not affect the poverty measure. The use of quantiles in equation (10.1) also ensures that the poverty indices P(z) obey the anonymity (see page 160) and population invariance principles (see page 160). For expositional simplicity, also assume that π(Q(p);z) is continuously differentiable in Q(p) between 0 and z up to an appropriate order, and denote the ith-order derivative of π(Q(p); z) with respect to Q(p) as π(i)(Q(p);z).

The first class of poverty indices (denoted by II1 (z+)) then regroups all of the poverty indices

Image that decrease when someone's income increases

Image and whose poverty line does not exceed z+.

Formally, indices within II1 (z+) are such that:

Image

where the Pareto principle (page 159) appears through the form of a non-positive first-order derivative π(1)(Q(p);z).

The second class of poverty indices, II2(z+), contains those first-order indices that have a greater ethical preference for the poorer among the poor - recall the Pigou-Dalton principle of page 161. Increasing the income of a poorer individual is better for poverty reduction that increasing by the same amount the income of a richer person. The absolute value of the first-order derivative is therefore decreasing with Q(p), and the indices are thus convex in income. This class II2 (z+) is then:

Image

We will discuss further below the role of the continuity condition π(z, z) = 0. Clearly, II2(z+) ⊂ II1(z+), but not the reverse.

Technically, obeying the "transfer-sensitivity" principle requires for the P(z) indices that their second-order derivative π(2)(Q(p);z) be decreasing in Q(p). Poverty indices belonging to the third-order class of poverty indices II3(z+ are then defined as:

Image

As before, II3(z+)⊂ II2(z+).

Subsequent classes of poverty indices are defined in an analogous manner. Generally speaking, poverty indices P(z) will be members of class IIs(z+) if (-l)s π(s) (Q(p);z) ≤ 0 and if π(i)(z,z) = 0 for i = 0, l, 2..., s - 2. As the order s of the class of poverty indices increases, the indices become more and more sensitive to the distribution of income among the poorest. At the limit, and as mentioned above, only the income of the poorest individual matters in comparing poverty across two distributions. Increasing the order s makes us focus on smaller subsets of poverty indices, in the sense that IIs (z+) ⊂ IIs-l(z+).

All poverty indices seen in Chapter 5 fit into some of the classes defined above. The poverty headcount F(z) clearly belongs to II1 (z+) (whenever z ≤ z+). As we will see, it also plays a crucial role in tests of first-order dominance. But it does not belong to the higher-order classes since it is not continuous at the poverty line. The average poverty gap belongs to II1(z+) and to II2(z+), but not to the higher-order classes. The square of the poverty gaps index belongs to II1 (z+), II2(z+) and II3(z+), but not to II4(z+). More generally, the FGT indices, for which π(Q(p); z) = g(p; z)a, belong to IIs(z+) when a α ≥ s - 1 and z ≤ z+. The Watts index belongs to IIi(z+) and to II2(z+), but not to II3(z+) since it does not obey the π1(z,z) = 0 restriction. A transformation of the Watts index, by which π(Q(p); z) = g(p; z) [ln(z) - In (Q*(p))], would, however, belong to II3(z+). The Chakravarty and Clark et al. indices belong to II1 (z+) and II2 (z+), and so do as well the S-Gini indices of poverty.

We can now see how to determine whether poverty in A is greater than in B for all indices that are members of any one of these classes. For this, there exist two approaches: a primal and a dual one. We consider them in turn.

10.1 Primal approach

10.1.1 Dominance tests

We are interested in whether we may assert confidently that poverty in a distribution A, as measured by PA(z), is larger than poverty in a distribution B, PB(z), for all of the poverty indices P(z) belonging to one of the classes of poverty indices defined above. We are therefore interested in checking whether the following difference in poverty indices ΔP(z) = PA(z) - PB(z) is positive:

Image

where on the second line a change of variable has been effected and where Δf(y) is the difference in the densities of income. To demonstrate the dominance conditions, we will make repetitive use of integration by parts of (10.5). This process will involve the use of stochastic dominance curves Ds(z), for orders of dominance s = 1, 2, 3,.... D1(z) is simply the cdf, F(z), namely, the proportion of individuals underneath the poverty line z. The higher order curves are iteratively defined as

Image

Thus, D2(z) is simply the area underneath the cdf curve for a range of incomes between 0 and z. This is illustrated in Figure 10.1. The curve shows the cdf F(y) at different values of y. The grey-shaded area underneath that curve (up to z) thus gives D2(z).

Defined as in (10.6), dominance curves may seem complicated to calculate. Fortunately, there is a very useful link between the dominance curves and the popular FGT indices, a link that greatly facilitates the computation of Ds(z).

Image

Figure 10.1: Primal stochastic dominance curves

We can indeed show that

Image

where c = 1/(s - 1)! is a constant that can be basically ignored. Therefore, to compute the dominance curve of order s, we need only compute the FGT index at α = s - 1, which is P(z; α = s - 1) (see (5.7)). Recall that P(z;α = 1) is the average poverty gap. Hence, the dominance curve of order 2 is simply the average poverty gap at different poverty lines. This can also be seen on Figure 10.1. The distance between z and y gives (when it is positive) the poverty gap at a given value of income y. For y = y1, for instance, Figure 10.1 shows that distance z - y1. dF(y1) — as measured on the vertical axis — gives the density of individuals at that level of income. The rectangular area given by the product of (z - y1) and dF(yl) then shows the contribution of those with income y1 to the population average poverty gap. Integrating all such positive distances between y and z across the population thus amounts to calculating the average poverty gap — again, this is the sum of individual rectangles of lengths (z - y) and heights dF(y), or simply the grey-shaded area of Figure 10.1.

Let us now integrate by parts equation (10.5). This gives:

Image

where ΔDs(y) is defined as DAA(y)-DsB(y). If we wish to ensure that ΔP(z) is positive for all of the indices that belong to II1(z+), we need to ensure that (10.8) is positive for all of the poverty indices that satisfy the conditions in (10.2), whatever the values of their first-order derivative π1 (y;z), so long as that derivative is everywhere non-positive between 0 and z+. For this to hold, we simply need that (recall that D1(y) = F(y))1:

Image

We refer to this as first-order poverty dominance of B over A. The result can be summarized as follows:

First-order poverty dominance (primal):

Image

The dominance condition in (10.10) is relatively stringent: it requires the headcount index in A never to be lower than the headcount in B, for all possible poverty lines between 0 and z+. If, however, the condition is found to hold

 

1DAD: Dominance|Poverty Dominance.

in practice, a very robust poverty ordering is obtained: we can then unambiguously say that poverty is higher in A than in B for all of the poverty indices in II1(z+) (including the headcount index). Since (almost) all of the poverty indices that have been proposed obey this restriction, this is a very powerful conclusion indeed. Note again that this ordering is valid for any choice of poverty line up to z+.

Moving to second-order poverty dominance, we integrate equation (10.8) once more by parts and find that:

Image

Recall that the indices that are members of II2(z+) are such that π2(Q(p);z)≥ 0 when Q(p) ≤ z and with π(z,z) = 0. Hence, if we wish ΔP(z) to be positive for all of the indices that belong to II2(z+), we must have that:

Image

This is second-order poverty dominance of B over A; it can be summarized as:

Second-order poverty dominance (primal)2:

Image

Recall from 10.7 that D2(z) = P(z; α = 1). Second-order poverty dominance thus requires the average poverty gap in A to be always larger than the average poverty gap in B, for all of the poverty lines between 0 and z+. If the condition in (10.13) is found to hold in practice, then we can say that poverty is higher in A than in B for all of the poverty indices that are continuous at the poverty line and that are equality preferring (their second-order derivative is positive). That, of course, also includes the average poverty gap itself. Most of the indices found in the literature fall into that category, a major exception being the headcount and the Sen index. And that ordering is again valid for any choice of poverty line between 0 and z+.

We can repeat this process for any arbitrarily higher order of dominance, by successive integration by parts and by determining the conditions under which all of the poverty indices P(z) that are members of a class IIs (z+) will indicate more poverty in A than in B, and this for all of the poverty lines z between 0 and z+. This gives the following general formulation of sth order poverty dominance:

 

2DAD: Dominance|Poverty Dominance.

sth-order poverty dominance (primal)3:

Image

This condition is illustrated in Figure 10.2 for general s-order dominance, where dominance holds until z+, but would not hold if z+ exceeded zs. Checking poverty dominance is thus conceptually straightforward. For first-order dominance, we use what has been termed "the poverty incidence curve", which is the headcount index as a function of the range of poverty lines [0, z+]. For second-order dominance, we use the "poverty deficit curve", which is the area underneath the poverty incidence curve or more simply the average poverty gap, again as a function of the range of poverty lines [0, z+]. Third-order dominance makes use of the area underneath the poverty deficit curve, or the square-of-poverty-gaps index (also called the poverty severity curve) for poverty lines between 0 and z+. Dominance curves for greater orders of dominance simply aggregate greater powers of poverty gaps, graphed against the same range of poverty lines [0, z+].

10.1.2 Nesting of dominance tests

The condition (10.13) for second-order dominance is less stringent than (10.10) for first-order poverty dominance. To see why, consider (10.6) again. When first-order dominance over [0, z+] holds, then second-order dominance over [0, z+] must also hold. Hence, when we find that the poverty indices in II1(z+) show more poverty in A, we also know that the poverty indices in II2(z+) will do the same. That is of course consistent with the fact that II2(z+) ⊂ II1(z+}.

Suppose, however, that we have that ΔD2(y) > 0 for all y ∈ [0, z+], but not that ΔD1(y) > 0 for all y ∈ [0, z+}. Hence, we have first-order, but not second-order, dominance. Poverty is larger in A for all of the indices in II2(z+) but not for all those in II1(z+). This is possible since II1(z+) is a larger set than II2(z+).

These relationships are in fact sequentially valid for higher orders as well. This is illustrated in Figures 10.3 and 10.4. Figure 10.3 shows that a class of indices Hs+l(z+) is a subset of the lower-class of indices IIs(z+). Whenever an ordering is made over IIs(z+), it is also necessarily valid over the subset IIs+1(z+). Figure 10.4 analogously illustrates the size of the sets of distributions (A, B) that can be ordered by the dominance condition in (10.14). The greater the value of s, the more likely can a couple (A, B) fall into those sets, and therefore the more likely can they be compared unambiguously by that

 

3DAD: Dominance|Poverty Dominance.

Image

Figure 10.2: s-order poverty dominance

dominance condition. Taken jointly, Figures 10.3 and 10.4 show the trade-off that exists between wishing to assert whether A really has more poverty than B (Figure 10.4), and wishing to assert this for as large a class of poverty indices and poverty lines as possible (Figure 10.3).

For a simple illustration of these relationships, consider a comparison of distributions A and B in Table 9.1 on page 156. The first-order dominance condition (10.10) only holds if z+ is lower than 9. Hence, we can conclude that A has more poverty than B for any choice of first-order indices so long as the poverty line is less than 9. Indeed, it is not hard to find some first-order indices that will show more poverty in B when z exceeds 9: the headcount between 9 and 11 clearly shows more poverty in B. We can, however, verify that the second-order condition is obeyed for any choice of z+. This then implies that all second-order indices (those that are members of II2(z+)) will show more poverty in A, regardless of the choice of poverty line. This is quite a robust statement, since it is valid for all distribution-sensitive poverty indices (the headcount is not distribution-sensitive, hence it does not always indicate more poverty in A) and again for any choice of poverty line. Again, as mentioned above, second-order poverty dominance is a criterion that is less stringent to check in practice than first-order dominance. The price of this, however, is that the set of indices over which poverty dominance is checked is smaller for second-order dominance than for first-order dominance.

10.2 Dual approach

There exists a dual approach to testing first-order and second-order poverty dominance, which is sometimes called a p, percentile, or quantile approach. Whereas the primal approach makes use of curves that censor the population's income at varying poverty lines, the dual approach makes use of curves that truncate the population at varying percentile values. The dual approach has interesting graphical properties, which makes it useful and informative in checking poverty dominance.

10.2.1 First-order poverty dominance

To illustrate this second approach, we focus on indices that aggregate poverty gaps using weights that are functions of p:

Image

Note that using aggregates of poverty gaps as in (10.15) is more restrictive than using functions π(Q(p); z) defined separately over Q(p) and z, as is done in (10.1). When the poverty lines are the same across distributions (as was implicitly assumed above for the primal approach, and as is almost always assumed to be the case in practice), the dominance rankings are, however, the same for the two approaches, as we will see below.

Membership in the (dual) first-order class II1(z+) of poverty indices only requires that the weights ω(p) be non-negative functions of p:

Image

If we want to check whether ΔΓ(z) = ΓA(z) - FB(z) is positive for all of the indices that belong to Ill(z+), we need only assess whether gA(p;z+) ≥ gB(p;z+) for all p ∈ [0,1]. This yields the following dual first-order poverty dominance:

First-order poverty dominance (dual)4:

Image

Condition (10.17) requires poverty gaps to be nowhere lower in A than in B, whatever the percentiles p considered. It thus amounts to ordering the poverty gap curves. It is not difficult to show that this is also equivalent to checking the primal first-order poverty dominance condition in (10.10). In other words, if we can order poverty over II1 (z+), then we can also do so overII1(z+), and vice versa. In fact, first-order poverty dominance (primal or dual) implies ordering all poverty indices (additive or otherwise) that are (weakly) decreasing in income. To check for such a wide degree of ethical robustness, we can use either the primal or the dual first-order poverty dominance condition.

First-order poverty dominance

The following conditions are equivalent:

1 Poverty is higher in A than in B for any of the poverty indices that obey the focus (see p.165), the anonymity (p.160), the population invariance (p.160) and the Pareto (p.159) principles and for any choice of poverty line between 0 and z+;

2 PA(z;α = 0) ≥ PB(z;α = 0) for all z between 0 and z+;

3 gA(p;z+) ≥ gB(p;z+) for all p between 0 and 1.

10.2.2 Second-order poverty dominance

Membership in the dual second-order class II2(z) of poverty indices requires that the weights ω(p) be positive and decreasing functions of the ranks

 

4DAD: Curves|Poverty Gap.

p:

Image

To show what dominance condition applies to (10.18), recall that G(p;z) is the Cumulative Poverty Gap (CPG) curve, and integrate by parts (10.15):

Image

For ΔΓ(z) to be positive for all of the indices that belong to II2(z+) (and therefore also for all poverty lines zz+), we need to order the CPG curves. The result is summarized as:

Second-order poverty dominance (dual)5:

Image

Again, we can show that the condition in (10.20) is equivalent to the primal second-order poverty dominance condition in (10.13). In other words, if and only if ΓA(z) - ΓB(Z) ≥ 0 for all p(z) ∈ II2(z+), then PA(z) - PB(z) > 0 for all P(z) e II2(2+). Thus, to check robustness of poverty ordering over all distribution-sensitive poverty indices, we can use either the primal or the dual second-order poverty dominance condition. This is summarized as follows:

Second-order poverty dominance

The following conditions are equivalent:

1 Poverty is higher in A than in B for any of the poverty indices that obey the focus (see p.165), the anonymity (p.160), the population invariance (p. 160), the Pareto (p. 159) and the Pigou-Dalton (p. 161) principles and for any choice of poverty line between 0 and z+;

2 PA(z; α = 1) ≥ PB(z; α = 1) for all z between 0 and z+;

3 GA(p; z+) ≥ GB(p; z+) for all p between 0 and 1.

10.2.3 Higher-order poverty dominance

Dual conditions for higher-order poverty dominance are not as convenient and simple as those just stated for first- and second-order dominance. It is therefore usual to check higher-order dominance using the primal conditions of (10.14). Stated in terms of ethical principles, third-order dominance reads for instance as:

 

5DAD: Curves|CPG.

Third-order poverty dominance

The following conditions are equivalent:

1 Poverty is higher in A than in B for any of the poverty indices that obey the focus (see p.165), the anonymity (p.160), the population invariance (p.160), the Pareto (p.159), the Pigou-Dalton (p.161) and the transfer-sensitivity principles (p.161) and for any choice of poverty line between 0 and z+;

2 PA(Z; α = 2) ≥ PB(Z; α = 2) for all z between 0 and z+.

10.3 Assessing the limits to dominance

Whether we use the primal or the dual approach, testing for poverty dominance involves specifying an upper bound z+ for the ordering of the dominance curves. This bound can presumably be obtained from empirical or ethical work on what reasonable range of poverty lines should be used to compare poverty. It can of course also be specified arbitrarily by the researcher. An alternative strategy is to use the available sample information and estimate directly from that information the upper bound up to which a distributive comparison can be inferred to be robust. We can then interpret this statistics as a "critical" bound. In the light of the results above, this critical bound will limit the range of poverty lines over which we will be able to order poverty across A and B.

Assume for instance that a primal poverty dominance curve DsA(y) for A is initially higher than that for B for low values of y, but that this ranking is reversed for higher values of y. Let Image be the first crossing point of the curves, such that Image Distribution B then has less poverty than distribution A for all of the poverty indices in IIs(z+), so long as Image As the notation implies, this calculation can be done for any desired order s of poverty dominance.

It may be, however, that we feel (for some order s) that Image is too low. Said differently, being able to order poverty only over a relatively narrow range Image may seem unsatisfactory. We may change this by moving to a higher order of dominance. Indeed, we can show that Image is increasing in s, with Image, whenever Image for some Image and Image for all Image. We may thus increase the range of poverty lines over which a poverty ranking is robust by moving up to a higher class of indices.

This is illustrated in Figure 10.5, where z+ < z++. For the sake of illustration, suppose that the first-order dominance curves of A and B cross somewhere between 0 and z++. It is then impossible to order poverty over all of the indices that belong to IIl(z++). Assume, however, that decreasing the upper bound from z++ to z+ does rank the distributions over II1(z+), and that increasing the order of dominance from 1 to 2 while maintaining the up­per bound at z++ also ranks the distributions in terms of poverty. In either case, poverty is now ordered, but over different sets. The alternative is then to choose between an ordering on indices that are ethically more restrictive (such as II2(z++)), and an ordering on indices with a more restrictive range of poverty lines (such as II1 (z+)).

10.4 References

Methods for testing poverty dominance are relatively recent, and postdate much of the literature on inequality and social welfare dominance. One of the early influential papers is Atkinson (1987) — that paper also introduced the idea of "restricted" dominance. The theoretical poverty dominance conditions have been further and rigorously explored in Foster and Shorrocks (1988b) and Foster and Shorrocks (1988c). Bounds to poverty dominance are discussed in Davidson and Duclos (2000). Zheng (2000a) provides a different approach based on "minimum distribution-sensitivity" poverty indices.

The Pigou-Dalton principle has been framed alternatively as a strong and as a weak axiom for the study of poverty indices (see Donaldson and Weymark (1986) and Zheng (1999a)). In the weak version, the axiom says that the poverty index must increase following a transfer from one individual to another wealthier individual, providing that both are initially below the poverty line and that the transfer does not lift the wealthier person above this threshold. The strong axiom postulates that the index must increase even if this transfer pushes the higher-income recipient above the poverty line. The strong formulation of the axiom is usually preferred.

Del Rio and Ruiz Castillo (2001), Jenkins and Lambert (1998a), Jenkins and Lambert (1998a), Jenkins and Lambert (1998b) and Spencer and Fisher (1992) discuss the use of CPG (or "TIP") curves (initially proposed by Jenkins and Lambert 1997) for second-order poverty dominance, surveys and integrative reviews of the literature can be found in Zheng (1999a), Zheng (2000b) and Zheng (2001a). US applications include Bishop, Formby, and Zeager (1996) (for the marginal impact of food stamps on US poverty) and Zheng, Gushing, and Chow (1995) (for another US application).

Image

Figure 10.3: Poverty indices and ethical judgements — The sets of poverty indices that belong to the classes IIi(z+). i = 1, 2, 3

Image

Figure 10.4: Poverty dominance and income distributions — The sets of distributions that are ordered by the dominance conditions ΔDi(y) ≥ 0, y ≤ z+, and i = 1, 2, 3

Image

Figure 10.5: Classes of poverty indices and upper bounds for poverty lines

Chapter 11
WELFARE AND INEQUALITY DOMINANCE

11.1 Ethical welfare judgments

As for poverty, we may wish to determine if the ranking of two distributions of income in terms of social welfare is robust to the choice of social welfare indices. Of course, one way to check such robustness would be to verify the welfare ranking of the two distributions for a large number of the many social welfare indices that have been proposed in the literature. This, however, would certainly be a tedious task. Besides, new social welfare indices can always be designed.

A simpler and more powerful alternative is to apply tests of welfare dominance. Unlike for poverty, welfare dominance tests take into account the whole distributions of income, as opposed to just the censored distributions used for poverty comparisons.

As for poverty dominance, there are two testing approaches, a primal (income-censoring) and a dual (percentile-truncating) one. The primal approach has the advantage of being applicable to any desired (however large) order of dominance, and uses curves of the well-known FGT indices for an infinite range of "poverty lines" or income censoring points. The dual approach is practically convenient only for first and second order dominance, but it uses curves that are graphically instructive and that have been documented extensively in the literature. As for poverty dominance, if, for first and second order dominance, a welfare ranking is obtained using one of these two testing approaches, the same ranking will be obtained using the other approach. In other words, the two approaches are equivalent in terms of their ability to rank distributions robustly over classes of first- and second-order social welfare indices.

As for poverty dominance, for both of these approaches we will make use of classes of social welfare indices defined by the reactions of indices to changes in or reallocations of income. These social welfare indices do not need to be additive, but for expositional convenience assume that they are defined in the simple rank-dependent utilitarian format of W in (4.28):

Image

The first-order class of social welfare indices regroups all of the symmetric (or anonymous) social welfare indices that are increasing in income. In terms of (11.1), this can be formulated as the class Ω1 with

Image

The second-order class of social welfare indices regroups all of the first-order indices that are increasing in mean-preserving equalizing transfers. Recall that such transfers redistribute one dollar of income from a richer to a poorer person. These indices thus obey the Pigou-Dalton principle of transfers. Using (11.1) again, this suggests the class of Ω2 indices:

Image

The third-order class of social welfare indices includes all of the second-order indices that further obey the transfer-sensitivity principle — requiring that equalizing transfers have a greater impact on social welfare when they occur lower down in the distribution of income. Expressed in terms of (11.1), this requirement forces ω(p) to be a constant and requires the concavity of individual utility functions to be decreasing in income. This suggests Ω3:

Image

As hinted above on page 162, higher orders of classes can be defined analogously. Generally speaking, membership in a higher-order class of social welfare indices requires these indices to be more sensitive to the income of the very poor. Membership in Ωs implies membership in Ωs-1, and for s-order additive welfare indices, we also need that (-1)(i)U(i)(Q(p)) ≤ 0 for i=1,..., s.

11.2 Tests of welfare dominance

As for poverty dominance, both primal and dual conditions can be used for testing first- and second-order welfare dominance. The two types of tests order social welfare on exactly the same class of indices.

First-order welfare dominance

The following conditions are equivalent:

1 Social we!lfare is larger in B than in A for any of the social welfare indices that obey the Pareto (p.159), the anonymity (p.160) and the population invariance principles (p.160);

2 WB - WA ≥ 0 for all W ε Ω1;

3 PA(z;α = 0) ≥ PB(z; α = 0) for all z between 0 and infinity;

4 Image for all z between 0 and infinity;

5 QA(p) ≤ QB(p) for all p between 0 and 11.

First-order welfare dominance can thus be checked by verifying whether the headcount index is higher for A than for B for all poverty lines z. There is therefore a useful analogue between tests of poverty and welfare dominance. Ordering two distributions of incomes over the first-order class of social welfare indices can also be done by comparing the incomes of the two distributions over the entire range of percentiles. Graphically, it requires checking that "Pen's parade of dwarfs and giants" be everywhere higher in B than in A, whatever the percentiles being compared. The two distributions "parade" simultaneously alongside each other, and the distributive analyst observes if one parade dominates everywhere the other.

A similar result can be stated for second-order welfare dominance. To see this, first recall the definition of the Generalized Lorenz curve GL(p) (see (4.44) on page 65):

Image

The Generalized Lorenz curve sums all incomes up to quantile Q(p), and is therefore the cumulative Pen's parade. We then obtain:

Second-order welfare dominance

The following conditions are equivalent:

1 Social welfare is larger in B than in A for any of the social welfare indices that obey the Pareto (p.159), the anonymity (p.160), the population invariance (p.160) and the Pigou-Dalton (p.161) principles;

2 WB - WA ≥ 0 for all W ε Ω2;

3 PA(z;α = 1) ≥ PB(z; α = 1) for all z between 0 and infinity;

 

1DAD: Curves|Quantile.

4 Image for all z between 0 and infinity;

5 GLA(p) ≤ GLB(p) for all p between 0 and 1 2.

An exactly similar result applies for higher-order welfare dominance. As for poverty dominance, the dual conditions are less convenient and are omitted here.

Higher-order welfare dominance

The following conditions are equivalent:

1 WB - WA ≥ 0 for all W ε Ωs;

2 P