Statistical inference for inequality measures based on semi-parametric estimators

Thesis (PhD)--Stellenbosch University, 2011. === ENGLISH ABSTRACT: Measures of inequality, also used as measures of concentration or diversity, are very popular in economics and especially in measuring the inequality in income or wealth within a population and between populations. However, they ha...

Full description

Bibliographic Details
Main Author:	Kpanzou, Tchilabalo Abozou
Other Authors:	De Wet, Tertius
Format:	Others
Language:	en_ZA
Published:	Stellenbosch : Stellenbosch University 2011
Subjects:	Extreme value theory Semi-parametric estimation Confidence intervals Theses > Statistics and actuarial science Dissertations > Statistics and actuarial science
Online Access:	http://hdl.handle.net/10019.1/17882

id	ndltd-netd.ac.za-oai-union.ndltd.org-sun-oai-scholar.sun.ac.za-10019.1-17882
record_format	oai_dc
collection	NDLTD
language	en_ZA
format	Others
sources	NDLTD
topic	Extreme value theory Semi-parametric estimation Confidence intervals Theses -- Statistics and actuarial science Dissertations -- Statistics and actuarial science
spellingShingle	Extreme value theory Semi-parametric estimation Confidence intervals Theses -- Statistics and actuarial science Dissertations -- Statistics and actuarial science Kpanzou, Tchilabalo Abozou Statistical inference for inequality measures based on semi-parametric estimators
description	Thesis (PhD)--Stellenbosch University, 2011. === ENGLISH ABSTRACT: Measures of inequality, also used as measures of concentration or diversity, are very popular in economics and especially in measuring the inequality in income or wealth within a population and between populations. However, they have applications in many other fields, e.g. in ecology, linguistics, sociology, demography, epidemiology and information science. A large number of measures have been proposed to measure inequality. Examples include the Gini index, the generalized entropy, the Atkinson and the quintile share ratio measures. Inequality measures are inherently dependent on the tails of the population (underlying distribution) and therefore their estimators are typically sensitive to data from these tails (nonrobust). For example, income distributions often exhibit a long tail to the right, leading to the frequent occurrence of large values in samples. Since the usual estimators are based on the empirical distribution function, they are usually nonrobust to such large values. Furthermore, heavy-tailed distributions often occur in real life data sets, remedial action therefore needs to be taken in such cases. The remedial action can be either a trimming of the extreme data or a modification of the (traditional) estimator to make it more robust to extreme observations. In this thesis we follow the second option, modifying the traditional empirical distribution function as estimator to make it more robust. Using results from extreme value theory, we develop more reliable distribution estimators in a semi-parametric setting. These new estimators of the distribution then form the basis for more robust estimators of the measures of inequality. These estimators are developed for the four most popular classes of measures, viz. Gini, generalized entropy, Atkinson and quintile share ratio. Properties of such estimators are studied especially via simulation. Using limiting distribution theory and the bootstrap methodology, approximate confidence intervals were derived. Through the various simulation studies, the proposed estimators are compared to the standard ones in terms of mean squared error, relative impact of contamination, confidence interval length and coverage probability. In these studies the semi-parametric methods show a clear improvement over the standard ones. The theoretical properties of the quintile share ratio have not been studied much. Consequently, we also derive its influence function as well as the limiting normal distribution of its nonparametric estimator. These results have not previously been published. In order to illustrate the methods developed, we apply them to a number of real life data sets. Using such data sets, we show how the methods can be used in practice for inference. In order to choose between the candidate parametric distributions, use is made of a measure of sample representativeness from the literature. These illustrations show that the proposed methods can be used to reach satisfactory conclusions in real life problems. === AFRIKAANSE OPSOMMING: Maatstawwe van ongelykheid, wat ook gebruik word as maatstawwe van konsentrasie of diversiteit, is baie populêr in ekonomie en veral vir die kwantifisering van ongelykheid in inkomste of welvaart binne ’n populasie en tussen populasies. Hulle het egter ook toepassings in baie ander dissiplines, byvoorbeeld ekologie, linguistiek, sosiologie, demografie, epidemiologie en inligtingskunde. Daar bestaan reeds verskeie maatstawwe vir die meet van ongelykheid. Voorbeelde sluit in die Gini indeks, die veralgemeende entropie maatstaf, die Atkinson maatstaf en die kwintiel aandeel verhouding. Maatstawwe van ongelykheid is inherent afhanklik van die sterte van die populasie (onderliggende verdeling) en beramers daarvoor is tipies dus sensitief vir data uit sodanige sterte (nierobuust). Inkomste verdelings het byvoorbeeld dikwels lang regtersterte, wat kan lei tot die voorkoms van groot waardes in steekproewe. Die tradisionele beramers is gebaseer op die empiriese verdelingsfunksie, en hulle is gewoonlik dus nierobuust teenoor sodanige groot waardes nie. Aangesien swaarstert verdelings dikwels voorkom in werklike data, moet regstellings gemaak word in sulke gevalle. Hierdie regstellings kan bestaan uit of die afknip van ekstreme data of die aanpassing van tradisionele beramers om hulle meer robuust te maak teen ekstreme waardes. In hierdie tesis word die tweede opsie gevolg deurdat die tradisionele empiriese verdelingsfunksie as beramer aangepas word om dit meer robuust te maak. Deur gebruik te maak van resultate van ekstreemwaardeteorie, word meer betroubare beramers vir verdelings ontwikkel in ’n semi-parametriese opset. Hierdie nuwe beramers van die verdeling vorm dan die basis vir meer robuuste beramers van maatstawwe van ongelykheid. Hierdie beramers word ontwikkel vir die vier mees populêre klasse van maatstawwe, naamlik Gini, veralgemeende entropie, Atkinson en kwintiel aandeel verhouding. Eienskappe van hierdie beramers word bestudeer, veral met behulp van simulasie studies. Benaderde vertrouensintervalle word ontwikkel deur gebruik te maak van limietverdelingsteorie en die skoenlus metodologie. Die voorgestelde beramers word vergelyk met tradisionele beramers deur middel van verskeie simulasie studies. Die vergelyking word gedoen in terme van gemiddelde kwadraat fout, relatiewe impak van kontaminasie, vertrouensinterval lengte en oordekkingswaarskynlikheid. In hierdie studies toon die semi-parametriese metodes ’n duidelike verbetering teenoor die tradisionele metodes. Die kwintiel aandeel verhouding se teoretiese eienskappe het nog nie veel aandag in die literatuur geniet nie. Gevolglik lei ons die invloedfunksie asook die asimptotiese verdeling van die nie-parametriese beramer daarvoor af. Ten einde die metodes wat ontwikkel is te illustreer, word dit toegepas op ’n aantal werklike datastelle. Hierdie toepassings toon hoe die metodes gebruik kan word vir inferensie in die praktyk. ’n Metode in die literatuur vir steekproefverteenwoordiging word voorgestel en gebruik om ’n keuse tussen die kandidaat parametriese verdelings te maak. Hierdie voorbeelde toon dat die voorgestelde metodes met vrug gebruik kan word om bevredigende gevolgtrekkings in die praktyk te maak.
author2	De Wet, Tertius
author_facet	De Wet, Tertius Kpanzou, Tchilabalo Abozou
author	Kpanzou, Tchilabalo Abozou
author_sort	Kpanzou, Tchilabalo Abozou
title	Statistical inference for inequality measures based on semi-parametric estimators
title_short	Statistical inference for inequality measures based on semi-parametric estimators
title_full	Statistical inference for inequality measures based on semi-parametric estimators
title_fullStr	Statistical inference for inequality measures based on semi-parametric estimators
title_full_unstemmed	Statistical inference for inequality measures based on semi-parametric estimators
title_sort	statistical inference for inequality measures based on semi-parametric estimators
publisher	Stellenbosch : Stellenbosch University
publishDate	2011
url	http://hdl.handle.net/10019.1/17882
work_keys_str_mv	AT kpanzoutchilabaloabozou statisticalinferenceforinequalitymeasuresbasedonsemiparametricestimators
_version_	1718163151400206336
spelling	ndltd-netd.ac.za-oai-union.ndltd.org-sun-oai-scholar.sun.ac.za-10019.1-178822016-01-29T04:02:32Z Statistical inference for inequality measures based on semi-parametric estimators Kpanzou, Tchilabalo Abozou De Wet, Tertius Neethling, Ariane Stellenbosch University. Faculty of Economic and Management Sciences. Dept. of Statistics and Actuarial Science. Extreme value theory Semi-parametric estimation Confidence intervals Theses -- Statistics and actuarial science Dissertations -- Statistics and actuarial science Thesis (PhD)--Stellenbosch University, 2011. ENGLISH ABSTRACT: Measures of inequality, also used as measures of concentration or diversity, are very popular in economics and especially in measuring the inequality in income or wealth within a population and between populations. However, they have applications in many other fields, e.g. in ecology, linguistics, sociology, demography, epidemiology and information science. A large number of measures have been proposed to measure inequality. Examples include the Gini index, the generalized entropy, the Atkinson and the quintile share ratio measures. Inequality measures are inherently dependent on the tails of the population (underlying distribution) and therefore their estimators are typically sensitive to data from these tails (nonrobust). For example, income distributions often exhibit a long tail to the right, leading to the frequent occurrence of large values in samples. Since the usual estimators are based on the empirical distribution function, they are usually nonrobust to such large values. Furthermore, heavy-tailed distributions often occur in real life data sets, remedial action therefore needs to be taken in such cases. The remedial action can be either a trimming of the extreme data or a modification of the (traditional) estimator to make it more robust to extreme observations. In this thesis we follow the second option, modifying the traditional empirical distribution function as estimator to make it more robust. Using results from extreme value theory, we develop more reliable distribution estimators in a semi-parametric setting. These new estimators of the distribution then form the basis for more robust estimators of the measures of inequality. These estimators are developed for the four most popular classes of measures, viz. Gini, generalized entropy, Atkinson and quintile share ratio. Properties of such estimators are studied especially via simulation. Using limiting distribution theory and the bootstrap methodology, approximate confidence intervals were derived. Through the various simulation studies, the proposed estimators are compared to the standard ones in terms of mean squared error, relative impact of contamination, confidence interval length and coverage probability. In these studies the semi-parametric methods show a clear improvement over the standard ones. The theoretical properties of the quintile share ratio have not been studied much. Consequently, we also derive its influence function as well as the limiting normal distribution of its nonparametric estimator. These results have not previously been published. In order to illustrate the methods developed, we apply them to a number of real life data sets. Using such data sets, we show how the methods can be used in practice for inference. In order to choose between the candidate parametric distributions, use is made of a measure of sample representativeness from the literature. These illustrations show that the proposed methods can be used to reach satisfactory conclusions in real life problems. AFRIKAANSE OPSOMMING: Maatstawwe van ongelykheid, wat ook gebruik word as maatstawwe van konsentrasie of diversiteit, is baie populêr in ekonomie en veral vir die kwantifisering van ongelykheid in inkomste of welvaart binne ’n populasie en tussen populasies. Hulle het egter ook toepassings in baie ander dissiplines, byvoorbeeld ekologie, linguistiek, sosiologie, demografie, epidemiologie en inligtingskunde. Daar bestaan reeds verskeie maatstawwe vir die meet van ongelykheid. Voorbeelde sluit in die Gini indeks, die veralgemeende entropie maatstaf, die Atkinson maatstaf en die kwintiel aandeel verhouding. Maatstawwe van ongelykheid is inherent afhanklik van die sterte van die populasie (onderliggende verdeling) en beramers daarvoor is tipies dus sensitief vir data uit sodanige sterte (nierobuust). Inkomste verdelings het byvoorbeeld dikwels lang regtersterte, wat kan lei tot die voorkoms van groot waardes in steekproewe. Die tradisionele beramers is gebaseer op die empiriese verdelingsfunksie, en hulle is gewoonlik dus nierobuust teenoor sodanige groot waardes nie. Aangesien swaarstert verdelings dikwels voorkom in werklike data, moet regstellings gemaak word in sulke gevalle. Hierdie regstellings kan bestaan uit of die afknip van ekstreme data of die aanpassing van tradisionele beramers om hulle meer robuust te maak teen ekstreme waardes. In hierdie tesis word die tweede opsie gevolg deurdat die tradisionele empiriese verdelingsfunksie as beramer aangepas word om dit meer robuust te maak. Deur gebruik te maak van resultate van ekstreemwaardeteorie, word meer betroubare beramers vir verdelings ontwikkel in ’n semi-parametriese opset. Hierdie nuwe beramers van die verdeling vorm dan die basis vir meer robuuste beramers van maatstawwe van ongelykheid. Hierdie beramers word ontwikkel vir die vier mees populêre klasse van maatstawwe, naamlik Gini, veralgemeende entropie, Atkinson en kwintiel aandeel verhouding. Eienskappe van hierdie beramers word bestudeer, veral met behulp van simulasie studies. Benaderde vertrouensintervalle word ontwikkel deur gebruik te maak van limietverdelingsteorie en die skoenlus metodologie. Die voorgestelde beramers word vergelyk met tradisionele beramers deur middel van verskeie simulasie studies. Die vergelyking word gedoen in terme van gemiddelde kwadraat fout, relatiewe impak van kontaminasie, vertrouensinterval lengte en oordekkingswaarskynlikheid. In hierdie studies toon die semi-parametriese metodes ’n duidelike verbetering teenoor die tradisionele metodes. Die kwintiel aandeel verhouding se teoretiese eienskappe het nog nie veel aandag in die literatuur geniet nie. Gevolglik lei ons die invloedfunksie asook die asimptotiese verdeling van die nie-parametriese beramer daarvoor af. Ten einde die metodes wat ontwikkel is te illustreer, word dit toegepas op ’n aantal werklike datastelle. Hierdie toepassings toon hoe die metodes gebruik kan word vir inferensie in die praktyk. ’n Metode in die literatuur vir steekproefverteenwoordiging word voorgestel en gebruik om ’n keuse tussen die kandidaat parametriese verdelings te maak. Hierdie voorbeelde toon dat die voorgestelde metodes met vrug gebruik kan word om bevredigende gevolgtrekkings in die praktyk te maak. 2011-10-18T08:21:46Z 2011-12-05T13:07:50Z 2011-10-18T08:21:46Z 2011-12-05T13:07:50Z 2011-12 Thesis http://hdl.handle.net/10019.1/17882 en_ZA Stellenbosch University xxv, 241 p. : ill. Stellenbosch : Stellenbosch University

Statistical inference for inequality measures based on semi-parametric estimators

Similar Items