Summary: | Thesis (PhD)--Stellenbosch University, 2011. === ENGLISH ABSTRACT: Measures of inequality, also used as measures of concentration or diversity, are very popular in economics
and especially in measuring the inequality in income or wealth within a population and between
populations. However, they have applications in many other fields, e.g. in ecology, linguistics, sociology,
demography, epidemiology and information science.
A large number of measures have been proposed to measure inequality. Examples include the Gini
index, the generalized entropy, the Atkinson and the quintile share ratio measures. Inequality measures
are inherently dependent on the tails of the population (underlying distribution) and therefore their
estimators are typically sensitive to data from these tails (nonrobust). For example, income distributions
often exhibit a long tail to the right, leading to the frequent occurrence of large values in samples. Since
the usual estimators are based on the empirical distribution function, they are usually nonrobust to such
large values. Furthermore, heavy-tailed distributions often occur in real life data sets, remedial action
therefore needs to be taken in such cases.
The remedial action can be either a trimming of the extreme data or a modification of the (traditional)
estimator to make it more robust to extreme observations. In this thesis we follow the second option,
modifying the traditional empirical distribution function as estimator to make it more robust. Using results
from extreme value theory, we develop more reliable distribution estimators in a semi-parametric
setting. These new estimators of the distribution then form the basis for more robust estimators of the
measures of inequality. These estimators are developed for the four most popular classes of measures,
viz. Gini, generalized entropy, Atkinson and quintile share ratio. Properties of such estimators
are studied especially via simulation. Using limiting distribution theory and the bootstrap methodology,
approximate confidence intervals were derived. Through the various simulation studies, the proposed
estimators are compared to the standard ones in terms of mean squared error, relative impact of contamination,
confidence interval length and coverage probability. In these studies the semi-parametric
methods show a clear improvement over the standard ones. The theoretical properties of the quintile
share ratio have not been studied much. Consequently, we also derive its influence function as well as
the limiting normal distribution of its nonparametric estimator. These results have not previously been
published.
In order to illustrate the methods developed, we apply them to a number of real life data sets. Using
such data sets, we show how the methods can be used in practice for inference. In order to choose
between the candidate parametric distributions, use is made of a measure of sample representativeness
from the literature. These illustrations show that the proposed methods can be used to reach
satisfactory conclusions in real life problems. === AFRIKAANSE OPSOMMING: Maatstawwe van ongelykheid, wat ook gebruik word as maatstawwe van konsentrasie of diversiteit,
is baie populêr in ekonomie en veral vir die kwantifisering van ongelykheid in inkomste of welvaart
binne ’n populasie en tussen populasies. Hulle het egter ook toepassings in baie ander dissiplines,
byvoorbeeld ekologie, linguistiek, sosiologie, demografie, epidemiologie en inligtingskunde.
Daar bestaan reeds verskeie maatstawwe vir die meet van ongelykheid. Voorbeelde sluit in die Gini
indeks, die veralgemeende entropie maatstaf, die Atkinson maatstaf en die kwintiel aandeel verhouding.
Maatstawwe van ongelykheid is inherent afhanklik van die sterte van die populasie (onderliggende
verdeling) en beramers daarvoor is tipies dus sensitief vir data uit sodanige sterte (nierobuust). Inkomste
verdelings het byvoorbeeld dikwels lang regtersterte, wat kan lei tot die voorkoms van groot
waardes in steekproewe. Die tradisionele beramers is gebaseer op die empiriese verdelingsfunksie, en
hulle is gewoonlik dus nierobuust teenoor sodanige groot waardes nie. Aangesien swaarstert verdelings
dikwels voorkom in werklike data, moet regstellings gemaak word in sulke gevalle.
Hierdie regstellings kan bestaan uit of die afknip van ekstreme data of die aanpassing van tradisionele
beramers om hulle meer robuust te maak teen ekstreme waardes. In hierdie tesis word die
tweede opsie gevolg deurdat die tradisionele empiriese verdelingsfunksie as beramer aangepas word
om dit meer robuust te maak. Deur gebruik te maak van resultate van ekstreemwaardeteorie, word
meer betroubare beramers vir verdelings ontwikkel in ’n semi-parametriese opset. Hierdie nuwe beramers
van die verdeling vorm dan die basis vir meer robuuste beramers van maatstawwe van ongelykheid.
Hierdie beramers word ontwikkel vir die vier mees populêre klasse van maatstawwe, naamlik
Gini, veralgemeende entropie, Atkinson en kwintiel aandeel verhouding. Eienskappe van hierdie
beramers word bestudeer, veral met behulp van simulasie studies. Benaderde vertrouensintervalle
word ontwikkel deur gebruik te maak van limietverdelingsteorie en die skoenlus metodologie. Die
voorgestelde beramers word vergelyk met tradisionele beramers deur middel van verskeie simulasie
studies. Die vergelyking word gedoen in terme van gemiddelde kwadraat fout, relatiewe impak van
kontaminasie, vertrouensinterval lengte en oordekkingswaarskynlikheid. In hierdie studies toon die
semi-parametriese metodes ’n duidelike verbetering teenoor die tradisionele metodes. Die kwintiel
aandeel verhouding se teoretiese eienskappe het nog nie veel aandag in die literatuur geniet nie.
Gevolglik lei ons die invloedfunksie asook die asimptotiese verdeling van die nie-parametriese beramer
daarvoor af.
Ten einde die metodes wat ontwikkel is te illustreer, word dit toegepas op ’n aantal werklike datastelle.
Hierdie toepassings toon hoe die metodes gebruik kan word vir inferensie in die praktyk. ’n Metode
in die literatuur vir steekproefverteenwoordiging word voorgestel en gebruik om ’n keuse tussen die
kandidaat parametriese verdelings te maak. Hierdie voorbeelde toon dat die voorgestelde metodes
met vrug gebruik kan word om bevredigende gevolgtrekkings in die praktyk te maak.
|