Skip to content

Commit bb24ba7

Browse files
committed
misc
1 parent 44c6902 commit bb24ba7

File tree

1 file changed

+73
-62
lines changed

1 file changed

+73
-62
lines changed

lectures/inequality.md

+73-62
Original file line numberDiff line numberDiff line change
@@ -25,8 +25,8 @@ In this section we
2525
Many historians argue that inequality played a key role in the fall of the
2626
Roman Republic.
2727

28-
After defeating Carthage and invading Spain, money flowed into Rome and
29-
greatly enriched those in power.
28+
Following the defeat of Carthage and the invasion of Spain, money flowed into
29+
Rome from across the empire, greatly enriched those in power.
3030

3131
Meanwhile, ordinary citizens were taken from their farms to fight for long
3232
periods, diminishing their wealth.
@@ -40,26 +40,23 @@ with Octavian (Augustus) in 27 BCE.
4040
This history is fascinating in its own right, and we can see some
4141
parallels with certain countries in the modern world.
4242

43-
Many recent political debates revolve around inequality.
43+
Let's now look at inequality in some of these countries.
4444

45-
Many economic policies, from taxation to the welfare state, are
46-
aimed at addressing inequality.
4745

4846
### Measurement
4947

48+
49+
Political debates often revolve around inequality.
50+
5051
One problem with these debates is that inequality is often poorly defined.
5152

5253
Moreover, debates on inequality are often tied to political beliefs.
5354

54-
This is dangerous for economists because allowing political beliefs to
55-
shape our findings reduces objectivity.
56-
57-
To bring a truly scientific perspective to the topic of inequality we must
58-
start with careful definitions.
55+
This is dangerous for economists because allowing political beliefs to shape our findings reduces objectivity.
5956

60-
In this lecture we discuss standard measures of inequality used in economic research.
57+
To bring a truly scientific perspective to the topic of inequality we must start with careful definitions.
6158

62-
For each of these measures, we will look at both simulated and real data.
59+
Hence we begin by discussing ways that inequality can be measured in economic research.
6360

6461
We will need to install the following packages
6562

@@ -91,7 +88,7 @@ In this section we define the Lorenz curve and examine its properties.
9188

9289
The Lorenz curve takes a sample $w_1, \ldots, w_n$ and produces a curve $L$.
9390

94-
We suppose that the sample $w_1, \ldots, w_n$ has been sorted from smallest to largest.
91+
We suppose that the sample has been sorted from smallest to largest.
9592

9693
To aid our interpretation, suppose that we are measuring wealth
9794

@@ -224,10 +221,10 @@ plt.show()
224221

225222
### Lorenz curves for US data
226223

227-
Next let's look at data, focusing on income and wealth in the US in 2016.
224+
Next let's look at US data for both income and wealth.
228225

229226
(data:survey-consumer-finance)=
230-
The following code block imports a subset of the dataset `SCF_plus`,
227+
The following code block imports a subset of the dataset `SCF_plus` for 2016,
231228
which is derived from the [Survey of Consumer Finances](https://en.wikipedia.org/wiki/Survey_of_Consumer_Finances) (SCF).
232229

233230
```{code-cell} ipython3
@@ -240,7 +237,7 @@ df_income_wealth = df.dropna()
240237
df_income_wealth.head(n=5)
241238
```
242239

243-
The following code block uses data stored in dataframe `df_income_wealth` to generate the Lorenz curves.
240+
The next code block uses data stored in dataframe `df_income_wealth` to generate the Lorenz curves.
244241

245242
(The code is somewhat complex because we need to adjust the data according to
246243
population weights supplied by the SCF.)
@@ -289,6 +286,10 @@ l_vals_nw, l_vals_ti, l_vals_li = L_vals
289286
Now we plot Lorenz curves for net wealth, total income and labor income in the
290287
US in 2016.
291288

289+
Total income is the sum of households' all income sources, including labor income but excluding capital gains.
290+
291+
(All income measures are pre-tax.)
292+
292293
```{code-cell} ipython3
293294
---
294295
mystnb:
@@ -309,31 +310,26 @@ ax.legend()
309310
plt.show()
310311
```
311312

312-
Here all the income and wealth measures are pre-tax.
313313

314-
Total income is the sum of households' all income sources, including labor income but excluding capital gains.
314+
One key finding from this figure is that wealth inequality is more extreme than income inequality.
315+
315316

316-
One key finding from this figure is that wealth inequality is significantly
317-
more extreme than income inequality.
318317

319-
We will take a look at this trend over time {ref}`in a later section<compare-income-wealth-usa-over-time>`.
320318

321319
## The Gini coefficient
322320

323321
The Lorenz curve is a useful visual representation of inequality in a distribution.
324322

325-
Another popular measure of income and wealth inequality is the Gini coefficient.
326-
327-
The Gini coefficient is just a number, rather than a curve.
323+
Another way to study income and wealth inequality is via the Gini coefficient.
328324

329325
In this section we discuss the Gini coefficient and its relationship to the
330326
Lorenz curve.
331327

332328

329+
333330
### Definition
334331

335-
As before, suppose that the sample $w_1, \ldots, w_n$ has been sorted from
336-
smallest to largest.
332+
As before, suppose that the sample $w_1, \ldots, w_n$ has been sorted from smallest to largest.
337333

338334
The Gini coefficient is defined for the sample above as
339335

@@ -377,8 +373,14 @@ ax.legend()
377373
plt.show()
378374
```
379375

380-
Another way to think of the Gini coefficient is as a ratio of the area between the 45-degree line of
381-
perfect equality and the Lorenz curve (A) divided by the total area below the 45-degree line (A+B) as shown in {numref}`lorenz_gini2`.
376+
In fact the Gini coefficient can also be expressed as
377+
378+
$$
379+
G = \frac{A}{A+B}
380+
$$
381+
382+
where $A$ is the area between the 45-degree line of
383+
perfect equality and the Lorenz curve, while $B$ is the area below the Lorenze curve -- see {numref}`lorenz_gini2`.
382384

383385
```{code-cell} ipython3
384386
---
@@ -403,11 +405,7 @@ ax.legend()
403405
plt.show()
404406
```
405407

406-
$$
407-
G = \frac{A}{A+B}
408-
$$
409408

410-
It is an average measure of deviation from the line of equality.
411409

412410
```{seealso}
413411
The World in Data project has a [nice graphical exploration of the Lorenz curve and the Gini coefficient](https://ourworldindata.org/what-is-the-gini-coefficient)
@@ -417,7 +415,7 @@ The World in Data project has a [nice graphical exploration of the Lorenz curve
417415

418416
Let's examine the Gini coefficient in some simulations.
419417

420-
First the code below enables us to compute the Gini coefficient.
418+
The code below computes the Gini coefficient from a sample.
421419

422420
```{code-cell} ipython3
423421
@@ -521,9 +519,9 @@ wb.search("gini")
521519

522520
We now know the series ID is `SI.POV.GINI`.
523521

524-
Another, and often useful way to find series ID, is to use the [World Bank data portal](https://data.worldbank.org) and then use `wbgapi` to fetch the data.
522+
(Another way to find the series ID is to use the [World Bank data portal](https://data.worldbank.org) and then use `wbgapi` to fetch the data.)
525523

526-
Using `pandas` we can take a quick look across all countries and all years in the World Bank dataset.
524+
To get a quick overview, let's histogram Gini coefficients across all countries and all years in the World Bank dataset.
527525

528526
```{code-cell} ipython3
529527
---
@@ -547,8 +545,7 @@ ax.set_ylabel("frequency")
547545
plt.show()
548546
```
549547

550-
We can see in {numref}`gini_histogram` that across 50 years of data and all countries
551-
the measure only varies between 20 and 65.
548+
We can see in {numref}`gini_histogram` that across 50 years of data and all countries the measure varies between 20 and 65.
552549

553550
Let us fetch the data `DataFrame` for the USA.
554551

@@ -559,7 +556,8 @@ data.head(n=5)
559556
data.columns = data.columns.map(lambda x: int(x.replace('YR','')))
560557
```
561558

562-
**Note:** This package often returns data with year information contained in the columns. This is not always convenient for simple plotting with pandas so it can be useful to transpose the results before plotting
559+
(This package often returns data with year information contained in the columns. This is not always convenient for simple plotting with pandas so it can be useful to transpose the results before plotting.)
560+
563561

564562
```{code-cell} ipython3
565563
data = data.T # Obtain years as rows
@@ -583,10 +581,8 @@ ax.set_xlabel("year")
583581
plt.show()
584582
```
585583

586-
As can be seen in {numref}`gini_usa1` the Gini coefficient:
587-
588-
1. trended upward from 1980 to 2020 and then dropped slightly following at the start of the COVID pandemic
589-
2. moves slowly over time
584+
As can be seen in {numref}`gini_usa1`, the income Gini
585+
trended upward from 1980 to 2020 and then dropped following at the start of the COVID pandemic.
590586

591587
(compare-income-wealth-usa-over-time)=
592588
### Gini coefficient for wealth (US data)
@@ -595,10 +591,9 @@ In the previous section we looked at the Gini coefficient for income using US da
595591

596592
Now let's look at the Gini coefficient for the distribution of wealth.
597593

598-
We can use the data collected above {ref}`survey of consumer finances <data:survey-consumer-finance>` to look at the Gini coefficient
594+
We can use the {ref}`Survey of Consumer Finances data <data:survey-consumer-finance>` to look at the Gini coefficient
599595
computed over the wealth distribution.
600596

601-
The Gini coefficient for net wealth and labour income is computed over many years.
602597

603598
```{code-cell} ipython3
604599
df_income_wealth.year.describe()
@@ -668,13 +663,24 @@ ax.set_ylabel("Gini coefficient")
668663
plt.show()
669664
```
670665

671-
The wealth time series exhibits a strong U-shape.
666+
The time series for the wealth Gini exhibits a U-shape, falling until the early
667+
1980s and then increasing rapidly.
668+
669+
670+
One possibility is that this change is mainly driven by technology.
671+
672+
However, we will see below that not all advanced economies experienced similar growth of inequality.
673+
674+
675+
676+
672677

673678
### Cross-country comparisons of income inequality
674679

675680
Earlier in this lecture we used `wbgapi` to get Gini data across many countries and saved it in a variable called `gini_all`
676681

677-
In this section we will compare a few Western economies and look at the evolution in their respective Gini coefficients
682+
In this section we will use this data to compare several advanced economies, and
683+
to look at the evolution in their respective income Ginis.
678684

679685
```{code-cell} ipython3
680686
data = gini_all.unstack()
@@ -683,7 +689,7 @@ data.columns
683689

684690
There are 167 countries represented in this dataset.
685691

686-
Let us compare three Western economies: USA, United Kingdom, and Norway
692+
Let us compare three advanced economies: the US, the UK, and Norway
687693

688694
```{code-cell} ipython3
689695
---
@@ -699,7 +705,9 @@ ax.legend(title="")
699705
plt.show()
700706
```
701707

702-
We see that Norway has a shorter time series so let us take a closer look at the underlying data
708+
We see that Norway has a shorter time series.
709+
710+
Let us take a closer look at the underlying data and see if we can rectify this.
703711

704712
```{code-cell} ipython3
705713
data[['NOR']].dropna().head(n=5)
@@ -724,15 +732,19 @@ ax.legend(title="")
724732
plt.show()
725733
```
726734

727-
From this plot we can observe that the USA has a higher Gini coefficient (i.e. higher income inequality) when compared to the UK and Norway.
735+
From this plot we can observe that the US has a higher Gini coefficient (i.e.
736+
higher income inequality) when compared to the UK and Norway.
737+
738+
Norway has the lowest Gini coefficient over the three economies and, moreover,
739+
the Gini coefficient shows no upward trend.
740+
728741

729-
Norway has the lowest Gini coefficient over the three economies and is substantially lower than the US.
730742

731743
### Gini Coefficient and GDP per capita (over time)
732744

733745
We can also look at how the Gini coefficient compares with GDP per capita (over time).
734746

735-
Let's take another look at the USA, Norway, and the United Kingdom.
747+
Let's take another look at the US, Norway, and the UK.
736748

737749
```{code-cell} ipython3
738750
countries = ['USA', 'NOR', 'GBR']
@@ -742,15 +754,15 @@ gdppc.columns = gdppc.columns.map(lambda x: int(x.replace('YR','')))
742754
gdppc = gdppc.T
743755
```
744756

745-
We can rearrange the data so that we can plot gdp per capita and the Gini coefficient across years
757+
We can rearrange the data so that we can plot GDP per capita and the Gini coefficient across years
746758

747759
```{code-cell} ipython3
748760
plot_data = pd.DataFrame(data[countries].unstack())
749761
plot_data.index.names = ['country', 'year']
750762
plot_data.columns = ['gini']
751763
```
752764

753-
Now we can get the gdp per capita data into a shape that can be merged with `plot_data`
765+
Now we can get the GDP per capita data into a shape that can be merged with `plot_data`
754766

755767
```{code-cell} ipython3
756768
pgdppc = pd.DataFrame(gdppc.unstack())
@@ -760,15 +772,14 @@ plot_data = plot_data.merge(pgdppc, left_index=True, right_index=True)
760772
plot_data.reset_index(inplace=True)
761773
```
762774

763-
Now using plotly to build a plot with gdp per capita on the y-axis and the Gini coefficient on the x-axis.
775+
Now we use Plotly to build a plot with GDP per capita on the y-axis and the Gini coefficient on the x-axis.
764776

765777
```{code-cell} ipython3
766778
min_year = plot_data.year.min()
767779
max_year = plot_data.year.max()
768780
```
769781

770-
771-
**Note:** The time series for all three countries start and stop in different years. We will add a year mask to the data to
782+
The time series for all three countries start and stop in different years. We will add a year mask to the data to
772783
improve clarity in the chart including the different end years associated with each countries time series.
773784

774785
```{code-cell} ipython3
@@ -796,24 +807,24 @@ fig.show()
796807
This figure is built using `plotly` and is {ref}` available on the website <fig:plotly-gini-gdppc-years>`
797808
```
798809

799-
This plot shows that all three Western economies GDP per capita has grown over time with some fluctuations
800-
in the Gini coefficient.
810+
This plot shows that all three Western economies GDP per capita has grown over
811+
time with some fluctuations in the Gini coefficient.
801812

802-
From the early 80's the United Kingdom and the US economies both saw increases in income
803-
inequality.
813+
From the early 80's the United Kingdom and the US economies both saw increases
814+
in income inequality.
804815

805816
Interestingly, since the year 2000, the United Kingdom saw a decline in income inequality while
806817
the US exhibits persistent but stable levels around a Gini coefficient of 40.
807818

819+
808820
## Top shares
809821

810822
Another popular measure of inequality is the top shares.
811823

812-
Measuring specific shares is less complex than the Lorenz curve or the Gini
813-
coefficient.
814824

815825
In this section we show how to compute top shares.
816826

827+
817828
### Definition
818829

819830
As before, suppose that the sample $w_1, \ldots, w_n$ has been sorted from smallest to largest.

0 commit comments

Comments
 (0)