Skip to content

Commit 5fb67a2

Browse files
committed
DOC: rewrite missing value includes to not require replacement
1 parent d5236ab commit 5fb67a2

File tree

4 files changed

+26
-18
lines changed

4 files changed

+26
-18
lines changed

doc/source/getting_started/comparison/comparison_with_sas.rst

+2-2
Original file line numberDiff line numberDiff line change
@@ -422,6 +422,8 @@ input frames.
422422
Missing data
423423
------------
424424

425+
Both pandas and SAS have a representation for missing data.
426+
425427
.. include:: includes/missing_intro.rst
426428

427429
One difference is that missing data cannot be compared to its sentinel value.
@@ -441,8 +443,6 @@ For example, in SAS you could do this to filter missing values.
441443
442444
.. include:: includes/missing.rst
443445

444-
.. |program| replace:: SAS
445-
446446

447447
GroupBy
448448
-------

doc/source/getting_started/comparison/comparison_with_stata.rst

+2-2
Original file line numberDiff line numberDiff line change
@@ -413,6 +413,8 @@ or the intersection of the two by using the values created in the
413413
Missing data
414414
------------
415415

416+
Both pandas and State have a representation for missing data.
417+
416418
.. include:: includes/missing_intro.rst
417419

418420
One difference is that missing data cannot be compared to its sentinel value.
@@ -427,8 +429,6 @@ For example, in Stata you could do this to filter missing values.
427429
428430
.. include:: includes/missing.rst
429431

430-
.. |program| replace:: Stata
431-
432432

433433
GroupBy
434434
-------
Original file line numberDiff line numberDiff line change
@@ -1,23 +1,31 @@
1-
This doesn't work in pandas. Instead, the :func:`pd.isna` or :func:`pd.notna` functions
2-
should be used for comparisons.
1+
In pandas, :meth:`Series.isna` and :meth:`Series.notna` can be used to filter the rows.
32

43
.. ipython:: python
54
6-
outer_join[pd.isna(outer_join["value_x"])]
7-
outer_join[pd.notna(outer_join["value_x"])]
5+
outer_join[outer_join["value_x"].isna()]
6+
outer_join[outer_join["value_x"].notna()]
87
9-
pandas also provides a variety of methods to work with missing data -- some of which would be
10-
challenging to express in |program|. For example, there are methods to drop all rows with any
11-
missing values, replacing missing values with a specified value, like the mean, or forward filling
12-
from previous rows. See the :ref:`missing data documentation<missing_data>` for more.
8+
pandas provides :ref:`a variety of methods to work with missing data <missing_data>`. Here are some examples:
9+
10+
Drop rows with missing values
11+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1312

1413
.. ipython:: python
1514
16-
# Drop rows with any missing value
1715
outer_join.dropna()
1816
19-
# Fill forwards
17+
Forward fill from previous rows
18+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
19+
20+
.. ipython:: python
21+
2022
outer_join.fillna(method="ffill")
2123
22-
# Impute missing values with the mean
24+
Replace missing values with a specified value
25+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
26+
27+
Using the mean:
28+
29+
.. ipython:: python
30+
2331
outer_join["value_x"].fillna(outer_join["value_x"].mean())

doc/source/getting_started/comparison/includes/missing_intro.rst

+3-3
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
1-
Both pandas and |program| have a representation for missing data — pandas' is the special float
2-
value ``NaN`` (not a number). Many of the semantics are the same; for example missing data
3-
propagates through numeric operations, and is ignored by default for aggregations.
1+
pandas represents missing data with the special float value ``NaN`` (not a number). Many of the
2+
semantics are the same; for example missing data propagates through numeric operations, and is
3+
ignored by default for aggregations.
44

55
.. ipython:: python
66

0 commit comments

Comments
 (0)