@@ -1775,7 +1775,9 @@ def sort_index(self, axis=0, level=None, ascending=True, inplace=False,
1775
1775
New labels / index to conform to. Preferably an Index object to
1776
1776
avoid duplicating data
1777
1777
method : {None, 'backfill'/'bfill', 'pad'/'ffill', 'nearest'}, optional
1778
- Method to use for filling holes in reindexed DataFrame:
1778
+ method to use for filling holes in reindexed DataFrame.
1779
+ Please note: this is only applicable to DataFrames/Series with a
1780
+ monotonically increasing/decreasing index.
1779
1781
* default: don't fill gaps
1780
1782
* pad / ffill: propagate last valid observation forward to next valid
1781
1783
* backfill / bfill: use next valid observation to fill gap
@@ -1799,7 +1801,118 @@ def sort_index(self, axis=0, level=None, ascending=True, inplace=False,
1799
1801
1800
1802
Examples
1801
1803
--------
1802
- >>> df.reindex(index=[date1, date2, date3], columns=['A', 'B', 'C'])
1804
+
1805
+ Create a dataframe with some fictional data.
1806
+
1807
+ >>> index = ['Firefox', 'Chrome', 'Safari', 'IE10', 'Konqueror']
1808
+ >>> df = pd.DataFrame({
1809
+ ... 'http_status': [200,200,404,404,301],
1810
+ ... 'response_time': [0.04, 0.02, 0.07, 0.08, 1.0]},
1811
+ ... index=index)
1812
+ >>> df
1813
+ http_status response_time
1814
+ Firefox 200 0.04
1815
+ Chrome 200 0.02
1816
+ Safari 404 0.07
1817
+ IE10 404 0.08
1818
+ Konqueror 301 1.00
1819
+
1820
+ Create a new index and reindex the dataframe. By default
1821
+ values in the new index that do not have corresponding
1822
+ records in the dataframe are assigned ``NaN``.
1823
+
1824
+ >>> new_index= ['Safari', 'Iceweasel', 'Comodo Dragon', 'IE10',
1825
+ ... 'Chrome']
1826
+ >>> df.reindex(new_index)
1827
+ http_status response_time
1828
+ Safari 404 0.07
1829
+ Iceweasel NaN NaN
1830
+ Comodo Dragon NaN NaN
1831
+ IE10 404 0.08
1832
+ Chrome 200 0.02
1833
+
1834
+ We can fill in the missing values by passing a value to
1835
+ the keyword ``fill_value``. Because the index is not monotonically
1836
+ increasing or decreasing, we cannot use arguments to the keyword
1837
+ ``method`` to fill the ``NaN`` values.
1838
+
1839
+ >>> df.reindex(new_index, fill_value=0)
1840
+ http_status response_time
1841
+ Safari 404 0.07
1842
+ Iceweasel 0 0.00
1843
+ Comodo Dragon 0 0.00
1844
+ IE10 404 0.08
1845
+ Chrome 200 0.02
1846
+
1847
+ >>> df.reindex(new_index, fill_value='missing')
1848
+ http_status response_time
1849
+ Safari 404 0.07
1850
+ Iceweasel missing missing
1851
+ Comodo Dragon missing missing
1852
+ IE10 404 0.08
1853
+ Chrome 200 0.02
1854
+
1855
+ To further illustrate the filling functionality in
1856
+ ``reindex``, we will create a dataframe with a
1857
+ monotonically increasing index (for example, a sequence
1858
+ of dates).
1859
+
1860
+ >>> date_index = pd.date_range('1/1/2010', periods=6, freq='D')
1861
+ >>> df2 = pd.DataFrame({"prices": [100, 101, np.nan, 100, 89, 88]},
1862
+ index=date_index)
1863
+ >>> df2
1864
+ prices
1865
+ 2010-01-01 100
1866
+ 2010-01-02 101
1867
+ 2010-01-03 NaN
1868
+ 2010-01-04 100
1869
+ 2010-01-05 89
1870
+ 2010-01-06 88
1871
+
1872
+ Suppose we decide to expand the dataframe to cover a wider
1873
+ date range.
1874
+
1875
+ >>> date_index2 = pd.date_range('12/29/2009', periods=10, freq='D')
1876
+ >>> df2.reindex(date_index2)
1877
+ prices
1878
+ 2009-12-29 NaN
1879
+ 2009-12-30 NaN
1880
+ 2009-12-31 NaN
1881
+ 2010-01-01 100
1882
+ 2010-01-02 101
1883
+ 2010-01-03 NaN
1884
+ 2010-01-04 100
1885
+ 2010-01-05 89
1886
+ 2010-01-06 88
1887
+ 2010-01-07 NaN
1888
+
1889
+ The index entries that did not have a value in the original data frame
1890
+ (for example, '2009-12-29') are by default filled with ``NaN``.
1891
+ If desired, we can fill in the missing values using one of several
1892
+ options.
1893
+
1894
+ For example, to backpropagate the last valid value to fill the ``NaN``
1895
+ values, pass ``bfill`` as an argument to the ``method`` keyword.
1896
+
1897
+ >>> df2.reindex(date_index2, method='bfill')
1898
+ prices
1899
+ 2009-12-29 100
1900
+ 2009-12-30 100
1901
+ 2009-12-31 100
1902
+ 2010-01-01 100
1903
+ 2010-01-02 101
1904
+ 2010-01-03 NaN
1905
+ 2010-01-04 100
1906
+ 2010-01-05 89
1907
+ 2010-01-06 88
1908
+ 2010-01-07 NaN
1909
+
1910
+ Please note that the ``NaN`` value present in the original dataframe
1911
+ (at index value 2010-01-03) will not be filled by any of the
1912
+ value propagation schemes. This is because filling while reindexing
1913
+ does not look at dataframe values, but only compares the original and
1914
+ desired indexes. If you do want to fill in the ``NaN`` values present
1915
+ in the original dataframe, use the ``fillna()`` method.
1803
1916
1804
1917
Returns
1805
1918
-------
0 commit comments