@@ -1777,7 +1777,9 @@ def sort_index(self, axis=0, level=None, ascending=True, inplace=False,
1777
1777
New labels / index to conform to. Preferably an Index object to
1778
1778
avoid duplicating data
1779
1779
method : {None, 'backfill'/'bfill', 'pad'/'ffill', 'nearest'}, optional
1780
- Method to use for filling holes in reindexed DataFrame:
1780
+ method to use for filling holes in reindexed DataFrame.
1781
+ Please note: this is only applicable to DataFrames/Series with a
1782
+ monotonically increasing/decreasing index.
1781
1783
* default: don't fill gaps
1782
1784
* pad / ffill: propagate last valid observation forward to next valid
1783
1785
* backfill / bfill: use next valid observation to fill gap
@@ -1801,7 +1803,118 @@ def sort_index(self, axis=0, level=None, ascending=True, inplace=False,
1801
1803
1802
1804
Examples
1803
1805
--------
1804
- >>> df.reindex(index=[date1, date2, date3], columns=['A', 'B', 'C'])
1806
+
1807
+ Create a dataframe with some fictional data.
1808
+
1809
+ >>> index = ['Firefox', 'Chrome', 'Safari', 'IE10', 'Konqueror']
1810
+ >>> df = pd.DataFrame({
1811
+ ... 'http_status': [200,200,404,404,301],
1812
+ ... 'response_time': [0.04, 0.02, 0.07, 0.08, 1.0]},
1813
+ ... index=index)
1814
+ >>> df
1815
+ http_status response_time
1816
+ Firefox 200 0.04
1817
+ Chrome 200 0.02
1818
+ Safari 404 0.07
1819
+ IE10 404 0.08
1820
+ Konqueror 301 1.00
1821
+
1822
+ Create a new index and reindex the dataframe. By default
1823
+ values in the new index that do not have corresponding
1824
+ records in the dataframe are assigned ``NaN``.
1825
+
1826
+ >>> new_index= ['Safari', 'Iceweasel', 'Comodo Dragon', 'IE10',
1827
+ ... 'Chrome']
1828
+ >>> df.reindex(new_index)
1829
+ http_status response_time
1830
+ Safari 404 0.07
1831
+ Iceweasel NaN NaN
1832
+ Comodo Dragon NaN NaN
1833
+ IE10 404 0.08
1834
+ Chrome 200 0.02
1835
+
1836
+ We can fill in the missing values by passing a value to
1837
+ the keyword ``fill_value``. Because the index is not monotonically
1838
+ increasing or decreasing, we cannot use arguments to the keyword
1839
+ ``method`` to fill the ``NaN`` values.
1840
+
1841
+ >>> df.reindex(new_index, fill_value=0)
1842
+ http_status response_time
1843
+ Safari 404 0.07
1844
+ Iceweasel 0 0.00
1845
+ Comodo Dragon 0 0.00
1846
+ IE10 404 0.08
1847
+ Chrome 200 0.02
1848
+
1849
+ >>> df.reindex(new_index, fill_value='missing')
1850
+ http_status response_time
1851
+ Safari 404 0.07
1852
+ Iceweasel missing missing
1853
+ Comodo Dragon missing missing
1854
+ IE10 404 0.08
1855
+ Chrome 200 0.02
1856
+
1857
+ To further illustrate the filling functionality in
1858
+ ``reindex``, we will create a dataframe with a
1859
+ monotonically increasing index (for example, a sequence
1860
+ of dates).
1861
+
1862
+ >>> date_index = pd.date_range('1/1/2010', periods=6, freq='D')
1863
+ >>> df2 = pd.DataFrame({"prices": [100, 101, np.nan, 100, 89, 88]},
1864
+ index=date_index)
1865
+ >>> df2
1866
+ prices
1867
+ 2010-01-01 100
1868
+ 2010-01-02 101
1869
+ 2010-01-03 NaN
1870
+ 2010-01-04 100
1871
+ 2010-01-05 89
1872
+ 2010-01-06 88
1873
+
1874
+ Suppose we decide to expand the dataframe to cover a wider
1875
+ date range.
1876
+
1877
+ >>> date_index2 = pd.date_range('12/29/2009', periods=10, freq='D')
1878
+ >>> df2.reindex(date_index2)
1879
+ prices
1880
+ 2009-12-29 NaN
1881
+ 2009-12-30 NaN
1882
+ 2009-12-31 NaN
1883
+ 2010-01-01 100
1884
+ 2010-01-02 101
1885
+ 2010-01-03 NaN
1886
+ 2010-01-04 100
1887
+ 2010-01-05 89
1888
+ 2010-01-06 88
1889
+ 2010-01-07 NaN
1890
+
1891
+ The index entries that did not have a value in the original data frame
1892
+ (for example, '2009-12-29') are by default filled with ``NaN``.
1893
+ If desired, we can fill in the missing values using one of several
1894
+ options.
1895
+
1896
+ For example, to backpropagate the last valid value to fill the ``NaN``
1897
+ values, pass ``bfill`` as an argument to the ``method`` keyword.
1898
+
1899
+ >>> df2.reindex(date_index2, method='bfill')
1900
+ prices
1901
+ 2009-12-29 100
1902
+ 2009-12-30 100
1903
+ 2009-12-31 100
1904
+ 2010-01-01 100
1905
+ 2010-01-02 101
1906
+ 2010-01-03 NaN
1907
+ 2010-01-04 100
1908
+ 2010-01-05 89
1909
+ 2010-01-06 88
1910
+ 2010-01-07 NaN
1911
+
1912
+ Please note that the ``NaN`` value present in the original dataframe
1913
+ (at index value 2010-01-03) will not be filled by any of the
1914
+ value propagation schemes. This is because filling while reindexing
1915
+ does not look at dataframe values, but only compares the original and
1916
+ desired indexes. If you do want to fill in the ``NaN`` values present
1917
+ in the original dataframe, use the ``fillna()`` method.
1805
1918
1806
1919
Returns
1807
1920
-------
0 commit comments