Skip to content

DOC: clean-up recent doc errors/warnings #23636

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Nov 12, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion doc/source/advanced.rst
Original file line number Diff line number Diff line change
Expand Up @@ -702,7 +702,7 @@ Index Types

We have discussed ``MultiIndex`` in the previous sections pretty extensively.
Documentation about ``DatetimeIndex`` and ``PeriodIndex`` are shown :ref:`here <timeseries.overview>`,
and documentation about ``TimedeltaIndex`` is found :ref:`here <timedeltas.timedeltaindex>`.
and documentation about ``TimedeltaIndex`` is found :ref:`here <timedeltas.index>`.

In the following sub-sections we will highlight some other index types.

Expand Down
2 changes: 1 addition & 1 deletion doc/source/ecosystem.rst
Original file line number Diff line number Diff line change
Expand Up @@ -140,7 +140,7 @@ which are utilized by Jupyter Notebook for displaying
(Note: HTML tables may or may not be
compatible with non-HTML Jupyter output formats.)

See :ref:`Options and Settings <options>` and :ref:`options.available <available>`
See :ref:`Options and Settings <options>` and :ref:`options.available`
for pandas ``display.`` settings.

`quantopian/qgrid <https://github.com/quantopian/qgrid>`__
Expand Down
4 changes: 3 additions & 1 deletion doc/source/timeseries.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2372,7 +2372,8 @@ can be controlled by the ``nonexistent`` argument. The following options are ava
* ``shift``: Shifts nonexistent times forward to the closest real time

.. ipython:: python
dti = date_range(start='2015-03-29 01:30:00', periods=3, freq='H')

dti = pd.date_range(start='2015-03-29 01:30:00', periods=3, freq='H')
# 2:30 is a nonexistent time

Localization of nonexistent times will raise an error by default.
Expand All @@ -2385,6 +2386,7 @@ Localization of nonexistent times will raise an error by default.
Transform nonexistent times to ``NaT`` or the closest real time forward in time.

.. ipython:: python

dti
dti.tz_localize('Europe/Warsaw', nonexistent='shift')
dti.tz_localize('Europe/Warsaw', nonexistent='NaT')
Expand Down
114 changes: 60 additions & 54 deletions doc/source/whatsnew/v0.24.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,10 @@ New features
~~~~~~~~~~~~
- :func:`merge` now directly allows merge between objects of type ``DataFrame`` and named ``Series``, without the need to convert the ``Series`` object into a ``DataFrame`` beforehand (:issue:`21220`)
- ``ExcelWriter`` now accepts ``mode`` as a keyword argument, enabling append to existing workbooks when using the ``openpyxl`` engine (:issue:`3441`)
- ``FrozenList`` has gained the ``.union()`` and ``.difference()`` methods. This functionality greatly simplifies groupby's that rely on explicitly excluding certain columns. See :ref:`Splitting an object into groups
<groupby.split>` for more information (:issue:`15475`, :issue:`15506`)
- ``FrozenList`` has gained the ``.union()`` and ``.difference()`` methods. This functionality greatly simplifies groupby's that rely on explicitly excluding certain columns. See :ref:`Splitting an object into groups <groupby.split>` for more information (:issue:`15475`, :issue:`15506`).
- :func:`DataFrame.to_parquet` now accepts ``index`` as an argument, allowing
the user to override the engine's default behavior to include or omit the
dataframe's indexes from the resulting Parquet file. (:issue:`20768`)
the user to override the engine's default behavior to include or omit the
dataframe's indexes from the resulting Parquet file. (:issue:`20768`)
- :meth:`DataFrame.corr` and :meth:`Series.corr` now accept a callable for generic calculation methods of correlation, e.g. histogram intersection (:issue:`22684`)


Expand Down Expand Up @@ -227,7 +226,7 @@ Other Enhancements
- :class:`Series` and :class:`DataFrame` now support :class:`Iterable` in constructor (:issue:`2193`)
- :class:`DatetimeIndex` gained :attr:`DatetimeIndex.timetz` attribute. Returns local time with timezone information. (:issue:`21358`)
- :meth:`round`, :meth:`ceil`, and meth:`floor` for :class:`DatetimeIndex` and :class:`Timestamp` now support an ``ambiguous`` argument for handling datetimes that are rounded to ambiguous times (:issue:`18946`)
- :meth:`round`, :meth:`ceil`, and meth:`floor` for :class:`DatetimeIndex` and :class:`Timestamp` now support a ``nonexistent`` argument for handling datetimes that are rounded to nonexistent times. See :ref:`timeseries.timezone_nonexsistent` (:issue:`22647`)
- :meth:`round`, :meth:`ceil`, and meth:`floor` for :class:`DatetimeIndex` and :class:`Timestamp` now support a ``nonexistent`` argument for handling datetimes that are rounded to nonexistent times. See :ref:`timeseries.timezone_nonexistent` (:issue:`22647`)
- :class:`Resampler` now is iterable like :class:`GroupBy` (:issue:`15314`).
- :meth:`Series.resample` and :meth:`DataFrame.resample` have gained the :meth:`Resampler.quantile` (:issue:`15023`).
- :meth:`pandas.core.dtypes.is_list_like` has gained a keyword ``allow_sets`` which is ``True`` by default; if ``False``,
Expand All @@ -237,7 +236,7 @@ Other Enhancements
- Compatibility with Matplotlib 3.0 (:issue:`22790`).
- Added :meth:`Interval.overlaps`, :meth:`IntervalArray.overlaps`, and :meth:`IntervalIndex.overlaps` for determining overlaps between interval-like objects (:issue:`21998`)
- :func:`~DataFrame.to_parquet` now supports writing a ``DataFrame`` as a directory of parquet files partitioned by a subset of the columns when ``engine = 'pyarrow'`` (:issue:`23283`)
- :meth:`Timestamp.tz_localize`, :meth:`DatetimeIndex.tz_localize`, and :meth:`Series.tz_localize` have gained the ``nonexistent`` argument for alternative handling of nonexistent times. See :ref:`timeseries.timezone_nonexsistent` (:issue:`8917`)
- :meth:`Timestamp.tz_localize`, :meth:`DatetimeIndex.tz_localize`, and :meth:`Series.tz_localize` have gained the ``nonexistent`` argument for alternative handling of nonexistent times. See :ref:`timeseries.timezone_nonexistent` (:issue:`8917`)
- :meth:`read_excel()` now accepts ``usecols`` as a list of column names or callable (:issue:`18273`)

.. _whatsnew_0240.api_breaking:
Expand Down Expand Up @@ -283,37 +282,37 @@ and replaced it with references to `pyarrow` (:issue:`21639` and :issue:`23053`)
.. _whatsnew_0240.api_breaking.csv_line_terminator:

`os.linesep` is used for ``line_terminator`` of ``DataFrame.to_csv``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

:func:`DataFrame.to_csv` now uses :func:`os.linesep` rather than ``'\n'``
for the default line terminator (:issue:`20353`).
for the default line terminator (:issue:`20353`).
This change only affects when running on Windows, where ``'\r\n'`` was used for line terminator
even when ``'\n'`` was passed in ``line_terminator``.

Previous Behavior on Windows:

.. code-block:: ipython

In [1]: data = pd.DataFrame({
...: "string_with_lf": ["a\nbc"],
...: "string_with_crlf": ["a\r\nbc"]
...: })
In [1]: data = pd.DataFrame({
...: "string_with_lf": ["a\nbc"],
...: "string_with_crlf": ["a\r\nbc"]
...: })

In [2]: # When passing file PATH to to_csv, line_terminator does not work, and csv is saved with '\r\n'.
...: # Also, this converts all '\n's in the data to '\r\n'.
...: data.to_csv("test.csv", index=False, line_terminator='\n')
In [2]: # When passing file PATH to to_csv, line_terminator does not work, and csv is saved with '\r\n'.
...: # Also, this converts all '\n's in the data to '\r\n'.
...: data.to_csv("test.csv", index=False, line_terminator='\n')

In [3]: with open("test.csv", mode='rb') as f:
...: print(f.read())
b'string_with_lf,string_with_crlf\r\n"a\r\nbc","a\r\r\nbc"\r\n'
In [3]: with open("test.csv", mode='rb') as f:
...: print(f.read())
b'string_with_lf,string_with_crlf\r\n"a\r\nbc","a\r\r\nbc"\r\n'

In [4]: # When passing file OBJECT with newline option to to_csv, line_terminator works.
...: with open("test2.csv", mode='w', newline='\n') as f:
...: data.to_csv(f, index=False, line_terminator='\n')
In [4]: # When passing file OBJECT with newline option to to_csv, line_terminator works.
...: with open("test2.csv", mode='w', newline='\n') as f:
...: data.to_csv(f, index=False, line_terminator='\n')

In [5]: with open("test2.csv", mode='rb') as f:
...: print(f.read())
b'string_with_lf,string_with_crlf\n"a\nbc","a\r\nbc"\n'
In [5]: with open("test2.csv", mode='rb') as f:
...: print(f.read())
b'string_with_lf,string_with_crlf\n"a\nbc","a\r\nbc"\n'


New Behavior on Windows:
Expand All @@ -322,54 +321,54 @@ New Behavior on Windows:
- The value of ``line_terminator`` only affects the line terminator of CSV,
so it does not change the value inside the data.

.. code-block:: ipython
.. code-block:: ipython

In [1]: data = pd.DataFrame({
...: "string_with_lf": ["a\nbc"],
...: "string_with_crlf": ["a\r\nbc"]
...: })
In [1]: data = pd.DataFrame({
...: "string_with_lf": ["a\nbc"],
...: "string_with_crlf": ["a\r\nbc"]
...: })

In [2]: data.to_csv("test.csv", index=False, line_terminator='\n')
In [2]: data.to_csv("test.csv", index=False, line_terminator='\n')

In [3]: with open("test.csv", mode='rb') as f:
...: print(f.read())
b'string_with_lf,string_with_crlf\n"a\nbc","a\r\nbc"\n'
In [3]: with open("test.csv", mode='rb') as f:
...: print(f.read())
b'string_with_lf,string_with_crlf\n"a\nbc","a\r\nbc"\n'


- On Windows, the value of ``os.linesep`` is ``'\r\n'``,
so if ``line_terminator`` is not set, ``'\r\n'`` is used for line terminator.
- Again, it does not affect the value inside the data.

.. code-block:: ipython
.. code-block:: ipython

In [1]: data = pd.DataFrame({
...: "string_with_lf": ["a\nbc"],
...: "string_with_crlf": ["a\r\nbc"]
...: })
In [1]: data = pd.DataFrame({
...: "string_with_lf": ["a\nbc"],
...: "string_with_crlf": ["a\r\nbc"]
...: })

In [2]: data.to_csv("test.csv", index=False)
In [2]: data.to_csv("test.csv", index=False)

In [3]: with open("test.csv", mode='rb') as f:
...: print(f.read())
b'string_with_lf,string_with_crlf\r\n"a\nbc","a\r\nbc"\r\n'
In [3]: with open("test.csv", mode='rb') as f:
...: print(f.read())
b'string_with_lf,string_with_crlf\r\n"a\nbc","a\r\nbc"\r\n'


- For files objects, specifying ``newline`` is not sufficient to set the line terminator.
You must pass in the ``line_terminator`` explicitly, even in this case.

.. code-block:: ipython
.. code-block:: ipython

In [1]: data = pd.DataFrame({
...: "string_with_lf": ["a\nbc"],
...: "string_with_crlf": ["a\r\nbc"]
...: })
In [1]: data = pd.DataFrame({
...: "string_with_lf": ["a\nbc"],
...: "string_with_crlf": ["a\r\nbc"]
...: })

In [2]: with open("test2.csv", mode='w', newline='\n') as f:
...: data.to_csv(f, index=False)
In [2]: with open("test2.csv", mode='w', newline='\n') as f:
...: data.to_csv(f, index=False)

In [3]: with open("test2.csv", mode='rb') as f:
...: print(f.read())
b'string_with_lf,string_with_crlf\r\n"a\nbc","a\r\nbc"\r\n'
In [3]: with open("test2.csv", mode='rb') as f:
...: print(f.read())
b'string_with_lf,string_with_crlf\r\n"a\nbc","a\r\nbc"\r\n'

.. _whatsnew_0240.api_breaking.interval_values:

Expand Down Expand Up @@ -777,17 +776,20 @@ Previous Behavior:
df = pd.DataFrame(arr)

.. ipython:: python

# Comparison operations and arithmetic operations both broadcast.
df == arr[[0], :]
df + arr[[0], :]

.. ipython:: python

# Comparison operations and arithmetic operations both broadcast.
df == (1, 2)
df + (1, 2)

.. ipython:: python
:okexcept:

# Comparison operations and arithmetic opeartions both raise ValueError.
df == (1, 2, 3)
df + (1, 2, 3)
Expand All @@ -797,8 +799,9 @@ Previous Behavior:

DataFrame Arithmetic Operations Broadcasting Changes
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

:class:`DataFrame` arithmetic operations when operating with 2-dimensional
``np.ndarray`` objects now broadcast in the same way as ``np.ndarray``s
``np.ndarray`` objects now broadcast in the same way as ``np.ndarray``
broadcast. (:issue:`23000`)

Previous Behavior:
Expand All @@ -817,11 +820,13 @@ Previous Behavior:
*Current Behavior*:

.. ipython:: python

arr = np.arange(6).reshape(3, 2)
df = pd.DataFrame(arr)
df

.. ipython:: python

df + arr[[0], :] # 1 row, 2 columns
df + arr[:, [1]] # 1 column, 3 rows

Expand Down Expand Up @@ -888,7 +893,7 @@ Current Behavior:
...
OverflowError: Trying to coerce negative values to unsigned integers

.. _whatsnew_0240.api.crosstab_dtypes
.. _whatsnew_0240.api.crosstab_dtypes:

Crosstab Preserves Dtypes
^^^^^^^^^^^^^^^^^^^^^^^^^
Expand Down Expand Up @@ -1007,6 +1012,7 @@ Current Behavior:

.. ipython:: python
:okwarning:

per = pd.Period('2016Q1')
per + 3

Expand Down
14 changes: 7 additions & 7 deletions pandas/_libs/tslibs/timedeltas.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -1111,14 +1111,14 @@ class Timedelta(_Timedelta):
Parameters
----------
value : Timedelta, timedelta, np.timedelta64, string, or integer
unit : string, {'Y', 'M', 'W', 'D', 'days', 'day',
'hours', hour', 'hr', 'h', 'm', 'minute', 'min', 'minutes',
'T', 'S', 'seconds', 'sec', 'second', 'ms',
'milliseconds', 'millisecond', 'milli', 'millis', 'L',
'us', 'microseconds', 'microsecond', 'micro', 'micros',
'U', 'ns', 'nanoseconds', 'nano', 'nanos', 'nanosecond'
'N'}, optional
unit : str, optional
Denote the unit of the input, if input is an integer. Default 'ns'.
Possible values:
{'Y', 'M', 'W', 'D', 'days', 'day', 'hours', hour', 'hr', 'h',
'm', 'minute', 'min', 'minutes', 'T', 'S', 'seconds', 'sec', 'second',
'ms', 'milliseconds', 'millisecond', 'milli', 'millis', 'L',
'us', 'microseconds', 'microsecond', 'micro', 'micros', 'U',
'ns', 'nanoseconds', 'nano', 'nanos', 'nanosecond', 'N'}
days, seconds, microseconds,
milliseconds, minutes, hours, weeks : numeric, optional
Values for construction in compat with datetime.timedelta.
Expand Down
3 changes: 3 additions & 0 deletions pandas/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -3409,13 +3409,15 @@ def assign(self, **kwargs):
Berkeley 25.0

Where the value is a callable, evaluated on `df`:

>>> df.assign(temp_f=lambda x: x.temp_c * 9 / 5 + 32)
temp_c temp_f
Portland 17.0 62.6
Berkeley 25.0 77.0

Alternatively, the same behavior can be achieved by directly
referencing an existing Series or sequence:

>>> df.assign(temp_f=df['temp_c'] * 9 / 5 + 32)
temp_c temp_f
Portland 17.0 62.6
Expand All @@ -3424,6 +3426,7 @@ def assign(self, **kwargs):
In Python 3.6+, you can create multiple columns within the same assign
where one of the columns depends on another one defined within the same
assign:

>>> df.assign(temp_f=lambda x: x['temp_c'] * 9 / 5 + 32,
... temp_k=lambda x: (x['temp_f'] + 459.67) * 5 / 9)
temp_c temp_f temp_k
Expand Down
14 changes: 7 additions & 7 deletions pandas/core/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -6506,16 +6506,16 @@ def interpolate(self, method='linear', axis=0, limit=None, inplace=False,

def asof(self, where, subset=None):
"""
Return the last row(s) without any `NaN`s before `where`.
Return the last row(s) without any NaNs before `where`.

The last row (for each element in `where`, if list) without any
`NaN` is taken.
In case of a :class:`~pandas.DataFrame`, the last row without `NaN`
NaN is taken.
In case of a :class:`~pandas.DataFrame`, the last row without NaN
considering only the subset of columns (if not `None`)

.. versionadded:: 0.19.0 For DataFrame

If there is no good value, `NaN` is returned for a Series or
If there is no good value, NaN is returned for a Series or
a Series of NaN values for a DataFrame

Parameters
Expand All @@ -6524,7 +6524,7 @@ def asof(self, where, subset=None):
Date(s) before which the last row(s) are returned.
subset : str or array-like of str, default `None`
For DataFrame, if not `None`, only use these columns to
check for `NaN`s.
check for NaNs.

Notes
-----
Expand Down Expand Up @@ -6560,7 +6560,7 @@ def asof(self, where, subset=None):
2.0

For a sequence `where`, a Series is returned. The first value is
``NaN``, because the first element of `where` is before the first
NaN, because the first element of `where` is before the first
index value.

>>> s.asof([5, 20])
Expand All @@ -6569,7 +6569,7 @@ def asof(self, where, subset=None):
dtype: float64

Missing values are not considered. The following is ``2.0``, not
``NaN``, even though ``NaN`` is at the index location for ``30``.
NaN, even though NaN is at the index location for ``30``.

>>> s.asof(30)
2.0
Expand Down
3 changes: 2 additions & 1 deletion pandas/core/indexes/datetimelike.py
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,8 @@ class TimelikeOps(object):
:ref:`frequency aliases <timeseries.offset_aliases>` for
a list of possible `freq` values.
ambiguous : 'infer', bool-ndarray, 'NaT', default 'raise'
Only relevant for DatetimeIndex:

- 'infer' will attempt to infer fall dst-transition hours based on
order
- bool-ndarray where True signifies a DST time, False designates
Expand All @@ -98,7 +100,6 @@ class TimelikeOps(object):
- 'NaT' will return NaT where there are ambiguous times
- 'raise' will raise an AmbiguousTimeError if there are ambiguous
times
Only relevant for DatetimeIndex

.. versionadded:: 0.24.0
nonexistent : 'shift', 'NaT', default 'raise'
Expand Down
Loading