Skip to content

Commit 8ca1de2

Browse files
committed
Merge remote-tracking branch 'upstream/master' into str_cat_set
2 parents e37016a + 4cac923 commit 8ca1de2

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

47 files changed

+1408
-862
lines changed

ci/code_checks.sh

+5-5
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@
1616

1717
echo "inside $0"
1818
[[ $LINT ]] || { echo "NOT Linting. To lint use: LINT=true $0 $1"; exit 0; }
19-
[[ -z "$1" || "$1" == "lint" || "$1" == "patterns" || "$1" == "doctests" ]] || { echo "Unkown command $1. Usage: $0 [lint|patterns|doctests]"; exit 9999; }
19+
[[ -z "$1" || "$1" == "lint" || "$1" == "patterns" || "$1" == "doctests" ]] || { echo "Unknown command $1. Usage: $0 [lint|patterns|doctests]"; exit 9999; }
2020

2121
source activate pandas
2222
RET=0
@@ -122,22 +122,22 @@ fi
122122
if [[ -z "$CHECK" || "$CHECK" == "doctests" ]]; then
123123

124124
MSG='Doctests frame.py' ; echo $MSG
125-
pytest --doctest-modules -v pandas/core/frame.py \
125+
pytest -q --doctest-modules pandas/core/frame.py \
126126
-k"-axes -combine -itertuples -join -nlargest -nsmallest -nunique -pivot_table -quantile -query -reindex -reindex_axis -replace -round -set_index -stack -to_stata"
127127
RET=$(($RET + $?)) ; echo $MSG "DONE"
128128

129129
MSG='Doctests series.py' ; echo $MSG
130-
pytest --doctest-modules -v pandas/core/series.py \
130+
pytest -q --doctest-modules pandas/core/series.py \
131131
-k"-nonzero -reindex -searchsorted -to_dict"
132132
RET=$(($RET + $?)) ; echo $MSG "DONE"
133133

134134
MSG='Doctests generic.py' ; echo $MSG
135-
pytest --doctest-modules -v pandas/core/generic.py \
135+
pytest -q --doctest-modules pandas/core/generic.py \
136136
-k"-_set_axis_name -_xs -describe -droplevel -groupby -interpolate -pct_change -pipe -reindex -reindex_axis -resample -to_json -transpose -values -xs"
137137
RET=$(($RET + $?)) ; echo $MSG "DONE"
138138

139139
MSG='Doctests top-level reshaping functions' ; echo $MSG
140-
pytest --doctest-modules -v \
140+
pytest -q --doctest-modules \
141141
pandas/core/reshape/concat.py \
142142
pandas/core/reshape/pivot.py \
143143
pandas/core/reshape/reshape.py \

doc/source/extending.rst

+16
Original file line numberDiff line numberDiff line change
@@ -135,6 +135,12 @@ There are two approaches for providing operator support for your ExtensionArray:
135135
2. Use an operator implementation from pandas that depends on operators that are already defined
136136
on the underlying elements (scalars) of the ExtensionArray.
137137

138+
.. note::
139+
140+
Regardless of the approach, you may want to set ``__array_priority__``
141+
if you want your implementation to be called when involved in binary operations
142+
with NumPy arrays.
143+
138144
For the first approach, you define selected operators, e.g., ``__add__``, ``__le__``, etc. that
139145
you want your ``ExtensionArray`` subclass to support.
140146

@@ -173,6 +179,16 @@ or not that succeeds depends on whether the operation returns a result
173179
that's valid for the ``ExtensionArray``. If an ``ExtensionArray`` cannot
174180
be reconstructed, an ndarray containing the scalars returned instead.
175181

182+
For ease of implementation and consistency with operations between pandas
183+
and NumPy ndarrays, we recommend *not* handling Series and Indexes in your binary ops.
184+
Instead, you should detect these cases and return ``NotImplemented``.
185+
When pandas encounters an operation like ``op(Series, ExtensionArray)``, pandas
186+
will
187+
188+
1. unbox the array from the ``Series`` (roughly ``Series.values``)
189+
2. call ``result = op(values, ExtensionArray)``
190+
3. re-box the result in a ``Series``
191+
176192
.. _extending.extension.testing:
177193

178194
Testing Extension Arrays

doc/source/whatsnew/v0.24.0.txt

+97
Original file line numberDiff line numberDiff line change
@@ -235,6 +235,97 @@ If installed, we now require:
235235
| scipy | 0.18.1 | |
236236
+-----------------+-----------------+----------+
237237

238+
.. _whatsnew_0240.api_breaking.csv_line_terminator:
239+
240+
`os.linesep` is used for ``line_terminator`` of ``DataFrame.to_csv``
241+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
242+
243+
:func:`DataFrame.to_csv` now uses :func:`os.linesep` rather than ``'\n'``
244+
for the default line terminator (:issue:`20353`).
245+
This change only affects when running on Windows, where ``'\r\n'`` was used for line terminator
246+
even when ``'\n'`` was passed in ``line_terminator``.
247+
248+
Previous Behavior on Windows:
249+
250+
.. code-block:: ipython
251+
252+
In [1]: data = pd.DataFrame({
253+
...: "string_with_lf": ["a\nbc"],
254+
...: "string_with_crlf": ["a\r\nbc"]
255+
...: })
256+
257+
In [2]: # When passing file PATH to to_csv, line_terminator does not work, and csv is saved with '\r\n'.
258+
...: # Also, this converts all '\n's in the data to '\r\n'.
259+
...: data.to_csv("test.csv", index=False, line_terminator='\n')
260+
261+
In [3]: with open("test.csv", mode='rb') as f:
262+
...: print(f.read())
263+
b'string_with_lf,string_with_crlf\r\n"a\r\nbc","a\r\r\nbc"\r\n'
264+
265+
In [4]: # When passing file OBJECT with newline option to to_csv, line_terminator works.
266+
...: with open("test2.csv", mode='w', newline='\n') as f:
267+
...: data.to_csv(f, index=False, line_terminator='\n')
268+
269+
In [5]: with open("test2.csv", mode='rb') as f:
270+
...: print(f.read())
271+
b'string_with_lf,string_with_crlf\n"a\nbc","a\r\nbc"\n'
272+
273+
274+
New Behavior on Windows:
275+
276+
- By passing ``line_terminator`` explicitly, line terminator is set to that character.
277+
- The value of ``line_terminator`` only affects the line terminator of CSV,
278+
so it does not change the value inside the data.
279+
280+
.. code-block:: ipython
281+
282+
In [1]: data = pd.DataFrame({
283+
...: "string_with_lf": ["a\nbc"],
284+
...: "string_with_crlf": ["a\r\nbc"]
285+
...: })
286+
287+
In [2]: data.to_csv("test.csv", index=False, line_terminator='\n')
288+
289+
In [3]: with open("test.csv", mode='rb') as f:
290+
...: print(f.read())
291+
b'string_with_lf,string_with_crlf\n"a\nbc","a\r\nbc"\n'
292+
293+
294+
- On Windows, the value of ``os.linesep`` is ``'\r\n'``,
295+
so if ``line_terminator`` is not set, ``'\r\n'`` is used for line terminator.
296+
- Again, it does not affect the value inside the data.
297+
298+
.. code-block:: ipython
299+
300+
In [1]: data = pd.DataFrame({
301+
...: "string_with_lf": ["a\nbc"],
302+
...: "string_with_crlf": ["a\r\nbc"]
303+
...: })
304+
305+
In [2]: data.to_csv("test.csv", index=False)
306+
307+
In [3]: with open("test.csv", mode='rb') as f:
308+
...: print(f.read())
309+
b'string_with_lf,string_with_crlf\r\n"a\nbc","a\r\nbc"\r\n'
310+
311+
312+
- For files objects, specifying ``newline`` is not sufficient to set the line terminator.
313+
You must pass in the ``line_terminator`` explicitly, even in this case.
314+
315+
.. code-block:: ipython
316+
317+
In [1]: data = pd.DataFrame({
318+
...: "string_with_lf": ["a\nbc"],
319+
...: "string_with_crlf": ["a\r\nbc"]
320+
...: })
321+
322+
In [2]: with open("test2.csv", mode='w', newline='\n') as f:
323+
...: data.to_csv(f, index=False)
324+
325+
In [3]: with open("test2.csv", mode='rb') as f:
326+
...: print(f.read())
327+
b'string_with_lf,string_with_crlf\r\n"a\nbc","a\r\nbc"\r\n'
328+
238329
.. _whatsnew_0240.api_breaking.interval_values:
239330

240331
``IntervalIndex.values`` is now an ``IntervalArray``
@@ -714,6 +805,8 @@ Other API Changes
714805
- :class:`pandas.io.formats.style.Styler` supports a ``number-format`` property when using :meth:`~pandas.io.formats.style.Styler.to_excel` (:issue:`22015`)
715806
- :meth:`DataFrame.corr` and :meth:`Series.corr` now raise a ``ValueError`` along with a helpful error message instead of a ``KeyError`` when supplied with an invalid method (:issue:`22298`)
716807
- :meth:`shift` will now always return a copy, instead of the previous behaviour of returning self when shifting by 0 (:issue:`22397`)
808+
- :meth:`DataFrame.set_index` now allows all one-dimensional list-likes, raises a ``TypeError`` for incorrect types,
809+
has an improved ``KeyError`` message, and will not fail on duplicate column names with ``drop=True``. (:issue:`22484`)
717810
- Slicing a single row of a DataFrame with multiple ExtensionArrays of the same type now preserves the dtype, rather than coercing to object (:issue:`22784`)
718811
- :class:`DateOffset` attribute `_cacheable` and method `_should_cache` have been removed (:issue:`23118`)
719812

@@ -733,6 +826,7 @@ Deprecations
733826
many ``Series``, ``Index`` or 1-dimensional ``np.ndarray``, or alternatively, only scalar values. (:issue:`21950`)
734827
- :meth:`FrozenNDArray.searchsorted` has deprecated the ``v`` parameter in favor of ``value`` (:issue:`14645`)
735828
- :func:`DatetimeIndex.shift` and :func:`PeriodIndex.shift` now accept ``periods`` argument instead of ``n`` for consistency with :func:`Index.shift` and :func:`Series.shift`. Using ``n`` throws a deprecation warning (:issue:`22458`, :issue:`22912`)
829+
- The ``fastpath`` keyword of the different Index constructors is deprecated (:issue:`23110`).
736830

737831
.. _whatsnew_0240.prior_deprecations:
738832

@@ -750,6 +844,8 @@ Removal of prior version deprecations/changes
750844
- :meth:`Categorical.searchsorted` and :meth:`Series.searchsorted` have renamed the ``v`` argument to ``value`` (:issue:`14645`)
751845
- :meth:`TimedeltaIndex.searchsorted`, :meth:`DatetimeIndex.searchsorted`, and :meth:`PeriodIndex.searchsorted` have renamed the ``key`` argument to ``value`` (:issue:`14645`)
752846
- Removal of the previously deprecated module ``pandas.json`` (:issue:`19944`)
847+
- :meth:`SparseArray.get_values` and :meth:`SparseArray.to_dense` have dropped the ``fill`` parameter (:issue:`14686`)
848+
- :meth:`SparseSeries.to_dense` has dropped the ``sparse_only`` parameter (:issue:`14686`)
753849

754850
.. _whatsnew_0240.performance:
755851

@@ -875,6 +971,7 @@ Numeric
875971
- Bug in :meth:`DataFrame.apply` where, when supplied with a string argument and additional positional or keyword arguments (e.g. ``df.apply('sum', min_count=1)``), a ``TypeError`` was wrongly raised (:issue:`22376`)
876972
- Bug in :meth:`DataFrame.astype` to extension dtype may raise ``AttributeError`` (:issue:`22578`)
877973
- Bug in :class:`DataFrame` with ``timedelta64[ns]`` dtype arithmetic operations with ``ndarray`` with integer dtype incorrectly treating the narray as ``timedelta64[ns]`` dtype (:issue:`23114`)
974+
- Bug in :meth:`Series.rpow` with object dtype ``NaN`` for ``1 ** NA`` instead of ``1`` (:issue:`22922`).
878975

879976
Strings
880977
^^^^^^^

0 commit comments

Comments
 (0)