Skip to content

Commit 65add0f

Browse files
committed
Merge branch 'master' into orc-reader
2 parents 1ff1715 + 9333e3d commit 65add0f

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

59 files changed

+369
-1275
lines changed

.github/workflows/ci.yml

+9-6
Original file line numberDiff line numberDiff line change
@@ -80,15 +80,18 @@ jobs:
8080
git fetch upstream
8181
if git diff upstream/master --name-only | grep -q "^asv_bench/"; then
8282
asv machine --yes
83-
ASV_OUTPUT="$(asv dev)"
84-
if [[ $(echo "$ASV_OUTPUT" | grep "failed") ]]; then
85-
echo "##vso[task.logissue type=error]Benchmarks run with errors"
86-
echo "$ASV_OUTPUT"
83+
asv dev | sed "/failed$/ s/^/##[error]/" | tee benchmarks.log
84+
if grep "failed" benchmarks.log > /dev/null ; then
8785
exit 1
88-
else
89-
echo "Benchmarks run without errors"
9086
fi
9187
else
9288
echo "Benchmarks did not run, no changes detected"
9389
fi
9490
if: true
91+
92+
- name: Publish benchmarks artifact
93+
uses: actions/upload-artifact@master
94+
with:
95+
name: Benchmarks log
96+
path: asv_bench/benchmarks.log
97+
if: failure()

asv_bench/benchmarks/frame_methods.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -565,7 +565,7 @@ def setup(self):
565565

566566
def time_frame_get_dtype_counts(self):
567567
with warnings.catch_warnings(record=True):
568-
self.df.get_dtype_counts()
568+
self.df._data.get_dtype_counts()
569569

570570
def time_info(self):
571571
self.df.info()

ci/code_checks.sh

+12-16
Original file line numberDiff line numberDiff line change
@@ -34,17 +34,13 @@ function invgrep {
3434
#
3535
# This is useful for the CI, as we want to fail if one of the patterns
3636
# that we want to avoid is found by grep.
37-
if [[ "$AZURE" == "true" ]]; then
38-
set -o pipefail
39-
grep -n "$@" | awk -F ":" '{print "##vso[task.logissue type=error;sourcepath=" $1 ";linenumber=" $2 ";] Found unwanted pattern: " $3}'
40-
else
41-
grep "$@"
42-
fi
43-
return $((! $?))
37+
grep -n "$@" | sed "s/^/$INVGREP_PREPEND/" | sed "s/$/$INVGREP_APPEND/" ; EXIT_STATUS=${PIPESTATUS[0]}
38+
return $((! $EXIT_STATUS))
4439
}
4540

46-
if [[ "$AZURE" == "true" ]]; then
47-
FLAKE8_FORMAT="##vso[task.logissue type=error;sourcepath=%(path)s;linenumber=%(row)s;columnnumber=%(col)s;code=%(code)s;]%(text)s"
41+
if [[ "$GITHUB_ACTIONS" == "true" ]]; then
42+
FLAKE8_FORMAT="##[error]%(path)s:%(row)s:%(col)s:%(code):%(text)s"
43+
INVGREP_PREPEND="##[error]"
4844
else
4945
FLAKE8_FORMAT="default"
5046
fi
@@ -198,15 +194,15 @@ if [[ -z "$CHECK" || "$CHECK" == "patterns" ]]; then
198194
invgrep -R --include="*.py" -P '# type: (?!ignore)' pandas
199195
RET=$(($RET + $?)) ; echo $MSG "DONE"
200196

197+
MSG='Check for use of foo.__class__ instead of type(foo)' ; echo $MSG
198+
invgrep -R --include=*.{py,pyx} '\.__class__' pandas
199+
RET=$(($RET + $?)) ; echo $MSG "DONE"
200+
201201
MSG='Check that no file in the repo contains trailing whitespaces' ; echo $MSG
202-
set -o pipefail
203-
if [[ "$AZURE" == "true" ]]; then
204-
# we exclude all c/cpp files as the c/cpp files of pandas code base are tested when Linting .c and .h files
205-
! grep -n '--exclude=*.'{svg,c,cpp,html,js} --exclude-dir=env -RI "\s$" * | awk -F ":" '{print "##vso[task.logissue type=error;sourcepath=" $1 ";linenumber=" $2 ";] Tailing whitespaces found: " $3}'
206-
else
207-
! grep -n '--exclude=*.'{svg,c,cpp,html,js} --exclude-dir=env -RI "\s$" * | awk -F ":" '{print $1 ":" $2 ":Tailing whitespaces found: " $3}'
208-
fi
202+
INVGREP_APPEND=" <- trailing whitespaces found"
203+
invgrep -RI --exclude=\*.{svg,c,cpp,html,js} --exclude-dir=env "\s$" *
209204
RET=$(($RET + $?)) ; echo $MSG "DONE"
205+
unset INVGREP_APPEND
210206
fi
211207

212208
### CODE ###

doc/redirects.csv

-181
Large diffs are not rendered by default.

doc/source/getting_started/basics.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -2006,7 +2006,7 @@ The number of columns of each type in a ``DataFrame`` can be found by calling
20062006
20072007
Numeric dtypes will propagate and can coexist in DataFrames.
20082008
If a dtype is passed (either directly via the ``dtype`` keyword, a passed ``ndarray``,
2009-
or a passed ``Series``, then it will be preserved in DataFrame operations. Furthermore,
2009+
or a passed ``Series``), then it will be preserved in DataFrame operations. Furthermore,
20102010
different numeric dtypes will **NOT** be combined. The following example will give you a taste.
20112011

20122012
.. ipython:: python

doc/source/reference/frame.rst

-2
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,6 @@ Attributes and underlying data
2828
:toctree: api/
2929

3030
DataFrame.dtypes
31-
DataFrame.get_dtype_counts
3231
DataFrame.select_dtypes
3332
DataFrame.values
3433
DataFrame.get_values
@@ -363,7 +362,6 @@ Serialization / IO / conversion
363362
DataFrame.to_msgpack
364363
DataFrame.to_gbq
365364
DataFrame.to_records
366-
DataFrame.to_dense
367365
DataFrame.to_string
368366
DataFrame.to_clipboard
369367
DataFrame.style

doc/source/reference/indexing.rst

-4
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,6 @@ Properties
3232
Index.has_duplicates
3333
Index.hasnans
3434
Index.dtype
35-
Index.dtype_str
3635
Index.inferred_type
3736
Index.is_all_dates
3837
Index.shape
@@ -42,9 +41,6 @@ Properties
4241
Index.ndim
4342
Index.size
4443
Index.empty
45-
Index.strides
46-
Index.itemsize
47-
Index.base
4844
Index.T
4945
Index.memory_usage
5046

doc/source/reference/series.rst

-6
Original file line numberDiff line numberDiff line change
@@ -33,16 +33,11 @@ Attributes
3333
Series.nbytes
3434
Series.ndim
3535
Series.size
36-
Series.strides
37-
Series.itemsize
38-
Series.base
3936
Series.T
4037
Series.memory_usage
4138
Series.hasnans
42-
Series.flags
4339
Series.empty
4440
Series.dtypes
45-
Series.data
4641
Series.name
4742
Series.put
4843

@@ -584,7 +579,6 @@ Serialization / IO / conversion
584579
Series.to_sql
585580
Series.to_msgpack
586581
Series.to_json
587-
Series.to_dense
588582
Series.to_string
589583
Series.to_clipboard
590584
Series.to_latex

doc/source/whatsnew/v0.25.1.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ including other versions of pandas.
99
I/O and LZMA
1010
~~~~~~~~~~~~
1111

12-
Some users may unknowingly have an incomplete Python installation lacking the `lzma` module from the standard library. In this case, `import pandas` failed due to an `ImportError` (:issue: `27575`).
12+
Some users may unknowingly have an incomplete Python installation lacking the `lzma` module from the standard library. In this case, `import pandas` failed due to an `ImportError` (:issue:`27575`).
1313
Pandas will now warn, rather than raising an `ImportError` if the `lzma` module is not present. Any subsequent attempt to use `lzma` methods will raise a `RuntimeError`.
1414
A possible fix for the lack of the `lzma` module is to ensure you have the necessary libraries and then re-install Python.
1515
For example, on MacOS installing Python with `pyenv` may lead to an incomplete Python installation due to unmet system dependencies at compilation time (like `xz`). Compilation will succeed, but Python might fail at run time. The issue can be solved by installing the necessary dependencies and then re-installing Python.

doc/source/whatsnew/v1.0.0.rst

+20-10
Original file line numberDiff line numberDiff line change
@@ -365,6 +365,7 @@ Deprecations
365365
is equivalent to ``arr[idx.get_loc(idx_val)] = val``, which should be used instead (:issue:`28621`).
366366
- :func:`is_extension_type` is deprecated, :func:`is_extension_array_dtype` should be used instead (:issue:`29457`)
367367
- :func:`eval` keyword argument "truediv" is deprecated and will be removed in a future version (:issue:`29812`)
368+
- :meth:`Categorical.take_nd` is deprecated, use :meth:`Categorical.take` instead (:issue:`27745`)
368369

369370
.. _whatsnew_1000.prior_deprecations:
370371

@@ -406,8 +407,9 @@ or ``matplotlib.Axes.plot``. See :ref:`plotting.formatters` for more.
406407
- Removed the previously deprecated :meth:`Index.summary` (:issue:`18217`)
407408
- Removed the previously deprecated "fastpath" keyword from the :class:`Index` constructor (:issue:`23110`)
408409
- Removed the previously deprecated :meth:`Series.get_value`, :meth:`Series.set_value`, :meth:`DataFrame.get_value`, :meth:`DataFrame.set_value` (:issue:`17739`)
409-
- Changed the the default value of `inplace` in :meth:`DataFrame.set_index` and :meth:`Series.set_axis`. It now defaults to False (:issue:`27600`)
410+
- Changed the the default value of `inplace` in :meth:`DataFrame.set_index` and :meth:`Series.set_axis`. It now defaults to ``False`` (:issue:`27600`)
410411
- Removed the previously deprecated :attr:`Series.cat.categorical`, :attr:`Series.cat.index`, :attr:`Series.cat.name` (:issue:`24751`)
412+
- Removed the previously deprecated "by" keyword from :meth:`DataFrame.sort_index`, use :meth:`DataFrame.sort_values` instead (:issue:`10726`)
411413
- Removed support for nested renaming in :meth:`DataFrame.aggregate`, :meth:`Series.aggregate`, :meth:`DataFrameGroupBy.aggregate`, :meth:`SeriesGroupBy.aggregate`, :meth:`Rolling.aggregate` (:issue:`18529`)
412414
- Passing ``datetime64`` data to :class:`TimedeltaIndex` or ``timedelta64`` data to ``DatetimeIndex`` now raises ``TypeError`` (:issue:`23539`, :issue:`23937`)
413415
- A tuple passed to :meth:`DataFrame.groupby` is now exclusively treated as a single key (:issue:`18314`)
@@ -418,6 +420,7 @@ or ``matplotlib.Axes.plot``. See :ref:`plotting.formatters` for more.
418420
- Removed :meth:`DataFrame.as_blocks`, :meth:`Series.as_blocks`, `DataFrame.blocks`, :meth:`Series.blocks` (:issue:`17656`)
419421
- :meth:`pandas.Series.str.cat` now defaults to aligning ``others``, using ``join='left'`` (:issue:`27611`)
420422
- :meth:`pandas.Series.str.cat` does not accept list-likes *within* list-likes anymore (:issue:`27611`)
423+
- :meth:`Series.where` with ``Categorical`` dtype (or :meth:`DataFrame.where` with ``Categorical`` column) no longer allows setting new categories (:issue:`24114`)
421424
- :func:`core.internals.blocks.make_block` no longer accepts the "fastpath" keyword(:issue:`19265`)
422425
- :meth:`Block.make_block_same_class` no longer accepts the "dtype" keyword(:issue:`19434`)
423426
- Removed the previously deprecated :meth:`ExtensionArray._formatting_values`. Use :attr:`ExtensionArray._formatter` instead. (:issue:`23601`)
@@ -438,6 +441,7 @@ or ``matplotlib.Axes.plot``. See :ref:`plotting.formatters` for more.
438441
- Removed previously deprecated :func:`pandas.tseries.plotting.tsplot` (:issue:`18627`)
439442
- Removed the previously deprecated ``reduce`` and ``broadcast`` arguments from :meth:`DataFrame.apply` (:issue:`18577`)
440443
- Removed the previously deprecated ``assert_raises_regex`` function in ``pandas.util.testing`` (:issue:`29174`)
444+
- Removed the previously deprecated ``FrozenNDArray`` class in ``pandas.core.indexes.frozen`` (:issue:`29335`)
441445
- Removed previously deprecated "nthreads" argument from :func:`read_feather`, use "use_threads" instead (:issue:`23053`)
442446
- Removed :meth:`Index.is_lexsorted_for_tuple` (:issue:`29305`)
443447
- Removed support for nexted renaming in :meth:`DataFrame.aggregate`, :meth:`Series.aggregate`, :meth:`DataFrameGroupBy.aggregate`, :meth:`SeriesGroupBy.aggregate`, :meth:`Rolling.aggregate` (:issue:`29608`)
@@ -451,13 +455,18 @@ or ``matplotlib.Axes.plot``. See :ref:`plotting.formatters` for more.
451455
- Removed the previously deprecated :attr:`DatetimeIndex.offset` (:issue:`20730`)
452456
- Removed the previously deprecated :meth:`DatetimeIndex.asobject`, :meth:`TimedeltaIndex.asobject`, :meth:`PeriodIndex.asobject`, use ``astype(object)`` instead (:issue:`29801`)
453457
- Removed previously deprecated "order" argument from :func:`factorize` (:issue:`19751`)
454-
- Removed previously deprecated "v" argument from :meth:`FrozenNDarray.searchsorted`, use "value" instead (:issue:`22672`)
455458
- :func:`read_stata` and :meth:`DataFrame.to_stata` no longer supports the "encoding" argument (:issue:`21400`)
456459
- In :func:`concat` the default value for ``sort`` has been changed from ``None`` to ``False`` (:issue:`20613`)
457460
- Removed previously deprecated "raise_conflict" argument from :meth:`DataFrame.update`, use "errors" instead (:issue:`23585`)
458461
- Removed previously deprecated keyword "n" from :meth:`DatetimeIndex.shift`, :meth:`TimedeltaIndex.shift`, :meth:`PeriodIndex.shift`, use "periods" instead (:issue:`22458`)
462+
- Removed the previously deprecated :meth:`Series.to_dense`, :meth:`DataFrame.to_dense` (:issue:`26684`)
463+
- Removed the previously deprecated :meth:`Index.dtype_str`, use ``str(index.dtype)`` instead (:issue:`27106`)
464+
- :meth:`Categorical.ravel` returns a :class:`Categorical` instead of a ``ndarray`` (:issue:`27199`)
465+
- Removed previously deprecated :meth:`Series.get_dtype_counts` and :meth:`DataFrame.get_dtype_counts` (:issue:`27145`)
466+
- Changed the default ``fill_value`` in :meth:`Categorical.take` from ``True`` to ``False`` (:issue:`20841`)
459467
- Changed the default value for the `raw` argument in :func:`Series.rolling().apply() <pandas.core.window.Rolling.apply>`, :func:`DataFrame.rolling().apply() <pandas.core.window.Rolling.apply>`,
460468
- :func:`Series.expanding().apply() <pandas.core.window.Expanding.apply>`, and :func:`DataFrame.expanding().apply() <pandas.core.window.Expanding.apply>` to ``False`` (:issue:`20584`)
469+
- Removed the previously deprecated :attr:`Series.base`, :attr:`Index.base`, :attr:`Categorical.base`, :attr:`Series.flags`, :attr:`Index.flags`, :attr:`PeriodArray.flags`, :attr:`Series.strides`, :attr:`Index.strides`, :attr:`Series.itemsize`, :attr:`Index.itemsize`, :attr:`Series.data`, :attr:`Index.data` (:issue:`20721`)
461470
- Changed :meth:`Timedelta.resolution` to match the behavior of the standard library ``datetime.timedelta.resolution``, for the old behavior, use :meth:`Timedelta.resolution_string` (:issue:`26839`)
462471
- Removed previously deprecated :attr:`Timestamp.weekday_name`, :attr:`DatetimeIndex.weekday_name`, and :attr:`Series.dt.weekday_name` (:issue:`18164`)
463472
- Removed previously deprecated ``errors`` argument in :meth:`Timestamp.tz_localize`, :meth:`DatetimeIndex.tz_localize`, and :meth:`Series.tz_localize` (:issue:`22644`)
@@ -488,7 +497,7 @@ Bug fixes
488497
Categorical
489498
^^^^^^^^^^^
490499

491-
- Added test to assert the :func:`fillna` raises the correct ValueError message when the value isn't a value from categories (:issue:`13628`)
500+
- Added test to assert the :func:`fillna` raises the correct ``ValueError`` message when the value isn't a value from categories (:issue:`13628`)
492501
- Bug in :meth:`Categorical.astype` where ``NaN`` values were handled incorrectly when casting to int (:issue:`28406`)
493502
- :meth:`DataFrame.reindex` with a :class:`CategoricalIndex` would fail when the targets contained duplicates, and wouldn't fail if the source contained duplicates (:issue:`28107`)
494503
- Bug in :meth:`Categorical.astype` not allowing for casting to extension dtypes (:issue:`28668`)
@@ -498,7 +507,7 @@ Categorical
498507
- Changed the error message in :meth:`Categorical.remove_categories` to always show the invalid removals as a set (:issue:`28669`)
499508
- Using date accessors on a categorical dtyped :class:`Series` of datetimes was not returning an object of the
500509
same type as if one used the :meth:`.str.` / :meth:`.dt.` on a :class:`Series` of that type. E.g. when accessing :meth:`Series.dt.tz_localize` on a
501-
:class:`Categorical` with duplicate entries, the accessor was skipping duplicates (:issue: `27952`)
510+
:class:`Categorical` with duplicate entries, the accessor was skipping duplicates (:issue:`27952`)
502511
- Bug in :meth:`DataFrame.replace` and :meth:`Series.replace` that would give incorrect results on categorical data (:issue:`26988`)
503512

504513

@@ -536,7 +545,7 @@ Timezones
536545
Numeric
537546
^^^^^^^
538547
- Bug in :meth:`DataFrame.quantile` with zero-column :class:`DataFrame` incorrectly raising (:issue:`23925`)
539-
- :class:`DataFrame` flex inequality comparisons methods (:meth:`DataFrame.lt`, :meth:`DataFrame.le`, :meth:`DataFrame.gt`, :meth: `DataFrame.ge`) with object-dtype and ``complex`` entries failing to raise ``TypeError`` like their :class:`Series` counterparts (:issue:`28079`)
548+
- :class:`DataFrame` flex inequality comparisons methods (:meth:`DataFrame.lt`, :meth:`DataFrame.le`, :meth:`DataFrame.gt`, :meth:`DataFrame.ge`) with object-dtype and ``complex`` entries failing to raise ``TypeError`` like their :class:`Series` counterparts (:issue:`28079`)
540549
- Bug in :class:`DataFrame` logical operations (`&`, `|`, `^`) not matching :class:`Series` behavior by filling NA values (:issue:`28741`)
541550
- Bug in :meth:`DataFrame.interpolate` where specifying axis by name references variable before it is assigned (:issue:`29142`)
542551
- Bug in :meth:`Series.var` not computing the right value with a nullable integer dtype series not passing through ddof argument (:issue:`29128`)
@@ -633,21 +642,22 @@ Groupby/resample/rolling
633642
-
634643
- Bug in :meth:`DataFrame.groupby` with multiple groups where an ``IndexError`` would be raised if any group contained all NA values (:issue:`20519`)
635644
- Bug in :meth:`pandas.core.resample.Resampler.size` and :meth:`pandas.core.resample.Resampler.count` returning wrong dtype when used with an empty series or dataframe (:issue:`28427`)
636-
- Bug in :meth:`DataFrame.rolling` not allowing for rolling over datetimes when ``axis=1`` (:issue: `28192`)
637-
- Bug in :meth:`DataFrame.rolling` not allowing rolling over multi-index levels (:issue: `15584`).
638-
- Bug in :meth:`DataFrame.rolling` not allowing rolling on monotonic decreasing time indexes (:issue: `19248`).
645+
- Bug in :meth:`DataFrame.rolling` not allowing for rolling over datetimes when ``axis=1`` (:issue:`28192`)
646+
- Bug in :meth:`DataFrame.rolling` not allowing rolling over multi-index levels (:issue:`15584`).
647+
- Bug in :meth:`DataFrame.rolling` not allowing rolling on monotonic decreasing time indexes (:issue:`19248`).
639648
- Bug in :meth:`DataFrame.groupby` not offering selection by column name when ``axis=1`` (:issue:`27614`)
640649
- Bug in :meth:`DataFrameGroupby.agg` not able to use lambda function with named aggregation (:issue:`27519`)
641650
- Bug in :meth:`DataFrame.groupby` losing column name information when grouping by a categorical column (:issue:`28787`)
642651
- Bug in :meth:`DataFrameGroupBy.rolling().quantile()` ignoring ``interpolation`` keyword argument (:issue:`28779`)
643652
- Bug in :meth:`DataFrame.groupby` where ``any``, ``all``, ``nunique`` and transform functions would incorrectly handle duplicate column labels (:issue:`21668`)
653+
- Bug in :meth:`DataFrameGroupBy.agg` with timezone-aware datetime64 column incorrectly casting results to the original dtype (:issue:`29641`)
644654
-
645655

646656
Reshaping
647657
^^^^^^^^^
648658

649659
- Bug in :meth:`DataFrame.apply` that caused incorrect output with empty :class:`DataFrame` (:issue:`28202`, :issue:`21959`)
650-
- Bug in :meth:`DataFrame.stack` not handling non-unique indexes correctly when creating MultiIndex (:issue: `28301`)
660+
- Bug in :meth:`DataFrame.stack` not handling non-unique indexes correctly when creating MultiIndex (:issue:`28301`)
651661
- Bug in :meth:`pivot_table` not returning correct type ``float`` when ``margins=True`` and ``aggfunc='mean'`` (:issue:`24893`)
652662
- Bug :func:`merge_asof` could not use :class:`datetime.timedelta` for ``tolerance`` kwarg (:issue:`28098`)
653663
- Bug in :func:`merge`, did not append suffixes correctly with MultiIndex (:issue:`28518`)
@@ -680,7 +690,7 @@ Other
680690
- :meth:`DataFrame.to_csv` and :meth:`Series.to_csv` now support dicts as ``compression`` argument with key ``'method'`` being the compression method and others as additional compression options when the compression method is ``'zip'``. (:issue:`26023`)
681691
- Bug in :meth:`Series.diff` where a boolean series would incorrectly raise a ``TypeError`` (:issue:`17294`)
682692
- :meth:`Series.append` will no longer raise a ``TypeError`` when passed a tuple of ``Series`` (:issue:`28410`)
683-
- :meth:`SeriesGroupBy.value_counts` will be able to handle the case even when the :class:`Grouper` makes empty groups (:issue: 28479)
693+
- :meth:`SeriesGroupBy.value_counts` will be able to handle the case even when the :class:`Grouper` makes empty groups (:issue:`28479`)
684694
- Fix corrupted error message when calling ``pandas.libs._json.encode()`` on a 0d array (:issue:`18878`)
685695
- Bug in :meth:`DataFrame.append` that raised ``IndexError`` when appending with empty list (:issue:`28769`)
686696
- Fix :class:`AbstractHolidayCalendar` to return correct results for

0 commit comments

Comments
 (0)