Skip to content

Commit 5fba421

Browse files
authored
Merge pull request #88 from pandas-dev/master
Sync Fork from Upstream Repo
2 parents 517fba6 + 05925d2 commit 5fba421

File tree

15 files changed

+86
-53
lines changed

15 files changed

+86
-53
lines changed

doc/source/getting_started/comparison/comparison_with_sql.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -75,7 +75,7 @@ Filtering in SQL is done via a WHERE clause.
7575
LIMIT 5;
7676
7777
DataFrames can be filtered in multiple ways; the most intuitive of which is using
78-
`boolean indexing <https://pandas.pydata.org/pandas-docs/stable/indexing.html#boolean-indexing>`_.
78+
:ref:`boolean indexing <indexing.boolean>`
7979

8080
.. ipython:: python
8181

doc/source/user_guide/cookbook.rst

+1-2
Original file line numberDiff line numberDiff line change
@@ -794,8 +794,7 @@ The :ref:`Resample <timeseries.resampling>` docs.
794794
`Time grouping with some missing values
795795
<https://stackoverflow.com/questions/33637312/pandas-grouper-by-frequency-with-completeness-requirement>`__
796796

797-
`Valid frequency arguments to Grouper
798-
<https://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases>`__
797+
Valid frequency arguments to Grouper :ref:`Timeseries <timeseries.offset_aliases>`
799798

800799
`Grouping using a MultiIndex
801800
<https://stackoverflow.com/questions/41483763/pandas-timegrouper-on-multiindex>`__

doc/source/whatsnew/v1.0.2.rst

+19-7
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
.. _whatsnew_102:
22

3-
What's new in 1.0.2 (March 11, 2020)
3+
What's new in 1.0.2 (March 12, 2020)
44
------------------------------------
55

66
These are the changes in pandas 1.0.2. See :ref:`release` for a full changelog
@@ -15,22 +15,34 @@ including other versions of pandas.
1515
Fixed regressions
1616
~~~~~~~~~~~~~~~~~
1717

18-
- Fixed regression in :meth:`DataFrame.to_excel` when ``columns`` kwarg is passed (:issue:`31677`)
19-
- Fixed regression in :meth:`Series.align` when ``other`` is a DataFrame and ``method`` is not None (:issue:`31785`)
18+
**Groupby**
19+
2020
- Fixed regression in :meth:`groupby(..).agg() <pandas.core.groupby.GroupBy.agg>` which was failing on frames with MultiIndex columns and a custom function (:issue:`31777`)
2121
- Fixed regression in ``groupby(..).rolling(..).apply()`` (``RollingGroupby``) where the ``raw`` parameter was ignored (:issue:`31754`)
2222
- Fixed regression in :meth:`rolling(..).corr() <pandas.core.window.rolling.Rolling.corr>` when using a time offset (:issue:`31789`)
2323
- Fixed regression in :meth:`groupby(..).nunique() <pandas.core.groupby.DataFrameGroupBy.nunique>` which was modifying the original values if ``NaN`` values were present (:issue:`31950`)
2424
- Fixed regression in ``DataFrame.groupby`` raising a ``ValueError`` from an internal operation (:issue:`31802`)
25-
- Fixed regression where :func:`read_pickle` raised a ``UnicodeDecodeError`` when reading a py27 pickle with :class:`MultiIndex` column (:issue:`31988`).
26-
- Fixed regression in :class:`DataFrame` arithmetic operations with mis-matched columns (:issue:`31623`)
2725
- Fixed regression in :meth:`groupby(..).agg() <pandas.core.groupby.GroupBy.agg>` calling a user-provided function an extra time on an empty input (:issue:`31760`)
28-
- Fixed regression in joining on :class:`DatetimeIndex` or :class:`TimedeltaIndex` to preserve ``freq`` in simple cases (:issue:`32166`)
29-
- Fixed regression in the repr of an object-dtype :class:`Index` with bools and missing values (:issue:`32146`)
26+
27+
**I/O**
28+
3029
- Fixed regression in :meth:`read_csv` in which the ``encoding`` option was not recognized with certain file-like objects (:issue:`31819`)
30+
- Fixed regression in :meth:`DataFrame.to_excel` when the ``columns`` keyword argument is passed (:issue:`31677`)
31+
- Fixed regression in :class:`ExcelFile` where the stream passed into the function was closed by the destructor. (:issue:`31467`)
32+
- Fixed regression where :func:`read_pickle` raised a ``UnicodeDecodeError`` when reading a py27 pickle with :class:`MultiIndex` column (:issue:`31988`).
33+
34+
**Reindexing/alignment**
35+
36+
- Fixed regression in :meth:`Series.align` when ``other`` is a DataFrame and ``method`` is not None (:issue:`31785`)
3137
- Fixed regression in :meth:`DataFrame.reindex` and :meth:`Series.reindex` when reindexing with (tz-aware) index and ``method=nearest`` (:issue:`26683`)
3238
- Fixed regression in :meth:`DataFrame.reindex_like` on a :class:`DataFrame` subclass raised an ``AssertionError`` (:issue:`31925`)
39+
- Fixed regression in :class:`DataFrame` arithmetic operations with mis-matched columns (:issue:`31623`)
40+
41+
**Other**
42+
43+
- Fixed regression in joining on :class:`DatetimeIndex` or :class:`TimedeltaIndex` to preserve ``freq`` in simple cases (:issue:`32166`)
3344
- Fixed regression in :meth:`Series.shift` with ``datetime64`` dtype when passing an integer ``fill_value`` (:issue:`32591`)
45+
- Fixed regression in the repr of an object-dtype :class:`Index` with bools and missing values (:issue:`32146`)
3446

3547

3648
.. ---------------------------------------------------------------------------

doc/source/whatsnew/v1.1.0.rst

+5-5
Original file line numberDiff line numberDiff line change
@@ -89,9 +89,9 @@ Backwards incompatible API changes
8989
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
9090
- :meth:`DataFrame.swaplevels` now raises a ``TypeError`` if the axis is not a :class:`MultiIndex`.
9191
Previously a ``AttributeError`` was raised (:issue:`31126`)
92-
- :meth:`DataFrameGroupby.mean` and :meth:`SeriesGroupby.mean` (and similarly for :meth:`~DataFrameGroupby.median`, :meth:`~DataFrameGroupby.std`` and :meth:`~DataFrameGroupby.var``)
92+
- :meth:`DataFrameGroupby.mean` and :meth:`SeriesGroupby.mean` (and similarly for :meth:`~DataFrameGroupby.median`, :meth:`~DataFrameGroupby.std` and :meth:`~DataFrameGroupby.var`)
9393
now raise a ``TypeError`` if a not-accepted keyword argument is passed into it.
94-
Previously a ``UnsupportedFunctionCall`` was raised (``AssertionError`` if ``min_count`` passed into :meth:`~DataFrameGroupby.median``) (:issue:`31485`)
94+
Previously a ``UnsupportedFunctionCall`` was raised (``AssertionError`` if ``min_count`` passed into :meth:`~DataFrameGroupby.median`) (:issue:`31485`)
9595
- :meth:`DataFrame.at` and :meth:`Series.at` will raise a ``TypeError`` instead of a ``ValueError`` if an incompatible key is passed, and ``KeyError`` if a missing key is passed, matching the behavior of ``.loc[]`` (:issue:`31722`)
9696
- Passing an integer dtype other than ``int64`` to ``np.array(period_index, dtype=...)`` will now raise ``TypeError`` instead of incorrectly using ``int64`` (:issue:`32255`)
9797
-
@@ -188,9 +188,9 @@ Performance improvements
188188
- Performance improvement in :class:`Timedelta` constructor (:issue:`30543`)
189189
- Performance improvement in :class:`Timestamp` constructor (:issue:`30543`)
190190
- Performance improvement in flex arithmetic ops between :class:`DataFrame` and :class:`Series` with ``axis=0`` (:issue:`31296`)
191-
- The internal :meth:`Index._shallow_copy` now copies cached attributes over to the new index,
192-
avoiding creating these again on the new index. This can speed up many operations
193-
that depend on creating copies of existing indexes (:issue:`28584`)
191+
- The internal index method :meth:`~Index._shallow_copy` now copies cached attributes over to the new index,
192+
avoiding creating these again on the new index. This can speed up many operations that depend on creating copies of
193+
existing indexes (:issue:`28584`, :issue:`32640`)
194194

195195
.. ---------------------------------------------------------------------------
196196

doc/sphinxext/announce.py

+7-6
Original file line numberDiff line numberDiff line change
@@ -122,14 +122,15 @@ def build_string(revision_range, heading="Contributors"):
122122
components["uline"] = "=" * len(components["heading"])
123123
components["authors"] = "* " + "\n* ".join(components["authors"])
124124

125+
# Don't change this to an fstring. It breaks the formatting.
125126
tpl = textwrap.dedent(
126-
f"""\
127-
{components['heading']}
128-
{components['uline']}
127+
"""\
128+
{heading}
129+
{uline}
129130
130-
{components['author_message']}
131-
{components['authors']}"""
132-
)
131+
{author_message}
132+
{authors}"""
133+
).format(**components)
133134
return tpl
134135

135136

pandas/_libs/tslibs/np_datetime.pyx

+10-7
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,15 @@
11
from cpython.object cimport Py_EQ, Py_NE, Py_GE, Py_GT, Py_LT, Py_LE
22

3-
from cpython.datetime cimport (datetime, date,
4-
PyDateTime_IMPORT,
5-
PyDateTime_GET_YEAR, PyDateTime_GET_MONTH,
6-
PyDateTime_GET_DAY, PyDateTime_DATE_GET_HOUR,
7-
PyDateTime_DATE_GET_MINUTE,
8-
PyDateTime_DATE_GET_SECOND,
9-
PyDateTime_DATE_GET_MICROSECOND)
3+
from cpython.datetime cimport (
4+
PyDateTime_DATE_GET_HOUR,
5+
PyDateTime_DATE_GET_MICROSECOND,
6+
PyDateTime_DATE_GET_MINUTE,
7+
PyDateTime_DATE_GET_SECOND,
8+
PyDateTime_GET_DAY,
9+
PyDateTime_GET_MONTH,
10+
PyDateTime_GET_YEAR,
11+
PyDateTime_IMPORT,
12+
)
1013
PyDateTime_IMPORT
1114

1215
from numpy cimport int64_t

pandas/core/indexes/category.py

+4-8
Original file line numberDiff line numberDiff line change
@@ -233,6 +233,7 @@ def _simple_new(cls, values: Categorical, name: Label = None):
233233

234234
result._data = values
235235
result.name = name
236+
result._cache = {}
236237

237238
result._reset_identity()
238239
result._no_setting_name = False
@@ -242,14 +243,9 @@ def _simple_new(cls, values: Categorical, name: Label = None):
242243

243244
@Appender(Index._shallow_copy.__doc__)
244245
def _shallow_copy(self, values=None, name: Label = no_default):
245-
name = self.name if name is no_default else name
246-
247-
if values is None:
248-
values = self.values
249-
250-
cat = Categorical(values, dtype=self.dtype)
251-
252-
return type(self)._simple_new(cat, name=name)
246+
if values is not None:
247+
values = Categorical(values, dtype=self.dtype)
248+
return super()._shallow_copy(values=values, name=name)
253249

254250
def _is_dtype_compat(self, other) -> bool:
255251
"""

pandas/core/indexes/datetimelike.py

+4-1
Original file line numberDiff line numberDiff line change
@@ -617,6 +617,7 @@ def _set_freq(self, freq):
617617

618618
def _shallow_copy(self, values=None, name: Label = lib.no_default):
619619
name = self.name if name is lib.no_default else name
620+
cache = self._cache.copy() if values is None else {}
620621

621622
if values is None:
622623
values = self._data
@@ -635,7 +636,9 @@ def _shallow_copy(self, values=None, name: Label = lib.no_default):
635636
del attributes["freq"]
636637

637638
attributes["name"] = name
638-
return type(self)._simple_new(values, **attributes)
639+
result = self._simple_new(values, **attributes)
640+
result._cache = cache
641+
return result
639642

640643
# --------------------------------------------------------------------
641644
# Set Operation Methods

pandas/core/indexes/datetimes.py

+1
Original file line numberDiff line numberDiff line change
@@ -268,6 +268,7 @@ def _simple_new(cls, values, name=None, freq=None, tz=None, dtype=None):
268268
result = object.__new__(cls)
269269
result._data = dtarr
270270
result.name = name
271+
result._cache = {}
271272
result._no_setting_name = False
272273
# For groupby perf. See note in indexes/base about _index_data
273274
result._index_data = dtarr._data

pandas/core/indexes/interval.py

+8-4
Original file line numberDiff line numberDiff line change
@@ -243,6 +243,7 @@ def _simple_new(cls, array: IntervalArray, name: Label = None):
243243
result = IntervalMixin.__new__(cls)
244244
result._data = array
245245
result.name = name
246+
result._cache = {}
246247
result._no_setting_name = False
247248
result._reset_identity()
248249
return result
@@ -332,12 +333,15 @@ def from_tuples(
332333
# --------------------------------------------------------------------
333334

334335
@Appender(Index._shallow_copy.__doc__)
335-
def _shallow_copy(self, values=None, **kwargs):
336+
def _shallow_copy(self, values=None, name: Label = lib.no_default):
337+
name = self.name if name is lib.no_default else name
338+
cache = self._cache.copy() if values is None else {}
336339
if values is None:
337340
values = self._data
338-
attributes = self._get_attributes_dict()
339-
attributes.update(kwargs)
340-
return self._simple_new(values, **attributes)
341+
342+
result = self._simple_new(values, name=name)
343+
result._cache = cache
344+
return result
341345

342346
@cache_readonly
343347
def _isnan(self):

pandas/core/indexes/period.py

+1
Original file line numberDiff line numberDiff line change
@@ -233,6 +233,7 @@ def _simple_new(cls, values: PeriodArray, name: Label = None):
233233
# For groupby perf. See note in indexes/base about _index_data
234234
result._index_data = values._data
235235
result.name = name
236+
result._cache = {}
236237
result._reset_identity()
237238
return result
238239

pandas/core/indexes/timedeltas.py

+1
Original file line numberDiff line numberDiff line change
@@ -180,6 +180,7 @@ def _simple_new(cls, values, name=None, freq=None, dtype=_TD_DTYPE):
180180
result = object.__new__(cls)
181181
result._data = values
182182
result._name = name
183+
result._cache = {}
183184
# For groupby perf. See note in indexes/base about _index_data
184185
result._index_data = values._data
185186

pandas/io/excel/_base.py

+4-8
Original file line numberDiff line numberDiff line change
@@ -366,6 +366,9 @@ def _workbook_class(self):
366366
def load_workbook(self, filepath_or_buffer):
367367
pass
368368

369+
def close(self):
370+
pass
371+
369372
@property
370373
@abc.abstractmethod
371374
def sheet_names(self):
@@ -895,14 +898,7 @@ def sheet_names(self):
895898

896899
def close(self):
897900
"""close io if necessary"""
898-
if self.engine == "openpyxl":
899-
# https://stackoverflow.com/questions/31416842/
900-
# openpyxl-does-not-close-excel-workbook-in-read-only-mode
901-
wb = self.book
902-
wb._archive.close()
903-
904-
if hasattr(self.io, "close"):
905-
self.io.close()
901+
self._reader.close()
906902

907903
def __enter__(self):
908904
return self

pandas/io/excel/_openpyxl.py

+5
Original file line numberDiff line numberDiff line change
@@ -492,6 +492,11 @@ def load_workbook(self, filepath_or_buffer: FilePathOrBuffer):
492492
filepath_or_buffer, read_only=True, data_only=True, keep_links=False
493493
)
494494

495+
def close(self):
496+
# https://stackoverflow.com/questions/31416842/
497+
# openpyxl-does-not-close-excel-workbook-in-read-only-mode
498+
self.book.close()
499+
495500
@property
496501
def sheet_names(self) -> List[str]:
497502
return self.book.sheetnames

pandas/tests/io/excel/test_readers.py

+15-4
Original file line numberDiff line numberDiff line change
@@ -629,6 +629,17 @@ def test_read_from_py_localpath(self, read_ext):
629629

630630
tm.assert_frame_equal(expected, actual)
631631

632+
@td.check_file_leaks
633+
def test_close_from_py_localpath(self, read_ext):
634+
635+
# GH31467
636+
str_path = os.path.join("test1" + read_ext)
637+
with open(str_path, "rb") as f:
638+
x = pd.read_excel(f, "Sheet1", index_col=0)
639+
del x
640+
# should not throw an exception because the passed file was closed
641+
f.read()
642+
632643
def test_reader_seconds(self, read_ext):
633644
if pd.read_excel.keywords["engine"] == "pyxlsb":
634645
pytest.xfail("Sheets containing datetimes not supported by pyxlsb")
@@ -1020,10 +1031,10 @@ def test_excel_read_buffer(self, engine, read_ext):
10201031
tm.assert_frame_equal(expected, actual)
10211032

10221033
def test_reader_closes_file(self, engine, read_ext):
1023-
f = open("test1" + read_ext, "rb")
1024-
with pd.ExcelFile(f) as xlsx:
1025-
# parses okay
1026-
pd.read_excel(xlsx, "Sheet1", index_col=0, engine=engine)
1034+
with open("test1" + read_ext, "rb") as f:
1035+
with pd.ExcelFile(f) as xlsx:
1036+
# parses okay
1037+
pd.read_excel(xlsx, "Sheet1", index_col=0, engine=engine)
10271038

10281039
assert f.closed
10291040

0 commit comments

Comments
 (0)