Skip to content

Commit 1869717

Browse files
author
MomIsBestFriend
committed
Merge remote-tracking branch 'upstream/master' into CI-unwanted-test-str-concat
2 parents 5e272bc + f9fb02e commit 1869717

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

68 files changed

+1691
-1111
lines changed

.travis.yml

-5
Original file line numberDiff line numberDiff line change
@@ -48,17 +48,12 @@ matrix:
4848
- mysql
4949
- postgresql
5050

51-
# In allow_failures
5251
- env:
5352
- JOB="3.6, slow" ENV_FILE="ci/deps/travis-36-slow.yaml" PATTERN="slow" SQL="1"
5453
services:
5554
- mysql
5655
- postgresql
5756

58-
allow_failures:
59-
- env:
60-
- JOB="3.6, slow" ENV_FILE="ci/deps/travis-36-slow.yaml" PATTERN="slow" SQL="1"
61-
6257
before_install:
6358
- echo "before_install"
6459
# set non-blocking IO on travis

ci/azure/windows.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ jobs:
3434
- bash: |
3535
source activate pandas-dev
3636
conda list
37-
python setup.py build_ext -q -i
37+
python setup.py build_ext -q -i -j 4
3838
python -m pip install --no-build-isolation -e .
3939
displayName: 'Build'
4040

ci/deps/azure-36-locale_slow.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ dependencies:
1313
- pytest-azurepipelines
1414

1515
# pandas dependencies
16-
- beautifulsoup4==4.6.0
16+
- beautifulsoup4=4.6.0
1717
- bottleneck=1.2.*
1818
- lxml
1919
- matplotlib=2.2.2

doc/source/getting_started/10min.rst

+2-1
Original file line numberDiff line numberDiff line change
@@ -697,8 +697,9 @@ Plotting
697697

698698
See the :ref:`Plotting <visualization>` docs.
699699

700+
We use the standard convention for referencing the matplotlib API:
701+
700702
.. ipython:: python
701-
:suppress:
702703
703704
import matplotlib.pyplot as plt
704705
plt.close('all')

doc/source/whatsnew/v1.0.0.rst

+8-2
Original file line numberDiff line numberDiff line change
@@ -221,8 +221,8 @@ Other enhancements
221221
- DataFrame constructor preserve `ExtensionArray` dtype with `ExtensionArray` (:issue:`11363`)
222222
- :meth:`DataFrame.sort_values` and :meth:`Series.sort_values` have gained ``ignore_index`` keyword to be able to reset index after sorting (:issue:`30114`)
223223
- :meth:`DataFrame.to_markdown` and :meth:`Series.to_markdown` added (:issue:`11052`)
224-
225224
- :meth:`DataFrame.drop_duplicates` has gained ``ignore_index`` keyword to reset index (:issue:`30114`)
225+
- Added new writer for exporting Stata dta files in version 118, ``StataWriter118``. This format supports exporting strings containing Unicode characters (:issue:`23573`)
226226

227227
Build Changes
228228
^^^^^^^^^^^^^
@@ -844,6 +844,7 @@ Interval
844844

845845
- Bug in :meth:`IntervalIndex.get_indexer` where a :class:`Categorical` or :class:`CategoricalIndex` ``target`` would incorrectly raise a ``TypeError`` (:issue:`30063`)
846846
- Bug in ``pandas.core.dtypes.cast.infer_dtype_from_scalar`` where passing ``pandas_dtype=True`` did not infer :class:`IntervalDtype` (:issue:`30337`)
847+
- Bug in :class:`IntervalDtype` where the ``kind`` attribute was incorrectly set as ``None`` instead of ``"O"`` (:issue:`30568`)
847848

848849
Indexing
849850
^^^^^^^^
@@ -892,6 +893,7 @@ I/O
892893
- Bug in :func:`read_json` where default encoding was not set to ``utf-8`` (:issue:`29565`)
893894
- Bug in :class:`PythonParser` where str and bytes were being mixed when dealing with the decimal field (:issue:`29650`)
894895
- :meth:`read_gbq` now accepts ``progress_bar_type`` to display progress bar while the data downloads. (:issue:`29857`)
896+
- Bug in :func:`pandas.io.json.json_normalize` where a missing value in the location specified by `record_path` would raise a ``TypeError`` (:issue:`30148`)
895897

896898
Plotting
897899
^^^^^^^^
@@ -907,6 +909,7 @@ Plotting
907909
- :func:`set_option` now validates that the plot backend provided to ``'plotting.backend'`` implements the backend when the option is set, rather than when a plot is created (:issue:`28163`)
908910
- :meth:`DataFrame.plot` now allow a ``backend`` keyword argument to allow changing between backends in one session (:issue:`28619`).
909911
- Bug in color validation incorrectly raising for non-color styles (:issue:`29122`).
912+
- Allow :meth: `DataFrame.plot.scatter` to plot ``objects`` and ``datetime`` type data (:issue:`18755`, :issue:`30391`)
910913
- Bug in :meth:`DataFrame.hist`, ``xrot=0`` does not work with ``by`` and subplots (:issue:`30288`).
911914

912915
Groupby/resample/rolling
@@ -929,6 +932,7 @@ Groupby/resample/rolling
929932
- Bug in :meth:`DataFrame.groupby` when using axis=1 and having a single level columns index (:issue:`30208`)
930933
- Bug in :meth:`DataFrame.groupby` when using nunique on axis=1 (:issue:`30253`)
931934
- Bug in :meth:`GroupBy.quantile` with multiple list-like q value and integer column names (:issue:`30289`)
935+
- Bug in :meth:`GroupBy.pct_change` and :meth:`SeriesGroupBy.pct_change` causes ``TypeError`` when ``fill_method`` is ``None`` (:issue:`30463`)
932936

933937
Reshaping
934938
^^^^^^^^^
@@ -971,13 +975,15 @@ Other
971975
- Bug in :meth:`Series.diff` where a boolean series would incorrectly raise a ``TypeError`` (:issue:`17294`)
972976
- :meth:`Series.append` will no longer raise a ``TypeError`` when passed a tuple of ``Series`` (:issue:`28410`)
973977
- Fix corrupted error message when calling ``pandas.libs._json.encode()`` on a 0d array (:issue:`18878`)
978+
- Bug in ``pd.core.util.hashing.hash_pandas_object`` where arrays containing tuples were incorrectly treated as non-hashable (:issue:`28969`)
974979
- Bug in :meth:`DataFrame.append` that raised ``IndexError`` when appending with empty list (:issue:`28769`)
975980
- Fix :class:`AbstractHolidayCalendar` to return correct results for
976981
years after 2030 (now goes up to 2200) (:issue:`27790`)
977982
- Fixed :class:`IntegerArray` returning ``inf`` rather than ``NaN`` for operations dividing by 0 (:issue:`27398`)
978983
- Fixed ``pow`` operations for :class:`IntegerArray` when the other value is ``0`` or ``1`` (:issue:`29997`)
979984
- Bug in :meth:`Series.count` raises if use_inf_as_na is enabled (:issue:`29478`)
980-
- Bug in :class:`Index` where a non-hashable name could be set without raising ``TypeError`` (:issue:29069`)
985+
- Bug in :class:`Index` where a non-hashable name could be set without raising ``TypeError`` (:issue:`29069`)
986+
981987

982988
.. _whatsnew_1000.contributors:
983989

pandas/_libs/hashing.pyx

+6
Original file line numberDiff line numberDiff line change
@@ -70,6 +70,12 @@ def hash_object_array(object[:] arr, object key, object encoding='utf8'):
7070
# null, stringify and encode
7171
data = <bytes>str(val).encode(encoding)
7272

73+
elif isinstance(val, tuple):
74+
# GH#28969 we could have a tuple, but need to ensure that
75+
# the tuple entries are themselves hashable before converting
76+
# to str
77+
hash(val)
78+
data = <bytes>str(val).encode(encoding)
7379
else:
7480
raise TypeError(f"{val} of type {type(val)} is not a valid type "
7581
"for hashing, must be string or null")

pandas/_libs/intervaltree.pxi.in

+33-8
Original file line numberDiff line numberDiff line change
@@ -6,12 +6,20 @@ WARNING: DO NOT edit .pxi FILE directly, .pxi is generated from .pxi.in
66

77
from pandas._libs.algos import is_monotonic
88

9-
ctypedef fused scalar_t:
10-
float64_t
11-
float32_t
9+
ctypedef fused int_scalar_t:
1210
int64_t
1311
int32_t
12+
float64_t
13+
float32_t
14+
15+
ctypedef fused uint_scalar_t:
1416
uint64_t
17+
float64_t
18+
float32_t
19+
20+
ctypedef fused scalar_t:
21+
int_scalar_t
22+
uint_scalar_t
1523

1624
# ----------------------------------------------------------------------
1725
# IntervalTree
@@ -128,7 +136,12 @@ cdef class IntervalTree(IntervalMixin):
128136
result = Int64Vector()
129137
old_len = 0
130138
for i in range(len(target)):
131-
self.root.query(result, target[i])
139+
try:
140+
self.root.query(result, target[i])
141+
except OverflowError:
142+
# overflow -> no match, which is already handled below
143+
pass
144+
132145
if result.data.n == old_len:
133146
result.append(-1)
134147
elif result.data.n > old_len + 1:
@@ -150,7 +163,12 @@ cdef class IntervalTree(IntervalMixin):
150163
missing = Int64Vector()
151164
old_len = 0
152165
for i in range(len(target)):
153-
self.root.query(result, target[i])
166+
try:
167+
self.root.query(result, target[i])
168+
except OverflowError:
169+
# overflow -> no match, which is already handled below
170+
pass
171+
154172
if result.data.n == old_len:
155173
result.append(-1)
156174
missing.append(i)
@@ -202,19 +220,26 @@ for dtype in ['float32', 'float64', 'int32', 'int64', 'uint64']:
202220
('neither', '<', '<')]:
203221
cmp_left_converse = '<' if cmp_left == '<=' else '<='
204222
cmp_right_converse = '<' if cmp_right == '<=' else '<='
223+
if dtype.startswith('int'):
224+
fused_prefix = 'int_'
225+
elif dtype.startswith('uint'):
226+
fused_prefix = 'uint_'
227+
elif dtype.startswith('float'):
228+
fused_prefix = ''
205229
nodes.append((dtype, dtype.title(),
206230
closed, closed.title(),
207231
cmp_left,
208232
cmp_right,
209233
cmp_left_converse,
210-
cmp_right_converse))
234+
cmp_right_converse,
235+
fused_prefix))
211236

212237
}}
213238

214239
NODE_CLASSES = {}
215240

216241
{{for dtype, dtype_title, closed, closed_title, cmp_left, cmp_right,
217-
cmp_left_converse, cmp_right_converse in nodes}}
242+
cmp_left_converse, cmp_right_converse, fused_prefix in nodes}}
218243

219244
cdef class {{dtype_title}}Closed{{closed_title}}IntervalNode:
220245
"""Non-terminal node for an IntervalTree
@@ -317,7 +342,7 @@ cdef class {{dtype_title}}Closed{{closed_title}}IntervalNode:
317342
@cython.wraparound(False)
318343
@cython.boundscheck(False)
319344
@cython.initializedcheck(False)
320-
cpdef query(self, Int64Vector result, scalar_t point):
345+
cpdef query(self, Int64Vector result, {{fused_prefix}}scalar_t point):
321346
"""Recursively query this node and its sub-nodes for intervals that
322347
overlap with the query point.
323348
"""

pandas/_typing.py

+13-5
Original file line numberDiff line numberDiff line change
@@ -23,21 +23,29 @@
2323
from pandas.core.indexes.base import Index # noqa: F401
2424
from pandas.core.series import Series # noqa: F401
2525
from pandas.core.generic import NDFrame # noqa: F401
26+
from pandas import Interval # noqa: F401
2627

28+
# array-like
2729

2830
AnyArrayLike = TypeVar("AnyArrayLike", "ExtensionArray", "Index", "Series", np.ndarray)
2931
ArrayLike = TypeVar("ArrayLike", "ExtensionArray", np.ndarray)
32+
33+
# scalars
34+
35+
PythonScalar = Union[str, int, float, bool]
3036
DatetimeLikeScalar = TypeVar("DatetimeLikeScalar", "Period", "Timestamp", "Timedelta")
37+
PandasScalar = Union["Period", "Timestamp", "Timedelta", "Interval"]
38+
Scalar = Union[PythonScalar, PandasScalar]
39+
40+
# other
41+
3142
Dtype = Union[str, np.dtype, "ExtensionDtype"]
3243
FilePathOrBuffer = Union[str, Path, IO[AnyStr]]
33-
3444
FrameOrSeries = TypeVar("FrameOrSeries", bound="NDFrame")
35-
Scalar = Union[str, int, float, bool]
3645
Axis = Union[str, int]
3746
Ordered = Optional[bool]
38-
JSONSerializable = Union[Scalar, List, Dict]
39-
47+
JSONSerializable = Union[PythonScalar, List, Dict]
4048
Axes = Collection
4149

4250
# to maintain type information across generic functions and parametrization
43-
_T = TypeVar("_T")
51+
T = TypeVar("T")

pandas/compat/pickle_compat.py

+2-2
Original file line numberDiff line numberDiff line change
@@ -169,9 +169,9 @@ def __new__(cls) -> "DataFrame": # type: ignore
169169

170170

171171
# our Unpickler sub-class to override methods and some dispatcher
172-
# functions for compat
173-
172+
# functions for compat and uses a non-public class of the pickle module.
174173

174+
# error: Name 'pkl._Unpickler' is not defined
175175
class Unpickler(pkl._Unpickler): # type: ignore
176176
def find_class(self, module, name):
177177
# override superclass

pandas/core/arrays/categorical.py

+6-6
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,14 @@
11
import operator
22
from shutil import get_terminal_size
3-
from typing import Type, Union, cast
3+
from typing import Dict, Hashable, List, Type, Union, cast
44
from warnings import warn
55

66
import numpy as np
77

88
from pandas._config import get_option
99

1010
from pandas._libs import algos as libalgos, hashtable as htable
11-
from pandas._typing import ArrayLike, Dtype, Ordered
11+
from pandas._typing import ArrayLike, Dtype, Ordered, Scalar
1212
from pandas.compat.numpy import function as nv
1313
from pandas.util._decorators import (
1414
Appender,
@@ -511,7 +511,7 @@ def itemsize(self) -> int:
511511
"""
512512
return self.categories.itemsize
513513

514-
def tolist(self) -> list:
514+
def tolist(self) -> List[Scalar]:
515515
"""
516516
Return a list of the values.
517517
@@ -2067,7 +2067,7 @@ def __setitem__(self, key, value):
20672067
lindexer = self._maybe_coerce_indexer(lindexer)
20682068
self._codes[key] = lindexer
20692069

2070-
def _reverse_indexer(self):
2070+
def _reverse_indexer(self) -> Dict[Hashable, np.ndarray]:
20712071
"""
20722072
Compute the inverse of a categorical, returning
20732073
a dict of categories -> indexers.
@@ -2097,8 +2097,8 @@ def _reverse_indexer(self):
20972097
self.codes.astype("int64"), categories.size
20982098
)
20992099
counts = counts.cumsum()
2100-
result = (r[start:end] for start, end in zip(counts, counts[1:]))
2101-
result = dict(zip(categories, result))
2100+
_result = (r[start:end] for start, end in zip(counts, counts[1:]))
2101+
result = dict(zip(categories, _result))
21022102
return result
21032103

21042104
# reduction ops #

0 commit comments

Comments
 (0)