Skip to content

Commit 292bca8

Browse files
committed
Merge branch 'master' of https://github.com/pandas-dev/pandas into mcmali-black
2 parents a17f582 + 07efdd4 commit 292bca8

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

44 files changed

+494
-584
lines changed

doc/source/getting_started/install.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -218,7 +218,7 @@ Recommended dependencies
218218
``numexpr`` uses multiple cores as well as smart chunking and caching to achieve large speedups.
219219
If installed, must be Version 2.6.2 or higher.
220220

221-
* `bottleneck <https://github.com/kwgoodman/bottleneck>`__: for accelerating certain types of ``nan``
221+
* `bottleneck <https://github.com/pydata/bottleneck>`__: for accelerating certain types of ``nan``
222222
evaluations. ``bottleneck`` uses specialized cython routines to achieve large speedups. If installed,
223223
must be Version 1.2.1 or higher.
224224

doc/source/whatsnew/v0.21.0.rst

+2-2
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ Highlights include:
2020
- Integration with `Apache Parquet <https://parquet.apache.org/>`__, including a new top-level :func:`read_parquet` function and :meth:`DataFrame.to_parquet` method, see :ref:`here <whatsnew_0210.enhancements.parquet>`.
2121
- New user-facing :class:`pandas.api.types.CategoricalDtype` for specifying
2222
categoricals independent of the data, see :ref:`here <whatsnew_0210.enhancements.categorical_dtype>`.
23-
- The behavior of ``sum`` and ``prod`` on all-NaN Series/DataFrames is now consistent and no longer depends on whether `bottleneck <http://berkeleyanalytics.com/bottleneck>`__ is installed, and ``sum`` and ``prod`` on empty Series now return NaN instead of 0, see :ref:`here <whatsnew_0210.api_breaking.bottleneck>`.
23+
- The behavior of ``sum`` and ``prod`` on all-NaN Series/DataFrames is now consistent and no longer depends on whether `bottleneck <https://bottleneck.readthedocs.io>`__ is installed, and ``sum`` and ``prod`` on empty Series now return NaN instead of 0, see :ref:`here <whatsnew_0210.api_breaking.bottleneck>`.
2424
- Compatibility fixes for pypy, see :ref:`here <whatsnew_0210.pypy>`.
2525
- Additions to the ``drop``, ``reindex`` and ``rename`` API to make them more consistent, see :ref:`here <whatsnew_0210.enhancements.drop_api>`.
2626
- Addition of the new methods ``DataFrame.infer_objects`` (see :ref:`here <whatsnew_0210.enhancements.infer_objects>`) and ``GroupBy.pipe`` (see :ref:`here <whatsnew_0210.enhancements.GroupBy_pipe>`).
@@ -390,7 +390,7 @@ Sum/Prod of all-NaN or empty Series/DataFrames is now consistently NaN
390390

391391

392392
The behavior of ``sum`` and ``prod`` on all-NaN Series/DataFrames no longer depends on
393-
whether `bottleneck <http://berkeleyanalytics.com/bottleneck>`__ is installed, and return value of ``sum`` and ``prod`` on an empty Series has changed (:issue:`9422`, :issue:`15507`).
393+
whether `bottleneck <https://bottleneck.readthedocs.io>`__ is installed, and return value of ``sum`` and ``prod`` on an empty Series has changed (:issue:`9422`, :issue:`15507`).
394394

395395
Calling ``sum`` or ``prod`` on an empty or all-``NaN`` ``Series``, or columns of a ``DataFrame``, will result in ``NaN``. See the :ref:`docs <missing_data.numeric_sum>`.
396396

doc/source/whatsnew/v0.8.1.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ Performance improvements
2929
~~~~~~~~~~~~~~~~~~~~~~~~
3030

3131
- Improved implementation of rolling min and max (thanks to `Bottleneck
32-
<http://berkeleyanalytics.com/bottleneck/>`__ !)
32+
<https://bottleneck.readthedocs.io>`__ !)
3333
- Add accelerated ``'median'`` GroupBy option (:issue:`1358`)
3434
- Significantly improve the performance of parsing ISO8601-format date
3535
strings with ``DatetimeIndex`` or ``to_datetime`` (:issue:`1571`)

doc/source/whatsnew/v1.0.0.rst

+1
Original file line numberDiff line numberDiff line change
@@ -414,6 +414,7 @@ Plotting
414414
- Bug in the ``xticks`` argument being ignored for :meth:`DataFrame.plot.bar` (:issue:`14119`)
415415
- :func:`set_option` now validates that the plot backend provided to ``'plotting.backend'`` implements the backend when the option is set, rather than when a plot is created (:issue:`28163`)
416416
- :meth:`DataFrame.plot` now allow a ``backend`` keyword arugment to allow changing between backends in one session (:issue:`28619`).
417+
- Bug in color validation incorrectly raising for non-color styles (:issue:`29122`).
417418

418419
Groupby/resample/rolling
419420
^^^^^^^^^^^^^^^^^^^^^^^^

pandas/_libs/groupby.pyx

+1-2
Original file line numberDiff line numberDiff line change
@@ -753,8 +753,7 @@ def group_quantile(ndarray[float64_t] out,
753753
assert values.shape[0] == N
754754

755755
if not (0 <= q <= 1):
756-
raise ValueError("'q' must be between 0 and 1. Got"
757-
" '{}' instead".format(q))
756+
raise ValueError(f"'q' must be between 0 and 1. Got '{q}' instead")
758757

759758
inter_methods = {
760759
'linear': INTERPOLATION_LINEAR,

pandas/_libs/hashing.pyx

+5-5
Original file line numberDiff line numberDiff line change
@@ -47,8 +47,8 @@ def hash_object_array(object[:] arr, object key, object encoding='utf8'):
4747
k = <bytes>key.encode(encoding)
4848
kb = <uint8_t *>k
4949
if len(k) != 16:
50-
raise ValueError("key should be a 16-byte string encoded, "
51-
"got {key} (len {klen})".format(key=k, klen=len(k)))
50+
raise ValueError(f"key should be a 16-byte string encoded, "
51+
f"got {k} (len {len(k)})")
5252

5353
n = len(arr)
5454

@@ -67,9 +67,9 @@ def hash_object_array(object[:] arr, object key, object encoding='utf8'):
6767
data = <bytes>str(val).encode(encoding)
6868

6969
else:
70-
raise TypeError("{val} of type {typ} is not a valid type "
71-
"for hashing, must be string or null"
72-
.format(val=val, typ=type(val)))
70+
raise TypeError(f"{val} of type {type(val)} is not a valid type "
71+
f"for hashing, must be string or null"
72+
)
7373

7474
l = len(data)
7575
lens[i] = l

pandas/_libs/index.pyx

+7-7
Original file line numberDiff line numberDiff line change
@@ -109,7 +109,7 @@ cdef class IndexEngine:
109109
Py_ssize_t loc
110110

111111
if is_definitely_invalid_key(val):
112-
raise TypeError("'{val}' is an invalid key".format(val=val))
112+
raise TypeError(f"'{val}' is an invalid key")
113113

114114
if self.over_size_threshold and self.is_monotonic_increasing:
115115
if not self.is_unique:
@@ -556,8 +556,8 @@ cpdef convert_scalar(ndarray arr, object value):
556556
pass
557557
elif value is None or value != value:
558558
return np.datetime64("NaT", "ns")
559-
raise ValueError("cannot set a Timestamp with a non-timestamp {typ}"
560-
.format(typ=type(value).__name__))
559+
raise ValueError(f"cannot set a Timestamp with a non-timestamp "
560+
f"{type(value).__name__}")
561561

562562
elif arr.descr.type_num == NPY_TIMEDELTA:
563563
if util.is_array(value):
@@ -573,8 +573,8 @@ cpdef convert_scalar(ndarray arr, object value):
573573
pass
574574
elif value is None or value != value:
575575
return np.timedelta64("NaT", "ns")
576-
raise ValueError("cannot set a Timedelta with a non-timedelta {typ}"
577-
.format(typ=type(value).__name__))
576+
raise ValueError(f"cannot set a Timedelta with a non-timedelta "
577+
f"{type(value).__name__}")
578578

579579
if (issubclass(arr.dtype.type, (np.integer, np.floating, np.complex)) and
580580
not issubclass(arr.dtype.type, np.bool_)):
@@ -677,7 +677,7 @@ cdef class BaseMultiIndexCodesEngine:
677677
# Index._get_fill_indexer), sort (integer representations of) keys:
678678
order = np.argsort(lab_ints)
679679
lab_ints = lab_ints[order]
680-
indexer = (getattr(self._base, 'get_{}_indexer'.format(method))
680+
indexer = (getattr(self._base, f'get_{method}_indexer')
681681
(self, lab_ints, limit=limit))
682682
indexer = indexer[order]
683683
else:
@@ -687,7 +687,7 @@ cdef class BaseMultiIndexCodesEngine:
687687

688688
def get_loc(self, object key):
689689
if is_definitely_invalid_key(key):
690-
raise TypeError("'{key}' is an invalid key".format(key=key))
690+
raise TypeError(f"'{key}' is an invalid key")
691691
if not isinstance(key, tuple):
692692
raise KeyError(key)
693693
try:

pandas/_libs/internals.pyx

+1-1
Original file line numberDiff line numberDiff line change
@@ -61,7 +61,7 @@ cdef class BlockPlacement:
6161
else:
6262
v = self._as_array
6363

64-
return '%s(%r)' % (self.__class__.__name__, v)
64+
return f'{self.__class__.__name__}({v})'
6565

6666
def __repr__(self) -> str:
6767
return str(self)

pandas/_libs/interval.pyx

+13-17
Original file line numberDiff line numberDiff line change
@@ -179,8 +179,8 @@ cdef class IntervalMixin:
179179
When `other` is not closed exactly the same as self.
180180
"""
181181
if self.closed != other.closed:
182-
msg = "'{}.closed' is '{}', expected '{}'."
183-
raise ValueError(msg.format(name, other.closed, self.closed))
182+
msg = f"'{name}.closed' is '{other.closed}', expected '{self.closed}'."
183+
raise ValueError(msg)
184184

185185

186186
cdef _interval_like(other):
@@ -308,17 +308,16 @@ cdef class Interval(IntervalMixin):
308308
self._validate_endpoint(right)
309309

310310
if closed not in _VALID_CLOSED:
311-
msg = "invalid option for 'closed': {closed}".format(closed=closed)
311+
msg = f"invalid option for 'closed': {closed}"
312312
raise ValueError(msg)
313313
if not left <= right:
314314
raise ValueError('left side of interval must be <= right side')
315315
if (isinstance(left, Timestamp) and
316316
not tz_compare(left.tzinfo, right.tzinfo)):
317317
# GH 18538
318-
msg = ("left and right must have the same time zone, got "
319-
"'{left_tz}' and '{right_tz}'")
320-
raise ValueError(msg.format(left_tz=left.tzinfo,
321-
right_tz=right.tzinfo))
318+
msg = (f"left and right must have the same time zone, got "
319+
f"'{left.tzinfo}' and '{right.tzinfo}'")
320+
raise ValueError(msg)
322321
self.left = left
323322
self.right = right
324323
self.closed = closed
@@ -359,8 +358,7 @@ cdef class Interval(IntervalMixin):
359358
name = type(self).__name__
360359
other = type(other).__name__
361360
op_str = {Py_LT: '<', Py_LE: '<=', Py_GT: '>', Py_GE: '>='}[op]
362-
raise TypeError('unorderable types: {name}() {op} {other}()'
363-
.format(name=name, op=op_str, other=other))
361+
raise TypeError(f'unorderable types: {name}() {op_str} {other}()')
364362

365363
def __reduce__(self):
366364
args = (self.left, self.right, self.closed)
@@ -381,17 +379,15 @@ cdef class Interval(IntervalMixin):
381379

382380
left, right = self._repr_base()
383381
name = type(self).__name__
384-
repr_str = '{name}({left!r}, {right!r}, closed={closed!r})'.format(
385-
name=name, left=left, right=right, closed=self.closed)
382+
repr_str = f'{name}({left!r}, {right!r}, closed={self.closed!r})'
386383
return repr_str
387384

388385
def __str__(self) -> str:
389386

390387
left, right = self._repr_base()
391388
start_symbol = '[' if self.closed_left else '('
392389
end_symbol = ']' if self.closed_right else ')'
393-
return '{start}{left}, {right}{end}'.format(
394-
start=start_symbol, left=left, right=right, end=end_symbol)
390+
return f'{start_symbol}{left}, {right}{end_symbol}'
395391

396392
def __add__(self, y):
397393
if isinstance(y, numbers.Number):
@@ -477,8 +473,8 @@ cdef class Interval(IntervalMixin):
477473
False
478474
"""
479475
if not isinstance(other, Interval):
480-
msg = '`other` must be an Interval, got {other}'
481-
raise TypeError(msg.format(other=type(other).__name__))
476+
msg = f'`other` must be an Interval, got {type(other).__name__}'
477+
raise TypeError(msg)
482478

483479
# equality is okay if both endpoints are closed (overlap at a point)
484480
op1 = le if (self.closed_left and other.closed_right) else lt
@@ -529,8 +525,8 @@ def intervals_to_interval_bounds(ndarray intervals,
529525
continue
530526

531527
if not isinstance(interval, Interval):
532-
raise TypeError("type {typ} with value {iv} is not an interval"
533-
.format(typ=type(interval), iv=interval))
528+
raise TypeError(f"type {type(interval)} with value "
529+
f"{interval} is not an interval")
534530

535531
left[i] = interval.left
536532
right[i] = interval.right

pandas/_libs/lib.pyx

+6-9
Original file line numberDiff line numberDiff line change
@@ -1219,8 +1219,7 @@ def infer_dtype(value: object, skipna: object=None) -> str:
12191219
return value
12201220

12211221
# its ndarray like but we can't handle
1222-
raise ValueError("cannot infer type for {typ}"
1223-
.format(typ=type(value)))
1222+
raise ValueError(f"cannot infer type for {type(value)}")
12241223

12251224
else:
12261225
if not isinstance(value, list):
@@ -1497,9 +1496,8 @@ cdef class Validator:
14971496
return self.is_valid(value) or self.is_valid_null(value)
14981497

14991498
cdef bint is_value_typed(self, object value) except -1:
1500-
raise NotImplementedError(
1501-
'{typ} child class must define is_value_typed'
1502-
.format(typ=type(self).__name__))
1499+
raise NotImplementedError(f'{type(self).__name__} child class '
1500+
f'must define is_value_typed')
15031501

15041502
cdef bint is_valid_null(self, object value) except -1:
15051503
return value is None or util.is_nan(value)
@@ -1635,9 +1633,8 @@ cdef class TemporalValidator(Validator):
16351633
return self.is_value_typed(value) or self.is_valid_null(value)
16361634

16371635
cdef bint is_valid_null(self, object value) except -1:
1638-
raise NotImplementedError(
1639-
'{typ} child class must define is_valid_null'
1640-
.format(typ=type(self).__name__))
1636+
raise NotImplementedError(f'{type(self).__name__} child class '
1637+
f'must define is_valid_null')
16411638

16421639
cdef inline bint is_valid_skipna(self, object value) except -1:
16431640
cdef:
@@ -1926,7 +1923,7 @@ def maybe_convert_numeric(ndarray[object] values, set na_values,
19261923
seen.float_ = True
19271924
except (TypeError, ValueError) as e:
19281925
if not seen.coerce_numeric:
1929-
raise type(e)(str(e) + " at position {pos}".format(pos=i))
1926+
raise type(e)(str(e) + f" at position {i}")
19301927
elif "uint64" in str(e): # Exception from check functions.
19311928
raise
19321929

pandas/_libs/ops.pyx

+2-4
Original file line numberDiff line numberDiff line change
@@ -123,8 +123,7 @@ def vec_compare(object[:] left, object[:] right, object op):
123123
int flag
124124

125125
if n != <Py_ssize_t>len(right):
126-
raise ValueError('Arrays were different lengths: {n} vs {nright}'
127-
.format(n=n, nright=len(right)))
126+
raise ValueError(f'Arrays were different lengths: {n} vs {len(right)}')
128127

129128
if op is operator.lt:
130129
flag = Py_LT
@@ -224,8 +223,7 @@ def vec_binop(object[:] left, object[:] right, object op):
224223
object[:] result
225224

226225
if n != <Py_ssize_t>len(right):
227-
raise ValueError('Arrays were different lengths: {n} vs {nright}'
228-
.format(n=n, nright=len(right)))
226+
raise ValueError(f'Arrays were different lengths: {n} vs {len(right)}')
229227

230228
result = np.empty(n, dtype=object)
231229

pandas/_libs/parsers.pyx

+14-14
Original file line numberDiff line numberDiff line change
@@ -637,19 +637,19 @@ cdef class TextReader:
637637
source = zip_file.open(file_name)
638638

639639
elif len(zip_names) == 0:
640-
raise ValueError('Zero files found in compressed '
641-
'zip file %s', source)
640+
raise ValueError(f'Zero files found in compressed '
641+
f'zip file {source}')
642642
else:
643-
raise ValueError('Multiple files found in compressed '
644-
'zip file %s', str(zip_names))
643+
raise ValueError(f'Multiple files found in compressed '
644+
f'zip file {zip_names}')
645645
elif self.compression == 'xz':
646646
if isinstance(source, str):
647647
source = _get_lzma_file(lzma)(source, 'rb')
648648
else:
649649
source = _get_lzma_file(lzma)(filename=source)
650650
else:
651-
raise ValueError('Unrecognized compression type: %s' %
652-
self.compression)
651+
raise ValueError(f'Unrecognized compression type: '
652+
f'{self.compression}')
653653

654654
if b'utf-16' in (self.encoding or b''):
655655
# we need to read utf-16 through UTF8Recoder.
@@ -703,8 +703,8 @@ cdef class TextReader:
703703
self.parser.cb_io = &buffer_rd_bytes
704704
self.parser.cb_cleanup = &del_rd_source
705705
else:
706-
raise IOError('Expected file path name or file-like object,'
707-
' got %s type' % type(source))
706+
raise IOError(f'Expected file path name or file-like object, '
707+
f'got {type(source)} type')
708708

709709
cdef _get_header(self):
710710
# header is now a list of lists, so field_count should use header[0]
@@ -744,8 +744,8 @@ cdef class TextReader:
744744
msg = "[%s], len of %d," % (
745745
','.join(str(m) for m in msg), len(msg))
746746
raise ParserError(
747-
'Passed header=%s but only %d lines in file'
748-
% (msg, self.parser.lines))
747+
f'Passed header={msg} but only '
748+
f'{self.parser.lines} lines in file')
749749

750750
else:
751751
field_count = self.parser.line_fields[hr]
@@ -779,7 +779,7 @@ cdef class TextReader:
779779
if not self.has_mi_columns and self.mangle_dupe_cols:
780780
while count > 0:
781781
counts[name] = count + 1
782-
name = '%s.%d' % (name, count)
782+
name = f'{name}.{count}'
783783
count = counts.get(name, 0)
784784

785785
if old_name == '':
@@ -1662,7 +1662,7 @@ cdef _to_fw_string(parser_t *parser, int64_t col, int64_t line_start,
16621662
char *data
16631663
ndarray result
16641664

1665-
result = np.empty(line_end - line_start, dtype='|S%d' % width)
1665+
result = np.empty(line_end - line_start, dtype=f'|S{width}')
16661666
data = <char*>result.data
16671667

16681668
with nogil:
@@ -2176,8 +2176,8 @@ def _concatenate_chunks(list chunks):
21762176
if warning_columns:
21772177
warning_names = ','.join(warning_columns)
21782178
warning_message = " ".join([
2179-
"Columns (%s) have mixed types." % warning_names,
2180-
"Specify dtype option on import or set low_memory=False."
2179+
f"Columns ({warning_names}) have mixed types."
2180+
f"Specify dtype option on import or set low_memory=False."
21812181
])
21822182
warnings.warn(warning_message, DtypeWarning, stacklevel=8)
21832183
return result

0 commit comments

Comments
 (0)