Skip to content

Commit bce25c9

Browse files
committed
Merge remote-tracking branch 'origin/master' into fix-25557
* origin/master: DOC: clean bug fix section in whatsnew (pandas-dev#25792) DOC: Fixed PeriodArray api ref (pandas-dev#25526) Move locale code out of tm, into _config (pandas-dev#25757) Unpin pycodestyle (pandas-dev#25789) Add test for rdivmod on EA array (GH23287) (pandas-dev#24047) ENH: Support datetime.timezone objects (pandas-dev#25065) Cython language level 3 (pandas-dev#24538) API: concat on sparse values (pandas-dev#25719) TST: assert_produces_warning works with filterwarnings (pandas-dev#25721) make core.config self-contained (pandas-dev#25613) CLN: replace %s syntax with .format in pandas.io.parsers (pandas-dev#24721) TST: Check pytables<3.5.1 when skipping (pandas-dev#25773) DOC: Fix typo in docstring of DataFrame.memory_usage (pandas-dev#25770) Replace dicts with OrderedDicts in groupby aggregation functions (pandas-dev#25693) TST: Fixturize tests/frame/test_missing.py (pandas-dev#25640) DOC: Improve the docsting of Series.iteritems (pandas-dev#24879) DOC: Fix function name. (pandas-dev#25751) Implementing iso_week_year support for to_datetime (pandas-dev#25541) DOC: clarify corr behaviour when using a callable (pandas-dev#25732) remove unnecessary check_output (pandas-dev#25755) # Conflicts: # doc/source/whatsnew/v0.25.0.rst
2 parents 5d40c93 + 4663951 commit bce25c9

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

77 files changed

+722
-373
lines changed

doc/source/reference/arrays.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -259,7 +259,7 @@ Every period in a ``PeriodArray`` must have the same ``freq``.
259259
.. autosummary::
260260
:toctree: api/
261261

262-
arrays.DatetimeArray
262+
arrays.PeriodArray
263263
PeriodDtype
264264

265265
.. _api.arrays.interval:

doc/source/user_guide/timeseries.rst

+11-5
Original file line numberDiff line numberDiff line change
@@ -2149,12 +2149,9 @@ Time Zone Handling
21492149
------------------
21502150

21512151
pandas provides rich support for working with timestamps in different time
2152-
zones using the ``pytz`` and ``dateutil`` libraries.
2152+
zones using the ``pytz`` and ``dateutil`` libraries or class:`datetime.timezone`
2153+
objects from the standard library.
21532154

2154-
.. note::
2155-
2156-
pandas does not yet support ``datetime.timezone`` objects from the standard
2157-
library.
21582155

21592156
Working with Time Zones
21602157
~~~~~~~~~~~~~~~~~~~~~~~
@@ -2197,6 +2194,15 @@ To return ``dateutil`` time zone objects, append ``dateutil/`` before the string
21972194
tz=dateutil.tz.tzutc())
21982195
rng_utc.tz
21992196
2197+
.. versionadded:: 0.25.0
2198+
2199+
.. ipython:: python
2200+
2201+
# datetime.timezone
2202+
rng_utc = pd.date_range('3/6/2012 00:00', periods=3, freq='D',
2203+
tz=datetime.timezone.utc)
2204+
rng_utc.tz
2205+
22002206
Note that the ``UTC`` time zone is a special case in ``dateutil`` and should be constructed explicitly
22012207
as an instance of ``dateutil.tz.tzutc``. You can also construct other time
22022208
zones objects explicitly first.

doc/source/whatsnew/v0.25.0.rst

+44-4
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@ including other versions of pandas.
1919
Other Enhancements
2020
^^^^^^^^^^^^^^^^^^
2121

22+
- Added support for ISO week year format ('%G-%V-%u') when parsing datetimes using :meth: `to_datetime` (:issue:`16607`)
2223
- Indexing of ``DataFrame`` and ``Series`` now accepts zerodim ``np.ndarray`` (:issue:`24919`)
2324
- :meth:`Timestamp.replace` now supports the ``fold`` argument to disambiguate DST transition times (:issue:`25017`)
2425
- :meth:`DataFrame.at_time` and :meth:`Series.at_time` now support :meth:`datetime.time` objects with timezones (:issue:`24043`)
@@ -27,6 +28,7 @@ Other Enhancements
2728
- :meth:`DatetimeIndex.union` now supports the ``sort`` argument. The behaviour of the sort parameter matches that of :meth:`Index.union` (:issue:`24994`)
2829
- :meth:`DataFrame.rename` now supports the ``errors`` argument to raise errors when attempting to rename nonexistent keys (:issue:`13473`)
2930
- :class:`RangeIndex` has gained :attr:`~RangeIndex.start`, :attr:`~RangeIndex.stop`, and :attr:`~RangeIndex.step` attributes (:issue:`25710`)
31+
- :class:`datetime.timezone` objects are now supported as arguments to timezone methods and constructors (:issue:`25065`)
3032

3133
.. _whatsnew_0250.api_breaking:
3234

@@ -65,6 +67,42 @@ is respected in indexing. (:issue:`24076`, :issue:`16785`)
6567
df = pd.DataFrame([0], index=pd.DatetimeIndex(['2019-01-01'], tz='US/Pacific'))
6668
df['2019-01-01 12:00:00+04:00':'2019-01-01 13:00:00+04:00']
6769

70+
Concatenating Sparse Values
71+
^^^^^^^^^^^^^^^^^^^^^^^^^^^
72+
73+
When passed DataFrames whose values are sparse, :func:`concat` will now return a
74+
Series or DataFrame with sparse values, rather than a ``SparseDataFrame`` (:issue:`25702`).
75+
76+
.. ipython:: python
77+
78+
df = pd.DataFrame({"A": pd.SparseArray([0, 1])})
79+
80+
*Previous Behavior:*
81+
82+
.. code-block:: ipython
83+
84+
In [2]: type(pd.concat([df, df]))
85+
pandas.core.sparse.frame.SparseDataFrame
86+
87+
*New Behavior:*
88+
89+
.. ipython:: python
90+
91+
type(pd.concat([df, df]))
92+
93+
94+
This now matches the existing behavior of :class:`concat` on ``Series`` with sparse values.
95+
:func:`concat` will continue to return a ``SparseDataFrame`` when all the values
96+
are instances of ``SparseDataFrame``.
97+
98+
This change also affects routines using :func:`concat` internally, like :func:`get_dummies`,
99+
which now returns a :class:`DataFrame` in all cases (previously a ``SparseDataFrame`` was
100+
returned if all the columns were dummy encoded, and a :class:`DataFrame` otherwise).
101+
102+
Providing any ``SparseSeries`` or ``SparseDataFrame`` to :func:`concat` will
103+
cause a ``SparseSeries`` or ``SparseDataFrame`` to be returned, as before.
104+
105+
68106
.. _whatsnew_0250.api_breaking.deps:
69107

70108
Increased minimum versions for dependencies
@@ -137,10 +175,7 @@ Performance Improvements
137175

138176
Bug Fixes
139177
~~~~~~~~~
140-
- Bug in :func:`to_datetime` which would raise an (incorrect) ``ValueError`` when called with a date far into the future and the ``format`` argument specified instead of raising ``OutOfBoundsDatetime`` (:issue:`23830`)
141-
- Bug in an error message in :meth:`DataFrame.plot`. Improved the error message if non-numerics are passed to :meth:`DataFrame.plot` (:issue:`25481`)
142-
- Bug in error messages in :meth:`DataFrame.corr` and :meth:`Series.corr`. Added the possibility of using a callable. (:issue:`25729`)
143-
- Bug in :meth:`Series.divmod` and :meth:`Series.rdivmod` which would raise an (incorrect) ``ValueError`` rather than return a pair of :class:`Series` object as result (:issue:`25557`)
178+
144179

145180
Categorical
146181
^^^^^^^^^^^
@@ -152,6 +187,7 @@ Categorical
152187
Datetimelike
153188
^^^^^^^^^^^^
154189

190+
- Bug in :func:`to_datetime` which would raise an (incorrect) ``ValueError`` when called with a date far into the future and the ``format`` argument specified instead of raising ``OutOfBoundsDatetime`` (:issue:`23830`)
155191
-
156192
-
157193
-
@@ -175,6 +211,8 @@ Numeric
175211

176212
- Bug in :meth:`to_numeric` in which large negative numbers were being improperly handled (:issue:`24910`)
177213
- Bug in :meth:`to_numeric` in which numbers were being coerced to float, even though ``errors`` was not ``coerce`` (:issue:`24910`)
214+
- Bug in error messages in :meth:`DataFrame.corr` and :meth:`Series.corr`. Added the possibility of using a callable. (:issue:`25729`)
215+
- Bug in :meth:`Series.divmod` and :meth:`Series.rdivmod` which would raise an (incorrect) ``ValueError`` rather than return a pair of :class:`Series` objects as result (:issue:`25557`)
178216
-
179217
-
180218
-
@@ -242,6 +280,7 @@ Plotting
242280
^^^^^^^^
243281

244282
- Fixed bug where :class:`api.extensions.ExtensionArray` could not be used in matplotlib plotting (:issue:`25587`)
283+
- Bug in an error message in :meth:`DataFrame.plot`. Improved the error message if non-numerics are passed to :meth:`DataFrame.plot` (:issue:`25481`)
245284
-
246285
-
247286
-
@@ -253,6 +292,7 @@ Groupby/Resample/Rolling
253292
- Bug in :meth:`pandas.core.groupby.DataFrameGroupBy.nunique` in which the names of column levels were lost (:issue:`23222`)
254293
- Bug in :func:`pandas.core.groupby.GroupBy.agg` when applying a aggregation function to timezone aware data (:issue:`23683`)
255294
- Bug in :func:`pandas.core.groupby.GroupBy.first` and :func:`pandas.core.groupby.GroupBy.last` where timezone information would be dropped (:issue:`21603`)
295+
- Ensured that ordering of outputs in ``groupby`` aggregation functions is consistent across all versions of Python (:issue:`25692`)
256296

257297

258298
Reshaping

environment.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ dependencies:
1919
- hypothesis>=3.82
2020
- isort
2121
- moto
22-
- pycodestyle=2.4
22+
- pycodestyle
2323
- pytest>=4.0.2
2424
- pytest-mock
2525
- sphinx

pandas/_config/__init__.py

+4
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
"""
2+
pandas._config is considered explicitly upstream of everything else in pandas,
3+
should have no intra-pandas dependencies.
4+
"""

pandas/_config/localization.py

+93
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,93 @@
1+
"""
2+
Helpers for configuring locale settings.
3+
4+
Name `localization` is chosen to avoid overlap with builtin `locale` module.
5+
"""
6+
from contextlib import contextmanager
7+
import locale
8+
9+
10+
@contextmanager
11+
def set_locale(new_locale, lc_var=locale.LC_ALL):
12+
"""
13+
Context manager for temporarily setting a locale.
14+
15+
Parameters
16+
----------
17+
new_locale : str or tuple
18+
A string of the form <language_country>.<encoding>. For example to set
19+
the current locale to US English with a UTF8 encoding, you would pass
20+
"en_US.UTF-8".
21+
lc_var : int, default `locale.LC_ALL`
22+
The category of the locale being set.
23+
24+
Notes
25+
-----
26+
This is useful when you want to run a particular block of code under a
27+
particular locale, without globally setting the locale. This probably isn't
28+
thread-safe.
29+
"""
30+
current_locale = locale.getlocale()
31+
32+
try:
33+
locale.setlocale(lc_var, new_locale)
34+
normalized_locale = locale.getlocale()
35+
if all(x is not None for x in normalized_locale):
36+
yield '.'.join(normalized_locale)
37+
else:
38+
yield new_locale
39+
finally:
40+
locale.setlocale(lc_var, current_locale)
41+
42+
43+
def can_set_locale(lc, lc_var=locale.LC_ALL):
44+
"""
45+
Check to see if we can set a locale, and subsequently get the locale,
46+
without raising an Exception.
47+
48+
Parameters
49+
----------
50+
lc : str
51+
The locale to attempt to set.
52+
lc_var : int, default `locale.LC_ALL`
53+
The category of the locale being set.
54+
55+
Returns
56+
-------
57+
is_valid : bool
58+
Whether the passed locale can be set
59+
"""
60+
61+
try:
62+
with set_locale(lc, lc_var=lc_var):
63+
pass
64+
except (ValueError, locale.Error):
65+
# horrible name for a Exception subclass
66+
return False
67+
else:
68+
return True
69+
70+
71+
def _valid_locales(locales, normalize):
72+
"""
73+
Return a list of normalized locales that do not throw an ``Exception``
74+
when set.
75+
76+
Parameters
77+
----------
78+
locales : str
79+
A string where each locale is separated by a newline.
80+
normalize : bool
81+
Whether to call ``locale.normalize`` on each locale.
82+
83+
Returns
84+
-------
85+
valid_locales : list
86+
A list of valid locales.
87+
"""
88+
if normalize:
89+
normalizer = lambda x: locale.normalize(x.strip())
90+
else:
91+
normalizer = lambda x: x.strip()
92+
93+
return list(filter(can_set_locale, map(normalizer, locales)))

pandas/_libs/__init__.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -2,4 +2,4 @@
22
# flake8: noqa
33

44
from .tslibs import (
5-
iNaT, NaT, Timestamp, Timedelta, OutOfBoundsDatetime, Period)
5+
iNaT, NaT, NaTType, Timestamp, Timedelta, OutOfBoundsDatetime, Period)

pandas/_libs/groupby.pyx

+3-3
Original file line numberDiff line numberDiff line change
@@ -57,10 +57,10 @@ cdef inline float64_t median_linear(float64_t* a, int n) nogil:
5757
n -= na_count
5858

5959
if n % 2:
60-
result = kth_smallest_c( a, n / 2, n)
60+
result = kth_smallest_c( a, n // 2, n)
6161
else:
62-
result = (kth_smallest_c(a, n / 2, n) +
63-
kth_smallest_c(a, n / 2 - 1, n)) / 2
62+
result = (kth_smallest_c(a, n // 2, n) +
63+
kth_smallest_c(a, n // 2 - 1, n)) / 2
6464

6565
if na_count:
6666
free(a)

pandas/_libs/parsers.pyx

+2-2
Original file line numberDiff line numberDiff line change
@@ -948,7 +948,7 @@ cdef class TextReader:
948948
status = tokenize_nrows(self.parser, nrows)
949949

950950
if self.parser.warn_msg != NULL:
951-
print >> sys.stderr, self.parser.warn_msg
951+
print(self.parser.warn_msg, file=sys.stderr)
952952
free(self.parser.warn_msg)
953953
self.parser.warn_msg = NULL
954954

@@ -976,7 +976,7 @@ cdef class TextReader:
976976
status = tokenize_all_rows(self.parser)
977977

978978
if self.parser.warn_msg != NULL:
979-
print >> sys.stderr, self.parser.warn_msg
979+
print(self.parser.warn_msg, file=sys.stderr)
980980
free(self.parser.warn_msg)
981981
self.parser.warn_msg = NULL
982982

pandas/_libs/tslib.pyx

+1-1
Original file line numberDiff line numberDiff line change
@@ -275,7 +275,7 @@ def format_array_from_datetime(ndarray[int64_t] values, object tz=None,
275275
dts.sec)
276276

277277
if show_ns:
278-
ns = dts.ps / 1000
278+
ns = dts.ps // 1000
279279
res += '.%.9d' % (ns + 1000 * dts.us)
280280
elif show_us:
281281
res += '.%.6d' % dts.us

pandas/_libs/tslibs/__init__.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
# flake8: noqa
33

44
from .conversion import normalize_date, localize_pydatetime, tz_convert_single
5-
from .nattype import NaT, iNaT, is_null_datetimelike
5+
from .nattype import NaT, NaTType, iNaT, is_null_datetimelike
66
from .np_datetime import OutOfBoundsDatetime
77
from .period import Period, IncompatibleFrequency
88
from .timestamps import Timestamp

pandas/_libs/tslibs/ccalendar.pyx

+4-5
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,8 @@ import cython
99
from numpy cimport int64_t, int32_t
1010

1111
from locale import LC_TIME
12+
13+
from pandas._config.localization import set_locale
1214
from pandas._libs.tslibs.strptime import LocaleTime
1315

1416
# ----------------------------------------------------------------------
@@ -159,7 +161,7 @@ cpdef int32_t get_week_of_year(int year, int month, int day) nogil:
159161
# estimate
160162
woy = (doy - 1) - dow + 3
161163
if woy >= 0:
162-
woy = woy / 7 + 1
164+
woy = woy // 7 + 1
163165

164166
# verify
165167
if woy < 0:
@@ -206,7 +208,7 @@ cpdef int32_t get_day_of_year(int year, int month, int day) nogil:
206208
return day_of_year
207209

208210

209-
cpdef get_locale_names(object name_type, object locale=None):
211+
def get_locale_names(name_type: object, locale: object=None):
210212
"""Returns an array of localized day or month names
211213
212214
Parameters
@@ -218,9 +220,6 @@ cpdef get_locale_names(object name_type, object locale=None):
218220
Returns
219221
-------
220222
list of locale names
221-
222223
"""
223-
from pandas.util.testing import set_locale
224-
225224
with set_locale(locale, LC_TIME):
226225
return getattr(LocaleTime(), name_type)

pandas/_libs/tslibs/conversion.pyx

+2-2
Original file line numberDiff line numberDiff line change
@@ -462,8 +462,8 @@ cdef _TSObject convert_str_to_tsobject(object ts, object tz, object unit,
462462
dt = datetime(obj.dts.year, obj.dts.month, obj.dts.day,
463463
obj.dts.hour, obj.dts.min, obj.dts.sec,
464464
obj.dts.us, obj.tzinfo)
465-
obj = convert_datetime_to_tsobject(dt, tz,
466-
nanos=obj.dts.ps / 1000)
465+
obj = convert_datetime_to_tsobject(
466+
dt, tz, nanos=obj.dts.ps // 1000)
467467
return obj
468468

469469
else:

pandas/_libs/tslibs/fields.pyx

+2-2
Original file line numberDiff line numberDiff line change
@@ -478,7 +478,7 @@ def get_date_field(int64_t[:] dtindex, object field):
478478
continue
479479

480480
dt64_to_dtstruct(dtindex[i], &dts)
481-
out[i] = dts.ps / 1000
481+
out[i] = dts.ps // 1000
482482
return out
483483
elif field == 'doy':
484484
with nogil:
@@ -522,7 +522,7 @@ def get_date_field(int64_t[:] dtindex, object field):
522522

523523
dt64_to_dtstruct(dtindex[i], &dts)
524524
out[i] = dts.month
525-
out[i] = ((out[i] - 1) / 3) + 1
525+
out[i] = ((out[i] - 1) // 3) + 1
526526
return out
527527

528528
elif field == 'dim':

pandas/_libs/tslibs/nattype.pyx

+1-1
Original file line numberDiff line numberDiff line change
@@ -353,7 +353,7 @@ class NaTType(_NaT):
353353
354354
.. versionadded:: 0.23.0
355355
""")
356-
day_name = _make_nan_func('day_name', # noqa:E128
356+
day_name = _make_nan_func('day_name', # noqa:E128
357357
"""
358358
Return the day name of the Timestamp with specified locale.
359359

pandas/_libs/tslibs/offsets.pyx

+1-1
Original file line numberDiff line numberDiff line change
@@ -587,7 +587,7 @@ def shift_day(other: datetime, days: int) -> datetime:
587587

588588
cdef inline int year_add_months(npy_datetimestruct dts, int months) nogil:
589589
"""new year number after shifting npy_datetimestruct number of months"""
590-
return dts.year + (dts.month + months - 1) / 12
590+
return dts.year + (dts.month + months - 1) // 12
591591

592592

593593
cdef inline int month_add_months(npy_datetimestruct dts, int months) nogil:

0 commit comments

Comments
 (0)