Skip to content

Commit f58bb5a

Browse files
committed
Merge remote-tracking branch 'upstream/master' into hist_legend
2 parents 6aa3db8 + 3d4f9dc commit f58bb5a

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

55 files changed

+745
-685
lines changed

ci/deps/azure-37-numpydev.yaml

+2-1
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,8 @@ dependencies:
1414
- pytz
1515
- pip
1616
- pip:
17-
- cython>=0.29.16
17+
- cython==0.29.16
18+
# GH#33507 cython 3.0a1 is causing TypeErrors 2020-04-13
1819
- "git+git://github.com/dateutil/dateutil.git"
1920
- "-f https://7933911d6844c6c53a7d-47bd50c35cd79bd838daf386af554a83.ssl.cf2.rackcdn.com"
2021
- "--pre"

doc/source/getting_started/intro_tutorials/03_subset_data.rst

+5-5
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@
2323
<div class="card-body">
2424
<p class="card-text">
2525

26-
This tutorial uses the titanic data set, stored as CSV. The data
26+
This tutorial uses the Titanic data set, stored as CSV. The data
2727
consists of the following data columns:
2828

2929
- PassengerId: Id of every passenger.
@@ -72,7 +72,7 @@ How do I select specific columns from a ``DataFrame``?
7272
<ul class="task-bullet">
7373
<li>
7474

75-
I’m interested in the age of the titanic passengers.
75+
I’m interested in the age of the Titanic passengers.
7676

7777
.. ipython:: python
7878
@@ -111,7 +111,7 @@ the number of rows is returned.
111111
<ul class="task-bullet">
112112
<li>
113113

114-
I’m interested in the age and sex of the titanic passengers.
114+
I’m interested in the age and sex of the Titanic passengers.
115115

116116
.. ipython:: python
117117
@@ -198,7 +198,7 @@ can be used to filter the ``DataFrame`` by putting it in between the
198198
selection brackets ``[]``. Only rows for which the value is ``True``
199199
will be selected.
200200

201-
We now from before that the original titanic ``DataFrame`` consists of
201+
We know from before that the original Titanic ``DataFrame`` consists of
202202
891 rows. Let’s have a look at the amount of rows which satisfy the
203203
condition by checking the ``shape`` attribute of the resulting
204204
``DataFrame`` ``above_35``:
@@ -212,7 +212,7 @@ condition by checking the ``shape`` attribute of the resulting
212212
<ul class="task-bullet">
213213
<li>
214214

215-
I’m interested in the titanic passengers from cabin class 2 and 3.
215+
I’m interested in the Titanic passengers from cabin class 2 and 3.
216216

217217
.. ipython:: python
218218

doc/source/user_guide/timeseries.rst

+9
Original file line numberDiff line numberDiff line change
@@ -786,6 +786,15 @@ Furthermore, if you have a ``Series`` with datetimelike values, then you can
786786
access these properties via the ``.dt`` accessor, as detailed in the section
787787
on :ref:`.dt accessors<basics.dt_accessors>`.
788788

789+
.. versionadded:: 1.1.0
790+
791+
You may obtain the year, week and day components of the ISO year from the ISO 8601 standard:
792+
793+
.. ipython:: python
794+
795+
idx = pd.date_range(start='2019-12-29', freq='D', periods=4)
796+
idx.to_series().dt.isocalendar()
797+
789798
.. _timeseries.offsets:
790799

791800
DateOffset objects

doc/source/whatsnew/v1.1.0.rst

+6-1
Original file line numberDiff line numberDiff line change
@@ -88,6 +88,7 @@ Other enhancements
8888
- :class:`Series.str` now has a `fullmatch` method that matches a regular expression against the entire string in each row of the series, similar to `re.fullmatch` (:issue:`32806`).
8989
- :meth:`DataFrame.sample` will now also allow array-like and BitGenerator objects to be passed to ``random_state`` as seeds (:issue:`32503`)
9090
- :meth:`MultiIndex.union` will now raise `RuntimeWarning` if the object inside are unsortable, pass `sort=False` to suppress this warning (:issue:`33015`)
91+
- :class:`Series.dt` and :class:`DatatimeIndex` now have an `isocalendar` method that returns a :class:`DataFrame` with year, week, and day calculated according to the ISO 8601 calendar (:issue:`33206`).
9192
- The :meth:`DataFrame.to_feather` method now supports additional keyword
9293
arguments (e.g. to set the compression) that are added in pyarrow 0.17
9394
(:issue:`33422`).
@@ -377,7 +378,7 @@ Performance improvements
377378
sparse values from ``scipy.sparse`` matrices using the
378379
:meth:`DataFrame.sparse.from_spmatrix` constructor (:issue:`32821`,
379380
:issue:`32825`, :issue:`32826`, :issue:`32856`, :issue:`32858`).
380-
- Performance improvement in reductions (sum, min, max) for nullable (integer and boolean) dtypes (:issue:`30982`, :issue:`33261`).
381+
- Performance improvement in reductions (sum, prod, min, max) for nullable (integer and boolean) dtypes (:issue:`30982`, :issue:`33261`, :issue:`33442`).
381382

382383

383384
.. ---------------------------------------------------------------------------
@@ -396,6 +397,7 @@ Categorical
396397
- Bug where :class:`Categorical` comparison operator ``__ne__`` would incorrectly evaluate to ``False`` when either element was missing (:issue:`32276`)
397398
- :meth:`Categorical.fillna` now accepts :class:`Categorical` ``other`` argument (:issue:`32420`)
398399
- Bug where :meth:`Categorical.replace` would replace with ``NaN`` whenever the new value and replacement value were equal (:issue:`33288`)
400+
- Bug where an ordered :class:`Categorical` containing only ``NaN`` values would raise rather than returning ``NaN`` when taking the minimum or maximum (:issue:`33450`)
399401

400402
Datetimelike
401403
^^^^^^^^^^^^
@@ -409,6 +411,8 @@ Datetimelike
409411
- Bug in :meth:`DatetimeIndex.searchsorted` not accepting a ``list`` or :class:`Series` as its argument (:issue:`32762`)
410412
- Bug where :meth:`PeriodIndex` raised when passed a :class:`Series` of strings (:issue:`26109`)
411413
- Bug in :class:`Timestamp` arithmetic when adding or subtracting a ``np.ndarray`` with ``timedelta64`` dtype (:issue:`33296`)
414+
- Bug in :meth:`DatetimeIndex.to_period` not infering the frequency when called with no arguments (:issue:`33358`)
415+
412416

413417
Timedelta
414418
^^^^^^^^^
@@ -463,6 +467,7 @@ Indexing
463467
- Bug in :meth:`DatetimeIndex.get_loc` raising ``KeyError`` with converted-integer key instead of the user-passed key (:issue:`31425`)
464468
- Bug in :meth:`Series.xs` incorrectly returning ``Timestamp`` instead of ``datetime64`` in some object-dtype cases (:issue:`31630`)
465469
- Bug in :meth:`DataFrame.iat` incorrectly returning ``Timestamp`` instead of ``datetime`` in some object-dtype cases (:issue:`32809`)
470+
- Bug in :meth:`DataFrame.at` when either columns or index is non-unique (:issue:`33041`)
466471
- Bug in :meth:`Series.loc` and :meth:`DataFrame.loc` when indexing with an integer key on a object-dtype :class:`Index` that is not all-integers (:issue:`31905`)
467472
- Bug in :meth:`DataFrame.iloc.__setitem__` on a :class:`DataFrame` with duplicate columns incorrectly setting values for all matching columns (:issue:`15686`, :issue:`22036`)
468473
- Bug in :meth:`DataFrame.loc:` and :meth:`Series.loc` with a :class:`DatetimeIndex`, :class:`TimedeltaIndex`, or :class:`PeriodIndex` incorrectly allowing lookups of non-matching datetime-like dtypes (:issue:`32650`)

pandas/_libs/lib.pyx

-3
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,6 @@
11
from collections import abc
22
from decimal import Decimal
3-
from fractions import Fraction
4-
from numbers import Number
53

6-
import sys
74
import warnings
85

96
import cython

pandas/_libs/reshape.pyx

+1-1
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ ctypedef fused reshape_t:
3636

3737
@cython.wraparound(False)
3838
@cython.boundscheck(False)
39-
def unstack(reshape_t[:, :] values, uint8_t[:] mask,
39+
def unstack(reshape_t[:, :] values, const uint8_t[:] mask,
4040
Py_ssize_t stride, Py_ssize_t length, Py_ssize_t width,
4141
reshape_t[:, :] new_values, uint8_t[:, :] new_mask):
4242
"""

pandas/_libs/tslibs/ccalendar.pxd

+2
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,11 @@ from cython cimport Py_ssize_t
22

33
from numpy cimport int64_t, int32_t
44

5+
ctypedef (int32_t, int32_t, int32_t) iso_calendar_t
56

67
cdef int dayofweek(int y, int m, int d) nogil
78
cdef bint is_leapyear(int64_t year) nogil
89
cpdef int32_t get_days_in_month(int year, Py_ssize_t month) nogil
910
cpdef int32_t get_week_of_year(int year, int month, int day) nogil
11+
cpdef iso_calendar_t get_iso_calendar(int year, int month, int day) nogil
1012
cpdef int32_t get_day_of_year(int year, int month, int day) nogil

pandas/_libs/tslibs/ccalendar.pyx

+43-11
Original file line numberDiff line numberDiff line change
@@ -150,33 +150,65 @@ cpdef int32_t get_week_of_year(int year, int month, int day) nogil:
150150
-------
151151
week_of_year : int32_t
152152
153+
Notes
154+
-----
155+
Assumes the inputs describe a valid date.
156+
"""
157+
return get_iso_calendar(year, month, day)[1]
158+
159+
160+
@cython.wraparound(False)
161+
@cython.boundscheck(False)
162+
cpdef iso_calendar_t get_iso_calendar(int year, int month, int day) nogil:
163+
"""
164+
Return the year, week, and day of year corresponding to ISO 8601
165+
166+
Parameters
167+
----------
168+
year : int
169+
month : int
170+
day : int
171+
172+
Returns
173+
-------
174+
year : int32_t
175+
week : int32_t
176+
day : int32_t
177+
153178
Notes
154179
-----
155180
Assumes the inputs describe a valid date.
156181
"""
157182
cdef:
158183
int32_t doy, dow
159-
int woy
184+
int32_t iso_year, iso_week
160185

161186
doy = get_day_of_year(year, month, day)
162187
dow = dayofweek(year, month, day)
163188

164189
# estimate
165-
woy = (doy - 1) - dow + 3
166-
if woy >= 0:
167-
woy = woy // 7 + 1
190+
iso_week = (doy - 1) - dow + 3
191+
if iso_week >= 0:
192+
iso_week = iso_week // 7 + 1
168193

169194
# verify
170-
if woy < 0:
171-
if (woy > -2) or (woy == -2 and is_leapyear(year - 1)):
172-
woy = 53
195+
if iso_week < 0:
196+
if (iso_week > -2) or (iso_week == -2 and is_leapyear(year - 1)):
197+
iso_week = 53
173198
else:
174-
woy = 52
175-
elif woy == 53:
199+
iso_week = 52
200+
elif iso_week == 53:
176201
if 31 - day + dow < 3:
177-
woy = 1
202+
iso_week = 1
203+
204+
iso_year = year
205+
if iso_week == 1 and doy > 7:
206+
iso_year += 1
207+
208+
elif iso_week >= 52 and doy < 7:
209+
iso_year -= 1
178210

179-
return woy
211+
return iso_year, iso_week, dow + 1
180212

181213

182214
@cython.wraparound(False)

pandas/_libs/tslibs/fields.pyx

+41-2
Original file line numberDiff line numberDiff line change
@@ -8,14 +8,14 @@ from cython import Py_ssize_t
88

99
import numpy as np
1010
cimport numpy as cnp
11-
from numpy cimport ndarray, int64_t, int32_t, int8_t
11+
from numpy cimport ndarray, int64_t, int32_t, int8_t, uint32_t
1212
cnp.import_array()
1313

1414
from pandas._libs.tslibs.ccalendar import (
1515
get_locale_names, MONTHS_FULL, DAYS_FULL, DAY_SECONDS)
1616
from pandas._libs.tslibs.ccalendar cimport (
1717
get_days_in_month, is_leapyear, dayofweek, get_week_of_year,
18-
get_day_of_year)
18+
get_day_of_year, get_iso_calendar, iso_calendar_t)
1919
from pandas._libs.tslibs.np_datetime cimport (
2020
npy_datetimestruct, pandas_timedeltastruct, dt64_to_dtstruct,
2121
td64_to_tdstruct)
@@ -670,3 +670,42 @@ cpdef isleapyear_arr(ndarray years):
670670
np.logical_and(years % 4 == 0,
671671
years % 100 > 0))] = 1
672672
return out.view(bool)
673+
674+
675+
@cython.wraparound(False)
676+
@cython.boundscheck(False)
677+
def build_isocalendar_sarray(const int64_t[:] dtindex):
678+
"""
679+
Given a int64-based datetime array, return the ISO 8601 year, week, and day
680+
as a structured array.
681+
"""
682+
cdef:
683+
Py_ssize_t i, count = len(dtindex)
684+
npy_datetimestruct dts
685+
ndarray[uint32_t] iso_years, iso_weeks, days
686+
iso_calendar_t ret_val
687+
688+
sa_dtype = [
689+
("year", "u4"),
690+
("week", "u4"),
691+
("day", "u4"),
692+
]
693+
694+
out = np.empty(count, dtype=sa_dtype)
695+
696+
iso_years = out["year"]
697+
iso_weeks = out["week"]
698+
days = out["day"]
699+
700+
with nogil:
701+
for i in range(count):
702+
if dtindex[i] == NPY_NAT:
703+
ret_val = 0, 0, 0
704+
else:
705+
dt64_to_dtstruct(dtindex[i], &dts)
706+
ret_val = get_iso_calendar(dts.year, dts.month, dts.day)
707+
708+
iso_years[i] = ret_val[0]
709+
iso_weeks[i] = ret_val[1]
710+
days[i] = ret_val[2]
711+
return out

0 commit comments

Comments
 (0)