Skip to content

Commit de90d79

Browse files
committed
Merge remote-tracking branch 'pandas-dev/master' into time_grouper
2 parents 026223a + 21a3800 commit de90d79

35 files changed

+602
-208
lines changed

ci/requirements-3.6_WIN.run

-1
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,6 @@ xlrd
88
xlwt
99
scipy
1010
feather-format
11-
pyarrow
1211
numexpr
1312
pytables
1413
matplotlib

doc/source/api.rst

+1
Original file line numberDiff line numberDiff line change
@@ -2025,6 +2025,7 @@ Upsampling
20252025
Resampler.backfill
20262026
Resampler.bfill
20272027
Resampler.pad
2028+
Resampler.nearest
20282029
Resampler.fillna
20292030
Resampler.asfreq
20302031
Resampler.interpolate

doc/source/categorical.rst

+2
Original file line numberDiff line numberDiff line change
@@ -146,6 +146,8 @@ Using ``.describe()`` on categorical data will produce similar output to a `Seri
146146
df.describe()
147147
df["cat"].describe()
148148
149+
.. _categorical.cat:
150+
149151
Working with categories
150152
-----------------------
151153

doc/source/io.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -4492,7 +4492,7 @@ Several caveats.
44924492
- The format will NOT write an ``Index``, or ``MultiIndex`` for the ``DataFrame`` and will raise an
44934493
error if a non-default one is provided. You can simply ``.reset_index(drop=True)`` in order to store the index.
44944494
- Duplicate column names and non-string columns names are not supported
4495-
- Categorical dtypes are currently not-supported (for ``pyarrow``).
4495+
- Categorical dtypes can be serialized to parquet, but will de-serialize as ``object`` dtype.
44964496
- Non supported types include ``Period`` and actual python object types. These will raise a helpful error message
44974497
on an attempt at serialization.
44984498

doc/source/timeseries.rst

+20-5
Original file line numberDiff line numberDiff line change
@@ -175,12 +175,8 @@ you can pass the ``dayfirst`` flag:
175175
can't be parsed with the day being first it will be parsed as if
176176
``dayfirst`` were False.
177177

178-
.. note::
179-
Specifying a ``format`` argument will potentially speed up the conversion
180-
considerably and explicitly specifying
181-
a format string of '%Y%m%d' takes a faster path still.
182-
183178
If you pass a single string to ``to_datetime``, it returns single ``Timestamp``.
179+
184180
Also, ``Timestamp`` can accept the string input.
185181
Note that ``Timestamp`` doesn't accept string parsing option like ``dayfirst``
186182
or ``format``, use ``to_datetime`` if these are required.
@@ -191,6 +187,25 @@ or ``format``, use ``to_datetime`` if these are required.
191187
192188
pd.Timestamp('2010/11/12')
193189
190+
Providing a Format Argument
191+
~~~~~~~~~~~~~~~~~~~~~~~~~~~
192+
193+
In addition to the required datetime string, a ``format`` argument can be passed to ensure specific parsing.
194+
It will potentially speed up the conversion considerably.
195+
196+
For example:
197+
198+
.. ipython:: python
199+
200+
pd.to_datetime('2010/11/12', format='%Y/%m/%d')
201+
202+
pd.to_datetime('12-11-2010 00:00', format='%d-%m-%Y %H:%M')
203+
204+
For more information on how to specify the ``format`` options, see https://docs.python.org/3/library/datetime.html#strftime-and-strptime-behavior.
205+
206+
Assembling datetime from multiple DataFrame columns
207+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
208+
194209
.. versionadded:: 0.18.1
195210

196211
You can also pass a ``DataFrame`` of integer or string columns to assemble into a ``Series`` of ``Timestamps``.

doc/source/whatsnew/v0.21.0.txt

+9-3
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@ New features
2828
and :class:`~pandas.ExcelWriter` to work properly with the file system path protocol (:issue:`13823`)
2929
- Added ``skipna`` parameter to :func:`~pandas.api.types.infer_dtype` to
3030
support type inference in the presence of missing values (:issue:`17059`).
31+
- :class:`~pandas.Resampler.nearest` is added to support nearest-neighbor upsampling (:issue:`17496`).
3132

3233
.. _whatsnew_0210.enhancements.infer_objects:
3334

@@ -113,7 +114,7 @@ Other Enhancements
113114
- :func:`pd.read_sas()` now recognizes much more of the most frequently used date (datetime) formats in SAS7BDAT files (:issue:`15871`).
114115
- :func:`DataFrame.items` and :func:`Series.items` is now present in both Python 2 and 3 and is lazy in all cases (:issue:`13918`, :issue:`17213`)
115116
- :func:`Styler.where` has been implemented. It is as a convenience for :func:`Styler.applymap` and enables simple DataFrame styling on the Jupyter notebook (:issue:`17474`).
116-
117+
- :func:`MultiIndex.is_monotonic_decreasing` has been implemented. Previously returned ``False`` in all cases. (:issue:`16554`)
117118

118119

119120
.. _whatsnew_0210.api_breaking:
@@ -215,7 +216,7 @@ New Behaviour:
215216

216217
Furthermore this will now correctly box the results of iteration for :func:`DataFrame.to_dict` as well.
217218

218-
.. ipython:: ipython
219+
.. ipython:: python
219220

220221
d = {'a':[1], 'b':['b']}
221222
df = pd.DataFrame(d)
@@ -363,7 +364,7 @@ Additionally, DataFrames with datetime columns that were parsed by :func:`read_s
363364
Consistency of Range Functions
364365
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
365366

366-
In previous versions, there were some inconsistencies between the various range functions: func:`date_range`, func:`bdate_range`, func:`cdate_range`, func:`period_range`, func:`timedelta_range`, and func:`interval_range`. (:issue:`17471`).
367+
In previous versions, there were some inconsistencies between the various range functions: :func:`date_range`, :func:`bdate_range`, :func:`cdate_range`, :func:`period_range`, :func:`timedelta_range`, and :func:`interval_range`. (:issue:`17471`).
367368

368369
One of the inconsistent behaviors occurred when the ``start``, ``end`` and ``period`` parameters were all specified, potentially leading to ambiguous ranges. When all three parameters were passed, ``interval_range`` ignored the ``period`` parameter, ``period_range`` ignored the ``end`` parameter, and the other range functions raised. To promote consistency among the range functions, and avoid potentially ambiguous ranges, ``interval_range`` and ``period_range`` will now raise when all three parameters are passed.
369370

@@ -524,6 +525,10 @@ Plotting
524525
^^^^^^^^
525526
- Bug in plotting methods using ``secondary_y`` and ``fontsize`` not setting secondary axis font size (:issue:`12565`)
526527
- Bug when plotting ``timedelta`` and ``datetime`` dtypes on y-axis (:issue:`16953`)
528+
- Line plots no longer assume monotonic x data when calculating xlims, they show the entire lines now even for unsorted x data. (:issue:`11310`)(:issue:`11471`)
529+
- With matplotlib 2.0.0 and above, calculation of x limits for line plots is left to matplotlib, so that its new default settings are applied. (:issue:`15495`)
530+
- Bug in ``Series.plot.bar`` or ``DataFramee.plot.bar`` with ``y`` not respecting user-passed ``color`` (:issue:`16822`)
531+
527532

528533
Groupby/Resample/Rolling
529534
^^^^^^^^^^^^^^^^^^^^^^^^
@@ -568,6 +573,7 @@ Categorical
568573
- Bug in the categorical constructor with empty values and categories causing
569574
the ``.categories`` to be an empty ``Float64Index`` rather than an empty
570575
``Index`` with object dtype (:issue:`17248`)
576+
- Bug in categorical operations with :ref:`Series.cat <categorical.cat>' not preserving the original Series' name (:issue:`17509`)
571577

572578
PyPy
573579
^^^^

pandas/_libs/algos.pyx

+11-23
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,12 @@
11
# cython: profile=False
22

3-
from numpy cimport *
43
cimport numpy as np
54
import numpy as np
65

76
cimport cython
7+
from cython cimport Py_ssize_t
88

9-
import_array()
9+
np.import_array()
1010

1111
cdef float64_t FP_ERR = 1e-13
1212

@@ -15,31 +15,19 @@ cimport util
1515
from libc.stdlib cimport malloc, free
1616
from libc.string cimport memmove
1717

18-
from numpy cimport NPY_INT8 as NPY_int8
19-
from numpy cimport NPY_INT16 as NPY_int16
20-
from numpy cimport NPY_INT32 as NPY_int32
21-
from numpy cimport NPY_INT64 as NPY_int64
22-
from numpy cimport NPY_FLOAT16 as NPY_float16
23-
from numpy cimport NPY_FLOAT32 as NPY_float32
24-
from numpy cimport NPY_FLOAT64 as NPY_float64
25-
26-
from numpy cimport (int8_t, int16_t, int32_t, int64_t, uint8_t, uint16_t,
27-
uint32_t, uint64_t, float16_t, float32_t, float64_t)
28-
29-
int8 = np.dtype(np.int8)
30-
int16 = np.dtype(np.int16)
31-
int32 = np.dtype(np.int32)
32-
int64 = np.dtype(np.int64)
33-
float16 = np.dtype(np.float16)
34-
float32 = np.dtype(np.float32)
35-
float64 = np.dtype(np.float64)
18+
from numpy cimport (ndarray,
19+
NPY_INT64, NPY_UINT64, NPY_INT32, NPY_INT16, NPY_INT8,
20+
NPY_FLOAT32, NPY_FLOAT64,
21+
NPY_OBJECT,
22+
int8_t, int16_t, int32_t, int64_t, uint8_t, uint16_t,
23+
uint32_t, uint64_t, float16_t, float32_t, float64_t,
24+
double_t)
25+
3626

3727
cdef double NaN = <double> np.NaN
3828
cdef double nan = NaN
3929

40-
cdef extern from "../src/headers/math.h":
41-
double sqrt(double x) nogil
42-
double fabs(double) nogil
30+
from libc.math cimport sqrt, fabs
4331

4432
# this is our util.pxd
4533
from util cimport numeric, get_nat

pandas/_libs/groupby.pyx

+5-4
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,17 @@
11
# cython: profile=False
22

3-
from numpy cimport *
4-
cimport numpy as np
3+
cimport numpy as cnp
54
import numpy as np
65

76
cimport cython
87

9-
import_array()
8+
cnp.import_array()
109

1110
cimport util
1211

13-
from numpy cimport (int8_t, int16_t, int32_t, int64_t, uint8_t, uint16_t,
12+
from numpy cimport (ndarray,
13+
double_t,
14+
int8_t, int16_t, int32_t, int64_t, uint8_t, uint16_t,
1415
uint32_t, uint64_t, float16_t, float32_t, float64_t)
1516

1617
from libc.stdlib cimport malloc, free

pandas/_libs/hashtable.pyx

+1-3
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ from khash cimport (
2323
kh_put_pymap, kh_resize_pymap)
2424

2525

26-
from numpy cimport *
26+
from numpy cimport ndarray, uint8_t, uint32_t
2727

2828
from libc.stdlib cimport malloc, free
2929
from cpython cimport (PyMem_Malloc, PyMem_Realloc, PyMem_Free,
@@ -56,8 +56,6 @@ cdef extern from "datetime.h":
5656

5757
PyDateTime_IMPORT
5858

59-
cdef extern from "Python.h":
60-
int PySlice_Check(object)
6159

6260
cdef size_t _INIT_VEC_CAP = 128
6361

pandas/_libs/interval.pyx

+1-2
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,10 @@
11
cimport numpy as np
22
import numpy as np
3-
import pandas as pd
43

54
cimport util
65
cimport cython
76
import cython
8-
from numpy cimport *
7+
from numpy cimport ndarray
98
from tslib import Timestamp
109

1110
from cpython.object cimport (Py_EQ, Py_NE, Py_GT, Py_LT, Py_GE, Py_LE,

pandas/_libs/intervaltree.pxi.in

+6-2
Original file line numberDiff line numberDiff line change
@@ -4,11 +4,15 @@ Template for intervaltree
44
WARNING: DO NOT edit .pxi FILE directly, .pxi is generated from .pxi.in
55
"""
66

7-
from numpy cimport int64_t, float64_t
8-
from numpy cimport ndarray, PyArray_ArgSort, NPY_QUICKSORT, PyArray_Take
7+
from numpy cimport (
8+
int64_t, int32_t, float64_t, float32_t,
9+
ndarray,
10+
PyArray_ArgSort, NPY_QUICKSORT, PyArray_Take)
911
import numpy as np
1012

1113
cimport cython
14+
from cython cimport Py_ssize_t
15+
1216
cimport numpy as cnp
1317
cnp.import_array()
1418

pandas/_libs/join.pyx

+4-19
Original file line numberDiff line numberDiff line change
@@ -1,34 +1,19 @@
11
# cython: profile=False
22

3-
from numpy cimport *
43
cimport numpy as np
54
import numpy as np
65

76
cimport cython
7+
from cython cimport Py_ssize_t
88

9-
import_array()
9+
np.import_array()
1010

1111
cimport util
1212

13-
from numpy cimport NPY_INT8 as NPY_int8
14-
from numpy cimport NPY_INT16 as NPY_int16
15-
from numpy cimport NPY_INT32 as NPY_int32
16-
from numpy cimport NPY_INT64 as NPY_int64
17-
from numpy cimport NPY_FLOAT16 as NPY_float16
18-
from numpy cimport NPY_FLOAT32 as NPY_float32
19-
from numpy cimport NPY_FLOAT64 as NPY_float64
20-
21-
from numpy cimport (int8_t, int16_t, int32_t, int64_t, uint8_t, uint16_t,
13+
from numpy cimport (ndarray,
14+
int8_t, int16_t, int32_t, int64_t, uint8_t, uint16_t,
2215
uint32_t, uint64_t, float16_t, float32_t, float64_t)
2316

24-
int8 = np.dtype(np.int8)
25-
int16 = np.dtype(np.int16)
26-
int32 = np.dtype(np.int32)
27-
int64 = np.dtype(np.int64)
28-
float16 = np.dtype(np.float16)
29-
float32 = np.dtype(np.float32)
30-
float64 = np.dtype(np.float64)
31-
3217
cdef double NaN = <double> np.NaN
3318
cdef double nan = NaN
3419

pandas/_libs/reshape.pyx

+4-19
Original file line numberDiff line numberDiff line change
@@ -1,34 +1,19 @@
11
# cython: profile=False
22

3-
from numpy cimport *
43
cimport numpy as np
54
import numpy as np
65

76
cimport cython
7+
from cython cimport Py_ssize_t
88

9-
import_array()
9+
np.import_array()
1010

1111
cimport util
1212

13-
from numpy cimport NPY_INT8 as NPY_int8
14-
from numpy cimport NPY_INT16 as NPY_int16
15-
from numpy cimport NPY_INT32 as NPY_int32
16-
from numpy cimport NPY_INT64 as NPY_int64
17-
from numpy cimport NPY_FLOAT16 as NPY_float16
18-
from numpy cimport NPY_FLOAT32 as NPY_float32
19-
from numpy cimport NPY_FLOAT64 as NPY_float64
20-
21-
from numpy cimport (int8_t, int16_t, int32_t, int64_t, uint8_t, uint16_t,
13+
from numpy cimport (ndarray,
14+
int8_t, int16_t, int32_t, int64_t, uint8_t, uint16_t,
2215
uint32_t, uint64_t, float16_t, float32_t, float64_t)
2316

24-
int8 = np.dtype(np.int8)
25-
int16 = np.dtype(np.int16)
26-
int32 = np.dtype(np.int32)
27-
int64 = np.dtype(np.int64)
28-
float16 = np.dtype(np.float16)
29-
float32 = np.dtype(np.float32)
30-
float64 = np.dtype(np.float64)
31-
3217
cdef double NaN = <double> np.NaN
3318
cdef double nan = NaN
3419

pandas/_libs/src/reduce.pyx

-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,4 @@
11
#cython=False
2-
from numpy cimport *
32
import numpy as np
43

54
from distutils.version import LooseVersion

pandas/core/api.py

+2-1
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,8 @@
1616
PeriodIndex, NaT)
1717
from pandas.core.indexes.period import Period, period_range, pnow
1818
from pandas.core.indexes.timedeltas import Timedelta, timedelta_range
19-
from pandas.core.indexes.datetimes import Timestamp, date_range, bdate_range
19+
from pandas.core.indexes.datetimes import (Timestamp, date_range, bdate_range,
20+
cdate_range)
2021
from pandas.core.indexes.interval import Interval, interval_range
2122

2223
from pandas.core.series import Series

pandas/core/base.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -900,7 +900,7 @@ def tolist(self):
900900
901901
See Also
902902
--------
903-
numpy.tolist
903+
numpy.ndarray.tolist
904904
"""
905905

906906
if is_datetimelike(self):

pandas/core/categorical.py

+5-3
Original file line numberDiff line numberDiff line change
@@ -2054,9 +2054,10 @@ class CategoricalAccessor(PandasDelegate, NoNewAttributesMixin):
20542054
20552055
"""
20562056

2057-
def __init__(self, values, index):
2057+
def __init__(self, values, index, name):
20582058
self.categorical = values
20592059
self.index = index
2060+
self.name = name
20602061
self._freeze()
20612062

20622063
def _delegate_property_get(self, name):
@@ -2075,14 +2076,15 @@ def _delegate_method(self, name, *args, **kwargs):
20752076
method = getattr(self.categorical, name)
20762077
res = method(*args, **kwargs)
20772078
if res is not None:
2078-
return Series(res, index=self.index)
2079+
return Series(res, index=self.index, name=self.name)
20792080

20802081
@classmethod
20812082
def _make_accessor(cls, data):
20822083
if not is_categorical_dtype(data.dtype):
20832084
raise AttributeError("Can only use .cat accessor with a "
20842085
"'category' dtype")
2085-
return CategoricalAccessor(data.values, data.index)
2086+
return CategoricalAccessor(data.values, data.index,
2087+
getattr(data, 'name', None),)
20862088

20872089

20882090
CategoricalAccessor._add_delegate_accessors(delegate=Categorical,

0 commit comments

Comments
 (0)