Skip to content

Commit 6cb196d

Browse files
committed
Merge remote-tracking branch 'upstream/master'
2 parents 441f879 + 24ab22f commit 6cb196d

19 files changed

+598
-53
lines changed

doc/source/conf.py

+5-1
Original file line numberDiff line numberDiff line change
@@ -569,7 +569,11 @@ def linkcode_resolve(domain, info):
569569
return None
570570

571571
try:
572-
fn = inspect.getsourcefile(obj)
572+
# inspect.unwrap() was added in Python version 3.4
573+
if sys.version_info >= (3, 5):
574+
fn = inspect.getsourcefile(inspect.unwrap(obj))
575+
else:
576+
fn = inspect.getsourcefile(obj)
573577
except:
574578
fn = None
575579
if not fn:

doc/source/contributing.rst

+52
Original file line numberDiff line numberDiff line change
@@ -612,6 +612,54 @@ Alternatively, you can install the ``grep`` and ``xargs`` commands via the
612612
`MinGW <http://www.mingw.org/>`__ toolchain, and it will allow you to run the
613613
commands above.
614614

615+
.. _contributing.import-formatting:
616+
617+
Import Formatting
618+
~~~~~~~~~~~~~~~~~
619+
*pandas* uses `isort <https://pypi.org/project/isort/>`__ to standardise import
620+
formatting across the codebase.
621+
622+
A guide to import layout as per pep8 can be found `here <https://www.python.org/dev/peps/pep-0008/#imports/>`__.
623+
624+
A summary of our current import sections ( in order ):
625+
626+
* Future
627+
* Python Standard Library
628+
* Third Party
629+
* ``pandas._libs``, ``pandas.compat``, ``pandas.util._*``, ``pandas.errors`` (largely not dependent on ``pandas.core``)
630+
* ``pandas.core.dtypes`` (largely not dependent on the rest of ``pandas.core``)
631+
* Rest of ``pandas.core.*``
632+
* Non-core ``pandas.io``, ``pandas.plotting``, ``pandas.tseries``
633+
* Local application/library specific imports
634+
635+
Imports are alphabetically sorted within these sections.
636+
637+
638+
As part of :ref:`Continuous Integration <contributing.ci>` checks we run::
639+
640+
isort --recursive --check-only pandas
641+
642+
to check that imports are correctly formatted as per the `setup.cfg`.
643+
644+
If you see output like the below in :ref:`Continuous Integration <contributing.ci>` checks:
645+
646+
.. code-block:: shell
647+
648+
Check import format using isort
649+
ERROR: /home/travis/build/pandas-dev/pandas/pandas/io/pytables.py Imports are incorrectly sorted
650+
Check import format using isort DONE
651+
The command "ci/code_checks.sh" exited with 1
652+
653+
You should run::
654+
655+
isort pandas/io/pytables.py
656+
657+
to automatically format imports correctly. This will modify your local copy of the files.
658+
659+
The `--recursive` flag can be passed to sort all files in a directory.
660+
661+
You can then verify the changes look ok, then git :ref:`commit <contributing.commit-code>` and :ref:`push <contributing.push-code>`.
662+
615663
Backwards Compatibility
616664
~~~~~~~~~~~~~~~~~~~~~~~
617665

@@ -1078,6 +1126,8 @@ or a new keyword argument (`example <https://github.com/pandas-dev/pandas/blob/v
10781126
Contributing your changes to *pandas*
10791127
=====================================
10801128

1129+
.. _contributing.commit-code:
1130+
10811131
Committing your code
10821132
--------------------
10831133

@@ -1122,6 +1172,8 @@ Now you can commit your changes in your local repository::
11221172

11231173
git commit -m
11241174

1175+
.. _contributing.push-code:
1176+
11251177
Pushing your changes
11261178
--------------------
11271179

doc/source/whatsnew/v0.24.0.txt

+2
Original file line numberDiff line numberDiff line change
@@ -1151,6 +1151,7 @@ Timezones
11511151
- Bug in :meth:`DatetimeIndex.unique` that did not re-localize tz-aware dates correctly (:issue:`21737`)
11521152
- Bug when indexing a :class:`Series` with a DST transition (:issue:`21846`)
11531153
- Bug in :meth:`DataFrame.resample` and :meth:`Series.resample` where an ``AmbiguousTimeError`` or ``NonExistentTimeError`` would raise if a timezone aware timeseries ended on a DST transition (:issue:`19375`, :issue:`10117`)
1154+
- Bug in :meth:`DataFrame.drop` and :meth:`Series.drop` when specifying a tz-aware Timestamp key to drop from a :class:`DatetimeIndex` with a DST transition (:issue:`21761`)
11541155

11551156
Offsets
11561157
^^^^^^^
@@ -1198,6 +1199,7 @@ Indexing
11981199
- The traceback from a ``KeyError`` when asking ``.loc`` for a single missing label is now shorter and more clear (:issue:`21557`)
11991200
- When ``.ix`` is asked for a missing integer label in a :class:`MultiIndex` with a first level of integer type, it now raises a ``KeyError``, consistently with the case of a flat :class:`Int64Index`, rather than falling back to positional indexing (:issue:`21593`)
12001201
- Bug in :meth:`DatetimeIndex.reindex` when reindexing a tz-naive and tz-aware :class:`DatetimeIndex` (:issue:`8306`)
1202+
- Bug in :meth:`Series.reindex` when reindexing an empty series with a ``datetime64[ns, tz]`` dtype (:issue:`20869`)
12011203
- Bug in :class:`DataFrame` when setting values with ``.loc`` and a timezone aware :class:`DatetimeIndex` (:issue:`11365`)
12021204
- ``DataFrame.__getitem__`` now accepts dictionaries and dictionary keys as list-likes of labels, consistently with ``Series.__getitem__`` (:issue:`21294`)
12031205
- Fixed ``DataFrame[np.nan]`` when columns are non-unique (:issue:`21428`)

pandas/_libs/lib.pyx

+4-1
Original file line numberDiff line numberDiff line change
@@ -57,7 +57,7 @@ from tslibs.conversion cimport convert_to_tsobject
5757
from tslibs.timedeltas cimport convert_to_timedelta64
5858
from tslibs.timezones cimport get_timezone, tz_compare
5959

60-
from missing cimport (checknull,
60+
from missing cimport (checknull, isnaobj,
6161
is_null_datetime64, is_null_timedelta64, is_null_period)
6262

6363

@@ -1181,6 +1181,9 @@ def infer_dtype(value: object, skipna: bool=False) -> str:
11811181
values = construct_1d_object_array_from_listlike(value)
11821182

11831183
values = getattr(values, 'values', values)
1184+
if skipna:
1185+
values = values[~isnaobj(values)]
1186+
11841187
val = _try_infer_map(values)
11851188
if val is not None:
11861189
return val

pandas/_libs/missing.pxd

+3
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,10 @@
11
# -*- coding: utf-8 -*-
22

3+
from numpy cimport ndarray, uint8_t
4+
35
cpdef bint checknull(object val)
46
cpdef bint checknull_old(object val)
7+
cpdef ndarray[uint8_t] isnaobj(ndarray arr)
58

69
cdef bint is_null_datetime64(v)
710
cdef bint is_null_timedelta64(v)

pandas/_libs/missing.pyx

+1-1
Original file line numberDiff line numberDiff line change
@@ -124,7 +124,7 @@ cdef inline bint _check_none_nan_inf_neginf(object val):
124124

125125
@cython.wraparound(False)
126126
@cython.boundscheck(False)
127-
def isnaobj(ndarray arr):
127+
cpdef ndarray[uint8_t] isnaobj(ndarray arr):
128128
"""
129129
Return boolean mask denoting which elements of a 1-D array are na-like,
130130
according to the criteria defined in `_check_all_nulls`:

pandas/_libs/tslibs/conversion.pyx

+14-1
Original file line numberDiff line numberDiff line change
@@ -844,7 +844,20 @@ def tz_localize_to_utc(ndarray[int64_t] vals, object tz, object ambiguous=None,
844844
vals : ndarray[int64_t]
845845
tz : tzinfo or None
846846
ambiguous : str, bool, or arraylike
847-
If arraylike, must have the same length as vals
847+
When clocks moved backward due to DST, ambiguous times may arise.
848+
For example in Central European Time (UTC+01), when going from 03:00
849+
DST to 02:00 non-DST, 02:30:00 local time occurs both at 00:30:00 UTC
850+
and at 01:30:00 UTC. In such a situation, the `ambiguous` parameter
851+
dictates how ambiguous times should be handled.
852+
853+
- 'infer' will attempt to infer fall dst-transition hours based on
854+
order
855+
- bool-ndarray where True signifies a DST time, False signifies a
856+
non-DST time (note that this flag is only applicable for ambiguous
857+
times, but the array must have the same length as vals)
858+
- bool if True, treat all vals as DST. If False, treat them as non-DST
859+
- 'NaT' will return NaT where there are ambiguous times
860+
848861
nonexistent : str
849862
If arraylike, must have the same length as vals
850863

pandas/_libs/tslibs/nattype.pyx

+7
Original file line numberDiff line numberDiff line change
@@ -593,6 +593,13 @@ class NaTType(_NaT):
593593
None will remove timezone holding local time.
594594
595595
ambiguous : bool, 'NaT', default 'raise'
596+
When clocks moved backward due to DST, ambiguous times may arise.
597+
For example in Central European Time (UTC+01), when going from
598+
03:00 DST to 02:00 non-DST, 02:30:00 local time occurs both at
599+
00:30:00 UTC and at 01:30:00 UTC. In such a situation, the
600+
`ambiguous` parameter dictates how ambiguous times should be
601+
handled.
602+
596603
- bool contains flags to determine if time is dst or not (note
597604
that this flag is only applicable for ambiguous fall dst dates)
598605
- 'NaT' will return NaT for an ambiguous time

pandas/_libs/tslibs/timestamps.pyx

+7
Original file line numberDiff line numberDiff line change
@@ -1026,6 +1026,13 @@ class Timestamp(_Timestamp):
10261026
None will remove timezone holding local time.
10271027

10281028
ambiguous : bool, 'NaT', default 'raise'
1029+
When clocks moved backward due to DST, ambiguous times may arise.
1030+
For example in Central European Time (UTC+01), when going from
1031+
03:00 DST to 02:00 non-DST, 02:30:00 local time occurs both at
1032+
00:30:00 UTC and at 01:30:00 UTC. In such a situation, the
1033+
`ambiguous` parameter dictates how ambiguous times should be
1034+
handled.
1035+
10291036
- bool contains flags to determine if time is dst or not (note
10301037
that this flag is only applicable for ambiguous fall dst dates)
10311038
- 'NaT' will return NaT for an ambiguous time

pandas/core/arrays/datetimes.py

+40
Original file line numberDiff line numberDiff line change
@@ -614,6 +614,12 @@ def tz_localize(self, tz, ambiguous='raise', nonexistent='raise',
614614
Time zone to convert timestamps to. Passing ``None`` will
615615
remove the time zone information preserving local time.
616616
ambiguous : 'infer', 'NaT', bool array, default 'raise'
617+
When clocks moved backward due to DST, ambiguous times may arise.
618+
For example in Central European Time (UTC+01), when going from
619+
03:00 DST to 02:00 non-DST, 02:30:00 local time occurs both at
620+
00:30:00 UTC and at 01:30:00 UTC. In such a situation, the
621+
`ambiguous` parameter dictates how ambiguous times should be
622+
handled.
617623
618624
- 'infer' will attempt to infer fall dst-transition hours based on
619625
order
@@ -685,6 +691,40 @@ def tz_localize(self, tz, ambiguous='raise', nonexistent='raise',
685691
DatetimeIndex(['2018-03-01 09:00:00', '2018-03-02 09:00:00',
686692
'2018-03-03 09:00:00'],
687693
dtype='datetime64[ns]', freq='D')
694+
695+
Be careful with DST changes. When there is sequential data, pandas can
696+
infer the DST time:
697+
>>> s = pd.to_datetime(pd.Series([
698+
... '2018-10-28 01:30:00',
699+
... '2018-10-28 02:00:00',
700+
... '2018-10-28 02:30:00',
701+
... '2018-10-28 02:00:00',
702+
... '2018-10-28 02:30:00',
703+
... '2018-10-28 03:00:00',
704+
... '2018-10-28 03:30:00']))
705+
>>> s.dt.tz_localize('CET', ambiguous='infer')
706+
2018-10-28 01:30:00+02:00 0
707+
2018-10-28 02:00:00+02:00 1
708+
2018-10-28 02:30:00+02:00 2
709+
2018-10-28 02:00:00+01:00 3
710+
2018-10-28 02:30:00+01:00 4
711+
2018-10-28 03:00:00+01:00 5
712+
2018-10-28 03:30:00+01:00 6
713+
dtype: int64
714+
715+
In some cases, inferring the DST is impossible. In such cases, you can
716+
pass an ndarray to the ambiguous parameter to set the DST explicitly
717+
718+
>>> s = pd.to_datetime(pd.Series([
719+
... '2018-10-28 01:20:00',
720+
... '2018-10-28 02:36:00',
721+
... '2018-10-28 03:46:00']))
722+
>>> s.dt.tz_localize('CET', ambiguous=np.array([True, True, False]))
723+
0 2018-10-28 01:20:00+02:00
724+
1 2018-10-28 02:36:00+02:00
725+
2 2018-10-28 03:46:00+01:00
726+
dtype: datetime64[ns, CET]
727+
688728
"""
689729
if errors is not None:
690730
warnings.warn("The errors argument is deprecated and will be "

0 commit comments

Comments
 (0)