Skip to content

Commit 8592fbd

Browse files
Merge remote-tracking branch 'upstream/master' into bisect
2 parents 7a404a7 + b5e4e2e commit 8592fbd

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

58 files changed

+484
-422
lines changed

.github/ISSUE_TEMPLATE/documentation_improvement.md

-22
This file was deleted.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
name: Documentation Improvement
2+
description: Report wrong or missing documentation
3+
title: "DOC: "
4+
labels: [Docs, Needs Triage]
5+
6+
body:
7+
- type: checkboxes
8+
attributes:
9+
options:
10+
- label: >
11+
I have checked that the issue still exists on the latest versions of the docs
12+
on `master` [here](https://pandas.pydata.org/docs/dev/)
13+
required: true
14+
- type: textarea
15+
id: location
16+
attributes:
17+
label: Location of the documentation
18+
description: >
19+
Please provide the location of the documentation, e.g. "pandas.read_csv" or the
20+
URL of the documentation, e.g.
21+
"https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html"
22+
placeholder: https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html
23+
validations:
24+
required: true
25+
- type: textarea
26+
id: problem
27+
attributes:
28+
label: Documentation problem
29+
description: >
30+
Please provide a description of what documentation you believe needs to be fixed/improved
31+
validations:
32+
required: true
33+
- type: textarea
34+
id: suggested-fix
35+
attributes:
36+
label: Suggested fix for documentation
37+
description: >
38+
Please explain the suggested fix and **why** it's better than the existing documentation
39+
validations:
40+
required: true

doc/source/development/contributing_codebase.rst

+2-2
Original file line numberDiff line numberDiff line change
@@ -181,7 +181,7 @@ run this command, though it may take longer::
181181

182182
git diff upstream/master --name-only -- "*.py" | xargs -r flake8
183183

184-
Note that on OSX, the ``-r`` flag is not available, so you have to omit it and
184+
Note that on macOS, the ``-r`` flag is not available, so you have to omit it and
185185
run this slightly modified command::
186186

187187
git diff upstream/master --name-only -- "*.py" | xargs flake8
@@ -244,7 +244,7 @@ Alternatively, you can run a command similar to what was suggested for ``black``
244244

245245
git diff upstream/master --name-only -- "*.py" | xargs -r isort
246246

247-
Where similar caveats apply if you are on OSX or Windows.
247+
Where similar caveats apply if you are on macOS or Windows.
248248

249249
You can then verify the changes look ok, then git :any:`commit <contributing.commit-code>` and :any:`push <contributing.push-code>`.
250250

doc/source/user_guide/timeseries.rst

+4-2
Original file line numberDiff line numberDiff line change
@@ -204,16 +204,18 @@ If you use dates which start with the day first (i.e. European style),
204204
you can pass the ``dayfirst`` flag:
205205

206206
.. ipython:: python
207+
:okwarning:
207208
208209
pd.to_datetime(["04-01-2012 10:00"], dayfirst=True)
209210
210211
pd.to_datetime(["14-01-2012", "01-14-2012"], dayfirst=True)
211212
212213
.. warning::
213214

214-
You see in the above example that ``dayfirst`` isn't strict, so if a date
215+
You see in the above example that ``dayfirst`` isn't strict. If a date
215216
can't be parsed with the day being first it will be parsed as if
216-
``dayfirst`` were False.
217+
``dayfirst`` were False, and in the case of parsing delimited date strings
218+
(e.g. ``31-12-2012``) then a warning will also be raised.
217219

218220
If you pass a single string to ``to_datetime``, it returns a single ``Timestamp``.
219221
``Timestamp`` can also accept string input, but it doesn't accept string parsing

doc/source/whatsnew/v1.4.0.rst

+14-3
Original file line numberDiff line numberDiff line change
@@ -103,10 +103,20 @@ Notable bug fixes
103103

104104
These are bug fixes that might have notable behavior changes.
105105

106-
.. _whatsnew_140.notable_bug_fixes.notable_bug_fix1:
106+
.. _whatsnew_140.notable_bug_fixes.inconsistent_date_string_parsing:
107107

108-
notable_bug_fix1
109-
^^^^^^^^^^^^^^^^
108+
Inconsistent date string parsing
109+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
110+
111+
The ``dayfirst`` option of :func:`to_datetime` isn't strict, and this can lead to surprising behaviour:
112+
113+
.. ipython:: python
114+
:okwarning:
115+
116+
pd.to_datetime(["31-12-2021"], dayfirst=False)
117+
118+
Now, a warning will be raised if a date string cannot be parsed accordance to the given ``dayfirst`` value when
119+
the value is a delimited date string (e.g. ``31-12-2012``).
110120

111121
.. _whatsnew_140.notable_bug_fixes.notable_bug_fix2:
112122

@@ -253,6 +263,7 @@ Categorical
253263
Datetimelike
254264
^^^^^^^^^^^^
255265
- Bug in :class:`DataFrame` constructor unnecessarily copying non-datetimelike 2D object arrays (:issue:`39272`)
266+
- :func:`to_datetime` would silently swap ``MM/DD/YYYY`` and ``DD/MM/YYYY`` formats if the given ``dayfirst`` option could not be respected - now, a warning is raised in the case of delimited date strings (e.g. ``31-12-2012``) (:issue:`12585`)
256267
-
257268

258269
Timedelta

pandas/__init__.py

+2-2
Original file line numberDiff line numberDiff line change
@@ -207,7 +207,7 @@ def __getattr__(name):
207207
warnings.warn(
208208
"The pandas.np module is deprecated "
209209
"and will be removed from pandas in a future version. "
210-
"Import numpy directly instead",
210+
"Import numpy directly instead.",
211211
FutureWarning,
212212
stacklevel=2,
213213
)
@@ -218,7 +218,7 @@ def __getattr__(name):
218218
elif name in {"SparseSeries", "SparseDataFrame"}:
219219
warnings.warn(
220220
f"The {name} class is removed from pandas. Accessing it from "
221-
"the top-level namespace will also be removed in the next version",
221+
"the top-level namespace will also be removed in the next version.",
222222
FutureWarning,
223223
stacklevel=2,
224224
)

pandas/_libs/parsers.pyx

+1-1
Original file line numberDiff line numberDiff line change
@@ -1000,7 +1000,7 @@ cdef class TextReader:
10001000
if col_dtype is not None:
10011001
warnings.warn((f"Both a converter and dtype were specified "
10021002
f"for column {name} - only the converter will "
1003-
f"be used"), ParserWarning,
1003+
f"be used."), ParserWarning,
10041004
stacklevel=5)
10051005
results[i] = _apply_converter(conv, self.parser, i, start, end)
10061006
continue

pandas/_libs/tslibs/nattype.pyx

+1-1
Original file line numberDiff line numberDiff line change
@@ -143,7 +143,7 @@ cdef class _NaT(datetime):
143143
return True
144144
warnings.warn(
145145
"Comparison of NaT with datetime.date is deprecated in "
146-
"order to match the standard library behavior. "
146+
"order to match the standard library behavior. "
147147
"In a future version these will be considered non-comparable.",
148148
FutureWarning,
149149
stacklevel=1,

pandas/_libs/tslibs/offsets.pyx

+2-2
Original file line numberDiff line numberDiff line change
@@ -696,15 +696,15 @@ cdef class BaseOffset:
696696

697697
def onOffset(self, dt) -> bool:
698698
warnings.warn(
699-
"onOffset is a deprecated, use is_on_offset instead",
699+
"onOffset is a deprecated, use is_on_offset instead.",
700700
FutureWarning,
701701
stacklevel=1,
702702
)
703703
return self.is_on_offset(dt)
704704

705705
def isAnchored(self) -> bool:
706706
warnings.warn(
707-
"isAnchored is a deprecated, use is_anchored instead",
707+
"isAnchored is a deprecated, use is_anchored instead.",
708708
FutureWarning,
709709
stacklevel=1,
710710
)

pandas/_libs/tslibs/parsing.pyx

+24
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@ Parsing functions for datetime and datetime-like strings.
33
"""
44
import re
55
import time
6+
import warnings
67

78
from libc.string cimport strchr
89

@@ -81,6 +82,11 @@ class DateParseError(ValueError):
8182
_DEFAULT_DATETIME = datetime(1, 1, 1).replace(hour=0, minute=0,
8283
second=0, microsecond=0)
8384

85+
PARSING_WARNING_MSG = (
86+
"Parsing '{date_string}' in {format} format. Provide format "
87+
"or specify infer_datetime_format=True for consistent parsing."
88+
)
89+
8490
cdef:
8591
set _not_datelike_strings = {'a', 'A', 'm', 'M', 'p', 'P', 't', 'T'}
8692

@@ -168,10 +174,28 @@ cdef inline object _parse_delimited_date(str date_string, bint dayfirst):
168174
# date_string can't be converted to date, above format
169175
return None, None
170176

177+
swapped_day_and_month = False
171178
if 1 <= month <= MAX_DAYS_IN_MONTH and 1 <= day <= MAX_DAYS_IN_MONTH \
172179
and (month <= MAX_MONTH or day <= MAX_MONTH):
173180
if (month > MAX_MONTH or (day <= MAX_MONTH and dayfirst)) and can_swap:
174181
day, month = month, day
182+
swapped_day_and_month = True
183+
if dayfirst and not swapped_day_and_month:
184+
warnings.warn(
185+
PARSING_WARNING_MSG.format(
186+
date_string=date_string,
187+
format='MM/DD/YYYY'
188+
),
189+
stacklevel=4,
190+
)
191+
elif not dayfirst and swapped_day_and_month:
192+
warnings.warn(
193+
PARSING_WARNING_MSG.format(
194+
date_string=date_string,
195+
format='DD/MM/YYYY'
196+
),
197+
stacklevel=4,
198+
)
175199
if PY_VERSION_HEX >= 0x03060100:
176200
# In Python <= 3.6.0 there is no range checking for invalid dates
177201
# in C api, thus we call faster C version for 3.6.1 or newer

pandas/_libs/tslibs/timedeltas.pyx

+1-1
Original file line numberDiff line numberDiff line change
@@ -571,7 +571,7 @@ cdef inline timedelta_from_spec(object number, object frac, object unit):
571571
if unit in ["M", "Y", "y"]:
572572
warnings.warn(
573573
"Units 'M', 'Y' and 'y' do not represent unambiguous "
574-
"timedelta values and will be removed in a future version",
574+
"timedelta values and will be removed in a future version.",
575575
FutureWarning,
576576
stacklevel=2,
577577
)

pandas/_libs/tslibs/timestamps.pyx

+5-5
Original file line numberDiff line numberDiff line change
@@ -171,7 +171,7 @@ cdef class _Timestamp(ABCTimestamp):
171171
@property
172172
def freq(self):
173173
warnings.warn(
174-
"Timestamp.freq is deprecated and will be removed in a future version",
174+
"Timestamp.freq is deprecated and will be removed in a future version.",
175175
FutureWarning,
176176
stacklevel=1,
177177
)
@@ -235,8 +235,8 @@ cdef class _Timestamp(ABCTimestamp):
235235
# We follow the stdlib datetime behavior of never being equal
236236
warnings.warn(
237237
"Comparison of Timestamp with datetime.date is deprecated in "
238-
"order to match the standard library behavior. "
239-
"In a future version these will be considered non-comparable."
238+
"order to match the standard library behavior. "
239+
"In a future version these will be considered non-comparable. "
240240
"Use 'ts == pd.Timestamp(date)' or 'ts.date() == date' instead.",
241241
FutureWarning,
242242
stacklevel=1,
@@ -425,7 +425,7 @@ cdef class _Timestamp(ABCTimestamp):
425425
warnings.warn(
426426
"Timestamp.freq is deprecated and will be removed in a future "
427427
"version. When you have a freq, use "
428-
f"freq.{field}(timestamp) instead",
428+
f"freq.{field}(timestamp) instead.",
429429
FutureWarning,
430430
stacklevel=1,
431431
)
@@ -858,7 +858,7 @@ cdef class _Timestamp(ABCTimestamp):
858858
NaT
859859
"""
860860
if self.nanosecond != 0 and warn:
861-
warnings.warn("Discarding nonzero nanoseconds in conversion",
861+
warnings.warn("Discarding nonzero nanoseconds in conversion.",
862862
UserWarning, stacklevel=2)
863863

864864
return datetime(self.year, self.month, self.day,

pandas/core/arrays/datetimelike.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -1186,7 +1186,7 @@ def _addsub_object_array(self, other: np.ndarray, op):
11861186

11871187
warnings.warn(
11881188
"Adding/subtracting object-dtype array to "
1189-
f"{type(self).__name__} not vectorized",
1189+
f"{type(self).__name__} not vectorized.",
11901190
PerformanceWarning,
11911191
)
11921192

pandas/core/arrays/datetimes.py

+4-4
Original file line numberDiff line numberDiff line change
@@ -744,7 +744,7 @@ def _add_offset(self, offset) -> DatetimeArray:
744744

745745
except NotImplementedError:
746746
warnings.warn(
747-
"Non-vectorized DateOffset being applied to Series or DatetimeIndex",
747+
"Non-vectorized DateOffset being applied to Series or DatetimeIndex.",
748748
PerformanceWarning,
749749
)
750750
result = self.astype("O") + offset
@@ -1186,8 +1186,8 @@ def to_perioddelta(self, freq) -> TimedeltaArray:
11861186
# Deprecaation GH#34853
11871187
warnings.warn(
11881188
"to_perioddelta is deprecated and will be removed in a "
1189-
"future version. "
1190-
"Use `dtindex - dtindex.to_period(freq).to_timestamp()` instead",
1189+
"future version. "
1190+
"Use `dtindex - dtindex.to_period(freq).to_timestamp()` instead.",
11911191
FutureWarning,
11921192
# stacklevel chosen to be correct for when called from DatetimeIndex
11931193
stacklevel=3,
@@ -1353,7 +1353,7 @@ def weekofyear(self):
13531353
warnings.warn(
13541354
"weekofyear and week have been deprecated, please use "
13551355
"DatetimeIndex.isocalendar().week instead, which returns "
1356-
"a Series. To exactly reproduce the behavior of week and "
1356+
"a Series. To exactly reproduce the behavior of week and "
13571357
"weekofyear and return an Index, you may call "
13581358
"pd.Int64Index(idx.isocalendar().week)",
13591359
FutureWarning,

pandas/core/arrays/sparse/array.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -407,7 +407,7 @@ def __init__(
407407
if is_datetime64tz_dtype(data.dtype):
408408
warnings.warn(
409409
f"Creating SparseArray from {data.dtype} data "
410-
"loses timezone information. Cast to object before "
410+
"loses timezone information. Cast to object before "
411411
"sparse to retain timezone information.",
412412
UserWarning,
413413
stacklevel=2,

pandas/core/computation/align.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -124,7 +124,7 @@ def _align_core(terms):
124124
w = (
125125
f"Alignment difference on axis {axis} is larger "
126126
f"than an order of magnitude on term {repr(terms[i].name)}, "
127-
f"by more than {ordm:.4g}; performance may suffer"
127+
f"by more than {ordm:.4g}; performance may suffer."
128128
)
129129
warnings.warn(w, category=PerformanceWarning, stacklevel=6)
130130

pandas/core/computation/expressions.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -214,7 +214,7 @@ def _bool_arith_fallback(op_str, a, b):
214214
warnings.warn(
215215
f"evaluating in Python space because the {repr(op_str)} "
216216
"operator is not supported by numexpr for the bool dtype, "
217-
f"use {repr(_BOOL_OP_UNSUPPORTED[op_str])} instead"
217+
f"use {repr(_BOOL_OP_UNSUPPORTED[op_str])} instead."
218218
)
219219
return True
220220
return False

pandas/core/construction.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -767,7 +767,7 @@ def _try_cast(
767767
f"Could not cast to {dtype}, falling back to object. This "
768768
"behavior is deprecated. In a future version, when a dtype is "
769769
"passed to 'DataFrame', either all columns will be cast to that "
770-
"dtype, or a TypeError will be raised",
770+
"dtype, or a TypeError will be raised.",
771771
FutureWarning,
772772
stacklevel=7,
773773
)

pandas/core/dtypes/cast.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -2085,7 +2085,7 @@ def maybe_cast_to_integer_array(
20852085
warnings.warn(
20862086
f"Constructing Series or DataFrame from {arr.dtype} values and "
20872087
f"dtype={dtype} is deprecated and will raise in a future version. "
2088-
"Use values.view(dtype) instead",
2088+
"Use values.view(dtype) instead.",
20892089
FutureWarning,
20902090
stacklevel=find_stack_level(),
20912091
)

pandas/core/dtypes/common.py

+2-2
Original file line numberDiff line numberDiff line change
@@ -302,8 +302,8 @@ def is_categorical(arr) -> bool:
302302
True
303303
"""
304304
warnings.warn(
305-
"is_categorical is deprecated and will be removed in a future version. "
306-
"Use is_categorical_dtype instead",
305+
"is_categorical is deprecated and will be removed in a future version. "
306+
"Use is_categorical_dtype instead.",
307307
FutureWarning,
308308
stacklevel=2,
309309
)

pandas/core/frame.py

+3-3
Original file line numberDiff line numberDiff line change
@@ -4542,9 +4542,9 @@ def lookup(
45424542
The found values.
45434543
"""
45444544
msg = (
4545-
"The 'lookup' method is deprecated and will be"
4546-
"removed in a future version."
4547-
"You can use DataFrame.melt and DataFrame.loc"
4545+
"The 'lookup' method is deprecated and will be "
4546+
"removed in a future version. "
4547+
"You can use DataFrame.melt and DataFrame.loc "
45484548
"as a substitute."
45494549
)
45504550
warnings.warn(msg, FutureWarning, stacklevel=2)

0 commit comments

Comments
 (0)