Skip to content

DEPR: lower/uppercase strings such as 'y', 'q', 'H', 'MIN', etc. denoting freqs/units for time series, period, and timedelta #56346

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

natmokval
Copy link
Contributor

@natmokval natmokval commented Dec 5, 2023

xref #52064, #54061

Deprecated lowercase strings in favour of uppercase strings denoting:

  • period aliases weekly, monthly and bigger frequency
  • offsets aliases representing DateOffset subclasses that are a week or bigger (Week, MonthBegin, MonthEnd, etc.)
  • timedelta units for week, month or bigger

Deprecated uppercase strings in favour of lowercase strings denoting:

  • period aliases hourly, minutely or smaller frequency
  • offsets aliases representing DateOffset subclasses that are an hour or smaller (Hour, Minute, etc.)
  • timedelta units for hour, minute or smaller

f"{name}, failed to parse with error message: {repr(err)}")
f"{name_orig}, failed to parse with error message: {repr(err)}")
)
if name != name_orig and name_orig != "d":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry could you remind me why we're special-casing 'd' please?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, my mistake. I did this because I got failures in tests for freq="d", e.g.

pandas/tests/resample/test_base.py -k test_resample_empty_dtypes
pandas/tests/tslibs/test_to_offset.py -k test_to_offset_leading_plus

but I had to replace deprecated lowercase frequency 'd' with the uppercase 'D'. Am I correct, we want to use for Day uppercase 'D'?
I will remove the check name_orig != "d" and will correct tests.

Comment on lines 4677 to 4680
if is_period is False and name.upper() in c_OFFSET_DEPR_FREQSTR:
if name == "m" and (n > 0 or stride == "-"):
name = name
else:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is super-complicated, I can't understand it anymore

would it work to just keep this part as it is, and at the beginning of _get_offset, to add

    if name not in _lite_rule_alias and (name.upper() in _lite_rule_alias):
        warnings.warn(
            f"\'{name}\' is deprecated and will be removed in a future "
            f"version, please use \'{name.upper()}\' instead.",
            FutureWarning,
            stacklevel=find_stack_level(),
        )
    elif name not in _lite_rule_alias and (name.lower() in _lite_rule_alias):
        warnings.warn(
            f"\'{name}\' is deprecated and will be removed in a future "
            f"version, please use \'{name.lower()}\' instead.",
            FutureWarning,
            stacklevel=find_stack_level(),
        )

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the comment, I added it in _get_offset() and we do need this in some cases. But it only works for frequencies in _lite_rule_alias.
The problem I tried to resolve by doing the check

if name == "m" and (n > 0 or stride == "-"):

was for frequencies like "2h30m". When we pass this freq to to_offset() we show our warning: 'm' is deprecated and will be removed in a future version, please use 'ME' instead., but we shouldn't.

Comment on lines 4673 to 4675
name not in _lite_rule_alias
and (name.upper() in _lite_rule_alias)
and name != "ms"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

separate to this PR, but can we move 'ms', 'us', 'ns', and 'min' from _lite_rule_alias? _lit_rule_alias is only ever used with .get(name, name), so I don't think they need to be in it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, I did as you suggested and removed "min", ”ms”, “ns", and “us" from _lite_rule_alias in separate PR #56516. I added these aliases to the list _dont_uppercase, because otherwise we uppercase them in _get_offset(). could you please take a look at this PR?

I am not sure, do we need “Min” in _lite_rule_alias? We want to deprecate the alias “Min” and left only lowercase “min” for Minutes. Am I correct?

Comment on lines 4786 to 4787
if n > 0 or stride == "-":
name = name
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks very complex, and not sure it's necessary anyway?

Copy link
Contributor Author

@natmokval natmokval Dec 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but I think it's the only way to pass test_to_offset_invalid() in pandas/tests/tslibs/test_to_offset.py
The reason is: when we parse frequency like "2h20m" or "-m" we show our warning FutureWarning: 'm' is deprecated and will be removed in a future version, please use 'ME' instead.
But here 'm' is offsets.Minute(), not offsets.MonthEnd()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure but '2h20m' is already invalid to begin with?

I'm not sure we should worry about a false positive warning being raised for frequencies which were invalid to begin with

Copy link
Contributor Author

@natmokval natmokval Dec 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, understood. I removed from to_offset() this unnecessary check and corrected parameters in test_to_offset_invalid to avoid raising the false positive warning

i2 = Period("1982", freq="MIN")
i2 = Period("1982-01-01 00:00")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why change this one completely? I'd suggest just to assert that the warning is raised and still check they're equal

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why change this one completely?

I wanted to do assert and I thought this is proper way to define i2.

I'd suggest just to assert that the warning is raised and still check they're equal

thanks, it's a good idea. I did as you suggested.

@natmokval natmokval marked this pull request as ready for review December 21, 2023 09:59
@natmokval natmokval changed the title DEPR: lowercase strings denoting freq for Week, MonthBegin, MonthEnd, etc. DEPR: lower/uppercase strings such as 'y', 'q', 'H', 'MIN', etc. denoting freqs/units for time series, period, and timedelta Dec 21, 2023
Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor comment, but I think this looks good!

Deprecate lowercase strings ``w``, ``m``, ``q``, etc. and uppercase strings ``H``, ``MIN``, ``S``, etc. for time series, period, and timedelta
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Previously, both ``'h'`` and ``'H'`` would allowed for 'hour' offset/period alias. We now require the case to be correct - check the :ref:`offset aliases <timeseries.offset_aliases>` and the :ref:`period aliases <timeseries.period_aliases>` parts of the docs and make sure you're using the correct one (:issue:`56346`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"would allowed" -> "were allowed"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you, I corrected this.

@MarcoGorelli MarcoGorelli mentioned this pull request Dec 21, 2023
Comment on lines 4806 to 4816
elif (is_period is False and
name != name.upper() and
name.upper() in c_REVERSE_OFFSET_DEPR_FREQSTR):
warnings.warn(
f"\'{name}\' is deprecated and will be removed "
f"in a future version, please use "
f"\'{name.upper()}\' instead.",
FutureWarning,
stacklevel=find_stack_level(),
)
name = name.upper()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the idea here that if someone passes 'me' instead of 'ME', then it will warn?

I'm not sure this is the right place for this warning to be honest - like this it'll only show the warning for lowercase versions of aliases which have been renamed. For example:

In [5]: to_offset('qs')
Out[5]: <QuarterBegin: startingMonth=1>

doesn't emit any warning, and similarly for others which aren't part of that list

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you for noticing this, it's my mistake. I fixed it and updated the PR.

@@ -4795,40 +4794,60 @@ cpdef to_offset(freq, bint is_period=False):

tups = zip(split[0::4], split[1::4], split[2::4])
for n, (sep, stride, name) in enumerate(tups):
if is_period is False and name in c_OFFSET_DEPR_FREQSTR:
if is_period is False and name.upper() in c_OFFSET_DEPR_FREQSTR:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, little nitpick, but if not is_period is generally preferred over if is_period is False (and similarly for the True counterparts)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, if not is_period looks much better indeed. I replaced if is_period is False with it (and if is_period is True with if is_period).

@natmokval natmokval added Deprecate Functionality to remove in pandas Frequency DateOffsets labels Jan 16, 2024
mps01060 added a commit to mps01060/intense-qc that referenced this pull request Feb 5, 2024
Update time offsets to reflect changes in more recent versions of
pandas (pandas-dev/pandas#56346).
Copy link
Contributor

This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this.

@MarcoGorelli
Copy link
Member

shall we mark this as draft until we've enforced the deprecations already in?

@natmokval natmokval marked this pull request as draft March 13, 2024 11:39
@natmokval
Copy link
Contributor Author

shall we mark this as draft until we've enforced the deprecations already in?

thank you, good idea. I marked this PR as draft.

@natmokval
Copy link
Contributor Author

partially closed via #59051 and #58998.
It's still allowed to use mixed cases such as: "Min", "Us", "uS", "Ns", etc.

@natmokval natmokval closed this Jun 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Deprecate Functionality to remove in pandas Frequency DateOffsets
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants