Skip to content

REGR: NotImplementedError: Prefix not defined when slicing offset with loc #47547

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Aug 23, 2022

Conversation

CloseChoice
Copy link
Member

@CloseChoice CloseChoice commented Jun 29, 2022

@pep8speaks
Copy link

pep8speaks commented Jun 29, 2022

Hello @CloseChoice! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2022-08-22 20:30:59 UTC

@CloseChoice
Copy link
Member Author

CloseChoice commented Jun 29, 2022

The problem in here is that we expect different types in this parsing.parse_time_string which are not consistent with the rule_code attribute, e.g.
pandas/tests/series/indexing/test_datetime.py::test_getitem_setitem_periodindex where the type of self.freq is pandas._libs.tslibs.offsets.Hour. This class implements the _prefix attribute.
But in the newly added test test_getitem_iloc_dateoffset_days self.freq has the type pandas._libs.tslibs.offsets.DateOffset which does inherit the _prefix method from BaseOffset as an NotImplementedError.

Is this on purpose? Shouldn't we be able to fix this issue by just implementing the _prefix method correctly?

@CloseChoice CloseChoice requested a review from jbrockmendel June 29, 2022 09:31
@CloseChoice CloseChoice changed the title fix regression FIX: NotImplementedErorr: Prefix not defined when slicing offset with loc Jun 29, 2022
@CloseChoice CloseChoice changed the title FIX: NotImplementedErorr: Prefix not defined when slicing offset with loc REGR: NotImplementedErorr: Prefix not defined when slicing offset with loc Jun 29, 2022
Copy link
Member

@phofl phofl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, I think we should figure out first, if _prefix is supposed to be defined

@simonjayhawkins simonjayhawkins added Indexing Related to indexing on series/frames, not to indexes themselves Datetime Datetime data dtype labels Jul 2, 2022
@simonjayhawkins simonjayhawkins added this to the 1.4.4 milestone Jul 2, 2022
@CloseChoice CloseChoice changed the title REGR: NotImplementedErorr: Prefix not defined when slicing offset with loc REGR: NotImplementedErrorr: Prefix not defined when slicing offset with loc Jul 16, 2022
@CloseChoice CloseChoice changed the title REGR: NotImplementedErrorr: Prefix not defined when slicing offset with loc REGR: NotImplementedError: Prefix not defined when slicing offset with loc Jul 16, 2022
@github-actions
Copy link
Contributor

This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this.

@github-actions github-actions bot added the Stale label Aug 16, 2022
@jorisvandenbossche
Copy link
Member

jorisvandenbossche commented Aug 20, 2022

@CloseChoice thanks for the PR! (and sorry for that automated comment, it's not you who didn't update the PR, but it's a review that was lacking)

I merged in main to fix the conflicts

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

I think we can try to clean up the situation around rule_code in another PR that is not targeted for 1.4.x (should DateOffset have that defined? Should we actually pass something (greqstr/inferred_freq) to the parse function if rule_code isn't defined?)

if self.freq is None or hasattr(self.freq, "rule_code"):
freq = self.freq
except NotImplementedError:
freq = getattr(self, "freqstr", getattr(self, "inferred_freq", None))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In practice, I don't think we actually run into cases that need this. For example, freqstr for the DateOffset case gives:

In [3]: d.index.freq
Out[3]: <DateOffset: days=1>

In [4]: d.index.freqstr
Out[4]: '<DateOffset: days=1>'

This "<DateOffset: days=1>" string is never going to be useful. But in practice, it also seems that parsing.parse_time_string only actually uses freq in very few cases. From a quick look, it only seems to use it if it is exactly equal to "M", or in case of parsing quarters. Here I could create a snippet where passing the freq actually impacts the result:

In [6]: parsing.parse_time_string("4q2022", "A-DEC")
Out[6]: (datetime.datetime(2022, 10, 1, 0, 0), 'quarter')

In [7]: parsing.parse_time_string("4q2022", "A-NOV")
Out[7]: (datetime.datetime(2022, 9, 1, 0, 0), 'quarter')

But, so this only is for quarterly offsets, and those do define a rule_code, so you probably don't run into this issue with that kind of freqs.

This line freq = getattr(self, "freqstr", getattr(self, "inferred_freq", None)) is restoring what we did before #42149, so that's good for a regression fix in the bug-fix release. But we should maybe also see if we can just remove it later on (and just pass None here if rule_code isn't defined?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked passing None if rule_code isn't defined and a couple of tests fail in this case. Didn't look into it further though. But I think it should be possible to improve the situation around rule_code.

@CloseChoice
Copy link
Member Author

Looks good to me.

I think we can try to clean up the situation around rule_code in another PR that is not targeted for 1.4.x (should DateOffset have that defined? Should we actually pass something (greqstr/inferred_freq) to the parse function if rule_code isn't defined?)

First things first: Thanks once again for helping out and the very friendly tone of your comments. I really appreciate it.

Yes, I totally agree that we should clean up rule_code at some point, I am also very willing to dive into that.
When I investigated the situation around rule_code there are a couple of things we could do:

  1. One thing that came to my mind is to add rule_code for DateOffsets. This would help us get rid of the except part in this PR
  2. Refactor parse_date_string to not need this information at all. As you mentioned in the other freq is actually used in very limited cases (as I can see, it only used in the chain parse_time_string -> parse_datetime_string_with_reso -> _parse_dateabbr_string -> quarter_to_myear -> get_rule_month and in _parse_dateabbr_string where it is checked against M. So there might be a good way of working around this by finding the rule_month (as given by get_rule_month) and checking for M first and implementing a special case for this somewhere not so deep down in the chain.

There are also a couple todos written in the code, so I think we could improve here quite some things.

@CloseChoice CloseChoice force-pushed the 2022-06-29-FIX-46671 branch from e7c2a26 to 54bd158 Compare August 21, 2022 14:41
Copy link
Member

@simonjayhawkins simonjayhawkins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @CloseChoice typo to fix otherwise lgtm

@jorisvandenbossche
Copy link
Member

Thanks @CloseChoice !

When I investigated the situation around rule_code there are a couple of things we could do:

Could you move that summary to a new issue to track this?

phofl pushed a commit that referenced this pull request Aug 23, 2022
… not defined when slicing offset with loc) (#48209)

Backport PR #47547: REGR: NotImplementedError: Prefix not defined when slicing offset with loc

Co-authored-by: Tobias Pitters <[email protected]>
noatamir pushed a commit to noatamir/pandas that referenced this pull request Nov 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Datetime Datetime data dtype Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: NotImplementedError: Prefix not defined when slicing offset with loc
6 participants