Skip to content

Catch SystemError in py3 Period.__richcmp__ #18205

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

jbrockmendel
Copy link
Member

First of ~3 to fix bugs reported in #17112. This just catches a py3-specific error in Period.__richcmp__ and raises the correct error instead. With a test that fails under the status quo.

if other.freq != self.freq:
msg = _DIFFERENT_FREQ.format(self.freqstr, other.freqstr)
raise IncompatibleFrequency(msg)
except SystemError:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think SystemError exists in py2?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least in 2.7.14 it does. I don't know about earlier off the top.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's there in 2.7.12 and 2.7.9

@@ -693,7 +693,12 @@ cdef class _Period(object):

def __richcmp__(self, other, op):
if is_period_object(other):
if other.freq != self.freq:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why does other.freq != self.freq ever raise?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your guess is as good as mine. Googling turned up some similar errors in Django, a suggestion that the GIL is involved. Hopefully we'll get to the root of the problem eventually. For now I'll be happy to just raise the right error.

Ref #17112, or to reproduce in py3:

>>> ser = pd.Series([pd.Period('2017'), pd.Period('2016Q1')])
>>> ser.sort_values()
[...]
SystemError: <built-in function isinstance> returned a result with an error set

@gfyoung gfyoung added 2/3 Compat Error Reporting Incorrect or improved errors from pandas Period Period data type labels Nov 10, 2017
@jreback
Copy link
Contributor

jreback commented Nov 10, 2017

this is just patching over the fundamental error, which is the self.freq != other.freq.
let's systematically test just this and try to repro, then this will be more clear.

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comments

@jbrockmendel
Copy link
Member Author

this is just patching over the fundamental error, which is the self.freq != other.freq.
let's systematically test just this and try to repro, then this will be more clear.

I agree on the patching over, but have gotten nowhere on actually figuring out what's going on. Open to suggestions.

@jreback
Copy link
Contributor

jreback commented Nov 10, 2017

I agree on the patching over, but have gotten nowhere on actually figuring out what's going on. Open to suggestions.

print out the case where this happens (e.g. both values). then add a test which systematically goes thru all the possibilities (e.g. maybe its a string comparing an offset or something). Then we can see exactly where the error is. having systematic unit tests is very important for basic things like this.

@jbrockmendel
Copy link
Member Author

I think there may be a problem in the CI. Travis icon has said it is running for something like 2 days.

@jreback
Copy link
Contributor

jreback commented Nov 12, 2017

pls rebase on master and follow my comments above.

@jbrockmendel
Copy link
Member Author

I'll try to follow that suggestion. The CI thing might need human intervention.

@jbrockmendel
Copy link
Member Author

print out the case where this happens (e.g. both values). then add a test which systematically goes thru all the possibilities (e.g. maybe its a string comparing an offset or something).

AFAICT it isn't happening in scalar comparisons, only in index/series/frame sort_values (but not sort_index). The broken operations all seem to go through values.argsort().

In offsets if I wrap isinstance(other, compat.string_types) in a try/except that returns False on error, the bug is fixed. So the question is: what is it about np.argsort that messes with isinstance?

@jbrockmendel
Copy link
Member Author

Closing because I'm concerned the never-ending Travis build may be clogging up the pipeline for others. Will return to this bug eventually.

@jreback jreback mentioned this pull request Nov 29, 2017
4 tasks
@jbrockmendel jbrockmendel deleted the period_bugs branch April 5, 2020 17:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Error Reporting Incorrect or improved errors from pandas Period Period data type
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants