-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
REF: make BaseOffset a cdef class #34167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
REF: make BaseOffset a cdef class #34167
Conversation
you can figure out what you need to patch by monkey patching the unpickler itself and stepping thru it’s going to be one of the dist instructions i think (i did this a number of years ago so don’t remember much more) |
Very tentative, but it looks like we can make a context to temporarily patch pickle to use the Unpickler in compat.pickle_compat during calls to |
Booyah green |
@@ -620,6 +628,8 @@ class _BaseOffset: | |||
TimeStamp | |||
Rolled timestamp if not on offset, otherwise unchanged timestamp. | |||
""" | |||
# TODO: try to avoid runtime/circular import | |||
from pandas import Timestamp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you not cimport this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need Timestamp (not cimport-able), not _Timestamp (cimport-able) here. We could re-instate a as_timestamp
func to only do the runtime import in one place, and after a (fast) isinstance check.
I asked on the cython mailing list yesterday about handling of circular cimports. It seems pretty hit-and-miss.
if group is None: | ||
raise KeyError(f"No object named {key} in the file") | ||
return self._read_group(group) | ||
with patch_pickle(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems pretty invasive to do this here. can we not do this in the pickle_compat.py itself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the unpickling is done with a call to pickle.loads
from within pytables (not our pytables code, but pytables' own code). the effect of this patching here is to make it use our pickle_compat code instead of the stdlib version. if there's a less invasive way to do this thatd be great, but i haven't found it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok i understand, can you add some comments here to that effect
ok will have another look, pls rebase |
…f-liboffsets-cdefit
updated; npdev failure is unrelated AFAICT |
rebased+green |
pandas/_libs/tslibs/offsets.pyx
Outdated
@@ -111,6 +111,14 @@ def as_datetime(obj: datetime) -> datetime: | |||
return obj | |||
|
|||
|
|||
cdef ABCTimestamp as_timestamp(obj): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we removed this previously? what is the purpose of this now? to centrailize the import?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to centrailize the import?
Yes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this would only have 2 times where you actually need to import Timestamp (and you don't even need to check if its ABC); i would just do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated+green
if group is None: | ||
raise KeyError(f"No object named {key} in the file") | ||
return self._read_group(group) | ||
with patch_pickle(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok i understand, can you add some comments here to that effect
…f-liboffsets-cdefit
…f-liboffsets-cdefit
updated+green |
very nice |
* idxmaxmin: pandas-dev/pandas#32579 * upstream fix * first_last: pandas-dev/pandas#34167 * test_dropna: unclear
@jreback i could use help here with one last stubborn pickle/pytables test
test_read_py2_hdf_file_in_py3
Within this legacy file we have the substring (sub-bytes?)
blob = b"ccopy_reg\n_reconstructor\np1\n(cpandas.tseries.offsets\nBusinessDay\np2\nc__builtin__\nobject\np3\nNtRp4\n(dp5\nS'normalize'\np6\nI00\nsS'n'\nI1\nsS'kwds'\np7\n(dp8\nsS'offset'\np9\ncdatetime\ntimedelta\np10\n(I0\nI0\nI0\ntRp11\nsb."
AFAICT it is doing
pickle.loads(blob)
and getting:Then it is falling back and for reasons not clear to me is calling
to_offset(blob)
which raisesValueError
which is what pytest shows us.IIUC there is a way to register functions for unpickling, but none of the functions in pd.compat.pickle_compat are getting called in this process, so I'm at a loss for how to do this. Any ideas?