-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: df.apply(...) method will sometimes return DatetimeIndex on first iteration #56747
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
There is also a related Stack Overflow post that, unfortunately, remains unresolved here: Why does pandas.apply return an index on the first iteration instead of the actual element? Thank you, team, for your work and looking into this strange and potentially widespread bug. 😊 |
So the output you posted above not the actual output you get? In any case, can you try using |
It is the actual output I get while looping through an internally generated dataframe timestamp column. However, I ran into an unexpected error where I sometimes get a DateTimeIndex as the first element. When I save the dataframe into a JSON and reinitialize it, I find that I am unable to reproduce this error. So, my assumption is that there might be some internal temporary variable in pandas itself that isn't being flushed correctly or something. To create a "reproducible" code snippet, I'd need to reveal sensitive parts of the codebase, which I cannot do. Although this isn't particularly helpful, I wanted to highlight the issue anyways, ensuring that the staff is aware of an anomaly occurring in the apply(...) method.
Not sure if this is helpful, but the dataframe was flipped beforehand prior to calling
|
I suspect that saving to JSON is lossy. Can you try the following:
Also, just to be sure, what is the output of |
You've checked that you confirmed this exists on the main branch of pandas, is that accurate? |
I get the following error
I get the following output
No, the actual function is doing something actually useful. It just blows up with a TypeError instead.
Yes. But don't break your neck trying to track it down. I have a feeling this would be one of the more elusive bugs. The point of this ticket was just to bubble up that there IS an anomaly going on and make a record of it. |
Sorry, could you please elaborate on this? How will it know when the method has failed and when to "fallback"? Does it consider the |
I think I was not clear. In the OP you posted the output
If you use the function
We check for ValueError, TypeError, and AssertionErrors. Returning Lines 1477 to 1480 in d8e9529
|
I was able to reproduce this on 2.0.x using the following reproducer:
The |
My apologies for any confusion. Thank you very much for your hard work. 🙌 |
Not a problem, that's for all the information. If you find this is still happening on pandas 2.1 or later, please comment here and we can reopen! |
Pandas version checks
Reproducible Example
Create a file locally named
example.json
with the following contents:Create a file locally named
example.py
with the following contents:Issue Description
I currently have the following pandas dataframe object:
I get normal expected results when I loop through
Revenue
column:I get abnormal unexpected results when I loop through
Date
column:Why is this happening? Why do I get an index object at first?
Expected Behavior
I should be getting exclusively timestamp objects on iteration:
Installed Versions
The text was updated successfully, but these errors were encountered: