-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: Using pd.to_datetime function in groupby.apply function causes key value error #44026
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Well I guess you are facing this issue cause you are not having a return value in your defined function of test_apply since pandas.apply() do corresponding mapping and you didn't returned anything therefore I believe that it is giving you the same key value pair. Also in you expected behaviour date is same but that will not be true with dt1.apply() function
|
If I simply return data, there will be no problem, but if I return a processed data, there will still be problems.In addition, if I comment out the line to_datetime and there is no return value, there will still be no error
|
I get the expected output when running the code in the OP on master. |
In any case, mutating the pandas object within apply is not supported: However, this note in the |
@rhshadrach So is this just an issue that'll be fixed once the docs are updated? |
Well, I think users sometimes overlook this detail. And this bug only occurs when the group has the same number of rows before, which may make it difficult for users to find out.
result:
|
I do not know, I cannot reproduce the result on master. Can others? This needs to be sorted out first before we can understand what needs to be done to close this issue.
What does "this" refer to here?
I am not aware of any cases where this bug occurs on master; the OP reports this bug exists on master, can you confirm whether or not this is the case? |
@rhshadrach
result:
|
@apache-chnsys: This is part of the documented behavior. See the Notes section here: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.core.groupby.GroupBy.apply.html |
It seems everything is resolved here, and additional tests are not needed. If I've missed something, please let me know by replying here and this issue can be reopened. |
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
I have confirmed this bug exists on the master branch of pandas.
Reproducible Example
Issue Description
在“test_apply”中对输入参数data中类型为object的字段'date'使用pd.to_datetime()函数之后,在test_apply函数中循环打印lambda函数中的x,期望结果为每次打印key值不同,分别为“aa”,"bb","cc","dd",但实际结果为"aa","aa","aa","aa"。
After using the pd.to_datetime() function in the "test_apply" field "date" with the type of object in the input parameter data, print the x in the lambda function in the test_apply function in a loop. The expected result is that the key value is different each time it is printed, respectively "aa","bb","cc","dd", but the actual result is "aa","aa","aa","aa".
key date qty
0 aa 2020-01-01 1.0
key date qty
1 aa 2020-01-01 1.0
key date qty
2 aa 2020-01-01 3.0
key date qty
3 aa 2020-01-01 3.0
Expected Behavior
key date qty
0 aa 2020-01-01 1.0
key date qty
1 bb 2020-01-01 1.0
key date qty
2 cc 2020-01-01 3.0
key date qty
3 dd 2020-01-01 3.0
Installed Versions
The text was updated successfully, but these errors were encountered: