Skip to content

apply method, should return certain columns as datetime but returns them as int[64] instead #18700

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jimbasquiat opened this issue Dec 8, 2017 · 3 comments
Labels
Dtype Conversions Unexpected or buggy dtype conversions Duplicate Report Duplicate issue or pull request Reshaping Concat, Merge/Join, Stack/Unstack, Explode

Comments

@jimbasquiat
Copy link

jimbasquiat commented Dec 8, 2017

Code Sample, a copy-pastable example if possible

a=pd.DataFrame(data=list(range(10)))
def test(x):
    x["date"]=pd.Timestamp("2017-01-01")
    return x
a.apply(test,axis=1)

returns:

    0   date
0   0   1483228800000000000
1   1   1483228800000000000
2   2   1483228800000000000
3   3   1483228800000000000
4   4   1483228800000000000
5   5   1483228800000000000
6   6   1483228800000000000
7   7   1483228800000000000
8   8   1483228800000000000
9   9   1483228800000000000

Problem description

This should obviously return a dataframe with timestamps in the column date. Here it is having the numerical .value of the timestamp instead.

Please don't come up with answer like : " use df['date']=pd.Timestamp("2017-01-01"). "

I simplified a lot my problem as to focus on the datetime with .apply. I need to use .apply in my real life code.

Expected Output

    0   date
0   0   2017-01-01 00:00:00
1   1   2017-01-01 00:00:00
2   2   2017-01-01 00:00:00
3   3   2017-01-01 00:00:00
4   4   2017-01-01 00:00:00
5   5   2017-01-01 00:00:00
6   6   2017-01-01 00:00:00
7   7   2017-01-01 00:00:00
8   8   2017-01-01 00:00:00
9   9   2017-01-01 00:00:00
@jimbasquiat jimbasquiat changed the title apply method, should return certain columns as datetime but returens them as int[64] instead apply method, should return certain columns as datetime but returns them as int[64] instead Dec 8, 2017
@jreback
Copy link
Contributor

jreback commented Dec 9, 2017

you are trying to mutate the internal Series. This violates all kinds of guarantes. Pandas cannot infer ever things a user will possible do. Not even sure what you are trying to accomplish.

In [7]: a=pd.DataFrame(data=list(range(10)))
   ...: def test(x):
   ...:     return pd.Timestamp("2017-01-01")
   ...: a.apply(test,axis=1)
   ...: 
Out[7]: 
0   2017-01-01
1   2017-01-01
2   2017-01-01
3   2017-01-01
4   2017-01-01
5   2017-01-01
6   2017-01-01
7   2017-01-01
8   2017-01-01
9   2017-01-01
dtype: datetime64[ns]

@jreback jreback closed this as completed Dec 9, 2017
@jreback jreback added this to the won't fix milestone Dec 9, 2017
@jreback jreback added Dtype Conversions Unexpected or buggy dtype conversions Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Dec 9, 2017
@jimbasquiat
Copy link
Author

'you are trying to mutate the internal Series.' I think you are referring here to altering a list while iterating over it. this is def a different problem since the original dataframe is absolutely not modified by the .apply method.
'This violates all kinds of guarantes.' Could you point to documentation or information about that? i don't see any type inference related matters in the doc for .apply.
I understand this is an annoying burden to have to deal with this kind of requests for you, but then i would wonder why would you volunteer for taking charge of the issues log if its for dismissing the problems out of hand like that?

@jreback
Copy link
Contributor

jreback commented Dec 9, 2017

this is already reported in #15526

generally mutating things inside an apply, while it might work are not supported in any guaranteed way.

applying row-by-row is an anti-pattern when you can easily do a vectorized operation. Sure this may not be your real operation, but according to your SO post, your entire calculation can be vectorized.

@jreback jreback added the Duplicate Report Duplicate issue or pull request label Dec 9, 2017
@jreback jreback modified the milestones: won't fix, No action Dec 9, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Dtype Conversions Unexpected or buggy dtype conversions Duplicate Report Duplicate issue or pull request Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

No branches or pull requests

2 participants