Skip to content

BUG: assignment in Float64Index #7586

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jreback opened this issue Jun 27, 2014 · 8 comments · Fixed by #7587
Closed

BUG: assignment in Float64Index #7586

jreback opened this issue Jun 27, 2014 · 8 comments · Fixed by #7587
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves
Milestone

Comments

@jreback
Copy link
Contributor

jreback commented Jun 27, 2014

http://stackoverflow.com/questions/24446500/pandas-valueerror-when-assigning-dataframe-entries-using-index-due-to-a-change

In [2]: A = pd.DataFrame(np.random.rand(10,4), index=np.random.rand(10))

In [3]: A.loc[A.index] = A.loc[A.index]
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
@jreback jreback added this to the 0.14.1 milestone Jun 27, 2014
@jreback
Copy link
Contributor Author

jreback commented Jun 27, 2014

@cpcloud

@eldad-a
Copy link

eldad-a commented Jun 27, 2014

You are using a Float64Index. Are you really sure this is what you want? easy enough to A = A.reset_index() then your operations will work normally. Float64Index is a very special use-case

@jreback
I may be missing something here...
If I reset_index I will not be able to use it as labels, right?
Then I loose the A.loc[B.index] = A.loc[B.index].add(B, fill_value=0) type functionality.

In my particular case the index results from timestamps differences which I needed to convert to Float64 due to another motivation.
Clearly I could bypass the situation somehow but it worked great previously (pandas 0.13.1).
Is there a reason why this does not work any more?
Is there a more "pandas-onic" way to do this?

@jreback
Copy link
Contributor Author

jreback commented Jun 27, 2014

this a small bug become of the implementation change in Float64Indexes.

Pls give a full example. Its not clear what you are trying to do.

@eldad-a
Copy link

eldad-a commented Jun 27, 2014

I apologise, I thought the context was clear from the stackoverflow post.

The following works great in pandas 0.13.1 but not in 0.14.0 :

In [1]: import numpy as np
In [2]: import pandas as pd
# say you have some big DataFrame
In [3]: A = pd.DataFrame(np.random.rand(10,4), index=np.random.randn(10))
# and some others that contain a subset of the labels (here only one is created)
In [4]: B = A[A.index>0].copy()
In [5]: B[:] = np.random.rand(*B.shape)
# now A accumulates the B's:
In [6]: A.loc[B.index] = A.loc[B.index].add(B, fill_value=0)

This case is not so unusual.
I used to experience bad performance when applying the add method directly without slicing using loc. I finally found a similar discussion with this solution on stackoverflow (regretfully I cannot locate that post any more).

Please let me know in case this is still unclear.

@jreback
Copy link
Contributor Author

jreback commented Jun 27, 2014

Here's a workaround until 0.14.1. This is a bug.

In [43]: A = DataFrame(np.arange(4*10).reshape(-1,4),index=np.random.randn(10)).sort_index()

In [44]: mask = A.index>0

In [45]: B = A[mask]

In [46]: A
Out[46]: 
            0   1   2   3
-1.899137  24  25  26  27
-1.638630  16  17  18  19
-1.529521  12  13  14  15
-0.579667   0   1   2   3
-0.498161  28  29  30  31
-0.225297   8   9  10  11
-0.071840  36  37  38  39
 0.034835  20  21  22  23
 0.811921   4   5   6   7
 1.050503  32  33  34  35

In [47]: B
Out[47]: 
           0   1   2   3
0.034835  20  21  22  23
0.811921   4   5   6   7
1.050503  32  33  34  35

In [48]: A[mask] = A.add(B,fill_value=0)

In [49]: A
Out[49]: 
            0   1   2   3
-1.899137  24  25  26  27
-1.638630  16  17  18  19
-1.529521  12  13  14  15
-0.579667   0   1   2   3
-0.498161  28  29  30  31
-0.225297   8   9  10  11
-0.071840  36  37  38  39
 0.034835  40  42  44  46
 0.811921   8  10  12  14
 1.050503  64  66  68  70

@cpcloud
Copy link
Member

cpcloud commented Jun 27, 2014

Another case for TimedeltaIndex! :)

@cpcloud
Copy link
Member

cpcloud commented Jun 27, 2014

@eldad-a if you want to give a try that would be great! Thanks for the report

@eldad-a
Copy link

eldad-a commented Jun 29, 2014

if you want to give a try that would be great!

@cpcloud
Did you mean the TimedeltaIndex?
I no longer remember why I avoided it before. I think it was due to a technical issue on my side (the timestamps are generated by a camera and have some jitter which I needed to round-off).

Thanks for the report

You are most welcome!
I am in debt to all the developers as it allows me to use very intuitive tools in my research work, so thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants