-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Stacking MultiIndex DataFrame columns with Timestamps levels fails #8039
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
when you pass Use this to create a frame
And simply set the columns.
|
@TomAugspurger not a bug, but a usage issue. |
yeah just read your comment. |
@ldkge's problem was with In [67]: result.stack(0)
Out[67]:
SomeColumnName
AnotherOne
1 2014-08-01 34204
2 2014-08-01 43580
3 2014-08-01 84329
5 2014-08-01 23485
In [68]: result.stack(1)
Out[68]:
Empty DataFrame
Columns: [(2014-08-01 00:00:00, AnotherOne)]
Index: [] would be different. |
This works in master (recently added feature).
I am not what stack(1) would/should actually do What would you expect? |
I thought it should shift the >>>df.stack(1)
2014-08-01
AnotherOne
1 SomeColumnName 34204
2 SomeColumnName 43580
3 SomeColumnName 84329
5 SomeColumnName 23485 |
cc @onesandzeroes what do you think? |
ok I think agree could be a bug |
I'll submit a PR once I figure out what's wrong. |
@jreback it has to do with how the MultiIndex is storing the timestamp. Any idea offhand why with In [6]: idx = pd.MultiIndex.from_tuples([(pd.datetime(2014, 1, 1), 'A', 'B')]) these two aren't equal? In [10]: idx.values[0][0]
Out[10]: Timestamp('2014-01-01 00:00:00')
In [8]: idx.levels[0].values
Out[8]: array(['2013-12-31T18:00:00.000000000-0600'], dtype='datetime64[ns]') edit: or even clearer, why isn't
equal to In [33]: idx.levels[0][0]
Out[33]: Timestamp('2014-01-01 00:00:00') I'm going to go digging in index.py |
where is this type of comparison? |
(I think) they're compared when constructing the new dataframe in ipdb> new_data
{(numpy.datetime64('2013-12-31T18:00:00.000000000-0600'), 'B'): array([1, 2, 3, 4])}
ipdb> new_columns
MultiIndex(levels=[[2014-01-01 00:00:00], ['B']],
labels=[[0], [0]])
ipdb> result = DataFrame(new_data, index=new_index, columns=new_columns)
ipdb> result
2014-01-01
B
0 C NaN
1 C NaN
2 C NaN
3 C NaN I'll see why |
I can't see exactly where you are pointing too... levels should be using |
@jreback I agree with TomAugspurger about what the expected behaviour of |
You can see the bug in the following code:
We would expect the data to be unchanged, however the returned DataFrame is empty.
The Pandas version used was 0.11.0
The text was updated successfully, but these errors were encountered: