Skip to content

Multiindexing slicing of dates #3843

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
hayd opened this issue Jun 11, 2013 · 9 comments
Closed

Multiindexing slicing of dates #3843

hayd opened this issue Jun 11, 2013 · 9 comments
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves MultiIndex

Comments

@hayd
Copy link
Contributor

hayd commented Jun 11, 2013

Should I be able to slice like follows:

In [1]: df
Out[1]:
          report_date  item_id  views category
0 2013-06-01 00:00:00        2      3        a
1 2013-06-01 00:00:00        2      2        b
2 2013-06-01 00:00:00        5     16        a
3 2013-06-01 00:00:00        2      4        c
4 2013-06-01 00:00:00        2      5        d

In [2]: pivot = df.pivot_table(values=['views'], rows=['report_date','item_id'], aggfunc='sum')

In [3]: pivot
Out[3]:
                     views
report_date item_id
2013-06-01  2           14
            5           16

In [4]: pivot.ix[('2013-6-01', 3):('2013-06-1', 6)]
Out[4]:
                     views
report_date item_id
2013-06-01  5           16

But if I try and slice from outside any of the ranges, I get an exception:

In [21]: pivot.ix[('2013-6-01',3):('2013-06-02', 6)]
---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
<ipython-input-21-390c2ec15fd2> in <module>()
----> 1 pivot.ix[('2013-6-01',3):('2013-06-02', 6)]

/Users/234BroadWalk/pandas/pandas/core/indexing.pyc in __getitem__(self, key)
     46             return self._getitem_tuple(key)
     47         else:
---> 48             return self._getitem_axis(key, axis=0)
     49
     50     def _get_label(self, label, axis=0):

/Users/234BroadWalk/pandas/pandas/core/indexing.pyc in _getitem_axis(self, key, axis)
    390         labels = self.obj._get_axis(axis)
    391         if isinstance(key, slice):
--> 392             return self._get_slice_axis(key, axis=axis)
    393         elif _is_list_like(key) and not (isinstance(key, tuple) and
    394                                          isinstance(labels, MultiIndex)):

/Users/234BroadWalk/pandas/pandas/core/indexing.pyc in _get_slice_axis(self, slice_obj, axis)
    652         else:
    653             try:
--> 654                 indexer = labels.slice_indexer(start, stop, slice_obj.step)
    655             except Exception:
    656                 if _is_index_slice(slice_obj):

/Users/234BroadWalk/pandas/pandas/core/index.pyc in slice_indexer(self, start, end, step)
   1199         This function assumes that the data is sorted, so use at your own peril
   1200         """
-> 1201         start_slice, end_slice = self.slice_locs(start, end)
   1202
   1203         # return a slice

/Users/234BroadWalk/pandas/pandas/core/index.pyc in slice_locs(self, start, end, strict)
   2276             if not isinstance(end, tuple):
   2277                 end = end,
-> 2278             end_slice = self._partial_tup_index(end, side='right')
   2279
   2280         return start_slice, end_slice

/Users/234BroadWalk/pandas/pandas/core/index.pyc in _partial_tup_index(self, tup, side)
   2293             if lab not in lev:
   2294                 if not lev.is_type_compatible(lib.infer_dtype([lab])):
-> 2295                     raise Exception('Level type mismatch: %s' % lab)
   2296
   2297                 # short circuit

Exception: Level type mismatch: 2013-06-02

see this SO question.

@jreback
Copy link
Contributor

jreback commented Jun 11, 2013

why don't u reset index, subset on the dates and/or id?

the problem with selecting via a tuple range is a) index must be sorted and b) the code is complicated with dates

mark as a bug for. 0.12 I guess

@hayd
Copy link
Contributor Author

hayd commented Jun 11, 2013

@jreback
Copy link
Contributor

jreback commented Jun 11, 2013

Can't seem to repru the isee you mention in the link just above

In [36]: df = DataFrame(1.,columns=['A'],index=date_range('2010-08-31 12:36:53',freq='5min',periods=10))

In [37]: df
Out[37]: 
                     A
2010-08-31 12:36:53  1
2010-08-31 12:41:53  1
2010-08-31 12:46:53  1
2010-08-31 12:51:53  1
2010-08-31 12:56:53  1
2010-08-31 13:01:53  1
2010-08-31 13:06:53  1
2010-08-31 13:11:53  1
2010-08-31 13:16:53  1
2010-08-31 13:21:53  1

In [38]: df['2010-07':'2010-10']
Out[38]: 
                     A
2010-08-31 12:36:53  1
2010-08-31 12:41:53  1
2010-08-31 12:46:53  1
2010-08-31 12:51:53  1
2010-08-31 12:56:53  1
2010-08-31 13:01:53  1
2010-08-31 13:06:53  1
2010-08-31 13:11:53  1
2010-08-31 13:16:53  1
2010-08-31 13:21:53  1

In [39]: df['2010-7':'2010-10']
Out[39]: 
                     A
2010-08-31 12:36:53  1
2010-08-31 12:41:53  1
2010-08-31 12:46:53  1
2010-08-31 12:51:53  1
2010-08-31 12:56:53  1
2010-08-31 13:01:53  1
2010-08-31 13:06:53  1
2010-08-31 13:11:53  1
2010-08-31 13:16:53  1
2010-08-31 13:21:53  1

@jreback
Copy link
Contributor

jreback commented Jun 11, 2013

@hayd I think the example indicated (the pivot) is actually not a bug, but a misuse; you can't slice on labels that are not included in the index, whether date or not (if they ARE STRINGS)

In [76]: pivot.ix[('2013-06-01',2):(Timestamp('20130602'),2)]
Out[76]: 
                     views
report_date item_id       
2013-06-01  2           14
            5           16

so I guess it is a buglet then; here treating datetimes different from the strings....hmmm

@hayd
Copy link
Contributor Author

hayd commented Jun 11, 2013

How fun! (I thought I'd played around with things like that.)

@jreback
Copy link
Contributor

jreback commented Sep 22, 2013

@hayd close this? or make a bug/enhancement out of it?

@hayd
Copy link
Contributor Author

hayd commented Sep 23, 2013

you can't slice on labels that are not included in the index, whether date or not (if they ARE STRINGS)

So you can do it with strings, even in a mi, just not when they are dates?

Should we pull out that as a someday feature?

@jreback
Copy link
Contributor

jreback commented Sep 23, 2013

it's a bug/feature just have to track down
move to 0.14

@jreback jreback modified the milestones: 0.15.0, 0.14.0 Mar 14, 2014
@jreback jreback modified the milestones: 0.16.0, Next Major Release Mar 3, 2015
@WillAyd WillAyd modified the milestones: Contributions Welcome, No action Jul 6, 2018
@WillAyd
Copy link
Member

WillAyd commented Jul 6, 2018

.ix. has been deprecated since 0.20 so won't fix

@WillAyd WillAyd closed this as completed Jul 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves MultiIndex
Projects
None yet
Development

No branches or pull requests

3 participants