Skip to content

AttributeError when slicing a datetime MultiIndex level #15928

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
toobaz opened this issue Apr 6, 2017 · 5 comments · Fixed by #37228
Closed

AttributeError when slicing a datetime MultiIndex level #15928

toobaz opened this issue Apr 6, 2017 · 5 comments · Fixed by #37228
Labels
Bug MultiIndex Needs Tests Unit test(s) needed to prevent regressions
Milestone

Comments

@toobaz
Copy link
Member

toobaz commented Apr 6, 2017

Code Sample, a copy-pastable example if possible

In [2]: df = pd.read_csv(pd.__path__[0]+'/tests/tools/data/quotes2.csv', parse_dates=['time']).set_index(['ticker', 'time']).sort_index()

In [3]: df.loc['AAPL'].loc[slice('2016-05-25 13:30:00'), :].head()
Out[3]: 
                           bid    ask
time                                 
2016-05-25 13:30:00.075  98.55  98.56
2016-05-25 13:30:00.076  98.55  98.56
2016-05-25 13:30:00.076  98.55  98.56
2016-05-25 13:30:00.076  98.55  98.56
2016-05-25 13:30:00.080  98.55  98.56

In [4]: df.loc[('AAPL', slice('2016-05-25 13:30:00')), :].head()
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-4-552cfee2e501> in <module>()
----> 1 df.loc[('AAPL', slice('2016-05-25 13:30:00')), :].head()

/home/nobackup/repo/pandas/pandas/core/indexing.py in __getitem__(self, key)
   1322             except (KeyError, IndexError):
   1323                 pass
-> 1324             return self._getitem_tuple(key)
   1325         else:
   1326             key = com._apply_if_callable(key, self.obj)

/home/nobackup/repo/pandas/pandas/core/indexing.py in _getitem_tuple(self, tup)
    833     def _getitem_tuple(self, tup):
    834         try:
--> 835             return self._getitem_lowerdim(tup)
    836         except IndexingError:
    837             pass

/home/nobackup/repo/pandas/pandas/core/indexing.py in _getitem_lowerdim(self, tup)
    945         # we may have a nested tuples indexer here
    946         if self._is_nested_tuple_indexer(tup):
--> 947             return self._getitem_nested_tuple(tup)
    948 
    949         # we maybe be using a tuple to represent multiple dimensions here

/home/nobackup/repo/pandas/pandas/core/indexing.py in _getitem_nested_tuple(self, tup)
   1020 
   1021             current_ndim = obj.ndim
-> 1022             obj = getattr(obj, self.name)._getitem_axis(key, axis=axis)
   1023             axis += 1
   1024 

/home/nobackup/repo/pandas/pandas/core/indexing.py in _getitem_axis(self, key, axis)
   1542             # nested tuple slicing
   1543             if is_nested_tuple(key, labels):
-> 1544                 locs = labels.get_locs(key)
   1545                 indexer = [slice(None)] * self.ndim
   1546                 indexer[axis] = locs

/home/nobackup/repo/pandas/pandas/indexes/multi.py in get_locs(self, tup)
   2173                 # a slice, include BOTH of the labels
   2174                 indexer = _update_indexer(_convert_to_indexer(
-> 2175                     self._get_level_indexer(k, level=i, indexer=indexer)),
   2176                     indexer=indexer)
   2177             else:

/home/nobackup/repo/pandas/pandas/indexes/multi.py in _get_level_indexer(self, key, level, indexer)
   2054                 # note that the stop ALREADY includes the stopped point (if
   2055                 # it was a string sliced)
-> 2056                 return convert_indexer(start.start, stop.stop, step)
   2057 
   2058             elif level > 0 or self.lexsort_depth == 0 or step is not None:

AttributeError: 'int' object has no attribute 'start'

Problem description

The two calls should be perfectly equivalent.

By the way, if I swap the levels, I get a different error:

In [16]: df = pd.read_csv(pd.__path__[0]+'/tests/tools/data/quotes2.csv', parse_dates=['time']).set_index(['time', 'ticker']).sort_index()

In [26]: df.loc[:'2016-05-25 13:30:00'].head()
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-26-a0298286991a> in <module>()
----> 1 df.loc[:'2016-05-25 13:30:00'].head()

/home/nobackup/repo/pandas/pandas/core/indexing.py in __getitem__(self, key)
   1325         else:
   1326             key = com._apply_if_callable(key, self.obj)
-> 1327             return self._getitem_axis(key, axis=0)
   1328 
   1329     def _is_scalar_access(self, key):

/home/nobackup/repo/pandas/pandas/core/indexing.py in _getitem_axis(self, key, axis)
   1503         if isinstance(key, slice):
   1504             self._has_valid_type(key, axis)
-> 1505             return self._get_slice_axis(key, axis=axis)
   1506         elif is_bool_indexer(key):
   1507             return self._getbool_axis(key, axis=axis)

/home/nobackup/repo/pandas/pandas/core/indexing.py in _get_slice_axis(self, slice_obj, axis)
   1353         labels = obj._get_axis(axis)
   1354         indexer = labels.slice_indexer(slice_obj.start, slice_obj.stop,
-> 1355                                        slice_obj.step, kind=self.name)
   1356 
   1357         if isinstance(indexer, slice):

/home/nobackup/repo/pandas/pandas/indexes/base.py in slice_indexer(self, start, end, step, kind)
   3226         """
   3227         start_slice, end_slice = self.slice_locs(start, end, step=step,
-> 3228                                                  kind=kind)
   3229 
   3230         # return a slice

/home/nobackup/repo/pandas/pandas/indexes/multi.py in slice_locs(self, start, end, step, kind)
   1741         # This function adds nothing to its parent implementation (the magic
   1742         # happens in get_slice_bound method), but it adds meaningful doc.
-> 1743         return super(MultiIndex, self).slice_locs(start, end, step, kind=kind)
   1744 
   1745     def _partial_tup_index(self, tup, side='left'):

/home/nobackup/repo/pandas/pandas/indexes/base.py in slice_locs(self, start, end, step, kind)
   3413         end_slice = None
   3414         if end is not None:
-> 3415             end_slice = self.get_slice_bound(end, 'right', kind)
   3416         if end_slice is None:
   3417             end_slice = len(self)

/home/nobackup/repo/pandas/pandas/indexes/multi.py in get_slice_bound(self, label, side, kind)
   1712         if not isinstance(label, tuple):
   1713             label = label,
-> 1714         return self._partial_tup_index(label, side=side)
   1715 
   1716     def slice_locs(self, start=None, end=None, step=None, kind=None):

/home/nobackup/repo/pandas/pandas/indexes/multi.py in _partial_tup_index(self, tup, side)
   1770                 start = start + section.searchsorted(idx, side='left')
   1771             else:
-> 1772                 return start + section.searchsorted(idx, side=side)
   1773 
   1774     def get_loc(self, key, method=None):

TypeError: unorderable types: int() > slice()

... while everything works fine if there is one level only.

Expected Output

The same as Out[3]:.

Output of pd.show_versions()

INSTALLED VERSIONS

commit: 4502e82
python: 3.5.3.final.0
python-bits: 64
OS: Linux
OS-release: 4.7.0-1-amd64
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: it_IT.utf8
LOCALE: it_IT.UTF-8

pandas: 0.19.0+743.g4502e8208
pytest: 3.0.6
pip: 9.0.1
setuptools: 33.1.1
Cython: 0.25.2
numpy: 1.12.0
scipy: 0.18.1
xarray: 0.9.1
IPython: 5.1.0.dev
sphinx: 1.4.9
patsy: 0.3.0-dev
dateutil: 2.5.3
pytz: 2016.7
blosc: None
bottleneck: 1.2.0
tables: 3.3.0
numexpr: 2.6.1
feather: 0.3.1
matplotlib: 2.0.0
openpyxl: 2.3.0
xlrd: 1.0.0
xlwt: 1.1.2
xlsxwriter: 0.9.6
lxml: 3.7.1
bs4: 4.5.3
html5lib: 0.999999999
sqlalchemy: 1.0.15
pymysql: None
psycopg2: None
jinja2: 2.8
s3fs: None
pandas_gbq: None
pandas_datareader: 0.2.1

@jreback
Copy link
Contributor

jreback commented Apr 6, 2017

can you change the above to a copy-pastable example.

@jorisvandenbossche
Copy link
Member

jorisvandenbossche commented Apr 6, 2017

When providing an end to the slice it does work:

In [86]: IDX = pd.IndexSlice

In [87]: df.loc[IDX['AAPL', '2016-05-25 13:30:00':'2016-05-25 13:30:01'], :].head()
Out[87]: 
                                  bid    ask
ticker time                                 
AAPL   2016-05-25 13:30:00.075  98.55  98.56
       2016-05-25 13:30:00.076  98.55  98.56
       2016-05-25 13:30:00.076  98.55  98.56
       2016-05-25 13:30:00.076  98.55  98.56
       2016-05-25 13:30:00.080  98.55  98.56

@toobaz
Copy link
Member Author

toobaz commented Apr 6, 2017

can you change the above to a copy-pastable example.

done (I hope)

When providing an end to the slice it does work:

Just for the records, if I'm not mistaken you're actually providing a start:

In [1]: slice('2016-05-25 13:30:00')
Out[1]: slice(None, '2016-05-25 13:30:00', None)

In [2]: slice('2016-05-25 13:30:00').stop
Out[2]: '2016-05-25 13:30:00'

@jbrockmendel
Copy link
Member

this works on master, though the file is now at pd.__path__[0] / "tests" / "tools" / "data" / "quotes2.csv"

@jbrockmendel jbrockmendel added the Needs Tests Unit test(s) needed to prevent regressions label Oct 18, 2020
@toobaz
Copy link
Member Author

toobaz commented Oct 18, 2020

this works on master

Indeed I suspect (or maybe just hope!) it was fixed by #19074 - as now indexing on two levels should be just like indexing separately on each level.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug MultiIndex Needs Tests Unit test(s) needed to prevent regressions
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants