Skip to content

MultiIndex slicing with partial string match of datetimes fails #25137

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
dmzio opened this issue Feb 4, 2019 · 1 comment
Closed

MultiIndex slicing with partial string match of datetimes fails #25137

dmzio opened this issue Feb 4, 2019 · 1 comment
Labels
Duplicate Report Duplicate issue or pull request

Comments

@dmzio
Copy link

dmzio commented Feb 4, 2019

Multilevel indexing doesn't work if IndexSlice with a range is provided.

Code Sample

import pandas as pd
from itertools import cycle
dt_index = pd.date_range('1999-07', '1999-11')
sec_index = ['black', 'red', 'white']
items = []
for d, c in zip(dt_index, cycle(sec_index)):
    items.append({
     'date': d,
        'kind': c,
        'content': '{} - {}'.format(d, c)
    })
df = pd.DataFrame(items).set_index(['date', 'kind'], drop=False)
print(len(df))

# it works when indexing matches single item in index
single_item_index = df.loc[pd.IndexSlice[:'1999-09-02', :], :]
print(len(single_item_index))

# but doesn't work when slice has a range of items
multi_items_index = df.loc[pd.IndexSlice[:'1999-09', :], :]

Problem description

Code above raises

.../python3.6/site-packages/pandas/core/indexes/multi.py in _get_level_indexer(self, key, level, indexer)
   2635                 # note that the stop ALREADY includes the stopped point (if
   2636                 # it was a string sliced)
-> 2637                 return convert_indexer(start.start, stop.stop, step)
   2638 
   2639             elif level > 0 or self.lexsort_depth == 0 or step is not None:

AttributeError: 'int' object has no attribute 'start'

As seen in the code, providing slice pd.IndexSlice[:'1999-09', :] with range as a 'stop' fails.

Problems seems arise from OR logic in if isinstance(start, slice) or isinstance(stop, slice): of 'core/indexes/multi.py#_get_level_indexer', when start defaults to 0 or points to particular element.

Expected Output

Expect to have slice of the dataframe in range [:'1999-09', :], same way as it works for single-level index

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.6.7.final.0
python-bits: 64
OS: Linux
OS-release: 4.18.10-arch1-1-ARCH
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.24.0
pytest: None
pip: 18.0
setuptools: 39.2.0
Cython: 0.28.5
numpy: 1.14.3
scipy: 1.1.0
pyarrow: None
xarray: None
IPython: 6.4.0
sphinx: None
patsy: 0.5.0
dateutil: 2.7.3
pytz: 2018.4
blosc: None
bottleneck: 1.2.1
tables: 3.4.4
numexpr: 2.6.6
feather: None
matplotlib: 2.2.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml.etree: 4.2.1
bs4: 4.6.0
html5lib: 0.9999999
sqlalchemy: 1.2.8
pymysql: None
psycopg2: 2.7.4 (dt dec pq3 ext lo64)
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None

@chris-b1
Copy link
Contributor

chris-b1 commented Feb 5, 2019

Thanks - this is a duplicate of #15928 - PR / investigation welcome!

@chris-b1 chris-b1 closed this as completed Feb 5, 2019
@chris-b1 chris-b1 added the Duplicate Report Duplicate issue or pull request label Feb 5, 2019
@chris-b1 chris-b1 added this to the No action milestone Feb 5, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Duplicate Report Duplicate issue or pull request
Projects
None yet
Development

No branches or pull requests

2 participants