Skip to content

.loc with date string and timestamp behave differently #15252

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
KwangmooKoh opened this issue Jan 28, 2017 · 3 comments
Closed

.loc with date string and timestamp behave differently #15252

KwangmooKoh opened this issue Jan 28, 2017 · 3 comments
Labels
Bug Datetime Datetime data dtype Indexing Related to indexing on series/frames, not to indexes themselves MultiIndex

Comments

@KwangmooKoh
Copy link

Code Sample, a copy-pastable example if possible

>>> import pandas as pd
>>> t0 = pd.Timestamp('2010-01-01')
>>> t1 = pd.Timestamp('2012-02-02')
>>> df = pd.DataFrame([1, 2, 3], index=pd.MultiIndex.from_tuples([(t0, 'A'), (t0, 'B'), (t1, 'A')]), columns=['C'])

>>> # Compare the following results:
>>> print(df.ix['2010-01-01'])
#    C
# A  1
# B  2
>>> print(df.loc['2010-01-01'])
#               C
# 2010-01-01 A  1
#            B  2
>>> print(df.ix[t0])
#    C
# A  1
# B  2
>>> print(df.loc[t0])
#    C
# A  1
# B  2

Problem description

The result of df.loc['2010-01-01'] is different from that of df.ix['2010-01-01'] or df.loc[pd.Timestamp('2010-01-01')]; it contains additional index level for date. (df.ix[] returns the same data frame for date string and timestamp slicer.)

Expected Output

----
   C
A  1
B  2
----
   C
A  1
B  2
----
   C
A  1
B  2
----
   C
A  1
B  2
----

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.5.2.final.0 python-bits: 64 OS: Darwin OS-release: 15.6.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8

pandas: 0.19.2+0.g825876c.dirty
nose: 1.3.7
pip: 8.1.2
setuptools: 27.2.0
Cython: 0.24.1
numpy: 1.11.2
scipy: 0.18.1
statsmodels: 0.6.1
xarray: None
IPython: 5.1.0
sphinx: 1.4.6
patsy: 0.4.1
dateutil: 2.5.3
pytz: 2016.6.1
blosc: None
bottleneck: 1.1.0
tables: 3.2.3.1
numexpr: 2.6.1
matplotlib: 1.5.3
openpyxl: 2.3.2
xlrd: 1.0.0
xlwt: 1.1.2
xlsxwriter: 0.9.3
lxml: 3.6.4
bs4: 4.5.1
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.13
pymysql: None
psycopg2: None
jinja2: 2.8
boto: 2.42.0
pandas_datareader: None

@jreback
Copy link
Contributor

jreback commented Jan 28, 2017

pls read the docs: http://pandas-docs.github.io/pandas-docs-travis/timeseries.html#indexing

this is all as expected

strings yield a slice, while a timestamp is a point
.ix is deprecated (and never did proper multi index partial string slicing)

@jreback jreback closed this as completed Jan 28, 2017
@jreback jreback added Indexing Related to indexing on series/frames, not to indexes themselves MultiIndex Datetime Datetime data dtype labels Jan 28, 2017
@jreback jreback added this to the No action milestone Jan 28, 2017
@jorisvandenbossche
Copy link
Member

jorisvandenbossche commented Jan 28, 2017

@jreback Not sure is this is necessarily expected. It seems related to the discussion we had in #14826
cc @ischurov

Just looking a the .loc behaviour: the difference in return value seems to indicate that the string is interpreted as a slice, while the Timestamp as an exact match. But, the resolution of both the index as the string is 'day', so I think the string should be interpreted as an exact match as well?

Without the multi-index, both string as Timestamp return the same value (and are interpreted as an exact match):

In [2]: df2 = df.reset_index(level=1, drop=True)    

In [3]: df2
Out[3]: 
            C
2010-01-01  1
2010-01-01  2
2012-02-02  3

In [4]: df2.loc['2012-02-02']
Out[4]: 
C    3
Name: 2012-02-02 00:00:00, dtype: int64

In [5]: df2.loc[t1]
Out[5]: 
C    3
Name: 2012-02-02 00:00:00, dtype: int64

So at least there seems to be an inconsistency between index and multi-index?

@mroeschke mroeschke added the Bug label Mar 31, 2020
@mroeschke mroeschke removed this from the No action milestone Oct 13, 2022
@mroeschke
Copy link
Member

Looks the code example in the OP is consistent now and I believe we have tests for this so closing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Datetime Datetime data dtype Indexing Related to indexing on series/frames, not to indexes themselves MultiIndex
Projects
None yet
Development

No branches or pull requests

4 participants