Skip to content

BUG: partial slicing with a Timestamp on PeriodIndex #15920

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ocschwar opened this issue Apr 6, 2017 · 10 comments
Open

BUG: partial slicing with a Timestamp on PeriodIndex #15920

ocschwar opened this issue Apr 6, 2017 · 10 comments
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves Period Period data type

Comments

@ocschwar
Copy link

ocschwar commented Apr 6, 2017

An example: a PeriodicIndex with a freq of 300S. First second works.
Remaining portion of the interval raises a KeyError.

>>> DR = pd.period_range(datetime.datetime.now(), freq='300S',periods=22)
>>> S = pd.Series( [0.0]*22,index=DR) 
>>> now = datetime.datetime.now()
>>> S[now]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Python/2.7/site-packages/pandas/core/series.py", line 603, in __getitem__
    result = self.index.get_value(self, key)
  File "/Library/Python/2.7/site-packages/pandas/tseries/period.py", line 757, in get_value
    return com._maybe_box(self, self._engine.get_value(s, key),
  File "pandas/index.pyx", line 98, in pandas.index.IndexEngine.get_value (pandas/index.c:3557)
  File "pandas/index.pyx", line 106, in pandas.index.IndexEngine.get_value (pandas/index.c:3240)
  File "pandas/index.pyx", line 154, in pandas.index.IndexEngine.get_loc (pandas/index.c:4279)
  File "pandas/src/hashtable_class_helper.pxi", line 404, in pandas.hashtable.Int64HashTable.get_item (pandas/hashtable.c:8564)
  File "pandas/src/hashtable_class_helper.pxi", line 410, in pandas.hashtable.Int64HashTable.get_item (pandas/hashtable.c:8508)
KeyError: 1491422454
>>> S.index[0]
Period('2017-04-05 20:00:23', '300S')
>>> S[datetime.datetime(2017,4,5,20,0,23)]
0.0
>>> S[datetime.datetime(2017,4,5,20,0,23,999)]
0.0
>>> DR = pd.period_range(datetime.datetime.now(), freq='5T',periods=22)
>>> S = pd.Series( [0.0]*22,index=DR)
>>> now = datetime.datetime.now()
>>> S[now]
0.0
>>> S.index[0]
Period('2017-04-05 20:03', '5T')
>>> S[datetime.datetime(2017,4,5,20,3,23,999)]
0.0
>>> S[datetime.datetime(2017,4,5,20,4,23,999)]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Python/2.7/site-packages/pandas/core/series.py", line 603, in __getitem__
    result = self.index.get_value(self, key)
  File "/Library/Python/2.7/site-packages/pandas/tseries/period.py", line 757, in get_value
    return com._maybe_box(self, self._engine.get_value(s, key),
  File "pandas/index.pyx", line 98, in pandas.index.IndexEngine.get_value (pandas/index.c:3557)
  File "pandas/index.pyx", line 106, in pandas.index.IndexEngine.get_value (pandas/index.c:3240)
  File "pandas/index.pyx", line 154, in pandas.index.IndexEngine.get_loc (pandas/index.c:4279)
  File "pandas/src/hashtable_class_helper.pxi", line 404, in pandas.hashtable.Int64HashTable.get_item (pandas/hashtable.c:8564)
  File "pandas/src/hashtable_class_helper.pxi", line 410, in pandas.hashtable.Int64HashTable.get_item (pandas/hashtable.c:8508)
KeyError: 24857044
pandas.version u'0.19.2'
@jreback
Copy link
Contributor

jreback commented Apr 6, 2017

can you make this example not dependent on datetime.datetime.now(). Your example is not copy-pastable.

@ocschwar
Copy link
Author

ocschwar commented Apr 6, 2017

m>>> import pandas, datetime
>>> import pandas as pd
>>> DR = pd.period_range(datetime.datetime.(2017,1,1), freq='5T',periods=22)
  File "<stdin>", line 1
    DR = pd.period_range(datetime.datetime.(2017,1,1), freq='5T',periods=22)
                                           ^
SyntaxError: invalid syntax
>>> DR = pd.period_range(datetime.datetime(2017,1,1), freq='5T',periods=22)
>>> S = pd.Series( [0.0]*22,index=DR) 
>>> S[datetime.datetime(2017,1,1,0,0)]
0.0
>>> S[datetime.datetime(2017,1,1,0,1)]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Python/2.7/site-packages/pandas/core/series.py", line 603, in __getitem__
    result = self.index.get_value(self, key)
  File "/Library/Python/2.7/site-packages/pandas/tseries/period.py", line 757, in get_value
    return com._maybe_box(self, self._engine.get_value(s, key),
  File "pandas/index.pyx", line 98, in pandas.index.IndexEngine.get_value (pandas/index.c:3557)
  File "pandas/index.pyx", line 106, in pandas.index.IndexEngine.get_value (pandas/index.c:3240)
  File "pandas/index.pyx", line 154, in pandas.index.IndexEngine.get_loc (pandas/index.c:4279)
  File "pandas/src/hashtable_class_helper.pxi", line 404, in pandas.hashtable.Int64HashTable.get_item (pandas/hashtable.c:8564)
  File "pandas/src/hashtable_class_helper.pxi", line 410, in pandas.hashtable.Int64HashTable.get_item (pandas/hashtable.c:8508)
KeyError: 24720481
>>> S[datetime.datetime(2017,1,1,0,0,30)]
0.0
>>> 

@ocschwar
Copy link
Author

ocschwar commented Apr 6, 2017

Or:

import datetime
import pandas as pd
DR = pd.period_range(datetime.datetime(2017,1,1), freq='5T',periods=22)
S = pd.Series( [0.0]*22,index=DR) 
S[datetime.datetime(2017,1,1,0,0)]
S[datetime.datetime(2017,1,1,0,1)]
S[datetime.datetime(2017,1,1,0,0,30)]

@jreback
Copy link
Contributor

jreback commented Apr 6, 2017

yeah looks buggy. Period partial slicing is not fully developed. pull-requests are welcome.

@jreback jreback added Difficulty Intermediate Indexing Related to indexing on series/frames, not to indexes themselves Period Period data type Datetime Datetime data dtype labels Apr 6, 2017
@jreback jreback added this to the Next Major Release milestone Apr 6, 2017
@jreback jreback changed the title PeriodIndex raising KeyError for periods that are N*freq, and keys that are n*freq, where 1<n<N BUG: partial slicing with a Timestamp on PeriodIndex Apr 6, 2017
@jreback
Copy link
Contributor

jreback commented Apr 6, 2017

This would need tests with datetimes & string slicing.

@jreback
Copy link
Contributor

jreback commented Apr 6, 2017

xref to #13429 (different though no PI).

@ocschwar
Copy link
Author

ocschwar commented Apr 8, 2017

If I understand the stack traces correctly, the bug is somewhere in pandas.core.common, right?

@jreback
Copy link
Contributor

jreback commented Apr 8, 2017

this is a little complicated, but you can have a look at: https://github.com/pandas-dev/pandas/blob/master/pandas/tseries/period.py#L725

step thru with a successful match and then the unsucceful one.

@ocschwar
Copy link
Author

I ran a git pull, and can confirm that the bug remains in the master branch.

Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Users/oschwarz/Desktop/git/pandas/pandas/core/series.py", line 597, in __getitem__ result = self.index.get_value(self, key) File "/Users/oschwarz/Desktop/git/pandas/pandas/tseries/period.py", line 769, in get_value return com._maybe_box(self, self._engine.get_value(s, key), File "pandas/_libs/index.pyx", line 98, in pandas._libs.index.IndexEngine.get_value (pandas/_libs/index.c:4363) cpdef get_value(self, ndarray arr, object key, object tz=None): File "pandas/_libs/index.pyx", line 106, in pandas._libs.index.IndexEngine.get_value (pandas/_libs/index.c:4046) loc = self.get_loc(key) File "pandas/_libs/index.pyx", line 154, in pandas._libs.index.IndexEngine.get_loc (pandas/_libs/index.c:5085) return self.mapping.get_item(val) File "pandas/_libs/hashtable_class_helper.pxi", line 756, in pandas._libs.hashtable.Int64HashTable.get_item (pandas/_libs/hashtable.c:13913) cpdef get_item(self, int64_t val): File "pandas/_libs/hashtable_class_helper.pxi", line 762, in pandas._libs.hashtable.Int64HashTable.get_item (pandas/_libs/hashtable.c:13857) raise KeyError(val) KeyError: 24720481
And I'm not quite able to wrap my head around the steps that get skipped in the stack trace.
Also, I'm wondering, what is the reason for grouping frequencies in the Period and PeriodIndex objects instead of just converting them to a time delta and storing them as such?

@jreback
Copy link
Contributor

jreback commented Apr 14, 2017

@ocschwar this is an open bug, not sure why you would think its fixed.

@mroeschke mroeschke added the Bug label Mar 31, 2020
@mroeschke mroeschke removed the Datetime Datetime data dtype label May 11, 2020
@mroeschke mroeschke removed this from the Contributions Welcome milestone Oct 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves Period Period data type
Projects
None yet
Development

No branches or pull requests

4 participants