Skip to content

Nanoseconds being truncated in asof #3375

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
schwallie opened this issue Apr 16, 2013 · 9 comments
Closed

Nanoseconds being truncated in asof #3375

schwallie opened this issue Apr 16, 2013 · 9 comments
Labels
Bug Datetime Datetime data dtype Dtype Conversions Unexpected or buggy dtype conversions Indexing Related to indexing on series/frames, not to indexes themselves
Milestone

Comments

@schwallie
Copy link

match_time = get_info.index.asof(trade_time)
match_price = get_info[match_time]['Bid Price']

The actual index value:
<Timestamp: 2013-04-10 10:22:01.696815975-0500 CDT, tz=US/Central>

The error returned:
KeyError: u'no item named 2013-04-10 10:22:01.696815-05:00'

It seems to me that it is cutting off the last digits of the timestamp in this example.

@jreback
Copy link
Contributor

jreback commented Apr 16, 2013

can you show pandas version, platform, and get_info?

@schwallie
Copy link
Author

pandas version = .10
platform = linux

get_info is a huge huge dataframe that is indexed by a timeseries. I think we should be fine without?

FYI if I use

match_price = get_info[get_info.index[0]]

I get the same problem and error

@jreback
Copy link
Contributor

jreback commented Apr 16, 2013

of course, just want to see a reproducible sample (you can put random data)

@jreback
Copy link
Contributor

jreback commented Apr 16, 2013

related to #3060

0.11rc1

In [14]: s = Series([pd.Timestamp('20130101')]).values.view('i8')[0]

In [15]: r = pd.DatetimeIndex([ s + 50 + i for i in range(100) ])

In [16]: x = Series(randn(100),index=r)

In [17]: x.asof(x.index[0])
Out[17]: -1.7275857659530389

In [20]: x.index.asof(x.index[0])
Out[20]: <Timestamp: 2013-01-01 00:00:00.000000050>

The errors

In [21]: x['2013-01-01 00:00:00.000000050']
KeyError: '2013-01-01 00:00:00.000000050'
In [23]: x[pd.Timestamp('2013-01-01 00:00:00.000000050')]
KeyError: 1356998400000000000

@wuan
Copy link
Contributor

wuan commented May 11, 2013

when writing

x[Timestamp(np.datetime64('2013-01-01 00:00:00.000000050+0000', 'ns'))]

the index access should work with #3060 now. The major problem here is, that timestrings are parsed with python datetime when creating a Timestamp object. This ignores nanosecond values.

It may break a lot of things, but wouldnt it be better to use np.datetime64 to parse timestamp strings?

@jreback
Copy link
Contributor

jreback commented May 11, 2013

http://comments.gmane.org/gmane.comp.python.pydata/688

the issue is dateutil which truncates

can't use numpy parser as its broken in < 1.7
it's possible to try it first and then fallback to dateutil
might be a perf hit
also possible to detect via some regexes common formats that we know numpy can parse

@wuan
Copy link
Contributor

wuan commented May 18, 2013

Maybe the only (performant) way to improve the situation is to reimplement the dateutil parser which is pure python at the moment.

@jreback
Copy link
Contributor

jreback commented May 18, 2013

it's possible that we do parse a few common formats in cython/c and fallback to parse
but this would be some work

@jreback jreback modified the milestones: 0.15.0, 0.14.0 Feb 26, 2014
@jreback
Copy link
Contributor

jreback commented Jan 26, 2015

IIRC this is fixed by some recent work in 0.15.0. If still not working, pls open a new issue.

@jreback jreback closed this as completed Jan 26, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Datetime Datetime data dtype Dtype Conversions Unexpected or buggy dtype conversions Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

No branches or pull requests

3 participants