Skip to content

TimedeltaIndex constructor gives wild output #9011

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ahjulstad opened this issue Dec 5, 2014 · 5 comments
Closed

TimedeltaIndex constructor gives wild output #9011

ahjulstad opened this issue Dec 5, 2014 · 5 comments
Labels
API Design Duplicate Report Duplicate issue or pull request Timedelta Timedelta data type
Milestone

Comments

@ahjulstad
Copy link
Contributor

import pandas as pd
a = pd.TimedeltaIndex(range(3),unit='S')
print (a[1])

gives output

11574 days 01:46:40

I would expect the same output as from

import pandas as pd
import numpy as np
npa = np.array(range(3),dtype='timedelta64[s]')
a = pd.TimedeltaIndex(npa)
print(a[1])

giving output

 0 days 00:00:01

This is pandas 0.15.1 as installed by conda update pandas from a fresh python 3.4 install of anaconda.


INSTALLED VERSIONS
------------------
commit: None
python: 3.4.1.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 58 Stepping 9, GenuineIntel
byteorder: little
LC_ALL: None
LANG: nb_NO

pandas: 0.15.1
nose: 1.3.4
Cython: 0.21
numpy: 1.9.1
scipy: 0.14.0
statsmodels: 0.5.0
IPython: 2.2.0
sphinx: 1.2.3
patsy: 0.3.0
dateutil: 2.1
pytz: 2014.9
bottleneck: None
tables: 3.1.1
numexpr: 2.3.1
matplotlib: 1.4.0
openpyxl: 1.8.5
xlrd: 0.9.3
xlwt: None
xlsxwriter: 0.5.7
lxml: 3.4.0
bs4: 4.3.2
html5lib: None
httplib2: None
apiclient: None
rpy2: None
sqlalchemy: 0.9.7
pymysql: None
psycopg2: None
@jreback
Copy link
Contributor

jreback commented Dec 5, 2014

this is a dupe of #8886

you can do this

In [1]: pd.to_timedelta(range(3),unit='s')
Out[1]: 
<class 'pandas.tseries.tdi.TimedeltaIndex'>
['00:00:00', ..., '00:00:02']
Length: 3, Freq: None

the unit paramater should be accepted directly by TimedeltaIndex with an implementation that passes to to_timedelta.

care to do a pull-request?

@jreback jreback closed this as completed Dec 5, 2014
@jreback jreback added API Design Duplicate Report Duplicate issue or pull request Timedelta Timedelta data type labels Dec 5, 2014
@jreback jreback added this to the 0.16.0 milestone Dec 5, 2014
@ahjulstad
Copy link
Contributor Author

From what I can understand, the unit parameter is already accepted by TimedeltaIndex, and sent to to_timedelta. I checked the returned value, it is 1e18 ns, indicating that the scaling factor (between s and ns) is applied twice. The problem seems to be here somewhere, it calls to_timedelta twice.

       if unit is not None:
            data = to_timedelta(data, unit=unit, box=False)
[snipped away]
        # convert if not already
        if getattr(data,'dtype',None) != _TD_DTYPE:
            data = to_timedelta(data,unit=unit,box=False)
            print('second time',data)

(This is code from 4f8af44#diff-c03d3accabc2e2f441d87b961f4425d7R172 )

I am not sure this is a duplicate, but then again I don't understand all the points in the original issue report.

@ahjulstad
Copy link
Contributor Author

The following changes fixes my problem, it appears, but I do not understand fully why.

Link to branch in my github tree:
https://github.com/ahjulstad/pandas/tree/fix-9011

(I have added a test, as well.)

@jreback
Copy link
Contributor

jreback commented Dec 6, 2014

I think you can ust not pass unit to this line: https://github.com/pydata/pandas/blob/master/pandas/tseries/tdi.py#L184 (its already converted if it has a passed unit).

go ahead and take your test and make the change do a pull-request. If you can do this shortly, can include it in 0.15.2

@jreback
Copy link
Contributor

jreback commented Dec 6, 2014

ideally also add some more tests cases enumerated in #8886

ahjulstad added a commit to ahjulstad/pandas that referenced this issue Dec 8, 2014
@jreback jreback modified the milestones: 0.15.2, 0.16.0 Dec 9, 2014
ahjulstad added a commit to ahjulstad/pandas that referenced this issue Dec 9, 2014
yarikoptic added a commit to neurodebian/pandas that referenced this issue Jul 2, 2015
Version 0.15.2

* tag 'v0.15.2': (64 commits)
  RLS: 0.15.2 final
  DOC: fix-up docs for 0.15.2 release
  DOC: update release notes
  DOC: v0.15.2 editiing, removing several duplicated issues
  TST: period-like test for GH9012
  BUG: fix PeriodConverter issue when given a list of integers (GH9012)
  TST: fix related to dateutil test failure in test_series.py
  Return from to_timedelta is forced to dtype timedelta64[ns]. (Fixes pydata/pandas pandas-dev#9011)
  TST: dateutil fixes (GH8639)
  Fix timedelta json on windows
  DOC: expand docs on sql type conversion
  ENH: Infer dtype from non-nulls when pushing to SQL
  COMPAT: windows dtype compat w.r.t. GH9019
  COMPAT: dateutil fixups for 2.3 (GH9021, GH8639)
  DOC: fix categorical comparison example (GH8946)
  Clean up style a bit
  Fix timedeltas to work with to_json
  BUG: Fix plots showing 2 sets of axis labels when the index is a timeseries.
  API: update NDFrame __setattr__ to match behavior of __getattr__ (GH8994)
  Make Timestamp('now') equivalent to Timestamp.now() and Timestamp('today') equivalent to Timestamp.today() and pass tz to today().
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Duplicate Report Duplicate issue or pull request Timedelta Timedelta data type
Projects
None yet
Development

No branches or pull requests

2 participants