Skip to content

COMPAT: periods needs coercion to np.int64 in _generate_regular_range #8943

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jreback opened this issue Nov 30, 2014 · 5 comments · Fixed by #9078
Closed

COMPAT: periods needs coercion to np.int64 in _generate_regular_range #8943

jreback opened this issue Nov 30, 2014 · 5 comments · Fixed by #9078
Labels
Compat pandas objects compatability with Numpy or Python functions
Milestone

Comments

@jreback
Copy link
Contributor

jreback commented Nov 30, 2014

https://github.com/pydata/pandas/blob/master/pandas/tseries/index.py#L1768

a simple np.int64(periods) will suffice

(and tests of course)

This needs to guaranteed to be int64 as a result (e.g. if this is in32 it will have an internal overflow which will make it have a valid but weird result)

@jreback jreback added Good as first PR Compat pandas objects compatability with Numpy or Python functions labels Nov 30, 2014
@jreback jreback added this to the 0.15.2 milestone Nov 30, 2014
@jreback jreback modified the milestones: 0.16.0, 0.15.2 Nov 30, 2014
@Garrett-R
Copy link
Contributor

Hey @jreback,

I'd like to try tackling this one as a first PR. Do you mind giving an example line of code that would cause _generate_regulate_range to have an overflow?

I thought maybe it was this line, but I couldn't find an example. I tried

pd.DatetimeIndex(start='2005', periods=np.int32(1e9), freq='s')

but in my case, stride was a Python int, and the result of periods*stride is a np.int64, so no overflow.

@jreback
Copy link
Contributor Author

jreback commented Dec 15, 2014

you actually need a bigger number

try np.iinfo(np.int32) + 1

@Garrett-R
Copy link
Contributor

Sorry, I'm probably being a bit dense here, but are you suggesting I try something like:

periods =np.iinfo(np.int32).max + 1
periods = np.int32(periods)
pd.DatetimeIndex(start='2005', periods=periods, freq='s')

In that case, the overflow happens on the second line. I can't think of how to produe an overflow inside _generate_regular_range...

@jreback
Copy link
Contributor Author

jreback commented Dec 15, 2014

look at @jorisvandenbossche example in the linked issue #8907
iirc that example generates the issue (in the index construction)

@Garrett-R
Copy link
Contributor

Cool, I think I know why I wasn't able to reproduce the bug. @jorisvandenbossche is on 32-bit Linux and I'm on 64-bit Linux, so for me, periods gets promoted to np.int64 before having a chance to overflow, whereas for @jorisvandenbossche, I think it probably stays as np.int32.

Alright, I'll do your proposed fix now.

Garrett-R added a commit to Garrett-R/pandas that referenced this issue Dec 17, 2014
In _generate_regular_range, we explicitly cast `periods` to np.int64 to prevent
overflow in the function tseries.index._generate_regular_range.
Note: we don't bother casting it for the function call generate_range since
there's no danger of overflow in that function.
shoyer added a commit that referenced this issue Dec 17, 2014
Closes #8943:  COMPAT: periods needs coercion to np.int64
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Compat pandas objects compatability with Numpy or Python functions
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants