-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Use np.random's RandomState when seed is None #13161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use np.random's RandomState when seed is None #13161
Conversation
@@ -2072,7 +2072,7 @@ def _random_state(state=None): | |||
elif isinstance(state, np.random.RandomState): | |||
return state | |||
elif state is None: | |||
return np.random.RandomState() | |||
return np.random.mtrand._rand |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use something like this instead?
seed = np.random.randint(2 ** 32)
return np.random.RandomState(seed)
That way we don't need to use NumPy's private API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that will do the same?
As this is to catch a previously called np.random.seed()
, so feeding it another seed will not give the desired result I think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We want results to be reproducible after calling np.random.seed()
. But we don't need to reuse exactly the same seed -- it's OK to also use a seed derived from NumPy's seed.
Using the exact same seed would only be necessary if we want to promise the sample
makes the exact same choice as np.random.choice
. But that's really an implementation detail.
(Clearly a comment in the code would also be in order if we use my suggested approach.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha, I see.
For me it does not really matter which approach we take (sklearn also uses _rand
, so although it is private it seems ok)
@ariddell pls also add a whatsnew entry. I would do it in API changes section. |
@jreback added. Thanks! |
@ariddell could you please fix the use of NumPy's private state as noted above? This is something simple that could avoid significant pain down the road. |
Fair enough. What should the seed be? Did you mean |
@ariddell Yes, I meant |
Using |
Let's just return |
@shoyer duck typing it is. |
@@ -2072,7 +2072,7 @@ def _random_state(state=None): | |||
elif isinstance(state, np.random.RandomState): | |||
return state | |||
elif state is None: | |||
return np.random.RandomState() | |||
return np.random |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you please update the docstring, too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Fixed.
On 05/17, Stephan Hoyer wrote:
@@ -2072,7 +2072,7 @@ def _random_state(state=None):
elif isinstance(state, np.random.RandomState):
return state
elif state is None:
return np.random.RandomState()
return np.random
can you please update the docstring, too?
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
https://github.com/pydata/pandas/pull/13161/files/c8a38f99637ad44dad0db9118b17fd4e3c8643f3#r63603099
you have a failing test |
@ariddell can you rebase / update |
The handle for numpy's current random state is ``np.random.mtrand._rand``. Rather than use the private API, return np.random, as the module makes available the same functions as an instance of RandomState. Compare https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/utils/validation.py#L573
OK. I think I've got it. |
thanks @ariddell |
I'm not getting the behavior I'd expect from this fix in Pandas 0.19.2. Example:
Any thoughts on why |
@pydolan is there a missing |
@shoyer -- sorry about that; will update example with my actual code (i.e., with the one missing line) |
@pydolan It looks like you are calling |
If I move the
Note that this time, the first call to (Also note that I did verify that calling |
This is working as intended. Try just sampling random numbers from numpy -- it does the exact same thing. |
Good point; I completely spaced on the nature of random sampling. Sorry for the waste of time! |
git diff upstream/master | flake8 --diff
The handle for numpy's current random state is
np.random.mtrand._rand
.Compare https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/utils/validation.py#L573