-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
CLN: replace pandas.compat.scipy.scoreatpercentile with numpy.percentile #6810
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -229,7 +229,7 @@ def test_timedelta_ops(self): | |||
|
|||
result = td.quantile(.1) | |||
# This properly returned a scalar. | |||
expected = to_timedelta('00:00:02.6') | |||
expected = np.timedelta64(2599999999,'ns') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a rounding issue yes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, Julian looked into the difference in rounding methods between pandas.compat.scipy.scoreatpercentile and numpy in a comment on #5824.. and also offered to update numpy. do you think this hard-coded expect should be removed and expect whatever numpy.percentile returns, in case they do change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, I think this is ok to change it, sort of go with np.percentile
results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you do a fuzzy comparison instead of equality? (I guess as its an integers almost_equal does not work)
I may still update numpy as this method saves a few precious cycles for small percentiles
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@juliantaylor this pr replace our original method so using np.percentile now fully
and ok with all numpy (incl numpy master)
if u do make a change in numpy master then we can change the test (to more of a allclose one)
pls add a release note referinc the issue (do it in API changes, even though just a compat change). |
side note, I am trying to figure out why @wesm included this in the first place, as I think |
@jreback, sorry for the delay, but I've added a second commit to this PR with:
No idea why np.percentile wasn't used originally, but will note that scipy.stats.scoreatpercentile existed first. Perhaps, when the scipy dependency was removed from pandas, the difference in rounding made copying scoreatpercentile the easiest way forward. Will also note that scipy has deprecated scoreatpercentile in favor of np.percentile. |
if com.is_datetime64_dtype(self): | ||
from pandas.tseries.tools import to_datetime | ||
values = _values_from_object(self).view('i8') | ||
result = to_datetime(_quantile(values, q * 100)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can just make this a call to Timestamp rather than use to_datetime
@gdraps looks good just a couple of minor comments timedeltas are thoroughly broken under numpy < 1.7 |
@jreback, updated the last commit of this PR based on your feedback -- let me know if you see anything else |
CLN: replace pandas.compat.scipy.scoreatpercentile with numpy.percentile
@gdraps thanks for the work on this gr8! |
PR to fix #5824.
compat.scipy.scoreatpercentile
withnumpy.percentile
.axis=0
in a few tests, becauseaxis=None
by default onnumpy.percentile
, which returns a scalar (i.e., operates on a flattened version of the array).np.timedelta64
directly sinceto_timedelta
doesn't support enough precision.)Let me know if you'd like to see any changes..