CLN: replace pandas.compat.scipy.scoreatpercentile with numpy.percentile #6810

gdraps · 2014-04-05T14:27:01Z

PR to fix #5824.

Replaces compat.scipy.scoreatpercentile with numpy.percentile.
Sets axis=0 in a few tests, because axis=None by default on numpy.percentile, which returns a scalar (i.e., operates on a flattened version of the array).
Fixes a test that fails after the switch due to differences in how fractions are computed. (Created a np.timedelta64 directly since to_timedelta doesn't support enough precision.)

Let me know if you'd like to see any changes..

jreback · 2014-04-05T14:54:08Z

pandas/tseries/tests/test_timedeltas.py

@@ -229,7 +229,7 @@ def test_timedelta_ops(self):

        result = td.quantile(.1)
        # This properly returned a scalar.
-        expected = to_timedelta('00:00:02.6')
+        expected = np.timedelta64(2599999999,'ns')


this is a rounding issue yes?

yes, Julian looked into the difference in rounding methods between pandas.compat.scipy.scoreatpercentile and numpy in a comment on #5824.. and also offered to update numpy. do you think this hard-coded expect should be removed and expect whatever numpy.percentile returns, in case they do change?

hmm, I think this is ok to change it, sort of go with np.percentile results.

can you do a fuzzy comparison instead of equality? (I guess as its an integers almost_equal does not work)
I may still update numpy as this method saves a few precious cycles for small percentiles

@juliantaylor this pr replace our original method so using np.percentile now fully
and ok with all numpy (incl numpy master)

if u do make a change in numpy master then we can change the test (to more of a allclose one)

jreback · 2014-04-05T14:55:16Z

pls add a release note referinc the issue (do it in API changes, even though just a compat change).

jreback · 2014-04-05T15:15:50Z

side note, I am trying to figure out why @wesm included this in the first place, as I think np.percentile has been around a while? any idea

gdraps · 2014-04-15T04:01:29Z

@jreback, sorry for the delay, but I've added a second commit to this PR with:

added to release notes
Series.quantile on datetime[ns] series changed to return a Timestamp object (with an explicit test)
added a test for Series.quantile on timedelta64 series, which didn't require any
special handling, though the test is skipped at numpy 1.6.x, because
numpy.percentile throws the following exception with timedelta64 objects
pre-1.7:
```
    TypeError: ufunc 'multiply' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule 'safe'
```
removed timedelta64 handling (introduced by BUG: Series.quantile raising on an object dtype (GH6555) #6558) in Series.quantile,
because not np.isscalar is always false and, in my testing,
numpy.percentile properly returns timedelta64 objects on numpy 1.7 & up.

No idea why np.percentile wasn't used originally, but will note that scipy.stats.scoreatpercentile existed first. Perhaps, when the scipy dependency was removed from pandas, the difference in rounding made copying scoreatpercentile the easiest way forward. Will also note that scipy has deprecated scoreatpercentile in favor of np.percentile.

jreback · 2014-04-15T11:32:14Z

pandas/core/series.py

+        if com.is_datetime64_dtype(self):
+            from pandas.tseries.tools import to_datetime
+            values = _values_from_object(self).view('i8')
+            result = to_datetime(_quantile(values, q * 100))


you can just make this a call to Timestamp rather than use to_datetime

jreback · 2014-04-15T11:35:47Z

@gdraps looks good

just a couple of minor comments

timedeltas are thoroughly broken under numpy < 1.7
so I gave up on trying to make them work a while ago

gdraps · 2014-04-16T04:35:34Z

@jreback, updated the last commit of this PR based on your feedback -- let me know if you see anything else

CLN: replace pandas.compat.scipy.scoreatpercentile with numpy.percentile

jreback · 2014-04-16T13:05:01Z

@gdraps thanks for the work on this gr8!

jreback reviewed Apr 5, 2014
View reviewed changes

jreback added Compat labels Apr 5, 2014

jreback added this to the 0.14.0 milestone Apr 5, 2014

jreback added the Algos label Apr 5, 2014

jreback reviewed Apr 15, 2014
View reviewed changes

gdraps added 2 commits April 16, 2014 00:10

CLN: replace pandas.compat.scipy.scoreatpercentile with numpy.percentile

6be8784

CLN/TST: return Timestamp for .quantile on datetime[ns] series

9d89f51

jreback added a commit that referenced this pull request Apr 16, 2014

Merge pull request #6810 from gdraps/replace-scoreatpercentile

f30278e

CLN: replace pandas.compat.scipy.scoreatpercentile with numpy.percentile

jreback merged commit f30278e into pandas-dev:master Apr 16, 2014

gdraps deleted the replace-scoreatpercentile branch April 16, 2014 15:48

jreback mentioned this pull request Apr 22, 2014

BUG: fix ndarray indexing with float in compat/scipy.py #6740

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

CLN: replace pandas.compat.scipy.scoreatpercentile with numpy.percentile #6810

CLN: replace pandas.compat.scipy.scoreatpercentile with numpy.percentile #6810

Uh oh!

gdraps commented Apr 5, 2014

Uh oh!

jreback Apr 5, 2014

Uh oh!

gdraps Apr 5, 2014

Uh oh!

jreback Apr 5, 2014

Uh oh!

juliantaylor Apr 16, 2014

Uh oh!

jreback Apr 16, 2014

Uh oh!

jreback commented Apr 5, 2014

Uh oh!

jreback commented Apr 5, 2014

Uh oh!

gdraps commented Apr 15, 2014

Uh oh!

jreback Apr 15, 2014

Uh oh!

jreback commented Apr 15, 2014

Uh oh!

gdraps commented Apr 16, 2014

Uh oh!

jreback commented Apr 16, 2014

Uh oh!

Uh oh!

Uh oh!

CLN: replace pandas.compat.scipy.scoreatpercentile with numpy.percentile #6810

CLN: replace pandas.compat.scipy.scoreatpercentile with numpy.percentile #6810

Uh oh!

Conversation

gdraps commented Apr 5, 2014

Uh oh!

jreback Apr 5, 2014

Choose a reason for hiding this comment

Uh oh!

gdraps Apr 5, 2014

Choose a reason for hiding this comment

Uh oh!

jreback Apr 5, 2014

Choose a reason for hiding this comment

Uh oh!

juliantaylor Apr 16, 2014

Choose a reason for hiding this comment

Uh oh!

jreback Apr 16, 2014

Choose a reason for hiding this comment

Uh oh!

jreback commented Apr 5, 2014

Uh oh!

jreback commented Apr 5, 2014

Uh oh!

gdraps commented Apr 15, 2014

Uh oh!

jreback Apr 15, 2014

Choose a reason for hiding this comment

Uh oh!

jreback commented Apr 15, 2014

Uh oh!

gdraps commented Apr 16, 2014

Uh oh!

jreback commented Apr 16, 2014

Uh oh!

Uh oh!