Skip to content

BUG: Let DataFrame.quantile() handle datetime #7093

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 10, 2014

Conversation

TomAugspurger
Copy link
Contributor

Closes #6965

@TomAugspurger
Copy link
Contributor Author

@jreback I'm having trouble with a bit of this.

With a DataFrame

In [3]: df = DataFrame({'a': pd.to_datetime(['2010', '2011']), 'b': [0, 5]})

In [5]: df.dtypes
Out[5]: 
a    datetime64[ns]
b             int64
dtype: object

In [6]: df.quantile(.5, numeric_only=False)

For the implementation, I've got a function f that I apply to each column essentially via df.apply.
When f gets the first column a the dtype is object, and I'm not able to convert from object back to datetime64.

ipdb> arr
0    2010-01-01 00:00:00
1    2011-01-01 00:00:00
Name: a, dtype: object
ipdb> _values_from_object(arr)
array([Timestamp('2010-01-01 00:00:00'), Timestamp('2011-01-01 00:00:00')], dtype=object)
ipdb> _values_from_object(arr).view('i8')
*** TypeError: Cannot change data-type for object array.
ipdb> arr.convert_objects(convert_dates=True)
*** TypeError: Cannot change data-type for object array.

What's to best way to get from these timestamps to viewing as i8 so that the quantiles can be computed?

@jreback
Copy link
Contributor

jreback commented May 10, 2014

just iterate over with iteritems() and concat the results

apply has a lot of heuristics that u don't need

@TomAugspurger
Copy link
Contributor Author

Thanks. That's a lot better.

@jreback jreback added this to the 0.14.0 milestone May 10, 2014
quantiles = [[f(vals, x) for x in per]
for (_, vals) in data.iteritems()]
result = DataFrame(quantiles, index=data._info_axis, columns=q).T
if len(is_dt_col) > 0:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you actually need the if (if its empty it just won't iterate over anything)...but no biggie

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left it out originally, but I hit an error somewhere, I can't remember exactly what.

Closes pandas-dev#6965

previously returned nonsense
@jreback
Copy link
Contributor

jreback commented May 10, 2014

no prob...this looks fine otherwise

TomAugspurger pushed a commit that referenced this pull request May 10, 2014
BUG: Let DataFrame.quantile() handle datetime
@TomAugspurger TomAugspurger merged commit aa31fd1 into pandas-dev:master May 10, 2014
@TomAugspurger TomAugspurger deleted the quantile-datetime branch November 3, 2016 12:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Datetime Datetime data dtype Dtype Conversions Unexpected or buggy dtype conversions Numeric Operations Arithmetic, Comparison, and Logical operations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: DataFrame.quantile fails on datetime values
2 participants