-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Extra Bin with Pandas Resample in 0.11.0 #4076
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This is curiously not the case if I pass how='count' -- no extra bin is returned. This makes me suspect a bug:
|
a somewhat related issue in master is that there's no longer zeros there, there's garbage values. this is a bug in how python vs. cythonized methods work, for example passing a lambda works
|
This also seems to act differently with different resample frequencies. With a frequency of 'AS', how='sum' yields the correct answer while how=lambda x: numpy.sum(x) does not:
|
your last example is an issue with |
I have also been having issues with resample adding extra bins (also in 0.11.0), and just thought I'd add that I can also see it even when the number of bins is not evenly divisible: >>> x = pandas.DataFrame(numpy.random.randn(9, 3), index=pandas.date_range('2000-1-1', periods=9))
>>> x
0 1 2
2000-01-01 -1.191405 0.645320 1.308088
2000-01-02 1.229103 -0.727613 0.488344
2000-01-03 0.885808 1.381995 -0.955914
2000-01-04 -1.013526 -0.225070 -0.163507
2000-01-05 0.670316 -0.828281 -0.233381
2000-01-06 1.357537 1.446020 -0.661463
2000-01-07 0.335799 0.952127 0.591679
2000-01-08 -0.083534 1.025077 -0.146682
2000-01-09 -1.338294 1.919551 0.446385
>>> x.resample('5D')
0 1 2
2000-01-01 0.116059 0.049270 8.872589e-02
2000-01-06 0.067877 1.335694 5.747979e-02
2000-01-11 0.591679 0.146682 3.952525e-322 I don't have any particular insight to add, but maybe this extra info will help... |
@cpcloud 0.13 or push? |
like to do 0.13 but got a lot on my plate already ... let me see if there's anything else i can push to 0.14 in favor of this |
up2u |
pushing for now...can always pull back! |
Ok
|
I've got a pandas data frame defined like this, using pandas 0.11.0:
and when I go to resample it:
I end up with an extra empty bin I wouldn't expect to see -- 2001-06-01. I wouldn't expect that bin to be there, as my 28 days are evenly divisible into the 7 day resample I'm performing. I've tried messing around with the closed kwarg, but I can't escape that extra bin. This seems like a bug, and it messes up my mean calculations when I try to do
as the numbers are being divided by 5 rather than 4. (I also wouldn't expect REST_KEY to show up in the aggregation columns as it's part of the groupby, but that's really a smaller problem.)
The text was updated successfully, but these errors were encountered: