-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH: Added functionality in resample to resolve #10530 #10543
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -423,6 +423,14 @@ def test_series_bin_grouper(): | |||
exp_counts = np.array([3, 3, 4], dtype=np.int64) | |||
assert_almost_equal(counts, exp_counts) | |||
|
|||
def test_resample_with_timedeltaindex(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be in pandas/tseries/tests/test_resample
. no need to import things, if they are not in the namespace import at the top of the file (but they are already in any event)
@jreback — thanks! |
@jreback — any comments on the updates? |
@@ -324,6 +326,11 @@ def _resample_timestamps(self, kind=None): | |||
|
|||
result.index = result.index + loffset | |||
|
|||
if kind == "timedelta" and len(result.index) > 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can just do the calculation in _get_time_delta_bins
. base
is just added to the labels (but not the binner).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jreback: I had tried putting it in _get_time_delta_bins
to avoid cluttering the _resample_timestamp
function with type-specific code but couldn't get it working properly. I think the code that takes care of downsampling might have something to do with it.
I can try and get it working, but it'll take me a while...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
np..glad you are working on it!
@jreback: I've been working on this and managed to move the code to Do you have any insights on this? Reproducibility Instructions
And the crash report:
|
try this: jreback@f593a22 your test is not right though. it should be the same length as resampling w/o the base. It just shifts the labels. |
@jreback: The documentation is a little vague on this and I haven't used
An explanation would be much appreciated as I'm not so savvy with TimeDeltaIndex! :) |
base should work the same for any of the datetimelikes it's just an add to every value it's not really s shift but it works the same way |
@jreback — Agreed. The documentation on it is pretty poor. Did another retrace of how |
I actually use |
@jreback: Made updates. Updated the tests a bit so that functionality aligns with what happens when you |
@@ -450,6 +451,7 @@ def test_resample_dup_index(self): | |||
# dup columns with resample raising | |||
df = DataFrame(np.random.randn(4,12),index=[2000,2000,2000,2000],columns=[ Period(year=2000,month=i+1,freq='M') for i in range(12) ]) | |||
df.iloc[3,:] = np.nan | |||
print(df) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should go away
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shoyer – Whoops! Thanks for catching it, missed that one!
looks good pls rebase on master and squash |
b6992aa
to
3a6143b
Compare
@jreback — squashed. |
@@ -122,7 +121,7 @@ def _get_binner_for_resample(self, kind=None): | |||
self.binner, bins, binlabels = self._get_time_delta_bins(ax) | |||
else: | |||
self.binner, bins, binlabels = self._get_time_period_bins(ax) | |||
|
|||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
try not to add white space
pls add a note in whatsnew |
5d1945d
to
4a91eac
Compare
That outta do it, @jreback — let me know if it needs anything else. |
@@ -94,6 +94,8 @@ Other enhancements | |||
|
|||
- Enable `read_hdf` to be used without specifying a key when the HDF file contains a single dataset (:issue:`10443`) | |||
|
|||
- Added functionality to use `base` when resampling a ``TimeDeltaIndex`` (:issue:`10530`) | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use double back ticks on base
Added tests for updated resample function Changed if-statement to be lower-bound inclusive Undid previous change to if statement Fixed typo in resample.py Fixed typo in _get_time_bins Updated _resample_timestamp function Updated condition in if-statement Updated exceptions raised in resample Moved test case into proper file Fixed typo in test case Updated tests for resampling fix ENH: Updated code for fixing pandas-dev#10530 Removed extraneous print statements from tests Moved code for fix to _get_time_delta_bins function Updated tests for resample TimeDeltaIndex with base Updated code for resample TimeDeltaIndex with base Removed print statements from test case Removed print statement in tests Added note to what's new Removed extra whitespace Removed addtional whitespace Removed whitespace Removed whitespace in resample.py Removed more whitespace in resample.py Removed more whitespace Fixed syntax in what's new
4a91eac
to
83a15af
Compare
merged via 4336020 thanks! |
closes #10530
I'm not 100% sure that I implemented this per the documentation or that my tests are complete but I'd be glad to make any necessary changes.