-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH: Initial pass at implementing DataFrame.asof, GH 2941 #10266
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
pls rebase on master and repush had an issue with some builds |
Sorry about that, must've been some changes between my rebase and the travis build. Anyway, it's green now. |
Implements DataFrame.asof with various possible logics for skipping missing elements. Default case is equivalent to df.apply(lambda s: s.asof(where))
Reworked this using take_2d_multi, thoughts? |
Is there a good reason why this is restricted to timestamps? I suspect not, and it would be nice to generalize things a bit. I recently cleaned up the related |
return Series(index=self.columns, data=np.nan) | ||
|
||
s = self.iloc[loc, :].copy() | ||
s.name = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name of this series should probably be the where
argument. This will be necessary for the consistency check I describe below.
else: | ||
raise ValueError("skipna must be one of percolumn, none, any, all.") | ||
|
||
if not hasattr(where, '__iter__'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might be better simpler just to call self.asof([where]).iloc[0]
-- though you might need to disable the "setting item with copy warning" in that case.
|
@jreback Indeed, this does almost work already with |
no I get that this handles multiple versions, and maybe that's a reason to special case somewhat, however, this should be an internal type of method (meaning implemented in the |
Thanks for the feedback. The motivation for Making it a part of index functionality as @jreback suggests, seems a natural choice to make the behaviour available to both frame and series, and also to non timeseries indexes. There's really only two fundamental cases: either skip missings everywhere I'll take a stab at implementing a |
@bwillers I think you have it backwards -- Here's my mapping from this method to
Which case would |
@shoyer Ah, well I feel quite the idiot right now. For some reason I was under the impression that
I was expecting that bottom right So since all the Sorry for the cycles wasted on my poor understanding. |
@bwillers No worries! We would still appreciate documentation updates and/or cleanup/deprecation for |
Fixes #2941
This can almost certainly be made quicker, still digging into the
internals to understand the various underlying indexers.
There are 4 distinct logics you can apply to how to deal with the
missings (as opposed to the Series asof, wheres theres just two,
either skipna or dont skipna). The default is to be equivalent to
df.apply(lambda s: s.asof(where))
.Likely not ready to actually be merged, but at a stage where I could
use some feedback on both the logic and implementation.