-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Split/Partition Master Issue #7387
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Here's a way to get started I think. Since something like this:
and
So then can easily add some more keyword args to the current here is the current:
so I would propose this functionaility:
something like this:
If |
looks good, only suggestion i have is to use the |
could pad if it's a tuple tho |
I'll make some time to dig through the pandas questions on SO and look for use cases that we should cover while we're addressing this. |
Since there's only 2 issues here, I think it's okay to track those issue separately as this tracker isnt really used anymore. Closing |
As pointed out by @dsm054, there are multiple lurking split/partition API requests. Here are the issues and a short summary of what they would do (there are some duplicates here, I've checked off those issues/PRs that have been closed in favor of a related issue):
pd.rolling_mean(ts, window='30min')
and possibly even arbitrary windows using another columnsplit
method on pandas objects, playing around with ideasn
samples of a bin.np.array_split
style API where you can split a pandas object into a list ofk
groups of possibly unequal size (could be a thin wrapper aroundnp.array_split
, or more integrated into the pandas DSL). IMO, this issue provides the best starting point for an API. SO usagegroupby
to haveitertools.groupby
semantics (i.e., preserve the order of duplicated group keys), i.e.,'aabbaa'
would yield groups['aa', 'bb', 'aa']
rather than['aaaa', 'bb']
. There'd have to be some changes to the use ofdict
in the groupby backend as noted by @y-p here API for splitting pandas objects #4059 (comment).The
toolz
library has apartitionby
function that provides a nice way to do some of the splitting on sequences and might provide us with some insight on how to approach the API.cc @jreback @jorisvandenbossche @hayd @danielballan
The text was updated successfully, but these errors were encountered: