-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
group_by produces 'minlength must be positive error' when applied to empty DataFrame #11699
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
cc @behzadnouri @Sereger13 I don't think their is an easy way around this w/o resorting to patching |
I see... We found that this code: If you do decide to fix size() - is there any idea when the next version/patch is going to be available? Thanks.. |
will be fixed; 0.18.0 prob later january |
Thanks. |
@Sereger13 my point about patching is that you can avoid any code changes. note again that is a 'hack' but will work. e.g.
|
Great - thanks for your help. |
This is more a bug in diff --git a/pandas/core/groupby.py b/pandas/core/groupby.py
index e9aa906..d722ef8 100644
--- a/pandas/core/groupby.py
+++ b/pandas/core/groupby.py
@@ -1439,7 +1439,8 @@ class BaseGrouper(object):
"""
ids, _, ngroup = self.group_info
ids = com._ensure_platform_int(ids)
- out = np.bincount(ids[ids != -1], minlength=ngroup)
+ mask = ids != -1
+ out = np.bincount(ids[mask], minlength=ngroup) if ngroup != 0 else []
return Series(out, index=self.result_index, dtype='int64')
@cache_readonly |
Interesting... thanks for the update. Yes they could have made So it looks like simply setting ngroup to None should also do the trick:
Not sure this is more readable than @behzadnouri's solution though. Looking forward for a new pandas with the workaround! |
This used to work fine in previous versions but appears to be broken in 0.17.1
The following code:
Produces this error:
In v 0.16.2 the same code produced an empty DataFrame. We'd really like to upgrade to 0.17.1 but heavily rely on this functionality so have to hold the upgrade. Checking for empty DataFrame is not going to work for us either as there are too many places where it can actually be empty.
If you can suggest any workaround in the meantime so we could upgrade that would be appreciated.
INSTALLED VERSIONS
commit: None
python: 2.7.10.final.0
python-bits: 64
OS: Linux
OS-release: 2.6.18-238.9.1.el5
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US
pandas: 0.16.2
...
The text was updated successfully, but these errors were encountered: