BUG AttributeError: 'DataFrameGroupBy' object has no attribute '_obj_with_exclusions' #11640

nbonnotte · 2015-11-18T14:27:07Z

I guess it will be clearer with an example. First, let's prepare the dataframe:

In [2]: df = pd.DataFrame(columns=['a','b','c','d'], data=[[1,'b1','c1',3], [1,'b2','c2',4]])

In [3]: df = df.pivot_table(index='a', columns=['b','c'], values='d').reset_index()

In [4]: df
Out[28]: 
b  a b1 b2
c    c1 c2
0  1  3  4

Now, the exception raised:

In [5]: df.groupby('a').mean()
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-29-a830c6135818> in <module>()
----> 1 df.groupby('a').mean()

/home/nicolas/Git/pandas/pandas/core/groupby.py in mean(self)
    764             self._set_selection_from_grouper()
    765             f = lambda x: x.mean(axis=self.axis)
--> 766             return self._python_agg_general(f)
    767 
    768     def median(self):

/home/nicolas/Git/pandas/pandas/core/groupby.py in _python_agg_general(self, func, *args, **kwargs)
   1245                 output[name] = self._try_cast(values[mask], result)
   1246 
-> 1247         return self._wrap_aggregated_output(output)
   1248 
   1249     def _wrap_applied_output(self, *args, **kwargs):

/home/nicolas/Git/pandas/pandas/core/groupby.py in _wrap_aggregated_output(self, output, names)
   3529     def _wrap_aggregated_output(self, output, names=None):
   3530         agg_axis = 0 if self.axis == 1 else 1
-> 3531         agg_labels = self._obj_with_exclusions._get_axis(agg_axis)
   3532 
   3533         output_keys = self._decide_output_index(output, agg_labels)

/home/nicolas/Git/pandas/pandas/core/groupby.py in __getattr__(self, attr)
    557 
    558         raise AttributeError("%r object has no attribute %r" %
--> 559                              (type(self).__name__, attr))
    560 
    561     def __getitem__(self, key):

AttributeError: 'DataFrameGroupBy' object has no attribute '_obj_with_exclusions'

Maybe I'm doing something wrong, and it's not a bug, but then the exception raised should definitely be more explicit than a reference to an internal attribute :-)

This attribute, by the way, is (only) referenced in one file and in issue #5264. It might be connected, but the discussion is a bit long and technical.

I'll try to have a look at what's going on.

The text was updated successfully, but these errors were encountered:

jreback · 2015-11-18T14:37:38Z

it should be a better error message, but you are grouping on something which is not a column, your
columns are a multi-index.

In [16]: df.columns
Out[16]: 
MultiIndex(levels=[[u'b1', u'b2', u'a'], [u'c1', u'c2', u'']],
           labels=[[2, 0, 1], [2, 0, 1]],
           names=[u'b', u'c'])

In [17]: df.index
Out[17]: Int64Index([0], dtype='int64')

In [18]: df.columns.values
Out[18]: array([('a', ''), ('b1', 'c1'), ('b2', 'c2')], dtype=object)

what exactly are you trying to do?

nbonnotte · 2015-11-18T14:43:48Z

I'm trying to group according to the column a, or ('a',''). What would be the proper way?

jreback · 2015-11-18T14:48:48Z

In [27]: df = pd.DataFrame(columns=['a','b','c','d'], data=[[1,'b1','c1',3], [1,'b2','c2',4]])

In [28]: df
Out[28]: 
   a   b   c  d
0  1  b1  c1  3
1  1  b2  c2  4

In [29]: df.groupby('a').mean()
Out[29]: 
     d
a     
1  3.5

nbonnotte · 2015-11-18T15:12:44Z

But that's not the result I would expect: with my dumb example, I would like to get the same dataframe.

BTW, if df['a'] works whatever the status of a, wouldn't it be nice to be able to group according to a as well?

jreback · 2015-11-18T16:20:24Z

what are your expecattions for a result here? pls show an example.

a is not a group in your example

nbonnotte · 2015-11-18T16:35:01Z

i would like that

after grouping by a and taking the mean, yields

b b1   b2
c c1   c2
a        
1  4  4.5

where the first dataframe is for instance obtained with

In [88]: df = pd.DataFrame(columns=['a','b','c','d'], data=[[1,'b1','c1',3], [1,'b2','c2',4], [2,'b1','c1',5], [2,'b2','c2',5]]).pivot_table(index='a', columns=['b','c'], values='d').reset_index()

In [89]: df
Out[89]: 
b  a b1 b2
c    c1 c2
0  1  3  4
1  2  5  5

In [90]: df['a'] = 1

In [91]: df
Out[91]: 
b  a b1 b2
c    c1 c2
0  1  3  4
1  1  5  5

jreback · 2015-11-18T16:47:43Z

In [17]: df.groupby([('a','')]).mean()
Out[17]: 
b     b1   b2
c     c1   c2
(a, )        
1      4  4.5

nbonnotte · 2015-11-18T17:07:31Z

So that was that... I had tried

In [99]: df.groupby(('a', '')).mean()
Out[99]: 
b  a b1 b2
c    c1 c2
   1  5  5
a  1  3  4

(the result of which I quite don't understand, but never mind) but not enclosing it betweens brackets. Thanks!

jreback · 2015-11-18T17:27:05Z

gr8.

if u are interested in improving he error message on he above case would be great

nbonnotte · 2015-11-19T08:53:33Z

Sure!

nbonnotte · 2015-11-24T13:08:45Z

@jreback digging about this issue, I think what is happening here is not so much a problem about reporting as a real bug. Indeed, my example just shows that after all issue #11185 was only partially solved by the PR #11202:

In [3]: df = pd.DataFrame(columns=['a', 'b', 'c', 'd'],
                       data=[[1, 'b1', 'c1', 3]])

In [4]: df.groupby('z').mean()
Out[4]: <pandas.core.groupby.DataFrameGroupBy object at 0x7f57f363d510>

This should produce a KeyError. The fact that a KeyError is not raised then allows for the AttributeError that is the subject of this issue, and is caused by the fact that the list of keys passed (here ['z']) is of the same length as the index, which in turn causes match_axis_length to be True in the following line:

https://github.com/pydata/pandas/blob/b07dd0cbd6d18c55aaa0043d85f42a483eab7dbb/pandas/core/groupby.py#L2210

I'll dig a bit deeper before making a PR

jreback · 2015-11-24T13:24:25Z

hmm, that does looks like a bug. I agree should give a KeyError (though a bit lower down in the code that where you pointed).

nbonnotte · 2015-11-24T18:52:37Z

Well, this is quite interesting. I've found a correction of the last bug, which does not solve the first problem though. But digging a bit further, I've found another bug

In [16]: df = pd.DataFrame(columns=['a', 'b', 'c', 'd'],
                       data=[[1, 'b1', 'c1', 3],
                             [1, 'b2', 'c2', 4]])

In [17]: dg = df.pivot_table(index='a', columns=['b', 'c'], values='d').reset_index()

In [18]: dg.drop('a', axis=1)
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-18-90595ac9cb8f> in <module>()
----> 1 dg.drop('a', axis=1)

/home/nicolas/Git/pandas/pandas/core/generic.pyc in drop(self, labels, axis, level, inplace, errors)
   1615                 new_axis = axis.drop(labels, level=level, errors=errors)
   1616             else:
-> 1617                 new_axis = axis.drop(labels, errors=errors)
   1618             dropped = self.reindex(**{axis_name: new_axis})
   1619             try:

/home/nicolas/Git/pandas/pandas/core/index.py in drop(self, labels, level, errors)
   5011                 else:
-> 5012                     inds.extend(lrange(loc.start, loc.stop))
   5013             except KeyError:
   5014                 if errors != 'ignore':

AttributeError: 'numpy.ndarray' object has no attribute 'start'

Turns out, this is the AttributeError which is mistakenly displayed as

AttributeError: 'DataFrameGroupBy' object has no attribute '_obj_with_exclusions'

I've not checked yet if there is already an issue for this.

No KeyError was raised when grouping by a non-existant column Fixes #11741 Xref issue #11640, PR #11717

closes pandas-dev#11640 closes pandas-dev#11717

jreback added Groupby Error Reporting Incorrect or improved errors from pandas labels Nov 18, 2015

jreback added this to the Next Major Release milestone Nov 18, 2015

nbonnotte mentioned this issue Nov 28, 2015

TST in .drop and .groupby for dataframes with multi-indexed columns #11717

Closed

jreback modified the milestones: 0.18.0, Next Major Release Nov 29, 2015

This was referenced Jan 16, 2016

Unexpected behavior with groupby on single-row dataframe? #11741

Closed

BUG in .groupby for single-row DF #12063

Closed

jreback pushed a commit that referenced this issue Jan 17, 2016

BUG in .groupby for single-row DF, #11741

e9e8598

No KeyError was raised when grouping by a non-existant column Fixes #11741 Xref issue #11640, PR #11717

nbonnotte mentioned this issue Jan 18, 2016

Obscur AttributeError when dropping on a multi-index dataframe #12078

Closed

jreback added this to the 0.18.0 milestone Jan 29, 2016

jreback pushed a commit to jreback/pandas that referenced this issue Jan 29, 2016

TST drop and groupby on dataframes with non-lexsorted multi-index

001dab6

closes pandas-dev#11640 closes pandas-dev#11717

jreback closed this as completed in b291dd6 Jan 29, 2016

nbonnotte mentioned this issue Feb 3, 2016

ERR: better error message on invalid on with multi-index columns #9455

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG AttributeError: 'DataFrameGroupBy' object has no attribute '_obj_with_exclusions' #11640

BUG AttributeError: 'DataFrameGroupBy' object has no attribute '_obj_with_exclusions' #11640

nbonnotte commented Nov 18, 2015

jreback commented Nov 18, 2015

nbonnotte commented Nov 18, 2015

jreback commented Nov 18, 2015

nbonnotte commented Nov 18, 2015

jreback commented Nov 18, 2015

nbonnotte commented Nov 18, 2015

jreback commented Nov 18, 2015

nbonnotte commented Nov 18, 2015

jreback commented Nov 18, 2015

nbonnotte commented Nov 19, 2015

nbonnotte commented Nov 24, 2015

jreback commented Nov 24, 2015

nbonnotte commented Nov 24, 2015

BUG AttributeError: 'DataFrameGroupBy' object has no attribute '_obj_with_exclusions' #11640

BUG AttributeError: 'DataFrameGroupBy' object has no attribute '_obj_with_exclusions' #11640

Comments

nbonnotte commented Nov 18, 2015

jreback commented Nov 18, 2015

nbonnotte commented Nov 18, 2015

jreback commented Nov 18, 2015

nbonnotte commented Nov 18, 2015

jreback commented Nov 18, 2015

nbonnotte commented Nov 18, 2015

jreback commented Nov 18, 2015

nbonnotte commented Nov 18, 2015

jreback commented Nov 18, 2015

nbonnotte commented Nov 19, 2015

nbonnotte commented Nov 24, 2015

jreback commented Nov 24, 2015

nbonnotte commented Nov 24, 2015