BUG-24915 fix unhashble Series.name in df aggregation #24920

lofoyet · 2019-01-25T03:44:19Z

closes Cannot do agg if dataframe has column "name" #24915
tests added / passed
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

Before creating a pd.Series and pass {name} as argument, check if {name} is hashable. If not use default None.

lofoyet · 2019-01-25T03:44:48Z

@YaoquanYe @daisysu0310

lofoyet · 2019-01-25T03:45:44Z

not sure about what tests to add

lofoyet · 2019-01-25T03:46:42Z

@youhealthy and I are the same person

codecov · 2019-01-25T04:20:00Z

Codecov Report

Merging #24920 into master will increase coverage by <.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master   #24920      +/-   ##
==========================================
+ Coverage   92.38%   92.38%   +<.01%     
==========================================
  Files         166      166              
  Lines       52404    52406       +2     
==========================================
+ Hits        48412    48414       +2     
  Misses       3992     3992

Flag	Coverage Δ
#multiple	`90.8% <100%> (ø)`	⬆️
#single	`42.9% <0%> (-0.01%)`	⬇️

Impacted Files	Coverage Δ
pandas/core/base.py	`97.76% <100%> (+0.01%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 539c54f...b270371. Read the comment docs.

codecov · 2019-01-25T04:20:00Z

Codecov Report

Merging #24920 into master will increase coverage by <.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master   #24920      +/-   ##
==========================================
+ Coverage   92.38%   92.38%   +<.01%     
==========================================
  Files         166      166              
  Lines       52404    52406       +2     
==========================================
+ Hits        48412    48414       +2     
  Misses       3992     3992

Flag	Coverage Δ
#multiple	`90.8% <100%> (ø)`	⬆️
#single	`42.9% <0%> (-0.01%)`	⬇️

Impacted Files	Coverage Δ
pandas/core/base.py	`97.76% <100%> (+0.01%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 539c54f...b270371. Read the comment docs.

mroeschke

Needs:

Whatsnew entry for 0.24.1
Test: Add a test based on the example in the original issue. Construct the DataFrame/Series that replicates the issue and the expected DataFrame/Series

jreback

this needs a test. I am also not sure this is actually the correct fix. constructio of a Series from scalars with an invalid name with raise now. IOW this should be raising already, so not sure this is a good remedy.

`

In [1]: Series([1,2],name=[1,2])                                                                                                                                                                                                                                        
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-1-360525e04c11> in <module>
----> 1 Series([1,2],name=[1,2])

~/pandas/pandas/core/series.py in __init__(self, data, index, dtype, name, copy, fastpath)
    266         generic.NDFrame.__init__(self, data, fastpath=True)
    267 
--> 268         self.name = name
    269         self._set_axis(0, index, fastpath=True)
    270 

~/pandas/pandas/core/generic.py in __setattr__(self, name, value)
   5077             object.__setattr__(self, name, value)
   5078         elif name in self._metadata:
-> 5079             object.__setattr__(self, name, value)
   5080         else:
   5081             try:

~/pandas/pandas/core/series.py in name(self, value)
    400     def name(self, value):
    401         if value is not None and not is_hashable(value):
--> 402             raise TypeError('Series.name must be a hashable type')
    403         object.__setattr__(self, '_name', value)
    404 

TypeError: Series.name must be a hashable type

lofoyet · 2019-01-28T17:45:45Z

@jreback I agree with you. You can refer to #24915 for the problem.

There are 2 ways that I have in mind to fix this:

This PR.
Make change to how getattr works.

For 2. What I looked at is

#pandas.core.generic.py
if (name in self._internal_names_set or name in self._metadata or
    name in self._accessors):
        return object.__getattribute__(self, name)
else:
    if self._info_axis._can_hold_identifiers_and_holds_name(name):
        return self[name]
    return object.__getattribute__(self, name)

getattr(self, "name") will go to else clause, and if "name" is a column, it will return in if clause. But df.name is really an attribute rather not a column name. It should skip that "if" in "else" and goes to object.getattribute().

How do you propose to change? @jreback

lofoyet · 2019-02-08T00:41:37Z

@jreback any idea?

jreback · 2019-02-08T02:26:40Z

@lofoyet first thing you need here is a test. Then step thru it until you get to the source of the error, see if you patch fixes.

I would rather shy away from changing the way getattr works. Also I am not sure your change is the right way.

WillAyd · 2019-02-28T00:34:12Z

Closing as stale. @lofoyet ping if you'd like to continue though as mentioned above this needs a test first and foremost

if df name is not hashable then use None in agg

b270371

mroeschke requested changes Jan 25, 2019

View reviewed changes

jreback requested changes Jan 26, 2019

View reviewed changes

jreback added the Reshaping Concat, Merge/Join, Stack/Unstack, Explode label Jan 26, 2019

WillAyd closed this Feb 28, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG-24915 fix unhashble Series.name in df aggregation #24920

BUG-24915 fix unhashble Series.name in df aggregation #24920

lofoyet commented Jan 25, 2019 •

edited

Loading

lofoyet commented Jan 25, 2019

lofoyet commented Jan 25, 2019

lofoyet commented Jan 25, 2019

codecov bot commented Jan 25, 2019

codecov bot commented Jan 25, 2019 •

edited

Loading

mroeschke left a comment

jreback left a comment

lofoyet commented Jan 28, 2019 •

edited

Loading

lofoyet commented Feb 8, 2019

jreback commented Feb 8, 2019

WillAyd commented Feb 28, 2019

BUG-24915 fix unhashble Series.name in df aggregation #24920

BUG-24915 fix unhashble Series.name in df aggregation #24920

Conversation

lofoyet commented Jan 25, 2019 • edited Loading

lofoyet commented Jan 25, 2019

lofoyet commented Jan 25, 2019

lofoyet commented Jan 25, 2019

codecov bot commented Jan 25, 2019

Codecov Report

codecov bot commented Jan 25, 2019 • edited Loading

Codecov Report

mroeschke left a comment

Choose a reason for hiding this comment

jreback left a comment

Choose a reason for hiding this comment

lofoyet commented Jan 28, 2019 • edited Loading

lofoyet commented Feb 8, 2019

jreback commented Feb 8, 2019

WillAyd commented Feb 28, 2019

lofoyet commented Jan 25, 2019 •

edited

Loading

codecov bot commented Jan 25, 2019 •

edited

Loading

lofoyet commented Jan 28, 2019 •

edited

Loading