Validate that 'name' attribute is set only if hashable #9193

dr-leo · 2015-01-03T15:50:34Z

addresses part of issue #8263.

…st part of issue #8263

shoyer · 2015-01-04T19:51:23Z

a few points:

this needs a test to verify that it works.
the cleaner way to do this rather than checking this for all attribute setting on every data structure would be to make name on a property on Series objects (the only type that currently has names in pandas). Something like:

    @property
    def name(self):
        return self._name

    @name.setter
    def name(self, value):
        try:
            hash(value)
        except TypeError:
            raise TypeError('name must be hashable')
        self._name = name

dr-leo · 2015-01-04T23:04:47Z

Thanks. Makes sense to me.

What next? Shall I try and make another commit on this branch while the
PR remains open, or do you want to reject it in which case (i) I could
try on a new branch and a new PR, or (ii) you could do it yourself?

Am 04.01.2015 um 20:51 schrieb Stephan Hoyer:

a few points:

this needs a test to verify that it works.

the cleaner way to do this rather than checking this for /all/
attribute setting on every data structure would be to make |name|
on a property on |Series| objects (the only type that currently
has names in pandas). Something like:

@Property
def name(self):
return self._name

@name.setter
def name(self, value):
try:
hash(value)
except TypeError:
raise TypeError('name must be hashable')
self._name = name

—
Reply to this email directly or view it on GitHub
#9193 (comment).

jreback · 2015-01-04T23:58:19Z

@dr-leo just make a new commit on this PR

jorisvandenbossche · 2015-01-05T22:44:08Z

There was recently introduced a pd.core.common.is_hashable() in this PR: #8929. Maybe that can be used here to do the hash check.

dr-leo · 2015-01-12T20:40:20Z

@joris: sorry, don't know how to use code from another PR in my own.

@ALL: I've just committed what will hopefully do part of the trick. I've
added a name property in core.series.py as Stephan suggested. I've also
added a test for this in test_series.py. I couldn't find a good place
for it so I put it behind the test_constructor_map case.

The test suite produces some failures: in test_series, line 1999, e.g.,
Series.name is set to a list type. That's mean, so I've made it a
string. This does not break that test, it is about repr.

Still there are 13 failures and one error out of 8027 tests. At least
one failure relates to Series.name. Unfortunately I am fairly unfamiliar
with nose and unittest. Before attempting to fix these failures I
thought I'd show you what I've done so far.

Please let me know your views on the test results and what it would
take to accept the PR.

As this is my very first PR for a serious project, I am somewhat baffled
about how much work it takes to put together good software :-O.

Anyway, so far it has been fun to work on this and I've learnt quite a bit.

Thanks.

Leo

Am 05.01.2015 um 23:44 schrieb Joris Van den Bossche:

There was recently introduced a |pd.core.common.is_hashable()| in this
PR: #8929 #8929. Maybe that can
be used here to do the hash check.

—
Reply to this email directly or view it on GitHub
#9193 (comment).

shoyer · 2015-01-12T20:43:59Z

pandas/core/series.py

+    def name(self, value):
+        try:
+            hash(value)
+        except TypeError:


You could replace this with just if not com.is_hashable(value):.

shoyer · 2015-01-12T20:44:39Z

@dr-leo The other PR is already merged, so you can just use the function it introduced directly.

Trust me, this gets easier with practice :)

shoyer · 2015-01-12T20:47:44Z

I'm pretty sure the test failures you're getting is because None (the default value for name) is not hashable. We definitely want None to remain a valid name, so you'll need to add a special case for that.

shoyer · 2015-01-12T20:48:46Z

Actually, that theory is wrong. None actually is hashable.

shoyer · 2015-01-12T20:55:51Z

OK, it looks like the trouble here is related to some strange business pandas does with overwriting __setattr__. It looks like replacing self._name = value with object.__setattr__(self, '_name', value) should do the trick.

jreback · 2015-01-12T23:55:07Z

needs a test to see if this survives pickling

dr-leo · 2015-01-13T09:10:23Z

Got you. Should all be doable.

On 13/01/2015, jreback [email protected] wrote:

needs a test to see if this survives pickling

Reply to this email directly or view it on GitHub:
#9193 (comment)

…hable.

dr-leo · 2015-01-15T19:17:49Z

On picling: There is already a testcase in test_series @411:
def test_pickle_preserve_name(self):
unpickled = self._pickle_roundtrip_name(self.ts)
self.assertEqual(unpickled.name, self.ts.name)

So I don't see any need for another.

On common.is_hashable: It returns False for NP.float64 (see below). This
breaks a couple of tests. I suppose this is a bug in is_hashable. If
not, we cannot use it to check Series.name for hashability. We allow
float64 for index labels after all.

In [3]: import numpy as NP

In [5]: f=NP.float64(3.14)

In [8]: hash(f)
Out[8]: 1846836513

In [9]: import pandas

In [11]: from pandas.core.common import is_hashable

In [12]: is_hashable(f)
Out[12]: False

Am 13.01.2015 um 00:55 schrieb jreback:

needs a test to see if this survives pickling

—
Reply to this email directly or view it on GitHub
#9193 (comment).

shoyer · 2015-01-15T22:34:57Z

@dr-leo what version of numpy are you running? I'm seeing a different result on numpy 1.9.1

dr-leo · 2015-01-16T07:30:32Z

np1.9.1, py34,32bit, win7 64bit.

Am 15.01.2015 um 23:35 schrieb Stephan Hoyer:

@dr-leo https://github.com/dr-leo what version of numpy are you
running? I'm seeing a different result on numpy 1.9.1

—
Reply to this email directly or view it on GitHub
#9193 (comment).

shoyer · 2015-01-16T20:39:19Z

@dr-leo It's only a problem with Python 3. Just made a new issue: #9276

shoyer · 2015-02-17T02:23:45Z

@dr-leo can you rebase on master and give this another try? We just fixed the is_hashable bug in #9473.

dr-leo · 2015-02-17T06:26:40Z

Great!

However, I am unfamiliar with rebase. To make things worse, I work with
Mercurial using hg-git. It does have a rebase extension but fiddling
with history is not one of my passions.

My hope was that you could simply merge my little PR branch into
master... That said, if you give me a hint I could try to help.

Leo

Am 17.02.2015 um 03:23 schrieb Stephan Hoyer:

@dr-leo https://github.com/dr-leo can you rebase on master and give
this another try? We just fixed the is_hashable bug in #9473
#9473.

—
Reply to this email directly or view it on GitHub
#9193 (comment).

jreback · 2015-02-17T06:46:34Z

pandas/core/series.py

+
+    @name.setter
+    def name(self, value):
+        if is_hashable(value):


do if value is not None here as it's the most common case

jreback · 2015-02-17T06:47:32Z

needs a perf check

jreback · 2015-02-17T07:03:09Z

pandas/tests/test_series.py

+
+    def test_constructor_unhashable_name(self):
+        def set_to_unhashable(s_):
+            s_.name = {}


needs a test on the construcor as well

jreback · 2015-05-09T15:59:50Z

closing pls reopen if/when updated

Validate that 'name' attribute is set only if hashable. Addresses fir…

4ca0a33

…st part of issue #8263

dr-leo mentioned this pull request Jan 3, 2015

ENH: set multi-index names as NamedTuples #8263

Closed

jreback added the Compat pandas objects compatability with Numpy or Python functions label Jan 4, 2015

property to validate that Series.name is hashable, add testcase for this

7fdf290

shoyer reviewed Jan 12, 2015
View reviewed changes

use is_hashable. Some unittests are broken as float64 is deemed unhas…

c244273

…hable.

shoyer mentioned this pull request Jan 16, 2015

BUG: common.is_hashable returns False for np.float64 on Python 3 #9276

Closed

jreback reviewed Feb 17, 2015
View reviewed changes

jreback closed this May 9, 2015

jorisvandenbossche added the Closed PR label May 14, 2015

This was referenced Mar 13, 2016

Bug: Pandas Series name attribute can be array #12610

Closed

BUG: ensure Series.name is hashable, #12610 #12612

Closed

Uh oh!

Validate that 'name' attribute is set only if hashable #9193

Validate that 'name' attribute is set only if hashable #9193

Uh oh!

Conversation

dr-leo commented Jan 3, 2015

Uh oh!

shoyer commented Jan 4, 2015

Uh oh!

dr-leo commented Jan 4, 2015

Uh oh!

jreback commented Jan 4, 2015

Uh oh!

jorisvandenbossche commented Jan 5, 2015

Uh oh!

dr-leo commented Jan 12, 2015

Uh oh!

shoyer Jan 12, 2015

Choose a reason for hiding this comment

Uh oh!

shoyer commented Jan 12, 2015

Uh oh!

shoyer commented Jan 12, 2015

Uh oh!

shoyer commented Jan 12, 2015

Uh oh!

shoyer commented Jan 12, 2015

Uh oh!

jreback commented Jan 12, 2015

Uh oh!

dr-leo commented Jan 13, 2015

Uh oh!

dr-leo commented Jan 15, 2015

Uh oh!

shoyer commented Jan 15, 2015

Uh oh!

dr-leo commented Jan 16, 2015

Uh oh!

shoyer commented Jan 16, 2015

Uh oh!

shoyer commented Feb 17, 2015

Uh oh!

dr-leo commented Feb 17, 2015

Uh oh!

jreback Feb 17, 2015

Choose a reason for hiding this comment

Uh oh!

jreback commented Feb 17, 2015

Uh oh!

jreback Feb 17, 2015

Choose a reason for hiding this comment

Uh oh!

jreback commented May 9, 2015

Uh oh!

Uh oh!