Skip to content

DOC: Correct uniqueness of index for Series #14344

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Nov 25, 2016

Conversation

themrmax
Copy link

@themrmax themrmax commented Oct 4, 2016

closes #7808

Just wanted to fix the docstring to reflect the fact that the index labels neither need to be unique nor hashable.

@chris-b1
Copy link
Contributor

chris-b1 commented Oct 4, 2016

It is true that labels no longer need to be unique, but they still must be hashable?

@themrmax
Copy link
Author

themrmax commented Oct 4, 2016

No they don't need to be hashable currently:

In [5]: s = pd.Series ([1,2,3], index = ([1],[1],[1]))

In [6]: s
Out[6]:
[1]    1
[1]    2
[1]    3
dtype: int64

Do you think we should add a check for hashable?

@jreback
Copy link
Contributor

jreback commented Oct 4, 2016

try to index that and see what happens

@themrmax
Copy link
Author

themrmax commented Oct 5, 2016

@jreback lol i don't know, maybe some people use series without ever indexing them? otherwise do you think we should add a check for hashable values? Or maybe edit the doco to say "for the object to support indexing, the labels must be of a hashable type"?

@jreback
Copy link
Contributor

jreback commented Oct 5, 2016

In [4]: s = pd.Series ([1,2,3], index = ([1],[2],[3]))

In [5]: s.loc[[1]]
TypeError: unhashable type: 'list'

In [7]: s.index.inferred_type
Out[7]: 'mixed'

So we do actually check hashable, but its fairly lazy as its somewhat to expensive to do this. The point is that this is non-idiomatic, and though may work in parts of pandas is not supported, nor recommended in any way.

So if you'd like to update with a stronger warning would be great.

@jorisvandenbossche
Copy link
Member

@themrmax Can you update this? (leave out the hashable change for now, but you can keep the change regarding uniqueness)

RangeIndex(len(data)) if not provided. If both a dict and index
sequence are used, the index will override the keys found in the
dict.
Values must be hashable and the same length as data. Will default to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spacing here

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah sorry about just learned how to use textwidth in vim

@jreback jreback added this to the 0.19.2 milestone Nov 23, 2016
@codecov-io
Copy link

codecov-io commented Nov 24, 2016

Current coverage is 85.21% (diff: 100%)

Merging #14344 into master will decrease coverage by 0.04%

@@             master     #14344   diff @@
==========================================
  Files           140        143     +3   
  Lines         50632      50800   +168   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits          43170      43289   +119   
- Misses         7462       7511    +49   
  Partials          0          0          

Powered by Codecov. Last update 96b364a...6462ef5

sequence are used, the index will override the keys found in the
dict.
Values must be hashable and the same length as data. Will default to
RangeIndex(len(data)) if not provided. If both a dict and index sequence
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is too long (therefore travis is failing, we test for PEP8 (line length < 80))

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah sorry about this i got my < confused with my ≤, i'll set my linewidth to 79.

@jorisvandenbossche jorisvandenbossche modified the milestones: 0.20.0, 0.19.2 Nov 24, 2016
@jorisvandenbossche jorisvandenbossche changed the title Update documentation for Series to remove spec for unique and hashable labels DOC: Correct uniqueness of index for Series Nov 25, 2016
@jorisvandenbossche jorisvandenbossche merged commit 6ad6e4e into pandas-dev:master Nov 25, 2016
@jorisvandenbossche
Copy link
Member

@themrmax Thanks!
(I added one sentence from another PR for the same issue ("Non-unique index values are allowed") before merging)

@themrmax
Copy link
Author

cool thanks @jorisvandenbossche yes that's clearer now !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Conflicting documentation about index uniqueness
5 participants