Skip to content

BUG: setting new items with .loc doesn't work for constant sequences #7787

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
code-of-kpp opened this issue Jul 18, 2014 · 11 comments
Closed
Labels
API Design Indexing Related to indexing on series/frames, not to indexes themselves

Comments

@code-of-kpp
Copy link

One can construct DataFrame with frozensets in the index:

 >>> import pandas
 >>> a = frozenset([1])
 >>> b = frozenset([1, 2])
 >>> c = frozenset([1, 2, 3])
 >>> 
 >>> frame = pandas.DataFrame(index=[a, b], columns=[c])
 >>> frame
        (1, 2, 3)
 (1)          NaN
 (1, 2)       NaN

Setting new values for existing items works:

 >>> frame.loc[a] = 0
 >>> frame
        (1, 2, 3)
 (1)            0
 (1, 2)       NaN
 >>> frame[c] = 1
 >>> frame
         (1, 2, 3)
 (1)             1
 (1, 2)          1
 >>> frame[frozenset([4])] = 1
 >>> frame
         (1, 2, 3)  (4)
 (1)             1    1
 (1, 2)          1    1

But new items cannot be created:

 >>> frame.loc[frozenset([3])] = 1
 Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
   File "venv/lib/python2.7/site-packages/pandas/core/indexing.py", line 118, in __setitem__
     indexer = self._convert_to_indexer(key, is_setter=True)
   File "venv/lib/python2.7/site-packages/pandas/core/indexing.py", line 1085, in _convert_to_indexer
     raise KeyError('%s not in index' % objarr[mask])
 KeyError: '[3] not in index'
@jreback
Copy link
Contributor

jreback commented Jul 18, 2014

what are you using this for? a frozenset is really an iterable, yes? (but doesn't have getitem), so not really sure what you would do with this. pandas has a lot of checks for iterable types (and list/tuple.ndarray, etc) in the indexing routines.

@code-of-kpp
Copy link
Author

I have a very long sequence and a function that produces a dict with several keys and values (features) for samples from that sequence.

[1, 2, 3 .... 100000000000]

func([1, 2, 3]) == {'a': 0.5, 'b': 0.6}
func([1, 3, 2]) == {'a': 0.5, 'b': 0.6}  # last two are equal

func([1, 3, 5, 9]) == {'a': 0.5, 'b': 0.7}
...

I wanted to make a DataFrame with 100 samples in rows and two columns ('a', 'b').

@jreback
Copy link
Contributor

jreback commented Jul 18, 2014

you can do with with tuples (as a value in an index).

@code-of-kpp
Copy link
Author

I have to consider (1, 2, 3), (1, 1, 2, 3) and (1, 3, 2) equal.

I solved this with coding (marshal.dumps and marshal.loads).

I'm just pointing out at inconsistency in API.

@code-of-kpp
Copy link
Author

Tuples have a special meaning in .loc

@jreback
Copy link
Contributor

jreback commented Jul 18, 2014

you would need a custom index type for this then and maybe some modifications to the indexing code. If you'd like to submit a pull-request, would be fine.

@jreback jreback added this to the Someday milestone Jul 18, 2014
@jreback
Copy link
Contributor

jreback commented Jul 18, 2014

The API for indexing is not clear that this is supported at all (in fact, IIRC, only: int/label/tuple/list-like/tuple/slice are supported)

@code-of-kpp
Copy link
Author

I think a warning should be rised when someone uses a value of another type in the index.

@jreback
Copy link
Contributor

jreback commented Jul 18, 2014

If you can do it w/o sacrificing speed and appropriate test, then we would take it. frozenset as I said is a hybrid list-like, not very friendly to the indexing code.

@jreback
Copy link
Contributor

jreback commented Jul 6, 2018

this is a very thorny problem and basically breaks indexing.

@mroeschke
Copy link
Member

I believe we are moving away from making the indexing code more permissive to indexing types such as frozenset. Going to close as a wont fix but happy to reopen if there's more interest from the community

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

No branches or pull requests

3 participants