Skip to content

REGR: assigning scalar with a length no longer works #26333

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
jorisvandenbossche opened this issue May 10, 2019 · 10 comments
Open

REGR: assigning scalar with a length no longer works #26333

jorisvandenbossche opened this issue May 10, 2019 · 10 comments
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves Nested Data Data where the values are collections (lists, sets, dicts, objects, etc.). Regression Functionality that used to work in a prior pandas version

Comments

@jorisvandenbossche
Copy link
Member

Assigning a value to a single location in a DataFrame (using .loc with scalar indexers) started to fail with "values with a length".

Consider the following example:

In [1]: df = pd.DataFrame({'a': [1, 2, 3], 'b': [(1, 2), (1, 2, 3), (3, 4)]})

In [2]: df
Out[2]: 
   a          b
0  1     (1, 2)
1  2  (1, 2, 3)
2  3     (3, 4)

In [3]: df.loc[0, 'b'] = (7, 8, 9)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-3-a2d59e11519a> in <module>
----> 1 df.loc[0, 'b'] = (7, 8, 9)

~/miniconda3/lib/python3.7/site-packages/pandas/core/indexing.py in __setitem__(self, key, value)
    187             key = com._apply_if_callable(key, self.obj)
    188         indexer = self._get_setitem_indexer(key)
--> 189         self._setitem_with_indexer(indexer, value)
    190 
    191     def _validate_key(self, key, axis):

~/miniconda3/lib/python3.7/site-packages/pandas/core/indexing.py in _setitem_with_indexer(self, indexer, value)
    604 
    605                     if len(labels) != len(value):
--> 606                         raise ValueError('Must have equal len keys and value '
    607                                          'when setting with an iterable')
    608 

ValueError: Must have equal len keys and value when setting with an iterable

This raises on 0.23.4 - master, but worked in 0.20.3 - 0.22.0 (the ones I tested):

In [1]: pd.__version__
Out[1]: '0.22.0'

In [2]: df = pd.DataFrame({'a': [1, 2, 3], 'b': [(1, 2), (1, 2, 3), (3, 4)]})

In [3]: df
Out[3]: 
   a          b
0  1     (1, 2)
1  2  (1, 2, 3)
2  3     (3, 4)

In [4]: df.loc[0, 'b'] = (7, 8, 9)

In [5]: df
Out[5]: 
   a          b
0  1  (7, 8, 9)
1  2  (1, 2, 3)
2  3     (3, 4)

Related to #25806

We don't have very robust support in general for list-like values, but, for the specific case above of updating a single value, I don't think there is anything ambiguous about it? You are updating a single value, so the passed value should simply be put in that place?

Note, the above is with tuples. But, there are also custom objects like MultiPolygons that represent single objects, but do define a __len__ ..

@jorisvandenbossche jorisvandenbossche added the Indexing Related to indexing on series/frames, not to indexes themselves label May 10, 2019
@jorisvandenbossche
Copy link
Member Author

There are several potential issues with assigning list-like values, see eg the issues being linked to here: #19590 (comment)
However, most of those cases occur when assigning to multiple elements at once (unpack the list?), while here the arguments to loc are both scalars.

Probably caused by #20732

@elfmanryan
Copy link

I am getting the same issue as the multipolygon seen as equivalent to a list of polygons.
my work around (to assign a MP to a single row) is to wrap the MP in a list first.
I am running a try except to catch the error, something like this:

from shapely.geometry import Multipolygon
try:
    geopandas_dataframe_one_row['geometry'] = Multipolygon
except ValueError:
    geopandas_dataframe_one_row['geometry'] = [Multipolygon]

Hope that's helpful to someone.

@jorisvandenbossche jorisvandenbossche added the Regression Functionality that used to work in a prior pandas version label Oct 31, 2019
@jklatt
Copy link

jklatt commented Dec 4, 2019

Hi! Any solution in sight? Also encountering the issue when wanting to assign MultiPolygons and the list work-around as well as ".values" work-around do not seem to help... Thankful for any hint!

@koshy1123
Copy link

@jklatt I was able to resolve by roundtripping a shapely geo through geopandas:

gdf.loc[scalar_index_loc, 'geometry'] = geopandas.GeoDataFrame(geometry=[shapely_geo]).geometry.values

This seems to avoid the ValueError

@jklatt
Copy link

jklatt commented Dec 5, 2019

Works like a charm! Thank you Thomas :)

@elizabethswkim
Copy link

Thanks, @koshy1123 ! Your roundtripping shapely geo via geopandas worked for me!

@bramson
Copy link

bramson commented Nov 4, 2021

I've tried various version of the workaround for this problem, but I still can't get anything to work.

What should work:

someData.at[index,'geometry'] = thisGeom

Some workarounds attempted:

someData.loc[index, 'geometry'] = geopandas.GeoDataFrame(geometry=[thisGeom]).geometry.values
someData.loc[index, 'geometry'] = geopandas.GeoDataFrame(geometry=[thisGeom]).geometry.values[0]
someData.loc[index, 'geometry'] = [geopandas.GeoDataFrame(geometry=[thisGeom]).geometry.values[0]]

someData.loc[[index],'geometry'] = geopandas.GeoSeries([thisGeom]).values

All give

ValueError: Must have equal len keys and value when setting with an iterable

or

ValueError: Must have equal len keys and value when setting with an ndarray

Is there any updated solution or working workaround?

@bennlich
Copy link

bennlich commented Jul 1, 2022

@bramson I think the one that worked for me is not in your list:

someData.loc[[index], 'geometry'] = geopandas.GeoDataFrame(geometry=[thisGeom]).geometry.values

😞

@hadim
Copy link

hadim commented Jul 22, 2022

I am having the same issue as well. So far the only solution for me was to build a complex .apply(lambda x: xxx) based hack as a workaround.

@oboklob
Copy link

oboklob commented Dec 5, 2022

I found that if the geopandas DataFrame contains only a geometry column the problem goes away. As soon as an additional column has values the above issue arises.

So my solution was to split the data into two Dataframes whilst populating geometry, one with only geometry and a second with additional data (using the same indexes). Then combine the two dataframes afterwards. Not ideal, but a work around if you are struggling.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves Nested Data Data where the values are collections (lists, sets, dicts, objects, etc.). Regression Functionality that used to work in a prior pandas version
Projects
None yet
Development

No branches or pull requests