Skip to content

NaN is converted to strings when reassigning a column with .loc #28403

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
remidomingues opened this issue Sep 12, 2019 · 4 comments · Fixed by #45695
Closed

NaN is converted to strings when reassigning a column with .loc #28403

remidomingues opened this issue Sep 12, 2019 · 4 comments · Fixed by #45695
Assignees
Labels
Dtype Conversions Unexpected or buggy dtype conversions Indexing Related to indexing on series/frames, not to indexes themselves Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Needs Tests Unit test(s) needed to prevent regressions
Milestone

Comments

@remidomingues
Copy link

Hello team,

Here is a simple working exemple of value assignment on a dataframe:

df = pd.DataFrame({'A': [np.nan, np.nan, 'b']})
df['A'].loc[[0, 1]] = ['a', np.nan]
df.notna()
Out[126]: 
       A
0   True
1  False
2   True

However, if we use loc to assign the entire column with an array including NaNs, all values are converted to strings, breaking notna():

df = pd.DataFrame({'A': [np.nan, np.nan, 'b']})
df['A'].loc[[0, 1, 2]] = ['a', np.nan, np.nan]
df['A'].notna()                                                                                                                                                                                   
Out[133]: 
0    True
1    True
2    True
Name: A, dtype: bool
@TomAugspurger
Copy link
Contributor

Thanks for the report, probably a bug.

Note that this is following the numpy behavior

In [23]: np.array(['a', np.nan])
Out[23]: array(['a', 'nan'], dtype='<U3')

I suspect that when df['A'] is presently an object-dtype column, we don't convert the incoming list-like with the right sanitizer.

Note that this wouldn't be a problem with a proper StringDtype: #27949

@TomAugspurger TomAugspurger added Dtype Conversions Unexpected or buggy dtype conversions Indexing Related to indexing on series/frames, not to indexes themselves Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Difficulty Intermediate labels Sep 12, 2019
@TomAugspurger TomAugspurger added this to the Contributions Welcome milestone Sep 12, 2019
@jreback
Copy link
Contributor

jreback commented Sep 12, 2019

#28176 might fix this actuallly

@jbrockmendel
Copy link
Member

Works on master, could use a test.

@jbrockmendel jbrockmendel added the Needs Tests Unit test(s) needed to prevent regressions label Jan 6, 2022
@NumberPiOso
Copy link
Contributor

take

NumberPiOso added a commit to NumberPiOso/pandas that referenced this issue Jan 29, 2022
NumberPiOso added a commit to NumberPiOso/pandas that referenced this issue Jan 31, 2022
@mroeschke mroeschke modified the milestones: Contributions Welcome, 1.5 Feb 3, 2022
mroeschke pushed a commit that referenced this issue Feb 3, 2022
* TST: Nan must not be converted to string

Closes #28403

* TST: Add test specific to the issue  #28403

* TST: Parametrize multiple inputs change nan loc
phofl pushed a commit to phofl/pandas that referenced this issue Feb 14, 2022
* TST: Nan must not be converted to string

Closes pandas-dev#28403

* TST: Add test specific to the issue  pandas-dev#28403

* TST: Parametrize multiple inputs change nan loc
yehoshuadimarsky pushed a commit to yehoshuadimarsky/pandas that referenced this issue Jul 13, 2022
* TST: Nan must not be converted to string

Closes pandas-dev#28403

* TST: Add test specific to the issue  pandas-dev#28403

* TST: Parametrize multiple inputs change nan loc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Dtype Conversions Unexpected or buggy dtype conversions Indexing Related to indexing on series/frames, not to indexes themselves Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Needs Tests Unit test(s) needed to prevent regressions
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants