We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
dtype=str
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When providing dtype=str to the DataFrame constructor, we're inconsistent about coercing values to strings.
When there's no overlap between the keys of data and columns, things are probably OK.
data
In [4]: pd.DataFrame(index=[0, 1], columns=[0, 1], dtype=str) Out[4]: 0 1 0 NaN NaN 1 NaN NaN
(those values are np.nan).
But when there is an overlap between keys of data and columns, the newly introduced values are coerced to strings.
In [8]: pd.DataFrame({'A': [1, 2]}, index=[0, 1], columns=['A', 'B'], dtype=str) Out[8]: A B 0 1 nan 1 2 nan
(everything in that dataframe is a string, like "1" or "nan")
"1"
"nan"
That's be cause init_dict relies on arrays_to_mgr to coerce the values to the dtype, and arrays_to_mgr only gets a single dtype.
init_dict
arrays_to_mgr
dtype
The text was updated successfully, but these errors were encountered:
This is not fixed in #24387
Sorry, something went wrong.
I am not able to reproduce this in pandas 0.23.4:
>>> pd.DataFrame({'A': [1, 2]}, index=[0, 1], columns=['A', 'B'], dtype=str)['B'][0] is np.nan True
Huh I must have been on the wrong branch again. Thanks @FR4NKESTI3N.
No branches or pull requests
When providing
dtype=str
to the DataFrame constructor, we're inconsistent about coercing values to strings.When there's no overlap between the keys of
data
and columns, things are probably OK.(those values are np.nan).
But when there is an overlap between keys of
data
and columns, the newly introduced values are coerced to strings.(everything in that dataframe is a string, like
"1"
or"nan"
)That's be cause
init_dict
relies onarrays_to_mgr
to coerce the values to thedtype
, andarrays_to_mgr
only gets a singledtype
.The text was updated successfully, but these errors were encountered: