-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
read_excel with dtype=str converts empty cells to np.nan #20429
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 12 commits
dd53df8
6f771fb
37f00ad
f194b70
eb8f4c5
ac6a409
6994bb0
40a563f
9858259
5f71a99
0a93b60
f296f9a
7c0af1f
f0fd0a7
61e0519
9fdac27
ddb904f
694849d
5ba95a1
d3ceec3
ea1d73a
c1376a5
3103811
7d5f6b2
478d08d
edb26d7
c3ab9cb
69f6c95
97a345a
8b2fb0b
c9f5120
fab0b27
571d5c4
0712392
47bc105
3740dfe
7d453bb
bcd739d
7341cd1
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -981,6 +981,8 @@ I/O | |
- :class:`Timedelta` now supported in :func:`DataFrame.to_excel` for all Excel file types (:issue:`19242`, :issue:`9155`, :issue:`19900`) | ||
- Bug in :meth:`pandas.io.stata.StataReader.value_labels` raising an ``AttributeError`` when called on very old files. Now returns an empty dict (:issue:`19417`) | ||
- Bug in :func:`read_pickle` when unpickling objects with :class:`TimedeltaIndex` or :class:`Float64Index` created with pandas prior to version 0.20 (:issue:`19939`) | ||
- Bug in :meth:`pandas.io.json.json_normalize` where subrecords are not properly normalized if any subrecords values are NoneType (:issue:`20030`) | ||
- Bug in :`read_excel` where it transforms np.nan to 'nan' if dtype=str is chosen. Now keeps np.nan as they are. (:issue:`20377`) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should be :func:`read_excel` There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. use double back-ticks around |
||
|
||
Plotting | ||
^^^^^^^^ | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -679,6 +679,11 @@ def _parse_cell(cell_contents, cell_typ): | |
**kwds) | ||
|
||
output[asheetname] = parser.read(nrows=nrows) | ||
dtypes = output[asheetname].dtypes | ||
output[asheetname].replace('nan', np.nan, inplace=True) | ||
output[asheetname] = output[asheetname].astype(dtypes, | ||
copy=False) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I worry about this patch being a performance hit against There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The result from There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. True, though that doesn't change my opinion. The problematic part still likely stems in the engine parsing, which would also effect |
||
|
||
if names is not None: | ||
output[asheetname].columns = names | ||
if not squeeze or isinstance(output[asheetname], DataFrame): | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are including some other changes here, pls rebase on master.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's not mine. I deleted it by mistake and added it back.
You can check master here https://github.com/pandas-dev/pandas/blob/master/doc/source/whatsnew/v0.23.0.txt#L985
However, even after rebasing, I keep getting this conflict
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you rebased off master and resolved the conflicts in the rebase then it should be ok. Did you fetch the current master before rebasing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i did now