-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Default values for dropna to "False" (issue 9382) #9484
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 6 commits
9ab8c23
1d7808c
66dfc6b
3e2a718
de022a9
5f2eae8
892835b
137c4c0
e137c73
1a119d2
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -130,6 +130,54 @@ methods (:issue:`9088`). | |
d 7 | ||
dtype: int64 | ||
|
||
- default behavior for HDF write functions with "table" format is now to keep rows that are all missing except for index. (:issue:`9382`) | ||
|
||
Previously, | ||
|
||
.. code-block:: python | ||
In [1]: | ||
df_with_missing = pd.DataFrame({'col1':[0, np.nan, 2], 'col2':[1, np.nan, 3]}) | ||
df_with_missing.to_hdf('file.h5', 'df_with_missing', format = 't') | ||
|
||
df_without_missing = pd.DataFrame({'col1':[0, -1, 2], 'col2':[1, -1, 3]}) | ||
df_without_missing.to_hdf('file.h5', 'df_without_missing') | ||
|
||
print(pd.read_hdf('file.h5', 'df_with_missing')) | ||
print(pd.read_hdf('file.h5', 'df_without_missing')) | ||
|
||
Out [1]: | ||
col1 col2 | ||
0 0 1 | ||
2 2 3 | ||
col1 col2 | ||
0 0 1 | ||
1 -1 -1 | ||
2 2 3 | ||
|
||
|
||
|
||
New behavior: do | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. remove the do |
||
|
||
.. code-block:: python | ||
In [1]: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. use a |
||
df_with_missing = pd.DataFrame({'col1':[0, np.nan, 2], 'col2':[1, np.nan, 3]}) | ||
df_with_missing.to_hdf('file.h5', 'df_with_missing', format = 't') | ||
|
||
df_without_missing = pd.DataFrame({'col1':[0, -1, 2], 'col2':[1, -1, 3]}) | ||
df_without_missing.to_hdf('file.h5', 'df_without_missing') | ||
|
||
print(pd.read_hdf('file.h5', 'df_with_missing')) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. no need for prints or the output (it will be generated) |
||
print(pd.read_hdf('file.h5', 'df_without_missing')) | ||
|
||
Out [2]: | ||
col1 col2 | ||
0 0 1 | ||
1 NaN NaN | ||
2 2 3 | ||
col1 col2 | ||
0 0 1 | ||
1 -1 -1 | ||
2 2 3 | ||
|
||
|
||
Deprecations | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4678,6 +4678,13 @@ def test_duplicate_column_name(self): | |
other = read_hdf(path, 'df') | ||
tm.assert_frame_equal(df, other) | ||
|
||
def test_all_missing_values(self): | ||
df_with_missing = DataFrame({'col1':[0, np.nan, 2], 'col2':[1, np.nan, np.nan]}) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. add the issue number as a comment |
||
|
||
with ensure_clean_path(self.path) as path: | ||
df_with_missing.to_hdf(path, 'df_with_missing', format = 't') | ||
reloaded = read_hdf(path, 'df_with_missing') | ||
tm.assert_frame_equal(df_with_missing, reloaded) | ||
|
||
def _test_sort(obj): | ||
if isinstance(obj, DataFrame): | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you don't need print statements, paste the actual ipython output (which will have numbered In/Outs)