-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Default values for dropna to "False" (issue 9382) #9484
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
PLEASE REVIEW: This is my commit to a major project, and would appreciate a quick once over! As per discussion in Issue 9382, changes all HDF functions from having default of dropping all rows with NA in all non-index rows.
all updates should be in this PR. |
odd that no tests fail. pls come up with a tests that fails w/o your change and succeeds with it. I guess this was not tested. |
OK, both on this branch. Will look into testing -- I'm not really sure what that means, sorry. |
Sorry Jeff, I think testing may be beyond my pay-grade. I thought I knew enough to contribute, but the testing wiki ( https://github.com/pydata/pandas/wiki/Testing ) is all greek to me. This is obviously not a big commit, so please feel free to just close the pull request and issue and ignore all this. Sorry. :/ |
Just write a function that matches the patters def test_all_missing_values That saves an reloads an array that would drop under the previous and won't drop now. Test should be here: https://github.com/pydata/pandas/blob/master/pandas/io/tests/test_pytables.py#L4680 |
Thanks @bashtage -- and where would I place this function? And sounds like it needs to raise an AssertionError with descriptions for source of problem if it the calls lead to the rows being dropped when they shouldn't be? |
The test should have the followign properties:
Item 3 would fail on master since the loaded df will be missing the all nan row in the original df. This will, however, not fail with your PR. See my link about where your function goes. If you want to check that it passes, you can run nosetests pandas.io.tests.test_pytables Assuming you are using |
Ah, ok great. And do I just stick this into pytables.py, or is there a special place for these? (Again, so sorry to need the hand holding!) |
There is almost always a |
Excellent, thanks. Will do now. |
Previously, | ||
|
||
.. ipython:: python | ||
In [1]: myFile = HDFStore('file.hdf') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- show these examples with a
DataFrame
with a row that has all-nans, and one that doesnt. - don't use camel case.
- use
to_hdf/read_hdf
(because for example you are not closing the store) - this needs to be a
code-block
OK -- I've made all the suggested updates, and I have a build failure. I'm not sure how to check if that's due to the test I added, but hopefully that's what we were looking for? |
Your code.
|
You should probably also include a test with a slightly more complicated layout, e.g. with a full row, a row with some mulls and a row with all null. |
@bashtage - thanks. Just found the traceback in Travis -- will check next build myself! |
df_without_missing = pd.DataFrame({'col1':[0, -1, 2], 'col2':[1, -1, 3]}) | ||
df_without_missing.to_hdf('file.h5', 'df_without_missing') | ||
|
||
print(pd.read_hdf('file.h5', 'df_with_missing')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you don't need print statements, paste the actual ipython output (which will have numbered In/Outs)
closing pls reopen if/when updated |
Sorry, forgot about this. Now updated and rebased. How do I reopen? Or do I need to start a new PR? |
Seems straightforward and ready. |
PLEASE REVIEW: This is my commit to a major project, and would appreciate a quick once over!
As per discussion in Issue 9382, changes all HDF functions from having default of dropping all rows with NA in all non-index rows.
closes #9382