Skip to content

update io documentation #35720

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

update io documentation #35720

wants to merge 1 commit into from

Conversation

usneil
Copy link

@usneil usneil commented Aug 14, 2020

The HDFS dropna=True parameter does not drop the NaN values. Perhaps this is a bug to the core Pandas, or really a typo in the documentation

The HDFS dropna=True parameter does not drop the NaN values. Perhaps this is a bug to the core Pandas, or really a typo in the documentation
@rhshadrach
Copy link
Member

The way you're changing the documentation, setting dropna=True keeps NA values. That isn't correct, I believe this is a bug, not an issue with the docs.

Copy link
Member

@simonjayhawkins simonjayhawkins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @grrlic for the PR. see @rhshadrach comment

@usneil
Copy link
Author

usneil commented Aug 15, 2020

Ah, glad this bug is found soon, 'cause I myself kinda doubt that the errata comes from the docs, but apparently it's not the case. Thanks to @XiaozhanYang observations.

@usneil usneil closed this Aug 15, 2020
@usneil
Copy link
Author

usneil commented Aug 15, 2020

@simonjayhawkins Thanks for the quick updates with @rhshadrach. I am actually new to contributing to open source. I myself am quite familiar with Python. Perhaps with regards to this bug, you have any suggestion or tips that would allow helping contribute? thanks in advanced, really love pandas hahaha.

@simonjayhawkins
Copy link
Member

@grrlic in the first instance, if you are unfamiliar with the codebase, I would probably bisect the regression, see #35685 to get a idea of where a fix should be applied.

although for this case, it might be quite straightforward to follow the code from to_hdf (I notice from git blame that there we changes to the signature in #29957) It appears that dropna is not passed onto store.put in the if append else block

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

REGR: pd.to_hdf(dropna=True) not dropping all nan rows
3 participants