-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
read_fwf with passing converters and na_values at the same time #15144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
cc @gfyoung |
So this is in fact a point of discussion for the API (and not a bug per se): When should Currently, as evidenced by the code (and personal testing) they are applied after the converters are applied. This is why there are no I don't see the harm in choosing one or the other, but we just need agreement and consistency (we have the latter currently). Regardless of how this issue goes, documentation updates will be needed to explain when |
I we choose, I would find it more logical that |
@jorisvandenbossche : Very well. So we have two people (@gitporst and @jorisvandenbossche) in favor of applying the |
@jorisvandenbossche : Argh...I forgot. We have this ugly bug that I reported in #13302. We might need to put a hold on resolving this and save it for a longer-term revamp of |
@gfyoung as long as we are consistent with the approach I think its fine (IOW apply to be honest, not sure what this means
what exactly are you referring? (do you have some plans for this)? |
@jreback : I was just saying that patching converters is not easy in the short-term because of the bug I pointed and the inconsistency I pointed out in the issue. However, at least we seem to have agreement that |
Yes, this indeeds seems like a duplicate of #13302. @gfyoung can you move the necessary information? (just the agreement that I am also not sure why solving this or the bug in #13302 would need a 1.0. Do you mean that the logical way to solve it would have a lot of other consequences? |
@jorisvandenbossche : The consequences of changing it would be too big I think. I gave it a try before a couple of months ago, and things just broke. Though if someone wants to prove me wrong, be my guest! 😄 |
Duplicate of #13302
When reading via read_fwf and passing both na_values and converters at the same time, na_value detection does not work:
Code Sample, a copy-pastable example if possible
Assume a fwf-style text file like this:
Expected Output
It would be nice if this could be handled behind the scenes.
In pandas.io.parsers._clean_na_values, float representations of the na_values are produced by _floatify_na_values. Maybe passing the converters to this function might be a solution?
Output of
pd.show_versions()
pandas: 0.19.2
(...)
The text was updated successfully, but these errors were encountered: