Skip to content

Enable read_csv to interpret "Infinity", "+Infinity" and "-Infinity" as floating point values #10065

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
cbare opened this issue May 5, 2015 · 3 comments · Fixed by #28181
Closed
Labels
Enhancement IO CSV read_csv, to_csv
Milestone

Comments

@cbare
Copy link

cbare commented May 5, 2015

The strings "Infinity", "+Infinity" and "-Infinity" are commonly used to represent infinite floating point values. It would be great for pandas pd.read_csv to correctly interpret these strings as infinite floating point values or to take a parameter "inf_values" analogous to "na_values" that would allow the caller to specify what they require.

@jreback jreback added the IO CSV read_csv, to_csv label May 5, 2015
@jreback
Copy link
Contributor

jreback commented May 5, 2015

inf/-inf case insensitive are already accepted. These are standard.

@cbare
Copy link
Author

cbare commented May 6, 2015

We're reading CSVs produced by Java, which stringifies infinite values as "Infinity" and "-Infinity".

Am I be looking in the right place if I'm poking around in maybe_convert_numeric().

@jreback
Copy link
Contributor

jreback commented May 6, 2015

Here is where it would need to be changed: https://github.com/pydata/pandas/blob/master/pandas/parser.pyx#L1474

I think you might just be able to short-circuit with a strncasecmp (to only compare first n characters), then to compare versus a hash table of allowed values. These c-functions are used because this is a very perf intensive part of the code (iow its hit a lot), so would need some testing to make sure perf is ok.

@jreback jreback added this to the Next Major Release milestone May 6, 2015
@jreback jreback modified the milestones: Contributions Welcome, 1.0 Sep 2, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement IO CSV read_csv, to_csv
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants