-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: read_csv not recognizing numbers appropriately when decimal is set #38420
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 11 commits
f49d007
cc7dd1b
189ca80
567423f
ba163cf
a5e568b
8dc8cb7
b98954b
514f45a
c20767b
d8d94af
a0eced5
efd78d7
e9d08c4
a611dad
9ec9954
56e7702
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -314,3 +314,51 @@ def test_malformed_skipfooter(python_parser_only): | |||||
msg = "Expected 3 fields in line 4, saw 5" | ||||||
with pytest.raises(ParserError, match=msg): | ||||||
parser.read_csv(StringIO(data), header=1, comment="#", skipfooter=1) | ||||||
|
||||||
|
||||||
@pytest.mark.parametrize("thousands", [None, "."]) | ||||||
@pytest.mark.parametrize( | ||||||
"value, result_value", | ||||||
[ | ||||||
("1,2", 1.2), | ||||||
("1,2e-1", 0.12), | ||||||
("1,2E-1", 0.12), | ||||||
("1,2e-10", 0.0000000012), | ||||||
("1,2e1", 12.0), | ||||||
("1,2E1", 12.0), | ||||||
("-1,2e-1", -0.12), | ||||||
("0,2", 0.2), | ||||||
(",2", 0.2), | ||||||
], | ||||||
) | ||||||
def test_decimal_and_exponential(python_parser_only, thousands, value, result_value): | ||||||
# GH#31920 | ||||||
data = StringIO( | ||||||
f"""a b | ||||||
1,1 {value} | ||||||
""" | ||||||
) | ||||||
result = python_parser_only.read_csv( | ||||||
data, "\t", decimal=",", engine="python", thousands=thousands | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry wrong commit button above. C works perfectly here. Already have tests therefore. Would probably makes sense unifying them as a follow up There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. IMO would do it in this PR, but follow-up also works. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this is fine as a followup |
||||||
) | ||||||
expected = DataFrame({"a": [1.1], "b": [result_value]}) | ||||||
tm.assert_frame_equal(result, expected) | ||||||
|
||||||
|
||||||
@pytest.mark.parametrize("thousands", [None, "."]) | ||||||
@pytest.mark.parametrize( | ||||||
"value", | ||||||
["e11,2", "1e11,2", "1,2,2", "1,2.1", "1,2e-10e1", "--1,2", "1a.2,1", "1..2,3"], | ||||||
) | ||||||
def test_decimal_and_exponential_erroneous(python_parser_only, thousands, value): | ||||||
# GH#31920 | ||||||
data = StringIO( | ||||||
f"""a b | ||||||
1,1 {value} | ||||||
""" | ||||||
) | ||||||
result = python_parser_only.read_csv( | ||||||
data, "\t", decimal=",", engine="python", thousands=thousands | ||||||
phofl marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
) | ||||||
expected = DataFrame({"a": [1.1], "b": [value]}) | ||||||
tm.assert_frame_equal(result, expected) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should say engine='python' right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed