-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: Enforce correct encoding in stata #15768
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Ensure StataReader and StataWriter have the correct encoding. Standardized default encoding to 'latin-1' closes pandas-dev#15723
f549481
to
2f02697
Compare
@bashtage side issue. We finally have 32-bit daily wheels. https://travis-ci.org/MacPython/pandas-wheels/jobs/213101390 so this one came up in testing (I am working fixing the rest). Unfort no easy way to test this as has to be on master. But if you have a fix can merge.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
trivial comments. after its green you can push (and just ping me to merge)
pandas/io/stata.py
Outdated
def __init__(self, encoding='latin-1'): | ||
|
||
if encoding not in VALID_ENCODINGS: | ||
raise ValueError('Unknown encoding. Only latin-1 and ascii ' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extra space before ascii
pandas/tests/io/test_stata.py
Outdated
@@ -1276,3 +1276,9 @@ def test_out_of_range_float(self): | |||
original.to_stata(path) | |||
tm.assertTrue('ColumnTooBig' in cm.exception) | |||
tm.assertTrue('infinity' in cm.exception) | |||
|
|||
def test_invalid_encoding(self): | |||
original = self.read_csv(self.csv3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add the commit number as a comment
@jreback I rolled a fix for the 32bit issue into this PR. I can separate if you prefer. |
@bashtage totally fine here. thanks! ping on green. |
Codecov Report
@@ Coverage Diff @@
## master #15768 +/- ##
==========================================
- Coverage 91.01% 90.99% -0.02%
==========================================
Files 143 143
Lines 49377 49384 +7
==========================================
- Hits 44941 44938 -3
- Misses 4436 4446 +10
Continue to review full report at Codecov.
|
Fix use of 64-bit integers as keys in general string objects (GSO) by wrapping in strings when used as dictionary keys
a79a515
to
8278be7
Compare
thanks! always a pleasure @bashtage |
@bashtage looks like this fixed the 32-bit error as well! https://travis-ci.org/MacPython/pandas-wheels/jobs/213622884 thanks (of course other errors which I am working on ......) |
Ensure StataReader and StataWriter have the correct encoding. Standardized default encoding to 'latin-1' closes pandas-dev#15723 Author: Kevin Sheppard <[email protected]> Closes pandas-dev#15768 from bashtage/limit-stata-encoding and squashes the following commits: 8278be7 [Kevin Sheppard] BUG: Fix limited key range on 32-bit platofrms 2f02697 [Kevin Sheppard] BUG: Enforce correct encoding in stata
Ensure StataReader and StataWriter have the correct encoding.
Standardized default encoding to 'latin-1'
closes #15723
git diff upstream/master | flake8 --diff