-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: Fix TypeError caused by GH13374 #17465
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hello @matthax! Thanks for updating the PR. Cheers ! There are no PEP8 issues in this Pull Request. 🍻 Comment last updated on September 09, 2017 at 21:49 Hours UTC |
Codecov Report
@@ Coverage Diff @@
## master #17465 +/- ##
==========================================
- Coverage 91.16% 91.14% -0.02%
==========================================
Files 163 163
Lines 49590 49590
==========================================
- Hits 45209 45200 -9
- Misses 4381 4390 +9
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #17465 +/- ##
==========================================
- Coverage 91.15% 91.13% -0.02%
==========================================
Files 163 163
Lines 49534 49534
==========================================
- Hits 45153 45144 -9
- Misses 4381 4390 +9
Continue to review full report at Codecov.
|
pls add your example: #13374 (comment) as a test. |
doc/source/whatsnew/v0.21.0.txt
Outdated
@@ -411,6 +411,7 @@ I/O | |||
- Bug in :func:`read_csv` when called with a single-element list ``header`` would return a ``DataFrame`` of all NaN values (:issue:`7757`) | |||
- Bug in :func:`read_stata` where value labels could not be read when using an iterator (:issue:`16923`) | |||
- Bug in :func:`read_html` where import check fails when run in multiple threads (:issue:`16928`) | |||
- Bug in :func:`read_csv` where automatic delimiter detection caused a `TypeError` to be thrown when a bad line was encountered (:issue:`13374`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use double-back-ticks around TypeError
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add:
rather than the correct error message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated both. Looks like you're going to have to manually restart the travis build though, something went wrong there and I'm not able to see any log info.
…thon_parser_patch merge upstream into master
@jreback added the test and made the requested changes in whatsnew |
@@ -218,6 +218,23 @@ def test_multi_char_sep_quotes(self): | |||
self.read_csv(StringIO(data), sep=',,', | |||
quoting=csv.QUOTE_NONE) | |||
|
|||
def test_none_delimiter(self): | |||
# see gh-13374 and gh-17465 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a 1-line comment about what is happening here.
Is this only in the python parser as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only the Python parser. I actually discovered the issue because I was using the built in CSV sniffer and the C parser for a while, but switched to the python engine with the pandas sniffer because it did notably better for the data files I was using.
@gfyoung pls have a look. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated the documentation for the test method. Couldn't find a nice way to one-line it though
lgtm. @gfyoung pls review and merge. |
|
||
# We expect the third line in the data to be | ||
# skipped because it is malformed | ||
# but we do not expect any errors to occur |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nits: add a comma after "malformed" + add a period at end of comment.
pandas/io/parsers.py
Outdated
@@ -2836,7 +2836,8 @@ def _rows_to_cols(self, content): | |||
for row_num, actual_len in bad_lines: | |||
msg = ('Expected %d fields in line %d, saw %d' % | |||
(col_len, row_num + 1, actual_len)) | |||
if len(self.delimiter) > 1 and self.quoting != csv.QUOTE_NONE: | |||
if self.delimiter and \ | |||
len(self.delimiter) > 1 and self.quoting != csv.QUOTE_NONE: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally practice for is to use parentheses around the conditionals and not to use the slash for something a little nicer to read i.e.:
if (self.delimiter and
len(self.delimiter) > 1...)
…thon_parser_patch merge from upstream
Just and FYI the flake check doesn't seem to be working on windows. I tried altering the commands as suggested but no dice. It doesn't error but it doesn't show me any issues. I swapped to using pylint and that seems to work ok, but perhaps that's something that needs to be looked into. |
Hmmm...we've been having this issue off-and-on with Windows. Not sure why it wouldn't have been caught on the diff. |
Thanks @matthax ! |
0 failed, 9873 passed, 1955 skipped, 11 xfailed, 4 warnings in 1475.44 seconds
git diff upstream/master -u -- "*.py" | flake8 --diff
@gfyoung