-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Prevent UnicodeDecodeError in pivot_table under Py2 #17489
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@mpenkov : Will need a |
Codecov Report
@@ Coverage Diff @@
## master #17489 +/- ##
==========================================
- Coverage 91.15% 91.14% -0.02%
==========================================
Files 163 163
Lines 49534 49534
==========================================
- Hits 45153 45148 -5
- Misses 4381 4386 +5
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #17489 +/- ##
==========================================
- Coverage 91.15% 91.14% -0.02%
==========================================
Files 163 163
Lines 49534 49534
==========================================
- Hits 45153 45148 -5
- Misses 4381 4386 +5
Continue to review full report at Codecov.
|
@gfyoung Added the what's new. Please check. |
pandas/core/reshape/pivot.py
Outdated
@@ -145,7 +145,7 @@ def _add_margins(table, data, values, rows, cols, aggfunc, | |||
if not isinstance(margins_name, compat.string_types): | |||
raise ValueError('margins_name argument must be a string') | |||
|
|||
msg = 'Conflicting name "{name}" in margins'.format(name=margins_name) | |||
msg = u('Conflicting name "{name}" in margins').format(name=margins_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new parentheses can be removed.
pandas/tests/reshape/test_pivot.py
Outdated
def test_issue_13292(self): | ||
# The below shouldn't raise an exception anymore. | ||
frame = pd.DataFrame({'foo': [1, 2, 3]}) | ||
pd.pivot_table(frame, index=['foo'], aggfunc=len, margins=True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you don't need the comment, instead change the test name to something useful like: test_pivot_margins_name_unicode
and put a comment with the issue number
pandas/tests/reshape/test_pivot.py
Outdated
# The below shouldn't raise an exception anymore. | ||
frame = pd.DataFrame({'foo': [1, 2, 3]}) | ||
pd.pivot_table(frame, index=['foo'], aggfunc=len, margins=True, | ||
margins_name=u'\u0394\u03bf\u03ba\u03b9\u03bc\u03ae') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get the result = pd.pivot_table(....)
and compare to a constructed expected
with tm.assert_frame_equal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jreback Could you please suggest what the constructed expected
should look like? I'm a bit new and unsure of what should go there. Here's what I've got so far:
def test_pivot_margins_name_unicode(self):
# issue #13292
greek = u'\u0394\u03bf\u03ba\u03b9\u03bc\u03ae'
frame = pd.DataFrame({'foo': [1, 2, 3]})
table = pd.pivot_table(frame, index=['foo'], aggfunc=len, margins=True,
margins_name=greek)
expected = pd.DataFrame({}, columns=['foo'], index=[1, 2, 3, greek])
tm.assert_frame_equal(table, expected)
This is failing on the last line with DataFrame shape mismatch
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't pass the {}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion. This is what I have now:
expected = pd.DataFrame(columns=['foo'], index=[1, 2, 3, greek])
It's still failing:
E AssertionError: DataFrame are different
E
E DataFrame shape mismatch
E [left]: (4, 0)
E [right]: (4, 1)
pandas/util/testing.py:1105: AssertionError
What else could be wrong?
doc/source/whatsnew/v0.20.0.txt
Outdated
@@ -1705,6 +1705,7 @@ Reshaping | |||
- Bug in ``pd.concat()`` in which concatenating with an empty dataframe with ``join='inner'`` was being improperly handled (:issue:`15328`) | |||
- Bug with ``sort=True`` in ``DataFrame.join`` and ``pd.merge`` when joining on indexes (:issue:`15582`) | |||
- Bug in ``DataFrame.nsmallest`` and ``DataFrame.nlargest`` where identical values resulted in duplicated rows (:issue:`15297`) | |||
- Bug in ``pivot._add_margins` when ``margins_name`` is Unicode under Py2 (:issue:`13292`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
write this from a user perspective
Bug in :func:`pandas.pivot_table` incorrectly raising when passing unicode input for
margins keyword
pandas/tests/reshape/test_pivot.py
Outdated
def test_pivot_margins_name_unicode(self): | ||
# issue #13292 | ||
frame = pd.DataFrame({'foo': [1, 2, 3]}) | ||
pd.pivot_table(frame, index=['foo'], aggfunc=len, margins=True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still like to compare this here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I figured out how to construct the expected object correctly. Please have a look.
thanks @mpenkov |
git diff upstream/master -u -- "*.py" | flake8 --diff