fix hashing string-casting error #21187

jbrockmendel · 2018-05-24T02:25:08Z

closes Rendering Series[Categorical] raises UnicodeDecodeError #21002
tests added / passed
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

jreback · 2018-05-24T02:29:10Z

pandas/tests/series/test_repr.py

+        # set sys.defaultencoding to ascii, then change it back after the test
+        enc = sys.getdefaultencoding()
+        reload(sys)  # noqa:F821
+        sys.setdefaultencoding('ascii')


there is a context manager this i think

There is a context manager for locale in pd.util.testing. Can that be used here or do you have something else in mind? (I agree it would be prettier)

yes pls use that

Looks like tm.set_locale doesn't change sys.getdefaultencoding(). I could make a new contextmanager specifically for this (which I guess would be a no-op in py3?)

codecov · 2018-05-24T04:06:41Z

Codecov Report

Merging #21187 into master will decrease coverage by 0.01%.
The diff coverage is 11.11%.

@@            Coverage Diff             @@
##           master   #21187      +/-   ##
==========================================
- Coverage   91.92%    91.9%   -0.02%     
==========================================
  Files         153      153              
  Lines       49563    49572       +9     
==========================================
+ Hits        45559    45560       +1     
- Misses       4004     4012       +8

Flag	Coverage Δ
#multiple	`90.3% <11.11%> (-0.02%)`	⬇️
#single	`41.8% <11.11%> (-0.01%)`	⬇️

Impacted Files	Coverage Δ
pandas/util/testing.py	`85.27% <11.11%> (-0.7%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b36b451...7ab9324. Read the comment docs.

jreback · 2018-05-24T11:51:44Z

pandas/tests/series/test_repr.py

@@ -202,6 +203,35 @@ def test_latex_repr(self):

 class TestCategoricalRepr(object):

+    @pytest.mark.skipif(compat.PY3, reason="Decoding failure only in PY2")


we need a test for py3 as well that uses utf8 as the encoding

Sure, couldn't hurt.

…icat

jreback · 2018-06-19T01:21:45Z

doc/source/whatsnew/v0.23.1.txt

@@ -100,6 +100,7 @@ Bug Fixes
 - Bug in :meth:`Series.str.replace()` where the method throws `TypeError` on Python 3.5.2 (:issue: `21078`)
 - Bug in :class:`Timedelta`: where passing a float with a unit would prematurely round the float precision (:issue: `14156`)
 - Bug in :func:`pandas.testing.assert_index_equal` which raised ``AssertionError`` incorrectly, when comparing two :class:`CategoricalIndex` objects with param ``check_categorical=False`` (:issue:`19776`)
+- Bug in rendering :class:`Series` with ``Categorical`` dtype in rare conditions under Python 2.7 (:issue:`21002`)


can you move to 0.23.2

jreback · 2018-06-19T01:22:07Z

pandas/tests/series/test_repr.py

+        # GH#21002 if len(index) > 60, sys.getdefaultencoding()=='ascii',
+        # and we are working in PY2, then rendering a Categorical could raise
+        # UnicodeDecodeError by trying to decode when it shouldn't
+        from pandas.core.base import StringMixin


can import at the top

jreback · 2018-06-19T01:24:14Z

pandas/tests/series/test_repr.py

+            str(ser)
+
+        else:
+            # set sys.defaultencoding to ascii, then change it back after


can you make this into a context manager in pandas.util.testing

…icat

jreback · 2018-06-21T10:19:16Z

thanks!

and soon enough we will blow away all PY2 code in any event! but nice patch.

(cherry picked from commit e24da6c)

jbrockmendel added 2 commits May 23, 2018 19:23

fix hashing string-casting error

0fab6a9

flake8 fixup

24a0f59

jreback requested changes May 24, 2018

View reviewed changes

jreback added Bug Output-Formatting __repr__ of pandas objects, to_string labels May 24, 2018

jbrockmendel added 4 commits June 11, 2018 18:07

Merge branch 'master' of https://github.com/pandas-dev/pandas into un…

a879d69

…icat

add test in py3, whatsnew note in 0.23.1

279a6e1

fixup remove unused import

3f8e9b2

Merge branch 'master' of https://github.com/pandas-dev/pandas into un…

35ca0cd

…icat

jreback requested changes Jun 19, 2018

View reviewed changes

jbrockmendel added 4 commits June 18, 2018 19:20

Merge branch 'master' of https://github.com/pandas-dev/pandas into un…

9a37725

…icat

make set_defaultencoding context

7f12013

Move note to 0.23.2

3b91c00

Merge branch 'master' of https://github.com/pandas-dev/pandas into un…

7ab9324

…icat

jreback added this to the 0.23.2 milestone Jun 21, 2018

jreback approved these changes Jun 21, 2018

View reviewed changes

jreback added the Needs Backport label Jun 21, 2018

jreback merged commit e24da6c into pandas-dev:master Jun 21, 2018

jbrockmendel deleted the unicat branch June 22, 2018 03:32

jorisvandenbossche removed the Needs Backport label Jun 29, 2018

jorisvandenbossche pushed a commit that referenced this pull request Jun 29, 2018

fix hashing string-casting error (#21187)

2411ad6

(cherry picked from commit e24da6c)

jorisvandenbossche pushed a commit that referenced this pull request Jul 2, 2018

fix hashing string-casting error (#21187)

4b1a687

(cherry picked from commit e24da6c)

Sup3rGeo pushed a commit to Sup3rGeo/pandas that referenced this pull request Oct 1, 2018

fix hashing string-casting error (pandas-dev#21187)

f4ef546

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix hashing string-casting error #21187

fix hashing string-casting error #21187

jbrockmendel commented May 24, 2018 •

edited

Loading

jreback May 24, 2018

jbrockmendel May 24, 2018

jreback May 24, 2018

jbrockmendel Jun 12, 2018

codecov bot commented May 24, 2018 •

edited

Loading

jreback May 24, 2018

jbrockmendel May 24, 2018

jreback Jun 19, 2018

jreback Jun 19, 2018

jreback Jun 19, 2018

jreback commented Jun 21, 2018

		@@ -202,6 +203,35 @@ def test_latex_repr(self):

		class TestCategoricalRepr(object):

		@pytest.mark.skipif(compat.PY3, reason="Decoding failure only in PY2")

fix hashing string-casting error #21187

fix hashing string-casting error #21187

Conversation

jbrockmendel commented May 24, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented May 24, 2018 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback commented Jun 21, 2018

jbrockmendel commented May 24, 2018 •

edited

Loading

codecov bot commented May 24, 2018 •

edited

Loading