DOC: make io.rst utf8 only #5926

ghost · 2014-01-13T18:30:02Z

@JanSchulz , can you test whether this solves the problem for you?

jorisvandenbossche · 2014-01-13T18:59:05Z

Maybe this can solve the windows building issue (I will also test), but aside: do we want this in the docs? Because the example by itself does work, it's only the building that does not work (as far as I understand).

ghost · 2014-01-13T19:06:45Z

I think we want users/contributors to be able to build the docs, yeah. Even if they're on windows/diff locale
Do you feel the workaround clutter detracts much from the example?

It took a lot of effort to get pandas to play nice wth unicode and one lesson learned is not
to mix encodings. The docs should be utf8-clean IMO.

jorisvandenbossche · 2014-01-13T20:35:59Z

I tried it, and it does not solve the issue. And in retrospect, that is maybe also logical: the problem in windows is in the building of the rst with unicode to html, and it is the output generated by the code example which causes this. With your changes, the output of the code example still contains special characters (which is also the point of the code example), and so causes the build on windows to stop.

I think @JanSchulz had another approach as a kind of hack: something along the lines of #5142 (comment). I also vaguely remember that the issue was fixed when using ipython's version of the ipython directive, but I should check that.

ghost · 2014-01-13T20:40:57Z

Then I misunderstood the issue. There is a definite difference:

s1='word,length\nTr\xc3\xa4umen,7\nGr\xc3\xbc\xc3\x9fe,5'
s2=s1.decode('utf8').encode('latin-1')

s1.decode('utf8')
Out[33]: u'word,length\nTr\xe4umen,7\nGr\xfc\xdfe,5'

s2.decode('utf8')
---------------------------------------------------------------------------
UnicodeDecodeError                        Traceback (most recent call last)
<ipython-input-34-7c1601a98c33> in <module>()
----> 1 s2.decode('utf8')

/usr/lib64/python2.7/encodings/utf_8.pyc in decode(input, errors)
     14 
     15 def decode(input, errors='strict'):
---> 16     return codecs.utf_8_decode(input, errors, True)
     17 
     18 class IncrementalEncoder(codecs.IncrementalEncoder):

UnicodeDecodeError: 'utf8' codec can't decode byte 0xe4 in position 14: invalid continuation byte

and in the case of python's "encoding utf8" premable that makes all the difference.
I expected sphinx to accept utf8 input, if it doesn't that seems like a bug to me.

Thanks for testing.

ghost · 2014-01-13T20:42:29Z

btw, there was some decode action in our hacked version of ipython_directive, #5925 may actually solve
the problem by sheer coincidence.

jorisvandenbossche · 2014-01-13T20:46:18Z

See also here #5142 (comment). There was indeed a .decode('utf8') in our version of the ipython directive for some other reason, but that broke the building on windows.

jorisvandenbossche · 2014-01-13T20:46:40Z

I will try out the other PR with your rebase.

ghost · 2014-01-13T20:48:42Z

I'm seriously skimming past all the important bits today :), sorry.

DOC: make io.rst utf8 only

f9e1a5a

ghost closed this Jan 13, 2014

ghost deleted the PR_GH5142 branch January 13, 2014 20:41

jorisvandenbossche mentioned this pull request Jan 14, 2014

DOC: easing building of the docs for contributors #5934

Closed

5 tasks

This pull request was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DOC: make io.rst utf8 only #5926

DOC: make io.rst utf8 only #5926

ghost commented Jan 13, 2014

jorisvandenbossche commented Jan 13, 2014

ghost commented Jan 13, 2014

jorisvandenbossche commented Jan 13, 2014

ghost commented Jan 13, 2014

ghost commented Jan 13, 2014

jorisvandenbossche commented Jan 13, 2014

jorisvandenbossche commented Jan 13, 2014

ghost commented Jan 13, 2014

DOC: make io.rst utf8 only #5926

DOC: make io.rst utf8 only #5926

Conversation

ghost commented Jan 13, 2014

jorisvandenbossche commented Jan 13, 2014

ghost commented Jan 13, 2014

jorisvandenbossche commented Jan 13, 2014

ghost commented Jan 13, 2014

ghost commented Jan 13, 2014

jorisvandenbossche commented Jan 13, 2014

jorisvandenbossche commented Jan 13, 2014

ghost commented Jan 13, 2014