Skip to content

read_stata failing with URL #9231

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jseabold opened this issue Jan 12, 2015 · 4 comments · Fixed by #9245
Closed

read_stata failing with URL #9231

jseabold opened this issue Jan 12, 2015 · 4 comments · Fixed by #9245
Labels
IO Stata read_stata, to_stata Unicode Unicode strings
Milestone

Comments

@jseabold
Copy link
Contributor

I'm not entirely sure the root cause yet. It's something to do with the default encoding I think. I'm pretty sure Stata uses iso-8859-1 not cp1252. Changing this inside pandas.io.maybe_read_encoded_stream works, but changing the encoding in StataReader breaks something else that I don't have time to debug.

Can someone sanity check me. Does this work on Python 2?

dta = pd.read_stata("http://www.stata-press.com/data/r13/fullauto.dta")
@TomAugspurger
Copy link
Contributor

I get a

UnicodeDecodeError: 'charmap' codec can't decode byte
0x9d in position 2644: character maps to <undefined>

in python 2.

@jreback
Copy link
Contributor

jreback commented Jan 13, 2015

the default encoding is passed (even to a url), so prob should be a bit more inference here.

@jreback jreback added Unicode Unicode strings IO Stata read_stata, to_stata labels Jan 13, 2015
@jreback jreback added this to the 0.16.0 milestone Jan 13, 2015
@jreback jreback changed the title read_stata failing with URL read_stata failing with URL Jan 13, 2015
@bashtage
Copy link
Contributor

Is there someone permanent to place a dta file at a URL for testing?

Or is there a reasonable method to mock a URL locally using a file?

@bashtage
Copy link
Contributor

Fixed, waiting for final green.

bashtage added a commit to bashtage/pandas that referenced this issue Jan 16, 2015
Fix encoding so that StataReader can correctly read URLs

closes pandas-dev#9231
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO Stata read_stata, to_stata Unicode Unicode strings
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants