DataReader with google source returns DataFrame with incorrect index name #8967

femtotrader · 2014-12-02T14:53:07Z

Hello,

import pandas.io.data as web
import datetime

symbol = "AAPL"
end = datetime.datetime.now()
start = end - datetime.timedelta(days=10)
df = web.DataReader("AAPL", 'google', start, end)

df.index.name

is '\xef\xbb\xbfDate' it should be only 'Date'

\xef\xbb\xbf characters are invisible !

A quick and dirty solution is to change

def _retry_read_url(url, retry_count, pause, name):
    for _ in range(retry_count):
        time.sleep(pause)

        # kludge to close the socket ASAP
        try:
            with urlopen(url) as resp:
                lines = resp.read()
        except _network_error_classes:
            pass
        else:
            rs = read_csv(StringIO(bytes_to_str(lines)), index_col=0,
                          parse_dates=True)[::-1]
            # Yahoo! Finance sometimes does this awesome thing where they
            # return 2 rows for the most recent business day
            if len(rs) > 2 and rs.index[-1] == rs.index[-2]:  # pragma: no cover
                rs = rs[:-1]
            return rs

    raise IOError("after %d tries, %s did not "
                  "return a 200 for url %r" % (retry_count, name, url))

to

def _retry_read_url(url, retry_count, pause, name):
    for _ in range(retry_count):
        time.sleep(pause)

        # kludge to close the socket ASAP
        try:
            with urlopen(url) as resp:
                lines = resp.read()
        except _network_error_classes:
            pass
        else:
            rs = read_csv(StringIO(bytes_to_str(lines)), index_col=0,
                          parse_dates=True)[::-1]
            rs.index.name = 'Date'
            # Yahoo! Finance sometimes does this awesome thing where they
            # return 2 rows for the most recent business day
            if len(rs) > 2 and rs.index[-1] == rs.index[-2]:  # pragma: no cover
                rs = rs[:-1]
            return rs

    raise IOError("after %d tries, %s did not "
                  "return a 200 for url %r" % (retry_count, name, url))

so adding rs.index.name = 'Date'
fix the problem but it could have side effect for other datasource

Kind regards

The text was updated successfully, but these errors were encountered:

jreback · 2015-03-07T23:17:36Z

closed by #9026

femtotrader changed the title ~~DataReader with google source return DataFrame with incorrect index name~~ DataReader with google source returns DataFrame with incorrect index name Dec 2, 2014

jreback added Bug Data Reader Unicode Unicode strings labels Dec 2, 2014

jreback added this to the 0.16.0 milestone Dec 2, 2014

davidastephens mentioned this issue Dec 6, 2014

BUG: Datareader index name shows unicode characters #9026

Closed

jreback modified the milestones: 0.15.2, 0.16.0 Dec 6, 2014

jreback modified the milestones: 0.16.0, Next Major Release Mar 6, 2015

jreback closed this as completed Mar 7, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DataReader with google source returns DataFrame with incorrect index name #8967

DataReader with google source returns DataFrame with incorrect index name #8967

femtotrader commented Dec 2, 2014

jreback commented Mar 7, 2015

DataReader with google source returns DataFrame with incorrect index name #8967

DataReader with google source returns DataFrame with incorrect index name #8967

Comments

femtotrader commented Dec 2, 2014

jreback commented Mar 7, 2015