Google Finance DataReader returns columns with object type instead of float64 #8980

femtotrader · 2014-12-03T09:07:13Z

Hello,

Google Finance DataReader returns columns with object type instead of float64

In [112]: import pandas.io.data as web

In [113]: import datetime

In [114]: start = datetime.datetime(2010, 1, 1)

In [115]: end = datetime.datetime(2013, 1, 27)

In [116]: f=web.DataReader("F", 'google', start, end)

In [117]: f.dtypes
Out[117]:
Open       object
High       object
Low        object
Close     float64
Volume      int64
dtype: object

The text was updated successfully, but these errors were encountered:

jreback · 2014-12-03T10:56:00Z

so #8792 validated this for Close. Didn't realize that OHL were still having issues. (Its because of a minus sign that I think is messed up). So will mark as a bug. Appreciate a PR to convert all of this data automatically on import.

femtotrader · 2014-12-03T17:09:40Z

I understand that you could be interested by a PR but I have done a totally different approach about DataReader and pandas/io/data.py needs a lot of change. So I don't know if it can be accepted.

I wrote a DataReaderBase class.

DataReaderGoogleFinanceDaily inherits from DataReaderBase
(so others DataReader's like DataReaderGoogleFinanceIntraday, DataReaderYahooFinanceDaily, DataReaderFRED...)

I used a factory pattern using DataReaderFactory class where every DataReaders are "registered"

API changed slightly and I'm sure you will not like this

it looks like

data = MyDataReader("GoogleFinanceDaily").get(symbol, start, end)

I can probably work to unify to actual DataReader API.

With this kind of idea it's now very easy to use requests (and have cache for queries).
see #8713

About this particular issue I just defined this function

def to_float(x):
    try:
        return(float(x))
    except:
        return(np.nan)

and did

DATE_COL = 'Date'

OPEN_COL = 'Open'
HIGH_COL = 'High'
LOW_COL = 'Low'
CLOSE_COL = 'Close'

VOLUME_COL = 'Volume'

LST_PRICE_COLS = [OPEN_COL, HIGH_COL, LOW_COL, CLOSE_COL]

for col in LST_PRICE_COLS:
    df[col] = df[col].map(to_float)

I need to work now on classes DataReaderFamaFrench, DataReaderYahooFinanceOptions, DataReaderWorldBank

I'm comparing my code returns to Pandas DataReader returns using unit tests (nosetests) and that's the reason why I'm catching some bugs around DataReader.

jorisvandenbossche · 2014-12-03T20:13:09Z

@femtotrader Have you seen #8961 ?
It seems like you would probably be interested, given your work and ideas you outline above!

femtotrader · 2014-12-03T21:58:40Z

Thanks for this link to this issue

I can send what I have done.

I have done a datareaders directory (with __init__.py) where base.py is stored (DataReaderBase)
in this directory there is a file per source (Google Finance, Yahoo Finance, FRED, ...)
and a tools.py (everything else)

As every DataReader object inherits from DataReaderBase it's quite easy to decide whether we fetch data using requests and so requests_cache or urlopen.

femtotrader · 2014-12-04T05:18:38Z

An other solution to fix this issue is to pass na_values='-'to pd.read_csv

femtotrader · 2014-12-04T05:36:17Z

Code is here https://github.com/femtotrader/pandas_datareaders

davidastephens · 2014-12-06T20:25:07Z

The problem is that F has some missing data:

2012-08-01 - - - 9.24 0

I'll submit a PR.

jreback added Bug Data Reader labels Dec 3, 2014

jreback added this to the 0.16.0 milestone Dec 3, 2014

jreback added the Google I/O label Dec 3, 2014

davidastephens mentioned this issue Dec 6, 2014

BUG: Fix Datareader dtypes if there are missing values from Google. #9025

Merged

jreback modified the milestones: 0.15.2, 0.16.0 Dec 6, 2014

jreback closed this as completed in #9025 Dec 7, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Google Finance DataReader returns columns with object type instead of float64 #8980

Google Finance DataReader returns columns with object type instead of float64 #8980

femtotrader commented Dec 3, 2014

jreback commented Dec 3, 2014

femtotrader commented Dec 3, 2014

jorisvandenbossche commented Dec 3, 2014

femtotrader commented Dec 3, 2014

femtotrader commented Dec 4, 2014

femtotrader commented Dec 4, 2014

davidastephens commented Dec 6, 2014

Google Finance DataReader returns columns with object type instead of float64 #8980

Google Finance DataReader returns columns with object type instead of float64 #8980

Comments

femtotrader commented Dec 3, 2014

jreback commented Dec 3, 2014

femtotrader commented Dec 3, 2014

jorisvandenbossche commented Dec 3, 2014

femtotrader commented Dec 3, 2014

femtotrader commented Dec 4, 2014

femtotrader commented Dec 4, 2014

davidastephens commented Dec 6, 2014