One file per datareader #58

femtotrader · 2015-08-21T16:02:10Z

Hello,

data.py is a very (too) big file.

I think we should have one file per datareader.

Moreover this will help #48

I've done on my fork a branch where each datareader is a file.

$ nosetests -s -v

shows

Ran 58 tests in 83.806s
FAILED (SKIP=7, errors=10, failures=3)

that's exactly same result with this master branch.

I will send a PR.

Kind regards

The text was updated successfully, but these errors were encountered:

bashtage · 2015-08-21T16:39:49Z

The splitting seems too fine grained IMO. Why not split by data vendor only?

bashtage · 2015-08-21T16:41:29Z

I think the utility functions, which is anything that isn't vendor specific, should also be collected in shared, rather than having one file per function.

femtotrader · 2015-08-21T17:07:36Z

what utility function will you put in "shared" ?
I will personally move function to "shared" only if they are actually shared.

I used this fine grained split because it will be easier for me to add caching (see #48 ) this way like I did in https://github.com/femtotrader/pandas_datareaders_unofficial .

Moreover it will be easier to see what is done and what need to be done.

femtotrader · 2015-08-21T17:17:43Z

A vendor can decide to deprecate one API and not other ones.
Is Yahoo Finance Components deprecated ?
With one file per reader it will be IMHO easier and could also avoid spaghettis.

bashtage · 2015-08-21T17:28:44Z

I've never some across a package that has one function per file (or anything close). With modern editors it doesn't matter how large the files is since you can always use folding to hide anything you don't want to see.

In terms of caching, is there some reason now to use django-style @cache decorator that assumes the exact same set of inputs produces the same output, so that only the final result is cached, and the reader doesn't need to be modified?

davidastephens · 2015-08-22T15:14:34Z

I'm fine with splitting by API. I agree with @bashtage though, I don't think we need one file per shared function. Just one shared file works.

Any benefit of using subpackages by vendor?
Ie: pandas_datareader/yahoo/options.py
Ie: pandas_datareader/yahoo/actions.py

bashtage · 2015-08-22T16:38:10Z

@davidastephens subpackage suggestion looks reasonable and seems to be more canonical Python packaging.

So something like pdr/vendor/datatype plus pdr/common.py where common is anything that isn't vendor specific.

femtotrader mentioned this issue Aug 21, 2015

One file per datareader #59

Closed

femtotrader mentioned this issue Aug 23, 2015

New package structure #68

Closed

davidastephens mentioned this issue Aug 23, 2015

CLN: Split into subpackages #69

Merged

davidastephens added this to the 0.2.0 milestone Aug 23, 2015

davidastephens closed this as completed in #69 Aug 23, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

One file per datareader #58

One file per datareader #58

femtotrader commented Aug 21, 2015

bashtage commented Aug 21, 2015

Uh oh!

bashtage commented Aug 21, 2015

Uh oh!

femtotrader commented Aug 21, 2015

Uh oh!

femtotrader commented Aug 21, 2015

Uh oh!

bashtage commented Aug 21, 2015

Uh oh!

davidastephens commented Aug 22, 2015

Uh oh!

bashtage commented Aug 22, 2015

Uh oh!

One file per datareader #58

One file per datareader #58

Comments

femtotrader commented Aug 21, 2015

bashtage commented Aug 21, 2015

Uh oh!

bashtage commented Aug 21, 2015

Uh oh!

femtotrader commented Aug 21, 2015

Uh oh!

femtotrader commented Aug 21, 2015

Uh oh!

bashtage commented Aug 21, 2015

Uh oh!

davidastephens commented Aug 22, 2015

Uh oh!

bashtage commented Aug 22, 2015

Uh oh!