Skip to content

ENH: Adding '.' as an na_value for FRED. #3469

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
4 commits merged into from
Apr 27, 2013
Merged

Conversation

TomAugspurger
Copy link
Contributor

The St. Louis Fed's FRED uses '.' as an NaN marker. I don't think the user has any way to specify the na_value via DataReader, so this just hard codes '.' as an na_value in the fred data-fetching function.

Before:

In [4]: from pandas.io.data import DataReader

In [5]: df = DataReader('DFII5', data_source="fred")

In [6]: df.head()
Out[6]:
           DFII5
DATE
2010-01-01     .
2010-01-04  0.52
2010-01-05  0.44
2010-01-06  0.44
2010-01-07  0.43

After

In [2]: from pandas.io.data import DataReader

In [3]: df = DataReader('DFII5', data_source="fred")

In [4]: df.head()
Out[4]:
            DFII5
DATE
2010-01-01    NaN
2010-01-04   0.52
2010-01-05   0.44
2010-01-06   0.44
2010-01-07   0.43

In [5]: pd.__version__
Out[5]: '0.11.0rc1'

Hopefully I've done everything correctly with the pull request. Let me know if anything needs fixing.

@ghost
Copy link

ghost commented Apr 27, 2013

I see you haven't got travis going, is this covered by the test suite? if not, please add a test to
monitor this in the future, after that I have no problem merging.

thanks.

@TomAugspurger
Copy link
Contributor Author

Thanks, I'll get a test going added and I'll try to get travis set up also. I've never used it before.

One question: I could be missing it, but I'm not sure the fred code is covered by any tests. I greped through the pandas/io/tests/ directory and I didn't see any references to fred. Should I start a new file similar to the ones for yahoo and wb? I can add some more tests while I'm in there too.

@ghost
Copy link

ghost commented Apr 27, 2013

Yeah, that's not a part of the code I'm familiar with, it's possible. In which case
you would be doing pandas an even greater service by throwing together some tests.

@ghost
Copy link

ghost commented Apr 27, 2013

CONTRIBUTING.MD has a link to the travis instructions, and other stuff if you
have the patience to read through it.

remember to tag your tests @network(+@slow?)

@ghost
Copy link

ghost commented Apr 27, 2013

related #3413

@TomAugspurger
Copy link
Contributor Author

OK I think Travis is testing my branch now. Assuming that passes, do you want me to try to stuff both my changes (handling the NaNs and adding the tests) into a single PR? Or should I do them separately?

@ghost
Copy link

ghost commented Apr 27, 2013

Same PR makes sense, feel free to add as many commits as you like to the PR branch and
push, just make it clear whether it's ready for merging or still WIP.

If your git-fu is good, you can rewrite and force push and all that in PR branches, that's
acceptable, if not, just add commits and push and I or another dev will squash things
down for you before merging, that's not a problem.

Thanks.

@TomAugspurger
Copy link
Contributor Author

I added a check for if the name you gave is a valid FRED series name, and a test that goes along with it. I'm worried that it will break if the St. Louis Fed decides to change their response to an invalid name. But the way I coded it, it should only occur if we already failed to return a dataframe. It's basically just a more informative error message.

I'd like to take a crack at #3413, but I might not have time until later this week. It should be something as simple as changing the fred branch of the main DataReader function to

names = list(name)  # if string is passed for a single series
return pd.concat([get_data_fred(name, start, end) for name in names])

I may let someone else handle the history rewriting. Last time I tried that over on statsmodels, I think I gave Joseph a headache. Travis is testing now. I'll add the final bit and mark it as ready to merge when it's done.

@TomAugspurger
Copy link
Contributor Author

OK This should be ready to go if you think it looks acceptable.

Thanks for the help y-p.

@ghost
Copy link

ghost commented Apr 27, 2013

One last thing, could you add a note to RELEASE.rst, to let users know you "fixed it ™"?

@TomAugspurger
Copy link
Contributor Author

Will do. I've just noticed that I'm on 0.11.0rc1 for my master, so I've got an out of date RELEASE.rst; CONTRIBUTING.md says "Don't merge upstream into a branch you're going to submit as a PR". Am I ok doing

git checkout master
git rebase upstream/master
git checkout fred_na  # my branch
git rebase master

And then adding the changes to RELEASE.rst and pushing again? Sorry.

@ghost
Copy link

ghost commented Apr 27, 2013

Exactly right. some people have a hard time grokking rebase. but that's the way
you should it. You can also just leave it in, and let us handle the conflicts if you don't
feel confident.

@TomAugspurger
Copy link
Contributor Author

Well I think I messed something up. After fetching and rebaseing my master onto upstream/master, when I did git checkout fred_na and get rebase master things seemed to work correctly, but now I'm told that

# Your branch and 'origin/fred_na' have diverged,
# and have 121 and 3 different commits each, respectively.
#   (use "git pull" to merge the remote branch into yours)

So I did git pull but now it looks like all of those other commits are going to be part of this PR when I push, which we don't want right?

@ghost
Copy link

ghost commented Apr 27, 2013

first, don't panic.

Most important git command in existence:

git reflog

Coupled with

git reset --hard commithash

@ghost
Copy link

ghost commented Apr 27, 2013

that error message is actually fine. origin has the previous state based on 0.11.0rc1,
your local branch has been rebased. so they have diverged.
your local is correct, and the remote needs to be clobbered.

write down the commit hash that's "safe" to go back to: a80acc4
then clobber the GH branch with your updated local branch (only allowed
on your own PR branches, not generally) with:

git push origin *branchname* --force

TomAugspurger added 4 commits April 27, 2013 15:03
This a very minimal, but it wasn't previously covered.
I've added a new to contain the tests and more or less copied code
from the test_yahoo file.
@TomAugspurger
Copy link
Contributor Author

I panicked :)

Seems to look ok now I think, once Travis is done?

ghost pushed a commit that referenced this pull request Apr 27, 2013
ENH: Adding '.' as an na_value for FRED.
@ghost ghost merged commit a254d35 into pandas-dev:master Apr 27, 2013
@ghost
Copy link

ghost commented Apr 27, 2013

Everything passes locally on my box. in it goes...

The log shows this is your first PR to pandas. good job! 👍

@TomAugspurger TomAugspurger deleted the fred_na branch November 3, 2016 12:37
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant