Skip to content

Yahoo Prices precision - pandas-datareader-0.9.0 #844

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
huvar opened this issue Jan 12, 2021 · 3 comments
Open

Yahoo Prices precision - pandas-datareader-0.9.0 #844

huvar opened this issue Jan 12, 2021 · 3 comments

Comments

@huvar
Copy link

huvar commented Jan 12, 2021

Hello

Seems like yahoo is returning prices with more precision than it should. In the webpage they are displayed with tow decimals but in the html we can see extra decimals. The issue is that these extra decimals seem to be random.
Is there some way to fix it? Is it something that can be rounded safely? Is there a better free datasource?

https://finance.yahoo.com/quote/GOOGL/history

from pandas_datareader import data
start, end = '2018-01-11', '2021-01-11'
df = data.DataReader('GOOGL', start=start,end=end, data_source='yahoo')['Adj Close']

df
Date
2018-01-11 1112.050049
2018-01-12 1130.650024
2018-01-16 1130.699951
2018-01-17 1139.099976
2018-01-18 1135.969971
...
2021-01-05 1740.050049
2021-01-06 1722.880005
2021-01-07 1774.339966
2021-01-08 1797.829956
2021-01-11 1756.290039
Name: Adj Close, Length: 755, dtype: float64

@ikramersh
Copy link

ikramersh commented Oct 10, 2021

This is caused by the way floating point numbers are stored in pandas dataframes.

There is a good discusion of the underlying cause in [https://github.com/pandas-dev/pandas/issues/16452] . It explains that it is a presentation issue and not an accuracy issue which means the numbers can safely be rounded. The issue is not caused by the data source but by the conversion into floating point numbers within pandas dataframes.

A simple solution is to set how pandas displays floating point numbers, for example if you want to display four decimal point precision use:

pd.options.display.float_format = '{:.4f}'.format
df = data.DataReader('GOOGL', start=start,end=end, data_source='yahoo')['Adj Close']

Note:

  1. This will apply to the display of all floating point numbers in your pandas dataframes.
  2. It does not affect the stored floating point values in the dataframe memory.

@huvar
Copy link
Author

huvar commented Oct 10, 2021

@ikramersh
Copy link

Thanks for the data that shows the issue coming from the source.

The use of floating point numbers creates the extra decimals so Yahoo are probably using floating point storage to process the original prices before serving them up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants