Skip to content

Displaying DataFrame with values from range [1e-7, 5e-7] shows 0 in some situations. #9764

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
tomazberisa opened this issue Mar 31, 2015 · 2 comments · Fixed by #9806
Closed
Labels
Output-Formatting __repr__ of pandas objects, to_string
Milestone

Comments

@tomazberisa
Copy link
Contributor

Hi,

For illustration purpose, I'll assume the following dataset:

In [1]: d=pd.DataFrame({'col1':[9.999e-8, 1e-7, 1.0001e-7, 2e-7, 4.999e-7, 5e-7, 5.0001e-7, 6e-7]})

In [2]: d
Out[2]:
           col1
0  9.999000e-08
1  1.000000e-07
2  1.000100e-07
3  2.000000e-07
4  4.999000e-07
5  5.000000e-07
6  5.000100e-07
7  6.000000e-07

I've noticed the following behavior (in pandas 0.16 and 0.15.2):

  1. When values from range [1e-7, 5e-7] are displayed along with values that are less than 1e-7, the output is OK:
In [3]: d[0:6]
Out[3]:
           col1
0  9.999000e-08
1  1.000000e-07
2  1.000100e-07
3  2.000000e-07
4  4.999000e-07
5  5.000000e-07
  1. When values exclusively from that range are displayed, the output is 0:
In [4]: d[1:6]
Out[4]:
   col1
1     0
2     0
3     0
4     0
5     0
  1. When values from that range are displayed along with values greater than 5e-7, the output for them is 0.000000:
In [5]: d[1:8]
Out[5]:
       col1
1  0.000000
2  0.000000
3  0.000000
4  0.000000
5  0.000000
6  0.000001
7  0.000001

I'm assuming it's a rounding / output formatting error.

Also:

In [6]: pd.show_versions()

INSTALLED VERSIONS
------------------
commit: None
python: 3.4.2.final.0
python-bits: 64
OS: Linux
OS-release: 2.6.32-431.el6.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.16.0
nose: 1.3.4
Cython: 0.21.2
numpy: 1.9.2
scipy: 0.14.0
statsmodels: 0.6.1
IPython: 2.3.1
sphinx: None
patsy: 0.3.0
dateutil: 2.4.1
pytz: 2015.2
bottleneck: None
tables: None
numexpr: None
matplotlib: 1.4.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
@jreback
Copy link
Contributor

jreback commented Apr 2, 2015

could be a bug or just an odd setting of options: http://pandas.pydata.org/pandas-docs/stable/options.html#list-of-options

want to see what is affecting the output here?

@jreback jreback added the Output-Formatting __repr__ of pandas objects, to_string label Apr 2, 2015
@jreback jreback added this to the Next Major Release milestone Apr 2, 2015
@tomazberisa
Copy link
Contributor Author

I've looked at my options and all that are related to formatting floats are set to defaults (display.chop_threshold is None, display.float_format is None, display.precision is 7).

It seems to be an issue in pandas/core/format.py - working on a fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Output-Formatting __repr__ of pandas objects, to_string
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants