BUG: display.precision option seems off-by-one #10451

rosnfeld · 2015-06-27T12:39:39Z

I may very well be wrong on this, given how common an option it seems, but I am surprised that the "display.precision" option seems to limit the digits after the decimal to one less than specified.

In [1]: x = pd.Series(np.random.randn(5))

In [2]: x
Out[2]: 
0   -0.163960
1    1.016273
2    0.861317
3   -0.521916
4   -0.069322
dtype: float64

In [3]: pd.set_option('display.precision', 3)

In [4]: x
Out[4]: 
0   -0.16
1    1.02
2    0.86
3   -0.52
4   -0.07
dtype: float64

I can't see where in the code this is happening. At first glance, this looks like what the code is doing:

In [13]: fmt_str = '%% .%dg' % 3

In [14]: fmt_str % x[0]
Out[14]: '-0.164'

but clearly something else is happening.

numpy's precision seems fine/meets my expectations:

In [10]: np.set_printoptions(precision=3)

In [11]: np.random.randn(5)
Out[11]: array([ 0.569, -2.638,  0.707,  0.675,  1.191])

So is this a bug? (if so it's been around for a long time) Or are my expectations off?

This was tested on current pandas master (as of writing) with numpy 1.9.2 and python 3.4.

The text was updated successfully, but these errors were encountered:

jreback · 2015-06-27T14:35:43Z

So this is done here: https://github.com/pydata/pandas/blob/master/pandas/core/format.py#L2024

not really sure why its using self.digits-1. You can try changing and see what breaks. Maybe you can then divine why its doing this. It should be self.digits.

rosnfeld · 2015-06-27T18:10:43Z

On a bit of digging I think it was just a convention, see Wes's comment on #726 - "I also changed the default number of decimal places (plus the first digit to the left of the decimal point) to 7, which is really just a suggestion as in R".

Pandas Options and Settings docs say "display.precision" means "Floating point output precision (number of significant digits)", so I see that accounting for a pre-decimal digit was probably intended, like in scientific notation. But the code is not really doing scientific notation, and is out of step with the numpy convention and standard C/python format strings.

I would vote to change it. I ran the test suite after changing it and the only errors were in test_format.py (3 of them), nothing too surprising, nothing too big to clean up. Not sure if this would mess up a chunk of the pandas userbase that had become accustomed to this behaviour, though.

jreback · 2015-06-28T12:04:07Z

I agree - want to prepare a pull request to change?

put a note in the API section of whatsnew as well

rosnfeld · 2015-06-28T12:13:56Z

Sure, I can do that. I'll make the docs a bit clearer as well.

kawochen · 2015-06-28T12:21:59Z

"[N]umber of significant digits" seems problematic. It sounds like significant figures, but it's not doing that.

rosnfeld · 2015-06-28T12:32:38Z

Right, I think we just give up on claiming that. Interestingly numpy makes a similar "digits of precision" claim in http://docs.scipy.org/doc/numpy/reference/generated/numpy.set_printoptions.html but basically has the same behaviour as pandas. It's just doing places after the decimal, which is what I think most people actually want to see. It's hard to scan/quickly interpret a table of numbers if they all have different exponents.

rosnfeld · 2015-07-02T21:46:07Z

Should we also change the default for "display.precision" to be 6 instead of 7, since this change will effectively increase the number of digits after the decimal place? Otherwise everyone's code will start outputting one more decimal.

rosnfeld · 2015-07-02T22:39:27Z

I should note that the current code does output "significant figures" when using scientific notation, which seems to be triggered by "large" and "small" values (a comment admits that the chosen thresholds are arbitrary, but "large" is currently 1e8, and "small" uses 10 ** (-self.digits + 1) ). The current master code handles this example from test_format.py as follows:

In [2]: pd.set_option('display.precision', 3)

In [3]: pd.DataFrame({'x': [0, 0.25, 3456.000, 12e+45, 1.64e+6, 1.7e+8, 1.253456, np.pi, -1e6]})
Out[3]: 
          x
0  0.00e+00
1  2.50e-01
2  3.46e+03
3  1.20e+46
4  1.64e+06
5  1.70e+08
6  1.25e+00
7  3.14e+00
8 -1.00e+06

numpy's precision argument handles it differently:

In [5]: np.set_printoptions(precision=3)

In [6]: np.array([0, 0.25, 3456.000, 12e+45, 1.64e+6, 1.7e+8, 1.253456, np.pi, -1e6])
Out[6]: 
array([  0.000e+00,   2.500e-01,   3.456e+03,   1.200e+46,   1.640e+06,
         1.700e+08,   1.253e+00,   3.142e+00,  -1.000e+06])

So there is a discrepancy between numpy and pandas, but pandas is using "significant figures". Check this example out, also:

In [10]: pd.DataFrame({'x': [0, 0.1, 0.12, 0.123]})
Out[10]: 
      x
0  0.00
1  0.10
2  0.12
3  0.12

In [11]: np.array([0, 0.1, 0.12, 0.123])
Out[11]: array([ 0.   ,  0.1  ,  0.12 ,  0.123])

Here numpy hides trailing zeroes. I can see an argument for it, but I don't think we want that. But pandas also isn't showing "significant figures" since it's not using scientific notation.

Python's %3f and %3e format strings do the same as numpy - both end up using 3 places after the decimal, so with scientific notation you get 4 significant digits.

It seems pandas is the odd one out.

I just want to confirm what we want to do here. It's possible to leave the scientific notation behaviour alone, if so desired, rather than adding a digit to it.

rosnfeld · 2015-07-05T23:28:24Z

I made a call on these questions in #10513 - maybe best to continue the conversation there.

jreback added Bug Output-Formatting __repr__ of pandas objects, to_string labels Jun 27, 2015

jreback added this to the 0.17.0 milestone Jun 27, 2015

rosnfeld mentioned this issue Jul 5, 2015

BUG: display.precision options seems off-by-one (GH10451) #10513

Merged

jreback closed this as completed in #10513 Aug 2, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: display.precision option seems off-by-one #10451

BUG: display.precision option seems off-by-one #10451

rosnfeld commented Jun 27, 2015

jreback commented Jun 27, 2015

rosnfeld commented Jun 27, 2015

jreback commented Jun 28, 2015

rosnfeld commented Jun 28, 2015

kawochen commented Jun 28, 2015

rosnfeld commented Jun 28, 2015

rosnfeld commented Jul 2, 2015

rosnfeld commented Jul 2, 2015

rosnfeld commented Jul 5, 2015

BUG: display.precision option seems off-by-one #10451

BUG: display.precision option seems off-by-one #10451

Comments

rosnfeld commented Jun 27, 2015

jreback commented Jun 27, 2015

rosnfeld commented Jun 27, 2015

jreback commented Jun 28, 2015

rosnfeld commented Jun 28, 2015

kawochen commented Jun 28, 2015

rosnfeld commented Jun 28, 2015

rosnfeld commented Jul 2, 2015

rosnfeld commented Jul 2, 2015

rosnfeld commented Jul 5, 2015