Skip to content

BUG: Fix FloatingArray output formatting #36800

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 28 commits into from
Oct 16, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
ca6c979
BUG: Fix FloatingArray output formatting
dsaxton Oct 1, 2020
adcc26c
fixup
dsaxton Oct 1, 2020
4663e10
fixup
dsaxton Oct 1, 2020
406ed4d
CLN: Clean float / complex string formatting
dsaxton Oct 2, 2020
a1228c9
Fix
dsaxton Oct 2, 2020
04f4be8
Merge branch 'fix-is-numeric-helper' into nullable-float-array-string…
dsaxton Oct 2, 2020
77b88fc
Test
dsaxton Oct 2, 2020
8743b5b
Test
dsaxton Oct 2, 2020
a235611
Fixture
dsaxton Oct 2, 2020
d775c33
Update format.py
dsaxton Oct 11, 2020
00c15ba
Reapply
dsaxton Oct 11, 2020
06ef337
Update
dsaxton Oct 11, 2020
aedd7ab
Merge remote-tracking branch 'upstream/master' into nullable-float-ar…
dsaxton Oct 11, 2020
32264e9
Hack
dsaxton Oct 12, 2020
ba3ac46
Simplify
dsaxton Oct 12, 2020
f8a3216
Fix regex
dsaxton Oct 12, 2020
546b44e
Fix doc
dsaxton Oct 12, 2020
99d6905
Fix
dsaxton Oct 12, 2020
319529f
Again
dsaxton Oct 12, 2020
f65bcfe
Merge remote-tracking branch 'upstream/master' into nullable-float-ar…
dsaxton Oct 14, 2020
7a477dc
Fix merge
dsaxton Oct 14, 2020
13567f8
Merge remote-tracking branch 'upstream/master' into nullable-float-ar…
dsaxton Oct 14, 2020
253c94e
Merge remote-tracking branch 'upstream/master' into nullable-float-ar…
dsaxton Oct 14, 2020
2136494
Merge remote-tracking branch 'upstream/master' into nullable-float-ar…
dsaxton Oct 15, 2020
7d4452a
Type and doc
dsaxton Oct 15, 2020
80b0103
Oops
dsaxton Oct 15, 2020
faa053f
Merge remote-tracking branch 'upstream/master' into nullable-float-ar…
dsaxton Oct 15, 2020
8fd5a9f
Add tests
dsaxton Oct 15, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions pandas/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -2656,7 +2656,7 @@ def memory_usage(self, index=True, deep=False) -> Series:
Examples
--------
>>> dtypes = ['int64', 'float64', 'complex128', 'object', 'bool']
>>> data = dict([(t, np.ones(shape=5000).astype(t))
>>> data = dict([(t, np.ones(shape=5000, dtype=int).astype(t))
... for t in dtypes])
>>> df = pd.DataFrame(data)
>>> df.head()
Expand Down Expand Up @@ -2691,7 +2691,7 @@ def memory_usage(self, index=True, deep=False) -> Series:
int64 40000
float64 40000
complex128 80000
object 160000
object 180000
bool 5000
dtype: int64

Expand Down Expand Up @@ -2790,7 +2790,7 @@ def transpose(self, *args, copy: bool = False) -> DataFrame:
>>> df2_transposed
0 1
name Alice Bob
score 9.5 8
score 9.5 8.0
employed False True
kids 0 0

Expand Down
36 changes: 21 additions & 15 deletions pandas/io/formats/format.py
Original file line number Diff line number Diff line change
Expand Up @@ -1311,7 +1311,7 @@ def _format_strings(self) -> List[str]:
float_format = get_option("display.float_format")
if float_format is None:
precision = get_option("display.precision")
float_format = lambda x: f"{x: .{precision:d}g}"
float_format = lambda x: f"{x: .{precision:d}f}"
else:
float_format = self.float_format

Expand Down Expand Up @@ -1372,6 +1372,8 @@ def _format(x):
tpl = " {v}"
fmt_values.append(tpl.format(v=_format(v)))

fmt_values = _trim_zeros_float(str_floats=fmt_values, decimal=".")

return fmt_values


Expand Down Expand Up @@ -1891,27 +1893,31 @@ def _trim_zeros_float(
Trims zeros, leaving just one before the decimal points if need be.
"""
trimmed = str_floats
number_regex = re.compile(fr"\s*[\+-]?[0-9]+(\{decimal}[0-9]*)?")
number_regex = re.compile(fr"^\s*[\+-]?[0-9]+\{decimal}[0-9]*$")

def _is_number(x):
def is_number_with_decimal(x):
return re.match(number_regex, x) is not None

def _cond(values):
finite = [x for x in values if _is_number(x)]
has_decimal = [decimal in x for x in finite]
def should_trim(values: Union[np.ndarray, List[str]]) -> bool:
"""
Determine if an array of strings should be trimmed.

return (
len(finite) > 0
and all(has_decimal)
and all(x.endswith("0") for x in finite)
and not (any(("e" in x) or ("E" in x) for x in finite))
)
Returns True if all numbers containing decimals (defined by the
above regular expression) within the array end in a zero, otherwise
returns False.
"""
numbers = [x for x in values if is_number_with_decimal(x)]
return len(numbers) > 0 and all(x.endswith("0") for x in numbers)

while _cond(trimmed):
trimmed = [x[:-1] if _is_number(x) else x for x in trimmed]
while should_trim(trimmed):
trimmed = [x[:-1] if is_number_with_decimal(x) else x for x in trimmed]

# leave one 0 after the decimal points if need be.
return [x + "0" if x.endswith(decimal) and _is_number(x) else x for x in trimmed]
result = [
x + "0" if is_number_with_decimal(x) and x.endswith(decimal) else x
for x in trimmed
]
return result


def _has_names(index: Index) -> bool:
Expand Down
22 changes: 22 additions & 0 deletions pandas/tests/io/formats/test_format.py
Original file line number Diff line number Diff line change
Expand Up @@ -3439,3 +3439,25 @@ def test_to_string_complex_number_trims_zeros():
result = s.to_string()
expected = "0 1.00+1.00j\n1 1.00+1.00j\n2 1.05+1.00j"
assert result == expected


def test_nullable_float_to_string(float_ea_dtype):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think should move these to EA tests and try to do these generically (for all of the numberic types); i guess ok here as well (but group them together int. & float both with and w/o nulls)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have int nullable tests for formatting?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not as far as I can tell, entirely possible I'm just missing them

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kk can you add (this PR would be great)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. These tests could probably be cleaned up a bit also, this file is very big and testing lots of different things.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh yes, for sure. if you'd create an issue would be great.

# https://github.com/pandas-dev/pandas/issues/36775
dtype = float_ea_dtype
s = pd.Series([0.0, 1.0, None], dtype=dtype)
result = s.to_string()
expected = """0 0.0
1 1.0
2 <NA>"""
assert result == expected


def test_nullable_int_to_string(any_nullable_int_dtype):
# https://github.com/pandas-dev/pandas/issues/36775
dtype = any_nullable_int_dtype
s = pd.Series([0, 1, None], dtype=dtype)
result = s.to_string()
expected = """0 0
1 1
2 <NA>"""
assert result == expected