Skip to content

BUG: fix col iteration in DataFrame.round, #11611 #11618

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions doc/source/whatsnew/v0.17.1.txt
Original file line number Diff line number Diff line change
Expand Up @@ -177,3 +177,5 @@ Bug Fixes
- Bug in ``DataFrame.join()`` with ``how='right'`` producing a ``TypeError`` (:issue:`11519`)
- Bug in ``Series.quantile`` with empty list results has ``Index`` with ``object`` dtype (:issue:`11588`)
- Bug in ``pd.merge`` results in empty ``Int64Index`` rather than ``Index(dtype=object)`` when the merge result is empty (:issue:`11588`)
- Bug in ``DataFrame.round()`` with non-unique column index producing a Fatal Python error (:issue:`11611`)
- Bug in ``DataFrame.round()`` with ``decimals`` being a non-unique indexed Series producing extra columns (:issue:`11618`)
11 changes: 7 additions & 4 deletions pandas/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -4382,17 +4382,20 @@ def round(self, decimals=0, out=None):
from pandas.tools.merge import concat

def _dict_round(df, decimals):
for col in df:
for col, vals in df.iteritems():
try:
yield np.round(df[col], decimals[col])
yield np.round(vals, decimals[col])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the decimals lookup going to have the same problem if it's not unique here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The value of decimals is either of type dict or pandas.Series. Whereas the former can not have duplicate entries, a pandas.Series might have and direct indexing will return all values with the same index. So, this case is still not handled.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think let's raise if decimals is not unique (it should be a dict/Series).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had the same thought this morning.

except KeyError:
yield df[col]
yield vals

if isinstance(decimals, (dict, Series)):
if isinstance(decimals, Series):
if not decimals.index.is_unique:
raise ValueError("Index of decimals must be unique")
new_cols = [col for col in _dict_round(self, decimals)]
elif com.is_integer(decimals):
# Dispatch to numpy.round
new_cols = [np.round(self[col], decimals) for col in self]
new_cols = [np.round(v, decimals) for _, v in self.iteritems()]
else:
raise TypeError("decimals must be an integer, a dict-like or a Series")

Expand Down
13 changes: 13 additions & 0 deletions pandas/tests/test_frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -15821,6 +15821,19 @@ class SubclassedPanel(Panel):
dtype='int64')
tm.assert_panel_equal(result, expected)

def test_round(self):
# GH11611

df = pd.DataFrame(np.random.random([3, 3]), columns=['A', 'B', 'C'],
index=['first', 'second', 'third'])

dfs = pd.concat((df, df), axis=1)
rounded = dfs.round()
self.assertTrue(rounded.index.equals(dfs.index))

decimals = pd.Series([1, 0, 2], index=['A', 'B', 'A'])
self.assertRaises(ValueError, df.round, decimals)


def skip_if_no_ne(engine='numexpr'):
if engine == 'numexpr':
Expand Down