Skip to content

Commit 91cc505

Browse files
authored
BUG: GH25871 -- adapt test_groupby_levels_and_columns
This test had an adjustment for column order when 'observed=True' is set. This hid the fact that, with that parameter set, the data columns were not actually reordered -- it was just the column group labels (analogous to index labels in pandas-dev#25871), leaving the data columns in place and out of sync. (This was not visible as the data consisted only of ones). I've made the test more sensitive (unsyncing of data columns will be caught now) and removed the special case for 'observed=True'. As there are no unobserved categories in this case, the result should not be influenced by this parameter.
1 parent 0aaa347 commit 91cc505

File tree

1 file changed

+14
-15
lines changed

1 file changed

+14
-15
lines changed

pandas/tests/groupby/test_grouping.py

+14-15
Original file line numberDiff line numberDiff line change
@@ -253,28 +253,27 @@ def test_groupby_levels_and_columns(self):
253253
tm.assert_frame_equal(by_levels, by_columns)
254254

255255
def test_groupby_categorical_index_and_columns(self, observed):
256-
# GH18432
256+
# GH18432, adapted for GH25871
257257
columns = ['A', 'B', 'A', 'B']
258258
categories = ['B', 'A']
259-
data = np.ones((5, 4), int)
259+
data = np.array([[1, 2, 1, 2],
260+
[1, 2, 1, 2],
261+
[1, 2, 1, 2],
262+
[1, 2, 1, 2],
263+
[1, 2, 1, 2]], int)
260264
cat_columns = CategoricalIndex(columns,
261265
categories=categories,
262266
ordered=True)
263267
df = DataFrame(data=data, columns=cat_columns)
264268
result = df.groupby(axis=1, level=0, observed=observed).sum()
265-
expected_data = 2 * np.ones((5, 2), int)
266-
267-
if observed:
268-
# if we are not-observed we undergo a reindex
269-
# so need to adjust the output as our expected sets us up
270-
# to be non-observed
271-
expected_columns = CategoricalIndex(['A', 'B'],
272-
categories=categories,
273-
ordered=True)
274-
else:
275-
expected_columns = CategoricalIndex(categories,
276-
categories=categories,
277-
ordered=True)
269+
expected_data = np.array([[4, 2],
270+
[4, 2],
271+
[4, 2],
272+
[4, 2],
273+
[4, 2]], int)
274+
expected_columns = CategoricalIndex(categories,
275+
categories=categories,
276+
ordered=True)
278277
expected = DataFrame(data=expected_data, columns=expected_columns)
279278
assert_frame_equal(result, expected)
280279

0 commit comments

Comments
 (0)