-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: pd.crosstab(s1, s2) handle column index incorrectly when both series have tuple names #30978
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 6 commits
7e461a1
1314059
8bcb313
b8069e5
ab7618e
6fd7abe
aabad97
9652de7
6cbf0c2
5d79446
768c499
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2549,6 +2549,53 @@ def test_crosstab_tuple_name(self, names): | |
result = pd.crosstab(s1, s2) | ||
tm.assert_frame_equal(result, expected) | ||
|
||
@pytest.mark.parametrize( | ||
"s1_data, s1_name, s2_data, s2_name, " | ||
"expected_index, expected_column, expected_data", | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. what is the unique feature between these sets of parameters you are trying to test? Wasn't immediately clear to me; may be OK to do without it if it simplifies test There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yeah, removed. the unique feature was different shape of output, and with one of |
||
[ | ||
( | ||
[1, 2, 3], | ||
("a", "b"), | ||
[1, 2, 3], | ||
("c", "d"), | ||
[1, 2, 3], | ||
[1, 2, 3], | ||
np.eye(3, dtype="int64"), | ||
), | ||
([1, 1, 1], ("a", "b"), [0, 1, 2], ("c", "d"), [1], [0, 1, 2], [[1, 1, 1]]), | ||
( | ||
[0, 1, 2], | ||
("a", "b"), | ||
[1, 1, 1], | ||
("c", "d"), | ||
[0, 1, 2], | ||
[1], | ||
[[1], [1], [1]], | ||
), | ||
], | ||
) | ||
def test_crosstab_both_tuple_names( | ||
self, | ||
s1_data, | ||
s1_name, | ||
s2_data, | ||
s2_name, | ||
expected_index, | ||
expected_column, | ||
expected_data, | ||
): | ||
# GH 18321 | ||
s1 = pd.Series(s1_data, name=s1_name) | ||
s2 = pd.Series(s2_data, name=s2_name) | ||
|
||
expected = pd.DataFrame( | ||
expected_data, | ||
index=pd.Index(expected_index, name=s1_name), | ||
columns=pd.Index(expected_column, name=s2_name), | ||
) | ||
result = crosstab(s1, s2) | ||
tm.assert_frame_equal(result, expected) | ||
|
||
def test_crosstab_unsorted_order(self): | ||
df = pd.DataFrame({"b": [3, 1, 2], "a": [5, 4, 6]}, index=["C", "A", "B"]) | ||
result = pd.crosstab(df.index, [df.b, df.a]) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you give a comment on why you are doing this.