Skip to content

Commit 57f3ae0

Browse files
author
Mateusz Górski
committed
BUG: resolved problem with DataFrame.equals() wrongly returning True (pandas-dev#28839)
The function was returning True in case shown in added test. The cause of the problem was sorting Blocks of DataFrame by type, and then mgr_locs before comparison. It resulted in arranging the identical blocks in the same way, which resulted in having the same two lists of blocks. Changing sorting order to (mgr_locs, type) resolves the problem, while not interrupting the other aspects of comparison.
1 parent 57490b1 commit 57f3ae0

File tree

2 files changed

+7
-1
lines changed

2 files changed

+7
-1
lines changed

pandas/core/internals/managers.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -1399,7 +1399,7 @@ def equals(self, other):
13991399
# blocks (say, Categorical) which can only be distinguished by
14001400
# the iteration order
14011401
def canonicalize(block):
1402-
return (block.dtype.name, block.mgr_locs.as_array.tolist())
1402+
return (block.mgr_locs.as_array.tolist(), block.dtype.name)
14031403

14041404
self_blocks = sorted(self.blocks, key=canonicalize)
14051405
other_blocks = sorted(other.blocks, key=canonicalize)

pandas/tests/internals/test_internals.py

+6
Original file line numberDiff line numberDiff line change
@@ -1297,3 +1297,9 @@ def test_make_block_no_pandas_array():
12971297
result = make_block(arr.to_numpy(), slice(len(arr)), dtype=arr.dtype)
12981298
assert result.is_integer is True
12991299
assert result.is_extension is False
1300+
1301+
1302+
def test_dataframe_not_equal():
1303+
df1 = pd.DataFrame({"a": [1, 2], "b": ["s", "d"]})
1304+
df2 = pd.DataFrame({"a": ["s", "d"], "b": [1, 2]})
1305+
assert df1.equals(df2) is False

0 commit comments

Comments
 (0)