Skip to content

ENH: raise_assert_detail shows the difference between the columns #48390

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Sep 12, 2022

Conversation

MarcoGorelli
Copy link
Member

@MarcoGorelli MarcoGorelli commented Sep 4, 2022

This idea came out of #48033 , so I'm opening a PR to see if anyone else would like to see this


Adding an extra At index {idx}, diff: {left} != {right}, analougously to pytest's output

Demo:

import pandas as pd
import numpy as np

a = np.random.randint(0, 10, size=200)
b = np.random.randint(0, 10, size=200)
df1 = pd.DataFrame({'a': a, 'b': b})
df2 = pd.DataFrame({'a': a, 'b': b})
df2.loc[97, 'b'] = 42
pd.testing.assert_frame_equal(df1, df2)

outputs

AssertionError: DataFrame.iloc[:, 1] (column name="b") are different

DataFrame.iloc[:, 1] (column name="b") values are different (0.5 %)
[index]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, ...]
[left]:  [1, 4, 3, 6, 4, 7, 2, 5, 3, 6, 6, 9, 8, 4, 4, 4, 7, 8, 4, 0, 3, 1, 4, 5, 7, 0, 3, 0, 9, 9, 6, 1, 7, 7, 4, 7, 6, 2, 3, 9, 1, 5, 1, 7, 2, 3, 8, 9, 5, 3, 6, 3, 7, 3, 4, 1, 7, 3, 6, 9, 0, 0, 7, 3, 6, 4, 4, 1, 0, 8, 0, 1, 8, 2, 8, 1, 8, 8, 8, 3, 8, 3, 8, 5, 7, 0, 0, 2, 8, 5, 3, 4, 7, 1, 6, 3, 5, 9, 7, 8, ...]
[right]: [1, 4, 3, 6, 4, 7, 2, 5, 3, 6, 6, 9, 8, 4, 4, 4, 7, 8, 4, 0, 3, 1, 4, 5, 7, 0, 3, 0, 9, 9, 6, 1, 7, 7, 4, 7, 6, 2, 3, 9, 1, 5, 1, 7, 2, 3, 8, 9, 5, 3, 6, 3, 7, 3, 4, 1, 7, 3, 6, 9, 0, 0, 7, 3, 6, 4, 4, 1, 0, 8, 0, 1, 8, 2, 8, 1, 8, 8, 8, 3, 8, 3, 8, 5, 7, 0, 0, 2, 8, 5, 3, 4, 7, 1, 6, 3, 5, 42, 7, 8, ...]
At positional index 97 diff: 9 != 42

This is analogous to how pytest shows diffs between lists:

    def test_me():
        a = [1, 2, 3]
        b = [4, 5, 6]
>       assert a == b
E       assert [1, 2, 3] == [4, 5, 6]
E         At index 0 diff: 1 != 4
E         Use -v to get more diff

@mroeschke mroeschke added the Testing pandas testing functions or related to the test suite label Sep 6, 2022
@@ -159,12 +160,14 @@ cpdef assert_almost_equal(a, b,
except AssertionError:
is_unequal = True
diff += 1
if not first_diff:
first_diff = f"At index {i} diff: {a[i]} != {b[i]}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
first_diff = f"At index {i} diff: {a[i]} != {b[i]}"
first_diff = f"At index {i}, first diff: {a[i]} != {b[i]}"

What do you think? Was thinking to clarify that this may not be the only diff during the comparison.

@mroeschke mroeschke added this to the 1.6 milestone Sep 12, 2022
@mroeschke mroeschke merged commit 28bf7f2 into pandas-dev:main Sep 12, 2022
@mroeschke
Copy link
Member

Thanks @MarcoGorelli

@mroeschke mroeschke modified the milestones: 1.6, 2.0 Oct 13, 2022
noatamir pushed a commit to noatamir/pandas that referenced this pull request Nov 9, 2022
…ndas-dev#48390)

* show first diff in assert_frame_equal

* reword diff -> first_diff
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Testing pandas testing functions or related to the test suite
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants