Skip to content

TST: change assertion messages in assert_frame_equal (#27023) #27068

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 28, 2019
Merged

TST: change assertion messages in assert_frame_equal (#27023) #27068

merged 1 commit into from
Jun 28, 2019

Conversation

ajwood
Copy link
Contributor

@ajwood ajwood commented Jun 26, 2019

@ajwood
Copy link
Contributor Author

ajwood commented Jun 27, 2019

Ouch, this is my first pandas PR, are all of these failing checks normal? The change is so tiny, I can't imagine what could have gone wrong.

@ajwood
Copy link
Contributor Author

ajwood commented Jun 27, 2019

Ah shoot, I see that there are tests that fail now, which match on the assertion text that I changed.
I'll take a closer look at whether the discrepant tests make sense to be updated. I don't totally understand the by_blocks test though, so any insight would be appreciated.

@WillAyd
Copy link
Member

WillAyd commented Jun 27, 2019

I don't totally understand the by_blocks test though, so any insight would be appreciated.

I think you are referring to the test_frame_equal_block_mismatch which is the one that is failing. This is parametrized with the by_blocks fixture, which basically means the test gets run twice (once where by_blocks is True and another time where it is False)

Haven't looked deep into the failure but hope the above helps

@WillAyd WillAyd added Error Reporting Incorrect or improved errors from pandas Testing pandas testing functions or related to the test suite labels Jun 27, 2019
@ajwood
Copy link
Contributor Author

ajwood commented Jun 27, 2019

If in the if by_blocks: block we change

assert_frame_equal(...
   obj='{obj}.blocks'.format(obj=obj))

to just

assert_frame_equal(...
   obj=obj)

the existing tests are happy. Note the original string is hard-coded 'DataFrame.blocks'. What I'm not sure of yet is if it makes sense to remove the .blocks part of the string, or if the test cases should be updated. Or neither or something else.

@WillAyd
Copy link
Member

WillAyd commented Jun 27, 2019

I don't think .blocks needs to necessarily be appended as the original hard coding of DataFrame.blocks is arguably arbitrary to begin with

@codecov
Copy link

codecov bot commented Jun 27, 2019

Codecov Report

Merging #27068 into master will decrease coverage by 50.17%.
The diff coverage is 100%.

Impacted file tree graph

@@             Coverage Diff             @@
##           master   #27068       +/-   ##
===========================================
- Coverage   92.04%   41.86%   -50.18%     
===========================================
  Files         180      180               
  Lines       50714    50714               
===========================================
- Hits        46679    21233    -25446     
- Misses       4035    29481    +25446
Flag Coverage Δ
#multiple ?
#single 41.86% <100%> (-0.09%) ⬇️
Impacted Files Coverage Δ
pandas/util/testing.py 51.05% <100%> (-39.79%) ⬇️
pandas/io/formats/latex.py 0% <0%> (-100%) ⬇️
pandas/io/gcs.py 0% <0%> (-100%) ⬇️
pandas/io/sas/sas_constants.py 0% <0%> (-100%) ⬇️
pandas/core/groupby/categorical.py 0% <0%> (-100%) ⬇️
pandas/tseries/plotting.py 0% <0%> (-100%) ⬇️
pandas/tseries/converter.py 0% <0%> (-100%) ⬇️
pandas/io/s3.py 0% <0%> (-100%) ⬇️
pandas/io/formats/html.py 0% <0%> (-99.37%) ⬇️
pandas/io/sas/sas7bdat.py 0% <0%> (-91.16%) ⬇️
... and 133 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e955515...fc23941. Read the comment docs.

1 similar comment
@codecov
Copy link

codecov bot commented Jun 27, 2019

Codecov Report

Merging #27068 into master will decrease coverage by 50.17%.
The diff coverage is 100%.

Impacted file tree graph

@@             Coverage Diff             @@
##           master   #27068       +/-   ##
===========================================
- Coverage   92.04%   41.86%   -50.18%     
===========================================
  Files         180      180               
  Lines       50714    50714               
===========================================
- Hits        46679    21233    -25446     
- Misses       4035    29481    +25446
Flag Coverage Δ
#multiple ?
#single 41.86% <100%> (-0.09%) ⬇️
Impacted Files Coverage Δ
pandas/util/testing.py 51.05% <100%> (-39.79%) ⬇️
pandas/io/formats/latex.py 0% <0%> (-100%) ⬇️
pandas/io/gcs.py 0% <0%> (-100%) ⬇️
pandas/io/sas/sas_constants.py 0% <0%> (-100%) ⬇️
pandas/core/groupby/categorical.py 0% <0%> (-100%) ⬇️
pandas/tseries/plotting.py 0% <0%> (-100%) ⬇️
pandas/tseries/converter.py 0% <0%> (-100%) ⬇️
pandas/io/s3.py 0% <0%> (-100%) ⬇️
pandas/io/formats/html.py 0% <0%> (-99.37%) ⬇️
pandas/io/sas/sas7bdat.py 0% <0%> (-91.16%) ⬇️
... and 133 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e955515...fc23941. Read the comment docs.

@ajwood
Copy link
Contributor Author

ajwood commented Jun 27, 2019

Merging #27068 into master will decrease coverage by 50.17%

Ouch, I'm not not having much luck with this! I don't understand where that drop in coverage could be coming from. Does this mean there is more digging I should be doing?

@WillAyd
Copy link
Member

WillAyd commented Jun 27, 2019

Don't worry about codecov it's sometimes spotty. Can you add a test in pandas/tests/util/test_assert_frame_equal.py that ensures this works correctly?

@ajwood
Copy link
Contributor Author

ajwood commented Jun 28, 2019

I'm not sure if I went overboard in the test script, by adding parametrization to all of the existing tests. Let me know, and I'd be happy do revert and redo, as a new test or two.

@@ -9,6 +9,11 @@ def by_blocks(request):
return request.param


@pytest.fixture(params=['aaa', 'zzz'])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should also include DataFrame & Series as strings;

can you call this obj_fixture? (and rename by_blocks -> by_blocks_fixture)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do. Is it enough to just include DataFrame and Series as strings in the fixture? Or should that somehow be made to disable specifying obj= in the tests, to test that expected default value is honoured?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think just DataFrame and Series as strings is fine ('aaa' and 'zzz' are kind of arbitrary). You'd have to then replace obj in your function signatures with obj_fixture and subsequently change invocations of these assert functions within those tests to obj=obj_fixture

Copy link
Contributor Author

@ajwood ajwood Jun 28, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. This is straining my knowledge of pep8 though!

    msg = """{obj}\\.index are different

{obj}\\.index values are different \\(33\\.33333 %\\)
\\[left\\]:  Index\\(\\['a', 'b', 'c'\\], dtype='object'\\)
\\[right\\]: Index\\(\\['a', 'b', 'd'\\], dtype='object'\\)""".format(obj=obj_fixture)

I'll read up to figure out how to shorten that last line, but if you see this in the meantime, could you help me out?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea you can just break after format(

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

K, done and pushed. When I revise my PR, I've been amending, then force-push. This time I forgot to fetch and reapply to the current master though. Is there a preferred workflow for revising pull requests?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would do this before any change:

git fetch upstream
git merge upstream/master
# if any merge conflicts fix up and
git merge --continue

Then you can just make new commits without amend and push without force

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be squashed and rebased when it's merge to pandas-dev/pandas though, right? I'll have to force-push after that, but I guess your recipe will keep the history organized until the PR is finally accepted?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We squash before merge

@jreback jreback added this to the 0.25.0 milestone Jun 28, 2019
@jreback jreback merged commit d7d26be into pandas-dev:master Jun 28, 2019
@jreback
Copy link
Contributor

jreback commented Jun 28, 2019

thanks @ajwood

@ajwood
Copy link
Contributor Author

ajwood commented Jun 28, 2019

thanks @ajwood

My pleasure! @WillAyd thanks for the help :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Error Reporting Incorrect or improved errors from pandas Testing pandas testing functions or related to the test suite
Projects
None yet
Development

Successfully merging this pull request may close these issues.

testing.assert_frame_equal obj parameter not always respected
3 participants