Skip to content

Correct assert_frame_equal doc string #22552

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Sep 3, 2018
Merged
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 15 additions & 8 deletions pandas/util/testing.py
Original file line number Diff line number Diff line change
Expand Up @@ -1306,33 +1306,35 @@ def assert_frame_equal(left, right, check_dtype=True,
check_categorical=True,
check_like=False,
obj='DataFrame'):
"""Check that left and right DataFrame are equal.
"""
Check that left and right DataFrame are equal.

Parameters
----------
left : DataFrame
First frame to compare.
right : DataFrame
Second frame to compare.
check_dtype : bool, default True
Whether to check the DataFrame dtype is identical.
check_index_type : bool / string {'equiv'}, default False
check_index_type : bool / string {'equiv'}, default 'equiv'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what we've been using in these cases is {'equiv'} or bool

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO that is more confusing. e.g. it might imply I pass a set. I'm not sure it's worth a special case for when there is only one possible string value.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is your suggestion, just 'equiv' or bool, default 'equiv' or something else? Besides being more consistent for the user, using the curly brackets in all cases would simplify parsing the types and adding validation and extracting stats. But if you are strongly in favor of not using them, I'm happy to merge this with it now, and see later on what's best when we implement that validation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think as it is above is good. It's maybe a little verbose but is very clear.

check_index_type : bool / string {'equiv'}, default 'equiv'

Happy to revisit if a standard emerges.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have the standard here: https://pandas.pydata.org/pandas-docs/stable/contributing_docstring.html#parameter-types

If you can use the first format I suggested, that would be great.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I have used your suggestion.

Whether to check the Index class, dtype and inferred_type
are identical.
check_column_type : bool / string {'equiv'}, default False
check_column_type : bool / string {'equiv'}, default 'equiv'
Whether to check the columns class, dtype and inferred_type
are identical.
check_frame_type : bool, default False
check_frame_type : bool, default True
Whether to check the DataFrame class is identical.
check_less_precise : bool or int, default False
Specify comparison precision. Only used when check_exact is False.
5 digits (False) or 3 digits (True) after decimal points are compared.
If int, then specify the digits to compare
If int, then specify the digits to compare.
check_names : bool, default True
Whether to check that the `names` attribute for both the `index`
and `column` attributes of the DataFrame is identical, i.e.

* left.index.names == right.index.names
* left.columns.names == right.columns.names

by_blocks : bool, default False
Specify how to compare internal data. If False, compare by columns.
If True, compare by blocks.
Expand All @@ -1345,10 +1347,15 @@ def assert_frame_equal(left, right, check_dtype=True,
check_like : bool, default False
If True, ignore the order of index & columns.
Note: index labels must match their respective rows
(same as in columns) - same labels must be with the same data
(same as in columns) - same labels must be with the same data.
obj : str, default 'DataFrame'
Specify object name being compared, internally used to show appropriate
assertion message
assertion message.

See Also
--------
assert_series_equal: equivalent method for asserting Series equality
DataFrame.equals: check DataFrame equality
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it makes sense adding a example?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure an example is particularly useful in this case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've got users of all levels, and things like comparing dataframes with different types are worth showing with examples.

Besides that, can you add a space before the colons in the see also section, capitalize the first letter of the description, and finish the description with a period? If you generate the html ./doc/make.py html --single pandas.util.testing.assert_frame_equal, I don't think without the space this is being rendered correcty.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added examples and corrected formatting as mentioned.

"""

# instance validation
Expand Down