-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: pd.testing.assert_frame_equal(..., check_exact=True) raises assertion error if any columns are not numeric #35446
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
See pandas-dev/pandas#35446 for details. Basically broke how we compare dataframes in tests.
See pandas-dev/pandas#35446 for details. Basically broke how we compare dataframes in tests.
Agreed that @ivirshup are you interested in submitting a PR to update it? |
Sure! I'm having some trouble finding where to put tests though. Turns out Could you point me to that? |
\pandas\tests\util\test_assert_frame_equal.py |
I am comparing 2 data frames indexed by timestamp and getting the following error (works perfectly well in pandas 1.0.5):
Can it be related? |
It looked to me like the assert_equal methods got a refactor in 1.1. I don't think it's the same bug, but just got introduced around the same time. |
…ting ### What changes were proposed in this pull request? Adjust the `check_exact` parameter for non-numeric columns to ensure pandas-on-Spark tests passed with all pandas versions. ### Why are the changes needed? `pd.testing` utils are utilized in pandas-on-Spark tests. Due to pandas-dev/pandas#35446, `check_exact=True` for non-numeric columns doesn't work for older pd.testing utils, e.g. `assert_series_equal`. We wanted to adjust that to ensure pandas-on-Spark tests pass for all pandas versions. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Existing unit tests. Closes #32772 from xinrong-databricks/test_util. Authored-by: Xinrong Meng <[email protected]> Signed-off-by: Hyukjin Kwon <[email protected]>
btw, if
|
@devProdigy's comment is also the problem I was running into (though not strictly related to the original issue). Looks like |
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
(optional) I have confirmed this bug exists on the master branch of pandas.
Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.
Code Sample, a copy-pastable example
Problem description
Previously, the example above would not have thrown an error. It makes it a less useful function if I have to start subdividing dataframes by types just to check that they are equal. Presumably check_exact should either have no effect if called on a non-numeric
Series
orassert_frame_equal
should not passcheck_exact=True
if the series to be compared isn't numeric.It looks like this was introduced recently, in 08c6597
Expected Output
I expected it to work like it did previously, and not throw an error. Here's an example of test failures this is causing.
Output of
pd.show_versions()
The text was updated successfully, but these errors were encountered: