-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Add tests in methods.py #23261
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add tests in methods.py #23261
Conversation
Hello @thoo! Thanks for updating the PR.
Comment last updated on October 21, 2018 at 04:29 Hours UTC |
@@ -105,6 +105,13 @@ def test_factorize_equivalence(self, data_for_grouping, na_sentinel): | |||
tm.assert_numpy_array_equal(l1, l2) | |||
self.assert_extension_array_equal(u1, u2) | |||
|
|||
def test_fillna_copy(self, data_for_fillna): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do you need to make a new fixture for this? you can just use data and the fill value can be the first value of the non-nulls
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was working on fillna returns a copy when there are no missing value
. data
has np.nan
values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about you re-use data_missing
and just do
arr = data.take([1, 1])
in the test to get an array with two valid values.
filled_val = df.iloc[0, 0] | ||
|
||
result = df.fillna(filled_val) | ||
assert df.values.base is not result.values.base |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that asserting df.A._values is data_for_fillna
is needed here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be good to test this for Series.fillna as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review. Sure I will. I just want to make sure I am on the right track.
3b95a39
to
00bca3f
Compare
@TomAugspurger I am not sure how to work on |
Something like
That will try filling a length-2 EA with a length-1 EA, which should raise. |
f8fe87a
to
7ab9458
Compare
Codecov Report
@@ Coverage Diff @@
## master #23261 +/- ##
==========================================
+ Coverage 92.18% 92.18% +<.01%
==========================================
Files 161 161
Lines 51184 51184
==========================================
+ Hits 47185 47186 +1
+ Misses 3999 3998 -1
Continue to review full report at Codecov.
|
This is not working on |
Yes, I think those don't implement |
result = df.fillna(filled_val) | ||
assert df.values.base is not result.values.base | ||
|
||
if isinstance(arr, pd.SparseArray): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rather than write like this, I would just override these tests for SparseArray
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
filled_val = df.iloc[0, 0] | ||
result = df.fillna(filled_val) | ||
|
||
assert df.values.base is not result.values.base |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is correct. I believe that for DataFrames with EAs, df.values
will all return a new ndarray, so this would be impossible to fail.
Rather, I think the best we can do is assert df.A.values is not result.A.values
.
result = df.fillna(filled_val) | ||
|
||
assert df.values.base is not result.values.base | ||
assert df.A._values is arr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be removed I think. That's asserting something different.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done! Thanks.
…y_tests * repo_org/master: (52 commits) ENH: Allow rename_axis to specify index and columns arguments (pandas-dev#20046) STY: proposed isort settings [ci skip] [skip ci] [ciskip] [skipci] (pandas-dev#23366) MAINT: Remove extraneous test.parquet file CLN: Follow-up comments to pandas-devgh-23392 (pandas-dev#23401) BUG GH23282 calling min on series of NaT returns NaT (pandas-dev#23289) unpin openpyxl (pandas-dev#23361) REF: collect ops dispatch functions in one place, try to de-duplicate SparseDataFrame methods (pandas-dev#23060) CLN: Remove pandas.tools module (pandas-dev#23376) CLN: Remove some dtype methods from API (pandas-dev#23390) CLN: Cleanup toplevel namespace shims (pandas-dev#23386) DOC: fixup whatsnew note for GH21394 (pandas-dev#23355) Fix import format at pandas/tests/extension directory (pandas-dev#23365) DOC: Remove Series.sortlevel from api.rst (pandas-dev#23395) API: Disallow dtypes w/o frequency when casting (pandas-dev#23392) BUG/TST/REF: Datetimelike Arithmetic Methods (pandas-dev#23215) STYLE: lint add np.nan* funcs to cython_table (pandas-dev#22109) Run Isort on tests/util single PR (pandas-dev#23347) BUG: Fix date_range overflow (pandas-dev#23345) Run Isort on tests/arrays single PR (pandas-dev#23346) ...
thanks @thoo |
Closes #23241