-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
API/TST: add tests for new copy/view behaviour #46979
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Tests look good to me. As a very small nit on style, I think most tests are using +1 on tests/copy_view |
The two failing builds seem unrelated conda installation failures. |
cc @mroeschke if you can have a look |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. 2 questions:
- Might have missed it, but is it worth testing subsetting by row and column (though it should already copy)
- I see you're mainly testing with integer dtype, just confirming that this behavior should be dtype independent?
That was indeed not yet explicitly tested. I added an additional test for subsetting with loc/iloc with both a column and row indexer (parametrized for a few different kind of indexers), and then generally testing that mutating the subset will not change the parent (indeed depending on the exact indexer, the result will actually already be a copy and will never update the parent anyway, but it is still an easy generic test to add).
Yeah, good question. In general it depends on the test I think. For tests that are ensuring that getitem operations are properly tracking references, I think it shouldn't matter in theory. What does matter is single vs multiple blocks, as that can take a different code path, and so I already parametrized a few of the tests to use homogeneous dtype vs mixed dtypes (I should probably expand this to more of the tests). |
I also added a helper function to do the equivalent of |
Thanks, yeah a follow up PR testing over more dtypes would be great. One typing issue otherwise LGTM
|
Any more comments here? |
Thanks @jorisvandenbossche |
And thanks for the review! |
* API/TST: add tests for new copy/view behaviour * fixes for non-CoW cases * move test file to tests/copy_view * split into multiple files * Fix tests for ArrayManager case * fix type checking * may_share_memory -> shares_memory * update pandas imports * add test for iloc/low with row+column indexer * user helper to get column values to assert np.shares_memory * fix typing * handle new iloc inplace deprecation warnings * handle new iloc deprecation warnings for ArrayManager
This is broken off from #46958 / #41878
This are the new tests that I wrote for the PRs implementing the proposed new copy/view semantics with Copy-on-Write (#36195). They are intended to be readable tests to follow and review the intended semantics.
They will partly be duplicating some other tests that also deal with copy/view details spread around in the test suite (see the required test edits in the abovementioned PRs), but I would say this is fine, as I think it is good to have a set of centralized tests focused on those copy/view semantics.
I broke of this part from the main PR because I think this part can be done as a pre-cursor (reducing the diff size for the main PR), and because it might be useful to review those tests as a way to review the proposed behaviours (cc @pandas-dev/pandas-core).
Some practical questions about the test organization:
/tests/indexing/test_copy_on_write.py
, but 1) it's not only about indexing, and 2) "copy_on_write" is more the internal implementation mechanism, while the user facing thing we are testing is copy/view behaviour. So maybe I could put them in a new top-leveltests/copy_view/
test directory? (which also allows more easily to split it in multiple files)