-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
CoW: Return read-only array in Index.values #53704
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 4 commits
3cc9846
6d050d8
4c2dac9
2e45177
6e19b8d
47d38a2
2f1623a
7fe0283
02632a9
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -327,7 +327,7 @@ def test_constructor_from_list_no_dtype(self): | |
index = Index([1, 2, 3]) | ||
assert index.dtype == np.int64 | ||
|
||
def test_constructor(self, dtype): | ||
def test_constructor(self, dtype, using_copy_on_write): | ||
jorisvandenbossche marked this conversation as resolved.
Show resolved
Hide resolved
|
||
index_cls = Index | ||
|
||
# scalar raise Exception | ||
|
@@ -347,8 +347,12 @@ def test_constructor(self, dtype): | |
val = arr[0] + 3000 | ||
|
||
# this should not change index | ||
arr[0] = val | ||
assert new_index[0] != val | ||
if not using_copy_on_write: | ||
arr[0] = val | ||
assert new_index[0] != val | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Similarly here, we want to test that the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. makes sense |
||
else: | ||
with pytest.raises(ValueError, match="assignment"): | ||
arr[0] = val | ||
|
||
if dtype == np.int64: | ||
# pass list, coerce fine | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -433,8 +433,12 @@ def test_read_columns(self, engine): | |
df, engine, expected=expected, read_kwargs={"columns": ["string"]} | ||
) | ||
|
||
def test_write_index(self, engine): | ||
def test_write_index(self, engine, using_copy_on_write, request): | ||
check_names = engine != "fastparquet" | ||
if using_copy_on_write and engine == "fastparquet": | ||
request.node.add_marker( | ||
pytest.mark.xfail(reason="fastparquet write into index") | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What does fastparquet exactly? (it tries to write into the array it gets from the index? Do you know why? (that sounds as a bug in fastparquet, as it can change the dataframe you are writing?) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we can simply set the flag there, they seem to use the index to do a conversion. It's already on my todo list, but don't want to block this pr because of that |
||
) | ||
|
||
df = pd.DataFrame({"A": [1, 2, 3]}) | ||
check_round_trip(df, engine) | ||
|
@@ -1213,12 +1217,14 @@ def test_error_on_using_partition_cols_and_partition_on( | |
partition_cols=partition_cols, | ||
) | ||
|
||
@pytest.mark.skipif(using_copy_on_write(), reason="fastparquet writes into Index") | ||
def test_empty_dataframe(self, fp): | ||
# GH #27339 | ||
df = pd.DataFrame() | ||
expected = df.copy() | ||
check_round_trip(df, fp, expected=expected) | ||
|
||
@pytest.mark.skipif(using_copy_on_write(), reason="fastparquet writes into Index") | ||
def test_timezone_aware_index(self, fp, timezone_aware_date_list): | ||
idx = 5 * [timezone_aware_date_list] | ||
|
||
|
@@ -1328,6 +1334,7 @@ def test_invalid_dtype_backend(self, engine): | |
with pytest.raises(ValueError, match=msg): | ||
read_parquet(path, dtype_backend="numpy") | ||
|
||
@pytest.mark.skipif(using_copy_on_write(), reason="fastparquet writes into Index") | ||
def test_empty_columns(self, fp): | ||
# GH 52034 | ||
df = pd.DataFrame(index=pd.Index(["a", "b", "c"], name="custom name")) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This defeats a bit the purpose of the original test (and I don't think we need to test here that assigning into a read-only numpy array gives an error).
So maybe just remove it? (we already check
np.shares_memory
) Or manually set the writeable flag to True and then assign the value.But actually, now that we keep track of references to Index data as well, the original setitem doesn't really need to do a copy, I think? (for another issue/PR)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah let's rip it out