TST: ro_replace changing dtype of series #50664

pooja-subramaniam · 2023-01-10T19:38:24Z

closes Different behavior of to_replace method between pandas version 0.23.4 and 0.24.2 by changing dtype of series #25797
Tests added and passed
@phofl and @noatamir

… of series stays object

noatamir · 2023-01-11T12:17:49Z

👋 @pooja-subramaniam
The pre-commit errors will be fixed if you:

Replace your single quotes with double quotes
Add a new line at the end of the file (after your test)

If you want, you can have a look at the details link to see what the errors look like, so you can identify them next time 😉.

In the future, I recommend making a feature branch in your fork, rather than making a PR from your fork's main branch. You can find guidance for this in our documentation (here).

pooja-subramaniam · 2023-01-11T13:27:28Z

thanks for the points @noatamir!

pooja-subramaniam · 2023-01-11T16:26:15Z

@noatamir I don't understand the reason for the canceled check here. Do you know why that happened?

phofl

You can ignore cancelled pipelines if there is only one of them

This happens once in a while

phofl · 2023-01-11T16:28:56Z

pandas/tests/series/methods/test_replace.py

+    def test_replace_change_dtype_series(self):
+        # GH#25797
+        df = pd.DataFrame.from_dict({"Test": ["0.5", True, "0.6"]})
+        print(df["Test"].dtype)


Some pointers: Please remove the print statements.

Generally, you can remove your first assert statements per block, but we want to compare the whole df with the second assert statement, e.g. define your expected object and then call assert_frame_equal

expected = ... tm.assert_frame_equal(df, expected)

if expected is the same as your initial df, you can define expected through

expected = df.copy()

in the beginning

Thanks! Made the changes.

Correct me if I am wrong, but here the assert needs to check if the series dtype remains 'object' and not if the dataframes are equal. I've made changes accordingly.

tm.assert_frame_equal raises if your dtypes are different. So this is covered as well. In general, we want to check that the DataFrame is as expected (values, columns, index, dtype, ...). That is why we are using assert_frame_equal

The functions could modify the values unexpectedly and the test would not fail, if you only check the dtypes

Ah okay! Thanks for explaining :)

…lt format for assert

phofl · 2023-01-11T21:30:23Z

pandas/tests/series/methods/test_replace.py

+        # GH#25797
+        df = pd.DataFrame.from_dict({"Test": ["0.5", True, "0.6"]})
+        df["Test"] = df["Test"].replace([True], [np.nan])
+        expected = pd.DataFrame.from_dict({"Test": ["0.5", np.nan, "0.6"]})


One last comment: Could you define expected only once after line 681 and reuse it for all assert_frame_equal statements? It's not necessary to define it multiple times when it's the same object :)

Right! How did I miss that one! Thanks again ;)

phofl · 2023-01-12T09:31:40Z

thx @pooja-subramaniam

pooja-subramaniam added 2 commits January 10, 2023 19:29

added test based on issue pandas-dev#25797 to check if changing dtype…

81c9858

… of series stays object

Merge remote-tracking branch 'upstream/main'

6e50525

phofl added the Sprints Sprint Pull Requests label Jan 10, 2023

Merge branch 'main' of https://github.com/pandas-dev/pandas

d21d88e

pooja-subramaniam added 3 commits January 11, 2023 13:45

Merge branch 'pandas-dev:main' into main

b2621b1

changed single quotes to double and added new line at the end

83d9801

Merge branch 'main' of https://github.com/pooja-subramaniam/pandas

e269bb7

phofl reviewed Jan 11, 2023

View reviewed changes

pooja-subramaniam added 2 commits January 11, 2023 17:47

remove print statements and unnecessary assert, and use expected resu…

fe3e454

…lt format for assert

change assert statements to compare dataframes rather than only dtype

ca92c77

phofl reviewed Jan 11, 2023

View reviewed changes

remove redundant expected assignment statements

250c044

phofl approved these changes Jan 12, 2023

View reviewed changes

phofl changed the title ~~test added for issue #25797~~ TST: ro_replace changing dtype of series Jan 12, 2023

phofl added this to the 2.0 milestone Jan 12, 2023

phofl added the replace replace method label Jan 12, 2023

phofl merged commit dc7f851 into pandas-dev:main Jan 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TST: ro_replace changing dtype of series #50664

TST: ro_replace changing dtype of series #50664

pooja-subramaniam commented Jan 10, 2023

noatamir commented Jan 11, 2023 •

edited

Loading

pooja-subramaniam commented Jan 11, 2023

pooja-subramaniam commented Jan 11, 2023

phofl left a comment

phofl Jan 11, 2023

pooja-subramaniam Jan 11, 2023

phofl Jan 11, 2023

pooja-subramaniam Jan 11, 2023

phofl Jan 11, 2023

pooja-subramaniam Jan 12, 2023

phofl commented Jan 12, 2023

TST: ro_replace changing dtype of series #50664

TST: ro_replace changing dtype of series #50664

Conversation

pooja-subramaniam commented Jan 10, 2023

noatamir commented Jan 11, 2023 • edited Loading

pooja-subramaniam commented Jan 11, 2023

pooja-subramaniam commented Jan 11, 2023

phofl left a comment

Choose a reason for hiding this comment

phofl Jan 11, 2023

Choose a reason for hiding this comment

pooja-subramaniam Jan 11, 2023

Choose a reason for hiding this comment

phofl Jan 11, 2023

Choose a reason for hiding this comment

pooja-subramaniam Jan 11, 2023

Choose a reason for hiding this comment

phofl Jan 11, 2023

Choose a reason for hiding this comment

pooja-subramaniam Jan 12, 2023

Choose a reason for hiding this comment

phofl commented Jan 12, 2023

noatamir commented Jan 11, 2023 •

edited

Loading