Skip to content

Fix Series.astype() to return a Series[type] dependent on argument #372

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Dr-Irv opened this issue Oct 7, 2022 · 3 comments · Fixed by #519
Closed

Fix Series.astype() to return a Series[type] dependent on argument #372

Dr-Irv opened this issue Oct 7, 2022 · 3 comments · Fixed by #519

Comments

@Dr-Irv
Copy link
Collaborator

Dr-Irv commented Oct 7, 2022

Deprecations in 1.5 on Series.astype() revealed that we have to work on that method.

See #227 (comment)

Ideally, things like s.astype(int) and s.astype("int") would both return Series[int] . Have to deal with each of the possible types, as well as the possible strings.

@ramvikrams
Copy link
Contributor

So we just have to change the return type to Series[S1]

@Dr-Irv
Copy link
Collaborator Author

Dr-Irv commented Jan 24, 2023

So we just have to change the return type to Series[S1]

No, it's more nuanced than that, because you can pass strings to astype() that correspond to dtypes, and there are dtype arguments that are not in the S1 typevar. So you need a lot of overloads to appropriately do the mappings. Have to handle a variety of numpy types as well.

I'd suggest making a table of values of S1 in one column, and then in the second column, you list all the possible arguments for astype() that could generate that value of S1. Then, from that table, you build a set of overloads.

There might be cases where there is no possible value for the astype() argument that give a particular value in S1. We'll need to review them one by one.

@ramvikrams
Copy link
Contributor

So we just have to change the return type to Series[S1]

No, it's more nuanced than that, because you can pass strings to astype() that correspond to dtypes, and there are dtype arguments that are not in the S1 typevar. So you need a lot of overloads to appropriately do the mappings. Have to handle a variety of numpy types as well.

I'd suggest making a table of values of S1 in one column, and then in the second column, you list all the possible arguments for astype() that could generate that value of S1. Then, from that table, you build a set of overloads.

There might be cases where there is no possible value for the astype() argument that give a particular value in S1. We'll need to review them one by one.

i'll get started with it now then

Dr-Irv pushed a commit that referenced this issue Feb 25, 2023
* adding overloads to astype

* Created the table for astype

* Update table.rst

* Updated the table and added numpy dtypes

* Update table.rst

* updated np.datetime64

* Update table.rst

* added types in Timedelta

* removed not required args in Dtype

* removed np.timedelta64 in Timedelta

* Removed timedelta64

* expanding series astype

* Added type in args in dtype

* corrected the args

* adding a overload for 'category'  and normal changes

* added tests

* removed unused args

* corrected tests

* Delete table.rst

* added the bool overload to top and done the required test  changes

* added type_checker

* added types for check and did requested changes

* updated the check types

* added astype in dataframe and other changes

* Update test_series.py

* Update test_series.py

* added dict test for astype in datatest_frame and tests for ExtensionDtype in test_series

* commented out the decimal tests

* Update test_series.py

* updated dtype args in astype

* added any to list of args for astype

* changed dtype args
twoertwein pushed a commit to twoertwein/pandas-stubs that referenced this issue Apr 1, 2023
* adding overloads to astype

* Created the table for astype

* Update table.rst

* Updated the table and added numpy dtypes

* Update table.rst

* updated np.datetime64

* Update table.rst

* added types in Timedelta

* removed not required args in Dtype

* removed np.timedelta64 in Timedelta

* Removed timedelta64

* expanding series astype

* Added type in args in dtype

* corrected the args

* adding a overload for 'category'  and normal changes

* added tests

* removed unused args

* corrected tests

* Delete table.rst

* added the bool overload to top and done the required test  changes

* added type_checker

* added types for check and did requested changes

* updated the check types

* added astype in dataframe and other changes

* Update test_series.py

* Update test_series.py

* added dict test for astype in datatest_frame and tests for ExtensionDtype in test_series

* commented out the decimal tests

* Update test_series.py

* updated dtype args in astype

* added any to list of args for astype

* changed dtype args
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants