-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Feature - Adding Dataframe applicability to some Series string methods #22911
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Labels
Comments
It's not clear what should happen when you have a mix of numeric and string dtypes. We typically recommend something like columns = data.select_dtypes("str")
data[column] = data[columns].apply(pd.Series.str.replace(" ", ",")) for applying the same column-wise transformation to a subset of the columns. |
It's not clear that what anyone would do with zfill on a series of strings
too. Practically it's more of a numeric series method but still applicable
on just string series. My point is to make those methods available for
either numeric series too or whole data frame which would be better.
…On Mon, 1 Oct 2018, 1:21 am Tom Augspurger, ***@***.***> wrote:
It's not clear what should happen when you have a mix of numeric and
string dtypes.
We typically recommend something like
columns = data.select_dtypes("str")
data[column] = data[columns].apply(pd.Series.str.replace(" ", ","))
for applying the same column-wise transformation to a subset of the
columns.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#22911 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/Ae0DoxY_akpzOIgZDqRW1-3RzSNlSkN-ks5ugSDKgaJpZM4XBGbP>
.
|
Ah, in that case I think that #17211 may cover everything you're asking for here. Can you confirm? |
Not really, I am proposing a parameter to string methods so that they can
be applied to data frames directly too. It has nothing to do with .astype()
.astype() is a part of it since the whole data would be converted to str
before applying string operation and back to orignal data type again.
…On Tue, 9 Oct 2018, 1:57 am Tom Augspurger, ***@***.***> wrote:
Ah, in that case I think that #17211
<#17211> may cover everything
you're asking for here. Can you confirm?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#22911 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/Ae0DoylPOaaiKUc3Cx4E6zq5C4qgiiypks5ui7U7gaJpZM4XBGbP>
.
|
Many methods, for example zfill()
Doesn't have a practical use on a string series. Why'd someone put some
zeroes to a string. In 90% of cases, it is a numeric series which they
convert to string and then apply the method and convert to int back again.
Similarly many methods like this are restricted to string use only and
doesn't have much of a practical use.
Knowing that, if there is an option to apply the whole method on a data
frame itself
Instead of doing it separately for each string can help a lot
On Tue, 9 Oct 2018, 2:14 am Kartikay Bhutani, <[email protected]>
wrote:
… Not really, I am proposing a parameter to string methods so that they can
be applied to data frames directly too. It has nothing to do with .astype()
.astype() is a part of it since the whole data would be converted to str
before applying string operation and back to orignal data type again.
On Tue, 9 Oct 2018, 1:57 am Tom Augspurger, ***@***.***>
wrote:
> Ah, in that case I think that #17211
> <#17211> may cover everything
> you're asking for here. Can you confirm?
>
> —
> You are receiving this because you authored the thread.
> Reply to this email directly, view it on GitHub
> <#22911 (comment)>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/Ae0DoylPOaaiKUc3Cx4E6zq5C4qgiiypks5ui7U7gaJpZM4XBGbP>
> .
>
|
That discussion moved on from `.astype` to a more dedicated `.format` method
On Mon, Oct 8, 2018 at 3:48 PM Kartikay Bhutani <[email protected]>
wrote:
… Many methods, for example zfill()
Doesn't have a practical use on a string series. Why'd someone put some
zeroes to a string. In 90% of cases, it is a numeric series which they
convert to string and then apply the method and convert to int back again.
Similarly many methods like this are restricted to string use only and
doesn't have much of a practical use.
Knowing that, if there is an option to apply the whole method on a data
frame itself
Instead of doing it separately for each string can help a lot
On Tue, 9 Oct 2018, 2:14 am Kartikay Bhutani, ***@***.***>
wrote:
> Not really, I am proposing a parameter to string methods so that they can
> be applied to data frames directly too. It has nothing to do with
.astype()
> .astype() is a part of it since the whole data would be converted to str
> before applying string operation and back to orignal data type again.
>
>
>
> On Tue, 9 Oct 2018, 1:57 am Tom Augspurger, ***@***.***>
> wrote:
>
>> Ah, in that case I think that #17211
>> <#17211> may cover
everything
>> you're asking for here. Can you confirm?
>>
>> —
>> You are receiving this because you authored the thread.
>> Reply to this email directly, view it on GitHub
>> <
#22911 (comment)>,
>> or mute the thread
>> <
https://github.com/notifications/unsubscribe-auth/Ae0DoylPOaaiKUc3Cx4E6zq5C4qgiiypks5ui7U7gaJpZM4XBGbP
>
>> .
>>
>
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#22911 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABQHIqpPbF__K4pW06SgdGAzZ9qdsdQgks5ui7oQgaJpZM4XBGbP>
.
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hello,
Initially the string methods, like replace, lower, zfill, strip etc etc.. are restricted to Series use only.
It would be good if a parameter is put to use it on data frames too. Methods like strip won't affect numeric columns since they wount be having spaces already. But if there is a method which can affect a numeric column, it can be excluded using exclude parameter (which should be added).
A simple way of doing it is demonstrated below.
In this example, the method is working fine with Series of all dtypes. And after successfully applying method, the columns are converted back to their original dtype.
If this issue is approved, I would like to work and contribute to this feature.
The text was updated successfully, but these errors were encountered: