-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
BUG: DataFrame.replace is broken in 0.12-dev #4115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@cpcloud take a look? |
sure while i'm fixing you can also do
|
ah i remember this being a bit tricky. was the last change i made before submitting the original pr. @vfilimonov thanks for the report |
it would be cool to look at the pandas devs "reaction time" (time to submit a pr after a bug report)...i wonder if the gh api has such information |
@cpcloud, thank you very much for taking care! Aside: the variant with Series seems to be much faster if there are a lot of columns of different types. When the number of columns is not big the difference is not that large.
CPU times: user 0.16 s, sys: 0.01 s, total: 0.17 s |
series is doing something a bit different...it uses a lot of cythonized internals to do the work at the expense of being slightly less general, e.g., it doesn't accept regex. this will most likely change in 0.13 when series becomes an ndframe since all of the replace stuff is happening at the |
The reaction time is really incredible. It is the second time when I submit a report and you guys fix the bug in 30 minutes! Thanks a lot |
@vfilimonov thanks for the kind words! we appreciate it. |
@cpcloud honestly seems like it would be pretty easy to change Series internals... Only issue is that it (might be) a bit slower... |
i agree w/ u. just commenting on why it's a bit faster. it uses masking via cython whereas i'm using |
@cpcloud if pandas dropped the vectorize, it could use nearly the same functions for arithmetic for Series and Frame (and probably even for Panel). |
@cpcloud that said, having code in 3 places is not a big deal...performance is more important. |
there's are only 3 places where vectorize is used in pandas
are you referring to use of things like |
keep in kind this is all going to change once series refactor |
yes we should wait for the motherload |
@cpcloud the last 2 are because I saw u using it! |
ha! |
which is really just because using |
the
DataFrame.replace
seems to be broken in 0.12 development versionthis code:
Return correct result in 0.11.0:
and in 0.12.0.dev-795004d it does not work:
Series.replace it works well, so the workaround for this bug is the following:
The text was updated successfully, but these errors were encountered: