Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this writes into the underlying array, this is a change in behavior. Not sure if this is desireable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@phofl Thanks for replying. per docstring under
update
that's what it's doing.Modify in place using non-NA values from another DataFrame. Aligns on indices. There is no return value.
it's just not doing in an efficient manner, since after 1.4.0 the setitem is being used for assignment, this needs to be optimized.
do you mind tagging other core contributor who you may think relavant to this code to take a look at this.
I really think this should be the way doing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@phofl is right
this is an api change and cannot be back ported
nor is the right way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
repeated updates are not idiomatic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jreback thanks for replying, I don't want to just jump in and change the API fundamentally but I want to understand you. the thing I am trying to fix is to speed up the operation. but didn't mean to have other impacts.
to me,
df[col] = "a"
vsdf.loc[:, col]="a"
doesn't have much difference but the latter is much faster. this is only thing I meant to change. as regards torepeated updates are not idiomatic
, i think this is what the existing code is doing(in a for loop), and since docstring says it is to update it in place, isn't the change just right on its purpose? but thedf[col] = "a"
is doing copy under the hood and it also slow it down.Would like to hear you thoughts.