Skip to content

PERF: astype(str) on object dtypes GH8732 #8971

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

vikram
Copy link

@vikram vikram commented Dec 2, 2014

Closes #8732

In most cases it looks like, we need to iterate over array
and coerce each element. This is so that the appropriate
exception can be raised, or we can deal with nulls.
So the original case of casting ints to strings, has to
work the way it does, unless we change the underlying behaviour.
So when astype(str) is called on ints. Then each element
is first cast as a string then made into a numpy object.
If we relied on numpy it wouldn't cast it to string, just
return it as an object. This breaks existing behaviour.

It is possible to bypass iterating over the array, when we are
coercing to int. Assuming that there are no NaNs and the type
of the array is a numeric.

In most cases it looks like, we need to iterate over array
and coerce each element. This is so that the appropriate
exception can be raised, or we can deal with nulls.
So the original case of casting ints to strings, has to
work the way it does, unless we change the underlying behaviour.
So when astype(str) is called on ints. Then each element
is first cast as a string then made into a numpy object.
If we relied on numpy it wouldn't cast it to string, just
return it as an object. This breaks existing behaviour.

It is possible to bypass iterating over the array, when we are
coercing to int. Assuming that there are no NaNs and the type
of the array is a numeric.
@jreback
Copy link
Contributor

jreback commented Dec 3, 2014

not sure what this is actually fixing. All of the tests you show work AFAICT in master.

@jorisvandenbossche
Copy link
Member

@jreback #8732 is not a bug, but a performance issue

@vikram can you show the output of the benchmark?

@jreback jreback added Performance Memory or execution speed performance Strings String extension data type and string data labels Jan 18, 2015
@jreback
Copy link
Contributor

jreback commented Jan 18, 2015

@vikram can you show the benchmark results?

@jreback jreback changed the title PERF:Partially fixes GH8732 PERF: astype(str) on object dtypes GH8732 Jan 18, 2015
@jreback
Copy link
Contributor

jreback commented Mar 25, 2015

@vikram can you show the vbench results for this?

@jreback
Copy link
Contributor

jreback commented May 9, 2015

closing pls reopen if/when updated

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Performance Memory or execution speed performance Strings String extension data type and string data
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PERF: directly astype with numpy if series is already nansafe
3 participants