Skip to content

PERF: better dtype inference for perf gains (GH7332) #7342

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 4, 2014

Conversation

jreback
Copy link
Contributor

@jreback jreback commented Jun 4, 2014

closes #7332

for some reason int64 was not being looked up correctly in the _TYPE_MAP table.
changing to work on name/kind makes this more efficient.

separately better inference of timedelta conversions from i8 were not being done
so that yields a big boost

-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------
dtype_infer_timedelta64_2                    |   3.9814 | 1765.3640 |   0.0023 |
dtype_infer_datetime64                       |   4.8450 | 1825.3207 |   0.0027 |
dtype_infer_int64                            |   0.7904 |  18.5820 |   0.0425 |
dtype_infer_float64                          |   0.8090 |   1.0493 |   0.7710 |
dtype_infer_float32                          |   0.3430 |   0.3823 |   0.8971 |
dtype_infer_int32                            |   0.3897 |   0.4017 |   0.9701 |
dtype_infer_uint32                           |   0.5450 |   0.5583 |   0.9762 |
dtype_infer_timedelta64_1                    | 118.2277 | 117.6037 |   1.0053 |
-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------

Ratio < 1.0 means the target commit is faster then the baseline.
Seed used: 1234

Target [e2f5a50] : PERF: use names/kinds for dtype inference on known types
Base   [87660ef] : Merge pull request #6968 from ahlmss/AHLMSS_0.13.1_ahl1

PERF: recognize int64 to timedelta64[ns] conversions and perform faster

PERF: use names/kinds for dtype inference on known types

DOC: performance docs
@hayd
Copy link
Contributor

hayd commented Jun 4, 2014

wow, quite a perf improvement!

jreback added a commit that referenced this pull request Jun 4, 2014
PERF: better dtype inference for perf gains (GH7332)
@jreback jreback merged commit 3b0c542 into pandas-dev:master Jun 4, 2014
@dsm054
Copy link
Contributor

dsm054 commented Jun 4, 2014

FWIW it worked for me to fix this symptom on my 64-bit box; my 32-bit isn't accessible, but I'll give it a go later, just for sanity purposes.

@jreback
Copy link
Contributor Author

jreback commented Jun 4, 2014

gr8; I merged into master. lmk if any other issues. thanks for the report!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Dtype Conversions Unexpected or buggy dtype conversions Performance Memory or execution speed performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

strange dtype behaviour as function of series length
3 participants