-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
PERF: asv for select_dtypes #14588
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
you would have to show a full example and pd.show_versions as the instructions indicate |
The slowest run took 13.72 times longer than the fastest. This could mean that an intermediate result is being cached.
100 loops, best of 3: 3.41 ms per loop ####################################################### INSTALLED VERSIONScommit: None pandas: 0.19.0 |
@simonm3 You might want to take a look at the definition in https://github.com/pandas-dev/pandas/blob/2e77536bdf90ef20fefd4eab751447918e07668f/pandas/core/frame.py maybe do some profiling to see where the time is spent. Before you do any profiling / more benchmarking, make sure the results are equivalent. For example, |
on master (post 0.19.1) @simonm3 your comparison is not apt as you are not just selecting the column names, but the actual data itself. all of that said, I will reprupose this issue to have some asv's for this |
Still extremely slow as of pandas 0.25.3 |
take |
Why is select_dtypes so slow?
%timeit [col for col in df.columns if np.issubdtype(df[col].dtype, np.number)]
453 microsecs per loop
%timeit df.select_dtypes(include=[np.number])
4.58 secs per loop
The text was updated successfully, but these errors were encountered: