-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
PERF: DataFrame(ndarr, copy=True) #52440
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PERF: DataFrame(ndarr, copy=True) #52440
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-1 on this without further investigation, this caused significant slowdowns in a lot of benchmarks when I looked at it a couple of weeks ago.
Please run them locally and report result
Edit: found my result
before after ratio
[26999ebe] [47ed11f5]
+ 763±6μs 1.68±0.1ms 2.20 arithmetic.Ops.time_frame_add(False, 1)
+ 768±30μs 1.67±0.08ms 2.18 arithmetic.Ops.time_frame_add(False, 'default')
+ 768±30μs 1.61±0.01ms 2.10 arithmetic.Ops.time_frame_mult(False, 'default')
+ 783±20μs 1.57±0.01ms 2.01 arithmetic.Ops.time_frame_mult(False, 1)
+ 784±30μs 1.33±0.03ms 1.69 arithmetic.MixedFrameWithSeriesAxis.time_frame_op_with_series_axis0('add')
+ 789±6μs 1.32±0.02ms 1.67 arithmetic.MixedFrameWithSeriesAxis.time_frame_op_with_series_axis0('sub')
+ 126±1μs 210±3μs 1.67 arithmetic.Ops2.time_frame_series_dot
+ 485±2μs 786±10μs 1.62 arithmetic.MixedFrameWithSeriesAxis.time_frame_op_with_series_axis0('eq')
+ 485±3μs 783±20μs 1.61 arithmetic.MixedFrameWithSeriesAxis.time_frame_op_with_series_axis0('ge')
+ 488±8μs 784±20μs 1.61 arithmetic.MixedFrameWithSeriesAxis.time_frame_op_with_series_axis0('le')
+ 495±10μs 789±30μs 1.59 arithmetic.MixedFrameWithSeriesAxis.time_frame_op_with_series_axis0('lt')
+ 491±10μs 767±10μs 1.56 arithmetic.MixedFrameWithSeriesAxis.time_frame_op_with_series_axis0('gt')
+ 514±3μs 785±20μs 1.53 arithmetic.MixedFrameWithSeriesAxis.time_frame_op_with_series_axis0('ne')
+ 1.42±0.02ms 2.09±0.02ms 1.46 arithmetic.Ops.time_frame_mult(True, 1)
+ 1.19±0.04ms 1.69±0.02ms 1.42 arithmetic.MixedFrameWithSeriesAxis.time_frame_op_with_series_axis0('truediv')
+ 971±40μs 1.34±0.04ms 1.38 arithmetic.MixedFrameWithSeriesAxis.time_frame_op_with_series_axis0('mul')
+ 1.74±0.01ms 2.39±0.1ms 1.37 arithmetic.Ops.time_frame_add(True, 1)
+ 43.7±0.2μs 56.0±0.2μs 1.28 arithmetic.NumericInferOps.time_add(<class 'numpy.uint8'>)
+ 43.6±0.1μs 55.4±0.2μs 1.27 arithmetic.NumericInferOps.time_multiply(<class 'numpy.int8'>)
+ 43.8±0.3μs 55.1±0.2μs 1.26 arithmetic.NumericInferOps.time_multiply(<class 'numpy.uint8'>)
+ 44.3±0.6μs 55.4±0.4μs 1.25 arithmetic.NumericInferOps.time_subtract(<class 'numpy.uint8'>)
+ 44.2±1μs 55.3±0.1μs 1.25 arithmetic.NumericInferOps.time_add(<class 'numpy.int8'>)
+ 967±20μs 1.20±0.08ms 1.24 arithmetic.Ops.time_frame_add(True, 'default')
+ 58.3±0.2μs 70.5±0.3μs 1.21 arithmetic.NumericInferOps.time_add(<class 'numpy.int16'>)
+ 58.5±0.1μs 70.4±0.5μs 1.20 arithmetic.NumericInferOps.time_multiply(<class 'numpy.int16'>)
+ 58.6±0.6μs 70.4±0.4μs 1.20 arithmetic.NumericInferOps.time_subtract(<class 'numpy.uint16'>)
+ 58.9±0.2μs 70.3±0.2μs 1.19 arithmetic.NumericInferOps.time_multiply(<class 'numpy.uint16'>)
+ 968±40μs 1.14±0.03ms 1.17 arithmetic.Ops.time_frame_mult(True, 'default')
+ 60.1±1μs 70.2±0.3μs 1.17 arithmetic.NumericInferOps.time_add(<class 'numpy.uint16'>)
- 45.9±2ms 35.3±0.4ms 0.77 arithmetic.BinaryOpsMultiIndex.time_binary_op_multiindex('add')
- 45.9±2ms 35.1±0.6ms 0.76 arithmetic.BinaryOpsMultiIndex.time_binary_op_multiindex('sub')
- 46.2±1ms 35.2±0.4ms 0.76 arithmetic.BinaryOpsMultiIndex.time_binary_op_multiindex('div')
- 46.1±2ms 35.0±0.4ms 0.76 arithmetic.BinaryOpsMultiIndex.time_binary_op_multiindex('mul')
- 2.91±0.2ms 2.20±0.03ms 0.76 arithmetic.MixedFrameWithSeriesAxis.time_frame_op_with_series_axis1('sub')
- 3.09±0.5ms 2.31±0.07ms 0.75 arithmetic.MixedFrameWithSeriesAxis.time_frame_op_with_series_axis1('mul')
- 5.81±0.09ms 4.05±0.4ms 0.70 arithmetic.Ops2.time_frame_int_mod
- 748±400μs 454±3μs 0.61 arithmetic.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 3.0, <built-in function ne>)
this was only a subset of all benchmarks, so further investigation necessary
I've always had trouble using asv, IDK if'm doing it wrong? for example running How long does |
Not sure I've ever ran the groupby ones, the frame benchmarks for example take around 5-10 minutes (not totally sure, would lean more towards 5) on my m2 air |
So Did you install asv like in the docs i.e. ( (I need to try figuring it out, so appreciate any help) |
I think so yes, can re-run later today. It's definitely not terribly long I used mamba (came when installing environment.yml) |
5 was a bit optimistic, ran around 9ish minutes. 2 suggestions:
|
Great, thanks I will try that, alternatively do a new install using mamba. |
Ok, I've run asv tests, generally look good
Everything else looks good so I guess it's jitters. |
Generally, DataFrame construction with copy=True doesnøt seem to be suppported in the asv tests. Below are the main differencers that I've noticed. >>> arr = np.random.randn(10_000_000, 2)
>>> %timeit df2 = pd.DataFrame(arr, copy=True)
13.5 ms ± 157 µs per loop # main
21.5 ms ± 435 µs per loop # this PR
>>> df.sum()
160 ms ± 469 µs per loop # main
13.1 ms ± 57.9 µs per loop # this PR I.e. faster aggregation, but slower construction |
Yeah sorry, running them agains main doesn't make much sense. You probably have to change the default to True and then run your change against that commit to get a better idea how things would change |
Hmm, setting the default This PR doesn't do anything that I could add a test that does |
This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this. |
I saw some very weird things when switching this for CoW. So I am not convinced that we should do this right now. I'd be more on-board with doing this in 3.0. I don't think that setting copy=True manually is a very common use-case anyway. |
Guess there's not much appetite for this before 3.0 so will mothball this PR in the meantime |
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.Performance example: