-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
TST: Benchmarks for CSV writing for various index variations #39318
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TST: Benchmarks for CSV writing for various index variations #39318
Conversation
Benchmarks derived from GH37484. This captures a peculiar difference in timings between a DataFrame originating from `df.set_index(...).head(...)` and one from `df.head(...).set_index(...)`.
can you show these your PR (e.g. |
Yes, here's the benchmark output:
|
asv_bench/benchmarks/io/csv.py
Outdated
f"col{i}": np.random.uniform(0, 100000.0, ROWS) for i in range(COLS) | ||
} | ||
df = DataFrame({**index_cols, **data_cols}) | ||
self.df_standard_index = df.head(N_HEAD) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the HEAD is a bit confusing here. can you not just make ROWS=N_HEAD to achieve the same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's like that because the following two tests do take a head
with a smaller number of rows, and I didn't want to duplicate the creation code. However, that's easily solved by extracting that into a function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh it was fine, but ok on the change. ping when green.
Hello @DanielFEvans! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found: There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2021-01-21 16:25:08 UTC |
thanks @DanielFEvans |
Benchmarks derived from GH37484 (#37484). This captures a peculiar difference in timings between a DataFrame originating from
df.set_index(...).head(...)
and one fromdf.head(...).set_index(...)
.I've not included a whatsnew entry, as no section seems particularly relevant to internal-only test/benchmark changes. If it should go in, shout and I'll add a note somewhere.