Skip to content

TST: Benchmarks for CSV writing for various index variations #39318

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jan 21, 2021
Merged

TST: Benchmarks for CSV writing for various index variations #39318

merged 4 commits into from
Jan 21, 2021

Conversation

DanielFEvans
Copy link
Contributor

@DanielFEvans DanielFEvans commented Jan 21, 2021

Benchmarks derived from GH37484 (#37484). This captures a peculiar difference in timings between a DataFrame originating from df.set_index(...).head(...) and one from df.head(...).set_index(...).

  • tests added / passed
  • Ensure all linting tests pass, see here for how to run them
  • whatsnew entry

I've not included a whatsnew entry, as no section seems particularly relevant to internal-only test/benchmark changes. If it should go in, shout and I'll add a note somewhere.

Benchmarks derived from GH37484. This captures a peculiar difference in timings between a DataFrame originating from `df.set_index(...).head(...)` and one from `df.head(...).set_index(...)`.
@jreback jreback added Benchmark Performance (ASV) benchmarks IO CSV read_csv, to_csv labels Jan 21, 2021
@jreback
Copy link
Contributor

jreback commented Jan 21, 2021

can you show these your PR (e.g. asv dev just see the timings

@DanielFEvans
Copy link
Contributor Author

Yes, here's the benchmark output:

$ asv run -e -E existing -b ^io.csv.ToCSVIndexes
· Discovering benchmarks
· Running 3 total benchmarks (1 commits * 1 environments * 3 benchmarks)
[  0.00%] ·· Benchmarking existing-py_home_jbanorthwest.co.uk_danielevans_venvs_pandas_dev_py39_bin_python3.9
[ 16.67%] ··· Running (io.csv.ToCSVIndexes.time_head_of_multiindex--)...
[ 66.67%] ··· io.csv.ToCSVIndexes.time_head_of_multiindex                                                  2.04±0.04s
[ 83.33%] ··· io.csv.ToCSVIndexes.time_multiindex                                                            818±60ms
[100.00%] ··· io.csv.ToCSVIndexes.time_standard_index                                                        645±60ms

@jreback jreback added this to the 1.3 milestone Jan 21, 2021
f"col{i}": np.random.uniform(0, 100000.0, ROWS) for i in range(COLS)
}
df = DataFrame({**index_cols, **data_cols})
self.df_standard_index = df.head(N_HEAD)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the HEAD is a bit confusing here. can you not just make ROWS=N_HEAD to achieve the same?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's like that because the following two tests do take a head with a smaller number of rows, and I didn't want to duplicate the creation code. However, that's easily solved by extracting that into a function.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh it was fine, but ok on the change. ping when green.

@pep8speaks
Copy link

pep8speaks commented Jan 21, 2021

Hello @DanielFEvans! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2021-01-21 16:25:08 UTC

@jreback jreback merged commit 612cd05 into pandas-dev:master Jan 21, 2021
@jreback
Copy link
Contributor

jreback commented Jan 21, 2021

thanks @DanielFEvans

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Benchmark Performance (ASV) benchmarks IO CSV read_csv, to_csv
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants