Skip to content

WEB: Update benchmarks page #61289

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Apr 19, 2025
Merged
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 9 additions & 24 deletions web/pandas/community/benchmarks.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ kinds of benchmarks relevant to pandas:

pandas benchmarks are implemented in the [asv_bench](https://github.com/pandas-dev/pandas/tree/main/asv_bench)
directory of our repository. The benchmarks are implemented for the
[airspeed velocity](https://asv.readthedocs.io/en/v0.6.1/) (asv for short) framework.
[airspeed velocity](https://asv.readthedocs.io/en/latest/) (asv for short) framework.

The benchmarks can be run locally by any pandas developer. This can be done
with the `asv run` command, and it can be useful to detect if local changes have
Expand All @@ -22,37 +22,22 @@ More information on running the performance test suite is found
Note that benchmarks are not deterministic, and running in different hardware or
running in the same hardware with different levels of stress have a big impact in
the result. Even running the benchmarks with identical hardware and almost identical
conditions produces significant differences when running the same exact code.
conditions can produce significant differences when running the same exact code.

## pandas benchmarks servers
## Automated benchmark runners

We currently have two physical servers running the benchmarks of pandas for every
(or almost every) commit to the `main` branch. The servers run independently from
each other. The original server has been running for a long time, and it is physically
located with one of the pandas maintainers. The newer server is in a datacenter
We currently have two setups running the benchmarks of pandas for every
(or almost every) commit to the `main` branch. One is run on GitHub actions
in the [asv-runner](https://github.com/pandas-dev/asv-runner/) repository.
The other is a physical server in a datacenter
kindly sponsored by [OVHCloud](https://www.ovhcloud.com/). More information about
pandas sponsors, and how your company can support the development of pandas is
available at the [pandas sponsors]({{ base_url }}about/sponsors.html) page.

Results of the benchmarks are available at:

- GitHub Actions results: [asv](https://pandas-dev.github.io/asv-runner/)
- OVH server: [asv](https://pandas.pydata.org/benchmarks/asv/)

### Original server configuration

The machine can be configured with the Ansible playbook in
[tomaugspurger/asv-runner](https://github.com/tomaugspurger/asv-runner).
The results are published to another GitHub repository,
[tomaugspurger/asv-collection](https://github.com/tomaugspurger/asv-collection).

The benchmarks are scheduled by [Airflow](https://airflow.apache.org/).
It has a dashboard for viewing and debugging the results.
You’ll need to setup an SSH tunnel to view them:

```
ssh -L 8080:localhost:8080 [email protected]
```
- asv-runner results: [asv](https://pandas-dev.github.io/asv-runner/)
- OVH server results: [asv](https://pandas.pydata.org/benchmarks/asv/)
Copy link
Member Author

@rhshadrach rhshadrach Apr 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last run in the OVH results is from February 10th.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good to indicate then which one of these benchmarks might be more up to date or preferred to reference

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can have a look, but I'm not planning to maintain that server myself. I'm happy to let yo @rhshadrach decide on the direction of this. In general a dedicated server should have more accurate results. I was reaching a point tunning the hardware that even without running benchmarks multiple times results were quite consistent. But if github actions is good enough, the maintenance burden should be less probably.

Changes here looks great. Once the direction is clear, I think it'd be valuable to answer Matt's question. But from my side happy to add it in a follow up, if not immediately clear what users should look at.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@datapythonista - agreed the physical server has more accurate results, in part due to the precise configuration you setup in https://github.com/pandas-dev/pandas-benchmarks. I plan to take a look and see how much of that we can implement in asv-runner on the GitHub Action workers (but if you're interested in doing this, by all means!).

However I do not see how to have a pattern for ways-of-working with a team (compare: https://github.com/pandas-dev/asv-runner/issues) and worry about maintenance / usability. As I don't think there is anyone who is maintaining / looking at these benchmarks, it would make sense to me to shut them down and remove from the docs. Does that sound good?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think these configurations are possible in a virtual machine or docker, so I don't think they are possible in GitHub Actions workers (unless we self-host them in a physical machine, which we did consider). That's the reason we have the physical server. You may be able to set up some of these options maybe, but I don't think they'll have any effect. Also, there are some sources of noise in the CI. I don't think we are guaranteed to even run the job in the same hardware. I don't think it happens much in practice, but as an example, if GitHub actions has an old data center with i5 cores and a newer with i7, the CI may run some times slower in the i5 and sometimes faster in the i7. I think changes aren't so dramatic, but I guess we don't always get the same exact hardware or OS configuration. Also, what's happening in the rest of the server (in other VMs) have an effect. If it's idle, memory access will be faster and CPU cache misses will be lower, running faster than when lots of things go on in the server. Funny enough, empirical results shown that in very busy servers (like the CI workers), the kind of worst case scenario is more consistent than a mostly idle environment (I find this quite counter-intuitive personally).

Do you mean removing the part of the creation of issues, or all the GitHub actions benchmarking?

Personally, I think having early performance regression information would be amazing, and I was happy to put some time into it when there were funds. But I never checked the benchmarks much myself. And I can't afford to put too much effort into maintaining and improving the benchmarks as a volunteer. So, I'm happy with whatever you decide. Removing those issues if nobody is looking at them seems fine. And leaving them if they are already working and no maintenance is needed seems also fine.

Copy link
Member

@datapythonista datapythonista Apr 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, I checked the OVH benchmarks server, and seems it wasn't set up to start running the benchmarks at start up. When OVH temporary shut down our servers because the end of our agreement is when we stop generating them in that server. I restarted them now. I forgot the details on how it's handled, but I think the lost history should be slowly populated when there are no new commits to benchmark.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there might be some confusion (due to imprecise verbiage on my part). My previous proposal is to remove all reference to the OVH benchmarking setup on this page, since it is not going to be maintained.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with that if it's not helpful


### OVH server configuration

Expand Down
Loading