Skip to content

Collected parse_results*.txt as artifacts from CI run #1867

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Oct 28, 2021

Conversation

ganeshgore
Copy link
Contributor

Description

This update collects parse_results*.txt from CI regression run, which can be analyzed locally or used to replace existing golden results.

Related Issue

Addressing issue #1866

Motivation and Context

"It's common to have to update QoR files; we should streamline the process. This would also avoid issues where the CI machines are slower or faster or have libraries that take more memory etc. than someone's server, by generating the new results in CI itself." - from issue #1866

To update golden results, you might be able to do the following

  1. Download *_golden_results artifacts and uncompress
  2. *run rsync --recursive downloaded_artifact/ local_task_directory/
  3. *run parse_vtr_task.py --create_golden
  4. Create a new commit with updated golden results files
    [*Remember to clean local run directory, parse_vtr_task collects results from the most recent run, and Step. 2 will update run001 by default]

How Has This Been Tested?

You can check the test CI run here
https://github.com/ganeshgore/vtr-verilog-to-routing/actions/runs/1317429846#artifacts

Types of changes

  • Bug fix (change which fixes an issue)
  • New feature (change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My change requires a change to the documentation
  • I have updated the documentation accordingly
  • I have added tests to cover my changes
  • All new and existing tests passed

@github-actions github-actions bot added the infra Project Infrastructure label Oct 7, 2021
@ganeshgore
Copy link
Contributor Author

This PR is also ready, let me know if anything else is expected.

@tangxifan
Copy link
Contributor

@ganeshgore Compared to previous PR, what are the new packages here?

image

@ganeshgore
Copy link
Contributor Author

everything named as *_golden_results

@tangxifan
Copy link
Contributor

@vaughnbetz I did look into the changes and also download the artifacts. They do include all the parse_results.txt files. I attached the screenshot so that everyone can find the right place to download easily.

I am not the expert on the CI system. If you and @mithro knows anyone who is willing to review the codes here. It will help a lot.

@mithro
Copy link
Contributor

mithro commented Oct 18, 2021

@umarcor & @acomodi - PTAL?

@mithro mithro requested a review from acomodi October 18, 2021 17:43
@@ -149,6 +149,13 @@ jobs:
vtr_flow/**/*.net
vtr_flow/**/*.r

- name: Upload golden results
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's worth creating an additional step. I would add the file to the list in the existing upload-artifact step.
When these artifacts are used for some specific purpose, then it'd be pertinent to evaluate whether to split it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The *_regression_log artifact was getting bulky and the GitHub artifact download has low bandwidth.
That is why I created a separate artifact for golden results, maybe we can move more frequently needed files (like vtr_flow/**/*.log) also to this artifact. Just for quicker download and debug.
But if you think it is best to combine all in regression_log I will merge those two runs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point is that artifacts are cumbersome to deal with. The later the worse. Hence, if whatever needs to be done with this txt can be sorted out without uploading it as an artifact, that's desirable. In case uploading as an artifact is required, it'd be desirable to use it in a different step of the same workflow. Having it be an isolated artifact to be manually downloaded should be the worst case approach.

@umarcor
Copy link
Contributor

umarcor commented Oct 20, 2021

To update golden results, you might be able to do the following

  1. Download *_golden_results artifacts and uncompress
  2. *run rsync --recursive downloaded_artifact/ local_task_directory/
  3. *run parse_vtr_task.py --create_golden
  4. Create a new commit with updated golden results files
    [*Remember to clean local run directory, parse_vtr_task collects results from the most recent run, and Step. 2 will update run001 by default]

I think that this procedure should be part of this PR. So, instead of uploading the .txt as an artifact, do create a branch/PR when the golden results need to be updated. That is doable in CI with the credentials copied from the checkout action. See, for instance: https://github.com/hdl/containers/blob/main/.github/workflows/doc.yml#L61.

@ganeshgore
Copy link
Contributor Author

I am not sure how to know when

... the golden results need to be updated ...

My understanding was it depends on the user to update or not.

@umarcor
Copy link
Contributor

umarcor commented Oct 20, 2021

I am not sure how to know when

... the golden results need to be updated ...

My understanding was it depends on the user to update or not.

@ganeshgore, see #1866:

Right now when we fail a QoR check in CI due to an expected QoR change, the developer needs to rerun the test on his/her machine and run parse_vtr_task.py --create_golden to make a new expected QoR file (golden_results.txt).

We can simplify this by bringing back the parse_results*.txt files (currently parse_results.txt and parse_results2.txt is also used in some tests), and then the developer could just run parse_vtr_task.py --create_golden using those files.

@vaughnbetz
Copy link
Contributor

  1. If the CI artifacts are slow when they get big, then we can just upload the short output files (don't do .blif, .net, .p or .r files as they are the big ones). Even better if you can split into a separate artifact
  2. I'd like updating golden results to be fast after a failure, and still allow a person to look at the golden results he/she is about to put back and edit them if necessary. Downloading the golden_results won't do that -- those are the old results. We need to download the new parsed results and use them to create a new golden result (or have a script or command that creates the new golden results automatically from the CI artifacts).

@umarcor
Copy link
Contributor

umarcor commented Oct 20, 2021

I'd like updating golden results to be fast after a failure,

@vaughnbetz can you please be more explicit about this? Assume that the reader is not aware of the internals of VTR. What does it mean "after a failure"? Which tests/jobs do we need to monitor in order to decide that "this is a failure that needs to trigger a golden results update" vs "this is a legit failure of the job/workflow which does not require further action in regard to golden results".

@vaughnbetz
Copy link
Contributor

I don't think we can make this decision automatically. A QoR update happens when:

  1. Someone looks at specific failures and concludes the QoR bounds were too tight or lucky (mostly on small circuits) and hence updating them is OK.
  2. Someone has changed an algorithm, and expects better QoR or a new trade-off (maybe a new feature for moderately higher CPU, or maybe a CPU reduction that takes us outside the current QoR bounds, ...). After collecting results, discussing with others, etc. the trade-off is ruled a good one and the QoR data is updated as part of the PR that puts the new code in. Easiest to update this by simply making a PR, seeing what the QoR failures are (if any) and getting the CI artifact data to make the new golden results.

@umarcor
Copy link
Contributor

umarcor commented Oct 20, 2021

@vaughnbetz, thanks!

@ganeshgore, according to the last explanations, I take back my previous suggestion. It is ok to have this file as a separated artifact.

@ganeshgore
Copy link
Contributor Author

Thank you @vaughnbetz and @umarcor for your comments.
I was planning to make the following 2 artifacts for each regression test, *_results will be fairly small for quick evaluation (faster download), users can download *_run_files for more elaborate debugging. Does that sound good to you all?
The above procedure to update golden results after downloading the artifacts (*_results) will still work fine.

*_run_files artifacts

vtr_flow/**/*.out
vtr_flow/**/*.blif
vtr_flow/**/*.p
vtr_flow/**/*.net
vtr_flow/**/*.r

*_results artifacts

vtr_flow/**/*.log
vtr_flow/**/parse_results*.txt

@vaughnbetz
Copy link
Contributor

Makes sense to me; thanks.

@ganeshgore ganeshgore merged commit a6e200e into verilog-to-routing:master Oct 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
infra Project Infrastructure
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants