Skip to content

Add a new API set_node_coordinates() to RRGraphBuilder #1853

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Sep 21, 2021
Merged

Add a new API set_node_coordinates() to RRGraphBuilder #1853

merged 5 commits into from
Sep 21, 2021

Conversation

hamzakhan-rs
Copy link
Contributor

Description

This PR focuses on updating routing resource graph builder functions, where we use the refactored data structure RRGraphBuilder to shadow the discreted data structure rr_graph_storage.
This PR aims to fully deprecate the direct use of the legacy API set_coordinates() from rr_node data structure.

After this PR, the set_coordinates API is fully deprecated and the set_node_coordinates from the refactored data structure RRGraphBuilder is the only way to use.

Checklist:

  • Removed the legacy API set_coordinates() from rr_node.cpp and rr_node.h
  • Added new APIs set_node_coordinates to data structures RRGraphBuilder, whose comments are Doxygen compatible
  • Replaced all the use of set_coordinates() in builder functions by set_node_coordinates()

Related Issue

This pull request is a follow-up PR on the routing resource graph refactoring effort #1805

Motivation and Context

This PR is the continuation to the refactoring effort with a focus on shadowing the rr_graph_storage APIs in RRGraphBuilder data structure.
This PR reworks the set_node_coordinates() API among the other APIs in #1847

How Has This Been Tested?

  • All vtr basic and strong tests are passing.

Types of changes

  • Bug fix (change which fixes an issue)
  • New feature (change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My change requires a change to the documentation
  • I have updated the documentation accordingly
  • I have added tests to cover my changes
  • All new and existing tests passed

@github-actions github-actions bot added the VPR VPR FPGA Placement & Routing Tool label Sep 17, 2021
Copy link
Contributor

@vaughnbetz vaughnbetz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See https://docs.verilogtorouting.org/en/latest/dev/developing/#evaluating-quality-of-result-qor-changes for how to evaluate a change like this that might affect cpu time. Please post the relevant spreadsheets (vtr benchmarks and titan benchmarks) showing no quality degradation; the key metrics to watch will be the routing times (min_channel_width_route_time and crit_path_route_time) and placement time.

I don't think this routine is called enough to affect cpu time; if in doubt, it can be inlined.

@@ -55,6 +55,9 @@ class RRGraphBuilder {
/** @brief Clear all the underlying data storage */
void clear();

/** @brief Set the node coordinate */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably isn't too cpu critical so can leave here. But should get a QoR spreadsheet to check we haven't slowed things down noticably; if there's a slowdown, marking this function as inline and adding the function body to the header should recapture the speed. For very short functions like this, inline works well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have shared the QoR sheets. Please let me know if this needs to be inlined after seenig QoR comparisons.

@tangxifan
Copy link
Contributor

@vaughnbetz This is the PR from my team at Rapidsilicon. They will keep contributing PR during our refactoring effort. Can you add @hamzakhan-rs as a contributor for now or trigger the kokoro manually? Many Thanks.

@vaughnbetz
Copy link
Contributor

Invitation sent.

@tangxifan
Copy link
Contributor

@hamzakhan-rs

  • Please accept the invitation from Vaughn so that we can trigger the kokoro tests.
  • Please address the comments as earliest as you can.

If you need help, feel free to ask.

@hamzakhan-rs
Copy link
Contributor Author

hamzakhan-rs commented Sep 21, 2021

See https://docs.verilogtorouting.org/en/latest/dev/developing/#evaluating-quality-of-result-qor-changes for how to evaluate a change like this that might affect cpu time. Please post the relevant spreadsheets (vtr benchmarks and titan benchmarks) showing no quality degradation; the key metrics to watch will be the routing times (min_channel_width_route_time and crit_path_route_time) and placement time.

I don't think this routine is called enough to affect cpu time; if in doubt, it can be inlined.

I am attaching the spreadsheets of comparison between baseline branch https://github.com/verilog-to-routing/vtr-verilog-to-routing/tree/test_sanity_basic and this PR for three different benchmark tests.

  1. Vtr_reg_basic_sanity_timing results
  2. vtr_reg_qor_chain_depop results
  3. titan_quick_qor

Please let me know this is acceptable tolerance in key metrics.

  1. https://docs.google.com/spreadsheets/d/1Hq37pwNZ8gIxVCaa63MwzI61kyljZDC0/edit?usp=sharing&ouid=111481362271934896678&rtpof=true&sd=true,
  2. https://docs.google.com/spreadsheets/d/1S1564DysaKgV1ftZtdP1Lrt6JRVqdI_J/edit?usp=sharing&ouid=111481362271934896678&rtpof=true&sd=true,
  3. https://docs.google.com/spreadsheets/d/1jWv6Heo3SwA7tXB_DeWng3m_KuMHRPXW/edit?usp=sharing&ouid=111481362271934896678&rtpof=true&sd=true

@hamzakhan-rs
Copy link
Contributor Author

hamzakhan-rs commented Sep 21, 2021

@hamzakhan-rs

  • Please accept the invitation from Vaughn so that we can trigger the kokoro tests.
  • Please address the comments as earliest as you can.

If you need help, feel free to ask.

I have accepted the invitation and joined this repo. I have added the label kokoro:force-run @tangxifan. Can you confirm it is triggered correctly?

@tangxifan
Copy link
Contributor

@hamzakhan-rs Thanks for the help. Kokoro is running well now.

Regarding the QoR check,

  • What we care most is the total flow runtime, critical path delay, max_vpr_mem. If you can summarize what you see on these metrics from the spreadsheet, it will be very helpful.
  • I see that the basic sanity test has 20% increase in flow runtime on average (most of them are on packer, where no code changes are introduced in this PR), while for VtR and Titan tests, I see -5% and +1% difference in flow runtime on average. I am not sure it is due to the workload change of the server or not.

@vaughnbetz
Copy link
Contributor

The basic_sanity designs are really small, and subject to high noise in ratios. So I wouldn't worry about them.
The titan and vtr designs show pretty close QoR data. Some of the later (in the table) Titan designs show a routing slowdown, but I suspect that's due to machine load as it doesn't make sense some slow down and some don't (trying to keep machine load consistent through QoR runs as much as you can is good).

A few designs get different placements, which is odd but has been happening in other PRs too. I suspect we have some behaviour (unrelated to this PR) where someone is making a decision that is address dependent (sorting or hashing on pointers) so we're getting different results with no significant change, but again I do not believe that is related to this PR.

So I think this is fine, but I agree with @tangxifan that it is good if you summarize the key QoR data / conclusions you see in the PR as well as it makes it easier to review.

@hamzakhan-rs
Copy link
Contributor Author

@hamzakhan-rs Thanks for the help. Kokoro is running well now.

Regarding the QoR check,

  • What we care most is the total flow runtime, critical path delay, max_vpr_mem. If you can summarize what you see on these metrics from the spreadsheet, it will be very helpful.
  • I see that the basic sanity test has 20% increase in flow runtime on average (most of them are on packer, where no code changes are introduced in this PR), while for VtR and Titan tests, I see -5% and +1% difference in flow runtime on average. I am not sure it is due to the workload change of the server or not.

I had run the more than one regression tests in parallel and some other people are also working on the same server so this slight irregularity might be due to the server workload change. In future, I will try to run the tests at time of lesser/even workload and one regression at a time so it dosent affect results too much due to load changes of server. Also I will summarize/conclude the metrics of each QoR results first thing in the morning.

@vaughnbetz vaughnbetz merged commit 616b31f into verilog-to-routing:master Sep 21, 2021
@tangxifan tangxifan deleted the set_node_coordinates_api branch September 21, 2021 21:31
@hamzakhan-rs
Copy link
Contributor Author

I am attaching the spreadsheets of comparison between baseline branch https://github.com/verilog-to-routing/vtr-verilog-to-routing/tree/test_sanity_basic and this PR for three different benchmark tests.

  1. Vtr_reg_basic_sanity_timing results
  2. vtr_reg_qor_chain_depop results
  3. titan_quick_qor

Please let me know this is acceptable tolerance in key metrics.

  1. https://docs.google.com/spreadsheets/d/1Hq37pwNZ8gIxVCaa63MwzI61kyljZDC0/edit?usp=sharing&ouid=111481362271934896678&rtpof=true&sd=true,
  2. https://docs.google.com/spreadsheets/d/1S1564DysaKgV1ftZtdP1Lrt6JRVqdI_J/edit?usp=sharing&ouid=111481362271934896678&rtpof=true&sd=true,
  3. https://docs.google.com/spreadsheets/d/1jWv6Heo3SwA7tXB_DeWng3m_KuMHRPXW/edit?usp=sharing&ouid=111481362271934896678&rtpof=true&sd=true

Here is a summary of the QoR results mentiond above. I will sumarize in this form for future PR as well.

Basic sanity test
Flow run time +22% on average.
No changes on critical path delay
Peak memory usage +4.7% on average

vtr_reg_qor_chain_depop test
Flow run time -4.7% on average.
critical path delay -0.13%
Peak memory usage -3% on average

titan_quick_qor test:
flow run time +1.2% on average.
No changes on critical path delay
Peak memory usage -1.7% on average

@tangxifan
Copy link
Contributor

@hamzakhan-rs Thanks. This is a good template to follow (reuse the template in next pull requests).

This PR has been merged. Good work.

@vaughnbetz
Copy link
Contributor

Thanks @hamzakhan-rs . I agree with @tangxifan : this is a good template to follow in terms of how to report QoR.

There is a small route time increase on the titan and vtr designs. It may be noise, but I think it's worth making a small change: inline this function (move the function body to the header and put the keyword inline in the function header) to see if it gets us a little bit of speed. If it does, it would be good to merge that change in a new PR.
You don't need to run the basic_sanity_test for qor; it's too small to be meaningful. The 2nd and 3rd runs above are enough, and if the machine loading can be kept as consistent as possible when running them it is best for cpu time comparisons.

@mithro
Copy link
Contributor

mithro commented Sep 22, 2021

@tangxifan -- It would be awesome if we could figure out a way to automatically generate your template in pull requests.

The team at @antmicro set this up for sv-tests, see an example @ chipsalliance/sv-tests#1719 (comment)

If we knew what the commands needed to be run to create them, then @antmicro could make that happen....

@tangxifan
Copy link
Contributor

@tangxifan -- It would be awesome if we could figure out a way to automatically generate your template in pull requests.

The team at @antmicro set this up for sv-tests, see an example @ chipsalliance/sv-tests#1719 (comment)

If we knew what the commands needed to be run to create them, then @antmicro could make that happen....

This is really useful. @hamzakhan-rs Can you look into it? It will make our life easier.

@hamzakhan-rs
Copy link
Contributor Author

Thanks @hamzakhan-rs . I agree with @tangxifan : this is a good template to follow in terms of how to report QoR.

There is a small route time increase on the titan and vtr designs. It may be noise, but I think it's worth making a small change: inline this function (move the function body to the header and put the keyword inline in the function header) to see if it gets us a little bit of speed. If it does, it would be good to merge that change in a new PR.
You don't need to run the basic_sanity_test for qor; it's too small to be meaningful. The 2nd and 3rd runs above are enough, and if the machine loading can be kept as consistent as possible when running them it is best for cpu time comparisons.

Thanks for the guidance. I will Inline this function and check if the route time and other metrics improve. If so, I will create the new PR with inline function.

@hamzakhan-rs
Copy link
Contributor Author

@tangxifan -- It would be awesome if we could figure out a way to automatically generate your template in pull requests.

The team at @antmicro set this up for sv-tests, see an example @ chipsalliance/sv-tests#1719 (comment)

If we knew what the commands needed to be run to create them, then @antmicro could make that happen....

@mithro https://docs.verilogtorouting.org/en/latest/dev/developing/#automated-qor-comparison-script this is the command needed to run to create a comparison xlsx sheet. qor_compare.py is in scripts folder https://github.com/verilog-to-routing/vtr-verilog-to-routing/blob/master/vtr_flow/scripts/qor_compare.py and parse_results1.txt is the baseline results and parse_results2.txt is the PR results. PR creater then need to upload both thses results txt files for all QoR tests.

@mithro
Copy link
Contributor

mithro commented Sep 23, 2021

@acomodi / @kgugala - Is this something that the people who did the sv-tests stuff make happen without to much effort?

@tgorochowik
Copy link
Contributor

@mithro some effort is needed to set this up, but once you know how to approach this it is relatively simple (initial testing is cumbersome, but nothing that can't be done).

Namely, it needs to be split into at least two workflows:

@hamzakhan-rs @tangxifan please let us know if this is enough for you to reproduce this here. We can also help you set up the comment posting mechanism, but it would be nice if you could get this to at least generate the desired .md file (or if the report is already visible in the CI logs somewhere, then just point to it directly so we know what needs to be uploaded)

@tangxifan
Copy link
Contributor

Link to #1868 So that we keep tracking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
VPR VPR FPGA Placement & Routing Tool
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants