-
Notifications
You must be signed in to change notification settings - Fork 415
Speed up Router's calculation of OveruseInfo #1376
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up Router's calculation of OveruseInfo #1376
Conversation
… been rerouted, since they should not contribute to the total number of overused nodes
Thanks Bill. The code looks good.
(Idea that won't work below; looked closer and pres_cost is also updated, which will need to be done for all nodes. To avoid that we'd have to compute pres_cost on the fly (from occupancy and pres_fac, so probably not worthwhile as the possible gains are reduced and the code churn is increased).
Right now that routine goes through all routing resource nodes. If only a few nets (or even better, connections) are rerouted, you could instead go through only the nets that were rerouted, but you'd need to update the costs for all nodes in either the old routing of those nets or the new routings. If more than a few nets are rerouted then just going through all rr_nodes is probably faster though. |
I was thinking more about #2 above and I think it is worth investigating. It would require:
Nowadays bringing more data in from storage is usually more expensive than doing a small amount of computation, so shrinking rr_node_route_inf with this on-the-fly computation may itself be a good idea (or at least not a very bad idea). And that refactoring allows pathfinder_update_cost to become incremental in the same way as you made the overused_info computation. This should all be done in any issue / PR though -- once you have the QoR data for this PR you can merge and close it. |
I agree, getting QoR data for each change individually, and merging them as separate PRs seems like the right approach. |
@vaughnbetz After some investigations, I found out that pres_cost is referenced in many other places, so it might not be possible to eliminate this field entirely (probably requires a lot of refactorizations). But it does seem to me that there is an abundance of updates to this field, and some of them are probably unnecessary. What I think I'll try to do is to pick out all the seemingly unnecessary ones and comment them out to see how that will affect the results. |
Titan benchmarks (still running): https://drive.google.com/file/d/1cOlwYne3owPUApwVj3e4ihL9rGn8wpeo/view?usp=sharing |
QoR data is looking good so far; looks like a 3% speedup or so. |
Motivation and Context
Trying to speed up the VPR router by detecting its various performance bottlenecks and remove them by optimizations. This PR should not change any of the VPR router's functionalities.
This PR is still undergoing changes. I will update more commits if I manage to optimize other areas of the code.
What Has Already Been Optimized?
Types of changes
Checklist: